Compare commits

..

3419 Commits

Author SHA1 Message Date
Chad Versace
8cb2b5c37a CHROMIUM: i965: Implement EGL_KHR_mutable_render_buffer
Tested with a low-latency handwriting application on Android Nougat on
the Chrome OS Pixelbook (codename Eve) with Kabylake.

BUG=b:77899911
TEST=No android-cts-7.1 regressions on Eve.

Change-Id: Ia816fa6b0a1158f81e5b63477451bf337c2001aa
2018-07-30 10:50:41 -07:00
Chad Versace
6158e78240 CHROMIUM: egl/android: Implement EGL_KHR_mutable_render_buffer
Specifically, implement the extension DRI_MutableRenderBufferLoader.
However, the loader enables EGL_KHR_mutable_render_buffer only if the
DRI driver implements its half of the extension,
DRI_MutableRenderBufferDriver.

BUG=b:77899911
TEST=No android-cts-7.1 regressions on Eve.

Change-Id: I7fe68a5a674d1707b1e7251d900b3affd5dd7660
2018-07-30 10:50:41 -07:00
Chad Versace
c689a23c77 CHROMIUM: egl/main: Add bits for EGL_KHR_mutable_render_buffer
A follow-up patch enables EGL_KHR_mutable_render_buffer for Android.
This patch is separate from the Android patch because I think it's
easier to review the platform-independent bits separately.

BUG=b:77899911
TEST=No android-cts-7.1 regressions on Eve.

Change-Id: I07470f2862796611b141f69f47f935b97b0e04a1
2018-07-30 10:50:41 -07:00
Chad Versace
0cbe8ad97d CHROMIUM: dri: Add param driCreateConfigs(mutable_render_buffer)
If set, then the config will have __DRI_ATTRIB_MUTABLE_RENDER_BUFFER,
which translates to EGL_MUTABLE_RENDER_BUFFER_BIT_KHR.

Not used yet.

BUG=b:77899911
TEST=No android-cts-7.1 regressions on Eve.

Change-Id: Icdf35794f3e9adf31e1f85740b87ce155efe1491
2018-07-30 10:50:41 -07:00
Chad Versace
f57e2bf44e CHROMIUM: dri: Define DRI_MutableRenderBuffer extensions
Define extensions DRI_MutableRenderBufferDriver and
DRI_MutableRenderBufferLoader. These are the two halves for
EGL_KHR_mutable_render_buffer.

Outside the DRI code there is one additional change.  Add
gl_config::mutableRenderBuffer to match
__DRI_ATTRIB_MUTABLE_RENDER_BUFFER. Neither are used yet.

BUG=b:77899911
TEST=No android-cts-7.1 regressions on Eve.

Change-Id: I4ca03d81e4557380b19c44d8d799a7cc9365d928
2018-07-30 10:50:40 -07:00
Chad Versace
3274c56585 CHROMIUM: egl/dri2: In dri2_make_current, return early on failure
This pulls an 'else' block into the function's main body, making the
code easier to follow.

Without this change, the upcoming EGL_KHR_mutable_render_buffer patch
transforms dri2_make_current() into spaghetti.

BUG=b:77899911
TEST=No android-cts-7.1 regressions on Eve.

Change-Id: I26be2b7a8e78a162dcd867a44f62d6f48b9a8e4d
2018-07-30 09:32:54 -07:00
Chad Versace
0a957c2822 CHROMIUM: egl: Simplify queries for EGL_RENDER_BUFFER
There exist *two* queryable EGL_RENDER_BUFFER states in EGL:
eglQuerySurface(EGL_RENDER_BUFFER) and
eglQueryContext(EGL_RENDER_BUFFER).

These changes eliminate potentially very fragile code in the upcoming
EGL_KHR_mutable_render_buffer implementation.

* eglQuerySurface(EGL_RENDER_BUFFER)

  The implementation of eglQuerySurface(EGL_RENDER_BUFFER) contained
  abstruse logic which required comprehending the specification
  complexities of how the two EGL_RENDER_BUFFER states interact.  The
  function sometimes returned _EGLContext::WindowRenderBuffer, sometimes
  _EGLSurface::RenderBuffer. Why? The function tried to encode the
  actual logic from the EGL spec. When did the function return which
  variable? Go study the EGL spec, hope you understand it, then hope
  Mesa mutated the EGL_RENDER_BUFFER state in all the correct places.
  Have fun.

  To simplify eglQuerySurface(EGL_RENDER_BUFFER), and to improve
  confidence in its correctness, flatten its indirect logic. For pixmap
  and pbuffer surfaces, simply return a hard-coded literal value, as the
  spec suggests. For window surfaces, simply return
  _EGLSurface::RequestedRenderBuffer.  Nothing difficult here.

* eglQueryContext(EGL_RENDER_BUFFER)

  The implementation of this suffered from the same issues as
  eglQuerySurface, and the solution is the same.  confidence in its
  correctness, flatten its indirect logic. For pixmap and pbuffer
  surfaces, simply return a hard-coded literal value, as the spec
  suggests. For window surfaces, simply return
  _EGLSurface::ActiveRenderBuffer.

BUG=b:77899911
TEST=No android-cts-7.1 regressions on Eve.

Change-Id: Ic5f2ab1952f26a87081bc4f78bc7fa96734c8f2a
2018-07-30 09:18:28 -07:00
Chad Versace
6ac5d83a6c FROMLIST: gallium/auxiliary: Fix Autotools on Android (v2)
Problem 1: u_debug_stack_android.cpp transitively included
"pipe/p_compiler.h", but src/gallium/include was missing from the C++
include path.

Problem 2: Add -std=c++11 to AM_CXXFLAGS. Android's libbacktrace headers
require C++11, but the Android toolchain (at least in the Chrome OS SDK)
does not enable C++11 by default.

v2: Add -std=c++11.

Cc: Gurchetan Singh <gurchetansingh@chromium.org>
Cc: Eric Engestrom <eric.engestrom@intel.com>
Archived-At: https://lists.freedesktop.org/archives/mesa-dev/2018-July/200979.html
(am from https://patchwork.freedesktop.org/patch/240805/)

Fixes the build for betty-arcnext.

BUG=b:77235812
TEST=emerge-betty-arcnext arc-mesa

Change-Id: Idd30463f3941d7b119207aa5afd7e37e69b0a5ae
Reviewed-on: https://chromium-review.googlesource.com/1149449
Tested-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Commit-Queue: Chad Versace <chadversary@chromium.org>
2018-07-25 01:54:44 +00:00
Chad Versace
53ba2797ee FROMLIST: drisw: Fix build on Android Nougat
In commit cf54bd5e8, dri_sw_winsys.c began using <sys/shm.h> to support
the new functions putImageShm, getImageShm in DRI_SWRastLoader. But
Android began supporting System V shared memory only in Oreo. Nougat has
no shm headers.

Fix the build by ifdef'ing out the shm code on Nougat.

Fixes: cf54bd5e8 "drisw: use shared memory when possible"
Cc: Marc-André Lureau <marcandre.lureau@gmail.com>
Cc: Dave Airlie <airlied@redhat.com>
Archived-At: https://lists.freedesktop.org/archives/mesa-dev/2018-July/200705.html
(am from https://patchwork.freedesktop.org/patch/240173/)

Fixes the build for betty.

BUG=b:77235812
TEST=emerge-betty arc-mesa

Change-Id: I5fc7214cd30fdea49b69cdc860cefd622466c7da
Reviewed-on: https://chromium-review.googlesource.com/1146074
Tested-by: Chad Versace <chadversary@chromium.org>
Trybot-Ready: Chad Versace <chadversary@chromium.org>
Reviewed-by: Ilja H. Friedel <ihf@chromium.org>
Commit-Queue: Ilja H. Friedel <ihf@chromium.org>
2018-07-21 16:51:26 +00:00
Lepton Wu
58cbc48eac UPSTREAM: virgl: Fix flush in virgl_encoder_inline_write.
The current code is buggy: if there are only 12 dwords left in cbuf,
we emit a zero data length command which will be rejected by virglrenderer.
Fix it by calling flush in this case.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry-picked from commit 04e278f793)

BUG=b:111521569

Change-Id: Ie952ece5e31ee94c5fb7e6ecb3e1de807a21763b
Reviewed-on: https://chromium-review.googlesource.com/1140419
Tested-by: Lepton Wu <lepton@chromium.org>
Trybot-Ready: Lepton Wu <lepton@chromium.org>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Commit-Queue: Lepton Wu <lepton@chromium.org>
2018-07-17 23:25:10 +00:00
Lepton Wu
c884f74c5f CHROMIUM: platform_android: use general fallback.
Instead of handling software fallback inside platform_android, just
let EGL framework to handle it. With this, we still can fall back
to software driver when hardware driver actually doesn't work.

BUG=b:77302150
TEST=manual - make sure betty still boot without virgl driver.

Change-Id: I5d514f67c9dc6f68661e03fd9fc9546acd7277bd
Reviewed-on: https://chromium-review.googlesource.com/1004006
Commit-Ready: Lepton Wu <lepton@chromium.org>
Tested-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
(cherry picked from commit b16fbdb135)
[chadversary: Simplify by dropping unneeded 'swrast' param.]

BUG=b:77235812
TEST=emerge-grunt arc-mesa; emerge-eve arc-mesa

Change-Id: Ib896c3f16dcd5fdddc0a7eb9e72caa7f2037c2f3
Reviewed-on: https://chromium-review.googlesource.com/1105708
Commit-Queue: Chad Versace <chadversary@chromium.org>
Tested-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Ilja H. Friedel <ihf@chromium.org>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
2018-07-12 18:17:24 +00:00
Emil Velikov
ca6dc3b7f4 FROMLIST: egl/android: remove HAL_PIXEL_FORMAT_BGRA_8888 support
As said in the EGL_KHR_platform_android extensions

    For each EGLConfig that belongs to the Android platform, the
    EGL_NATIVE_VISUAL_ID attribute is an Android window format, such as
    WINDOW_FORMAT_RGBA_8888.

Although it should be applicable overall.

Even though we use HAL_PIXEL_FORMAT here, those are numerically
identical to the  WINDOW_FORMAT_ and AHARDWAREBUFFER_FORMAT_ ones.

Barring the said format of course. That one is only listed in HAL.

Keep in mind that even if we try to use the said format, you'll get
caught by droid_create_surface(). The function compares the format of
the underlying window, against the NATIVE_VISUAL_ID of the config.

Unfortunatelly it only prints a warning, rather than error out, likely
leading to visual corruption.

While SDL will even call ANativeWindow_setBuffersGeometry() with the
wrong format, and conviniently ignore the [expected] failure.

Cc: mesa-stable@lists.freedesktop.org
Cc: Chad Versace <chadversary@google.com>
Cc: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Tomasz Figa <tfiga@chromium.org>
(am from https://patchwork.freedesktop.org/patch/166176/)
(tfiga: Remove only respective EGL config, leave EGL image as is.)

BUG=b:33533853
TEST=dEQP-EGL.functional.*.rgba8888_window tests pass on eve

Change-Id: I8eacfe852ede88b24c1a45bff1445aacd86f6992
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/582263
Reviewed-by: Chad Versace <chadversary@chromium.org>
(cherry picked from commit 42c96d393b)

BUG=b:77235812
TEST=emerge-grunt arc-mesa; emerge-eve arc-mesa

Change-Id: I74e91a3621fcc9c80633cf512a8d08782ac6e1d7
Reviewed-on: https://chromium-review.googlesource.com/1105707
Commit-Queue: Chad Versace <chadversary@chromium.org>
Tested-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
2018-07-12 18:17:23 +00:00
Stéphane Marchesin
f7f64ad0ac CHROMIUM: egl/android: Support opening render nodes from within EGL
This patch adds support for opening render nodes directly from within
display initialization, Instead of relying on private interfaces
provided by gralloc.

In addition to having better separation from gralloc and being able to
use different render nodes for allocation and rendering, this also fixes
problems encountered when using the same DRI FD for gralloc and Mesa,
when both stepped each over another because of shared GEM handle
namespace.

BUG=b:29036398
TEST=No significant regressions in dEQP inside the container

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/367215
Reviewed-by: Nicolas Boichat <drinkcat@chromium.org>
(cherry picked from commit 1171bddb74)

BUG=b:77235812
TEST=emerge-grunt arc-mesa; emerge-eve arc-mesa

Change-Id: Ib76d2f02077e3f679ca86d31fc999bff3a46edf0
Reviewed-on: https://chromium-review.googlesource.com/1105706
Commit-Queue: Chad Versace <chadversary@chromium.org>
Tested-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
2018-07-12 18:17:22 +00:00
Tapani Pälli
97636c85f9 FROMLIST: glcpp: Hack to handle expressions in #line directives.
GLSL ES 320 technically allows #line to have arbitrary expression trees
rather than integer literal constants, unlike the C and C++ preprocessor.
This is likely a completely unused feature that does not make sense.

However, Android irritatingly mandates this useless behavior, so this
patch implements a hack to try and support it.

We handle a single expression:

    #line <line number expression>

but we avoid handling the double expression:

    #line <line number expression> <source string expression>

because this is an ambiguous grammar.  Instead, we handle the case that
wraps both in parenthesis, which is actually well defined:

    #line (<line number expression>) (<source string expression>)

With this change following tests pass:

   dEQP-GLES3.functional.shaders.preprocessor.builtin.line_expression_vertex
   dEQP-GLES3.functional.shaders.preprocessor.builtin.line_expression_fragment
   dEQP-GLES3.functional.shaders.preprocessor.builtin.line_and_file_expression_vertex
   dEQP-GLES3.functional.shaders.preprocessor.builtin.line_and_file_expression_fragment

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>

BUG=b:33352633
BUG=b:33247335
TEST=affected tests passing on CTS 7.1_r1 sentry

Reviewed-on: https://chromium-review.googlesource.com/427305
Tested-by: Haixia Shi <hshi@chromium.org>
Reviewed-by: Ilja H. Friedel <ihf@chromium.org>
Commit-Queue: Haixia Shi <hshi@chromium.org>
Trybot-Ready: Haixia Shi <hshi@chromium.org>
(cherry picked from commit 5ec83db3fc)

BUG=b:77235812
TEST=emerge-grunt arc-mesa; emerge-eve arc-mesa

Change-Id: Ie12a03e7642ac4f5d79e0b6b20e16d7653c3395c
Reviewed-on: https://chromium-review.googlesource.com/1105705
Commit-Queue: Chad Versace <chadversary@chromium.org>
Tested-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
2018-07-12 18:17:21 +00:00
Stéphane Marchesin
f408dc0e46 CHROMIUM: glsl: Avoid crash when overflowing the samplers array
Fixes a crash when we have too many samplers.

BUG=chromium:141901
TEST=by hand

Signed-off-by: Prince Agyeman <prince.agyeman@intel.com>
Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: James Ausmus <james.ausmus@intel.com>
(cherry picked from commit 403ab71152)

BUG=b:77235812
TEST=emerge-grunt arc-mesa; emerge-eve arc-mesa

Change-Id: I8ffebdae3bdab68da4277193fe367959ab719796
Reviewed-on: https://chromium-review.googlesource.com/1105702
Commit-Queue: Chad Versace <chadversary@chromium.org>
Tested-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
2018-07-12 18:17:20 +00:00
Chad Versace
15656ccb9e UPSTREAM: anv/android: Fix type error in call to vk_errorf()
In a single call to vk_errorf() in the Android code, the arguments were
swapped. The bug has existed since day one. Chrome OS used to forgive
the warning, but it is now a compilation error.

CC: <mesa-stable@lists.freedesktop.org>
Fixes: 053d4c32 "anv: Implement VK_ANDROID_native_buffer (v9)"
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit be5fc0d7f1)

BUG=b:77235812
TEST=emerge-eve arc-mesa

Change-Id: I0dde2e53eeb09af34d629b17d0692f0e00151fdf
Reviewed-on: https://chromium-review.googlesource.com/1133766
Tested-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Commit-Queue: Chad Versace <chadversary@chromium.org>
2018-07-12 18:13:03 +00:00
Chad Versace
09a9aafa81 UPSTREAM: anv/android: Fix Autotools build for VK_ANDROID_native_buffer
Changes to vk.xml and anv_entrypoints_gen.py broke the Autotools build
on Android. The changes undef'd the VK_ANDROID_native_buffer entrypoints
in anv_entrypoints.h.

Fix it with CPPFLAGS += -DVK_USE_PLATFORM_ANDROID_KHR.

CC: <mesa-stable@lists.freedesktop.org>
See-Also: 63525ba7 "android: enable VK_ANDROID_native_buffer"
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 8e403bc959)

BUG=b:77235812
TEST=none (because anv doesn't build yet)

Change-Id: Ic3b8dbeab7ee8663ca6dcca0401dc6340dddf832
Reviewed-on: https://chromium-review.googlesource.com/1133765
Tested-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Commit-Queue: Chad Versace <chadversary@chromium.org>
2018-07-12 18:13:02 +00:00
Stéphane Marchesin
d346f5b8bd CHROMIUM: st/mesa: Do not flush front buffer on context flush
Make gallium work again with new chrome.

BUG=none
TEST=compile

Signed-off-by: James Ausmus <james.ausmus@intel.com>
Signed-off-by: Prince Agyeman <prince.agyeman@intel.com>
Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
(cherry picked from commit 8940a624ae)

BUG=b:77235812
TEST=emerge-grunt arc-mesa

Change-Id: I450f99e8e9c83a55626a28b80d5ed4ce884c8298
Reviewed-on: https://chromium-review.googlesource.com/1105701
Commit-Queue: Chad Versace <chadversary@chromium.org>
Tested-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
2018-07-12 18:13:02 +00:00
Chad Versace
ab8ac2c3e9 UPSTREAM: gallium: Fix automake for Android (v2)
Chromium OS uses Autotools and pkg-config when building Mesa for
Android. The gallium drivers were failing to find the headers and
libraries for zlib and Android's libbacktrace.

v2:
  - Don't add a check for zlib.pc. configure.ac already checks for
    zlib.pc elsewhere. [for tfiga]
  - Check for backtrace.pc separately from the other Android libs.
    [for tfiga]

Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit dc6665422a)

BUG=b:79872606
TEST=emerge-grunt arc-mesa
CQ-DEPEND=CL:1105634

Change-Id: If2e7fd953786723d6b7194f9174f5856a55c3b14
Reviewed-on: https://chromium-review.googlesource.com/1105700
Tested-by: Chad Versace <chadversary@chromium.org>
Commit-Queue: Chad Versace <chadversary@chromium.org>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
2018-07-12 18:13:00 +00:00
Tomasz Figa
d069832666 CHROMIUM: Add PRESUBMIT.cfg to disable various checks
This makes it so that we don't need to run 'repo upload --no-verify'.

BUG=b:26574868
TEST=ran upload on some CLs

Change-Id: I0b22a5ee0321dc454affeefcfac0eee75490bd6e
Reviewed-on: https://chromium-review.googlesource.com/780587
Tested-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Commit-Queue: Chad Versace <chadversary@chromium.org>
(cherry picked from commit 7cc5c96a9a)

BUG=b:77235812
TEST=none

Change-Id: Ia4285d3e178787874f439b76aaf8717f50fd7df2
Reviewed-on: https://chromium-review.googlesource.com/1105699
Reviewed-by: Bas Nieuwenhuizen <basni@chromium.org>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Commit-Queue: Ilja H. Friedel <ihf@chromium.org>
Tested-by: Ilja H. Friedel <ihf@chromium.org>
Trybot-Ready: Ilja H. Friedel <ihf@chromium.org>
2018-06-22 23:38:47 +00:00
Timothy Arceri
a9114b5e3e util: add allow_glsl_relaxed_es to drirc for Google Earth VR
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-06-19 12:09:56 +10:00
Timothy Arceri
725b1a406d mesa/util: add allow_glsl_relaxed_es driconfig override
This relaxes a number of ES shader restrictions allowing shaders
to follow more desktop GLSL like rules.

This initial implementation relaxes the following:

 - allows linking ES shaders with desktop shaders
 - allows mismatching precision qualifiers
 - always enables standard derivative builtins

These relaxations allow Google Earth VR shaders to compile.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-06-19 12:09:56 +10:00
Timothy Arceri
781c23ece6 util: add allow_glsl_builtin_const_expression to drirc for Google Earth VR
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-06-19 12:09:56 +10:00
Timothy Arceri
90dbab0f9a mesa/util: add allow_glsl_builtin_const_expression driconf override
Google Earth VR shaders uses builtins in constant expressions with
GLSL 1.10. That feature wasn't allowed until GLSL 1.20.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-06-19 12:09:56 +10:00
Timothy Arceri
de93f546a7 util: manually extract the program name from program_invocation_name
Glibc has the same code to get program_invocation_short_name. However
for some reason the short name gets mangled for some wine apps.

For example with Google Earth VR I get:

program_invocation_name:
"/home/tarceri/.local/share/Steam/steamapps/common/EarthVR/Earth.exe"

program_invocation_short_name:
"e"

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-19 12:09:56 +10:00
Bas Nieuwenhuizen
1a8501a9dd ac/surface: Set compressZ for stencil-only surfaces.
We HTILE compress stencil-only surfaces too.

CC: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-06-19 02:52:01 +02:00
Jason Ekstrand
0146d79636 anv: Use a single global API patch version
The Vulkan API has only one patch version shared among all of the
major.minor versions.  We should also advertise the same patch version
regardless of major.minor.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106941
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-06-18 17:11:52 -07:00
Timothy Arceri
68bf94a8b0 radeonsi: enable OpenGL 3.3 compat profile
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-06-19 09:21:33 +10:00
Timothy Arceri
89a5d6f715 mesa: add ff fragment shader support for geom and tess shaders
This is required for compatibility profile support.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-06-19 09:21:33 +10:00
Eric Anholt
e636199c1c v3d: Set the SO offsets correctly if we have to re-emit.
This should fix TF across a glFlush() or TF pause/restart.  Fixes
dEQP-GLES3.functional.transform_feedback.array.interleaved.lines.highp_float
and many, many others.
2018-06-18 14:54:16 -07:00
Marek Olšák
94178044d5 gallium/hud: = should rename the last added data source
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-18 17:53:15 -04:00
Rafael Antognolli
ba2c18763b anv: Disable constant buffer 0 being relative.
If we are on gen8+ and have context isolation support, just make that
constant buffer address be absolute, so we can use it for push UBOs too.

v2: Do not duplicate constant_buffer_0_is_relative flag (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-18 14:41:38 -07:00
Rafael Antognolli
be18d5a0ce anv/device: Check for kernel support of context isolation.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-06-18 14:41:38 -07:00
Rafael Antognolli
056214ebfc intel/genxml: Add bitmasks for CS_DEBUG_MODE2/INSTPM.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-06-18 14:41:38 -07:00
Alok Hota
a678f40e46 swr/rast: Clang-Format most rasterizer source code
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-06-18 13:57:38 -05:00
Eric Engestrom
d85fef1e34 radv: fix reported number of available VGPRs
It's a bit late to round up after an integer division.

Fixes: de88979413 "radv: Implement VK_AMD_shader_info"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
2018-06-18 17:08:22 +01:00
Eric Engestrom
9a4bd6b45f mesa: add missing return in error path
Fixes: 67f40dadaa "mesa: add support for ARB_sample_locations"
Cc: Rhys Perry <pendingchaos02@gmail.com>
Cc: Brian Paul <brianp@vmware.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-06-18 16:19:48 +01:00
Bas Nieuwenhuizen
a3d93eec7c radv: Use less conservative approximation for context rolls.
Drops the number of time we set the scissor by 4x for F1 2017,
which results in a consistent performance improvement of about 4%.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-06-18 16:21:10 +02:00
Eric Engestrom
4d08c1e7d1 radv: fix bitwise check
Fixes: 922cd38172 "radv: implement out-of-order rasterization when it's safe on VI+"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-06-18 12:15:18 +01:00
Eric Engestrom
e8eb84826e meson: fix i965/anv/isl genX static lib names
Shouldn't make any functional difference, just that `liblibanv_gen90.a`
will now be called `libanv_gen90.a`.

Fixes: 3218056e0e "meson: Build i965 and dri stack"
Fixes: d1992255bb "meson: Add build Intel "anv" vulkan driver"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-06-18 12:03:24 +01:00
Timothy Arceri
66673bef94 mesa: Unconditionally enable floating-point textures
ARB_texture_float references US Patent #6,650,327 [1] which has a filing date
of June 16 1998.

According to [2], patents filed after 1995 expire 20 years from the filing
date, giving an expiration of June 17 2018.

[1] https://www.google.com/patents/US6650327
[2] https://en.wikipedia.org/wiki/Term_of_patent_in_the_United_States

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-18 09:29:38 +10:00
Jose Maria Casanova Crespo
b8e099e7d5 intel/fs: shuffle_64bit_data_for_32bit_write is not used anymore
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo
a4965842d6 intel/fs: Use new shuffle_32bit_write for all 64-bit storage writes
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo
a4d445b93c intel/fs: shuffle_32bit_load_result_to_64bit_data is not used anymore
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo
71b319a285 intel/fs: Use shuffle_from_32bit_read for 64-bit FS load_input
As the previous use of shuffle_32bit_load_result_to_64bit_data
had a source/destination overlap for 64-bit. Now a temporary destination
is used for 64-bit cases to use shuffle_from_32bit_read that doesn't
handle src/dst overlaps.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo
8003ae87f4 intel/fs: shuffle_from_32bit_read at load_per_vertex_input at TCS/TES
Previously, the shuffle function had a source/destination overlap that
needs to be avoided to use shuffle_from_32bit_read. As we can use for
the shuffle destination the destination of removed MOVs.

This change also avoids the internal MOVs done by the previous shuffle
to deal with possible overlaps.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo
5565630f85 intel/fs: Use shuffle_from_32bit_read at VS load_input
shuffle_from_32bit_read manages 32-bit reads to 32-bit destination
in the same way that the previous loop so now we just call the new
function for all bitsizes, simplifying also the 64-bit load_input.

v2: Add comment about future 16-bit support (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo
152bffb69b intel/fs: Use shuffle_from_32bit_read for 64-bit gs_input_load
This implementation avoids two unneeded MOVs for each 64-bit
component. One was done in the old shuffle, to avoid cases of
src/dst overlap but this is not the case. And the removed MOV
was already being being done in the shuffle.

Copy propagation wasn't able to remove them because shuffle
destination values are defined with partial writes because they
have stride == 2.

v2: Reword commit log summary (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo
8b26a2d96d intel/fs: shuffle_from_32bit_read for 64-bit do_untyped_vector_read
do_untyped_vector_read is used at load_ssbo and load_shared.

The previous MOVs are removed because shuffle_from_32bit_read
can handle storing the shuffle results in the expected destination
just using the proper offset.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo
c2297bdf19 intel/fs: Remove old 16-bit shuffle/unshuffle functions
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo
fd3d8a8f79 intel/fs: Use shuffle_for_32bit_write for 16-bits store_ssbo
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo
20e4732f7d intel/fs: Use shuffle_from_32bit_read to read 16-bit SSBO
Using shuffle_from_32bit_read instead of 16-bit shuffle functions
avoids the need of retype. At the same time new function are
ready for 8-bit type SSBO reads.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo
a0891eabca intel/fs: Use shuffle_from_32bit_read at VARYING_PULL_CONSTANT_LOAD
shuffle_from_32bit_read can manage the shuffle/unshuffle needed
for different 8/16/32/64 bit-sizes at VARYING PULL CONSTANT LOAD.
To get the specific component the first_component parameter is used.

In the case of the previous 16-bit shuffle, the shuffle operation was
generating not needed MOVs where its results where never used. This
behaviour passed unnoticed on SIMD16 because dead_code_eliminate
pass removed the generated instructions but for SIMD8 they cound't be
removed because of being partial writes.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo
22c654941b intel/fs: New shuffle_for_32bit_write and shuffle_from_32bit_read
These new shuffle functions deal with the shuffle/unshuffle operations
needed for read/write operations using 32-bit components when the
read/written components have a different bit-size (8, 16, 64-bits).
Shuffle from 32-bit to 32-bit becomes a simple MOV.

shuffle_src_to_dst takes care of doing a shuffle when source type is
smaller than destination type and an unshuffle when source type is
bigger than destination. So this new read/write functions just need
to call shuffle_src_to_dst assuming that writes use a 32-bit
destination and reads use a 32-bit source.

As shuffle_for_32bit_write/from_32bit_read components take components
in unit of source/destination types and shuffle_src_to_dst takes units
of the smallest type component, we adjust components and first_component
parameters.

To enable this new functions it is needed than there is no
source/destination overlap in the case of shuffle_from_32bit_read.
That never happens on shuffle_for_32bit_write as it allocates a new
destination register as it was at shuffle_64bit_data_for_32bit_write.

v2: Reword commit log and add comments to explain why first_component
    and components parameters are adjusted. (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo
a5665056e5 intel/fs: general 8/16/32/64-bit shuffle_src_to_dst function
This new function takes care of shuffle/unshuffle components of a
particular bit-size in components with a different bit-size.

If source type size is smaller than destination type size the operation
needed is a component shuffle. The opposite case would be an unshuffle.

Component units are measured in terms of the smaller type between
source and destination. As we are un/shuffling the smaller components
from/into a bigger one.

The operation allows to skip first_component number of components from
the source.

Shuffle MOVs are retyped using integer types avoiding problems with
denorms and float types if source and destination bitsize is different.
This allows to simplify uses of shuffle functions that are dealing with
these retypes individually.

Now there is a new restriction so source and destination can not overlap
anymore when calling this shuffle function. Following patches that migrate
to use this new function will take care individually of avoiding source
and destination overlaps.

v2: (Jason Ekstrand)
    - Rewrite overlap asserts.
    - Manage type_sz(src.type) == type_sz(dst.type) case using MOVs
      from source to dest. This works for 64-bit to 64-bits
      operation that on Gen7 as it doesn't support Q registers.
    - Explain that components units are based in the smallest type.
v3: - Fix unshuffle overlap assert (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-16 22:39:08 +02:00
Jose Fonseca
d882331f7a appveyor: Consume LLVM 5.0.1.
https://ci.appveyor.com/project/jrfonseca/mesa/build/47

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-06-16 18:09:20 +01:00
Bas Nieuwenhuizen
c4714f698b ac: Clear meminfo to avoid valgrind warning.
Somehow valgrind misses that the value is initialized by the ioctl.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-06-16 19:03:47 +02:00
Samuel Pitoiset
5917761e3d radv: fix emitting the TCS regs on GFX9
The primitive ID is NULL and this generates an invalid
select instruction which crashes because one operand is NULL.

This fixes crashes in The Long Journey Home, Quantum Break
and Just Cause 3 with DXVK.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106756
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-16 10:18:51 +02:00
Ian Romanick
355868dbfc nir: Document a couple instances of parent_instr
nir_ssa_def::parent_instr and nir_src::parent_instr have the same name,
but they mean really different things.  I choose to save the next person
the hour+ that I just spent figuring that out.  Even now that I know, I
doubt I'd notice in code review that someone typed foo->parent_instr
when they actually meant foo->ssa->parent_instr.

v2: Minor wording tweak in nir_ssa_def::parent_instr.  Suggested by
Jason.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-15 17:36:51 -07:00
Ian Romanick
4467040cb6 i965/fs: Propagate conditional modifiers from not instructions
Skylake
total instructions in shared programs: 14399081 -> 14399010 (<.01%)
instructions in affected programs: 26961 -> 26890 (-0.26%)
helped: 57
HURT: 0
helped stats (abs) min: 1 max: 6 x̄: 1.25 x̃: 1
helped stats (rel) min: 0.16% max: 0.80% x̄: 0.30% x̃: 0.18%
95% mean confidence interval for instructions value: -1.50 -0.99
95% mean confidence interval for instructions %-change: -0.35% -0.25%
Instructions are helped.

total cycles in shared programs: 532978307 -> 532976050 (<.01%)
cycles in affected programs: 468629 -> 466372 (-0.48%)
helped: 33
HURT: 20
helped stats (abs) min: 3 max: 360 x̄: 116.52 x̃: 98
helped stats (rel) min: 0.06% max: 3.63% x̄: 1.66% x̃: 1.27%
HURT stats (abs)   min: 2 max: 172 x̄: 79.40 x̃: 43
HURT stats (rel)   min: 0.04% max: 3.02% x̄: 1.48% x̃: 0.44%
95% mean confidence interval for cycles value: -81.29 -3.88
95% mean confidence interval for cycles %-change: -1.07% 0.12%
Inconclusive result (%-change mean confidence interval includes 0).

All Gen6+ platforms, except Ivy Bridge, had similar results. (Haswell shown)
total instructions in shared programs: 12973897 -> 12973838 (<.01%)
instructions in affected programs: 25970 -> 25911 (-0.23%)
helped: 55
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.07 x̃: 1
helped stats (rel) min: 0.16% max: 0.62% x̄: 0.28% x̃: 0.18%
95% mean confidence interval for instructions value: -1.14 -1.00
95% mean confidence interval for instructions %-change: -0.32% -0.24%
Instructions are helped.

total cycles in shared programs: 410355841 -> 410352067 (<.01%)
cycles in affected programs: 578454 -> 574680 (-0.65%)
helped: 47
HURT: 5
helped stats (abs) min: 3 max: 360 x̄: 85.74 x̃: 18
helped stats (rel) min: 0.05% max: 3.68% x̄: 1.18% x̃: 0.38%
HURT stats (abs)   min: 2 max: 242 x̄: 51.20 x̃: 4
HURT stats (rel)   min: <.01% max: 0.45% x̄: 0.15% x̃: 0.11%
95% mean confidence interval for cycles value: -104.89 -40.27
95% mean confidence interval for cycles %-change: -1.45% -0.66%
Cycles are helped.

Ivy Bridge
total instructions in shared programs: 11679351 -> 11679301 (<.01%)
instructions in affected programs: 28208 -> 28158 (-0.18%)
helped: 50
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.12% max: 0.54% x̄: 0.23% x̃: 0.16%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -0.27% -0.19%
Instructions are helped.

total cycles in shared programs: 257445362 -> 257444662 (<.01%)
cycles in affected programs: 419338 -> 418638 (-0.17%)
helped: 40
HURT: 3
helped stats (abs) min: 1 max: 170 x̄: 65.05 x̃: 24
helped stats (rel) min: 0.02% max: 3.51% x̄: 1.26% x̃: 0.41%
HURT stats (abs)   min: 2 max: 1588 x̄: 634.00 x̃: 312
HURT stats (rel)   min: 0.05% max: 2.97% x̄: 1.21% x̃: 0.62%
95% mean confidence interval for cycles value: -97.96 65.41
95% mean confidence interval for cycles %-change: -1.56% -0.62%
Inconclusive result (value mean confidence interval includes 0).

No changes on Iron Lake or GM45.

v2: Move 'if (cond != BRW_CONDITIONAL_Z && cond != BRW_CONDITIONAL_NZ)'
check outside the loop.  Suggested by Iago.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-15 17:22:27 -07:00
Ian Romanick
f2d8bb7a7b i965/fs: Rearrange code to remove most of the gotos
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-15 17:22:27 -07:00
Ian Romanick
77f269bb56 i965/fs: Refactor propagation of conditional modifiers from compares to adds
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-15 17:22:27 -07:00
Ian Romanick
22f9fbc0d9 i965/vec4: Optimize OR with 0 into a MOV
All of the affected shaders are geometry shaders... the same ones from
the similar fs changes.

The "No changes on any other platforms" comment below is not quite
right.  Without the previous change to register coalescing, this
optimization caused quite a few regressions in tests that either used
gl_ClipVertex or used different interpolation modes.  I observed that
with both patches applied,
glsl-1.10/execution/interpolation/interpolation-none-gl_BackSecondaryColor-smooth-vertex.shader_test
was one instruction shorter.  I suspect other shaders would be similarly
affected.  Since this is all based on NOS, shader-db does not reflect
it.

Haswell
total instructions in shared programs: 12954955 -> 12954918 (<.01%)
instructions in affected programs: 3603 -> 3566 (-1.03%)
helped: 37
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.21% max: 2.50% x̄: 1.99% x̃: 2.50%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -2.30% -1.69%
Instructions are helped.

total cycles in shared programs: 410012108 -> 410012098 (<.01%)
cycles in affected programs: 3540 -> 3530 (-0.28%)
helped: 5
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 0.28% max: 0.28% x̄: 0.28% x̃: 0.28%
95% mean confidence interval for cycles value: -2.00 -2.00
95% mean confidence interval for cycles %-change: -0.28% -0.28%
Cycles are helped.

Ivy Bridge
total instructions in shared programs: 11679387 -> 11679351 (<.01%)
instructions in affected programs: 3292 -> 3256 (-1.09%)
helped: 36
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.21% max: 2.50% x̄: 2.04% x̃: 2.50%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -2.34% -1.74%
Instructions are helped.

No changes on any other platforms.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-15 17:22:27 -07:00
Ian Romanick
e6a9bd97b9 i965/vec4: Don't register coalesce into source of VS_OPCODE_UNPACK_FLAGS_SIMD4X2
This prevents regressions in a bunch of clipping and interpolation tests
caused by the next patch (i965/vec4: Optimize OR with 0 into a MOV).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-15 17:22:27 -07:00
Ian Romanick
284b563fb0 i965/fs: Optimize OR with 0 into a MOV
fs_visitor::set_gs_stream_control_data_bits generates some code like
"control_data_bits | stream_id << ((2 * (vertex_count - 1)) % 32)" as
part of EmitVertex.  The first time this (dynamically) occurs in the
shader, control_data_bits is zero.  Many times we can determine this
statically and various optimizations will collaborate to make one of the
OR operands literal zero.

Converting the OR to a MOV usually allows it to be copy-propagated away.
However, this does not happen in at least some shaders (in the assembly
output of shaders/closed/UnrealEngine4/EffectsCaveDemo/301.shader_test,
search for shl).

All of the affected shaders are geometry shaders.

Broadwell and Skylake had similar results. (Skylake shown)
total instructions in shared programs: 14375452 -> 14375413 (<.01%)
instructions in affected programs: 6422 -> 6383 (-0.61%)
helped: 39
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.14% max: 2.56% x̄: 1.91% x̃: 2.56%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -2.26% -1.57%
Instructions are helped.

total cycles in shared programs: 531981179 -> 531980555 (<.01%)
cycles in affected programs: 27493 -> 26869 (-2.27%)
helped: 39
HURT: 0
helped stats (abs) min: 16 max: 16 x̄: 16.00 x̃: 16
helped stats (rel) min: 0.60% max: 7.92% x̄: 5.94% x̃: 7.92%
95% mean confidence interval for cycles value: -16.00 -16.00
95% mean confidence interval for cycles %-change: -6.98% -4.90%
Cycles are helped.

No changes on earlier platforms.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-15 17:22:27 -07:00
Eric Anholt
4106f6ce54 v3d: Handle a no-intersection scissor even if it's outside of the VP.
The min/maxes ended up producing a negative clip width/height for
dEQP-GLES3.functional.fragment_ops.scissor.outside_render_line.  Just make
sure they stay at 0 (or v3d 3.x's workaround) if that happens.
2018-06-15 16:09:39 -07:00
Eric Anholt
9aa670e52a v3d: Use the proper depth texture type for sampling.
Fixes failing tests in dEQP-GLES3.functional.texture.shadow
2018-06-15 16:09:39 -07:00
Eric Anholt
778594ae12 v3d: Limit shader threading according to our maximum TMU fifo usage.
Fixes simulator assertion failures in
dEQP-GLES3.functional.shaders.texture_functions.texture.samplercubeshadow_bias_fragment
and similar complicated cases.
2018-06-15 16:09:39 -07:00
Eric Anholt
e130ada243 v3d: Fix shaders using pixel center W but no varyings.
The docs called this field "uses both center W and centroid W", but
actually it's "do you need center W even if varyings don't obviously call
for it?"

Fixes dEQP-GLES3.functional.shaders.builtin_variable.fragcoord_w
2018-06-15 16:09:39 -07:00
Dylan Baker
0d4f338a11 docs: Update release-notes and calendar 2018-06-15 13:53:25 -07:00
Dylan Baker
3c454fc84a docs: Add release notes for 18.1.2 2018-06-15 13:52:44 -07:00
Rafael Antognolli
9e1f208795 intel/aubinator: Use int to store getopt_long flags.
getopt_long flag parameter is an int pointer, so if we use bool to store
those values, when getopt_long writes to one of them, it might end up
overwriting the next one.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-15 09:03:10 -07:00
Samuel Pitoiset
f8e2c4c57c Revert "radv: always set/load both depth and stencil clear values"
This fixes a rendering regression with RoTR.

This reverts commit 4bdad9fadd.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-15 16:52:06 +02:00
Samuel Pitoiset
a2f6e72138 radv: don't check for linear images in emit_fast_color_clear()
We don't enable CMASK for linear surfaces and addrlib only
enables DCC for tiling surfaces.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-15 15:54:12 +02:00
Samuel Pitoiset
3befac52db radv: allow RADV_PERFTEST=dccmsaa on GFX9
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-15 15:54:10 +02:00
Samuel Pitoiset
bfca15e16a radv: add RADV_DEBUG=checkir
This allows to run the LLVM verifier pass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-15 15:54:08 +02:00
Samuel Pitoiset
706d51de7f radv: update ZRANGE_PRECISION in radv_update_bound_fast_clear_ds()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-15 15:54:06 +02:00
Samuel Pitoiset
fa8bc821a8 radv: clean up radv_{set,load}_depth_clear_regs() helpers
And replace _regs by _metadata because it makes more sense.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-15 15:54:04 +02:00
Samuel Pitoiset
4bdad9fadd radv: always set/load both depth and stencil clear values
I don't think that matter much to emit both values and that
makes the code a bit simpler.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-15 15:54:02 +02:00
Samuel Pitoiset
2193a6a828 radv: update the fast ds clear values only if the image is bound
It's unnecessary to update the fast depth/stencil clear values
if the fast cleared depth/stencil image isn't currently bound.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-15 15:54:00 +02:00
Samuel Pitoiset
be794fa26b radv: clean up radv_{set,load}_color_clear_regs() helpers
And replace _regs by _metadata because it makes more sense.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-15 15:53:58 +02:00
Samuel Pitoiset
d7b772abb4 radv: update the fast color clear values only if the image is bound
It's unnecessary to update the fast color clear values if the
fast cleared color image isn't currently bound.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-15 15:53:55 +02:00
Christian Gmeiner
efae127993 util/bitset: include util/macro.h
BITSET_FFS(x) macro makes use of ARRAY_SIZE(x) macro which is
defined in util/macro.h. Include it directy to make usage more
straightforward.

Fixes: 692bd4a1ab ("util: replace Elements() with ARRAY_SIZE()")
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-15 11:26:30 +01:00
Lukas Rusak
4cfc4cef80 meson: fix private libs when building without glx
I noticed that the generated pkg-config files will include
glx and x11 dependencies even when x11 isn't a selected platform.

This fixes the private libs and was tested by building kmscube

V2:
  - check if gallium-xlib is being used for glx

Fixes: 108d257a16 "meson: build libEGL"
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-15 10:43:22 +01:00
Rhys Perry
30f1ab7a59 docs: document addition of GL_ARB_sample_locations for nvc0
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v2)
2018-06-14 20:09:45 -06:00
Rhys Perry
66ca7e400b nvc0: add support for programmable sample locations
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
2018-06-14 20:09:45 -06:00
Rhys Perry
9f217facbd st/mesa: add support for ARB_sample_locations
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v2)
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)
2018-06-14 20:09:45 -06:00
Rhys Perry
51a221e378 gallium: add support for programmable sample locations
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v2)
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)
2018-06-14 20:09:45 -06:00
Rhys Perry
67f40dadaa mesa: add support for ARB_sample_locations
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v2)
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)
2018-06-14 20:09:45 -06:00
Eric Anholt
cd2e673abc v3d: Fix polygon offset for Z16 buffers.
Fixes:
dEQP-GLES3.functional.polygon_offset.fixed16_displacement_with_units
dEQP-GLES3.functional.polygon_offset.fixed16_render_with_units
2018-06-14 17:03:16 -07:00
Eric Anholt
d91e06a065 v3d: Fix configuration setup of mixed f32 and f16 render targets.
Fixes dEQP-GLES3.functional.fragment_out.random.26 and 6 others.
2018-06-14 16:52:25 -07:00
Eric Anholt
6784aa9870 v3d: Don't set the first_ez_state to DISABLED if after only UNDECIDED draws.
We need to have the RCL start with EZ enabled, since those undecided draws
had EZ enabled.  But we do need to update from UNDECIDED to LT or GT as
necessary still.

Fixes many simulator assertion fails in deqp
fragment_ops/interaction/basic_shader/*
2018-06-14 16:52:25 -07:00
Eric Anholt
9080642449 v3d: Use the right size for v3d 4.x TEXTURE_SHADER_STATE BO.
This doesn't really matter, since they both get rounded up to 4096.
2018-06-14 16:52:25 -07:00
Eric Anholt
31548187cf v3d: Add static asserts for other packed packet sizes. 2018-06-14 16:52:25 -07:00
Eric Anholt
0eef4d7f8f v3d: Fix the size of the packed attribute state.
Fixes segfaults in dEQP-GLES3.functional.vertex_array_objects.all_attributes.
2018-06-14 16:52:25 -07:00
Eric Anholt
7d8fe50af3 v3d: Remove some unused context fields from vc4. 2018-06-14 16:52:25 -07:00
Eric Anholt
48011c42aa v3d: Remove unused QUNIFORM_STENCIL left over from vc4. 2018-06-14 16:52:25 -07:00
Eric Anholt
4564537222 v3d: Use our #define for max attributes in shader caps. 2018-06-14 16:52:25 -07:00
Eric Anholt
a40bc33b11 v3d: Fix undefined results for a swap_color_rb RT from a float shader output.
Fixes segfaults and undefined behavior in
dEQP-GLES3.functional.fragment_out.basic.fixed.srgb8_alpha8_lowp_float
2018-06-14 16:52:25 -07:00
Dave Airlie
600d34c822 radv: remove multisample bit from shader key.
This wasn't being used anywhere inside the shader from what I can see.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-15 09:33:20 +10:00
Kenneth Graunke
f6898f2b55 intel/compiler: Properly consider UBO loads that cross 32B boundaries.
The UBO push analysis pass incorrectly assumed that all values would fit
within a 32B chunk, and only recorded a bit for the 32B chunk containing
the starting offset.

For example, if a UBO contained the following, tightly packed:

   vec4 a;  // [0, 16)
   float b; // [16, 20)
   vec4 c;  // [20, 36)

then, c would start at offset 20 / 32 = 0 and end at 36 / 32 = 1,
which means that we ought to record two 32B chunks in the bitfield.

Similarly, dvec4s would suffer from the same problem.

v2: Rewrite the accounting, my calculations were wrong.
v3: Write a comment about partial values (requested by Jason).

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> [v1]
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> [v3]
2018-06-14 14:58:59 -07:00
Ian Romanick
37bd9ccd21 glsl: Don't copy propagate elements from SSBO or shared variables either
Since SSBOs can be written by a different GPU thread, copy propagating a
read can cause the value to magically change.  SSBO reads are also very
expensive, so doing it twice will be slower.

The same shader was helped by this patch and the previous.

Haswell, Broadwell, and Skylake had similar results. (Skylake shown)
total instructions in shared programs: 14399119 -> 14399113 (<.01%)
instructions in affected programs: 683 -> 677 (-0.88%)
helped: 1
HURT: 0

total cycles in shared programs: 532973113 -> 532971865 (<.01%)
cycles in affected programs: 524666 -> 523418 (-0.24%)
helped: 1
HURT: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106774
2018-06-14 11:28:12 -07:00
Ian Romanick
461a5c899c glsl: Don't copy propagate from SSBO or shared variables either
Since SSBOs can be written by other GPU threads, copy propagating a read
can cause the value to magically change.  SSBO reads are also very
expensive, so doing it twice will be slower.

Haswell, Broadwell, and Skylake had similar results. (Skylake shown)
total instructions in shared programs: 14399120 -> 14399119 (<.01%)
instructions in affected programs: 684 -> 683 (-0.15%)
helped: 1
HURT: 0

total cycles in shared programs: 532978931 -> 532973113 (<.01%)
cycles in affected programs: 530484 -> 524666 (-1.10%)
helped: 1
HURT: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106774
2018-06-14 11:26:33 -07:00
Lukas Rusak
1d92d6486a meson: only build vl_winsys_dri.c when x11 platform is used
This seems to have been missed in the move from autotools

This fixes the following build issue:

../src/gallium/auxiliary/vl/vl_winsys_dri.c:34:10: fatal error: X11/Xlib-xcb.h: No such file or directory
 #include <X11/Xlib-xcb.h>
          ^~~~~~~~~~~~~~~~

Fixes: b1b65397d0
       ("meson: Build gallium auxiliary")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-06-14 10:34:51 -07:00
Brian Paul
b9e6438adf st/mesa: add missing switch cases in glsl_to_tgsi_visitor::visit()
To silence compiler warning about unhandled switch cases.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-06-14 11:29:51 -06:00
Bas Nieuwenhuizen
41dabdc475 radv: Fix output for sparse MRTs.
We need to init the cb_shader_format correctly with the changed
col_format, so this moves the col_format adjustment to before the
adjustment to before the cb_shader_mask gets generated.

Fixes: 06d3c65098 "radv: fix a GPU hang when MRTs are sparse"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106903
CC: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-06-14 11:48:24 +02:00
Samuel Pitoiset
68dead112e radv: update the ZRANGE_PRECISION value for the TC-compat bug
On GFX8+, there is a bug that affects TC-compatible depth surfaces
when the ZRange is not reset after LateZ kills pixels.

The workaround is to always set DB_Z_INFO.ZRANGE_PRECISION to match
the last fast clear value. Because the value is set to 1 by default,
we only need to update it when clearing Z to 0.0.

We also need to set the depth clear regs and to update
ZRANGE_PRECISION when initializing a TC-compat depth image to 0.

Original patch from James Legg.

This fixes random CTS fails with
dEQP-VK.renderpass.suballocation.formats.d32_sfloat_s8_uint.input.*

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105396
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-14 11:38:29 +02:00
Samuel Iglesias Gonsálvez
183adc51f8 anv: reduce maxFragmentInputComponents
If the application asks for the maximum number of fragment input
components (128), use all of them plus some builtins that are
passed in the VUE, then we exceed the maximum number of used VUE
slots (32) and we break one assert that checks this limit.

Also, with separate shader objects, we add CLIP_DIST0, CLIP_DIST1
builtins in brw_compute_vue_map() because we don't know if
gl_ClipDistance is going to be read/write by an adjacent stage.

Fixes VK-GL-CTS CL#2569.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-14 09:54:28 +02:00
Marek Olšák
6d671078a8 radeonsi/gfx9: fix si_get_buffer_from_descriptors for 48-bit pointers
This fixes:
GL45-CTS.pipeline_statistics_query_tests_ARB.functional_compute_shader_invocations

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-06-13 22:00:12 -04:00
Marek Olšák
a4312742a5 radeonsi/gfx9: update & clean up a DPBB heuristic
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:43 -04:00
Marek Olšák
47b780be21 radeonsi/gfx9: set POPS_DRAIN_PS_ON_OVERLAP due to a hw bug
This may not be needed yet, but let's set it now.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:42 -04:00
Marek Olšák
a152ca70f2 radeonsi/gfx9: remove UINT_MAX array terminators in bin size tables
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:40 -04:00
Marek Olšák
cd0be6cdc8 radeonsi/gfx9: update bin sizes
This is based on our docs (recently updated), not amdvlk.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:39 -04:00
Marek Olšák
2f51081a93 radeonsi/gfx9: update primitive binning code for EQAA
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:37 -04:00
Marek Olšák
22e994bb75 radeonsi: assume that rasterizer state is non-NULL in draw_vbo
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:36 -04:00
Marek Olšák
f3b3ee6974 radeonsi: micro-optimize prim checking and fix guardband with lines+adjacency
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:34 -04:00
Marek Olšák
d6974feb90 radeonsi: move the guardband registers into a separate state atom
They have a different frequency of updates and don't change when scissors
change.

I think this even fixes something in si_update_vs_viewport_state.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:31 -04:00
Marek Olšák
68b1c669e7 radeonsi/gfx9: implement the scissor bug workaround without performance drop
This might improve performance on Vega10 and Raven.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:27 -04:00
Marek Olšák
73b0d10152 radeonsi: don't set VGT_LS_HS_CONFIG if it doesn't change
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:25 -04:00
Marek Olšák
28ee825e19 radeonsi: move VGT_GS_OUT_PRIM_TYPE into si_shader_gs
same as amdvlk.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:23 -04:00
Marek Olšák
99e0ba6868 radeonsi: record CLIPVERTEX output usage properly for compatibility profiles
This was missed when adding CLIPVERTEX support into GS & tess.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:20 -04:00
Marek Olšák
47a57a709d radeonsi: fix FBFETCH with 2D MSAA arrays
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:17 -04:00
Marek Olšák
e5e57c3a5e ac: handle undefined EQAA samples in ac_apply_fmask_to_sample
RADV might wanna use this helper too.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-06-13 22:00:12 -04:00
Marek Olšák
a2d4c8ff6d radeonsi: return real memory usage instead of per-process usage
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-13 21:47:36 -04:00
Marek Olšák
95ecde42eb ac/gpu_info: report real total memory sizes
The change from MIN2 to MAX2 is intentional.

Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-13 21:47:36 -04:00
Dave Airlie
f11b664f48 docs: mark virgl GL 4.0 features as complete.
virgl should now expose GL4.1 where it can.
2018-06-14 10:38:11 +10:00
Dave Airlie
7b6f2704eb virgl: add ARB_tessellation_shader support. (v2)
This should add all the pieces to enable tess shaders on virgl.

v2: fixup transform to handle tess and strip out precise.
set default for max patch varyings to work around issue when
tess gets enabled from v1 caps but v2 caps aren't in place. (Elie)

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-06-14 10:36:31 +10:00
Dave Airlie
babd1d526b glsl: allow standalone semicolons outside main()
GLSL 4.60 offically added this but games and older CTS suites actually
had shaders that did this, we may as well enable it everywhere.

Adding stable because it appears apps in the wild do this.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
2018-06-14 10:21:51 +10:00
Samuel Pitoiset
51e23d3419 radv: don't fast clear HTILE for 16-bit depth surfaces on GFX8
This causes rendering issues in Shadow Warrior 2 with DXVK.

Cc: mesa-stable@lists.freedesktop.org
Fixes: ccc64f3133 ("radv: enable TC-compat HTILE for 16-bit depth surfaces on GFX8")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106912
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-13 20:30:04 +02:00
Andrew Galante
baf16b2ea3 configure.ac: Test for __atomic_add_fetch in atomic checks
Some platforms have 64-bit __atomic_load_n but not 64-bit
__atomic_add_fetch, so test for both of them.

Bug: https://bugs.gentoo.org/655616
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-06-13 10:09:46 -07:00
Andrew Galante
9d547a7617 meson: Test for __atomic_add_fetch in atomic checks
Some platforms have 64-bit __atomic_load_n but not 64-bit
__atomic_add_fetch, so test for both of them.

Bug: https://bugs.gentoo.org/655616
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-06-13 10:09:46 -07:00
Matt Turner
b29b5a82a1 meson: Fix -latomic check
Commit 54ba73ef10 (configure.ac/meson.build: Fix -latomic test) fixed
some checks for -latomic, and then commit 54bbe600ec (configure.ac:
rework -latomic check) further extended the fixes in configure.ac but
not in Meson. This commit extends those fixes to the Meson tests.

Fixes: 54bbe600ec (configure.ac: rework -latomic check)
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-06-13 10:09:46 -07:00
Dylan Baker
9cc577761f meson: Remove various completed todos
v3: - Remove "won't do" todos, so only completed todo's are now removed.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com> (v2)
2018-06-13 10:07:03 -07:00
Dylan Baker
0ce3f3538b meson: Make use of optional modules
meson 0.43 gained support for optional modules, which clover wold like
to use. Since we require 0.44.1 now we can rely on them being available
for clover.

compile tested only.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-13 10:06:58 -07:00
Dylan Baker
34bbb24ce7 meson: Add support for ppc assembly/optimizations
v2: - Use -mpower8-vector in compiler test for altivec
    - rename altivec option to power8
    - reword power8 option description to be more clear, originally I
      had made it a boolean, but replaced it with an auto option.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-13 10:06:54 -07:00
Dylan Baker
e26af22143 meson: Add support for SPARC assembly
This was blindly copied from autotools and tested by a helpful gentoo
user.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-13 10:06:25 -07:00
Dylan Baker
6eaa013685 meson: Set include dirs for asm
v2: - split this from the next patch
    - Only include x86-64 and not x86 when buiding x86_64

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-13 10:06:23 -07:00
Dylan Baker
65e447c5df meson: move cc and cpp definitions to top of main meson.build
This just makes using cc and cpp easier.

v2: - Add this patch to fix altivec

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-13 10:06:16 -07:00
Jason Ekstrand
51376cd749 Revert "intel/compiler: Properly consider UBO loads that cross 32B boundaries."
This reverts commit b8fa847c2e.

This broke about 30k Vulkan CTS tests.
2018-06-13 09:23:55 -07:00
Kenneth Graunke
b8fa847c2e intel/compiler: Properly consider UBO loads that cross 32B boundaries.
The UBO push analysis pass incorrectly assumed that all values would fit
within a 32B chunk, and only recorded a bit for the 32B chunk containing
the starting offset.

For example, if a UBO contained the following, tightly packed:

   vec4 a;  // [0, 16)
   float b; // [16, 20)
   vec4 c;  // [20, 36)

then, c would start at offset 20 / 32 = 0 and end at 36 / 32 = 1,
which means that we ought to record two 32B chunks in the bitfield.

Similarly, dvec4s would suffer from the same problem.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-06-13 02:07:58 -07:00
Ross Burton
3c288da5ee drivers/dri/i965: add missing #include
brw_bufmgr.h uses time_t without include time.h, so the build fails under musl.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-12 12:08:30 +01:00
Mauro Rossi
fb9ab2fbd3 anv/android: Use an address for each anv_image plane
Fixes to avoid building error after change in image->planes[] structure,
{bo,bo_offset} has to be replaced by address.{bo,offset}
and update is needed also in the assert() for debug builds.

external/mesa/src/intel/vulkan/anv_android.c:188:21:
error: no member named 'bo' in 'struct anv_image::(anonymous at external/mesa/src/intel/vulkan/anv_private.h:2647:4)'
   image->planes[0].bo = bo;
   ~~~~~~~~~~~~~~~~ ^
1 error generated.

Fixes: bf34ef16ac ("anv: Use an address for each anv_image plane")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-12 11:17:43 +03:00
Mauro Rossi
a1220e7311 anv/android: Set the BO flags in bo_cache_import (v2)
Changes to avoid building error:

external/mesa/src/intel/vulkan/anv_android.c:131:72:
error: too few arguments to function call, expected 5, have 4
   result = anv_bo_cache_import(device, &device->bo_cache, dma_buf, &bo);
            ~~~~~~~~~~~~~~~~~~~                                        ^
1 error generated.

(v2) Set the correct bo_flags based on support of 48bit addresses and soft-pin

Fixes: b0d50247a7 ("anv/allocator: Set the BO flags in bo_cache_alloc/import")
Fixes: e7d0378bd9 ("anv: Soft-pin client-allocated memory")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-12 11:16:39 +03:00
Kenneth Graunke
0d5329d626 anv: Disable __gen_validate_value if NDEBUG is set.
We were enabling undefined memory checking for genxml values based on
Valgrind being installed at build time, even for release builds.  This
generates piles and piles of assembly whenever you touch genxml.

With gcc 7.3.1 and -O3 and -march=native on a Kabylake with Valgrind
installed at build time:

      text    data    bss     dec    hex filename
   5978385  262884  13488 6254757 5f70a5 libvulkan_intel.so
   3799377  262884  13488 4075749 3e30e5 libvulkan_intel.so

That's a 36% reduction in text size.

Fixes: 047ed02723 (vk/emit: Use valgrind to validate every packed field)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-11 14:55:32 -07:00
Eric Engestrom
06e8771dec README: wording fix for previous commit
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-11 18:34:58 +01:00
Eric Engestrom
d9f54dceca README: add link to WhosWho for IRC nicks
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-11 18:33:12 +01:00
Eric Engestrom
eadc068406 add project README
Now that we're using GitLab, let's take advantage of the "landing page"
README feature with some minimal information, mostly to point people to
the right resources.

Acked-by: Dylan Baker <dylan@pnwbakers.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-11 18:02:35 +01:00
Eric Engestrom
e43c012433 i965: fix resource leak
v2: intel_miptree_release() already takes care of the planes, no need
    to hand-code the loop (Lionel)

Coverity ID: 1436909
Fixes: 3352f2d746 "i965: Create multiple miptrees for planar YUV images"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
2018-06-11 14:54:23 +01:00
Rob Clark
55d1a77c29 freedreno/ir3: use pipe_image_view's cpp
At least for PIPE_BUFFER, we could get the resource used as (for
example) R32F imageBuffer.  So using cpp=1 from the rsc is wrong.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-11 09:06:03 -04:00
Rob Clark
9bb90a3255 freedreno/ir3: fix image dimensions offset
copy-pasta fail from how SSBO sizes are handled.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-11 09:06:03 -04:00
Rob Clark
e9fc9c16c9 freedreno/a5xx: correct image/ssbo offset
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-11 09:06:03 -04:00
Rob Clark
132e5b0b34 freedreno/ir3: use saml always if we have lod
In some cases we get plain tex opcodes (but w/ a lod argument).. in this
case always use the saml instruction.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-11 09:06:03 -04:00
Rob Clark
cf5dda3349 freedreno/ir3: don't cp absneg into meta:fi
If using a fanin (collect) to collect of consecutive registers together,
we can CP mov's into the fanin, but not (abs) or (neg).  No places that
allow those modifiers are consuming a fanin anyways.  But this caused an
absneg to be lost between a ldgb and stgb for shaders like:

  outputs[n] = abs(input[n])

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-11 09:06:03 -04:00
Rob Clark
39e7a39e91 freedreno/ir3: rework size/type conversion instructions
With 8b and 16b, there are a lot more to handle.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-11 09:06:03 -04:00
Rob Clark
a52e698219 freedreno/ir3: propagate HALF flag across fanout
If we have a fanout (split) meta instruction to split the result of a
vector instruction, propagate the HALF flag back to the original
instruction.  Otherwise result ends up in a full precision register
while instruction(s) that use the result look in a half-precision
register.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-11 09:06:03 -04:00
Rob Clark
fc1690c9d9 freedreno/a5xx: add sample-id/sample-mask-in
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-11 09:06:03 -04:00
Rob Clark
619d2317cd freedreno/ir3: add sample-id/sample-mask-in
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-11 09:06:03 -04:00
Rob Clark
a49c87956e freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-11 09:06:03 -04:00
Rob Clark
067d89c2cd freedreno/ir3: image atomics use image-store path
image reads are handled via tex state, whereas image writes and atomics
are handled via SSBO state block.  Previously we were only considering
image write, and not image atomics which also uses the SSBO state block.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-06-11 09:06:03 -04:00
Kyle Brenneman
41642bdbca egl/glvnd: Fix a segfault in eglGetProcAddress.
If FindProcIndex in egldispatchstubs.c is called with a name that's less than
the first entry in the array, it would end up trying to store an index of -1 in
an unsigned integer, wrap around to 2^32, and then crash when it tries to look
that up.

Change FindProcIndex so that it uses bsearch(3) instead of implementing its own
binary search, like the GLX equivalent FindGLXFunction does.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-11 12:17:07 +01:00
Jordan Justen
e266b32059 mesa/program_binary: add implicit UseProgram after successful ProgramBinary
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106810
Fixes: b4c37ce214 "i965: Add ARB_get_program_binary support using nir_serialization"
Ref: 3fe8d04a6d "mesa: don't always set _NEW_PROGRAM when linking"
Ref: c505d6d852 "mesa: use gl_program for CurrentProgram rather than gl_shader_program"
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-10 21:12:46 -07:00
Dave Airlie
525cfe5dab features.txt: update virgl GL4.1 status.
All the features for GL4.1 are done (64-bit attribs were part of
the fp64 enable).

Once tessellation shaders land this will be advertised
2018-06-11 10:49:14 +10:00
Dave Airlie
77d7d7acab virgl: enable ARB_gpu_shader_fp64
This enables ARB_gpu_shader_fp64 if the host provides it.

Tested-by: Gurchetan Singh <gurchetansingh@chromium.org>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2018-06-11 08:35:03 +10:00
Samuel Pitoiset
135e4d434f radv: add a workaround for DXVK hangs by setting amdgpu-skip-threshold
Workaround for bug in llvm that causes the GPU to hang in presence
of nested loops because there is an exec mask issue. The proper
solution is to fix LLVM but this might require a bunch of work.

This fixes a bunch of GPU hangs that happen with DXVK.

Vega10:
Totals from affected shaders:
SGPRS: 110456 -> 110456 (0.00 %)
VGPRS: 122800 -> 122800 (0.00 %)
Spilled SGPRs: 7478 -> 7478 (0.00 %)
Spilled VGPRs: 36 -> 36 (0.00 %)
Code Size: 9901104 -> 9922928 (0.22 %) bytes
Max Waves: 7143 -> 7143 (0.00 %)

Code size slightly increases because it inserts more branch
instructions but that's expected. I don't see any real performance
changes.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105613
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-09 14:16:49 +02:00
Samuel Pitoiset
94706f0de4 radv: fix missing ZRANGE_PRECISION(1) for GFX9+
ZRANGE_PRECISION(1) seems to be the default optimal value, but
it was only set for VI and older chips.

This fixes a rendering issue with Banished through DXVK, and
might fix more than that.

There is still the ZRANGE_PRECISION bug that we need to handle
but that can be fixed later.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-09 10:57:01 +02:00
Gustavo Lima Chaves
7dfaf025c5 anv: enable VK_EXT_shader_stencil_export
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-08 11:16:01 -07:00
Gustavo Lima Chaves
7cc5178bba spirv: add/hookup SpvCapabilityStencilExportEXT
v2:
An attempt to support SpvExecutionModeStencilRefReplacingEXT's behavior
also follows, with the interpretation to said mode being we prevent
writes to the built-in FragStencilRefEXT variable when the execution
mode isn't set.

v3:
A more cautious reading of 1db44252d0 led
me to a missing change that would stop (what I later discovered were)
GPU hangs on the CTS test written to exercise this.

v4:
Turn FragStencilRefEXT decoration usage without StencilRefReplacingEXT
mode into a warning, instead of trying to make the variable read-only.
If we are to follow the originating extension on GL, the built-in
variable in question should never be readable anyway.

v5/v6: rebases.

v7:
Fix check for gen9 lost in rebase. (Ilia)
Reduce the scope of the bool used to track whether
SpvExecutionModeStencilRefReplacingEXT was used. Was in shader_info,
moved to vtn_builder. (Jason)

v8:
Assert for fragment shader handling StencilRefReplacingEXT execution
mode. (Caio)
Remove warning logic, since an entry point might not have
StencilRefReplacingEXT execution mode, but the global output variable
might still exist for another entry point in the module. (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-08 11:15:37 -07:00
Eric Anholt
22cc83cf87 travis: Add the v3d driver to the automake build.
Hopefully this reduces the number of fixup commits we need for the
automake build.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-06-08 09:50:38 -07:00
Eric Anholt
3db39d84d2 travis: Do our automake build tests with srcdir != builddir.
This will catch many automake bugs that end-users get to experience first,
otherwise.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-06-08 09:50:28 -07:00
Eric Engestrom
37eb56d239 autotools/meson: compile against wayland-egl-*backend*
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=106861
Fixes: 1db4ec0546 "egl: rewire the build systems to use libwayland-egl"
Suggested-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Andreas Hartmetz <ahartmetz@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-08 16:45:43 +01:00
Cameron Kumar
cb03803253 vulkan/wsi: Destroy swapchain images after terminating FIFO queues
The queue_manager thread can access the images from x11_present_to_x11,
hence this reorder prevents dereferencing of dangling pointers.

Cc: "18.1" <mesa-stable@lists.freedesktop.org>
Fixes: e73d136a02 ("vulkan/wsi/x11: Implement FIFO mode.")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-06-08 14:06:46 +01:00
Sonny Jiang
ce64c1b70a radeonsi: emit_dpbb_state packets optimization
Remembering latest states of registers to eliminate redunant SET_CONTEXT_REG packets

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-06-07 23:26:40 -04:00
Sonny Jiang
7dcfa1f46e radeonsi: emit_clip_state packets optimization
Remembering latest states of registers to eliminate redunant SET_CONTEXT_REG packets

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-06-07 23:26:36 -04:00
Sonny Jiang
06b47005d3 radeonsi: emit_msaa_sample_locs packets optimization
Remembering latest states of registers to eliminate redunant SET_CONTEXT_REG packets

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-06-07 23:26:36 -04:00
Sonny Jiang
a1b4b00ce2 radeonsi: emit_msaa_config packets optimization
Remembering latest states of registers to eliminate redunant SET_CONTEXT_REG packets

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-06-07 23:26:36 -04:00
Sonny Jiang
2bad413f55 radeonsi: emit_cb_render_state packets optimization
Remembering latest states of registers to eliminate redunant SET_CONTEXT_REG packets

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-06-07 23:26:25 -04:00
Sonny Jiang
43b0269ce3 radeonsi: emit_db_render_state packets optimization
Remembering latest states of registers to eliminate redunant SET_CONTEXT_REG packets

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-06-07 23:26:25 -04:00
Jan Vesely
d797f1f47e drisw: Fix invalid pointer arithmetic
Use of void * in pointer arithmetic is illegal, use char * instead.
Fixes: cf54bd5e83 ("drisw: use shared memory when possible")

Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
2018-06-07 21:01:29 -04:00
Timothy Arceri
03c370d2f1 radeonsi: fix possible truncation on renderer string
Fixes truncation warning in gcc 8.1

Fixes: 8539c9bf31 ("gallium/radeon: add the kernel version into the renderer string")
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2018-06-08 10:07:55 +10:00
Timothy Arceri
fae3b38770 ac: fix possible truncation of intrinsic name
Fixes the gcc warning:
snprintf’ output between 26 and 33 bytes into a destination of size 32

Fixes: d5f7ebda3e ("ac: add LLVM build functions for subgroup instrinsics")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-08 09:24:15 +10:00
Bas Nieuwenhuizen
4fc2d5e141 amd/common: Fix number of coords for getlod.
The LLVM 6 code reduced it to a non-array call. We need to do that
with the new code too.

This fixes dEQP-VK.glsl.texture_functions.query.texturequerylod.*array* for radv.

Fixes: a9a7993441 "amd/common: use the dimension-aware image intrinsics on LLVM 7+"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-06-07 23:59:52 +02:00
Dave Airlie
9be56316cf features: add virgl to the GL features list
This hopefully adds virgl to the correct places and current statuses
of various extensions.

virgl of course relies on two external things
a) host driver that can support the features
b) up to date host virglrenderer library that can support the features.

This list will be maintained as latest (a) + (b) + mesa.

Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2018-06-08 07:34:53 +10:00
Matt Turner
a5abb2da74 meson: Add support for read-only text segment on x86
Port of 6dfc5e28f7 (configure.ac: Add support to enable read-only text
segment on x86.) to Meson.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-06-07 14:16:44 -07:00
Dylan Baker
8f2421d73b meson: work around gentoo applying -m32 to host compiler in cross builds
Gentoo's ebuild system always adds -m32 to the compiler for doing x86_64
-> x86 cross builds, while meson expects it not to do that. This
results in an x86 -> x86 cross build, and assembly gets disabled.

Fixes: 2d62fc0646
       ("meson: disable x86 asm in fewer cases.")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-07 11:54:06 -07:00
Jason Ekstrand
e0fa239962 i965/screen: Sanity check that all formats we advertise are useable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-06-07 11:23:34 -07:00
Jason Ekstrand
0e7f3febf7 i965/screen: Use RGBA non-sRGB formats for images
Not all of the MESA_FORMAT and ISL_FORMAT helpers we use can properly
handle RGBX formats.  Also, we don't want to make decisions based on
those in the first place because we can't render to RGBA and we use the
non-sRGB version to determine whether or not to allow CCS_E.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-06-07 11:23:34 -07:00
Jason Ekstrand
a266934935 i965/screen: Return false for unsupported formats in query_modifiers
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-06-07 11:23:34 -07:00
Jason Ekstrand
eeae485149 i965/screen: Refactor query_dma_buf_formats
This reworks it to work like query_dma_buf_modifiers and, in particular,
makes it more flexible so that we can disallow a non-static set of
formats.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-06-07 11:23:34 -07:00
Jason Ekstrand
3b54dd87f7 intel/isl: Add bounds-checking assertions for the format_info table
We follow the same convention as isl_format_get_layout in having two
assertions to ensure that only valid formats are passed in.  We also
check against the array size of the table because some valid formats
such as CCS formats will may be past the end of the table.  This fixes
some potential out-of-bounds array access even in valid cases.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-06-07 11:23:34 -07:00
Jason Ekstrand
778e2881a0 intel/isl: Add bounds-checking assertions in isl_format_get_layout
We add two assertions instead of one because the first assertion that
format != ISL_FORMAT_UNSUPPORTED is more descriptive and checks for a
real but unsupported enumerant while the second ensures that they don't
pass in garbage values.  We also update some other helpers to use
isl_format_get_layout instead of using the table directly so that they
get bounds checking too.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-06-07 11:23:34 -07:00
Dylan Baker
c267f46ef2 meson: Clarify why asm cannot be used in cross compile
This makes the reasoning for why a cross compile is not using asm
clearer (hopefully).

v2: - fix typos

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-07 10:40:35 -07:00
Eric Engestrom
f436ae237b docs: talk about Wayland instead of libwayland
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-07 18:06:40 +01:00
Jason Ekstrand
237c5ac4f9 anv: Set fence/semaphore types to NONE in impl_cleanup
There were some places that were calling anv_semaphore_impl_cleanup and
neither deleting the semaphore nor setting the type back to NONE.  Just
set it to NONE in impl_cleanup to avoid these issues.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106643
Fixes: 031f57eba "anv: Add a basic implementation of VK_KHX_external..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-06-07 09:46:45 -07:00
Plamena Manolova
3ba16d640e nir: Add global invocation id intrinsic.
Add the missing nir intrinsic for the gl_GlobalInvocationID
compute shader variable.

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-06-07 14:53:12 +01:00
Eric Engestrom
61edad216e travis: bump libwayland to the first version with libwayland-egl
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-07 11:10:11 +01:00
Kenneth Graunke
3ea2d791f3 i965: Require softpin support for Cannonlake and later.
This isn't strictly necessary, but anyone running Cannonlake will
already have Kernel 4.5 or later, so there's no reason to support
the relocation model on Gen10+.

This will let us avoid dealing with them for new features.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-06-06 19:45:09 -07:00
Kenneth Graunke
a363bb2cd0 i965: Allocate VMA in userspace for full-PPGTT systems.
This patch enables soft-pinning of all buffers, allowing us to skip
relocation processing entirely.  All systems with full PPGTT and > 4GB
of VMA should gain these benefits.  This should be most Gen8+.

Unfortunately, this excludes a few systems:
- Cherryview (only has 32-bit addressing, despite 48-bit pointers)
- Broadwell with a 32-bit kernel
- Anybody running pre-4.5 kernel.

We may enable it for Cherryview in the future, but it would require
some tweaks to the memory zone.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-06-06 19:45:09 -07:00
Kenneth Graunke
74259b98aa intel/blorp: Emit VF cache invalidates for 48-bit bugs with softpin.
commit 92f01fc5f9 made i965 start emitting
VF cache invalidates when the high bits of vertex buffers change.  But
we were not tracking vertex buffers emitted by BLORP.  This was papered
over by a mistake where I emitted VF cache invalidates all the time,
which Chris fixed in commit 3ac5fbadfd.

This patch adds a new hook which allows the driver to track addresses
and request a VF cache invalidate as appropriate.

v2: Make the driver do the PIPE_CONTROL so it can apply workarounds
    (caught by Jason Ekstrand).  Rebase on anv bug fix.
v3: Don't screw up the boolean (caught by Jason Ekstrand).

Fixes: 92f01fc5f9 ("i965: Emit VF cache invalidates for 48-bit addressing bugs with softpin.")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-06 19:45:09 -07:00
Timothy Arceri
2a74296f24 nir: add opt_if_loop_terminator()
This pass detects potential loop terminators and moves intructions
from the non breaking branch after the if-statement.

This enables both the new opt_if_simplification() pass and loop
unrolling to potentially progress further.

Unexpectedly this change speed up shader-db run times by ~3%

Ivy Bridge shader-db results (all changes in dolphin/ubershaders):

total instructions in shared programs: 9995662 -> 9995338 (-0.00%)
instructions in affected programs: 87845 -> 87521 (-0.37%)
helped: 27
HURT: 0

total cycles in shared programs: 230931495 -> 230925015 (-0.00%)
cycles in affected programs: 56391385 -> 56384905 (-0.01%)
helped: 27
HURT: 0

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-07 11:33:04 +10:00
Timothy Arceri
1098bc5e85 nir: move ends_in_break() helper to nir_loop_analyze.h
We will use the helper while simplifying potential loop terminators
in the following patch.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-07 11:33:04 +10:00
Timothy Arceri
186988e28f radv: fix Coverity no effect control flow issue
swizzle is unsigned so "desc->swizzle[c] < 0" is never true.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-07 10:10:57 +10:00
Jason Ekstrand
44c614843c intel/blorp: Don't vertex fetch directly from clear values
On gen8+, we have to VF cache flush whenever a vertex binding aliases a
previous binding at the same index modulo 4GiB.  We deal with this in
Vulkan by ensuring that vertex buffers and the dynamic state (from which
BLORP pulls its vertex buffers) are in the same 4GiB region of the
address space.  That doesn't work if we're reading clear colors with the
VF unit.  In order to work around this we switch to using MI commands to
copy the clear value into the vertex buffer we allocate for the normal
constant data.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-06 16:32:38 -07:00
Lionel Landwerlin
b28a2510cc dri: add missing 16bits formats mapping
i965 advertises the 16-bit R and RG formats through
eglQueryDmaBufFormatsEXT but falls over when a client tries to use or
asks more information about such a format because
driImageFormatToGLFormat returns MESA_FORMAT_NONE.

Found by Eero Tamminen.

v2: Add G16R16 formats (Lionel)

v3: Fix G16R16 mapping to mesa format (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106642
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com> (v2)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-07 00:09:21 +01:00
Eric Anholt
833c404600 nir: Look into uniform structs for samplers when counting num_textures.
mesa/st decides whether to update samplers after a program change based on
whether num_textures is nonzero.  By not counting samplers in a uniform
struct, we would segfault in
KHR-GLES3.shaders.struct.uniform.sampler_vertex if it was run in the same
context after a non-vertex-shader-uniform testcase (as is the case during
a full conformance run).

v2: Implement using two separate pure functions instead of updating
    pointers.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-06 13:46:55 -07:00
Eric Anholt
f69473a712 v3d: Work around GFXH-1461/GFXH-1689 by using CLEAR_TILE_BUFFERS.
This doesn't seem to have done anything to my test results.  However,
given that we've still got a class of GPU hangs, following the workarounds
that the closed driver does so that we get the same command sequences
seems like a good idea.
2018-06-06 13:46:55 -07:00
Eric Anholt
9d5860310d v3d: Enable the new NIR bitfield operation lowering paths.
These together get the GLSL 3.00 unorm/snorm pack functions and
MESA_shader_integer operations working.

v2: Fix commit message typo.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-06 13:44:28 -07:00
Eric Anholt
73953b0713 nir: Add lowering for nir_op_bit_count.
This is basically the same as the GLSL lowering path.

v2: Fix typo in the link

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-06 13:44:28 -07:00
Eric Anholt
7afa26d4e3 nir: Add lowering for nir_op_bitfield_reverse.
This is basically the same as the GLSL lowering path.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-06 13:44:28 -07:00
Eric Anholt
6e1597c2d9 nir: Add an ALU lowering pass for mul_high.
This is based on the glsl/lower_instructions.cpp implementation, but
should be much more readable.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-06 13:44:28 -07:00
Eric Anholt
6a0db5f08f nir: Add lowering for find_lsb.
There is a fairly simple relation to turn this into ufind_msb.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-06 13:44:28 -07:00
Eric Anholt
d4c7c3c225 nir: Add lowering for ifind_msb to ufind_msb.
ufind_msb is easily expressed in terms of clz, and we can reduce ifind_msb
to that.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-06 13:44:28 -07:00
Eric Anholt
af88acf4c4 nir: Add lowering from ibitfield_extract/ubitfield_extract to shifts.
V3D doesn't have opcodes for ibfe/ubfe, so we need to lower similarly to
glsl/lower_instructions.cpp.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-06 13:44:28 -07:00
Eric Anholt
74618ccbca nir: Add lowering for bitfieldInsert without using bfi.
If you don't have HW to do bfi, then lowering bitfieldInsert to bfi makes
things harder than keeping the "bits" argument around.

This still uses bfm, but I've added the obvious lowering of bfm if you
need it.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-06 13:44:28 -07:00
Eric Engestrom
735b104707 docs: add note about moving to libwayland-egl in 18.2.0
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Daniel Stone <daniels@collabora.com>
Cc: Andres Gomez <agomez@igalia.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-06 12:12:03 -07:00
Eric Engestrom
b9361c9df0 egl: remove wayland-egl now that we're using libwayland-egl
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Daniel Stone <daniels@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-06 12:12:01 -07:00
Eric Engestrom
1db4ec0546 egl: rewire the build systems to use libwayland-egl
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Daniel Stone <daniels@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-06 12:11:57 -07:00
zhaowei yuan
67f7a16b59 glsl: Take 'double' as reserved after GLSL ES 1.0
GLSL ES 1.0.17 specifies that "double" is a keyword reserved

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106823
Signed-off-by: zhaowei yuan <zhaowei.yuan@samsung.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-05 23:39:25 -07:00
Marek Olšák
17a42062cc r300g/swtcl: make pipe_context uploaders use malloc'd memory as before
Discovered by Roland Scheidegger.

The resource_create code uses GPU memory for PIPE_BIND_CUSTOM, but
malloc'd memory otherwise. Vertex and index buffers should use malloc'd
memory.

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
2018-06-05 22:52:08 -04:00
Jason Ekstrand
01ad2067bb intel/eu: Use a struct copy instead of a memcpy
The memcpy had the wrong size and this was causing crashes on 32-bit
builds of the driver.

Fixes: 6a9525bf67 "intel/eu: Switch to a logical state stack"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106830
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-05 15:51:01 -07:00
Philip Rebohle
cc21e96d5f radv: Use correct color format for fast clears
Using the image format is incorrect when the view has a different
format than the image. Instead, the view format needs to be used.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
CC: 18.1 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106687
2018-06-05 23:51:03 +02:00
Eric Anholt
2b1b2cbf61 v3d: Be more explicit about include directory from our generated code.
You'd need src/broadcom/cle/ in the -I previously, for srcdir != builddir.
nir was fine at that, but automake didn't have it.

Bugzilla: https://github.com/anholt/mesa/issues/104
2018-06-05 12:44:49 -07:00
Bas Nieuwenhuizen
2a10fd902d radv: Do not hardcode fast clear formats.
except for the odd one out.

This should support many more formats.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-06-05 20:53:21 +02:00
Scott D Phillips
6fb22114a0 intel/tools: add intel_sanitize_gpu to EXTRA_DIST
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106778
Fixes: cc41603d6d ("intel/tools: new intel_sanitize_gpu tool")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-06-05 10:32:35 -07:00
Scott D Phillips
08535dd886 util/tests/vma: Fix warning c++11-narrowing
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106801
Fixes: 943fecc569 ("util: Add a randomized test for the virtual memory allocator")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-06-05 10:32:07 -07:00
Scott D Phillips
4b123fb74b util: tests: vma test depends on C++11 support
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106776
Fixes: 943fecc569 ("util: Add a randomized test for the virtual memory allocator")
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-06-05 10:13:14 -07:00
Michel Dänzer
6b8f3724c8 glx: Fix number of property values to read in glXImportContextEXT
We were trying to read twice as many as the X server sent us, which
upset XCB:

[xcb] Too much data requested from _XRead
[xcb] This is most likely caused by a broken X extension library
[xcb] Aborting, sorry about that.
glx-free-context: ../../src/xcb_io.c:732: _XRead: Assertion `!xcb_xlib_too_much_data_requested' failed.

Fixing this takes 3 GLX piglit tests from crash to pass.

Fixes: 0852162950 "glx: Be more tolerant in glXImportContext (v2)"
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-06-05 18:56:43 +02:00
Eric Engestrom
c765c39ea7 configure: radv depends on mako
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=106784
Fixes: 17201a2eb0 "radv: port to using updated anv
                              entrypoint/extension generator."
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-05 16:32:48 +01:00
Eric Engestrom
5bdc38f356 travis: use correct form for array options
I'd like to eventually drop support for the confusing "an array of
a single empty string is meant to be interpreted as an empty array", so
let's start by not using it anymore.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-05 16:31:23 +01:00
Lionel Landwerlin
9aedee64ac anv: intel: add softpin flag on imported BOs
Looks like we forgot to update this bit of the driver for softpin.

Fixes: 4affeba1e9 ("anv: Soft-pin everything else")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-05 14:18:35 +01:00
Eric Engestrom
66c61797ad autotools: add missing android file to package
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=106779
Fixes: ff904978a1 "gallium/util: Android backtrace support"
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-05 10:39:04 +01:00
Eric Engestrom
7c4423cce9 meson: fix platforms check for -D egl=true
Fixes: 0ed6a87a10 "meson: fix platforms=[]"
Reported-by: Christoph Haag <haagch@frickel.club>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-05 10:38:57 +01:00
Mathias Fröhlich
1ac4439d62 mesa: Make sure that imm draws are flushed before other draws execute.
The recent patch

    mesa: Remove FLUSH_VERTICES from VAO state changes.

    Pending draw calls on immediate mode or display list calls do
    not depend on changes of the VAO state. So, remove calls to
    FLUSH_VERTICES and flag _NEW_ARRAY as appropriate.

uncovered a problem that non immediate mode draw calls do only
flush outstanding immediate mode draws if FLUSH_UPDATE_CURRENT
is set in ctx->Driver.NeedFlush.
In that case, due to the sequence of _mesa_set_draw_vao commands
we could end up with the VAO from the FLUSH_VERTICES call set
into gl_context::Array._DrawVAO when the array draw is executed.
So the change pulls FLUSH_CURRENT out of _mesa_validate_* calls
into the array draw calls being validated.
The change introduces a new macro FLUSH_FOR_DRAW beside FLUSH_VERTICES
and FLUSH_CURRENT that flushes on changed current attributes as well
as on outstanding immediate mode draw calls. Use FLUSH_FOR_DRAW
in the non immediate mode draw code paths.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106594
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-06-05 07:05:24 +02:00
gurchetansingh@chromium.org
a7b74a77fa virgl: use bits in caps set v2
Let's add another field to caps v2, that can help report boolean
values.

Suggested-by: Gert Wollny <gert.wollny@collabora.com>
Suggested-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-06-05 14:29:00 +10:00
gurchetansingh@chromium.org
6ce94a50bb virgl: add shader offset alignment to to v2 caps struct
This is the SSBO analogue to fe0647. User supplied data must
be a multiple of GL_SHADER_STORAGE_BUFFER_OFFSET_ALIGNMENT.

This fixes 44 GLES31 tests on airlied@'s GLES31 sketch branches with
Nvidia hardware, but this patch standalone can applied to master. The
alignment restriction on Nvidia is 32, hence the default value.

Example tests:
   dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.0
   dEQP-GLES31.functional.ssbo.layout.multi_basic_types.single_buffer.std430

v2: Move to a better place in case statement
v3: Rebase

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-06-05 14:28:49 +10:00
Kenneth Graunke
1c9053d076 i965: Prepare batchbuffer module for softpin support.
If EXEC_OBJECT_PINNED is set, we don't want to emit any relocations.
We simply want to add the BO to the validation list, and possibly mark
it as writeable.  The new brw_use_pinned_bo() interface does just that.

To avoid having to make every caller consider both the relocation and
softpin cases, we make emit_reloc() call brw_use_pinned_bo() when given
a softpinned buffer.

We also can't grow buffers that are softpinned - the mechanism places a
larger BO at the same offset as the original, which requires moving BOs
around in the VMA.  With softpin, we only allocate enough VMA for the
original size of the BO.

v2: Assert that BOs aren't pinned if the kernel says we should move them
    (feedback from Chris Wilson)

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-06-04 18:38:41 -07:00
Kenneth Graunke
01058a5522 i965: Add virtual memory allocator infrastructure to brw_bufmgr.
This introduces a new fast virtual memory allocator integrated with our
BO cache bucketing.  For larger objects, it falls back to the simple
free-list allocator (util_vma).

This puts the allocators in place but doesn't enable softpin yet.

v2:
 (feedback from Chris Wilson)
 - Check (bo->kflags & EXEC_OBJECT_PINNED) instead of a global flag
 - Avoid vma_free(0ull) on the err_free path.
 - Only enable if the kernel says we have full PPGTT support
 - Make bucketing allocators more resistant to failing to grow arrays
 (feedback from Scott Phillips)
 - Don't use node after popping it from the list.
 - Avoid undefined behavior in canonicalization by reusing new helper
 - Comment updates
 (feedback from myself)
 - Avoid __vma_alloc vs. vma_alloc by making a zero_high_bits helper
   to return a non-canonical address with the high bits zeroed.
 - Don't shadow loop variable 'i' when destroying things (ugly; worked)
v3:
 - Replace zero_high_bits with new common gen_48b_address helper.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-06-04 18:38:41 -07:00
Jason Ekstrand
e99b32d4d6 i965: Disable internal CCS for shadows of multi-sampled windows
If window system supports Y-tiling but not CCS_E, we currently create an
internal CCS for any window system buffers and then resolve right before
handing it off to X or Wayland.  In the case of the single-sampled
shadow of a multi-sampled window system buffer, this is pointless
because the only thing we do with it is use it as a MSAA resolve target
so we do MSAA resolve -> CCS resolve -> hand to the window system.
Instead, just disable CCS for the shadow and then the MSAA resolve will
write uncompressed directly into it.  If the window system supports
CCS_E, we will still use CCS_E, we just won't do internal CCS.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-04 15:27:29 -07:00
Jason Ekstrand
6ab9fe7673 i965/miptree: Rename a parameter to create_for_dri_image
Instead of having it be a general "is this a winsys image" boolean, make
it more specific to the actual purpose.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-04 15:27:16 -07:00
Jason Ekstrand
6a9525bf67 intel/eu: Switch to a logical state stack
Instead of the state stack that's based on copying a dummy instruction
around, we start using a logical stack of brw_insn_states.  This uses a
bit less memory and is way less conceptually bogus.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-04 14:03:03 -07:00
Jason Ekstrand
db9675f5a4 intel/eu: Set flag [sub]register number differently for 3src
Prior to gen8, the flag [sub]register number is in a different spot on
3src instructions than on other instructions.  Starting with Broadwell,
they made it consistent.  This commit fixes bugs that occur when a
conditional modifier gets propagated into a 3src instruction such as a
MAD.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-04 14:03:03 -07:00
Jason Ekstrand
2d20303e18 intel/eu: Copy fields manually in brw_next_insn
Instead of doing a memcpy, this moves us to start with a blank
instruction (memset to zero) and copy the fields over one at a time.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-04 14:03:03 -07:00
Jason Ekstrand
381fac2740 intel/eu: Add some brw_get_default_ helpers
This is much cleaner than everything that wants a default value poking
at the bits of p->current directly.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-04 14:03:03 -07:00
Jose Fonseca
db38c3b4ba trace: Fix parsing of recent traces.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-06-04 21:06:31 +01:00
Jose Fonseca
8652ff7cdf trace: Fix trace_context_transfer_unmap methods.
The emitted buffer_subdata/texture_subdata call didn't match the
respective signatures.

v2: Actually emit buffer_subdata call.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-06-04 21:06:31 +01:00
Nicolai Hähnle
a9a7993441 amd/common: use the dimension-aware image intrinsics on LLVM 7+
Requires LLVM trunk r329166.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-06-04 21:34:59 +02:00
Kenneth Graunke
b3ba47c592 i965: Fix batch-last mode to properly swap BOs.
On pre-4.13 kernels, which don't support I915_EXEC_BATCH_FIRST, we move
the validation list entry to the end...but incorrectly left the exec_bo
array alone, causing a mismatch where exec_bos[0] no longer corresponded
with validation_list[0] (and similarly for the last entry).

One example of resulting breakage is that we'd update bo->gtt_offset
based on the wrong buffer.  This wreaked total havoc when trying to use
softpin, and likely caused unnecessary relocations in the normal case.

Fixes: 29ba502a4e (i965: Use I915_EXEC_BATCH_FIRST when available.)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-06-04 09:43:09 -07:00
Samuel Pitoiset
06d3c65098 radv: fix a GPU hang when MRTs are sparse
When the i-th target format is set, all previous target formats
must be non-zero to avoid hangs. In other words, without this
if a fragment shader exports mrt0, mrt2 and mrt3, the GPU hangs
because the target format of mrt1 is zero.

This fixes DXVK GPU hangs with "Seven: The Days Long Gone",
"GTA V" and probably more games.

Cc: "18.0" 18.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-04 14:01:33 +02:00
Bas Nieuwenhuizen
2835b6baf4 radv: Don't pass a TESS_EVAL shader when tesselation is not enabled.
Otherwise on pre-GFX9, if the constant layout allows both TESS_EVAL and
GEOMETRY shaders, but the PIPELINE has only GEOMETRY, it would return the
GEOMETRY shader for the TESS_EVAL shader.

This would cause the flush_constants code to emit the GEOMETRY constants
to the TESS_EVAL registers and then conclude that it did not need to set
the GEOMETRY shader registers.

Fixes: dfff9fb6f8 "radv: Handle GFX9 merged shaders in radv_flush_constants()"
CC: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-06-04 13:46:24 +02:00
Samuel Pitoiset
e3e929f8c3 nir: implement the GLSL equivalent of if simplication in nir_opt_if
This pass turns:

   if (cond) {
   } else {
      do_work();
   }

into:

   if (!cond) {
      do_work();
   } else {
   }

Here's the vkpipeline-db stats (from affected shaders) on Polaris10:

Totals from affected shaders:
SGPRS: 17272 -> 17296 (0.14 %)
VGPRS: 18712 -> 18740 (0.15 %)
Spilled SGPRs: 1179 -> 1142 (-3.14 %)
Code Size: 1503364 -> 1515176 (0.79 %) bytes
Max Waves: 916 -> 911 (-0.55 %)

This pass only affects Serious Sam 2017 (Vulkan) on my side. The
stats are not really good for now. Some shaders look quite dumb
but this will be improved with further NIR passes, like ifs
combination.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-04 12:41:10 +02:00
Samuel Pitoiset
e44f90eccf nir: make is_comparison() a non-static helper function
Rename and change the prototype for consistency regarding
nir_tex_instr_is_query(). This function will be used in the
following patch.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-04 12:41:08 +02:00
Dave Airlie
67eccd6aa2 nir: use num_components wrappers in print/validate.
These wrappers were introduces, so start using them.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-04 05:58:42 +10:00
Juan A. Suarez Romero
bad7332f7c doc: update calendar, add news and link release notes for 18.0.5
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-06-03 10:19:32 +00:00
Juan A. Suarez Romero
41c01d79ee docs: add sha256 checksums for 18.0.5
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit aba161e63a)
2018-06-03 10:12:02 +00:00
Juan A. Suarez Romero
a89cb6711b docs: add release notes for 18.0.5
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit ca0037aaef)
2018-06-03 10:12:00 +00:00
Jose Fonseca
8841c2cda5 scons: Fix MinGW cross compilation with LLVM 5.0.
LLVM 5.0 requires additional Win32 libraries, and MinGW with pthreads.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-06-02 09:58:50 +01:00
Jason Ekstrand
64e619674e anv: Don't even bother processing relocs if we have softpin
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-06-01 16:34:26 -07:00
Jason Ekstrand
c7be17c8d3 anv: Refactor reloc handling in execbuf_add_bo
This just separates the reloc list vs. BO set cases and lets us avoid an
allocation if relocs->deps->entries == 0.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-06-01 16:34:25 -07:00
Jason Ekstrand
7105b7890a anv: Assert that the kernel leaves pinned BO addresses alone
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-06-01 16:33:07 -07:00
Scott D Phillips
4affeba1e9 anv: Soft-pin everything else
v2 (Jason Ekstrand):
 - Break up Scott's mega-patch

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-06-01 14:27:13 -07:00
Scott D Phillips
f3dbe0419d anv: Soft-pin batch buffers
Co-authored-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-06-01 14:27:12 -07:00
Jason Ekstrand
a0b133286a anv/batch_chain: Simplify secondary batch return chaining
Previously, we did this weird thing where we left space and an empty
relocation for use in a hypothetical MI_BATCH_BUFFER_START that would be
added to the secondary later.  Then, when it came time to chain it into
the primary, we would back that out and emit an MI_BATCH_BUFFER_START.
This worked well but it was always a bit hacky, fragile and ugly.  This
commit instead adds a helper for rewriting the MI_BATCH_BUFFER_START at
the end of an anv_batch_bo and we use that helper for both batch bo list
cloning and handling returns from secondaries.  The new helper doesn't
actually modify the batch in any way but instead just adjusts the
relocation as needed.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-06-01 14:27:12 -07:00
Jason Ekstrand
4f20c665b4 anv/batch_chain: Call batch_bo_finish at the end of end_batch_buffer
The only reason we were calling it in the middle was that one of the
cases for figuring out the secondary command buffer execution type
wanted batch_bo->length which gets set by batch_bo_finish.  It's easy
enough to recalculate and now batch_bo_finish is called in a sensible
location.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-06-01 14:27:11 -07:00
Jason Ekstrand
e7d0378bd9 anv: Soft-pin client-allocated memory
Now that we've done all that refactoring, addresses are now being
directly written into surface states by ISL and BLORP whenever a BO is
pinned so there's really nothing to do besides enable it.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-06-01 14:27:11 -07:00
Jason Ekstrand
caf41c78ca anv/allocator: Support softpin in the BO cache
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-06-01 14:27:11 -07:00
Jason Ekstrand
b0d50247a7 anv/allocator: Set the BO flags in bo_cache_alloc/import
It's safer to set them there because we have the opportunity to properly
handle combining flags if a BO is imported more than once.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-06-01 14:27:10 -07:00
Scott D Phillips
27cc68d9e9 anv: For pinned BOs, skip relocations, but track bo usage
References to pinned BOs won't need to be relocated at a later
point, so just write the final value of the reference into the bo
directly.

Add a `set` to the relocation lists for tracking dependencies that
were previously tracked by relocations. When a batch is executed, we
add the referenced pinned BOs to the exec list.

v2: - visit bos from the dependency set in a deterministic order (Jason)
v3: - compar => compare, drat (Jason)
    - Reworded commit message, provided by (Jordan)

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-01 14:27:10 -07:00
Scott D Phillips
c7db0ed4e9 anv: Use a separate pool for binding tables when soft pinning
Soft pinning lets us satisfy the binding table address
requirements without using both sides of a growing state_pool.

If you do use both sides of a state pool, then you need to read
the state pool's center_bo_offset (with the device mutex held) to
know the final offset of relocations that target the state pool
bo.

By having a separate pool for binding tables that only grows in
the forward direction, the center_bo_offset is always 0 and
relocations don't need an update pass to adjust relocations with
the mutex held.

v2: - don't introduce a separate state flag for separate binding tables (Jason)
    - replace bo and map accessors with a single binding_table_pool accessor (Jason)
v3: - assert bt_block->offset >= 0 for the separate binding table (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-06-01 14:27:10 -07:00
Scott D Phillips
e662bdb820 anv: Soft-pin state pools
The state_pools reserve virtual address space of the full
BLOCK_POOL_MEMFD_SIZE, but maintain the current behavior of
growing from the middle.

v2: - rename block_pool::offset to block_pool::start_address (Jason)
    - assign state pool start_address statically (Jason)
v3: - remove unnecessary bo_flags tampering for the dynamic pool (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-06-01 13:49:22 -07:00
Ian Romanick
f00fcfb7a2 nir: Lower !f2b(x) to x == 0.0
Some trivial help now, but it also prevents ~40 regressions caused by
Samuel's "nir: implement the GLSL equivalent of if simplication in
nir_opt_if" patch.

All Gen4+ platforms had similar results. (Skylake shown)
total instructions in shared programs: 14369557 -> 14369555 (<.01%)
instructions in affected programs: 442 -> 440 (-0.45%)
helped: 2
HURT: 0

total cycles in shared programs: 532425772 -> 532425743 (<.01%)
cycles in affected programs: 6086 -> 6057 (-0.48%)
helped: 2
HURT: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-06-01 10:14:53 -07:00
Ian Romanick
619c51722b nir: Add some missing "optimization undo" patterns
d8d18516b0 and 03fb13f646 added some patterns to undo conversions like

   (('ior', ('flt', a, b), ('flt', a, c)), ('flt', a, ('fmax', b, c)))

If further optimization cause some of the operands to either be the same
or be constants, undoing the transformation can lead to further savings.

I don't know why these patterns were not added in those patches.  I did
not check to see which specific patterns actually helped.  I just added
all of them for symmetry.  This prevents some loop unrolling regressions
Plane Shift caused by Samuel's "nir: implement the GLSL equivalent of if
simplication in nir_opt_if" patch.

Skylake and Broadwell had similar results. (Skylake shown)
total instructions in shared programs: 14369768 -> 14369557 (<.01%)
instructions in affected programs: 44076 -> 43865 (-0.48%)
helped: 141
HURT: 0
helped stats (abs) min: 1 max: 5 x̄: 1.50 x̃: 1
helped stats (rel) min: 0.07% max: 1.52% x̄: 0.66% x̃: 0.60%
95% mean confidence interval for instructions value: -1.67 -1.32
95% mean confidence interval for instructions %-change: -0.72% -0.59%
Instructions are helped.

total cycles in shared programs: 532430629 -> 532425772 (<.01%)
cycles in affected programs: 1170832 -> 1165975 (-0.41%)
helped: 101
HURT: 5
helped stats (abs) min: 1 max: 160 x̄: 48.54 x̃: 32
helped stats (rel) min: <.01% max: 8.49% x̄: 2.76% x̃: 2.03%
HURT stats (abs)   min: 2 max: 22 x̄: 9.20 x̃: 4
HURT stats (rel)   min: <.01% max: 0.05% x̄: 0.02% x̃: <.01%
95% mean confidence interval for cycles value: -53.64 -38.00
95% mean confidence interval for cycles %-change: -3.06% -2.20%
Cycles are helped.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-01 10:13:16 -07:00
Eric Engestrom
57fbc2ac50 docs/meson: mention how to use array options
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-01 17:53:06 +01:00
Eric Engestrom
03a2e7b662 meson: drop unused empty string array element
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-01 17:53:06 +01:00
Eric Engestrom
0ed6a87a10 meson: fix platforms=[]
Fixes: 5608d0a2ce ("meson: use array type options")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-01 17:53:06 +01:00
Eric Engestrom
a92cdcd598 meson: fix vulkan-drivers=[]
Fixes: 5608d0a2ce ("meson: use array type options")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-01 17:53:06 +01:00
Eric Engestrom
a425db4d7d meson: fix gallium-drivers=[]
Fixes: 5608d0a2ce ("meson: use array type options")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-01 17:53:06 +01:00
Eric Engestrom
393abd6a57 meson: fix dri-drivers=[]
Fixes: 5608d0a2ce ("meson: use array type options")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-01 17:53:06 +01:00
Eric Engestrom
8faa22c146 REVIEWERS: add root meson.build to the Meson reviewers group
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-01 17:53:06 +01:00
Juan A. Suarez Romero
cbe4baed1f glsl: Add ir_binop_vector_extract in NIR
Implement ir_binop_vector_extract using NIR operations. Based on SPIR-V
to NIR approach.

This fixes:
dEQP-GLES3.functional.shaders.indexing.moredynamic.with_value_from_indexing_expression_fragment
Piglit's glsl-fs-vec4-indexing-8.shader_test

CC: mesa-stable@lists.freedesktop.org
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Iago Toral <itoral@igalia.com>
2018-06-01 18:09:22 +02:00
Dylan Baker
4ad8e2ac82 doc: update calendar, add news and link release notes for 18.1.1 2018-06-01 08:39:17 -07:00
Dylan Baker
55ee53ea19 docs/relnotes: Add sha256 sums for mesa 18.1.1 2018-06-01 08:39:17 -07:00
Dylan Baker
423c4fe954 docs: Add release notes for 18.1.1 2018-06-01 08:39:17 -07:00
Plamena Manolova
939312702e i965: Add ARB_fragment_shader_interlock support.
Adds suppport for ARB_fragment_shader_interlock. We achieve
the interlock and fragment ordering by issuing a memory fence
via sendc.

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-06-01 16:36:39 +01:00
Plamena Manolova
60e843c4d5 mesa: Add GL/GLSL plumbing for ARB_fragment_shader_interlock.
This extension provides new GLSL built-in functions
beginInvocationInterlockARB() and endInvocationInterlockARB()
that delimit a critical section of fragment shader code. For
pairs of shader invocations with "overlapping" coverage in a
given pixel, the OpenGL implementation will guarantee that the
critical section of the fragment shader will be executed for
only one fragment at a time.

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-06-01 16:36:36 +01:00
Martin Pelikán
53719f818c compiler/spirv: reject invalid shader code properly
After bebe3d626e, b->fail_jump is prepared after vtn_create_builder
which can longjmp(3) to it through its vtx_assert()s.  This corrupts
the stack and creates confusing core dumps, so we need to avoid it.

While there, I decided to print the offending values for debugability.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-01 08:09:35 -07:00
Juan A. Suarez Romero
360bfb619f docs: change release manager for 18.1
Dylan will replace Emil as the release manager for 18.1.x series.

CC: Emil Velikov <emil.l.velikov@gmail.com>
CC: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-06-01 15:24:02 +02:00
Gert Wollny
ef3a6e3d98 virgl: Always assume that ORIGIN_UPPER_LEFT and PIXEL_CENTER* are supported
The driver must support at least one of

  PIPE_CAP_TGSI_FS_COORD_ORIGIN_UPPER_LEFT
  PIPE_CAP_TGSI_FS_COORD_ORIGIN_LOWER_LEFT

and one of

  PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_HALF_INTEGER
  PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_INTEGER

otherwise glsl_to_tgsi will fire an assert.

ORIGIN_UPPER_LEFT is the default convention, and is supported by
all mesa drivers, hence it seems reasonable to always report the caps
to be enabled.  On gles ORIGIN_LOWER_LEFT is generally not supported,
so we rely on the caps reported by the host that depend on whether we
run on an GL or an EGL host.

For PIXEL_CENTER it is completely host driver dependend on what is
supported, and since we do not report the actual host driver capabilities
it is best to mark both as supported, this is how it works for a GL
host too.

Fixes:
   dEQP-GLES3.functional.shaders.builtin_variable.fragcoord_xyz
   dEQP-GLES3.functional.shaders.metamorphic.bubblesort_flag.variant_1
   dEQP-GLES3.functional.shaders.metamorphic.bubblesort_flag.variant_2

Reviewed-by: Gurchetan Singh <gurcetansingh@chromium.org>
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
2018-06-01 12:04:21 +01:00
Alex Smith
01a2414045 radeonsi: Fix crash on shaders using MSAA image load/store
The value returned by tgsi_util_get_texture_coord_dim() does not
account for the sample index. This means image_fetch_coords() will not
fetch it, leading to a null deref in ac_build_image_opcode() which
expects it to be present (the return value of ac_num_coords() *does*
include the sample index).

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: "18.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-06-01 08:53:38 +01:00
Alex Smith
dfff9fb6f8 radv: Handle GFX9 merged shaders in radv_flush_constants()
This was not previously handled correctly. For example,
push_constant_stages might only contain MESA_SHADER_VERTEX because
only that stage was changed by CmdPushConstants or
CmdBindDescriptorSets.

In that case, if vertex has been merged with tess control, then the
push constant address wouldn't be updated since
pipeline->shaders[MESA_SHADER_VERTEX] would be NULL.

Use radv_get_shader() instead of getting the shader directly so that
we get the right shader if merged. Also, skip emitting the address
redundantly - if two merged stages are set in push_constant_stages
this change would have made the address get emitted twice.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: "18.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-01 08:53:34 +01:00
Alex Smith
7ca0167ae9 radv: Consolidate GFX9 merged shader lookup logic
This was being handled in a few different places, consolidate it into a
single radv_get_shader() function.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: "18.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-01 08:53:31 +01:00
Alex Smith
0fa51bfdbe radv: Set active_stages the same whether or not shaders were cached
With GFX9 merged shaders, active_stages would be set to the original
stages specified if shaders were not cached, but to the stages still
present after merging if they were.

Be consistent and use the original stages.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: "18.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-01 08:53:01 +01:00
Marek Olšák
9e61147ef6 st/mesa: relax requirements for ARB_ES3_compatibility
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106748

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-01 01:04:17 -04:00
Scott D Phillips
29a139b308 anv/blorp: Write relocated values into surface states
v2 (Jason Ekstrand):
 - Split the blorp bit into it's own patch and re-order a bit
 - Use anv_address helpers

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-31 16:51:47 -07:00
Jason Ekstrand
bf34ef16ac anv: Use an address for each anv_image plane
This is better than having BO and offset fields.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-05-31 16:51:46 -07:00
Jason Ekstrand
1f2328c3b7 anv/cmd_buffer: Rework surface relocation helpers
This commit renames add_surface_state_reloc to add_surface_reloc and
makes it takes an address.  We also rename add_image_view_relocs to
add_surface_state_relocs because it takes an anv_surface_state and
doesn't really care about the image view anymore.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-05-31 16:51:46 -07:00
Jason Ekstrand
f270a09737 anv: Use an anv_address in anv_buffer
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-05-31 16:51:46 -07:00
Jason Ekstrand
8a8bd39d5e anv/cmd_buffer: Use anv_address for handling indirect parameters
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-05-31 16:51:46 -07:00
Jason Ekstrand
1029458ee3 anv: Use an anv_address in anv_buffer_view
Instead of storing a BO and offset separately, use an anv_address.  This
changes anv_fill_buffer_surface_state to use anv_address and we now call
anv_address_physical and pass that into ISL.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-05-31 16:51:46 -07:00
Jason Ekstrand
de1c5c1b50 anv: Use full anv_addresses in anv_surface_state
This refactors surface state filling to work entirely in terms of
anv_addresses instead of offsets.  This should make things simpler for
when we go to soft-pin image buffers.  Among other things,
add_image_view_relocs now only cares about the addresses in the surface
state and doesn't really need the image view anymore.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-05-31 16:51:46 -07:00
Jason Ekstrand
94081ffc80 anv: Add some anv_address helpers
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-05-31 16:51:46 -07:00
Scott D Phillips
aaea46242d anv: Add vma_heap allocators in anv_device
These will be used to assign virtual addresses to soft pinned
buffers in a later patch.

Two allocators are added for separate 'low' and 'high' virtual
memory areas. Another alternative would have been to add a
double-sided allocator, which wasn't done here just because it
didn't appear to give any code complexity advantages.

v2 (Scott Phillips):
 - rename has_exec_softpin to use_softpin (Jason)
 - Only remove bottom one page and top 4 GiB from virt (Jason)
 - refer to comment in anv_allocator about state address + size
   overflowing 48 bits (Jason)
 - Mention hi/lo allocators vs double-sided allocator in
   commit message (Chris)
 - assign state pool memory ranges statically (Jason)

v3 (Jason Ekstrand):
 - Use (LOW|HIGH)_HEAP_(MIN|MAX)_ADDRESS rather than (1 << 31) for
   determining which heap to use in anv_vma_free
 - Only return de-canonicalized addresses to the heap

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-05-31 16:51:46 -07:00
Jason Ekstrand
6e4672f881 intel/common: Add an address de-canonicalization helper
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-05-31 16:51:45 -07:00
Scott D Phillips
943fecc569 util: Add a randomized test for the virtual memory allocator
The test pseudo-randomly makes allocations and deallocations with
the virtual memory allocator and checks that the results are
consistent. Specifically, we test that:

 * no result from the allocator overlaps an already allocated range
 * allocated memory fulfills the stated alignment requirement
 * a failed result from the allocator could not have been fulfilled
 * memory freed to the allocator can later be allocated again

v2: - fix if() in test() to actually run fill()
v3: - add c++11 build flag (Jason)
    - test the full 64-bit range (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-31 16:51:35 -07:00
Jason Ekstrand
f19ad5d31f util: Add a virtual memory allocator
This is simple linear-walk first-fit allocator roughly based on the
allocator in the radeon winsys code.  This allocator has two primary
functional differences:

 1) It cleanly returns 0 on allocation failure

 2) It allocates addresses top-down instead of bottom-up.

The second one is needed for Intel because high addresses (with bit 47
set) need to be canonicalized in order to work properly.  If we allocate
bottom-up, then high addresses will be very rare (if they ever happen).
We'd rather always have high addresses so that the canonicalization code
gets better testing.

v2: - [scott-ph] remove _heap_validate() if NDEBUG is defined (Jordan)

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Tested-by: Scott D Phillips <scott.d.phillips@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-31 16:17:35 -07:00
Bas Nieuwenhuizen
b9fb2c266a radv: Add startup debug option.
This adds a RADV_DEBUG=startup option to dump more info about
instance creation and device enumeration.

A common question end users have is why the direver is not loading
for them, and this has two common reasons:
1) They did not install the driver.
2) AMDGPU is not used for the card in the kernel.

This adds some info messages so we can easily get a some useful
output from end users.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-31 11:51:23 +02:00
Bas Nieuwenhuizen
38933c1151 radv: Add option to print errors even in optimized builds.
Errors are not that common of a case so we can eat a slight perf
hit in having to call a function and do a runtime check.

In turn this makes debugging random errors happening for end users
easier, because they don't have to have a debug build on hand.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-31 11:51:23 +02:00
Bas Nieuwenhuizen
729f7373de radv: Make the sem_info allocate/free functions static.
They are only used in 1 file.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-31 11:51:23 +02:00
Samuel Pitoiset
70f9e2589e nir: optimize iand(ieq(a, 0), ieq(b, 0)) to ieq(ior(a, b), 0)
Totals from affected shaders:
SGPRS: 80 -> 80 (0.00 %)
VGPRS: 48 -> 48 (0.00 %)
Code Size: 2120 -> 2096 (-1.13 %) bytes
Max Waves: 16 -> 16 (0.00 %)

Only two Rise of Tomb Raider shaders are affected on my side.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-05-31 10:57:16 +02:00
Tapani Pälli
c983c6abaf mesa: don't call Driver.TexEnv with invalid arguments
Patch skips useless and possibly dangerous calls down to the driver
in case invalid arguments were given. I noticed this would be happening
with demo of Darwinia game. AFAIK this does not fix anything but makes
this path safer and more like how other API functions are implemented.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-31 09:24:17 +03:00
Vinson Lee
d511bba2f9 v3d: Fix automake linking error.
CXXLD    gallium_dri.la
../../../../src/broadcom/.libs/libbroadcom.a(clif_dump.o): In function `clif_dump_packet':
src/broadcom/clif/clif_dump.c:87: undefined reference to `v3d33_clif_dump_packet'
src/broadcom/clif/clif_dump.c:85: undefined reference to `v3d41_clif_dump_packet'
../../../../src/broadcom/.libs/libbroadcom.a(clif_dump.o): In function `clif_process_worklist':
src/broadcom/clif/clif_dump.c:140: undefined reference to `v3d41_clif_dump_gl_shader_state_record'
src/broadcom/clif/clif_dump.c:144: undefined reference to `v3d33_clif_dump_gl_shader_state_record'

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-05-30 11:55:09 -07:00
Jakob Bornecrantz
d6cee5a162 virgl: Update virgl_hw.h
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
2018-05-30 17:07:26 +01:00
Dave Airlie
e2b6d830b2 virgl: add ARB_transform_feedback_overflow_query support
Reviewed-by: Jakob Bornecrantz <jakob@collabora.com>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
2018-05-30 17:02:55 +01:00
Dave Airlie
22b072c194 virgl: add polygon offset clamp
Reviewed-by: Jakob Bornecrantz <jakob@collabora.com>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
2018-05-30 17:02:51 +01:00
Dave Airlie
49204ff8ad virgl: add derivative control support
Reviewed-by: Jakob Bornecrantz <jakob@collabora.com>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
2018-05-30 17:02:47 +01:00
Dave Airlie
46fe349af2 virgl: add ARB_conditional_render_inverted support
Reviewed-by: Jakob Bornecrantz <jakob@collabora.com>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
2018-05-30 17:02:40 +01:00
Dave Airlie
f9eb7e8b76 virgl: update caps bitset to latest version.
This makes this use all 32 bits, so future sets need to be
defined in a new struct.

Reviewed-by: Jakob Bornecrantz <jakob@collabora.com>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
2018-05-30 17:02:19 +01:00
Timothy Arceri
e8b368ad1c nir: add unsigned comparison simplifications
This avoids loop unrolling regressions in Wolfenstein II on DXVK
with an upcoming optimisation series from Samuel.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-30 22:48:37 +10:00
Bas Nieuwenhuizen
c2799574eb radv: Only expose subgroup shuffles on VI+.
The current implementation depends on bpermute, which
is VI+.

Fixes: f2c6a55061 "radv: enable subgroup capabilities"
Reviewed-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-30 13:49:46 +02:00
Samuel Pitoiset
02c7916298 radv: fix emitting descriptor pointers with LLVM < 7
This was terribly wrong, I forced use of 32-bit pointers when
emitting shader descriptor pointers. This fixes GPU hangs with
LLVM 5&6 because 32-bit pointers are only supported with LLVM 7.

Fixes: 88d1ed0f81 ("radv: emit shader descriptor pointers consecutively")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-30 11:38:54 +02:00
Ilia Mirkin
04fff21c62 nv30: add a couple of missed shader caps
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-05-30 02:06:28 -04:00
Ilia Mirkin
30918b77ac nv30: ensure that displayable formats are marked accordingly
Fixes: f7604d8af5 ("st/dri: only expose config formats that are display targets")
Cc: "18.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-05-30 02:06:28 -04:00
Marek Olšák
858ac8942d mesa: expose ARB_tessellation_shader in the compatibility profile
Gallium drivers don't expose this yet due to:
    "st/mesa: use PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY"

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-29 20:13:24 -04:00
Marek Olšák
16ac832392 mesa: expose AMD_vertex_shader_layer in the compatibility profile
This requires layered FBOs from GL 3.2.

Gallium drivers don't expose this yet due to:
    "st/mesa: use PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY"

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-29 20:13:24 -04:00
Marek Olšák
518d8065ce mesa: expose ARB_gpu_shader5 in the compatibility profile
Gallium drivers don't expose this yet due to:
    "st/mesa: use PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY"

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-29 20:13:24 -04:00
Marek Olšák
dd93bc4f34 st/mesa: use PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-29 20:13:24 -04:00
Marek Olšák
34ea55d820 gallium: add PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-29 20:13:24 -04:00
Marek Olšák
e453fc76e7 mesa: update fixed-func state constants for TCS, TES, GS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-29 20:13:24 -04:00
Marek Olšák
27a9f27310 mesa: print Compatibility Profile in the version string
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-29 20:13:24 -04:00
Marek Olšák
d3a87537dd glsl: parse #version XXX compatibility
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-29 20:13:24 -04:00
Marek Olšák
a7d0c53ab8 st/mesa: fix assertion failures with GL_UNSIGNED_INT64_ARB (v2)
Bindless texture handles can be passed via vertex attribs using this type.
They use the double codepath, so don't use st_pipe_vertex_format.

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-05-29 20:09:00 -04:00
Marek Olšák
a8e1413876 mesa: handle GL_UNSIGNED_INT64_ARB properly (v2)
Bindless texture handles can be passed via vertex attribs using this type.
This fixes a bunch of bindless piglit tests on radeonsi.

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-05-29 20:09:00 -04:00
Timothy Arceri
1f7a3a1102 mesa: add display list support for glPatchParameter{i,fv}()
This is required for tessellation shader Compat profile support.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-30 09:37:35 +10:00
Dave Airlie
d3ff478732 glx/drisw: make the shm/non-shm loader extensions separately.
I disliked removing the const here, function tables are meant
to be const just to avoid having to think about them,
make a second table for the shm vs non-shm paths to use.

Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-05-30 09:11:54 +10:00
Marc-André Lureau
33ce3aa512 drisw/glx: implement getImageShm
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-05-30 09:11:54 +10:00
Marc-André Lureau
17b27725fe drisw: use getImageShm() if available
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-05-30 09:11:54 +10:00
Marc-André Lureau
9feaf33371 drisw: learn to query shmid handle type
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-05-30 09:11:54 +10:00
Marc-André Lureau
bcd80be49a drisw/glx: use XShm if possible
Implements putImageShm from DRIswrastLoaderExtension.

If XShm extension is not available, or fails, it will fallback on
regular XPutImage().

Tested on Linux only with 16bpp and 32bpp visual.

(airlied: tested on 24bpp as well)

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-05-30 09:11:54 +10:00
Marc-André Lureau
cf54bd5e83 drisw: use shared memory when possible
If drisw_loader_funcs implements put_image_shm, allocates display
target data with shared memory and display with put_image_shm().

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-05-30 09:11:54 +10:00
Marc-André Lureau
63c427fa71 drisw: use putImageShm if available
If the DRIswrastLoaderExtension implements putImageShm, bind it to
drisw_loader_funcs.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-05-30 09:11:53 +10:00
Marc-André Lureau
de8085e649 dri: add putImageShm and getImageShm to swrastLoader
Add new API to put and get an image using shared memory. Instead of only
passing the data pointer, 3 arguments are given: the shmid, the data
offset and the shmaddr.

Bump interface version.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-05-30 09:11:53 +10:00
Dave Airlie
b7ac0779e0 gallium/winsys: rename DRM_API_HANDLE_* to WINSYS_HANDLE_*
This just renames this as we want to add an shm handle which
isn't really drm related.

Originally by: Marc-André Lureau <marcandre.lureau@gmail.com>
(airlied: I used this sed script instead)
This was generated with:
 git grep -l 'DRM_API_' | xargs sed -i 's/DRM_API_/WINSYS_/g'

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-30 09:11:53 +10:00
Marc-André Lureau
d2eaff33d0 gallium: move winsys handle to it's own file.
This will be used in the drisw interface later, which isn't
drm specific.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-30 09:11:53 +10:00
Francisco Jerez
4bd2047dee intel/fs: Add explicit last_rt flag to fb writes orthogonal to eot.
When using multiple RT write messages to the same RT such as for
dual-source blending or all RT writes in SIMD32, we have to set the
"Last Render Target Select" bit on all write messages that target the
last RT but only set EOT on the last RT write in the shader.
Special-casing for dual-source blend works today because that is the
only case which requires multiple RT write messages per RT.  When we
start doing SIMD32, this will become much more common so we add a
dedicated bit for it.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-05-29 15:44:50 -07:00
Francisco Jerez
d3cd6b7215 intel/fs: Replace the CINTERP opcode with a simple MOV
The only reason it was it's own opcode was so that we could detect it
and adjust the source register based on the payload setup.  Now that
we're using the ATTR file for FS inputs, there's no point in having a
magic opcode for this.

v2 (Jason Ekstrand):
 - Break the bit which removes the CINTERP opcode into its own patch

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-05-29 15:44:50 -07:00
Francisco Jerez
39de901a96 intel/fs: Use the ATTR file for FS inputs
This replaces the special magic opcodes which implicitly read inputs
with explicit use of the ATTR file.

v2 (Jason Ekstrand):
 - Break into multiple patches
 - Change the units of the FS ATTR to be in logical scalars

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-05-29 15:44:50 -07:00
Francisco Jerez
4bfa2ac2ea intel/fs: Rename a local variable so it doesn't shadow component()
v2 (Jason Ekstrand):
 - Break the refactor into its own patch

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-05-29 15:44:50 -07:00
Francisco Jerez
11c71f0e75 intel/eu: Remove brw_codegen::compressed_stack.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-05-29 15:44:50 -07:00
Jason Ekstrand
71a86d1fc6 intel/fs: Use groups for SIMD16 LINTERP on gen11+
This is better than compression control because it naturally extends to
SIMD32.

v2:
 - Push/pop instruction state around adjusted codegen (Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-05-29 15:44:50 -07:00
Jason Ekstrand
a1a850cd34 intel/fs: Assert that the gen4-6 plane restrictions are followed
The fall-back does not work correctly in SIMD16 mode and the register
allocator should ensure that we never hit this case anyway.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-05-29 15:44:50 -07:00
Jan Vesely
ed834aefa2 travis: Add clover llvm-6.0 build
v2: Don't force build using gcc-4.8
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-By: Aaron Watry <awatry@gmail.com>
2018-05-29 17:36:16 -04:00
Jan Vesely
41b878e1bd clover: Cleanup compat code for llvm < 3.9
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-By: Aaron Watry <awatry@gmail.com>
2018-05-29 17:36:16 -04:00
Jan Vesely
d424be0fed clover: Fix build after llvm r332881.
v2: fix whitespace and indentation

r332881 added an extra parameter to the emit function.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106619
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-By: Aaron Watry <awatry@gmail.com>
Tested-By: Aaron Watry <awatry@gmail.com>
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
2018-05-29 17:36:16 -04:00
Chris Wilson
3ac5fbadfd i965: Only emit VF cache invalidations when the high bits changes
Commit 92f01fc5f9 ("i965: Emit VF cache invalidates for 48-bit
addressing bugs with softpin.") tried to only emit the VF invalidate if
the high bits changed, but it accidentally always set need_invalidate to
true; causing it to emit unconditionally emit the pipe control before
every primitive.

Fixes: 92f01fc5f9 ("i965: Emit VF cache invalidates for 48-bit addressing bugs with softpin.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106708
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-29 12:16:26 -07:00
Eric Engestrom
e4fe2fd3bb vulkan: don't free uninitialised memory
The modifiers array hasn't been initialised by then, much less with data
that would need freeing.
Move the label after the loop to fix this.

Fixes: c80c08e226 ("vulkan/wsi/x11: Add support for DRI3 v1.2")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-05-29 17:44:13 +01:00
Eric Engestrom
51a17e7fee dri: replace two-way switch case with a table lookup
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
---
v2: rebased on top of 432df741e0 "dri_util: Add
R10G10B10{A,X}2 translation between DRI and mesa_format."
2018-05-29 17:44:13 +01:00
Eric Engestrom
d3ca7bd452 dri: fix error value returned by driGLFormatToImageFormat()
0 is not a valid value for the __DRI_IMAGE_FORMAT_* enum.
It is, however, the value of MESA_FORMAT_NONE, which two of the callers
(i915 & i965) checked for.

The other callers (that check for errors, ie. st/dri) already check for
__DRI_IMAGE_FORMAT_NONE.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-29 17:44:13 +01:00
Eric Engestrom
1945231b48 egl/x11: fix build with DRI3 disabled
Fixes: 473af0b541 "egl/x11: deduplicate depth-to-format logic"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Yogesh Marathe <yogesh.marathe@intel.com>
2018-05-29 17:01:21 +01:00
Emil Velikov
63b95fb291 meson: require shared glapi when using DRI based libGL
Just like we do in the autotools build.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-05-29 16:56:19 +01:00
Emil Velikov
728d1da159 meson: remove unreachable with_glx == 'auto' check
Cannot happen since, props to the autodetection further up.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-05-29 16:31:46 +01:00
Thierry Reding
9e539012df tegra: Treat resources with modifiers as scanout
Resources created with modifiers are treated as scanout because there is
no way for applications to specify the usage (though that capability may
be useful to have in the future). Currently all the resources created by
applications with modifiers are for scanout, so make sure they have bind
flags set accordingly.

This is necessary in order to properly export buffers for such resources
so that they can be shared with scanout hardware.

Tested-by: Daniel Kolesa <daniel@octaforge.org>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-05-29 16:48:37 +02:00
Thierry Reding
9603d81df0 tegra: Fix scanout resources without modifiers
Resources created for scanout but without modifiers need to be treated
as pitch-linear. This is because applications that don't use modifiers
to create resources must be assumed to not understand modifiers and in
turn won't be able to create a DRM framebuffer and passing along which
modifiers were picked by the implementation.

Tested-by: Daniel Kolesa <daniel@octaforge.org>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-05-29 16:48:34 +02:00
Thierry Reding
bd3e97e5aa tegra: Remove usage of non-stable UAPI
This code path is no longer required with framebuffer modifier support.

Tested-by: Daniel Kolesa <daniel@octaforge.org>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-05-29 16:47:45 +02:00
Eric Engestrom
f736be86bb docs: add favicon to the website
favicon.png is just gears.png resized to 64x64, and favicon.ico is
generated using this command, adapted from the ImageMagick example [1]:

  $ convert favicon.png -background black \
      \( -clone 0 -resize 16x16 \) \
      \( -clone 0 -resize 32x32 \) \
      \( -clone 0 -resize 48x48 \) \
      \( -clone 0 -resize 64x64 \) \
      -delete 0 -alpha off -colors 256 favicon.ico

We could edit every html page to add `<link rel="icon" href="favicon.ico" />`,
but there's not much point as pretty much every browser will pick it up
automatically if the file is named `favicon.ico` and is in the root folder.

[1] http://www.imagemagick.org/Usage/thumbnails/#favicon

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-05-29 14:48:21 +01:00
Eric Engestrom
e6a1aca0b2 docs: add missing html closing tag
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-05-29 14:48:21 +01:00
Eric Engestrom
3b5376330f docs: add missing html tag
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-05-29 14:48:21 +01:00
Karol Herbst
56792a0876 nir/print: fix printing of 8/16 bit constant variables
v2 (Jose Maria Casanova Crespo <jmcasanova@igalia.com>): add float16 support

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
2018-05-29 13:43:49 +02:00
Pierre Moreau
f0e80e123c nv50/ir: Extend ImmediateValue::applyLog2 to 64-bit integers
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-05-29 13:37:45 +02:00
Pierre Moreau
03f592a164 util/u_math: Implement a logbase2 function for unsigned long
v2 (Karol Herbst <kherbst@redhat.com>):
* removed unneeded ll
* ll -> ull

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-05-29 13:37:45 +02:00
Eric Engestrom
539aa604a0 docs: trivial typo fix
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-05-29 12:10:14 +01:00
Samuel Pitoiset
88d1ed0f81 radv: emit shader descriptor pointers consecutively
This reduces the number of SET_SH_REG packets which are emitted
for applications that use more than one descriptor set per stage.

We should be able to emit more SET_SH_REG packets consecutively
(like push constants and vertex buffers for the vertex stage),
but this will be improved later.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-29 10:07:18 +02:00
Samuel Pitoiset
21baf33a94 radv: allow radv_emit_shader_pointer_head() to emit more pointers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-29 10:07:16 +02:00
Samuel Pitoiset
288fe7ec71 radv: split radv_emit_shader_pointer()
This will allow to emit consecutive shader pointers for
reducing the number of emitted SET_SH_REG packets, which
is recommended.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-29 10:07:13 +02:00
Rhys Perry
57e721a456 gm107/ir: prevent WaW hazards in instruction scheduling
Previously, findFirstUse() only considered reads "uses". This fixes that
by making it check both an instruction's sources and definitions. It
also shortens both findFistUse() and findFirstDef() along the way.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-05-28 13:59:56 -04:00
Bas Nieuwenhuizen
a29bc043ae radv: Implement VK_KHR_draw_indirect_count.
Literally the same as the AMD ext.

Passes *indirect_draw_count* CTS tests.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-28 12:08:26 +02:00
Bas Nieuwenhuizen
b0002e4e05 vulkan: Update header+vk.xml to 1.1.76
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-05-28 12:08:20 +02:00
Bas Nieuwenhuizen
6914d5a2c0 radv: Implement alternate GFX9 scissor workaround.
This improves dota2 performance for me by 11% when I force the
GPU DPM level to low (otherwise dota2 is CPU limited for 4k on my
threadripper), which should be a large part of the radv-amdvlk gap.
(For me with that was radv 60.3 -> 66.6, while AMDVLK does about 68
fps)

It looks like dota2 rendered the GUI with a bunch of draws with
a SetScissors before almost each draw, causing a lot of pipeline
stalls.

I'm not really happy with the duplication of code, but overriding
radeon_set_context_reg would also be messy since we have the
pre-recorded pipelines and a bunch of si_cmd_buffer code, as well
as some memory->context reg loads for which things would be more
complicated.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-28 12:04:25 +02:00
Eric Anholt
3b6dfcf7ae Revert "st/nir: use NIR for asm programs"
This reverts commit 5c33e8c772.  It broke
fixed function vertex programs on vc4 and v3d, and apparently caused
trouble for radeonsi's NIR paths as well.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
https://bugs.freedesktop.org/show_bug.cgi?id=106673
2018-05-28 14:41:03 +10:00
Scott D Phillips
4714784dae anv: move canonical_address calculation into a separate function
A later patch will make use of this in other places. Also, remove
dependency on undefined behavior of left-shifting a signed value.

v2: - move function into a separate header (Chris)
v3: (by Ken) Add new header to the various build systems.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-27 19:24:33 -07:00
Gert Wollny
1aec4a07d4 r600: Fix SSG when not all components are written
Make sure only those components are written to that are specified in the
write mask.

Fixes:
  dEQP-GLES2.functional.shaders.operator.common_functions.sign.lowp_float_vertex
  dEQP-GLES2.functional.shaders.operator.common_functions.sign.lowp_float_fragment
  dEQP-GLES2.functional.shaders.operator.common_functions.sign.mediump_float_vertex
  dEQP-GLES2.functional.shaders.operator.common_functions.sign.mediump_float_fragment
  dEQP-GLES2.functional.shaders.operator.common_functions.sign.highp_float_vertex
  dEQP-GLES2.functional.shaders.operator.common_functions.sign.highp_float_fragment
  dEQP-GLES2.functional.shaders.operator.common_functions.sign.lowp_vec3_vertex
  dEQP-GLES2.functional.shaders.operator.common_functions.sign.lowp_vec3_fragment
  dEQP-GLES2.functional.shaders.operator.common_functions.sign.mediump_vec3_vertex
  dEQP-GLES2.functional.shaders.operator.common_functions.sign.mediump_vec3_fragment
  dEQP-GLES2.functional.shaders.operator.common_functions.sign.highp_vec3_vertex
  dEQP-GLES2.functional.shaders.operator.common_functions.sign.highp_vec3_fragment
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-05-28 02:57:46 +01:00
Gert Wollny
42cd2810aa r600: Correct IDIV if DST and SRC use the same temporary
In cases like

  IDIV TEMP[0].xy TEMP[0].xx TEMP[1].yy

the result will be written to the same register that is also a source register.
Since the components are evaluated one by one, this may result in overwriting
the source value for a later operation. Work around this by adding another
temporary to store the result if the destination temporary index is equal to
one of the source temporary indices.

Fixes:
  dEQP-GLES2.functional.shaders.operator.binary_operator.div.*
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-05-28 02:57:46 +01:00
Kenneth Graunke
58fb613a51 i965: Revert recent tiled memcpy changes.
This reverts commit 79fe00efb4.
This reverts commit f5e8b13f78.
This reverts commit d21c086d81.

They broke the Android build and I'd rather not leave it broken
for the long holiday weekend.
2018-05-26 16:25:50 -07:00
Scott D Phillips
79fe00efb4 i965/miptree: Use cpu tiling/detiling when mapping
Rename the (un)map_gtt functions to (un)map_map (map by
returning a map) and add new functions (un)map_tiled_memcpy that
return a shadow buffer populated with the intel_tiled_memcpy
functions.

Tiling/detiling with the cpu will be the only way to handle Yf/Ys
tiling, when support is added for those formats.

v2: Compute extents properly in the x|y-rounded-down case (Chris Wilson)

v3: Add units to parameter names of tile_extents (Nanley Chery)
    Use _mesa_align_malloc for the shadow copy (Nanley)
    Continue using gtt maps on gen4 (Nanley)

v4: Use streaming_load_memcpy when detiling

v5: (edited by Ken) Move map_tiled_memcpy above map_movntdqa, so it
    takes precedence.  Add intel_miptree_access_raw, needed after
    rebasing on commit b499b85b0f.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-25 21:35:50 -07:00
Chris Wilson
f5e8b13f78 i915: Fix streaming loads for intel_tiled_memcpy
We stream from a tiled and aligned source into an unaligned user buffer,
so we need to use _mm_storeu_si128.

Fixes: d21c086d81 (i965/tiled_memcpy: inline movntdqa loads in tiled_to_linear)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-25 21:35:50 -07:00
Marek Olšák
18c50498db radeonsi: remove unused variable addr_vec
trivial
2018-05-25 18:37:57 -04:00
Jason Ekstrand
ae514ca695 intel/blorp: Support blits and clears on surfaces with offsets
For certain EGLImage cases, we represent a single slice or LOD of an
image with a byte offset to a tile and X/Y intratile offsets to the
given slice.  Most of i965 is fine with this but it breaks blorp.  This
is a terrible way to represent slices of a surface in EGL and we should
stop some day but that's a very scary and thorny path.  This gets blorp
to start working with those surfaces and fixes some dEQP EGL test bugs.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106629
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-25 14:01:44 -07:00
Marek Olšák
2f65c67043 radeonsi: fix passing gl_ClipVertex for GS and tess
Also add the fprintf call.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-25 16:46:00 -04:00
Marek Olšák
a7d61c0753 radeonsi: fix color inputs/outputs for GS and tess
GS is tested, tessellation is untested.

Have outputs_written_before_ps for HW VS and outputs_written for other
stages. The reason is that COLOR and BCOLOR alias for HW VS, which
drives elimination of VS outputs based on PS inputs.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-25 16:46:00 -04:00
Marek Olšák
92ea9329e5 radeonsi: fix incorrect parentheses around VS-PS varying elimination
I don't know if it caused issues.

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-25 16:46:00 -04:00
Marek Olšák
a4ba7cd6a2 st/mesa: simplify lastLevel determination in st_finalize_texture
This fixes shader images where we always bind stObj->pt and not individual
gl_texture_images.

Roughly based on i965 commit 845ad2667a
which does a similar thing but for a different reason.

This fixes GL CTS assertion failures introduced by Ilia.

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-25 16:31:36 -04:00
Scott D Phillips
d21c086d81 i965/tiled_memcpy: inline movntdqa loads in tiled_to_linear
The reference for MOVNTDQA says:

    For WC memory type, the nontemporal hint may be implemented by
    loading a temporary internal buffer with the equivalent of an
    aligned cache line without filling this data to the cache.
    [...] Subsequent MOVNTDQA reads to unread portions of the WC
    cache line will receive data from the temporary internal
    buffer if data is available.

This hidden cache line sized temporary buffer can improve the
read performance from wc maps.

v2: Add mfence at start of tiled_to_linear for streaming loads (Chris)

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-25 11:05:46 -07:00
Alok Hota
fb20ae0374 swr/rast: Adjusted avx512 primitive assembly for msvc codegen
Optimize AVX-512 PA Assemble (PA_STATE_OPT). Reduced generated code by
about 4x, MSVC compiler was going crazy making temporaries and
split-loading inputs onto the stack unless explicit AVX-512 load ops
were added

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-05-25 10:57:02 -05:00
Alok Hota
b3360f5c8b swr/rast: Moved memory init out of core swr init
Added two new files for a wrapper function for initialization

v2: added missing include for single architecture builds

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-05-25 10:56:55 -05:00
Alok Hota
b6b114c1ae swr/rast: Removed superfluous JitManager argument from passes
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-05-25 10:56:49 -05:00
Alok Hota
98d0201577 swr/rast: Renamed MetaData calls
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-05-25 10:56:43 -05:00
Alok Hota
14b5cac0be swr/rast: Use metadata to communicate between passes
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-05-25 10:56:37 -05:00
Alok Hota
f09636e2e1 swr/rast: Check gCoreBuckets/CORE_BUCKETS equal length at compile time
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-05-25 10:56:01 -05:00
Alok Hota
cfe75cc7b5 swr/rast: Added in-place building to SCATTERPS
SCATTERPS previously assumed it was being used with an existing basic
block

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-05-25 10:55:37 -05:00
Samuel Pitoiset
45eb24fedf radv: run the EarlyCSEMemSSA LLVM pass
It's recommended by the instruction combining pass, and
RadeonSI also runs it.

This pass used to segfault with one shader of F12017 in the
past, but it no longer crashes. Maybe the LLVM IR generated
by RADV has changed.

Polaris10:
Totals from affected shaders:
SGPRS: 441352 -> 441648 (0.07 %)
VGPRS: 310888 -> 300784 (-3.25 %)
Spilled SGPRs: 13576 -> 12983 (-4.37 %)
Code Size: 22560328 -> 22420544 (-0.62 %) bytes
Max Waves: 40755 -> 41366 (1.50 %)

Vega10:
Totals from affected shaders:
SGPRS: 442848 -> 442000 (-0.19 %)
VGPRS: 310396 -> 300460 (-3.20 %)
Spilled SGPRs: 13708 -> 12906 (-5.85 %)
Code Size: 22479428 -> 22336216 (-0.64 %) bytes
Max Waves: 45783 -> 46506 (1.58 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-25 14:24:14 +02:00
Samuel Pitoiset
66e38654c9 radv: fix dumping compute shader on the graphics queue
The graphics pipeline can be NULL.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-25 11:58:07 +02:00
Samuel Pitoiset
de06dfa9ea radv: add radv_dump_pipeline_state() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-25 11:58:05 +02:00
Samuel Pitoiset
6f0530ecfe radv: rework how shaders are dumped when generating a hang report
Use a flag for the active stages instead.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-25 11:58:03 +02:00
Samuel Pitoiset
8c406f0b4d radv: remove unused parameter in radv_dump_annotated_shader()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-25 11:57:59 +02:00
Jose Dapena Paz
6c61c31dc2 mesa: do not leak ctx->Shader.ReferencedProgram references
When glUseProgram is used, references to the included shaders are
added in ctx->Shader.ReferencedProgram. But those references are not
decreased when the shader data is deallocated. Thus, those shaders
are leaked.

Explicitely remove the pending references to these shaders.

Fixes: e6506b3cd2 ("mesa: retain gl_shader_programs after glDeleteProgram if they are in use")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-25 10:38:09 +10:00
Marek Olšák
508b423dd6 radeonsi: set DB_EQAA.MAX_ANCHOR_SAMPLES correctly
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-24 13:41:57 -04:00
Marek Olšák
07e02c8617 radeonsi: round ps_iter_samples in set_min_samples
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-24 13:41:57 -04:00
Marek Olšák
510c88f9d1 radeonsi: remove redundant ps_iter_samples clamp
si_get_ps_iter_samples already does this.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-24 13:41:56 -04:00
Marek Olšák
25cdf754e4 radeonsi: remove some old gfx 9.x registers
Leftover from bring up.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-24 13:41:56 -04:00
Marek Olšák
b936f9aa32 radeonsi: disable primitive binning for all blitter ops
same as amdvlk.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-24 13:41:56 -04:00
Marek Olšák
8c1c451a90 ac/surface/gfx6: don't overallocate mipmapped HTILE
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-24 13:41:56 -04:00
Eric Engestrom
473af0b541 egl/x11: deduplicate depth-to-format logic
Suggested-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-05-24 18:01:45 +01:00
Tapani Pälli
7b54404c9d i965: enable OES_texture_view for gen8+
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-24 12:53:07 +03:00
Tapani Pälli
3ddcdcf94d mesa: changes to expose OES_texture_view extension
Functionality already covered by ARB_texture_view, patch also
adds missing 'gles guard' for enums (added in f1563e6392).

Tested via arb_texture_view.*_gles3 tests and individual app
utilizing texture view with ETC2.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-24 12:53:07 +03:00
Juan A. Suarez Romero
046b2b651e docs: update release calendar for 18.1 series
v2: extend 18.1 series (Andres)
v3: fix copy/paste typo (Engestrom)

CC: Andres Gomez <agomez@igalia.com>
CC: Emil Velikov <emil.l.velikov@gmail.com>
CC: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-05-24 11:47:47 +02:00
Samuel Pitoiset
38a8c5903b radv: call nir_lower_io_to_temporaries for VS, GS, TES and FS
Do not lower FS inputs because this moves all load_var
instructions at beginning of shaders and because
interp_var_at_sample (and friends) seem broken. That might
be eventually enabled later on if we really want to preload
all FS inputs at beginning.

Polaris10:
Totals from affected shaders:
SGPRS: 54072 -> 54264 (0.36 %)
VGPRS: 38580 -> 38124 (-1.18 %)
Spilled SGPRs: 652 -> 652 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 2128116 -> 2127380 (-0.03 %) bytes
Max Waves: 8048 -> 8086 (0.47 %)

Vega10:
Totals from affected shaders:
SGPRS: 52616 -> 52656 (0.08 %)
VGPRS: 37536 -> 37116 (-1.12 %)
Spilled SGPRs: 828 -> 828 (0.00 %)
Code Size: 2043756 -> 2042672 (-0.05 %) bytes
Max Waves: 9176 -> 9254 (0.85 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-24 09:18:57 +02:00
Samuel Pitoiset
ded1509587 radv: call nir_split_var_copies() before nir_lower_var_copies()
This doesn't nothing special currently because we don't create
any copy_var instructions, but this is needed for the next patch.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-24 09:18:54 +02:00
Francisco Jerez
936cd3c87a i965: Use intel_bufferobj_buffer() wrapper in image surface state setup.
Instead of directly using intel_obj->buffer.  Among other things
intel_bufferobj_buffer() will update intel_buffer_object::
gpu_active_start/end, which are used by glBufferSubData() to decide
which path to take.  Fixes a failure in the Piglit
ARB_shader_image_load_store-host-mem-barrier Buffer Update/WaW tests,
which could be reproduced with a non-standard glGetTexSubImage
implementation (see bug report).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105351
Reported-by: Nanley Chery <nanleychery@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-05-23 16:21:34 -07:00
Francisco Jerez
e989acb03b i965: Handle non-zero texture buffer offsets in buffer object range calculation.
Otherwise the specified surface state will allow the GPU to access
memory up to BufferOffset bytes past the end of the buffer.  Found by
inspection.

v2: Protect against out-of-range BufferOffset (Nanley).
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-05-23 16:21:28 -07:00
Francisco Jerez
156d2c6e62 i965: Move buffer texture size calculation into a common helper function.
The buffer texture size calculations (should be easy enough, right?)
are repeated in three different places, each of them subtly broken in
a different way.  E.g. the image load/store path was never fixed to
clamp to MaxTextureBufferSize, and none of them are taking into
account the buffer offset correctly.  It's easier to fix it all in one
place.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106481
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-05-23 16:21:09 -07:00
Francisco Jerez
5a68147803 Revert "mesa: simplify _mesa_is_image_unit_valid for buffers"
This reverts commit c0ed52f614.  It was
preventing the image format validation from being done on buffer
textures, which is required to ensure that the application doesn't
attempt to bind a buffer texture with an internal format incompatible
with the image unit format (e.g. of different texel size), which is
not allowed by the spec (it's not allowed for *any* texture target,
whether or not there is spec wording restricting this behavior
specifically for buffer textures) and will cause the driver to
calculate texel bounds incorrectly and potentially crash instead of
the expected behavior.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106465
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-05-23 16:21:09 -07:00
Bas Nieuwenhuizen
699e1f5aac ac: Use DPP for build_ddxy where possible.
WQM is pretty reliable now on LLVM 7, so let us just use
DPP + WQM.

This gives approximately a 1.5% performance increase on the
vrcompositor built-in benchmark.

v2: Use ac_build_quad_swizzle.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-23 21:02:45 +02:00
Miguel Casas
b73b340c37 i965: add {X,A}BGR2101010 to 'intel_image_formats'
This patch adds {X,A}BGR2101010 entries to the list of supported
'intel_image_formats'.

Bug: https://crbug.com/776093
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-05-23 10:19:04 -07:00
Miguel Casas
432df741e0 dri_util: Add R10G10B10{A,X}2 translation between DRI and mesa_format.
Add R10G10B10{A,X}2 translation between mesa_format and DRI format
to driGLFormatToImageFormat() and driImageFormatToGLFormat().

Bug: https://crbug.com/776093
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-05-23 10:17:45 -07:00
Dylan Baker
c8acfd5ab2 bin/get-pick-listh.sh: force git --pretty=medium
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2018-05-23 09:54:17 -07:00
Dylan Baker
5a639bdb81 bin/bugzilla_mesa.sh: explicitly set the --pretty argument
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2018-05-23 09:54:00 -07:00
Eric Engestrom
ec986241f3 docs: drop unnecessary out-of-frame target
I'm guessing an earlier version of the website used to have the page
contents in <frames>, but this isn't the case anymore so just drop the
unnecessary `target="_main"` :)

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-05-23 16:52:23 +01:00
Eric Engestrom
09a6cb7be6 docs: fix various html tags mistakes
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-05-23 16:52:23 +01:00
Eric Engestrom
8034f5f623 docs: fix < & > used in html code
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-05-23 16:52:23 +01:00
Juan A. Suarez Romero
6db0660d08 docs: add news notes to 18.1.0
CC: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2018-05-23 13:06:55 +02:00
Dave Airlie
f2f464de57 tgsi/scan: add hw atomic to the list of memory accessing files
This fixes 4 out of 5 cases in:
arb_framebuffer_no_attachments-atomic on cayman.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "18.0 18.1" <mesa-stable@lists.freedesktop.org>
2018-05-23 03:51:40 +01:00
Roland Scheidegger
7b89fcec41 llvmpipe: improve rasterization discard logic
This unifies the explicit rasterization discard as well as the implicit
rasterization disabled logic (which we need for another state tracker),
which really should do the exact same thing.
We'll now toss out the prims early on in setup with (implicit or
explicit) discard, rather than do setup and binning with them, which
was entirely pointless.
(We should eventually get rid of implicit discard, which should also
enable us to discard stuff already in draw, hence draw would be
able to skip the pointless clip and fallback stages in this case.)
We still need separate logic for only null ps - this is not the same
as rasterization discard. But simplify the logic there and don't count
primitives simply when there's an empty fs, regardless of depth/stencil
tests, which seems perfectly acceptable by d3d10.
While here, also fix statistics for primitives if face culling is
enabled.
No piglit changes.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-05-23 04:23:32 +02:00
Bas Nieuwenhuizen
047438287c ac/surface/gfx6: Don't force a tile index for fmask.
The bpe of the fmask often differs from the bpe of the main
surface. On SI that means it has to get a different tile
index.

addrlib is capable of figuring this out itself, so just pass
-1 instead to let it know that it is not preset.

Fixes: 9bf3570fed "ac/surface/gfx6: compute FMASK together with the color surface"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106511
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106499
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-23 02:23:03 +02:00
Jason Ekstrand
a347a5a12c i965: Remove ring switching entirely
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-22 15:46:39 -07:00
Jason Ekstrand
b499b85b0f i965/miptree: Move the access_raw call to the individual map functions
The only function that doesn't need to call access_raw is map_blit.  If
it takes the blitter path, it will happen as part of intel_miptree_copy.
If map_blit takes the blorp path, brw_blorp_copy_miptrees will handle
doing whatever resolves are needed.  This should save us resolves in
quite a few cases and will probably help performance a bit.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-22 15:46:37 -07:00
Jason Ekstrand
f566a1264c i965: Remove support for the BLT ring
We still support the blitter on gen4-5 but it's on the same ring as 3D.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-22 15:46:35 -07:00
Jason Ekstrand
33affda8bf i965/miptree: Use blorp for blit maps on gen6+
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-22 15:46:34 -07:00
Jason Ekstrand
0eedb0fca9 i965/miptree: Use blorp for validation tex copies on gen6+
It's faster than the blitter and can handle things like stencil properly
so it doesn't require software fallbacks.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-22 15:46:32 -07:00
Jason Ekstrand
80fc3896f3 i965: Delete the blitter path for CopyTexSubImage
The blorp path (called first) can do anything the blitter path can do so
it's just dead code.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-22 15:46:31 -07:00
Jason Ekstrand
8162256b01 i965: Don't fall back to the blitter in BlitFramebuffer
On gen4-5, we try the blitter before we even try blorp.  On newer
platforms, blorp can do everything the blitter can so there's no point
in even having the blitter fall-back path.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-22 15:46:29 -07:00
Jason Ekstrand
e596563b08 i965: Remove some unused includes of intel_blit.h
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-22 15:46:27 -07:00
Jason Ekstrand
a9499374a9 i965/blit: Delete intel_emit_linear_blit
This function is no longer used.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-22 15:46:25 -07:00
Jason Ekstrand
7fd962093f i965: Use meta for pixel ops on gen6+
Using meta for anything is fairly aweful and definitely has more CPU
overhead.  However, it also uses the 3D pipe and is therefore likely
faster in terms of GPU time than the blitter.  Also, the blitter code
has so many early returns that it's probably not buying us that much.
We may as well just use meta all the time instead of working over-time
to find the tiny case where we can use the blitter.  We keep gen4-5
using the old blit paths to avoid perturbing old hardware too much.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-22 15:46:20 -07:00
Kenneth Graunke
92f01fc5f9 i965: Emit VF cache invalidates for 48-bit addressing bugs with softpin.
We'd like to start using soft-pin to assign BO addresses up front, and
never move them again.  Our previous plan for dealing with 48-bit VF
cache bugs was to relocate vertex buffers to the low 4GB, so we'd never
have addresses that alias in the low 32 bits.  But that requires moving
buffers dynamically.

This patch tracks the last seen BO address for each vertex/index buffer,
and emits a VF cache invalidate if the high bits change.  (Ideally, we
won't hit this case very often.)  This should work for the soft-pin
case, but unfortunately won't work in the relocation case, as we don't
actually know the addresses.  So, we have to use both methods.

v2: Mention that the cache uses a <VertexBufferIndex, Address> tuple
    more explicitly (suggested by Scott).  Mention "single batch" too
    (suggested by Chris).

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-05-22 10:02:28 -07:00
Kenneth Graunke
c7259259d4 i965: Introduce a "memory zone" concept on BO allocation.
We're planning to start managing the PPGTT in userspace in the near
future, rather than relying on the kernel to assign addresses.  While
most buffers can go anywhere, some need to be restricted to within 4GB
of a base address.

This commit adds a "memory zone" parameter to the BO allocation
functions, which lets the caller specify which base address the BO will
be associated with, or BRW_MEMZONE_OTHER for the full 48-bit VMA.

Eventually, I hope to create a 4GB memory zone corresponding to each
state base address.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-05-22 10:01:09 -07:00
Jason Ekstrand
417b9e5770 intel/eu: Set EXECUTE_1 when setting the rounding mode in cr0
Fixes: d6cd14f213 "i965/fs: Define new shader opcode to..."
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
2018-05-22 09:53:23 -07:00
Michel Dänzer
fe2edb25dd dri3: Stricter SBC wraparound handling
Prevents corrupting the upper 32 bits of draw->recv_sbc when
draw->send_sbc resets to 0 (which currently happens when the window is
unbound from a context and bound to one again), which in turn caused
loader_dri3_swap_buffers_msc to calculate target_msc with corrupted
upper 32 bits. This resulted in hangs with the Xorg modesetting driver
as of xserver 1.20 (older versions and other drivers ignored the upper
32 bits of the target MSC, which is why this wasn't noticed earlier).

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/106351
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2018-05-22 17:59:53 +02:00
Samuel Pitoiset
75e919c045 radv: fix computation of user sgprs for 32-bit pointers
With 32-bit pointers we only need one user SGPR per desc set.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-22 15:53:29 +02:00
Samuel Pitoiset
c5536fc813 radv: drop user_sgpr_info::sgpr_count
It's only used inside allocate_user_sgprs().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-22 15:53:26 +02:00
Samuel Pitoiset
36a4d6d081 radv: add support for 32-bit pointers in user data SGPRs
We still use 64-bit GPU pointers for all ring buffers because
llvm.amdgcn.implicit.buffer.ptr doesn't seem to support 32-bit
GPU pointers for now. This can be improved later anyways.

Vega10:
Totals from affected shaders:
SGPRS: 1008722 -> 1026710 (1.78 %)
VGPRS: 706580 -> 707136 (0.08 %)
Spilled SGPRs: 22555 -> 22209 (-1.53 %)
Spilled VGPRs: 75 -> 75 (0.00 %)
Code Size: 34819208 -> 35202140 (1.10 %) bytes
Max Waves: 175423 -> 175086 (-0.19 %)

Polaris10:
Totals from affected shaders:
SGPRS: 1029849 -> 1036517 (0.65 %)
VGPRS: 709984 -> 708872 (-0.16 %)
Spilled SGPRs: 22672 -> 22309 (-1.60 %)
Spilled VGPRs: 82 -> 66 (-19.51 %)
Scratch size: 76 -> 60 (-21.05 %) dwords per thread
Code Size: 34915336 -> 35309752 (1.13 %) bytes
Max Waves: 151221 -> 151677 (0.30 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-22 15:53:22 +02:00
Samuel Pitoiset
b654ef5808 radv: add set_loc_shader_ptr() helper
This helper will hep for switching to 32-bit GPU pointers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-22 15:53:20 +02:00
Samuel Pitoiset
14a7547c08 radv: allocate descriptor BOs in the 32-bit addr space
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-22 15:53:18 +02:00
Samuel Pitoiset
0d1406ad12 radv: allocate the upload BO in the 32-bit addr space
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-22 15:53:17 +02:00
Samuel Pitoiset
d8a61d3232 radv: set amdgpu-32bit-address-high-bits LLVM attribute
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-22 15:53:15 +02:00
Samuel Pitoiset
fe2649d3ad radv/winsys: allow to allocate BOs in the 32-bit addr space
This introduces a new flag called RADEON_FLAG_32BIT.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-22 15:53:13 +02:00
Samuel Pitoiset
b60e0ee789 radv/winsys: request high address
This is needed for 32-bit GPU pointers. Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-22 15:53:09 +02:00
Anuj Phogat
0748383a60 i965/glk: Add l3 banks count for 2x6 configuration
2x6 configuration with pci-id 0x3185 has same number of
banks (2) as 3x6 configuration (pci-id 0x3184).

Reported-by: Clayton Craft <clayton.a.craft@intel.com>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Clayton Craft <clayton.a.craft@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: eb23be1d97 "i965: Add and initialize l3_banks field for gen7+"
Cc: Francisco Jerez <currojerez@riseup.net>
2018-05-21 16:43:26 -07:00
Vinson Lee
85f61197df v3d: Include v3d_drm.h path.
Fix build error.

  CC       v3d_blit.lo
In file included from v3d_blit.c:27:0:
v3d_context.h:39:10: fatal error: v3d_drm.h: No such file or directory
 #include "v3d_drm.h"
          ^~~~~~~~~~~

Fixes: 8a793d42f1 ("v3d: Switch the vc5 driver to using the finalized V3D UABI.")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-05-21 11:15:47 -07:00
Samuel Pitoiset
73df16dcee radv: fix centroid interpolation
It's legal to set the centroid and sample interpolation modes
when MSAA disabled. So, we have to initialize the centroid
inputs because the hardware doesn't.

This fixes rendering issues with DXVK and The Witness, World of
Warcraft, Trackmania and probably more games.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106315
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102390
CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-21 13:57:46 +02:00
Bas Nieuwenhuizen
f26b008e28 radv: Cleanup unused prime blit path.
Since we have the common WSI code, we use vkCmdCopyImageToBuffer
instead.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-21 10:33:41 +02:00
Bas Nieuwenhuizen
a63a0960e3 radv: Fix SRGB compute copies.
SRGB stores are broken. We had compensation code in the
resolve path but none in the copy path. Since we don't
want any conversion and it does not matter for DCC,
just make everything UNORM instead.

This happened to cause wrong colors for the PRIME path, as
that uses image->buffer copies which always use the compute
path.

CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106587
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-05-21 10:33:41 +02:00
Tapani Pälli
63525ba730 android: enable VK_ANDROID_native_buffer
Patch changes entrypoints generator to not skip this extension even
though it is set as disabled in the xml. We also need compilation
flag VK_USE_PLATFORM_ANDROID_KHR to be enabled.

It looks like this extension got disabled in commit 69f447553c.

v2: just remove the whole 'supported' attrib check + remove
    vk_icd.h compilation fix (fix in VulkanHeaders instead)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-21 09:26:50 +03:00
Tapani Pälli
437acae704 vulkan: update vk_icd.h to current upstream
Import from commit eb0c1fd on branch 'master'
of https://github.com/KhronosGroup/Vulkan-Headers.git.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-21 09:26:50 +03:00
Dave Airlie
bfa74bb44d virgl: set texture buffer offset alignment to disable ARB_texture_buffer_range.
The host side hasn't got support for this feature yet, so don't enable it
unless we get the caps from the host.

This makes the texture buffer range piglit tests skip now.

Fixes: fe0647df5a (virgl: add offset alignment values to to v2 caps struct)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2018-05-21 12:44:55 +10:00
Timothy Arceri
2e6c987a85 mesa: stop hiding query parameters from OpenGL compat
Just let the extension detection do its job as we will be adding
compat profile support in future, also we want these to work
with compat profile version overrides.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-21 09:39:03 +10:00
Christoph Haag
549e54270b radv: fix VK_EXT_descriptor_indexing
GetPhysicalDeviceProperties2KHR() was crashing because features was null

Fixes: 0e10790558 "radv: Enable VK_EXT_descriptor_indexing."
CC: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-20 13:36:07 +02:00
Bas Nieuwenhuizen
a1c87235a9 ac/surface: Only align linear power of two fmt textures.
We're not sharing 32_32_32 formats between different GPUs, so we
do not have to align for vega on pre-vega cards.

Fixes: e361970ed7 "radv: Add support for IMG_DATA_FORMAT_32_32_32."
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-20 11:57:59 +02:00
Bas Nieuwenhuizen
62e0e089d7 amd/addrlib: Use defines in autotools build.
Otherwise stuff like NDEBUG would not be passed through.

CC: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106479
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-20 11:57:59 +02:00
Aaron Watry
cfe582f9dc r600/compute: Mark several functions as static
They're not used anywhere else, so keep them private

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
2018-05-19 10:22:16 -05:00
Aaron Watry
d21e64c626 r600/compute: Remove unused compute_memory_pool functions
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
2018-05-19 10:21:57 -05:00
Roland Scheidegger
6f558fb0f7 draw: get rid of special logic to not emit null tris
I've confirmed after 77554d220d we no
longer need this to pass some tests from another api (as we no longer
generate the bogus extra null tris in the first place).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-05-19 02:49:58 +02:00
Dylan Baker
c86e9a5fe5 docs: Add sha sums for release 2018-05-18 16:44:50 -07:00
Dylan Baker
1d46852830 docs: Add release notes for 18.1.0 2018-05-18 16:44:43 -07:00
Alyssa Rosenzweig
5d85a0a55b nir: Implement optional b2f->iand lowering
This pass is required by the Midgard compiler; our instruction set uses
NIR-style booleans (~0 for true) but lacks a dedicated b2f instruction.
Normally, this lowering pass would be implemented in a backend-specific
algebraic pass, but this conflicts with the existing iand->b2f pass in
nir_opt_algebraic.py, hanging the compiler. This patch thus makes the
existing pass optional (default on -- all other backends should remain
unaffected), adding an optional pass for lowering the opposite
direction.

v2: Defer lowering until late algebraic optimisations to allow
optimising the b2f instruction itself.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-05-18 22:44:09 +02:00
Jan Vesely
8ed2cabd04 travis: Adapt to radeonsi dropping support for LLVM 4
meson Vulkan, Clover, and autotools Vulkan need to be switched to llvm 5

Fixes: f9eb1ef870
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-18 13:59:37 -04:00
Marek Olšák
3d64ed5785 radeonsi: skip ES output stores for undefined output components
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-18 13:38:07 -04:00
Nanley Chery
0ab25f05ab i965: isl: Move the MCS gen7+ assertion into ISL
This is useful for every user of ISL. Drop the comment along the way to
match similar functions in ISL.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-18 09:53:06 -07:00
Nanley Chery
f88caf2321 i965/miptree: Remove format assertion in alloc_aux
intel_miptree_supports_{ccs,mcs,hiz} ensures the format is valid for the
color or depth miptree before the miptree is assigned an aux_usage.
alloc_aux switches on the aux_usage so don't assert that the format is
valid.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-18 09:53:06 -07:00
Nanley Chery
8007b2d78b i965/miptree: Simplify the switch in supports_ccs
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-05-18 09:53:06 -07:00
Nanley Chery
da98441fef i965: Make get_ccs_surf succeed in alloc_aux
Synchronize the requirements listed in isl_surf_get_ccs_surf with
intel_miptree_supports_ccs by importing a restriction from ISL. Some
implications:
* We successfully create every aux_surf in alloc_aux
* We only return false from alloc_aux if we run out of memory

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-18 09:53:06 -07:00
Brian Paul
42aee8f4f6 llvmpipe: fix check for a no-op shader
The tgsi_info.num_tokens fix broke llvmpipe's detection of no-op shaders.
Fix the code to check for num_instructions <= 1 instead.

Fixes: 8fde9429c3 ("tgsi: fix incorrect tgsi_shader_info::num_tokens
computation")
Tested-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-05-18 09:09:41 -06:00
Samuel Pitoiset
03c4816093 radv: pass radv_nir_compiler_options directly to create_llvm_function()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-18 11:07:01 +02:00
Christian Gmeiner
2eb3f794d9 st/mesa: only define GLSL 1.4 for compat if driver supports it
Currently GLSL 1.4 is defined for all gallium drivers even only
GLSL 1.2 is supported as seen on etnaviv.

v1 -> v2:
 - use _min(..) as suggested by Lucas Stach and Michel Dänzer

Fixes: 4560aad780 ("mesa: add GLSLVersionCompat constant")
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-18 10:46:24 +02:00
Dave Airlie
48e28ab961 vbo: remove MaxVertexAttribStride assert check.
Some drivers (virgl) don't support GL4.4 or GLES3.1 yet,
so never fill in this const.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-05-18 14:58:15 +10:00
Timothy Arceri
c0c69bd8dd mesa: drop GL_EXT_polygon_offset support
glPolygonOffset() has been part of the GL standard since 1.1. Also
niether AMD or Nvidia support this in their binary drivers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61761
2018-05-18 09:21:24 +10:00
Brian Paul
8fde9429c3 tgsi: fix incorrect tgsi_shader_info::num_tokens computation
We were incrementing num_tokens in each loop iteration while parsing
the shader.  But each call to tgsi_parse_token() can consume more than
one token (and often does).  Instead, just call the tgsi_num_tokens()
function.

Luckily, this issue doesn't seem to effect any current users of this
field (llvmpipe just checks for <= 1, for example).

Reviewed-by: Neha Bhende<bhenden@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-05-17 15:02:05 -06:00
Samuel Pitoiset
fcba3934fc radv: add radv_emit_shader_pointer() helper
For future work (support for 32-bit GPU pointers).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-17 21:28:59 +02:00
Samuel Pitoiset
9b2c310a70 radv: add some helpers for cleaning up radv_get_preamble_cs()
Because this function looks a bit ugly to me.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-17 21:28:57 +02:00
Marek Olšák
f9eb1ef870 amd: remove support for LLVM 4.0
It doesn't support GFX9.

Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-17 14:54:41 -04:00
Juan A. Suarez Romero
11a0d5563f docs: update calendar, add news and link release notes to 18.0.4
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-05-17 18:45:26 +00:00
Juan A. Suarez Romero
042e21976a docs: add sha256 checksums for 18.0.4
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 69ef6e4a75)
2018-05-17 18:40:53 +00:00
Juan A. Suarez Romero
bb7750e8da docs: add release notes for 18.0.4
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 3b49ab6219)
2018-05-17 18:40:51 +00:00
Mathias Fröhlich
6fac626193 mesa: The glArrayElement api is independent of the current program.
All the shader program dependent handling is done on the level
of the gl_Context::Array._DrawVAO/_DrawVAOEnabledAttribs.
So, skip array element invalidation on _NEW_PROGRAM.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-05-17 20:13:40 +02:00
Mathias Fröhlich
984cb4e512 mesa: Flag _NEW_ARRAY only if we are changing ctx->Array.VAO.
For the VAO internal helper functions that may be called
with a non current VAO, flag the _NEW_ARRAY state only
if it is the current ctx->Array.VAO.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-05-17 20:13:39 +02:00
Mathias Fröhlich
5c7e3a90ed mesa: Remove flush_vertices argument from VAO methods.
The flush_vertices argument is now unused, remove it.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-05-17 20:13:39 +02:00
Mathias Fröhlich
9c7be67968 mesa: Remove FLUSH_VERTICES from VAO state changes.
Pending draw calls on immediate mode or display list calls do
not depend on changes of the VAO state. So, remove calls to
FLUSH_VERTICES and flag _NEW_ARRAY as appropriate.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-05-17 20:13:39 +02:00
Juan A. Suarez Romero
0a2c947556 docs: add 18.0.5 in the release calendar
Mesa 18.1 series has not been released yet, so let's extend 18.0 lifetime.

v2: Add missing closing TR tags (Eric Engestrom)

CC: Andres Gomez <agomez@igalia.com>
CC: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2018-05-17 19:01:19 +02:00
Alok Hota
936ce75285 swr/rast: Added FEClipRectangles event
and also added some comments

Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
2018-05-17 10:53:14 -05:00
Alok Hota
a33d376133 swr/rast: Whitespace and tab-to-spaces changes
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
2018-05-17 10:53:10 -05:00
Alok Hota
7970fcff25 swr/rast: fix VCVTPD2PS generation for AVX512
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
2018-05-17 10:53:06 -05:00
Alok Hota
a0dddac1cb swr/rast: Rectlist support for GS
Add rectlist as an option for GS.  Needed to support some driver
optimizations.

Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
2018-05-17 10:53:01 -05:00
Alok Hota
7926d18fa5 swr/rast: Remove unneeded virtual from methods
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
2018-05-17 10:52:21 -05:00
Stefan Schake
b0acc3a562 broadcom/vc4: Native fence fd support
With the syncobj support in place, lets use it to implement the
EGL_ANDROID_native_fence_sync extension. This mostly follows previous
implementations in freedreno and etnaviv.

v2: Drop the flags (Eric)
    Handle in_fence_fd already in job_submit (Eric)
    Drop extra vc4_fence_context_init (Eric)
    Dup fds with CLOEXEC (Eric)
    Mention exact extension name (Eric)

Signed-off-by: Stefan Schake <stschake@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-05-17 16:04:30 +01:00
Stefan Schake
44036c354d broadcom/vc4: Store job fence in syncobj
This gives us access to the fence created for the render job.

v2: Drop flag (Eric)

Signed-off-by: Stefan Schake <stschake@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-05-17 16:04:28 +01:00
Stefan Schake
9ed05e2520 broadcom/vc4: Detect syncobj support
We need to know if the kernel supports syncobj submission since otherwise
all the DRM syncobj calls fail.

v2: Use drmGetCap to detect syncobj support (Eric)

Signed-off-by: Stefan Schake <stschake@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-05-17 16:04:26 +01:00
Stefan Schake
4fc0ebdff5 broadcom/vc4: Bump libdrm requirement
Require a version of libdrm with syncobj support.

v2: Don't require a libdrm_vc4, just bump core libdrm if vc4 enabled (by
    anholt)

Signed-off-by: Stefan Schake <stschake@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-05-17 16:04:24 +01:00
Stefan Schake
580d1f4c60 drm-uapi: Update vc4 header with syncobj submit support
v2: Synchronized with kernel v2
v3: Update for the finalized kernel ABI (pad2 field)

Signed-off-by: Stefan Schake <stschake@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-05-17 16:04:21 +01:00
Stefan Schake
1ec01a911b broadcom/vc4: Drop libdrm_vc4 requirement
This was missed in the move back to the local uapi copy.
libdrm_vc4 only seems to consist of headers that also exist in the
Mesa tree.

Signed-off-by: Stefan Schake <stschake@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-05-17 16:04:12 +01:00
Eric Anholt
97894b1267 v3d: Add support for glSampleMask / glSampleCoverage. 2018-05-17 15:09:46 +01:00
Eric Anholt
9bbc3f8cf1 v3d: Enable NaN propagation in the VS and CS as well.
Fixes piglit vs-isnan-*.shader_test at the expense of gl-1.0-spot-light.
2018-05-17 15:09:12 +01:00
Nanley Chery
edfb57c0a0 i965/blorp: Disable BLORP clear color updates
With the previous patches, we now update the indirect clear color buffer
every time the clear color changes. Avoid redundant updates.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-17 07:06:42 -07:00
Nanley Chery
02f5512fed intel/blorp: Add a NO_UPDATE_CLEAR_COLOR batch flag
Allow callers to handle updating the indirect clear color buffer
themselves. This can reduce the number of clear color updates in the
case where a caller performs multiple fast clears with the same clear
color.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-17 07:06:42 -07:00
Nanley Chery
f8ac11d69f i965/blorp: Also skip the fast clear if the clear color differs
If the aux state is CLEAR and clear color value has changed, only the
surface state must be updated. The bit-pattern in the aux buffer is
exactly the same.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-17 07:06:42 -07:00
Nanley Chery
43616404be i965/clear: Drop a stale comment in fast_clear_depth
This comment made more sense when it was above the calls to
intel_miptree_slice_set_needs_depth_resolve(). We stopped using these
functions at commit 554f7d6d02
("i965: Move depth to the new resolve functions").

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-17 07:06:41 -07:00
Nanley Chery
82849fb6d5 i965: Update the indirect buffer in set_clear_color
For depth buffers, we avoid fast-clearing if the aux_state is already
CLEAR. We do the same for color buffers only if the clear color
doesn't change. We require that the clear colors match because, in
that case, we don't update the indirect clear color outside of BLORP.

Update the indirect clear color for color buffers as well. We'll
enable the same depth buffer optimization for color buffers in a
later patch.

Note that we're now actually updating the indirect clear color twice
in the case where we use BLORP to perform the fast-clear. This is
only temporary. In later patches, we'll prevent BLORP from performing
the update.

v2: Add more context to the commit message (Topi).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-17 07:06:41 -07:00
Nanley Chery
5b315f3ad1 i965/clear: Remove an early return in fast_clear_depth
Reduce complexity and allow the next patch to delete some code. With
this change, clear operations will still be skipped and setting the
aux_state will cause no side-effects.

Remove the associated comment which implies an early return.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-17 07:06:41 -07:00
Nanley Chery
6f609ca609 i965: Use set_clear_color for depth miptrees
Reduce code duplication now and prevent it in the following commits.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-17 07:06:41 -07:00
Nanley Chery
92a0a87b6f Revert "i965: Make the miptree clear color setter take a gl_color_union"
This reverts commit 1d94aa1987.

The next patch will make depth miptrees use the clear color setter that
was originally being used for color miptrees. Go back to using the
isl_color_value parameter because it's the same type as the
fast_clear_color field used by color and depth miptrees.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-17 07:06:41 -07:00
Nanley Chery
bb18af82c3 i965/miptree: Unify aux buffer allocation
There isn't much that changes between the aux allocation functions.
Remove the duplicated code.

v2: Inline the switch statement (Jason).

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-17 07:06:41 -07:00
Nanley Chery
6c41a2ef3b i965: Prepare to delete intel_miptree_alloc_ccs()
We're going to delete intel_miptree_alloc_ccs() in the next commit. With
that in mind, replace the use of this function in
do_single_blorp_clear() with intel_miptree_alloc_aux() and move the
delayed allocation logic to it's callers.

v2: Duplicate the delayed allocation comment (Topi Pohjolainen).

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-17 07:06:41 -07:00
Nanley Chery
beed9c4550 i965/miptree: Drop the mt param from alloc_aux_buffer
Drop an unused parameter.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-17 07:06:41 -07:00
Nanley Chery
6b1836aabe i965/miptree: Drop the alloc_flags param from alloc_aux_buffer
We have enough information to determine the optimal flags internally.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-17 07:06:41 -07:00
Nanley Chery
3dd7f600e0 i965/miptree: Drop the name param from alloc_aux_buffer
A name of "aux-miptree" should be sufficient.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-17 07:06:41 -07:00
Nanley Chery
58d99a21f1 i965/miptree: Initialize the indirect clear color to zero
The indirect clear color isn't correctly tracked in
intel_miptree::fast_clear_color. The initial value of ::fast_clear_color
is zero, while that of the indirect clear color is undefined.

Topi Pohjolainen discovered this issue with MCS buffers. This issue is
apparent when fast-clearing an MCS buffer for the first time with
glClearColor = {0.0,}. Although the indirect clear color is undefined,
the initial aux state of the MCS is CLEAR and the tracked clear color is
zero, so we avoid updating the indirect clear color with {0.0,}.

Make the indirect clear color match the initial value of
::fast_clear_color.

Note: although we only have to drop HiZ's BO_ALLOC_BUSY flag for gen10+,
we also drop it pre-gen10 to keep things simple. We add this flag back
for pre-gen10 in a later patch.

v2: Add a note about dropping HiZ's BO_ALLOC_BUSY flag (Topi).

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-17 07:06:41 -07:00
Nanley Chery
b58675e93f i965/miptree: Add and use a memset option in alloc_aux_buffer
Add infrastructure for initializing the clear color BO.
intel_miptree_init_mcs is no longer needed with change.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-17 07:06:41 -07:00
Nanley Chery
8a9491058d i965/miptree: Zero-initialize CCS_D buffers
Before this patch, the aux_state was actually AUX_INVALID because the BO
was never defined. This was fine on single slice miptrees because we
would fast-clear the resource right after creation. For multi-slice
miptrees on SKL+ however, this results in undefined behavior when
accessing a non-base slice. Here's a specific example:

1) Fast clear level 0
   * Undefined CCS_D buffer allocated in "PASS_THROUGH" state.
   * Level 0 transitions to the CLEAR state.
2) Render to level 1
   * Level 1 may have a 2-bit pattern of 2's.
   * Rendering with a 2 in the CCS is undefined.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-17 07:06:41 -07:00
Nanley Chery
816f2dc67d i965/miptree: Fix handling of uninitialized MCS buffers
Before this patch, if we failed to initialize an MCS buffer, we'd
end up in a state in which the miptree thinks it has an MCS buffer,
but doesn't. We also leaked the clear_color_bo if it existed.

With this patch, we now free the miptree aux buffer resources and let
intel_miptree_alloc_mcs() know that the MCS buffer no longer exists.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-17 07:06:41 -07:00
Samuel Pitoiset
1fba2e10b3 radv: only declare the ESGS rings for pre GFX9 chips
GFX9 uses LDS instead.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-17 14:14:20 +02:00
Samuel Pitoiset
d349d4bd24 radv: allow to print GPU info with RADV_DEBUG=info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-17 14:14:17 +02:00
Samuel Pitoiset
56d53ed1d6 radv: do not emit unnecessary ES output stores
GFX9:
Totals from affected shaders:
SGPRS: 472 -> 464 (-1.69 %)
VGPRS: 576 -> 584 (1.39 %)
Code Size: 45432 -> 44324 (-2.44 %) bytes
Max Waves: 40 -> 40 (0.00 %)

VI:
SGPRS: 720 -> 720 (0.00 %)
VGPRS: 728 -> 728 (0.00 %)
Code Size: 45348 -> 43992 (-2.99 %) bytes
Max Waves: 120 -> 120 (0.00 %)

This affects Rise of Tomb Raider and the three Vulkan demos
that use a geometry shader (geometryshader, deferredshadows
and viewportarray).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-17 14:14:13 +02:00
Samuel Pitoiset
a6e44d1271 radv: do not emit unnecessary GS output stores
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-17 14:14:11 +02:00
Samuel Pitoiset
507402ada6 radv: only pass the global BO list at submit time if enabled
That way the winsys might use a faster path when the global
BO list is NULL.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-17 13:48:27 +02:00
Samuel Pitoiset
6211799aff radv: remove the radv_finishme() when compiling shaders
Having an entrypoint different than "main" doesn't mean we
have multiple shaders per module.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-17 13:48:24 +02:00
Samuel Pitoiset
1e86eaf7d8 radv: remove radv_device::llvm_supports_spill
It's always true.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-17 13:48:21 +02:00
Timothy Arceri
f71714022b mesa: add glUniform*ui{v} support to display lists
Fixes: a017c7ecb7 "mesa: display list support for uint uniforms"

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78097
2018-05-17 13:07:48 +10:00
Dieter Nützel
7f1dc93357 radeonsi: create .gitignore
Signed-off-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-05-16 21:48:17 -04:00
Dave Airlie
eba4cf797c ac/llvm: use amdgcn.tbuffer.store instead of SI.tbuffer.store intrinsic
Drop the use of the old intrinsic.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-17 11:46:53 +10:00
Eric Anholt
b2e7c32703 v3d: Fix wiring filters to NEAREST for 32-bit texture returns.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104626
2018-05-16 21:19:07 +01:00
Eric Anholt
795488d2bf v3d: Enable the driver by default.
Now that we have a stabilized ABI and a fairly conformant driver, turn it
on.
2018-05-16 21:19:07 +01:00
Eric Anholt
01ae6a9181 v3d: Rename driver functions from vc5 to v3d.
This is the final step of the driver rename.
2018-05-16 21:19:07 +01:00
Eric Anholt
8c47ebbd23 v3d: Rename the driver files from "vc5" to "v3d". 2018-05-16 21:19:07 +01:00
Eric Anholt
c4c488a2ae v3d: Rename the vc5_dri.so driver to v3d_dri.so.
This allows the driver to load against the merged kernel DRM driver.  In
the process, rename most of the build system variables and gallium
plumbing functions.
2018-05-16 21:19:07 +01:00
Eric Anholt
8a793d42f1 v3d: Switch the vc5 driver to using the finalized V3D UABI.
In the process of merging to the kernel, I renamed the driver to the
general product line's name (since we have both vc5 and vc6 supported
already).  Since the ABI is finalized, move the header to include/drm-uapi.
2018-05-16 21:19:07 +01:00
Charmaine Lee
33a86acd78 svga: fix incompatible bind flags at buffer validation time
At buffer resource validation time, if the resource handle is not yet
created and if the initial buffer bind flags and the tobind flags are
incompatible, just use the tobind flags to create the resource handle.
On the other hand, if the bind flags are compatible, we can combine
the bind flags for the resource handle creation.

Fixes piglit gl-3.1-buffer-bindings crash.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-05-16 13:04:16 -06:00
jenny.q.cao
1261b34cd5 mesa: cast the GLenum16 to GLint to avoid compile warning on android
Cast the enum to GLint to avoid the compile warning:
/src/mesa/main/get.c:3005:19:
warning: comparison of constant -32768 with expression of type
'GLenum16' (aka 'unsigned short') is always false
-Wtautologicalia-constant-out-of-range-compare

Tests: compilation without this warning
Signed-off-by: jenny.q.cao <jenny.q.cao@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-05-16 13:02:43 -06:00
Stuart Young
f806cc9eb6 etnaviv: Fix missing rnndb file in tarballs
Seems that when the rnndb files for etniviv were updated/included back
in Nov 2017, hw/texdesc_3d.xml.h was missed from Makefile.sources and
meson.build. This was all during the conversion to meson, so it apears
to have slipped through the cracks. As such, this file has been missing
from the official tarballs since inclusion in Mesa, so the git trees
and tarballs differ.

Found due to lintian errors in the Debian packages.

Fixes: f1e1c60ff6 ("etnaviv: Update from rnndb")
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-05-16 19:36:10 +02:00
Matthias Groß
71892fbe19 gallium/hud: add frametime graph (v2)
Thanks for your comment. This version has an additional boolean in the
fps_info struct to distinguish between fps and frame time calculation.
The struct is initialised in the respecting install functions for this
purpose.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-05-15 19:30:12 -04:00
Jan Vesely
f3521ce2c4 eg/compute: Use reference counting to handle compute memory pool.
Use pipe_reference to release old RAT surfaces.
RAT surface adds a reference to pool bo, so use reference counting for pool->bo
as well.

v2: Use the same pattern for both defrag paths
    Drop confusing comment

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-05-15 19:01:47 -04:00
Roland Scheidegger
e01af38d6f gallivm: Use alloca_undef with array type instead of alloca_array
Use a single allocation of array type instead of the old-style array
allocation for the temp and immediate arrays.
Probably only makes a difference if they aren't used indirectly (so,
if we used them solely because there's too many temps or immediates).
In this case the sroa and early-cse passes can sometimes do some
optimizations which they otherwise cannot.
(As a side note, for the temp reg array, we actually really should
use one allocation per array id, not just one for everything.)
Note that the instcombine pass would actually promote such
allocations to single alloc of array type as well, but it's too late
for some artificial shaders we've seen to help (we don't want to run
instcombine at the beginning due to its cost, hence would need
another sroa/cse pass after instcombine). sroa/early-cse help there
because they can actually eliminate all of the huge shader, reducing
it to a single const output (don't ask...).
(Interestingly, instcombine also removes all the bitcasts we do on that
allocation for single-value gathering, and in the end directly indexes
into the single vector elements, which according to spec is only
semi-valid, but this happens regardless. Another thing instcombine also
does is use inbound GEPs, which is probably something we should do
manually as well - for indirectly indexed reg files llvm may not be
able to figure it out on its own, but we should be able to guarantee
all pointers are always inbound. In any case, by the looks of it
using single allocation with array type seems to be the right thing
to do even for ordinary shaders.)
No piglit change.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-05-16 00:04:48 +02:00
Dieter Nützel
bd0b6b9f17 radv: add generated files to .gitignore(s)
Signed-off-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-15 22:53:55 +02:00
Samuel Pitoiset
6bde8c5608 spirv: fix visiting inner loops with same break/continue block
We should stop walking through the CFG when the inner loop's
break block ends up as the same block as the outer loop's
continue block because we are already going to visit it.

This fixes the following assertion which ends up by crashing
in RADV or ANV:

SPIR-V parsing FAILED:
In file ../src/compiler/spirv/vtn_cfg.c:381
block->node.link.next == NULL
0 bytes into the SPIR-V binary

This also fixes a crash with a camera shader from SteamVR.

v2: make use of vtn_get_branch_type() and add an assertion

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106090
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106504
CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-15 21:38:19 +02:00
Rob Clark
d89f58a6b8 mesa/st: handle vert_attrib_mask in nir case too
Note, actually fixes 9987a072cb, but the problems don't show up until
19a91841c3.

Fixes: 19a91841c3 st/mesa: Use Array._DrawVAO in st_atom_array.c.
Fixes: 9987a072cb st/mesa: Make the input_to_index array available.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-05-15 15:15:33 -04:00
Marek Olšák
3e27b377f2 cso: check count == 0 in cso_set_vertex_buffers
The code didn't expect that, leading to crashes.

Fixes: 86d63b53a2 "gallium: remove aux_vertex_buffer_slot code"

Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2018-05-15 12:36:27 -04:00
Rob Clark
dace607245 vc5: use util_copy_framebuffer_state
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-05-15 08:48:13 -04:00
Rob Clark
dae4c98dd7 vc4: use util_copy_framebuffer_state
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-05-15 08:47:35 -04:00
Rob Clark
f897b67dc1 freedreno/a5xx: remove fd5_shader_stateobj
Extra level of indirection that serves no purpose.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-05-15 08:46:46 -04:00
Rob Clark
d48a2404a2 freedreno/a4xx: remove fd4_shader_stateobj
Extra level of indirection that serves no purpose.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-05-15 08:46:46 -04:00
Rob Clark
2c40f2ba32 freedreno/a3xx: remove fd3_shader_stateobj
Extra level of indirection that serves no purpose.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-05-15 08:46:46 -04:00
Rob Clark
273f7d8404 freedreno: fence should hold a ref to pipe
Since the fence can outlive the context, and all it really needs to wait
on a fence is the pipe, use the new fd_pipe reference counting to hold a
ref to the pipe and drop the ctx pointer.

This fixes a crash seen with (for example) glmark2:

  #0  fd_pipe_wait_timeout (pipe=0xbf48678b3cd7b32b, timestamp=0, timeout=18446744073709551615) at freedreno_pipe.c:101
  #1  0x0000ffffbdf75914 in fd_fence_finish (pscreen=0x561110, ctx=0x0, fence=0xc55c10, timeout=18446744073709551615) at ../src/gallium/drivers/freedreno/freedreno_fence.c:96
  #2  0x0000ffffbde154e4 in dri_flush (cPriv=0xb1ff80, dPriv=0x556660, flags=3, reason=__DRI2_THROTTLE_SWAPBUFFER) at ../src/gallium/state_trackers/dri/dri_drawable.c:569
  #3  0x0000ffffbecd8b44 in loader_dri3_flush (draw=0x558a28, flags=3, throttle_reason=__DRI2_THROTTLE_SWAPBUFFER) at ../src/loader/loader_dri3_helper.c:656
  #4  0x0000ffffbecbc36c in glx_dri3_flush_drawable (draw=0x558a28, flags=3) at ../src/glx/dri3_glx.c:132
  #5  0x0000ffffbecd91e8 in loader_dri3_swap_buffers_msc (draw=0x558a28, target_msc=0, divisor=0, remainder=0, flush_flags=3, force_copy=false) at ../src/loader/loader_dri3_helper.c:827
  #6  0x0000ffffbecbcfc4 in dri3_swap_buffers (pdraw=0x5589f0, target_msc=0, divisor=0, remainder=0, flush=1) at ../src/glx/dri3_glx.c:587
  #7  0x0000ffffbec98218 in glXSwapBuffers (dpy=0x502bb0, drawable=2097154) at ../src/glx/glxcmds.c:840
  #8  0x000000000040994c in CanvasGeneric::update (this=0xfffffffff400) at ../src/canvas-generic.cpp:114
  #9  0x0000000000411594 in MainLoop::step (this=this@entry=0x5728f0) at ../src/main-loop.cpp:108
  #10 0x0000000000409498 in do_benchmark (canvas=...) at ../src/main.cpp:117
  #11 0x00000000004071b0 in main (argc=<optimized out>, argv=<optimized out>) at ../src/main.cpp:210

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-05-15 08:46:46 -04:00
Rob Clark
a8c0daa172 freedreno: batch cache doesn't hold a ref to batch
The cache doesn't hold a (strong) reference to the batch.  So we
shouldn't be trying to drop a reference, as that leads to:

   #0  0x0000ffffbecb37a0 in raise () from /lib64/libc.so.6
   #1  0x0000ffffbeca159c in abort () from /lib64/libc.so.6
   #2  0x0000ffffbecacf48 in __assert_fail_base () from /lib64/libc.so.6
   #3  0x0000ffffbecacfa8 in __assert_fail () from /lib64/libc.so.6
   #4  0x0000ffffbd28def0 in pipe_reference_described (ptr=0x4f47130, reference=0x0, get_desc=0xffffbd2e0f08 <__fd_batch_describe>) at ../src/gallium/auxiliary/util/u_inlines.h:88
   #5  0x0000ffffbd28e188 in fd_batch_reference_locked (ptr=0x4f40de0, batch=0x0) at ../src/gallium/drivers/freedreno/freedreno_batch.h:258
   #6  0x0000ffffbd28e9a8 in fd_bc_invalidate_resource (rsc=0x4f40ca0, destroy=true) at ../src/gallium/drivers/freedreno/freedreno_batch_cache.c:244
   #7  0x0000ffffbd293778 in fd_resource_destroy (pscreen=0xedc170, prsc=0x4f40ca0) at ../src/gallium/drivers/freedreno/freedreno_resource.c:644
   #8  0x0000ffffbd922674 in u_transfer_helper_resource_destroy (pscreen=0xedc170, prsc=0x4f40ca0) at ../src/gallium/auxiliary/util/u_transfer_helper.c:144
   #9  0x0000ffffbd29527c in pipe_resource_reference (ptr=0x4f455d8, tex=0x0) at ../src/gallium/auxiliary/util/u_inlines.h:144
   #10 0x0000ffffbd29548c in fd_surface_destroy (pctx=0x1012720, psurf=0x4f455d0) at ../src/gallium/drivers/freedreno/freedreno_surface.c:78
   #11 0x0000ffffbd1f9c48 in pipe_surface_reference (ptr=0x4f471d0, surf=0x0) at ../src/gallium/auxiliary/util/u_inlines.h:113
   #12 0x0000ffffbd1f9ef4 in util_copy_framebuffer_state (dst=0x4f471c8, src=0x0) at ../src/gallium/auxiliary/util/u_framebuffer.c:114
   #13 0x0000ffffbd2e0e30 in __fd_batch_destroy (batch=0x4f47130) at ../src/gallium/drivers/freedreno/freedreno_batch.c:225
   #14 0x0000ffffbd28e1b0 in fd_batch_reference_locked (ptr=0xfffffffff010, batch=0x0) at ../src/gallium/drivers/freedreno/freedreno_batch.h:262
   #15 0x0000ffffbd28e6b0 in fd_bc_invalidate_context (ctx=0x1012720) at ../src/gallium/drivers/freedreno/freedreno_batch_cache.c:190
   #16 0x0000ffffbd2e2b6c in fd_context_destroy (pctx=0x1012720) at ../src/gallium/drivers/freedreno/freedreno_context.c:139
   #17 0x0000ffffbd2c3280 in fd5_context_destroy (pctx=0x1012720) at ../src/gallium/drivers/freedreno/a5xx/fd5_context.c:56
   #18 0x0000ffffbd5b7a8c in st_destroy_context_priv (st=0xfd72f0, destroy_pipe=true) at ../src/mesa/state_tracker/st_context.c:281

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-05-15 08:46:26 -04:00
Eric Engestrom
37d44e2608 docs/meson: mark code/commands as <code>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-05-15 10:33:39 +01:00
Eric Engestrom
5829f616ec docs/meson: replace plaintext url with a link
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-05-15 10:33:36 +01:00
Eric Engestrom
67c550708a docs/meson: fix various html issues
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-05-15 10:33:34 +01:00
Eric Engestrom
dc2dc1fa30 docs/meson: fix various typos
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-05-15 10:33:28 +01:00
Eric Engestrom
6c5df78d8b meson: fix copyright symbol
Fixes: bd68f1013c "autotools, meson: add tileset.h"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-05-15 10:31:46 +01:00
Juan A. Suarez Romero
bd68f1013c autotools, meson: add tileset.h
Fixes: 4e52cb51b5 ("swr/rast: Thread locked tiles improvement")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-05-15 10:00:11 +02:00
Thomas Hellstrom
3d0b4979ee st/xa: Bump minor
Bump xa minor to signal that the underlying mesa version is suitable for dri3.

This is a bit ugly since it doesn't relate to a specific xa interface change.
Recently there has been a number of fixes in mesa that helps enabling dri3
without any significant regressions in automated testing and common desktop
usage latency. However, the xf86-video-vmware driver has no other way to tell
but inspecting the xa version.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-05-15 09:27:46 +02:00
Dave Airlie
9585e70206 virgl: enable vertex streams when glsl level is high enough.
This enabled the vertex streams out when the host supports
GL4.0.
2018-05-15 14:56:57 +10:00
Kai Wasserbäch
b691d9192c opencl: autotools: Fix linking order for OpenCL target
Otherwise the build fails with an undefined reference to
clang::FrontendTimesIsEnabled.

Bugzilla: https://bugs.freedesktop.org/106209
Cc: Jan Vesely <jan.vesely@rutgers.edu>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Acked-by: Jan Vesely <jan.vesely@rutgers.edu>
Tested-by: Aaron Watry <awatry@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-05-14 22:45:01 -04:00
Samuel Pitoiset
97b179570c radv: reduce the number of parameters export by the GS copy shader
By using the geometry shader output usage mask.

This improves all Vulkan demos that use a geometry shader
(ie. geometryshader, deferredshadows, viewportarray).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-14 21:38:23 +02:00
Samuel Pitoiset
560bd9eb67 radv: scan the geometry shader output usage mask
For reducing the number of parameters that are exported by
the GS copy shader.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-14 21:38:21 +02:00
Samuel Pitoiset
ea43d935ab radv: run the shader info pass before emitting the GS copy shader
For further optimizations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-14 21:38:19 +02:00
Samuel Pitoiset
7cbc6f2621 radv: check that layout isn't NULL in radv_nir_shader_info_pass()
An upcoming patch will run the shader info pass on the
geometry shader just before emitting the GS copy shader.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-14 21:38:17 +02:00
Jason Ekstrand
18f8200a99 intel/blorp: Use linear formats for CCS_E clear colors in copies
It's clear that the original code meant to do this and there is even a
10-line comment explaining why.  Originally, we had a simple function
for packing the clear colors which was unaware of sRGB.  However, in
a6b66a7b26, when we started using ISL to do the packing, the wrong
format was used.

Fixes: a6b66a7b26 "intel/blorp: Use ISL instead of bitcast_color..."
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-14 10:41:26 -07:00
Bas Nieuwenhuizen
f944a59996 radv: Disable texel buffers with A2 SNORM/SSCALED/SINT for pre-vega.
The hardware always interprets the alpha as unsigned and fixing it
in the shader is going to add unacceptable overheads.

CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106480
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-14 18:58:30 +02:00
Bas Nieuwenhuizen
3d4d388e39 radv: Fix up 2_10_10_10 alpha sign.
Pre-Vega HW always interprets the alpha for this format as unsigned,
so we have to implement a fixup to do the sign correctly for signed
formats.

v2: Improve indexing mess.

CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106480
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-14 18:58:20 +02:00
Bas Nieuwenhuizen
e361970ed7 radv: Add support for IMG_DATA_FORMAT_32_32_32.
Basic sampling support for linear tiling.

No CTS regressions, but it seems the blitting coverage is not very
extensive.

https://bugs.freedesktop.org/show_bug.cgi?id=106331
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-14 18:58:12 +02:00
Bas Nieuwenhuizen
dd102405de radv: Translate logic ops.
radeonsi could pass them through but the enum changed between
Gallium and Vulkan, so we have to translate.

In progress I made the register defines a bit more readable.

CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100430
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-14 16:49:06 +02:00
Bas Nieuwenhuizen
62f50df7b7 radv: Fix multiview queries.
This moves the extra queries to after the main query ended, instead
of doing it after the begin and hence doing nesting.

We also emit only (view count - 1) extra queries, as the main query
is already there for the first view.

This fixes the CTS occasionally getting stuck in
dEQP-VK.multiview.queries* waiting on results.

Fixes: 32b4f3c38d "radv/query: handle multiview queries properly. (v3)"
CC: 18.1 <mesa-stable@lists.freedesktop.org>

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-14 16:49:06 +02:00
Eric Engestrom
f0cdc39b13 meson: remove dependency antipattern
`dep_valgrind != []` now (0.45) produces a warning that is quite explicit:
  WARNING: Trying to compare values of different types (DependencyHolder, list) using !=.
  The result of this is undefined and will become a hard error in a future Meson release.

`dep_valgrind = []` used to be the recommended way to deal with
non-existant dependency, but these don't work with `.found()`, so now
the recommended way is to declare a impossible dependency, which
null_dep does for us in Mesa.

In short, we don't need and shouldn't check for `!= []` anywhere anymore.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-05-14 14:55:36 +01:00
Samuel Pitoiset
ece398277c radv: remove useless check in radv_create_shaders()
radv_can_dump_shader() already handles if module is NULL.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-14 12:38:01 +02:00
Samuel Pitoiset
8ade3e4684 radv: allow to dump the GS copy shader with RADV_DEBUG="shaders"
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-14 12:38:00 +02:00
Samuel Pitoiset
553418af1e radv: move {load,store}_var intrinsics scanning in different functions
These are going to be crazy and we are probably going to add
more scan stuff in the future. Also use switch cases instead.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-14 12:37:58 +02:00
jenny.q.cao
ff7521c9ba android: change include "cutils/log.h" to "log/log.h" on Android API >=26
There is a compile warning from Android 8 (API version 26) from "include cutils/log.h"
warning: "Deprecated: don't include cutils/log.h, use either android/log.h or log/log.h"-W#warnings,
Change to include "log/log.h" on Android 8 or later major version to avoid this warning

Signed-off-by: jenny.q.cao <jenny.q.cao@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-05-14 08:08:31 +03:00
Roland Scheidegger
cf3fb42fb5 llvmpipe: Fix random number generation for unit tests
We were never producing negative numbers for signed types.
Also fix only producing half the valid range for uint32, and
properly clamp signed values.

Because this now also properly tests snorm with actually negative
values, need to increase eps for such conversions. I believe these
cannot actually be hit in ordinary operation (e.g. if a snorm texture
is sampled and output to snorm RT, it will still go through snorm->float
and float->snorm conversion), so don't bother to do anything to fix
the bad accuracy (might be quite complex).
Basically, the issue is for something like snorm16->snorm8 that in the
end this will just use a 8 bit arithmetic right shift.
But the math behind it says we should actually do a division by 32767 / 127, which
is ~258, not 256. So the result can be one bit off (values have too large
magnitude), and furthermore, the shift has incorrect rounding (always rounds
down). For positive numbers, these errors have different direction, but
for negative ones they have the same, hence for some values the error will
be 2 bit in the end.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=106232
2018-05-14 03:14:00 +02:00
Dave Airlie
5978d54a09 radv: use compute path for multi-layer images.
I don't think the hw resolve path can't handle multi-layer images.

This fixes all the:
dEQP-VK.renderpass.multisample_resolve.layers_*
tests on my VI card.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: <mesa-stable@lists.freedesktop.org>
2018-05-14 08:57:54 +10:00
Dave Airlie
98dbaa445a radv: resolve all layers in compute resolve path.
This path should iterate across all layers, I've some ideas
for doing this in a single pass, but this is simpler for now.

This passes the tests because we don't use the fragment path
unless we have DCC, and we don't have DCC on layered images.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: <mesa-stable@lists.freedesktop.org>
2018-05-14 08:57:27 +10:00
Dave Airlie
b16fc6cda1 radv/resolve: do fmask decompress on all layers.
For a multi-layer subpass resolve we want to make sure we flush all
the layers.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: <mesa-stable@lists.freedesktop.org>
2018-05-14 08:56:47 +10:00
Rhys Perry
8f6cbb8c7d nvc0: fix setting of subpixel precision during conservative rasterization
Fixes: 07dac3e040 ("nvc0: add conservative rasterization support")
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-05-13 13:21:41 -04:00
Rhys Perry
c879011c72 anv,nir: add generated files to .gitignore(s)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-12 20:14:49 -07:00
Marek Olšák
86d63b53a2 gallium: remove aux_vertex_buffer_slot code
The slot index is always 0, and is pretty unlikely to change in the future.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-05-12 21:08:09 -04:00
Timothy Arceri
ce188813bf radv: add initial support for VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT
When VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT is set we skip NIR
linking optimisations and only run over the NIR optimisation loop
once similar to the GLSLOptimizeConservatively constant used by
some GL drivers.

We need to run over the opts at least once to avoid errors in LLVM
(e.g. dead vars it can't handle) and also to reduce the time spent
compiling the IR in LLVM.

With this change the Blacksmith Unity demos compilation times
go from 329760 ms -> 299881 ms when using Wine and DXVK.

V2: add bit to radv_pipeline_key

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106246
2018-05-13 09:58:33 +10:00
Vinson Lee
26ddc4f9e1 scons: Add PROGRAM_NIR_FILES.
Fix SCons build error.

  Linking build/linux-x86_64-debug/gallium/targets/libgl-xlib/libGL.so.1.5 ...
build/linux-x86_64-debug/mesa/libmesa.a(st_program.os): In function `st_translate_prog_to_nir':
src/mesa/state_tracker/st_program.c:392: undefined reference to `prog_to_nir'

Fixes: 5c33e8c772 ("st/nir: use NIR for asm programs")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2018-05-12 00:50:05 -07:00
Timothy Arceri
5c33e8c772 st/nir: use NIR for asm programs
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-05-12 14:48:21 +10:00
Timothy Arceri
0b3e9564bd st/nir: make st_nir_opts() available externally
The following patch will make use of this for asm style programs.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-05-12 14:48:21 +10:00
Boyuan Zhang
0907d3ab9c radeon/vce: add firmware support for ver 53 and up
All vce firmwares with major version greater than or equal to 53 are supported

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-05-11 14:59:00 -04:00
Rob Clark
a7c81a7f67 etnaviv: remove pipe_fence_handle::ctx
A fence can outlive the ctx it was created from (see glmark2).. etnaviv
doesn't actually need fence->ctx so lets remove it before someone makes
the mistake of assuming it is a valid pointer.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-05-11 18:42:13 +02:00
George Kyriazis
4e52cb51b5 swr/rast: Thread locked tiles improvement
- Change tilemgr TILE_ID encoding to use Morton-order (Z-order).
- Change locked tiles set to bitset.  Makes clear, set, get much faster.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-05-11 11:26:35 -05:00
George Kyriazis
8238c791dc swr/rast: Add Builder::GetVectorType()
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-05-11 11:25:47 -05:00
George Kyriazis
8cb55dae2e swr/rast: Prepend the console output with a newline
It can get jumbled with output from other threads.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-05-11 11:25:24 -05:00
George Kyriazis
db25fcfcde swr/rast: Add ConcatLists()
for concatenating lists

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-05-11 11:22:57 -05:00
George Kyriazis
dcaca3c7b3 swr/rast: Add constant initializer for uint64_t
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-05-11 11:22:17 -05:00
George Kyriazis
70f0a28b83 swr/rast: Use binner topology to assemble backend attributes
Previously was using the draw topology, which may change if GS or Tess
are active. Only affected attributes marked with constant interpolation,
which limited the impact.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-05-11 11:21:52 -05:00
George Kyriazis
b3b0f0e0ec swr/rast: Change formatting
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-05-11 11:21:22 -05:00
Ville Syrjälä
659910eda0 meson: Fix build for egl platform_x11 with dri3
platform_x11 with dri3 needs inc_loader.

In file included from ../src/egl/drivers/dri2/platform_x11_dri3.c:35:0:
../src/egl/drivers/dri2/egl_dri2.h:41:32: fatal error: loader_dri3_helper.h: No such file or directory
In file included from ../src/egl/drivers/dri2/platform_x11.c:46:0:
../src/egl/drivers/dri2/egl_dri2.h:41:32: fatal error: loader_dri3_helper.h: No such file or directory
In file included from ../src/egl/drivers/dri2/egl_dri2.c:61:0:
../src/egl/drivers/dri2/egl_dri2.h:41:32: fatal error: loader_dri3_helper.h: No such file or directory

Cc: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2018-05-11 17:41:57 +03:00
Samuel Pitoiset
efc10949cc radv: move ac_build_if_state on top of radv_nir_to_llvm.c
These helpers will be needed for future work.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-11 12:35:07 +02:00
Samuel Pitoiset
3a410f0afc radv: minor cleanups in radv_fill_shader_variant()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-11 12:35:05 +02:00
Jan Vesely
58272c1ad7 winsys/amdgpu: Destroy dev_hash table when the last winsys is removed.
Fixes memory leak on module unload.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-10 23:23:50 -04:00
Marek Olšák
a2e9d9b4c1 ac/gpu_info: add has_read_registers_query
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:40:11 -04:00
Marek Olšák
9b1fdfc541 ac/gpu_info: add has_2d_tiling
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:40:10 -04:00
Marek Olšák
d26696283d ac/gpu_info: add has_sparse_vm_mappings
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:40:08 -04:00
Marek Olšák
125adc92ad ac/gpu_info: add has_unaligned_shader_loads
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:40:07 -04:00
Marek Olšák
8b9694da4b radeonsi: expose ARB_query_buffer_object on ancient kernels too
It doesn't use indirect dispatches.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:40:04 -04:00
Marek Olšák
e9c08bc658 ac/gpu_info: add has_indirect_compute_dispatch
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:40:03 -04:00
Marek Olšák
64265ac8d5 ac/gpu_info: add kernel_flushes_tc_l2_after_ib
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:40:01 -04:00
Marek Olšák
14c5a93bfa ac/gpu_info: add has_format_bc1_through_bc7
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:40:00 -04:00
Marek Olšák
2bd2c173e8 ac/gpu_info: add has_eqaa_surface_allocator
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:39:58 -04:00
Marek Olšák
e720cb6135 radeonsi: clean up the reset status query implementation
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:39:57 -04:00
Marek Olšák
3060f62340 ac/gpu_info: add has_bo_metadata
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:39:56 -04:00
Marek Olšák
09f1bab483 ac/gpu_info: add si_TA_CS_BC_BASE_ADDR_allowed
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:39:54 -04:00
Marek Olšák
8b58a14ef7 ac/gpu_info: add htile_cmask_support_1d_tiling
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:39:53 -04:00
Marek Olšák
b81149e258 ac/gpu_info: add kernel_flushes_hdp_before_ib
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:39:47 -04:00
Marek Olšák
a969f184cf radeonsi: add an environment variable that forces EQAA for MSAA allocations
This is for testing and experiments.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:34:37 -04:00
Marek Olšák
2309cedf44 radeonsi: set up EQAA image descriptors properly
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:34:36 -04:00
Marek Olšák
7ac4ef097d radeonsi: add EQAA SC,DB,CB register programming
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:34:34 -04:00
Marek Olšák
9d00580e75 radeonsi: support creating EQAA color textures
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:34:32 -04:00
Marek Olšák
912b0163dc ac/surface: add EQAA support
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:34:31 -04:00
Marek Olšák
ee31762ef5 radeonsi: use better sample locations for 8x EQAA
Verified with the piglit MSAA accuracy test.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:32:57 -04:00
Marek Olšák
4b6df225f7 radeonsi: improve quality of 16 sample locations
This results in better 16x and 8x quality when using these locations.
Verified with the piglit MSAA accuracy test.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:29:02 -04:00
Marek Olšák
01fd543c82 radeonsi: use better sample locations for 4x MSAA
Discovered by luck. Verified with the piglit MSAA accuracy test.
It also shows that the worst case EQAA 16s4f results in very good 4x MSAA
in the worst case.

Nine might not like these positions, but they are prettier to the eye and
GL doesn't care.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:28:12 -04:00
Marek Olšák
8d8b71ccfa radeonsi: reorder sample locations as required by EQAA
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:27:46 -04:00
Marek Olšák
5769a5ec01 radeonsi: simplify si_get_sample_position
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:26:33 -04:00
Marek Olšák
9f456b3a3c radeonsi: simplify arrays of sample locations
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:26:33 -04:00
Marek Olšák
3d70b5beae radeonsi: set DB_EQAA the same as Vulkan
These never change, but they only affect EQAA, which isn't implemented.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:26:33 -04:00
Marek Olšák
b5ed039325 radeonsi: remove CM_ prefixes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:26:33 -04:00
Marek Olšák
656fd607be radeonsi: don't update clear color registers if they don't change
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:26:33 -04:00
Marek Olšák
835095973d radeonsi: remove r600_fmask_info
radeon_surf contains almost everything.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:26:33 -04:00
Marek Olšák
bdc3e410f7 ac/surface: unify common legacy and gfx9 fmask fields
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:26:33 -04:00
Marek Olšák
9bf3570fed ac/surface/gfx6: compute FMASK together with the color surface
instead of invoking FMASK computation separately.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:26:33 -04:00
Marek Olšák
276acda835 ac/surface/gfx9: fix a typo in CMASK RB/pipe alignment
No change in behavior because it's always aligned.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:26:32 -04:00
Marek Olšák
6841845b00 ac: set correct LLVM processor names for Raven & Vega12
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:26:32 -04:00
Marek Olšák
6f7f10d285 ac: sort raster configs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:26:32 -04:00
Marek Olšák
e7b82a9978 ac: remove 1 RB raster config for Iceland
Iceland always reports 2 RBs.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:26:32 -04:00
Marek Olšák
cb0f5cddcc ac: move the Fiji kernel workaround for raster config out of the switch
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:26:32 -04:00
Marek Olšák
ce954ac6f3 ac: enable both RBs on Kaveri
This can result in 2x increase in performance on non-harvested Kaveris.

v2: don't do it on radeon

Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:26:32 -04:00
Marek Olšák
597b9e8810 radeonsi/gfx9: work around a GPU hang due to broken indirect indexing in LLVM
Fixes: 6d19120da8 "radeonsi/gfx9: workaround for INTERP with indirect indexing"
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:26:32 -04:00
Jason Ekstrand
b784561c1a intel/isl/storage: Don't lower most UNORM formats on gen11+
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Anuj Phogat <anuj.phogat@gmail.com>
2018-05-10 14:13:24 -07:00
Jason Ekstrand
399962e7c6 intel/isl: Several UNORM formats support typed writes on gen11+
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Anuj Phogat <anuj.phogat@gmail.com>
2018-05-10 14:12:55 -07:00
Brian Paul
e4211b36bb mesa: revert GL_[SECONDARY_]COLOR_ARRAY_SIZE glGet type to TYPE_INT
Since size can be 3, 4 or GL_BGRA we need to keep these glGet types
as TYPE_INT, not TYPE_UBYTE.

Fixes: d07466fe18 ("mesa: fix glGetInteger/Float/etc queries for
vertex arrays attribs")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106462
cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-05-10 09:49:40 -06:00
Andres Rodriguez
34e9e4023f radv: disable DCC for shareable images on GFX9+
This seems to be broken at the moment for opengl interop.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-10 11:27:12 -04:00
Thomas Petazzoni
54bbe600ec configure.ac: rework -latomic check
The configure.ac logic added in commit
2ef7f23820 ("configure: check if
-latomic is needed for __atomic_*") makes the assumption that if a
64-bit atomic intrinsic test program fails to link without -latomic,
it is because we must use -latomic.

Unfortunately, this is not completely correct: libatomic only appeared
in gcc 4.8, and therefore gcc versions before that will not have
libatomic, and therefore don't provide atomic intrinsics for all
architectures. This issue was for example encountered on PowerPC with
a gcc 4.7 toolchain, where the build fails with:

powerpc-ctng_e500v2-linux-gnuspe/bin/ld: cannot find -latomic

This commit aims at fixing that, by not assuming -latomic is
available. The commit re-organizes the atomic intrinsics detection as
follows:

 (1) Test if a program using 64-bit atomic intrinsics links properly,
     without -latomic. If this is the case, we have atomic intrinsics,
     and we're good to go.

 (2) If (1) has failed, then test to link the same program, but this
     time with -latomic in LDFLAGS. If this is the case, then we have
     atomic intrinsics, provided we link with -latomic.

This has been tested in three situations:

 - On x86-64, where atomic instrinsics are all built-in, with no need
   for libatomic. In this case, config.log contains:

   GCC_ATOMIC_BUILTINS_SUPPORTED_FALSE='#'
   GCC_ATOMIC_BUILTINS_SUPPORTED_TRUE=''
   LIBATOMIC_LIBS=''

   This means: atomic intrinsics are available, and we don't need to
   link with libatomic.

 - On NIOS2, where atomic intrinsics are available, but some of them
   (64-bit ones) require using libatomic. In this case, config.log
   contains:

   GCC_ATOMIC_BUILTINS_SUPPORTED_FALSE='#'
   GCC_ATOMIC_BUILTINS_SUPPORTED_TRUE=''
   LIBATOMIC_LIBS='-latomic'

   This means: atomic intrinsics are available, and we need to link
   with libatomic.

 - On PowerPC with an old gcc 4.7 toolchain, where 32-bit atomic
   instrinsics are available, but not 64-bit atomic instrinsics, and
   there is no libatomic. In this case, config.log contains:

   GCC_ATOMIC_BUILTINS_SUPPORTED_FALSE=''
   GCC_ATOMIC_BUILTINS_SUPPORTED_TRUE='#'

   With means that atomic intrinsics are not usable.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
2018-05-10 08:13:57 -07:00
Brian Paul
d07466fe18 mesa: fix glGetInteger/Float/etc queries for vertex arrays attribs
The vertex array Size and Stride attributes are now ubyte and short,
respectively.  The glGet code needed to be updated to handle those
types, but wasn't.

Fixes the new piglit test gl-1.5-get-array-attribs test.

v2: fix inadvertant whitespace change, change COLOR_ARRAY_SIZE to UBYTE,
misc fixes suggested by Justin

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106450
Fixes: d5f42f96e1 ("mesa: shrink size of gl_array_attributes (v2)")
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-05-10 08:08:11 -06:00
Jan Vesely
45dfa6f4e7 winsys/radeon: Destroy fd_hash table when the last winsys is removed.
Fixes memory leak on module unload.
v2: Use util_hash_table helper function

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
2018-05-10 05:12:48 -04:00
Jan Vesely
d146768d13 gallium/auxiliary: Add helper function to count the number of entries in hash table
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
2018-05-10 05:12:43 -04:00
Samuel Pitoiset
0defc55547 radv: move handling nosisched option in a better place
It's a per-application optimization, so it makes more sense
to do that in radv_handle_per_app_options().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-10 10:57:41 +02:00
Grazvydas Ignotas
4fdce205dd radv: assorted typo fixes
Trivial.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-10 11:50:46 +03:00
Mathias Fröhlich
f660683027 mesa/vbo/tnl: Move gl_vertex_array related stuff to tnl.
The only remaining users of gl_vertex_array are tnl based
drivers. So move everything related to that into tnl and
rename it accordingly.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-05-10 07:06:16 +02:00
Mathias Fröhlich
881d2fcafa mesa: Remove Array._DrawArrays.
Only tnl based drivers still use this array. So remove it
from core mesa and use Array._DrawVAO instead.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-05-10 07:06:16 +02:00
Mathias Fröhlich
899476b6b1 i965: Remove the now unused gl_vertex_array.
Was meant to be temporary in i965.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-05-10 07:06:16 +02:00
Mathias Fröhlich
0fabd55306 i965: Remove the gl_vertex_array indirection.
For now store binding and attrib in brw_vertex_element.
The i965 driver still provides lots of opportunity to make use
of the unique binding information in the VAO which is currently not
taken from the VAO.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-05-10 07:06:16 +02:00
Mathias Fröhlich
172c9a908f i965: Implement all_varyings_in_vbos in terms of Array._DrawVAO.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-05-10 07:06:16 +02:00
Mathias Fröhlich
79eb6ab7b6 st/mesa: Remove the now unused gl_vertex_array.
Was meant to be temporary in gallium.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-05-10 07:06:16 +02:00
Mathias Fröhlich
4c77f0d065 st/mesa: Make feedback draw and rasterpos use _DrawVAO.
Instead of playing with Array._DrawArrays, make the feedback draw
path use Array._DrawVAO. Also st_RasterPos needs to use the VAO then.

v2: Use helper methods to get the offset values for array and binding.
    Update comments.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-05-10 07:06:16 +02:00
Mathias Fröhlich
19a91841c3 st/mesa: Use Array._DrawVAO in st_atom_array.c.
Finally make use of the binding information in the VAO when
setting up arrays for draw.

v2: Emit less relocations also for interleaved userspace arrays.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-05-10 07:06:15 +02:00
Mathias Fröhlich
9987a072cb st/mesa: Make the input_to_index array available.
The input_to_index array is already available internally
when preparing vertex programs. Store the map in
struct st_vertex_program.
Also store the bitmask of mesa vertex processing inputs in
struct st_vp_variant.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-05-10 07:06:15 +02:00
Mathias Fröhlich
f24bf45210 st/mesa: Use _DrawVAO for edgeflag enabled check.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-05-10 07:06:15 +02:00
Mathias Fröhlich
d1698d4311 mesa: Compute effective buffer bindings in the vao.
Compute VAO buffer binding information past the position/generic0 mapping.
Scan for duplicate buffer bindings and collapse them into derived
effective buffer binding index and effective attribute mask variables.
Provide a set of helper functions to access the distilled
information in the VAO. All of them prefixed with _mesa_draw_...
to indicate that they are meant to query draw information.

v2: Also group user space arrays containing interleaved arrays.
    Add _Eff*Offset to be copied on attribute and binding copy.
    Update comments.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-05-10 07:06:15 +02:00
Gert Wollny
fb4011ace9 virgl: Add support for passing GL_ANY_SAMPLES_PASSED_CONSERVATIVE
This is needed for fixing CTS:
   dEQP-GLES3.functional.occlusion_query.conservative*

Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
2018-05-10 12:26:57 +10:00
Dave Airlie
ce027ac5c7 r600: fix constant buffer bounds.
If you have an indirect access to a constant buffer on r600/eg
use a vertex fetch in the shader. However apps have expected
behaviour on those out of bounds accessess (even if illegal).

If the constants were being uploaded as part of a larger
upload buffer, we'd set the range of allowed access to a lot
larger than required so apps would get values back from
other parts of the upload buffer instead of the expected out
of bounds access.

This fixes rendering bugs in Trine and Witcher 1, thanks
to iive for nagging me effectively until I figured it out :-)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91808
Cc: <mesa-stable@lists.freedesktop.org>

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-05-10 02:14:32 +01:00
Jason Ekstrand
a8a740f272 i965,anv: Set the CS stall bit on the ISP disable PIPE_CONTROL
From the bspec docs for "Indirect State Pointers Disable":

    "At the completion of the post-sync operation associated with this
    pipe control packet, the indirect state pointers in the hardware are
    considered invalid"

So the ISP disable is a post-sync type of operation which means that it
should be combined with a CS stall.  Without this, the simulator throws
an error.

Fixes: 766d801ca "anv: emit pixel scoreboard stall before ISP disable"
Fixes: f536097f6 "i965: require pixel scoreboard stall prior to ISP disable"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-05-09 18:03:28 -07:00
Dave Airlie
56766b8515 radv: handle arrays in the fmask descriptor.
This fixes the fmask descriptor generation to handle 2d ms arrays.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-10 10:42:49 +10:00
Matt Turner
0f959215c3 gallium/tests: Fix assignment of EXTRA_DIST
Fixes: 6754c2e83d ("autotools: Include new meson files")
2018-05-09 16:38:47 -07:00
Matt Turner
0097940223 configure.ac: Check for grep with AC_PROG_GREP
Perhaps with a new version of autoconf, I began seeing:

| checking the name lister (/usr/bin/nm -B) interface... ./configure: line 6973: External.*some_variable: command not found
| BSD nm

This is because AC_PROG_NM expands to

	...
	if $GREP 'External.*some_variable' conftest.out > /dev/null; then
	    lt_cv_nm_interface="MS dumpbin"
	fi
	...

I'm not sure if it's a bug in AC_PROG_NM that it doesn't call
AC_PROG_GREP, but it's easy enough for us to do it.
2018-05-09 16:38:47 -07:00
Xiong, James
0ab266dc1b main: fail texture_storage() call if the size is not okay
Signed-off-by: Xiong, James <james.xiong@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 09:34:31 +10:00
Xiong, James
08c1444c95 main: return 0 length when the queried program object's not linked
Signed-off-by: Xiong, James <james.xiong@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-05-10 09:34:19 +10:00
Kenneth Graunke
a83face48a i965: Shut up unused variable warnings.
These are only used in assertions.
2018-05-09 16:20:50 -07:00
Ross Burton
1755654d9f src/intel/Makefile.vulkan.am: add missing MKDIR_GEN
Out of tree builds can try to write into a directory that doesn't exist yet:

| Traceback (most recent call last):
|   File "../../../mesa-18.0.2/src/intel/vulkan/anv_icd.py", line 46, in <module>
|     with open(args.out, 'w') as f:
| IOError: [Errno 2] No such file or directory: 'vulkan/intel_icd.x86_64.json'
| Makefile:4882: recipe for target 'vulkan/intel_icd.x86_64.json' failed

Add missing MKDIR_GEN calls to solve this.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-05-09 16:08:52 -07:00
Rhys Perry
5ac16ed047 mesa: fix error handling in get_framebuffer_parameteriv
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-05-09 14:32:40 -07:00
Lionel Landwerlin
766d801ca3 anv: emit pixel scoreboard stall before ISP disable
We want to make sure that all indirect state data has been loaded into
the EUs before disable the pointers.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Fixes: 78c125af39 ("anv/gen10: Ignore push constant packets during context restore.")
2018-05-09 20:11:57 +01:00
Lionel Landwerlin
f536097f67 i965: require pixel scoreboard stall prior to ISP disable
Invalidating the indirect state pointers might affect a previously
scheduled & still running 3DPRIMITIVE (causing page fault). So stall
on pixel scoreboard before that.

v2: Fix compile issue :(

v3: Stall on pixel scoreboard

v4: Drop the post sync operation (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Fixes: ca19ee33d7 ("i965/gen10: Ignore push constant packets during context restore.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106243
2018-05-09 20:11:51 +01:00
Jason Ekstrand
561348caa1 intel/isl: Allow CCS_E on 1010102 formats
On CNL and above, CCS_E supports 1010102 formats and R11G11B10F.  We had
shut them off during early enabling because blorp_copy couldn't handle
them.  Now it can handle 1010102 formats so we can turn them back on.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
ccb44b8a94 intel/blorp: Allow CCS copies of 1010102 formats
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
1978de66f7 intel/blorp: Add support for more format bitcasting
nir_format_bitcast_uint_vec_unmasked can only be used to cast between
formats with uniform channel sizes.  In particular, it cannot handle
10_10_10_2 formats.  By making use of the NIR helper for uint vector
casts, we should now be able to bitcast between any two uint formats so
long as their channels are in RGBA order (possibly with channels
missing).  In order to do this we need to rework the key a bit to pass
the actual formats instead of just the number of bits in each.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
7998fe268e intel/blorp: Use nir_format_bitcast_uint_vec_unmasked
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
047e68389f nir/format_convert: Add code for bitcasting vectors
This is a fairly direct port from blorp.  The only real change is that
the nir_format_convert version doesn't assume that everything is a vec4.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
a6b66a7b26 intel/blorp: Use ISL instead of bitcast_color_value_to_uint
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
09ced65420 intel/isl: Add format conversion code
This adds helpers to ISL to convert an isl_color_value to and from
binary data encoded with a given isl_format.  The conversion is done
using ISL's built-in format introspection so it's fairly slow as format
conversions go but it should be fine for a single pixel value.  In
particular, we can use this to convert clear colors.

As a side-effect, we now rely on the sRGB helpers in libmesautil so we
need to tweak the build system a bit.  All prior uses of src/util in ISL
were header-only.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
8152c60e01 intel/isl/format: Get rid of the ALPHA colorspace
Alpha-only formats are just linear.  There's no need to specially
deliminate them as being in their own colorspace.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
8ab73790ef intel/isl/format: Add field locations informations to channel_layout
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
96598fbc02 intel/isl/format: Add a column for channel order to the table
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
d08d6a3da8 i965/blorp: Remove a pile of blorp_blit restrictions
Previously, blorp could only blit into something that was renderable.
Thanks to recent additions to blorp, it can now blit into basically
anything so long as it isn't compressed.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
465d8566cd i965/blorp: Allow blorp blits for 16x MSAA
BLORP has supported 16x MSAA for quite a while now, we just never
bothered to enable it for CopyTexSubImage.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
09eede9c9d anv: Allow blitting to/from any supported format
Now that blorp handles all the cases, why not?  The only real change we
have to make is to stop using anv_swizzle_for_render() in blorp_blit
because it doesn't work for B4G4R4A4 and blorp now natively handles that.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
8ce31c9cc5 intel/blorp: Support the RGB workaround on more formats
Previously we only supported UINT formats because that's what blorp_copy
required.  If we want to use it in blorp_blit, however, we need to
support everything.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
4e26e3dea9 intel/blorp: Silently convert RGBX destination formats to RGBA
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
08cd834996 intel/isl: Add some helpers for working with RGBX formats
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
804856fa57 intel/blorp: Handle more exotic destination formats
This commit adds support for the following formats as destination
formats even though the hardware does not support rendering to them:

 - ISL_FORMAT_R24_UNORM_X8_TYPELESS
 - ISL_FORMAT_A4B4G4R4_UNORM
 - ISL_FORMAT_L8_UNORM_SRGB
 - ISL_FORMAT_R9G9B9E5_SHAREDEXP

This is done by using a different format and emitting shader code to
fake it the rest of the way.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
9e492bb92e intel/blorp: Include nir_format_convert.h in blorp_blit.c
nir_mask_shift_or is now defined in nir_format_convert.h so we can
delete the copy in blorp_blit.c.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
9981709d8f nir/format_convert: Add a function to pack RGB9_E5 formats
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
4e337b42f9 nir/format_convert: Add pack/unpack for R11F_G11F_B10F
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
98156b0019 nir/format_convert: Add linear <-> sRGB helpers
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
2fdd966e3d nir: Add the start of a format conversion helper header
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
906c32ce87 intel/blorp: Add swizzle support for all hardware
This commit makes blorp capable of swizzling anything even on hardware
that doesn't support texture swizzle.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
1ef4f5aff1 intel/isl: Add a helper for inverting swizzles
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
242f6f7492 intel/isl: Add a helper for composing swizzles
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
dad67cc245 intel/isl: Add an isl_swizzle_supports_rendering helper
This helper encodes more details, specifically about Haswell, than the
previous asserts in isl_surface_state.c.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
23d703de1f i965/surface_state: Use an identity swizzle pre-Haswell
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-09 11:16:33 -07:00
Jason Ekstrand
293b8de161 blorp: Handle the RGB workaround more like other workarounds
The previous version was sort-of strapped on in that it just adjusted
the blit rectangle and trusted in the fact that we would use texelFetch
and round to the nearest integer to ensure that the component positions
matched.  This new version, while slightly more complicated, is more
accurate because all three components end up with exactly the same
dst_pos and so they will get interpolated and sampled at the same
texture coordinate.  This makes the workaround suitable for using with
scaled blits.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-05-09 11:16:33 -07:00
Lionel Landwerlin
3853f1c6f4 i965: silence unused variable
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 2dc29e095f ("i965: Don't leak blorp on Gen4-5.")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-05-09 18:12:10 +01:00
Lionel Landwerlin
11d36c373a intel: devinfo: silence coverity warning
It's just not possible to have a device with no subslices.

CID: 1433511
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-05-09 15:21:01 +01:00
Michel Dänzer
6f81e07ecb dri3: Only update number of back buffers in loader_dri3_get_buffers
And only free no longer needed back buffers there as well.

We want to stick to the same back buffer throughout a frame, otherwise
we can run into various issues.

Bugzilla: https://bugs.freedesktop.org/105906
Bugzilla: https://bugs.freedesktop.org/106399
Fixes: 3160cb86aa "egl/x11: Re-allocate buffers if format is suboptimal"
Reported-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Acked-by: Daniel Stone <daniels@collabora.com>
2018-05-09 15:40:41 +02:00
Samuel Iglesias Gonsálvez
2cf64fdb46 anv: ignore pColorBlendState if all color attachments of the subpass are unused
According to Vulkan spec:

  "pColorBlendState is a pointer to an instance of the
   VkPipelineColorBlendStateCreateInfo structure, and is ignored if the
   pipeline has rasterization disabled or if the subpass of the render pass the
   pipeline is created against does not use any color attachments."

Fixes tests from CL#2505:

   dEQP-VK.renderpass.*.simple.color_unused_omit_blend_state

v2:
- Check that blend is not NULL before usage.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-09 07:01:10 +02:00
Timothy Arceri
e7a7b712fe mesa: remove hard-coded OpenGL 3.2 compat limit
Just let validate_context_version() do it instead. This fixes
MESA_GL_VERSION_OVERRIDE for compat, it will also allow us to
enable new compat versions on a per driver bases in future.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-09 14:24:43 +10:00
Timothy Arceri
4560aad780 mesa: add GLSLVersionCompat constant
This allows drivers to define what version of GLSL they support
in compat. This will be needed in order to support compat 3.2
without breaking drivers that wont support it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-09 14:24:36 +10:00
Timothy Arceri
be3ee9d141 mesa: dont call _mesa_override_glsl_version() in _mesa_init_constants()
All drivers that support GLSL will later set their default GLSL versions
overriding this override call. They currently all call
 _mesa_override_glsl_version() again later in order to support overrides.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-09 14:24:29 +10:00
Timothy Arceri
2a621acc8d mesa: dont set GLSLVersion in _mesa_init_constants()
Just leave it as 0 and let the drivers set it (as they already do)
to avoid redundantly initialising it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-09 14:24:22 +10:00
Jan Vesely
0783399d79 pipe-loader: Free driver_name in error path
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-08 21:35:07 -04:00
Brian Paul
901db25d5b glsl: change ast_type_qualifier bitset size to work around GCC 5.4 bug
Change the size of the bitset from 128 bits to 96.  This works around an
apparent GCC 5.4 bug in which bad SSE code is generated, leading to a
crash in ast_type_qualifier::validate_in_qualifier() (ast_type.cpp:654).

This can be repro'd with the Piglit test tests/spec/glsl-1.50/execution/
varying-struct-basic-gs-fs.shader_test

Bugzilla:https://bugs.freedesktop.org/show_bug.cgi?id=105497
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Tested-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-05-08 19:06:09 -06:00
Kenneth Graunke
20f06bc72b i965: Dump validation list on INTEL_DEBUG=bat,submit.
This is really useful when debugging any sort of buffer management
issues, so just printing it during INTEL_DEBUG=bat,submit seems
reasonable.  With bat, we're already spamming so much output that
it doesn't really hurt.  With submit, it's still easy to grep for
the older information, and the new information is nice too.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-05-08 10:08:16 -07:00
Jason Ekstrand
06d3841882 i965/miptree: Remove redundant fields from intel_miptree_aux_buffer
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-08 08:27:46 -07:00
Jason Ekstrand
4f4779b367 i965: Simplify brw_emit_depthbuffer and brw_emit_depth_stencil_hiz
Now that we're using ISL, a good chunk of brw_emit_depthstencil is
pointless checks which ISL will do for us anyway.  Since we only have
one manual depth buffer emit function, move the useful bits into it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-08 08:27:45 -07:00
Jason Ekstrand
96f01501d7 i965: Move brw_emit_depth_stencil_hiz higher up in the file
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-08 08:27:45 -07:00
Jason Ekstrand
bdbb527a65 i965: Use ISL for emitting depth/stencil/hiz state on gen6+
We leave gen4-5 alone because the ISL code hasn't really been well-
tested on gen4-5 or with combined depth-stencil because we don't use
BLORP for depth operations on gen4-5.  Also, the gen4-5 code has to deal
with intratile offsets for LOD hacks and ISL doesn't handle those yet.
We could make ISL handle gen4-5 capable or we could just not bother.

Among other things, this should make future platform enabling easier
because it means we don't have to update multiple (or hand-rolled!)
depth stencil emit paths.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-08 08:27:44 -07:00
Jason Ekstrand
ccd3dce3c0 i965: Use the brw_depthbuffer atom on all gens
The only reason why we had two atoms was that the one we used for gen7+
depended on _NEW_DEPTH and _NEW_STENCIL as well as _NEW_BUFFERS.  Since
this is no longer true, we can combine them into one atom.  We do add a
dependence on BRW_NEW_AUX_STATE but that should never get set on gen4-5
so adding it is a no-op for those platforms.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-08 08:27:44 -07:00
Jason Ekstrand
514bb6f41e i965: Always set depth/stencil write enables on gen7+
The hardware will AND these fields with the corresponding fields in
DEPTH_STENCIL_STATE so there's no real reason to toggle them on and off
based on state bits.  This removes our reliance on the _NEW_DEPTH and
_NEW_STENCIL state bits and better matches what ISL does.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-08 08:27:43 -07:00
Jason Ekstrand
c4d00da7b7 i965: Re-order depth/stencil/hiz/clear packets to match ISL
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-08 08:27:42 -07:00
Jason Ekstrand
6fc3404911 i965: Re-emit depth/stencil/hiz on BRW_NEW_AUX_STATE
Certain things can change the aux usage or fast clear color of a depth
surface and we want to re-emit if that happens.  For instance, if you do
a fast depth clear of an already clear depth surface, we will just set
the clear color and not do anything else.  In that case, we could fail
to re-emit 3DSTATE_CLEAR_PARAMS and not get the new fast-clear color.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-08 08:23:55 -07:00
Lionel Landwerlin
3cdf1bf97d intel: devinfo: fix assertion on devices with odd number of EUs
I forgot to change the assert in the second helper function in a
previous change.

This hit the assert() on a Broadwell platform with 1 slice, 3
subslices but all EUs disabled in subslice 1 & 2.

Fixes: c1900f5b0f ("intel: devinfo: add helper functions to fill fusing masks values")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-08 15:15:54 +01:00
Bas Nieuwenhuizen
b17cfb08a3 vulkan/wsi: Only use LINEAR modifier for prime if supported.
This was setting the LINEAR modifier if neither the
X server nor the driver supported modifiers.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106180
Fixes: c80c08e226 "vulkan/wsi/x11: Add support for DRI3 v1.2"
CC: 18.1 <mesa-stable@lists.freedesktop.org>
Tested-by: Abel Garcia Dorta <mercuriete@gmail.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-08 15:47:16 +02:00
Jan Vesely
a9e4be9212 eg/compute: Drop reference to kernel_param bo in destructor
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-08 09:02:38 -04:00
Jan Vesely
a1e8fcce3e r600: Cleanup constant buffers on context destruction
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-08 09:02:30 -04:00
Alejandro Piñeiro
b6648798cf mesa/formatquery: remove online compression check on is_resource_supported
is_resource_supported returns if the combination of
target/internalformat is supported in at least one operation. Online
compression is only mandatory for glTexImage2D. Some formats doesn't
support online compression, but can be used in any case, with
glCompressed*D methods.

Without this commit, ETC2 internalformats were returning FALSE, even
for the drivers supporting it. So any other query (like
TEXTURE_COMPRESSED) was returning FALSE/NONE instead of the proper
value.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-08 08:19:38 +02:00
Kenneth Graunke
e6fb8196ce intel/genxml: Assert that genxml field start and ends are sane.
Chris recently fixed a bunch of genxml end < start bugs, as well as
booleans that are wider than a bit.  These are way too easy to write, so
asserting that the fields are sane is a good plan.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-07 23:06:52 -07:00
Kenneth Graunke
f83fd929b7 intel/genxml: Fix some more fake booleans in genxml.
None of these are actually booleans.  Tile Parameter is a tiling mode
enum.  Display pipes take plane numbers.  Predicate Enable has some
operations (and the default value of 6 was particular bogus).

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-07 23:06:52 -07:00
Kenneth Graunke
33906eeaca intel/genxml: Make assert in gen_pack_header print a message.
Python's assert can take both a condition and a string, which will cause
it to print the string if the assertion trips.  (You can't use parens as
that creates a tuple.)  Doing "condition and string" works in C, but
doesn't have the desired effect in Python.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-07 23:06:52 -07:00
Kenneth Graunke
2dc29e095f i965: Don't leak blorp on Gen4-5.
We used to only initialize BLORP on Gen6+.  When we added it on Gen4-5,
we forgot to destroy it unconditionally.

Fixes: 752d7af77a (i965: Add blorp support for gen4-5)
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-05-07 23:05:59 -07:00
Matt Turner
ed5af94373 nir: Transform discard_if(true) into discard
Noticed while reviewing Tim Arceri's NIR inlining series.

Without his series:

instructions in affected programs: 16 -> 14 (-12.50%)
helped: 2

With his series:

instructions in affected programs: 196 -> 174 (-11.22%)
helped: 22

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-07 13:50:23 -07:00
Jan Vesely
ea1fff4416 eg/compute: Drop reference on code_bo in destructor.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-07 15:04:03 -04:00
Nicolas Boichat
54ba73ef10 configure.ac/meson.build: Fix -latomic test
When compiling with LLVM 6.0 on x86 (32-bit) for Android, the test
fails to detect that -latomic is actually required, as the atomic
call is inlined.

In the code itself (src/util/disk_cache.c), we see this pattern:
p_atomic_add(cache->size, - (uint64_t)size);
where cache->size is an uint64_t *, and results in the following
link time error without -latomic:
src/util/disk_cache.c:628: error: undefined reference to '__atomic_fetch_add_8'

Fix the configure/meson test to replicate this pattern, which then
correctly realizes the need for -latomic.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
2018-05-07 10:14:53 -07:00
Scott D Phillips
8b519075ea anv: remove unused field anv_queue::pool
The last use of the field was removed in 2015's ("48a87f4ba06
anv/queue: Get rid of the serial")

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-07 09:03:46 -07:00
Kenneth Graunke
0b1cfd01ff i965: Set initial kflags on BO creation.
This simplifies kflag initialization, by creating a bufmgr-wide setting
for initial kflags, and just applying it whenever we create a new BO.

This also properly allows 48-bit addresses for imported BOs (via prime
or flink), which I had missed in my earlier 48-bit support series.

This will be useful when adding softpin support, as we'll want to add
EXEC_OBJECT_PINNED to initial_kflags as well.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-05-07 08:47:21 -07:00
Juan A. Suarez Romero
7ee54fc33d docs: update calendar, add news and link release notes to 18.0.3
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-05-07 11:25:54 +00:00
Juan A. Suarez Romero
78e103da8b docs: add sha256 checksums for 18.0.3
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit ae12c5e990)
2018-05-07 11:19:36 +00:00
Juan A. Suarez Romero
6c06d4e17b docs: add sha256 checksums for 18.0.3
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 6dc2658fd6)
2018-05-07 11:19:34 +00:00
Chris Wilson
cf440d85db intel/genxml: Fix a few invalid field widths
A couple of typos found by inspecting field.end - field.start, revealed
a few wide integers declared as bool and some that ended before they
started.

Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-05-07 11:34:13 +01:00
Vinson Lee
cd5319a64f swr/rast: Fix include for createInstructionCombiningPass with llvm-7.0.
Fix build error after llvm-7.0.0svn r330669 ("InstCombine: Fix layering
by not including Scalar.h in InstCombine").

  CXX      rasterizer/jitter/libmesaswr_la-blend_jit.lo
rasterizer/jitter/blend_jit.cpp:816:20: error: use of undeclared identifier 'createInstructionCombiningPass'; did you mean 'createInstructionSimplifierPass'?
        passes.add(createInstructionCombiningPass());
                   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                   createInstructionSimplifierPass

Suggested-by: George Kyriazis <george.kyriazis@intel.com>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
2018-05-05 13:20:53 -07:00
Jan Vesely
2f1ad72ac1 clover: Add explicit virtual destructor to argument class
It is needed to destroy the v vector in scalar_argument
Fixes memory leaks on parameter set/bind.

v2: Drop redundant sclara_argument destructor

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-05-05 13:17:08 -04:00
Iago Toral Quiroga
e4c667b9e8 anv/device: expose shaderInt16 support in gen8+
This rollbacks the revert of this patch introduced with
commit 7cf284f18e.

Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-05 12:41:14 +02:00
Iago Toral Quiroga
5a12bdac09 i965/compiler: handle conversion to smaller type in the lowering pass for that
This rollbacks the revert of this same patch introduced in
commit 7b9c15628a.

And also squahes the following patch to prevent a piglit regression caused
by this change:

intel/compiler: Fix lower_conversions for 8-bit types.
Author: Jose Maria Casanova Crespo <jmcasanova@igalia.com>

For 8-bit types the execution type is word. A byte raw MOV has 16-bit
execution type and 8-bit destination and it shouldn't be considered
a conversion case. So there is no need to change alignment and enter
in lower_conversions for these instructions.

Fixes a regresion in the piglit test "glsl-fs-shader-stencil-export"
that is introduced with this patch from the Vulkan shaderInt16 series:
'i965/compiler: handle conversion to smaller type in the lowering
pass for that'. The problem is caused because there is already a case
in the driver that injects Byte instructions like this:

mov(8)          g127<1>UB       g2<32,8,4>UB

And the aforementioned pass was not accounting for the special
handling of the execution size of Byte instructions. This patch
fixes this.

v2: (Jason Ekstrand)
   - Simplify is_byte_raw_mov, include reference to PRM and not
   consider B <-> UB conversions as raw movs.

v3: (Matt Turner)
   - Indentation style fixes.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106393
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-05 12:41:02 +02:00
Iago Toral Quiroga
a75f967388 intel/compiler: handle 16-bit to 64-bit conversions in BSW platforms
These are subject to the general restriction that anything that is converted
to 64-bit needs to be aligned to 64-bit.  We had this already in place for
32-bit to 64-bit conversions, so this patch generalizes the implementation
to take effect on any conversion to 64-bit from a source smaller than
64-bit.

Fixes assembly validation errors in the following CTS tests in BSW:
dEQP-VK.spirv_assembly.instruction.compute.sconvert.int16_to_int64
dEQP-VK.spirv_assembly.instruction.compute.uconvert.uint16_to_uint64
dEQP-VK.spirv_assembly.instruction.compute.sconvert.int16_to_uint64

Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-05 12:26:37 +02:00
Caio Marcelo de Oliveira Filho
9d1ff2261c intel/genxml: recognize 0x, 0o and 0b when setting default value
Remove the need of converting values that are documented in
hexadecimal. This patch would allow writing

    <field name="3D Command Sub Opcode" ... default="0x1B"/>

instead of

    <field name="3D Command Sub Opcode" ... default="27"/>

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-05-04 23:58:10 +01:00
Ian Romanick
9a10a2fd5f r200: Enable NV_fog_distance
With the previous fixes in place, it appears to just work.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-04 15:29:30 -07:00
Ian Romanick
9d0bf720ed i965: Enable NV_fog_distance
With the previous fixes in place, it appears to just work.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-04 15:29:28 -07:00
Ian Romanick
df80ffa4aa ffvertex: Don't try to read output registers in fog calculation
Gallium drivers use _mesa_remove_output_reads() via st_program to lower
output reads away.  It seems better to just generate the right thing in
the first place.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-04 15:27:50 -07:00
Ian Romanick
f2db3be620 mesa: Add missing support for glFogiv(GL_FOG_DISTANCE_MODE_NV)
Found by inspection, so I made a piglit test too.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-04 15:27:44 -07:00
Ian Romanick
d350276b03 mesa: Silence an unused parameter warning
main/framebuffer.c: In function ‘update_color_draw_buffers’:
main/framebuffer.c:629:46: warning: unused parameter ‘ctx’ [-Wunused-parameter]
 update_color_draw_buffers(struct gl_context *ctx, struct gl_framebuffer *fb)
                                              ^~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-04 15:27:40 -07:00
Gert Wollny
e695a35f40 mesa/main/readpix: Correct handling of packed floating point values
Make sure that clamping in the pixel transfer operations is enabled/disabled
for packed floating point values just like it is done for single normal and
half precision floating point values.

This fixes a series of CTS tests with virgl that use r11f_g11f_b10f
buffers as target, and where virglrenderer reads these surfaces back
using the format GL_UNSIGNED_INT_10F_11F_11F_REV.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-05-04 10:47:46 -07:00
Scott D Phillips
5c075b0855 util/set: add a set_clear function
Clear a set back to the state of having zero entries.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-04 10:13:33 -07:00
Tapani Pälli
affe63b1da egl: add EGL_BAD_MATCH error case for surfaceless and android
Just like is done for other backends when suitable config is not
found (added in fd4eba4929).

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2018-05-04 14:04:03 +03:00
Nicolai Hähnle
c0acb596f4 amd/common: use llvm.amdgcn.wqm for explicit derivatives
To comply with an upcoming change in LLVM, see
https://reviews.llvm.org/D46051

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-04 11:02:48 +02:00
Rhys Perry
b30949a9c2 nv50/ir: fix printing of pixld
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-05-03 22:57:46 -04:00
Drew Davenport
4373dd3215 st/va: Support YUV formats in vaCreateSurfaces
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2018-05-03 15:48:35 -07:00
Mark Janes
7cf284f18e Revert "anv/device: expose shaderInt16 support in gen8+"
This reverts commit 0ba0ac815e.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106393
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-05-03 15:26:59 -07:00
Mark Janes
7b9c15628a Revert "i965/compiler: handle conversion to smaller type in the lowering pass for that"
This reverts commit 96b5153790.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106393
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-05-03 15:26:59 -07:00
Vinson Lee
589622a2fe swr/rast: Fix WriteBitcodeToFile usage with llvm-7.0.
Fix build error after llvm-7.0svn r325155 ("Pass a reference to a module
to the bitcode writer.").

  CXX      rasterizer/jitter/libmesaswr_la-JitManager.lo
rasterizer/jitter/JitManager.cpp:548:30: error: reference to type 'const llvm::Module' could not bind to an lvalue of type 'const llvm::Module *'
    llvm::WriteBitcodeToFile(M, bitcodeStream);
                             ^

Suggested-by: George Kyriazis <george.kyriazis@intel.com>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
2018-05-03 14:06:09 -07:00
Deepak Rawat
9a21c96126 egl/x11: Send invalidate to driver on copy_region path in swap_buffer
Similar to swap_available path send invalidate to the driver because
egl/X11 is not watching for for server's invalidate events. The
dri2_copy_region path is trigerred when server supports DRI2 version
minor 1.

Tested with piglit egl tests for regression.

V2: Move invalidate from dri2_copy_region to swap_buffer common.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
2018-05-03 13:55:58 +02:00
Juan A. Suarez Romero
fd4eba4929 egl: check if colorspace/surface type is supported
According to EGL 1.4 spec, section 3.5.1 ("Creating On-Screen Rendering
Surfaces"), if config does not support the colorspace or alpha format
attributes specified in attrib_list (as defined for
eglCreateWindowSurface), an EGL_BAD_MATCH error is generated.

This fixes dEQP-EGL.functional.wide_color.*_888_colorspace_srgb (still
not merged,
https://android-review.googlesource.com/c/platform/external/deqp/+/667322),
which is crashing when trying to create a windows surface with RGB888
configuration and sRGB colorspace.

v2: Handle the fix in other backends (Tapani)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-05-03 12:26:12 +02:00
Iago Toral Quiroga
0ba0ac815e anv/device: expose shaderInt16 support in gen8+
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:26 +02:00
Iago Toral Quiroga
002cb6f2b3 anv/pipeline: support SpvCapabilityInt16 in gen8+
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:26 +02:00
Iago Toral Quiroga
f07c05576f compiler/spirv: add implementation to check for SpvCapabilityInt16 support
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:26 +02:00
Iago Toral Quiroga
dd41630d9a intel/compiler: implement 16-bit pack/unpack opcodes
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:26 +02:00
Iago Toral Quiroga
1dacb56279 compiler/spirv: implement 16-bit bitcasts
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:26 +02:00
Iago Toral Quiroga
2d648e5ba3 compiler/lower_64bit_packing: rename the pass to be more generic
It can do 32-bit packing too now.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:26 +02:00
Iago Toral Quiroga
d2564af842 nir/lower_64bit_packing: extend the pass to handle packing from / to 16-bit.
With 16-bit support we can now do 32-bit packing, a follow-up patch will
rename the pass to something more generic.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:26 +02:00
Iago Toral Quiroga
c9653cc14c nir: add opcodes for 16-bit packing and unpacking
Noitice that we don't need 'split' versions of the 64-bit to / from
16-bit opcodes which we require during pack lowering to implement these
operations. This is because these operations can be expressed as a
collection of 32-bit from / to 16-bit and 64-bit to / from 32-bit
operations, so we don't need new opcodes specifically for them.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:26 +02:00
Iago Toral Quiroga
6318808a05 intel/compiler: fix 16-bit comparisons
NIR assumes that booleans are always 32-bit, but Intel hardware produces
16-bit booleans for 16-bit comparisons. This means that we need to convert
the 16-bit result to 32-bit.

In the future we want to add an optimization pass to clean this up and
hopefully remove the conversions.

v2 (Jason): use the type of the source for the temporary and use
            brw_reg_type_from_bit_size for the conversion to 32-bit.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:25 +02:00
Iago Toral Quiroga
b11e9425df intel/compiler: lower some 16-bit integer operations to 32-bit
These are not supported in hardware for 16-bit integers.

We do the lowering pass after the optimization loop to ensure that we
lower ALU operations injected by algebraic optimizations too.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:25 +02:00
Iago Toral Quiroga
b9a3d8c23e compiler/nir: add a lowering pass to convert the bit size of ALU operations
Not all bit-sizes may be supported natively in hardware for all operations.
This pass allows drivers to lower such operations to a bit-size that is
actually supported and then converts the result back to the original
bit-size.

Compiler backends control which operations and wich bit-sizes require
the lowering through a callback function.

v2: generalize this pass and make it available in NIR core (Rob, Jason)
v3: remove some temporaries and reduce nesting in instruction loop using
    a continue statement (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:25 +02:00
Jose Maria Casanova Crespo
f575277f7e intel/compiler: support negate and abs of half float immediates
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:25 +02:00
Jose Maria Casanova Crespo
f0e6dacee5 intel/compiler: fix brw_imm_w for negative 16-bit integers
16-bit immediates need to replicate the 16-bit immediate value
in both words of the 32-bit value. This needs to be careful
to avoid sign-extension, which the previous implementation was
not handling properly.

For example, with the previous implementation, storing the value
-3 would generate imm.d = 0xfffffffd due to signed integer sign
extension, which is not correct. Instead, we should cast to
uint16_t, which gives us the correct result: imm.ud = 0xfffdfffd.

We only had a couple of cases hitting this path in the driver
until now, one with value -1, which would work since all bits are
one in this case, and another with value -2 in brw_clip_tri(),
which would hit the aforementioned issue (this case only affects
gen4 although we are not aware of whether this was causing an
actual bug somewhere).

v2: Make explicit uint32_t casting for left shift (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

Cc: "18.0 18.1" <mesa-stable@lists.freedesktop.org>
2018-05-03 11:40:25 +02:00
Jose Maria Casanova Crespo
2a76f03c90 intel/compiler: fix 16-bit int brw_negate_immediate and brw_abs_immediate
From Intel Skylake PRM, vol 07, "Immediate" section (page 768):

"For a word, unsigned word, or half-float immediate data,
software must replicate the same 16-bit immediate value to both
the lower word and the high word of the 32-bit immediate field
in a GEN instruction."

This fixes the int16/uint16 negate and abs immediates that weren't
taking into account the replication in lower and upper words.

v2: Integer cases are different to Float cases. (Jason Ekstrand)
    Included reference to PRM (Jose Maria Casanova)
v3: Make explicit uint32_t casting for left shift (Jason Ekstrand)
    Split half float implementation. (Jason Ekstrand)
    Fix brw_abs_immediate (Jose Maria Casanova)

Cc: "18.0 18.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:25 +02:00
Jose Maria Casanova Crespo
e5fc3c0717 intel/compiler: implement nir_instr_type_load_const for 16-bit constants
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:25 +02:00
Iago Toral Quiroga
939501c8ed intel/compiler: implement conversions from 16-bit int/float to bool
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:25 +02:00
Iago Toral Quiroga
d5a419176f intel/compiler: implement conversion between float/int 16-bit types
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:25 +02:00
Iago Toral Quiroga
96b5153790 i965/compiler: handle conversion to smaller type in the lowering pass for that
The lowering pass was specialized to act on 64-bit to 32-bit conversions only,
but the implementation is valid for other cases.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:25 +02:00
Iago Toral Quiroga
5361a87ee7 intel/compiler: fix isign for 16-bit integers
We need to use 16-bit constants with 16-bit instructions,
otherwise we get the following validation error:

"Destination stride must be equal to the ratio of the sizes of
 the execution data type to the destination type"

Because the execution data type is 4B due to the 32-bit integer
constant.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:25 +02:00
Chris Wilson
b5e266765a i965: Always try to create a logical context
Always enable use of HW logical contexts to preserve GPU state between
batches when the kernel supports such constructs, continuing to enforce
the required support for gen6+.

At runtime, this effectively removes the BRW_NEW_CONTEXT flag (and the
upload of invariant state) from the start of every batch for any kernel
supporting contexts. So long as the older atoms are correctly listening
to the right flag (NEW_CONTEXT rather than NEW_BATCH) this should
eliminate a few redundant state uploads for the older platforms.

No piglits were harmed on ctg and ilk, both with and without logical
contexts.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-03 01:39:33 -07:00
Neil Roberts
e17d0ccbbd spirv: Apply OriginUpperLeft to FragCoord
This behaviour was changed in 1e5b09f42f. The commit message
for that says it is just a “tidy up” so my assumption is that the
behaviour change was a mistake. It’s a little hard to decipher looking
at the diff, but the previous code before that patch was:

  if (builtin == SpvBuiltInFragCoord || builtin == SpvBuiltInSamplePosition)
     nir_var->data.origin_upper_left = b->origin_upper_left;

  if (builtin == SpvBuiltInFragCoord)
     nir_var->data.pixel_center_integer = b->pixel_center_integer;

After the patch the code was:

  case SpvBuiltInSamplePosition:
     nir_var->data.origin_upper_left = b->origin_upper_left;
     /* fallthrough */
  case SpvBuiltInFragCoord:
     nir_var->data.pixel_center_integer = b->pixel_center_integer;
     break;

Before the patch origin_upper_left affected both builtins and
pixel_center_integer only affected FragCoord. After the patch
origin_upper_left only affects SamplePosition and pixel_center_integer
affects both variables.

This patch tries to restore the previous behaviour by changing the
code to:

  case SpvBuiltInFragCoord:
     nir_var->data.pixel_center_integer = b->pixel_center_integer;
     /* fallthrough */
  case SpvBuiltInSamplePosition:
     nir_var->data.origin_upper_left = b->origin_upper_left;
     break;

This change will be important for ARB_gl_spirv which is meant to
support OriginLowerLeft.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Fixes: 1e5b09f42f "spirv: Tidy some repeated if checks..."
2018-05-03 10:08:42 +02:00
Samuel Iglesias Gonsálvez
b291a3a4a3 spirv: convert some operands for bitwise shift and bitwise ops to uint32
SPIR-V allows to define the shift, offset and count operands for
shift and bitfield opcodes with a bit-size different than 32 bits,
but in NIR the opcodes have that limitation. As agreed in the
mailing list, this patch adds a conversion to 32 bits to fix this.

For more info, see:

https://lists.freedesktop.org/archives/mesa-dev/2018-April/193026.html

v2:
- src_bit_size will have zero value for variable bit-size operands (Jason).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 07:07:24 +02:00
Timothy Arceri
58c05ede96 mesa: enable geom shaders in OpenGL 3.2 Compat profile
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-05-03 12:08:21 +10:00
Bas Nieuwenhuizen
ffa15861ef radv: UseEnumerateInstanceVersion for the default version.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-02 21:57:08 +02:00
Bas Nieuwenhuizen
467c562a29 radv: Don't check the incoming apiVersion on CreateInstance.
This fixes

dEQP-VK.api.device_init.create_instance_invalid_api_version

CC: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-02 21:57:08 +02:00
Bas Nieuwenhuizen
9267ff9883 radv: Allow vkEnumerateInstanceVersion ProcAddr without instance.
Apparently the somewhere between 1.1.70 and 1.1.73 the loader started
depending on this. The loader then creates a 1.0 instance, which gets
into funny situation because we have a 1.1 device.

No idea how to do line wrapping in Mako though, my random guesses
did not work.

CC: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-02 21:57:08 +02:00
Lionel Landwerlin
336decd67e intel: aubinator: add an option to limit the number of decoded VBO lines
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-02 19:46:47 +01:00
Lionel Landwerlin
000452aebc intel: decoder: limit to the number decoded lines from VBO
By default we set no limit, but the debug batch decoder in i965 sets
it to 100.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-02 19:46:47 +01:00
Jason Ekstrand
bd35345e85 anv: Advertise variableMultisampleRate
Initially, I didn't understand this feature.  Turns out that all it
means is that you can switch multisample rates in the middle of a
zero-attachment subpass.  We've been able to do this since forever.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2018-05-02 10:59:03 -07:00
Rob Clark
28e410f6a5 nir: add missing dependency in meson.build
nir_builder_opcodes.h also depends on nir_intrinsics.py for generating
the system-value builders.

Reported-by: Christoph Haag <haagch@frickel.club>
Reported-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-02 13:57:51 -04:00
Matthew Nicholls
97d57ef917 radv: fix multisample image copies
Previously before fb077b0728, the LOD parameter was being used in place of the
sample index, which would only copy the first sample to all samples in the
destination image. After that multisample image copies wouldn't copy anything
from my observations.

This fixes some copy_and_blit CTS tests.

v3.1: - set lod to 0 for nir_txf_ms (Samuel)
v2: - use GLSL_SAMPLER_DIM_MS instead of 2D (Samuel)
    - updated commit description (Samuel)

Fix this properly by copying each sample in a separate radv_CmdDraw and using a
pipeline with the correct rasterizationSamples for the destination image.

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-02 19:32:00 +02:00
Kenneth Graunke
169d8e011a intel: Fix 3DSTATE_CONSTANT buffer decoding.
First, this was iterating over the 3DSTATE_CONSTANT_* instruction
but trying to process fields of the 3DSTATE_CONSTANT_BODY substructure.

Secondly, the fields have been called Buffer[0] and Read Length[0],
for a while now, and we were not handling the subscripts correctly.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-05-02 10:09:28 -07:00
Lionel Landwerlin
cf1d587879 intel: fix aubinator include
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 7c22c150c4 ("intel: Move batch decoder/disassembler from tools/ to common/")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-02 17:54:29 +01:00
Kenneth Graunke
0ab423388c i965: Reuse batch decoder infrastructure rather than open coding it.
With the new callback, Jason's newer batch decoder infrastructure
should be able to do just as well as the old open coded INTEL_DEBUG=bat
handling, with much less code.  If there are any limitations, we'd like
to improve the common code rather than doing one-off hacks here.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-05-02 09:27:56 -07:00
Kenneth Graunke
bf91b81a0b intel: Give the batch decoder a callback to ask about state size.
Given an arbitrary batch, we don't always know what the size of certain
things are, such as how many entries are in a binding table.  But it's
easy for the driver to track that information, so with a simple callback
we can calculate this correctly for INTEL_DEBUG=bat.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-05-02 09:27:56 -07:00
Kenneth Graunke
7c22c150c4 intel: Move batch decoder/disassembler from tools/ to common/
Making these part of libintel_common allows us to use them in the DRI
driver.  The standalone tool binaries already link against the common
library, too, so it's no harder for them.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-05-02 09:27:56 -07:00
Kenneth Graunke
5c04971831 i965: Allocate shadow batches to explicitly be the BO size.
This unfortunately makes it malloc/realloc on every new batch, rather
than once at startup.  But it ensures that the shadow buffer's size will
absolutely match the BO size.  Otherwise, as we tune BATCH_SZ/STATE_SZ
or bufmgr cache bucket sizes, we may get a BO size that's rounded up,
and fail to allocate the shadow buffer large enough.

This doesn't fix any bugs today, as BATCH_SZ/STATE_SZ are the size of
a cache bucket, but it's better to be safe than sorry.

Reported-by: James Xiong <james.xiong@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-05-02 09:26:55 -07:00
Lionel Landwerlin
ec5df73803 intel: batch-decoder: iterate VERTEX_BUFFER_STATE fields
The gen_field_iterator only iterates the fields of a given gen_group.
If we want to iterate the fields of another gen_group contained as
field, we need to do it manually.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-02 17:11:28 +01:00
Lionel Landwerlin
acbce2ac57 intel: decoder: fix starting dword of struct fields
Struct fields might span several dwords, but iter_dword is incremented
up to the last dword of the current field before we print out the
struct's fields. We can't use iter_dword for computing the offset into
the pointer of data to decode.

v2: Fix displayed offset number (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-02 17:11:28 +01:00
Lionel Landwerlin
467430ddcc intel: decoder: document when fields should be used
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-02 17:10:37 +01:00
Lionel Landwerlin
4f128f7850 intel: decoder: identify groups with fixed length
<register> & <struct> elements always have fixed length. The
get_length() method implies that we're dealing with an instruction in
which the length is encoded into the variable data but the field
iterator uses it without checking what kind of gen_group it is dealing
with.

Let's make get_length() report the correct length regardless of the
gen_group (register, struct or instruction).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-02 17:10:37 +01:00
Lionel Landwerlin
3c416a50d8 intel: decoder: make the field iterator use more natural
while (iter_next()) { ... }

instead of

do { ... } while (iter_next());

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-02 17:10:37 +01:00
Vlad Golovkin
967aabca06 nv50: Extract needed value bits without shifting them before calling bitcount
This can save one instruction since bitcount doesn't care about specific
bits' positions.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-05-02 15:12:48 +02:00
Antia Puentes
3a1df14a7b intel: activate the gl_BaseVertex lowering
Surplus code related to the basevertex is removed.

The Vertex Elements contain now:
* VE 1: <firstvertex, BaseInstance, VertexID, InstanceID>
* VE 2: <DrawID, is_indexed_draw, 0, 0>

Also fixes unreachable message.

Fixes OpenGL CTS tests:
* KHR-GL46.shader_draw_parameters_tests.ShaderDrawArraysInstancedParameters
* KHR-GL46.shader_draw_parameters_tests.ShaderMultiDrawArraysParameters
* KHR-GL46.shader_draw_parameters_tests.MultiDrawArraysIndirectCountParameters
* KHR-GL46.shader_draw_parameters_tests.ShaderDrawArraysParameters
* KHR-GL46.shader_draw_parameters_tests.ShaderMultiDrawArraysIndirectParameters

Fixes Piglit tests:
* arb_shader_draw_parameters-drawid-indirect baseinstance
* arb_shader_draw_parameters-basevertex

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102678
2018-05-02 11:24:46 +02:00
Antia Puentes
0fb204fac1 compiler/nir: Add conditional lowering for gl_BaseVertex
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-02 11:24:31 +02:00
Antia Puentes
0cbf29fa55 intel: emit is_indexed_draw in the same VE than gl_DrawID
The Vertex Elements are now:
* VE 1: <BaseVertex/firstvertex, BaseInstance, VertexID, InstanceID>
* VE 2: <DrawID, is-indexed-draw, 0, 0>

VE1 is it kept as it was before, VE2 additionally contains the new
system value.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-02 11:23:34 +02:00
Antia Puentes
6ba9088d9c intel/compiler: Add uses_is_indexed_draw flag
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-02 11:20:48 +02:00
Antia Puentes
9e6b886cf2 compiler: Add SYSTEM_VALUE_IS_INDEXED_DRAW and instrinsics
This VS system value contains if the draw command used to start the
rendering was an indexed draw command or a non-indexed one
(~0/0 respectively). Useful to calculate the gl_BaseVertex as:
(SYSTEM_VALUE_IS_INDEXED_DRAW & SYSTEM_VALUE_FIRST_VERTEX).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-02 11:20:40 +02:00
Samuel Pitoiset
0737c1e3a6 radv: enable out-of-order rasterization by default
As the implementation is conservative, we can now enable it
by default. It can be disabled with RADV_DEBUG=nooutoforder.

Don't expect much more than 1% of improvements, but the gain
seems consistent.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-02 10:33:24 +02:00
Samuel Pitoiset
1d766b0196 radv: only disable out-of-order rast for perfect occlusion queries
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-02 10:33:22 +02:00
Kenneth Graunke
1122fb2d98 i965: Drop unused gen5 sampler default color struct.
Trivial.
2018-05-01 23:09:25 -07:00
Kenneth Graunke
9f6082f6c7 i965: Make brw_vs_outputs_written static.
Drop a prototype.  Trivial.
2018-05-01 23:09:16 -07:00
Nanley Chery
3e56e4642f i965/tex_image: Avoid the ASTC LDR workaround on gen9lp
Both the internal documentation and the results of testing this in the
CI suggest that this is unnecessary. Add the fixes tag because this
reduces an internal benchmark's startup time by about 17 seconds
(reported by Eero).

Fixes: 710b1d2e66 "i965/tex_image: Flush certain subnormal ASTC channel values"
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-01 16:47:39 -07:00
Eric Anholt
800be7f277 freedreno: Fix ir3_cmdline.c build.
Fixes: 6487e7a30c ("nir: move GL specific passes to src/compiler/glsl")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-05-01 16:38:37 -07:00
Jason Ekstrand
d216ffc604 anv: Allow lookup of vkEnumerateInstanceVersion without an instance
Fixes: cbab2d1da5
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-01 14:45:51 -07:00
Jason Ekstrand
d5a0787f03 anv: Don't advertise Float64 or Int64 on HW without 64-bit types
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2018-05-01 14:45:50 -07:00
Samuel Pitoiset
d8db5986ce radv: compute the number of subpass attachments correctly
Only count color attachments twice if resolves are used, also
account for the depth stencil attachment if present.

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-01 22:18:03 +02:00
Dave Airlie
e66f64c285 radv: set fmask_surf_index on fmask surfaces.
This is needed for gfx9 and later for all fmask surface index.

(Mentioned by Marek on irc)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-05-02 06:01:42 +10:00
Brian Paul
f298ed93d9 gallium/i915: fix PIPE_CAPF_MIN_CONSERVATIVE_RASTER_DILATE typo
Fixes: fffe5e2d14 ("gallium: add initial support for conservative
rasterization")
Trivial.
2018-05-01 09:52:22 -06:00
Rhys Perry
07dac3e040 nvc0: add conservative rasterization support
Subpixel precision bias, dilation and the post-snap mode are supported on
GM200 and newer. The pre-snap mode is supported for triangle primitives on
GP100.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-04-30 21:13:53 -06:00
Rhys Perry
97f5f399ef st/mesa: add support for nvidia conservative rasterization extensions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-04-30 21:13:53 -06:00
Rhys Perry
fffe5e2d14 gallium: add initial support for conservative rasterization
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-30 21:13:53 -06:00
Rhys Perry
4580617509 mesa: add support for nvidia conservative rasterization extensions
Although the specs are written against compatibility GL 4.3 and allows core
profile and GLES2+, it is exposed for GL 1.0+ and GLES1 and GLES2+.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-04-30 21:13:53 -06:00
Brian Paul
31ab0427a7 glsl/tests: add GLSL_TYPE_UINT8, GLSL_TYPE_INT8 cases to switch statements
To silence warnings about unhandled switch values.
Untested otherwise.

v2: move the INT/UINT8 cases after the INT/UINT16 cases, per Eric.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-30 21:13:53 -06:00
Brian Paul
efec712d51 tgsi: use enums instead of unsigned in ureg code
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-04-30 21:13:53 -06:00
Timothy Arceri
6487e7a30c nir: move GL specific passes to src/compiler/glsl
With this we should have no passes in src/compiler/nir with any
dependencies on headers from core GL Mesa.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-05-01 12:39:33 +10:00
Andres Rodriguez
f56e22e496 radv/winsys: fix leaking resources from bo's imported by fd
A bo's ref_count was not being initialized when imported from an fd.
Therefore, we would fail to free the resource during VkFreeMemory().

This patch fixes applications like hifi VR in threaded mode, which
perform frequent imports/releases of IPC shared memory.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
2018-04-30 18:20:30 -04:00
Scott D Phillips
2a08ae3c7c i965/tiled_memcpy: ytiled_to_linear a cache line at a time
Similar to the transformation applied to linear_to_ytiled, also align
each readback from the ytiled source to a cacheline (i.e. transfer a
whole cacheline from the source before moving on to the next column).
This will allow us to utilize movntqda (_mm_stream_si128) in a
subsequent patch to obtain near WB readback performance when accessing
the uncached ytiled memory, an order of magnitude improvement.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-30 15:18:36 -07:00
Chris Wilson
682bdaa658 i965: Record mipmap resolver for unmapping
When mapping a region of the mipmap_tree, record which complementary
method to use to unmap it afterwards. By doing so we can avoid
duplicating the decision tree used when mapping and thereby eliminate
trivial errors that can be introduced if the two if-chains become out of
sync.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-30 14:06:23 -07:00
Chris Wilson
5367295e1a i965: Move unmap_depthstencil before map_depthstencil
Reorder code to avoid a forward declaration in the next patch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-30 14:06:23 -07:00
Chris Wilson
ab2825c898 i965: Move unmap_etc before map_etc
Reorder code to avoid a forward declaration in the next patch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-30 14:06:23 -07:00
Chris Wilson
9e7e88049f i965: Move unmap_s8 before map_s8
Reorder code to avoid a forward declaration in the next patch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-30 14:06:23 -07:00
Chris Wilson
b3ad6f5ca6 i965: Move unmap_movntdqa before map_movntdqa
Reorder code to avoid a forward declaration in the next patch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-30 14:06:23 -07:00
Chris Wilson
f348d07a62 i965: Move unmap_blit before map_blit
Reorder code to avoid a forward declaration in the next patch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-30 14:06:23 -07:00
Chris Wilson
359624142d i965: Move unmap_gtt before map_gtt
Reorder code to avoid a forward declaration in the next patch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-30 14:06:23 -07:00
Dave Airlie
8d3529872c ac/nir: expand 64-bit vec3 loads to fix shuffling.
If loading 64-bit vec3 values, a 4 component load would be followed
by a 2 component load and the resulting shuffle would fail as it
requires 2 4 components. This just expands the second results
vector out to 4 components.

This fixes 100 CTS tests:
dEQP-VK.spirv_assembly.type.vec3.*64*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-05-01 05:58:14 +10:00
Kenneth Graunke
bde12f75e1 i965: Don't stomp initial kflags for program cache.
We want to flag EXEC_OBJECT_CAPTURE, but we ought to preserve any
existing kflags.  Today, there are none (as the program cache doesn't
support 48-bit addressing), but once we start using softpin, we'll
need to preserve EXEC_OBJECT_PINNED.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-04-30 11:34:19 -07:00
Kenneth Graunke
0cc98522f9 i965: Let batchbuffers be placed anywhere in the 48-bit address space.
We were trying to mark batch buffers with EXEC_OBJECT_CAPTURE, and
accidentally stomped EXEC_OBJECT_SUPPORTS_48B_ADDRESS in the process.

There's no reason to restrict batch buffers to the lower 4GB.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-04-30 11:34:19 -07:00
Scott D Phillips
8ffc6ee251 intel: fix check for 48b ppgtt support
The previous logic of the supports_48b_addresses wasn't actually
checking if i915.ko was running with full_48bit_ppgtt. The ENOENT
it was checking for was actually coming from the invalid context
id provided in the test execbuffer.  There is no path in the
kernel driver where the presence of
EXEC_OBJECT_SUPPORTS_48B_ADDRESS leads to an error.

Instead, check the default context's GTT_SIZE param for a value
greater than 4 GiB

v2 (Ken): Fix in i965 as well.
v3 Check GTT_SIZE instead of HAS_ALIASING_PPGTT (Chris Wilson)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-30 11:34:19 -07:00
Leo Liu
1c5f4f4e17 st/omx/enc: fix blit setup for YUV LoadImage
The blit here involves scaling since it's copying from I8 format to R8G8 format.
Half of source will be filtered out with PIPE_TEX_FILTER_NEAREST instruction, it
looks that GPU always uses the second half as source. Currently we use "1" as
the start point of x for R, then causing 1 source pixel of U component shift to
right. So "-1" should be the start point for U component.

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-30 11:55:36 -04:00
Juan A. Suarez Romero
4d449c94e4 autotools, meson: bump up required VA version
Due using a new VP9 config we use, required VA API 0.39

Fixes: 413c5ca372 ("travis: update libva required version")
CC: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-04-30 13:59:37 +02:00
Juan A. Suarez Romero
96ed3714fc docs: update calendar, add news and link release notes to 18.0.2
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-04-28 17:01:48 +00:00
Juan A. Suarez Romero
8f1159bf9a docs: add sha256 checksums for 18.0.2
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit b3eed3ad03)
2018-04-28 16:58:39 +00:00
Juan A. Suarez Romero
14f85260de docs: add release notes for 18.0.2
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit d38da7bd2d)
2018-04-28 16:58:36 +00:00
Marek Olšák
8b7358fe43 radeonsi: increase the number of compiler threads depending on the CPU
The compiler queue was limited to 3 threads, so shader-db running
on a 16-thread CPU would have a bottleneck on the 3-thread queue.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Benedikt Schemmer <ben at besd.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
3f0eaaf6d9 radeonsi: avoid a crash in gallivm_dispose_target_library_info
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Benedikt Schemmer <ben at besd.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
e75fc8d033 radeonsi: move data_layout into si_compiler
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Benedikt Schemmer <ben at besd.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
797d673c9a radeonsi: move passmgr into si_compiler
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Benedikt Schemmer <ben at besd.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
c1823ff661 radeonsi: move target_library_info into si_compiler
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Benedikt Schemmer <ben at besd.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
5a94f15aa7 radeonsi: use si_compiler::triple in si_llvm_optimize_module
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Benedikt Schemmer <ben at besd.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
43f0a10051 radeonsi: add triple into si_compiler
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Benedikt Schemmer <ben at besd.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
87eb597758 radeonsi: add struct si_compiler containing LLVMTargetMachineRef
It will contain more variables.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Benedikt Schemmer <ben at besd.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
788d66553a radeonsi: rename r600_texture::resource to buffer
r600_resource could be renamed to si_buffer.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
6fadfc01c6 radeonsi: use r600_resource() typecast helper
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
3160ee876a radeonsi: remove unused atom parameter from si_atom::emit
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
de344209ad radeonsi: inline 2 trivial state structures
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
e395475096 radeonsi: remove function si_init_atom
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
ccebcba893 radeonsi: remove si_atom::id
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
639b673fc3 radeonsi: don't use an indirect table for state atoms
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
9054799b39 radeonsi: rename r600_atom -> si_atom
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
a8abbbb172 radeonsi: remove r600_pipe_common.h
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
6d19120da8 radeonsi/gfx9: workaround for INTERP with indirect indexing
and clean up the conditions.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
2018-04-27 17:56:04 -04:00
Marek Olšák
2d69b485f5 radeonsi: rewrite DCC format compatibility checking code
It might be better to use a slow compressed clear when clearing to 1.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
c732d069b3 radeonsi: implement DCC fast clear swizzle constraints more accurately
Reduce swizzle constraints to the ALPHA_IS_ON_MSB constraint and the clear
value of 1.

This significantly changes the DCC fast clear code, and fixes fast clear
for RGB formats without alpha.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
9ef423f720 radeonsi: rename variables and document stuff around DCC fast clear
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
1cc2e0cc6b radeonsi: fully enable 2x DCC MSAA for array and non-array textures
The clear code is exactly the same as for 1 sample buffers -
just clear the whole thing.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
ca33d961a4 radeonsi: enable fast color clear for level 0 of mipmapped textures on <= VI
GFX9 is more complicated and needs a compute shader that we should just
copy from amdvlk.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
174e11c3f5 ac/surface: handle DCC subresource fast clear restriction on VI
v2: require the previous level to be clearable for determining whether
    the last unaligned level is clearable

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
George Kyriazis
838f15650e swr/rast: No need to export GetSimdValidIndicesGfx
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
7caeee3432 swr/rast: Small editorial changes
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
f276517ebf swr/rast: Use new processor detection mechanism
Use specific avx512 selection mechanism based on avx512er bit instead of
getHostCPUName().  LLVM 6.0.0 has a bug that reports wrong string for KNL
(fixed in 6.0.1).

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
8ace547e8d swr/rast: Output rasterizer dir to console since it's process specific
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
c328c5d0f4 swr/rast: Add TranslateGfxAddress for shader
Also add GFX_MEM_CLIENT_SHADER

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
edc41f73b8 swr/rast: jit PRINT improvements.
Sign-extend integer types to 32bit when specifying "%d" and add new %u
which zero-extends to 32bit. Improves  printing of sub 32bit integer types
(i1 specifically).

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
5d403178e6 swr/rast: Fix regressions.
Bump jit cache revision number to force recompile.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
577af2bed4 swr/rast: Cleanup old cruft.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
aeab9db50a swr/rast: Package events.proto with core output
However only if the file exists in DEBUG_OUTPUT_DIR. The expectation is
that AR rasterizerLauncher will start placing it there when launching
a workload (which is in a subsequent checkin)

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
b97bb0ea6d swr/rast: Fix init in EventHandlerWorkerStats
Make sure we initialize variables.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
9a72d4c03e swr/rast: Fix return type of VCVTPS2PH.
expecting <8xi16> return.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
3f008c5505 swr/rast: WIP Translation handling
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
7986519d50 swr/rast: Use different handing for stream masks
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
6b1c852ebc swr/rast: Silence warnings
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
e6daa62a48 swr/rast: Add support for TexelMask evaluation
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
cec1b52cac swr/rast: Internal core change
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
7b343a215e swr/rast: Fix x86 lowering 64-bit float handling
- 64-bit cvt-to-float needs to be explicitly handled
- gathers need the right parameter types to work with doubles

Fixes draw-vertices piglit tests

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
fa4ab7910e swr/rast: Add some SIMD_T utility functors
VecEqual and VecHash

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
18c9cb85d1 swr/rast: Fix wrong type allocation
ALLOCA pointer elements, not pointers.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
1cdbce8805 swr: touch generated files to update timestamp
previous change in generators necessitates this change

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
George Kyriazis
9ceeb671a3 swr/rast: Fix byte offset for non-indexed draws
for the case when USE_SIMD16_SHADERS == FALSE

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-27 14:36:41 -05:00
Marek Olšák
7083ac7290 util/u_queue: fix a deadlock in util_queue_finish
Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 13:28:17 -04:00
Dylan Baker
7772de5283 meson: fix race condition revealed by using 0.44
Previously there was a special target that blocked for the generation of
anv_entrypoints.h, with meson 0.44 we don't need this, we can use a new
language feature instead. The problem is that previously that blocking
target would hide a race condition for the generation of another header,
anv_extensions.h. Now the build sometimes fails when anv_extensions.h is
not generated in time.

v2: - clarify the race condition in the commit message (Emil)

CC: Mark Janes <mark.a.janes@intel.com>
Fixes: 92550d9b16
       ("meson: remove workaround for custom target creating .h and .c files")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-04-27 10:24:51 -07:00
Dylan Baker
0c23bd76d1 bin: force git show to use default pretty setting
I have pretty default to short, which breaks this script.

v2: - Fix both places that don't define a --pretty (Emil)

cc: Juan A. Suarez <jasuarez@igalia.com>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Andres Gomez <agomez@igalia.com> (v1)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-04-27 10:19:55 -07:00
Tapani Pälli
b3ad4b6971 mesa: add TBO support for GL_EXT_texture_norm16
Earlier plumbing missed interaction with texture buffer objects.

Fixes: 7f467d4f73 "mesa: GL_EXT_texture_norm16 extension plumbing"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-27 14:34:43 +03:00
Samuel Pitoiset
d38425ce87 ac: fix texture query LOD for 1D textures on GFX9
1D textures are allocated as 2D which means we only need
one coordinate for texture query LOD.

Fixes: 625dcbbc45 ("amd/common: pass address components individually to
ac_build_image_intrinsic")
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 11:15:35 +02:00
Christian Gmeiner
3e69127939 etnaviv: remove not needed includes
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
2018-04-27 09:04:56 +02:00
Christian Gmeiner
2ba587aac7 etnaviv: remove redundant include
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
2018-04-27 09:04:53 +02:00
Timothy Arceri
79b0556f29 glsl: replace some asserts with unreachable when processing the ast
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-04-27 10:18:47 +10:00
Timothy Arceri
410f901bee mesa: drop the buffer mode param from the DrawBuffer driver function
No drivers used it.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-04-27 10:09:10 +10:00
Anuj Phogat
b695a7bd8e anv/icl: Enable Vulkan on Ice Lake
This patch enables the Vulkan driver on Ice Lake h/w
with added warning about preliminary support.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-04-26 16:31:27 -07:00
Caio Marcelo de Oliveira Filho
c9bdc7f7e2 anv: enable VK_EXT_shader_viewport_index_layer
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-26 15:32:05 -07:00
Jason Ekstrand
3db93f9128 anv/allocator: Don't shrink either end of the block pool
Previously, we only tried to ensure that we didn't shrink either end
below what was already handed out.  However, due to the way we handle
relocations with block pools, we can't shrink the back end at all.  It's
probably best to not shrink in either direction.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105374
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106147
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2018-04-26 13:17:14 -07:00
Eric Anholt
76ee9edcb4 broadcom/vc5: Add support for centroid varyings.
It would be nice to share the flags packet emit logic with flat shade
flags, but I couldn't come up with a good way while still using our pack
macros.  We need to refactor this to shader record setup at compile time,
anyway.

Fixes ext_framebuffer_multisample-interpolation * centroid-*
2018-04-26 11:30:22 -07:00
Eric Anholt
e2f3317801 broadcom/vc5: Add an assert about GFXH-1559.
Our TF outputs always start at 6 or 7 currently, so we don't hit the
broken 8 case.  Let's make sure that doesn't change somehow.
2018-04-26 11:30:22 -07:00
Eric Anholt
77b4f30bae broadcom/vc5: Add validation that we don't violate GFXH-1633 requirements.
We don't use ldunifa yet, but we will eventually for UBOs.
2018-04-26 11:30:22 -07:00
Eric Anholt
089c32eefd broadcom/vc5: Add validation that we don't violate GFXH-1625 requirements.
We don't use TMUWT yet, but we will once we do SSBOs.
2018-04-26 11:30:22 -07:00
Eric Anholt
57ceb95c84 broadcom/vc5: Implement GFXH-1742 workaround (emit 2 dummy stores on 4.x).
This should fix help with intermittent GPU hangs in tests switching
formats while rendering small frames.  Unfortunately, it didn't help with
the tests I'm having troubles with.
2018-04-26 11:30:22 -07:00
Eric Anholt
dc4cb04ee5 broadcom/vc5: Add QPU validation for register writes after thrend.
The next shader gets to start writing the register file during these
slots, so make sure we don't stomp over them.

The only case of hitting this that I could imagine would be dead writes.
2018-04-26 11:30:22 -07:00
Eric Anholt
8adf813f83 st: Choose a 2101010 format for GL_RGB/GL_RGBA with a 2_10_10_10 type.
GLES's GL_EXT_texture_type_2_10_10_10_REV allows uploading this type to an
unsized internalformat, and it should be non-color-renderable.
fbobject.c's implementation of the check for color-renderable is checks
that the texture has a 2101010 mesa format, so make sure that we have
chosen a 2101010 format so that check can do what it meant to.

Fixes KHR-GLES3.packed_pixels.pbo_rectangle.rgb on vc5.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-26 11:30:22 -07:00
Charmaine Lee
8aef7fccb7 st/mesa: fix missing setting of _ElementSize in new_draw_rasterpos_stage
With this patch, _ElementSize is initialized along with the rest
of the vertex array attributes in new_draw_rasterpos_stage().
This fixes a crash in st_pipe_vertex_format() when running
topogun-1.06-orc-84k-resize trace file with VMware svga driver.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-04-26 10:29:02 -07:00
Drew Davenport
e923e8151d st/va: Fix typos
s/attibute/attribute/
s/suface/surface/

v2: rebased(Leo)

Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-04-26 11:16:05 -04:00
Drew Davenport
893808006a st/va: Fix potential buffer overread
VASurfaceAttribExternalBuffers.pitches is indexed by
plane. Current implementation only supports single plane layout.

Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-04-26 11:16:05 -04:00
Boyuan Zhang
deba56accf radeon/vcn: fix mpeg4 msg buffer settings
Previous bit-fields assignments are incorrect and will result certain mpeg4
decode failed due to wrong flag values. This patch fixes these assignments.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-04-26 11:16:05 -04:00
Ian Romanick
bf5e0276b6 radeon: Drop broken front_buffer_reading/drawing optimization
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-26 09:38:51 -04:00
Ian Romanick
0b3231966f radeon: Use _mesa_is_front_buffer_drawing
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-26 09:38:51 -04:00
Samuel Pitoiset
d7ffe3b384 radv: set ac_surf_info::num_channels correctly
num_channels has been introduced since "ac/surface: don't set
the display flag for obviously unsupported cases".

Based on RadeonSI.

Fixes: e29facff31 ("ac/surface: don't set the display flag for obviously unsupported cases (v2)")
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-26 15:34:14 +02:00
Samuel Pitoiset
a6fbefa67b radv: fix DCC enablement since partial MSAA implementation
dcc_msaa_allowed is always false on GFX9+ and only true on VI
if RADV_PERFTEST=dccmsaa is set. This means DCC was disabled
in some situations where it should not.

This is likely going to fix a performance regression.

Fixes: 2f63b3dd09 ("radv: enable DCC for MSAA 2x textures on VI under an option")
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-26 15:34:11 +02:00
Karol Herbst
227b1af866 nir/opt_constant_folding: fix folding of 8 and 16 bit ints
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-26 11:16:15 +02:00
Karol Herbst
14943add44 nir: print 8 and 16 bit constants correctly
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-26 11:16:15 +02:00
Karol Herbst
543a8c66a7 nir: support converting to 8-bit integers in nir_type_conversion_op
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-26 11:16:15 +02:00
Neil Roberts
c4ab1bdcc9 spirv: Don’t check for NaN for most OpFOrd* comparisons
For all of the OpFOrd* comparisons except OpFOrdNotEqual the hardware
should probably already return false if one of the operands is NaN so
we don’t need to have an explicit check for it. This seems to at least
work on Intel hardware. This should reduce the number of instructions
generated for the most common comparisons.

For what it’s worth, the original code to handle this was added in
e062eb6415. The commit message for that says that it was to fix
some CTS tests for OpFUnord* opcodes. Even if the hardware doesn’t
handle NaNs this patch shouldn’t affect those tests. At any rate they
have since been moved out of the mustpass list. Incidentally those
tests fail on the nvidia proprietary driver so it doesn’t seem like
handling NaNs correctly is a priority.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-26 10:08:14 +02:00
Matt Atwood
3ba5a646e5 Intel: Add a Kaby Lake PCI ID
v2: Branding changed

Signed-off-by: Matt Atwood <matthew.s.atwood@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-04-25 13:31:55 -07:00
Eric Anholt
069c409f43 gallium/util: Fix incorrect refcounting of separate stencil.
The driver may have a reference on the separate stencil buffer for some
reason (like an unflushed job using it), so we can't directly free the
resource and should instead just decrement the refcount that we own.
Fixes double-free in KHR-GLES3.packed_depth_stencil.blit.depth32f_stencil8
on vc5.

Fixes: e94eb5e600 ("gallium/util: add u_transfer_helper")
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-04-25 12:14:33 -07:00
Eric Anholt
0d4ce00d70 broadcom/vc5: Fix reloads of separate stencil buffers.
Like for stores, we need to emit a separate load_general packet.
2018-04-25 09:21:54 -07:00
Eric Anholt
9f3f4284c0 broadcom/vc5: Fix cpp of MSAA surfaces on 4.x.
The internal-type-bpp path is for surfaces that get stored in the raw TLB
format.  For 4.x, we're storing MSAA as just 2x width/height at the
original format.
2018-04-25 09:21:54 -07:00
Eric Anholt
ac207acb97 broadcom/vc5: Implement stencil blits using RGBA.
Fixes piglit fbo-depthstencil blit default_fb
2018-04-25 09:21:54 -07:00
Eric Anholt
503716fa86 broadcom/vc5: Remove leftover vc4 MSAA lowering setup in the FS key. 2018-04-25 09:21:54 -07:00
Eric Anholt
5710532e9e broadcom/vc5: Fix tile load/store of MSAA surfaces on 4.x.
For single-sample we have to always program SAMPLE_0, but for multisample
we want to store all the samples.
2018-04-25 09:21:54 -07:00
Juan A. Suarez Romero
413c5ca372 travis: update libva required version
Commit fa328456e8 added VP9 config support, but this needs a newer
libva version, 1.7.0 or above.

Fixes: fa328456e8 ("st/va: add VP9 config to enable profile2")
CC: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-04-25 16:09:20 +02:00
Tapani Pälli
7f467d4f73 mesa: GL_EXT_texture_norm16 extension plumbing
Patch enables use of short and unsigned short data for texture uploads,
rendering and reading of framebuffers within the restrictions specified
in GL_EXT_texture_norm16 spec.

Patch also enables those 16bit format layout qualifiers listed in
GL_NV_image_formats that depend on EXT_texture_norm16.

v2: expose extension with dummy_true
    fix layout qualifier map changes (Ilia Mirkin)

v3: use _mesa_has_EXT_texture_norm16, other fixes
    and cleanup (Ilia Mirkin)

v4: fix rest of the issues found

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-25 14:26:20 +03:00
Jordan Justen
b0c5774027 meson: Fix with_intel_vk and with_amd_vk variables
Fixes: 5608d0a2ce "meson: use array type options"
Cc: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-04-24 23:12:42 -07:00
Roland Scheidegger
77554d220d draw: fix different sign logic when clipping
The logic was flawed, since mul(x,y) will be <= 0 (exactly 0) when
the sign is the same but both numbers are sufficiently small
(if the product is smaller than 2^-128).
This could apparently lead to emitting a sufficient amount of
additional bogus vertices to overflow the allocated array for them,
hitting an assertion (still safe with release builds since we just
aborted clipping after the assertion in this case - I'm however unsure
if this is now really no longer possible, so that code stays).
Not sure if the additional vertices could cause other grief, I didn't
see anything wrong even when hitting the assertion.

Essentially, both +-0 are treated as positive (the vertex is considered
to be inside the clip volume for this plane), so integrate the logic
determining different sign into the branch there.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-04-25 04:50:20 +02:00
Roland Scheidegger
98578df27b draw: simplify clip null tri logic
Simplifies the logic when to emit null tris (albeit the reasons why we
have to do this remain unclear).
This is strictly just logic simplification, the behavior doesn't change
at all.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-04-25 04:50:20 +02:00
Ilia Mirkin
c17ddcb4b4 nvc0/ir: all short immediates are sign-extended, adjust LIMM test
Some analysis suggests that all short immediates are sign-extended. The
insnCanLoad logic already accounted for this, but we could still pick
the wrong form when emitting actual instructions that support both short
and long immediates (with the long form usually having additional
restrictions that insnCanLoad should be aware of).

This also reverses a bunch of commits that had previously "worked
around" this issue in various emitters:

9c63224540: gm107/ir: make use of ADD32I for all immediates
83a4f28dc2: gm107/ir: make use of LOP32I for all immediates
b84c97587b: gm107/ir: make use of IMUL32I for all immediates
d30768025a: gk110/ir: make use of IMUL32I for all immediates

as well as the original import for UMUL in the nvc0 emitter.

Reported-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Karol Herbst <kherbst@redhat.com>
2018-04-24 21:37:44 -04:00
Boyan Ding
6695f9d5c5 mesa: call DrawBufferAllocate driver hook in update_framebuffer for windows-system FB
When draw buffers are changed on a bound framebuffer, DrawBufferAllocate()
hook should be called. However, it is missing in update_framebuffer with
window-system framebuffer, in which FB's draw buffer state should match
context state, potentially resulting in a change.

Note: This is needed because gallium delays creating the front buffer,
      i965 works fine without this change.

V2 (Timothy Arceri):
 - Rebased on merged/simplified DrawBuffer driver function
 - Move DrawBuffer call outside fb->ColorDrawBuffer[0] !=
   ctx->Color.DrawBuffer[0] check to make piglit pass.

v3 (Timothy Arceri):
 - Call new DrawBuffaerAllocate() driver function.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> (v2)
Reviewed-by: Brian Paul <brianp@vmware.com> (v2)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99116
2018-04-25 09:08:26 +10:00
Timothy Arceri
6ca09f3a60 st/mesa: add new driver function DrawBufferAllocate
Unlike some of the classic drivers the st was only using DrawBuffer()
to allocated some buffers on-demand. Creating a separate function
will allow us to call it from update_framebuffer() in the following
patch without regressing some of the older classic drivers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-25 09:08:26 +10:00
Timothy Arceri
2554b8cb00 mesa: some C99 tidy ups for framebuffer.c
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-04-25 09:08:26 +10:00
Dylan Baker
1d01b52d76 meson: Fix no-rtti in llvm detection
Because I clearly wasn't thinking and clearly didn't do a good job
testing. Sigh

Fixes: c5a97d658e
       ("meson: fix builds against LLVM built without rtti")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-24 15:26:51 -07:00
Dylan Baker
be0a2cfc65 meson: use new warning function
Instead of emulating it with message.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-24 14:08:15 -07:00
Dylan Baker
5608d0a2ce meson: use array type options
This option type is nice since it involves less converting strings into
lists, and because it validates the values that are provided.

v2: - Set with_any_vk to true if any vulkan driver is built (Eric)

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-24 14:08:15 -07:00
Dylan Baker
c5a97d658e meson: fix builds against LLVM built without rtti
Building without rtti is a frought with peril, but it's something that
autotools supports so we need to support it too.

Since we've moved to version 0.44 as a whole we can use the meson
functionality for accessing random llvm-config options we can check for
rtti and add -fno-rtti to all C++ code accordingly.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-04-24 14:08:15 -07:00
Dylan Baker
595021bf1a meson: remove dummy_cpp
meson has gotten pretty smart about tracking C and C++ dependencies
(internal and external), and using the right linker. This wasn't always
the case and we created empty c++ files to force the use of the c++
linker. We don't need that any more.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-24 14:08:15 -07:00
Dylan Baker
db90c8627c meson: allow empty sources when using link_whole
meson used to get grumpy if the sources list was empty, even when using
--whole-archive (link_whole). In more recent versions that's not true,
so remove the workaround.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-24 14:08:15 -07:00
Dylan Baker
92550d9b16 meson: remove workaround for custom target creating .h and .c files
In more modern versions of meson a custom_target returns an index-able
object. This allows us to create accurate dependency models for targets
that rely only on the header and not on the code from anv_entrypoints.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-24 14:08:15 -07:00
Dylan Baker
5a670d08c0 meson: raise required version to 0.44.1
We have already required 0.44 for building clover and swr, so it was
already partially required. This just makes it required across the board
instead of just for clover and swr.

There is a bug in 0.44 which makes it impossible to build mesa in some
configurations, so require 0.44.1 which fixes this.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-24 14:08:15 -07:00
Dylan Baker
1546f76a39 meson: fix graw-xlib after auxiliary consolidation
This one's completely my fault, I didn't do good enough testing after
rebasing and this got missed.

Fixes: d28c246501
       ("meson: build graw tests")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-24 14:08:15 -07:00
Dylan Baker
c73abb4f82 meson: only build mesa_st tests when build-tests is true
Since we have an option to turn test building on and off, we should
honor that.

Fixes: 34cb4d0ebc
       ("meson: build tests for gallium mesa state tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-24 14:08:15 -07:00
Dylan Baker
aaab624245 meson: don't build classic mesa tests without dri_drivers
Since mesa_classic is build-on-demand the tests will create a demand and
add a bunch of extra compilation.

Fixes: 43a6e84927
       ("meson: build mesa test.")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-24 14:08:15 -07:00
Nanley Chery
0e8b16e0a2 i965/meta_util: Re-enable sRGB-encoded fast-clears on CNL
The paths which sample with the clear color are now using a getter which
performs the sRGB decode needed to enable this fast clear.

This path can be exercised by fast-clearing a texture, then performing
an operation which requires sRGB decoding. Test coverage for this
feature is provided with the following tests:

* Shader texture calls:
  - spec@ext_texture_srgb@tex-srgb

* Shader texelfetch calls:
  - spec@arb_framebuffer_srgb@fbo-fast-clear
  - spec@arb_framebuffer_srgb@msaa-fast-clear

* Blending:
  - spec@arb_framebuffer_srgb@arb_framebuffer_srgb-fast-clear-blend

* Blitting:
  - spec@arb_framebuffer_srgb@blit texture srgb msaa enabled clear

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-24 13:41:14 -07:00
Nanley Chery
129ad66dd5 i965/miptree: Extend the sRGB-blending WA to future platforms
The blending issue seems to be present on CNL as well.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-24 13:41:14 -07:00
Nanley Chery
7ea013c6d3 i965: Add and use a getter for the clear color
It returns both the inline clear color and a clear address which points
to the indirect clear color buffer (or NULL if unused/non-existent).
This getter allows CNL to sample from fast-cleared sRGB textures
correctly by doing the needed sRGB-decode on the clear color (inline)
and making the indirect clear color buffer unused.

v2 (Rafael):
* Have a more detailed commit message.
* Add a comment on the sRGB conversion process.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-24 13:41:14 -07:00
Jason Ekstrand
b55077a8bc util/srgb: Add a float sRGB -> linear helper
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-24 13:41:14 -07:00
Nanley Chery
cd5ce363e3 i965/wm_surface_state: Use the clear address if clear_bo is non-NULL
We want to add and use a getter that turns off the indirect path by
returning zero for the clear color bo and offset.

v2: Fix usage of "clear address" in commit message (Jason).

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-24 13:41:14 -07:00
Nanley Chery
af4e9295fe i965: Add and use a single miptree aux_buf field
We want to add and use a function that accesses the auxiliary buffer's
clear_color_bo and doesn't care if it has an MCS or HiZ buffer
specifically.

v2 (Jason Ekstrand):
* Drop intel_miptree_get_aux_buffer().
* Mention CCS in the aux_buf field.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-24 13:41:14 -07:00
Nanley Chery
5503b65103 i965: Add and use a getter for the miptree aux buffer
Make the next patch easier to read by eliminating most of the would-be
duplicate field accesses now.

v2: Update the HiZ comment instead of deleting it (Rafael).

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-04-24 13:41:14 -07:00
Karol Herbst
e4f675dc42 gm107/ir/lib: fix sched in div u32 builtin
Imad needs to set a read barrier.

With significant big work groups I was getting wrong results for div u32. Turns
out the issue was with the sched opcodes.

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-24 22:31:59 +02:00
Ian Romanick
0d5ce25c1c intel/compiler: Add scheduler deps for instructions that implicitly read g0
Otherwise the scheduler can move the writes after the reads.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95009
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95012
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Cc: Clayton A Craft <clayton.a.craft@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2018-04-24 14:31:21 -04:00
Ian Romanick
cd32a4e5f4 intel/compiler: Silence unused parameter warnings in empty vec4_instruction_scheduler methods
src/intel/compiler/brw_schedule_instructions.cpp: In member function ‘virtual void vec4_instruction_scheduler::count_reads_remaining(backend_instruction*)’:
src/intel/compiler/brw_schedule_instructions.cpp:764:72: warning: unused parameter ‘be’ [-Wunused-parameter]
 vec4_instruction_scheduler::count_reads_remaining(backend_instruction *be)
                                                                        ^~
src/intel/compiler/brw_schedule_instructions.cpp: In member function ‘virtual void vec4_instruction_scheduler::setup_liveness(cfg_t*)’:
src/intel/compiler/brw_schedule_instructions.cpp:769:51: warning: unused parameter ‘cfg’ [-Wunused-parameter]
 vec4_instruction_scheduler::setup_liveness(cfg_t *cfg)
                                                   ^~~
src/intel/compiler/brw_schedule_instructions.cpp: In member function ‘virtual void vec4_instruction_scheduler::update_register_pressure(backend_instruction*)’:
src/intel/compiler/brw_schedule_instructions.cpp:774:75: warning: unused parameter ‘be’ [-Wunused-parameter]
 vec4_instruction_scheduler::update_register_pressure(backend_instruction *be)
                                                                           ^~
src/intel/compiler/brw_schedule_instructions.cpp: In member function ‘virtual int vec4_instruction_scheduler::get_register_pressure_benefit(backend_instruction*)’:
src/intel/compiler/brw_schedule_instructions.cpp:779:80: warning: unused parameter ‘be’ [-Wunused-parameter]
 vec4_instruction_scheduler::get_register_pressure_benefit(backend_instruction *be)
                                                                                ^~
src/intel/compiler/brw_schedule_instructions.cpp: In member function ‘virtual int vec4_instruction_scheduler::issue_time(backend_instruction*)’:
src/intel/compiler/brw_schedule_instructions.cpp:1550:61: warning: unused parameter ‘inst’ [-Wunused-parameter]
 vec4_instruction_scheduler::issue_time(backend_instruction *inst)
                                                             ^~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-24 14:31:21 -04:00
Ian Romanick
bdb15c2344 intel/compiler: Silence unused parameter warning in compile_cs_to_nir
src/intel/compiler/brw_fs.cpp: In function ‘nir_shader* compile_cs_to_nir(const brw_compiler*, void*, const brw_cs_prog_key*, brw_cs_prog_data*, const nir_shader*, unsigned int)’:
src/intel/compiler/brw_fs.cpp:7205:44: warning: unused parameter ‘prog_data’ [-Wunused-parameter]
                   struct brw_cs_prog_data *prog_data,
                                            ^~~~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-24 14:31:21 -04:00
Ian Romanick
d84b2ed1d7 intel/compiler: Silence unused parameter warnings in generate_foo methods
Since all of the fs_generator::generate_foo methods take a fs_inst * as
the first parameter, just remove the name to quiet the compiler.

src/intel/compiler/brw_fs_generator.cpp: In member function ‘void fs_generator::generate_barrier(fs_inst*, brw_reg)’:
src/intel/compiler/brw_fs_generator.cpp:743:41: warning: unused parameter ‘inst’ [-Wunused-parameter]
 fs_generator::generate_barrier(fs_inst *inst, struct brw_reg src)
                                         ^~~~
src/intel/compiler/brw_fs_generator.cpp: In member function ‘void fs_generator::generate_discard_jump(fs_inst*)’:
src/intel/compiler/brw_fs_generator.cpp:1326:46: warning: unused parameter ‘inst’ [-Wunused-parameter]
 fs_generator::generate_discard_jump(fs_inst *inst)
                                              ^~~~
src/intel/compiler/brw_fs_generator.cpp: In member function ‘void fs_generator::generate_pack_half_2x16_split(fs_inst*, brw_reg, brw_reg, brw_reg)’:
src/intel/compiler/brw_fs_generator.cpp:1675:54: warning: unused parameter ‘inst’ [-Wunused-parameter]
 fs_generator::generate_pack_half_2x16_split(fs_inst *inst,
                                                      ^~~~
src/intel/compiler/brw_fs_generator.cpp: In member function ‘void fs_generator::generate_shader_time_add(fs_inst*, brw_reg, brw_reg, brw_reg)’:
src/intel/compiler/brw_fs_generator.cpp:1743:49: warning: unused parameter ‘inst’ [-Wunused-parameter]
 fs_generator::generate_shader_time_add(fs_inst *inst,
                                                 ^~~~
src/intel/compiler/brw_vec4_generator.cpp: In function ‘void generate_set_simd4x2_header_gen9(brw_codegen*, brw::vec4_instruction*, brw_reg)’:
src/intel/compiler/brw_vec4_generator.cpp:1412:52: warning: unused parameter ‘inst’ [-Wunused-parameter]
                                  vec4_instruction *inst,
                                                    ^~~~
src/intel/compiler/brw_vec4_generator.cpp: In function ‘void generate_mov_indirect(brw_codegen*, brw::vec4_instruction*, brw_reg, brw_reg, brw_reg, brw_reg)’:
src/intel/compiler/brw_vec4_generator.cpp:1430:41: warning: unused parameter ‘inst’ [-Wunused-parameter]
                       vec4_instruction *inst,
                                         ^~~~
src/intel/compiler/brw_vec4_generator.cpp:1432:63: warning: unused parameter ‘length’ [-Wunused-parameter]
                       struct brw_reg indirect, struct brw_reg length)
                                                               ^~~~~~
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-24 14:31:21 -04:00
Eric Anholt
3d21fc193e broadcom/vc5: Set up internal_format for imported resources.
Without this, we'd assertion fail in u_transfer_helper when mapping an
imported resource.
2018-04-24 10:37:29 -07:00
Eric Anholt
f08f477a93 broadcom/vc5: Assert that created BOs have offset != 0.
The kernel shouldn't return a bo at NULL, and the HW special-cases NULL
address values for things like OQs.
2018-04-24 10:37:29 -07:00
Eric Anholt
482f2e24b5 broadcom/vc5: Don't allocate simulator BOs at offset 0.
The kernel won't return us BOs at offset 0 (because things like OQs
wouldn't work there), so we shouldn't in the simulator either.
2018-04-24 10:37:29 -07:00
Eric Anholt
82cdb801fd broadcom/vc5: Add sim support for the GET_BO_OFFSET ioctl.
Otherwise we'd crash immediately upon importing a BO through EGL
interfaces.
2018-04-24 10:37:29 -07:00
Eric Anholt
3cdd055ed2 broadcom/vc5: Treat imports of DRM_FORMAT_MOD_INVALID BOs as linear.
We don't have any kernel metadata about BO tiling, so this probably is all
we should do for the moment.
2018-04-24 10:37:29 -07:00
Tapani Pälli
c2e159d050 i965: expose MESA_FORMAT_R8G8B8A8_SRGB visual
Exposing the visual makes following dEQP tests pass on Android:

   dEQP-EGL.functional.wide_color.window_8888_colorspace_srgb
   dEQP-EGL.functional.wide_color.pbuffer_8888_colorspace_srgb

Visual is exposed only when DRI_LOADER_CAP_RGBA_ORDERING is set.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-24 14:55:18 +03:00
Tapani Pälli
fa4d4d97f3 dri: Add __DRI_IMAGE_FORMAT_SABGR8
Add format definition and required plumbing to create images.
Note that there is no match to drm_fourcc definition, just like
with existing _DRI_IMAGE_FOURCC_SARGB8888.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-24 14:55:18 +03:00
Marek Olšák
4559aefb5c Revert "st/dri: Fix dangling pointer to a destroyed dri_drawable"
This reverts commit dab02dea34.

It causes crashes of qtcreator and firefox.

Fixes: dab02de "st/dri: Fix dangling pointer to a destroyed dri_drawable"

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
2018-04-24 00:00:20 -04:00
Roland Scheidegger
e8e1d287a3 gallivm: dump bitcode before optimization
If we dump the bitcode for off-line debug purposes, we really want the
pre-optimized bitcode, otherwise it's useless in identifying problems
with IR optimization (if you have a shader which takes an hour to do
IR optimization, it's also nice you don't have to wait that hour...).
Also, print out the function passes for opt which correspond to what
was used for jit compilation (and also the opt level for codegen).
Using opt/llc this way should then pretty much mimic what was done
for jit. (When specifying something like -time-passes
-debug-pass=[Structure|Arguments] (for either opt or llc) that also
gives very useful information in which passes all the time was spent,
and which passes are really run along with the order - llvm will add
passes due to dependencies on its own, and of course -O2 for llc
comes with a ~100 pass list.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-04-24 04:49:39 +02:00
Roland Scheidegger
e89cf59c27 gallivm: (trivial) do division by 1000 with int64
Conversion to int can otherwise overflow if compile times are over
~71min. (Yes this can happen...)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-04-24 04:49:39 +02:00
Roland Scheidegger
45b8f620a5 gallivm: remove LICM pass
LICM is simply too expensive, even though it presumably can help quite
a bit in some cases.
It was definitely cheaper in llvm 3.3, though as far as I can tell with
llvm 3.3 it failed to do anything in most cases. early-cse also actually
seems to cause licm to be able to move things when it previously couldn't,
which causes noticeable compile time increases.
There's more loop passes in llvm, but I'm not sure which ones are helpful,
and I couldn't find anything which would roughly do what the old licm in
llvm 3.3 did, so ditch it.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-04-24 04:49:39 +02:00
Roland Scheidegger
8b9ab674b9 gallivm: add early cse pass
This pass is quite cheap, and can simplify the IR quite a bit for our
generated IR.
In particular on a variety of shaders I've found the time saved by
other passes due to the simplified IR more than makes up for the cost
of this pass, and on top of that the end result is actually better.
The only downside I've found is this enables the LICM pass to move some
things out of the main shader loop (in the case I've seen, instanced
vertex fetch (which is constant within the jit shader) plus the derived
instructions in the shader) which it couldn't do before for some reason.
This would actually be desirable but can increase compile time
considerably (licm seems to have considerable cost when it actually can
move things out of loops, due to alias analysis). But blaming early cse
for this seems inappropriate. (Note that the first two sroa / earlycse
passes are similar to what a standard llvm opt -O1/-O2 pipeline would
do, albeit this has some more passes even before but I don't think
they'd do much for us.)
It also in particular helps some crazy shader used for driver
verification (don't ask...) a lot (about factor of 6 faster in compile
time) (due to simplfiying the ir before LICM is run).
While here, also move licm behind simplifycfg. For some shaders there
seems to be very significant compile time gains (we've seen a factor
of 10000 albeit that was a really crazy shader you'd certainly never
see in a real app), beause LICM is quite expensive and there's cases
where running simplifycfg (along with sroa and early-cse) before licm
reduces IR complexity significantly. (I'm not entirely sure if it would
make sense to also run it afterwards.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-04-24 04:49:39 +02:00
Vlad Golovkin
1ff1dc1c63 glsl/glcpp: Handle hex constants with 0X prefix
GLSL 4.6 spec describes hex constant as:

hexadecimal-constant:
    0x hexadecimal-digit
    0X hexadecimal-digit
    hexadecimal-constant hexadecimal-digit

Right now if you have a shader with the following structure:

    #if 0X1 // or any hex number with the 0X prefix
    // some code
    #endif

the code between #if and #endif gets removed because the checking is performed
only for "0x" prefix which results in strtoll being called with the base 8 and
after encountering the 'X' char the strtoll returns 0. Letting strtoll detect
the base makes this limitation go away and also makes code easier to read.

From the strtoll Linux man page:

"If base is zero or 16, the string may then include a "0x" prefix, and the
number will be read in base 16; otherwise, a zero base is taken as 10 (decimal)
unless the next character is '0', in which case it is taken as 8 (octal)."

This matches the behaviour in the GLSL spec.

This patch also adds a test for uppercase hex prefix.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-24 09:55:05 +10:00
Timothy Arceri
295f57e09a mesa: rename api_validate.{c,h} -> draw_validate.{c,h}
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65422
2018-04-24 09:23:30 +10:00
Dave Airlie
a90c9f33cf ac/radv/radeonsi: refactor harvest config register getters.
This refactors the code out to share it between radv and radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-24 09:08:34 +10:00
Dave Airlie
8e4d54505a radv: only set raster_config_1 outside the index registers.
This follows what radeonsi does.

Ported from radeonsi:
    radeonsi: emit PA_SC_RASTER_CONFIG_1 only once

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-24 09:08:34 +10:00
Dave Airlie
f77caa7411 ac/radv/radeonsi: refactor max simd waves into common code.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-24 09:08:33 +10:00
Dave Airlie
899df55ee0 ac/radv/radeonsi: refactor raster_config default values getters.
This just makes this common code between the two drivers.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-24 09:07:51 +10:00
Dave Airlie
8de7ff91be radeonsi: use common gs_table_depth code
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-24 09:05:43 +10:00
Dave Airlie
9afe9c0fe2 radv: use common gs_table_depth code.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-24 09:05:43 +10:00
Dave Airlie
5e2ef28390 ac/info: move gs table depth to common code.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-24 09:05:38 +10:00
Dave Airlie
b25f6cde89 radeonsi: don't runtime check gs table info
We can just unreachable here, this aligns with radv code, makes
it easier to move to common code.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-24 09:05:29 +10:00
Dave Airlie
40783a7fa3 radv/gfx9: don't use gs_table_depth on gfx9.
Missed this on initial radeonsi port, we shouldn't use this value
on gfx9, but also in gfx8 only for when we have a geom shader.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-04-24 09:04:42 +10:00
Jason Ekstrand
de1f22d595 i965/fs: Return mlen * 8 for size_read() for INTERPOLATE_AT_*
They are send messages and this makes size_read() and mlen agree.  For
both of these opcodes, the payload is just a dummy so mlen == 1 and this
should decrease register pressure a bit.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Cc: mesa-stable@lists.freedesktop.org
2018-04-23 14:04:42 -07:00
Samuel Pitoiset
d136a5fad9 ac: fix the number of coordinates for ac_image_get_lod and arrays
This fixes crashes for the following CTS:
dEQP-VK.glsl.texture_functions.query.texturequerylod.*

Cubemaps are the same as 2D arrays.

Fixes: 625dcbbc45 ("amd/common: pass address components individually to
ac_build_image_intrinsic")
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-23 21:48:38 +02:00
Lionel Landwerlin
2964e16e51 i965: perf: enable GPA query statistics
The combinaison of GPA/MDAPI components expects a particular name &
layout for their pipeline statistics query.

v2: Limit the query GPA/MDAPI statistics to gen7->9 (Lionel)

v3: Add curly braces (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-23 18:30:10 +01:00
Lionel Landwerlin
2e3025c817 i965: perf: add support for raw queries
The INTEL_performance_query extension provides a list of queries that
a user can select to monitor a particular workload. Each query reports
different sets of counters (roughly looking at different parts of the
hardware, i.e. caches/fixed functions/etc...).

Each query has an associated configuration that we need to program
into the hardware before using the query. Up to now, we provided
predefined queries. This change allows the user to build its own query
(and associated configuration) externally, and have the i965 driver
use that configuration through a new query named :

   Intel_Raw_Hardware_Counters_Set_0_Query

When this query is selected, the i965 driver will report raw counters
deltas (meaning their values need to be interpreted by the user, as
opposed to existing queries that provide human readable values).

This change is also useful for debug purposes for building new
pre-defined queries and verifying the underlying numbers make sense
before writing equations for user readable output.

This change's purpose is also to enable GPA. GPA uses a library called
MDAPI that processes raw counter data. MDAPI expects raw data to have
a certain layout (per generation which is a bit unfortunate...). This
change also embeds the expected data layouts.

v2: Enable raw queries on gen 7->11, v1 had 7->9 (Lionel)

v3: Don't assert on cherryview for gen7... (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-23 18:30:10 +01:00
Lionel Landwerlin
c61d445a5a i965: perf: read slice/unslice frequencies from OA reports
v2: Add comment breaking down where the frequency values come from (Ken)

v3: More documentation (Ken/Lionel)
    Adjust clock ratio multiplier to reflect the divider's behavior (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-23 18:30:10 +01:00
Lionel Landwerlin
43fcb72d2c i965: perf: snapshot RPSTAT register
This register contains the current/previous frequency of the GT, it's
one of the value GPA would like to have as part of their queries.

v2: Don't use this register on baytrail/cherryview (Ken)
    Use GET_FIELD() macro (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-23 18:30:10 +01:00
Lionel Landwerlin
d71b442416 i965: perf: extract utility functions
We would like to reuse a number of the functions and structures in
another file in a future commit.

We also move the previous content of brw_performance_query.h into
brw_performance_query_metrics.h to be included by generated metrics
files.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-23 18:30:10 +01:00
Samuel Pitoiset
e37e643589 ac: teach get_ac_sampler_dim() about subpass attachments
Suggested by Nicolai.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-23 19:10:56 +02:00
Samuel Pitoiset
84fef802fb ac/nir: add missing round_slice for 1D arrays
This fixes a bunch of CTS fails with 1D arrays:

dEQP-VK.glsl.texture_functions.texture*.sampler1darray_*

Fixes: 625dcbbc45 ("amd/common: pass address components individually to
ac_build_image_intrinsic")
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-23 19:10:52 +02:00
Dylan Baker
10e4290524 bin/install_megadrivers: rename a few variables to make things clearer
Originally the "each" variable was just a part of the "drivers"
variable. It's not anymore so it's a bit ambiguous.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-04-23 09:57:35 -07:00
Dylan Baker
ae3f45c11e bin/install_megadrivers: fix DESTDIR and -D*-path
This fixes -Ddri-drivers-path, -Dvdpau-libs-path, etc. with DESTDIR when
those paths are absolute. Currently due to the way python's os.path.join
handles absolute paths these will ignore DESTDIR, which is bad. This
fixes them to be relative to DESTDIR if that is set.

Fixes: 3218056e0e
       ("meson: Build i965 and dri stack")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-04-23 09:57:35 -07:00
Dylan Baker
dbf5b772b3 compiler/glsl: close fd's in glcpp_test.py
I would have thought falling out of scope would allow the gc to collect
these, but apparently it doesn't, and this hits an fd limit on macos.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106133
Fixes: db8cd8e367
       ("glcpp/tests: Convert shell scripts to a python script")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Vinson Lee <vlee@freedesktop.org>
2018-04-23 09:55:17 -07:00
Bas Nieuwenhuizen
0e945fdf23 nir: Do not use progress for unreachable code in return lowering.
We seem to use progress for two cases:
1) When we lowered some returns.
2) When we remove unreachable code.

If just case 2 happens we assert as state->return_flag has not
been allocated yet, but we are still trying to do insert all
predicates based on it.

This splits the concerns. We only use progress internally for case 1
and then keep track of 2 in a separate variable to indicate progress
in the return value of the pass.

This is slightly better than transforming the assert into
if (!state->return_flag) return, as the solution in this patch avoids
inserting predicates even if some other part of the might need them.

Fixes: 6e22ad6edc "nir: return early when lowering a return at the end of a function"
CC: 18.1 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106174
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-23 16:55:15 +02:00
Józef Kucia
8328c64eb1 radv: advertise 8 bits of subpixel precision for viewports
This is what radeonsi does.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-23 11:16:11 +02:00
Johan Klokkhammer Helsing
dab02dea34 st/dri: Fix dangling pointer to a destroyed dri_drawable
If an EGLSurface is created, made current and destroyed, and then a second
EGLSurface is created. Then the second malloc in driCreateNewDrawable may
return the same pointer address the first surface's drawable had.
Consequently, when dri_make_current later tries to determine if it should
update the texture_stamp it compares the surface's drawable pointer against
the drawable in the last call to dri_make_current and assumes it's the same
surface (which it isn't).

When texture_stamp is left unset, then dri_st_framebuffer_validate thinks
it has already called update_drawable_info for that drawable, leaving it
unvalidated and this is when bad things starts to happen. In my case it
manifested itself by the width and height of the surface being unset.

This is fixed this by setting the pointer to NULL before freeing the
surface.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106126
Signed-off-by: Johan Klokkhammer Helsing <johan.helsing@qt.io>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
2018-04-23 04:25:40 -04:00
Ilia Mirkin
5428066f5e nv50/ir: make a copy of tex src if it's referenced multiple times
For nv50 we coalesce the srcs and defs into a single node. As such, we
can end up with impossible constraints if the source is referenced
after the tex operation (which, due to the coalescing of values, will
have overwritten it).

This logic already exists for inserting moves for MERGE/UNION sources.
It's the exact same idea here, so leverage that code, which also
includes a few optimizations around not extending live ranges
unnecessarily.

Fixes tests/spec/glsl-1.30/execution/fs-textureSize-components.shader_test

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-04-22 23:03:16 -04:00
Lepton Wu
6c5abb68c7 virgl: disable virgl when no 3D for virtio gpu.
If users are running mesa under old version of qemu or have turned off
GL at runtime, virtio gpu driver actually doesn't work. Adds a detection
here so mesa can fall back to software rendering.

v2:
 - move detection from loader to virgl (Ilia, Emil)

Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-04-23 12:35:29 +10:00
Dave Airlie
a8420e2530 radv: mark const structs as extern in header file to avoid lto damage
The copr repo from che was using LTO and he reported radv broke
recently with it. When testing with lto builds here I noticed
that we weren't seeing any instance extensions reported.

It appears LTO was treating the const without extern as an empty
struct, this is possibly a gcc bug, but we can work around it
just by marking these with extern.

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-04-23 05:55:22 +10:00
Dylan Baker
f8c4716854 Bump version after 18.1
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-04-22 09:35:56 -07:00
Ilia Mirkin
3f1cad48b8 gallium/tests/trivial: fix viewport depth transform
These were getting mapped off into outer space, which would cause nv50
and nvc0 to clip the primitives (as depth_clip was enabled).

These drivers are configured to clip everything outside the [0, 1]
range, even though the hardware supports other view settings.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-04-21 23:31:48 -04:00
Ilia Mirkin
fe8b6d7e1f trace: allow image resource to be null
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-21 23:29:39 -04:00
Karol Herbst
63572091b5 nv50/ir/ra: prefer def == src2 for fma with immediates on nvc0
This helps with the PostRALoadPropagation pass moving long immediates into
FMA/MAD instructions.

changes in shader-db:
total instructions in shared programs : 5894114 -> 5886074 (-0.14%)
total gprs used in shared programs    : 666558 -> 666563 (0.00%)
total shared used in shared programs  : 520416 -> 520416 (0.00%)
total local used in shared programs   : 53524 -> 53524 (0.00%)
total bytes used in shared programs   : 54006744 -> 53932472 (-0.14%)

                local     shared        gpr       inst      bytes
    helped           0           0           2        4192        4192
      hurt           0           0           7           9           9

Signed-off-by: Karol Herbst <karolherbst@gmail.com>
[imirkin: minor edits to separate nv50 and nvc0+ cases]
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-04-21 10:53:59 -04:00
Rhys Perry
cc35b76e99 docs/features: mark GL_ARB_post_depth_coverage as DONE for nvc0
This was done a while ago but never marked on features.txt. Note that
this is only supported on GM200+.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-04-21 10:02:55 -04:00
Dylan Baker
6754c2e83d autotools: Include new meson files
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-20 20:26:56 -07:00
Dylan Baker
5c8e2501a6 autotools: Add passes.h to sources so it will be included in the tarball
This was introduced in commit 8f848ada8a
but not added to the sources list, which is necessary for it to be
included in release tarballs.

Fixes: 8f848ada8a
       ("swr/rast: Start refactoring of builder/packetizer.")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-20 20:26:54 -07:00
Dylan Baker
cfd7d2ba0d autotools: include include/vulkan headers
This is needed to provide vk_android_native_buffer.h for vk_enum_to_str.

v2: - remove accidentally included changes

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-20 20:26:49 -07:00
Rhys Perry
a0e57432b7 nvc0: fix line width on GM20x+
This has the side-effect of fixing polygon-offset piglit test failures.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-04-20 20:43:59 -04:00
Nanley Chery
7b20329107 i965/miptree: Delete an unused function
We're going to combine ::mcs_buf and ::hiz_buf in later commits. Once
that happens, this function no longer make sense.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-20 17:14:37 -07:00
Nanley Chery
010abacc95 i965/miptree: Don't leak the clear_color_bo
Free the clear_color_bo in addition to freeing the
intel_miptree_aux_buffer which holds the reference to it.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-20 17:14:37 -07:00
Jason Ekstrand
9d2ef3c9ec i965/blorp: Do the gen11 BTI flush
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-04-20 16:30:14 -07:00
Jason Ekstrand
185630c6bc anv/blorp: Do the gen11 BTI flush
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-04-20 16:30:14 -07:00
Lucas Stach
52e93e309f etnaviv: fix texture_format_needs_swiz
memcmp returns 0 when both swizzles are the same, which means we don't
need any hardware swizzling. texture_format_needs_swiz should return
true when the return value of the memcmp is non-zero.

Fixes: 751ae6afbe ("etnaviv: add support for swizzled texture formats")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2018-04-20 18:54:10 +02:00
Samuel Pitoiset
8f13975713 ac/nir: fix image dimension for subpass attachments
For subpass attachments we need one more coordinate with
the layer, so make them array types.

This fixes a bunch of CTS fails with RADV.

Fixes: 24fb3e6aa1 ("ac/nir: use ac_build_image_opcode for image intrinsics")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-20 18:44:51 +02:00
Bas Nieuwenhuizen
e1df849c3c radv: Mark GTT memory as device local for APUs.
Otherwise a lot of games complain about not having enough memory,
and it is sort of local so this seems reasonable to me.

CC: 18.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-20 18:16:16 +02:00
Samuel Pitoiset
fedd0a4215 radv/winsys: allow to submit up to 4 IBs for chips without chaining
The SI family doesn't support chaining which means the maximum
size in dwords per CS is limited. When that limit was reached
we failed to submit the CS and the application crashed.

This patch allows to submit up to 4 IBs which is currently the
limit, but recent amdgpu supports more than that.

Please note that we can reach the limit of 4 IBs per submit
but currently we can't improve that. The only solution is to
upgrade libdrm. That will be improved later but for now this
should fix crashes on SI or when using RADV_DEBUG=noibs.

Fixes: 36cb5508e8 ("radv/winsys: Fail early on overgrown cs.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105775
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-20 18:12:26 +02:00
Stefan Schake
ff904978a1 gallium/util: Android backtrace support
We can't use any of the existing implementations in u_debug_stack.
Android technically has libunwind, but it's been modified to the point
where it no longer compiles with the Mesa usage. The library is also
not meant to be referenced by vendor libraries. The officially sanctioned
way of obtaining backtraces is through the Android own libbacktrace, a
C++ library. Access it through a separate C++ source file on Android only.

Signed-off-by: Stefan Schake <stschake@gmail.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-04-20 18:49:49 +03:00
Stefan Schake
2abd4f4b49 gallium/util: Don't stub u_debug_stack on Android
The fallback path for no libunwind ends up being stubs for Android.
Don't compile them in so we can provide our own implementation.

Signed-off-by: Stefan Schake <stschake@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-04-20 18:49:37 +03:00
Samuel Pitoiset
dd069e9b41 ac/nir: handle nir_intrinsic_load_first_vertex like base_vertex
This fixes a ton of CTS crashes.

Fixes: c366f422f0 ("nir: Offset vertex_id by first_vertex instead of base_vertex")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-20 17:07:38 +02:00
Samuel Pitoiset
b21a4efb55 radv/winsys: allow local BOs on APUs
Ported from RadeonSI.

Local BOs ignore BO priorities, and we don't need those on APUs.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-20 16:18:24 +02:00
Samuel Pitoiset
5c1233ed62 radv: use a global BO list only for VK_EXT_descriptor_indexing
Maintaining two different paths is annoying but this gets
rid of the performance regression introduced by the global
BO list.

We might find a better solution in the future, but for now
just keeps two paths.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-20 16:18:18 +02:00
Samuel Pitoiset
7bd5367546 Revert "radv: Don't store buffer references in the descriptor set."
In order to reduce a performance regression introduced by
4b13fe55a4 ("radv: Keep a global BO list for VkMemory."),
we are going to maintain two different paths.

One when VK_EXT_descriptor_indexing is enabled by the
application because we need to have a global BO list, and
one (the old one) when it's not enabled.

With Talos on Polaris, the global BO list reduces performance
by 10% which is too much for me.

This reverts commit ab6cadd3ec.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-20 16:18:13 +02:00
Jose Maria Casanova Crespo
eb96bd57c7 i965/fs: retype offset_reg to UD at load_ssbo
All operations with offset_reg at do_vector_read are done
with UD type. So copy propagation was not working through
the generated MOVs:

mov(8) vgrf9:UD, vgrf7:D

This change allows removing the MOV generated for reading the
first components for 16-bit and 64-bit ssbo reads with
non-constant offsets.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-04-20 13:30:12 +02:00
Nicolai Hähnle
24fb3e6aa1 ac/nir: use ac_build_image_opcode for image intrinsics
So that we'll use the dimension-aware intrinsics in the future.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-04-20 09:30:07 +02:00
Nicolai Hähnle
74063431f1 radeonsi: generate image load/store/atomic ops using ac_build_image_opcode
In preparation of dimension-aware LLVM image intrinsics.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-04-20 09:29:57 +02:00
Nicolai Hähnle
625dcbbc45 amd/common: pass address components individually to ac_build_image_intrinsic
This is in preparation for the new image intrinsics.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-04-20 09:23:52 +02:00
Nicolai Hähnle
f931583828 amd/common: pass new enum ac_image_dim to ac_build_image_opcode
This is in preparation for the new, dimension-aware LLVM image
intrinsics.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-04-20 09:23:40 +02:00
Nicolai Hähnle
9cb52d470a radeonsi/nir: fix crash in test involving the sample mask
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-20 09:21:50 +02:00
Nicolai Hähnle
552bc37c6f radeonsi/nir: set FS properties only when scanning a fragment shader
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-20 09:21:47 +02:00
Nicolai Hähnle
a807a9b215 ac/nir: fix atomic compare-and-swap
The LLVM instruction returns { i32, i1 }, where the i1 indicates success.
We're only interested in the first part, which is the loaded value.

Fixes dEQP-GLES31.functional.compute.shared_var.atomic.compswap.*

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-20 09:21:40 +02:00
Nicolai Hähnle
e788b987d8 radeonsi: fix error paths of si_texture_transfer_map
trans is zero-initialized, but trans->resource is setup immediately so
needs to be dereferenced.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-20 09:21:33 +02:00
Nicolai Hähnle
68ee1d5796 glsl: prevent spurious Valgrind errors when serializing NIR
It looks as if the structure fields array is fully initialized below,
but in fact at least gcc in debug builds will not actually overwrite
the unused bits of bit fields.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-20 09:21:23 +02:00
Aaron Watry
354b12681b clover: Fix host access validation for sub-buffer creation
From CL 1.2 Section 5.2.1:
    CL_INVALID_VALUE if buffer was created with CL_MEM_HOST_WRITE_ONLY and
    flags specify CL_MEM_HOST_READ_ONLY , or if buffer was created with
    CL_MEM_HOST_READ_ONLY and flags specify CL_MEM_HOST_WRITE_ONLY , or if
    buffer was created with CL_MEM_HOST_NO_ACCESS and flags specify
    CL_MEM_HOST_READ_ONLY or CL_MEM_HOST_WRITE_ONLY .

Fixes CL 1.2 CTS test/api get_buffer_info

v2: Correct host_access_flags check (Francisco)

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-04-19 20:57:37 -05:00
Neil Roberts
c366f422f0 nir: Offset vertex_id by first_vertex instead of base_vertex
base_vertex will be zero for non-indexed calls and in that case we
need vertex_id to be offset by the ‘first’ parameter instead. That is
what we get with first_vertex. This is true for both GL and Vulkan.

The freedreno driver is also setting vertex_id_zero_based on
nir_options. In order to avoid breakage this patch switches the
relevant code to handle SYSTEM_VALUE_FIRST_VERTEX so that it can
retain the same behavior.

v2: change a3xx/fd3_emit.c and a4xx/fd4_emit.c from
SYSTEM_VALUE_BASE_VERTEX to SYSTEM_VALUE_FIRST_VERTEX (Kenneth).

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: Rob Clark <robdclark@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-04-19 15:57:45 -07:00
Neil Roberts
c4f30a9100 spirv: Lower BaseVertex to FIRST_VERTEX instead of BASE_VERTEX
The base vertex in Vulkan is different from GL in that for non-indexed
primitives the value is taken from the firstVertex parameter instead
of being set to zero. This coincides with the new SYSTEM_VALUE_FIRST_VERTEX
instead of BASE_VERTEX.

v2 (idr): Add comment describing why SYSTEM_VALUE_FIRST_VERTEX is used
for SpvBuiltInBaseVertex.  Suggested by Jason.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1]
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-19 15:57:45 -07:00
Antia Puentes
c32e1035cb intel: Handle firstvertex in an identical way to BaseVertex
Until we set gl_BaseVertex to zero for non-indexed draw calls
both have an identical value.

The Vertex Elements are kept like that:
* VE 1: <BaseVertex/firstvertex, BaseInstance, VertexID, InstanceID>
* VE 2: <Draw ID, 0, 0, 0>

v2 (idr): Mark nir_intrinsic_load_first_vertex as "unreachable" in
emit_system_values_block and fs_visitor::nir_emit_vs_intrinsic.
2018-04-19 15:57:45 -07:00
Neil Roberts
0c8395e15d intel/compiler: Add a uses_firstvertex flag
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-19 15:57:45 -07:00
Antia Puentes
5ff848df7b compiler: Add SYSTEM_VALUE_FIRST_VERTEX and instrinsics
This VS system value will contain the value passed as <basevertex> for
indexed draw calls or the value passed as <first> for non-indexed draw
calls. It can be used to calculate the gl_VertexID as
SYSTEM_VALUE_VERTEX_ID_ZERO_BASE plus SYSTEM_VALUE_FIRST_VERTEX.

From the OpenGL 4.6 spec, 10.4 "Drawing Commands Using Vertex Arrays":

-  Page 352:
"The index of any element transferred to the GL by DrawArraysOneInstance
is referred to as its vertex ID, and may be read by a vertex shader as
gl_VertexID.  The vertex ID of the ith element transferred is first +
i."

- Page 355:
"The index of any element transferred to the GL by
DrawElementsOneInstance is referred to as its vertex ID, and may be read
by a vertex shader as gl_VertexID.  The vertex ID of the ith element
transferred is the sum of basevertex and the value stored in the
currently bound element array buffer at offset indices + i."

Currently the gl_VertexID calculation uses SYSTEM_VALUE_BASE_VERTEX but
this will have to change when the value of gl_BaseVertex is
fixed. Currently its value is broken for non-indexed draw calls because
it must be zero but we are setting it to <first>.

v2: use SYSTEM_VALUE_FIRST_VERTEX as name for the value, instead of
SYSTEM_VALUE_BASE_VERTEX_ID (Kenneth).

v3 (idr): Rebase on Rob Clark converting nir_intrinsics.h to be
generated.  Reformat commit message to 72 columns.

Reviewed-by: Neil Roberts <nroberts@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-04-19 15:57:45 -07:00
Mike Lothian
051fddb4a9 meson: Build st_tests_common with gtest
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106131
Fixes: 34cb4d0ebc ("meson: build tests for gallium mesa state tracker")
Signed-off-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-04-19 09:04:51 -07:00
Bas Nieuwenhuizen
dffdef6737 radv: Add Vega M support.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-19 16:36:21 +02:00
Bas Nieuwenhuizen
d1ce31d36c radv: Add bound checking workaround for dynamic buffers.
I have seen a few applications and games do the dynamic buffer bounds incorrectly, this
make it easier to work around, e.g. for debugging.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-19 16:13:25 +02:00
Thomas Hellstrom
e0c08183fb svga: Fix incorrect advertizing of EGL_KHR_gl_colorspace
When advertizing this extension, egl_dri2 uses the DRI2_RENDERER_QUERY
extension to query whether an sRGB format is supported. That extension will
query our driver with the BIND flag PIPE_BIND_RENDER_TARGET rather than
PIPE_BIND_DISPLAY_TARGET which is used when building the configs.
We only return the correct value for PIPE_BIND_DISPLAY_TARGET.

The inconsistency causes EGL to crash at surface initialization if sRGB is
not supported. Fix this by supporting both bind flags.

Testing done:
piglit egl_gl_colorspace srgb

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-04-19 13:42:51 +02:00
Mike Lothian
79487c427e swr: Fix include for createPromoteMemoryToRegisterPass
Include llvm/Transforms/Utils.h with the newest LLVM 7

v2: Include with " " rather than < > (Vinson Lee)

v3: Use LLVM_VERSION_MAJOR rather than HAVE_LLVM (George Kyriazis)

Signed-of-by: Mike Lothian <mike@fireburn.co.uk>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
2018-04-19 00:39:04 -07:00
Samuel Pitoiset
2f63b3dd09 radv: enable DCC for MSAA 2x textures on VI under an option
This can be enabled with RADV_PERFTEST=dccmsaa.

DCC for MSAA textures is actually not as easy to implement. It
looks like there is some corner cases. I will improve support
incrementally.

Vega support, as well as Polaris improvements, will be added later.

No CTS changes on Polaris using RADV_DEBUG=zerovram and
RADV_PERFTEST=dccmsaa.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-19 09:10:55 +02:00
Samuel Pitoiset
dc3d39771f radv: decompress DCC for multisampled source images before resolving
Multisampled source images (ie. color attachments) can be now
DCC compressed, so the driver needs to perform a DCC decompression
pass before resolving

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-19 09:10:52 +02:00
Samuel Pitoiset
1aefb62f1e radv: add a workaround for fast clears with DCC and MSAA textures
This should be fixed at some point in order to improve
performance.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-19 09:10:50 +02:00
Samuel Pitoiset
373fa0b599 radv: allocate CMASK for DCC fast clear with MSAA
CMASK is required because it should be cleared to
0xCCCCCCCC for MSAA textures.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-19 09:10:48 +02:00
Samuel Pitoiset
255506c4e0 radv: implement fast color clear for DCC with MSAA
When DCC is enabled with MSAA textures, CMASK should be
cleared to 0xCCCCCCCC.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-19 09:10:45 +02:00
Samuel Pitoiset
796b6f4aab radv: make sure to sync after resolving using the compute path
This fixes some random CTS failures:

dEQP-VK.renderpass.multisample.*.

Performing a fast-clear eliminate is still useless, but it
seems that we need to sync.

Found while running CTS with RADV_DEBUG=zerovram.

Fixes: 56a171a499 ("radv: don't fast-clear eliminate after resolving a subpass with compute")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-19 09:09:55 +02:00
Samuel Pitoiset
4a698660ae radv: dump the SHA1 of SPIRV in the hang report
Might be useful for debugging purposes, especially when we
want to replace a shader on the fly.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-19 09:09:52 +02:00
Bas Nieuwenhuizen
0e10790558 radv: Enable VK_EXT_descriptor_indexing.
This adds everything except non-uniform indexing, which needs a bit
more work and testing.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen
5f7ebb5206 spirv: Add support for runtime descriptor array cap.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen
c48feaf2d1 spirv: Add support for VK_EXT_descriptor_indexing uniform indexing caps.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen
b5e04e9217 radv: Support allocating variable size descriptor sets.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen
78c54acbe8 radv: Add support for variable descriptor set layouts.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen
082c11e8a5 radv: Fix GetDescriptorSetLayoutSupport.
The continue means we do alignment differently than during creation,
making the buffer smaller than expected.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen
d02bbde1a8 radv: Use sorted bindings for set layout creation.
Previously we did not care about havin the set storage in order,
but for variable descriptor count we want the highest binding
at the end of the storage.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen
ab6cadd3ec radv: Don't store buffer references in the descriptor set.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen
4b13fe55a4 radv: Keep a global BO list for VkMemory.
With update after bind we can't attach bo's to the command buffer
from the descriptor set anymore, so we have to have a global BO
list.

I am somewhat surprised this works really well even though we have
implicit synchronization in the WSI based on the bo list associations
and with the new behavior every command buffer is associated with
every swapchain image. But I could not find slowdowns in games because
of it.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen
22d6b89e39 spirv: Update spirv.h to 12f8de9f04327336b699b1b80aa390ae7f9ddbf4
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-18 22:56:54 +02:00
Kenneth Graunke
da25ae92be i965: Fix shadow batches to be the same size as the real BO.
brw_bo_alloc may round up our allocation size to the next bucket size.
In this case, we would malloc a shadow buffer that was the original
intended size, but use bo->size (the larger size) for all of our checks.

This could cause us to run off the end of the shadow buffer.

v2: Actually use the new BO size (caught by Lionel)

Reported-by: James Xiong <james.xiong@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: c7dcee58b5 (i965: Avoid problems from referencing orphaned BOs after growing.)
2018-04-18 13:55:08 -07:00
Marek Olšák
7bd24d951a glsl_to_tgsi: try harder to lower unsupported ir_binop_vector_extract
This fixes some piglits.

Cc: 18.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-18 15:34:52 -04:00
Leo Liu
90de03708f radeon/vce: disable vce dual pipe on VegaM
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-18 14:45:35 -04:00
Marek Olšák
c6f1d36019 radeonsi: add support for VegaM
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-18 14:45:33 -04:00
Marek Olšák
d6a66bc8db amd/addrlib: add support for VegaM
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-18 14:45:32 -04:00
Marek Olšák
d15fb766aa radeonsi/gfx9: fix a hang with an empty first IB
This packet causes the no-op IB detection to fail, so the IB is always
submitted. Also fix the no-op IB detection by moving the begin call.

Cc: 18.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-18 14:42:06 -04:00
Dylan Baker
d28c246501 meson: build graw tests
This only enables the null and xlib target, so no windows support yet.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-18 09:03:57 -07:00
Dylan Baker
34cb4d0ebc meson: build tests for gallium mesa state tracker
v2: - Fix typo

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-18 09:03:57 -07:00
Dylan Baker
de01018293 meson: build gallium unit tests
v2: - gate unit tests on swrast being enabled (Eric A)
v3: - rebase on libtrace being merged with gallium auxiliary

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net> (v2)
2018-04-18 09:03:57 -07:00
Dylan Baker
4c794c7834 meson: Build gallium trivial tests
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-18 09:03:57 -07:00
Dylan Baker
7fee8fed16 meson: Remove TODO about mesa/main tests
They're already done.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-18 09:03:57 -07:00
Dylan Baker
5d16c86add meson: enable glcpp test
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-18 09:03:57 -07:00
Dylan Baker
db8cd8e367 glcpp/tests: Convert shell scripts to a python script
This ports glcpp-test.sh and glcpp-test-cr-lf.sh to a python script that
accepts arguments for each line ending type. This should allow for
better reporting to users.

v2: - Use $PYTHON2 to be consistent with other tests in mesa

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-18 09:03:57 -07:00
Dylan Baker
8cb96c4031 glsl/tests: Remove unused compare_ir.py script
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-04-18 09:03:57 -07:00
Dylan Baker
877d250ea1 meson: enable optimization-test
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-04-18 09:03:57 -07:00
Dylan Baker
97c28cb082 glsl/tests: Convert optimization-test.sh to pure python
This patch converts optimization-test.sh to python, in this process it
removes external shell dependencies including diff. It replaces the
python script that generates shell scripts with a python library that
generates test cases and runs them using subprocess.

v2: - use $PYTHON2 to be consistent with other tests in mesa

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-04-18 09:03:57 -07:00
Dylan Baker
ad9c2f2018 meson: run glsl compiler warnings test
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-18 09:03:57 -07:00
Dylan Baker
3b52d29227 glsl/tests: reimplement warnings-test in python
This reimplements the test in python with a shell script wrapper that
allows autotools to continue to run the test without realizing that
anything has changed.

Using python has two advantages, first it's portable so this test can be
run on windows as well as Linux since it just requires python, no more
diff, pwd or sh. It's also no longer tied to autotools implementation
details, like the environment variables $srcdir and $abs_builddir,
though the autotools shell wrapper still uses those, which makes it
possible to run the test in meson.

v2: - Use $PYTHON2 in script to be consistent with other scripts in mesa

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-18 09:03:57 -07:00
George Kyriazis
12a002a3a1 swr/rast: Fix VGATHERPD lowering
Also Implement VHSUBPS in x86 lowering pass.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
99fe90722d swr/rast: Replace x86 VMOVMSK with llvm-only implementation
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
0899122c03 swr/rast: Optimize late/bindless JIT of samplers
Add per-worker thread private data to all shader calls
Add per-worker sampler cache and jit context
Add late LoadTexel JIT support
Add per-worker-thread Sampler / LoadTexel JIT

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
ec7154abc0 swr/rast: Implement VROUND intrinsic in x86 lowering pass
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
bb02da3c1b swr/rast: Refactor to improve code sharing.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
94ca1c018f swr/rast: minimize codegen redundant work
Move filtering of redundant codegen operations into gen scripts themselves

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
7f34860125 swr/rast: double-pump in x86 lowering pass
Add support for double-pumping a smaller SIMD width intrinsic.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
96ad8f5a23 swr/rast: Fix 64bit float loads in x86 lowering pass
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
1ffbbbee97 swr/rast: Add shader stats infrastructure (WIP)
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
a81c625cb7 swr/rast: Type-check TemplateArgUnroller
Allows direct use of enum values in conversion to template args.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
2966ee1028 swr/rast: Add vgather to x86 lowering pass.
Add support for generic VGATHERPD intrinsic in x86 lowering pass.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
e4929b5d26 swr/rast: fix comment
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
670a99c233 swr/rast: add cvt instructions in x86 lowering pass
Support generic VCVTPD2PS and VCVTPH2PS in x86 lowering pass.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
aa482014e5 swr/rast: Fix alloca usage in jitter
Fix issue where temporary allocas were getting hoisted to function entry
unnecessarily. We now explicitly mark temporary allocas and skip hoisting
during the hoist pass. Shuold reduce stack usage.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
81371a5909 swr/rast: Change gfx pointers to gfxptr_t
Changing type to gfxptr for indices and related changes to fetch and mem
builder code.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
71239478d3 swr/rast: Fix byte offset for non-indexed draws
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
c57b594317 swr/rast: Add support for setting optimization level
for JIT compilation

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
4f0df5e2f7 swr/rast: Adding translate call to builder_gfx_mem.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
f135f54b18 swr/rast: Fix codegen for typedef types
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
c5d7b37fe7 swr: add x86 lowering pass to fragment shader
Needed because some FP paths (namely stipple) use gather intrinsics
that now need to be lowered to x86.

v2: fix typo in commit message
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
9161c40d14 swr/rast: Enable generalized fetch jit
Enable generalized fetch jit with 8 or 16 wide SIMD target. Still some
work needed to remove some simd8 double pumping for 16-wide target.

Also removed unused non-gather load vertices path.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
d73082b98b swr/rast: Add builder_gfx_mem.{h|cpp}
Abstract usage scenarios for memory accesses into builder_gfx_mem.
Builder_gfx_mem will convert gfxptr_t from 64-bit int to regular pointer
types for use by builder_mem.

v2: reworded commit message; renamed enum more appropriately
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
1eb72673fc swr/rast: Lower VGATHERPS and VGATHERPS_16 to x86.
Some more work to do before we can support simultaneous 8-wide and
16-wide and remove the VGATHERPS_16 version.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
b15fb78df5 swr/rast: Cleanup of JitManager convenience types
Small cleanup. Remove convenience types from JitManager and standardize
on the Builder's convenience types.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
d68694016c swr/rast: Lower PERMD and PERMPS to x86.
Add support for providing an emulation callback function for arch/width
combinations that don't map cleanly to an x86 intrinsic.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
8f848ada8a swr/rast: Start refactoring of builder/packetizer.
Move x86 intrinsic lowering to a separate pass. Builder now instantiates
generic intrinsics for features not supported by llvm. The separate x86
lowering pass is responsible for lowering to valid x86 for the target
SIMD architecture. Currently it's a port of existing code to get it
up and running quickly. Will eventually support optimized x86 for AVX,
AVX2 and AVX512.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
ffc0aeb4ec swr/rast: Simplify #define usage in gen source file
Removed preprocessor defines from structures passed to LLVM jitted code.

The python scripts do not understand the preprocessor defines and ignores
them. So for fields that are compiled out due to a preprocessor define
the LLVM script accounts for them anyway because it doesn't know what
the defines are set to. The sanitize defines for open source are fine
in that they're safely used.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
f36026ce2e swr/rast: Move CallPrint() to a separate file
Needed work for jit code debug.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
67c8bb4db7 swr/rast: Fix name mangling for LLVM pow intrinsic
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
7a5054aa1c swr/rast: Add some archrast counters
Hook up archrast counters for shader stats: instructions executed.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
f52a501716 swr/rast: Code cleanup
Removing some code that doesn't seem to do anything meaningful.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
093c1aee88 swr/rast: Add "Num Instructions Executed" stats intrinsic.
Added a SWR_SHADER_STATS structure which is passed to each shader. The
stats pass will instrument the shader to populate this.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
5fbee5e4ef swr/rast: Add MEM_ADD helper function to Builder.
mem[offset] += value

This function will be heavily used by all stats intrinsics.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
9103119cb3 swr/rast: Permute work for simd16
Fix slow permutes in PA tri lists under SIMD16 emulation on AVX

Added missing permute (interlane, immediate) to SIMDLIB

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
4c69823d15 swr/rast: WIP builder rewrite (2)
Finish up the remaining explicit intrinsic uses. At this point all
explicit Intrinsic::getDeclaration() usage has been replaced with auto
generated macros generated with gen_llvm_ir_macros.py. Going forward,
make sure to only use the intrinsics here, adding new ones as needed.

Next step is to remove all references to x86 intrinsics to keep the
builder target-independent. Any x86 lowering will be handled by a
separate pass.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
c2163dc56a swr/rast: Add autogen of helper llvm intrinsics.
Replace sqrt, maskload, fp min/max, cttz, ctlz with llvm equivalent.
Replace AVX maskedstore intrinsic with LLVM intrinsic. Add helper llvm
macros for stacksave, stackrestore, popcnt.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
6427315e43 swr/rast: WIP builder rewrite.
Start removing avx2 macros for functionality that exists in llvm.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
a16f8e0554 swr/rast: LLVM 6 fix
for getting masked gather intrinsic (also compatible with LLVM 4)

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
a92cc09c7a swr/rast: Changes to allow jitter to compile with LLVM5
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
0f6fef9632 swr/rast: Add some archrast stats
Add stats for degenerate and backfacing primitive counts

Wire archrast stats for alpha blend and alpha test.
pass value to jitter, upon return have archrast event increment a value

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
b488028854 swr/rast: Silence some unused variable warnings
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
e84bfec4ab swr/rast: Add debug type info for i128
Help support debug info in 16 wide shaders.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
a3edcfe1fb swr/rast: Use blend context struct to pass params
Stuff parameters into a blend context struct before passing down through
the PFN_BLEND_JIT_FUNC function pointer. Needed for stat changes.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
be6cf0fd7c swr/rast: Introduce JIT_MEM_CLIENT
Add assert for correct usage of memory accesses

v2: reworded commit message; renamed enum more appropriately
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
George Kyriazis
d34edffe48 swr/rast: Add some instructions to jitter
VPHADDD, PMAXUD, PMINUD

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-04-18 10:51:38 -05:00
Juan A. Suarez Romero
4aa03581b5 docs: update calendar, add news and link release notes to 18.0.1
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-04-18 15:29:12 +00:00
Juan A. Suarez Romero
ad51d8871e docs: add sha256 checksums for 18.0.1
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit a1c421c638)
2018-04-18 15:25:32 +00:00
Juan A. Suarez Romero
76cadaa1de docs: add release notes for 18.0.1
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 8bd719e3fa)
2018-04-18 15:25:30 +00:00
Juan A. Suarez Romero
193d615917 docs: update calendar, add news and link release notes to 17.3.9
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-04-18 09:45:11 +00:00
Juan A. Suarez Romero
6372227209 docs: add sha256 checksums for 17.3.9
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit cf0864dc63)
2018-04-18 09:40:44 +00:00
Juan A. Suarez Romero
6a1261bd09 docs: add release notes for 17.3.9
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 6d88ea9dd4)
2018-04-18 09:40:42 +00:00
Dylan Baker
b9ad5282ba Revert "meson: add wrap for libdrm"
This reverts commit 6217eedc9b.

I was using this for testing and accidentally put it on master

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-04-17 13:48:55 -07:00
Dylan Baker
efcbcfa7c8 Revert "Add subprojects directory and git ignore"
This reverts commit 21e2e73f71.

I was using this for testing and accidentally put it on master

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-04-17 13:48:43 -07:00
Jan Alexander Steffens (heftig)
5cf752b18b meson: Version libMesaOpenCL like autotools does
This is for parity with autotools. It names the library
libMesaOpenCL.so.1.0.0 and points mesa.icd to the .1 symlink.

opencl_version now matches configure.ac's OPENCL_VERSION.

Signed-off-by: Jan Alexander Steffens (heftig) <jan.steffens@gmail.com>
Tested-By: Aaron Watry <awatry@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-04-17 13:46:15 -07:00
Jan Alexander Steffens (heftig)
5bb98cfd92 meson: Add library versions to swr drivers
This is for parity with autotools.

Signed-off-by: Jan Alexander Steffens (heftig) <jan.steffens@gmail.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
2018-04-17 13:46:15 -07:00
Dylan Baker
6217eedc9b meson: add wrap for libdrm
Currently this requires libdrm from git, since the version reported by
meson is wrong.
2018-04-17 13:46:15 -07:00
Dylan Baker
21e2e73f71 Add subprojects directory and git ignore
For meson wraps.
2018-04-17 13:46:15 -07:00
Samuel Pitoiset
893e19efb7 radv: fix scissor computation when using half-pixel viewport offset
'scale[i]' can be non-integer.

Original patch by Philip Rebohle.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106074
Fixes: 0f3de89a56 ("radv: Use the guard band.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-17 22:12:14 +02:00
Neil Roberts
608d70bc02 spirv: Accept doubles in FaceForward, Reflect and Refract
The SPIR-V spec doesn’t specify a size requirement for these and the
equivalent functions in the GLSL spec have explicit alternatives for
doubles. Refract is a little bit more complicated due to the fact that
the final argument is always supposed to be a scalar 32- or 16- bit
float regardless of the other operands. However in practice it seems
there is a bug in glslang that makes it convert the argument to 64-bit
if you actually try to pass it a 32-bit value while the other
arguments are 64-bit. This adds an optional conversion of the final
argument in order to support any type.

These have been tested against the automatically generated tests of
glsl-4.00/execution/built-in-functions using the ARB_gl_spirv branch
which tests it with quite a large range of combinations.

The issue with glslang has been filed here:
https://github.com/KhronosGroup/glslang/issues/1279

v2: Convert the eta operand of Refract from any size in order to make
    it eventually cope with 16-bit floats.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-17 20:58:11 +02:00
Neil Roberts
6e499572b9 spirv: Add a 64-bit implementation of OpIsInf
The only change neccessary is to change the type of the constant used
to compare against.

This has been tested against the arb_gpu_shader_fp64/execution/
fs-isinf-dvec tests using the ARB_gl_spirv branch.

v2: Use nir_imm_floatN_t for the constant.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-17 20:58:06 +02:00
Neil Roberts
696f4abcbc spirv: Use nir_imm_floatN_t for constants for GLSL450 builtins
There is an existing macro that is used to choose between either a
float or a double immediate constant based on the bit size of the
first operand to the builtin. This is now changed to use the new
nir_imm_floatN_t helper function to reduce the number of places that
make this decision.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-17 20:58:03 +02:00
Neil Roberts
e7b2c125c3 nir/builder: Add a nir_imm_floatN_t helper
This lets you easily build float immediates just given the bit size.
If we have this single place here to handle this then it will be
easier to add support for 16-bit floats later.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-17 20:57:36 +02:00
Timothy Arceri
6e22ad6edc nir: return early when lowering a return at the end of a function
Otherwise we create unused conditional return flags and things
get unnecessarily ugly fast when lowering nested functions.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-17 14:17:56 +10:00
Timothy Arceri
d3cafc18fc mesa: merge the driver functions DrawBuffers and DrawBuffer
The extra params we unused by the drivers that used DrawBuffers.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-04-17 14:17:48 +10:00
Marc Dietrich
268d8f244b glsl: fix gcc 8 parenthesis warning
fixes warnings like this:
[184/1137] Compiling C++ object 'src/compiler/glsl/glsl@sta/lower_jumps.cpp.o'.
In file included from ../src/mesa/main/mtypes.h:48,
                 from ../src/compiler/glsl_types.h:149,
                 from ../src/compiler/glsl/lower_jumps.cpp:59:
../src/compiler/glsl/lower_jumps.cpp: In member function '{anonymous}::block_record {anonymous}::ir_lower_jumps_visitor::visit_block(exec_list*)':
../src/compiler/glsl/list.h:650:17: warning: unnecessary parentheses in declaration of 'node' [-Wparentheses]
    for (__type *(__inst) = (__type *)(__list)->head_sentinel.next; \
                 ^
../src/compiler/glsl/lower_jumps.cpp:510:7: note: in expansion of macro 'foreach_in_list'
       foreach_in_list(ir_instruction, node, list) {
       ^~~~~~~~~~~~~~~

Signed-off-by: Marc Dietrich <marvin24@gmx.de>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-17 11:53:59 +10:00
Rob Clark
2a55344e7d compiler: int8/uint8 fixes
A couple spots were missed for handling of the new INT8/UINT8 base type.

Also de-duplicate get_base_type().. get_scalar_type() had nearly the
same switch statement, with the exception that anything with base_type
that was not scalar would return error_type.  So just handle that one
special case in get_scalar_type().

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-16 20:41:18 -04:00
Marek Olšák
60299e9abe radeonsi: don't emit partial flushes for internal CS flushes only
Tested-by: Benedikt Schemmer <ben@besd.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-16 16:58:10 -04:00
Marek Olšák
692f550740 winsys/amdgpu: always set AMDGPU_IB_FLAG_TC_WB_NOT_INVALIDATE
There is a kernel patch that adds the new flag.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Benedikt Schemmer <ben@besd.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-16 16:58:10 -04:00
Marek Olšák
1b3199d14d radeonsi: implement mechanism for IBs without partial flushes at the end (v6)
(This patch doesn't enable the behavior. It will be enabled in a later
commit.)

Draw calls from multiple IBs can be executed in parallel.

v2: do emit partial flushes on SI
v3: invalidate all shader caches at the beginning of IBs
v4: don't call si_emit_cache_flush in si_flush_gfx_cs if not needed,
    only do this for flushes invoked internally
v5: empty IBs should wait for idle if the flush requires it
v6: split the commit

If we artificially limit the number of draw calls per IB to 5, we'll get
a lot more IBs, leading to a lot more partial flushes. Let's see how
the removal of partial flushes changes GPU utilization in that scenario:

With partial flushes (time busy):
    CP: 99%
    SPI: 86%
    CB: 73:

Without partial flushes (time busy):
    CP: 99%
    SPI: 93%
    CB: 81%

Tested-by: Benedikt Schemmer <ben@besd.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-16 16:58:10 -04:00
Erico Nunes
d19b488339 nir: fix ir_binop_gequal glsl_to_nir conversion
ir_binop_gequal needs to be converted to nir_op_sge when native integers
are not supported in the driver.
Otherwise it becomes no different than ir_binop_less after the
conversion.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-16 07:59:25 -07:00
Jason Ekstrand
72ab499c9f anv,radv: Drop XML workarounds for VK_ANDROID_native_buffer
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-16 07:59:25 -07:00
Jason Ekstrand
35ef0f767e vulkan: Update the XML and headers to 1.1.73
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-16 07:59:25 -07:00
Samuel Pitoiset
62510846b6 radv: clean up radv_decompress_resolve_subpass_src()
To handle the source color image transitions in the same place.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-16 14:21:05 +02:00
Samuel Pitoiset
56a171a499 radv: don't fast-clear eliminate after resolving a subpass with compute
That looks useless, and I think radv_handle_image_transition()
will do a fast-clear eliminate because it's called after the
resolve.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-16 14:21:02 +02:00
Samuel Pitoiset
7e84d69861 radv: handle CMASK/FMASK transitions only if DCC is disabled
DCC implies a fast-clear eliminate, so I think this sounds
reasonable.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-16 14:20:59 +02:00
Samuel Pitoiset
584d1f2711 radv: merge radv_handle_{dcc,cmask}_image_transition() functions
Into radv_handle_color_image_transition().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-16 14:20:56 +02:00
Samuel Pitoiset
d5812b900b radv: add radv_init_color_image_metadata() helper
In order to separate initialization from decompression. In the
future, that will allow us to init DCC/FMASK/CMASK in one shot.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-16 14:20:54 +02:00
Samuel Pitoiset
fde7b90ecf radv: make radv_initialise_cmask() static
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-16 14:20:51 +02:00
Samuel Pitoiset
790f6e4718 radv: clean up radv_handle_image_transition() a bit
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-16 14:20:49 +02:00
Samuel Pitoiset
6967d32beb radv: add radv_handle_color_image_transition() helper
To handle CMASK, FMASK and DCC transitions in the same place.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-16 14:20:45 +02:00
Samuel Pitoiset
c6b1f1c97a radv: handle DCC image transitions before CMASK/FMASK transitions
Mostly because DCC implies a fast-clear eliminate and we
should be able to skip some DCC decompressions by setting
a predicate like for CMASK and FMASK.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-16 14:20:42 +02:00
Samuel Pitoiset
79c87a45b6 radv: disable prediction only if it has been enabled
When decompressing DCC we don't enable it, so it's useless
to disable it. This reduces the number of prediction packets
sent to the GPU when performing color decompression passes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-16 14:20:39 +02:00
Bas Nieuwenhuizen
b0e3a9b19f ac/nir: Make the GFX9 buffer size fix apply to image loads/atomics too.
No clue how I missed those ...

Fixes: 4503ff760c "ac/nir: Add workaround for GFX9 buffer views."
CC: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105320
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-16 11:55:48 +02:00
Brian Paul
6a519a157b gallium/osmesa: link with winsock2 library on Windows
To fix the MSVC build.  The build broke because we started to compile
the ddebug code on Windows after the mtypes.h changes.  Building ddebug
caused us to also use the u_network.c code for the first time.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-04-13 19:06:55 -06:00
Brian Paul
201c08c463 gallium/util: put (void) in a few function signatures
To match the header file.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-04-13 19:06:55 -06:00
Brian Paul
65d1040435 ddebug: add PIPE_OS_UNIX/LINUX checks to fix MSVC build
Don't include Unix headers or use Unix functions when building with MSVC.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-04-13 19:06:55 -06:00
Brian Paul
6d41edbf8a mesa: protect #include of unistd.h with _MSV_VER check
unistd.h is unix only.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-04-13 19:06:55 -06:00
Brian Paul
bf67fec235 mesa: remove unused 'i' in dimensions_error_check()
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-13 19:06:55 -06:00
Marek Olšák
976db661ff radeonsi: restore si_emit_cache_flush call at the end of IBs
Fixes: 918b798668 "radeonsi: make sure CP DMA is idle at the end of IBs"
2018-04-13 20:05:53 -04:00
Daniel Schürmann
f2c6a55061 radv: enable subgroup capabilities
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-14 01:03:15 +02:00
Daniel Schürmann
4b0616e533 ac: handle subgroup intrinsics
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-14 01:03:15 +02:00
Daniel Schürmann
d5f7ebda3e ac: add LLVM build functions for subgroup instrinsics
Co-authored-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-14 01:03:09 +02:00
Daniel Schürmann
d19f20e793 ac: make ballot and umsb capable of 64bit inputs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-14 00:52:22 +02:00
Daniel Schürmann
79701b414c nir: lower 64bit subgroup shuffle intrinsics
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-14 00:52:22 +02:00
Daniel Schürmann
fd5b0e0a64 nir/spirv: Fix warning and add missing breaks.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-14 00:52:22 +02:00
Daniel Schürmann
54937d820d nir: use ballot_bit_size when lowering ballot_bitfield_extract
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-14 00:52:22 +02:00
Daniel Schürmann
4d802df3aa nir: subgroups instructions for 64bit ballot sizes
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-14 00:52:22 +02:00
Brian Paul
1098c18af3 glsl: #undef THIS macro to fix MSVC build
THIS is a macro in one of the MSVC header files.  It's also a token
in the GLSL lexer.  This causes a compilation failure with MSVC.
This issue seems to be newly exposed after the recent mtypes.h removal
patches.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-04-13 13:53:12 -06:00
Brian Paul
5dc7233f44 glsl: rename 'interface' var to 'iface' to fix MSVC build
The recent mtypes.h removal patches seems to have exposed a MSVC
issue where 'interface' is defined as a macro in an MSVC header file.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-04-13 13:53:08 -06:00
Brian Paul
73f1e33d34 mesa: remove snprintf macro in imports.h to fix MSVC build
snprintf is a macro in the MSVC stdio.h header and we needed to
include that header before imports.h where we also defined an
snprintf macro.  Otherwise, the MSVC build would fail.  The recent
mtypes.h removal patches seems to have exposed this issue.

This patch simply removes our snprintf macro and replaces one use
of it in teximage.c with _mesa_snprintf().  There are other calls
to snprintf() in DRI drivers, but none of them are built on Windows.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-04-13 13:52:57 -06:00
Lionel Landwerlin
0a6547014f anv: fix number of planes for depth & stencil
We're not counting correctly with depth & stencil images.

Additionally we need to move an assert that is meant just for color
attachments.

v2: Move an assert() (Reported by Craig)
    Change aspect mask checks (Francesco)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: a62a979335 ("anv: enable multiple planes per image/imageView")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105994
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-04-13 11:44:53 -07:00
Marek Olšák
6ff0c6f4eb gallium: move ddebug, noop, rbug, trace to auxiliary to improve build times
which also simplifies the build scripts.
2018-04-13 14:08:14 -04:00
Marek Olšák
918b798668 radeonsi: make sure CP DMA is idle at the end of IBs 2018-04-13 14:07:20 -04:00
Marek Olšák
b6ad7075b9 gallium/hud: add a simple HUD view that only draws text
Add this prefix to the env var: "simple," For example:
    GALLIUM_HUD=simple,fps

The X coordinates are the same, but the Y coordinates are different, because
there is only text.

'+' happens to behave the same as "\n".
',' happens to behave the same as "\n\n".
2018-04-13 14:07:20 -04:00
Dylan Baker
506671594a mesa: Include unistd.h in program_lexer
Which was previously provided implicitly by mtypes.h

CC: Marek Olšák <marek.olsak@amd.com>
CC: Mark Janes <mark.a.janes@intel.com>
Fixes: 43d66c8c2d
       ("mesa: include mtypes.h less")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-04-13 11:03:37 -07:00
Marek Olšák
9a1363427e radeonsi: always prefetch later shaders after the draw packet
so that the draw is started as soon as possible.

v2: only prefetch the API VS and VBO descriptors

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-13 12:31:04 -04:00
Marek Olšák
e4b7974ec7 radeonsi: emit shader pointers before cache flushes & waits
This code was written with the constant engine in mind.
We can simplify it now.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-13 12:31:04 -04:00
Marek Olšák
82799c5035 radeonsi/gfx9: don't use the workaround for gather4 + stencil
it doesn't seem to be needed.

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-13 12:31:04 -04:00
Marek Olšák
1372ccfe6f radeonsi: disable TC-compat HTILE on Tonga and Iceland
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-13 12:31:04 -04:00
Marek Olšák
afe0bd2c55 radeonsi: force 2D tiling on VI only when TC-compat HTILE is really enabled
just pass the flag that indicates it.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-13 12:31:04 -04:00
Marek Olšák
29a09e1d38 radeonsi: don't flush HTILE if there is no HTILE clear
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-13 12:31:04 -04:00
Marek Olšák
5fb31a1734 radeonsi: merge 2 identical if statements in si_clear
and other cleanups

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-13 12:31:04 -04:00
Marek Olšák
8a28679987 radeonsi: don't do GFX-specific texture decompression for compute
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-13 12:31:04 -04:00
Marek Olšák
307bccc6df radeonsi: simplify generating the renderer string
HAVE_LLVM > 0 is a tautology.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-13 12:31:04 -04:00
Marek Olšák
a3b785be4d winsys/amdgpu: allow local BOs on APUs
Local BOs ignore BO priorities, and we don't need those on APUs.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-13 12:31:04 -04:00
Juan A. Suarez Romero
b37b35a5d2 getteximage: assume texture image is empty for non defined levels
Current code is returning an INVALID_OPERATION when trying to use
getTextureImage() on a level that has not been explicitly defined.

That is, we define a mipmapped Texture2D with 3 levels, and try to use
GetTextureImage() for the 4th levels, and INVALID_OPERATION is returned.

Nevertheless, such case is not listed as an error in OpenGL 4.6 spec,
section 8.11.4 ("Texture Image Queries"), where all the case errors for
this function are defined. So it seems this is a valid operation.

On the other hand, in section 8.22 ("Texture State and Proxy State") it
states:

  "Each initial texture image is null. It has zero width, height, and
   depth, internal format RGBA, or R8 for buffer textures, component
   sizes set to zero and component types set to NONE, the compressed
   flag set to FALSE, a zero compressed size, and the bound buffer
   object name is zero."

We can assume that we are reading this initialized empty image when
calling GetTextureImage() with a non defined level.

With this assumption, we will reach one of the other error cases defined
for the functions. In the end this means that we would end up returning
INVALID_VALUE to the caller.

This fixes arb_get_texture_sub_image piglit tests.

v2: just return INVALID_VALUE if there is no defined level (Iago)

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-04-13 17:47:37 +02:00
Juan A. Suarez Romero
8d411eb6b3 gettextureimage: verify cube map is complete
According to OpenGL 4.6 spec, section 8.11.4 ("Texture Image Queries"),
relative to errors for GetTexImage, GetTextureImage, and GetnTexImage:

  "An INVALID_OPERATION error is generated by GetTextureImage if the
   effective target is TEXTURE_CUBE_MAP or TEXTURE_CUBE_MAP_ARRAY, and
   the texture object is not cube complete or cube array complete,
   respectively."

This fixes arb_get_texture_sub_image piglit tests.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-04-13 17:47:27 +02:00
Juan A. Suarez Romero
42891dbaa1 gettextsubimage: verify zoffset and depth are correct
According to OpenGL 4.6 spec, section 8.11.4 ("Texture Image Queries"),
relative to errors for GetTextureSubImage() function:

  "An INVALID_VALUE error is generated if the effective target is
   TEXTURE_1D and either yoffset is not zero, or height is not one.

   An INVALID_VALUE error is generated if the effective target is
   TEXTURE_1D, TEXTURE_1D_ARRAY, TEXTURE_2D or TEXTURE_RECTANGLE, and
   either zoffset is not zero, or depth is not one."

The commit fixes the check for height and depth.

This fixes arb_get_texture_sub_image piglit tests.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-04-13 17:47:27 +02:00
Timothy Arceri
a63e69f5f0 mesa: free debug messages when destroying the debug state
Fixes: 04a8baad37 "mesa: refactor _mesa_PopDebugGroup and _mesa_free_errors_data"

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98281
2018-04-13 22:20:48 +10:00
Timothy Arceri
c500ab2735 mesa: fix x86 builds
Fixes: 43d66c8c2d "mesa: include mtypes.h less"
2018-04-13 22:13:46 +10:00
Marek Olšák
e961824ba8 Fix make check 2018-04-12 20:03:13 -04:00
Marek Olšák
6d6b1b3890 Fix scons build 2018-04-12 19:55:01 -04:00
Marek Olšák
43d66c8c2d mesa: include mtypes.h less
- remove mtypes.h from most header files
- add main/menums.h for often used definitions
- remove main/core.h

v2: fix radv build

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-04-12 19:31:30 -04:00
Marek Olšák
57f4268da4 mesa: include dispatch.h less
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-04-12 19:31:28 -04:00
Bas Nieuwenhuizen
6ff98dbf7c radv: Implement VK_EXT_vertex_attribute_divisor.
Pretty straight forward, just pass the divisors through the shader
key and then do a LLVM divide.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-12 22:57:23 +02:00
Bas Nieuwenhuizen
7eff8d7d35 ac/surface: Allow S swizzle for displayable surfaces.
For dcn1 && < 64 bpp displayable surfaces, addrlib only accepts
S swizzles.

At the same time addrlib prefers D swizzles is allowed, so we can
just allow S swizzles as fallback.

Fixes: b64b712558 "ac/surface/gfx9: request desired micro tile mode explicitly"
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-12 21:24:55 +02:00
Eric Anholt
7bc77dbb00 broadcom/vc5: Fix a stray '`' in a comment. 2018-04-12 11:20:50 -07:00
Eric Anholt
b225cdcecc broadcom/vc5: Update the UABI for in/out syncobjs
This is the ABI I'm hoping to stabilize for merging the driver.  seqnos
are eliminated, which allows for the GPU scheduler to task-switch between
DRM fds even after submission to the kernel.  In/out sync objects are
introduced, to allow the Android fencing extension (not yet implemented,
but should be trivial), and to also allow the driver to tell the kernel to
not start a bin until a previous render is complete.
2018-04-12 11:20:50 -07:00
Eric Anholt
d9c525ed22 broadcom/vc5: Drop the finished_seqno optimization.
With the DRM scheduler changes, I'm about to remove all seqnos from the
UABI.
2018-04-12 11:20:50 -07:00
Eric Anholt
aedfd8ede4 broadcom/vc5: Drop the throttling code.
Since I'll be using the DRM scheduler, we won't run into the problem of a
runaway client starving other clients of GPU time.
2018-04-12 11:20:50 -07:00
Eric Anholt
dd9c476165 broadcom/vc5: Move flush_last_load into load_general, like for stores.
This should avoid mistakes with not flushing as we change the series of
loads.  Already, it fixes a hopefully unreachable case where we were
emitting just the TILE_COORDINATES and not the dummy store that needs to
go with it.
2018-04-12 11:20:50 -07:00
Eric Anholt
6a21a582fb broadcom/vc5: Rename read_but_not_cleared to loads_pending.
This is a more obvious name for what the variable means, and matches what
it's called for stores.
2018-04-12 11:20:50 -07:00
Eric Anholt
b946218c48 broadcom/vc5: Refactor the implicit coords/stores_pending logic.
Since I just fixed a bug due to forgetting to do these right, do it once
in the helper func.
2018-04-12 11:20:50 -07:00
Eric Anholt
ec60559f97 broadcom/vc5: Emit missing TILE_COORDINATES_IMPLICIT in separate z/s stores.
Fixes a simulator assertion failure in
KHR-GLES3.packed_depth_stencil.blit.depth32f_stencil8
2018-04-12 11:20:50 -07:00
Eric Anholt
8f2999120d broadcom/vc5: Add checks that we don't try to do raw Z+S load/stores.
This was dying in the simulator on
GTF-GLES3.gtf.GL3Tests.packed_depth_stencil.packed_depth_stencil_blit.
We'll need to do basically the same thing as Z32F/S8 does in the MSAA
Z24S8 case.
2018-04-12 11:20:50 -07:00
Eric Anholt
7553cbfc9d broadcom/vc5: Fix MSAA depth/stencil size setup.
The v3dX(get_internal_type_bpp_for_output_format)() call only handles
color output formats (which overlap in enum numbers with depth output
formats), so for depth we just need to take the normal cpp times the
number of samples.
2018-04-12 11:20:50 -07:00
Leo Liu
fa328456e8 st/va: add VP9 config to enable profile2
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
dac0024b58 radeonsi: use PIPE_FORMAT_P016 format for VP9 profile2
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
f1277dabbc radeon/vcn: add VP9 profile2 support
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
e8724bd1e3 vl: add VP9 profile2 support
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
d9a31341ec st/va: add VP9 config to enable profile0
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
ef52ba8aa0 st/va: parse VP9 uncompressed frame header
To get some of UVD required parameters.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
bf0f5fe929 st/va: add slice parameter handling for VP9
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
05176fe65e st/va: add picture parameter handling for VP9
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
9ff83d13e5 st/va: add handles for VP9 buffers
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
30438fbf46 st/va: add VP9 picture to context
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
0f373a65e5 radeonsi: cap VP9 support to progressive buffer
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
6adaf6de6d radeonsi: cap VP9 support to Raven
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
905368669d radeon/vcn: add VP9 context buffer
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
e2ce7c0a62 radeon/vcn: get VP9 msg buffer
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
6000bdb75b radeon/vcn: fill probability table to prob buffers
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
93c0f3cc13 radeon/vcn: add VP9 message buffer interface
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:13 -04:00
Leo Liu
caaecf3d3b radeon/vcn: add VP9 prob table buffer
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:12 -04:00
Leo Liu
b628ea039f vl: add VP9 probability tables
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:12 -04:00
Leo Liu
eb22785bd8 radeon/vcn: add VP9 dpb buffer size
The current FW has restricted the size to the worse case,
and the new dynamic dpb buffer support is on the way from
firmware side, we will change accordingly.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:12 -04:00
Leo Liu
f73befdd9b radeon/vcn: add VP9 stream type for decoder
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:12 -04:00
Leo Liu
ca1646db89 vl: add VP9 picture description
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:12 -04:00
Leo Liu
29bc354684 vl: add VP9 profile0 and format
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-04-12 11:15:12 -04:00
Samuel Pitoiset
9eac49246c radv: fix radv_layout_dcc_compressed() when image doesn't have DCC
num_dcc_levels means that DCC is supported, but this doesn't
mean that it's enabled by the driver. Instead, we should rely
on radv_image_has_dcc().

This fixes some multisample regressions since 0babc8e5d6
("radv: fix picking the method for resolve subpass") on Vega.
This is because the resolve method changed from HW to FS, but
those fails are totally unexpected, so there might some
differences between Polaris and Vega here.

Fixes: 44fcf58744 ("radv: Disable DCC for GENERAL layout and compute transfer dest.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-12 09:58:46 +02:00
Samuel Pitoiset
ab0e625a67 radv: add radv_decompress_resolve_{subpass}_src() helpers
This helper shares common code before resolving using either
a fragment or a compute shader.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-12 09:58:44 +02:00
Samuel Pitoiset
ed93d90a67 radv: add radv_init_dcc_control_reg() helper
And add some comments.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-12 09:58:41 +02:00
Timothy Arceri
c7e3d31b0b glsl: fix compat shaders in GLSL 1.40
The compatibility and core tokens were not added until GLSL 1.50,
for GLSL 1.40 just assume all shaders built with a compat profile
are compat shaders.

Fixes rendering issues in Dawn of War II on radeonsi which has
enabled OpenGL 3.1 compat support.

Fixes: a0c8b49284 "mesa: enable OpenGL 3.1 with ARB_compatibility"

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105807
2018-04-12 11:51:08 +10:00
Ian Romanick
f3b14ca2e1 mesa: Silence remaining unused parameter warnings in teximage.c
src/mesa/main/teximage.c: In function ‘_mesa_test_proxy_teximage’:
src/mesa/main/teximage.c:1301:51: warning: unused parameter ‘level’ [-Wunused-parameter]
                           GLuint numLevels, GLint level,
                                                   ^~~~~
src/mesa/main/teximage.c: In function ‘texsubimage_error_check’:
src/mesa/main/teximage.c:2186:30: warning: unused parameter ‘dsa’ [-Wunused-parameter]
                         bool dsa, const char *callerName)
                              ^~~
src/mesa/main/teximage.c: In function ‘copytexture_error_check’:
src/mesa/main/teximage.c:2297:32: warning: unused parameter ‘width’ [-Wunused-parameter]
                          GLint width, GLint height, GLint border )
                                ^~~~~
src/mesa/main/teximage.c:2297:45: warning: unused parameter ‘height’ [-Wunused-parameter]
                          GLint width, GLint height, GLint border )
                                             ^~~~~~
src/mesa/main/teximage.c: In function ‘check_rtt_cb’:
src/mesa/main/teximage.c:2679:21: warning: unused parameter ‘key’ [-Wunused-parameter]
 check_rtt_cb(GLuint key, void *data, void *userData)
                     ^~~
src/mesa/main/teximage.c: In function ‘override_internal_format’:
src/mesa/main/teximage.c:2756:55: warning: unused parameter ‘width’ [-Wunused-parameter]
 override_internal_format(GLenum internalFormat, GLint width, GLint height)
                                                       ^~~~~
src/mesa/main/teximage.c:2756:68: warning: unused parameter ‘height’ [-Wunused-parameter]
 override_internal_format(GLenum internalFormat, GLint width, GLint height)
                                                                    ^~~~~~
src/mesa/main/teximage.c: In function ‘texture_sub_image’:
src/mesa/main/teximage.c:3293:24: warning: unused parameter ‘dsa’ [-Wunused-parameter]
                   bool dsa)
                        ^~~
src/mesa/main/teximage.c: In function ‘can_avoid_reallocation’:
src/mesa/main/teximage.c:3788:53: warning: unused parameter ‘x’ [-Wunused-parameter]
                        mesa_format texFormat, GLint x, GLint y, GLsizei width,
                                                     ^
src/mesa/main/teximage.c:3788:62: warning: unused parameter ‘y’ [-Wunused-parameter]
                        mesa_format texFormat, GLint x, GLint y, GLsizei width,
                                                              ^
src/mesa/main/teximage.c: In function ‘valid_texstorage_ms_parameters’:
src/mesa/main/teximage.c:5987:40: warning: unused parameter ‘samples’ [-Wunused-parameter]
                                GLsizei samples, unsigned dims)
                                        ^~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-04-11 16:20:56 -07:00
Ian Romanick
fa44941072 mesa: Silence unused parameter warning in compressedteximage_only_format
Passing ctx to compressedteximage_only_format was the only use of the
ctx parameter in _mesa_format_no_online_compression, so that parameter
had to go too.

../../SOURCE/master/src/mesa/main/teximage.c: In function ‘compressedteximage_only_format’:
../../SOURCE/master/src/mesa/main/teximage.c:1355:57: warning: unused parameter ‘ctx’ [-Wunused-parameter]
 compressedteximage_only_format(const struct gl_context *ctx, GLenum format)
                                                         ^~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-04-11 16:20:42 -07:00
Nanley Chery
377da9eb78 blorp: Silence unused function warnings
vulkan/genX_blorp_exec.c:69:1: warning: ‘blorp_get_surface_base_address’ defined but not used [-Wunused-function]
 blorp_get_surface_base_address(struct blorp_batch *batch)
 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from vulkan/genX_blorp_exec.c:35:0:
./blorp/blorp_genX_exec.h:1249:1: warning: ‘blorp_emit_memcpy’ defined but not used [-Wunused-function]
 blorp_emit_memcpy(struct blorp_batch *batch,
 ^~~~~~~~~~~~~~~~~
genX_blorp_exec.c:99:1: warning: ‘blorp_get_surface_base_address’ defined but not used [-Wunused-function]
 blorp_get_surface_base_address(struct blorp_batch *batch)
 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from genX_blorp_exec.c:33:0:
../../../../../src/intel/blorp/blorp_genX_exec.h:1249:1: warning: ‘blorp_emit_memcpy’ defined but not used [-Wunused-function]
 blorp_emit_memcpy(struct blorp_batch *batch,
 ^~~~~~~~~~~~~~~~~

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-04-11 13:04:49 -07:00
Caio Marcelo de Oliveira Filho
89542c9ce6 nir/vars_to_ssa: Simplify node matching code
The matching code doesn't make real use of the return value. The main
function return value is ignored, and while the worker function
propagate its return value, the actual callback never returns false.

v2: Style fixes. (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-11 11:05:05 -07:00
Caio Marcelo de Oliveira Filho
fac9dd1b93 nir/vars_to_ssa: Remove an unnecessary deref_arry_type check
Only fully-qualified direct derefs, collected in direct_deref_nodes,
are checked for aliasing, so it is already known up front that they
have only array derefs of type direct.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-11 11:05:05 -07:00
Caio Marcelo de Oliveira Filho
1c9bccdeb8 nir/vars_to_ssa: Rework register_variable_uses()
The return value was needed to make use of the old nir_foreach_block
helper, but not needed anymore with the macro version. Then go one
step further and move the foreach directly into the register variable
uses function.

v2: Move foreach to register_variable_uses(). (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-11 11:05:05 -07:00
Jason Ekstrand
bc2b170d68 nir: Use nir_builder in lower_io_to_temporaries
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-04-11 11:03:22 -07:00
Bas Nieuwenhuizen
bd95397d65 radv: Enable RB+ on Raven.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-11 18:46:55 +02:00
Tapani Pälli
9f29b1a4c8 vulkan: fix build issue on android (both anv/radv)
Fixes linking errors against:

   anv_GetPhysicalDeviceImageFormatProperties2KHR
   radv_GetPhysicalDeviceImageFormatProperties2KHR

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-11 13:55:49 +03:00
Nicolai Hähnle
41e6ffee49 radeonsi: correctly parse disassembly with labels
LLVM now emits labels as part of the disassembly string, which is very
useful but breaks the old parsing approach.

Use the semicolon to detect the boundary of instructions instead of going
by line breaks.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-11 12:44:30 +02:00
Nicolai Hähnle
0630e52c9e radeonsi: pass -O halt_waves to umr for hang debugging
This will give us meaningful wave information in the case of a hang where
shaders are still running in an infinite loop.

Note that we call umr multiple times for different sections of the ddebug
hang dump, and so the wave information will not necessarily match up
between sections.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-11 12:44:24 +02:00
Jason Ekstrand
69f447553c vulkan: Drop vk_android_native_buffer.xml
All the information in vk_android_native_buffer.xml is now in vk.xml.
The only exception is the extension type attribute which we can work
around in the generators while we wait for the XML to be fixed.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-04-10 19:29:49 -07:00
Jason Ekstrand
ae3a856c34 nir/lower_atomics: Rework the main walker loop a bit
This replaces some "if (...} { }" with "if (...) continue;" to reduce
nesting depth and makes nir_metadata_preserve conditional on progress
for the given impl.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-04-10 19:28:49 -07:00
Bas Nieuwenhuizen
ed94638156 radv: Enable RB+ where possible.
According to Marek, not enabling it on Stoney has a significant
negative performance impact. (And I guess this might impact
performance on Raven as well)

The register settings are pretty much copied from radeonsi. I did
not put this in the pipeline as that would make the pipeline more
dependent on the format which mean we would have to have more
pipelines for the meta shaders.

v2: Don't clear RB+ regs if not enabled as the CLEAR_STATE packet
    does already.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-11 01:19:10 +02:00
Topi Pohjolainen
5d895a1f37 nir: Check if u_vector_init() succeeds
However, it only fails when running out of memory. Now, if we
are about to check that, we should be consistent and check
the allocation of the worklist as well.

CID: 1433512
Fixes: edb18564c7 nir: Initial implementation of a nir_instr_worklist
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-04-11 01:49:56 +03:00
Topi Pohjolainen
98d3874754 mesa: Assert base format before truncating to unsigned short
CID: 1433709
Fixes: ca721b3d8: mesa: use GLenum16 in a few more places
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-04-11 01:49:56 +03:00
Topi Pohjolainen
26f48fe010 intel/dev: Assert the number of slices is not zero
Fixes: c1900f5b intel: devinfo: add helper functions to fill...
CID: 1433511
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-04-11 01:49:56 +03:00
Kenneth Graunke
8960903c90 i965: Remove brw_bo_alloc_tiled_2d from intel_detect_swizzling.
I'd like to drop this pre-isl function.  This drops one of the two uses.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-04-10 15:31:31 -07:00
Timothy Arceri
a05faf80c3 mesa: fix glsl version mismatch in compat profile
Drivers that only support compat 3.0 were reporting GLSL 1.40
support. This fixes issues with the menu of Dawn of War II.

Fixes: a0c8b49284 "mesa: enable OpenGL 3.1 with ARB_compatibility"

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105807
2018-04-11 08:05:19 +10:00
Samuel Pitoiset
0babc8e5d6 radv: fix picking the method for resolve subpass
The source and destination image parameters were swapped.

No CTS changes on Polaris10, but I suspect this might
fix something.

Fixes: 2a04f5481d ("radv/meta: select resolve paths")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-10 21:55:28 +02:00
Samuel Pitoiset
9f6a28eb27 radv: add shader BOs to the list at pipeline bind time
Otherwise, the shader BOs are not added to the list on SI because
prefetching isn't supported. Calling radv_cs_add_buffer() in the
prefetch codepath was a bad idea.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105952
Fixes: 4ad7595f35 ("radv: rename radv_emit_prefetch() to radv_emit_prefetch_L2")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Turo Lamminen <turo@alternativegames.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-10 21:55:28 +02:00
Marek Olšák
e29facff31 ac/surface: don't set the display flag for obviously unsupported cases (v2)
This enables the tile swizzle for some cases of the displayable micro mode,
and it also fixes an addrlib assertion failure on Vega.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2018-04-10 13:06:03 -04:00
Marek Olšák
19ce5048ee radeonsi: add shader binary padding for UMR 2018-04-10 13:05:20 -04:00
Marek Olšák
b64b712558 ac/surface/gfx9: request desired micro tile mode explicitly
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-10 12:44:41 -04:00
Emil Velikov
5dd02123a0 docs/release-calendar: update to include 18.1 and 18.2
Dylan has kindly stepped up to help with 18.1.0, while I've taken the
liberty to nominate Andres for 18.2.0 ;-)

As always, people are welcome to swap/adjust where needed.

v2: Add Juan for 18.0.x (Juan)

Cc: Andres Gomez <agomez@igalia.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com> (v1)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-04-10 16:08:54 +01:00
Emil Velikov
8eceac9de7 glsl: remove unreachable assert()
Earlier commit enforced that we'll bail out if the number of terminators
is different than 2. With that in mind, the assert() will never trigger.

Fixes: 56b867395d ("glsl: fix infinite loop caused by bug in loop
unrolling pass")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-04-10 16:04:50 +01:00
Juan A. Suarez Romero
0d0ef8ae33 spirv: autotools: add vtn_gather_types_c.py in distribution tarball
Fixes: 042ee4bea2 "(spirv: Move SPIR-V building to Makefile.spirv.am and
spirv/meson.build")

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-04-10 10:37:46 +02:00
Juan A. Suarez Romero
15ed757834 radeonsi: autotools: add si_build_pm4.h in dist tarball
Fixes: 5777488406 ("radeonsi: move r600_cs.h contents into si_pipe.h,
si_build_pm4.h")

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-04-10 10:33:28 +02:00
Bas Nieuwenhuizen
4381be4648 ac/nir: Use an array instead of hashtable for SSA defs.
Saves about 2% of compile time for F1 2017, as well as reduce code
size of an optimized libvulkan_radeon.so by about 1 KiB.

This still keeps the hashtable, as we also stored blocks in there.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-10 09:53:16 +02:00
Timothy Arceri
6066f08ee9 st/mesa: finalise tcs/tes/geom NIR before storing it to the cache
We don't create variants of the NIR so here we finalise it before
caching to avoid unnecessary processing when restoring it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-10 15:10:16 +10:00
Timothy Arceri
bc71e20993 st/mesa: exit st_translate_fragment_program() earlier for NIR path
This avoids a bunch of scanning that is only used by the TGSI path.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-10 15:10:16 +10:00
Timothy Arceri
494a5c3501 radeonsi/nir: tidy up si_nir_load_sampler_desc()
This makes it easier to follow the code, and also initialises
dynamic_index which will be useful for adding bindless textures
support.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-10 14:43:45 +10:00
Timothy Arceri
d7cbe795ed radeonsi/nir: set uses_bindless_images for images
V2: add missing intrinsics (Spotted-by: Samuel Pitoiset)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-10 14:43:45 +10:00
Timothy Arceri
74b3fc2ce0 nir: dont lower bindless samplers
We neeed to skip the var if its not a uniform here as well as checking
the bindless flag since UBOs can contain bindless samplers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-10 14:43:45 +10:00
Timothy Arceri
bd4cc54c8b st/glsl_to_nir: set paramater value offset as driver location for packed uniforms
This allows us to simplify the code and will also be useful for supporting
bindless textures.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-10 14:43:45 +10:00
Timothy Arceri
222d862cd3 radeonsi/nir: don't add bindless samplers/images to declared bitmasks
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-10 14:43:45 +10:00
Timothy Arceri
f33d9036b9 st/mesa: stop calling _mesa_init_shader_object_functions()
This sets the LinkShader function for the driver, but for the st we
set it properly with the following call to st_init_program_functions().

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-10 14:43:45 +10:00
Jason Ekstrand
c3f9d5c235 anv/pipeline: Lower more constant initializers earlier
Once we've gotten rid of everything but the main entrypoint, there's no
reason why we should go ahead and lower them all.  This is what radv
does and it will make future work easier.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-04-09 19:45:25 -07:00
Jason Ekstrand
14e0a222d9 spirv: Use the LOCAL_GROUP_SIZE system value
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-04-09 19:45:25 -07:00
Jason Ekstrand
131d454c35 nir/lower_system_values: Support SYSTEM_VALUE_LOCAL_GROUP_SIZE
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-04-09 19:45:25 -07:00
Lionel Landwerlin
f3353e53db intel: aubinator: print out addresses of invalid instructions
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-04-10 00:58:38 +01:00
Bas Nieuwenhuizen
41fbcc7901 radv: Always reset draw user SGPRs after secondary command buffer.
As we sometimes reset them to -1, -1 does not mean that they are
not written by the secondary command buffer.

Fixes: ad11fc3571 "radv: don't emit unneeded vertex state."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-09 23:04:42 +02:00
Bas Nieuwenhuizen
74b0b869dd radv: Don't set instance count using predication.
The packet can sometimes be skipped, but we still think the change takes effect.

This just makes the packet always take effect.

Fixes: ad11fc3571 "radv: don't emit unneeded vertex state."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105942
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-09 23:04:35 +02:00
Rob Clark
d66dc34316 mesa/st/nir: fix instruction removal
At one point this kinda worked (or at least didn't cause problems).  But
with deref-instructions it results in dangling deref instructions not
being properly removed.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-09 15:36:21 -04:00
Rob Clark
becf2d1fac mesa/st/nir: fix naked lowering pass call
Not using the macro means no nir_validate in debug builds, resulting in
problems showing up only after later passes.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-09 15:36:21 -04:00
Rob Clark
c4457113e9 nir: add comment about nir_src_copy()
So it is more clear about when to use nir_instr_rewrite_src()

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-09 15:36:21 -04:00
Nanley Chery
1d94aa1987 i965: Make the miptree clear color setter take a gl_color_union
We want to hide the internal details of how the miptree's clear color
is calculated.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-09 10:56:48 -07:00
Nanley Chery
3dbb49a978 i965/miptree: Move the clear color and value setter implementations
These will get more complex in later commits.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-09 10:56:48 -07:00
Nanley Chery
1ce7ae391e i965: Use the brw_context for the clear color and value setters
Do what all the other functions in the miptree API do.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-09 10:56:48 -07:00
Bas Vermeulen
c63bef15fc radeonsi: convert dispatch packet to little endian
The parameters for the compute engine are wrong when using
an E8860 on a big endian machine.
To fix this, convert the contents of struct dispatch_packet
to little endian.

This ensures that get_global_id(0) and similar functions
in the OpenCL code get the correct endian values, and
makes my simple OpenCL program work correctly.

Signed-off-by: Bas Vermeulen <bas@daedalean.ai>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2018-04-09 13:47:52 -04:00
Bas Vermeulen
be628e4749 radeonsi: correct si_vgt_param_key on big endian machines
Using mesa OpenCL failed on a big endian PowerPC machine because
si_vgt_param_key is using bitfields and a 32 bit int for an
index into an array.

Fix si_vgt_param_key to work correctly on both little endian
and big endian machines.

Signed-off-by: Bas Vermeulen <bas@daedalean.ai>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-04-09 13:42:30 -04:00
Marek Olšák
f33e4482b3 radeonsi: don't set RB+ registers on GFX9 chips without RB+
CLEAR_STATE initializes them properly.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-09 13:40:25 -04:00
Emil Velikov
ea2536cd26 etnaviv: meson: add etnaviv_query_pm.[ch] to the sources
Otherwise building the driver will fail with unresolved symbols.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105960
Fixes: 72d2043be0 ("etnaviv: add perfmon query implementation")
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Clayton Craft <clayton.a.craft@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-04-09 19:09:24 +02:00
Xiong, James
f23b45dce3 i965: return the fourcc saved in __DRIimage when possible
When creating a image from a texture, the image's dri_format is
set to the first plane's format, and used to look up for the
fourcc. e.g. for FOURCC_NV12 texture, the dri_format is set to
__DRI_IMAGE_FORMAT_R8, we end up with a wrong entry in function
intel_lookup_fourcc():
   { __DRI_IMAGE_FOURCC_R8, __DRI_IMAGE_COMPONENTS_R, 1,
     { { 0, 0, 0, __DRI_IMAGE_FORMAT_R8, 1 }, } },
instead of the correct one:
   { __DRI_IMAGE_FOURCC_NV12, __DRI_IMAGE_COMPONENTS_Y_UV, 2,
     { { 0, 0, 0, __DRI_IMAGE_FORMAT_R8, 1 },
       { 1, 1, 1, __DRI_IMAGE_FORMAT_GR88, 2 } } },
as a result, a wrong fourcc __DRI_IMAGE_FOURCC_R8 was returned.

To fix this bug, the image inherits the texture's planar_format that
has the original fourcc; Upon querying, if planar_format is set,
return the saved fourcc; Otherwise fall back to the old way.

v3: add a bug description and "cc mesa-stable" tag (Jason)
  remove redundant null pointer check (Tapani)
  squash 2 patches into one (James)
v2: fall back to intel_lookup_fourcc() when planar_format is NULL
  (Dongwon & Matt Roper)

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Xiong, James <james.xiong@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-04-09 18:16:59 +03:00
Bastien Orivel
42c2f5b579 nir: Fix a typo in src/compiler/Makefile.nir.am
Since 31d91f019b, the makefile tries to
find the file SConstript.spirv instead of SConscript.spirv which breaks
the make dist command.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-04-09 08:32:45 -06:00
Samuel Pitoiset
04e609f1f8 radv: fix prefetching of vertex shader and VBOs on SI
Forgot one check... Too many mistakes for a simple change.

Fixes: f1d7c16e85 ("radv: fix prefetching compute shaders on CIK and older chips")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 16:14:12 +02:00
Samuel Pitoiset
56a4d03b0c radv: implement VK_AMD_shader_core_properties
Simple extension that only returns information for AMD hw.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 14:28:13 +02:00
Samuel Pitoiset
466aba9fa2 radv: add RADV_NUM_PHYSICAL_VGPRS constant
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 14:28:13 +02:00
Samuel Pitoiset
2f7bb93146 radv: add radv_get_num_physical_sgprs() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 14:28:13 +02:00
Samuel Pitoiset
b30dec738a vulkan: Update the XML and headers to 1.1.72
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 14:28:13 +02:00
Andres Gomez
a055f5108d docs: properly escape characters
Signed-off-by: Andres Gomez <agomez@igalia.com>
2018-04-09 13:47:40 +03:00
Andres Gomez
7cf3932098 mesa: adds some comments regarding MESA_GLES_VERSION_OVERRIDE usage
Fixes: 03fd6704db ("mesa: Add support for a new override string
MESA_GLES_VERSION_OVERRIDE")

Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-04-09 13:47:40 +03:00
Marek Olšák
806ab42c0f mesa: simplify MESA_GL_VERSION_OVERRIDE behavior of API override
v2:
 - Provide a correct explanation on the envvars documentation (Ian).
 - Provide a more correct explanation on the function comments (Andres).
v3:
 - Homogenize documentation and inline comments (Emil).
 - Correct a typo (Emil).

Fixes: 2599b92eb9 ("mesa: allow forcing >=3.1 compatibility contexts
with MESA_GL_VERSION_OVERRIDE")

Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Ian Romanick <ian.d.romanick@intel.com>
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-04-09 13:47:40 +03:00
Andres Gomez
c6067fcd07 dri_util: don't fail when not supporting ARB_compatibility with GL3.1
Currently, any driver that does not support the ARB_compatibility
extension will fail on GL3.1 context creation if the application does
not request the forward-compatiblity flag.

Restore the original check which changes mesa_api to API_OPENGL_CORE,
only when:
 - GL3.1 is requested, without the forward-compatiblity flag.
 - driver does not support ARB_compatibility - as deduced by
max_gl_compat_version.

Fixes: a0c8b49284 ("mesa: enable OpenGL 3.1 with ARB_compatibility")

v2:
 - Improve commit log (Emil).
 - Provide a correct explanation on the features documentation (Ian).

Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Ian Romanick <ian.d.romanick@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-09 13:46:34 +03:00
Andres Gomez
044acd3569 dri_util: when overriding, always reset the core version
This way we won't fail when validating just because we may have a non
overriden core version that is lower than the requested one, even when
the compat version is high enough.

For example, running glcts from VK-GL-CTS with i965, this will
succeed:

$ MESA_GL_VERSION_OVERRIDE=4.6 ./glcts --deqp-case=KHR-GL46.info.vendor

While, this will fail:

$ MESA_GL_VERSION_OVERRIDE=4.6COMPAT ./glcts --deqp-case=KHR-GL46.info.vendor

Fixes: 464c56d3d5 ("dri_util: Use
_mesa_override_gl_version_contextless")

Cc: Ian Romanick <ian.d.romanick@intel.com>
Cc: Tapani Pälli <tapani.palli@intel.com>
Cc: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-04-09 13:18:16 +03:00
Samuel Pitoiset
b0f8ad189c radv: add radv_image_is_tc_compat_htile() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 11:21:26 +02:00
Samuel Pitoiset
95d5ad80e9 radv: add radv_use_dcc_for_image() helper
And add some TODOs.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 11:21:24 +02:00
Samuel Pitoiset
fab5fe4284 radv: rename radv_image_is_tc_compat_htile()
... to radv_use_tc_compat_htile_for_image(). This function
name makes more sense to me because we want to know if and
only if TC-compat HTILE should be used.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 11:21:21 +02:00
Samuel Pitoiset
2692736cee radv: simplify a check in radv_initialise_color_surface()
If the image has FMASK metadata, the number of samples is > 1
because radv_image_can_enable_fmask() handles that already.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 11:21:16 +02:00
Samuel Pitoiset
ed41e776d0 radv: clean up radv_vi_dcc_enabled()
And rename to radv_dcc_enabled() to be consistent.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 11:21:14 +02:00
Samuel Pitoiset
e213f19907 radv: clean up radv_htile_enabled()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 11:21:12 +02:00
Samuel Pitoiset
0fc9113ac5 radv: add radv_image_has_{cmask,fmask,dcc,htile}() helpers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 11:21:10 +02:00
Samuel Pitoiset
32f5174ce8 radv: add radv_get_cmask_fast_clear_value() helper
DCC for MSAA textures are currently unsupported but that will
be used later on.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 11:21:08 +02:00
Samuel Pitoiset
f882c62218 radv: add radv_clear_{cmask,dcc} helpers
They will help for DCC MSAA textures and if we support mipmaps
in the future.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-09 11:21:05 +02:00
Axel Davy
d899826733 st/nine: Do not use scratch for face register
Scratch registers are reused every instructions.
Since vFace is reused, a new temporary register
should be used.

Fixes: https://github.com/iXit/Mesa-3D/issues/311

Signed-off-by: Axel Davy <davyaxel0@gmail.com>

CC: "17.3 18.0" <mesa-stable@lists.freedesktop.org>
2018-04-08 22:49:43 +02:00
Christian Gmeiner
9e80273693 etnaviv: expose perfmon query groups
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:23:45 +02:00
Christian Gmeiner
c320b158f5 etnaviv: add query_group_info for perfmon counters
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:23:38 +02:00
Christian Gmeiner
5a3b744ed2 etnaviv: assign group_ids to perfmon queries
Prep work for AMD_performance_monitor support.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:23:34 +02:00
Christian Gmeiner
4020fa3e08 etnaviv: support MC performance counters
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:21:40 +02:00
Christian Gmeiner
3c3f936ae1 etnaviv: support TX performance counters
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:21:12 +02:00
Christian Gmeiner
f380ce13f0 etnaviv: support RA performance counters
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:21:04 +02:00
Christian Gmeiner
3af0e228e5 etnaviv: support SE performance counters
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:20:50 +02:00
Christian Gmeiner
9ae86c1306 etnaviv: support PA performance counters
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:20:46 +02:00
Christian Gmeiner
69bebe06e3 etnaviv: support SH performance counters
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:20:42 +02:00
Christian Gmeiner
1f603402f6 etnaviv: support PE performance counters
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:20:37 +02:00
Christian Gmeiner
d0bed0b494 etnaviv: support HI performance counters
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:20:32 +02:00
Christian Gmeiner
72d2043be0 etnaviv: add perfmon query implementation
Add needed infrastructure to use performance monitor
requests for queries.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
2018-04-08 22:20:25 +02:00
Christian Gmeiner
7e3dba301e etnaviv: sw queries: return correct number of groups
Fixes: 3d912bd742 ("etnaviv: add query_group_info for sw counters")
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-04-08 22:13:04 +02:00
Lucas Stach
208891650b etnaviv: advertise YUV formats as external only
We only support importing YUV as OES external resources.
This will change in the future, but for now this fixes the
advertised capabilities in eglQueryDmaBufModifiersEXT.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-04-08 22:11:46 +02:00
Lucas Stach
dfe4a08ccd gallium/util: implement util_format_is_yuv
This adds a helper to check if a pipe format is in YUV color space.
Drivers want to know about this, as YUV mostly needs special handling.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-04-08 22:10:57 +02:00
Rhys Perry
19254a977b nvc0: finish implementation of PIPE_QUERY_SO_OVERFLOW_PREDICATE
This also removes some useless code leftover from old changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-04-07 16:45:00 -04:00
Rhys Perry
14cc8c55ea nvc0: change ACQUIRE_EQUAL to ACQUIRE_GEQUAL in nvc0_hw_query_fifo_wait
If a fence is created in between nvc0_hw_end_query and
nvc0_hw_query_fifo_wait, the sequence number in nvc0->screen->fence.bo can
be larger than hq->fence->sequence before the semaphore is created,
resulting in the semaphore never being triggered.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-04-07 16:45:00 -04:00
Rhys Perry
98d15e0550 nvc0: ensure the query's fence has been emitted in nvc0_hw_query_fifo_wait
If the fence has not been emitted, hq->fence->sequence would be zero. This
would result in the semaphore never being triggered, blocking all later
commands in the pushbuf.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
[imirkin: use nouveau_fence_emit instead]
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-04-07 16:45:00 -04:00
Ilia Mirkin
90bb2d7152 st/mesa: tex offsets can't be in a const or 2d-indexed
All consts are now implicitly 2d (they set .Dimension), so trigger
asserts. Also, the texture offset can't handle any sort of 2d indexing.
While this could be tacked on, this seems unnecessary, just move it off
into a separate temp.

Fixes assertion failure in
tests/spec/arb_gpu_shader5/compiler/builtin-functions/fs-gatherOffset-uniform-offset.frag

Note that this was an issue even before the const-always-2d thing, since
there was no detection of when even a proper second dimension was used,
e.g. for UBO or geom/tess inputs.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-07 16:45:00 -04:00
Ilia Mirkin
2a2b22e9b1 nvc0: restore image binding on RGB10A2, remove from BGR10A2
Fixes a bunch of new CTS pbo tests that use those as an output format,
which the state tracker converts into buffer image writes.

No part of the driver is ready for BGR10A2. It could probably be enabled
on Maxwell+, but seems unnecessary. This error was introduced when
flipping the displayable bit on those formats, which accidentally also
moved the image bit.

Fixes: e1a70aed10 (nv50,nvc0: mark ABGR format as displayable instead of ARGB format)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-04-07 16:45:00 -04:00
Rob Clark
684f7cd7e3 freedreno/ir3: use lower_global_vars_to_local in cmdline compiler
tgsi_to_nir emits things with arrays as global vars.. and nir->ir3 does
lower_locals_to_regs.  But nothing was lowering global to local, which
breaks compiling tgsi shaders

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-04-07 11:33:41 -04:00
Kenneth Graunke
a3782a612f i965: Use %x instead of %u in debug print.
I mistakenly printed out the address as 0x<decimal number> instead of
printing a proper hex number.  This was...surprising.
2018-04-06 22:57:48 -07:00
Dylan Baker
b5f92b6fd4 meson: fix warnings about comparing unlike types
In the old days (0.42.x), when mesa's meson system was written the
recommendation for handling conditional dependencies was to define them
as empty lists. When meson would evaluate the dependencies of a target
it would recursively flatten all of the arguments, and empty lists would
be removed. There are some problems with this, among them that lists and
dependencies have different methods (namely .found()), so the
recommendation changed to use `dependency('', required : false)` for
such cases.  This has the advantage of providing a .found() method, so
there is no need to do things like `dep_foo != [] and dep_foo.found()`,
such a dependency should never exist.

I've tested this with 0.42 (the minimum we claim to support) and 0.45.
On 0.45 this removes warnings about comparing unlike types, such as:

meson.build:1337: WARNING: Trying to compare values of different types
(DependencyHolder, list) using !=.

v2: - Use dependency('', required : false) instead of
      declare_dependency(), the later will always report that it is
      found, which is not what we want.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-04-06 15:29:53 -07:00
Ian Romanick
81ed629b38 intel/compiler: Explicitly cast register type in switch
brw_reg::type is "enum brw_reg_type type:4".  For whatever reason, GCC
is treating this as an int instead of an enum.  As a result, it doesn't
detect missing switch cases and it doesn't detect that flow can get out
of the switch.

This silences the warning:

src/intel/compiler/brw_reg.h: In function ‘bool brw_regs_negative_equal(const brw_reg*, const brw_reg*)’:
src/intel/compiler/brw_reg.h:305:1: warning: control reaches end of non-void function [-Wreturn-type]
 }
 ^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-04-06 15:22:10 -07:00
Axel Davy
39240926cd st/nine: Declare lighting consts for ff shaders
The lighting constants were not declared previously,
but were accessed with indirect addressing, which is
illegal.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=105442

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

CC: "17.3 18.0" <mesa-stable@lists.freedesktop.org>
2018-04-06 23:34:31 +02:00
Caio Marcelo de Oliveira Filho
67c728f7a9 nir: rename variables in nir_lower_io_to_temporaries for clarity
In the emit_copies() function, the use of "newv" and "temp" names made
sense when only copies from temporaries to the new variables were
being done. But now there are other calls to copy with other pairings,
and "temp" doesn't always refer to a temporary created in this
pass. Use the names "dest" and "src" instead.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-06 11:08:08 -07:00
Samuel Pitoiset
8f9f62c2db radv: don't pass the pipeline to radv_flush_constants()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-06 19:46:27 +02:00
Samuel Pitoiset
2bd50cceff radv: rename radv_cmd_buffer_update_vertex_descriptors()
... to radv_flush_vertex_descriptors().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-06 19:46:23 +02:00
Samuel Pitoiset
e829a0cc1e radv: do not try to skip draw calls when VBOs upload failed
This is unnecessary because we record an error which should
be returned by vkEndCommandBuffer(), and the app shouldn't
submit a command buffer when this happens.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-06 19:46:21 +02:00
Samuel Pitoiset
f1d7c16e85 radv: fix prefetching compute shaders on CIK and older chips
Because the check was moved to radv_emit_prefetch_L2().

Fixes: 4ad7595f35 ("radv: rename radv_emit_prefetch() to radv_emit_prefetch_L2()")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-06 19:46:18 +02:00
Samuel Pitoiset
7fe586f6fb radv: only enable PERFECT_ZPASS_COUNTS for precision occlusion queries
This unnecessary when the precision bit flag is not set, and this
might hurt performance. The Vulkan explains that not setting
VK_QUERY_CONTROL_PRECISE_BIT might be more efficient on some
implementations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-06 09:07:34 +02:00
Samuel Pitoiset
d53dff3bfc radv: enable the Polaris small primitive filter control
Enable it directly in the preamble, but do not enable line
on Polaris10/11/12 because there is a hw bug.

There is possibly an issue when MSAA is off, but this doesn't
regress any CTS and AMDVLK doesn't have a workaround as well.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-06 09:07:31 +02:00
Jason Ekstrand
c5b87c94d8 anv: Add WSI support for the I915_FORMAT_MOD_Y_TILED_CCS
v2 (Jason Ekstrand):
 - Return the correct enum values from anv_layout_to_fast_clear_type

v3 (Jason Ekstrand):
 - Always return ANV_FAST_CLEAR_NONE and leave doing the right thing for
   the patch which adds a modifier which supports fast-clears.

Reviewed-by: Daniel Stone <daniels@collabora.com>
Tested-by: Daniel Stone <daniels@collabora.com>
Acked-by: Nanley Chery <nanley.g.chery@intel.com>
2018-04-05 21:17:02 -07:00
Anuj Phogat
ff8b82666a Add more Coffee Lake brand strings
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-04-05 14:50:11 -07:00
Jan Vesely
2406e8848e radeonsi: Reorder checks in si_check_render_feedback
si_get_total_colormask accesses NULL pointer on compute shaders
Fixes crashes on clover
Fixes: 0669dca9c0 ("radeonsi: skip DCC render feedback checking if color writes are disabled")
CC: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-05 17:11:18 -04:00
Kevin Rogovin
cc41603d6d intel/tools: new intel_sanitize_gpu tool
Adds a new debug tool to pad each GEM BO allocated with (weak)
pseudo-random noise values which are then checked after each
batchbuffer dispatch to the kernel. This can be quite valuable to
find diffucult to track down heisenberg style bugs.

[scott.d.phillips@intel.com: split to separate tool]

v2: (by Scott D Phillips)
    - track gem handles per fd (Kevin)
    - remove handles on GEM_CLOSE (Kevin)
    - ignore prime handles
    - meson & shell script

v3: (by Scott D Phillips)
    - don't track prime bos at all (Kevin)
    - protect the hash table with a mutex (Kevin)
    - hook fds by drm_version.name, not path (Chris Wilson)

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Reviewed-by: Kevin Rogovin <kevin.rogovin@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-04-05 13:52:49 -07:00
Jason Ekstrand
e85b95269e prog/nir: Simplify some load/store operations
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-05 13:20:39 -07:00
Marek Olšák
c7dd59b06d radeonsi: fix a crash if ps_shader.cso is NULL in si_get_total_colormask 2018-04-05 15:53:52 -04:00
Marek Olšák
be4250aa88 radeonsi: remove more R600 references
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
c0dfc0c6df radeonsi: try to fix android
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
f55d1f806e radeonsi: try to fix meson
This is not fully tested. Meson can't link LLVM even though automake can.

PATH=/usr/llvm/x86_64-linux-gnu/bin:$PATH meson build/ -Dgallium-va=false \
    -Dplatforms=x11,drm -Dgallium-drivers=radeonsi -Ddri-drivers= \
    -Dgallium-omx=disabled -Dgallium-xvmc=false -Dgles1=false \
    -Dtexture-float=true -Dvulkan-drivers=

src/gallium/auxiliary/libgallium.a(gallivm_lp_bld_misc.cpp.o):
(.data.rel.ro._ZTI26DelegatingJITMemoryManager[_ZTI26DelegatingJITMemoryManager]+0x10):
undefined reference to `typeinfo for llvm::RTDyldMemoryManager'

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
38faac43e3 radeonsi: don't build libradeon.la separately
for better parallelism

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
f9323ddbb9 radeonsi: clean up GET_MAX_VIEWPORT_RANGE definition
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
6a93441295 radeonsi: remove r600_common_context
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
5f77361d2e radeonsi: remove r600_pipe_common::screen
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
321bd6c280 radeonsi: move r600_buffer_common.c and r600_texture.c into radeonsi
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
d58080b318 radeonsi: move r600_gpu_load.c to si_gpu_load.c
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
f7f4ba5306 radeonsi: move r600_query.c/h files to si_query.c/h
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
5777488406 radeonsi: move r600_cs.h contents into si_pipe.h, si_build_pm4.h
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
eced536ed6 radeonsi: rename query definitions R600_ -> SI_
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
72e9e98076 radeonsi: move and rename R600_ERR out of r600_pipe_common.h
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
076afb4f0e radeonsi: rename a few R600/r600_ -> SI_/si_
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
5f1cddde78 radeonsi: move definitions out of r600_pipe_common.h
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
a67ee02388 radeonsi: move functions out of and remove r600_pipe_common.c
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
90d12f1d77 radeonsi: rename r600 -> si in some places
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
50c7aa6756 radeonsi: use si_context instead of pipe_context in parameters pt3
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
e332ba61f4 radeonsi: use si_context instead of pipe_context in parameters pt2
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
c424f86180 radeonsi: use si_context instead of pipe_context in parameters pt1
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
2a62e5eec9 radeonsi: pass sctx to si_rebind_buffer and clean up
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
605ba1b9ae radeonsi: use r600_common_context less pt7
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
0b2f2a6a18 radeonsi: use r600_common_context less pt6
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
4c5efc40f4 radeonsi: update copyrights
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
95bc30275b radeonsi: switch radeon_add_to_buffer_list parameter to si_context
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
e5053060eb radeonsi: use r600_common_context less pt5
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
884fd97f6b radeonsi: use r600_common_context less pt4
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
a8291a23c5 radeonsi: use r600_common_context less pt3
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
3069cb8b78 radeonsi: use r600_common_context less pt2
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
71d9028b7a radeonsi: use r600_common_context less pt1
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
0606190059 radeonsi: don't use r600_common_context in si_emit_cache_flush
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
3de323f9bb radeonsi: switch r600_atom::emit parameter to si_context
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
2b70dd8c8a radeonsi: flatten / remove struct r600_ring
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
f7de8686de radeonsi: remove r600_ring::flush callback
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
4598ad6a00 radeonsi: make radeon_add_to_buffer_list_check_mem be gfx-only
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
426ef367f3 radeonsi: add_to_buffer_list functions can return void
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
c0987d8adf radeonsi: move saved_cs functions from r600_pipe_common.c to si_debug.c
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
37ef4765ff radeonsi: move DMA CS functions from r600_pipe_common.c to si_dma_cs.c
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
19f550f1d2 radeonsi: move EOP event code from r600_pipe_common.c to si_fence.c
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
fc6a44e169 radeonsi: rename si_hw_context.c -> si_gfx_cs.c
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
42500d1dab radeonsi: move si_destroy_saved_cs to si_debug.c
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
02a61e71a2 radeonsi: rename si_begin_new_cs -> si_begin_new_gfx_cs
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
fa09388704 radeonsi: rename si_need_cs_space -> si_need_gfx_cs_space
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
85e75b2da5 radeonsi: remove r600_pipe_common::blit_decompress_depth
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
e04389cc2a radeonsi: remove r600_pipe_common::decompress_dcc
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
9d7f809c03 radeonsi: remove r600_pipe_common::invalidate_buffer
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
898500c440 radeonsi: remove r600_pipe_common::rebind_buffer
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
fbf1bf9b8f radeonsi: remove r600_common_context::set_occlusion_query_state
and remove unused old_enable parameter.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
5ed8b54ffe radeonsi: remove r600_pipe_common::save_qbo_state
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
72842d15ac radeonsi: remove unused query code
The get_size perf counter callback is also inlined and removed.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
3f55fe99d6 radeonsi: use num_cs_dw_queries_suspend
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
54f28359b5 radeonsi: remove r600_pipe_common::need_gfx_cs_space
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
0447e8e59e radeonsi: remove r600_pipe_common::set_atom_dirty
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
5c125ab1ba radeonsi: remove r600_pipe_common::check_vm_faults
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
17e8f1608e radeonsi: call CS flush functions directly whenever possible
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
0669dca9c0 radeonsi: skip DCC render feedback checking if color writes are disabled 2018-04-05 15:34:58 -04:00
Dylan Baker
6ac87c1769 meson: fix megadriver symlinking
Which should be relative instead of absolute.

Fixes: f7f1b30f81
       ("meson: extend install_megadrivers script to handle symmlinking")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105567
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-and-Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-04-05 10:48:38 -07:00
Dylan Baker
19dbed6477 meson: Set .so version for xa like autotools does
Fixes: 0ba909f0f1
       ("meson: build gallium xa state tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-04-05 10:46:14 -07:00
Rafael Antognolli
7728720f07 anv: Make blorp update the clear color.
Instead of updating the clear color in anv before a resolve, just let
blorp handle that for us during fast clears.

v5: Update comment about HiZ clear color (Jordan).

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
e8cadb673d anv: Use clear address for HiZ fast clears too.
Store the default clear address for HiZ fast clears on a global bo, and
point to it when needed.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
021e1885d0 anv: Emit the fast clear color address, instead of value.
On Gen10+, instead of copying the clear color from the state buffer to
the surface state, just use the address of the state buffer in the
surface state directly. This way we can avoid the copy from state buffer
to surface state.

v4:
 - Remove use_clear_address from anv code. (Jason)
 - Use the helper to extract clear color from attachment (Jason)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
3f96b459f4 anv: Add a helper to extract clear color from the attachment.
Extract the code from color_attachment_compute_aux_usage, so we can
later reuse it to update the clear color state buffer.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
7987d041fd i965/surface_state: Emit the clear color address instead of value.
On Gen10, when emitting the surface state, use the value stored in the
clear color entry buffer by using a clear color address in the surface
state.

v4: Use the clear color offset from the clear_color_bo, when available.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
2efe8309d3 i965/blorp: Update the fast clear value buffer.
On Gen10, whenever we do a fast clear, blorp will update the clear color
state buffer for us, as long as we set the clear color address
correctly.

However, on a hiz clear, if the surface is already on the fast clear
state we skip the actual fast clear operation and, before gen10, only
updated the miptree. On gen10+ we need to update the clear value state
buffer too, since blorp will not be doing a fast clear and updating it
for us.

v4:
 - do not use clear_value_size in the for loop
 - Get the address of the clear color from the aux buffer or the
 clear_color_bo, depending on which one is available.
 - let core blorp update the clear color, but also update it when we
 skip a fast clear depth.

v5: Better subject (Jordan).
v6: Remove outdated comment (Jason).

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
5449f942f2 i965: Add aux_buf variable to simplify code.
In a follow up patch, we make use of clear_color_bo, which is in
mt->mcs_buf or mt->hiz_buf. To avoid duplicating more code that does the
same thing on both aux buffers, just use aux_buf already.

v5: Add aux_buf to brw_wm_surface_state too.
v6: Drop aux_surf and use aux_buf->surf instead (Jason).

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
8735c86ce0 i965/miptree: Add new clear color BO for winsys aux buffers
Add an extra BO to store clear color when we receive the aux buffer from
the window system. Since we have no control over the aux buffer size in
this case, we need the new BO to store only the clear color.

v5:
 - Better subject (Jordan).
 - Drop alignment from brw_bo_alloc().

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
ab633c2d61 i965/miptree: Add space to store the clear value in the aux surface.
Similarly to vulkan where we store the clear value in the aux surface,
we can do the same in GL.

v2: Remove unneeded extra function.
v3: Use clear_value_state_size instead of clear_value_size.
v4:
 - rename to clear_color_state_size
 - store clear_color_bo and clear_color_offset in the aux buf struct
v5: Unreference clear color bo (Jordan)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
14260e7c60 intel/blorp: Update clear color state buffer during fast clears.
We always want to update the fast clear color during a fast clear on
i965. On anv, we are doing that before a resolve, but by adding support
to blorp, we can do a similar thing and update it during a fast clear
instead.

The goal is to remove some code from anv that does such update, and
centralize everything in blorp, hopefully removing a lot of code
duplication. It also allows us to have a similar behavior on gen < 9 and
gen >= 10.

v5: s/we/we are/ (Jordan)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
92eb5bbc68 intel/blorp: Only copy clear color when doing a resolve.
We only need to copy the clear color from the state buffer to the
inlined surface state when doing a resolve.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
188a473b9a intel/blorp: Add support for fast clear address.
On gen10+, if surface->clear_color_addr is present, use it directly
intead of copying it to the surface state.

v4: Remove redundant #if clause for GEN <= 10 (Jason)
v5: Move flush after the reloc, and keep lower bits (Topi).

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
b8f45cf967 intel/isl: Add support to emit clear value address.
gen10 can emit the clear color by setting it on a buffer somewhere, and
then adding only the address to the surface state.

This commit add support for that on isl_surf_fill_state, and if that is
requested, skip setting the clear value itself.

v2: Add assert to make sure we are at least on gen10.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
94675edcfd intel: Use Clear Color struct size.
The size of the clear color struct (expected by the hardware) is 8
dwords (isl_dev.ss.clear_value_state_size here). But we still need to
track the size of the clear color, used when memcopying it to/from the
state buffer. For that we keep isl_dev.ss.clear_value_size.

v4:
 - Add struct to gen11 too (Jason, Jordan)
 - Add field for Converted Clear Color to gen11 (Jason)
 - Add clear_color_state_offset to differentiate from
   clear_value_offset.
 - Fix all the places where clear_value_size was used.

v5 (Jason):
 - Split genxml changes to another commit.
 - Remove unnecessary gen checks.
 - Bring back missing offset increment to init_fast_clear_color().

v6 (Jason):
 - On init_fast_clear_color, change:
   addr.offset += 4 => sdi.Address.offset += i * 4
 - Use GEN_GEN instead of GEN_VERSIONx10.

[jordan.l.justen@intel.com: isl_device_init changes]
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
f77789a3f0 intel/genxml: Add Clear Color struct to gen10+.
v5: Split genxml changes into its own commit (Jason).

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
7e616ae201 intel/genxml: Use a single field for clear color address on gen10.
genxml does not support having two address fields with different names
but same position in the state struct. Both "Clear Color Address"
and "Clear Depth Address Low" mean the same thing, only for different
surface types.

To workaround this genxml limitation, rename "Clear Color Address"
to "Clear Value Address" and use it for both color and depth. Do the
same for the high bits.

TODO: add support for multiple addresses at the same position in the
xml.

v2: Combine high and low order bits into a single address field.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
8e1f2e1d2d genxml: Preserve fields that share dword space with addresses.
Some instructions contain fields that are either an address or a value
of some type based on the content of other fields, such as clear color
values vs address. That works fine if these fields are in the less
significant dword, the lower 32 bits of the address, because they get
OR'ed with the address. But if they are in the higher 32 bits, they get
discarded.

On Gen10 we have fields that share space with the higher 16 bits of the
address too. This commit makes sure those fields don't get discarded.

v5: Remove spurious whitespace (Jason).

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
f421a31637 anv/image: Do not override lower bits of dword.
The lower bits seem to have extra fields in every platform but gen8
(even though we don't use them in gen9). So just go ahead and avoid
using them for the address.

v4: Use Jason's suggestion for comment explaining the change.
v5: Fix aux_address comment in anv_private.h (Jason)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-04-05 07:42:45 -07:00
Samuel Pitoiset
942fdfe357 radv: implement a fast prefetch path for the vertex stage
This allows to start draws as soon as possible.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-05 10:03:48 +02:00
Samuel Pitoiset
4ad7595f35 radv: rename radv_emit_prefetch() to radv_emit_prefetch_L2()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-05 10:03:45 +02:00
Samuel Pitoiset
a8a696a38f radv: use a mask for VBOs and shaders prefetching
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-05 10:03:42 +02:00
Marek Olšák
8cd58df2f2 gallium/pp: fix MLAA shaders
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99549
2018-04-04 20:01:43 -04:00
Marek Olšák
096942be2c gallium/pp: use user constant buffers
This fixes a radeonsi crash.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105026
2018-04-04 20:01:43 -04:00
Marek Olšák
d9dc26c94e st/mesa: set stencil border color the same as intensity
This fixes some stencil border color tests on Vega and Raven chips.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-04-04 16:55:52 -04:00
Jon Turney
498d9d0f4d Fix use of alloca() without #include <c99_alloca.h>
Fix use of alloca() without #include <c99_alloca.h> in 1da345e5

vbo/vbo_context.c: In function '_vbo_draw_indirect':
vbo/vbo_context.c:284:34: error: implicit declaration of function 'alloca' [-Werror=implicit-function-declaration]
       struct _mesa_prim *space = alloca(draw_count*sizeof(struct _mesa_prim));
                                  ^~~~~~
vbo/vbo_context.c:284:34: warning: initialization makes pointer from integer without a cast [-Wint-conversion]

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-04-04 14:34:07 +01:00
Samuel Pitoiset
922cd38172 radv: implement out-of-order rasterization when it's safe on VI+
Disabled by default for now, it can be enabled with
RADV_PERFTEST=outoforder.

No CTS regressions on Polaris, and all Vulkan games I tested
look good as well.

Expect small performance improvements for applications where
out-of-order rasterization can be enabled by the driver.

Loosely based on RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-04 13:32:00 +02:00
Samuel Pitoiset
d6709c91a6 radv: change blend_enable field to use four bits per CB
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-04 13:32:00 +02:00
Samuel Pitoiset
a8818d1af2 radv: scan which color blend attachments are enabled
With cb_target_enabled_4bit in order to have four bits per CB.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-04 13:32:00 +02:00
Samuel Pitoiset
ac456d0d1b radv: put more fields in radv_blend_state
Some will be used for further optimizations (ie. out-of-order rast).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-04 13:32:00 +02:00
Samuel Pitoiset
e4976ca33b radv: do not always disable dual quad mode when chip has RbPlus
For GFX9+ only, RadeonSI does this too.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-04 13:32:00 +02:00
Samuel Pitoiset
b8c06a961c radv: don't use the SPI barrier management bug workaround
Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-04 13:32:00 +02:00
Samuel Pitoiset
ab147cba77 radv: mask out high VM address bits in registers where needed
Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-04 13:32:00 +02:00
Lionel Landwerlin
1beb80cb56 intel: compiler: silence compiler warning
../src/intel/compiler/brw_reg.h: In function ‘bool brw_regs_negative_equal(const brw_reg*, const brw_reg*)’:
../src/intel/compiler/brw_reg.h:305:1: warning: control reaches end of non-void function [-Wreturn-type]

Introduced by 8f83eea71e ("i965: Add negative_equals methods").

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-04-04 11:57:39 +01:00
Iago Toral Quiroga
41ac0b1443 compiler/spirv: set is_shadow for depth comparitor sampling opcodes
From the SPIR-V spec, OpTypeImage:

"Depth is whether or not this image is a depth image. (Note that
 whether or not depth comparisons are actually done is a property of
 the sampling opcode, not of this type declaration.)"

The sampling opcodes that specify depth comparisons are
OpImageSample{Proj}Dref{Explicit,Implicit}Lod, so we should set
is_shadow only for these (we were using the deph property of the
image until now).

v2:
 - Do the same for OpImageDrefGather.
 - Set is_shadow to false if the sampling opcode is not one of these (Jason)
 - Reuse an existing switch statement instead of adding a new one (Jason)

Fixes crashes in:
dEQP-VK.spirv_assembly.instruction.graphics.image_sampler.depth_property.*

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
2018-04-04 07:57:58 +02:00
Sergii Romantsov
98b860e311 i965: Extend the negative 32-bit deltas to 64-bits
Gen8+ use 48-bit address relocations so need to extend the sign
to 64-bit return value. Without it we have higher bits zeroed
and missing the negavive values.
Haswell and older use 32-bit deltas so are unaffected by this issue.

v2:
  used int32_t fucntion parameter instead of explicit type conversion.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101408
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Tested-by: Andriy Khulap <andriy.khulap@globallogic.com>
Tested-by: Stuart Young <cefiar@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "18.0 17.3" <mesa-stable@lists.freedesktop.org>
2018-04-03 22:48:09 -07:00
Jason Ekstrand
800df942ea nir/lower_vec_to_movs: Only coalesce if the vec had a SSA destination
Otherwise we may end up trying to coalesce in a case such as

ssa_1 = fadd r1, r2
r3.x = fneg(r2);
r3 = vec4(ssa_1, ssa_1.y, ...)

and that would cause us to move the writes to r3 from the vec to the
fadd which would re-order them with respect to the write from the fneg.
In order to solve this, we just don't coalesce if the destination of the
vec is not SSA.  We could try to get clever and still coalesce if there
are no writes to the destination of the vec between the vec and the ALU
source.  However, since registers only come from phi webs and indirects,
the chances of having a vec with a register destination that is actually
coalescable into its source is very slim.

Shader-db results on Haswell:

    total instructions in shared programs: 13657906 -> 13659101 (<.01%)
    instructions in affected programs: 149291 -> 150486 (0.80%)
    helped: 0
    HURT: 592

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105440
Fixes: 2458ea95c5 "nir/lower_vec_to_movs: Coalesce movs on-the-fly when possible"
Reported-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Tested-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-04-03 22:21:23 -07:00
Kevin Strasser
5bbde9b80f anv: Fix close(fd) before import issue in vkCreateDmaBufImageINTEL
If we close the fd before calling DRM_IOCTL_PRIME_FD_TO_HANDLE the kernel
will hit a -EBADF error. Move the close(fd) call to the end of
anv_CreateDmaBufImageINTEL().

Signed-off-by: Kevin Strasser <kevin.strasser@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-03 18:33:17 -07:00
Timothy Arceri
b42633db8e glsl: always call do_lower_jumps() after loop unrolling
This fixes a bug in radeonsi where LLVM cannot handle the case where
a break exists but its not the last instruction in the block.

LLVM would fail with:
Terminator found in the middle of a basic block!
LLVM ERROR: Broken function found, compilation aborted!

Fixes: 96fe8834f5 "glsl_to_tgsi: do fewer optimizations with GLSLOptimizeConservatively"

Reviewed-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105317
2018-04-04 08:40:16 +10:00
James Legg
a58fdc61e9 vulkan/wsi/wayland: fix leaks
Fixes: bfa22266cd ("vulkan/wsi/wayland: Add support for zwp_dmabuf")
Reviewed-by: Daniel Stone <daniels@collabora.com>
CC: Jason Ekstrand <jason@jlekstrand.net>
2018-04-03 22:09:57 +01:00
Juan A. Suarez Romero
06076ead28 docs: update calendar, add news and link release notes to 17.3.8
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-04-03 17:38:36 +00:00
Juan A. Suarez Romero
ca71b7bab8 docs: add sha256 checksums for 17.3.8
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit ba371c7262)
2018-04-03 17:34:16 +00:00
Juan A. Suarez Romero
d89ef8ce62 docs: add release notes for 17.3.8
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 3bf5c10c5c)
2018-04-03 17:34:16 +00:00
Jakob Bornecrantz
88e958257c st/mesa: Also use PIPE_FORMAT_R8G8B8A8_SRGB for framebuffer_sRGB.
When running virgl on a GLES host the only sRGB formats that support
rendering is RGBA and RGBX. That pipe format is in the sRGB default
lists that the state tracker uses when mapping mesa formats.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
2018-04-03 17:48:52 +01:00
Lionel Landwerlin
78c18d99dc intel: gen-decoder: print all dword a field belongs to
Prior to printing a decoded field, print out all dwords that field
belongs to. In particular with address fields spanning multiple
dwords, we want to have all the dwords presented before the field is
decoded to make it easier to read.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-04-03 16:55:53 +01:00
Lionel Landwerlin
4d59127213 intel: genxml: decode variable length MI_LRI
MI_LOAD_REGISTER_IMM can load multiple (register, value) tuples in one
command. In our drivers we only use one tuple at a time, but the
kernel might load more than one at a time.

Instead of making all the tuple part of a group, we leave out the
first tuple (the one we use in the generated packing structures).

This is particularly useful for looking at error stats generated by
the kernel.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-04-03 16:55:53 +01:00
Lionel Landwerlin
2841af6238 intel: gen-decoder: don't decode fields beyond a dword length
For example, a PIPE_CONTROL with DWordLength = 2 should look like
this :

0xffffe374:  0x7a000002:  PIPE_CONTROL
0xffffe374:  0x7a000002 : Dword 0
    DWord Length: 2
0xffffe378:  0x00800000 : Dword 1
    Depth Cache Flush Enable: false
    Stall At Pixel Scoreboard: false
    State Cache Invalidation Enable: false
    Constant Cache Invalidation Enable: false
    VF Cache Invalidation Enable: false
    DC Flush Enable: false
    Pipe Control Flush Enable: false
    Notify Enable: false
    Indirect State Pointers Disable: false
    Texture Cache Invalidation Enable: false
    Instruction Cache Invalidate Enable: false
    Render Target Cache Flush Enable: false
    Depth Stall Enable: false
    Post Sync Operation: 0 (No Write)
    Generic Media State Clear: false
    TLB Invalidate: false
    Global Snapshot Count Reset: false
    Command Streamer Stall Enable: false
    Store Data Index: 0
    LRI Post Sync Operation: 1 (MMIO Write Immediate Data)
    Destination Address Type: 0 (PPGTT)
    Flush LLC: false
0xffffe37c:  0x00000000 : Dword 2
    Address: 0x00000000
0xffffe384:  0x05000000:  MI_BATCH_BUFFER_END

Prior to this change, fields beyond the length of the command would be
decoded (notice the MI_BATCH_BUFFER_END decoded as part of the
previous PIPE_CONTROL) :

0xffffe374:  0x7a000002:  PIPE_CONTROL
0xffffe374:  0x7a000002 : Dword 0
    DWord Length: 2
0xffffe378:  0x00800000 : Dword 1
    Depth Cache Flush Enable: false
    Stall At Pixel Scoreboard: false
    State Cache Invalidation Enable: false
    Constant Cache Invalidation Enable: false
    VF Cache Invalidation Enable: false
    DC Flush Enable: false
    Pipe Control Flush Enable: false
    Notify Enable: false
    Indirect State Pointers Disable: false
    Texture Cache Invalidation Enable: false
    Instruction Cache Invalidate Enable: false
    Render Target Cache Flush Enable: false
    Depth Stall Enable: false
    Post Sync Operation: 0 (No Write)
    Generic Media State Clear: false
    TLB Invalidate: false
    Global Snapshot Count Reset: false
    Command Streamer Stall Enable: false
    Store Data Index: 0
    LRI Post Sync Operation: 1 (MMIO Write Immediate Data)
    Destination Address Type: 0 (PPGTT)
    Flush LLC: false
0xffffe37c:  0x00000000 : Dword 2
    Address: 0x00000000
0xffffe380:  0x00000000 : Dword 3
0xffffe384:  0x05000000 : Dword 4
    Immediate Data: 83886080
0xffffe384:  0x05000000:  MI_BATCH_BUFFER_END

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-04-03 16:55:53 +01:00
Lionel Landwerlin
81375516b2 intel: error_decode: add an option to decode all buffers
The kernel reports workaround batch buffers, but we're not presenting
them currently. Also they might not be useful for debugging purely
userspace driver issues, when problems arise because of interactions
between kernel & userspace drivers, it's nice to be able to decode
them.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-04-03 16:55:53 +01:00
Lionel Landwerlin
b3aa18dfd6 intel: genxml: add preemption control instructions
Helpful to debug kernel workaround batchbuffers.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-04-03 16:55:53 +01:00
Dylan Baker
6f6e711c72 mesa: ensure that variable is initialized
This variable controls whether we link using the glsl code path or the
spirv path. It's set when we validate that all shaders are glsl or
spirv, but if there are no shaders attached to the program it will
remain unset, resulting in undefined behavior. We want to go down the
glsl path in that case, so initialize to false.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105820
Fixes: 16f6634e7f
       ("mesa/program: Link SPIR-V shaders using the SPIR-V code-path")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-04-03 08:47:59 -07:00
Marek Olšák
d3e96b1063 radeonsi/gfx9: fix bad LLVM params in monolithic LS+HS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-03 11:07:28 -04:00
Samuel Pitoiset
acf60abc54 radv: enable VK_EXT_shader_viewport_index_layer
The driver already supports exporting the Layer and ViewportIndex
built-ins from vertex or tessellation shaders.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-03 14:05:46 +02:00
Rob Clark
51888bf07d nir+drivers: add helpers to get # of src/dest components
Add helpers to get the number of src/dest components for an intrinsic,
and update spots that were open-coding this logic to use the helpers
instead.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-04-03 06:08:56 -04:00
Rob Clark
91f9450b32 freedreno/ir3: fix fallout of unused false-depth elimination
Since we were MARK flag for both preventing loops, and tracking whether
instructions were used, we could end up in an infinite loop due to
bd2ca2bcdd.  Instead invert the logic.. mark all instructions UNUSED
up front and clear the flag as we visit them.

Fixes: bd2ca2bcdd freedreno/ir3: eliminate unused false-deps
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-04-03 06:08:56 -04:00
Timothy Arceri
7e9b7ec094 gallium/pipebuffer: fix parenthesis location
Without this the return value will never get set to -1. This
was first added in 49866c8f34 and copied in 2b396eeed9.

Fixes: 2b396eeed9 "gallium/pb_cache: add a copy of cache bufmgr independent of pb_manager"

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102342
2018-04-03 16:05:59 +10:00
Tapani Pälli
6b21391729 Revert "mesa: add GL_HALF_FLOAT as supported type to readpixels"
This reverts commit 41cf30b8bc.

Commit caused regressions with KHR-GLES3.packed_pixels.* tests.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Suggested-by: Eric Anholt <eric@anholt.net>
2018-04-03 08:43:30 +03:00
Mike Lothian
0bdbe4583f gallivm: Fix include for LLVMAddPromoteMemoryToRegisterPass
Include llvm-c/Transforms/Utils.h with the newest LLVM 7

Signed-of-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-04-02 14:27:29 -04:00
Mike Lothian
5e07881305 radeonsi: Fix include for LLVMAddPromoteMemoryToRegisterPass
Include llvm-c/Transforms/Utils.h with the newest LLVM 7

Signed-of-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-04-02 14:27:29 -04:00
Mike Lothian
7e144ace95 ac/nir: Fix include for LLVMAddPromoteMemoryToRegisterPass
Include llvm-c/Transforms/Utils.h with the newest LLVM 7

Signed-of-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-04-02 14:27:29 -04:00
Daniel Stone
4cbecb6168 st/dri: Initialise modifier to INVALID for DRI2
When allocating a buffer for DRI2, set the modifier to INVALID to inform
the backend that we have no supplied modifiers and it should do its own
thing. The missed initialisation forced linear, even if the
implementation had made other decisions.

This resulted in VC4 DRI2 clients failing with:
  Modifier 0x0 vs. tiling (0x700000000000001) mismatch

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reported-by: Andreas Müller <schnitzeltony@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Fixes: 3f8513172f ("gallium/winsys/drm: introduce modifier field to winsys_handle")
2018-04-02 19:07:57 +01:00
Marek Olšák
2be6143032 radeonsi: implement GL_KHR_blend_equation_advanced
MSAA is supported using sample shading. Layered rendering and all texture
targets are also supported.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-02 13:55:25 -04:00
Marek Olšák
e04631b0f2 radeonsi: rename unpack_param -> si_unpack_param
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-02 13:55:23 -04:00
Marek Olšák
dc04e4bba2 radeonsi: move FMASK shader logic to shared code
We'll need it for FBFETCH in both TGSI and NIR paths.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-02 13:55:22 -04:00
Marek Olšák
eb77961292 radeonsi: add R600_DEBUG=nofmask to disable MSAA compression
For testing.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-02 13:55:20 -04:00
Marek Olšák
56342c97ee gallium/u_tests: test FBFETCH and shader-based blending with MSAA
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-02 13:55:18 -04:00
Marek Olšák
5d91c2ccea ac/gpu_info: print GB_ADDR_CONFIG 2018-04-02 13:10:37 -04:00
Marek Olšák
b1f33086ec ac/gpu_info: reorder the fields and print them nicely 2018-04-02 13:10:37 -04:00
Marek Olšák
a0a96819e1 ac/gpu_info: rename has_virtual_memory -> r600_has_virtual_memory 2018-04-02 13:10:37 -04:00
Marek Olšák
32b3932de1 ac/gpu_info: don't print irrelevant fields 2018-04-02 13:10:37 -04:00
Marek Olšák
f754217517 st/mesa: don't draw if the bound element array buffer is not allocated
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-02 13:10:36 -04:00
Iago Toral Quiroga
31881079af anv/cmd_buffer: honor pending clear views for depth/stencil attachments
v2: rebased on top of subpass rework.

v3: rebased

v4:
 - rebased
 - reset pending clear views in one go rather one bit at a time (Caio)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-02 09:53:24 +02:00
Iago Toral Quiroga
f60c5fc17e anv/cmd_buffer: consider multiview masks for tracking pending clear aspects
When multiview is active a subpass clear may only clear a subset of the
attachment layers. Other subpasses in the same render pass may also
clear too and we want to honor those clears as well, however, we need to
ensure that we only clear a layer once, on the first subpass that uses
a particular layer (view) of a given attachment.

This means that when we check if a subpass attachment needs to be cleared
we need to check if all the layers used by that subpass (as indicated by
its view_mask) have already been cleared in previous subpasses or not, in
which case, we must clear any pending layers used by the subpass, and only
those pending.

v2:
  - track pending clear views in the attachment state (Jason)
  - rebased on top of fast-clear rework.

v3:
  - rebased on top of subpass rework.

v4: rebased.

v5 (Caio):
 - Rebased.
 - Initialize pending clear views to only have bits set for layers
   that exist.
 - Reset pending clear views in one go rather one bit at a time.
 - Put "last subpass for this attachment" condition in a separate
   function to simplify the conditional that resets pending_clear_aspects.

Fixes:
dEQP-VK.multiview.readback_implicit_clear.*

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-02 09:53:15 +02:00
Timothy Arceri
c88e7fe29e radeonsi/nir: fix explicit component packing for geom/tess doubles
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-02 14:56:00 +10:00
Timothy Arceri
dd3d3cc877 radeonsi/nir: gather buffers declared more accurately and use const fast path
For now we skip SI && HAVE_LLVM < 0x0600 for simplicity. We also skip
setting the more accurate masks for builtin uniforms for now as it
causes some piglit regressions.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-02 14:56:00 +10:00
Timothy Arceri
56017d8100 radeonsi: create load_const_buffer_desc_fast_path() helper
This will be shared by the TGSI and NIR backends. For simplicity
we leave the SI LLVM 5.0 and lower work around only in the TGSI
backend.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-02 14:56:00 +10:00
Timothy Arceri
7aad5e15f6 radeonsi/nir: set TGSI_PROPERTY_NEXT_SHADER
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-02 14:56:00 +10:00
Timothy Arceri
2ca5d9548f st/glsl_to_nir: gather next_stage in shader_info
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-04-02 14:56:00 +10:00
Rob Clark
2f175bfe5d freedreno/a5xx: don't align height for PIPE_BUFFER
Buffers can be large, so we probably don't want to make them all 32x
bigger.  But they can't be rendered to (at least in GL) so we don't
need this workaround to prevent page faults on mem<->gmem.

Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-04-01 11:26:01 -04:00
Rob Clark
1866f76f7b freedreno/a5xx: fix page faults on last level
We could alternatively fall back to using "old style" draw's for
mem<->gmem (ie. what <= a4xx do) when height is not aligned to 32,
but that is somewhat more work (and not really something that could
be applied to stable)

Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-04-01 10:50:11 -04:00
Rob Clark
afde9294b5 freedreno/ir3: fix issue w/ glamor composite shaders
Fixes an issue that became possible when we started lowering phi webs to
regs (a7ea2b4e) (although was not really seen until we also switched to
using peephole select pass (ec8bc54a) instead of lowering *all* if/else
to select).

If texture coord (or anything else that uses create_collect() to collect
scalar values in a sequence of scalar registers) was consuming a value
produced on either side of an if/else (ie. a phi lowered to nir reg,
which in ir3 is an "array" of length 1) then register allocation would
happen incorrectly and we'd end up sampling from garbage coordinates.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-31 16:25:13 -04:00
Rob Clark
2191a18e75 freedreno/ir3: more half-precision fixes
Some instructions require src/dst to be in full or half precision
register depending on src/dst type.  So do a better job of propagating
register type.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-31 15:16:16 -04:00
Rob Clark
e04e068f75 freedreno/ir3: add helper to create immed of specified size
We'll also need to be able to create a half-precision immediate.  So
re-work create_immed().  Prep work for following patch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-31 15:13:11 -04:00
Rob Clark
1f45320e51 freedreno/ir3: pass ctx instead of block to create_collect()
Prep work for following patch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-31 15:12:33 -04:00
Rob Clark
bd2ca2bcdd freedreno/ir3: eliminate unused false-deps
Previously false-dependencies would get flagged as used, even if the
only "use" was a false dep to (for example) prevent a load from being
scheduled after a store.

In addition to being pointless instructions, in some cases they can
cause problems.  For example, ldg (and similar instructions) depend on
an immed arg getting CP'd into the instruction, but this doesn't happen
if an instruction is otherwise unused.  Which can result in undefined
results (overwriting unintended registers).

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-31 15:11:46 -04:00
Rob Clark
4f78383809 freedreno/ir3: add local_group_size
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-31 15:10:56 -04:00
Rob Clark
96e7927fb2 freedreno/ir3: clear SSA flag when assigning "ARRAY" regs too
Avoids a misleading "INVALID FLAGS" warning in debug builds.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-31 15:10:16 -04:00
Rob Clark
6514b4e3fd freedreno/ir3: print array live ranges
This is also useful to see if optmsgs are enabled.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-31 15:09:42 -04:00
Wladimir J. van der Laan
e8e3aa68d6 freedreno: a2xx: Implement DP2 instruction
Use DOT2ADDv instruction with 0.0f constant add.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan
79d6b194f2 freedreno: a2xx: implement SEQ/SNE instructions
Extend translate_sge_slt to emit these, in analogous fashion
but using CNDEv.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan
837fabaaa3 freedreno: a2xx: Compressed textures support
Add support for:

- PIPE_FORMAT_ETC1_RGB8
- PIPE_FORMAT_DXT1_RGB
- PIPE_FORMAT_DXT1_RGBA
- PIPE_FORMAT_DXT3_RGBA
- PIPE_FORMAT_DXT5_RGBA

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan
92d529e7e4 freedreno: a2xx: Support TEXTURE_RECT
Denormalized texture coordinates are required for text rendering in
GALLIUM_HUD.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan
6be017fdc4 freedreno: a2xx: Prevent crash in emit_texture if view is not set
Textures will sometimes be updated if texture view state was
un-set, without this change that causes an assertion crash or
segfault.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan
fb41372761 freedreno: a2xx: Fix fd2_tex_swiz
Compose swizzles using util_format_compose_swizzles instead
of the custom code (which somehow had a bug).

This makes the GL_ALPHA internal format work.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan
faed84a615 freedreno: a2xx: Change use of BLEND_ to BLEND2_
Change use of BLEND_ to BLEND2_,

    BLEND_* a3xx_rb_blend_opcode
    BLEND2_* is a2xx_rb_blend_opcode

This makes no effective difference as the used enumerant has the same
value (0), but the other enumerants do not match 1-to-1 so this will
avoid future problems.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan
cb6dd7070f freedreno: a2xx: Update rnndb header for formats enumeration
The format enumeration comes comes from the yamoto
register headers that are part of the amd-gpu kernel driver.
(see freedreno envytools commit b8fb7978e7ae106d0d11d0b238ab2ba2d4dd9d43)

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-03-31 06:17:59 +00:00
Mathias Fröhlich
1da345e569 vbo: Use alloca for _vbo_draw_indirect.
Avoid using malloc in the draw path of mesa.
Since the draw_count is a user api input, fall back to malloc if
the amount of consumed stack space may get too high.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-31 06:32:15 +02:00
Mathias Fröhlich
3f1cd957d3 vbo: Remove unused includes to vbo_private.h
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-31 06:32:14 +02:00
Mathias Fröhlich
6e9f00e3fc vbo: Move vbo_split into the tnl module.
Move the files, adapt to the naming scheme in tnl, update callers
and build system.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-31 06:32:14 +02:00
Mathias Fröhlich
245f9a3977 vbo: Readd the arrays argument to the legacy draw methods.
The legacy draw paths from back before 2012 contained a gl_vertex_array
array for the inputs to be used for draw. So all draw methods from legacy
drivers and everything that goes through tnl are originally written
for this calling convention. The same goes for tools like t_rebase or
vbo_split*, that even partly still have the original calling convention
with a currently unused such pointer.
Back in 2012 patch 50f7e75

mesa: move gl_client_array*[] from vbo_draw_func into gl_context

introduced Array._DrawArrays, which was something that was IMO aiming for
a similar direction than Array._DrawVAO introduced recently.
Now several tools like t_rebase and vbo_split*, which are mostly used by
tnl based drivers, would need to be converted to use the internal
Array._DrawVAO instead of Array._DrawArrays. The same goes for the driver
backends that use any of these tools.
Alternatively we can reintroduce the gl_vertex_array array in its call
argument list and put these tools finally into the tnl directory.
So this change reintroduces this gl_vertex_array array for the legacy
draw paths that are still required for the tools t_rebase and vbo_split*.
A followup will move vbo_split also into tnl.

Note that none of the affected drivers use the DriverFlags.NewArray
driver bit. So it should be safe to remove this also for the legacy
draw path.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-31 06:32:14 +02:00
Mathias Fröhlich
461698af26 vbo: Remove the now unused vbo draw path.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-31 06:32:13 +02:00
Mathias Fröhlich
784fdef4e7 tnl: Push down the gl_vertex_array inputs into tnl drivers.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-31 06:32:13 +02:00
Mathias Fröhlich
7f8db5ca47 vbo: Remove vbo_indirect_draw_func.
Remove the vbo_indirect_draw_func vbo callback and make the default
implementation use the drivers main draw callback function directly.
This will be needed with the next changes when drivers without own main
drivers DrawIndirect implementation get moved to the main drivers
Draw method.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-31 06:32:13 +02:00
Mathias Fröhlich
4db9d83a2d i965: Push down the gl_vertex_array inputs into i965.
Let the i965 backend have its own gl_vertex_array array and basically
reimplement the way _vbo_draw works.
Note that brw_draw_indirect_prims calls brw_draw_prims internally
and gets its update to Array._DrawArray by this way.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-31 06:32:12 +02:00
Mathias Fröhlich
fca1550550 gallium: Push down the gl_vertex_array inputs into gallium.
Let the gallium backend have its own gl_vertex_array array and basically
reimplement the way _vbo_draw works.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-31 06:32:12 +02:00
Jason Ekstrand
9978f55cd1 nir/validator: Validate that all used variables exist
We were validating this for locals but nothing else.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-30 17:20:27 -07:00
Jason Ekstrand
2b977989f3 intel/vec4: Set channel_sizes for MOV_INDIRECT sources
Otherwise, any indirect push constant access results in an assertion
failure when we start digging through the channel_sizes array.  This
fixes dEQP-VK.pipeline.push_constant.graphics_pipeline.dynamic_index_vert
on Haswell.  It should be a harmless no-op for GL since indirect push
constants aren't used there.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes: e69e5c7006 "i965/vec4: load dvec3/4 uniforms first in the..."
2018-03-30 17:20:27 -07:00
Jason Ekstrand
6018f5b079 nir/lower_indirect_derefs: Support interp_var_at intrinsics
This fixes the fs-interpolateAtCentroid-block-array piglit test on i965.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2018-03-30 17:20:27 -07:00
Jason Ekstrand
0517d65f96 nir/vars_to_ssa: Remove copies from the correct set
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2018-03-30 17:20:27 -07:00
Jason Ekstrand
a1452a94fc nir: Return a cursor from nir_instr_remove
Because nir_instr_remove is an inline wrapper around nir_instr_remove_v,
the compiler should be able to tell that the return value is unused and
not emit the extra code in most cases.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-30 17:20:27 -07:00
Jason Ekstrand
956f17395b nir: Add src/dest num_components helpers
We already have these for bit_size

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-30 17:20:27 -07:00
Brian Paul
bebf758c49 docs: document WGL_SWAP_INTERVAL env var
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-03-30 14:45:05 -06:00
Brian Paul
c8906b8459 st/wgl: check if WGL_SWAP_INTERVAL is defined in wglSwapIntervalEXT()
This allows the WGL_SWAP_INTERVAL env var to override any application
calls to wglSwapIntervalEXT().  Useful for debugging, or to set the
interval to zero to effectively disable the swap interval.

Note: we also rename the previous instance of SVGA_SWAP_INTERVAL to
WGL_SWAP_INTERVAL since this is a WGL feature and not related to the
svga driver.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-03-30 14:44:50 -06:00
Brian Paul
1bf201ddce glapi: define GL_API to be KEYWORD1 in glapi_dispatch.c (v2)
This fixes a Windows build warning where the prototypes for the ES
function in the header file don't match the prototypes in this file
because the GL_API and GLAPI macros are defined differently.

v2: defined GL_API to KEYWORD1 instead of GLAPI, per Mathias.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-03-30 14:33:33 -06:00
Brian Paul
26bc983c83 spirv: s/uint/unsigned/ to fix MSVC build
Reviewed-by: Neil Roberts <nroberts@igalia.com>
2018-03-30 14:33:33 -06:00
Brian Paul
f3164c2ed9 nir/spirv: s/uint32_t/SpvOp/ in various functions
The MSVC compiler warns when the function parameter types don't
exactly match with respect to enum vs. uint32_t.  Use SpvOp everywhere.

Alternately, uint32_t could be used everywhere.  There doesn't seem
to be an advantage to one over the other.

Reviewed-by: Neil Roberts <nroberts@igalia.com>
2018-03-30 14:33:33 -06:00
Brian Paul
cb619a3c9a nir/spirv: fix MSVC syntax error in vtn_handle_texture()
Reviewed-by: Neil Roberts <nroberts@igalia.com>
2018-03-30 14:33:33 -06:00
Brian Paul
c58c9f712d nir/spirv: move NORETURN annotation on _vtn_fail() prototype
This needs to before the function, not after, to compile with MSVC.
This works with gcc too.

Reviewed-by: Neil Roberts <nroberts@igalia.com>
2018-03-30 14:33:33 -06:00
Brian Paul
84be45fc20 nir/spirv: fix MSVC warning in vtn_align_u32()
Fixes warning that "negation of an unsigned value results in an
unsigned value".

Reviewed-by: Neil Roberts <nroberts@igalia.com>
2018-03-30 14:33:33 -06:00
Neil Roberts
31d91f019b spirv: Fix building with SCons
The SCons build broke with commit ba975140d3 because a SPIR-V
function is called from Mesa main. This adds a convenience library for
SPIR-V and adds it to everything that was including nir. It also adds
both nir and spirv to drivers/x11/SConscript.

Also add nir/spirv modules to osmesa and libgl-gdi targets. (Brian Paul)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105817
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
2018-03-30 14:33:03 -06:00
Brian Paul
cdc34e2cea mesa: fix MSVC bitshift overflow warnings
In the BITFIELD_MASK() macro, if b==32 the expression evaluates to
~0u, but the compiler still sees the expression (1 << 32) in the
unused part and issues a warning about integer bitshift overflow.

Fix that by using (b) % 32 to ensure the max shift is 31 bits.

This issue has been present for a while, but shows up much more
often because of the recent VBO changes.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-03-30 11:04:32 -06:00
Brian Paul
fa18a427e9 st/mesa: add missing GLSL_TYPE_[U]INT8 cases in st_glsl_type_dword_size()
Silences a compiler warning about unhandled enum switch cases.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-03-30 11:04:32 -06:00
Jakob Bornecrantz
e16b92ad7e vbo: MaxVertexAttribStride is not always set
This assert is hit on hardware which does not expose GL 4.4 or GLES 3.1.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
2018-03-30 17:23:08 +01:00
Daniel Stone
696762eef5 x11: Only report supported DRI3/Present versions
The version passed to QueryVersion requests is the version that the
client supports. We were just passing in whatever version of XCB was
present on the system, which may not be a version that Mesa actually
explicitly supports, e.g. it might bring unwanted semantics.

Set specific protocol versions which we support, and only pass those.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Fixes: 7aeef2d4ef ("dri3: allow building against older xcb (v3)")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-30 16:53:51 +01:00
Samuel Pitoiset
2a329f4ada radv: set SAMPLE_RATE to the number of samples of the current fb
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-30 17:32:15 +02:00
Brian Paul
fc1d1dbe81 nir: s/uint/unsigned/ to fix MSVC/MinGW build
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-03-30 08:37:59 -06:00
Eduardo Lima Mitev
e7fc18097e i965: Don't call process_glsl_ir() for SPIR-V shaders
v2: Use 'spirv_data' from gl_linked_shader instead, to check if shader
   is SPIR-V. (Timothy Arceri)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-30 09:14:56 +02:00
Eduardo Lima Mitev
e7d97aa75d i965: Call spirv_to_nir() instead of glsl_to_nir() for SPIR-V shaders
This is the main fork of the shader compilation code-path, where a NIR
shader is obtained by calling spirv_to_nir() or glsl_to_nir(),
depending on its nature..

v2: Use 'spirv_data' member from gl_linked_shader to know which method
   to call. (Timothy Arceri)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-30 09:14:56 +02:00
Eduardo Lima Mitev
abb6d0797c mesa/glspirv: Add a _mesa_spirv_to_nir() function
This is basically a wrapper around spirv_to_nir() that includes
arguments setup and post-conversion validation.

v2: * Rebase update (SpirVCapabilities not a pointer anymore,
    spirv_to_nir_options added, and others).
    * Code-style improvements and remove debug hunk. (Timothy Arceri)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-30 09:14:56 +02:00
Eduardo Lima Mitev
16f6634e7f mesa/program: Link SPIR-V shaders using the SPIR-V code-path
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-30 09:14:56 +02:00
Eduardo Lima Mitev
9c36e9f862 mesa/glspirv: Add _mesa_spirv_link_shaders() function
This is the equivalent to link_shaders() from
src/compiler/glsl/linker.cpp, but for SPIR-V programs. It just
creates the program and its gl_linked_shader objects, giving drivers
the opportunity to implement any linking of SPIR-V shaders they choose,
at a later stage.

v2: Bail out if we see more that one shader for the same stage, and
    add a corresponding comment. (Timothy Arceri)

v3:
  * Adds also a linker error log to the condition above, with a
    reference to the specification issue. (Timothy Arceri)
  * Squash with the patch adding the function boilerplate (Timothy
    Arceri)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-30 09:14:56 +02:00
Eduardo Lima Mitev
22b6b3d0a7 mesa: Add a reference to gl_shader_spirv_data to gl_linked_shader
This is a reference to the spirv_data object stored in gl_shader, which
stores shader SPIR-V data that is needed during linking too.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-30 09:14:56 +02:00
Nicolai Hähnle
ba975140d3 mesa: Implement glSpecializeShaderARB
v2:
  * Use gl_spirv_validation instead of spirv_to_nir.  This method just
    validates the shader. The conversion to NIR will happen later,
    during linking. (Alejandro Piñeiro)
  * Use gl_shader_spirv_data struct to store the SPIR-V data.
    (Eduardo Lima)
  * Use the 'spirv_data' member to tell if the gl_shader is a SPIR-V
    shader, instead of a dedicated flag. (Timothy Arceri)

Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Eduardo Lima Mitev <elima@igalia.com>

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-30 09:14:56 +02:00
Alejandro Piñeiro
9063bf7ad8 nir/spirv: add gl_spirv_validation method
ARB_gl_spirv adds the ability to use SPIR-V binaries, and a new
method, glSpecializeShader. Here we add a new function to do the
validation for this function:

From OpenGL 4.6 spec, section 7.2.1"

   "Shader Specialization", error table:

    INVALID_VALUE is generated if <pEntryPoint> does not name a valid
    entry point for <shader>.

    INVALID_VALUE is generated if any element of <pConstantIndex>
    refers to a specialization constant that does not exist in the
    shader module contained in <shader>.""

v2: rebase update (spirv_to_nir options added, changes on the warning
    logging, and others)

v3: include passing options on common initialization, doesn't call
    setjmp on common_initialization

v4: (after Jason comments):
  * Rename common_initialization to vtn_builder_create
  * Move validation method and their helpers to own source file.
  * Create own handle_constant_decoration_cb instead of reuse existing one

v5: put vtn_build_create refactoring to their own patch (Jason)

v6: update after vtn_builder_create method renamed, add explanatory
    comment, tweak existing comment and commit message (Timothy)
2018-03-30 09:14:56 +02:00
Alejandro Piñeiro
bebe3d626e spirv: add vtn_create_builder
Refactored from spirv_to_nir, in order to be reused later.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>

v2: renamed method (from vtn_builder_create), add explanatory comment
    (Timothy)
2018-03-30 09:14:56 +02:00
Alejandro Piñeiro
3761e675e2 i965: initialize SPIR-V capabilities
Needed for ARB_gl_spirv. Those are not the same that the Intel vulkan
driver. From the ARB_spirv_extensions spec:

   "3. If a new GL extension is added that includes SPIR-V support via
   a new SPIR-V extension does it's SPIR-V extension also get
   enumerated by the SPIR_V_EXTENSIONS_ARB query?.

   RESOLVED. Yes. It's good to include it for consistency. Any SPIR-V
   functionality supported beyond the SPIR-V version that is required
   for the GL API version should be enumerated."

So in addition to the core SPIR-V support, there is the possibility of
specific GL extensions enabling specific SPIR-V extensions (so
capabilities). That would mean that it is possible that OpenGL and
Vulkan not having the same capabilities supported, even for the same
driver. For this reason it is better to keep them separated.

As an example: at the time of this patch writing Intel vulkan driver
support multiview, but there isn't any OpenGL multiview GL extension
supported.

Note: we initialize SPIR-V capabilities at brwCreateContext instead of
the usual brw_initialize_context_constants because we want to do that
only if the extension is enabled.

v2:
   * Rebase update (SpirVCapabilities not a pointer anymore)
   * Fill spirv capabilities for OpenGL >= 3.3 (Ian Romanick)

v3:
   * Drop multiview support, as i965 doesn't support any multiview GL
     extension (Jason)
   * Fill spirv capabilities only if the extension is enabled (Jason)

v4: Capabilities are supported only on gen7+. Added comment and assert
    (Jason)
2018-03-30 09:14:56 +02:00
Nicolai Hähnle
ca5cc78206 mesa: add gl_constants::SpirVCapabilities
For drivers to declare which SPIR-V features they support.

v2: Don't use a pointer (Ian Romanick)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-30 09:14:56 +02:00
Ian Romanick
19e0dd1ad3 i965: Don't request GLSL IR lowering of gl_VertexID
Let the lowering in NIR handle it instead.

This hurts one shader that occurs twice in shader-db (SynMark GSCloth)
on IVB and HSW.  No other shaders or platforms were affected.

total cycles in shared programs: 253438422 -> 253438426 (0.00%)
cycles in affected programs: 412 -> 416 (0.97%)
helped: 0
HURT: 2

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Antia Puentes <apuentes@igalia.com>
2018-03-29 14:16:07 -07:00
Ian Romanick
2765633116 i965: Silence unused parameter warning
src/mesa/drivers/dri/i965/brw_draw_upload.c: In function ‘double_types’:
src/mesa/drivers/dri/i965/brw_draw_upload.c:225:34: warning: unused parameter ‘brw’ [-Wunused-parameter]
 double_types(struct brw_context *brw,
                                  ^~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-03-29 14:16:04 -07:00
Ian Romanick
042ee4bea2 spirv: Move SPIR-V building to Makefile.spirv.am and spirv/meson.build
Future changes will add generated files used only from
src/compiler/glsl.  These can't be built from Makefile.nir.am, and we
can't move all the rules from Makefile.nir.am to Makefile.spirv.am (and
it would be silly anyway).

v2: Do it for meson too.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (the meson bits)
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> (the automake bits)
2018-03-29 14:16:01 -07:00
Ian Romanick
2c9621ee5c compiler: All leaf Makefile.am should use +=
This slightly simplifies later changes that add more Makefile.*.am
files.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2018-03-29 14:09:41 -07:00
Ian Romanick
4925347ec5 util: Include bitscan.h directly
Previously bitset.h would include u_math.h to get bitscan.h.  u_math.h
lives in src/gallium/auxiliary/util while both bitset.h and bitscan.h
live in src/util.  Having the one file directly include another file
that lives in the same directory makes much more sense.

As a side-effect, several files need to directly include standard header
files that were previously indirectly included.

v2: Fix build break in src/amd/common/ac_nir_to_llvm.c.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2018-03-29 14:09:30 -07:00
Ian Romanick
ef7a4c9015 util: Optimize util_is_power_of_two_nonzero
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Suggested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2018-03-29 14:09:29 -07:00
Ian Romanick
cd18aa1e50 util: Use util_is_power_of_two_nonzero in u_vector
Previously size=0, element_size=0 would have been allowed.  That
combination can only lead to despair.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-03-29 14:09:28 -07:00
Ian Romanick
22fbb5c594 util: Add and use util_is_power_of_two_nonzero
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2018-03-29 14:09:28 -07:00
Ian Romanick
d76c204d05 util: Move util_is_power_of_two to bitscan.h and rename to util_is_power_of_two_or_zero
The new name make the zero-input behavior more obvious.  The next
patch adds a new function with different zero-input behavior.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Suggested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-03-29 14:09:23 -07:00
Dylan Baker
a3a16d4aa7 meson: use dep_libdrm version for pkg-config
This corrects pkg-config to use the libdrm version (as computed by the
previous patch) instead of using a hardcoded value that may or may not
(probably not) be right.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-29 10:20:52 -07:00
Dylan Baker
c445b1d56f meson: Use the same version for all libdrm checks
Currently each driver specifies it's own version, and core libdrm
specifies a version. In the most common case this is fine, since there
will be exactly one libdrm installed on a system, but if there are more
than one it's possible that mesa will be linked against different
versions of libdrm. There is also the possibility that the current
approach makes the pkg-config files we generate incorrect, since there
could be #defines that use newer features if they're available.

This patch corrects all of that. All of the versions are still set by
driver (along with a default core version). Then all of the drivers that
are enabled have their versions compared and the highest version is
selected, then all libdrm checks are made with that version.

v2: - Reorder the list to have the name first and whether the dependency
      is needed second (Eric)

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-29 10:20:52 -07:00
Dylan Baker
acadf06f56 meson: group libdrm dependencies
The reason libdrm is after libdrm_* will be made clear in later patches.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-29 10:18:47 -07:00
Brian Paul
e520ca562a gl.h: remove stale comment, trailing whitespace
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-29 08:46:55 -06:00
Brian Paul
4ff6a7b0de glapi: add glBlendBarrier(), glPrimitiveBoundingBox() prototypes
in glapi_dispatch.c, as we have for many other GLES functions.
Fixes a cross-compile issue (missing prototype) when GLES support
is disabled.

Reviewed-by: Sinclair Yeh <syeh@vmware.com>
2018-03-29 08:45:10 -06:00
Brian Paul
5cd5878a1f st/mesa: silence unhandled switch case warning
And improve the unreachable() error message.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-03-29 08:45:10 -06:00
Henri Verbeet
0b73c86b80 mesa: Inherit texture view multi-sample information from the original texture images.
Found running "The Witness" in Wine. Without this patch, texture views created
on multi-sample textures would have a GL_TEXTURE_SAMPLES of 0. All things
considered such views actually work surprisingly well, but when combined with
(plain) multi-sample textures in a framebuffer object, the resulting FBO is
incomplete because the sample counts don't match.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Henri Verbeet <hverbeet@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-03-29 14:38:25 +04:30
Samuel Pitoiset
e45fe0ed66 radv: fix scanning output_usage_mask with structs
To fix a regression in:
dEQP-VK.spirv_assembly.instruction.graphics.variable_init.output.struct

And the following regressions (Polaris only):
dEQP-VK.glsl.indexing.varying_array.*

Fixes: f3275ca01c ("ac/nir: only enable used channels when exporting parameters")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-29 10:22:10 +02:00
Karol Herbst
6179a87c1e nvc0/ir: fix emiting NOTs with predicates
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-03-29 03:06:36 +02:00
Aaron Watry
1dae92f150 broadcom/vc4: Fix out-of-tree build with automake.
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-03-28 17:48:41 -07:00
Eric Anholt
81f82ecc56 broadcom/vc5: Start using nir_opt_move_load_ubo().
In the absence of a general NIR or VIR-level scheduler, this at least
avoids spilling in
GTF-GLES3.gtf.GL3Tests.uniform_buffer_object.uniform_buffer_object_storage_layouts
2018-03-28 17:48:41 -07:00
Eric Anholt
1fe4c748f7 broadcom/vc5: Fix setup of integer surface clear values.
I'm disappointed that the compiler didn't warn me about use of
uninitialized uc in these paths.  Just use the incoming clear color
instead of the packing temporary if we're doing our own packing.

Fixes GTF-GLES3.gtf.GL3Tests.color_buffer_float.color_buffer_float_clamp_*
2018-03-28 17:48:41 -07:00
Eric Anholt
123ee37627 broadcom/vc5: Stop trying to swizzle around RGBA4 clear color.
We always want A in the A slot in the tile buffer, and any other swapping
should happen elsewhere.

Fixes RGBA4-using cases in fbo-clear-formats and
GTF-GLES3.gtf.GL3Tests.color_buffer_float.color_buffer_float_clamp_fixed.
2018-03-28 17:48:41 -07:00
Eric Anholt
2f4c4e10c2 broadcom/vc5: Work around scissor w/h==0 bug same as rasterizer discard.
The 7268 HW apparently lets some rendering through in this case.  Fixes
GTF-GLES2.gtf.GL2FixedTests.scissor.scissor
2018-03-28 17:48:41 -07:00
Eric Anholt
0349c79bdc st: Don't try to finalize the texture in st_render_texture().
We can't necessarily finalize the texture at this point if we're rendering
to a texture image whose format is different from the baselevel's format.
This was introduced as a fix for fbo-incomplete-texture-03 in
de414f4915, but the later fix for vmware on
that testcase in 95d5c48f68 made it
unnecessary.

Fixes assertion failures in util_resource_copy_region() in
KHR-GLES3.copy_tex_image_conversions.forbidden.* when trying to finalize
an R8 texture image to the RG8 texture object's pt.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-03-28 17:48:41 -07:00
Marek Olšák
e159d46fc7 drirc: whitelist glthread for Medieval II: TW, Carnivores: DHR, Far Cry 2 2018-03-28 20:00:48 -04:00
Daniel Schürmann
b91cd5dba4 radv: enable VK_AMD_shader_trinary_minmax extension
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-29 01:29:39 +02:00
Daniel Schürmann
d00fb7ce54 ac: add support for trinary_minmax instructions
v2: Add missing break (Bas)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-29 01:29:35 +02:00
Dave Airlie
fe5d5d19b0 spirv: add support for SPV_AMD_shader_trinary_minmax
Co-authored-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-29 01:29:29 +02:00
Dave Airlie
3e830a1af2 nir: add support for min/max/median of 3 srcs
These are needed for SPV_AMD_shader_trinary_minmax,
the AMD HW supports these.

Co-authored-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-29 01:28:58 +02:00
Marek Olšák
025105453a radeonsi: simplify DCC format categories
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-28 18:45:52 -04:00
Marek Olšák
3fea237c85 radeonsi: don't use the SPI barrier management bug workaround
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-28 18:45:52 -04:00
Marek Olšák
3045c5f274 radeonsi: use maximum OFFCHIP_BUFFERING on Vega12
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-28 18:45:52 -04:00
Bas Nieuwenhuizen
4503ff760c ac/nir: Add workaround for GFX9 buffer views.
On GFX9 whether the buffer size is interpreted as elements or bytes
depends on whether IDXEN is enabled in the instruction. If the index
is a constant zero, LLVM optimizes IDXEN to 0.

Now the size in elements is interpreted in bytes which of course
results in out of bounds accesses.

The correct fix is most likely to disable the LLVM optimization,
but we need something to work with LLVM <= 6.0.

radeonsi does the max between stride and element count on the CPU
but that results in the size intrinsics returning the wrong size
for the buffer. This would cause CTS errors for radv.

v2: Also include the store changes.

Fixes: e38685cc62 'Revert "radv: disable support for VEGA for now."'
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-29 00:03:03 +02:00
Marek Olšák
4f96747530 ac/surface: set AddrSurfInfoIn.format = ADDR_FMT_8 for stencil, add assertions
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105738

Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-28 17:23:41 -04:00
Samuel Pitoiset
1c4fdcf444 radv: enable VK_EXT_sampler_filter_minmax
Only enable for CIK+ because it's buggy on SI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-28 22:55:48 +02:00
Samuel Pitoiset
413d77e7f9 radv: add support for VK_EXT_sampler_filter_minmax
The driver only supports the required formats for now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-28 22:55:48 +02:00
Samuel Pitoiset
99b52aa1da radv: rename VEGA10 device name
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-28 20:15:17 +02:00
Samuel Pitoiset
4d2c46dda3 radv: add support for Vega12
Based on RadeonSI. Untested.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-28 20:15:14 +02:00
Matt Turner
3e6326deb9 build: Fix up nir_intrinsics.Plo
nir_intrinsics.c existed as a static file until commit 76dfed8ae2 began
generating it as part of the build process. autotools is incapable of
coping, and so a build-tree from before this commit would then fail with
it:

[4]: *** No rule to make target '../../../mesa/src/compiler/nir/nir_intrinsics.c', needed by 'nir/nir_intrinsics.lo'.  Stop.

Add a few lines to configure.ac to update the broken build files.

Fixes: 76dfed8ae2 ("nir: mako all the intrinsics")
2018-03-28 11:09:23 -07:00
Dylan Baker
2cfc68d984 autotools: Include intel/dev/meson.build in tarball
Fixes: 272bef0601
       ("intel: Split gen_device_info out into libintel_dev")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-28 10:19:05 -07:00
Dylan Baker
bc2fdb9759 autotools: include meson_get_version
Otherwise meson won't read the VERSION file and won't set a version.
That means that pkg-config files will have version unset as well.

Fixes: 3e9533d9b8
       ("meson: Add script to use VERSION file for getting version")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-28 10:13:23 -07:00
Eric Engestrom
d77844a529 docs: fix 18.0 release note version
Fixes: 839fb3a696 "docs: Update 18.0.0 release notes"
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-28 16:52:56 +01:00
Marek Olšák
20eb44ad65 radeonsi: add support for Vega12
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-28 11:37:43 -04:00
Marek Olšák
5425d32fcf amd/addrlib: update to the latest version for Vega12
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-28 11:37:43 -04:00
Eric Engestrom
431a1d12cc gbm: remove never-implemented function
I assume this was implemented in a previous version of that commit, but
was removed in the version that actually landed.

Fixes: 8430af5ebe "Add support for swrast to the DRM EGL platform"
Cc: Giovanni Campagna <gcampagna@src.gnome.org>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-28 16:25:52 +01:00
Stefan Schake
77ade10c86 android: Use new nir intrinsics python scripts
Fixes: 76dfed8ae2 ("nir: mako all the intrinsics")
Signed-off-by: Stefan Schake <stschake@gmail.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-03-28 14:48:47 +03:00
Eric Anholt
a691fa4a1b broadcom/vc5: Fix padding of NPOT miplevels >= 2.
The power-of-two padded size that gets minified is based on level 1's
dimensions, not level 0's, which starts to differ at a width of 9.

Fixes all failures on texelFetch fs sampler2D 1x1x1-64x64x1
2018-03-27 21:16:23 -07:00
Timothy Arceri
92fa89a08d ac/radeonsi: pass bindless bool to load_sampler_desc()
We also fix the base_index for bindless by using the driver
location.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-28 12:56:16 +11:00
Timothy Arceri
5411b98d52 st/glsl_to_nir: set driver location for bindless images and samplers
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-28 12:56:15 +11:00
Timothy Arceri
f94b6b79be radeonsi/nir: set uses_bindless_samplers for samplers
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-28 12:56:15 +11:00
Timothy Arceri
5c810a2c05 nir: add bindless to nir data
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-28 12:56:15 +11:00
Kenneth Graunke
fb18d0dbe4 i965: Drop unnecessary bo->align field.
bo->align is always 0; there's no need to waste 8 bytes storing it.
Thanks to C99 initializers zeroing fields, we can completely drop the
only read of the field altogether.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-27 18:41:44 -07:00
Kenneth Graunke
037d738a23 i965: Drop unused alignment parameter from brw_bo_alloc().
brw_bo_alloc no longer uses this parameter, so there's no point.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-27 18:41:44 -07:00
Kenneth Graunke
07ec3a2e0f i965: Drop alignment parameter from bo_alloc_internal().
Buffers are always page aligned on 965+ hardware; I believe this extra
parameter is a vestige from the Gen2-3 era.

All callers pass 0, and in fact we assert that the alignment is 0 unless
BO_ALLOC_BUSY is set (for some reason).  We can just drop the parameter
and set the value to 0 explicitly.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-27 18:41:44 -07:00
Kenneth Graunke
b9a54b18f6 i965: Drop BO_ALLOC_BUSY in intel_miptree_create_for_bo().
intel_miptree_create_for_bo does not actually allocate a BO, so
specifying allocation flags accomplishes nothing and is confusing.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-27 18:41:44 -07:00
Kenneth Graunke
2c01215c1b i965: Drop PIPE_CONTROL_NO_WRITE from various calls.
This is just zero - passing nothing already gives us a post-sync
operation of "nothing".

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-27 18:41:44 -07:00
Jason Ekstrand
5f21a7afe0 nir/intrinsics: Don't report negative dest_components
I have no idea why but having dest_components == -1 was causing a memory
leak somewhere.  Without this, you can't get through a full shader-db
run without running out of memory.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-03-27 18:18:26 -07:00
Jason Ekstrand
7e38f49a8f intel/fs: Don't emit a des copy for image ops with has_dest == false
This was causing us to walk dest_components times over a thing with no
destination.  This happened to work because all of the image intrinsics
without a destination also happened to have dest_components == 0.  We
shouldn't be reading dest_components if has_dest == false.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-03-27 18:18:21 -07:00
Ilia Mirkin
776e6af879 nvc0/ir: fix INTERP_* with indirect inputs
There were two problems, both of which are fixed now:
 - The indirect address was not being shifted by 4
 - The indirect address was being placed as an argument in the offset case

This fixes some of the new interpolateAt* piglits which now test for
these situations.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-03-27 20:41:11 -04:00
Timothy Arceri
629ee690ad nir: fix crash in loop unroll corner case
When an if nesting inside anouther if is optimised away we can
end up with a loop terminator and following block that looks like
this:

        if ssa_596 {
                block block_5:
                /* preds: block_4 */
                vec1 32 ssa_601 = load_const (0xffffffff /* -nan */)
                break
                /* succs: block_8 */
        } else {
                block block_6:
                /* preds: block_4 */
                /* succs: block_7 */
        }
        block block_7:
        /* preds: block_6 */
        vec1 32 ssa_602 = phi block_6: ssa_552
        vec1 32 ssa_603 = phi block_6: ssa_553
        vec1 32 ssa_604 = iadd ssa_551, ssa_66

The problem is the phis. Loop unrolling expects the last block in
the loop to be empty once we splice the instructions in the last
block into the continue branch. The problem is we cant move phis
so here we lower the phis to regs when preparing the loop for
unrolling. As it could be possible to have multiple additional
blocks/ifs following the terminator we just convert all phis at
the top level of the loop body for simplicity.

We also add some comments to loop_prepare_for_unroll() while we
are here.

Fixes: 51daccb289 "nir: add a loop unrolling pass"

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105670
2018-03-28 09:59:38 +11:00
Timothy Arceri
48f6014903 st/glsl_to_nir: correctly handle arrays packed across multiple vars
Fixes piglit test:
tests/spec/arb_enhanced_layouts/execution/component-layout/vs-fs-array-interleave-range.shader_test

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-28 09:59:38 +11:00
Timothy Arceri
b260efbd5e radeonsi/nir: fix input processing for packed varyings
The location was only being incremented the first time we processed a
location. This meant we would incorrectly skip some elements of
an array if the first element was packed and proccessed previously
but other elements were not.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-28 09:59:38 +11:00
Timothy Arceri
51f175028d ac/nir_to_llvm: fix component packing for double outputs
We need to wait until after the writemask is widened before we
adjust it for component packing.

Together with the previous patch this fixes a number of
arb_enhanced_layouts component layout piglit tests.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-28 09:59:37 +11:00
Timothy Arceri
fc51fdbcde st/glsl_to_nir: fix driver location for dual-slot packed doubles
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-28 09:59:37 +11:00
Timothy Arceri
47eee04556 radeonsi/nir: fix scanning of multi-slot output varyings
This fixes tcs/tes varying arrays where we dont lower indirects and
therefore don't split arrays. Here we also fix useagemask for dual
slot doubles.

Fixes a number of arb_tessellation_shader piglit tests.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-28 09:59:37 +11:00
Eric Anholt
9f1b4f6204 broadcom/vc5: Fix RG16I/UI texture sampling.
How many times did I look at this table without noticing the missing 'G'
in the texture column?

Fixes KHR-GLES3.copy_tex_image_conversions.required.* on 7268.
2018-03-27 15:49:58 -07:00
Rob Clark
16581904b0 nir: fix generated nir_intrinsics.c for MSVC
Apparently it is not happy about things like: .foo = {}

So skip over initializers for empty lists.

Fixes: 76dfed8ae2
Reported-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-27 15:01:11 -04:00
Emil Velikov
eda2f58d15 docs: update calendar 18.0.0 is out
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-27 19:11:45 +01:00
Emil Velikov
02f89b62fe docs: add news item and link release notes for 18.0.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-27 19:08:48 +01:00
Emil Velikov
62eb721ed8 docs: add sha256 checksums for 18.0.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit fb64913d19)
2018-03-27 19:06:27 +01:00
Emil Velikov
839fb3a696 docs: Update 18.0.0 release notes
Note: the file was originally 17.4.0, yet git stuggles to detect the
move :-\

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit dceb1ce807)
2018-03-27 19:06:19 +01:00
Rob Clark
76dfed8ae2 nir: mako all the intrinsics
I threatened to do this a long time ago.. I probably *should* have done
it a long time ago when there where many fewer intrinsics.  But the
system of macro/#include magic for dealing with intrinsics is a bit
annoying, and python has the nice property of optional fxn params,
making it possible to define new intrinsics while ignoring parameters
that are not applicable (and naming optional params).  And not having to
specify various array lengths explicitly is nice too.

I think the end result makes it easier to add new intrinsics.

v2: couple small fixes found with a test program to compare the old and
    new tables
v3: misc comments, don't rely on capture=true for meson.build, get rid
    of system_values table to avoid return value of intrinsic() and
    *mostly* remove side-effects, add autotools build support
v4: scons build

Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-27 08:36:37 -04:00
Rob Clark
cc3a88e81d nir: fix per_vertex_output intrinsic
This is supposed to have both BASE and COMPONENT but num_indices was
inadvertantly set to 1.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-27 08:20:40 -04:00
Rob Clark
1e0a06000b glsl_types: fix build break with intel/msvc compiler
The VECN() macro was taking advantage of a GCC specific feature that is
not available on lesser compilers, mostly for the purposes of avoiding a
macro that encoded a return statement.

But as suggested by Ian, we could just have the macro produce the entire
method body and avoid the need for this.  So let's do that instead.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105740
Fixes: f407edf340
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Cc: Roland Scheidegger <sroland@vmware.com>
Cc: Ian Romanick <idr@freedesktop.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-03-27 08:17:11 -04:00
Lin Johnson
41cf30b8bc mesa: add GL_HALF_FLOAT as supported type to readpixels
EXT_color_buffer_float spec states:

  "An INVALID_OPERATION error is generated ... if the color buffer is
   a floating-point format and type is not FLOAT, HALF FLOAT, or
   UNSIGNED_INT_10F_11F_11F_REV."

This means that GL_HALF_FLOAT type should be supported when color
buffer has floating-point format.

Fixes Android CTS test android.view.cts.PixelCopyTest.

v2: remove comments of EXT_color_buffer_half_float as
    EXT_color_buffer_float can use type GL_HALF_FLOAT

Signed-off-by: Lin Johnson <johnson.lin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-03-27 09:04:52 +03:00
Eric Anholt
0024b77e87 broadcom/vc5: Fix swizzling of RGB10_A2UI render targets.
This is the actual hardware layout, and we were only swizzling R/B back
around in texturing.  Fixes part of
KHR-GLES3.copy_tex_image_conversions.required.cubemap_negx_cubemap_negx in
simulation.
2018-03-26 17:46:23 -07:00
Eric Anholt
c2b13627d9 broadcom/vc5: Fix extraneous register index in QIR dumping of TLBU writes.
Just like TLB without a config uniform, we don't have a register index.
2018-03-26 17:46:23 -07:00
Eric Anholt
494da6c2dd broadcom/vc5: Implement workaround for GFXH-1431.
This should fix some blending errors, but doesn't impact any testcases in
the CTS.
2018-03-26 17:46:19 -07:00
Eric Anholt
1bf466270d broadcom/vc5: Fix EZ disabling and allow using GT/GE direction as well.
Once we've disabled EZ for some draws, we need to not use EZ on future
draws.  Implementing that made implementing the GT/GE direction trivial.

Fixes KHR-GLES3.shaders.fragdepth.compare.no_write on V3D 4.1 simulation.
2018-03-26 17:46:19 -07:00
Eric Anholt
262208eb3c broadcom/vc5: Disable TF on V3D 4.x when drawing with queries disabled.
On 3.x, we just don't flag the primitive as needing TF, but those
primitive bits are now allocated to the new primitive types.  Now we need
to actually update the enable flag at draw time.
2018-03-26 17:46:19 -07:00
Eric Anholt
ef2cf9cc3c broadcom/vc5: Disable transform feedback on V3D 4.x at the end of the job.
The next job from this client will turn it back on unless TF gets
disabled, but we don't want the state to leak from this client to another
(which causes GPU hangs).
2018-03-26 17:46:19 -07:00
Eric Anholt
1fa820cef8 broadcom/vc5: Move the BCL epilogue code to a per-version compile.
I need to do some new packets for transform feedback on 4.1.
2018-03-26 17:46:19 -07:00
Eric Anholt
3387864130 broadcom/vc5: Fix transform feedback in the presence of point size.
I had this note to myself, and it turns out that a lot of CTS tests use
XFB with points to get data out without using a fragment shader.  Keep
track of two sets of precomputed TF specs (point size in VPM prologue or
not), and switch between them when we enable/disable point size.
2018-03-26 17:46:19 -07:00
Eric Anholt
09ac5ade8f broadcom/vc5: Split transform feedback specs update from buffers.
The specs update will be changing based on additional state flags in the
next commit, and this unindents the buffer update code.
2018-03-26 17:46:18 -07:00
Eric Anholt
9e62aec9cd broadcom/vc5: Limit each transform feedback data spec to 16 dwords.
The length-1 field only has 4 bits, so we need to generate separate specs
when there's too much TF output per buffer.

Fixes
GTF-GLES3.gtf.GL3Tests.transform_feedback.transform_feedback_builtin_type
and transform_feedback_max_interleaved.
2018-03-26 17:33:37 -07:00
Eric Anholt
0356db022d gallium/u_vbuf: Protect against overflow with large instance divisors.
GTF-GLES3.gtf.GL3Tests.instanced_arrays.instanced_arrays_divisor uses -1
as a divisor, so we would overflow to count=0 and upload no data,
triggering the assert below.  We want to upload 1 element in this case,
fixing the test on VC5.

v2: Use some more obvious logic, and explain why we don't use the normal
    round_up().

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-03-26 17:33:37 -07:00
Eric Anholt
d491ad1d36 st: Allow accelerated CopyTexImage from RGBA to RGB.
There's nothing to worry about here -- the A channel just gets dropped by
the blit.  This avoids a segfault in the fallback path when copying from a
RGBA16_SINT renderbuffer to a RGB16_SINT destination represented by an
RGBA16_SINT texture (the fallback path tries to get/fetch to float
buffers, but the float pack/unpack functions are NULL for SINT/UINT).

Fixes KHR-GLES3.packed_pixels.pbo_rectangle.rgba16i on VC5.

v2: Extract the logic to a helper function and explain what's going on
    better.
v3: const-qualify args

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-03-26 17:33:37 -07:00
Marek Olšák
7d2079908d winsys/amdgpu: always allow GTT placements on APUs
Reviewed-by: Christian König <christian.koenig@amd.com>
2018-03-26 19:23:30 -04:00
Marek Olšák
769603564e radeonsi: don't reallocate on DMABUF export if local BOs are disabled 2018-03-26 19:22:12 -04:00
Timothy Arceri
56b867395d glsl: fix infinite loop caused by bug in loop unrolling pass
Just checking for 2 jumps is not enough to be sure we can do a
complex loop unroll. We need to make sure we also have also found
2 loop terminators.

Without this we were attempting to unroll a loop where the second
jump was nested inside multiple ifs which loop analysis is unable
to detect as a terminator. We ended up splicing out the first
terminator but failed to actually unroll the loop, this resulted
in the creation of a possible infinite loop.

Fixes: 646621c66d "glsl: make loop unrolling more like the nir unrolling path"

Tested-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105670
2018-03-27 09:15:02 +11:00
Vinson Lee
dc94a0506f gallium: Do not add -Wframe-address option for gcc <= 4.4.
This patch fixes these build errors with GCC 4.4.

  Compiling src/gallium/auxiliary/util/u_debug_stack.c ...
src/gallium/auxiliary/util/u_debug_stack.c: In function ‘debug_backtrace_capture’:
src/gallium/auxiliary/util/u_debug_stack.c:268: error: #pragma GCC diagnostic not allowed inside functions
src/gallium/auxiliary/util/u_debug_stack.c:269: error: #pragma GCC diagnostic not allowed inside functions
src/gallium/auxiliary/util/u_debug_stack.c:271: error: #pragma GCC diagnostic not allowed inside functions

Fixes: 370e356eba ("gallium: silence __builtin_frame_address nonzero argument is unsafe warning")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105529
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-03-26 11:23:51 -07:00
Alyssa Rosenzweig
029f1a2d61 gallium: Correct minor typo in header comments
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-03-26 10:15:04 -07:00
Rafael Antognolli
27581d18bc intel/aubinator_error_decode: Decode more registers.
Decode SC_INSTDONE, ROW_INSTDONE and SAMPLER_INSTDONE.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-26 09:25:57 -07:00
Rafael Antognolli
70d7c70e8d intel/genxml: Add SAMPLER_INSTDONE register.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-26 09:25:57 -07:00
Rafael Antognolli
227edf05f3 intel/genxml: Add ROW_INSTDONE register.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-26 09:25:57 -07:00
Rafael Antognolli
4c0ae36143 intel/genxml: Add SC_INSTDONE register.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-26 09:25:57 -07:00
Ian Romanick
91225cb33f i965/vec4: Fix null destination register in 3-source instructions
A recent commit (see below) triggered some cases where conditional
modifier propagation and dead code elimination would cause a MAD
instruction like the following to be generated:

    mad.l.f0  null, ...

Matt pointed out that fs_visitor::fixup_3src_null_dest() fixes cases
like this in the scalar backend.  This commit basically ports that code
to the vec4 backend.

NOTE: I have sent a couple tests to the piglit list that reproduce this
bug *without* the commit mentioned below.  This commit fixes those
tests.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Fixes: ee63933a7 ("nir: Distribute binary operations with constants into bcsel")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105704
2018-03-26 08:50:44 -07:00
Ian Romanick
2c643fd978 nir: Don't condition 'a-b < 0' -> 'a < b' on is_not_used_by_conditional
Now that i965 recognizes that a-b generates the same conditions as 'a <
b', there is no reason to condition this transformation on 'is not used
by conditional.'

Since this was the only user of the is_not_used_by_conditional function,
delete it.

All Gen6+ platforms had similar results. (Skylake shown)
total instructions in shared programs: 14400775 -> 14400595 (<.01%)
instructions in affected programs: 36712 -> 36532 (-0.49%)
helped: 182
HURT: 26
helped stats (abs) min: 1 max: 2 x̄: 1.13 x̃: 1
helped stats (rel) min: 0.15% max: 1.82% x̄: 0.70% x̃: 0.62%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.24% max: 1.02% x̄: 0.82% x̃: 0.90%
95% mean confidence interval for instructions value: -0.97 -0.76
95% mean confidence interval for instructions %-change: -0.59% -0.43%
Instructions are helped.

total cycles in shared programs: 532929592 -> 532926345 (<.01%)
cycles in affected programs: 478660 -> 475413 (-0.68%)
helped: 187
HURT: 22
helped stats (abs) min: 2 max: 200 x̄: 20.99 x̃: 18
helped stats (rel) min: 0.23% max: 24.10% x̄: 1.48% x̃: 1.03%
HURT stats (abs)   min: 1 max: 214 x̄: 30.86 x̃: 11
HURT stats (rel)   min: 0.01% max: 23.06% x̄: 3.12% x̃: 0.86%
95% mean confidence interval for cycles value: -19.50 -11.57
95% mean confidence interval for cycles %-change: -1.42% -0.58%
Cycles are helped.

GM45 and Iron Lake had similar results. (Iron Lake shown)
total cycles in shared programs: 177851578 -> 177851810 (<.01%)
cycles in affected programs: 24408 -> 24640 (0.95%)
helped: 2
HURT: 4
helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4
helped stats (rel) min: 0.42% max: 0.47% x̄: 0.44% x̃: 0.44%
HURT stats (abs)   min: 24 max: 108 x̄: 60.00 x̃: 54
HURT stats (rel)   min: 0.52% max: 1.62% x̄: 1.04% x̃: 1.02%
95% mean confidence interval for cycles value: -7.75 85.08
95% mean confidence interval for cycles %-change: -0.39% 1.49%
Inconclusive result (value mean confidence interval includes 0).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-26 08:50:43 -07:00
Ian Romanick
cd635d149b i965/vec4: Propagate conditional modifiers from compares to adds
No changes on Broadwell or later as those platforms do not use the vec4
backend.

Ivy Bridge and Haswell had similar results. (Ivy Bridge shown)
total instructions in shared programs: 11682119 -> 11681056 (<.01%)
instructions in affected programs: 150403 -> 149340 (-0.71%)
helped: 950
HURT: 0
helped stats (abs) min: 1 max: 16 x̄: 1.12 x̃: 1
helped stats (rel) min: 0.23% max: 2.78% x̄: 0.82% x̃: 0.71%
95% mean confidence interval for instructions value: -1.19 -1.04
95% mean confidence interval for instructions %-change: -0.84% -0.79%
Instructions are helped.

total cycles in shared programs: 257495842 -> 257495238 (<.01%)
cycles in affected programs: 270302 -> 269698 (-0.22%)
helped: 271
HURT: 13
helped stats (abs) min: 2 max: 14 x̄: 2.42 x̃: 2
helped stats (rel) min: 0.06% max: 1.13% x̄: 0.32% x̃: 0.28%
HURT stats (abs)   min: 2 max: 12 x̄: 4.00 x̃: 4
HURT stats (rel)   min: 0.15% max: 1.18% x̄: 0.30% x̃: 0.26%
95% mean confidence interval for cycles value: -2.41 -1.84
95% mean confidence interval for cycles %-change: -0.31% -0.26%
Cycles are helped.

Sandy Bridge
total instructions in shared programs: 10430493 -> 10429727 (<.01%)
instructions in affected programs: 120860 -> 120094 (-0.63%)
helped: 766
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.30% max: 2.70% x̄: 0.78% x̃: 0.73%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -0.80% -0.75%
Instructions are helped.

total cycles in shared programs: 146138718 -> 146138446 (<.01%)
cycles in affected programs: 244114 -> 243842 (-0.11%)
helped: 132
HURT: 0
helped stats (abs) min: 2 max: 4 x̄: 2.06 x̃: 2
helped stats (rel) min: 0.03% max: 0.43% x̄: 0.16% x̃: 0.19%
95% mean confidence interval for cycles value: -2.12 -2.00
95% mean confidence interval for cycles %-change: -0.18% -0.15%
Cycles are helped.

GM45 and Iron Lake had identical results. (Iron Lake shown)
total instructions in shared programs: 7780251 -> 7780248 (<.01%)
instructions in affected programs: 175 -> 172 (-1.71%)
helped: 3
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 1.49% max: 2.44% x̄: 1.81% x̃: 1.49%

total cycles in shared programs: 177851584 -> 177851578 (<.01%)
cycles in affected programs: 9796 -> 9790 (-0.06%)
helped: 3
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 0.05% max: 0.08% x̄: 0.06% x̃: 0.05%

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-26 08:50:43 -07:00
Ian Romanick
780f307ba8 i965/vec4: Allow cmod propagation when src0 is a uniform or shader input
No shader-db changes.  This source must have been written by a previous
instruction, so it cannot be a uniform or a shader input.  However, this
change allows the next commit to help more shaders.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-26 08:50:43 -07:00
Ian Romanick
020b0055e7 i965/fs: Propagate conditional modifiers from compares to adds
The math inside the add and the cmp in this instruction sequence is the
same.  We can utilize this to eliminate the compare.

add(8)          g5<1>F          g2<8,8,1>F      g64.5<0,1,0>F   { align1 1Q compacted };
cmp.z.f0(8)     null<1>F        g2<8,8,1>F      -g64.5<0,1,0>F  { align1 1Q switch };
(-f0) sel(8)    g8<1>F          (abs)g5<8,8,1>F 3e-37F          { align1 1Q };

This is reduced to:

add.z.f0(8)     g5<1>F          g2<8,8,1>F      g64.5<0,1,0>F   { align1 1Q compacted };
(-f0) sel(8)    g8<1>F          (abs)g5<8,8,1>F 3e-37F          { align1 1Q };

This optimization pass could do even better.  The nature of converting
vectorized code from the GLSL front end to scalar code in NIR results in
sequences like:

add(8)          g7<1>F          g4<8,8,1>F      g64.5<0,1,0>F   { align1 1Q compacted };
add(8)          g6<1>F          g3<8,8,1>F      g64.5<0,1,0>F   { align1 1Q compacted };
add(8)          g5<1>F          g2<8,8,1>F      g64.5<0,1,0>F   { align1 1Q compacted };
cmp.z.f0(8)     null<1>F        g2<8,8,1>F      -g64.5<0,1,0>F  { align1 1Q switch };
(-f0) sel(8)    g8<1>F          (abs)g5<8,8,1>F 3e-37F          { align1 1Q };
cmp.z.f0(8)     null<1>F        g3<8,8,1>F      -g64.5<0,1,0>F  { align1 1Q switch };
(-f0) sel(8)    g10<1>F         (abs)g6<8,8,1>F 3e-37F          { align1 1Q };
cmp.z.f0(8)     null<1>F        g4<8,8,1>F      -g64.5<0,1,0>F  { align1 1Q switch };
(-f0) sel(8)    g12<1>F         (abs)g7<8,8,1>F 3e-37F          { align1 1Q };

In this sequence, only the first cmp.z is removed.  With different
scheduling, all 3 could get removed.

Skylake
total instructions in shared programs: 14407009 -> 14400173 (-0.05%)
instructions in affected programs: 1307274 -> 1300438 (-0.52%)
helped: 4880
HURT: 0
helped stats (abs) min: 1 max: 33 x̄: 1.40 x̃: 1
helped stats (rel) min: 0.03% max: 8.70% x̄: 0.70% x̃: 0.52%
95% mean confidence interval for instructions value: -1.45 -1.35
95% mean confidence interval for instructions %-change: -0.72% -0.69%
Instructions are helped.

total cycles in shared programs: 532943169 -> 532923528 (<.01%)
cycles in affected programs: 14065798 -> 14046157 (-0.14%)
helped: 2703
HURT: 339
helped stats (abs) min: 1 max: 1062 x̄: 12.27 x̃: 2
helped stats (rel) min: <.01% max: 28.72% x̄: 0.38% x̃: 0.21%
HURT stats (abs)   min: 1 max: 739 x̄: 39.86 x̃: 12
HURT stats (rel)   min: 0.02% max: 27.69% x̄: 1.38% x̃: 0.41%
95% mean confidence interval for cycles value: -8.66 -4.26
95% mean confidence interval for cycles %-change: -0.24% -0.14%
Cycles are helped.

LOST:   0
GAINED: 1

Broadwell
total instructions in shared programs: 14719636 -> 14712949 (-0.05%)
instructions in affected programs: 1288188 -> 1281501 (-0.52%)
helped: 4845
HURT: 0
helped stats (abs) min: 1 max: 33 x̄: 1.38 x̃: 1
helped stats (rel) min: 0.03% max: 8.00% x̄: 0.70% x̃: 0.52%
95% mean confidence interval for instructions value: -1.43 -1.33
95% mean confidence interval for instructions %-change: -0.72% -0.68%
Instructions are helped.

total cycles in shared programs: 559599253 -> 559581699 (<.01%)
cycles in affected programs: 13315565 -> 13298011 (-0.13%)
helped: 2600
HURT: 269
helped stats (abs) min: 1 max: 2128 x̄: 12.24 x̃: 2
helped stats (rel) min: <.01% max: 23.95% x̄: 0.41% x̃: 0.20%
HURT stats (abs)   min: 1 max: 790 x̄: 53.07 x̃: 20
HURT stats (rel)   min: 0.02% max: 15.96% x̄: 1.55% x̃: 0.75%
95% mean confidence interval for cycles value: -8.47 -3.77
95% mean confidence interval for cycles %-change: -0.27% -0.18%
Cycles are helped.

LOST:   0
GAINED: 8

Haswell
total instructions in shared programs: 12978609 -> 12973483 (-0.04%)
instructions in affected programs: 932921 -> 927795 (-0.55%)
helped: 3480
HURT: 0
helped stats (abs) min: 1 max: 33 x̄: 1.47 x̃: 1
helped stats (rel) min: 0.03% max: 7.84% x̄: 0.78% x̃: 0.58%
95% mean confidence interval for instructions value: -1.53 -1.42
95% mean confidence interval for instructions %-change: -0.80% -0.75%
Instructions are helped.

total cycles in shared programs: 410270788 -> 410250531 (<.01%)
cycles in affected programs: 10986161 -> 10965904 (-0.18%)
helped: 2087
HURT: 254
helped stats (abs) min: 1 max: 2672 x̄: 14.63 x̃: 4
helped stats (rel) min: <.01% max: 39.61% x̄: 0.42% x̃: 0.21%
HURT stats (abs)   min: 1 max: 519 x̄: 40.49 x̃: 16
HURT stats (rel)   min: 0.01% max: 12.83% x̄: 1.20% x̃: 0.47%
95% mean confidence interval for cycles value: -12.82 -4.49
95% mean confidence interval for cycles %-change: -0.31% -0.18%
Cycles are helped.

LOST:   0
GAINED: 5

Ivy Bridge
total instructions in shared programs: 11686082 -> 11681548 (-0.04%)
instructions in affected programs: 937696 -> 933162 (-0.48%)
helped: 3150
HURT: 0
helped stats (abs) min: 1 max: 33 x̄: 1.44 x̃: 1
helped stats (rel) min: 0.03% max: 7.84% x̄: 0.69% x̃: 0.49%
95% mean confidence interval for instructions value: -1.49 -1.38
95% mean confidence interval for instructions %-change: -0.71% -0.67%
Instructions are helped.

total cycles in shared programs: 257514962 -> 257492471 (<.01%)
cycles in affected programs: 11524149 -> 11501658 (-0.20%)
helped: 1970
HURT: 239
helped stats (abs) min: 1 max: 3525 x̄: 17.48 x̃: 3
helped stats (rel) min: <.01% max: 49.60% x̄: 0.46% x̃: 0.17%
HURT stats (abs)   min: 1 max: 1358 x̄: 50.00 x̃: 15
HURT stats (rel)   min: 0.02% max: 59.88% x̄: 1.84% x̃: 0.65%
95% mean confidence interval for cycles value: -17.01 -3.35
95% mean confidence interval for cycles %-change: -0.33% -0.08%
Cycles are helped.

LOST:   9
GAINED: 1

Sandy Bridge
total instructions in shared programs: 10432841 -> 10429893 (-0.03%)
instructions in affected programs: 685071 -> 682123 (-0.43%)
helped: 2453
HURT: 0
helped stats (abs) min: 1 max: 9 x̄: 1.20 x̃: 1
helped stats (rel) min: 0.02% max: 7.55% x̄: 0.64% x̃: 0.46%
95% mean confidence interval for instructions value: -1.23 -1.17
95% mean confidence interval for instructions %-change: -0.67% -0.62%
Instructions are helped.

total cycles in shared programs: 146133660 -> 146134195 (<.01%)
cycles in affected programs: 3991634 -> 3992169 (0.01%)
helped: 1237
HURT: 153
helped stats (abs) min: 1 max: 2853 x̄: 6.93 x̃: 2
helped stats (rel) min: <.01% max: 29.00% x̄: 0.24% x̃: 0.14%
HURT stats (abs)   min: 1 max: 1740 x̄: 59.56 x̃: 12
HURT stats (rel)   min: 0.03% max: 78.98% x̄: 1.96% x̃: 0.42%
95% mean confidence interval for cycles value: -5.13 5.90
95% mean confidence interval for cycles %-change: -0.17% 0.16%
Inconclusive result (value mean confidence interval includes 0).

LOST:   0
GAINED: 1

GM45 and Iron Lake had similar results (GM45 shown):
total instructions in shared programs: 4800332 -> 4798380 (-0.04%)
instructions in affected programs: 565995 -> 564043 (-0.34%)
helped: 1451
HURT: 0
helped stats (abs) min: 1 max: 20 x̄: 1.35 x̃: 1
helped stats (rel) min: 0.05% max: 5.26% x̄: 0.47% x̃: 0.31%
95% mean confidence interval for instructions value: -1.40 -1.29
95% mean confidence interval for instructions %-change: -0.50% -0.45%
Instructions are helped.

total cycles in shared programs: 122032318 -> 122027798 (<.01%)
cycles in affected programs: 8334868 -> 8330348 (-0.05%)
helped: 1029
HURT: 1
helped stats (abs) min: 2 max: 40 x̄: 4.43 x̃: 2
helped stats (rel) min: <.01% max: 1.83% x̄: 0.09% x̃: 0.04%
HURT stats (abs)   min: 38 max: 38 x̄: 38.00 x̃: 38
HURT stats (rel)   min: 0.25% max: 0.25% x̄: 0.25% x̃: 0.25%
95% mean confidence interval for cycles value: -4.70 -4.08
95% mean confidence interval for cycles %-change: -0.09% -0.08%
Cycles are helped.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-26 08:50:43 -07:00
Ian Romanick
5bbb3d60d3 i965/fs: Allow cmod propagation when src0 is a uniform or shader input
No shader-db changes.  This source must have been written by a previous
instruction, so it cannot be a uniform or a shader input.  However, this
change allows the next commit to help about 900 more shaders.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-26 08:50:43 -07:00
Ian Romanick
8f83eea71e i965: Add negative_equals methods
This method is similar to the existing ::equals methods.  Instead of
testing that two src_regs are equal to each other, it tests that one is
the negation of the other.

v2: Simplify various checks based on suggestions from Matt.  Use
src_reg::type instead of fixed_hw_reg.type in a check.  Also suggested
by Matt.

v3: Rebase on 3 years.  Fix some problems with negative_equals with VF
constants.  Add fs_reg::negative_equals.

v4: Replace the existing default case with BRW_REGISTER_TYPE_UB,
BRW_REGISTER_TYPE_B, and BRW_REGISTER_TYPE_NF.  Suggested by Matt.
Expand the FINISHME comment to better explain why it isn't already
finished.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [v3]
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-26 08:50:43 -07:00
Gert Wollny
a21da49e5c mesa/st/tests: Use tgsi opcode enum also in the test classes
Fixes: ec478cf9c31K ("st/mesa,tgsi: use enum tgsi_opcode")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105737
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-03-26 09:04:53 -06:00
Eric Engestrom
1e36fe5dc4 meson: fix header check message
before: Checking if "endian.h works" compiles: YES
after:  Checking if "endian.h" compiles: YES

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
2018-03-26 09:59:32 +01:00
Rob Clark
2f181c8c18 glsl_types: vec8/vec16 support
Not used in GL but 8 and 16 component vectors exist in OpenCL.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-25 10:42:54 -04:00
Rob Clark
f407edf340 glsl_types: refactor/prep for vec8/vec16
Refactor things so there isn't so much typing involved to add new
things.

Also drops a pointless conditional (out of bounds rows or columns
already returns error_type in all paths.. might as well drop it
rather than make the check more convoluted in the next patch by
adding the vec8/vec16 case).

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-25 10:42:54 -04:00
Jordan Justen
d60eaf7b1f anv: Set genX_table for gen11
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-23 17:23:59 -07:00
Jordan Justen
af8535d02f anv: Add gen11 to anv_genX_call
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-23 17:23:59 -07:00
Mathias Fröhlich
4a8ef1f5d4 vbo: Make sure the internal VAO's stay within limits.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-23 19:59:02 +01:00
Mathias Fröhlich
1a131aaf4b mesa: Flag early if we modify a SharedAndImmutable VAO.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-23 19:58:59 +01:00
Mathias Fröhlich
19526a57f5 mesa: When copying a VAO also copy the vertex attribute mode.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-23 19:58:54 +01:00
Emil Velikov
5a75019ad0 configure: use AC_CHECK_HEADERS to check for endian.h
The currently we use the singular CHECK_HEADER combined with explicit
append to the DEFINES variable. That is a legacy misnomer, since it
requires us to add $DEFINES to every piece that we build.

Using the plural version of the helper sets the HAVE_ macro for us, plus
ensures it's passed to the compiler - if config.h is available in there
(not in the case of mesa) otherwise on the command line.

In hindsight, we should replace all the AC_CHECK_{FUNC,HEADER} instances
with the plural version (or even the _ONCE suffixed version) and drop
the DEFINES hacks.

Fixes: cbee1bfb34 ("meson/configure: detect endian.h instead of trying
to guess when it's available")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105717
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Clayton Craft <clayton.a.craft@intel.com>
2018-03-23 18:12:52 +00:00
Kenneth Graunke
90f556f0b1 android: Use local i915_drm.h rather than the system one.
Fixes: 2d26c99933 (intel: devinfo: meson: include drm uapi)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Clayton Craft <clayton.a.craft@intel.com>
2018-03-23 10:05:02 -07:00
Brian Paul
e31d5bd2f9 st/mesa: s/unsigned/enum pipe_shader_type/ for st_bind_ubos()
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-03-23 09:03:26 -06:00
Brian Paul
6a93deedf5 st/mesa: whitespace/formatting fixes in st_atom_constbuf.c
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-03-23 09:03:26 -06:00
Brian Paul
aad23f91ee st/mesa: s/unsigned/enum pipe_shader_type/
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-03-23 09:03:26 -06:00
Brian Paul
93581c2ca0 svga: simplify uses_flat_interp expression in emit_input_declarations()
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-03-23 09:03:26 -06:00
Brian Paul
c99f46c2ac svga: replace unsigned with proper enum names
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-03-23 09:03:26 -06:00
Brian Paul
7181a9fa0e tgsi,softpipe: use enum tgsi_opcode
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-03-23 09:03:26 -06:00
Brian Paul
ec478cf9c3 st/mesa,tgsi: use enum tgsi_opcode
Need to update the tgsi code and st_glsl_to_tgsi code at the same time
to prevent compile break since C++ is much pickier about implicit
enum/unsigned casting.

Bump size of glsl_to_tgsi_instruction::op to 10 bits to be sure to
avoid MSVC signed enum overflow issue.  No change in class size.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-03-23 09:03:26 -06:00
Brian Paul
ccecb2bbd3 tgsi/nir: use enum tgsi_opcode
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-03-23 09:03:26 -06:00
Brian Paul
22a3190c85 tgsi: use enum tgsi_opcode
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-03-23 09:03:26 -06:00
Brian Paul
9413d1c0fe gallivm: use enum tgis_opcode
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-03-23 09:03:26 -06:00
Brian Paul
7df96826f8 svga: use enum tgsi_opcode
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-03-23 09:03:26 -06:00
Brian Paul
4e0f967f6d tgsi: convert opcode macros to enums
Enums are nicer in gdb.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-03-23 09:03:26 -06:00
Lionel Landwerlin
412fae46c0 compiler: glsl: silence valgrind warning on write cache
I don't think it actually fixes anything, but that's nice not to have valgrind warnings.
It manifests itself when running the piglit test : glsl-fs-raytrace-bug27060

==2058== Uninitialised byte(s) found during client check request
==2058==    at 0xC5BB040: blob_write_bytes (blob.c:152)
==2058==    by 0xC595359: write_variable (nir_serialize.c:144)
==2058==    by 0xC59560C: write_var_list (nir_serialize.c:192)
==2058==    by 0xC5982E4: nir_serialize (nir_serialize.c:1124)
==2058==    by 0xC0B729D: brw_program_serialize_nir (brw_program.c:835)
==2058==    by 0xC0AB2D6: brw_link_shader (brw_link.cpp:358)
==2058==    by 0xC32FE3F: _mesa_glsl_link_shader (ir_to_mesa.cpp:3169)
==2058==    by 0xC36C7ED: create_new_program(gl_context*, state_key*) (ff_fragment_shader.cpp:1127)
==2058==    by 0xC36C8A6: _mesa_get_fixed_func_fragment_program (ff_fragment_shader.cpp:1157)
==2058==    by 0xC1B50AF: update_program (state.c:134)
==2058==    by 0xC1B56DF: _mesa_update_state_locked (state.c:352)
==2058==    by 0xC1B579A: _mesa_update_state (state.c:386)
==2058==  Address 0xf1eab8a is 58 bytes inside a block of size 96 alloc'd
==2058==    at 0x4C2CB8F: malloc (vg_replace_malloc.c:299)
==2058==    by 0xC0FD306: ralloc_size (ralloc.c:121)
==2058==    by 0xC0FD5B1: ralloc_array_size (ralloc.c:208)
==2058==    by 0xC452B3B: (anonymous namespace)::nir_visitor::visit(ir_variable*) (glsl_to_nir.cpp:448)
==2058==    by 0xC45CE8B: ir_variable::accept(ir_visitor*) (ir.h:428)
==2058==    by 0xC46D0B5: visit_exec_list(exec_list*, ir_visitor*) (ir.cpp:1898)
==2058==    by 0xC451D2F: glsl_to_nir (glsl_to_nir.cpp:162)
==2058==    by 0xC0B5223: brw_create_nir (brw_program.c:79)
==2058==    by 0xC0AAB67: brw_link_shader (brw_link.cpp:257)
==2058==    by 0xC32FE3F: _mesa_glsl_link_shader (ir_to_mesa.cpp:3169)
==2058==    by 0xC36C7ED: create_new_program(gl_context*, state_key*) (ff_fragment_shader.cpp:1127)
==2058==    by 0xC36C8A6: _mesa_get_fixed_func_fragment_program (ff_fragment_shader.cpp:1157)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-03-23 13:05:12 +00:00
Eric Engestrom
cbee1bfb34 meson/configure: detect endian.h instead of trying to guess when it's available
Cc: Maxin B. John <maxin.john@gmail.com>
Cc: Khem Raj <raj.khem@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Suggested-by: Jon Turney <jon.turney@dronecode.org.uk>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Cc: <mesa-stable@lists.freedesktop.org>
2018-03-23 11:44:21 +00:00
Juan A. Suarez Romero
ee2b943fa8 wayland-drm: do not distribute generated sources
Instead we will re-generate them again on building.

v2: get rid of BUILT_SOURCES (Daniel, Emil)
v3: keep BUILT_SOURCES for egl/Makefile.am (Emil)

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-23 11:27:12 +01:00
Samuel Pitoiset
ccc64f3133 radv: enable TC-compat HTILE for 16-bit depth surfaces on GFX8
The hardware only supports 32-bit depth surfaces, but we can
enable TC-compat HTILE for 16-bit depth surfaces if no Z planes
are compressed.

The main benefit is to reduce the number of depth decompression
passes. Also, we don't need to implement DB->CB copies which is
fine.

This improves Serious Sam 2017 by +4%. Talos and F12017 are also
affected but I don't see a performance difference.

This also improves the shadowmapping Vulkan demo by 10-15%
(FPS is now similar to AMDVLK).

No CTS regressions on Polaris10.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-23 10:05:57 +01:00
Samuel Pitoiset
5ae9772245 radv: add radv_calc_decompress_on_z_planes() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-23 10:05:55 +01:00
Samuel Pitoiset
9b8e75bee3 radv: add radv_image_is_tc_compat_htile() helper
Instead of that huge conditional that's going to be crazy.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-23 10:05:54 +01:00
Jason Ekstrand
884d27bcf6 nir: Rename image intrinsics to image_var
Generated with

git grep -l nir_intrinsic_image | xargs \
sed -i 's/nir_intrinsic_image/nir_intrinsic_image_var/g'

and some manual fixing in nir_intrinsics.h

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-23 13:48:11 +11:00
Dave Airlie
fa683385de virgl: add ARB_cull_distance support.
This just allows the properties through to the host if we have
cull dist support.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-23 10:21:10 +10:00
Eric Anholt
d7a015cbc6 broadcom/vc5: Account for InstanceID/VertexID in VPM segment size.
Fixes failure in
GTF-GLES3.gtf.GL3Tests.draw_instanced.draw_instanced_attrib_size
2018-03-22 15:12:21 -07:00
Eric Anholt
b8387dbc49 broadcom/vc5: Allow FBOs with mixed color formats.
This is required by GLES3, fixing
GTF-GLES3.gtf.GL3Tests.framebuffer_srgb.framebuffer_srgb_draw
2018-03-22 15:12:21 -07:00
Eric Anholt
4f62679be5 broadcom/vc5: Add missing support for 2101010_REV vertex attributes.
Fixes
GTF-GLES3.gtf.GL3Tests.vertex_type_2_10_10_10_rev.vertex_type_2_10_10_10_rev_invalid2,
where we hadn't thrown a GL error as needed in the extension-disabled
case.  We want to be exposing the extension anyway.
2018-03-22 15:12:21 -07:00
Eric Anholt
ba29b89dc7 broadcom/vc5: Set up a vertex position if the shader doesn't.
Our backend needs some sort of vertex position value to emit the scaled
viewport values and such.  Fixes potential segfaults in
KHR-GLES3.copy_tex_image_conversions.required.cubemap_negx_cubemap_negx
2018-03-22 15:12:21 -07:00
Lionel Landwerlin
903e9952fb i965: add performance query support on CNL
v2: Add brw_oa_cnl.xml to EXTRA_DIST (Emil)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 20:14:22 +00:00
Lionel Landwerlin
e7f6d1e5f8 i965: perf: add support for new equation operators
Some equations of the CNL metrics started to use operators we haven't
defined yet, just add those.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 20:14:22 +00:00
Lionel Landwerlin
57a11550bc i965: perf: query topology
With the introduction of asymmetric slices in CNL, we cannot rely on
the previous SUBSLICE_MASK getparam to tell userspace what subslices
are available.

We introduce a new uAPI in the kernel driver to report exactly what
part of the GPU are fused and require this to be available on Gen10+.

Prior generations can continue to rely on GETPARAM on older kernels.

This patch is quite a lot of code because we have to support lots of
different kernel versions, ranging from not providing any information
(for Haswell on 4.13 through 4.17), to being able to query through
GETPARAM (for gen8/9 on 4.13 through 4.17), to finally requiring 4.17
for Gen10+.

This change stores topology information in a unified way on
brw_context.topology from the various kernel APIs. And then generates
the appropriate values for the equations from that unified topology.

v2: Move slice/subslice masks fields to gen_device_info (Rafael)

v3: Add a gen_device_info_subslice_available() helper (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 20:14:22 +00:00
Lionel Landwerlin
c1900f5b0f intel: devinfo: add helper functions to fill fusing masks values
There are a couple of ways we can get the fusing information from the
kernel :

  - Through DRM_I915_GETPARAM with the SLICE_MASK/SUBSLICE_MASK
    parameters

  - Through the new DRM_IOCTL_I915_QUERY by requesting the
    DRM_I915_QUERY_TOPOLOGY_INFO

The second method is more accurate and also gives us the EUs fusing
masks. It's also a requirement for CNL as this platform has asymetric
subslices and the first method SUBSLICE_MASK value is assumed uniform
across slices.

v2: Change gen_device_info_update_from_masks() to generate topology
    and call into gen_device_info_update_from_topology (Lionel/Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 20:14:22 +00:00
Lionel Landwerlin
2d26c99933 intel: devinfo: meson: include drm uapi
Already available with the autotools build.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 20:14:22 +00:00
Lionel Landwerlin
5d3e74a5a5 drm-uapi: bump headers
Required updates from drm-next for changes in i965.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org
2018-03-22 20:14:22 +00:00
Lionel Landwerlin
c471716574 intel: devinfo: store slice/subslice/eu masks
We want to store values coming from the kernel but as a first step, we
can generate mask values out the numbers already stored in the
gen_device_info masks.

v2: Add a helper to set EU masks (Lionel/Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 20:14:22 +00:00
Lionel Landwerlin
7e2c6147da intel: devinfo: store number of EUs per subslice
This will be reused to store values reported by the kernel. The main
use case will be for use as the input values of the metric sets
equations for the INTEL_performance_queries extension. By storing this
information in the gen_device_info we make this non GL specific so
this can be reused by Vulkan if we ever have an equivalent extension.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 20:14:22 +00:00
Dylan Baker
8e5988eb35 Revert "meson: merge C and C++ compiler arguments check"
This reverts commit cb2ddcefa5.

This causes clang to error out building C++ code. The plan is to fix the
build to work with clang, but in the mean time we'll just revert this

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric@engestrom.ch>
2018-03-22 11:35:08 -07:00
Lionel Landwerlin
1603ce1921 i965/perf: fix config registration when uploading to kernel
When registring configurations to the kernel for the first time, we
run into an issue where the id number is not properly set (we're using
the wrong variable). As a result when trying to use that id later on,
we get an error.

This issue manifest itself the first time you use frameretrace after
reboot, subsequent runs are fine.

Fixes: 27ee83eaf7 ("i965: perf: add support for userspace configurations")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 18:21:57 +00:00
Lepton Wu
a8b846bccd gallium/winsys/kms: Add support for multi-planes
Add a new struct kms_sw_plane which delegate a plane and use it
in place of sw_displaytarget. Multiple planes share same underlying
kms_sw_displaytarget.

v2:
 - add more check for plane size (Tomasz)
v3:
 - split from larger patch (Emil)
v4:
 - no change from v3
v5:
 - remove mapped field (Tomasz)
v6:
 - remove change-id in commit message (Tomasz)
v7:
 - add revision history in commit message (Emil)

Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Lepton Wu <lepton@chromium.org>
2018-03-22 18:10:44 +00:00
Lepton Wu
d891f28df9 gallium/winsys/kms: Fix possible leak in map/unmap.
If user calls map twice for kms_sw_displaytarget, the first mapped
buffer could get leaked. Instead of calling mmap every time, just
reuse previous mapping. Since user could map same displaytarget with
different flags, we have to keep two different pointers, one for rw
mapping and one for ro mapping. Also introduce reference count for
mapped buffer so we can unmap them at right time.

v2:
 - avoid duplicated mapping and leaked mapping (Tomasz)
v3:
 - split from larger patch (Emil)
v4:
 - remove munmap from dt_destory (Emil)
v5:
 - introduce reference count for mapping (Tomasz)
 - add back munmap in dt_destory
v6:
 - remove change-id in commit message (Tomasz)
v7:
 - remove munmap from dt_destory again (Emil)
 - add revision history in commit message (Emil)

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Lepton Wu <lepton@chromium.org>
2018-03-22 18:10:42 +00:00
Juan A. Suarez Romero
4db269f30c broadcom/vc4: add path to nir_builder.h
As the other VC4 files do. Otherwise, it won't find nir_builder.h

v2: add path in source code rather changing autotools (Emil)

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-22 18:25:39 +01:00
Juan A. Suarez Romero
d39e828c82 autotools: add tegra header files
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-22 18:25:39 +01:00
Juan A. Suarez Romero
40ecee89b7 swr/rast: autotools: add events_private.proto in dist tarball.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-22 18:25:39 +01:00
Juan A. Suarez Romero
0bf1274883 radv: autotools: add radv_extensions.h in the generated VULKAN list
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-22 18:25:39 +01:00
Juan A. Suarez Romero
13459c637a anv/radv: autotools: include vulkan_*.h headers
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-22 18:25:39 +01:00
Juan A. Suarez Romero
f8b749b7c0 nir: autotools, meson: add GLSL.ext.AMD.h in the files list
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-22 18:25:39 +01:00
Matt Turner
724586a266 intel/compiler: Readd ICL to test_eu_validate.cpp
Now that the PCI IDs are upstream, this can be readded.
2018-03-22 09:56:09 -07:00
Matt Turner
65b060d9cb intel/compiler: Skip 64-bit type tests when types not available
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 09:56:09 -07:00
Anuj Phogat
ad7ed86bf7 intel: Add a Ice Lake PCI IDs
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-03-22 09:56:09 -07:00
Anuj Phogat
1065acfb69 intel: Disable fast color clear on icl
Disabling fast color clear makes fbo-clearmipmap test render correct
texture in base miplevel. Fast color clear is anyways disabled for
non-base miplevels.

Acked-by: Matt Turner <mattst88@gmail.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 09:56:09 -07:00
Jason Ekstrand
d2eecf0b0b intel/compiler/icl: Clear "null render target" bit in extended message descriptor
Otherwise all our render target writes go no where.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 09:56:09 -07:00
Anuj Phogat
1484876ef7 intel/compiler/icl: Update the assert in brw_stage_has_packed_dispatch()
Rafael ran piglit with the test code enabled and saw no additional GPU
hangs.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 09:56:09 -07:00
Anuj Phogat
f05e0d9c2a intel/common/icl: Disable hiz surface sampling
On gen11+ AUX_HIZ is not a supported value for surfaces being
sampled by the 3D sampler.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 09:56:09 -07:00
Anuj Phogat
370af9dcc0 intel/common/icl: Add L3 config
ICL uses the same L3 configs as CNL, just leaving the SLM configs out.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 09:56:09 -07:00
Matt Turner
f56693af4b intel/tools/aubinator: Drop platform list from print_help()
We all know the platform names, and I don't want to update this list
continually.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-22 09:56:09 -07:00
Derek Foreman
aa18a63512 egl/wayland: Make swrast display_sync the correct queue
commit 03dd9a88b0 introduced per surface
queues, but the display_sync for swrast_commit_backbuffer remained on
the old queue.  This is likely to break when dispatching the correct
queue at the top of function (which can't dispatch the sync callback
we're waiting for).

The easiest known reproduction case is running weston-subsurfaces under
weston --use-pixman

Signed-off-by: Derek Foreman <derekf@osg.samsung.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-03-22 15:27:35 +00:00
Samuel Pitoiset
52fba3f45d radv: remove unused radv_pipeline::needs_data_cache variable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-22 14:30:37 +01:00
Eric Engestrom
cb2ddcefa5 meson: merge C and C++ compiler arguments check
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-03-22 11:59:12 +00:00
Mathias Fröhlich
880c1718b6 omx: always define ENABLE_ST_OMX_{BELLAGIO,TIZONIA}
We're trying to be -Wundef clean so that we can turn it on (and
eventually make it an error).

Note that the OMX code already used `#if ENABLE_ST_OMX_BELLAGIO` instead
of #ifdef; I could've changed these, but the point of -Wundef is to
catch typos, so we might as well make the change the right way.

Fixes: 83d4a5d5ae "st/omx/tizonia: Add H.264 decoder"
Fixes: b2f2236dc5 "st/omx/tizonia: Add H.264 encoder"
Fixes: c62cf1f165 "st/omx/tizonia/h264d: Add EGLImage support"
Cc: Gurkirpal Singh <gurkirpal204@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-03-22 11:39:28 +00:00
Mathias Fröhlich
795b465c50 meson: simplify omx logic
and let's make sure `with_gallium_omx` is never 'auto' and can only be
one of [bellagio, tizonia, disabled].

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-03-22 10:08:10 +00:00
Mathias Fröhlich
862c872c48 vbo: Remove now duplicate _DrawVAO notification.
The DriverFlags.NewArray bit is already set to NewDriverState in
_mesa_set_draw_vao since we have actually just above changed the VAOs
content. So this can be removed.
The _vbo_update_inputs is called by the vbo...recalculate_inputs being
set through the same mechanism as described above.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-22 04:58:53 +01:00
Mathias Fröhlich
006b5e798a vbo: Remove now duplicate _vbo_update_inputs from dlist draw.
At the current state, _vbo_update_inputs is called from
the draw callback if vbo...recalculate_inputs is set.
But that is now set of the _DrawVAO or its content or the
vertex program mode is changed.
So remove _vbo_update_inputs from the direct dlist draw path.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-22 04:58:53 +01:00
Mathias Fröhlich
2887c98140 vbo: Remove redundant set of DriverFlags.NewArray in vbo_bind_arrays.
Now that setting vbo...recalculate_inputs also sets the
DriverFlags.NewArray bits into the NewDriverState setting that from
vbo_bind_arrays is redundant.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-22 04:58:52 +01:00
Mathias Fröhlich
9f5b6ef2ef vbo: Remove vbo...recalculate_inputs from vbo_exec_invalidate_state.
This flag is now set when the actual Array._DrawVAO changes.
So setting this flag is redundant here.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-22 04:58:52 +01:00
Mathias Fröhlich
bf328359a7 mesa: A change of gl_vertex_processing_mode needs an array update.
Since arrays also handle the mapping of current values into the
disabled array slots, we need to tell the array update code that
this mapping has changed. Also mark only dirty if it has changed.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-22 04:58:52 +01:00
Mathias Fröhlich
5b91786225 mesa: Set DriverFlags.NewArray together with vbo...recalculate_inputs.
Both mean something very similar and are set at the same time now.
For that vbo module to be set from core mesa, implement a public vbo
module method to set that flag. In the longer term the flag should
vanish in favor of a driver flag of the appropriate driver.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-22 04:58:52 +01:00
Mathias Fröhlich
d3c604e12e mesa: Update VAO internal state when setting the _DrawVAO.
Update the VAO internal state on Array._DrawVAO instead of
Array.VAO. Also the VAO internal state update gets triggered now
by a change of Array._DrawVAO instead of the _NEW_ARRAY state flag.
Also no driver looks at any VAO's NewArrays value from within
the Driver.UpdateState callback. So it should be safe to move
this update into the _mesa_set_draw_vao method.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-22 04:58:52 +01:00
Mathias Fröhlich
c4c56ff303 vbo: Move vbo_bind_arrays into a dd_driver_functions draw callback.
Factor out that common call into the almost single place.
Remove the _mesa_set_drawing_arrays call from vbo_{exec,save}_draw code
paths as the function is now called through vbo_bind_arrays.
Prepare updating the list of struct gl_vertex_array entries via
calling _vbo_update_inputs for being pushed into those drivers that
finally work on that long list of gl_vertex_array pointers.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-22 04:58:52 +01:00
Mathias Fröhlich
6307d1be0a mesa: Move vbo draw functions into dd_function_table.
Move vbo draw functions into struct dd_function_table.
For now just wrap the underlying vbo functions.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-22 04:58:52 +01:00
Aaron Watry
23100acc8f clover/llvm: Fix build against LLVM/Clang 4.0
The opencl 1.0 langstandard was renamed in 5.0+

v2: Move preprocessor check into compat.hpp

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-03-21 21:03:23 -05:00
Timothy Arceri
c135316555 ac/nir_to_llvm: add frexp support
Fixes CTS tests:
KHR-GL40.gpu_shader_fp64.builtin.frexp_double
KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec2
KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec3
KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec4

And piglit test:
tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-frexp-dvec4.shader_test

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-22 12:42:34 +11:00
Timothy Arceri
cca2141745 nir: add frexp_exp and frexp_sig opcodes
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-22 12:42:34 +11:00
Caio Marcelo de Oliveira Filho
12c22b897a anv/pipeline: don't pass constant view index in multiview
If view mask has only one bit set, view index is effectively a
constant, so doesn't need to be passed to the next stages, just always
set it.

Part of this was in the original patch that added
anv_nir_lower_multiview.c but disabled.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-21 14:49:50 -07:00
Caio Marcelo de Oliveira Filho
5e7c1d05d4 anv/pipeline: use less instructions for multiview
The view_index is encoded in the remainder of dividing instance id by
the number of views in the view mask (n). In the general case (handled
by the else clause), there is a need to map from 0..n-1 into the
number of the view being masked. For that a map is encoded.

In the case only the first n bits in the mask are set, the mapping is
trivial, 0..n-1 already represent what view is being referred to.

That case was in the original patch that added
anv_nir_lower_multiview.c but disabled.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-21 14:49:50 -07:00
Eric Anholt
baeb6a4b4a broadcom/vc5: Fix up the NIR types of FS outputs generated by NIR-to-TGSI.
Unfortunately TGSI doesn't record the type of the FS output like GLSL
does, but VC5's TLB writes depend on the output's base type.  Just record
the type in the key at variant compile time when we've got a TGSI input
and then fix it up.

Fixes KHR-GLES3.packed_pixels.pbo_rectangle.rgba32i/ui and apparently a
GPU hang that breaks most tests that come after it.
2018-03-21 14:02:34 -07:00
Neil Roberts
61603f0e42 spirv: Add a 64-bit implementation of Frexp
The implementation is inspired by
lower_instructions_visitor::dfrexp_sig_to_arith.

This has been tested against the arb_gpu_shader_fp64/fs-frexp-dvec4
test using the ARB_gl_spirv branch.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-21 20:18:44 +01:00
Rafael Antognolli
5297a17571 aubinator_error_decode: Compare only the class_name of the ring.
ring_name is "<class_name> + <instance_id>" (e.g. rcs0). So we need to
first compare the class name only, then get the instance id.

Without this, INSTDONE is not being decoded.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-03-21 11:35:15 -07:00
Thomas Helland
8d5cd91ca0 nir: Migrate nir_dce to instr worklist
Shader-db runtime change avarage of five runs:
   Before 125,77 seconds (+/- 0,09%)
   After  124,48 seconds (+/- 0,07%)

Tested-by: Dieter Nützel <Dieter at nuetzel-hh.de>
Reviewed-by: Eric Anholt <eric at anholt.net>
2018-03-21 19:26:40 +01:00
Thomas Helland
edb18564c7 nir: Initial implementation of a nir_instr_worklist
Make a simple worklist by basically just wrapping u_vector.
This is intended used in nir_opt_dce to reduce the number of calls
to ralloc, as we are currenlty spamming ralloc quite bad. It should
also give better cache locality and much lower memory usage.

Tested-by: Dieter Nützel <Dieter at nuetzel-hh.de>
Reviewed-by: Eric Anholt <eric at anholt.net>
2018-03-21 19:26:27 +01:00
Scott D Phillips
cab8df1e3e intel/tools: aubinator: Catch gen11 "enhanced execlist" submission
Different registers are used for execlist submission in gen11, so
also watch those. This code only watches element zero of the
submit queue, which is all aubdump currently writes.

Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-03-21 11:07:15 -07:00
Marek Olšák
a8d55374dc radeonsi: fix a snprintf warning on gcc 7.3.0 2018-03-21 13:43:09 -04:00
Marek Olšák
cf0a95afac radeonsi/gfx9: print the swizzle mode for testdma
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-03-21 13:40:06 -04:00
Marek Olšák
f7ffa504a0 ac/surface: compute tile swizzle for GFX9
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-03-21 13:40:06 -04:00
Eric Anholt
9f0c9c6d18 broadcom/vc5: Don't skip job submit just because everything is scissored.
The coordinate shaders may now have side effects in the form of transform
feedback.

Part of fixing
GTF-GLES3.gtf.GL3Tests.transform_feedback.transform_feedback_misc
2018-03-21 10:04:21 -07:00
Eric Anholt
024e814dee broadcom/vc5: Handle sparsely populated SO target array.
Fixes
GTF-GLES3.gtf.GL3Tests.transform_feedback.transform_feedback_state_variables
2018-03-21 10:04:21 -07:00
Eric Anholt
f735ac6b1c broadcom/vc5: Fix 3D miplevel limit to match other texture targets.
Fixes segfault in
GTF-GLES3.gtf.GL3Tests.texture_storage.texture_storage_texture_levels on
level 13.
2018-03-21 10:04:21 -07:00
Eric Anholt
ba87d85b04 broadcom/vc5: Clamp the instance divisor to 16 bits.
Fixes debug assert on
GTF-GLES3.gtf.GL3Tests.instanced_arrays.instanced_arrays_divisor

Signed-off-by: Eric Anholt <eric@anholt.net>
2018-03-21 10:04:21 -07:00
Lionel Landwerlin
3dd92184d5 i965: fix android build
This is the equivalent of commit 5770e1d89e for
android.

v2: fix xml files path and file given to --header

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Fixes: 2d2b15fbca ("i965: fix autotools/android build")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105634
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-21 18:56:47 +02:00
Juan A. Suarez Romero
e5cd376c2f docs: fix typo in 17.3.6 release notes
Title is about 17.3.5, when it must be about 17.3.6.

CC: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-21 16:37:49 +00:00
Caio Marcelo de Oliveira Filho
8571c577aa nir/dead_cf: also remove useless ifs
Generalize the code for remove dead loops to also remove dead if
nodes. The conditions are the same in both cases, if the node (and
it's children) don't have side-effects AND the nodes after it don't
use the values produced by the node.

The only difference is when evaluating side effects: loops consider
only return jumps as a side-effect -- they can stop execution of nodes
after it; 'if' nodes outside loops should consider all kinds of
jumps (return, break, continue) since all of them can cause execution
of nodes after it to be skipped.

After this patch, empty ifs (those which both then and else blocks are
empty) will be removed by nir_opt_dead_cf.

It caused no change to shader-db, in part because the removal of empty
ifs is currently covered by nir_opt_peephole_select.

v2: Improve the identification of cases where break/continue can cause
    side-effects. (Jason)

v3: Move code comment changes to a different patch. (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-21 09:36:09 -07:00
Caio Marcelo de Oliveira Filho
470056d37b nir/dead_cf: rephrase definition of a dead loop node
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-21 09:35:57 -07:00
Juan A. Suarez Romero
e1f8c23e18 docs: update calendar, add news and link release notes to 17.3.7
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-03-21 16:02:37 +00:00
Juan A. Suarez Romero
543e7c8382 docs: add sha256 checksums for 17.3.7
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 13dd6016d7)
2018-03-21 15:58:55 +00:00
Juan A. Suarez Romero
09448940ed docs: add release notes for 17.3.7
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 8a51f3857c)
2018-03-21 15:58:52 +00:00
Leo Liu
c4de2f0880 radeon/vce: move feedback command inside of destroy function
On the CI family, firmware requires the destory command have to be the
last command in the IB, moving feedback command after destroy is causing
issues on CI cards, so we have to keep the previous logic that moves
destroy back to the last command.

But as the original issue fixed previously, with the newer family like Vega10,
feedback command have to be included inside of the task info command along
with destroy command.

Fixes: 6d74cb25("radeon/vce: move destroy command before feedback command")

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Cc: mesa-stable@lists.freedesktop.org
2018-03-21 11:24:35 -04:00
Eric Engestrom
1346a36162 egl: pull update from Khronos and drop local define
Added in Khronos in 2b6bb4ee45cc46c89d4a "EGL_MESA_drm_image: add
EGL_DRM_BUFFER_USE_CURSOR_MESA to egl.xml" [1] as part of PR #36 [2].

[1] 2b6bb4ee45
[2] https://github.com/KhronosGroup/EGL-Registry/pull/36

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-21 14:28:05 +00:00
Eric Engestrom
f744c6c1e2 egl: align the formatting of Haiku section of eglplatform.h with Khronos'
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-21 14:28:05 +00:00
Eric Engestrom
ac698ae4a0 egl: add Ozone section to eglplatform.h
This pulls in commit a93f559e9c11fa53fb5f1cc255b8f75433f85d2a "Add Ozone
section to eglplatform.h" from Khronos [1] added by Brian Anderson [2]
a few months ago.

[1] a93f559e9c
[2] https://github.com/KhronosGroup/EGL-Registry/pull/26

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-21 14:28:05 +00:00
Aaron Watry
c95d953b18 clover: Dynamically calculate __OPENCL_VERSION__ and CLC language version
Use get_language_version to calculate default cl standard based on
device capabilities and -cl-std specified in build options.

v5; move dev_clc_version declaration from an earlier patch
v4: Squash the __OPENCL_VERSION__ and CLC language version patches
v3: (Jan) Allow device_version up to 2.2 while device_clc_version
    only goes to 2.0
    Use get_cl_version to calculate version instead
v2: Split out from the previous patch (Pierre)

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
CC: Jan Vesely <jan.vesely@rutgers.edu>
2018-03-21 06:59:46 -05:00
Aaron Watry
29b4090d18 clover/llvm: Add get_[cl|language]_version, validation and some helpers
Used to calculate the default CLC language version based on the --cl-std in build args
and the device capabilities.

According to section 5.8.4.5 of the 2.0 spec, the CL C version is chosen by:
 1) If you have -cl-std=CL1.1+ use the version specified
 2) If not, use the highest 1.x version that the device supports

Curiously, there is no valid value for -cl-std=CL1.0

Validates requested cl-std against device_clc_version

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>

v7: (Pierre) Split cl/clc versions into separate lists and
    make more references const.

v6: (Pierre) Add more const and fix some whitespace

v5: (Aaron) Use a collection of cl versions instead of switch cases
    Consolidates the string, numeric version, and clc langstandard::kind

v4: (Pierre) Split get_language_version addition and use into separate patches
    Squash patches that add the helpers and validate the language standard

v3: Change device_version to device_clc_version

v2: (Pierre) Move create_compiler_instance changes to correct patch
    to prevent temporary build breakage.
    Convert version_str into unsigned and use it to find language version
    Add build_error for unknown language version string
    Whitespace fixes
2018-03-21 06:59:37 -05:00
Juan A. Suarez Romero
14fffefc60 docs: add 17.3.{8,9} in the release calendar
Mesa 18.0 series has not been released yet, so let's extend 17.3 lifetime.

v2: add 17.3.9 in the calendar (Andres Gomez)

CC: Andres Gomez <agomez@igalia.com>
CC: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-21 11:57:44 +01:00
Eric Anholt
4d8b476fa9 intel/blorp: Fix compiler warning about num_layers.
The compiler doesn't notice that the condition for num_layers to be
undefined already defined it above (as our assert checked in a debug
build).

v2: Move the pair of assignments to one outside of the block.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-20 14:06:46 -07:00
Samuel Pitoiset
f0211155f1 radv: add support for VK_EXT_depth_range_unrestricted
This extension removes the restrictions on minDepth/maxDepth,
minDepthBounds/maxDepthBounds and VkClearDepthStencilValue::depth.

The following CTS tests now pass:

dEQP-VK.glsl.builtin_var.fragdepth.line_list_d32_sfloat_large_depth
dEQP-VK.glsl.builtin_var.fragdepth.point_list_d32_sfloat_large_depth
dEQP-VK.glsl.builtin_var.fragdepth.triangle_list_d32_sfloat_large_depth
dEQP-VK.draw.inverted_depth_ranges.nodepthclamp_depth_range_unrestricted
dEQP-VK.draw.inverted_depth_ranges.depthclamp_depth_range_unrestricted

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-20 21:55:41 +01:00
Samuel Pitoiset
4e9b0b39b5 radv: only enable one channel when exporting prim id
It's a 32-bit integer like the layer.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-20 21:54:48 +01:00
Lionel Landwerlin
5770e1d89e i965: fix out of tree autotools build
Fixes: 2d2b15fbca ("i965: fix autotools/android build")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-03-20 19:48:56 +00:00
Stéphane Marchesin
1117edc60d virgl: Implement seamless cube maps
This was previously ignored.

Along with the virglrenderer patch, this fixes ~100 dEQP tests:
dEQP-GLES3.functional.texture.filtering.cube.*

Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-21 05:44:52 +10:00
Emil Velikov
c43715d30b i965: annotate brw_oa.py's --header and --code as required
As of earlier commit, the --header was made a hard requirement when
using --code.

Hence - annotate both as required and drop a few no longer needed
checks.

Fixes: 035cc7a12d ("i965: perf: reduce i965 binary size")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-20 17:21:49 +00:00
Lionel Landwerlin
d3e5d3955c i965: pipecontrol: add LRI write immediate flag
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-20 16:58:30 +00:00
Lionel Landwerlin
7f977d51b3 intel: genxml: add INSTPM/CS_DEBUG_MODE2 registers
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-20 16:58:30 +00:00
Lionel Landwerlin
2d2b15fbca i965: fix autotools/android build
Autotools/android builds generate the header & code files in 2 steps,
but the code generation requires the name of the header file to
include it.

This change generates both files in one command.

Fixes: 035cc7a12d ("i965: perf: reduce i965 binary size")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-20 16:58:29 +00:00
Daniel Stone
9f3509665d dri3: Fix typo in version check
The have-new-DRI3 codepaths would never actually properly trigger, since
there was a typo in configure.ac which broke the version check. This
went unnoticed but for an error in config.log if you looked closely
enough.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reported-by: Lukas F. Hartmann <lukas@mntmn.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Fixes: 7aeef2d4ef ("dri3: allow building against older xcb (v3)")
Cc: Dave Airlie <airlied@redhat.com>
2018-03-20 16:38:08 +00:00
Daniel Stone
bc5e59119e meson: Don't build svga by default on ARM/AArch64
VMware has no (published) support for Arm-architecture guests.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reported-by: Dylan Baker <dylan@pnwbakers.com>
2018-03-20 16:18:37 +00:00
Daniel Stone
d7603cb518 meson: Add default DRI drivers for ARM/AArch64
On all Arm architectures (ARMv7 and below as 'arm', ARMv8 and above as
'aarch64'), only build swrast for DRI drivers. The only classic drivers
which could be used are r200 and NV20 cards, which seems unlikely enough
that it shouldn't be the default.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reported-by: Javier Jardón <jjardon@gnome.org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-20 16:18:37 +00:00
Emil Velikov
28780c5028 st/mesa: add compiler/nir/ prefix for nir includes
Stay consistent with the rest of the codebase, effectively fixing the
autotools build.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105621
Fixes: ffa4bbe466 ("st/nir/radeonsi: move nir_lower_uniforms_to_ubo()
to the state tracker")
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-03-20 16:11:19 +00:00
Scott D Phillips
d849d36c6c anv: off-by-one in GetDescriptorSetLayoutSupport
Loop was accessing one more than bindingCount elements from
pBindings, accessing uninitialized memory.

Fixes: ddc4069122 ("anv: Implement VK_KHR_maintenance3")
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-20 07:58:10 -07:00
Lionel Landwerlin
035cc7a12d i965: perf: reduce i965 binary size
Performance metric numbers are calculated the following way :

   - out of the 256 bytes long OA reports, we accumulate the deltas
     into an array of uint64_t

   - the equations' generated code reads the accumulated uint64_t
     deltas and normalizes them for a particular platform

Our hardware is such that a number of counters in the OA reports
always return the same values (i.e. they're not programmable), and
they return the same values even across generations, and as a result a
number of equations are identical in different metric sets across
different generations.

Up to now we've kept the generated code of the equations separated in
different files (per generation/GT), and didn't apply any
factorization of the common equations. We could have make some
improvement by reusing equations within a given metrics file, but we
can go even further and reuse across generations (i.e. all files).

This change changes the code generation to emit a single file in which
we reuse equations emitted code based on the hash of equations'
strings.

Here are the savings in a meson build :

Before(.old)/after :
   $ du -h ./build/src/mesa/drivers/dri/libmesa_dri_drivers.so ./build/src/mesa/drivers/dri/libmesa_dri_drivers.so.old
   43M	./build/src/mesa/drivers/dri/libmesa_dri_drivers.so
   47M	./build/src/mesa/drivers/dri/libmesa_dri_drivers.so.old

   $ size build/src/mesa/drivers/dri/libmesa_dri_drivers.so build/src/mesa/drivers/dri/libmesa_dri_drivers.so.old
       text   data          bss	     dec            hex filename
   13054002 409424	 671856	14135282	 d7aff2	build/src/mesa/drivers/dri/libmesa_dri_drivers.so
   14550386 409552	 671856	15631794	 ee85b2	build/src/mesa/drivers/dri/libmesa_dri_drivers.so.old

As a side comment here is the size of the drivers if we remove all of
the metrics from the build :

   $ du -sh build/src/mesa/drivers/dri/libmesa_dri_drivers.so
   40M	build/src/mesa/drivers/dri/libmesa_dri_drivers.so

v2: Fix an issue with hashing of counter equations (Lionel)
    Build system rework (Emil)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (build system part)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-20 13:56:07 +00:00
Lionel Landwerlin
e9a9e85948 i965: perf: fix a counter return type on hsw
The equation code computes a float (percentage) yet the return type
was an uint64_t.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-20 11:36:13 +00:00
Tapani Pälli
604cac9f73 mesa: fix leaking ParameterValueOffset
==15115== 48 bytes in 1 blocks are definitely lost in loss record 16 of 66
==15115==    at 0x4C2EC15: realloc (vg_replace_malloc.c:785)
==15115==    by 0x8602C3E: _mesa_reserve_parameter_storage (prog_parameter.c:212)
==15115==    by 0x8602D1E: _mesa_add_parameter (prog_parameter.c:252)
==15115==    by 0x86032C4: _mesa_add_sized_state_reference (prog_parameter.c:384)
==15115==    by 0x8603324: _mesa_add_state_reference (prog_parameter.c:409)

Fixes: edded12376 "mesa: rework ParameterList to allow packing"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-20 13:25:07 +02:00
Daniel Stone
478fc2d2a1 dri3: Don't fail on version mismatch
The previous commit to make DRI3 modifier support optional, breaks with
an updated server and old client.

Make sure we never set multibuffers_available unless we also support it
locally. Make sure we don't call stubs of new-DRI3 functions (or empty
branches) which will never succeed.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Fixes: 7aeef2d4ef ("dri3: allow building against older xcb (v3)")
2018-03-20 08:52:59 +00:00
Timothy Arceri
9a243eccae radv: don't lower indirects until after opts have run
Noticed while passing by. Not sure if it impacts anything, but
likely to impact GFX9 more than anything else since we lower
inputs, outputs and locals there.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-20 15:01:44 +11:00
Timothy Arceri
dfe2f19855 st/nir: fix atomic lowering for gallium drivers
i965 and gallium handle the atomic buffer index differently. It was
just by luck that the single piglit test for this was passing.

For gallium we use the atomic binding so that we match the handling
in st_bind_atomics().

On radeonsi this fixes the CTS test:
KHR-GL43.shader_storage_buffer_object.advanced-write-fragment

It also fixes tressfx hair rendering in Tomb Raider.

Reviewed-by: Marek Olšák  <marek.olsak@amd.com>
2018-03-20 14:29:53 +11:00
Timothy Arceri
632d5e97ef st/radeonsi: enable uniform packing in NIR backend
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:19:35 +11:00
Timothy Arceri
231333a20d st: add uniform packing support to lower_uniforms_to_ubo()
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:17:34 +11:00
Timothy Arceri
9c51a7ea29 gallium: add packed uniform CAP
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:17:34 +11:00
Timothy Arceri
ffa4bbe466 st/nir/radeonsi: move nir_lower_uniforms_to_ubo() to the state tracker
This will only ever be used by gallium drivers so it probably doesn't
belong in the nir toolkit. Also we want to pass it some non NIR
things in the following patch.

To avoid regressions we wrap the lowering calls that have been moved
to st_glsl_to_nir with a quick hack so that they are only called for
radeonsi, we will replace the hack with a check for uniform packing
in a following patch.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:17:34 +11:00
Timothy Arceri
a80cf442d9 st: add st_glsl_type_dword_size() helper
This will be used to support uniform packing.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:17:34 +11:00
Timothy Arceri
5488166730 st/glsl_to_nir: add support for packed builtin uniforms
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:17:34 +11:00
Timothy Arceri
57ebab64c0 mesa: add _mesa_add_sized_state_reference() helper
This will be used for adding packed builtin uniforms.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:17:34 +11:00
Timothy Arceri
2377754329 mesa: add support propagate uniform support for packed uniforms
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:17:34 +11:00
Timothy Arceri
40711a7a60 mesa: allow for uniform packing when adding uniforms to param list
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-03-20 14:17:33 +11:00
Timothy Arceri
a2198d4fdb mesa: add packing support for setting uniform handles
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-03-20 14:17:33 +11:00
Timothy Arceri
6cfa15b803 mesa: add packing support for setting uniforms
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-03-20 14:17:33 +11:00
Timothy Arceri
4a7c5c079b mesa: create copy uniform to storage helpers
These will be used in the following patch to allow copying directly
to the param list when packing is enabled.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:17:33 +11:00
Timothy Arceri
edded12376 mesa: rework ParameterList to allow packing
Currently everything is padded to 4 components. Making the list
more flexible will allow us to do uniform packing.

V2 (suggestions from Nicolai):
- always pass existing calls to _mesa_add_parameter() true for padd_and_align
- fix bindless param value offsets
- remove left over wip logic from pad and align code
- zero out param value padding
- whitespace fix

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:17:33 +11:00
Timothy Arceri
b13b9eb432 mesa: add PackedDriverUniformStorage const
Will be used to determine whether to take packing code paths or not.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:17:33 +11:00
Eric Anholt
00910e3057 broadcom/vc5: Don't annotate dumps with stale live intervals.
As you're debugging register allocation, you may have changed the
intervals and not recomputed yet.  Just skip the dump in that case.
2018-03-19 16:44:20 -07:00
Eric Anholt
facc3c6f58 broadcom/vc5: Add support for register spilling.
Our register spilling support is nice to have since vc4 couldn't at all,
but we're still very restricted due to needing to not spill during a TMU
operation, or during the last segment of the program (which would be nice
to spill a value of, when there's a long-lived value being passed through
with little modification from the start to the end).

We could do better by emitting unspills for the last-segment values just
before the last thrsw, since the last segment is probably not the maximum
interference area.

Fixes GTF uniform_buffer_object_arrays_of_all_valid_basic_types and 3
others.
2018-03-19 16:44:06 -07:00
Eric Anholt
271fc58ba1 broadcom/vc5: Remove redundant last_inst lookup.
The point was to get the MOV, which the MOV_dest already returned.
2018-03-19 16:42:59 -07:00
Eric Anholt
34dc64f627 broadcom/vc5: On QPU pack error, dump the instruction and return cleanly.
This is nice for debugging when you've made a bad instruction.
2018-03-19 16:42:59 -07:00
Eric Anholt
d721348dcd broadcom/vc5: Add cursors to the compiler infrastructure, like NIR's.
This will let me do lowering late in compilation using the same
instruction builder as we use in nir_to_vir.
2018-03-19 16:42:59 -07:00
Eric Anholt
c81d681742 broadcom/vc5: Move the umul macro to a header.
Anywhere we want to multiply, we probably want this.
2018-03-19 16:42:59 -07:00
Eric Anholt
9e28c18cd1 broadcom/vc5: Correct the arg count of TIDX/EIDX. 2018-03-19 16:42:59 -07:00
Eric Anholt
55bf298333 broadcom/vc5: Re-do live variables after removing thrsws.
Otherwise our start/ends ips won't line up with the actual instructions.
2018-03-19 16:42:59 -07:00
Eric Anholt
c3a504f470 broadcom/vc5: Add a QPU helper for instructions using the TLB.
This will be used for detecting last thread segment in register spilling.
2018-03-19 16:42:59 -07:00
Eric Anholt
09c4dd1971 broadcom/vc5: Introduce v3d_qpu_reads_vpm()/v3d_qpu_writes_vpm().
These helpers will be used in register spilling to determine where to add
a last thrsw if needed, and might help refactor QPU scheduling.
2018-03-19 16:42:59 -07:00
Eric Anholt
407f21ef1b broadcom/vc5: The ldvpm signal also a case of using the VPM.
The QPU scheduling code calling this function already separately checked
this signal.
2018-03-19 16:42:59 -07:00
Eric Anholt
4760040c09 broadcom/vc5: Extract v3d_qpu_writes_tmu() helper.
This will be reused in register spilling.
2018-03-19 16:42:59 -07:00
Dave Airlie
32791a0502 radv: don't export NULL layer.
We have some cases where in subpass we want the layer but having
it be 0 and loaded in the frag shader without the vertex shader
exporting it is fine.

So don't export the layer if we don't have a value to put in it.

Fixes: d4c74aed7a (radv/multiview: mark layer_input if we have input attachments.)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-19 21:36:48 +00:00
Marek Olšák
f674b50d0e mesa: adjust incorrect comment in texture_buffer_range 2018-03-19 16:56:17 -04:00
Ian Romanick
6aeaa7d363 nir: Don't compare b2f or b2i with zero
All of the shaders that had loops changed were in Tomb Raider.  The one
shader that lost SIMD16 is one of those.

Skylake
total instructions in shared programs: 14391653 -> 14390468 (<.01%)
instructions in affected programs: 111891 -> 110706 (-1.06%)
helped: 501
HURT: 0
helped stats (abs) min: 1 max: 155 x̄: 2.37 x̃: 1
helped stats (rel) min: 0.05% max: 21.54% x̄: 1.61% x̃: 1.01%
95% mean confidence interval for instructions value: -3.23 -1.50
95% mean confidence interval for instructions %-change: -1.77% -1.45%
Instructions are helped.

total cycles in shared programs: 532793024 -> 532776598 (<.01%)
cycles in affected programs: 987682 -> 971256 (-1.66%)
helped: 348
nnHURT: 41
helped stats (abs) min: 1 max: 3074 x̄: 54.91 x̃: 18
helped stats (rel) min: 0.05% max: 32.24% x̄: 3.36% x̃: 1.68%
HURT stats (abs)   min: 1 max: 422 x̄: 65.39 x̃: 24
HURT stats (rel)   min: 0.09% max: 39.29% x̄: 9.50% x̃: 2.02%
95% mean confidence interval for cycles value: -64.08 -20.38
95% mean confidence interval for cycles %-change: -2.78% -1.23%
Cycles are helped.

total loops in shared programs: 4854 -> 4829 (-0.52%)
loops in affected programs: 27 -> 2 (-92.59%)
helped: 18
HURT: 0

LOST:   1
GAINED: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-19 13:52:35 -07:00
Dave Airlie
e8d9b7ab02 radv: lower constant initializers on output variables earlier
If a shader only writes to an output via a constant initializer we
need to lower it before we call nir_remove_dead_variables so that
this pass sees the stores from the initializer and doesn't kill the
output.

Fixes test failures in new work-in-progress CTS tests:
dEQP-VK.spirv_assembly.instruction.graphics.variable_init.output.float

This is ported from anv:
99b57daf4a anv/pipeline: lower constant initializers on output variables earlier
from Iago Toral Quiroga <itoral@igalia.com>

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-19 19:29:40 +00:00
Dave Airlie
032014ac01 radv/query: handle multiview timestamp queries.
For each view bit we need to emit a timestamp query.

Fixes: dEQP-VK.multiview.queries*

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-19 19:29:14 +00:00
Dave Airlie
32b4f3c38d radv/query: handle multiview queries properly. (v3)
For multiview we need to emit a number of sequential queries
depending on the view mask.

This avoids dEQP-VK.multiview.queries.15 waiting forever
on the CPU for query results that are never coming.

We only really want to emit one query,
and the rest should be blank (amdvlk does the same),
so we emit begin/end pairs for all the others except
the first query.

v2: fix tests
v3: split out patch.

Fixes: dEQP-VK.multiview.queries*
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-19 19:29:09 +00:00
Dave Airlie
4034dc5c72 radv/query: split out begin/end query emission
This just splits out the begin/end query hw emissions,
it makes it easier to add multiview support for queries.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-19 19:29:05 +00:00
Dave Airlie
d4c74aed7a radv/multiview: mark layer_input if we have input attachments.
This fixes:
dEQP-VK.multiview.input_attachments*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-19 19:26:39 +00:00
Caio Marcelo de Oliveira Filho
f6338c3b85 anv/pipeline: set active_stages early
Since the intermediate states of active_stages are not used,
i.e. active_stages is read only after all stages were set into it,
just set its value before compiling the shaders.

This will allow to conditionally run certain passes based on what
other shaders are being used, e.g. a certain pass might only be
applicable to the vertex shader if there's no geometry or tessellation
shader being used.

v2: Use vk_to_mesa_shader_stage. (Lionel)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-19 18:00:49 +00:00
Caio Marcelo de Oliveira Filho
318073ce66 anv/pipeline: fail if TCS/TES compile fail
v2: Add Fixes tag. (Lionel)

Fixes: e50d4807a3 ("anv: Compile TCS/TES shaders.")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-19 18:00:49 +00:00
Jordan Justen
2ed288363f main/program_binary: In ProgramBinary set link status as LINKING_SKIPPED
This change allows the disk shader cache to work with programs loaded
with ProgramBinary. Drivers check for LINKING_SKIPPED, and if set,
then they try to use the shader cache.

Since the program loaded by ProgramBinary is similar to loading the
shader from the disk cache, this is probably more appropriate.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-19 09:57:09 -07:00
Jordan Justen
d2b74ca2b5 i965: Allow disk shader cache usage with LINKING_SUCCESS status
Currently, we only look in the disk shader cache if we see that the
shader program is in the cache during the link step.

If the shader cache entry isn't found during the program link, there
are still some (fairly unlikely) scenarios where later it might be
useful to search the cache for gen binary programs.

1. If the cache evicts the serialized glsl cache, there might still be
   valid gen program entries in the disk cache.

2. If two applications are running in parallel, then it is possible
   that one may write out the cached gen program item which the other
   application can then make use of.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-19 09:57:09 -07:00
Jordan Justen
b5baaee0d6 glsl/serialize: Save shader program metadata sha1
When the shader cache is used, this can be generated. In fact, the
shader cache uses this sha1 to lookup the serialized GL shader
program.

If a GL shader program is restored with ProgramBinary, the shaders are
not available, and therefore the correct sha1 cannot be generated. If
this is restored, then we can use the shader cache to restore the
binary programs to the program that was loaded with ProgramBinary.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-03-19 09:57:09 -07:00
Jordan Justen
9b473f9e3c glsl: Remove api_enabled tracking for transform feedback
We used this to prevent usage of the disk shader cache when transform
feedback was enabled via the GL API. This is no longer used.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105444
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-19 09:57:09 -07:00
Jordan Justen
fc4a7aaa82 i965: Allow disk shader cache usage with transform feedback
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105444
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-19 09:57:09 -07:00
Jordan Justen
6d830940f7 glsl/shader_cache: Allow shader cache usage with transform feedback
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105444
Suggested-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-19 09:57:09 -07:00
Jose Fonseca
e10dc12f6f scons: need to split CC or things might fail
We've seen this fail internally.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-03-19 16:41:57 +01:00
Jordan Justen
d07a49fb18 i965: Add INTEL_DEBUG stages support for disk shader cache
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-19 00:07:29 -07:00
Dave Airlie
8f052a3e25 radv: handle exporting view index to fragment shader. (v1.1)
The fragment shader was trying to read this, but nothing
was exporting it from the vertex shader. This handles
it like the prim id export.

Fixes:
dEQP-VK.multiview.secondary_cmd_buffer.*
dEQP-VK.multiview.index.fragment_shader.*

v1.1: updated to use 0x1 (Samuel)

Fixes: e3265c10c8 (radv: Implement multiview draws.)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-19 01:20:00 +00:00
Axel Davy
dbc24835d7 st/nine: Fix non inversible matrix check
There was a missing absolute value when
checking if the determinant was big enough.

Fixes: https://github.com/iXit/Mesa-3D/issues/292

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Reviewed-by: Patrick Rudolph <siro@das-labor.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

CC: "17.3 18.0" <mesa-stable@lists.freedesktop.org>
2018-03-18 22:53:46 +01:00
Axel Davy
f61e9a958b st/nine: Fixes warning about implicit conversion
Makes the conversion explicit.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=102542

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Reviewed-by: Patrick Rudolph <siro@das-labor.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

CC: "17.3 18.0" <mesa-stable@lists.freedesktop.org>
2018-03-18 22:53:42 +01:00
Axel Davy
71eae7940e st/nine: Fix bad tracking of vs textures for NINESBT_ALL
Stateblocks with NINESBT_ALL should track all textures.
For better performance they have a faster path which
copies all the required.

This path was only tracking ps textures.

Fixes: https://github.com/iXit/Mesa-3D/issues/303

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Reviewed-by: Patrick Rudolph <siro@das-labor.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

CC: "17.3 18.0" <mesa-stable@lists.freedesktop.org>
2018-03-18 22:53:36 +01:00
Axel Davy
76fa1f730b st/nine: Fix bad tracking of bound vs textures
An incorrect formula was used to compute bound_samplers_mask_vs.
Since s is above always 8 for vs and the variable is encoded on 8 bits,
it was always 0.
This resulted in commiting the samplers every call when
there was at least one texture read in the vs shader.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Reviewed-by: Patrick Rudolph <siro@das-labor.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-03-18 22:53:32 +01:00
Grazvydas Ignotas
e1b2e5667c radv: make vk_format_description structures static
No need to bother the linker about them.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-17 18:53:21 +02:00
Grazvydas Ignotas
331141e87e radv: fix stale comment in generated vk_format_table.c
It seems to be a leftover from u_format_table.py.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-17 18:53:21 +02:00
Eric Anholt
7db1c09d12 anv: Silence warning about heap_size.
We only get VK_SUCCESS if it was initialized, but apparently my compiler
doesn't track that far.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-16 15:10:05 -07:00
Eric Anholt
d25640c3a3 i965: Silence compiler warning about promoted_constants.
We only have a cfg != NULL if we went through one of the paths that set
it, but my compiler doesn't figure that out.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 6411defdcd ("intel/cs: Re-run final NIR optimizations for each SIMD size")
2018-03-16 15:09:55 -07:00
Eric Anholt
9f89452ea3 anv: Silence compiler warnings about uninitialized bind_offset.
This is a legitimate warning: if anv's blorp_alloc_binding_table() throws
an error from anv_cmd_buffer_alloc_blorp_binding_table(), we silently
continue to use this undefined value.  The rest of this code doesn't seem
very allocation-error-proof, though, either.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-16 15:09:47 -07:00
Matt Turner
f3833f1ca7 intel/compiler: Use gen_get_device_info() in test_eu_validate
Previously the unit test filled out a minimal devinfo struct. A previous
patch caused the test to begin assert failing because the devinfo was
not complete. Avoid this by using the real mechanism to create devinfo.

Note that we have to drop icl from the table, since we now rely on the
name -> PCI ID translation done by gen_device_name_to_pci_device_id(),
and ICL's PCI IDs are not upstream yet.

Fixes: f89e735719 ("intel/compiler: Check for unsupported register sizes.")
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-03-16 13:20:21 -07:00
Matt Turner
54db78b196 intel: Add cfl to gen_device_name_to_pci_device_id()
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-03-16 13:20:21 -07:00
Rob Clark
bc5001325b meson+dri3: allow building against older xcb (v3)
Similar to previous patch, make xcb 1.13 optional.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-03-16 16:18:42 -04:00
Dave Airlie
7aeef2d4ef dri3: allow building against older xcb (v3)
I'm not sure everyone wants to be updating their dri3 in a forced
march setting, this allows a nicer approach, esp when you want
to build on distro that aren't brand new.

I'm sure there are plenty of ways this patch could be cleaner,
and I've also not built it against an updated dri3.

For meson I've just left it alone, since if you are using meson
you probably don't mind xcb updates, and if you are using meson
you can fix this better than me.

v3: just don't put a version in for dri3/present without
modifiers, should allow building with 1.11 as well

(feel free to supply meson followups)

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-03-16 13:19:45 -04:00
Marek Olšák
f099c3aef1 r600: consolidate PIPE_BIND_SHARED/SCANOUT handling
(Ported from radeonsi commit f70f6baaa3)

Allows cached BOs to be reused in more cases.

Bugzilla: https://bugs.freedesktop.org/105171
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2018-03-16 17:31:28 +01:00
Rafael Antognolli
f89e735719 intel/compiler: Check for unsupported register sizes.
Make sure we don't emit 64 bit types if the hardware doesn't support
them.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-16 09:27:16 -07:00
Jason Ekstrand
315ee5faec loader: Include include/drm-uapi in the autotools build
We're already including it in the meson build.  This fixes build issues
on systems which have a drm_fourcc.h that doesn't have modifiers.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-16 08:50:07 -07:00
Wu, Zhongmin
5fc21c6044 egl/android: Implement the eglSwapinterval for Android.
Implement the eglSwapinterval for Android platform to
enable the async mode for some GFX benchmarks such as
Daimler C217, CityBench.

Results of the dEQP-EGL.*swap_interval tests

'dEQP-EGL.functional.query_config.get_config_attrib.max_swap_interval'..
'dEQP-EGL.functional.query_config.get_config_attrib.min_swap_interval'..
'dEQP-EGL.functional.choose_config.simple.selection_only.max_swap_interval'..
'dEQP-EGL.functional.choose_config.simple.selection_only.min_swap_interval'..
'dEQP-EGL.functional.choose_config.simple.selection_and_sort.max_swap_interval'..
'dEQP-EGL.functional.choose_config.simple.selection_and_sort.min_swap_interval'..
'dEQP-EGL.functional.negative_api.swap_interval'..

 Test run totals:
   Passed:        7/7 (100.0%)
   Failed:        0/7 (0.0%)
   Not supported: 0/7 (0.0%)
   Warnings:      0/7 (0.0%)

Signed-off-by: Zhongmin Wu <zhongmin.wu@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
[Emil Velikov: polish inline comment, add dEQP stats, s/dpy/disp/]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-16 13:58:56 +00:00
Emil Velikov
3a9fb4f7ad st/mesa: simplify st_init_limits() via tgsi_processor_to_shader_stage
Reuse the tgis helper and remove a bunch of duplicated code.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-16 13:49:16 +00:00
Emil Velikov
f7f95310f0 tgsi: move tgsi_processor_to_shader_stage() to a header
This way we can utilise it with later patches.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-16 13:48:46 +00:00
Emil Velikov
9fa1d822bf egl/dri2: move wayland header inclusion where applicable
Instead of indirectly pulling the wayland headers everywhere, use
forward declarations and #include only as needed.

Should effectively fix build errors like the following:

make[5]: Entering directory
'/.../src/gallium/state_trackers/omx/tizonia'
   CC       h264dprc.lo
In file included from h264dprc.c:45:0:
.../src/egl/drivers/dri2/egl_dri2.h:47:10: fatal error:
wayland/wayland-egl/wayland-egl-backend.h: No such file or directory
  #include "wayland/wayland-egl/wayland-egl-backend.h"

Cc: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Andy Furniss <adf.lists@gmail.com>
2018-03-16 13:47:59 +00:00
Emil Velikov
d091c9c4cf vulkan/wsi/x11: correct DRI3 version in comment
During development the version was bumped, yet the comment did not get
an update.

Fixes: c80c08e226 ("vulkan/wsi/x11: Add support for DRI3 v1.2")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-03-16 13:47:52 +00:00
Emil Velikov
19ec817756 vulkan/wsi/x11: use ARRAY_SIZE where applicable
Use the handy macro instead of hard coded numbers.

Fixes: c80c08e226 ("vulkan/wsi/x11: Add support for DRI3 v1.2")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-03-16 13:45:47 +00:00
Juan A. Suarez Romero
705a6446b4 mesa: RGB9_E5 invalid for CopyTexSubImage* in GLES
According to OpenGL ES 3.2, section 8.6, CopyTexSubImage* should return
an INVALID_OPERATION if the internalformat of the texture is RGB9_E5.

This fixes
dEQP-GLES31.functional.debug.negative_coverage.*.copytexsubimage2d_texture_internalformat.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-03-16 12:49:16 +00:00
Christian Gmeiner
5e51f72374 etnaviv: remove superfluous \n from DBG(..) callers
The DBG(..) macro appends a \n already so there is no
need to do it twice.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-03-16 11:41:27 +01:00
Samuel Pitoiset
e96a1d27dc radv: run nir_opt_move_load_ubo
Polaris10:
SGPRS: 108560 -> 107856 (-0.65 %)
VGPRS: 74576 -> 74520 (-0.08 %)
Spilled SGPRs: 7375 -> 7113 (-3.55 %)
Code Size: 4273464 -> 4274364 (0.02 %) bytes
Max Waves: 9434 -> 9446 (0.13 %)

Vega10:
Totals from affected shaders:
SGPRS: 108264 -> 107576 (-0.64 %)
VGPRS: 69068 -> 69000 (-0.10 %)
Spilled SGPRs: 7221 -> 6959 (-3.63 %)
Code Size: 3800796 -> 3801496 (0.02 %) bytes
Max Waves: 10687 -> 10709 (0.21 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-16 09:58:19 +01:00
Samuel Pitoiset
af355aaa07 nir: add nir_opt_move_load_ubo() optimization pass
This pass moves load UBO operations just before their first use,
loosely based on nir_opt_move_comparisons.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-16 09:50:31 +01:00
Dave Airlie
9d0d806332 radv: drop geometry stride user sgpr.
This removes the other geometry specific user sgpr.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:23:21 +00:00
Dave Airlie
6f051549c3 radv: get rid of geometry user sgpr for num entries.
This drops one of the geometry specific user sgprs,
we can work this out at compile time.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:23:17 +00:00
Dave Airlie
9188bd78d7 radv: migrate lds size calculations to shader gen.
This moves the lds_size calcs into the shader so we have all
the size stuff in one file.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:23:12 +00:00
Dave Airlie
384aced65e radv: drop scanning the tess shader in the nir code.
This drops the now unneeded scanning and results in favour
of the ones in the info.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:23:08 +00:00
Dave Airlie
f50d520acf radv: use num_patches output from tcs shader.
Instead of recalculating the value, use the shader calculated value.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:23:05 +00:00
Dave Airlie
bf9a0ea853 radv/tess: remove last chunk of tess sgprs
This removes the last TES-specifc user sgpr.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:23:01 +00:00
Dave Airlie
6db44d6a8c radv: pass num_patches to tes from tcs
TES needs num_patches to do some of the calculations.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:22:58 +00:00
Dave Airlie
010d055aae radv: drop tess offchip layout for tcs.
This removes the last TCS specific user sgpr.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:22:54 +00:00
Dave Airlie
ee31cff856 radv: drop tcs_out_offsets
Move all calculations to shader generation.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:22:47 +00:00
Dave Airlie
b0460bbf1c radv: drop tcs_out_layout
Move all calculations to shader generation.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:22:43 +00:00
Dave Airlie
6adf99165c radv/tess: drop tcs_in_layout setting completely.
Inline all calcs at shader creation.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:22:37 +00:00
Dave Airlie
f343d11ae7 radv: drop ls_out_layout const.
We can precalculate input_vertex_size at compile time.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:22:32 +00:00
Dave Airlie
d89b16b7b9 radv/shader_info: start gathering tess output info (v2)
This gathers the ls outputs written by the vertex shader,
and the tcs outputs, these are needed to calculate certain
tcs parameters.

These have to be separate for combined gfx9 shaders.

This is a bit pessimistic compared to the nir pass,
as we don't work out the individual slots for tcs outputs,
but I actually thing it should be fine to just mark the whole
thing used here.

v2: move to radv, handle clip dist (Samuel),
    handle compacts and patchs properly.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:22:23 +00:00
Dave Airlie
2012dae19a radv: migrate unique index info shader info (v2)
This just moves this function to an inline so the shader_info
pass can use it.

v2: use inline (Samuel)

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-16 05:22:19 +00:00
Samuel Pitoiset
f02f1ad13f Revert "mesa: do not trigger _NEW_TEXTURE_STATE in glActiveTexture()"
This reverts commit f314a532fd.

This appears to introduce some blinking textures in UT2004. Not
sure exactly what's the root cause because we don't have much
information about the issue.

Anyway, this was just a micro optimization that actually breaks,
at least, one app almost one year later.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105436
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-15 21:32:52 +01:00
Lionel Landwerlin
51783f3e7d anv: silence unused variable warning
Fixes: 59b0ea0c74 ("anv: Stop returning VK_ERROR_INCOMPATIBLE_DRIVER")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-03-15 18:56:26 +00:00
Lionel Landwerlin
b5b56f91f5 i965: silence unused function warning
[123/227] Compiling C object 'src/mesa/drivers/dri/i965/libi965_gen110@sta/genX_blorp_exec.c.o'.
../src/mesa/drivers/dri/i965/genX_blorp_exec.c:99:1: warning: ‘blorp_get_surface_base_address’ defined but not used [-Wunused-function]
 blorp_get_surface_base_address(struct blorp_batch *batch)
 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-03-15 18:56:23 +00:00
Lionel Landwerlin
0f544a3c51 anv: silence unused function warning on gen11
[84/227] Compiling C object 'src/intel/vulkan/libanv_gen110@sta/genX_blorp_exec.c.o'.
../src/intel/vulkan/genX_blorp_exec.c:68:1: warning: ‘blorp_get_surface_base_address’ defined but not used [-Wunused-function]
 blorp_get_surface_base_address(struct blorp_batch *batch)
 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-03-15 18:55:42 +00:00
Dylan Baker
2a7027f79a meson: fix pipe-loaders after omx changes
with_gallium_omx used to be a boolean, but now it's a string. That means
it needs to be compared to 'disabled' instead of false.

CC: Rob Clark <robdclark@gmail.com>
Fixes: 34e852d5b5
       ("meson: Re-add auto option for omx")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Tested-by: Rob Clark <robdclark@gmail.com
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-15 10:02:32 -07:00
Dylan Baker
9bd7a6f6f0 meson: require amdgpu >= 2.4.91
the meson equivalent of f8773edb0a

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-15 10:00:02 -07:00
Marek Olšák
f8773edb0a configure.ac: require libdrm_amdgpu 2.4.91
Since 2.4.90 is problematic, just ask for the next version.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-15 12:44:40 -04:00
Marek Olšák
5d0acff39e configure.ac: blacklist libdrm 2.4.90
Cc: 18.0 17.3 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-15 12:44:37 -04:00
Samuel Pitoiset
16ecf037f9 radv: dump LLVM IR when a hang is detected
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-15 17:20:07 +01:00
Samuel Pitoiset
81818662a5 radv: record LLVM IR when debugging shaders
If AMD_shader_info or RADV_TRACE_FILE is used we might need to
keep trace of LLVM IR.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-15 17:20:03 +01:00
Samuel Pitoiset
d07edf5fdf radv: add dump_shader to the NIR compiler options
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-15 17:20:00 +01:00
Samuel Pitoiset
50fcca328c radv: pass the NIR compiler options to ac_compile_llvm_module()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-15 17:19:58 +01:00
Samuel Pitoiset
14c27c2511 radv: print some information when RADV_TRACE_FILE is set
Just to be sure all options are enabled when trying to generate
a hang report.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-15 17:19:54 +01:00
Samuel Pitoiset
5be2757c35 radv: only display options that are enabled
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-15 17:19:52 +01:00
Eric Engestrom
6332893594 mailmap: Use Eric Engestrom's personal email address
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-15 12:03:41 +00:00
Alejandro Piñeiro
50767214a7 spirv/radv: add AMD_gcn_shader capability, remove current extensions
So now, during spirv_to_nir, it uses the capability instead of the
extension. Note that we are really doing here is treating
SPV_AMD_gcn_shader as other supported extensions. SPV_AMD_gcn_shader
is not the first SPV extension supported. For example, the capability
draw_parameters infers if the extension SPV_KHR_shader_draw_parameters
is supported or not.

This could be seen as counter-intuitive, and that it would be easier
to define which extensions are supported, and based our checks on
that, but we need to take into account that some capabilities are
optional from core, and others came from new extensions.

Also this commit would make the implementation of ARB_spirv_extensions
easier.

v2: AMD_gcn_shader capability renamed to gcn_shader (Daniel Schürmann)

Reviewed-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-15 12:08:25 +01:00
Samuel Iglesias Gonsálvez
adf58e59d3 spirv: update arguments for vtn_nir_alu_op_for_spirv_opcode()
We don't need anymore the source and destination's data type, just
their bitsize.

v2:
- Use glsl_get_bit_size () instead (Jason).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-15 08:56:15 +01:00
Samuel Iglesias Gonsálvez
ce2fd87056 spirv: fix the translation of SPIR-V conversion opcodes to NIR
There are some SPIRV opcodes (like UConvert and SConvert) have some
expectations of the output that doesn't depend on the operands
data type. Generalize the solution of all of them.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-15 08:51:01 +01:00
Mathias Fröhlich
98f35ad63c vbo: Correctly handle source arrays in vbo_split_copy.
The original approach did optimize away a bit too many fields.
Restablish the pointer into the original array and correctly feed that
one.

Reviewed-by: Brian Paul <brianp@vmware.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105471
Fixes: 64d2a20480
    mesa: Make gl_vertex_array contain pointers to first order VAO members.
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-15 06:11:57 +01:00
Apple SWE
361f79c97f sched.h needs to be imported on Darwin/OSX targets.
sched_yield is used but the include reference on Darwin is missing. This patch
conditionally guards on Darwin/OSX to import sched.h first.

Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-03-14 22:08:34 -07:00
Apple SWE
67f27b1e18 Add processor topology calculation implementation for Darwin/OSX targets.
The implementation for bootstrapping SWR on Darwin targets is based on the Linux version.
Instead of reading the output of /proc/cpuinfo, sysctlbyname is used to determine the
physical identifiers, processor identifiers, core counts and thread-processor affinities.

With this patch, it is possible to use SWR as an alternate renderer on OSX to softpipe and
llvmpipe.

Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-03-14 22:08:34 -07:00
Dave Airlie
4b15b5e803 virgl: resize resource bo allocation if we need to.
This fixes an illegal command buffer on the host seen with
piglit arb_internalformat_query2-max-dimensions

Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-15 12:26:39 +10:00
Mario Kleiner
c1e47a3c1f nv50,nvc0: Support BGRX1010102 and RGBX1010102 for sampling.
Add them as usable for textures, so they can be used by
Wayland drm in 10 bpc mode and for X11 compositing under
GLX and EGL. We need these formats to be supported at
least for sampling, otherwise GLX_texture_from_pixmap
and the equivalent EGL image extension won't work with
X11 drawables of depth 30 and just display an all black
window.

Do not expose these formats as renderable, and thereby
not as a fbconfig/EGLConfig/Visual, as NVidia hw does
not support 10 bpc unorm formats without alpha channel.

Tested under X11 + GLX/EGL + DRI2/DRI3 for compositing,
and under Wayland+Weston drm backend with a Tesla and
Pascal gpu.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-03-14 21:41:27 -04:00
Thomas Helland
03e37ec6d7 util: Use set_foreach instead of rolling our own
This follows the same pattern as in the hash_table.

Reviewed-by: Jason Ekstrand <jason.ekstrand at intel.com>
2018-03-14 20:03:57 +01:00
Thomas Helland
5f129c05e6 glsl: Use hash table cloning in copy propagation
Walking the whole hash table, inserting entries by hashing them first
is just a really bad idea. We can simply memcpy the whole thing.

V2: Remove leftover creation of acp in two places

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-14 19:52:02 +01:00
Thomas Helland
6baaf4291b util: Implement a hash table cloning function
V2: Don't rzalloc; we are about to rewrite the whole thing (Vladislav)

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-14 19:52:01 +01:00
Guillaume Charifi
388ed47081 st/mesa: Factorize duplicate code in st_BlitFramebuffer()
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-03-14 14:46:51 -04:00
Dylan Baker
7dd261ac50 autotools: add -I/src/egl to tizonia
This fixes the following build breakage:

make[5]: Entering directory
'/mnt/sdc1/Gits/mesa/src/gallium/state_trackers/omx/tizonia'
   CC       h264dprc.lo
In file included from h264dprc.c:45:0:
../../../../../src/egl/drivers/dri2/egl_dri2.h:47:10: fatal error:
wayland/wayland-egl/wayland-egl-backend.h: No such file or directory
  #include "wayland/wayland-egl/wayland-egl-backend.h"
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.

meson got the same fix in 7598dedfde.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-14 11:23:19 -07:00
Dylan Baker
848f2b6e31 Revert "Add processor topology calculation implementation for Darwin/OSX targets."
This reverts commit de0d10db93.

This breaks the build on at least Linux, probably other non-apple
platforms.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-03-14 09:30:17 -07:00
Dylan Baker
0f30c80932 Revert "sched.h needs to be imported on Darwin/OSX targets."
This reverts commit 9dc5063262.

This breaks the build on at least Linux, probably other non-apple
platforms.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-03-14 09:28:58 -07:00
Karol Herbst
b617bfcccf compiler: int8/uint8 support
OpenCL kernels also have int8/uint8.

v2: remove changes in nir_search as Jason posted a patch for that

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-03-14 10:08:42 -04:00
Alex Smith
fcf267ba08 radv: Fix CmdCopyImage between uncompressed and compressed images
From the spec:

    "When copying between compressed and uncompressed formats the
     extent members represent the texel dimensions of the source
     image and not the destination."

However, as per 7b890a36, we must still use the destination image type
when clamping the extent so that we copy the correct number of layers
for 2D to 3D copies.

Fixes: 7b890a36 "radv: Fix vkCmdCopyImage for 2d slices into 3d Images"
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-14 09:59:21 +00:00
Samuel Pitoiset
38f34117dd radv: fix vkGetDeviceQueue2() when create flags don't match
This fixes CTS:
dEQP-VK.api.device_init.create_device_queue2_unmatched_flags

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@gmail.com>
2018-03-14 09:53:42 +01:00
Neil Roberts
25a966a23d spirv: Handle doubles when multiplying a mat by a scalar
The code to handle mat multiplication by a scalar tries to pick either
imul or fmul depending on whether the matrix is float or integer.
However it was doing this by checking whether the base type is float.
This was making it choose the int path for doubles (and presumably
float16s).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-14 08:43:33 +01:00
Iago Toral Quiroga
1a0aba7216 anv/entrypoints: VkGetDeviceProcAddr returns NULL for core instance commands
af5f2322d0 addressed this for extension commands, but the spec mandates
this behavior also for core API commands. From the Vulkan spec,
Table 2. vkGetDeviceProcAddr behavior:

device     pname                            return
----------------------------------------------------------
(..)
device     core device-level command        fp
(...)

See that it specifically states "device-level".

Since the vk.xml file doesn't state if core commands are instance or
device level, we identify device level commands as the ones that take a
VkDevice, VkQueue or VkCommandBuffer as their first parameter.

Fixes test failures in new work-in-progress CTS tests.

Also see the public issue:
https://github.com/KhronosGroup/Vulkan-LoaderAndValidationLayers/issues/2323

v2:
  - Include reference to github issue (Emil)
  - Rebased on top of Vulkan 1.1 changes.

v3:
  - Remove the not in the condition and switch the then/else cases (Jason)

Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-14 08:09:15 +01:00
Iago Toral Quiroga
a631575ff4 anv/entrypoints: dispatches to VkQueue are device-level
v2:
  - Add trampoline functions (Jason)
  - Add an assertion for unhandled trampoline cases

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-14 08:09:15 +01:00
Dave Airlie
3b0f2081b5 radv: drop assert on bindingDescriptorCount > 0
The spec is pretty clear that this can be 0, and that it operates
as a reserved binding.

Fixes:
dEQP-VK.binding_model.descriptor_update.empty_descriptor.uniform_buffer

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-14 16:54:52 +10:00
Apple SWE
9dc5063262 sched.h needs to be imported on Darwin/OSX targets.
sched_yield is used but the include reference on Darwin is missing. This patch
conditionally guards on Darwin/OSX to import sched.h first.

Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
2018-03-13 22:50:56 -07:00
Apple SWE
de0d10db93 Add processor topology calculation implementation for Darwin/OSX targets.
The implementation for bootstrapping SWR on Darwin targets is based on the Linux version.
Instead of reading the output of /proc/cpuinfo, sysctlbyname is used to determine the
physical identifiers, processor identifiers, core counts and thread-processor affinities.

With this patch, it is possible to use SWR as an alternate renderer on OSX to softpipe and
llvmpipe.

Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
2018-03-13 22:50:27 -07:00
Roland Scheidegger
274f8bf05e r600: fix abs for op3 sources
If a src was referencing the same temp as the dst, the per-component
copy code didn't work.
e.g.
  cndge r0.xy, r0.xx, |r2|, r3
got expanded into
  mov  r12.x, |r2|
  cndge r0.x, r0.x, r12, r3
  mov  r12.y, |r2|
  cndge r0.y, r0.x, r12, r3
hence for the second cndge r0.x was mistakenly the previous cndge result.
Fix this by doing all the movs first, so there's no bogus alu.last in between.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=102905

Tested-by: <iive@yahoo.com>
Reviewed-by: Dave Airlie <airlied@gmail.com>
2018-03-14 04:54:45 +01:00
Dave Airlie
27a5e5366e radv: mark all tess output for an indirect access.
If a shader does a tcs store with an indirect access, we
were only marking the first spot as used. For indirect access
we always now mark all slots used by the variable.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464
Fixes: 94f9591995 (radv/ac: add support for TCS/TES inputs/outputs.)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-14 11:18:54 +10:00
Dave Airlie
4f0c89d66c ac/nir: pass the nir variable through tcs loading.
I was going to have to add another parameter to this monster,
so we should just pass the nir_variable in, I can't find any
reason this would be a bad idea.

This needed for the next fix.

Fixes: 94f9591995 (radv/ac: add support for TCS/TES inputs/outputs.)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-14 11:18:54 +10:00
Dave Airlie
f9de2d409b radv: get correct offset into LDS for indexed vars.
This seems more correct to me, since if we have an array
of floats they'll be vec4 aligned, and if we do af[2],
we want the const index to increase by 2 slots in the non
compact case.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464
Fixes: 94f9591995 (radv/ac: add support for TCS/TES inputs/outputs.)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-14 11:18:54 +10:00
Rob Clark
4e4428482e nir: lower_load_const_to_scalar fix for 8/16b types
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-13 20:17:04 -04:00
Dylan Baker
2aad12b2af Update the documentation for meson
Meson is pretty well tested and works in most configurations now, so we
can remove the warning about it being unsuited for actual use.

It's also worth documenting that meson 0.42.0 or greater is required.

v2: - Minor rewording of supported platforms as suggested by Emil
    - Add two missing tags as reported by xmllint --html

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
2018-03-13 14:54:47 -07:00
Jason Ekstrand
85000b812d ac/nir: Use lower_vote_eq_to_ballot instead of ac_nir_lower_subgroups
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 13:25:27 -07:00
Jason Ekstrand
3d1d7e8561 nir/subgroups: Add lowering for vote_ieq/vote_feq to a ballot
This is based heavily on 97f10934ed, "ac/nir: Add vote_ieq/vote_feq
lowering pass." from Bas Nieuwenhuizen.  This version is a bit more
general since it's in common code.  It also properly handles NaN due to
not flipping the comparison for floats.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 13:25:15 -07:00
Dylan Baker
8247a30838 meson: don't use compiler.has_header
Meson's compiler.has_header is completely useless, it only checks that a
header exists, not whether it's usable. This creates problems if a
header contains a conditional #error declaration, like so:

> #if __x86_64__
> # error "Doesn't work with x86_64!"
> #endif

Compiler.has_header will return true in this case, even when compiling
for x86_64. This is useless.

Instead, we'll do a compile check so that any #error declarations will
be treated as errors, and compilation will work.

Fixes compilation on x32 architecture.

Gentoo Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=649746
meson bug: https://github.com/mesonbuild/meson/issues/2246
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-13 11:41:10 -07:00
Jason Ekstrand
8379bff6c4 i965: Emit texture cache invalidates around blorp_copy
This is a terrible hack but it fixes CTS regressions.  It's still
incredibly unclear exactly what is going wrong in the hardware to cause
this to be an issue so this isn't a good fix by any means.  However, it
does fix tests so there is that.

Fixes: fb0e9b5197 "i965: Track the depth and render caches separately"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103746
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-13 11:24:40 -07:00
Eric Anholt
a326eedc75 brodacom/vc4: Fix simulator since the perfmon change.
It would be nice to support perfmon with simulator, and might be a useful
tool for regression testing performance (since the simulator would be
deterministic).
2018-03-13 10:32:58 -07:00
Eric Anholt
191bc7ce61 spirv: Silence compiler warning about undefined srcs[0]
v2: Use assume() at the srcs[] definition instead.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-03-13 10:32:55 -07:00
Samuel Pitoiset
7c83430672 ac/nir: rename radeon_llvm_reg_index_soa() to ac_llvm_reg_index_soa()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 16:54:28 +01:00
Samuel Pitoiset
b128fd773f ac/nir: remove some unnecessary includes and declarations
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 16:54:27 +01:00
Samuel Pitoiset
cd4e823341 ac/nir: drop radv prefix from radv_lower_gather4_integer()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 16:54:25 +01:00
Samuel Pitoiset
fbe694562b ac/nir: move ac_nir_compiler_options and friends to radv folder
Also replace ac_ by radv_.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 16:54:23 +01:00
Samuel Pitoiset
237229430f ac: move ac_shader_info to radv folder
This is RADV specific code.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 16:54:21 +01:00
Samuel Pitoiset
2cfba40eea ac/nir: move ac_shader_variant_info and friends to radv folder
Also replace ac_ by radv_.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 16:54:16 +01:00
Samuel Pitoiset
b2653007b9 ac/nir: move all RADV related code to radv_nir_to_llvm.c
Now the "ac/nir" prefix will really be the shared code between
RadeonSI and RADV, that might avoid confusions in the future.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 14:05:06 +01:00
Samuel Pitoiset
8e15824b9d ac/nir: make emit_barrier() non-static
Required in order to move all RADV specific code outside of ac/nir.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 14:05:06 +01:00
Samuel Pitoiset
4e3117b718 ac/nir: move radeon_llvm_reg_index_soa() to ac_nir_to_llvm.h
Required in order to move all RADV specific code outside of ac/nir.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 14:05:06 +01:00
Samuel Pitoiset
3a30b89353 ac/nir: make handle_shader_output_decl() non-static
Required in order to move all RADV specific code outside of ac/nir.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 14:05:06 +01:00
Samuel Pitoiset
3fe47b1290 ac/nir: change prototype of handle_shader_output_decl()
This allows to remove the ac_nir_context dependency.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 14:05:06 +01:00
Samuel Pitoiset
61a91ca3f5 ac/nir: move unpack_param() to ac_llvm_build.c
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 14:05:06 +01:00
Samuel Pitoiset
28bb6873ec ac/nir: move trim_vector to ac_llvm_build.c
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 14:05:06 +01:00
Samuel Pitoiset
895632baef ac/nir: move cast_ptr() to ac_llvm_build.c
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 14:05:06 +01:00
Samuel Pitoiset
bf6368297b ac/nir: move ac_build_alloca() to ac_llvm_build.c
As well as si_build_alloca_undef() and drop the si prefix.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 14:05:06 +01:00
Timothy Arceri
370e356eba gallium: silence __builtin_frame_address nonzero argument is unsafe warning
Calling __builtin_frame_address with a nonzero argument is unsafe
but is sometimes done for debugging purposes. Since this code is
part of some debug util code I'm assuming that is the case here
and using GCC pragma to silence the warning.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-03-13 09:38:10 +11:00
Dylan Baker
b7c6870f87 meson: Add moduledir to d3d.pc
This is required to build wine with the nine patchset

Fixes: 6b4c7047d5
       ("meson: build gallium nine state_tracker")
Reported-by: Mike Lothian <mike@fireburn.co.uk>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-12 13:52:38 -07:00
Mathias Fröhlich
a2f08dd574 gallium: Use struct gl_array_attributes* as st_pipe_vertex_format argument.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-12 18:24:31 +01:00
Ian Romanick
def0030e64 mesa: Don't write to user buffer in glGetTexParameterIuiv on error
With some sets of optimization flags, GCC will generate warnings like
this:

src/mesa/main/texparam.c:2327:27: warning: ‘*((void *)&ip+12)’ may be used uninitialized in this function [-Wmaybe-uninitialized]
             params[3] = ip[3];
                         ~~^~~
src/mesa/main/texparam.c:2320:16: note: ‘*((void *)&ip+12)’ was declared here
          GLint ip[4];
                ^~

ip is not initialized in cases where a GL error is generated.  In these
cases, we should *not* write to the user's buffer, so this is actually a
bug.  I wrote a new piglit test gl-3.0-texparameteri to show this bug.

I suspect that Coverity also detected this, but the scan site is
currently down.

Fixes: c2c507786 "main: Added entry points for glGetTextureParameteriv, Iiv, and Iuiv."
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-12 10:13:30 -07:00
Roman Gilg
f94597f554 gallium: work around libtool relink issue for libdrm
This is similar to commit 90633079. libtool links first to system directories
instead of custom locations of libdrm on relinking. Since a more recent libdrm
version than the one provided by the system is often needed when compiling
mesa, make sure this works by putting libdrm in front.

See also: https://bugs.freedesktop.org/show_bug.cgi?id=100259

Signed-off-by: Roman Gilg <subdiff@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-12 14:49:07 +00:00
Emil Velikov
678ba53240 vulkan: autotools: do not redirect stdin/stdout for wayland-scanner
The tool accepts the input and output files as arguments.
There's no need for the redirection.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-03-12 14:48:52 +00:00
Emil Velikov
8151f5cad9 wayland-drm: autotools: do not redirect stdin/stdout for wayland-scanner
The tool accepts the input and output files as arguments.
There's no need for the redirection.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-03-12 14:48:52 +00:00
Emil Velikov
1178e0cf49 egl: autotools: do not redirect stdin/stdout for wayland-scanner
The tool accepts the input and output files as arguments.
There's no need for the redirection.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-03-12 14:48:52 +00:00
Emil Velikov
08189731a4 docs: document removal of GLX_SGIX_swap_{barrier,group} stubs
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-12 14:48:52 +00:00
Emil Velikov
5ef608fab7 glx: remove empty GLX_SGIX_swap_group stubs
The extension was never implemented. Quick search suggests:
 - no actual users (on my Arch setup)
 - the Nvidia driver does not implement the extension

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-03-12 14:48:52 +00:00
Emil Velikov
2c765b0d9a gallium/x11: remove empty GLX_SGIX_swap_group stubs
The extension was never implemented. Quick search suggests:
 - no actual users (on my Arch setup)
 - the Nvidia driver does not implement the extension

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-03-12 14:48:52 +00:00
Emil Velikov
afab516f5f x11: remove empty GLX_SGIX_swap_group stubs
The extension was never implemented. Quick search suggests:
 - no actual users (on my Arch setup)
 - the Nvidia driver does not implement the extension

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-03-12 14:48:52 +00:00
Emil Velikov
742b8e3301 glx: remove empty GLX_SGIX_swap_barrier stubs
The extension was never implemented. Quick search suggests:
 - no actual users (on my Arch setup)
 - the Nvidia driver does not implement the extension

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-03-12 14:48:52 +00:00
Emil Velikov
447731348e gallium/x11: remove empty GLX_SGIX_swap_barrier stubs
The extension was never implemented. Quick search suggests:
 - no actual users (on my Arch setup)
 - the Nvidia driver does not implement the extension

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-03-12 14:48:51 +00:00
Emil Velikov
1d2d519d78 x11: remove empty GLX_SGIX_swap_barrier stubs
The extension was never implemented. Quick search suggests:
 - no actual users (on my Arch setup)
 - the Nvidia driver does not implement the extension

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-03-12 14:48:51 +00:00
Emil Velikov
f197f02e50 configure: remove unused AM_CONDITIONAL
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-12 14:48:51 +00:00
Bas Nieuwenhuizen
997306c031 radv: Increase the number of dynamic uniform buffers.
The vulkan API is not ideal as it does not allow us have a
shared limit.

Feral needs 15+6 for one of their games, and I'm not a fan
of overcommitting the limits, so increase the number of
dynamic uniform buffers to 16.

CC: <mesa-stable@lists.freedesktop.org>
CC: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-12 09:46:22 +01:00
Dave Airlie
e76cf1ff12 u_vbuf/translate: pass max_index into the set_buffer.
This fixes a memory trashing crash (not the test) seen with
dEQP-GLES3.stress.draw.unaligned_data.random.203
on virgl.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-12 11:57:13 +10:00
Dave Airlie
5d4fbc2b54 r600: implement callstack workaround for evergreen.
This is ported from the sb backend, there are some issues with
evergreen stacks on the boundary between entries and ALU_PUSH_BEFORE
instructions.

Whenever we are going to use a push before, we check the stack
usage and if we have to use the workaround, then we switch to
a separate push.

I noticed this problem dealing with some of the soft fp64 shaders,
in nosb mode, they are quite stack happy.

This fixes all the glitches and inconsistencies I've seen with them

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Elie Tournier <elie.tournier@collabora.com>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-12 11:11:44 +10:00
Marek Olšák
163a29099a gallium/util: add helper util_wait_for_idle
This is an old patch that I had.
2018-03-11 13:14:27 -04:00
Roland Scheidegger
0f0a6fa21d u_blit: (trivial) u_blit.h needs to include p_defines.h
(For the pipe_tex_filter enum)

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-03-10 20:09:04 +01:00
Christian Gmeiner
c9b153fea7 travis: bump libxcb version to 1.13
Fixes following dependency problem:
  Native dependency xcb-dri3 found: NO found '1.11' but need: '>= 1.13'

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Fixes: c80c08e226 ("vulkan/wsi/x11: Add support for DRI3 v1.2")
2018-03-10 16:55:36 +01:00
Mathias Fröhlich
64d2a20480 mesa: Make gl_vertex_array contain pointers to first order VAO members.
Instead of keeping a copy of the vertex array content in
struct gl_vertex_array only keep pointers to the first order
information originaly in the VAO.
For that represent the current values by struct gl_array_attributes
and struct gl_vertex_buffer_binding.

v2: Change comments.
    Remove gl... prefix from variables except in the i965 directory where
    it was like that before. Reindent because of that.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-10 07:33:51 +01:00
Roland Scheidegger
d62f0df354 draw: fix alpha value for very short aa lines
The logic would not work correctly for line lengths smaller than 1.0,
even a degenerated line with length 0 would still produce a fragment
with anyhwere between alpha 0.0 and 0.5.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-03-10 02:11:50 +01:00
Jordan Justen
24b415270f intel/vulkan: Hard code CS scratch_ids_per_subslice for Cherryview
Ken suggested that we might be underallocating scratch space on HD
400. Allocating scratch space as though there was actually 8 EUs
seems to help with a GPU hang seen on synmark CSDof.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-09 16:15:58 -08:00
Jordan Justen
06e3bd02c0 i965: Hard code CS scratch_ids_per_subslice for Cherryview
Ken suggested that we might be underallocating scratch space on HD
400. Allocating scratch space as though there was actually 8 EUs
seems to help with a GPU hang seen on synmark CSDof.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104636
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105290
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
2018-03-09 16:15:34 -08:00
Marek Olšák
db495b8962 st/dri: fix OpenGL-OpenCL interop for GL_TEXTURE_BUFFER
Tested by our OpenCL team.

Fixes: 9c499e6759 "st/mesa: don't invoke st_finalize_texture & st_convert_sampler for TBOs"

Acked-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-09 16:33:31 -05:00
Marek Olšák
2bdb54bce7 radeonsi: add a workaround for GFX9 hang with init_config alignment
Fixes: 75c5d25f0f "radeonsi: align command buffer starting address to fix some Raven hangs"
Cc: 17.3 18.0 <mesa-stable@lists.freedesktop.org>
2018-03-09 16:28:29 -05:00
Marek Olšák
e99212e970 ac/gpu_info: print ib_start_alignment, add assertion 2018-03-09 16:28:29 -05:00
Greg V
e30a165be2 meson: Use system_has_kms_drm in default driver selection
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-03-09 10:02:44 -08:00
Eric Anholt
c57d5ea3bb broadcom/vc4: Add an accelerated path to turn raster R8/RG88 into tiled.
Drawing a 1080p YV12 video stream generated by MMAL goes from 10.5 FPS to
36.
2018-03-09 09:59:54 -08:00
Eric Anholt
cf170616da gallium: Add a util_blitter path for using a custom VS and FS.
Like the r600 paths to use other custom states, we pass in a couple of
parameters to customize the innards of the blitter.  It's up to the caller
to wrap other state necessary for its shaders (for example, constant
buffers for the uniforms the shader uses).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-09 09:59:54 -08:00
Eric Anholt
46a32e3d2e broadcom/vc4: Allow binding non-zero constant buffers.
We're going to use UBO loads for implementing YUV linear-to-T-format
blits.
2018-03-09 09:59:54 -08:00
Eric Anholt
2725ab2b12 broadcom: Remove our defines of DRM_FORMAT_MOD_INVALID.
The imported drm_fourcc.h handles it now.
2018-03-09 09:59:54 -08:00
Eric Anholt
a3a4c23dec broadcom: Suppress compiler warnings about enum pipe_tex_filter. 2018-03-09 09:59:54 -08:00
Louis-Francis Ratté-Boulianne
3160cb86aa egl/x11: Re-allocate buffers if format is suboptimal
If PresentCompleteNotify event says the pixmap was presented
with mode PresentCompleteModeSuboptimalCopy, it means the pixmap
could possibly have been flipped instead if allocated with a
different format/modifier.

Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-03-09 17:47:14 +00:00
Louis-Francis Ratté-Boulianne
069fdd5f9f egl/x11: Support DRI3 v1.1
Add support for DRI3 v1.1, which allows pixmaps to be backed by
multi-planar buffers, or those with format modifiers. This is both
for allocating render buffers, as well as EGLImage imports from a
native pixmap (EGL_NATIVE_PIXMAP_KHR).

Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-03-09 17:47:14 +00:00
Louis-Francis Ratté-Boulianne
61309c2a72 vulkan/wsi/x11: Return VK_SUBOPTIMAL_KHR for X11
When it is detected that a window could have been flipped
but has been copied because of suboptimal format/modifier.
The Vulkan client should then re-create the swapchain.

Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-03-09 17:47:13 +00:00
Daniel Stone
c80c08e226 vulkan/wsi/x11: Add support for DRI3 v1.2
Adds support for multiple planes and buffer modifiers.

v4: Rename "has_dri3_v1_1" to "has_dri3_modifiers"
v12: Multi-planar/modifier support is now DRI3 v1.2; also update release
     versions
2018-03-09 17:47:13 +00:00
Dylan Baker
7258be91c5 autotools: include all meson.build files
Otherwise SWR cannot be built with meson from an autotools generated
tarball, such as the 18.0.0-rc4 tarball.

Fixes: 16bf813830 ("meson/swr: re-shuffle generated files")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: George Kyriazis <george.kyriazis@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-09 08:15:04 -08:00
Michel Dänzer
2a4596a2f0 st/mesa: gl_program::info.system_values_read is a 64-bit-field
We were dropping the upper 32 bits, which caused assertion failures in
some compute shader piglit tests with radeonsi since the commit below.

Fixes: 752e969703 ("compiler: Add two new system values for subgroups")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-09 16:52:11 +01:00
George Kyriazis
379e00dc27 swr/rast: Refactor memory gather operations
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-03-09 09:36:42 -06:00
George Kyriazis
3f7ce10b3e swr/rast: Add KNOB_DISABLE_SPLIT_DRAW
This is useful for archrast data collection. This greatly speeds up the
post processing script since there is significantly less events generated.

Finally, this is a simpler option to communicate to users than having
them directly adjust MAX_PRIMS_PER_DRAW and MAX_TESS_PRIMS_PER_DRAW.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-03-09 09:36:30 -06:00
George Kyriazis
e0a4a25829 swr/rast: Add VPOPCNT
Supports popcnt on vector masks (e.g. <8 x i1>)

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-03-09 09:36:23 -06:00
George Kyriazis
b56afe1a4f swr/rast: Add tracking for stream out topology
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-03-09 09:36:14 -06:00
George Kyriazis
2f6ae8cfcd swr/rast: Add split draw and other state information to DrawInfoEvent.
Removed specific split draw events.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-03-09 09:36:07 -06:00
George Kyriazis
714093203e swr/rast: Refactor api and worker event handlers.
In the API event handler we want to share information between the core
layer and the API. Specifically, around associating various ids with
different kinds of events. For example, associate render pass id with
draw ids, or command buffer ids with draw ids.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-03-09 09:35:59 -06:00
George Kyriazis
cfdd35beaf swr/rast: Add support for generalized late and early z/stencil stats
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-03-09 09:35:52 -06:00
George Kyriazis
9e25f298eb swr/rast: Rasterized Subspans stats support
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-03-09 09:35:47 -06:00
George Kyriazis
d78b28fc33 swr/rast: Added comment
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-03-09 09:34:55 -06:00
Eric Engestrom
e903a7b0bb vulkan/wsi: clean up cleanup path
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-09 13:25:44 +00:00
Bas Nieuwenhuizen
a793e7899f radv: Fix the autotools build take 2.
Forgot to remove a word....

Fixes: 04ffabf17a "radv: Fix autotools build."
2018-03-09 14:10:24 +01:00
Lucas Stach
1f55d06783 etnaviv: allow mixing different bit depths for color and depth surfaces
Vivante hardware supports this just fine. There is no reason why this shouldn't
be advertised as a valid combination.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-03-09 12:06:07 +01:00
Thierry Reding
6d4d46bca9 autotools: Add tegra to AM_DISTCHECK_CONFIGURE_FLAGS
This allows the driver to be built on a make distcheck and makes sure
that it properly builds when a distribution tarball is made.

Suggested-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-03-09 11:48:22 +01:00
Thierry Reding
1755f608f5 tegra: Initial support
Tegra K1 and later use a GPU that can be driven by the Nouveau driver.
But the GPU is a pure render node and has no display engine, hence the
scanout needs to happen on the Tegra display hardware. The GPU and the
display engine each have a separate DRM device node exposed by the
kernel.

To make the setup appear as a single device, this driver instantiates
a Nouveau screen with each instance of a Tegra screen and forwards GPU
requests to the Nouveau screen. For purposes of scanout it will import
buffers created on the GPU into the display driver. Handles that
userspace requests are those of the display driver so that they can be
used to create framebuffers.

This has been tested with some GBM test programs, as well as kmscube and
weston. All of those run without modifications, but I'm sure there is a
lot that can be improved.

Some fixes contributed by Hector Martin <marcan@marcan.st>.

Changes in v2:
- duplicate file descriptor in winsys to avoid potential issues
- require nouveau when building the tegra driver
- check for nouveau driver name on render node
- remove unneeded dependency on libdrm_tegra
- remove zombie references to libudev
- add missing headers to C_SOURCES variable
- drop unneeded tegra/ prefix for includes
- open device files with O_CLOEXEC
- update copyrights

Changes in v3:
- properly unwrap resources in ->resource_copy_region()
- support vertex buffers passed by user pointer
- allocate custom stream and const uploader
- silence error message on pre-Tegra124
- support X without explicit PRIME

Changes in v4:
- ship Meson build files in distribution tarball
- drop duplicate driver_tegra dependency

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-03-09 11:48:22 +01:00
Thierry Reding
2052dbdae3 nouveau: Add framebuffer modifier support
This adds support for framebuffer modifiers to Nouveau. This will be
used by the Tegra driver to share metadata about the format of buffers
(such as the tiling mode or compression).

Changes in v2:
- remove unused parameters to nouveau_buffer_create()
- move format modifier query code to nvc0 backend
- restrict format modifiers to 2D textures
- implement ->query_dmabuf_modifiers()

Changes in v4:
- add UAPI include path on meson builds

Changes in v5:
- remove unnecessary includes

Acked-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-03-09 11:48:08 +01:00
Thierry Reding
b964cab80a nouveau/nvc0: Extract common tile mode macro
Add a new macro that can be used to extract the tiling mode from a
tile_mode value. This is will be used to determine the number of GOBs
used in block linear mode.

Acked-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-03-09 11:47:54 +01:00
Thierry Reding
75bf489628 drm/tegra: Sanitize format modifiers
The existing format modifier definitions were merged prematurely, and
recent work has unveiled that the definitions are suboptimal in several
ways:

  - The format specifiers, except for one, are not Tegra specific, but
    the names don't reflect that.
  - The number space is split into two, reserving 32 bits for some
    "parameter" which most of the modifiers are not going to have.
  - Symbolic names for the modifiers are not using the standard
    DRM_FORMAT_MOD_* prefix, which makes them awkward to use.
  - The vendor prefix NV is somewhat ambiguous.

Fortunately, nobody's started using these modifiers, so we can still fix
the above issues. Do so by using the standard prefix. Also, remove TEGRA
from the name of those modifiers that exist on NVIDIA GPUs as well. In
case of the block linear modifiers, make the "parameter" smaller (4
bits, though only 6 values are valid) and don't let that leak into any
of the other modifiers.

Finally, also use the more canonical NVIDIA instead of the ambiguous NV
prefix.

This is based on commit 268892cb63a822315921a8dab48ac3e4abf7dd03 from
Linux v4.16-rc1.

Acked-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-03-09 11:44:35 +01:00
Thierry Reding
ffc85cfac0 drm/fourcc: Fix fourcc_mod_code() definition
Avoid a compiler warnings when the val parameter is an expression.

This is based on commit 5843f4e02fbe86a59981e35adc6cabebee46fdc0 from
Linux v4.16-rc1.

Acked-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-03-09 11:44:35 +01:00
Bas Nieuwenhuizen
04ffabf17a radv: Fix autotools build.
Forgot it again ....

Fixes: b6347807a9 "radv: Generate icd files."
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-09 09:36:19 +01:00
Samuel Pitoiset
365850fd68 ac/nir: set number of channels for packed mrt exports
Bit 0 enables VSRC0 (R in low bits, G high) and bit 2 enables
VSRC1 (B in low bits, A high).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-09 09:28:20 +01:00
Bas Nieuwenhuizen
68201ab2da radv: Update version to 1.1.70.
Turns out they did not reset the patch number on release.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-09 07:53:39 +01:00
Bas Nieuwenhuizen
b6347807a9 radv: Generate icd files.
If the api version is too low, the loader clamps the application
requested version to the advertized version, which messes with
which extensions are enabled.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-09 07:53:39 +01:00
Ian Romanick
6878c9aabc nir: Don't i2b a value that is already Boolean
A bunch of shaders have sequences like:

    i2b(u2i(floatBitsToUint(intBitsToFloat(x == y ? -1 : 0))))

Other optimizations (and NIR's typeless nature) reduce this to

    i2b(x == y)

which is silly.

Skylake
total instructions in shared programs: 14498698 -> 14497948 (<.01%)
instructions in affected programs: 74480 -> 73730 (-1.01%)
helped: 277
HURT: 0
helped stats (abs) min: 1 max: 32 x̄: 2.71 x̃: 2
helped stats (rel) min: 0.04% max: 13.79% x̄: 1.45% x̃: 0.68%
95% mean confidence interval for instructions value: -3.35 -2.06
95% mean confidence interval for instructions %-change: -1.74% -1.16%
Instructions are helped.

total cycles in shared programs: 532015500 -> 531999238 (<.01%)
cycles in affected programs: 5943878 -> 5927616 (-0.27%)
helped: 251
HURT: 74
helped stats (abs) min: 1 max: 13149 x̄: 127.89 x̃: 14
helped stats (rel) min: 0.01% max: 17.31% x̄: 1.55% x̃: 0.53%
HURT stats (abs)   min: 1 max: 4550 x̄: 214.04 x̃: 15
HURT stats (rel)   min: <.01% max: 44.43% x̄: 2.81% x̃: 0.33%
95% mean confidence interval for cycles value: -158.51 58.43
95% mean confidence interval for cycles %-change: -1.07% -0.04%
Inconclusive result (value mean confidence interval includes 0).

total loops in shared programs: 4753 -> 4735 (-0.38%)
loops in affected programs: 18 -> 0
helped: 18
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for loops value: -1.00 -1.00
95% mean confidence interval for loops %-change: -100.00% -100.00%
Loops are helped.

Haswell and Broadwell had simliar results. (Broadwell shown)
total instructions in shared programs: 14791877 -> 14791127 (<.01%)
instructions in affected programs: 77326 -> 76576 (-0.97%)
helped: 278
HURT: 1
helped stats (abs) min: 1 max: 32 x̄: 2.70 x̃: 2
helped stats (rel) min: 0.04% max: 13.79% x̄: 1.42% x̃: 0.68%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.49% max: 0.49% x̄: 0.49% x̃: 0.49%
95% mean confidence interval for instructions value: -3.33 -2.05
95% mean confidence interval for instructions %-change: -1.70% -1.13%
Instructions are helped.

total cycles in shared programs: 558250067 -> 558252872 (<.01%)
cycles in affected programs: 5806328 -> 5809133 (0.05%)
helped: 235
HURT: 83
helped stats (abs) min: 1 max: 10630 x̄: 81.73 x̃: 16
helped stats (rel) min: 0.03% max: 18.58% x̄: 1.60% x̃: 0.51%
HURT stats (abs)   min: 1 max: 10590 x̄: 265.19 x̃: 20
HURT stats (rel)   min: <.01% max: 15.28% x̄: 1.89% x̃: 0.54%
95% mean confidence interval for cycles value: -89.87 107.51
95% mean confidence interval for cycles %-change: -1.06% -0.32%
Inconclusive result (value mean confidence interval includes 0).

total loops in shared programs: 4735 -> 4717 (-0.38%)
loops in affected programs: 18 -> 0
helped: 18
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for loops value: -1.00 -1.00
95% mean confidence interval for loops %-change: -100.00% -100.00%
Loops are helped.

total fills in shared programs: 83111 -> 83110 (<.01%)
fills in affected programs: 28 -> 27 (-3.57%)
helped: 1
HURT: 0

Ivy Bridge
total instructions in shared programs: 11774173 -> 11773436 (<.01%)
instructions in affected programs: 70819 -> 70082 (-1.04%)
helped: 267
HURT: 0
helped stats (abs) min: 1 max: 48 x̄: 2.76 x̃: 2
helped stats (rel) min: 0.21% max: 19.51% x̄: 1.57% x̃: 0.63%
95% mean confidence interval for instructions value: -3.51 -2.01
95% mean confidence interval for instructions %-change: -1.94% -1.21%
Instructions are helped.

total cycles in shared programs: 257153833 -> 257148932 (<.01%)
cycles in affected programs: 585341 -> 580440 (-0.84%)
helped: 167
HURT: 100
helped stats (abs) min: 1 max: 1327 x̄: 44.89 x̃: 16
helped stats (rel) min: 0.04% max: 26.54% x̄: 2.41% x̃: 0.88%
HURT stats (abs)   min: 1 max: 200 x̄: 25.95 x̃: 16
HURT stats (rel)   min: 0.04% max: 9.81% x̄: 1.34% x̃: 0.65%
95% mean confidence interval for cycles value: -33.25 -3.46
95% mean confidence interval for cycles %-change: -1.47% -0.54%
Cycles are helped.

total loops in shared programs: 3416 -> 3398 (-0.53%)
loops in affected programs: 18 -> 0
helped: 18
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for loops value: -1.00 -1.00
95% mean confidence interval for loops %-change: -100.00% -100.00%
Loops are helped.

LOST:   2
GAINED: 0

Sandy Bridge
total instructions in shared programs: 10499306 -> 10499094 (<.01%)
instructions in affected programs: 6051 -> 5839 (-3.50%)
helped: 43
HURT: 0
helped stats (abs) min: 1 max: 32 x̄: 4.93 x̃: 2
helped stats (rel) min: 0.39% max: 12.90% x̄: 4.29% x̃: 2.45%
95% mean confidence interval for instructions value: -7.66 -2.20
95% mean confidence interval for instructions %-change: -5.47% -3.12%
Instructions are helped.

total cycles in shared programs: 145862568 -> 145861370 (<.01%)
cycles in affected programs: 61733 -> 60535 (-1.94%)
helped: 36
HURT: 2
helped stats (abs) min: 16 max: 66 x̄: 36.61 x̃: 35
helped stats (rel) min: 0.45% max: 17.31% x̄: 4.92% x̃: 2.81%
HURT stats (abs)   min: 18 max: 102 x̄: 60.00 x̃: 60
HURT stats (rel)   min: 1.10% max: 1.85% x̄: 1.48% x̃: 1.48%
95% mean confidence interval for cycles value: -41.28 -21.77
95% mean confidence interval for cycles %-change: -6.16% -3.00%
Cycles are helped.

total loops in shared programs: 1803 -> 1785 (-1.00%)
loops in affected programs: 18 -> 0
helped: 18
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for loops value: -1.00 -1.00
95% mean confidence interval for loops %-change: -100.00% -100.00%
Loops are helped.

LOST:   4
GAINED: 0

No changes on Iron Lake of GM45.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-08 15:26:26 -08:00
Ian Romanick
1583f49eaa i965/vec4: Allow CSE on subset VF constant loads
v2: Rewrite the code that generates the VF mask.  Suggested by Ken.

No changes on other platforms.

Haswell, Ivy Bridge, and Sandy Bridge had similar results. (Haswell shown)
total instructions in shared programs: 13059891 -> 13059884 (<.01%)
instructions in affected programs: 431 -> 424 (-1.62%)
helped: 7
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 1.19% max: 5.26% x̄: 2.05% x̃: 1.49%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -3.39% -0.71%
Instructions are helped.

total cycles in shared programs: 409260032 -> 409260018 (<.01%)
cycles in affected programs: 4228 -> 4214 (-0.33%)
helped: 7
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 0.28% max: 2.04% x̄: 0.54% x̃: 0.28%
95% mean confidence interval for cycles value: -2.00 -2.00
95% mean confidence interval for cycles %-change: -1.15% 0.07%

Inconclusive result (%-change mean confidence interval includes 0).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-08 15:26:26 -08:00
Ian Romanick
360899d457 i965/vec4: Relax writemask condition in CSE
If the previously seen instruction generates more fields than the new
instruction, still allow CSE to happen.  This doesn't do much, but it
also enables a couple more shaders in the next patch.  It helped quite a
bit in another change series that I have (at least for now) abandoned.

v2: Add some extra comentary about the parameters to instructions_match.
Suggested by Ken.

No changes on Skylake, Broadwell, Iron Lake or GM45.

Ivy Bridge and Haswell had similar results. (Ivy Bridge shown)
total instructions in shared programs: 11780295 -> 11780294 (<.01%)
instructions in affected programs: 302 -> 301 (-0.33%)
helped: 1
HURT: 0

total cycles in shared programs: 257308315 -> 257308313 (<.01%)
cycles in affected programs: 2074 -> 2072 (-0.10%)
helped: 1
HURT: 0

Sandy Bridge
total instructions in shared programs: 10506687 -> 10506686 (<.01%)
instructions in affected programs: 335 -> 334 (-0.30%)
helped: 1
HURT: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-08 15:26:26 -08:00
Ian Romanick
52c7df1643 i965/fs: Merge CMP and SEL into CSEL on Gen8+
v2: Fix several problems handling inverted predicates.  Add a much
bigger comment around the BRW_CONDITIONAL_NZ case.

v3: Allow uniforms and shader inputs as sources for the original SEL and
CMP instructions.  This enables a LOT more shaders to receive CSEL
merging (5816 vs 8564 on SKL).

v4: Report progress.

Broadwell and Skylake had similar results. (Broadwell shown)
helped: 8527
HURT: 0
helped stats (abs) min: 1 max: 27 x̄: 2.44 x̃: 1
helped stats (rel) min: 0.03% max: 17.80% x̄: 1.12% x̃: 0.70%
95% mean confidence interval for instructions value: -2.51 -2.36
95% mean confidence interval for instructions %-change: -1.15% -1.10%
Instructions are helped.

total cycles in shared programs: 559442317 -> 558288357 (-0.21%)
cycles in affected programs: 372699860 -> 371545900 (-0.31%)
helped: 6748
HURT: 1450
helped stats (abs) min: 1 max: 32000 x̄: 182.41 x̃: 12
helped stats (rel) min: <.01% max: 66.08% x̄: 3.42% x̃: 0.70%
HURT stats (abs)   min: 1 max: 2538 x̄: 53.08 x̃: 14
HURT stats (rel)   min: <.01% max: 96.72% x̄: 3.32% x̃: 0.90%
95% mean confidence interval for cycles value: -179.01 -102.51
95% mean confidence interval for cycles %-change: -2.37% -2.08%
Cycles are helped.

LOST:   0
GAINED: 6

No changes on earlier platforms.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v3]
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-08 15:26:26 -08:00
Kenneth Graunke
70de61594d i965/fs: Add infrastructure for generating CSEL instructions.
v2 (idr): Don't allow CSEL with a non-float src2.

v3 (idr): Add CSEL to fs_inst::flags_written.  Suggested by Matt.

v4 (idr): Only set BRW_ALIGN_16 on Gen < 10 (suggested by Matt).  Don't
reset the access mode afterwards (suggested by Samuel and Matt).  Add
support for CSEL not modifying the flags to more places (requested by
Matt).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v3]
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-03-08 15:26:26 -08:00
Ian Romanick
54e8d2268d nir: Narrow some dot product operations
On vector platforms, this helps elide some constant loads.

v2: Reorder the transformations.

No changes on Broadwell or Skylake.

Haswell
total instructions in shared programs: 13093793 -> 13060163 (-0.26%)
instructions in affected programs: 1277532 -> 1243902 (-2.63%)
helped: 13216
HURT: 95
helped stats (abs) min: 1 max: 18 x̄: 2.56 x̃: 2
helped stats (rel) min: 0.21% max: 20.00% x̄: 3.63% x̃: 2.78%
HURT stats (abs)   min: 1 max: 6 x̄: 1.77 x̃: 1
HURT stats (rel)   min: 0.09% max: 5.56% x̄: 1.25% x̃: 1.19%
95% mean confidence interval for instructions value: -2.57 -2.49
95% mean confidence interval for instructions %-change: -3.65% -3.54%
Instructions are helped.

total cycles in shared programs: 409580819 -> 409268463 (-0.08%)
cycles in affected programs: 71730652 -> 71418296 (-0.44%)
helped: 9898
HURT: 2352
helped stats (abs) min: 2 max: 16014 x̄: 37.08 x̃: 16
helped stats (rel) min: <.01% max: 35.55% x̄: 6.26% x̃: 4.50%
HURT stats (abs)   min: 2 max: 276 x̄: 23.25 x̃: 6
HURT stats (rel)   min: <.01% max: 40.00% x̄: 3.54% x̃: 1.97%
95% mean confidence interval for cycles value: -33.19 -17.80
95% mean confidence interval for cycles %-change: -4.50% -4.26%
Cycles are helped.

total fills in shared programs: 82059 -> 82052 (<.01%)
fills in affected programs: 21 -> 14 (-33.33%)
helped: 7
HURT: 0

Sandy Bridge and Ivy Bridge had similar results (Ivy Bridge shown)
total instructions in shared programs: 11811851 -> 11780605 (-0.26%)
instructions in affected programs: 1155007 -> 1123761 (-2.71%)
helped: 12304
HURT: 95
helped stats (abs) min: 1 max: 18 x̄: 2.55 x̃: 2
helped stats (rel) min: 0.21% max: 20.00% x̄: 3.69% x̃: 2.86%
HURT stats (abs)   min: 1 max: 6 x̄: 1.77 x̃: 1
HURT stats (rel)   min: 0.09% max: 5.56% x̄: 1.25% x̃: 1.19%
95% mean confidence interval for instructions value: -2.56 -2.48
95% mean confidence interval for instructions %-change: -3.71% -3.59%
Instructions are helped.

total cycles in shared programs: 257618409 -> 257316805 (-0.12%)
cycles in affected programs: 71999580 -> 71697976 (-0.42%)
helped: 9155
HURT: 2380
helped stats (abs) min: 2 max: 16014 x̄: 38.44 x̃: 16
helped stats (rel) min: <.01% max: 35.75% x̄: 6.39% x̃: 4.62%
HURT stats (abs)   min: 2 max: 290 x̄: 21.14 x̃: 4
HURT stats (rel)   min: <.01% max: 41.55% x̄: 3.14% x̃: 1.33%
95% mean confidence interval for cycles value: -34.32 -17.97
95% mean confidence interval for cycles %-change: -4.55% -4.29%
Cycles are helped.

GM45 and Iron Lake had nearly identical results (Iron Lake shown)
total instructions in shared programs: 7886750 -> 7879944 (-0.09%)
instructions in affected programs: 373781 -> 366975 (-1.82%)
helped: 3715
HURT: 47
helped stats (abs) min: 1 max: 8 x̄: 1.86 x̃: 1
helped stats (rel) min: 0.22% max: 16.67% x̄: 2.88% x̃: 2.06%
HURT stats (abs)   min: 1 max: 6 x̄: 2.55 x̃: 2
HURT stats (rel)   min: 1.09% max: 5.00% x̄: 1.93% x̃: 2.35%
95% mean confidence interval for instructions value: -1.85 -1.77
95% mean confidence interval for instructions %-change: -2.91% -2.73%
Instructions are helped.

total cycles in shared programs: 178114636 -> 178095452 (-0.01%)
cycles in affected programs: 7227666 -> 7208482 (-0.27%)
helped: 3349
HURT: 301
helped stats (abs) min: 2 max: 90 x̄: 6.55 x̃: 4
helped stats (rel) min: <.01% max: 14.18% x̄: 0.95% x̃: 0.63%
HURT stats (abs)   min: 2 max: 42 x̄: 9.13 x̃: 10
HURT stats (rel)   min: 0.01% max: 11.19% x̄: 1.22% x̃: 1.50%
95% mean confidence interval for cycles value: -5.52 -4.99
95% mean confidence interval for cycles %-change: -0.81% -0.73%
Cycles are helped.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]
2018-03-08 15:26:26 -08:00
Lionel Landwerlin
d10a39ebe0 i965: perf: consolidate unmapping oa perf bo outside accumulation
Do this in one place outside the only caller of the accumulation
function.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-08 23:05:29 +00:00
Lionel Landwerlin
fb921a2870 i965: perf: count number of accumlated reports
This will be reused later.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-08 23:05:26 +00:00
Lionel Landwerlin
e4387faafb i965: perf: reuse timescale base function from query
We already have the same function in brw_queryobj.c

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-08 23:05:23 +00:00
Lionel Landwerlin
b71da26496 i965: perf: store sysfs device entry into context
We want to reuse it later on.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-08 23:05:21 +00:00
Lionel Landwerlin
5742b17da1 i965: perf: store the hw_id of the context in the query
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-08 23:05:18 +00:00
Lionel Landwerlin
80cd669a32 i965: perf: default case for unknown query types
Just some extra safety before further changes.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-08 23:05:00 +00:00
Marek Olšák
9b7db12815 radeonsi: remove chip_class parameter from si_lower_nir
We can get it from si_screen.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-08 14:58:16 -05:00
Marek Olšák
78ef16e2f9 winsys/amdgpu: query GDS info
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-08 14:58:16 -05:00
Marek Olšák
a4a113b5bc winsys/amdgpu: pad compute IBs
v2: pad with PKT2 NOPs on SI

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-08 14:58:16 -05:00
Marek Olšák
35cd86d4e9 radeonsi: expand constbuf 0 address correctly to fix Vega10 hangs
This is only required with the latest libdrm.

This fixes 32-bit support with high addresses.
(and possibly 64-bit support too because the high bits need to be masked out)

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-08 14:58:16 -05:00
Marek Olšák
75c5d25f0f radeonsi: align command buffer starting address to fix some Raven hangs
Cc: 17.3 18.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-08 14:58:16 -05:00
Christian Gmeiner
5b68a7297d etnaviv: add get_driver_query_group_info(..)
This enables AMD_performance_monitor extension.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2018-03-08 20:44:04 +01:00
Christian Gmeiner
3d912bd742 etnaviv: add query_group_info for sw counters
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2018-03-08 20:43:55 +01:00
Dylan Baker
1e9d779331 meson: Fix building gallium media libs without egl
v2: - rebase on omx fix

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net> (v1)
2018-03-08 10:14:02 -08:00
Dylan Baker
f74cf04d3e meson: Allow building dri based EGL without GLX
It should be possible to build EGL without GLX, but the meson build
currently doesn't allow that because it too tightly couples glx and dri.
This patch eases dri and glx apart, so that EGL without GLX can be
built.

CC: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-03-08 09:12:24 -08:00
Thierry Reding
d41ee9ba5d glx/apple: Ship meson build file in tarball
The meson build file for Apple GLX is not listed in the EXTRA_DIST make
variable and therefore isn't shipped as part of the release tarball, so
meson builds from the tarball will fail.

Add the file to EXTRA_DIST to ensure it is included in the tarball.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-03-08 12:11:32 +01:00
Samuel Pitoiset
4e3c1ace65 ac/nir: do not emit unnecessary null exports in fragment shaders
Null exports should only be needed when no other exports are
emitted. This removes a bunch of 'exp null off, off, off, off done vm'.

Affected games are Dota 2 and Wolfenstein 2, not sure if that
really helps, but code size is decreasing there.

Polaris10:
Totals from affected shaders:
SGPRS: 8216 -> 8216 (0.00 %)
VGPRS: 7072 -> 7072 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 454968 -> 453896 (-0.24 %) bytes
Max Waves: 772 -> 772 (0.00 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-08 11:56:05 +01:00
Eric Engestrom
19dd7f007e drirc: whitespace fix
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-08 09:53:34 +00:00
Thomas Hellstrom
93e58d5e17 drirc: Disable the GLX_SGI_video_sync extension for gnome-shell on vmware
With this extension enabled and a server GLX implementation that actually
honors it, Window movement lags considerably on gnome-shell/vmware, so
disable it by default.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
2018-03-08 07:26:29 +01:00
Thomas Hellstrom
4ca9ad2bb2 gallium/st_dri: Honor the glx_disable_sgi_video_sync config option
This option is disabled by default. Primarily intended for drivers on
virtual hardware.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
2018-03-08 07:26:29 +01:00
Thomas Hellstrom
f4070956d4 glx/dri: Add a driconf option to disable GLX_SGI_video_sync
Drivers on virtual hardware don't want to expose this extension to
GLX compositors, similarly to GLX_OML_sync_control, since that significantly
increases latency.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
2018-03-08 07:26:29 +01:00
Timothy Arceri
0c90264da4 ac/radeonsi: add emit_kill to the abi
This should fix a regression with Rocket League grass rendering
on the NIR backend.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104717
2018-03-08 11:28:37 +11:00
Timothy Arceri
50cc97d98a radeonsi: add si_llvm_emit_kill() helper
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-08 11:28:37 +11:00
Timothy Arceri
f4b877631e spirv: fix autotools builds
Fixes: 68a6a3b51a "spirv: handle AMD_gcn_shader extended instructions"

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-08 10:45:56 +11:00
Timothy Arceri
99cdc019bf ac: make use of if/loop build helpers
These helpers insert the basic block in the same order as they
appear in NIR making it easier to follow LLVM IR dumps. The helpers
also insert more useful labels onto the blocks.

TGSI use the line number of the corresponding opcode in the TGSI
dump as the label id, here we use the corresponding block index
from NIR.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-08 10:12:34 +11:00
Timothy Arceri
6e1a142863 radeonsi: make use of if/loop build helpers in ac
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-08 10:12:34 +11:00
Timothy Arceri
42627dabb4 ac: add if/loop build helpers
These have been ported over from radeonsi.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-08 10:12:34 +11:00
Daniel Schürmann
ffbf75cde4 radv: enable AMD_gcn_shader extension
Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-07 23:09:58 +01:00
Daniel Schürmann
18c7f1e041 ac: implement AMD_gcn_shader extended instructions
Co-authored-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-07 23:09:58 +01:00
Daniel Schürmann
68a6a3b51a spirv: handle AMD_gcn_shader extended instructions
Co-authored-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-07 23:09:58 +01:00
Daniel Schürmann
a1a2a8dfda nir: add AMD_gcn_shader extended instructions
Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-07 23:09:58 +01:00
Daniel Schürmann
39437025de spirv: import AMD extensions header from glslang
Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-07 23:09:58 +01:00
Dylan Baker
cba104ebe3 meson: Fix indent in omx meson.build
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Julien Isorce <julien.isorce@gmail.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
2018-03-07 13:30:54 -08:00
Dylan Baker
6f628951af meson: Use include directory variables instead of traversing
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Julien Isorce <julien.isorce@gmail.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
2018-03-07 13:30:53 -08:00
Dylan Baker
34e852d5b5 meson: Re-add auto option for omx
This re-adds the auto option for omx, without it we default to tizonia
and the build fails almost immediately, this is especially obnoxious
those building a driver that doesn't support the OMX state tracker to
begin with.

v2: - Only define OMX_FOO for auto cases if the dependencies are found.
      This fixes building tizonia with auto (Julien, Eric)

CC: Gurkirpal Singh <gurkirpal204@gmail.com>
Fixes: bb5e27fab6
       ("st/omx/bellagio: Rename st and target directories")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Julien Isorce <julien.isorce@gmail.com>
Tested-by: Karol Herbst <kherbst@redhat.com> (v1)
2018-03-07 13:30:53 -08:00
Dylan Baker
7598dedfde meson: fix tizonia compilation
It needs to have src/egl in it's includes as well.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Julien Isorce <julien.isorce@gmail.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
2018-03-07 13:30:53 -08:00
Dylan Baker
2d3004ef1c meson: combine state trackers and target if blocks
This is needed later since tizonia requires dri

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Julien Isorce <julien.isorce@gmail.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
2018-03-07 13:30:53 -08:00
Marek Olšák
55376cb31e st/mesa: expose 0 shader binary formats for compat profiles for Qt
Bugzilla: https://bugreports.qt.io/browse/QTBUG-66420
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105065
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
2018-03-07 15:36:31 -05:00
Roland Scheidegger
8ba3750d3d draw: fix line stippling with aa lines
In contrast to non-aa, where stippling is based on either dx or dy
(depending on if it's a x or y major line), stippling is based on
actual distance with smooth lines, so adjust for this.

(It looks like there's some minor artifacts with mesa demos
line-sample and stippling, it looks like the line endpoints
aren't quite right with aa + stippling - maybe due to the
integer math in the stipple stage, but I can't quite pinpoint it.)

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-03-07 21:29:00 +01:00
Roland Scheidegger
dbb2cf388b draw: simplify (and correct) aaline fallback (v2)
The motivation actually was to get rid of the additional tex
instruction, since that requires the draw fallback code to intercept
all sampler / view calls (even if the fallback is never hit).
Basically, the idea is to use coverage of the pixel to calculate
the alpha value, and coverage is simply based on the distance
to the center of the line (in both line direction, which is useful
for wide lines, as well as perpendicular to the line).
This is much closer to what hw supporting this natively actually does.
It also fixes an issue with line width not quite being correct, as
well as endpoints getting stretched too far (in line direction) with
wide lines, which is apparent with mesa demo line-sample.
(For llvmpipe, it would probably make sense to do something like this
directly when drawing lines, since rendering two tris is twice as
expensive as a line, but it would need some changes with state
management.)
Since we're no longer relying on mipmapping to get the alpha value,
we also don't need to draw 3 rects (6 tris), one is sufficient.

There's still issues (as before):
- quite sure it's not correct without half_pixel_center, but can't test
this with GL.
- aaline + line stipple is incorrect (evident with line-sample demo).
Looking at the spec the stipple pattern should actually be based on
distance (not just dx or dy for x/y major lines as without aa).
- outputs (other than pos + the one used for line aa) should be
reinterpolated since we actually increase line length by half a pixel
(but there's no tests which would care).

v2: simplify the math (should be equivalent), don't need immediate
v3: use float versions of atan2,cos,sin, minor cleanups

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-03-07 21:28:31 +01:00
Bas Nieuwenhuizen
034cce96b4 radv: Don't emit a warning on VI-GFX9.
We are conformant:

https://www.khronos.org/conformance/adopters/conformant-products#submission_308

v2: Actually not emit it on gfx9.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
04d65d2b76 radv: Enable vulkan 1.1.0 for configurations that can support it.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
0168eaaa42 radv: Disable sampler ycbcr conversion.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
cce62f4065 radv: Expose that we don't support any VK_KHR_16_bit_storage parts.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
b99b9cc864 radv: Implement vkEnumerateInstanceVersion.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
5240fddb9d radv: Add trivial device group implementation.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
84e877aa77 radv: Implement vkCmdDispatchBase.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
de5e25898c radv: Implement VkGetDeviceQueue2.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
b137e25277 radv: Support VkPhysicalDeviceProtectedMemoryFeatures.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
4bcf4d1678 radv: Support VkPhysicalDeviceShaderDrawParameterFeatures.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
41d958d073 radv: Implement VK_KHR_maintenance3.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
8f9af587a2 radv: Add minimal subgroup support.
Deliberately not implementing workgroup scopes as that is not needed
for core vulkan.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen
89651fba9b radv: Change client version check.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:34 +01:00
Bas Nieuwenhuizen
5b3979704d radv: Update MAX_API_VERSION to 1.1.0
v2: Don't bump supported version.
v3: Update json files.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:34 +01:00
Bas Nieuwenhuizen
97f10934ed ac/nir: Add vote_ieq/vote_feq lowering pass.
The old vote_eq implementation supported only booleans, but now
we have to support arbitrary values, so use the read_first_invocation
intrinsic + ballot.

I took this as an opportunity to figure out how easy it was to do this
in nir instead of in the nir_to_llvm pass, and it actually turned out
pretty okay IMO. Only creating the pass is some extra code.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 21:18:32 +01:00
Jason Ekstrand
c217607b65 anv: Support version overrides
While always sketchy to do, this is useful for debugging.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
a1ee51309e vulkan/util: Add a helper to get a version override
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
d6b65222df anv: Enable Vulkan 1.1
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
03c07ac548 anv: Add support for SPIR-V 1.3 subgroup operations
This requires us to bump the subgroup size to 32 for all shader stages
because Vulkan requires that to be a physical device query.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
8b4a5e641b intel/fs: Add support for subgroup quad operations
NIR has code to lower these away for us but we can do significantly
better in many cases with register regioning and SIMD4x2.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
2292b20b29 intel/fs: Implement reduce and scan opeprations
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
4150920b95 intel/fs: Add a helper for emitting scan operations
This commit adds a helper to the builder for emitting "scan" operations.
Given a binary operation #, a scan takes the vector [a0, a1, ..., aN]
and returns the vector [a0, a0 # a1, ..., a0 # a1 # ... # aN] where each
channel contains the combination of all previous channels.  The sequence
of instructions to perform the scan is fairly optimal; a 16-wide scan on
a 32-bit type is only 6 instructions.  The subgroup scan and reduction
operations will be implemented in terms of this.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
b0858c1cc6 intel/fs: Add a couple of simple helper opcodes
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
57bff0a546 spirv: Add support for subgroup arithmetic
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
789221dcfa nir: Add a helper for getting binop identities
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
82d493a939 nir: Add subgroup arithmetic reduction intrinsics
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
b3a5b0f3fc spirv: Add subgroup quad support
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
493a165544 nir: Add quad operations and lowering
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
90c9f29518 i965/fs: Add support for nir_intrinsic_shuffle
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
8256ee3fa3 spirv: Add subgroup shuffle support
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
149b92ccf2 nir: Add subgroup shuffle intrinsics and lowering
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
7cfece820d i965/fs: Support nir_intrinsic_vote_feq
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
0e893356fe nir/lower_subgroups: Add scalarizing for vote_eq
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
d792f3d4cd spirv: Add subgroup vote support
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
44681e4795 nir: Generalize nir_intrinsic_vote_eq
The SPIR-V extension wants us to be able to do an AllEqual on any vector
or scalar type.  This has two implications:

 1) We need to be able to handle vectors so we switch the vote_eq
    intrinsics to be vectorized intrinsics.

 2) We need to handle floats which have different behavior with respect
    to +-0, NaN, etc. than the integer variant so we need two variants.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
9812fce60b spirv: Add subgroup ballot support
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
974daec495 i965/fs: Implement basic SPIR-V subgroup intrinsics
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
adc077797a spirv: Add initial subgroup support
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
5162a1d884 nir: Add new SPIR-V ballot intrinsics and lowering
Someone can make the lowering optional later if they want something
different for their hardware.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
752e969703 compiler: Add two new system values for subgroups
This will be required for SPIR-V subgroup support

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
34c60ea02b nir: Add new SPIR-V ballot ALU intrinsics and lowering
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
cc587ee9a7 spirv: Handle the new OpModuleProcessed instruction
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
59b0ea0c74 anv: Stop returning VK_ERROR_INCOMPATIBLE_DRIVER
From the Vulkan 1.1 spec:

    "Vulkan 1.0 implementations were required to return
    VK_ERROR_INCOMPATIBLE_DRIVER if apiVersion was larger than 1.0.
    Implementations that support Vulkan 1.1 or later must not return
    VK_ERROR_INCOMPATIBLE_DRIVER for any value of apiVersion."

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
cbab2d1da5 anv: Implement vkEnumerateInstanceVersion
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Iago Toral Quiroga
605fd7c0da anv/device: fail to initialize device if we have queues with unsupported flags
This is not strictly necessary since users should not be requesting any
flags that are not valid for the list of enabled features requested and
we already fail if they attempt to use an unsupported feature, however
it is an easy to implement sanity check that would help developes realize
that they are doing things wrong, so we might as well do it.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-07 12:13:47 -08:00
Iago Toral Quiroga
b262f17b15 anv/device: GetDeviceQueue2 should only return queues with matching flags
From the Vulkan 1.1 spec, VkDeviceQueueInfo2 structure:

   "The queue returned by vkGetDeviceQueue2 must have the same flags value
    from this structure as that used at device creation time in a
    VkDeviceQueueCreateInfo instance. If no matching flags were specified
    at device creation time then pQueue will return VK_NULL_HANDLE."

For us this means no flags at all since we don't support any.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
9c8b40001d anv: Support querying for protected memory
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
773a51e772 anv: Implement GetDeviceQueue2
This belongs to the protected memory feature but there's nothing about
it that's specific to protected memory.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
68df93ecbc anv: Trivially implement VK_KHR_device_group
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
dfe18be09e anv: Implement vkCmdDispatchBase
This is part of the device groups extension/feature but it's a decent
chunk of work in its own right so it's worth breaking into its own
patch.  The mechanism we use is fairly straightforward: we just push the
base work group id into the shader and add it to the work group id we
get from dispatch.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
ff9db1a4cc nir/spirv: Add support for device groups
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
ddc4069122 anv: Implement VK_KHR_maintenance3
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
1deb7967c8 anv: Support VkPhysicalDeviceShaderDrawParameterFeatures
This advertises the VK_KHR_shader_draw_parameters functionality as a
"core optimal feature" in Vulkan 1.1.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
06719f9d4b anv/entrypoints: Drop support for protect attributes
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
bd1279bd9f Get rid of a bunch of KHR suffixes
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
af461986db anv: Add version 1.1.0 but leave it disabled
This requires us to rename any Vulkan API entrypoints which became core
in 1.1 to no longer have the KHR suffix.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
0128187335 spirv: Update the SPIR-V headers and json to 1.3.1
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
205c271562 vulkan: Update the XML and headers to 1.1.70
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
7fb86fb511 vulkan/enum_to_str: Add support for aliases and new Vulkan versions
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
539a0aec45 vulkan/enum_to_str: Add a add_value_from_xml helper to VkEnum
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
eb23ca069f anv/entrypoints: Generate #ifdef guards from platform attributes
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
05fc377f2e anv/extensions: Add support for multiple API versions
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
8efa173ed2 anv/entrypoints_gen: Add support for aliases in the XML
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
39d9fcea13 anv/entrypoints: Allow an entrypoint to require multiple extensions
In this case, we say an entrypoint is supported if ANY of the extensions
is supported.  This is because, in the XML, entrypoints don't require
extensions so much as extensions require entrypoints.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
8e8f167c72 anv/entrypoints: Add an is_device_entrypoint helper
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
54b3493fc0 anv/entrypoints_gen: Allow the string map to grow
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
d91da06df5 anv/entrypoints_gen: A bit of refactoring
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
a4ca4c99ba anv/entrypoints: Generalize the string map a bit
The original string map assumed that the mapping from strings to
entrypoints was a bijection.  This will not be true the moment we
add entrypoint aliasing.  This reworks things to be an arbitrary map
from strings to non-negative signed integers.  The old one also had a
potential bug if we ever had a hash collision because it didn't do the
strcmp inside the lookup loop.  While we're at it, we break things out
into a helpful class.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
3960d0e332 vulkan: Rename multiview from KHX to KHR
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
68af9f04a4 spirv: Rework barriers
Our previous handling of barriers always used the big hammer and didn't
correctly emit memory barriers when specified along with a control
barrier.  This commit completely reworks the way we emit barriers to
make things both more precise and more correct.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
de518f38e5 spirv: Add a vtn_constant_value helper
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Marek Olšák
9779f34326 radeonsi: remove si_llvm_add_attribute 2018-03-07 13:55:49 -05:00
Marek Olšák
2c3f3651c4 radeonsi: fix passing address32_hi to LLVM for high values
The old function treats high values as negative, which LLVM interprets as 0.
2018-03-07 13:55:49 -05:00
Marek Olšák
b3b6b00ac8 radeonsi: assume has_virtual_memory == true 2018-03-07 13:55:48 -05:00
Marek Olšák
53db2790c0 radeonsi: add/update assertions for 32-bit address space 2018-03-07 13:55:47 -05:00
Marek Olšák
16856a1ee8 radeonsi: prevent a negative buffer offset in si_upload_descriptors 2018-03-07 13:55:42 -05:00
Marek Olšák
9b55498059 radeonsi: properly extract a buffer address from a descriptor 2018-03-07 13:55:40 -05:00
Marek Olšák
2a47660754 radeonsi: fix vertex buffer address computation with full 64-bit addresses 2018-03-07 13:55:38 -05:00
Marek Olšák
2e30268877 radeonsi: mask out high VM address bits in registers where needed 2018-03-07 13:55:35 -05:00
Bas Nieuwenhuizen
94c9096c83 radv: Add entrypoints generation with the new vk.xml
A lot of it is based on intel again.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-07 15:50:19 +01:00
Simon Hausmann
fb5825e7ce glsl: Fix memory leak with known glsl_type instances
When looking up known glsl_type instances in the various hash tables, we
end up leaking the key instances used for the lookup, as the glsl_type
constructor allocates memory on the global mem_ctx. This patch changes
glsl_type to manage its own memory, which fixes the leak and also allows
getting rid of the global mem_ctx and its mutex.

v2: remove lambda usage (Tapani)
    (+keep ASSERT_BITFIELD_SIZE, modify dummy ctor to initialize mem_ctx)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104884
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Simon Hausmann <simon.hausmann@qt.io>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-07 14:33:34 +02:00
Caio Marcelo de Oliveira Filho
c17808562e spirv: Add SpvCapabilityShaderViewportIndexLayerEXT
This capability allows gl_ViewportIndex and gl_Layer to also be used
as outputs in Vertex and Tesselation shaders.

v2: Make conditional to the capability, add gl_Layer, add tesselation
    shaders. (Iago)

v3: Don't export to tesselation control shader.

v4: Add Reviewd-by tag.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 07:04:20 +01:00
Mauro Rossi
487f8d48c9 android: anv: add libmesa_intel_dev static dependency
Fixes the following building errors:

external/mesa/src/intel/vulkan/anv_device.c:300: error: undefined reference to 'gen_get_pci_device_id_override'
external/mesa/src/intel/vulkan/anv_device.c:312: error: undefined reference to 'gen_get_device_name'
external/mesa/src/intel/vulkan/anv_device.c:313: error: undefined reference to 'gen_get_device_info'
clang.real: error: linker command failed with exit code 1 (use -v to see invocation)

Fixes: 272bef0601 "intel: Split gen_device_info out into libintel_dev"
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-03-07 07:55:34 +02:00
Timothy Arceri
1fdb21541e Revert "nir: bump loop unroll limit to 96."
This reverts commit 2d36efdb7f.

This raised limit turns out to harmful for more complex shaders,
it causes excessive spilling in some Bioshock Infinite shaders.

The fps for the ssao demo on radv remains unchanged when reverting
this.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-07 15:10:05 +11:00
Dave Airlie
fb077b0728 ac/nir: don't put lod into args if it's zero.
If it's zero but put it in args we still end up consuming a
register for it.

This fixes some spilling in the NIR paths in Dirt Rally that
isn't seen with TGSI.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-07 03:34:59 +00:00
Christian Gmeiner
38e91e2b81 freedreno: bump required libdrm version
Fixes: 26a9321d0a "freedreno: add global_bindings state"

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-06 21:52:59 +01:00
Ian Romanick
e3ea166a2c nir: Simplify some comparisons like a+b < a
All Gen7+ platforms had similar results. (Skylake shown)
total instructions in shared programs: 14514555 -> 14514547 (<.01%)
instructions in affected programs: 1972 -> 1964 (-0.41%)
helped: 8
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.39% max: 0.42% x̄: 0.41% x̃: 0.41%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -0.41% -0.40%
Instructions are helped.

total cycles in shared programs: 533141444 -> 533136780 (<.01%)
cycles in affected programs: 164728 -> 160064 (-2.83%)
helped: 181
HURT: 3
helped stats (abs) min: 2 max: 94 x̄: 26.17 x̃: 30
helped stats (rel) min: 0.12% max: 5.33% x̄: 3.42% x̃: 3.80%
HURT stats (abs)   min: 4 max: 54 x̄: 24.00 x̃: 14
HURT stats (rel)   min: 0.20% max: 2.39% x̄: 1.09% x̃: 0.68%
95% mean confidence interval for cycles value: -27.12 -23.58
95% mean confidence interval for cycles %-change: -3.54% -3.16%
Cycles are helped.

Sandy Bridge
total instructions in shared programs: 10533667 -> 10533539 (<.01%)
instructions in affected programs: 10148 -> 10020 (-1.26%)
helped: 124
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.03 x̃: 1
helped stats (rel) min: 0.39% max: 4.35% x̄: 2.20% x̃: 2.04%
95% mean confidence interval for instructions value: -1.06 -1.00
95% mean confidence interval for instructions %-change: -2.46% -1.95%
Instructions are helped.

total cycles in shared programs: 146136887 -> 146132122 (<.01%)
cycles in affected programs: 206382 -> 201617 (-2.31%)
helped: 171
HURT: 0
helped stats (abs) min: 2 max: 40 x̄: 27.87 x̃: 30
helped stats (rel) min: 0.08% max: 5.73% x̄: 2.98% x̃: 2.67%
95% mean confidence interval for cycles value: -29.19 -26.54
95% mean confidence interval for cycles %-change: -3.20% -2.76%
Cycles are helped.

Iron Lake
total instructions in shared programs: 7886515 -> 7886507 (<.01%)
instructions in affected programs: 3016 -> 3008 (-0.27%)
helped: 8
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.25% max: 0.28% x̄: 0.27% x̃: 0.27%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -0.27% -0.26%
Instructions are helped.

total cycles in shared programs: 178100396 -> 178100388 (<.01%)
cycles in affected programs: 156128 -> 156120 (<.01%)
helped: 4
HURT: 4
helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4
helped stats (rel) min: 0.02% max: 0.04% x̄: 0.03% x̃: 0.03%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: <.01% max: 0.01% x̄: <.01% x̃: <.01%
95% mean confidence interval for cycles value: -3.68 1.68
95% mean confidence interval for cycles %-change: -0.03% <.01%
Inconclusive result (value mean confidence interval includes 0).

GM45
total instructions in shared programs: 4857872 -> 4857868 (<.01%)
instructions in affected programs: 1544 -> 1540 (-0.26%)
helped: 4
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.25% max: 0.27% x̄: 0.26% x̃: 0.26%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -0.28% -0.24%
Instructions are helped.

total cycles in shared programs: 122167654 -> 122167662 (<.01%)
cycles in affected programs: 96248 -> 96256 (<.01%)
helped: 0
HURT: 4
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: <.01% max: 0.01% x̄: <.01% x̃: <.01%
95% mean confidence interval for cycles value: 2.00 2.00
95% mean confidence interval for cycles %-change: <.01% 0.02%
Cycles are HURT.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-06 11:17:30 -08:00
Ian Romanick
d1ed4ffe0b nir: Use De Morgan's Law on logic compounded comparisons
The replacement of the comparison operators must happen during this
step.  If it does not, the next pass of nir_opt_algebraic will reapply
De Morgan's Law in the "opposite direction" before performing dead code
elimination.  The resulting infinite loop will eventually get OOM
killed.

Haswell, Broadwell, and Skylake had similar results. (Broadwell shown)
total instructions in shared programs: 14808185 -> 14808036 (<.01%)
instructions in affected programs: 13758 -> 13609 (-1.08%)
helped: 39
HURT: 0
helped stats (abs) min: 1 max: 10 x̄: 3.82 x̃: 3
helped stats (rel) min: 0.44% max: 1.55% x̄: 0.98% x̃: 1.01%
95% mean confidence interval for instructions value: -4.67 -2.97
95% mean confidence interval for instructions %-change: -1.09% -0.88%
Instructions are helped.

total cycles in shared programs: 559438333 -> 559435832 (<.01%)
cycles in affected programs: 199160 -> 196659 (-1.26%)
helped: 42
HURT: 3
helped stats (abs) min: 2 max: 184 x̄: 61.50 x̃: 51
helped stats (rel) min: 0.02% max: 6.94% x̄: 1.41% x̃: 1.40%
HURT stats (abs)   min: 2 max: 40 x̄: 27.33 x̃: 40
HURT stats (rel)   min: 0.05% max: 0.74% x̄: 0.51% x̃: 0.74%
95% mean confidence interval for cycles value: -71.47 -39.69
95% mean confidence interval for cycles %-change: -1.64% -0.93%
Cycles are helped.

Sandy Bridge and Ivy Bridge had similar results. (Ivy Bridge shown)
total instructions in shared programs: 11811776 -> 11811553 (<.01%)
instructions in affected programs: 15201 -> 14978 (-1.47%)
helped: 39
HURT: 0
helped stats (abs) min: 1 max: 20 x̄: 5.72 x̃: 6
helped stats (rel) min: 0.44% max: 2.53% x̄: 1.30% x̃: 1.26%
95% mean confidence interval for instructions value: -7.21 -4.23
95% mean confidence interval for instructions %-change: -1.48% -1.12%
Instructions are helped.

total cycles in shared programs: 257617270 -> 257614589 (<.01%)
cycles in affected programs: 212107 -> 209426 (-1.26%)
helped: 45
HURT: 0
helped stats (abs) min: 2 max: 180 x̄: 59.58 x̃: 54
helped stats (rel) min: 0.02% max: 6.02% x̄: 1.30% x̃: 1.32%
95% mean confidence interval for cycles value: -74.02 -45.14
95% mean confidence interval for cycles %-change: -1.59% -1.01%
Cycles are helped.

Iron Lake
total instructions in shared programs: 7886648 -> 7886515 (<.01%)
instructions in affected programs: 14106 -> 13973 (-0.94%)
helped: 29
HURT: 0
helped stats (abs) min: 1 max: 10 x̄: 4.59 x̃: 4
helped stats (rel) min: 0.35% max: 1.83% x̄: 0.90% x̃: 0.81%
95% mean confidence interval for instructions value: -5.65 -3.52
95% mean confidence interval for instructions %-change: -1.03% -0.76%
Instructions are helped.

total cycles in shared programs: 178100812 -> 178100396 (<.01%)
cycles in affected programs: 67970 -> 67554 (-0.61%)
helped: 29
HURT: 0
helped stats (abs) min: 2 max: 40 x̄: 14.34 x̃: 12
helped stats (rel) min: 0.15% max: 1.69% x̄: 0.58% x̃: 0.54%
95% mean confidence interval for cycles value: -18.30 -10.39
95% mean confidence interval for cycles %-change: -0.71% -0.45%
Cycles are helped.

GM45
total instructions in shared programs: 4857939 -> 4857872 (<.01%)
instructions in affected programs: 7426 -> 7359 (-0.90%)
helped: 15
HURT: 0
helped stats (abs) min: 1 max: 10 x̄: 4.47 x̃: 4
helped stats (rel) min: 0.33% max: 1.80% x̄: 0.87% x̃: 0.77%
95% mean confidence interval for instructions value: -6.06 -2.87
95% mean confidence interval for instructions %-change: -1.06% -0.67%
Instructions are helped.

total cycles in shared programs: 122167930 -> 122167654 (<.01%)
cycles in affected programs: 43118 -> 42842 (-0.64%)
helped: 15
HURT: 0
helped stats (abs) min: 4 max: 40 x̄: 18.40 x̃: 16
helped stats (rel) min: 0.15% max: 1.69% x̄: 0.62% x̃: 0.54%
95% mean confidence interval for cycles value: -25.03 -11.77
95% mean confidence interval for cycles %-change: -0.82% -0.41%
Cycles are helped.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-06 11:17:29 -08:00
Ian Romanick
52607658ff nir: Replace fmin(b2f(a), b) with a bcsel
All of the affected shaders are HDR mappers from Serious Sam 3.

All Gen7+ platforms had similar results. (Skylake shown)
total instructions in shared programs: 14516285 -> 14516273 (<.01%)
instructions in affected programs: 348 -> 336 (-3.45%)
helped: 12
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 2.08% max: 6.67% x̄: 4.31% x̃: 4.17%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -5.55% -3.06%
Instructions are helped.

total cycles in shared programs: 533163876 -> 533163808 (<.01%)
cycles in affected programs: 1144 -> 1076 (-5.94%)
helped: 4
HURT: 0
helped stats (abs) min: 16 max: 18 x̄: 17.00 x̃: 17
helped stats (rel) min: 5.80% max: 6.08% x̄: 5.94% x̃: 5.94%
95% mean confidence interval for cycles value: -18.84 -15.16
95% mean confidence interval for cycles %-change: -6.20% -5.68%
Cycles are helped.

Sandy Bridge
total instructions in shared programs: 10533321 -> 10533309 (<.01%)
instructions in affected programs: 372 -> 360 (-3.23%)
helped: 12
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 2.00% max: 5.88% x̄: 3.91% x̃: 3.85%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -4.96% -2.86%
Instructions are helped.

total cycles in shared programs: 146136632 -> 146136428 (<.01%)
cycles in affected programs: 11668 -> 11464 (-1.75%)
helped: 12
HURT: 0
helped stats (abs) min: 16 max: 18 x̄: 17.00 x̃: 17
helped stats (rel) min: 0.99% max: 3.44% x̄: 2.20% x̃: 2.29%
95% mean confidence interval for cycles value: -17.66 -16.34
95% mean confidence interval for cycles %-change: -2.82% -1.58%
Cycles are helped.

Iron Lake
total instructions in shared programs: 7886301 -> 7886277 (<.01%)
instructions in affected programs: 576 -> 552 (-4.17%)
helped: 12
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 2.94% max: 6.06% x̄: 4.51% x̃: 4.65%
95% mean confidence interval for instructions value: -2.00 -2.00
95% mean confidence interval for instructions %-change: -5.30% -3.72%
Instructions are helped.

total cycles in shared programs: 178113176 -> 178113176 (0.00%)
cycles in affected programs: 2116 -> 2116 (0.00%)
helped: 2
HURT: 4
helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4
helped stats (rel) min: 1.14% max: 1.14% x̄: 1.14% x̃: 1.14%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.50% max: 0.65% x̄: 0.58% x̃: 0.58%
95% mean confidence interval for cycles value: -3.25 3.25
95% mean confidence interval for cycles %-change: -0.93% 0.94%
Inconclusive result (value mean confidence interval includes 0).

GM45
total instructions in shared programs: 4857756 -> 4857744 (<.01%)
instructions in affected programs: 294 -> 282 (-4.08%)
helped: 6
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 2.94% max: 5.71% x̄: 4.40% x̃: 4.55%
95% mean confidence interval for instructions value: -2.00 -2.00
95% mean confidence interval for instructions %-change: -5.71% -3.09%
Instructions are helped.

total cycles in shared programs: 122178730 -> 122178722 (<.01%)
cycles in affected programs: 700 -> 692 (-1.14%)
helped: 2
HURT: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-06 11:17:29 -08:00
Ian Romanick
b974dfee11 nir: Pull b2f out of bcsel
All platforms had similar results. (Skylake shown)
total instructions in shared programs: 14516592 -> 14516586 (<.01%)
instructions in affected programs: 500 -> 494 (-1.20%)
helped: 2
HURT: 0

total cycles in shared programs: 533167044 -> 533166998 (<.01%)
cycles in affected programs: 6988 -> 6942 (-0.66%)
helped: 2
HURT: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-06 11:17:29 -08:00
Ian Romanick
f50400cc80 nir: Replace an odd comparison involving fmin of -b2f
I noticed the fge version while looking at a shader for an unrelated
reason.  The feq version prevents a regression in a later change that
performs strength reduction of some compares.

Broadwell and Skylake had similar results. (Skylake shown)
total instructions in shared programs: 14514808 -> 14514796 (<.01%)
instructions in affected programs: 750 -> 738 (-1.60%)
helped: 4
HURT: 0
helped stats (abs) min: 1 max: 5 x̄: 3.00 x̃: 3
helped stats (rel) min: 0.83% max: 1.96% x̄: 1.40% x̃: 1.40%
95% mean confidence interval for instructions value: -6.67 0.67
95% mean confidence interval for instructions %-change: -2.43% -0.36%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 533144939 -> 533144853 (<.01%)
cycles in affected programs: 8911 -> 8825 (-0.97%)
helped: 4
HURT: 0
helped stats (abs) min: 16 max: 32 x̄: 21.50 x̃: 19
helped stats (rel) min: 0.60% max: 1.89% x̄: 1.28% x̃: 1.31%
95% mean confidence interval for cycles value: -32.94 -10.06
95% mean confidence interval for cycles %-change: -2.30% -0.26%
Cycles are helped.

Haswell
total instructions in shared programs: 13093785 -> 13093775 (<.01%)
instructions in affected programs: 924 -> 914 (-1.08%)
helped: 4
HURT: 2
helped stats (abs) min: 1 max: 5 x̄: 3.00 x̃: 3
helped stats (rel) min: 0.82% max: 1.95% x̄: 1.39% x̃: 1.39%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 1.19% max: 1.19% x̄: 1.19% x̃: 1.19%
95% mean confidence interval for instructions value: -4.53 1.20
95% mean confidence interval for instructions %-change: -2.02% 0.97%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 409580553 -> 409580118 (<.01%)
cycles in affected programs: 10909 -> 10474 (-3.99%)
helped: 5
HURT: 1
helped stats (abs) min: 6 max: 222 x̄: 89.60 x̃: 18
helped stats (rel) min: 0.16% max: 24.72% x̄: 9.54% x̃: 1.78%
HURT stats (abs)   min: 13 max: 13 x̄: 13.00 x̃: 13
HURT stats (rel)   min: 0.39% max: 0.39% x̄: 0.39% x̃: 0.39%
95% mean confidence interval for cycles value: -180.68 35.68
95% mean confidence interval for cycles %-change: -19.55% 3.79%
Inconclusive result (value mean confidence interval includes 0).

Ivy Bridge
total instructions in shared programs: 11811851 -> 11811840 (<.01%)
instructions in affected programs: 1032 -> 1021 (-1.07%)
helped: 5
HURT: 1
helped stats (abs) min: 1 max: 5 x̄: 2.40 x̃: 1
helped stats (rel) min: 0.63% max: 1.95% x̄: 1.13% x̃: 0.97%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 1.19% max: 1.19% x̄: 1.19% x̃: 1.19%
95% mean confidence interval for instructions value: -4.17 0.51
95% mean confidence interval for instructions %-change: -1.86% 0.36%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 257618403 -> 257618168 (<.01%)
cycles in affected programs: 10784 -> 10549 (-2.18%)
helped: 4
HURT: 2
helped stats (abs) min: 4 max: 220 x̄: 64.50 x̃: 17
helped stats (rel) min: 0.50% max: 24.34% x̄: 7.07% x̃: 1.72%
HURT stats (abs)   min: 9 max: 14 x̄: 11.50 x̃: 11
HURT stats (rel)   min: 0.24% max: 0.42% x̄: 0.33% x̃: 0.33%
95% mean confidence interval for cycles value: -133.11 54.78
95% mean confidence interval for cycles %-change: -14.79% 5.59%
Inconclusive result (value mean confidence interval includes 0).

GM45, Iron Lake, and Sandy Bridge had similar results. (Sandy Bridge shown)
total instructions in shared programs: 10533871 -> 10533859 (<.01%)
instructions in affected programs: 865 -> 853 (-1.39%)
helped: 4
HURT: 0
helped stats (abs) min: 1 max: 5 x̄: 3.00 x̃: 3
helped stats (rel) min: 0.63% max: 1.83% x̄: 1.22% x̃: 1.21%
95% mean confidence interval for instructions value: -6.67 0.67
95% mean confidence interval for instructions %-change: -2.16% -0.29%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 146139904 -> 146139852 (<.01%)
cycles in affected programs: 15213 -> 15161 (-0.34%)
helped: 4
HURT: 0
helped stats (abs) min: 3 max: 18 x̄: 13.00 x̃: 15
helped stats (rel) min: 0.15% max: 0.84% x̄: 0.39% x̃: 0.29%
95% mean confidence interval for cycles value: -23.79 -2.21
95% mean confidence interval for cycles %-change: -0.88% 0.09%
Inconclusive result (%-change mean confidence interval includes 0).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-06 11:17:29 -08:00
Ian Romanick
380136e998 nir: Mark bcsel-to-fmin (or fmax) transformations as inexact
These transformations are inexact because section 4.7.1 (Range and
Precision) says:

    Operations and built-in functions that operate on a NaN are not
    required to return a NaN as the result.

The fmin or fmax might not return NaN in cases where the original
expression would be required to return NaN.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-06 11:17:14 -08:00
Ian Romanick
4addd34b04 nir: Recognize some more open-coded fmin / fmax
This transformation is inexact because section 4.7.1 (Range and
Precision) says:

    Operations and built-in functions that operate on a NaN are not
    required to return a NaN as the result.

The fmin or fmax might not return NaN in cases where the original
expression would be required to return NaN.

v2: Reorder operands and mark as inexact.  The latter suggested by
Jason.

shader-db results:

Haswell, Broadwell, and Skylake had similar results. (Skylake shown)
total instructions in shared programs: 14514817 -> 14514808 (<.01%)
instructions in affected programs: 229 -> 220 (-3.93%)
helped: 3
HURT: 0
helped stats (abs) min: 1 max: 4 x̄: 3.00 x̃: 4
helped stats (rel) min: 2.86% max: 4.12% x̄: 3.70% x̃: 4.12%

total cycles in shared programs: 533145211 -> 533144939 (<.01%)
cycles in affected programs: 37268 -> 36996 (-0.73%)
helped: 8
HURT: 0
helped stats (abs) min: 2 max: 134 x̄: 34.00 x̃: 2
helped stats (rel) min: 0.02% max: 14.22% x̄: 3.53% x̃: 0.05%

Sandy Bridge and Ivy Bridge had similar results. (Ivy Bridge shown)
total cycles in shared programs: 257618409 -> 257618403 (<.01%)
cycles in affected programs: 12582 -> 12576 (-0.05%)
helped: 3
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 0.05% max: 0.05% x̄: 0.05% x̃: 0.05%

No changes on Iron Lake or GM45.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-06 11:17:14 -08:00
Gurkirpal Singh
c62cf1f165 st/omx/tizonia/h264d: Add EGLImage support
Example Gstreamer pipeline :
MESA_ENABLE_OMX_EGLIMAGE=1 GST_GL_API=gles2 GST_GL_PLATFORM=egl gst-launch-1.0 filesrc location=movie.mp4 ! qtdemux ! h264parse ! omxh264dec ! glimagesink

Acked-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Julien Isorce <julien.isorce@gmail.com>
2018-03-06 17:21:11 +00:00
Gurkirpal Singh
b2f2236dc5 st/omx/tizonia: Add H.264 encoder
v2: Refactor out screen functions to st/omx

Example Gstreamer pipeline :
gst-launch-1.0 filesrc location=movie.mp4 ! qtdemux ! h264parse ! avdec_h264 ! videoconvert ! omxh264enc ! h264parse ! avdec_h264 ! videoconvert ! ximagesink

Acked-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Julien Isorce <julien.isorce@gmail.com>
2018-03-06 17:20:08 +00:00
Gurkirpal Singh
83d4a5d5ae st/omx/tizonia: Add H.264 decoder
v2: Refactor out screen functions to st/omx

Example Gstreamer pipeline :
gst-launch-1.0 filesrc location=movie.mp4 ! qtdemux ! h264parse ! omxh264dec ! videoconvert ! ximagesink

Acked-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Julien Isorce <julien.isorce@gmail.com>
2018-03-06 14:29:42 +00:00
Gurkirpal Singh
430ccdbcb9 st/omx/tizonia: Add entrypoint
Adds base files for adding components

Acked-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Julien Isorce <julien.isorce@gmail.com>
2018-03-06 14:29:42 +00:00
Gurkirpal Singh
e2afa154e9 st/omx/tizonia: Add --enable-omx-tizonia flag and build files
Allow only bellagio or tizonia to be used at the same time.
Detect tizonia package config file
Generate libomx_mesa.so and install it to libtizcore.pc::pluginsdir
Only compile empty source (target.c) for now.

GSoC Project link: https://summerofcode.withgoogle.com/projects/#4737166321123328

Acked-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Julien Isorce <julien.isorce@gmail.com>
2018-03-06 14:29:42 +00:00
Gurkirpal Singh
bb5e27fab6 st/omx/bellagio: Rename st and target directories
v2: Refactor out screen functions to st/omx

Allows to keep all the code under st/omx (st/omx/tizonia and
st/omx/bellagio).
Reverts targets/omx_bellagio to omx as additions to existing files
is enough to compile for both bellagio and tizonia.

* autotools changes:
  --enable-omx -> --enable-omx-bellagio

* meson changes:
  -Dgallium-omx=false -> -Dgallium-omx=disabled
  -Dgallium-omx=true  -> -Dgallium-omx=bellagio

Acked-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Julien Isorce <julien.isorce@gmail.com>
2018-03-06 13:07:03 +00:00
Samuel Pitoiset
e96e6f60f7 radv: report the scratch private memory size with shader stats
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-06 10:38:42 +01:00
Samuel Pitoiset
7f6b91c9c3 ac/nir: count the scratch private memory size
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-06 10:38:40 +01:00
Samuel Pitoiset
3b8e7459f2 ac: add ac_count_scratch_private_memory()
Imported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-06 10:38:38 +01:00
Samuel Pitoiset
f3275ca01c ac/nir: only enable used channels when exporting parameters
This allows us to generate, for example,
"exp param0 v0, off, off, off" if only the first channel is needed.

Not sure if this improves performance but it's worth trying.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-06 10:38:35 +01:00
Samuel Pitoiset
675dde13b2 ac: update enabled channels mask when optimizing PARAM exports
When the mask is not 0xf we need to update the number of
enabled channels, otherwise the hardware won't emit the
components that are combined.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-06 10:37:52 +01:00
Samuel Pitoiset
c24abae9dc ac/nir: pass the number of enabled channels to si_llvm_init_export_args()
Currently, it's always 0xf but an upcoming patch will reduce the
number of channels for parameters export.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-06 10:37:50 +01:00
Samuel Pitoiset
5cd34f03c0 ac/shader: scan output usage mask for VS and TES
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-06 10:37:47 +01:00
Clayton Craft
d1fa30e0f8 intel: Add missing includes for building on Android
This adds a missing library to the i965/Android.mk file, and updates
intel/Android.mk to include the new library. Without this, mesa does not
build on Android.

Fixes: 272bef0601 "intel: Split gen_device_info out into
libintel_dev"

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-06 00:14:22 -08:00
Tapani Pälli
237c9caa78 vulkan: do not expose surface/swapchain extensions on Android
On Android surface/swapchain extensions are implemented by the loader. Patch
modifies both anv and radv extension scripts disabling currently exposed
ones. See also earlier commit 9f763c1f9b.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-06 08:02:59 +02:00
Tapani Pälli
85518657a9 anv: Don't expose VK_KHX_multiview on android.
Just like commit 2ffe395 does for radv.

Fixes following dEQP test on i965:
   dEQP-VK.api.info.android.no_unknown_extensions

v2: make it !ANDROID since this extension is not about
    surfaces/swapchain

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-06 08:01:20 +02:00
Roland Scheidegger
cf4a92fda2 gallium: increase PIPE_MAX_SHADER_SAMPLER_VIEWS to 128
Some state trackers require 128.
(There are no plans to increase PIPE_MAX_SAMPLERS too, since with gl
state tracker it's unlikely more than 32 will be needed, if you need
more use bindless.)
2018-03-06 05:18:17 +01:00
Roland Scheidegger
06e724c7b4 tgsi/scan: use wrap-around shift behavior explicitly for file_mask
The comment said it will only represent the lowest 32 regs. This was
not entirely true in practice, since at least on x86 you'll get
masked shifts (unless the compiler could recognize it already and toss
it out). It turns out this actually works out alright (presumably
noone uses it for temp regs) when increasing max sampler views, so
make that behavior explicit.
Albeit it feels a bit hacky (but in any case, explicit behavior there
is better than undefined behavior).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-03-06 05:18:17 +01:00
Aaron Watry
95ae6c0355 clover: Allow overriding platform/device version numbers
Useful for testing API, builtin library, and device completeness of
not-yet-supported versions.

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(v3) Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Jan Vesely <jan.vesely@rutgers.edu>

v4: Remove redundant std::string wrapper around debug_get_option calls
v3: mark CL version overrides as static and const
v2: Make version_string in platform const in case
2018-03-05 20:09:46 -06:00
Aaron Watry
106020712f clover/llvm: Pass device down to compile
We'll need to be able to detect device version to define the appropriate
__OPENCL_VERSION__ header.

v2: Rebase after removing the previous patch (Pierre)
  - Removed "clover: Add device_clc_version to llvm::create_compiler_instance"

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-03-05 20:09:46 -06:00
Aaron Watry
fc629e3594 clover: Pass device to llvm::create_compiler_instance
We'll be using dev.device_clc_version to select the default language version
soon along with the existing ir_target field.

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>

v4: Pass the device down instead of device_clc_version as a separate field
v3: Revise to acknowledge that we now have the device in compile/link_program
    instead of the string values.
v2: (Pierre) Move changes to create_compiler_instance invocation to correct
    patch to prevent temporary build breakage.
    (Jan) Use device_clc_version instead of device_version for compile/link
2018-03-05 20:09:46 -06:00
Aaron Watry
dd81ca3883 clover/llvm: Use device in llvm compilation instead of copying fields
Copying the individual fields from the device when compiling/linking
will lead to an unnecessarily large number of fields getting passed
around.

v3: Rebase on current master
v2: Use device in function args before making additional changes in
    following patches

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-03-05 20:09:46 -06:00
Timothy Arceri
71b3d681d8 radeonsi/nir: fix handling of doubles for gs inputs
Fixes piglit test:
tests/spec/arb_gpu_shader_fp64/execution/explicit-location-gs-fs-vs.shader_test

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-06 11:44:06 +11:00
Timothy Arceri
20bd0f6a2b ac: pass the unmodified number of components to load gs inputs
Currently both users of this would overflow an array when the
input was a dual slot double as they expected the number of
components to be a max of 4.

Since we pass the type we can just let the functions handle
doubles in a way they choose.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-06 11:44:06 +11:00
Timothy Arceri
2a68c6c6c8 radeonsi: move si_nir_load_input_gs() to si_shader.c
All the tess shader and tgsi equivalents are here and it allows
use to use llvm_type_is_64bit() in the following patch without
exposing it externally.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-06 11:44:06 +11:00
Boris Brezillon
9ea90ffb98 broadcom/vc4: Add support for HW perfmon
The V3D engine provides several perf counters.
Implement ->get_driver_query_[group_]info() so that these counters are
exposed through the GL_AMD_performance_monitor extension.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
2018-03-05 15:54:04 -08:00
Boris Brezillon
5924379a58 drm-uapi: Update vc4 header with perfmon related definitions
v2: Update to the final version with the documentation.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
2018-03-05 15:53:48 -08:00
Roland Scheidegger
434523cf2a r600: fix color export mask
The r600 code (not the eg one) forgot to copy the ps_color_export_mask
in commit 5b14e06d8b when updating the
pixel state, leading to misrenderings (probably with MRT).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105262

Tested-by: LoneVVolf <lonewolf@xs4all.nl>
Tested-by: Pavel Vinogradov <public@sourcemage.org>
2018-03-05 20:15:05 +01:00
Andres Gomez
72552012c7 travis: keep meson version below 0.45.0
Recently Meson upgraded to 0.45.0 and it needs python 3.5+, which is
not available in Trusty.

Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Jon Turney <jon.turney@dronecode.org.uk>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-03-05 21:12:37 +02:00
Kenneth Graunke
0472aa3efe intel: Drop SURFACE_FORMAT enum from genxml.
We want people to be using ISL_FORMAT_*, rather than the genxml format
enumerations. This patch drops 10 separate copies, and drops a bunch
of ugly casting.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
[jordan.l.justen@intel.com: Minor changes for rebase]
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-05 09:51:08 -08:00
Jordan Justen
755e7e6c20 intel/common: Use isl for decoder surface formats
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-05 09:51:04 -08:00
Jordan Justen
bd3392423d intel/isl: Add isl_format_is_valid
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-05 09:51:01 -08:00
Jordan Justen
272bef0601 intel: Split gen_device_info out into libintel_dev
Split out the device info so isl doesn't depend on intel/common. Now
it will depend on the new intel/dev device info lib.

This will allow the decoder in intel/common to use isl, allowing us to
apply Ken's patch that removes the genxml duplication of surface
formats.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-05 09:47:37 -08:00
Gert Wollny
9a0d7bb48c gallium/aux/hud: Avoid possible buffer overflow
Limit the length of acceptable cpu names for use in hud_get_num_cpufreq
in order to avoid a buffer overflow later in add_object when this name
is copied into cpufreq_info::name.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105274
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-03-05 11:38:28 -05:00
Eric Engestrom
b98c905a46 gbm: give a name to rgba fields
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-03-05 15:14:36 +00:00
Andres Gomez
40abffb295 egl: remove duplicated initialization
Found by inspection.

The line removed is a duplicate of the line literally just above the
the 3 lines context usually printed in a commit log.

v2: enhance the commit log (Emil).

Cc: Ian Romanick <ian.d.romanick@intel.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-05 15:55:53 +02:00
Rob Clark
5a5a43078c freedreno/ir3: start dealing with half-precision
Some instructions, assume src and/or dst is half-precision based on a
type field (ie. f32/s32/u32 are full precision but others are half
precision).  So add some code to sanity check the src/dst registers to
catch mixups.

Also propagate half-precision flag for SSA sources.  The instruction
consuming a SSA value needs to be of the same type as the one producing
it.

This is probably not complete half-precision support, but a useful first
step.  We do still need to add support for nir alu instructions for
converting between half/full precision.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Rob Clark
175d1b4372 freedreno/ir3: fix fixing-up register footprint
It isn't just vertex shaders that need to fixup reg footprint for inputs
populated before shader starts.

This problem showed up with compute shaders.  If you have (for example)
a localregid sysval, but only the .x component is used, the hw still
writes the .yz components, which could overflow into other threads
causing corruption.  Showed up in cl cts 'basic/test_basic intmath_int'.
But in theory the same problem could crop up elsewhere.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Rob Clark
9a62536108 freedreno: surfaces can be PIPE_BUFFER
At least for clover.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Rob Clark
d7af35a7f3 freedreno/a5xx: handle compute resources
Not *entirely* sure why this is a different BIND bit, but it is.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Rob Clark
82c71b09d5 freedreno/ir3: ignore return jump
I think this should also always only occur at the end of a BB (by
definition), and the BB successor should be the end block.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Rob Clark
c9b1cc33df freedreno: add some more compute caps
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Rob Clark
9630f4df3b freedreno/a5xx: don't expose 64b pointers yet
Temporary hack, but since we can't do 64b math yet in ir3, pretend that
we don't support 64b pointers.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Rob Clark
54988f1e6b freedreno: steal handy macro for compute caps from nouveau
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Rob Clark
26a9321d0a freedreno: add global_bindings state
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Rob Clark
8c42f63151 freedreno/ir3: small cleanup
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Rob Clark
76687b0c0a freedreno: add pctx->memory_barrier()
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Rob Clark
9e4f5966e8 freedreno/ir3: cmdline compiler updates for spv shaders
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-03-05 08:05:33 -05:00
Samuel Pitoiset
322a51b549 ac: add ac_build_fsign()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-05 11:04:36 +01:00
Samuel Pitoiset
e8bdde2289 ac: add ac_build_isign()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-05 11:04:32 +01:00
Samuel Pitoiset
459e33900f ac: add ac_build_fract()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-05 11:04:30 +01:00
gurchetansingh@chromium.org
fe0647df5a virgl: add offset alignment values to to v2 caps struct
glBindBufferRange(..) in vrend_draw_bind_ubo is failing with
more than one uniform block. This is due to improper alignment
of the start of the second block. Let's query the proper
alignment from the driver and pass it back to Mesa.

Let's query for the texture alignment too, even though the Virgl
renderer doesn't call glTexBufferRange yet.

The default values are the widest workable range possible (for example,
GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT on Nvidia is 256).

Fixes:
	dEQP-GLES3.functional.ubo.* on Nvidia

Example test:
	dEQP-GLES3.functional.ubo.multi_basic_types.single_buffer.shared_vertex

Note: This is based on "virgl: reduce some default capset limits.",
which hasn't landed in Mesa yet but should relatively soon.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-05 13:29:39 +10:00
Dave Airlie
9283cf2ad1 virgl: reduce some default capset limits.
Since v2 might take a while to rollout, we should reduce
these inside some gathered minimums and then v2 can increase
them using host values.

Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-05 13:29:38 +10:00
Dave Airlie
cd32258ec1 virgl: handle getting new capsets.
This checks the kernel api is new enough and asks for the
larger caps size since the kernel won't mess it up now.

Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-05 13:29:38 +10:00
Timothy Arceri
70190a6567 radeonsi/nir: call ac_lower_indirect_derefs()
Fixes piglit tests:
tests/spec/glsl-1.50/execution/variable-indexing/gs-input-array-vec3-index-rd.shader_test
tests/spec/glsl-1.50/execution/geometry/max-input-components.shader_test

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-05 14:09:23 +11:00
Timothy Arceri
561503e3bd radeonsi: add chip class to compiler_ctx_state
This will be used in the following patch.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-05 14:09:23 +11:00
Timothy Arceri
0f2c7341e8 ac/radv: move lower_indirect_derefs() to ac_nir_to_llvm.c
Until llvm handles indirects better we will need to use these
workarounds in the radeonsi backend also.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-05 14:09:23 +11:00
Bas Nieuwenhuizen
eea20d59ab radv: Fix copying from 3D images starting at non-zero depth.
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-05 01:04:54 +01:00
Vinson Lee
bb742b6ebf swr/rast: Fix macOS macro.
Fixes: a25093de71 ("swr/rast: Implement JIT shader caching to disk")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
2018-03-04 13:23:57 -08:00
Mathias Fröhlich
411aa8c322 vbo: Try to reuse the same VAO more often for successive dlists.
The change tries to catch more opportunities to reuse the same set
of VAO's when building up display lists. Instead of checking the
offset with respect to the beginning of the vertex buffer object
the change tries to apply this same optimization with respect to the
previous display list node.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-03 05:56:35 +01:00
Ian Romanick
a9eb455e29 mesa: Silence unused parameter warnings from TEXSTORE_PARAMS
Reduces my build from 1717 warnings to 1547 warnings by silencing 170
instances of things like

In file included from ../../SOURCE/master/src/mesa/main/texcompress_bptc.h:30:0,
                 from ../../SOURCE/master/src/mesa/main/texcompress_bptc.c:31:
../../SOURCE/master/src/mesa/main/texcompress_bptc.c: In function ‘_mesa_texstore_bptc_rgba_unorm’:
../../SOURCE/master/src/mesa/main/texstore.h:60:14: warning: unused parameter ‘dstFormat’ [-Wunused-parameter]
  mesa_format dstFormat, \
              ^
../../SOURCE/master/src/mesa/main/texcompress_bptc.c:1276:32: note: in expansion of macro ‘TEXSTORE_PARAMS’
 _mesa_texstore_bptc_rgba_unorm(TEXSTORE_PARAMS)
                                ^~~~~~~~~~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-02 16:10:44 -08:00
Ian Romanick
1049b57bf2 i965: Silence unused parameter warnings in genX_state_upload
Reduces my build from 1772 warnings to 1717 warnings by silencing 55
instances of things like

../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘gen4_emit_vertex_buffer_state’:
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:313:41: warning: unused parameter ‘end_offset’ [-Wunused-parameter]
                                unsigned end_offset,
                                         ^~~~~~~~~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘gen4_emit_sampler_state_pointers_xs’:
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4689:58: warning: unused parameter ‘brw’ [-Wunused-parameter]
 genX(emit_sampler_state_pointers_xs)(struct brw_context *brw,
                                                          ^~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4690:62: warning: unused parameter ‘stage_state’ [-Wunused-parameter]
                                      struct brw_stage_state *stage_state)
                                                              ^~~~~~~~~~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘gen4_upload_default_color’:
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4730:40: warning: unused parameter ‘format’ [-Wunused-parameter]
                            mesa_format format, GLenum base_format,
                                        ^~~~~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘translate_wrap_mode’:
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4906:41: warning: unused parameter ‘brw’ [-Wunused-parameter]
 translate_wrap_mode(struct brw_context *brw, GLenum wrap, bool using_nearest)
                                         ^~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘gen4_update_sampler_state’:
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4972:37: warning: unused parameter ‘batch_offset_for_sampler_state’ [-Wunused-parameter]
                            uint32_t batch_offset_for_sampler_state)
                                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-02 16:10:44 -08:00
Ian Romanick
50bf186829 isl: Silence unused parameter warnings in __gen_combine_address implementations
Reduces my build from 1808 warnings to 1772 warnings by silencing 36
instances of things like

../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c: In function ‘__gen_combine_address’:
../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c:30:29: warning: unused parameter ‘data’ [-Wunused-parameter]
 __gen_combine_address(void *data, void *loc, uint64_t addr, uint32_t delta)
                             ^~~~
../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c:30:41: warning: unused parameter ‘loc’ [-Wunused-parameter]
 __gen_combine_address(void *data, void *loc, uint64_t addr, uint32_t delta)
                                         ^~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-02 16:10:44 -08:00
Ian Romanick
492a472b28 genxml: Silence unused parameter warnings in generated pack code
Reduces my build from 1960 warnings to 1808 warnings by silencing 152
instances of things like

In file included from ../../SOURCE/master/src/intel/genxml/genX_pack.h:32:0,
                 from ../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c:36:
src/intel/genxml/gen4_pack.h: In function ‘__gen_uint’:
src/intel/genxml/gen4_pack.h:58:49: warning: unused parameter ‘end’ [-Wunused-parameter]
 __gen_uint(uint64_t v, uint32_t start, uint32_t end)
                                                 ^~~
src/intel/genxml/gen4_pack.h: In function ‘__gen_offset’:
src/intel/genxml/gen4_pack.h:94:35: warning: unused parameter ‘start’ [-Wunused-parameter]
 __gen_offset(uint64_t v, uint32_t start, uint32_t end)
                                   ^~~~~
src/intel/genxml/gen4_pack.h:94:51: warning: unused parameter ‘end’ [-Wunused-parameter]
 __gen_offset(uint64_t v, uint32_t start, uint32_t end)
                                                   ^~~
src/intel/genxml/gen4_pack.h: In function ‘__gen_ufixed’:
src/intel/genxml/gen4_pack.h:133:48: warning: unused parameter ‘end’ [-Wunused-parameter]
 __gen_ufixed(float v, uint32_t start, uint32_t end, uint32_t fract_bits)
                                                ^~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-02 16:10:44 -08:00
Ian Romanick
f726695cce i965: Silence unused parameter warnings in blorp
Reduces my build from 2023 warnings to 1960 warnings by silencing 63
instances of things like

In file included from ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:33:0:
../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h: In function ‘blorp_emit_cc_viewport’:
../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h:500:51: warning: unused parameter ‘params’ [-Wunused-parameter]
                        const struct blorp_params *params)
                                                   ^~~~~~
../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h: In function ‘blorp_emit_sampler_state’:
../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h:524:53: warning: unused parameter ‘params’ [-Wunused-parameter]
                          const struct blorp_params *params)
                                                     ^~~~~~
In file included from ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:36:0:
../../SOURCE/master/src/mesa/drivers/dri/i965/gen4_blorp_exec.h: In function ‘blorp_emit_vs_state’:
../../SOURCE/master/src/mesa/drivers/dri/i965/gen4_blorp_exec.h:50:48: warning: unused parameter ‘params’ [-Wunused-parameter]
                     const struct blorp_params *params)
                                                ^~~~~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c: In function ‘blorp_flush_range’:
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:197:39: warning: unused parameter ‘batch’ [-Wunused-parameter]
 blorp_flush_range(struct blorp_batch *batch, void *start, size_t size)
                                       ^~~~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:197:52: warning: unused parameter ‘start’ [-Wunused-parameter]
 blorp_flush_range(struct blorp_batch *batch, void *start, size_t size)
                                                    ^~~~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:197:66: warning: unused parameter ‘size’ [-Wunused-parameter]
 blorp_flush_range(struct blorp_batch *batch, void *start, size_t size)
                                                                  ^~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-02 16:10:44 -08:00
Ian Romanick
3a944316c4 nir: Silence unused parameter warnings in generated nir_constant_expressions code
Reduces my build from 2075 warnings to 2023 warnings by silencing 52
instances of things like

src/compiler/nir/nir_constant_expressions.c: In function ‘evaluate_bfi’:
src/compiler/nir/nir_constant_expressions.c:1812:61: warning: unused parameter ‘bit_size’ [-Wunused-parameter]
 evaluate_bfi(MAYBE_UNUSED unsigned num_components, unsigned bit_size,
                                                             ^~~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-02 16:10:44 -08:00
Ian Romanick
ab8f2e30b8 i965: Silence unused parameter warnings in generated OA code
Reduces my build from 6301 warnings to 2075 warnings by silencing 4226
instances of things like

src/mesa/drivers/dri/i965/i965@sta/brw_oa_hsw.c: In function ‘hsw__render_basic__gpu_core_clocks__read’:
src/mesa/drivers/dri/i965/i965@sta/brw_oa_hsw.c:41:62: warning: unused parameter ‘brw’ [-Wunused-parameter]
 hsw__render_basic__gpu_core_clocks__read(struct brw_context *brw,
                                                              ^~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-02 16:10:44 -08:00
Ian Romanick
a55dae6ea2 i965: Silence warnings about mixing enum and non-enum in conditional
Reduces my build from 6451 warnings to 6301 warnings by silencing 150
instances of

../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_reg_type brw_inst_src1_type(const gen_device_info*, const brw_inst*)’:
../../SOURCE/master/src/intel/compiler/brw_inst.h:802:55: warning: enumeral and non-enumeral type in conditional expression [-Wextra]
    unsigned file = __builtin_strcmp("dst", #reg) == 0 ?                       \
                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~
                    BRW_GENERAL_REGISTER_FILE :                                \
                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                    brw_inst_##reg##_reg_file(devinfo, inst);                  \
                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../SOURCE/master/src/intel/compiler/brw_inst.h:811:1: note: in expansion of macro ‘REG_TYPE’
 REG_TYPE(src1)
 ^~~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-02 16:10:44 -08:00
Ian Romanick
feefb7810e intel/compiler: Silence unused parameter warnings in release builds
Reduces my build from 7005 warnings to 6451 warnings by silencing 554
instances of

In file included from ../../SOURCE/master/src/intel/compiler/brw_disasm.c:28:0:
../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_3src_a1_src0_imm’:
../../SOURCE/master/src/intel/compiler/brw_inst.h:346:57: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_inst_3src_a1_src0_imm(const struct gen_device_info *devinfo,
                                                         ^~~~~~~
../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_3src_a1_src2_imm’:
../../SOURCE/master/src/intel/compiler/brw_inst.h:354:57: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_inst_3src_a1_src2_imm(const struct gen_device_info *devinfo,
                                                         ^~~~~~~
../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_set_3src_a1_src0_imm’:
../../SOURCE/master/src/intel/compiler/brw_inst.h:362:61: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_inst_set_3src_a1_src0_imm(const struct gen_device_info *devinfo,
                                                             ^~~~~~~
../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_set_3src_a1_src2_imm’:
../../SOURCE/master/src/intel/compiler/brw_inst.h:370:61: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_inst_set_3src_a1_src2_imm(const struct gen_device_info *devinfo,
                                                             ^~~~~~~
../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_imm_uq’:
../../SOURCE/master/src/intel/compiler/brw_inst.h:703:47: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_inst_imm_uq(const struct gen_device_info *devinfo, const brw_inst *insn)
                                               ^~~~~~~
In file included from ../../SOURCE/master/src/intel/compiler/brw_shader.h:29:0,
                 from ../../SOURCE/master/src/intel/compiler/brw_disasm.c:29:
../../SOURCE/master/src/intel/compiler/brw_compiler.h: In function ‘brw_stage_has_packed_dispatch’:
../../SOURCE/master/src/intel/compiler/brw_compiler.h:1277:61: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_stage_has_packed_dispatch(const struct gen_device_info *devinfo,
                                                             ^~~~~~~
../../SOURCE/master/src/intel/compiler/brw_disasm.c: In function ‘src_ia1’:
../../SOURCE/master/src/intel/compiler/brw_disasm.c:849:18: warning: unused parameter ‘_reg_file’ [-Wunused-parameter]
         unsigned _reg_file,
                  ^~~~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-02 16:10:44 -08:00
Ian Romanick
c8a03ab453 i965: Silence unused parameter warnings
Reduces my build from 7119 warnings to 7005 warnings by silencing 114
instances of

In file included from ../../SOURCE/master/src/mesa/drivers/dri/i965/brw_context.h:46:0,
                 from ../../SOURCE/master/src/mesa/drivers/dri/i965/intel_pixel_read.c:38:
../../SOURCE/master/src/mesa/drivers/dri/i965/brw_bufmgr.h: In function ‘brw_bo_unmap’:
../../SOURCE/master/src/mesa/drivers/dri/i965/brw_bufmgr.h:258:47: warning: unused parameter ‘bo’ [-Wunused-parameter]
 static inline int brw_bo_unmap(struct brw_bo *bo) { return 0; }
                                               ^~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-02 16:10:44 -08:00
Kenneth Graunke
9fa95359df intel: Drop program size pointer from vec4/fs assembly getters.
These days, we're just passing a pointer to a prog_data field, which
we already have access to.  We can just use it directly.

(In the past, it was a pointer to a separate value.)

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-02 14:20:22 -08:00
Kenneth Graunke
b04cf529f2 i965: Mark upload buffers with MAP_ASYNC and MAP_PERSISTENT.
This should have no practical impact.  For the default uploader, we
don't really care, but for others, we may want to append more data
as the GPU is reading existing data, which means we need async and
persistent flags.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-03-02 14:19:33 -08:00
Kenneth Graunke
eb99bf8abe i965: Generalize intel_upload.c to support multiple uploaders.
I'd like to reuse the upload logic for a new program cache, but the
buffers will need to have a different lifetime than the default
uploader, and also some address space restrictions.  So, we can't
use a single uploader for both situations - we'll need two of them.

This creates a public 'uploader' structure, and adjusts the interface
to take an uploader rather than always using brw->upload.  It should
have no functional change at the moment.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-03-02 14:19:33 -08:00
Anuj Phogat
56dc9f9f49 intel/compiler: Memory fence commit must always be enabled for gen10+
Commit bit in the message descriptor (Bit 13) must be always set
to true in CNL+ for memory fence messages. It also fixes a piglit
GPU hang on cnl+ in simulation environment.
Piglit test: arb_shader_image_load_store-shader-mem-barrier
See HSD ES # 1404612949

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-03-02 11:45:21 -08:00
Francisco Jerez
4b4838b1ae Revert "i965/fs: Predicate byte scattered writes if needed"
This reverts commit a4031bdfa9.  It's
redundant with the sample mask predication done at this point by the
common logical send lowering infrastructure, and rather buggy because
it wasn't applying the correct sample mask in shaders using discard,
since the dispatch mask returned by FS_OPCODE_MOV_DISPATCH_TO_FLAGS
doesn't reflect samples discarded by the shader, so it could have led
to data corruption in fragment shader invocations that execute discard
based on a non-dynamically uniform condition.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-02 11:28:56 -08:00
Francisco Jerez
c063e88909 intel/fs: Handle surface opcode sample masks via predication.
The main motivation is to enable HDC surface opcodes on ICL which no
longer allows the sample mask to be provided in a message header, but
this is enabled all the way back to IVB when possible because it
decreases the instruction count of some shaders using HDC messages
significantly, e.g. one of the SynMark2 CSDof compute shaders
decreases instruction count by about 40% due to the removal of header
setup boilerplate which in turn makes a number of send message
payloads more easily CSE-able.  Shader-db results on SKL:

 total instructions in shared programs: 15325319 -> 15314384 (-0.07%)
 instructions in affected programs: 311532 -> 300597 (-3.51%)
 helped: 491
 HURT: 1

Shader-db results on BDW where the optimization needs to be disabled
in some cases due to hardware restrictions:

 total instructions in shared programs: 15604794 -> 15598028 (-0.04%)
 instructions in affected programs: 220863 -> 214097 (-3.06%)
 helped: 351
 HURT: 0

The FPS of SynMark2 CSDof improves by 5.09% ±0.36% (n=10) on my SKL
laptop with this change.  According to Eero this improves performance
of the same test by 9% on BYT and by 7-8% on BXT J4205 and on SKL GT2
desktop.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-By: Eero Tamminen <eero.t.tamminen@intel.com>
2018-03-02 11:28:56 -08:00
Francisco Jerez
e7c9adca57 intel/eu: Plumb header present bit to codegen helpers for HDC messages.
This makes sure that the header-present bit of the message descriptor
is in sync with the IR instruction fields, which gives the optimizer
more control to avoid the overhead of setting up a message header when
it's possible to do so.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-02 11:28:56 -08:00
Francisco Jerez
6edb332b44 intel/ir: Allow arbitrary scratch flag registers for SHADER_OPCODE_FIND_LIVE_CHANNEL.
This shouldn't cause any functional change at this point, it changes
SHADER_OPCODE_FIND_LIVE_CHANNEL to use the flag register specified at
the IR level instead of the hard-coded f1.0, now that it can be
represented in backend_instruction::flag_subreg.  This will be
necessary for scheduling to behave correctly once more things start
making use of f1.0.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-02 11:28:56 -08:00
Francisco Jerez
cc0fc8b8ac intel/ir: Allow representing additional flag subregisters in the IR.
This allows representing conditional mods and predicates on f1.0-f1.1
at the IR level by adding an extra bit to the flag_subreg
backend_instruction field.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-02 11:28:56 -08:00
Francisco Jerez
9ec3362e0b intel/l3: Don't allocate SLM partition on ICL+.
SLM has a chunk of special-purpose memory separate from L3 on ICL+, we
shouldn't allocate a partition for it on L3 anymore.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-02 11:28:56 -08:00
Charmaine Lee
af8877af3b svga: add SVGA_NEW_PRESCALE to the tracked dirty mask for gs
Since geometry shader also consumes prescale constants, the
geometry shader constant buffer will need to be updated when prescale
factor is changed.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-03-02 12:23:50 -07:00
Brian Paul
dc79b88402 svga: fix blending regression
The earlier Mesa commit 3d06c8afb5 ("st/mesa: don't translate blend
state when it's disabled for a colorbuffer") subtly changed the
details of gallium's per-RT blend state.

In particular, when pipe_rt_blend_state[i].blend_enabled is true,
we have to get the src/dst blend terms from pipe_rt_blend_state[i],
not [0] as before.

We now have to scan the blend targets to find the first one that's
enabled (if any).  We have to use the index of that target for getting
the src/dst blend terms.  And note that we have to set identical blend
terms for all targets.

This fixes the Piglit fbo-drawbuffers2-blend test.  VMware bug 2063493.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-03-02 12:23:50 -07:00
Brian Paul
b871a77316 svga: check svga_have_vgpu10() in svga_delete_blend_state()
We were calling SVGA3D_vgpu10_DestroyBlendState() when vgpu10 was not
enabled (bs->id==0 by default), resulting in lots of device errors.

Reviewed-by: Neha Bhende<bhenden@vmware.com>
2018-03-02 12:23:50 -07:00
Brian Paul
72df3a7a39 svga: if svga_update_state() fails, skip the draw call
If svga_update_state() fails, we flush the command buffer and retry.
If it fails again, it likely means we were unable to translate a shader
for some reason (uses too many resources, for example).  In that case,
let's just skip the draw call.  The alternative, just disabling the
shader stage in question, would certainly lead to bad rendering anyway,
and probably device errors.

Fixes failed assertion running Piglit glsl-1.50/execution/
variable-indexing/gs-output-array-vec4-index-wr.shader_test since it
uses too many GS output registers (though the test still fails).
VMware bug 2063492.

v2: also call pipe_debug_message() so apps or apitrace can be notified
when this issue occurs.
v3: use svga_update_state_retry().

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-03-02 12:23:50 -07:00
Brian Paul
0a7deaa0d6 svga: let svga_update_state_retry() return a bool
This will allow minor simplifications elsewhere.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-03-02 12:23:50 -07:00
Brian Paul
35c5cf8959 svga: s/unsigned/boolean/ for a few local vars
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-03-02 12:23:50 -07:00
Dylan Baker
e23192022a meson: install vulkan_intel.h header
Fixes: d1992255bb
       ("meson: Add build Intel "anv" vulkan driver")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-02 11:11:20 -08:00
Boyuan Zhang
1ad89fa138 st/omx_bellagio: add picture profile and entry point
Profile and entry point were missing in the picture structure.
Therefore, add them back.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2018-03-02 12:04:36 -05:00
Boyuan Zhang
6a62e455f2 radeonsi: fix radeon create encoder return
Previous patch missed a "return" when trying to modify the create encoder
function, which made the whole logic fail. Therefore, add the return back.

Fixes: b38b208ff8 "radeonsi:create uvd hevc enc entry"

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-02 12:04:36 -05:00
Thierry Reding
f9bc48d41d loader: Add support for platform and host1x busses
ARM SoCs usually have their DRM/KMS devices on the platform bus, so add
support for this bus in order to allow use of the DRI_PRIME environment
variable with those devices.

While at it, also support the host1x bus, which is effectively the same
but uses an additional layer in the bus hierarchy.

Note that it isn't enough to support the bus that has the rendering GPU
because the loader code will also try to construct an ID path tag for a
scanout-only device if it is the default that is being opened.

The ID path tag for a device can be obtained by running udevadm info on
the device node, as shown in this example on NVIDIA Tegra:

	$ udevadm info /dev/dri/card0 | grep ID_PATH_TAG
	E: ID_PATH_TAG=platform-50000000_host1x

The corresponding OF_FULLNAME property, from which the ID_PATH_TAG is
constructed, can be found in the sysfs "uevent" attribute for the card0
device's parent:

	$ grep OF_FULLNAME /sys/devices/platform/50000000.host1x/drm/uevent
	OF_FULLNAME=/host1x@50000000

Similarily, /dev/dri/card1 corresponds to the GPU:

	$ udevadm info /dev/dri/card1 | grep ID_PATH_TAG
	E: ID_PATH_TAG=platform-57000000_gpu

and:

	$ grep OF_FULLNAME /sys/devices/platform/57000000.gpu/uevent
	OF_FULLNAME=/gpu@57000000

Changes in v2:
- avoid confusing pre-increment in strdup()
- add examples of tags to commit message

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-03-02 14:40:29 +01:00
Thierry Reding
498faea103 disk cache: Link with -latomic if necessary
The disk cache implementation uses 64-bit atomic operations. For some
architectures, such as 32-bit ARM, GCC will not be able to translate
these operations into atomic, lock-free instructions and will instead
rely on the external atomics library to provide these operations.

Check at configuration time whether or not linking against libatomic
is necessary and if so, create a dependency that can be used while
linking the mesautil library.

This is the meson equivalent of 2ef7f23820 ("configure: check if
-latomic is needed for __atomic_*").

For some background information on this, see:

	https://gcc.gnu.org/wiki/Atomic/GCCMM

Changes in v2:
- clarify meaning of lock-free in commit message
- fix build if -latomic is not necessary

Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-03-02 11:31:59 +01:00
Samuel Pitoiset
c133a3411b radv: do not set pending_reset_query in BeginCommandBuffer()
This is just useless for two reasons:
1) flush_bits is not set accordingly, so nothing will be flushed
   in BeginQuery().
2) we always flush caches in EndCommandBuffer(), so if a reset
   is done in a previous command buffer we are safe.

Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-02 09:44:12 +01:00
Dave Airlie
bf2af063c3 r600/cayman: fix fragcood loading recip generation.
This fixes some hangs seen where the recip_ieee opcodes would
end up split across the wrong slots.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-02 00:33:18 +00:00
Kenneth Graunke
cee9f38903 i965: Allow 48-bit addressing on Gen8+.
This allows most GPU objects to use the full 48-bit address space
offered by Gen8+ platforms, rather than being stuck with 32-bit.
This expands the available GPU memory from 4G to 256TB or so.

A few objects - instruction, scratch, and vertex buffers - need to
remain pinned in the low 4GB of the address space for various reasons.
We default everything to 48-bit but disable it in those cases.

Thanks to Jason Ekstrand for blazing this trail in anv first and
finding the nasty undocumented hardware issues.  This patch simply
rips off all of his findings.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-01 15:46:11 -08:00
Kenneth Graunke
6712611735 i965: Shorten the name of the workaround BO.
This makes the name shorter in debug printouts.  If "workaround_bo"
is good enough for the code, it's probably good enough for debugging.
2018-03-01 15:46:11 -08:00
Kenneth Graunke
b04c5cece7 i965: Add debugging code to dump the validation list.
When anything goes wrong with this code, dumping the validation list
is a useful way to figure out what's happening.
2018-03-01 15:46:11 -08:00
Jason Ekstrand
ff4726077d intel/fs: Set up sampler message headers in the visitor on gen7+
This gives the scheduler visibility into the headers which should
improve scheduling.  More importantly, however, it lets the scheduler
know that the header gets written.  As-is, the scheduler thinks that a
texture instruction only reads it's payload and is unaware that it may
write to the first register so it may reorder it with respect to a read
from that register.  This is causing issues in a couple of Dota 2 vertex
shaders.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104923
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-03-01 15:11:01 -08:00
Timothy Arceri
f5305c1b44 ac: fix nir_intrinsic_shared_atomic_comp_swap handling
Following on from 49879f3778 this makes sure we use the correct
src index.

Fixes cts test:
KHR-GL46.compute_shader.atomic-case3

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-02 09:11:20 +11:00
Timothy Arceri
13cdf4e590 st/glsl_to_nir: simplify st_nir_assign_var_locations() and fix for fs outputs
We only need to check for previously processed location on user
defined varyings as they are the only ones that support component
packing. Therefore a single instance of processed_locs can be
shared by regular varyings and patches.

For simplicity we make processed_locs an array in order to handle
dual source bleanding.

Fixes the follow piglit test on radeonsi:
tests/spec/arb_enhanced_layouts/execution/component-layout/fs-output.shader_test

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-02 09:11:20 +11:00
Jason Ekstrand
89f78cf333 anv: Enable MSAA fast-clears
This speeds up the Sascha Willems multisampling demo by around 25% when
using 8x or 16x MSAA.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-01 14:07:58 -08:00
Jason Ekstrand
00da139477 anv/cmd_buffer: Add support for MCS fast-clears and resolves
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-01 14:07:58 -08:00
Jason Ekstrand
1805c483b1 anv/cmd_buffer: Add helpers for computing resolve predicates
We'll want to re-use the complex resolve predicate computations for MCS
resolves so it's nice to have them as helper functions.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-01 14:07:58 -08:00
Jason Ekstrand
a0a319f16e anv/cmd_buffer: Handle MCS identical to CCS_E in compute_aux_usage
This doesn't actually do anything because att_state->fast_clear is
determined based on the return value of anv_layout_to_fast_clear_type
which currently returns NONE for multisampled images.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-01 14:07:58 -08:00
Jason Ekstrand
d0f701d2f1 anv/blorp: Pass the clear address to blorp for subpass MSAA resolves
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-01 14:07:58 -08:00
Jason Ekstrand
f4f95496cb anv/blorp: Allow indirect clear colors on blorp sources on gen7
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-01 14:07:58 -08:00
Jason Ekstrand
d85f05bd6f anv/blorp: Add partial clear support to anv_image_mcs_op
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-01 14:07:58 -08:00
Jason Ekstrand
c34feaea52 intel/blorp: Add indirect clear color support to mcs_partial_resolve
This is a bit complicated because we have to get the indirect clear
color in there somehow.  In order to not do any more work in the shader
than needed, we set it up as it's own vertex binding which points
directly at the clear color address specified by the client.

Acked-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-03-01 14:07:58 -08:00
Jason Ekstrand
ca7ab1a6a5 intel/blorp: Add a helper for filling out VERTEX_BUFFER_STATE
There are enough #ifs in there that it's kind-of pointless to duplicate
it for each buffer.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-01 14:07:58 -08:00
Andriy Khulap
7859701920 i965: Fix RELOC_WRITE typo in brw_store_data_imm64()
Fixes: 6c530ad116
("i965: Reduce passing 2x32b of reloc_domains to 2 bits")

Signed-off-by: Andriy Khulap <andriy.khulap@globallogic.com>
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-01 11:20:04 -08:00
Jonathan Gray
034bbaa6c0 gallium/util: use sockets on PIPE_OS_UNIX in u_network
Instead of listing all the UNIX PIPE_OS platforms just use
PIPE_OS_UNIX.  Makes BSD sockets available on PIPE_OS_BSD.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-03-01 18:44:39 +00:00
Jonathan Gray
7bea40e566 util: use clock_gettime() on PIPE_OS_BSD
OpenBSD, FreeBSD, NetBSD and DragonFlyBSD all have clock_gettime()
so use it when PIPE_OS_BSD is defined.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-03-01 18:44:38 +00:00
Jose Maria Casanova Crespo
4420d8866c nir/search: Include 8 and 16-bit support in construct_value
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-01 09:16:03 -08:00
Jason Ekstrand
99ee40fb54 nir/search: Support 8 and 16-bit constants in match_value
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
2018-03-01 09:15:01 -08:00
Andres Gomez
b5b912dfee travis: make Meson find the proper llvm-config
Travis CI has moved to LLVM 5.0, and meson is detecting automatically
the available version in /usr/local/bin based on the PATH env variable
order preference.

As for 0.44.x, Meson cannot receive the path to the llvm-config binary
as a configuration parameter. See
https://github.com/mesonbuild/meson/issues/2887 and
7c8b6ee3fa

We want to use the custom (APT) installed version. Therefore, let's
make Meson find our wanted version sooner than the one at
/usr/local/bin

Once this is corrected, we would still need a patch similar to:
https://lists.freedesktop.org/archives/mesa-dev/2017-December/180217.html

v2: Create the link only to the specificly wanted LLVM version (Gert).

Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: Gert Wollny <gw.fossdev@gmail.com>
Cc: Jon Turney <jon.turney@dronecode.org.uk>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-and-Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-01 12:21:30 +02:00
Andres Gomez
98f7650add meson: fix LLVM version detection when <= 3.4
3 digits versions in LLVM only started from 3.4.1 on.

Hence, even if you can perfectly build with an old LLVM (< 3.4.1) in
the system while not needing LLVM at all (auto), when passing through
the LLVM version detection code, meson will fail when accessing
"_llvm_version[2]" due to:

"Index 2 out of bounds of array of size 2."

v2: Properly compare LLVM version and set patch version to 0
    if < 3.4.1 (Eric).

v3: Improve the commit log explanation (Eric).

Cc: Dylan Baker <dylan@pnwbakers.com>
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-03-01 12:16:23 +02:00
Iago Toral Quiroga
bc73016703 i965/sbe: fix number of inputs for active components
In 16631ca30e we fixed gen9 active components to account for padded
inputs in the URB, which we can have with SSO programs. To do that,
instead of going through the bitfield of inputs (which doesn't include
padding information), we compute the number of inputs from the size
of the URB entry.

Unfortunately, there are some special inputs that are not stored in
the URB and that we also need to account for. These special inputs
are identified and handled during calculate_attr_overrides().

Instead of keeping track of the exact number of inputs, we just
program active components for all possible inputs like we do in
anvil.

This fixes a regression in a WebGL program that uses Point Sprite
functionality (specifically, VARYING_SLOT_PNTC).

v2:
 - Add 'Fixes' tag (Mark Janes)
 - make no_vue_inputs int instead of uint32_t, and add const qualifier
   to num_inputs variable (Ian)

v3:
 - Do not try to count inputs correctly, just program all input
   slots like we do in anvil (Ken)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105224
Fixes: 16631ca30e (i965/sbe: fix active components for SSO programs with over 16 inputs)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-01 10:55:12 +01:00
Samuel Pitoiset
c27f5419f6 radv: only emit cache flushes when the pool size is large enough
This is an optimization which reduces the number of flushes for
small pool buffers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-01 09:53:40 +01:00
Samuel Pitoiset
2fe07933bd radv: keep track of the query pool size
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-01 09:53:39 +01:00
Samuel Pitoiset
c956d0f406 radv: make sure to emit cache flushes before starting a query
If the query pool has been previously resetted using the compute
shader path.

Fixes: a41e2e9cf5 ("radv: allow to use a compute shader for resetting the query pool")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105292
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-01 09:14:49 +01:00
Alejandro Piñeiro
e72fb4e611 nir/serialize: handle var->name being NULL
var->name could be NULL under ARB_gl_spirv for example. And in any
case, the code is already handing var name being NULL when reading a
variable, so it is consistent to do it writing a variable too.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-01 08:23:33 +01:00
Jose Maria Casanova Crespo
ba642ee3ee anv: Enable VK_KHR_16bit_storage for PushConstant
Enables storagePushConstant16 features of VK_KHR_16bit_storage for Gen8+.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo
02266f9ba1 spirv/i965/anv: Relax push constant offset assertions being 32-bit aligned
The introduction of 16-bit types with VK_KHR_16bit_storages implies that
push constant offsets could be multiple of 2-bytes. Some assertions are
updated so offsets should be just multiple of size of the base type but
in some cases we can not assume it as doubles aren't aligned to 8 bytes
in some cases.

For 16-bit types, the push constant offset takes into account the
internal offset in the 32-bit uniform bucket adding 2-bytes when we access
not 32-bit aligned elements. In all 32-bit aligned cases it just becomes 0.

v2: Assert offsets to be aligned to the dest type size. (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo
23ffb7c2d1 spirv: Calculate properly 16-bit vector sizes
Range in 16-bit push constants load was being calculated
wrongly using 4-bytes per element instead of 2-bytes as it
should be.

v2: Use glsl_get_bit_size instead of if statement
    (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo
994d210429 anv: Enable VK_KHR_16bit_storage for SSBO and UBO
Enables storageBuffer16BitAccess and uniformAndStorageBuffer16BitAccesss
features of VK_KHR_16bit_storage for Gen8+.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo
69be3a82ca i965/fs: Support 16-bit store_ssbo with VK_KHR_relaxed_block_layout
Restrict the use of untyped_surface_write with 16-bit pairs in
ssbo to the cases where we can guarantee that offset is multiple
of 4.

Taking into account that VK_KHR_relaxed_block_layout is available
in ANV we can only guarantee that when we have a constant offset
that is multiple of 4. For non constant offsets we will always use
byte_scattered_write.

v2: (Jason Ekstrand)
    - Assert offset_reg to be multiple of 4 if it is immediate.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo
8dd8be0323 i965/fs: Support 16-bit do_read_vector with VK_KHR_relaxed_block_layout
16-bit load_ubo/ssbo operations that call do_untyped_read_vector don't
guarantee that offsets are multiple of 4-bytes as required by untyped_read
message. This happens for example in the case of f16mat3x3 when then
VK_KHR_relaxed_block_layout is enabled.

Vectors reads when we have non-constant offsets are implemented with
multiple byte_scattered_read messages that not require 32-bit aligned offsets.

Now for all constant offsets we can use the untyped_read_surface message.
In the case of constant offsets not aligned to 32-bits, we calculate a
start offset 32-bit aligned and use the shuffle_32bit_load_result_to_16bit_data
function and the first_component parameter to skip the copy of the unneeded
component.

v2: (Jason Ekstrand)
    Use untyped_read_surface messages always we have constant offsets.

v3: (Jason Ekstrand)
    Simplify loop for reads with non constant offsets.
    Use end - start to calculate the number of 32-bit components to read with
    constant offsets.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo
2dd94f462b i965/fs: shuffle_32bit_load_result_to_16bit_data now skips components
This helper used to load 16bit components from 32-bits read now allows
skipping components with the new parameter first_component. The semantics
now skip components until we reach the first_component, and then reads the
number of components passed to the function.

All previous uses of the helper are updated to use 0 as first_component.
This will allow read 16-bit components when the first one is not aligned
32-bit. Enabling more usages of untyped_reads with 16-bit types.

v2: (Jason Ektrand)
    Change parameters order to first_component, num_components

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo
67d7dd594e isl/i965/fs: SSBO/UBO buffers need size padding if not multiple of 32-bit
The surfaces that backup the GPU buffers have a boundary check that
considers that access to partial dwords are considered out-of-bounds.
For example, buffers with 1,3 16-bit elements has size 2 or 6 and the
last two bytes would always be read as 0 or its writting ignored.

The introduction of 16-bit types implies that we need to align the size
to 4-bytew multiples so that partial dwords could be read/written.
Adding an inconditional +2 size to buffers not being multiple of 2
solves this issue for the general cases of UBO or SSBO.

But, when unsized arrays of 16-bit elements are used it is not possible
to know if the size was padded or not. To solve this issue the
implementation calculates the needed size of the buffer surfaces,
as suggested by Jason:

surface_size = isl_align(buffer_size, 4) +
               (isl_align(buffer_size, 4) - buffer_size)

So when we calculate backwards the buffer_size in the backend we
update the resinfo return value with:

buffer_size = (surface_size & ~3) - (surface_size & 3)

It is also exposed this buffer requirements when robust buffer access
is enabled so these buffer sizes recommend being multiple of 4.

v2: (Jason Ekstrand)
    Move padding logic fron anv to isl_surface_state.
    Move calculus of original size from spirv to driver backend.
v3: (Jason Ekstrand)
    Rename some variables and use a similar expresion when calculating.
    padding than when obtaining the original buffer size.
    Avoid use of unnecesary component call at brw_fs_nir.
v4: (Jason Ekstrand)
    Complete comment with buffer size calculus explanation in brw_fs_nir.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-28 21:37:40 -08:00
Mathias Fröhlich
4c232dc721 vbo: Remove vbo_save_vertex_list::vertex_size.
Like before use local variables from compile_vertex_list instead.
Remove vertex_size from struct vbo_save_vertex_list.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Mathias Fröhlich
478a9bc7bb vbo: Remove vbo_save_vertex_list::buffer_offset.
The buffer_offset is used in aligned_vertex_buffer_offset.
But now that most of these decisions are done in compile_vertex_list
we can work on local variables instead of struct members in the
display list code. Clean that up and remove buffer_offset.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Mathias Fröhlich
bfa8d8e5bf vbo: Remove vbo_save_vertex_list::start_vertex.
Replace last use on replay with _vbo_save_get_{min,max}_index. Appart from
that it is not used anymore.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Mathias Fröhlich
6dd3e98c21 vbo: Remove vbo_save_vertex_list::attrsz.
Is not used anymore on replay, move the last use in display list
compilation to the original array in the display list compiler.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Mathias Fröhlich
95b4be4f29 vbo: Remove vbo_save_vertex_list::attrtype.
Is not used anymore on replay, move the last use in display list
compilation to the original array in the display list compiler.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Mathias Fröhlich
77df52cc4f vbo: Remove vbo_save_vertex_list::enabled.
Is not used anymore on replay.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Mathias Fröhlich
19a0f27a49 vbo: Remove reference to the vertex_store from the dlist node.
Since we now store a set of VAOs in the display list, use these object
to get the reference to the VBO in several places.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Mathias Fröhlich
6e410270ee vbo: Implement current values update in terms of the VAO.
Use the information already present in the VAO to update the current values
after display list replay. Set GL_OUT_OF_MEMORY on allocation failure
for the current value update storage.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Mathias Fröhlich
08aa0d9bf4 vbo: Implement vbo_loopback_vertex_list in terms of the VAO.
Use the information already present in the VAO to replay a display list
node using immediate mode draw commands. Use a hand full of helper methods
that will be useful for the next patches also.

v2: Insert asserts, constify local variables.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Mathias Fröhlich
f7178d677c vbo: Use a local variable for the dlist offsets.
The master value is now stored inside the VAO already present in
struct vbo_save_vertex_list. Remove the unneeded copy from dlist storage.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Mathias Fröhlich
1cc3516a11 vbo: Remove unused vbo_save_context::wrap_count.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Mathias Fröhlich
07915020f0 vbo: Remove unused vbo_save_vertex_list::dangling_attr_ref.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2018-03-01 04:06:23 +01:00
Jason Ekstrand
6d3edbea16 anv: Always set has_context_priority
We don't zalloc the physical device so we need to unconditionally set
everything.  Crucible helpfully initializes all allocations to 139 so it
was getting true regardless of whether or not the kernel actually
supports context priorities.

Fixes: 6d8ab53303 "anv: implement VK_EXT_global_priority extension"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 17:31:20 -08:00
Mark Janes
0fc009b8c7 Revert "i965: Only emit 3DSTATE_DRAWING_RECTANGLE once on gen8+"
This reverts commit a2c1e48f15.

On BDWGT3e and KBLGT3e systems, this commit regressed the following
tests:

  piglit.spec.ext_framebuffer_multisample.accuracy 2 stencil_resolve small depthstencil
  piglit.spec.ext_framebuffer_multisample.accuracy 4 stencil_resolve small depthstencil
  piglit.spec.ext_framebuffer_multisample.accuracy 6 stencil_resolve small depthstencil
  piglit.spec.ext_framebuffer_multisample.accuracy 8 stencil_resolve small depthstencil
  piglit.spec.ext_framebuffer_multisample.accuracy all_samples stencil_resolve small depthstencil
2018-02-28 17:26:08 -08:00
Dave Airlie
6c1b5a40fd radeonsi/nir: increase values to 8 for gs fetch.
This stops a crash when running (still fails):
tests/spec/arb_gpu_shader_fp64/execution/explicit-location-gs-fs-vs.shader_test

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-01 10:35:09 +10:00
Bas Nieuwenhuizen
f9898b211e radv: Use the syncobj wait ioctl to wait on fences if possible.
Handles the !waitAll and signal after the start of the wait cases correctly.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-01 01:07:18 +01:00
Bas Nieuwenhuizen
34bd5e2e2e radv: Implement more efficient !waitAll fence waiting.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-01 01:07:18 +01:00
Bas Nieuwenhuizen
6968d782d3 radv: Implement waiting on non-submitted fences.
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-01 01:07:18 +01:00
Bas Nieuwenhuizen
2a404c6f92 radv: Implement WaitForFences with !waitAll.
Nothing to do except using a busy wait loop. At least for old kernels.

A better implementation for newer kernels to come later.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105255
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-01 01:07:18 +01:00
Dave Airlie
49879f3778 ac/nir: fix shared atomic operations.
The nir->llvm conversion was using the wrong srcs.

Fixes:
tests/spec/arb_compute_shader/execution/shared-atomics.shader_test

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-01 10:06:06 +10:00
Dave Airlie
69495b30a3 ac/nir: don't apply slice rounding on txf_ms
This matches the tgsi code.

Fixes arb_texture_multisample texelFetch piglit tests.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: f4e499ec79 (radv: add initial non-conformant radv vulkan driver)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-03-01 10:04:34 +10:00
Timothy Arceri
f383fec903 radeonsi: set some context vars for nir path
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-01 10:51:56 +11:00
Timothy Arceri
7e46214f87 gallium: remove llvm from ir struct
This was added in 425dc4c4b3 but never used. Also since
100796c15c native has superseded llvm.

Acked-by: Dave Airlie <airlied@redhat.com>
2018-03-01 10:51:56 +11:00
Kenneth Graunke
e51b0664e0 i965: Don't emit MOVs with undefined registers for Gen4 point clipping.
Gen4 point clipping calls brw_clip_tri_alloc_regs with nr_verts == 0,
which means that c->reg.vertex[] isn't initialized.  It then emits MOVs
to stomp components of those uninitialized registers to 0.

This started causing assertions after Matt's recent series, when those
uninitialized registers started getting BRW_REGISTER_TYPE_NF, which
definitely doesn't exist on Gen4-5.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-02-28 15:03:51 -08:00
Eric Anholt
e4e79a02da broadcom/vc5: Fix regression in the page-cache slice size alignment.
We need to align the size of the slice, not the offset of the next slice.
Fixes KHR-GLES3.texture_repeat_mode.rgba32ui_11x131_2_clamp_to_edge.

Fixes: b4b4ada761 ("broadcom/vc5: Fix layout of 3D textures.")
2018-02-28 13:59:50 -08:00
Jason Ekstrand
a2c1e48f15 i965: Only emit 3DSTATE_DRAWING_RECTANGLE once on gen8+
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 13:31:42 -08:00
Jason Ekstrand
67da59e320 i965: Be more clever about setting up our viewport clip
Before, we were trusting in the hardware to take the intersection
of the viewport clip with the drawing rectangle.  Unfortunately,
3DSTATE_DRAWING_RECTANGLE is fairly expensive because it implicitly
does a full pipeline stall.  If we're a bit more careful with our
viewport clipping, we can just re-emit it once at context creation
time.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 13:31:42 -08:00
Matt Turner
debaa822ef intel/compiler: Re-add .vs_inputs_dual_locations = true
Looks like a rebase mistake.

Fixes: 89fe5190a2 ("intel/compiler: Lower flrp32 on Gen11+")
2018-02-28 13:25:21 -08:00
Dave Airlie
7cb9353de3 r600/shader: when using images always load thread id gpr at start (v2)
The delayed loading code was fail if we had control flow.

This fixes:
tests/spec/arb_shader_image_load_store/execution/image_checkerboard.shader_test

v2: don't use temp_reg before setting temp_reg up.

Tested-by: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-28 20:16:19 +00:00
Dave Airlie
8369fdee8b r600: fix whitespace in recent 1d texture commit.
trivial fix.
2018-02-28 20:16:19 +00:00
Matt Turner
6f00bf519d intel/compiler: Add ICL to test_eu_validate.cpp
With the Align16 tests now disabled, we can run the rest of the tests in
ICL mode (and see them pass!)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
ff4b41dd1d intel/compiler: Disable Align16 tests on Gen11+
Align16 is no more.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
c31d77ac22 intel/compiler: Add instruction compaction support on Gen11
Gen11 only differs from SKL+ in that it uses a new datatype index table.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
d5bf093cf9 intel/compiler: Mark line, pln, and lrp as removed on Gen11+
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
89fe5190a2 intel/compiler: Lower flrp32 on Gen11+
The LRP instruction is no more.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
2134ea3800 intel/compiler/fs: Implement ddy without using align16 for Gen11+
Align16 is no more. We previously generated an align16 ADD instruction
to calculate DDY:

   add(16) g25<1>F  -g23<4>.xyxyF   g23<4>.zwzwF   { align16 1H };

Without align16, we now implement it as:

   add(4) g25<1>F   -g23<0,2,1>F    g23.2<0,2,1>F  { align1 1N };
   add(4) g25.4<1>F -g23.4<0,2,1>F  g23.6<0,2,1>F  { align1 1N };
   add(4) g26<1>F   -g24<0,2,1>F    g24.2<0,2,1>F  { align1 1N };
   add(4) g26.4<1>F -g24.4<0,2,1>F  g24.6<0,2,1>F  { align1 1N };

where only the first two instructions are needed in SIMD8 mode.

Note: an earlier version of the patch implemented this in two
instructions in SIMD16:

   add(8) g25<2>F   -g23<4,2,0>F    g23.2<4,2,0>F  { align1 1N };
   add(8) g25.1<2>F -g23.1<4,2,0>F  g23.3<4,2,0>F  { align1 1N };

but I realized that the channel enable bits will not be correct. If we
knew we were under uniform control flow, we could emit only those two
instructions however.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
62cfd4c656 intel/compiler/fs: Simplify ddx/ddy code generation
The brw_reg() constructor just obfuscates things here, in my opinion.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
bed0267ff6 intel/compiler/fs: Pass fs_inst to generate_ddx/ddy instead of opcode
In a future patch, generate_ddy will want to inspect inst->exec_size.
Change generate_ddx as well for consistency.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
3a584a15c0 intel/compiler/fs: Don't generate integer DWord multiply on Gen11
Like CHV et al., Gen11 does not support 32x32 -> 32/64-bit integer
multiplies.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
432674ce93 intel/compiler/fs: Implement FS_OPCODE_LINTERP with MADs on Gen11+
The PLN instruction is no more. Its functionality is now implemented
using two MAD instructions with the new native-float type. Instead of

   pln(16) r20.0<1>:F r10.4<0;1,0>:F r4.0<8;8,1>:F

we now have

   mad(8) acc0<1>:NF r10.7<0;1,0>:F r4.0<8;8,1>:F r10.4<0;1,0>:F
   mad(8) r20.0<1>:F acc0<8;8,1>:NF r5.0<8;8,1>:F r10.5<0;1,0>:F
   mad(8) acc0<1>:NF r10.7<0;1,0>:F r6.0<8;8,1>:F r10.4<0;1,0>:F
   mad(8) r21.0<1>:F acc0<8;8,1>:NF r7.0<8;8,1>:F r10.5<0;1,0>:F

... and in the case of SIMD8 only the first pair of MAD instructions is
used.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
b5d8781e19 intel/compiler/fs: Return multiple_instructions_emitted from generate_linterp
If multiple instructions are emitted, special handling of things like
conditional mod and NoDDClr/NoDDChk need to be performed.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
b1afdf9fc1 intel/compiler/fs: Fix application of cmod and saturate to LINE/MAC pair
This isn't technically broken, but the next patch will make this
function report whether it generated multiple instructions, and that
information will be used to disable the application of conditional mod
by the generic code.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
2cff324210 intel/compiler: Add Gen11+ native float type
This new type exposes the additional precision offered by the
accumulator register and will be used in the next patch to implement the
functionality of the PLN instruction using a pair of MAD instructions.

One weird thing to note: align1 ternary instructions may only have an
accumulator in the dst or src1 normally, but when src0's type is :NF
the accumulator is read.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
58611ff913 intel/compiler: Add Gen11 register types
The hardware register types' encodings have changed on Gen11. Good thing
we have that superfluous looking brw_reg_type abstraction lying around!

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:15:47 -08:00
Matt Turner
bb428454a9 intel: Disable 64-bit extensions on platforms without 64-bit types
Gen11 does not support DF, Q, UQ types in hardware. As a result, we have
to disable some GL extensions until they can be reimplemented.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-02-28 11:15:47 -08:00
Anuj Phogat
5e42103f3b intel: Add icl pci id for INTEL_DEVID_OVERRIDE
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
2018-02-28 11:15:47 -08:00
Matt Turner
35bfe20995 i965: Warn about preliminary support for Gen11
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-28 11:14:03 -08:00
Anuj Phogat
5ac804bd9a intel: Add a preliminary device for Ice Lake
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Anuj Phogat <anuj.phogat@intel.com>
2018-02-28 11:14:03 -08:00
Tapani Pälli
0c983b9094 anv: remove anv_gem_set_context_priority helper
anv_gem_set_context_param is to be used directly instead!

Fixes: 6d8ab53303 "anv: implement VK_EXT_global_priority extension"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-28 19:50:54 +02:00
George Kyriazis
a01d5e3712 swr/rast: revert clip distance precision
Fixes piglit tests that broke with 8a64593bde

Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-28 11:42:50 -06:00
George Kyriazis
7e813f6214 swr/rast: Faster frustum prim culling
Fix clipper validMask setting. We don't need to run frustum rejected
primitives through the clipper.  Perform frustum culling with only
frustum clip codes. Guardband clip codes cannot be used because they
overlap frustum codes.

Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-28 11:42:46 -06:00
George Kyriazis
1c73f42e6e swr/rast: Consolidate TRANSLATE_ADDRESS
Translate is now part of an overloaded LOAD call which required a change to
the code gen to skip the load functions in order to handle them manually
to make them virtual.

Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-28 11:42:41 -06:00
George Kyriazis
e2a4fd0761 swr/rast: Code generation cleanup
Generate more compact code from gen_llvm.hpp.

Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-28 11:42:37 -06:00
George Kyriazis
190ead3d79 swr/rast: Remove draw type from event definitions
- Have the draw type sent to DrawInfoEvent in handlers created in
  archrast.cpp.  The draw type no longer needs to be sent during during
  AR_API_EVENT() call in api.cpp.

- Remove draw type from event defintions in events_private.proto, no
  longer needed

Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-28 11:42:32 -06:00
George Kyriazis
90e3e23f63 swr/rast: whitespace change
Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-28 11:42:28 -06:00
George Kyriazis
539de78633 swr/rast: Fix index buffer overfetch issue for non-indexed draws
Populate pLastIndex, even for the non-indexed case.  An zero pLastIndex
can cause the index offsets inside the fetcher to have non-sensical values
that can be either very large positive or very large negative numbers.

Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-28 11:42:19 -06:00
Roland Scheidegger
26103487b5 softpipe: don't iterate through PIPE_MAX_SHADER_SAMPLER_VIEWS
We were setting view to NULL if the iteration was larger than i.
But in fact if the view is NULL the code did nothing anyway...

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-02-28 18:22:28 +01:00
Roland Scheidegger
b923f21eaa cso: don't cycle through PIPE_MAX_SHADER_SAMPLER_VIEWS on context destroy
There's no point, we know the highest non-null one.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-02-28 18:22:28 +01:00
Roland Scheidegger
89ae5def8c draw: don't needlessly iterate through all sampler view slots
We already stored the highest (potentially) used number.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-28 18:22:28 +01:00
Tapani Pälli
6d8ab53303 anv: implement VK_EXT_global_priority extension
v2: add ANV_CONTEXT_REALTIME_PRIORITY (Chris)
    use unreachable with unknown priority (Samuel)

v3: add stubs in gem_stubs.c (Emil)
    use priority defines from gen_defines.h

v4: cleanup, add anv_gem_set_context_param (Jason)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> (v2)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v3)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-28 14:36:57 +02:00
Tapani Pälli
5960023cf4 i965: use context priority definitions from gen_defines.h
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-28 14:36:57 +02:00
Tapani Pälli
4449a1f80d intel: add new common header gen_defines.h
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-28 14:36:57 +02:00
Christian König
33633690aa winsys/amdgpu: request high addresses
We now have hopefully fixed all bugs regarding high addresses on Vega10 and
Raven. Start to use the high range to make room for SVM in the low
range.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-28 13:30:32 +01:00
Samuel Pitoiset
639c4f2b54 ac/shader: move scanning some info about input PS declarations
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-28 10:14:26 +01:00
Samuel Iglesias Gonsálvez
e207b2e2c8 glsl/linker: fix bug when checking precision qualifier
According to GLSL ES 3.2 spec, see table in 9.2.1 "Linked Shaders"
section, the precision qualifier should match for uniform variables.
This also applies to previous GLSL ES 3.x specs.

This 'if' checks the condition for uniform variables, while for UBOs
it is checked in link_interface_blocks.cpp.

Fixes: b50b82b8a5
("glsl/es31: precision qualifier doesn't need to match in shader interface block members")

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-02-28 07:04:13 +01:00
Samuel Iglesias Gonsálvez
c757c9dc03 anv: set maxResourceSize to the respective value for each generation
v2:
- Add the proper values to gen9+ (Jason)

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-28 06:54:48 +01:00
Dave Airlie
a5853a3333 r600: partly revert disabling tiling for 1d texture.
Previously we had a check for 1d of narrow 2D textures, however
narrow 2d textures caused gpu hangs, but it was correct for 1d
textures.

This fixes a bunch of 1D image piglits for me.

Fixes: 7b8e1c089d (r600/texture: drop lowering 1d/2d images to linear.)
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-28 04:59:37 +00:00
Timothy Arceri
0c1f37cc2d nir: fix interger divide by zero crash during constant folding
From the GLSL 4.60 spec Section 5.9 (Expressions):

   "Dividing by zero does not cause an exception but does result in
    an unspecified value."

Fixes: 89285e4d47 "nir: add new constant folding infrastructure"

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105271
2018-02-28 15:55:39 +11:00
Ilia Mirkin
086c88551d st/mesa: ensure that images don't try to reference non-existent levels
Ideally the st_finalize_texture call would take care of that, but it
doesn't seem to with KHR-GL45.shader_image_size.advanced-nonMS-*. This
assertion makes sure that no such values are passed to the driver.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-27 22:38:33 -05:00
Dave Airlie
c7b25005a1 ac/radv: move load base vertex abi setup to vertex shader.
This was segfaulting:
dEQP-VK.memory.pipeline_barrier.host_write_index_buffer.1024

Fixes: 8de6f79707 (ac/radeonsi: add load_base_vertex() to the abi)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-28 09:58:12 +10:00
Dave Airlie
3401b028df ac/shader: fix vertex input with components.
This fixes:
dEQP-VK.glsl.440.linkage.varying.component.*

Fixes: 1c57a6da5e (ac/shader: scan vertex inputs usage mask)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-28 09:04:46 +10:00
Dave Airlie
6bafd4f4dd radv: remove device pointer from buffer.
This is never used.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-28 09:03:26 +10:00
Timothy Arceri
a050ea60ee nir: add lower_ldexp to nir compiler options
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-28 09:23:49 +11:00
Timothy Arceri
08fa84bb9a ac: implement nir_op_ldexp
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-28 09:23:49 +11:00
Timothy Arceri
9790921ff5 ac: fix nir_op_fdd{x,y} handling
radeonsi, i965 and anv all treat fdd{x,y} opcodes the same as
fdd{x,y}_coarse by default. The SPIR-V spec lets the implementation
decide how it should be handled and radv was previously going
for the higher quality option. Here we change the shared amd
code to match how nir_op_fdd{x,y} is expected to be handled
by the other NIR drivers.

Fixes piglit test:
./bin/arb_shader_texture_lod-texgrad -auto

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-28 09:23:49 +11:00
Timothy Arceri
8de6f79707 ac/radeonsi: add load_base_vertex() to the abi
Fixes the following piglit tests:

./bin/arb_shader_draw_parameters-basevertex basevertex -auto -fbo
./bin/arb_shader_draw_parameters-basevertex basevertex-baseinstance -auto -fbo

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-28 09:23:49 +11:00
Timothy Arceri
7f91473414 radeonsi: create get_base_vertex() helper
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-28 09:23:49 +11:00
Timothy Arceri
ae47af50d6 radeonsi/nir: disable vertex_id_zero_based lowering
The lowering is incompatible with how the radeonsi backend works.

Fixes piglit test:
./bin/arb_shader_draw_parameters-basevertex vertexid-zerobased -auto

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-28 09:23:49 +11:00
Timothy Arceri
5504bebfc4 ac: add support for handling nir_intrinsic_load_vertex_id
This will be used by radeonsi.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-28 09:23:49 +11:00
Timothy Arceri
3a0b4187dd ac: fix f2b and i2b for doubles
Without this llvm was asserting in debug builds.

V2: use LLVMConstNull()

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-28 09:23:49 +11:00
Francisco Jerez
cb309d27c5 intel/ir: Fix invalid type aliasing with undefined behavior in test_eu_compact.
test_fuzz_compact_instruction() was attempting to modify the uint64_t
data array of a brw_inst through a pointer to uint32_t, which has
undefined behavior.  This was causing the test_eu_compact unit test to
fail mysteriously for me on GCC 7 with some additional
harmless-looking changes I had applied to my tree, which happened to
affect the order instructions are emitted by GCC causing the bit
twiddling to be done after the clear_pad_bits() call which is supposed
to overwrite the same data through a pointer of different type,
leading to data corruption.  A similar failure has been reported by
Vinson Lee on the master branch built with GCC 8.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105052
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-02-27 11:42:39 -08:00
Francisco Jerez
69b4a9d21d util/bitset: Make C++ wrapper trivially constructible.
In order to fix a build failure on compilers not implementing
unrestricted unions, which is a C++11 feature.

v2: Provide signed integer comparison and assignment operators instead
    of BITSET_WORD ones to avoid spurious ambiguity warnings on
    comparisons with a signed integer literal.

Fixes: ba79a90fb5 "glsl: Switch ast_type_qualifier to a 128-bit bitset."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105238
Tested-by: Roland Scheidegger <sroland@vmware.com>
Tested-By: George Kyriazis <george.kyriazis@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-27 11:38:18 -08:00
Jordan Justen
9f223d860b intel/tools: Use gen_device_name_to_pci_device_id in aubinator
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-02-27 11:15:10 -08:00
Jordan Justen
8ff89250ff intel/common: Add gen_device_name_to_pci_device_id
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-02-27 11:15:10 -08:00
Jordan Justen
c2134f94c8 intel/vulkan: Support INTEL_DEVID_OVERRIDE environment variable
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-02-27 11:15:10 -08:00
Jordan Justen
843f6d187a i965: Use gen_get_pci_device_id_override
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-02-27 11:15:10 -08:00
Jordan Justen
e560bb9dc2 intel/common: Add gen_get_pci_device_id_override
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-02-27 11:15:10 -08:00
Jordan Justen
6b274d5cc6 intel/vulkan: Support INTEL_NO_HW environment variable
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-02-27 11:15:10 -08:00
Harish Krupo
b9af043716 android: fix source files path for libmesa_anv_gen11
Signed-off-by: Harish Krupo <harish.krupo.kps@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-02-27 14:16:08 +02:00
Eric Engestrom
248c593132 meson: avoid changing types for the dri3 option
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-27 11:21:20 +00:00
Eric Engestrom
76e8d61999 meson: simplify the gbm option code, and avoid changing types
v2: drop gallium comment (Dylan)

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-27 11:21:20 +00:00
Samuel Pitoiset
a549da877b ac/nir: clean up a hack about rounding 2nd coord component
It's basically just the opposite, and it only makes sense to
round the layer for 2D texture arrays.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-27 10:09:27 +01:00
Ilia Mirkin
e683a797c6 nvc0: collapse output slots to have adjacent registers
The hardware skips over unallocated slots, so we have to make sure those
registers are packed together.

Fixes KHR-GL45.enhanced_layouts.fragment_data_location_api

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Karol Herbst <kherbst@redhat.com>
2018-02-27 00:10:39 -05:00
Dave Airlie
250468f6b7 radv: expose async compute on SI
It looks like we had all the pieces in place for this,
just never tested it and turned it on.

I don't see any CTS regressions and the computeshader
demo runs.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-27 00:54:59 +00:00
Dave Airlie
1fc19a0f27 radv: merge tess rings into a single bo
Inspired by a passing commit to radeonsi.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-27 00:54:59 +00:00
Emil Velikov
784d81e97e docs: update calendar, add news and link release notes to 17.3.6
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-27 00:32:14 +00:00
Emil Velikov
d9391014de docs: add sha256 checksums for 17.3.6
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit b00880973e)
2018-02-27 00:29:44 +00:00
Emil Velikov
676c58fbdb docs: add release notes for 17.3.6
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit b3e5a3f35b)
2018-02-27 00:29:43 +00:00
Dylan Baker
b9636fe38a meson: fix building without GL
libgl will be undefined _glx, so move that check inside the
`if with_glx != 'disabled'` block.

v2: - Simplify commit message (Eric, Emil)

Fixes: 5c460337fd ("meson: Fix GL and EGL pkg-config files with glvnd")
Reported-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
CC: Daniel Stone <daniels@collabora.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Untested-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-26 09:32:14 -08:00
Lionel Landwerlin
fca9f5b585 intel: aubinator_error_decode: fix segfault on missing register
Some register might be missing in our genxmls. Don't try to decode
them.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-26 16:54:48 +00:00
Eric Engestrom
11d45304fd *-symbol-check: use correct nm path when cross-compiling
Inspired-by: a similar patch for libdrm by Heiko Becker
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-26 13:50:59 +00:00
Karol Herbst
ef308d4007 nvir/gm107: consider FILE_FLAGS dependencies in SchedDataCalculatorGM107
currently while insterting barriers, writes and reads to FILE_FLAGS aren't
considered. This can lead to WaR hazards in some situations.

With the previous commit fixes shaders with intstructions like this:
  mad u32 $r2 $r4 $r11 $r2
  mad u32 { $r5 $c0 } $r4 $r10 $r6
  mad (SUBOP:1) u32 $r3 $r4 $r10 $r2 $c0

Affects OpenCL CTS tests on Maxwell+:
basic/test_basic intmath_long
basic/test_basic intmath_long2
basic/test_basic intmath_long4

v2: only put barriers on instructions which actually read flags

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-02-26 14:41:58 +01:00
Karol Herbst
2f07f823c9 nvir/gm107: iterate over all defs in SchedDataCalculatorGM107::findFirstUse
In the sched data calculator we have to track first use of defs by iterating
over all defs of an instruction, not just the first one.

v2: fix minGRP and maxGRP values

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-02-26 14:41:58 +01:00
Samuel Pitoiset
e05507a427 ac/nir: use ordered float comparisons except for not equal
Original patch from Timothy Arceri, I have just fixed the
not equal case locally.

This fixes one important rendering issue in Wolfenstein 2
(the cutscene transition issue).

RadeonSI uses the same ordered comparisons, so I guess that
what we should do as well.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104302
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104905
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2018-02-26 13:59:04 +01:00
Mauro Rossi
6451b0703f android: vulkan/util: add dependency on libnativewindow for O and later
Similar to 90dd6e5 ("Android: egl: add dependency on libnativewindow")

Fixes the following building error:

In file included from out/target/product/x86_64/obj_x86/STATIC_LIBRARIES/libmesa_vulkan_util_intermediates/util/vk_enum_to_str.c:26:
external/mesa/include/vulkan/vk_android_native_buffer.h:22:10: fatal error: 'system/window.h' file not found
         ^~~~~~~~~~~~~~~~~
1 error generated.

Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2018-02-26 14:50:24 +02:00
Mauro Rossi
d448954228 android: anv: add dependency on libnativewindow for O and later
Similar to 90dd6e5 ("Android: egl: add dependency on libnativewindow")

Fixes the following building errors:

In file included from external/mesa/src/intel/vulkan/gen7_cmd_buffer.c:30:
In file included from external/mesa/src/intel/vulkan/anv_private.h:72:
external/mesa/include/vulkan/vk_android_native_buffer.h:22:10: fatal
error: 'system/window.h' file not found
         ^~~~~~~~~~~~~~~~~
1 error generated.
...
In file included from external/mesa/src/intel/vulkan/anv_gem.c:32:
In file included from external/mesa/src/intel/vulkan/anv_private.h:72:
external/mesa/include/vulkan/vk_android_native_buffer.h:22:10: fatal
error: 'system/window.h' file not found
         ^~~~~~~~~~~~~~~~~
1 error generated.

Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2018-02-26 14:49:06 +02:00
Mauro Rossi
9a508b719b android: anv/extensions: fix generated sources build
Building rules are aligned to automake ones

The correct script to build anv_extensions.{c,h} is anv_extensions_gen.py
Generation rules for anv_extensions.c requires --out-c option
Generation rules for anv_extensions.h were missing
Necessary include paths are added to avoid following build errors:

cp: cannot stat '.../gen/STATIC_LIBRARIES/libmesa_vulkan_common_intermediates/vulkan/anv_extensions.c':
No such file or directory

In file included from external/mesa/src/intel/vulkan/anv_gem.c:32:
external/mesa/src/intel/vulkan/anv_private.h:75:10: fatal error: 'anv_extensions.h' file not found
         ^~~~~~~~~~~~~~~~~~
1 error generated.

In file included from external/mesa/src/intel/vulkan/anv_batch_chain.c:30:
external/mesa/src/intel/vulkan/anv_private.h:75:10: fatal error: 'anv_extensions.h' file not found
         ^~~~~~~~~~~~~~~~~~
1 error generated.

Fixes: dd088d4bec ("anv/extensions: Generate a header file with extension tables")
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-02-26 14:37:33 +02:00
Marek Olšák
8799eaed99 radeonsi: remove 2 unused user SGPRs from merged TES-GS with 32-bit pointers
The effect of the last 13 commits on user SGPR counts:

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-26 12:01:19 +01:00
Marek Olšák
3fa7a59d69 radeonsi: make SI_SGPR_VERTEX_BUFFERS the last user SGPR input
so that it can be removed and replaced with inline VBO descriptors,
and the pointer can be packed in unused bits of VBO descriptors.
This also removes the pointer from merged TES-GS where it's useless.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-26 12:01:08 +01:00
Marek Olšák
c78640ce31 radeonsi: set correct num_input_sgprs for VS prolog in merged shaders
We need to take num_input_sgprs from VS, not the second shader.
No apps suffered from this.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-26 12:01:05 +01:00
Marek Olšák
f852b24ce0 radeonsi: allow fewer input SGPRs in 2nd shader of merged shaders
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-26 12:01:03 +01:00
Marek Olšák
8d6e6b1d7c radeonsi: don't use struct si_descriptors for vertex buffer descriptors
VBO descriptor code will change a lot one day.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-26 12:01:00 +01:00
Daniel Stone
61d6ff3ba3 build: Move wayland-scanner check into platform
Also only check for wayland-scanner if building for the Wayland
platform.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Fixes: bfa22266cd ("vulkan/wsi/wayland: Add support for zwp_dmabuf")
Cc: Emil Velikov <emil.velikov@collabora.co.uk>
Reported-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105211
2018-02-26 10:43:19 +00:00
Daniel Stone
d33cd875e8 build: Move wayland-protocols check into platform
In line with wayland-client and wayland-server, move the check for
wayland-protocols into the wayland platform branch.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Fixes: bfa22266cd ("vulkan/wsi/wayland: Add support for zwp_dmabuf")
Cc: Emil Velikov <emil.velikov@collabora.co.uk>
Reported-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105211
2018-02-26 10:43:16 +00:00
Daniel Stone
d8f19d9aa0 vulkan/wsi/wayland: Move Wayland protocol from BUILT_SOURCES
autotools wants to have the BUILT_SOURCES ready as soon as it enters the
directory, even if they are not used. This meant the build failed if
wayland-protocols was not available on the system, even if it was not
enabled.

As BUILT_SOURCES cannot be used in a conditional (cf. 166852ee95), do
the same thing as EGL and manually encode the dependencies in the
Makefile.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Fixes: bfa22266cd ("vulkan/wsi/wayland: Add support for zwp_dmabuf")
Cc: Emil Velikov <emil.velikov@collabora.co.uk>
Reported-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105211
2018-02-26 10:43:12 +00:00
Dave Airlie
0cc5be7741 r600: fix tgsi clock last setting
On cayman this was hitting an assert later, which probably wasn't
see on non-cayman due to having the t slot.

Fixes: 9041730d1 (r600: add support for ARB_shader_clock.)
2018-02-26 11:05:45 +10:00
Dave Airlie
4d72a1efea r600: add time lo/hi debugging output.
This just adds the these to the debug prints.
2018-02-26 11:05:26 +10:00
Timothy Arceri
22430224fe radeonsi/nir: enable lowering of fpow
Lowering fpow in NIR rather than LLVM can be beneficial.

Polaris results:

Totals from affected shaders:
SGPRS: 124928 -> 124896 (-0.03 %)
VGPRS: 68616 -> 68332 (-0.41 %)
Spilled SGPRs: 394 -> 413 (4.82 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 3668912 -> 3658368 (-0.29 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 18575 -> 18593 (0.10 %)
Wait states: 0 -> 0 (0.00 %)

Fixes: d6b7539206 "ac/nir: remove emission of nir_op_fpow"

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-26 11:43:47 +11:00
Timothy Arceri
9873bd9dcd ac: make use of ac_get_llvm_num_components() helper
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-26 11:43:47 +11:00
Timothy Arceri
1a757c9c97 gallium/tgsi: remove is_msaa_sampler array from tgsi_shader_info
Seems to have not been used since 16be87c904

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-26 11:43:47 +11:00
Timothy Arceri
9f7c940840 radeonsi/nir: fix loading of doubles for tess varyings
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-26 11:43:47 +11:00
Timothy Arceri
81f9d03807 radeonsi/nir: fix lds store in tcs outputs handling
We were ignoring the channel offset.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-26 11:43:47 +11:00
Gert Wollny
c7cadcbda4 r600: Take ALU_EXTENDED into account when evaluating jump offsets
ALU_EXTENDED needs 4 DWORDS instead of the usual 2, hence if the last ALU
clause within a IF-JUMP or ELSE branch is ALU_EXTENDED the target jump
offset needs to be adjusted accordingly.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104654
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-26 10:29:48 +10:00
Francisco Jerez
51562ea7a0 mesa: Expose EXT_shader_framebuffer_fetch(_non_coherent) on desktop and embedded GL.
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
c6c64d4d6a glsl: Silence warnings when reading from a framebuffer fetch output.
Framebuffer fetch outputs are implicitly initialized upon entry to the
fragment shader.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
537bb1da98 glsl: Specify framebuffer fetch coherency mode in lower_blend_equation_advanced().
This requires passing an extra argument to the lowering pass because
the KHR_blend_equation_advanced specification doesn't seem to define
any mechanism for the implementation to determine at compile-time
whether coherent blending can ever be used (not even an "#extension
KHR_blend_equation_advanced_coherent" directive seems to be required
in the shader source AFAICT).

In the long run we'll probably want to do state-dependent recompiles
based on the value of ctx->Color.BlendCoherent, but right now there
would be no benefit from that because the only driver that supports
coherent framebuffer fetch is i965 on SKL+ hardware, which are unable
to support the non-coherent path for the moment because of texture
layout issues, so framebuffer fetch coherency is always enabled for
them.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
ef9e3f63ca glsl: Add support for the framebuffer fetch layout(noncoherent) qualifier.
This allows the application to request framebuffer fetch coherency
with per-fragment output granularity.  Coherent framebuffer fetch
outputs (which is the default if no qualifier is present for
compatibility with older versions of the EXT_shader_framebuffer_fetch
extension) will have ir_variable_data::memory_coherent set to true.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
0aeec504b4 glsl: Allow layout token for EXT_shader_framebuffer_fetch_non_coherent.
EXT_shader_framebuffer_fetch_non_coherent requires layout qualifiers
even on GL(ES) 2.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
1bc01db95f glsl: Initialize ir_variable_data::fb_fetch_output earlier for GL(ES) 2.
At the same point where it is initialized on GL(ES) 3.0+ so we can
implement some common layout qualifier handling in a future commit.
Until now the fb_fetch_output flag would be inherited from the
original implicit gl_LastFragData declaration at a later point in the
AST to GLSL IR translation.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
6ebefb0fd5 glsl: Replace MESA_shader_framebuffer_fetch extension flags with EXT ones.
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
ba79a90fb5 glsl: Switch ast_type_qualifier to a 128-bit bitset.
This should end the drought of bits in the ast_type_qualifier object.
The bitset_t type works pretty much as a drop-in replacement for the
current uint64_t bitset.

The only catch is that the bitset_t type as defined in the previous
commit doesn't have a trivial constructor (because it has a
user-defined constructor), so it cannot be used as union member
without providing a user-defined constructor for the union (which
causes it in turn to be non-trivially constructible).  This annoyance
could be easily addressed in C++11 by declaring the default
constructor of bitset_t to be the implicitly defined one -- IMO one
more reason to drop support for GCC 4.2-4.3.

The other minor change was required because glsl_parser_extras.cpp was
hard-coding the type of bitset temporaries as uint64_t, which (unlike
would have been the case if the uint64_t had been replaced with
e.g. an __int128) would otherwise have caused a build failure, because
the boolean conversion operator of bitset_t is marked explicit (if
C++11 is available), so the bitset won't be silently truncated down to
1 bit in order to use it to initialize the uint64_t temporaries
(yikes).

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
bdbc2ffa42 util/bitset: Add C++ wrapper for static-size bitsets.
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
8d1f1ce412 util: Add EXPLICIT_CONVERSION macro.
This can be used to specify that a C++ conversion operator is not
meant to be used for implicit conversions, which can lead to
unintended loss of information in some cases.  Implemented as a macro
in order to keep old GCC versions happy.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
378e918e28 mesa: Implement glFramebufferFetchBarrierEXT entry point.
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
e4124f9bc1 glapi: Update XML for last revision of EXT_shader_framebuffer_fetch.
Desktop GL is now supported, and there is an additional entry-point
for EXT_shader_framebuffer_fetch_non_coherent.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
6a8ec78c2a mesa: Rename MESA_shader_framebuffer_fetch gl_extensions bits to EXT.
The changes I had originally planned for the MESA_shader_framebuffer_fetch
extension have been merged into the EXT spec, there's no point in keeping
MESA_shader_framebuffer_fetch extension enables.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
d0bef79f12 mesa: Rename dd_function_table::BlendBarrier to match latest EXT spec.
This GL entry point was renamed to glFramebufferFetchBarrier() in the
EXT extension on request from Khronos members.  Update the Mesa
codebase to match the latest spec.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Francisco Jerez
27c829da28 i965: Fix KHR_blend_equation_advanced with some render targets.
This reverts two bogus and seemingly useless changes from the commits
referenced below, which broke KHR_blend_equation_advanced (and
EXT_shader_framebuffer_fetch_non_coherent which wasn't exposed yet)
for any kind of render target surface that would cause the
get_isl_surf() call in brw_emit_surface_state() to do anything useful
(notice how the result of get_isl_surf() is completely ignored by the
caller right now), as was the case while using those extensions with
1D array or 3D framebuffers in particular.

Fixes: f5859b45b1 "i965/miptree: Switch remaining surfaces to isl"
Fixes: bf24c3539e "i965/miptree: Clean-up unused"
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Marek Olšák
fb410ae392 radeonsi: remove si_descriptors parameter from emit_shader_pointer functions
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-24 23:08:29 +01:00
Marek Olšák
63ea0a00a3 radeonsi: preload the tess offchip ring in TES
so that it's not done multiple times in branches

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-24 23:08:29 +01:00
Marek Olšák
2d03c4cac8 radeonsi: move tess ring address into TCS_OUT_LAYOUT, removes 2 TCS user SGPRs
TCS_OUT_LAYOUT has 13 unused bits. That's enough for a 32-bit address
aligned to 512KB. Hey, it's a 13-bit pointer!

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-24 23:08:29 +01:00
Marek Olšák
190e064e63 radeonsi: move 2nd-shader descriptor pointers into s[0:1]
If 32-bit pointers are supported, both pointers can be moved into s[0:1]
and then ESGS has exactly the same user data SGPR declarations as VS.

If 32-bit pointers are not supported, only one pointer can be moved into
s[0:1]. In that case, the 2nd pointer is moved before TCS constants,
so that the location is the same in HS and GS.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-24 23:08:29 +01:00
Marek Olšák
1d1df76d2b radeonsi: change si_descriptors::shader_userdata_offset type to short
We will want to use SH registers outside of user data SGPRs, like the GFX9
special SGPRs.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-24 23:08:28 +01:00
Marek Olšák
fca7dee9c6 radeonsi: put both tessellation rings into 1 buffer
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-24 23:08:28 +01:00
Marek Olšák
d2963d8b5f radeonsi: move tessellation ring info into si_screen
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-24 23:08:28 +01:00
Marek Olšák
41895c26d3 radeonsi: move TCS_OUT_LAYOUT.PatchVerticesIn to lower bits
For a later patch.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-24 23:08:28 +01:00
Karol Herbst
f0b39779a0 nvir: dont optimize mad with subops to shladd
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-24 18:48:13 +01:00
James Legg
afd8fd0656 radv: Really use correct HTILE expanded words.
When transitioning to an htile compressed depth format, Set the full
depth range, so later rasterization can pass HiZ. Previously, for depth
only formats, the depth range was set to 0 to 0. This caused unwanted
HiZ rejections with a VK_FORMAT_D16_UNORM depth buffer
(VK_FORMAT_D32_SFLOAT was not affected somehow).

These values are derived from PAL [0], since I can't find the
specification describing the htile values.

[0] 5cba4ecbda/src/core/hw/gfxip/gfx9/gfx9MaskRam.cpp (L1500)

CC: Dave Airlie <airlied@redhat.com>
CC: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Fixes: 5158603182 "radv: Use correct HTILE expanded words."
2018-02-24 02:16:22 +01:00
Mauro Rossi
8eed942136 radv/extensions: fix c_vk_version for patch == None
Similar to cb0d1ba156 ("anv/extensions: Fix VkVersion::c_vk_version for patch == None")
fixes the following building errors:

out/target/product/x86_64/obj_x86/STATIC_LIBRARIES/libmesa_radv_common_intermediates/radv_entrypoints.c:1161:48:
error: use of undeclared identifier 'None'; did you mean 'long'?
      return instance && VK_MAKE_VERSION(1, 0, None) <= core_version;
                                               ^~~~
                                               long
external/mesa/include/vulkan/vulkan.h:34:43: note: expanded from macro 'VK_MAKE_VERSION'
    (((major) << 22) | ((minor) << 12) | (patch))
                                          ^
...
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.

Fixes: e72ad05c1d ("radv: Return NULL for entrypoints when not supported.")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-24 00:31:31 +01:00
Eric Anholt
b4b4ada761 broadcom/vc5: Fix layout of 3D textures.
Cube maps are entire miptrees repeated, while 3D textures have each level
have all of its layers next to each other.  Fixes tex3d and
tex-miplevel-selection GL2:texture() 3D.
2018-02-23 15:07:26 -08:00
Eric Anholt
97dc077303 broadcom/vc5: Ignore unused usage flags in is_format_supported.
Like for vc4, the new DISPLAY_TARGET flag ended up causing no formats to
match.  Just drop the whole retval == usage thing and return early when we
hit a known unsupported case.

Fixes: f7604d8af5 ("st/dri: only expose config formats that are display targets")
2018-02-23 15:07:18 -08:00
Eric Anholt
880573e737 gbm: Fix the alpha masks in the GBM format table.
Once GBM started looking at the values of the alpha masks, ARGB/ABGR
wouldn't match any more because we had both A and R in the low bits.

Fixes: 2ed344645d ("gbm/dri: Add RGBA masks to GBM format table")
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-02-23 15:03:36 -08:00
Mathias Fröhlich
b54bf0e3e3 mesa: Update vertex processing mode on _mesa_UseProgram.
The change is a bug fix for 92d76a169:
  mesa: Provide an alternative to get_vp_mode()
that actually got exposed through 4562a7b0:
  vbo: Make use of _DrawVAO from the dlist code.

Fixes: KHR-GLES31.core.shader_image_load_store.advanced-sso-simple
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105229
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 21:08:35 +01:00
Marek Olšák
d169438d8e mesa: rename has_core_gs -> has_gs in get_programiv
This is also true for GLES.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 20:50:23 +01:00
Marek Olšák
1881f41b6c mesa: replace some API_OPENGL_CORE checks with _mesa_is_desktop_gl
This is more accurate with respect to the compatibility profile.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 20:50:22 +01:00
Marek Olšák
1defc973db mesa: add some of missing compatibility support for ARB_bindless_texture
The extension is exposed in the compatibility profile.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 20:50:20 +01:00
Marek Olšák
b8e2e9e1a1 mesa: expose ARB_enhanced_layouts in the compatibility profile
GLSL 1.40 is required.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 20:50:19 +01:00
Marek Olšák
a0c8b49284 mesa: enable OpenGL 3.1 with ARB_compatibility
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 20:50:17 +01:00
Marek Olšák
605a7f6db5 mesa: implement ARB_compatibility
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 20:50:15 +01:00
Emil Velikov
14a2c87c41 swr: remove dead LLVM code paths
LLVM requirement was bumped to 4.0.0 with earlier commit.
Hence any code tailored for older versions is now unreachable.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2018-02-23 19:17:31 +00:00
Eric Anholt
5980a41c0f broadcom/vc4: Remove the retval==usage check in is_format_supported().
This got us into trouble recently, so just remove it entirely.
2018-02-23 08:42:13 -08:00
Eric Anholt
bc3d16e633 broadcom/vc4: Add support for YUV textures using unaccelerated blits.
Previously we would assertion fail about having no hardware format.  This
is enough to get kmscube -M nv12-2img working.
2018-02-23 08:42:13 -08:00
Eric Anholt
c824a045ea broadcom/vc4: Fix double-unrefcounting of prsc->next with shadows.
When we set up the shadow resource we were copying the original resource
as the template, including its prsc->next field.  When we shadowed the
first YUV plane's resource for linear-to-tiled conversion, we would end up
unbalancing the refcount on the shadow resource's destruction.
2018-02-23 08:42:13 -08:00
Eric Anholt
6deb158ec1 broadcom/vc4: Add pipe_reference debugging for vc4_bos.
Trying to track down the YUV EGLImage use-after-free, it helps to see what
the mystery objects are that are being refcounted.
2018-02-23 08:42:13 -08:00
Eric Anholt
34ea1aca92 broadcom/vc4: Remove dead vc4_bo_set_reference().
It would be broken if NULL was passed to it anyway, since it wouldn't
participate in screen->bo_handles management.
2018-02-23 08:42:13 -08:00
Eric Anholt
a49738290c broadcom/vc4: Use pipe_resource_reference in sampler views.
Improves u_debug_refcount output.
2018-02-23 08:42:13 -08:00
Eric Anholt
0c1dd9dee0 broadcom/vc4: Allow importing linear BOs with arbitrary offset/stride.
This is part of supporting YUV textures -- MMAL will be handing us a
single GEM BO with the planes at offsets within it, and MMAL-decided
stride.
2018-02-23 08:42:13 -08:00
Eric Anholt
978b884afc broadcom/vc4: Ignore PIPE_BIND_DISPLAY_TARGET in is_format_supported().
We were failing the retval == usage check at the end.

Fixes: f7604d8af5 ("st/dri: only expose config formats that are display targets")
2018-02-23 08:42:13 -08:00
Lucas Stach
8df11f3fad etnaviv: fix in-place resolve tile count
TS tiles map to a fixed amount of bytes in the color/depth surface,
so the blocksize of the format needs to be taken into account when
calculating the number of tiles to fill.

The simplest fix is to just use the layer stride, which is the surface
size in bytes.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2018-02-23 15:34:39 +01:00
Lucas Stach
add23b59c9 etnaviv: switch magic single buffer state to "3"
Some of the 16bit formats misrender with missing tiles with the current
"2" state. As all the previously working formats also work with the "3"
state, just always use that one.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2018-02-23 15:34:39 +01:00
Lucas Stach
8befc11186 etnaviv: add debug switch to disable single buffer feature
This feature has caused some trouble already. Add a debug switch to
allow users to quickly check if a specific issue is caused by this
feature.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-02-23 15:34:31 +01:00
Dylan Baker
5c460337fd meson: Fix GL and EGL pkg-config files with glvnd
Currently meson will generate a pkg-config that links to EGL_mesa (or
GLX_mesa), but this isn't correct, it should always link to EGL or GL.
Probably the "right" solution is to have glvnd itself provide the pkg
config files for GL and EGL, but that also means that glvnd needs to
provide many of the header files, which makes it a more involved job.

Fixes: a47c525f32 ("meson: build glx")
Fixes: 035ec7a2bb ("meson: Add support for EGL glvnd")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-02-23 13:30:28 +00:00
Frank Binns
6160bf97db egl/dri2: fix segfault when display initialisation fails
dri2_display_destroy() is called when platform specific display
initialisation fails. However, this would typically lead to a
segfault due to the dri2_egl_display vbtl not having been set up.

Fixes: 2db9548296 ("loader_dri3/glx/egl: Optionally use a blit
context for blitting operations")
Signed-off-by: Frank Binns <francisbinns@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-23 11:13:22 +00:00
Juan A. Suarez Romero
e1623b303c mesa: add missing RGB9_E5 format in _mesa_base_fbo_format
RGB9_E5 should be accepted by RenderbufferStorage if the
EXT_texture_shared_exponent is exposed. It is left to the
implementations to return GL_FRAMEBUFFER_UNSUPPORTED_EXT
when checking the framebuffer completeness if they do not
support rendering in this format.

Discussed in:
https://github.com/KhronosGroup/OpenGL-API/issues/32

This fixes KHR-GL45.internalformat.renderbuffer.rgb9_e5

v2: Added more info to the commit message (Antia)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Antia Puentes <apuentes@igalia.com>
2018-02-23 10:12:06 +01:00
Christian Gmeiner
e72062b66d etnaviv: npot_tex_any_wrap needs one bit only
Reduces size of struct etna_specs from 100 to 94 bytes.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2018-02-23 09:38:16 +01:00
Mathias Fröhlich
4562a7b0e8 vbo: Make use of _DrawVAO from the dlist code.
Finally use an internal VAO to execute display list draws. Avoid
duplicate state validation for display list draws. Remove client arrays
previously used exclusively for display lists.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:34:14 +01:00
Mathias Fröhlich
2f35140846 mesa: Use atomics for shared VAO reference counts.
VAOs will be used in the next change as immutable object across multiple
contexts. Only reference counting may write concurrently on the VAO. So,
make the reference count thread safe for those and only those VAO objects.

v3: Use bool/true/false for gl_vertex_array_object::SharedAndImmutable.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:34:11 +01:00
Mathias Fröhlich
8a3a4b6fae vbo: Make use of _DrawVAO from immediate mode draw
Finally use an internal VAO to execute immediate mode draws. Avoid
duplicate state validation for immediate mode draws. Remove client arrays
previously used exclusively for immediate mode draws.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:34:07 +01:00
Mathias Fröhlich
c757e416ce vbo: Implement tool functions for vbo specific VAO setup.
Correct VBO_MATERIAL_SHIFT value.
The functions will be used next in this series.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:34:04 +01:00
Mathias Fröhlich
ef8028017d mesa: Add flush_vertices to _mesa_bind_vertex_buffer.
We will need the flush_vertices argument later in this series.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:34:01 +01:00
Mathias Fröhlich
354b76ad20 mesa: Make _mesa_vertex_attrib_binding public.
Change vertex_attrib_binding() to _mesa_vertex_attrib_binding(), add a
flush_vertices argument, and make it publicly available.
The function will be needed later in the series.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:33:58 +01:00
Mathias Fröhlich
4331969ac4 mesa: Add flush_vertices to _mesa_{enable,disable}_vertex_array_attrib.
We will need the flush_vertices argument later in this series.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:33:55 +01:00
Mathias Fröhlich
195bb990ed vbo: Use _DrawVAO for array type draw commands.
Switch over to use the _DrawVAO for all the array type draws.
The _DrawVAO needs to be set before we enter _mesa_update_state, so move
setting the draw method in front of the first call to _mesa_update_state
which is in turn called from the *validate*Draw* calls. Using the
gl_vertex_array_object::_Enabled bitmask, gl_vertex_program_state::_VPMode
and gl_vertex_array_object::_AttributeMapMode we can already set
varying_vp_inputs before we call _mesa_update_state the first time.
Thus remove duplicate state validation.

v2: Update comments.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:33:50 +01:00
Mathias Fröhlich
6002ab564b vbo: Implement method to track the inputs array.
Provided the _DrawVAO and the derived state that is maintained if we have
the _DrawVAO set, implement a method to incrementally update the array of
gl_vertex_array input pointers.

v2: Add some more comments.
    Rename _vbo_array_init to _vbo_init_inputs.
    Rename vbo_context::arrays to vbo_context::draw_arrays.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:33:46 +01:00
Mathias Fröhlich
08c7474189 mesa: Introduce a yet unused _DrawVAO.
During the patch series this VAO gets populated with either the currently
bound VAO or an internal VAO that will be used for immediate mode and
dlist rendering.

v2: More comments about the _DrawVAO, filter and enabled mask.
    Rename _DrawVAOEnabled to _DrawVAOEnabledAttribs.
v3: Fix and move comment.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:33:43 +01:00
Mathias Fröhlich
ce3d2421a0 vbo: Remove get_vp_mode() and enum vp_mode.
Is now unused.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:33:40 +01:00
Mathias Fröhlich
60c3ca1b23 vbo: Use _VPMode instead of get_vp_mode().
At those places where we used get_vp_mode() use
gl_vertex_program_state::_VPMode instead.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:33:36 +01:00
Mathias Fröhlich
92d76a1691 mesa: Provide an alternative to get_vp_mode()
To get equivalent information than get_vp_mode(), track the vertex
processing mode in a per context variable at
gl_vertex_program_state::_VPMode.
This aims to replace get_vp_mode() as seen in the vbo module.
But instead of the get_vp_mode() implementation which only gives correct
answers past calling _mesa_update_state() this context variable is
immediately tracked when the vertex processing state is modified. The
correctness of this value is asserted on state validation.

With this in place we should be able to untangle the dependency with
varying_vp_inputs and state invalidation.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 05:33:30 +01:00
Ilia Mirkin
d73f1f2ad8 nv50,nvc0: fix integer MS resolves using 2d engine
We don't want filtering for integer textures, same as depth/stencil.

Fixes: KHR-GL45.direct_state_access.renderbuffers_storage_multisample
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Karol Herbst <kherbst@redhat.com>
2018-02-22 20:47:48 -05:00
Ilia Mirkin
33ce3569c5 nvc0: fix writing query results into buffer
We need to mark the range as valid, and validate the resource using a
helper to ensure that the buffer status is marked properly.

Fixes some CTS pipeline stats query tests, and
KHR-GL45.direct_state_access.queries_functional

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Karol Herbst <kherbst@redhat.com>
2018-02-22 20:47:48 -05:00
Ilia Mirkin
f6e4f95668 nv50,nvc0: fix clear buffer acceleration
Two things were off:
 - valid range was not updated, which could affect waiting for future
   maps
 - fencing was done manually instead of using the *_resource_validate
   helper, which resulted in a missed dirty buffer flag being set

Fixes: KHR-GL45.direct_state_access.buffers_clear
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Karol Herbst <kherbst@redhat.com>
2018-02-22 20:47:48 -05:00
Lionel Landwerlin
bd9672695b i965: perf: ensure reading config IDs from sysfs isn't interrupted
Fixes: 458468c136 "i965: Expose OA counters via INTEL_performance_query"
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-02-23 01:44:07 +00:00
Bas Nieuwenhuizen
032870beda radv: Fix autotools build.
Somewhere along the way the Makefile changes got lost ...

Fixes: 4db78f3a6b "radv: Put supported extensions in a struct."
Acked-by: Dave Airlie <airlied@redhat.com>
2018-02-23 01:54:12 +01:00
Bas Nieuwenhuizen
e72ad05c1d radv: Return NULL for entrypoints when not supported.
This implements strict checking for the entrypoint ProcAddr
functions.

 - InstanceProcAddr with instance = NULL, only returns the 3 allowed
   entrypoints.
 - DeviceProcAddr does not return any instance entrypoints.
 - InstanceProcAddr does not return non-supported or disabled
   instance entrypoints.
 - DeviceProcAddr does not return non-supported or disabled device
   entrypoints.
 - InstanceProcAddr still returns non-supported device entrypoints.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-23 00:39:02 +01:00
Bas Nieuwenhuizen
414f5e0e14 radv: Reword radv_entrypoints_gen.py
With a big inspiration from anv as always ...

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-23 00:39:02 +01:00
Bas Nieuwenhuizen
076f7cfc6b radv: Track enabled extensions.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-23 00:39:02 +01:00
Bas Nieuwenhuizen
4db78f3a6b radv: Put supported extensions in a struct.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-23 00:39:02 +01:00
Jose Fonseca
1f5618e81c appveyor: Build with MSVC 2015.
The MSVC version we (at VMware) primarily care about from now on is
2015.

See https://ci.appveyor.com/project/jrfonseca/mesa/build/46

We can drop support for building with 2013 in a future commit.  I'm not
aware of significant changes in C99/C11 support from MSVC 2013 to 2015,
but there's no point in continuing supporting old MSVC versions when
nobody cares.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-22 21:10:20 +00:00
Samuel Pitoiset
d6b7539206 ac/nir: remove emission of nir_op_fpow
fpow is now lowered at NIR level.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-22 20:44:46 +01:00
Samuel Pitoiset
7aa008d1d7 radv: enable lowering of fpow to fexp2 and flog2
There is no fpow in hardware, so it's always lowered somewhere,
but it appears that lowering at NIR level is better. Figured while
comparing compute shaders between RadeonSI and RADV.

Polaris10:
Totals from affected shaders:
SGPRS: 18936 -> 18904 (-0.17 %)
VGPRS: 12240 -> 12220 (-0.16 %)
Spilled SGPRs: 2809 -> 2809 (0.00 %)
Code Size: 718116 -> 719848 (0.24 %) bytes
Max Waves: 1409 -> 1410 (0.07 %)

Vega10:
Totals from affected shaders:
SGPRS: 18392 -> 18392 (0.00 %)
VGPRS: 12008 -> 11920 (-0.73 %)
Spilled SGPRs: 3001 -> 2981 (-0.67 %)
Code Size: 777444 -> 778788 (0.17 %) bytes
Max Waves: 1503 -> 1504 (0.07 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-22 20:40:47 +01:00
Samuel Pitoiset
63fb30c674 nir: lower fexp2(fmul(flog2(a), 2)) to fmul(a, a)
Similar for the 4 case.

Suggested by Bas.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-22 20:40:45 +01:00
Samuel Pitoiset
b18997876f nir: add is_used_once for fmul(fexp2(a), fexp2(b)) to fexp2(fadd(a, b))
Otherwise the code size increases because the original fexp2()
instructions can't be deleted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-22 20:40:43 +01:00
Samuel Pitoiset
a01e9996b5 ac/nir: set GLC=1 for load/store of coherent/volatile images
This disables persistence accross wavefronts.

F1 2017 and Wolfenstein 2 appear to use some coherent images
but this patch doesn't seem to change anything.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-22 20:39:55 +01:00
Samuel Pitoiset
3c40be126f spirv: apply memory qualifiers to images
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-22 20:39:53 +01:00
Chuck Atkins
540e49e105 glx: Properly handle cases where screen creation fails
This fixes a segfault exposed by a29d63ecf7 which occurs when swr is
used on an unsupported architecture.

v2: re-work to place logic in xmesa_init_display

Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Cc: George Kyriazis <george.kyriazis@intel.com>
Cc: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-22 10:20:32 -05:00
Iago Toral Quiroga
7668b594e6 anv/blorp: multisample resolve all attachment layers
We were only resolving the first.

v2:
  - Do not require that the number of layers on dst and src are an
    exact match, it is okay if the dst has more layers so long as
    it has at least the same that we are going to resolve.
  - Do not always resolve array_len layers, we should resolve
    only from base_array_layer to array_len.

v3:
  - v2 was assuming that array_len represented the total number of
    layers in the image, but it represents the number of layers
    starting at the base array ayer.

v4:
 - The number of layers to resolve should be taken from the
   framebuffer (Nanley).

Fixes new CTS tests for multisampled layered rendering:
dEQP-VK.renderpass.multisample_resolve.layers_*

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-22 08:23:39 +01:00
Jason Ekstrand
2dce4ac6ac intel/isl: Improve the documentation on get_default_aux_state
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-02-21 18:18:16 -08:00
Jason Ekstrand
24952160fd i965: Use finish_external instead of make_shareable in setTexBuffer2
The setTexBuffer2 hook from GLX is used to implement glxBindTexImageEXT
which has tighter restrictions than just "it's shared".  In particular,
it says that any rendering to the image while it is bound causes the
contents to become undefined.

The GLX_EXT_texture_from_pixmap extension provides us with an acquire
and release in the form of glXBindTexImageEXT and glXReleaseTexImageEXT.
The extension spec says,

    "Rendering to the drawable while it is bound to a texture will leave
    the contents of the texture in an undefined state.  However, no
    synchronization between rendering and texturing is done by GLX.  It
    is the application's responsibility to implement any synchronization
    required."

From the EGL 1.4 spec for eglBindTexImage:

    "After eglBindTexImage is called, the specified surface is no longer
    available for reading or writing.  Any read operation, such as
    glReadPixels or eglCopyBuffers, which reads values from any of the
    surface’s color buffers or ancillary buffers will produce
    indeterminate results.  In addition, draw operations that are done
    to the surface before its color buffer is released from the texture
    produce indeterminate results

In other words, between the bind and release calls, we effectively own
those pixels and can assume, so long as we don't crash, that no one else
is reading from/writing to the surface.  The GLX and EGL implementations
call the setTexBuffer2 and releaseTexBuffer function pointers that the
driver can hook.

In theory, this means that, between BindTexImage and ReleaseTexImage, we
own the pixels and it should be safe to track aux usage so we
can avoid redundant resolves so long as we start off with the right
assumption at the start of the bind/release pair.

In practice, however, X11 has slightly different expectations.  It's
expected that the server may be drawing to the image at the same time as
the compositor is texturing from it.  In that case, the worst expected
outcome should be tearing or partial rendering and not random corruption
like we see when rendering races with scanout with CCS.  Fortunately,
the GEM rules about texture/render dependencies save us here.  If X11
submits work to write to a pixmap after the compositor has submitted
work to texture from it, GEM inserts a dependency between the compositor
and X11.  If X11 is using a high-priority context, this will cause the
compositor to get a temporarily boosted priority while the batch from
X11 is waiting on it.  This means that we will never have an actual race
between X11 and the compositor so no corruption can happen.

Unfortunately, however, this means that X11 will likely be rendering to it
between the compositor's BindTexImage and ReleaseTexImage calls.  If we
want to avoid strange issues, we need to be a bit careful about
resolves because we can't really transition it away from the "default"
aux usage.  The only case where this would practically be a problem is
with image_load_store where we have to do a full resolve in order to use
the image via the data port.  Even there it would only be a problem if
batches were split such that X11's rendering happens between the resolve
and the use of it as a storage image.  However, the chances of this
happening are very slim so we just emit a warning and hope for the best.

This commit adds a new helper intel_miptree_finish_external which resets
all aux state to whatever ISL says is the right worst-case "default" for
the given modifier.  It feels a little awkward to call it "finish"
because it's actually an acquire from the perspective of the driver, but
it matches the semantics of the other prepare/finish functions.  This
new helper gets called in intelSetTexBuffer2 instead of make_shareable.
We also add an intelReleaseTexBuffer (we passed NULL to releaseTexBuffer
before) and call intel_miptree_prepare_external in it.  This probably
does nothing most of the time but it means that the prepare/finish calls
are properly matched.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-02-21 18:18:16 -08:00
Jason Ekstrand
00926a2730 i965/tex_image: Reference the renderbuffer miptree in setTexBuffer2
The old code made a new miptree that referenced the same BO as the
renderbuffer and just trusted in the memory aliasing to work.  There are
only two ways in which the new miptree is liable to differ from the one
in the renderbuffer and neither of them matter:

 1) It may have a different target.  The only targets that we can ever
    see in intelSetTexBuffer2 are GL_TEXTURE_2D and GL_TEXTURE_RECTANGLE
    and the difference between the two doesn't matter as far as the
    miptree is concerned; genX(update_sampler_state) only looks at the
    gl_texture_object and not the miptree when determining whether or
    not to use normalized coordinates.

 2) It may have a very slightly different format.  Again, this doesn't
    matter because we've supported texture views for quite some time so
    we always look at the gl_texture_object format instead of the
    miptree format for hardware setup anyway.

On the other hand, because we were recreating the miptree, we were using
intel_miptree_create_for_bo which doesn't understand modifiers.  We
really want this function to work without doing a resolve so long as you
have modifiers so we need to fix that.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-02-21 18:18:16 -08:00
Jason Ekstrand
41d45eb21e i965/tex_image: Pull the tex format from the renderbuffer in intelSetTexBuffer2
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-02-21 18:18:16 -08:00
Jason Ekstrand
344b57b10b i965/miptree: Loosen the format check in miptree_match_image
This function is used to determine when we need to re-allocate a
miptree.  Since we do nothing different in miptree allocation for
sRGB vs. linear, loosening this should be safe and may lead to less
copying and reallocating in some odd cases.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-02-21 18:18:16 -08:00
Jason Ekstrand
5b1b710e6f i965/state: Ignore intel_obj->_Format for depth/stencil and ETC2
We're about to start letting the intel_obj->_Format be the "real"
texture format.  For depth/stencil textures, this may be a combined
depth stencil format.  For ETC2 on gen7 and earlier, this will be the
actual ETC2 format.  This makes a bit more GL sense but means we have to
be careful in state upload.

Reviewed-by: Chad Versace <chadversary@chromium.org>
2018-02-21 18:18:16 -08:00
Kenneth Graunke
183ce5e629 glsl: Parse 'layout' as a token with advanced blending or bindless
Both KHR_blend_equation_advanced and ARB_bindless_texture provide
layout qualifiers, and are exposed in compatibility contexts.  We
need to parse the layout qualifier as a token in order for those
to work, but forgot to extend this check.

ARB_shader_image_load_store would need a similar treatment, but we
don't expose that in legacy OpenGL contexts.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105161
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-02-21 17:50:57 -08:00
Daniel Stone
c7e22483fe vulkan/wsi/x11: Consistently update and return swapchain status
Use a helper function for updating the swapchain status. This will be
used later to handle VK_SUBOPTIMAL_KHR, where we need to make a
non-error status stick to the swapchain until recreation.  Instead of
direct comparisons to VK_SUCCESS to check for error, test for negative
numbers meaning an error status, and positive numbers indicating
non-error statuses.

v2 (Jason Ekstrand):
 - Use a pattern of "return x11_swapchain_result(chain, VK_WHATEVER)"
 - Handle wsi_queue_pull returning VK_TIMEOUT
 - Call x11_swapchain_result in x11_present_to_x11

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-21 22:37:10 +00:00
Jason Ekstrand
6937c61324 vulkan/wsi/x11: Set OUT_OF_DATE if wait_for_special_event fails
This most likely means we lost our connection to the X server so
OUT_OF_DATE is reasonable.  This was also the one case where we pushed a
UINT32_MAX into the queue without setting an error condition.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-02-21 22:37:10 +00:00
Daniel Stone
bfa22266cd vulkan/wsi/wayland: Add support for zwp_dmabuf
zwp_linux_dmabuf_v1 lets us use multi-planar images and buffer
modifiers.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-21 22:37:10 +00:00
Jason Ekstrand
c757fd2852 anv/image: Add support for modifiers for WSI
This adds support for the modifiers portion of the WSI "extension".

Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-02-21 22:37:10 +00:00
Jason Ekstrand
adca1e4a92 anv/image: Separate modifiers from legacy scanout
For a bit there, we had a bug in i965 where it ignored the tiling of the
modifier and used the one from the BO instead.  At one point, we though
this was best fixed by setting a tiling from Vulkan.  However, we've
decided that i965 was just doing the wrong thing and have fixed it as of
5048572352.

The old assumptions also affected the solution we used for legacy
scanout in Vulkan.  Instead of treating it specially, we just treated it
like a modifier like we do in GL.  This commit goes back to making it
it's own thing so that it's clear in the driver when we're using
modifiers and when we're using legacy paths.

v2 (Jason Ekstrand):
 - Rename legacy_scanout to needs_set_tiling

Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-02-21 22:37:10 +00:00
Jason Ekstrand
f5433e4d6c vulkan/wsi: Add modifiers support to wsi_create_native_image
This involves extending our fake extension a bit to allow for additional
querying and passing of modifier information.  The added bits are
intended to look a lot like the draft of VK_EXT_image_drm_format_modifier.
Once the extension gets finalized, we'll simply transition all of the
structs used in wsi_common to the real extension structs.

Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-02-21 22:37:10 +00:00
Daniel Stone
55b27e1e5f vulkan/wsi: Add drm_modifier member to wsi_image
Not yet used anywhere.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-21 22:37:10 +00:00
Daniel Stone
61c3feb38d vulkan/wsi: Add multiple planes to wsi_image
Not currently used.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-21 22:37:10 +00:00
Timothy Arceri
cdeac00267 nir: remove old assert
This was originally intended to make sure the remap location
was not -1. However the code has changed alot since then,
the location is now never set to -1 and we also handle
components meaning this old assert has been doing comparisions
with the pointer to the array of component data.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105183
2018-02-22 09:31:00 +11:00
Timothy Arceri
86098696fc radeonsi/nir: collect more accurate output_usagemask
Fixes assert in the glsl-1.50-gs-max-output-components piglit test.

Note that the double handling will only work for doubles that
don't take up multiple slots i.e. double and dvec2. However
dual slot double handling is an existing bug which is made no
worse by this patch.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-22 09:31:00 +11:00
Timothy Arceri
79dc94828a radeonsi/nir: disable GLSL IR loop unrolling
Delaying unrolling and allowing NIR to do it instead has been shown
to result in better code in drivers such as i965. shader-db results
appear to show the same is true for radeonsi.

The other advantage is that using NIR unrolling improves compile
times significantly.

Totals from affected shaders:
SGPRS: 9624 -> 10016 (4.07 %)
VGPRS: 6800 -> 6464 (-4.94 %)
Spilled SGPRs: 0 -> 2 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 359176 -> 332264 (-7.49 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 1355 -> 1432 (5.68 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-22 09:31:00 +11:00
Timothy Arceri
e6269ffc2e radeonsi/nir: fix tess varying loads for doubles
Fixes the following piglit tests:

tests/spec/arb_tessellation_shader/execution/double-array-vs-tcs-tes.shader_test
tests/spec/arb_tessellation_shader/execution/double-vs-tcs-tes.shader_test

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-22 09:31:00 +11:00
Timothy Arceri
6d338d757f ac/radeonsi: pass type to load_tess_varyings()
We need this to be able to load 64bit varyings.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-22 09:31:00 +11:00
Daniel Stone
eef890b7b1 x11/dri3: Store raw present completion mode
The DRI3 drawable info struct currently stores a boolean for whether the
last completed operation was a flip or not. As we need to track the full
completion mode for handling suboptimal returns, change the 'flipping'
field to the raw present completion mode from the server.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-02-21 21:57:38 +00:00
Daniel Stone
a6f1952814 x11/dri3: Don't open-code ARRAY_SIZE
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-21 21:57:38 +00:00
Jason Ekstrand
52056206e1 anv: Don't assert that stencil HiZ clears are single-slice
It's true for depth HiZ clears because we only have HiZ on single-slice
images right now.  However, for stencil-only clears there is no such
restriction.

Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-21 13:54:11 -08:00
Jason Ekstrand
7dd0f73fe1 anv: Only copy clear dwords if we're rendering to the first slice
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-02-21 12:47:17 -08:00
Marek Olšák
b494ed168c radeonsi: don't flush when si_eliminate_fast_color_clear is no-op 2018-02-21 20:03:11 +01:00
Marek Olšák
5f55f4c59f radeonsi: make texture_discard_cmask/eliminate functions non-static 2018-02-21 20:03:11 +01:00
James Zhu
81dd4a7637 radeonsi: enable uvd encode for HEVC main
Enable UVD encode for HEVC main profile

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
2018-02-21 13:53:38 -05:00
James Zhu
b38b208ff8 radeonsi:create uvd hevc enc entry
Add UVD hevc encode pipe video codec creation entry

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
2018-02-21 13:53:38 -05:00
James Zhu
e7d51e27ed radeon/uvd:add uvd hevc enc functions
Implement UVD hevc encode functions

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
2018-02-21 13:53:38 -05:00
James Zhu
2b86f5fa0b radeon/uvd:add uvd hevc enc hw ib implementation
Implement required IBs for UVD HEVC encode.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
2018-02-21 13:53:38 -05:00
James Zhu
461508c15c radeon/uvd:add uvd hevc enc hw interface header
Add hevc encode hardware interface for UVD

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
2018-02-21 13:53:38 -05:00
James Zhu
c6acae22c8 winsys/amdgpu:add uvd hevc enc support in amdgpu cs
Support UVD HEVC encode in amdgpu cs

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
2018-02-21 13:53:38 -05:00
James Zhu
f0ad908e79 amd/common:add uvd hevc enc support check in hw query
Based on amdgpu hardware query information to check if UVD hevc enc support

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-21 13:53:38 -05:00
Karol Herbst
7319311a50 nvir/nvc0: fix legalizing of ld unlock c0[0x10000]
We have to increase the file index also for 0x10000 not just for values
greater than 0x10000.

Fixes: 37b67db6ae
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-21 11:12:45 +01:00
Samuel Pitoiset
a6accad68f ac/nir: add glsl_is_array_image() helper
For consistency.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-21 09:41:51 +01:00
Samuel Pitoiset
ff83dfb364 ac/nir: set the DA field when performing atomics on 3D images
This doesn't fix anything known but it should definitely be set.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-21 09:41:49 +01:00
Eric Anholt
afa7b2f199 i965: Fix compiler warning about write being undefined.
This looks like it should be protected by the assume() about
nr_color_regions, but my compiler warns anyway.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-02-20 20:23:57 -08:00
Eric Anholt
4636ce362d glsl/tests: Fix a compiler warning about signed/unsigned loop comparison.
Fixes: d32956935e ("glsl: Walk a list of ir_dereference_array to mark array elements as accessed")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-02-20 20:23:57 -08:00
Eric Anholt
7075c084fc loader: Fix compiler warnings about truncating the PCI ID path.
My build was producing:

../src/loader/loader.c:121:67: warning: ‘%1u’ directive output may be truncated writing between 1 and 3 bytes into a region of size 2 [-Wformat-truncation=]

and we can avoid this careful calculation by just using asprintf (as we do
elsewhere in the file).

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-02-20 20:23:57 -08:00
Eric Anholt
1b313eedb5 glsl: Silence warnings in the uniform initializer test about 16-bit types
They should probably get unit tests implemented, but this cleans up a
bunch of warnings in my build for now.

Fixes: 59f458cd87 ("glsl: Add 16-bit types")
Cc: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-02-20 20:23:57 -08:00
Jordan Justen
96fe36f7ac i965: Enable disk shader cache by default
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-20 18:49:43 -08:00
Dave Airlie
baa0feb73d radv: don't send num_tcs_input_cp to sgprs.
We never use it in the shaders.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-21 00:01:36 +00:00
Dave Airlie
952222ddd4 radv/tess: don't need to look in constant for vertices_per_patch
This just avoids passing this value via user sgprs.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-21 00:01:28 +00:00
Dave Airlie
77fd1b9187 ac/radv: cleanup some tcs output values access
Just consolidates some code to make it easier to change.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-21 00:01:23 +00:00
Dave Airlie
0e6f0d400b ac/radv: remove total_vertices variable
This just removes an unneeded variable.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-21 00:01:19 +00:00
Dave Airlie
e9b9fb3616 ac/radv: don't mark tess inner as used if we don't use it.
This just avoids marking it as a used output if we don't
actually use it.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-21 00:01:15 +00:00
Dave Airlie
d5b2d7ed67 ac/nir: to integer the args to bcsel.
dEQP-VK.tessellation.invariance.outer_edge_symmetry.triangles_equal_spacing_ccw
was hitting an llvm assert due to one value being an int and the
other a float.

This just casts both values to integer and fixes the test.

Fixes: dEQP-VK.tessellation.invariance.outer_edge_symmetry.triangles_equal_spacing_ccw
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-20 23:15:18 +00:00
Jason Ekstrand
c66fb12117 anv/blorp: Use layout_to_aux_usage when a layout is provided
Instead of having aux usage and ANV_AUX_USAGE_DEFAULT to mean "give me
something reasonable" we now use anv_layout_to_aux_usage whenever a
layout is available.  If a layout is available, we ignore the aux_usage
parameter.  For the cases where we have an explicit aux usage such as
clears and aux ops, we have a new ANV_IMAGE_LAYOUT_EXPLICIT_AUX layout.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-20 13:57:17 -08:00
Jason Ekstrand
0fa040e6f5 anv/cmd_buffer: Delete some assert-only variables
Checking the sample count is almost as good as aux usage in this case.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-20 13:57:16 -08:00
Jason Ekstrand
e10a62662b anv/cmd_buffer: Use layout_to_* helpers in compute_aux_usage
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-20 13:57:14 -08:00
Jason Ekstrand
7ea8131aa0 anv/cmd_buffer: Simplify transition_depth_buffer
If we don't have HiZ, then anv_layout_to_aux_usage will return NONE for
both layouts.  If the two layouts are the same, they will get the aux
usage.  In either case, the code below will give us ISL_AUX_OP_NONE and
we'll return without doing anything.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-20 13:57:09 -08:00
Jason Ekstrand
87e86ee2e6 anv/cmd_buffer: Do subpass image transitions in begin/end_subpass
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:25 -08:00
Jason Ekstrand
7d5f6b6088 anv/cmd_buffer: Mark depth/stencil surfaces written in begin_subpass
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:25 -08:00
Jason Ekstrand
8a3f086a42 anv/cmd_buffer: Sync clear values in begin_subpass
This is quite a bit cleaner because we now sync the clear values at the
same time as we do the fast clear.  For loading the clear values into
the surface state, we now do it once when we handle the LOAD_OP_LOAD
instead of every subpass.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:25 -08:00
Jason Ekstrand
a4136b8c1a anv/pass: Store usage in each subpass attachment
This requires us to ditch the VkAttachmentReference struct in favor of
an anv-specific struct.  However, we can now easily identify from just
the subpass attachment what kind of an attachment it is.  This will make
iteration over anv_subpass::attachments a little easier in some case.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:25 -08:00
Jason Ekstrand
bd356e1bcf anv/cmd_buffer: Add a concept of pending load aspects
These are the same as pending clear aspects only for the "load"
operation.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:25 -08:00
Jason Ekstrand
e526d49edd anv/cmd_buffer: Iterate all subpass attachments when clearing
This unifies things a bit because we now handle depth and stencil at the
same time.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:25 -08:00
Jason Ekstrand
2cc3445eb2 anv/cmd_buffer: Decide whether or not to HiZ clear up-front
This moves the decision out of begin_subpass and into BeginRenderPass
like the decision for color clears.  We use a similar name for the
function for depth/stencil as for color even though no aux usage is
really getting computed.

v2 (Jason Ekstrand):
 - Don't always disable HiZ clears by accident
 - Use the initial layout to decide whether to do fast clears

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:24 -08:00
Jason Ekstrand
6fc8555610 anv/cmd_buffer: Move the rest of clear_subpass into begin_subpass
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:24 -08:00
Jason Ekstrand
7991838973 intel/blorp: Add a blorp_hiz_clear_depth_stencil helper
This is similar to blorp_gen8_hiz_clear_attachments except that it takes
actual images instead of trusting in the already set depth state.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:24 -08:00
Jason Ekstrand
1900dd76d0 anv/cmd_buffer: Move the color portion of clear_subpass into begin_subpass
This doesn't really change much now but it will give us more/better
control over clears in the future.  The one interesting functional
change here is that we are now re-emitting 3DSTATE_DEPTH_BUFFERS and
friends for each clear.  However, this only happens at begin_subpass
time so it shouldn't be substantially more expensive.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:24 -08:00
Jason Ekstrand
6fb9d6c6f5 anv/cmd_buffer: Pass a subpass id into begin_subpass
This is a bit less awkward than passing in the subpass because it means
we don't have to extract the subpass id from the subpass.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:24 -08:00
Jason Ekstrand
01223b8199 anv/cmd_buffer: Add begin/end_subpass helpers
Having begin/end_subpass is a bit nicer than the begin/next/end hooks
that Vulkan gives us.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:24 -08:00
Jason Ekstrand
b5bd3fb4e4 anv/cmd_buffer: Apply subpass flushes before set_subpass
This seems slightly more correct because it means that the flushes
happen before any clears or resolves implied by the subpass transition.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:24 -08:00
Jason Ekstrand
869448a8ab anv: Use framebuffer layers for implicit subpass transitions
Fixes: de3be61801 "anv/cmd_buffer: Rework aux tracking"
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:24 -08:00
Jason Ekstrand
85d0bec961 anv: Be more careful about fast-clear colors
Previously, we just used all the channels regardless of the format.
This is less than ideal because some channels may have undefined values
and this should be ok from the client's perspective.  Even though the
driver should do the correct thing regardless of what is in the
undefined value, it makes things less deterministic.  In particular, the
driver may choose to fast-clear or not based on undefined values.  This
level of nondeterminism is bad.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:24 -08:00
Jason Ekstrand
4796025ba5 intel/isl: Add an isl_color_value_is_zero helper
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:24 -08:00
Jason Ekstrand
116e818ef1 anv/gpu_memcpy: CS Stall before a MI memcpy on gen7
This fixes a pile of hangs caused by the recent shuffling of resolves
and transitions.  The particularly problematic case is when you have at
least three attachments with load ops of CLEAR, LOAD, CLEAR.  In this
case, we execute the first CLEAR followed by a MI memcpy to copy the
clear values over for the LOAD followed by a second CLEAR.  The MI
commands cause the first CLEAR to hang which causes us to get stuck on
the 3DSTATE_MULTISAMPLE in the second CLEAR.

We also add guards for BLORP to fix the same issue.  These shouldn't
actually do anything right now because the only use of indirect clears
in BLORP today is for resolves which are already guarded by a render
cache flush and CS stall.  However, this will guard us against potential
issues in the future.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:19 -08:00
Guillaume Charifi
a572ec2efe st/mesa: Factorize duplicate code for atomic buffer binding
Signed-off-by: Guillaume Charifi <guillaume.charifi@sfr.fr>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-02-20 20:54:49 +01:00
Guillaume Charifi
56bfcd50f7 st/mesa: Factorize duplicate code in st_update_framebuffer_state()
Signed-off-by: Guillaume Charifi <guillaume.charifi@sfr.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-02-20 20:54:49 +01:00
Rob Clark
4c4e6232ee freedreno/ir3: fix use_count refcnt'ing issue
Was hitting an assert with vs-varying-array-mat4-index-col-row-wr.shader_test

When eliminating a copy, we were dropping the use_count of the mov that
is skipped, but not increasing the use_count of it's src instruction.

Fixes: 76440fcca9 freedreno/ir3: clean up dangling false-dep's
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-20 13:43:42 -05:00
Eric Engestrom
ac731531a1 docs: fix patent url
Reported-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-20 15:14:34 +00:00
Brian Paul
e7d1a93723 svga: replaced 'unsigned' with proper enum types in shader code
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-02-20 08:11:06 -07:00
Jonathan Gray
9401d90a53 configure.ac: pthread-stubs not present on OpenBSD
pthread-stubs is no longer required on OpenBSD and has been removed.
libpthread parts involved moved to libc.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Cc: 17.3 18.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-20 15:08:47 +00:00
Andres Gomez
36ac485bd1 swr: bump minimum supported LLVM version to 4.0
Since radv and radeonsi removed support for LLVM 3.9 the distcheck
target got broken because SWR distribution needed 3.9.x.

After checking with George Kyriazis, SWR is OK with moving to LLVM 4.0
and above, which will solve this problem.

Fixes: 3bf1e036e8 ("amd: remove support for LLVM 3.9")
Cc: George Kyriazis <george.kyriazis@intel.com>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: George Kyriazis <george.kyriazis@intel.com>
2018-02-20 17:03:06 +02:00
Andres Gomez
b39f6d5fc7 travis: radeonsi and radv need LLVM 4.0
Fixes: 3bf1e036e8 ("amd: remove support for LLVM 3.9")
Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Jan Vesely <jan.vesely@rutgers.edu>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-20 16:58:30 +02:00
Samuel Pitoiset
1ac741d690 ac/nir: move ac_declare_lds_as_pointer() outside of the switch
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-20 10:44:59 +01:00
Samuel Pitoiset
b5d111ae76 radv: allow to force family using RADV_FORCE_FAMILY
Useful for pipeline-db.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-20 10:44:47 +01:00
Thomas Hellstrom
f386776ea5 loader_dri3/glx/egl: Reinstate the loader_dri3_vtable get_dri_screen callback
Removing this callback caused rendering corruption in some multi-screen cases,
so it is reinstated but without the drawable argument which was never used
by implementations and was confusing since the drawable could have been
created with another screen.

Cc: "17.3 18.0" mesa-stable@lists.freedesktop.org
Fixes: 5198e48a0d (loader_dri3/glx/egl: Remove the loader_dri3_vtable get_dri_screen callback)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105013
Reported-by: Daniel van Vugt <daniel.van.vugt@canonical.com>
Tested-by: Timo Aaltonen <tjaalton@ubuntu.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-20 10:36:53 +01:00
Thomas Hellstrom
80c31f7837 svga: Fix a leftover debug hack
Fix what appears to be a leftover debug hack.
The hack would force the driver to take a different blit path; possibly,
although unverified, reverting to software blits.

Tested using piglit tests/quick. No related regressions.

Cc: "17.2 17.3 18.0" <mesa-stable@lists.freedesktop.org>
Fixes: 9d81ab7376 (svga: Relax the format checks for copy_region_vgpu10 somewhat)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104625
Reported-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-20 10:12:19 +01:00
Iago Toral Quiroga
af5f2322d0 anv/entrypoints: make vkGetDeviceProcAddr return NULL for instance commands
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-20 08:12:32 +01:00
Ilia Mirkin
e1a70aed10 nv50,nvc0: mark ABGR format as displayable instead of ARGB format
This matches the hardware's capabilities.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-19 22:33:58 -05:00
Ilia Mirkin
f7604d8af5 st/dri: only expose config formats that are display targets
In the case of NVIDIA hardware, ABGR is displayable but ARGB is not.
Only advertise the one set in the visuals list.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Stone <daniels@collabora.com>
2018-02-19 22:33:58 -05:00
Ilia Mirkin
ebdc4c31e2 mesa: add xbgr support adjacent to xrgb
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Daniel Stone <daniels@collabora.com>
2018-02-19 22:33:58 -05:00
Timothy Arceri
d88a2906f8 st/shader_cache: copy nir pointer to gl_program after deserializing
This fixes a crash when running the arb_get_program_binary-api-errors
piglit test twice.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-20 13:15:02 +11:00
Timothy Arceri
691c320de0 radeonsi: add nir shader cache support
In future we might want to try avoid calling nir_serialize() but
this works for now.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-20 13:15:02 +11:00
Timothy Arceri
2b431808ab radeonsi: rename variables tgsi_binary -> ir_binary
This better represents that the ir could be either tgsi or nir.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-20 13:15:02 +11:00
Emil Velikov
1270990438 docs: update calendar, add news and link release notes to 17.3.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-19 22:10:18 +00:00
Emil Velikov
be5a996039 docs: add sha256 checksums for 17.3.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 164a993112)
2018-02-19 22:08:14 +00:00
Emil Velikov
ca614d40cd docs: add release notes for 17.3.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 2529d77179)
2018-02-19 22:08:12 +00:00
Marek Olšák
f78fe98fff radeonsi: fix regression from 32-bit pointers on CI
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2018-02-19 17:56:23 +01:00
Samuel Pitoiset
549c7f3724 radv: compact varyings after removing unused ones
It makes no sense to compact before, and the description of
nir_compact_varyings() confirms that.

Polaris10:
Totals from affected shaders:
SGPRS: 108528 -> 108128 (-0.37 %)
VGPRS: 74548 -> 74500 (-0.06 %)
Spilled SGPRs: 844 -> 814 (-3.55 %)
Code Size: 3007328 -> 2992932 (-0.48 %) bytes
Max Waves: 16019 -> 16009 (-0.06 %)

Vega10:
Totals from affected shaders:
SGPRS: 106088 -> 106232 (0.14 %)
VGPRS: 74652 -> 74700 (0.06 %)
Spilled SGPRs: 692 -> 658 (-4.91 %)
Code Size: 2967708 -> 2953028 (-0.49 %) bytes
Max Waves: 18178 -> 18162 (-0.09 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-02-19 12:19:17 +01:00
Timothy Arceri
51e745cf77 radeonsi/nir: fix gl_FragCoord for pixel_center_integer
Fixes piglit test glsl-arb-fragment-coord-conventions

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-19 08:47:48 +11:00
Timothy Arceri
347038baa9 glsl/nir: add pixel_center_integer to shader info
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-19 08:47:48 +11:00
Ilia Mirkin
fe76fc11b1 gm107/ir: avoid using kepler instruction capabilities
Split up the op properties table into generation-specific bits, and only
use the kepler ones on kepler. Fixes some CTS images tests.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-02-17 23:41:21 -05:00
Ilia Mirkin
f08fd676bf nvc0: add support for bindless on maxwell+
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-17 23:41:21 -05:00
Ilia Mirkin
0255550eb1 gm107/ir: change how SUQ works in preparation for bindless
All this information can be retrieved from the TIC directly. Avoid
having to dip into the constbuf information about the image.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-17 23:41:21 -05:00
Kenneth Graunke
fa8a764b62 i965: Use absolute addressing for constant buffer 0 on Kernel 4.16+.
By default, 3DSTATE_CONSTANT_* Constant Buffer 0 is relative to dynamic
state base address.  This makes it unusable for pushing UBOs.

There is a bit in the INSTPM register (or CS_DEBUG_MODE2 on Skylake)
which controls whether buffer 0 is relative to dynamic state base
address, or simply a normal pointer.  Setting that gives us full
flexibility.  This lets us push up to 4 UBO ranges.

We can't currently write this on Haswell and earlier, and will need
to update the kernel command parser, and then do the whole version
checking song and dance.  We also need a brand new kernel that supports
context isolation - on older kernels, newly created contexts inherit
register state from whatever happened to be running.  So, setting this
would have catastrophic impact on other drivers such as libva, Beignet,
or older Mesa.

See commit 8ec5a4e4a4 where we did this
once before, but had to revert it in commit 013d331220.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-02-17 11:26:31 -08:00
Kenneth Graunke
a63c74be85 i965: Stop restoring the default L3 configuration on Kernel 4.16+.
Kernel 4.16 has proper context isolation, which means we can change
the L3 configuration without worrying about that leaking to other
newly created contexts, breaking the assumptions of other userspace.

So, disable our workaround to reprogram it back to the default.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-02-17 11:26:18 -08:00
Mikko Perttunen
5a1606c51f nvc0: Use GP100_COMPUTE_CLASS on GP10B
GP10B requires the use of GP100_COMPUTE_CLASS instead of
GP104_COMPUTE_CLASS as is used for other non-GP100 chips.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-17 14:16:10 -05:00
Daniel Stone
9d21dbeb88 i965: Fix aux-surface size check
The previous commit reworked the checks intel_from_planar() to check the
right individual cases for regular/planar/aux buffers, and do size
checks in all cases.

Unfortunately, the aux size check was broken, and required the aux
surface to be allocated with the correct aux stride, but full image
height (!).

As the ISL aux surface is not recorded in the DRIimage, we cannot easily
access it to check. Instead, store the aux size from when we do have the
ISL surface to hand, and check against that later when we go to access
the aux surface.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Fixes: c2c4e5bae3 ("i965: Fix bugs in intel_from_planar")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-17 10:22:35 +00:00
Marek Olšák
931ec80eeb radeonsi: implement 32-bit pointers in user data SGPRs (v2)
User SGPRs changes:
    VS:     14 ->  9
    TCS:    14 -> 10
    TES:    10 ->  6
    GS:      8 ->  4
    GSCOPY:  2 ->  1
    PS:      9 ->  5
    Merged VS-TCS: 24 -> 16
    Merged VS-GS:  18 -> 11
    Merged TES-GS: 18 -> 11

SGPRS: 2170102 -> 2158430 (-0.54 %)
VGPRS: 1645656 -> 1641516 (-0.25 %)
Spilled SGPRs: 9078 -> 8810 (-2.95 %)
Spilled VGPRs: 130 -> 114 (-12.31 %)
Scratch size: 1508 -> 1492 (-1.06 %) dwords per thread
Code Size: 52094872 -> 52692540 (1.15 %) bytes
Max Waves: 371848 -> 372723 (0.24 %)

v2: - the shader cache needs to take address32_hi into account
    - set amdgpu-32bit-address-high-bits

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v1)
2018-02-17 04:52:17 +01:00
Marek Olšák
5722cd4084 radeonsi: disallow constant buffers with a 64-bit address in slot 0
State trackers must use a user buffer or const_uploader,
or set pipe_resource::flags same as const_uploader->flags.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-17 04:52:17 +01:00
Marek Olšák
d790b6cece radeonsi: move const_uploader allocations to 32-bit address space
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-17 04:52:17 +01:00
Marek Olšák
50581549b7 winsys/radeon: implement and enable 32-bit VM allocations
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-17 04:52:17 +01:00
Marek Olšák
1104d1e9d3 winsys/radeon: add struct radeon_vm_heap
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-17 04:52:17 +01:00
Marek Olšák
48ecacfefa winsys/amdgpu: enable 32-bit VM allocations
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-17 04:52:17 +01:00
Marek Olšák
c2da45be86 gallium/radeon: add 32-bit address space heaps
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-17 04:52:17 +01:00
Marek Olšák
0977b7f7b3 ac: query high bits of 32-bit address space 2018-02-17 04:51:58 +01:00
Marek Olšák
16be55da94 gallium: use PIPE_CAP_CONSTBUF0_FLAGS 2018-02-17 04:20:55 +01:00
Marek Olšák
8e7222f4e5 gallium: allow drivers to impose BO flags restrictions on constant buffer 0
Required by radeonsi for optimal behavior.
2018-02-17 04:20:55 +01:00
Alexander von Gluck IV
834d221512 meson: Add Haiku platform support v4
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-16 16:56:34 -06:00
Anuj Phogat
7b283544dc anv/icl: Add render target flush after uploading binding table
The PIPE_CONTROL command description says:

"Whenever a Binding Table Index (BTI) used by a Render Taget Message
points to a different RENDER_SURFACE_STATE, SW must issue a Render
Target Cache Flush by enabling this bit. When render target flush
is set due to new association of BTI, PS Scoreboard Stall bit must
be set in this packet."

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat
136f583a24 anv/icl: Enable float blend optimization
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat
cd7102972f anv/icl: Use gen11 functions
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat
9673c21d4f anv/icl: Build anv libs for gen11
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat
1f108b436b anv/icl: Generate gen11 entry point functions
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat
a86c0a08df anv/icl: Don't use DISPATCH_MODE_SIMD4X2
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat
cd5fc634a8 anv/icl: Don't use SingleVertexDispatch
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat
6e3940b3cf anv/icl: Don't set ResetGatewayTimer
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat
41a4c2c8e8 anv/icl: Add #define genX
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:31 -08:00
Anuj Phogat
413d475b44 anv/icl: Add gen11 mocs defines
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:31 -08:00
Kenneth Graunke
1d6cf433d2 i965: Implement GenerateMipmap directly, rather than using Meta.
Meta is awful and we'd like to stop using it.  Implementing this using
BLORP allows us to stop trashing a bunch of GL state every time.

This follows the structure of st_generate_mipmap().
compute_num_levels is lifted directly from there.

Improves performance in Gl41HdrBloom by about 11.794% +/- 1.01919% (n=3)
on Kabylake GT2 at 1280x720 (the difference seems much smaller at higher
resolutions).

v2 (idr): Don't try depth or depth-stencil blorp blits on Gen4 or Gen5
because it's not implemented yet.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-02-16 10:48:10 -08:00
Kenneth Graunke
9bcd31ea90 mesa: Move compute_num_levels from st_gen_mipmap.c to mipmap.c.
I want to use compute_num_levels inside i965.  Rather than duplicating
it, move it from mesa/st to core Mesa, and make it non-static.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-16 10:48:10 -08:00
Dylan Baker
03ab40b1f7 meson: freedreno depends on nir
This fixes a race condition in building targets that link in freedreno.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105120
Fixes: 0bbecc5a85 ("meson: define driver dependencies")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Mark Janes <mark.a.janes@intel.com>
2018-02-16 10:10:18 -08:00
George Kyriazis
f1fbeb1a53 swr/rast: blend_epi32() should return Integer, not Float
fix gcc8 compiler error for KNL.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105029
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:02 -06:00
George Kyriazis
7dd793d10c swr/rast: Normalize path for debug metadata
in template gen_llvm.hpp

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:02 -06:00
George Kyriazis
f979d0bc2f swr/rast: Consolidate archrast Draw events
Consolidate archrst draw events into single draw event with an attribute
that represents the type of draw

- Add handlers for new private proto versions of DrawInstancedEvent,
  DrawIndexedInstancedEvent, DrawInstancedSplitEvent, and
  DrawIndexedInstancedSplitEvent
- Convert the draw events to generic DrawInfoEvents
- parse_proto_event_fields() replaces 'AR_DRAW_TYPE' as a field type with
  'uint32_t'. This draw type is actually an enum, but can be represented
  as an unsigned integer.
- is_draw_or_dispatch() recognizes DrawInfoEvent as a draw event

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:02 -06:00
George Kyriazis
45df1a6520 swr/rast: Add semantics for translating address
Added support for another full translation path in fetch jitter.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:02 -06:00
George Kyriazis
c09483cf0a swr/rast: Convert C Sampler intrinsics
Convert portions of the C sampler to the rasty SIMD lib.

Also fix SRL call with a non-immediate.  Don't count on the compiler
automagically converting an srli call to srl if the shift count isn't
an immediate.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:01 -06:00
George Kyriazis
37ebf86add swr/rast: Make SIMDLib templated types easier to use
"typename SIMD_T::TypeName" --> "TypeName<SIMD_T>"

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:01 -06:00
George Kyriazis
74e8bb4a22 swr/rast: Be more explicit when fetching next component
Use a new function to denote that we want to get offset to next component
and hide the fact that GEP is used underneath.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:01 -06:00
George Kyriazis
da77eb55d5 swr/rast: Fix bug related to passing AR handle
We were passing a garbage handle. Let's not do that.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:01 -06:00
George Kyriazis
48d62409f8 swr/rast: Fix primitive replication issue in tesselation PA.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:01 -06:00
George Kyriazis
e12db47a7d swr/rast: Use llvm intrinsic masked gather
Use llvm intrinsic masked.gather instead of manual unroll for the cases
where we have vector of pointers.  Improves llvm IR debug experience by
reducing a ton of IR to a single intrinsic call. Also seems to reduce
overall stack use considerably.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:01 -06:00
George Kyriazis
9cc9688e49 swr/rast: Misc cleanup
Together with correct detection of clipDistance NaNs when no cullDistance is set

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:00 -06:00
George Kyriazis
036c8b6247 swr/rast: Renamed variable in vertexbufferstate
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:00 -06:00
George Kyriazis
b25efa36e6 swr/rast: Fix GATHERPS to avoid assertions.
With the pBase type change, LLVM was asserting because of wrong types.
Cast appropriately.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:00 -06:00
George Kyriazis
8a64593bde swr/rast: More precise user clip distance interpolation
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:00 -06:00
George Kyriazis
3e560b7c85 swr/rast: Cull prims when all verts have negative clip distances
Performance optimization, and fixes some clipping issues.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:00 -06:00
George Kyriazis
cb4b604ebd swr/rast: whitespace and comment cleanup
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:54:00 -06:00
George Kyriazis
5df4d98780 swr/rast: Fix invalid number of attributes
Fix invalid number of attributes passed into tesselation PA.
Needs to take into account any offsets from the shader.
Innocuous issue, but removes an assert firing in debug.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:59 -06:00
George Kyriazis
2053472723 swr/rast: Add clipper stats.
Clipper event is now:

event ClipperEvent
{
    uint32_t drawId;
    uint32_t trivialRejectCount;
    uint32_t trivialAcceptCount;
    uint32_t mustClipCount;
};

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:59 -06:00
George Kyriazis
0420b2be89 swr/rast: Separate event types to public and private
Split into two proto files and modify appropriate build rules for
configure / scons / meson builds.

There are private internal events (proxy) that communicate information
from rasterizer to ArchRast. ArchRast can use these events to calculate
a final answer and then emit other public events which will be saved to
file. Users will use the public proto file and not the private one.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:59 -06:00
George Kyriazis
e48dd2489c swr/rast: Clean up event types and remove BE events
Begin/End events not needed anymore.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:59 -06:00
George Kyriazis
7070027d7b swr/rast: Removed unused variable
Gets rid of zillions of unused variable warnings, made worse by templates.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:59 -06:00
George Kyriazis
e3f92bb7af swr/rast: Separate RDTSC code from archrast
Renamed rdstc defines more appropriately

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:59 -06:00
George Kyriazis
8bce71622e swr/rast: Cleanup of mpPrivateContext in Builder
Provide access functions for mpPrivateContext in Builder.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:58 -06:00
George Kyriazis
5697dc3e23 swr/rast: Remove some JIT debug code
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:58 -06:00
George Kyriazis
2407b8c9b4 swr/rast: Don't include private context in gather args
Move mpPrivateContext to compensate

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:58 -06:00
George Kyriazis
a4c23fc25b swr/rast: Cleanup knob definitions
Rename some of the categories and move some options around.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:53:42 -06:00
George Kyriazis
ec34ed73d6 swr/rast: Add missing parameter to a few gather functions
We now pass pDrawContext as a default parameter

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-02-16 10:39:42 -06:00
Philipp Zabel
bfe4e24a42 etnaviv: add useful information to BO import errors
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-02-16 17:05:43 +01:00
Daniel Stone
ff5432dc50 egl/wayland: Always use in-tree wayland-egl-backend.h
A recent patchset to Wayland[0] migrated Mesa's libwayland-egl backend
into Wayland itself, so implementations could provide backends. Mesa
still uses its own, and the two have already diverged[1].

The include from egl_dri2.h could pick up either the installed Wayland
wayland-egl-backend.h (with a 'driver_private' member), or the Mesa
internal wayland-egl-backend.h (with a 'private' member), failing the
build in the first instance.

Add an explicit directory prefix to the include, so we always get our
in-tree version.

[0]: https://patchwork.freedesktop.org/series/31663/
[1]: https://cgit.freedesktop.org/wayland/wayland/commit/?id=9fa60983b579

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105103
Fixes: 198af27c67 ("wayland-egl: rename wayland-egl-{priv,backend}.h")
2018-02-16 14:04:19 +00:00
Daniel Stone
f766e1afa5 meson: Move Wayland dmabuf to wayland-drm
As the comment notes: linux-dmabuf has nothing to do with wayland-drm,
but we need a single place to build these files we can use from both EGL
and Vulkan, which is guaranteed to be included before both EGL and
Vulkan WSI.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
2018-02-16 14:04:19 +00:00
Eric Engestrom
65dda6c9ec egl/wayland: check for invalid format index
v2: just tell the compiler to assume the format will always be found, as
it comes from the table itself to begin with. (DanielS)

CID: 1429516
Fixes: d32b23f383 "egl/wayland: Add bpp to visual map"
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-02-16 13:14:29 +00:00
Eric Engestrom
a176b053b6 glsl: fix sizeof(pointer) bug
Doesn't really change anything to the test though ¯\_(ツ)_/¯

CID: 1429511
Fixes: e8495646af "glsl/tests: changes to test_disk_cache_create test"
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-02-16 12:04:29 +00:00
Timothy Arceri
2f5d3df9fc radeonsi/nir: set TGSI_PROPERTY_FS_EARLY_DEPTH_STENCIL correctly
We set this for post_depth_coverage in addition to early_fragment_tests.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-16 15:53:13 +11:00
Dave Airlie
60c14a0db2 virgl: remap query types to hw support.
The gallium query types changed, so we need to remap from the
gallium ones to the virgl ones.

Fixes:
dEQP-GLES3.functional.transform_feedback.basic_types*

"This also fixes:

dEQP-GLES3.functional.transform_feedback.array.separate*
dEQP-GLES3.functional.transform_feedback.array_element*
dEQP-GLES3.functional.transform_feedback.interpolation.*

Gallium's p_defines.h and virglrenderer's p_defines.h have diverged
quite a bit, so not including
PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE there makes sense for now."
 - Gurchetan Singh

Fixes: 3f6b3d9db (gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Tested-by: Gurchetan Singh <gurchetansingh@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-16 12:42:06 +10:00
Anuj Phogat
8a05b06146 i965/icl: Add render target flush after uploading binding table
From PIPE_CONTROL command description in gfxspecs:

"Whenever a Binding Table Index (BTI) used by a Render Taget Message
 points to a different RENDER_SURFACE_STATE, SW must issue a Render
 Target Cache Flush by enabling this bit. When render target flush
 is set due to new association of BTI, PS Scoreboard Stall bit must
 be set in this packet."

V2: Move the PIPE_CONTROL to update_renderbuffer_surfaces() in
    brw_wm_surface_state.c (Ken).

Fixes a fulsim error and a GPU hang described in below JIRA.
JIRA: MD5-322
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:56 -08:00
Anuj Phogat
3f8289164f i965/icl: Enable float blend optimization and Wa3DStateMode
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:56 -08:00
Anuj Phogat
ba3cbee6c5 intel/common/icl: Add has_sample_with_hiz flag in gen_device_info
Sampling from hiz is enabled in i965 for GEN9+ but this feature has
been removed from gen11. So, this new flag will be useful to turn
the feature on/off for different gen h/w. It will be used later
in a patch adding device info for gen11.

Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:56 -08:00
Anuj Phogat
9c144dc81e i965/icl: Add assertions to check dispatch mode is SIMD8
SIMD4x2 dispatch mode has been removed in GEN11. We're not using
it anyways in Mesa. Adding few asserts to make it explicit.

Use GEN_GEN macro in place of devinfo->gen (Ken)

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:56 -08:00
Anuj Phogat
02e91b6d62 i965/icl: Update switch statements
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:56 -08:00
Anuj Phogat
27d0034938 i965/icl: Update the assert in brw_memory_barrier()
Nothing is changed here from gen10 to gen11. So, just update
the assert.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:56 -08:00
Anuj Phogat
d6b26649a6 i965/icl: Define and use icl mocs settings
Gen11 MOCS settings are duplicate of Gen10 MOCS settings.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:56 -08:00
Anuj Phogat
e9ad5c9a5d i965/icl: Update the comment for maximum number of threads per PSD
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:56 -08:00
Anuj Phogat
93f601d7ed i965/icl: Build and use gen11 functions for genxml state-upload and blorp
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-15 16:14:56 -08:00
Anuj Phogat
85f319155f i965/icl: Don't set ResetGatewayTimer
This field is removed in gen11+

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:56 -08:00
Anuj Phogat
772a75be46 intel/icl: Do StateCacheInvalidation for indirect clear color
StateCacheInvalidation is required on all gen7+ platforms. We
don't need to update this check for every new gen h/w unless
this requirement is changed. So, dropping the check for latest
gen h/w.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:55 -08:00
Anuj Phogat
bff24e2173 intel/isl/icl: Build and use gen11 surface state emit functions
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-15 16:14:55 -08:00
Anuj Phogat
0427bd4954 intel/isl/icl: Add the maximum surface size limit
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:55 -08:00
Anuj Phogat
c68ede0be7 intel/genxml/icl: Update genx_bits header
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:55 -08:00
Anuj Phogat
165a68b05a intel/genxml/icl: Generate packing headers
Move build system changes in to one patch (Ken, Emil)

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-15 16:14:55 -08:00
Anuj Phogat
7ed27d8cbf intel/genxml/icl: Add gen11.xml
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 16:14:55 -08:00
Kenneth Graunke
4dee8f0548 i965: Drop EXEC_OBJECT_CAPTURE defines.
These only existed to avoid making people update libdrm for new uABI
headers.  A while ago we imported those headers into the Mesa repo,
so the dependency is gone and these are no longer useful.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-15 15:35:52 -08:00
Jan Vesely
78673b614b clover: Fix build after llvm r325155 and r325160
r325155 ("Pass a reference to a module to the bitcode writer.")
and
r325160 ("Pass module reference to CloneModule")

change function interface from pointer to reference.

v2: Fix indentation (tab instead of spaces)

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-02-15 18:18:53 -05:00
Bas Nieuwenhuizen
05d84ed68a radv: Always lower indirect derefs after nir_lower_global_vars_to_local.
Otherwise new local variables can cause hangs on vega.

CC: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105098
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-02-15 23:45:59 +01:00
Dylan Baker
2ab1ce30c4 meson: fix xvmc target linkage
This needs to link the state tracker with --whole-archive to expose the
right symbols.

v4: - Always add libswdri and libswkmsdri to the link_with list

Fixes: 22a817af8a ("meson: build gallium xvmc state tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:38:43 -08:00
Dylan Baker
0b73c329bc meson: Fix xa target linkage
This needs to use --whole-archive (link_whole in meson) to properly
expose symbols.

v4: - Always add libswdri and libswkmsdri to link_with list

Fixes: 0ba909f0f1 ("meson: build gallium xa state tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:36:31 -08:00
Dylan Baker
91a59b6287 meson: Fix omx-bellagio target linkage
This needs to use --whole-archive (link_whole in meson) to properly
expose symbols.

v4: - Always add libswdri and libswkmsdri to link_with

Fixes: 1d36dc674d ("meson: build gallium omx state tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:36:26 -08:00
Dylan Baker
2e4be28fb2 meson: fix va target linkage
The state tracker needs to be linked with whole-archive (like
autotools). As a result there are symbols from libswdri and libswkmsdri
that are needed, so link those as well.

v4: - Always add libswdri and libswkmsdri to link_with list

Fixes: 5a785d51a6 ("meson: build gallium va state tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:36:16 -08:00
Dylan Baker
90d361753c meson: fix vdpau target linkage
The VDPAU state tracker needs to be linked with whole-archive (autotools
does this). Because we are linking the whole archive we alos need to
link with libswdri and libswkmsdri if those have been enabled.

v4: - Always add libswdri and libswkmsdri to link_with list

Fixes: 68076b8747 ("meson: build gallium vdpau state tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:36:09 -08:00
Dylan Baker
3403055768 meson: Actually link xvmc target with libxvmc
Unlike vdpau this is required.

Fixes: 22a817af8a ("meson: build gallium xvmc state tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:36:04 -08:00
Dylan Baker
7708103857 meson: actually link with libomxil-bellagio
This state tracker actually needs to link, unlike vdpau.

Fixes: 1d36dc674d ("meson: build gallium omx state tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:35:57 -08:00
Dylan Baker
7023b373ec meson: link dri3 xcb libs into vlwinsys instead of into each target
This makes the dependencies easier to manage, since each media target
doesn't need to worry about linking to half a dozen libraries.

Fixes: b1b65397d0 ("meson: Build gallium auxiliary")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:35:51 -08:00
Dylan Baker
424e654cb0 meson: use va-api version reported by pkg-config
Fixes: 5a785d51a6 ("meson: build gallium va state tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:35:47 -08:00
Dylan Baker
8eb608df61 meson: add libswdri and libswkmsdri to dri link_with
Fixes: b154b44ae3 ("meson: build radeonsi gallium driver")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:35:42 -08:00
Dylan Baker
be879f9f29 meson: add libswdri and libswkmsdri to d3dadaptor link_with
v5: - Fix libswdi -> libswdri typo

Fixes: 6b4c7047d5 ("meson: build gallium nine state_tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:35:36 -08:00
Dylan Baker
d672084ba2 meson: define empty variables for libswdri and libswkmsdri
This allows these variables to unconditionally included in `link_with`
lists, even if they're not used. This allows deleting duplicated logic
in nearly every gallium target implemented in meson today. This also
removes the now useless `build_by_default` flag from swdri and swkmsdri.

v4: - add this patch

Fixes: 66c94b9313
       ("meson: build gallium winsys for dri, null, and wrapper")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 10:35:23 -08:00
Dylan Baker
7d0e342af2 meson: add convenience variable for anv_extensions.py depdendency
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-15 09:46:07 -08:00
Dylan Baker
0e617c04f1 meson: use depend_files for adding extra file dependencies
cc: Jason Ekstrand <jason.ekstrand@intel.com>
Fixes: dd088d4bec ("anv/extensions: Generate a header file with extension tables")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-15 09:46:04 -08:00
Dylan Baker
b03969a5ad meson: use depend_files to track extra file dependencies
cc: Jason Ekstrand <jason.ekstrand@intel.com>
Fixes: f939940809 ("anv: Split anv_extensions.py into two files")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-15 09:45:56 -08:00
Dylan Baker
384bff13e0 Revert "anv/meson: Make anv_entrypoints_gen.py depend on anv_extensions.py"
This reverts commit 10d1b0be8e.

This is unnecessary, the depend_files argument is for adding
dependencies on files that are not part of the input, which is already
done.

cc: Jason Ekstrand <jason.ekstrand@intel.com>
Fixes: 10d1b0be8e
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-15 09:45:40 -08:00
Brian Paul
64a1223a80 svga: replace gotos with else clauses
Simple clean-up.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-02-15 09:49:06 -07:00
Brian Paul
fa901768a4 svga: s/unsigned/enum pipe_shader_type/
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-02-15 09:05:09 -07:00
Brian Paul
8b54299c34 svga: move duplicated code for setting fillmode/flatshade state
Move the calls to svga_hwtnl_set_fillmode() and svga_hwtnl_set_flatshade()
out of the two retry_draw_*() functions to the svga_draw_vbo() function.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-02-15 09:05:09 -07:00
Brian Paul
072df89a79 svga: move svga_update_state() call in draw code
This fixes a few Piglit transform feedback regressions caused by
commit 7a1401938b.

In that change I moved the moved svga_update_state() into the loops,
after the calls to svga_hwtnl_set_flatshade().  But
svga_hwtnl_set_flatshade() actually depends on some derived shader
state.  This patch moves the svga_update_state() call into
svga_draw_vbo() so it's not duplicated in two places.

Fixes: 7a1401938b ("svga: clean up retry_draw_range_elements(),
retry_draw_arrays()")

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-02-15 09:05:08 -07:00
Brian Paul
6f0aec5671 svga: call tgsi_scan_shader() for dummy shaders
If we fail to compile the normal VS or FS we fall back to a simple/
dummy shader.  We need to rescan the the shader to update the shader
info.  Otherwise, this can lead to further translations failures
because the shader info doesn't match the actual shader.

Found by adding some extra debug assertions in the state-update code
while debugging something else.

v2: also update shader generic_inputs/outputs, etc. per Charmaine

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-02-15 09:05:01 -07:00
Samuel Pitoiset
579b33c1fd ac/nir: do not reserve user SGPRs for unused descriptor sets
In theory this might lead to corruption if we bind a descriptor
set which is unused, because LLVM is smart and it can re-use
unused user SGPRs. In practice, this doesn't seem to fix
anything.

As a side effect, this will reduce the number of emitted
SH_REG packets.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-15 14:53:30 +01:00
Samuel Pitoiset
309854148c ac/shader: fix gathering of desc_set_used_mask
This was quite wrong.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-15 14:53:30 +01:00
Samuel Pitoiset
61a4fc3ecc ac/shader: be a little smarter when scanning vertex buffers
Although meta shaders don't use any vertex buffers, there is no
behaviour change but I think it's better to do this. Though,
this saves two user SGPRs for push constants inlining or
something else.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-15 14:53:30 +01:00
Louis-Francis Ratté-Boulianne
a34715ad9c dri: fromPlanar() can return NULL as a valid result
It was assumed that fromPlanar() could return NULL to mean
that the planar image is the same as the parent DRI image.
That assumption wasn't made everywhere though.

Let's fix things and make sure that all callers understand
a NULL result

Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-02-15 11:58:17 +00:00
Emil Velikov
f0654dfa65 docs: correct link to the 17.3.3 release notes
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 11:33:27 +00:00
Emil Velikov
dd4734d5c1 docs: update calendar, add news and link release notes to 17.3.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 11:33:04 +00:00
Emil Velikov
eadde35f83 docs: add sha256 checksums for 17.3.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 26c84b8af9)
2018-02-15 11:28:19 +00:00
Emil Velikov
6f4a6e2310 docs: add release notes for 17.3.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 2f9820c553)
2018-02-15 11:28:18 +00:00
Karol Herbst
7bc15090fc nvc0: disable MS Images for sample_count == 1 on Maxwell
fixes KHR-GL45.multi_bind.dispatch_bind_textures on Maxwell

Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-15 11:14:46 +01:00
Gurchetan Singh
c6694793e1 mesa: don't clamp just based on ARB_viewport_array extension
The ARB_viewport_array spec says:

"Dependencies
    OpenGL 1.0 is required.

    OpenGL 3.2 or the EXT_geometry_shader4 or ARB_geometry_shader4 extensions
    are required.

    This extension is written against the OpenGL 3.2 (Compatibility)
    Specification."

As such, we should ignore it for GLES2 contexts.

Fixes:
dEQP-GLES2.functional.state_query.integers.viewport_getinteger
dEQP-GLES2.functional.state_query.integers.viewport_getfloat

on llvmpipe and virgl.

v2: Use _mesa_has_* (Ilia)

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Cc: 17.3 18.0 <mesa-stable@lists.freedesktop.org>
2018-02-15 01:58:50 +01:00
Dylan Baker
5317211fa0 meson: use a custom target instead of a generator for i965 oa
Generators really are never the thing you want. The problem in this case
is that a generator must create a file that contains any file that the
generated target depends on. Since brw_oa.py doesn't generate such a
file the generated sources are not regenerated even if the xml files
they should depend on changes.

While we could change brw_oa.py to write such a file, that's silly, it
depends on itself and the xml file. So we'll just use a custom target
instead, which will have the correct dependency behavior and doesn't
really add that much code.

Fixes: 3218056e0e ("meson: Build i965 and dri stack")
CC: Ian Romanick <idr@freedesktop.org>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-02-14 16:45:40 -08:00
Anuj Phogat
0cd37f9178 isl: Don't use surface format R32_FLOAT for typed atomic integer operations
From Skylake PRM Surface Formats section:

   "The surface format for the typed atomic integer operations must
    be R32_UINT or R32_SINT."

Fixes an error and a piglit GPU hang in simulation environment.
Piglit test: gl45-imageAtomicExchange-float.shader_test

Suggested-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.co
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "18.0 17.3" <mesa-stable@lists.freedesktop.org>
2018-02-14 16:30:05 -08:00
Timothy Arceri
7be5f30bb1 radeonsi/nir: fix si_nir_load_tcs_varyings() for outputs
We were incorrectly using the input info for outputs.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-15 09:02:41 +11:00
Timothy Arceri
9740c8a8aa ac: implement nir_intrinsic_image_samples
Fixes cts test:
KHR-GL45.shader_texture_image_samples_tests.image_functional_test

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-15 09:02:41 +11:00
Timothy Arceri
c6b70a0eae st: add NIR GL_ARB_get_program_binary support
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-15 09:02:41 +11:00
Timothy Arceri
928be4e97e st/shader_cache: add st_{de}serialise_nir_program() helpers
These will be used for NIR GL_ARB_get_program_binary support.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-15 09:02:41 +11:00
Timothy Arceri
3ad52501dc ac/nir_to_llvm: fix image size for arrays of arrays
Fixes cts test:
KHR-GL44.shader_image_size.advanced-changeSize

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-15 09:02:41 +11:00
Timothy Arceri
6acab18828 radeonsi/nir: fix shader ballot return value bitsize
Fixes cts test:
KHR-GL46.shader_ballot_tests.ShaderBallotFunctionBallot

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-15 09:02:41 +11:00
Jason Ekstrand
8534af44e4 intel/aubinator: Correctly decode INTERFACE_DESCRIPTOR_DATA
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-14 13:17:26 -08:00
Jason Ekstrand
5c9d47d9c6 i965: Add gl_state_index casts for PATCH_VERTICES_IN
This fixes the build in clang

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105088
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-14 13:16:47 -08:00
Scott D Phillips
3b4f432d9b i965/miptree: Initialize mcs with a linear map
When initializing mcs, map with MAP_RAW and fill in the linear
map. Removes a place where gtt mapping is used.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-14 12:38:34 -08:00
Scott D Phillips
d13ab69a78 i965/tiled_memcpy: change linear pointer from (0, 0) to (xt1, yt1)
In all current uses, the linear surface is only allocated starting
at (xt1, yt1) anyway, so this improves the calling ergonomics.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-14 12:38:34 -08:00
Scott D Phillips
ecaad89525 i965/tiled_memcpy: linear_to_ytiled a cache line at a time
TileY's low 6 address bits are: v1 v0 u3 u2 u1 u0
Thus a cache line in the tiled surface is composed of a 2d area of
16x4 bytes of the linear surface.

Add a special case where the area being copied is 4-line aligned
and a multiple of 4-lines so that entire cache lines will be
written at a time.

On Apollolake, this increases tiling throughput to wc maps by
84.0103% +/- 0.862818%

v2: Split [y0, y1) and [y2, y3) loops apart for clarity (Jason Ekstrand)
v3: Don't reset src var (Jason), Ensure y0 <= y1 <= y2 <= y3

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-14 12:38:34 -08:00
Rafael Antognolli
eb2e17e2d1 docs: Add Cannonlake support to 18.0 release notes.
17.4 is actually 18.0.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: "18.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-14 10:11:05 -08:00
Rafael Antognolli
fcae3d1a9a anv/gen10: Remove warning message.
Gen10 seems pretty stable so far, remove "alpha support" message.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: "18.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-14 10:11:01 -08:00
Rafael Antognolli
bf1577fe09 i965/gen10: Remove warning message.
Gen10 seems pretty stable so far, so there's no reason to keep this
message.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: "18.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-14 10:09:41 -08:00
Louis-Francis Ratté-Boulianne
aad14cf15a egl/x11: Fix leak in dri3_create_image_khr_pixmap
bp_reply wasn't properly free'd

Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-02-14 11:52:06 +00:00
Iago Toral Quiroga
cb9dbd6dec i965/compiler: clean up nir_intrinsic_load_input for vertex shaders
This code to re-set the type of the source and destination is not
necessary since we never manipulate the types. Looks like a
left over from a time where we had to retype to float temporarily
to handle 64-bit inputs.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-14 12:00:14 +01:00
Iago Toral Quiroga
4917d38321 intel/compiler: fix first_component for 64-bit types on vertex inputs
Divide it by two as we do for other stages. This is because the
component layout qualifier is always in 32-bit units.

Fixes issues in a new CTS test (still WIP):
KHR-GL45.enhanced_layouts.varying_double_components

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-14 12:00:14 +01:00
Samuel Pitoiset
ad4b58ea70 ac/nir: rename nir_to_llvm_context to radv_shader_context
There is still more to do in that area, but it's a good start.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-14 11:53:16 +01:00
Samuel Pitoiset
141db61509 ac: remove nir_to_llvm_context from ac_nir_translate()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-14 11:53:14 +01:00
Samuel Pitoiset
a541117ff4 ac/nir: remove nir_to_llvm_context::nir link
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-14 11:53:12 +01:00
Samuel Pitoiset
e9f0205ca2 ac: move the outputs array to the ABI
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-14 11:53:10 +01:00
Samuel Pitoiset
07e4268f36 ac/shader: scan force_persample
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-14 11:53:08 +01:00
Dave Airlie
b9d2ff05a6 r600: fix regression in gl_FragColor drawing
This fixes a regression in the broadcast color to all color bufs case.

Fixes: 6c691081a (r600: fixup sparse color exports.)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-14 14:02:41 +10:00
Dave Airlie
9c9a9bee44 r600: fix array spill if temp[0] is before all arrays
I found a shader with
DCL TEMP[0], LOCAL
DCL TEMP[1..256], ARRAY(1), LOCAL
DCL TEMP[257..512], ARRAY(2), LOCAL
DCL TEMP[513..768], ARRAY(3), LOCAL
DCL TEMP[769], LOCAL

This would remap badly, as it would add up all the spilled sizes
and subtract it from the temp for 0. If the current temp is less
than the array start break out.

Fixes: 1d871aa6 (r600g: Implement spilling of temp arrays (v2))
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-14 13:37:59 +10:00
Dave Airlie
8f2656c75b virgl: add ARB_sample_shading support.
This enable ARB_sample_shading if the renderer supports it.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-14 13:06:07 +10:00
Dave Airlie
9b95b70719 virgl: add ARB_draw_indirect support.
This relies on the renderer code landing first.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-14 13:06:07 +10:00
Roland Scheidegger
f6718baabc tgsi: Recognize RET in main for tgsi_transform
Shaders coming from dx10 state trackers have a RET before the END.
And the epilog needs to be placed before the RET (otherwise it will
get ignored).
Hence figure out if a RET is in main, in this case we'll place
the epilog there rather than before the END.
(At a closer look, there actually seem to be problems with control
flow in general with output redirection, that would need another
look. It's enough however to fix draw's aa line emulation in some
internal bug - lines tend to be drawn with trivial shaders, moving
either a constant color or a vertex color directly to the output).

v2: add assert so buggy handling of RET in main is detected

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-02-14 02:06:54 +01:00
Bas Nieuwenhuizen
7461bd5b8f ac: Use the renumbered const address space for LLVM 7.
The LLVM AMDGPU backend decided to renumber the constant address
space ....

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-14 01:05:03 +01:00
Dave Airlie
9ddacd9af4 gallium: drop all the guard band float caps.
Nobody queries these and nobody sets them to anything useful,
the docs say TODO.

Drop them until a use appears.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-14 08:50:08 +10:00
Vadym Shovkoplias
a553c54abf mesa: add glsl version query (v4)
Add support for GL_NUM_SHADING_LANGUAGE_VERSIONS
and glGetStringi for GL_SHADING_LANGUAGE_VERSION

v2:
  - Combine similar functionality into
    _mesa_get_shading_language_version() function.
  - Change GLSL version return mechanism.
v3:
  - Add return of empty string for GLSL ver 1.10.
  - Move _mesa_get_shading_language_version() function
    to src/mesa/main/version.c.
v4:
  - Add OpenGL version check.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104915
Signed-off-by: Andriy Khulap <andriy.khulap@globallogic.com>
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 13:24:31 -07:00
Brian Paul
b08d718703 mesa: add missing switch case for EXTRA_VERSION_40 in check_extra()
The EXTRA_VERSION_40 predicate is tested as part of
extra_gl40_ARB_sample_shading but there was no switch case for it.

Fixes: 77b440e42d ("mesa: Add new functions and enums required
by GL_ARB_sample_shading")
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-02-13 10:35:55 -07:00
Mark Janes
e5809788d6 mesa: fix compile failure
Missing header triggered a failure in i965 CI buildtest project.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105067
Fixes: e149a0253c
2018-02-13 00:22:05 -08:00
Mark Janes
d9de7aaca3 Partially revert "mesa: use GLenum16 in a few more places"
This reverts part of commit ca721b3d89.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105067
2018-02-13 00:22:05 -08:00
Mark Janes
3e5758a70a Revert "mesa: reduce the size of gl_texture_image"
This reverts commit f4ea2b2a9e.

Several members reduced in size by the offending commit are not large
enough to store the data needed by the i965 driver.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105067
2018-02-13 00:22:05 -08:00
Dave Airlie
db5f422169 i965: fix tessellation regressions with gl_state_index16
Looks like one conversion was missed.

Fixes: e149a0253 (mesa,glsl,nir: reduce gl_state_index size to 2 bytes)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105067
Signed-off-by: Dave Airlie <airlied@redhat.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
2018-02-12 23:05:16 -08:00
Stéphane Marchesin
5e4a2b394e virgl: Support v2 caps struct (v2)
This struct allows us to report:
- accurate max point size/line width.
- accurate texel and texture gather offsets
- vertex/geometry limits.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-13 14:23:54 +10:00
Timothy Arceri
10457712ed ac/nir: add nir_intrinsic_{load,store}_shared support
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-13 14:43:05 +11:00
Timothy Arceri
c787cbfa33 ac/nir_to_llvm: add support for nir_intrinsic_shared_atomic_*
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-13 14:43:05 +11:00
Timothy Arceri
b6cf898ec2 radeonsi: make si_declare_compute_memory() more generic and call for nir
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-13 14:43:05 +11:00
Timothy Arceri
94fa090fad st/glsl: set req_local_mem earlier for compute shaders
Without this change it will never be set for backends using nir.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-13 14:43:05 +11:00
Marek Olšák
6b1e26e181 mesa: move STATE_LENGTH to shader_enums.h and use it everywhere
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
f4ea2b2a9e mesa: reduce the size of gl_texture_image
80 -> 40 bytes.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
4794fbc86e mesa: reduce the size of gl_program_parameter
40 -> 24 bytes, which includes the gl_state_index16 change.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
e149a0253c mesa,glsl,nir: reduce gl_state_index size to 2 bytes
Let's use the new gl_state_index16 type everywhere and remove
the typecasts.

This helps reduce the size of gl_program_parameter.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
a7882013d3 mesa: reduce the size of gl_viewport_attrib
All drivers convert these to float, so there is no reason to use double.
The piglit test that expects double precision from glGet will be adjusted
not to require it (there is a piglit patch).

gl_context::ViewportArray: 512 -> 384 bytes

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
d7550d783a mesa: reduce the size of gl_texture_object
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
65ed98839b mesa: reduce the size of gl_program
gl_program: 1456 -> 976 bytes

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
78f1decc95 mesa: reduce the size of gl_image_unit (v2)
gl_context::ImageUnits: 6144 -> 4608 bytes

v2: use ASSERT_BITFIELD_SIZE

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
ca5c5d96d8 mesa: further reduce the size of ctx->Texture
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
78043a75f6 mesa: decrease the array size of ctx->Texture.FixedFuncUnit to 8
GL allows doing glTexEnv on 192 texture units, while in reality,
only MaxTextureCoordUnits units are used by fixed-func shaders.

There is a piglit patch that adjusts piglits/texunits to check only
MaxTextureCoordUnits units.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
07c10cc59c mesa: separate legacy stuff from gl_texture_unit into gl_fixedfunc_texture_unit
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
79aca14f5f mesa: inline init_texture_unit
because this is going to be changed

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák
ca721b3d89 mesa: use GLenum16 in a few more places
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Jason Ekstrand
4c77e21c81 anv: Move setting current_pipeline to cmd_state_init
We were setting current_pipeline to UINT32_MAX and then calling
cmd_cmd_state_reset which memsets the entire state struct to 0 which
implicitly resets current_pipeline to 3D.  I have no idea how this
hasn't caused everything to explode.

Fixes: cd3feea745 "anv/cmd_buffer: Rework anv_cmd_state_reset"
cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-02-12 15:18:23 -08:00
Jason Ekstrand
f37bd726c7 anv: Don't resolve or ambiguate non-existent layers
The previous code was trying to avoid non-existent layers by taking a
MAX with anv_image_aux_layers.  Unfortunately, it wasn't taking into
account that layer_count starts at base_layer which may not be zero.
Instead, we need to subtract base_layer from anv_image_aux_layers with
a guard against roll-over.

Fixes: de3be61801 "anv/cmd_buffer: Rework aux tracking"
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-12 15:14:57 -08:00
Daniel Stone
c2c4e5bae3 i965: Fix bugs in intel_from_planar
This commit fixes two bugs in intel_from_planar.  First, if the planar
format was non-NULL but only had a single plane, we were falling through
to the planar case.  If we had a CCS modifier and plane == 1, we would
return NULL instead of the CCS plane.  Second, if we did end up in the
planar_format == NULL case and the modifier was DRM_FORMAT_MOD_INVALID,
we would end up segfaulting in isl_drm_modifier_has_aux.

Cc: mesa-stable@lists.freedesktop.org
Fixes: 8f6e54c929
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-12 15:14:45 -08:00
Eric Anholt
1aed66dc1e radv: Fix compiler warning about uninitialized 'set'
The compiler doesn't figure out that we only get result == VK_SUCCESS if
set got initialized.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 20:48:47 +00:00
Eric Anholt
21670f8208 glsl/tests: Fix strict aliasing warning about int64/double.
Fixes: 4bf9862747 ("glsl/tests: Add UINT64 and INT64 types")
Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>
2018-02-12 20:48:43 +00:00
Eric Anholt
091bff8317 ac/nir: Fix compiler warning about uninitialized dw_addr.
Even switching the def's condition to be the same chip revision check as
the use, the compiler doesn't figure it out.  Just NULL-init it.

Fixes: ec53e52742 ("ac/nir: Add ES output to LDS for GFX9.")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 20:48:29 +00:00
Eric Anholt
7a83be4b28 gallium/llvmpipe: Fix compiler warnings about ddx/ddy/ddmax.
My gcc doesn't figure out that dims >= 1 (seems reasonable), and doesn't
notice that ddmax is used from the same no_rho_opt as its initialization.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-12 20:48:18 +00:00
Kenneth Graunke
bd87bd178c anv: Drop I915_EXEC_CONSTANTS_REL_GENERAL from execbuf.
The kernel used to have execbuf parameters to program the INSTPM bit
for whether 3DSTATE_CONSTANT_* should be relative to dynamic state
base address or an absolute address.  However, they never worked in
the presence of hardware contexts, so I deleted them a while back.

It doesn't make sense to set this flag, as it doesn't exist anymore.
It also never did anything anyway - the flag is zero, so |'ing it in
did nothing.  The default is relative anyway.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-12 07:00:41 -08:00
Eric Engestrom
111d4bf1d0 r200: remove left over dead code
0aaa27f291 removed the references to this array without
removing the array itself

Cc: Ian Romanick <ian.d.romanick@intel.com>
Fixes: 0aaa27f291 "mesa: Pass the translated color logic op dd_function_table::LogicOpcode"
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-02-12 11:19:44 +00:00
Samuel Pitoiset
f4e85ba93f ac/nir: remove backlink to nir_to_llvm_context
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:39 +01:00
Samuel Pitoiset
be5f6eb13e ac/nir: remove nir_to_llvm_context::module
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:36 +01:00
Samuel Pitoiset
90a815ddeb ac/nir: remove nir_to_llvm_context::builder
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:34 +01:00
Samuel Pitoiset
759acfa180 ac/nir: drop nir_to_llvm_context from glsl_to_llvm_type()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:31 +01:00
Samuel Pitoiset
e7373a6498 ac/nir: drop nir_to_llvm_context from visit_var_atomic()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:29 +01:00
Samuel Pitoiset
485346b05a ac/nir: drop nir_to_llvm_context from visit_vulkan_resource_reindex()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:27 +01:00
Samuel Pitoiset
cd6dfacda9 ac/nir: drop nir_to_llvm_context from visit_load_push_constant()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:25 +01:00
Samuel Pitoiset
5c9e398c83 ac/nir: drop nir_to_llvm_context from cast_ptr()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:23 +01:00
Samuel Pitoiset
5ef5944848 ac/nir: drop nir_to_llvm_context from visit_load_local_invocation_index()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:21 +01:00
Samuel Pitoiset
da8b0b8264 ac/nir: drop nir_to_llvm_context from emit_f2f16()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:19 +01:00
Samuel Pitoiset
e32f374944 ac: remove unused parameters in abi::load_tess_coord()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:17 +01:00
Samuel Pitoiset
1e69db003d ac/nir: remove useless bitcast in load_tess_coord()
nir_intrinsic_load_tess_coord always returns a v3i32.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:15 +01:00
Samuel Pitoiset
ed179fbdf3 ac: add load_resource() to the ABI
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:13 +01:00
Samuel Pitoiset
ecf229706f ac: add load_sample_mask_in() to the ABI
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:11 +01:00
Samuel Pitoiset
0f48eeea05 ac: move view_index to the ABI
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:09 +01:00
Samuel Pitoiset
0efbede949 ac: move push_constants to the ABI
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:07 +01:00
Samuel Pitoiset
460d3ce726 ac: move tg_size to the ABI
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:04 +01:00
Samuel Pitoiset
054c92190c ac/nir: remove unused nir_to_llvm_context:{defs,phis}
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-12 11:54:02 +01:00
Eric Anholt
0b97eb02b0 egl/gbm: Fix compiler warning about visual matching.
The compiler doesn't know that num_visuals > 0.

Fixes: 37a8d907cc ("egl/gbm: Ensure EGLConfigs match GBM surface format")
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-02-12 09:16:44 +00:00
Rob Clark
831fb29252 freedreno: small fix for flushing dependent batches
Flush a resource's previous write_batch synchronously.  Because a
resource's associated batches are not updated until after the flush
thread submits rendering to the kernel, this was causing a bit of
confusion in the following loop.  This fixes a bug that appeared with
recent stk.

Perhaps we need to re-work things a bit to clear out dependent patches
in the ctx's thread and use a fence to deal with the period between
when a flush is queued and when it is submitted to the kernel.  But
this will do until time permits a larger refactor.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
c57ed8e01c freedreno/ir3: intra-block scheduling
Because of loops, we can't schedule all of a block's predecessors first.
Instead just assume that the result consumed in a block was written far
enough away in all paths into a block.  And do an intra-block scheduling
pass to figure out if there are any cases where we need to insert extra
nop's.  This works out better than always assuming the worst case (ie.
that a value live into a block was written in the last instruction in
the predecessor block).

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
2a2099a875 freedreno/ir3: "boost" the depth of if/else condition
Account for the move to predicate register, to try to avoid needing to
insert extra NOPs later.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
ffb00f6841 freedreno/ir3: account for arrays in delayslot calc
Normally false-deps are not something to consider, since they mostly
exist for delay-slot related reasons:

 * barriers
 * ordering writes after read
 * SSBO/image access ordering

The exception is a false-dependency on an array store.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
f54d2b4f10 freedreno/ir3: more clever legalize algorithm
Previously we didn't handle flow control in legalize, and instead just
set (ss)(sy) on the first instruction in every block.  Which isn't very
clever.

Instead, consider output state of all predecessor blocks, so we only
set a sync bit if needed for any possible path leading into a block.
Because of loops, we can't require that all successor blocks are
legalized before a given block, so instead run in a loop until results
converge.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
015afb6a38 freedreno/ir3: track block predecessors
Useful in the following patches.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
76440fcca9 freedreno/ir3: clean up dangling false-dep's
Maybe there is a better way for this..  where it comes useful is "array"
loads, which end up as a false-dep for a later array store.

If all the uses of an array load are CP'd into their consumer, it still
leaves the dangling array load, leading to funny things like:

  mov.u32u32 r5.y, r0.y
  mov.u32u32 r5.y, r0.z

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
aea223741f freedreno/ir3: handle IMMED for mad 2nd src special case
Consider also immediates for swapping the first two srcs, because they
can be lowered to constant.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
242a8a1957 freedreno/ir3: remove ir3 phi instruction
Now that we convert phi webs to ssa, we can drop all this.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
a7b569d60c freedreno/ir3: remove lower_if_else pass
Now that it is unused.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
268ab05484 freedreno/ir3: add experimental GCM pass
Generally seems to do worse on instruction count and register usage,
according to shader-db.  But shader-db also doesn't do a very good job
of weighting loop bodies, so that might not be totally valid.

So add an env variable to enable GCM pass for easier experimentation.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
4c15c53d91 freedreno/ir3: change opt passes
There are more useful nir passes added since initial conversion to nir.
But ir3 was never updated to use them.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
ec8bc54ad2 freedreno/ir3: use peephole select pass
Agressively lowering all if/else to selects in some extreme cases
results in much higher register pressure.  Using peephole select instead
with a modest threshold speeds up alu2 4x!

16 seems like a good limit, low enough to help alu2 but not too low that
it penalizes everything else.  With a bit better scheduling of the
instruction that moves a value into a predicate register, we might be
able to lower this limit a bit more in the future, but since we need 6
cycles from the move to predicate register to predicated branch, that
puts some sort of lower bound on how far we can lower this threshold.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
a7ea2b4eba freedreno/ir3: lower phi webs to regs
nir's from_ssa pass is much better at avoiding inserting extra moves
than our logic is.  And lowering phi webs to regs just treats anything
involved in a phi web as an array of length=1.  Which with previous
array related fixes in RA/etc ends up working out quite well.  This cuts
down on extra instructions and also helps with register pressure.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
0a6ddf964f freedreno/ir3: separate arrays from groups
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
55f14a1ac4 freedreno/ir3: make block/instruction serialno per-shader
Makes it easier to compare values seen in-game (where there are many
shaders) to cmdline standalone compiler.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
5a7de94392 freedreno/ir3: add spirv support to cmdline compiler
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
942341bcd0 freedreno/ir3: don't lower fsat
Instead, if possible fold (sat) flag into src, otherwise use:

  (sat)max.f rD, rS, rS

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
b2fc94f074 freedreno/ir3: add encoding/decoding for (sat) bit
Seems to be there since a3xx, but we always lowered fsat.  But we can
shave some instructions, especially in shaders that use lots of
clamp(foo, 0.0, 1.0) by not lowering fsat.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
1b658533e1 freedreno/ir3: extend liverange of arrays
Use livein state of other blocks to extend liverange of arrays when they
are still needed by successor blocks.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
ac459a6f7f freedreno/ir3: avoid extra mov's for "arrays"
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
2bc3fb6992 freedreno/ir3: a couple more array fixes
(Plus a couple TODOs)

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
8ea1ef4191 freedreno/ir3: keep array stores
Since these are not in SSA form, add to block's keeps so it doesn't
appear unused.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
c60f150d56 freedreno/ir3: propagate barrier information
When eliminating movs, the instruction that is now directly using the
src of the mov has the same scheduling order constraints as the original
mov instruction.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
98702c1010 freedreno/ir3: remove pointless statement
Function ends after this if/else ladder, so it was pointless.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
930ca0e038 freedreno/ir3: some more debug prints
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
a84e324847 freedreno/ir3: fix printing of relative branch offsets
The number of bits depends on generation.  But printing negative values
with a5xx encoding (largest size) but compiling for a3xx or a4xx, would
result in negative values printed as large positive values.

I guess in practice huge negative branch offsets aren't likely (and if
that is the case, the shader is probably too big to grok by reading the
assembly).  So just print using smallest bitfield size.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
a5c28fe07b freedreno/ir3: be more clever with if/else jumps
Try to clean up things like:

  br !p0.x #2
  br p0.x #something

to eliminate the first branch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
44dd7dcd2f freedreno/ir3: avoid some spurious sync bits
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
069c0ac625 freedreno/ir3: print # of sync bits for shaderdb
When trying to optimize to reduce stalls, it is nice to see this info.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark
7d45e2e39f freedreno: add debug trace for flush
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Grazvydas Ignotas
9b9a89cd79 intel/compiler: fix 64bit value prints on 32bit
Fix the following:
warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but
argument 3 has type ‘uint64_t {aka long long unsigned int}.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-02-10 17:59:02 +02:00
Timothy Arceri
ff0e3fa1fe st/glsl_to_nir: remove unused options variable 2018-02-10 11:06:55 +11:00
Timothy Arceri
8f378c116e st/radeonsi: enable disk cache for nir
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-10 10:59:10 +11:00
Timothy Arceri
bc9d9f9b86 st: add nir shader disk cache support
v2: include compute shader support

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-10 10:59:10 +11:00
Timothy Arceri
97efdc0d57 st/glsl_to_tgsi: move nir detection earlier
We move the nir check before the shader cache call so that we can
call a nir based caching function in a following patch.

Also with this change we simply check if vertex shaders support
NIR rather than looping over the stages as mixing of shader types
is not supported anyway.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-10 10:59:10 +11:00
Timothy Arceri
b5e23887fe radeonsi: stop returning PIPE_SHADER_IR_NATIVE for PIPE_SHADER_CAP_PREFERRED_IR
Clover now checks PIPE_SHADER_CAP_SUPPORTED_IRS for native support instead.

This change indirectly enables NIR support for compute shaders
on radeonsi.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-10 10:59:10 +11:00
Timothy Arceri
73f1d6f0c1 r600: always return PIPE_SHADER_IR_TGSI for PIPE_SHADER_CAP_PREFERRED_IR
We now use PIPE_SHADER_CAP_SUPPORTED_IRS to check for native support
in clover.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-10 10:59:10 +11:00
Timothy Arceri
51f484bb44 clover: use PIPE_SHADER_CAP_SUPPORTED_IRS to discover IR
PIPE_SHADER_CAP_PREFERRED_IR was conflicting with PIPE_SHADER_IR_NIR
for compute shaders, so we let clover pick the one it wants to use.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-10 10:59:10 +11:00
Timothy Arceri
3af4f34e61 r600: add PIPE_SHADER_IR_NATIVE to supported shaders for cs
Acked-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-10 10:59:10 +11:00
Timothy Arceri
ce836487b8 radeonsi/nir: add depth layout to scan pass
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-10 10:46:28 +11:00
Timothy Arceri
6a8efbe652 radeonsi/nir: add FRAG_RESULT_COLOR to scan pass
Fixes a number of draw buffers piglit tests.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-10 10:46:28 +11:00
Timothy Arceri
ef8082baf8 ac: convert nir_op_f2f32 src to a float
Fixes the following piglit test:

./bin/arb_vertex_attrib_64bit-check-explicit-location -auto -fbo

Where we would end up with the nir such as:

	vec1 64 ssa_11 = pack_64_2x32_split ssa_9, ssa_10
	vec1 32 ssa_12 = f2f32 ssa_2

And our pack_64_2x32_split nir to llvm code always produces
a 64bit integer as output.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-10 10:46:28 +11:00
Timothy Arceri
1b1e5f8edf ac: fix some 64bit unpack asserts
Previously the asserts did not take swizzles into account.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-10 10:46:28 +11:00
Mark Janes
9a05c66feb Revert "i965: prevent potentially null pointer access"
This reverts commit 712332ed54, which
caused over 90k failures in Mesa i965 CI.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-09 09:46:07 -08:00
Daniel Stone
37a8d907cc egl/gbm: Ensure EGLConfigs match GBM surface format
When we create an EGL window surface on a GBM surface, ensure that the
EGLConfig is compatible with the GBM format, notwithstanding XRGB/ARGB
interchange.

For example, rendering with an XRGB8888 EGLConfig on to an ARGB8888
gbm_surface (and vice-versa) are acceptable, but rendering with an
XRGB2101010 EGLConfig on to an XRGB8888 gbm_surface will now be
rejected.

This was previously allowed through; when 10bpc formats were enabled,
clients which picked a completely random EGL config and hoped/assumed
they were XRGB8888 would break.

If you have bisected a failure to start a GBM/KMS client to this commit,
please look at its EGLConfig selection (e.g. through eglChooseConfigs),
and add an EGL_NATIVE_VISUAL_ID == gbm_surface format match to the
attribs for config selection.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:16 +00:00
Daniel Stone
8174e5b49e egl/gbm: Remove duplicate format table
Now that we have mask/channel information in gbm_dri's format conversion
table, we can remove the copy in EGL.

As this table contains more formats (notably including R8 and RG8, which
can be used for BO but not surface allocation), we now compare the masks
of all channels when trying to find a suitable config. Without doing
this, an XRGB8888 EGLConfig would match on an R8 format.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:16 +00:00
Daniel Stone
314714ac53 gbm/dri: Expose visuals table through gbm_dri_device
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:16 +00:00
Daniel Stone
2ed344645d gbm/dri: Add RGBA masks to GBM format table
Eventually, we can replace the visuals list inside GBM EGL driver with
this one.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:16 +00:00
Daniel Stone
4732094cff egl/wayland: Use an array for modifiers
Each Wayland EGLDisplay currently contains a struct with one vector of
modifiers per format, hardcoded in the header. To allow easier support
for more formats, turn this into an array of u_vectors which is opaque
outside of platform_wayland.c.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:16 +00:00
Daniel Stone
5bc49d4cbf egl/wayland: Remove has_format enum
Instead of the has_format enum, use an index into the visual array. This
makes adding new formats less typing.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:16 +00:00
Daniel Stone
d32b23f383 egl/wayland: Add bpp to visual map
Both the DRI2 GetBuffersWithFormat interface, and SHM buffer allocation,
had their own format -> bpp lookup tables. Replace these with a lookup
into the visual map.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:16 +00:00
Daniel Stone
4de98a9c07 egl/wayland: Use visual map for DRIImage<->FourCC map
When trying to translate between DRIImage format enums and FourCC codes,
use our visual map rather than an open-coded subset.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:16 +00:00
Daniel Stone
68a80c11bd egl/wayland: Use visual map for format advertisement
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:16 +00:00
Daniel Stone
3323ce72ff egl/wayland: Use visual map for buffer_from_image
When creating a wl_buffer on an upstream Wayland display from an
existing EGLImage, use the dri2_wl_visual map rather than another
hardcoded list of formats.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:16 +00:00
Daniel Stone
a9cc4edb60 egl/wayland: Use visual map for config->format lookup
Having hoisted the format -> config map into common code, we now use it
for config -> format lookups.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:15 +00:00
Daniel Stone
1dc013f1ee egl/wayland: Add format enums to visual map
Extend the visual map from only containing names and bitmasks, to also
carrying the three format enums we need. These are the DRIImage format
tokens for internal allocation, FourCC codes for wl_drm and dmabuf
protocol, and wl_shm codes for swrast drivers.

We will later use these formats to eliminate a bunch of open-coded
conversions.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:15 +00:00
Daniel Stone
66912641df egl/wayland: Use proper enum type in visual definition
No semantic change.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:15 +00:00
Daniel Stone
845c2f6156 egl/wayland: Widen channel masks to bpp
Widen the channel masks given in the visual table to the full width of
the pixel format, i.e. as many leading zeros as required.

No functional change.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:15 +00:00
Daniel Stone
19cbca38e4 egl/wayland: Hoist format <-> EGLConfig definition up
Pull the mapping between Wayland formats and EGLConfigs up to the top
level, so we can reuse it elsewhere.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:15 +00:00
Daniel Stone
4fbd2d50b1 egl/wayland: Fix ARGB/XRGB transposition in config map
When 0b2b719121 moved from an if tree to a struct to map between
wl_drm formats and EGLConfigs, it transposed the mapping between XRGB
and ARGB. Luckily, everyone exposes both formats, so this is harmless.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Fixes: 0b2b719121 ("egl/wayland: introduce dri2_wl_add_configs_for_visuals() helper")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-02-09 16:17:06 +00:00
Marek Olšák
76085f2048 st/mesa: generate blend state according to the number of enabled color buffers
Non-MRT cases always translate blend state for 1 color buffer only.
MRT cases only check and translate blend state for enabled color buffers.

This also avoids an assertion failure in translate_blend for:
  dEQP-GLES31.functional.draw_buffers_indexed.overwrite_common.common_advanced_blend_eq_buffer_blend_eq

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-02-09 15:52:22 +01:00
Marek Olšák
c446dd7927 st/mesa: don't translate blend state when color writes are disabled
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-02-09 15:52:22 +01:00
Marek Olšák
3d06c8afb5 st/mesa: don't translate blend state when it's disabled for a colorbuffer
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-02-09 15:52:22 +01:00
Lionel Landwerlin
712332ed54 i965: prevent potentially null pointer access
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
CID: 1418110
2018-02-09 14:02:59 +00:00
Mark Thompson
5db29d62ce st/va: Make the vendor string more descriptive
Include the Mesa version and detail about the platform.

Signed-off-by: Mark Thompson <sw@jkqxz.net>
Reviewed-by: Christian König <christian.koenig@amd.com>
2018-02-09 13:37:43 +01:00
Mark Thompson
768f1487b0 st/va: Enable vaExportSurfaceHandle()
It is present from libva 2.1 (VAAPI 1.1.0 or higher).

Signed-off-by: Mark Thompson <sw@jkqxz.net>
Reviewed-by: Christian König <christian.koenig@amd.com>
2018-02-09 13:37:36 +01:00
Tapani Pälli
41c5bf3836 disk cache: move path creation back to constructor
This patch moves disk cache path and index creation back to the
constructor which matches previous behavior. We still allow create
to succeed without path so that cache can be used with callback
functionality.

Fixes: c95d3ed091 "disk cache: create cache even if path creation fails"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-02-09 11:33:25 +02:00
Samuel Pitoiset
3a2bb4db23 ac/nir: compute correct number of user SGPRs on GFX9
For merged shaders.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-09 10:16:04 +01:00
Michel Dänzer
171076f082 st/mesa: Initialize tex_target in compile_tgsi_instruction
Initialize to TGSI_TEXTURE_BUFFER (== 0), same as was done before the
variable type was changed to enum tgsi_texture_type.

Fixes a bunch of piglit failures with radeonsi, e.g.:

gles-3.0-transform-feedback-uniform-buffer-object: ../../../../src/gallium/auxiliary/tgsi/tgsi_util.c:502: tgsi_util_get_texture_coord_dim: Assertion `!"unknown texture target"' failed.

Corresponding compiler warning:

  CXX      state_tracker/st_glsl_to_tgsi.lo
../../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp: In function ‘pipe_error st_translate_program(gl_context*, uint, ureg_program*, glsl_to_tgsi_visitor*, const gl_program*, GLuint, const ubyte*, const ubyte*, const ubyte*, const ubyte*, const ubyte*, GLuint, const ubyte*, const ubyte*, const ubyte*)’:
../../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp:5992:23: warning: ‘tex_target’ may be used uninitialized in this function [-Wmaybe-uninitialized]
       ureg_memory_insn(ureg, inst->op, dst, num_dst, src, num_src,
       ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                        inst->buffer_access,
                        ~~~~~~~~~~~~~~~~~~~~
                        tex_target, inst->image_format);
                        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp:5866:27: note: ‘tex_target’ was declared here
    enum tgsi_texture_type tex_target;
                           ^~~~~~~~~~

Fixes: 9f9ce1625f ("st/mesa: use TGSI enum types in st_glsl_to_tgsi.cpp")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-09 09:26:40 +01:00
Alejandro Piñeiro
f32b01ca43 glsl/linker: remove ubo explicit binding handling
This is already handled at link_uniform_blocks, specifically at
process_block_array_leaf.

Additionally, this code was not handling correctly arrays of
arrays. When creating the name of the block to set the binding, it
only took into account the first level, so any attempt to set a
explicit binding on a array of array ubo would trigger an assertion.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-02-09 08:32:42 +01:00
Mathias Fröhlich
77cb2fc0bd mesa: Only update enabled VAO gl_vertex_array entries.
Instead of updating all modified gl_vertex_array_object::_VertexArray
entries just update those that are modified and enabled.
Also release buffer object from the _VertexArray that belong
to disabled attributes.

v2: Also set Ptr and Size to zero.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-09 04:26:23 +01:00
Mathias Fröhlich
437cae411e gallium: Mute arrays for several meta like callbacks.
Set the _DrawArray pointer to NULL when calling into the Drivers
Bitmap/CopyPixels/DrawAtlasBitmaps/DrawPixels/DrawTex hooks.
This fixes an assert that gets uncovered when the following
patch gets applied.

v2: Mute from within the state tracker instead of generic mesa.
v3: Avoid evaluating _DrawArrays from within st_validate_state.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-09 04:26:13 +01:00
Mathias Fröhlich
2f9eb0aad5 mesa: Fix VAO buffer object tracking.
When changing the attribute binding in the VAO we also need to
account for getting rid of non vbo bits from VertexAttribBufferMask.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-09 04:21:36 +01:00
Timothy Arceri
d8bca3809d radeonsi/nir: gather some missing fs info
Fixes some early-z arb_shader_image_load_store piglit tests.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-09 12:51:27 +11:00
Timothy Arceri
c77078c942 ac: pass struct ac_llvm_context to emit_membar()
Fixes segfault in piglit test:

./bin/arb_shader_image_load_store-shader-mem-barrier --quick -auto -fbo

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-09 12:51:27 +11:00
Marek Olšák
12fd567c78 radeonsi: copy the NIR enablement debug bit to the shader cache flags
When NIR is enabled, TGSI must not be used. When NIR is disabled, TGSI

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-02-09 02:01:45 +01:00
Jason Ekstrand
8f20cf166e intel/blorp: Use isl_aux_op instead of blorp_hiz_op
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
1e941a0528 intel/blorp: Use isl_aux_op instead of blorp_fast_clear_op
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
1810f965c8 anv: Allow fast-clearing the first slice of a multi-slice image
Now that we're tracking aux properly per-slice, we can enable this for
applications which actually care.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
de3be61801 anv/cmd_buffer: Rework aux tracking
This commit completely reworks aux tracking.  This includes a number of
somewhat distinct changes:

 1) Since we are no longer fast-clearing multiple slices, we only need
    to track one fast clear color and one fast clear type.

 2) We store two bits for fast clear instead of one to let us
    distinguish between zero and non-zero fast clear colors.  This is
    needed so that we can do full resolves when transitioning to
    PRESENT_SRC_KHR with gen9 CCS images where we allow zero clear
    values in all sorts of places we wouldn't normally.

 3) We now track compression state as a boolean separate from fast clear
    type and this is tracked on a per-slice granularity.

The previous scheme had some issues when it came to individual slices of
a multi-LOD images.  In particular, we only tracked "needs resolve"
per-LOD but you could do a vkCmdPipelineBarrier that would only resolve
a portion of the image and would set "needs resolve" to false anyway.
Also, any transition from an undefined layout would reset the clear
color for the entire LOD regardless of whether or not there was some
clear color on some other slice.

As far as full/partial resolves go, he assumptions of the previous
scheme held because the one case where we do need a full resolve when
CCS_E is enabled is for window-system images.  Since we only ever
allowed X-tiled window-system images, CCS was entirely disabled on gen9+
and we never got CCS_E.  With the advent of Y-tiled window-system
buffers, we now need to properly support doing a full resolve of images
marked CCS_E.

v2 (Jason Ekstrand):
 - Fix an bug in the compressed flag offset calculation
 - Treat 3D images as multi-slice for the purposes of resolve tracking

v3 (Jason Ekstrand):
 - Set the compressed flag whenever we fast-clear
 - Simplify the resolve predicate computation logic

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
2cbfcb205e anv/cmd_buffer: Move the mi_alu helper higher up
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
2e69045c4d anv/image: Simplify some verbose commennts
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
f0523f70ef anv: Use blorp_ccs_ambiguate instead of fast-clears
Even though the blorp pass looks a bit on the sketchy side, the end
result in the Vulkan driver is very nice.  Instead of having this weird
case where you do a fast clear and then maybe have to resolve, we just
do the ambiguate and are done with it.  The ambiguate does exactly what
we want of setting all the CCS values to 0 which puts it into the
pass-through state.

This should also improve performance a bit in certain cases.  For
instance, if we did a transition from UNDEFINED to GENERAL for a surface
that doesn't have CCS enabled all the time, we would end up doing a
fast-clear and then a full resolve which ends up touching every byte in
the main surface as well as the CCS.  With the ambiguate pass, that
transition only touches the CCS.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
84fd2ebfbc anv/cmd_buffer: Re-arrange the logic around UNDEFINED fast-clears
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
3ef8c4b2f5 anv/cmd_buffer: Pull the undefined layout condition into the if
Now that this isn't a multi-case if and it's just the one case, it's a
bit clearer if the condition is just part of the if instead of being
pulled out into a boolean variable.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
857b5b5a7f intel/blorp: Add a CCS ambiguation pass
This pass performs an "ambiguate" operation on a CCS-compressed surface
by manually writing zeros into the CCS.  On gen8+, ISL gives us a fairly
detailed notion of how the CCS is laid out so this is fairly simple to
do.  On gen7, the CCS tiling is quite crazy but that isn't an issue
because we can only do CCS on single-slice images so we can just blast
over the entire CCS buffer if we want to.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
13b621d6fd anv: Only fast clear single-slice images
The current strategy we use for managing resolves has an issues where we
track clear colors and the need for resolves per-LOD but we still allow
resolves of only a subset of the slices in any given LOD and doing so
sets the "needs resolve" flag for that LOD to false while leaving the
remaining layers unresolved.  This patch is only the first step and does
not, by itself fix anything.  However, it's fairly self-contained and
splitting it out means any performance regressions should bisect to this
nice obvious commit rather than to the giant "rework aux tracking"
commit.

Nanley and I did some testing and none of the applications we tested
even tried to fast-clear anything other than the first slice of an
image.  The test was done by adding a printf right before we call
blorp_fast_clear if we were every going to touch any slice other than
the first with a fast-clear.  Due to the way the original code was
structured, this would not have included applications which only cleared
a subset of layers.  The applications tested were:

 * All Sascha Willems demos
 * Aztec Ruins
 * Dota 2
 * The Talos Principle
 * Mad Max
 * Warhammer 40,000: Dawn of War III
 * Serious Sam Fusion 2017: BFE

While not the full list of shipping applications, it's a pretty good
spread and covers most of the engines we've seen running on our driver.
If this is ever shown to be a performance problem in the future, we can
reconsider our strategy.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
571ed588ac anv/cmd_buffer: Add a mark_image_written helper
Currently, this helper does nothing but we call it every place where an
image is written through the render pipeline.  This will allow us to
properly mark the aux state so that we can handle resolves correctly.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
9876d6f0ef anv/blorp: Add src/dst_level helper variables in CmdCopyImage
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
c180c2c868 anv/cmd_buffer: Add an anv_genX_call macro
This is copied and pasted from the similar macro we added to ISL.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
ab7543b13d anv/cmd_buffer: Generalize transition_color_buffer
This moves it to being based on layout_to_aux_usage instead of being
hard-coded based on bits of a priori knowledge of how transitions
interact with layouts.  This conceptually simplifies things because
we're now using layout_to_aux_usage and layout_supports_fast_clear to
make resolve decisions so changes to those functions will do what one
expects.

There is a potential bug with window system integration on gen9+ where
we wouldn't do a resolve when transitioning to the PRESENT_SRC layout
because we just assume that everything that handles CCS_E can handle it
all the time.  When handing a CCS_E image off to the window system, we
may need to do a full resolve if the window system does not support the
CCS_E modifier.  The only reason why this hasn't been a problem yet is
because we don't support modifiers in Vulkan WSI and so we always get X
tiling which implies no CCS on gen9+.  This patch doesn't actually fix
that bug yet but it takes us the first step in that direction by making
us actually pick the correct resolve op.  In order to handle all of the
cases, we need more detailed aux tracking.

v2 (Jason Ekstrand):
 - Make a few more things const
 - Use the anv_fast_clear_support enum

v3 (Jason Ekstrand):
 - Move an assert and add a better comment

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
151771b390 anv/cmd_buffer: Recurse in transition_color_buffer instead of falling through
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
bea7373c92 anv/image: Support color aspects in layout_to_aux_usage
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
b09464db42 anv/image: Add a helper for determining when fast clears are supported
v2 (Jason Ekstrand):
 - Return an enum instead of a boolean

v3 (Jason Ekstrand):
 - Return ANV_FAST_CLEAR_NONE instead of false (Topi)
 - Rename ANV_FAST_CLEAR_ANY to ANV_FAST_CLEAR_DEFAULT_VALUE
 - Add documentation for the enum values

v4 (Jason Ekstrand):
 - Remove a dead comment

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
1f7eee6bc1 anv/image: Update a comment
This got lost in all of the aspect vs. plane rebasing of YCBCR.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
5c38ab8f07 anv/blorp: Rework HiZ ops to look like MCS and CCS
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
1d473e26f2 anv/blorp: Support ISL_AUX_USAGE_HIZ in surf_for_anv_image
If the function gets passed ANV_AUX_USAGE_DEFAULT, it still has the old
behavior of setting ISL_AUX_USAGE_NONE for depth/stencil which is what
we want for blits/copies.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
42f1668a54 anv/blorp: Rework image clear/resolve helpers
This replaces image_fast_clear and ccs_resolve with two new helpers that
simply perform an isl_aux_op whatever that may be on CCS or MCS.  This
is a bit cleaner as it separates performing the aux operation from which
blorp helper we have to call to do it.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
482c24783e intel/isl: Codify AUX operations in an enum
Right now, we have different entrypoints and enums in blorp for these
different operations.  This provides us a central enum which we can
begin to transition to.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Gert Wollny
c36172e387 r600/sb: Check whether optimizations would result in reladdr conflict
v2: * Check whether the node src and dst registers are NULL before using
      them.
    * fix a type in the commit message.

Two cases are handled with this patch:

1. If copy propagation tries to eliminated a move from a relative
   array access then it could optimize

     MOV R1, ARRAY[RELADDR_1]
     MOV R2, ARRAY[RELADDR_2]
     OP2 R3, R1 R2

   into

     OP2 R3, ARRAY[RELADDR_1], ARRAY[RELADDR_2]

   which is forbidden, because there is only one address register available.

2. When MULADD(x,a,MUL(x,c)) is handled

      MUL TMP, R1, ARRAY[RELADDR_1]
      MULLADD R3, R1, ARRAY[RELADDR_2], TMP

   by folding this into

      ADD TMP, ARRAY[RELADDR_2], ARRAY[RELADDR_1]
      MUL R3, R1, TMP

   which is also forbidden.

Test for these cases and reject the optimization if a forbidden combination
of relative access would be created.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103142
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-09 10:00:38 +10:00
Glenn Kennard
1d871aa626 r600g: Implement spilling of temp arrays (v2)
Pessimistically spills arrays if GPR limit is exceeded.

v2: fix r600 support [airlied]

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-09 09:53:26 +10:00
Dave Airlie
22fc5eff80 r600/sb: handle scratch mem reads on r600
On r600 we use the scratch mem with read/read_ind, in that case
sb should track the rw_gpr as a dst instead of a src.

This stops the whole shader being optimised out.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-09 09:53:21 +10:00
Glenn Kennard
cd34deb585 r600g/sb: Add dependency tracking for scratch ops
Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-09 09:53:19 +10:00
Glenn Kennard
a100d906b2 r600g/sb: Support scratch ops
Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-09 09:53:16 +10:00
Glenn Kennard
6b4303f358 r600g: Implement scratch buffer state management (v2)
v2: add Glenn's fixes

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-09 09:53:12 +10:00
Glenn Kennard
9d31596d7a r600g: Add pending output function
Spills have to happen after the VLIW bundle currently
processed, so defer emitting the spill op.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-09 09:53:08 +10:00
Glenn Kennard
9c48a139b0 r600g: Support emitting scratch ops
Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-09 09:52:48 +10:00
Dave Airlie
2a891ed190 r600: fix texture gather swizzling.
This fixes:
KHR-GL45.texture_gather.swizzle
on cayman and redwood.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-09 09:32:20 +10:00
Timothy Arceri
12a2350e6d ac: add 64bit support to ac_find_lsb()
v2: use LLVMBuildTrunc()

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-09 09:42:59 +11:00
Timothy Arceri
a9f6b392c7 ac: move get_elem_bits() to ac_llvm_build.c
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-09 09:42:59 +11:00
Timothy Arceri
19f9839f0b ac: add 64bit bitCount support
v2: use LLVMBuildTrunc()

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-09 09:42:59 +11:00
Samuel Pitoiset
bb750d265c ac/nir: clean up handle_fs_outputs_post()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-08 22:14:33 +01:00
Samuel Pitoiset
528bc14fa5 ac/nir: add radv_load_output() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-08 22:14:30 +01:00
Samuel Pitoiset
834d9845ca ac/shader: scan info about output PS declarations
NIR->LLVM should only be a translation pass, and all scan stuff
should be done before.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-08 22:14:27 +01:00
Samuel Pitoiset
a8e04e91de ac/nir: add radv_export_param() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-08 22:14:26 +01:00
Samuel Pitoiset
e3cfd6b805 ac/nir: remove set but unused export_mask
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-08 22:14:24 +01:00
Samuel Pitoiset
724136d590 ac/nir: remove dead code in handle_vs_outputs_post()
The memcpy can't be reached because the condition is always false.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-08 22:14:22 +01:00
Samuel Pitoiset
c63d8d0284 ac/nir: remove useless check in si_llvm_init_export_args()
values can't be NULL because we use ac_build_export_null() now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-08 22:14:20 +01:00
Samuel Pitoiset
26ab5a4269 ac/nir: use ac_build_export_null()
The number of enabled channels should be 0 when exporting null.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-08 22:11:44 +01:00
Samuel Pitoiset
bd9f7b7635 ac: add ac_build_export_null() helper
Imported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-08 22:11:42 +01:00
Scott D Phillips
1f4d2433e7 meson: Add build option for tools
Add a build option to control building some of the misc tools we
have. Also set the executables to install, presumably you want
that if you're asking for the build.

v2: set 'install:' to the with_tools value, not true (Jordan)
    handle 'all' in a the comma list (Dylan)
    Add freedreno's tools (Dylan)

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-08 11:24:42 -08:00
Anuj Phogat
464d057c86 intel: Add Coffee Lake brand strings
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-02-08 10:26:34 -08:00
Brian Paul
11e92889aa gallium/util: silence clang warning in blitter code
Silence "warning: comparison of constant 4294967295 with expression
of type 'ubyte'".

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-02-08 10:27:31 -07:00
Brian Paul
4b0a45da25 tgsi: s/unsigned/enum tgsi_semantic/ in ureg_DECL_output()
So the function matches the prototype.  Found with clang.
v2: fix copy&paste error

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-02-08 10:27:19 -07:00
Brian Paul
d95c2d86cc tgsi: use TGSI_INTERPOLATE_x arguments instead of zeros in ureg code
TGSI_INTERPOLATE_CONSTANT and TGSI_INTERPOLATE_LOC_CENTER have the
value zero so there's no change in behavior.  It seems funny to
declare these fs input registers with constant interpolation.  But
it looks like ureg_DECL_input_layout() is not called anywhere and
ureg_DECL_input() is only called from
util_make_geometry_passthrough_shader().

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-08 09:49:03 -07:00
Brian Paul
26948ba761 gallium/util: s/uint/enum tgsi_semantic/ in simple shader code
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-08 09:49:03 -07:00
Brian Paul
0f40f4ffda tgsi: s/unsigned/enum pipe_shader_type/ in ureg code
And add a default switch case to silence a compiler warning.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-08 09:49:03 -07:00
Brian Paul
c0dc337ecd gallium/util: s/uint/enum tgsi_semantic/ in u_blitter.c
And put static qualifier on const arrays.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-08 09:49:03 -07:00
Brian Paul
e55de6e20c st/mesa: s/unsigned/enum tgsi_semantic/ st_cb_drawpixels.c
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-08 09:49:03 -07:00
Brian Paul
b9ff185e41 vbo: add a comment on vbo_draw_transform_feedback()
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-08 09:49:03 -07:00
Brian Paul
93b3d38176 gallium/util: trivial whitespace/formatting fixes in u_blit.c
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-08 09:49:03 -07:00
Brian Paul
5396f8546a vbo: improve comments on vbo_draw_func()
And rename a parameter name.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-08 09:49:03 -07:00
Brian Paul
b03ade55b9 cso: add a couple sanity check assertions in cso_draw_vbo()
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-08 09:49:03 -07:00
Brian Paul
5cf342704d st/mesa: rename some vars related to indirect draw count
'indirect_params' was a bit vague.  Use the names that we use in
gallium's pipe_draw_indirect_info.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-08 09:49:03 -07:00
Marek Olšák
d9e6e0bbe3 st/mesa: remove out_num_textures from update_textures
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-02-08 16:14:11 +01:00
Marek Olšák
08496c5d52 st/mesa: don't store non-fragment sampler states and views in st_context
those are unused.

st_context: 10120 -> 3704 bytes

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-02-08 16:14:11 +01:00
Lionel Landwerlin
e843667733 i965: perf: cleanup detection of kernel support for loadable configs
The initial revision of the patch adding loadable configs was testing
the feature's availability by adding a new config successfully and
then removing it.

A second version tested the availability just by exercising the
removal. But some unused code remained.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-02-08 10:52:14 +00:00
Lionel Landwerlin
bd6c0cab60 i965: perf: use drmIoctl() instead of ioctl()
ioctl() might be interrupted, use drmIoctl() instead as it'll retry
automatically.

Fixes: 27ee83eaf7 "i965: perf: add support for userspace configurations"
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
2018-02-08 10:51:40 +00:00
Lionel Landwerlin
0f952b778f i965: perf: add debug messages for loaded configs
This helps figuring out potential problems when metrics don't show up
on frameretrace for example.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-02-08 10:51:01 +00:00
Dave Airlie
3f7a7bd897 r600: implement tg4 integer workaround. (v2)
This ports the texture gather integer workaround from radeonsi.

This fixes:
KHR-GL45.texture_gather.plain-gather-uint/int*

v2: add rect support, fix 2d array shadow
Reviewed-by: Roland Scheidegger <sroland@vmware.com> (on irc)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-08 16:21:40 +10:00
Glenn Kennard
77b1b33724 r600: clean up initial shader register setup
This is taken from Glenn Kennards scratch series, but separated
out as a cleanup by me.

Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-08 16:21:35 +10:00
Roland Scheidegger
b936f4d1ca r600: partly fix sampleMaskIn value
The hw gives us coverage for pixel, not for individual fragment shader
invocations, in case execution isn't per pixel (eg, unlike cm, actually
cannot do "real" minSampleShading, it's either per-pixel or per-fragment,
but it doesn't really make a difference here).
Also, with msaa disabled, the hw still gives us a mask corresponding to
the number of samples, where GL requires this to be 1.
Fix this up by masking the sampleMaskIn bits with the bit corresponding to
the sampleID, if we know this shader is always executed at per-sample
granularity. (In case of a per-sample frequency shader and msaa disabled,
the sampleID will always be 0, so this works just fine there.)
Fixing this for the minSampleShading case will need a shader key (radeonsi
uses the prolog part for) (for eg, could get away with a single bit, cm
would need more bits depending on sample/invocation ratio, or read the
bits from a uniform), unless we'd want to always use a sample mask uniform
(which is probably not a good idea, as it would make the ordinary common
msaa case slower for no good reason).
This fixes some parts of piglit arb_sample_shading-samplemask (with fixed
test), in particular those which use a sampleID, still failing others
as expected.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-08 04:07:52 +01:00
Roland Scheidegger
07d724326a r600: clean up fragment shader input scan code
For some reason, we were iterating through the code twice (first just for
instructions needing barycentrics, then for instructions and input dcls).
Move things around slightly so this is no longer necessary.
There also was a unnedeed enabling of the fixed_pt_position_gpr - this is only
needed if the per-sample interpolation comes from an input, not from an
instruction (just move the assert where it belongs) (since the sample id to
sample from comes from a tgsi src in this case, and isn't sampleID).
Otherwise there should be no functional change.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-08 04:07:52 +01:00
Roland Scheidegger
6fd3c39590 mesa: (trivial) remove unused ignore_sample_qualifier_parameter
This parameter for _mesa_get_min_incations_per_fragment() was once used
by the intel driver, but it's long gone.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Dave Airlie <airlied@vmware.com>
2018-02-08 04:07:52 +01:00
Roland Scheidegger
becc7faae2 r600/cm: (trivial) code cleanup for emitting msaa state
No functional change (compile tested only).

Reviewed-by: Dave Airlie <airlied@redhate.com>
2018-02-08 04:07:52 +01:00
Brian Paul
b99cb13002 tgsi: use tgsi_semantic enum type in ureg code
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-07 18:43:01 -07:00
Brian Paul
174f3a4ab7 st/mesa: use tgsi_semantic enum type
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-07 18:43:01 -07:00
Brian Paul
0f7be4fc16 tgsi: use TGSI enum types in ureg code
v2: fix enum tgsi_interpolate_mode/loc typo.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-07 18:42:39 -07:00
Brian Paul
9f9ce1625f st/mesa: use TGSI enum types in st_glsl_to_tgsi.cpp
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-07 18:38:04 -07:00
Brian Paul
6321b1bd40 gallium/util: replace uint with tgsi enum types
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-07 18:38:04 -07:00
Brian Paul
15874338ff gallium/util: replace unsigned with tgsi enum types
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-07 18:38:04 -07:00
Fredrik Höglund
5a38d8f103 radv: implement VK_EXT_external_memory_host
Ported from the radeonsi GL_AMD_pinned_memory implementation.

Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-08 00:46:07 +01:00
Dave Airlie
5dd385f378 r600: fix rendering regression on r6/7 gpus
Fixes: 2d5b5d267e (r600: work out target mask at framebuffer bind.)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104989

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-08 09:37:09 +10:00
Grazvydas Ignotas
f91aa68ac6 radeonsi: avoid int-to-pointer-cast warnings on 32bit
I hope the actual dropping of MSB is ok, but that's what's already
happened before this change.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-08 01:13:58 +02:00
Grazvydas Ignotas
13ada91740 gallium/hud: update some query functions
It seems these were missed when struct pipe_context * argument was
added to hud_graph::query_new_value.

Fixes: 3132afdf4c "gallium/hud: pass pipe_context explicitly to most functions"
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-08 01:12:07 +02:00
Roland Scheidegger
09f49b9e50 Revert "gallium: build ddebug, noop, rbug, trace as part of auxiliary"
This reverts commit 6f82b8d8d0.

This broke scons build, and reportedly clover with autotools/meson too.
2018-02-07 23:47:39 +01:00
Marek Olšák
6f82b8d8d0 gallium: build ddebug, noop, rbug, trace as part of auxiliary
Building gallium is faster by 7.5 seconds on a 4core/8thread 3GHz CPU.
(gallium build time is reduced by 15% when building only radeonsi)

Non-recursive makefiles are great!
2018-02-07 22:08:34 +01:00
Roland Scheidegger
def09f8db0 u_blit: (trivial) fix bogus argument order for set_fragment_shader
Amazingly this still worked sometimes, albeit I'm not even sure why...
This fixes d7bec6f7a6.
2018-02-07 22:03:18 +01:00
Andres Rodriguez
83990dd529 mesa: fix incorrect type when allocating arrays
The array members are have type 'struct gl_buffer_object *'

Found by coverity.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-02-07 14:50:21 -05:00
Roland Scheidegger
d7bec6f7a6 u_blit,u_simple_shaders: add shader to convert from xrbias format
We need this to handle some oddball dx10 format
(DXGI_FORMAT_R10G10B10_XR_BIAS_A2_UNORM). What you can do with this
format is very limited, hence we don't want to add it as a gallium
format (we could not express the properties of this format as
ordinary format properties neither, so like all special formats
it would need specific code for handling it in any case).
While here, also nuke the array for different shaders for different
writemasks, as it was not actually used (always full masks are
passed in for generating shaders).

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-02-07 17:09:37 +01:00
Roland Scheidegger
afd1e9be17 u_simple_shaders: fix mask handling in util_make_fragment_tex_shader_writemask
The writemask handling was busted, since writing defaults to output
meant they got overwritten by the tex sampling anyway. Albeit the
affected components were undefined, so maybe with some luck it
still would have worked with some drivers - if not could as well
kill it... (This would have affected u_blitter but not u_blit since
the latter always used xyzw mask.)

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-07 17:08:24 +01:00
Bas Nieuwenhuizen
5d754872b5 autotools: Only build libmesa-st-tests-common.a for tests.
We don't need the library if we don't build tests, and building
it adds a dependency on gtest which adds a dependency on cxxabi.h.

Fixes: 6569b33b6e "mesa/st/tests: unify MockCodeLine* classes"
Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>
2018-02-07 14:04:04 +01:00
Tapani Pälli
9d322fde97 i965: add __DRI2_BLOB support and set cache functions
v2: adjust to change that moved cache from ctx to screen

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-07 14:45:34 +02:00
Tapani Pälli
ae00ef2702 disk cache: add callback functionality
v2: add disk_cache_has_key, disk_cache_put_key support
    using blob cache (Nicolai, Jordan)

v3: rename set_cb as put_cb to match existing naming (Timothy)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-07 14:45:34 +02:00
Tapani Pälli
6a651b6b77 disk cache: initialize cache path and index only when used
This patch makes disk_cache initialize path and index lazily so
that we can utilize disk_cache without a path using callback
functionality introduced by next patch.

v2: unmap mmap and destroy queue only if index_mmap exists

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-07 14:45:34 +02:00
Tapani Pälli
e8495646af glsl/tests: changes to test_disk_cache_create test
Next patch will allow disk_cache instance to be created without
path set for it, modify some test cases that assume disk_cache
creation to fail with invalid path. Creation should succeed but
simple put/get test fail.

v2: leave tests as is but check that both cache struct exists
    and try simple put/get that should fail with invalid path set
    (Emil)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-07 14:45:34 +02:00
Tapani Pälli
83c81b6cce glsl/tests: move utility functions in cache_test
Patch moves functions higher so that we can utilize them from
test_disk_cache_create which is modified by next patch.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-07 14:45:34 +02:00
Tapani Pälli
6f5b57093b egl: add support for EGL_ANDROID_blob_cache
v2: cleanup, move callbacks to _egl_display struct (Emil Velikov)
    adapt to earlier ctx->screen changes

v3: remove useless checking, add _eglSetFuncName (Emil Velikov)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v2)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-07 14:45:34 +02:00
Tapani Pälli
cf4569da6b dri: add interface for EGL_ANDROID_blob_cache extension
v2: move from __DRIcontext to __DRIscreen (Emil Velikov)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-07 14:45:34 +02:00
Samuel Pitoiset
757d36ee70 ac/nir: use new pknorm_i16/u16 and pk_i16/u16 LLVM intrinsics
Ported from RadeonSI.

Only one F1 2017 shader is affected, code size decreased
from 532 to 488 on both Polaris10 and Vega10.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-07 12:42:13 +01:00
Samuel Pitoiset
2f54d7382d ac/nir: avoid loading unused VS input components
Polaris10:
Totals from affected shaders:
SGPRS: 122840 -> 120984 (-1.51 %)
VGPRS: 78812 -> 78440 (-0.47 %)
Spilled SGPRs: 177 -> 129 (-27.12 %)
Code Size: 2950028 -> 2941276 (-0.30 %) bytes
Max Waves: 17899 -> 17976 (0.43 %)

Vega10:
Totals from affected shaders:
SGPRS: 117144 -> 115776 (-1.17 %)
VGPRS: 77580 -> 77532 (-0.06 %)
Spilled SGPRs: 0 -> 152 (0.00 %)
Code Size: 3352656 -> 3347860 (-0.14 %) bytes
Max Waves: 19756 -> 19866 (0.56 %)

This increases SGPRs spilling a bit with Talos, but I have
some other ideas that might reduce it.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-07 12:42:09 +01:00
Samuel Pitoiset
1c57a6da5e ac/shader: scan vertex inputs usage mask
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-07 12:42:07 +01:00
Iago Toral Quiroga
f474b19875 i965: allocate a SGVS element when VertexID or InstanceID are read
Although on gen8+ platforms we can in theory use 3DSTATE_VF_SGVS
to put these beyond the last vertex element it seems that we still
need to allocate the SVGS element, otherwise we have observed cases
where we end up reading garbage. Specifically, the CTS test mentioned
below was flaky with a fail rate of ~1% on some gen9+ platforms caused
by reading garbage for the gl_InstanceID value. The flakyness goes
away as soon as we start allocating the SVGS element.

v2:
  - Do this for gen8+, not just gen9+, and pull the boolean
    outside the #if block (Jason)

Fixes flaky test:
KHR-GL45.vertex_attrib_64bit.limits_test

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104335
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-07 11:11:16 +01:00
Dylan Baker
c74719cf4a glapi: fix check_table test for non-shared glapi with meson
v2: - Add glapitable_h generated source to requirements

Fixes: 3218056e0e ("meson: Build i965 and dri stack")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
2018-02-06 15:00:17 -08:00
Dylan Baker
002fbde71e glapi: Don't search through subdirs from glapitable.h
Because meson won't put it in that folder.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-06 15:00:17 -08:00
Dylan Baker
aac3d01178 state_tracker: Don't build st-renumerate-test without shared glapi
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-06 15:00:17 -08:00
Dylan Baker
0316aa432d glapi: remove APPLE extensions from test
Fixes: 7009955281 ("mesa: Remove GL_APPLE_vertex_array_object stubs")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
2018-02-06 15:00:17 -08:00
Dylan Baker
a4f1fc5dd1 glapi/check_table: Remove 'extern "C"' block
Using 'extern "C"' around includes is always incorrect, as the header may
contain C++ symbols (as it does in this case), which means it cannot use
C linkage. In this case the header has a template in it, which obviously
cannot be linked with C linkage rules.

Fixes: a29ad2b421 ("mesa/tests: Add tests for the generated dispatch table")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-06 15:00:17 -08:00
Dylan Baker
105178db8f meson: fix test source name for static glapi
fixes: 43a6e84927 ("meson: build mesa test.")
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-06 15:00:17 -08:00
Dylan Baker
9be7487f30 glapi: don't walk backwards for includes
Instead just set the proper -I flags and include it from a more standard
path. In this case we'll add -Isrc/mesa (which is common), and #include
main/foo.h.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-06 15:00:17 -08:00
Brian Paul
e7a4536e64 mesa: rename gl_vertex_array_object::_VertexAttrib -> _VertexArray
Since the type is gl_vertex_array.  Update comment to explain that
these arrays are only used by the VBO module.

Also rename some local variables in _mesa_update_vao_derived_arrays().

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-02-06 15:36:47 -07:00
Brian Paul
d9ab39ea65 mesa: minor whitespace fixes, line wrapping in texcompress.c
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-02-06 15:23:26 -07:00
Brian Paul
b38196b452 mesa: simplify _mesa_get_compressed_formats()
Instead of testing for formats==NULL everywhere, just point formats at
a dummy array which will be discarded.

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-02-06 15:23:26 -07:00
Vlad Golovkin
d919ff0f27 util: remove redundant check for the __clang__ macro
Clang defines __GNUC__ macro, so one doesn't need to check __clang__
macro in this particular case.

v2: added comment as per Brian Paul's suggestion

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-06 15:23:26 -07:00
Brian Paul
77bc74e674 st/mesa: use st_access_flags_to_transfer_flags() helper in more places
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-06 15:23:26 -07:00
Brian Paul
1852a2e1a2 st/mesa: refactor st_bufferobj_map_range()
Use a new helper function, st_access_flags_to_transfer_flags(), to
convert the GL_MAP_x flags to PIPE_TRANSFER_x flags.

We'll be able to use this function in a couple other places.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-06 15:23:26 -07:00
Brian Paul
8a32dd2ec9 st/mesa: refactor bufferobj_data()
Split out some of the code into three new helper functions:
buffer_target_to_bind_flags(), storage_flags_to_buffer_flags(),
buffer_usage() to make the code more managable.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-06 15:23:26 -07:00
Samuel Pitoiset
3488a3f033 radv: run nir_opt_shrink_load
LLVM can't shrink loads.

Polaris10:
Totals from affected shaders:
SGPRS: 62528 -> 59955 (-4.11 %)
VGPRS: 44708 -> 44616 (-0.21 %)
Spilled SGPRs: 16 -> 8 (-50.00 %)
Code Size: 1355504 -> 1355172 (-0.02 %) bytes
Max Waves: 11710 -> 11670 (-0.34 %)

Vega10:
Totals from affected shaders:
SGPRS: 51448 -> 50371 (-2.09 %)
VGPRS: 39140 -> 39048 (-0.24 %)
Spilled SGPRs: 16 -> 16 (0.00 %)
Code Size: 1307188 -> 1304296 (-0.22 %) bytes
Max Waves: 11312 -> 11292 (-0.18 %)

This reduces SGPRs spilling in MadMax, and it also reduces
number of SGPRs in DOW3 and F12017. The number of waves slightly
decreases in F1 but I don't see any performance changes after
benchmarking it. Talos and Serious Sam are not affected because
they don't use any push constants.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-06 23:08:44 +01:00
Samuel Pitoiset
e68562b94b nir: add nir_opt_shrink_load pass
This is a very simple pass that just shrinks load_push_constant
intrinsics when some components are unused. For now, it can just
shrink vec4 to vec3, vec3 to vec2 and so on.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-06 23:08:39 +01:00
Timothy Arceri
e2ea9e1191 radeonsi/nir: add nir support for compiling compute shaders
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:43:08 +11:00
Timothy Arceri
9c52902c76 ac/radeonsi: add num_work_groups to the abi
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:43:08 +11:00
Timothy Arceri
f12e2f9c12 ac: implement nir_intrinsic_shader_clock
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:43:08 +11:00
Timothy Arceri
b7b89bbddb ac/radeonsi: create ac_build_shader_clock() helper
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:43:08 +11:00
Timothy Arceri
d116af383f ac/radeonsi: add load_local_group_size() to the abi
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:43:08 +11:00
Timothy Arceri
f6932d1ef3 radeonsi: add get_block_size() helper
This will be reused by the nir backend in a later patch.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:43:08 +11:00
Timothy Arceri
e3ebffdbb0 ac: don't call emit_outputs() for compute
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:43:08 +11:00
Timothy Arceri
c8066cdfa7 ac/radeonsi: add local_invocation_ids to the abi
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:43:08 +11:00
Timothy Arceri
fa5239c153 ac/radeonsi: add workgroup_ids to the abi
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:43:08 +11:00
Timothy Arceri
64c10c9737 radeonsi/nir: gather some compute info in si_nir_scan_shader()
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:43:08 +11:00
Timothy Arceri
1142b1d3e1 radeonsi/nir: always set input_usage_mask as using all components
This fixes a regression for now, in the future we should gather
the used components properly.

V2: just set for VS and correctly handle doubles

Fixes: be973ed21f "radeonsi: load the right number of components for VS inputs and TBOs"

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:38:52 +11:00
Timothy Arceri
ffeebcfa7e i965: remove unused brw_nir_lower_cs_shared()
This has been unused since 8761a04d0d.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-02-07 08:38:01 +11:00
Bas Nieuwenhuizen
a3e42e7a69 vulkan/wsi: Fix OOM behavior with prime images.
Fixes: d50937f137 "vulkan/wsi: Implement prime in a completely generic way"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-06 21:52:39 +01:00
Bas Nieuwenhuizen
c7d640fbbf ac/nir: fix GS load input type.
Fixes: df1d5174fc "ac/nir: replace SI.buffer.load.dword with amdgcn.buffer.load"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-06 21:52:38 +01:00
Mathias Fröhlich
e8a9473d32 mesa: Factor out _mesa_disable_vertex_array_attrib.
And use it in the enable code path.
Move _mesa_update_attribute_map_mode into its only remaining file.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-06 21:20:14 +01:00
Mathias Fröhlich
236657842b vbo: Move vbo_rebase into its only caller module tnl.
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-06 21:20:14 +01:00
Mathias Fröhlich
2313c33e95 mesa: Use atomics for buffer objects reference counts.
The mutex is currently used for reference counting and updating
the minmax index cache.
The change uses atomics directly for reference counting and
the mutex for the minmax cache.
This is safe since the reference count is not modified beside
in _mesa_reference_buffer_object where atomics aim to be used.
While using the minmax cache, the calling code holds a reference
to the buffer object. Thus unreferencing or even referencing the
buffer object does not need to be serialized with accessing
the minmax cache.
The change reduces the time _mesa_reference_buffer_object_ takes
by about a factor of two when looking at perf results for some
of my favorite use cases.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-06 21:20:14 +01:00
Dave Airlie
6c691081a1 r600: fixup sparse color exports.
If we have gaps in the shader mask we have to have 0x1 in them
according to a comment in radeonsi, and this is required to fix
the test at least on cayman.

We also need to record the highest one written to write to the
ps exports reg.

This fixes:
KHR-GL45.enhanced_layouts.fragment_data_location_api

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:16:59 +10:00
Dave Airlie
2d5b5d267e r600: work out target mask at framebuffer bind.
If we only get 1,2,3,6 framebuffers we want a sparse target mask.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:16:55 +10:00
Dave Airlie
5b14e06d8b r600: work out shader export mask at shader build time (v1.1)
Since enhanced layouts allows setting specific MRT outputs, we
can get sparse outputs, so we have to calculate the shader
mask earlier.

v1.1: update checks for state update (Roland)

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:16:27 +10:00
Dave Airlie
f292eceae1 r600: fix xfb stream check.
This fixes:
KHR-GL45.enhanced_layouts.xfb_vertex_streams

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:08:12 +10:00
Dave Airlie
680cb9898a r600/compute: add render cond support.
Set render cond and emit atom.

Fixes:
KHR-GL45.compute_shader.conditional-dispatching

Reviewed-by: Roland Scheidegger <sorland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:08:12 +10:00
Dave Airlie
5fd7b282b3 r600: fix not-very indirect compute
We need to get the grid sizes earlier to fill in to the const
buffer.

Fixes:
KHR-GL45.compute_shader.built-in-variables
and
KHR-GL45.compute_shader.dispatch-indirect

Reviewed-by: Roland Scheidegger <sorland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:08:12 +10:00
Dave Airlie
00a112641b r600: overhaul buffer resource query.
This cleans up and fixes the previous fix even more.

Buffers from textures start at max const,
buffers from buffers/images come in from the 168 offset.

This fixes a bunch of:
KHR-GL45.shader_storage_buffer_object*

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:08:12 +10:00
Dave Airlie
736b150768 r600/eg: fix buffer sizing.
For buffers we want the size in bytes,
For images we want it in elements.

This fixes:
KHR-GL45.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-std430-vec-pad

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:08:12 +10:00
Dave Airlie
c9c4f0b722 r600/images: set offset for compute shaders with number of declared samplers
for frag shaders we get a value in the key, I expect I need
to make compute work better

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:08:12 +10:00
Dave Airlie
ab5cee4c24 r600/compute: only mark buffer/image state dirty for fragment shaders
The compute emission path always emits this currently, and emitting
it on the fragment path breaks the blitter.

This fixes gpu hangs in KHR-GL45.compute_shader.resource-texture

Reviewed-by: Roland Scheidegger <sorland@vmware.com>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:08:12 +10:00
Dave Airlie
4e3b43f180 r600/atomic: fix ATOMCAS instruction.
This has 4 srcs.

This fixes:
KHR-GL45.shader_atomic_counter_ops_tests.ShaderAtomicCounterOpsExchangeTestCase

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:08:11 +10:00
Dave Airlie
8bdad9fa1f r600/sb/cayman: fix indirect ubo access on cayman
With sb enabled on cayman, this was overwriting the proper
cf index value with random ones if the dst gpr was 2 or 3,
only save the value for a MOVA instruction.

Fixes:
KHR-GL45.gpu_shader5.uniform_blocks_array_indexing
(on cayman with sb)

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:08:11 +10:00
Dave Airlie
012100b809 r600/eg: use texture target to pick array size not view target (v2)
This fixes a few CTS cases in :
KHR-GL45.texture_view.view_sampling

some multisample cases are still broken, but not sure this is
the same problem.

v2: fix more cases

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-07 06:08:11 +10:00
Dave Airlie
e7e81f362d radv: don't support tc-compat on multisample d32s8 at all.
RX550 fails
dEQP-VK.renderpass.suballocation.multisample.d32_sfloat_s8_uint.samples_2

So increase the range of the workaround.

Fixes: f4c534ef6 (radv: don't enable tc compat for d32s8 + 4/8 samples (v1.1))

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-06 19:56:00 +00:00
Michal Navratil
4081e08896 winsys/amdgpu: allow non page-aligned size bo creation from pointer
Fix INVALID_OPERATION caused by BufferData with target
EXTERNAL_VIRTUAL_MEMORY_BUFFER_AMD when the buffer size is
not page aligned.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Cc: 17.3 18.0 <mesa-stable@lists.freedesktop.org>
2018-02-06 18:51:12 +01:00
Jon Turney
9440599c8e meson: ensure xmlpool/options.h is generated for libgallium
In file included from ../src/gallium/targets/dri/target.c:1:
In file included from ../src/gallium/auxiliary/target-helpers/drm_helper.h:8:
../src/util/xmlpool.h:103:10: fatal error: 'xmlpool/options.h' file not found

See also 26bde1e3.

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-02-06 15:56:12 +00:00
Andres Gomez
1ec88755c2 vbo: provide 64bits support to print_draw_arrays
Cc: Mathias Fröhlich <mathias.froehlich@web.de>
Cc: Brian Paul <brianp@vmware.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-02-06 15:30:29 +02:00
Andres Gomez
0057ae4038 vbo: take into account the size when printing VAO elements
When using print_draw_arrays for debugging, we were printing an "n"
amount of vertex but that meant not to print all the size in the "n"
vertex, depending on the stride used.

Now we print the whole size in the "n" vertex.

Cc: Mathias Fröhlich <mathias.froehlich@web.de>
Cc: Brian Paul <brianp@vmware.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-02-06 15:30:23 +02:00
Andres Gomez
c9325b4fa9 vbo: print first element of the VAO when the binding stride is 0
Cc: Mathias Fröhlich <mathias.froehlich@web.de>
Cc: Brian Paul <brianp@vmware.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-02-06 15:30:12 +02:00
Iago Toral Quiroga
a5053ba27e anv/device: initialize the list of enabled extensions properly
The loop goes through the list of enabled extensions marking them as
enabled in the list, but this relies on every other extension being
initialized to false by default.

This bug would make us, for example, advertise certain device extension
entry points as available even when the corresponding extensions had
not been enabled.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Fixes: abc62282b5 "anv: Add a per-device table of enabled extensions"
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-02-06 07:51:00 +01:00
Iago Toral Quiroga
ef439a4fdc spirv: split constant initializers on in/out structs
The SPIR-V parser splits in/out struct variables and creates
a separate variable for each first-level member of the struct.
When the struct variable has an initializer this means that we also
need to split the initializer.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-06 07:50:18 +01:00
Iago Toral Quiroga
1d20001d97 i965/nir: do int64 lowering before optimization
Otherwise loop unrolling will fail to see the actual cost of
the unrolling operations when the loop body contains 64-bit integer
instructions, and very specially when the divmod64 lowering applies,
since its lowering is quite expensive.

Without this change, some in-development CTS tests for int64
get stuck forever trying to register allocate a shader with
over 50K SSA values. The large number of SSA values is the result
of NIR first unrolling multiple seemingly simple loops that involve
int64 instructions, only to then lower these instructions to produce
a massive pile of code (due to the divmod64 lowering in the unrolled
instructions).

With this change, loop unrolling will see the loops with the int64
code already lowered and will realize that it is too expensive to
unroll.

v2: Run nir_algebraic first so we can hopefully get rid of some of
    the int64 instructions before we even attempt to lower them.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-02-06 07:49:27 +01:00
Ilia Mirkin
02a6d901ee mesa: add OES_EGL_image_external_essl3 support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-02-06 07:28:11 +02:00
Vinson Lee
fe32f796f2 r600/fp64: Fix build.
CC       r600_shader.lo
r600_shader.c: In function ‘egcm_int_to_double’:
r600_shader.c:4543:12: error: ‘ctx’ is a pointer; did you mean to use ‘->’?
     if (ctx.bc->chip_class == CAYMAN)
            ^
            ->

Fixes: 35b4301577 ("r600/fp64: fix integer->double conversion")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-05 15:32:20 -08:00
Dave Airlie
35b4301577 r600/fp64: fix integer->double conversion
Doing a straight uint/int->fp32->fp64 conversion causes
some precision issues, Roland suggested splitting the
integer into two portions and doing two separate
int->fp32->fp64 conversions then adding the results.

This passes the tests in CTS and piglit.

[airlied: fix cypress conversion opcodes]
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-06 08:21:48 +10:00
Samuel Pitoiset
0170ae1e23 ac/nir: remove emission of nir_op_fdiv
RadeonSI and RADV lower fdiv.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-05 23:09:34 +01:00
Jon Turney
b5af199f92 travis: add macOS meson build
v2: Simplify set of options now we have better defaults

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-05 19:42:01 +00:00
Jon Turney
80bc41b2ec meson: osx ld doesn't support --build-id
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-05 19:40:43 +00:00
Jon Turney
ea8730024f meson: build src/glx/apple
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-05 19:40:43 +00:00
Dylan Baker
569628dd24 meson: set apple glx defines
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-05 19:40:43 +00:00
Jon Turney
4772909447 meson: better defaults for osx, windows and cygwin
set suitable defaults for 'dri-drivers', 'gallium-drivers', 'vulkan-drivers'
and 'platforms' options for osx, windows and cygwin, adding cygwin where
appropriate.

v2: error() for unknown OS

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-05 19:34:37 +00:00
Matt Turner
e2b31e9acf i965: Move mistakenly placed line
Ken called this out in review, but it seems I forgot to make the change.
I noticed that the control flow annotations in the fragment shader
disassembly of tests/shaders/glsl-fs-loop-continue.shader_test were not
correct, and moving this line to the correct place fixes it.
2018-02-05 09:50:56 -08:00
Juan A. Suarez Romero
4195eed961 glsl/linker: check same name is not used in block and outside
According with OpenGL GLSL 3.20 spec, section 4.3.9:

  "It is a link-time error if any particular shader interface
   contains:
     - two different blocks, each having no instance name, and each
       having a member of the same name, or
     - a variable outside a block, and a block with no instance name,
       where the variable has the same name as a member in the block."

This fixes a previous commit 9b894c8 ("glsl/linker: link-error using the
same name in unnamed block and outside") that covered this case, but
did not take in account that precision qualifiers are ignored when
comparing blocks with no instance name.

With this commit, the original tests
KHR-GL*.shaders.uniform_block.common.name_matching keep fixed, and also
dEQP-GLES31.functional.shaders.linkage.uniform.block.differing_precision
regression is fixed, which was broken by previous commit.

v2: use helper varibles (Matteo Bruni)

Fixes: 9b894c8 ("glsl/linker: link-error using the same name in unnamed block and outside")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104668
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104777
CC: Mark Janes <mark.a.janes@intel.com>
CC: "18.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Matteo Bruni <matteo.mystral@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-02-05 18:10:43 +01:00
Juan A. Suarez Romero
3d14e72057 mesa: enable ASTC format for CompressedTexSubImage3D
If extensions GL_KHR_texture_compression_astc_hdr or
GL_KHR_texture_compression_astc_sliced_3d are implemented then ASTC
format are supported in CompressedTex*Îmage3D.

Fixes KHR-GLES2.texture_3d.* with this format.

CC: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-02-05 17:00:19 +01:00
Stephan Gerhold
02e2009b92 util/build-id: Fix address comparison for binaries with LOAD vaddr > 0
build_id_find_nhdr_for_addr() fails to find the build-id if the first LOAD
segment has a virtual address other than 0x0.

For most shared libraries, the first LOAD segment has vaddr=0x0:

    Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
    LOAD           0x000000 0x00000000 0x00000000 0x2d2e26 0x2d2e26 R E 0x1000
    LOAD           0x2d2e54 0x002d3e54 0x002d3e54 0x2e248 0x2f148 RW  0x1000

However, compiling the Intel Vulkan driver as 32-bit binary on Android produces
the following ELF header with vaddr=0x8000 instead:

    Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
    PHDR           0x000034 0x00008034 0x00008034 0x00100 0x00100 R   0x4
    LOAD           0x000000 0x00008000 0x00008000 0x224a04 0x224a04 R E 0x1000
    LOAD           0x225710 0x0022e710 0x0022e710 0x25988 0x27364 RW  0x1000

build_id_find_nhdr_callback() compares the address of dli_fbase from dladdr()
and dlpi_addr from dl_iterate_phdr(). With vaddr > 0, these point to a
different memory address, e.g.:

    dli_fbase=0xd8395000 (offset 0x8000)
    dlpi_addr=0xd838d000

At least on glibc and bionic (Android) dli_fbase refers to the address where
the shared object is mapped into the process space, whereas dlpi_addr is just
the base address for the vaddrs declared in the ELF header.

To compare them correctly, we need to calculate the start of the mapping
by adding the vaddr of the first LOAD segment to the base address.

Note: musl users will need the following patch.
https://git.musl-libc.org/cgit/musl/commit/?id=b3ae7beabb9f0c219bb8a8b63567a01c6530c1ac

Cc: Chad Versace <chadversary@chromium.org>
Cc: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104642
Fixes: 5c98d38 "util: Query build-id by symbol address, not library name"
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-02-05 14:26:33 +00:00
Boyuan Zhang
d645b0850a radeonsi: enable vcn encode for HEVC main
Enable vcn encode for HEVC main profile on Raven.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Boyuan Zhang
5534a2791f st/va: implement HEVC encode functions
Implement HEVC encode functions based on VAAPI HEVC encode interface.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Boyuan Zhang
9ac50a2e0c st/va: add HEVC encode functions
Add a separate file for HEVC encode functions.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Boyuan Zhang
66087d8a2d st/va: enable dual instances encode only for H264
Logics that related to dual instances encode should only be done for
H264, not other codecs.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Boyuan Zhang
a9c0861c6c st/va: add entrypoint check for HEVC
Add entrypoint check for HEVC to differentiate decode and encode jobs.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Boyuan Zhang
ecc3944344 st/va: add HEVC picture desc
Add HEVC picture desc, and add codec check when creating and destroying
context.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Boyuan Zhang
9393b53c29 st/va: move H264 enc functions into separate file
Move all H264 encode related functions into separate file. Similar to
VAAPI decode side, there will be separate file for each codec on encode
side as well.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Boyuan Zhang
b391d34916 radeon/vcn: add header implementations for HEVC
Implement encoding of sps, pps, vps, aud, and slice headers for HEVC
based on HEVC specs.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Boyuan Zhang
fdc952b320 radeon/vcn: add ib implementations for HEVC
Implement required ibs for vcn HEVC encode.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Boyuan Zhang
5ab73edddb radeon/vcn: support picture parameters for HEVC
Pass pipe_picture_desc instead of pipe_h264_enc_picture_desc so that
it can be used for different codecs. Add functions to handle picture
parameters that will be used for HEVC encode.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Boyuan Zhang
db67d04df3 radeon/vcn: add vcn encode interface for HEVC
Add vcn encode interface for HEVC, and rename radeon_enc_h264_enc_pic
to radeon_enc_pic since radeon_enc_pic is used by both H264 and HEVC.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Boyuan Zhang
f410936439 vl: add parameters for HEVC encode
Add HEVC encode interface

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2018-02-05 09:16:18 -05:00
Eric Anholt
aa2f609f70 broadcom/vc5: Ignore samplers for finding uniform offsets.
Fixes:
KHR-GLES3.shaders.struct.uniform.sampler_array_fragment
KHR-GLES3.shaders.struct.uniform.sampler_array_vertex
KHR-GLES3.shaders.struct.uniform.sampler_nested_fragment
KHR-GLES3.shaders.struct.uniform.sampler_nested_vertex
2018-02-05 13:56:02 +00:00
Eric Anholt
63a8a0f3c0 broadcom/vc5: Fix non-mipfiltered sampling.
We need to clamp the LOD to 0 if mip filtering is disabled.  This is part
of fixing KHR-GLES3.shaders.struct.uniform.sampler_array_fragment.
2018-02-05 13:53:38 +00:00
Eric Anholt
e29988c908 broadcom/vc5: Fix "hardwrae" typo in a field name in XML. 2018-02-05 13:53:38 +00:00
Samuel Pitoiset
a1d568c830 ac/nir: fix a crash in load_gs_input() on pre-GFX9 chips
Fixes: df1d5174fc ("ac/nir: replace SI.buffer.load.dword with amdgcn.buffer.load")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-05 11:05:52 +01:00
Eric Anholt
8bb000f460 broadcom/vc5: Try to merge more than 2 QPU instructions together.
Obviously it would be good to have an ADD and a MUL and a signal together,
but we can even potentially have multiple signals merged, as well.

total instructions in shared programs: 100423 -> 97874 (-2.54%)
instructions in affected programs:     78812 -> 76263 (-3.23%)
2018-02-05 09:29:37 +00:00
Eric Anholt
dc78643ace broadcom/vc5: Remove no-op MOVs after register allocation.
We emit some MOVs to track lifetimes of payload registers, but we don't
need there to be actual MOV instructions for them.

total instructions in shared programs: 101045 -> 100423 (-0.62%)
instructions in affected programs:     37083 -> 36461 (-1.68%)
2018-02-05 09:29:37 +00:00
Eric Anholt
f3978a7380 broadcom/vc5: Add missing shader-db instruction counting.
I must have misplaced it in the instruction packing rework.
2018-02-05 09:29:37 +00:00
Dave Airlie
7801425028 r600: fix resq for buffer images.
If this is an image buffer, we need to calculate the correct resource
id.

Fixes:
KHR-GL45.shader_image_size.*

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-05 05:15:41 +10:00
Dave Airlie
6c1432f0be r600/eg: fix cube map array buffer images.
This fixes a crash in:
KHR-GL45.texture_cube_map_array.texture_size_compute_sh.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-05 05:14:56 +10:00
Marek Olšák
af3685d149 mesa: change ctx->Color.ColorMask into a 32-bit bitmask
4 bits per draw buffer, 8 draw buffers in total --> 32 bits.

This is easier to work with.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-02-04 01:50:10 +01:00
Jordan Justen
83e60ce927 i965: Create new program cache bo when clearing the program cache
When the disk shader cache CI testing was enabled, we started noticing
occasional failures on deqp test runs. (Mainly SNB, rarely HSW)

Before this change, when we cleared the (in memory) program cache we
reused the same bo. Since the disk shader cache quickly restores
programs, it appears that this would lead to overwrites of the older
program binaries in the in memory program cache that apparently were
still executing in some cases. If these programs were still executing,
this could cause a GPU hang.

This issue is probably not disk shader cache specific, but may have
been hidden due to the compiler taking time to recompile programs
after the cache was cleared.

v2:
 * Don't add `copy` param to brw_cache_new_bo (Ken)
 * Call from brw_program_cache_check_size (Ken)

Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-03 12:16:58 -08:00
Jason Ekstrand
589e9db23f aubinator: Multiply count by 4 to compute buffer sizes
The count field is in terms of dwords and not bytes.  In
7d4007d58a, I fixed one instance
of this but missed another.
2018-02-02 22:30:56 -08:00
Eric Anholt
2e746bc63d broadcom/vc5: Enable UIF XOR on textures.
This should increase performance by reducing SDRAM bank conflicts when
crossing between UIF columns (particularly on power-of-two height
textures).

The uif_xor_disable setup is dropped, since we need to allow XOR on lower
miplevels even when level 0 is XOR.  The level 0 force UIF and level 0 XOR
flags should handle setting XOR properly on imported buffers.
2018-02-02 16:50:02 -08:00
Eric Anholt
6a862b0de7 broadcom/vc5: Fix alignment of miplevel 1 with UIF.
The alignment here means that we can't get back the padded height from the
size/stride any more, so it's now a field in the slice as well.

Fixes piglit fbo-generatemipmap-formats RGBA16 NPOT.
2018-02-02 16:27:49 -08:00
Eric Anholt
5c57e0a549 broadcom/vc5: Switch our RGBA4 support to the new gallium format.
Fixes fbo-generatemipmap-formats, fbo-alphatest-formats, etc. tests for
GL_RGBA4, GL_RGB4, GL_RGBA2, etc.
2018-02-02 16:27:49 -08:00
Eric Anholt
2a97f1d3ef gallium: Add a new A4B4G4R4 pipe format for Broadcom.
The VC5 HW puts A in the low bits and R in the high bits.  We can't just
swizzle in the shaders because the blending HW can't pick what channel A
is in, so make a new format to match it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-02 16:27:49 -08:00
Eric Anholt
1429cd74c2 mesa: Drop incorrect A4B4G4R4 _mesa_format_matches_format_and_type() cases.
swapBytes operates on bytes, not 4-bit channels, so you can't just take
non-swapBytes cases and flip the REV flag.

Avoids piglit texture-packed-formats regressions when enabling the
ABGR4444 format.

Fixes: c5a5c9a7db ("mesa/formats: add new mesa formats and their pack/unpack functions.")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-02 16:27:49 -08:00
George Kyriazis
bbef9474fa meson/swr: Updated copyright dates
cc: mesa-stable@lists.freedesktop.org
cc: dylan@pnwbakers.com

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-02 17:43:07 -06:00
George Kyriazis
16bf813830 meson/swr: re-shuffle generated files
Move generated files from codegen/meson.build to other directories, in order
to satisfy generated include file dependencies

Add correct file lists for architecture-specific libraries.

cc: mesa-stable@lists.freedesktop.org
cc: dylan@pnwbakers.com

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-02 17:43:00 -06:00
Marek Olšák
3bf1e036e8 amd: remove support for LLVM 3.9
Only these are supported:
- LLVM 4.0
- LLVM 5.0
- LLVM 6.0
- master (7.0)

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-02 23:47:40 +01:00
Dylan Baker
c75a4e5b46 meson: Check for actual LLVM required versions
Currently we always check for 3.9.0, which is pretty safe since
everything except radv work with >= 3.9 and 3.9 is pretty old at this
point. However, radv actually requires 4.0, and there is a patch for
radeonsi to do the same.

Fixes: 673dda8330 ("meson: build "radv" vulkan driver for radeon hardware")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-02 13:22:58 -08:00
Dylan Baker
d7235ef83b meson: Don't confuse the install and search paths for dri drivers
Currently there is not a separate option for setting the search path of
DRI drivers in meson, like there is in scons and autotools. This is an
oversight and needs to be fixed. This adds an extra option
`dri-search-path`, which will default to the value of
`dri-drivers-path`, like autotools does.

v2: - Split input list before joining.
v3: - use : instead of ; as the delimiter. The autotools help string
      incorrectly says ; but the code uses :
v4: - Take list in pre : delimited form (Ilia)
    - Ensure that the dri-search-path is absolute when using
      dri_drivers_path

Fixes: db9788420d ("meson: Add support for configuring dri drivers directory.")
Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net> (v2)
Reviewed-by: Eric Engestrom <eric@engestrom.ch> (v3)
2018-02-02 11:01:42 -08:00
Marek Olšák
847d0a393d radeonsi: use pknorm_i16/u16 and pk_i16/u16 LLVM intrinsics
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-02 16:46:22 +01:00
Jon Turney
b3a1d9588e travis: add osx autotools build
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-02 15:28:52 +00:00
Jon Turney
4701379d96 travis: pip -> pip2
On travis, for OSX, python2 from homebrew is pre-installed. per [1]:

 python points to the macOS system Python (with no manual PATH modification)
 python2 points to Homebrew’s Python 2.7.x (if installed)
 python3 points to Homebrew’s Python 3.x (if installed)
 pip doesn't exist
 pip2 points to Homebrew’s Python 2.7.x’s pip (if installed)
 pip3 points to Homebrew’s Python 3.x’s pip (if installed)

We will end up using 'python2' for building mesa.

Just use 'pip2' instead of 'pip', as that seems to work for all platforms on
travis.

[1] https://docs.brew.sh/Homebrew-and-Python.html

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-02 15:28:52 +00:00
Jon Turney
7d1ec6d6a9 travis: conditionalize building of prerequisites on if OS=linux
Use a '|' YAML literal block to avoid the convoluted syntax needed to put
the entire conditional on a single line.

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-02 15:28:52 +00:00
Jon Turney
63041ba613 glx/test: fix building for osx
An additional stub for applegl_create_context() is needed
Cannot test indirect API as it's not built on osx, currently

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-02 15:28:52 +00:00
Andres Gomez
4761a8fea6 i965: check if upload is 0 explicitely, when downsizing a format
downsize_format_if_needed takes an integer as number of uploads
parameter. Hence, let's do an integer comparation instead of a boolean
check, since that is confusing.

Since we are at it, fix a couple of wrongly tabbed indents.

Cc: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-02-02 16:32:30 +02:00
Marek Olšák
51d36f5e02 mesa: don't flag _NEW_COLOR for KHR adv.blend if prog constant doesn't change
This only affects drivers that set DriverFlags.NewBlend.

v2: - fix typo advanded -> advanced
    - return "enum gl_advanced_blend_mode" from
      _mesa_get_advanced_blend_sh_constant
    - don't call FLUSH_VERTICES twice

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-02-02 15:06:47 +01:00
Samuel Pitoiset
df1d5174fc ac/nir: replace SI.buffer.load.dword with amdgcn.buffer.load
The old one generates useless instructions in there, found while
comparing geometry shaders between RadeonSI and RADV.

This improves all Vulkan demos that use geometry shaders, +4%
for deferredshadows, +9% for viewportarray, +7% for
geometryshader on Polaris10.

This seems to also improve DOW3 a little bit (+1%).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by:  Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-02 12:32:21 +01:00
Dave Airlie
f9c121c420 r600/eg: add crap indirect compute support.
I think the cp packets can be made work, but I think it might
need a kernel change, so for now just do the worst thing.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-02 16:50:18 +10:00
Jason Ekstrand
2f7205be47 i965: Call prepare_external after implicit window-system MSAA resolves
This fixes some rendering corruption in a couple of Android apps that
use window-system MSAA.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104741
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-02-01 21:45:25 -08:00
Roland Scheidegger
c2f0e08857 r600: don't do stack workarounds for hemlock
By the looks of it it seems hemlock is treated separately to cypress, but
certainly it won't need the stack workarounds cedar/redwood (and
seemingly every other eg chip except cypress/juniper) need.
(Discovered by accident.)

Acked-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-02 01:46:43 +01:00
Dave Airlie
8fa5aade43 r600: initial attempt at gl_HelperInvocation (v3)
This passes the CTS and piglit tests.

This also disable sb for helper invocations until it doesn't
mess up the VPM flags.

Thanks to Ilia and Glenn for advice, and Roland for working
out the working evergreen path.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-02 09:46:05 +10:00
Bas Nieuwenhuizen
2ffe395cba radv: Don't expose VK_KHX_multiview on android.
deqp does not allow any KHX extensions, and since deqp is included
in android-cts, android does not allow any khx extensions.

So disable VK_KHX_multiview on android.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
CC: 18.0 <mesa-stable@lists.freedesktop.org>
2018-02-01 23:32:48 +01:00
Mathias Fröhlich
5b3d58520f vbo: Simplify input array distribution for dlist type draws.
Using the newly introduced VAO array maps, we can
simplify vbo_bind_vertex_list.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-01 22:39:08 +01:00
Mathias Fröhlich
fb10a7b7b0 vbo: Simplify input array distribution for imm type draws.
Using the newly introduced VAO array maps, we can
simplify vbo_exec_bind_arrays.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-01 22:39:08 +01:00
Mathias Fröhlich
44b1454b96 vbo: Simplify input array distribution for array type draws.
Using the newly introduced VAO state variable, we can
simplify recalculate_input_bindings.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-01 22:39:07 +01:00
Mathias Fröhlich
3d4fb879dd vbo: Use static const VERT_ATTRIB->VBO_ATTRIB maps.
Instead of each context having its own map instance for
this purpose, use a global static const map.

v2: s,unsigned char,GLubyte,g
    s,_VP_MODE_MAX,VP_MODE_MAX,g
    Change comment style.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-01 22:39:07 +01:00
Mathias Fröhlich
b4fd63015a mesa: Track position/generic0 aliasing in the VAO.
Since the first material attribute no longer aliases with
the generic0 attribute, only aliasing between generic0 and
position is left and entirely dependent on the enabled
state of the VAO. So introduce a gl_attribute_map_mode
in the VAO that is used to track how the position
and the generic 0 attribute alias.
Provide a static const array that can be used to
map from vertex program input indices to VERT_ATTRIB_*
indices. The outer dimension of the array is meant to
be indexed directly by the new VAO member variable.
Also provide methods on the VAO to convert bitmasks of
VERT_BIT's from the VAO numbering to the vertex processing
inputs numbering.

v2: s,unsigned char,GLubyte,g
    s,_ATTRIBUTE_MAP_MODE_MAX,ATTRIBUTE_MAP_MODE_MAX,g
    Change comment style, add comments.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-01 22:39:06 +01:00
Mathias Fröhlich
186f03cfb0 mesa: Put materials at the end of the generic block.
The materials are now moved to the end of the
generic attributes block to the range 4-15.

Before, the way the position and generic 0 attribute
is handled was dependent on the presence and kind of
the currently attached vertex program. With this
change the way the position attribute and the generic 0
attribute is treated only depends on the enabled
flag of those two arrays.
This will later help to untangle the update dependencies
between enabled arrays and shader inputs.

v2: s,VERT_ATTRIB_MAT_OFFSET,VERT_ATTRIB_MAT0,g

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-01 22:39:06 +01:00
Mathias Fröhlich
38b41fd718 mesa: Use defines for the aliased material array attributes.
Instead of just assuming that the material attributes
just overlap with the generic attributes 0-12, give
them symbolic defines so that we can easier move them
to an other range.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-01 22:39:06 +01:00
Mathias Fröhlich
f37e29ac22 vbo: Correctly handle attribute offsets in dlist draw.
When executing a display list draw, for the offset
list to be correct, the offset computation needs to
accumulate all attribute size values in order.
Specifically, if we are shuffling around the position
and generic0 attributes, we may violate the order or
if we do not walk the generic vbo attributes we may
skip some of the attributes.
Even if this is an unlikely usecase we can fix this use
case by precomputing the offsets on the full attribute list
and store the full offset list in the display list node.

v2: Formatting fix
v3: Rebase

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-01 22:39:05 +01:00
Brian Paul
7a044ef68b gallivm/llvmpipe: add const qualifiers on sampler variables
Once a lp_build_sampler_soa or lp_build_sampler_aos object is created,
it should never be modified.  Found by inspection.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-01 14:19:58 -07:00
Brian Paul
1bdbeae17c vbo: change an argument in vbo_draw_indirect_prims()
In vbo_draw_indirect_prims() pass the 'indirect_data' argument to
vbo->draw_prims().  All the callers are passing ctx->DrawIndirectBuffer
so this should be no functional change.  Add a (temporary) assertion to
be sure.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-02-01 12:17:59 -07:00
Brian Paul
1b7ad3ae97 vbo: add comments on the VBO draw function typedefs
And rename indirect_params -> indirect_draw_count_buffer and
indirect_params_offset -> indirect_draw_count_offset to be more
specific.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-02-01 12:17:59 -07:00
Brian Paul
c7bf05c833 vbo: s/drawcount/drawcount_offset
This parameter (from the glMultiDrawArraysIndirectCountARB function)
is poorly named.  It's an offset into the buffer which contains the
number of primitives to draw.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-02-01 12:17:59 -07:00
Brian Paul
b0a2f38db9 vbo: use vbo local var for draw call in vbo_save_playback_vertex_list()
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-02-01 12:17:59 -07:00
Brian Paul
84c3641864 svga: remove unneeded #includes in svga_pipe_draw.c
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-02-01 12:17:59 -07:00
Brian Paul
fa98730bf3 svga: whitespace/formatting fixes in svga_pipe_draw.c
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-02-01 12:17:59 -07:00
Brian Paul
7a1401938b svga: clean up retry_draw_range_elements(), retry_draw_arrays()
Get rid of a bunch of goto spaghetti.  Remove unneeded do_retry parameter.
No Piglit changes.  Also tested w/ Google Earth and other apps.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-02-01 12:17:59 -07:00
Brian Paul
c744289552 svga: remove unused min/max_index params to draw_vgpu10()
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-02-01 12:17:59 -07:00
Eric Anholt
06858c7348 broadcom/vc5: Fix image_h setup for both loads and stores.
The image_h for the tiling algorithm needs to be the padded-to-a-uifblock
height of the level, not the unpadded height or the height of level 0.
Fixes some cases of KHR-GLES3.texture_repeat_mode.* and
depthstencil-render-miplevels.
2018-02-01 11:02:29 -08:00
Eric Anholt
5329f35ea1 broadcom/vc5: Add appropriate height padding for bank conflicts.
I thought I didn't need this because I was doing level-0-always-UIF and
that the pad there would propagate down, but it turns out that for level 1
the padding ends up being chosen by the HW.  This brings us closer to
being able to turn on UIF XOR for increased performance, as well.
2018-02-01 11:02:29 -08:00
Eric Anholt
dea902c933 broadcom/vc5: Simplify separate stencil surface setup.
If we just make another gallium surface for the separate stencil, it's a
lot easier to keep track of which set of fields we're using in RCL setup.

This also incidentally fixes a little bug in setting up the surface's
padded height for separate stencil when the UIF-ness changes at different
levels of Z versus stencil.
2018-02-01 11:02:29 -08:00
Eric Anholt
7239b3edbe broadcom/vc5: Rename the UIFCFG register in the UAPI.
This matches the naming of the other hub regs we get, and I don't know for
sure if UIFCFG will be the same register between the hub and the cores on
all versions.
2018-02-01 11:02:29 -08:00
Eric Anholt
353b42ccc7 broadcom/vc5: Fix a segfault on mix of booleans.
We don't have a src1 to look up if the compare instruction is "i2b".
2018-02-01 11:02:29 -08:00
Eric Anholt
eb765394c2 broadcom/vc5: Skip over missing color buffers for a couple of checks.
Fixes crashes in piglit alpha-to-coverage-no-draw-buffer-zero 2
2018-02-01 11:02:29 -08:00
Eric Anholt
aec066c7aa broadcom/vc5: Add the missing PIPE_CAP_FENCE_SIGNAL. 2018-02-01 11:02:29 -08:00
Baldur Karlsson
030821a873 mesa: fix query of GL_TEXTURE_COMPRESSION_HINT_ARB
Fixes: f96a69f916 ("mesa: replace GLenum with GLenum16 in common
structures (v4)")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104908
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-01 11:58:02 -07:00
Lucas Stach
0c71a19fe4 renderonly: fix dumb BO allocation for non 32bpp formats
Take into account the resource format, instead of applying a hardcoded
32bpp. This not only over-allocates 16bpp formats, but also results in
a wrong stride being filled into the handle.

Fixes: 848b49b288 ("gallium: add renderonly library")
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-02-01 19:36:17 +01:00
Kenneth Graunke
85ec7abc3f intel/decoder: Fix control / evaluation label mixup.
Trivial.  DS is TES, HS is TCS.
2018-02-01 09:44:15 -08:00
Kenneth Graunke
c3cd2aac27 i965: Bump official kernel requirement to Linux v3.9.
In commit 3f353342a6 (present in 17.3.0)
we started unconditionally using I915_EXEC_NO_RELOC, which was
introduced in Linux v3.9.  ChromeOS kernel 3.8 has backported this,
so it should work too.

Running on older kernels would likely result in every single batch
being rejected by the kernel, which is pretty catastrophic.  Yet, it
appears that nobody noticed.  So, let's just bump the official
requirement and move forward ever so slowly.

Fixes: 3f353342a6 ("i965: Use I915_EXEC_NO_RELOC")
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-01 07:58:58 -08:00
Marc Dietrich
4c5f0b4fd4 meson: don't install windows headers on non-windows platforms
Only dive into the windows subdir if windows platform is selected.

Signed-off-by: Marc Dietrich <marvin24@gmx.de>
Fixes: 5ef75cb02b "meson: build src/glx/windows"
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-02-01 15:33:02 +00:00
Marek Olšák
71c6f64e54 radeonsi: use ac_build_buffer_load_format for image buffer loads
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-01 16:20:19 +01:00
Marek Olšák
b0a6053a99 ac/nir: use ac_build_buffer_load_format for image buffer loads
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-01 16:20:19 +01:00
Marek Olšák
bac9fa9f17 ac: add glc parameter to ac_build_buffer_load_format
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-01 16:20:19 +01:00
Marek Olšák
be973ed21f radeonsi: load the right number of components for VS inputs and TBOs
The supported counts are 1, 2, 4. (3=4)

The following snippet loads float, vec2, vec3, and vec4:

Before:
    buffer_load_format_x v9, v4, s[0:3], 0 idxen          ; E0002000 80000904
    buffer_load_format_xyzw v[0:3], v5, s[8:11], 0 idxen  ; E00C2000 80020005
    s_waitcnt vmcnt(0)                                    ; BF8C0F70
    buffer_load_format_xyzw v[2:5], v6, s[12:15], 0 idxen ; E00C2000 80030206
    s_waitcnt vmcnt(0)                                    ; BF8C0F70
    buffer_load_format_xyzw v[5:8], v7, s[4:7], 0 idxen   ; E00C2000 80010507

After:
    buffer_load_format_x v10, v4, s[0:3], 0 idxen         ; E0002000 80000A04
    buffer_load_format_xy v[8:9], v5, s[8:11], 0 idxen    ; E0042000 80020805
    buffer_load_format_xyzw v[0:3], v6, s[12:15], 0 idxen ; E00C2000 80030006
    s_waitcnt vmcnt(0)                                    ; BF8C0F70
    buffer_load_format_xyzw v[3:6], v7, s[4:7], 0 idxen   ; E00C2000 80010307

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-01 16:20:19 +01:00
Marek Olšák
472361dd7e radeonsi: remove unused si_shader_context members
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-01 16:20:19 +01:00
Jon Turney
d3540b405b glx/apple: locate dispatch table functions to wrap by name
Avoid reaching into the dispatch table internals (and thus having to deal
with the complexities of remap etc.) by identifying functions to wrap by
name.

See:
https://lists.freedesktop.org/archives/mesa-dev/2015-June/086721.html et seq.
https://bugs.freedesktop.org/show_bug.cgi?id=90311

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-01 15:14:08 +00:00
Jon Turney
b37b7b42dc glx/apple: include util/debug.h for env_var_as_boolean prototype
mesa/src/glx/glxcmds.c:1295:21: error: implicit declaration of function 'env_var_as_boolean' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
mesa/src/glx/apple/apple_visual.c:85:28: error: implicit declaration of function 'env_var_as_boolean' is invalid in C99 [-Werror,-Wimplicit-function-declaration]

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-01 15:14:02 +00:00
Jon Turney
f8ed9f24d5 osx: ld doesn't support --build-id
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-01 15:13:56 +00:00
Jon Turney
7ad7a07c88 configure: Default to gbm=no on osx
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-01 15:13:00 +00:00
Andres Rodriguez
bbd00844a2 mesa: remove usage of alloca in externalobjects.c v4
Don't want an overly large numBufferBarriers/numTextureBarriers to blow
up the stack.

v2: handle malloc errors
v3: fix patch
v4: initialize texObjs/bufObjs

Suggested-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
2018-02-01 09:48:04 -05:00
Samuel Pitoiset
2ef5ce1198 radv: do not insert shaders in cache when it's disabled
When the application doesn't provide its own pipeline cache,
the driver uses a in-memory cache but it shouldn't insert any
entries when the cache is explicitely disabled by the user.

Found while running my experimental pipeline-db tool with a
ton of shaders, the memory footprint was just huge, and sometimes
the process was even killed...

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-01 09:40:11 +01:00
Samuel Pitoiset
4922e7f25c radv: use separate bindings for graphics and compute descriptors
The Vulkan spec says:

   "pipelineBindPoint is a VkPipelineBindPoint indicating whether
    the descriptors will be used by graphics pipelines or compute
    pipelines. There is a separate set of bind points for each of
    graphics and compute, so binding one does not disturb the other."

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104732
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-01 09:37:09 +01:00
Samuel Pitoiset
cf224014dd radv: store the bind point when creating descriptors with templates
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-01 09:37:07 +01:00
Dave Airlie
7ea15a36fb r600/eg: make sure we allow vpm bit on other CF ops.
the vpm bit wasn't being applied to the push/pop instructions.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-01 13:41:32 +10:00
Timothy Arceri
4d982ae2c7 gallium/st/clover: remove unused PIPE_SHADER_IR_LLVM
This has been unused since 100796c15c.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-02-01 13:56:34 +11:00
Dave Airlie
0491d5425f r600/sb: just add some missing debug bits
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-01 12:06:40 +10:00
Dave Airlie
df155a73f4 r600: fix buffer resinfo opcode translation.
The vtx operations never got translated, so things worked by
0 being equal to 0, translate them so we can use the proper buffer
resinfo code.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-01 11:59:55 +10:00
Timothy Arceri
679e4e7a46 st/glsl_to_nir: add more nir opts to st_nir_opts()
All of the current gallium nir driver use these optimisations but
they do so in their backends. Having these called in the backend
only can cause a number of problems:

- Shader compile times are greater because the opts need to do
  significant passes over all shader variants.
- The shader cache is partially defeated due to the significant
  optimisation passes over variants.
- We might miss out on nir linking optimisation opportunities.

Adding these passes to st_nir_opts() alleviates these problems.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-02-01 09:42:57 +11:00
Andres Gomez
5a7aba2e0a i965: perform 2 uploads with dual slot *64*PASSTHRU formats on gen<8
The emission of vertex attributes corresponding to dvec3 and dvec4
vertex shader input variables was not correct when the <size> passed
to the VertexAttribL* commands was <= 2.

In 61a8a55f55 ("i965/gen8: Fix vertex attrib upload for dvec3/4
shader inputs"), for gen8+ we needed to determine if the attrib was
dual slot to emit 128 or 256-bit, independently of the VAO size.

Similarly, for gen < 8 we also need to determine whether the attrib is
dual slot to force the emission of 256-bits through 2 uploads.

Additionally, we make use of the ISL_FORMAT_R32_FLOAT format in this
second upload to fill these unspecified components with zeros, as we
also do for gen8+.

Fixes the following test on Haswell:
KHR-GL46.vertex_attrib_binding.basic-inputL-case1

v2: Added more inline comments to explain why we are using
    ISL_FORMAT_R32_FLOAT and its consequences, as requested by
    Alejandro and Antía.

Fixes: 75968a668e ("i965/gen7: expose OpenGL 4.2 on Haswell when
supported")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103006
Cc: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: Antia Puentes <apuentes@igalia.com>
Cc: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Antia Puentes <apuentes@igalia.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-31 22:50:06 +02:00
Kenneth Graunke
ab1f2e6bc4 i965: Make texture validation code use texture objects, not units.
This requires moving the _MaxLevel handling up to the callers.  Another
user of intel_finalize_mipmap_tree will be added later that depends on
_MaxLevel not being modified.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-01-31 11:33:52 -08:00
Kenneth Graunke
0a2e878c69 i965: Pass tObj into intel_update_max_level instead of intel_obj.
We want both anyway, but this will simplify things a tiny bit in an
upcoming patch.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-01-31 11:33:52 -08:00
Kenneth Graunke
876f1537e9 i965: Delete more misleading comments.
brw_bo_wait_rendering used to take a brw_context pointer for perf_debug
messages about stalls.  Chris eliminated that in 833108ac14.
This message about passing NULL to avoid those warnings is no longer
relevant, and just adds confusion.  So, drop it.
2018-01-31 11:33:52 -08:00
Andres Rodriguez
8996610acb docs/features: mark EXT_semaphore(_fd) as DONE v2
Support for these extensions is available in radeonsi.

v2: also updated relnotes

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
2018-01-31 12:31:40 -05:00
Brian Paul
d32c22a13f st/mesa: whitespace, formatting fixes in st_glsl_to_tgsi.cpp
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-01-31 08:17:25 -07:00
Brian Paul
3b3d8275d8 st/mesa: s/int/GLenum/ in st_glsl_to_tgsi.cpp
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-01-31 08:17:25 -07:00
Brian Paul
1882ec4ff7 svga: use opcode local var to simplify some code
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-01-31 08:17:25 -07:00
Brian Paul
338c35c427 svga: s/unsigned/VGPU10_OPCODE_TYPE/
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-01-31 08:17:25 -07:00
Samuel Pitoiset
a097a6f519 radv: do not dump meta shader stats
That's quite useless and that pollutes the output.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-31 14:10:26 +01:00
Samuel Pitoiset
26cc3e74b9 ac/nir: fix emission of ffract for 64-bit
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-31 14:10:24 +01:00
Eric Engestrom
2f0db33527 meson: dedup gallium-xa logic
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-01-31 11:17:03 +00:00
Eric Engestrom
fa5d616bf9 meson: dedup gallium-va logic
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-01-31 11:17:03 +00:00
Eric Engestrom
86168ed31c meson: dedup gallium-omx logic
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-01-31 11:17:03 +00:00
Eric Engestrom
724916c8a8 meson: dedup gallium-xvmc logic
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-01-31 11:17:03 +00:00
Eric Engestrom
992af0a4b8 meson: dedup gallium-vdpau logic
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-01-31 11:17:03 +00:00
Antia Puentes
0da434fb47 Revert "mesa: add missing RGB9_E5 format in _mesa_base_fbo_format"
This reverts commit 513c2263cb.

_mesa_base_fbo_format_ is used to validate the internalformat
passed to RenderbufferStorage, which in the OpenGL 4.6 is said:

"An INVALID_ENUM error is generated if internalformat is not one of the
color-renderable, depth-renderable, or stencil-renderable formats defined
in section 9.4."

RGB9_E5 format is not renderable, as stated in the same specification
(Bug 9338).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104794

Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2018-01-31 12:06:00 +01:00
Michel Dänzer
1cf1bf32ef winsys/radeon: Compute is_displayable in surf_drm_to_winsys
It was always 0, breaking (at least) DRI3 with Xwayland.

Bugzilla: https://bugs.freedesktop.org/104306
Fixes: 5f2073be32 ("ac/surface: add ac_surface::is_displayable")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:53:58 +01:00
Matthew Nicholls
ef272b161e radv: remove predication on cache flushes
This can lead to a situation where cache flushes could get conditionally
disabled while still clearing the flush_bits, and thus flushes due to
application pipeline barriers may never get executed.

Fixes: a6c2001ace (radv: add support for cmd predication.)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-31 13:37:18 +10:00
Brian Paul
1ea9efd2f8 mesa: fix broken glGet*(GL_POLYGON_MODE) query
This reverts part of the patch which introduced the GLenum16 change.
Fixes a conform regression found by Roland.

Fixes: f96a69f916 ("mesa: replace GLenum with GLenum16 in
common structures (v4)")
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-30 20:32:37 -07:00
Dave Airlie
49c61d8b84 virgl: also remove dimension on indirect.
This fixes some dEQP tests that generated bad shaders.

Fixes: b6f6ead19 (virgl: drop const dimensions on first block.)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Tested-by: Gurchetan Singh <gurchetansingh@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-31 12:24:11 +10:00
Marek Olšák
fdf01d0244 radeonsi: remove DBG_PRECOMPILE
it's useless and shader-db stats only report the main shader part.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-31 03:21:20 +01:00
Marek Olšák
148b48646b radeonsi: print shader-db stats for main parts, not final binaries
This is needed to get shader-db stats for LS,HS,ES,GS stages on gfx9.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-31 03:21:20 +01:00
Marek Olšák
c02c9ee550 radeonsi: move max_simd_waves computation into a separate function
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-31 03:21:20 +01:00
Marek Olšák
a7311cd7ee mesa: fix glGet MAX_VERTEX_ATTRIB queries
Broken by f96a69f916

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-31 03:21:20 +01:00
Jason Ekstrand
97938dac36 anv/cmd_buffer: Re-emit the pipeline at every subpass
If we ever hit this edge-case, it can theoretically cause problem for
CNL because we could end up changing render targets without re-emitting
3DSTATE_MULTISAMPLE which is part of the pipeline.  Just get rid of the
edge case.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-01-30 17:16:33 -08:00
Ian Romanick
ee63933a73 nir: Distribute binary operations with constants into bcsel
This was specifically designed to simplify 1+mix(0, a-1, condition) to
mix(1, a, condition) by pushing the 1+ inside.

Skylake, Broadwell, and Haswell had similar results.  Skylake shown.
total instructions in shared programs: 14521753 -> 14521716 (<.01%)
instructions in affected programs: 10619 -> 10582 (-0.35%)
helped: 51
HURT: 14
helped stats (abs) min: 1 max: 12 x̄: 1.43 x̃: 1
helped stats (rel) min: 0.20% max: 3.58% x̄: 1.01% x̃: 0.95%
HURT stats (abs)   min: 1 max: 11 x̄: 2.57 x̃: 1
HURT stats (rel)   min: 0.22% max: 1.75% x̄: 1.20% x̃: 1.32%
95% mean confidence interval for instructions value: -1.31 0.17
95% mean confidence interval for instructions %-change: -0.80% -0.27%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 533000205 -> 533003533 (<.01%)
cycles in affected programs: 110610 -> 113938 (3.01%)
helped: 43
HURT: 28
helped stats (abs) min: 6 max: 440 x̄: 27.12 x̃: 16
helped stats (rel) min: 0.39% max: 4.84% x̄: 1.60% x̃: 1.67%
HURT stats (abs)   min: 2 max: 3066 x̄: 160.50 x̃: 14
HURT stats (rel)   min: 0.08% max: 77.78% x̄: 5.16% x̃: 0.62%
95% mean confidence interval for cycles value: -43.81 137.56
95% mean confidence interval for cycles %-change: -1.47% 3.60%
Inconclusive result (value mean confidence interval includes 0).

Ivy Bridge
total instructions in shared programs: 10018840 -> 10018713 (<.01%)
instructions in affected programs: 9431 -> 9304 (-1.35%)
helped: 51
HURT: 3
helped stats (abs) min: 1 max: 80 x̄: 2.76 x̃: 1
helped stats (rel) min: 0.20% max: 16.43% x̄: 1.16% x̃: 0.81%
HURT stats (abs)   min: 1 max: 12 x̄: 4.67 x̃: 1
HURT stats (rel)   min: 0.22% max: 1.33% x̄: 0.59% x̃: 0.22%
95% mean confidence interval for instructions value: -5.36 0.66
95% mean confidence interval for instructions %-change: -1.66% -0.46%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 87571944 -> 87572785 (<.01%)
cycles in affected programs: 117234 -> 118075 (0.72%)
helped: 42
HURT: 23
helped stats (abs) min: 2 max: 114 x̄: 51.90 x̃: 30
helped stats (rel) min: 0.11% max: 11.01% x̄: 4.45% x̃: 2.74%
HURT stats (abs)   min: 1 max: 2341 x̄: 131.35 x̃: 10
HURT stats (rel)   min: 0.06% max: 37.11% x̄: 2.75% x̃: 0.61%
95% mean confidence interval for cycles value: -61.05 86.93
95% mean confidence interval for cycles %-change: -3.47% -0.33%
Inconclusive result (value mean confidence interval includes 0).

Sandy Bridge
total instructions in shared programs: 10542933 -> 10542844 (<.01%)
instructions in affected programs: 11487 -> 11398 (-0.77%)
helped: 52
HURT: 3
helped stats (abs) min: 1 max: 40 x̄: 1.96 x̃: 1
helped stats (rel) min: 0.08% max: 8.16% x̄: 0.90% x̃: 0.72%
HURT stats (abs)   min: 1 max: 11 x̄: 4.33 x̃: 1
HURT stats (rel)   min: 0.22% max: 1.22% x̄: 0.55% x̃: 0.22%
95% mean confidence interval for instructions value: -3.17 -0.07
95% mean confidence interval for instructions %-change: -1.13% -0.52%
Instructions are helped.

total cycles in shared programs: 146098397 -> 146097094 (<.01%)
cycles in affected programs: 128140 -> 126837 (-1.02%)
helped: 47
HURT: 8
helped stats (abs) min: 2 max: 333 x̄: 29.21 x̃: 18
helped stats (rel) min: 0.13% max: 5.04% x̄: 1.18% x̃: 0.95%
HURT stats (abs)   min: 1 max: 16 x̄: 8.75 x̃: 9
HURT stats (rel)   min: 0.08% max: 0.43% x̄: 0.30% x̃: 0.34%
95% mean confidence interval for cycles value: -37.49 -9.90
95% mean confidence interval for cycles %-change: -1.22% -0.71%
Cycles are helped.

Iron Lake
total instructions in shared programs: 7886711 -> 7886509 (<.01%)
instructions in affected programs: 10425 -> 10223 (-1.94%)
helped: 50
HURT: 2
helped stats (abs) min: 1 max: 78 x̄: 4.08 x̃: 1
helped stats (rel) min: 0.34% max: 15.38% x̄: 1.12% x̃: 0.54%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.86% max: 0.91% x̄: 0.89% x̃: 0.89%
95% mean confidence interval for instructions value: -8.05 0.28
95% mean confidence interval for instructions %-change: -1.83% -0.26%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 178115324 -> 178114612 (<.01%)
cycles in affected programs: 765726 -> 765014 (-0.09%)
helped: 39
HURT: 1
helped stats (abs) min: 2 max: 276 x̄: 18.31 x̃: 8
helped stats (rel) min: <.01% max: 8.47% x̄: 0.39% x̃: 0.04%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.03% max: 0.03% x̄: 0.03% x̃: 0.03%
95% mean confidence interval for cycles value: -32.07 -3.53
95% mean confidence interval for cycles %-change: -0.86% 0.10%
Inconclusive result (%-change mean confidence interval includes 0).

GM45
total instructions in shared programs: 4857762 -> 4857661 (<.01%)
instructions in affected programs: 5523 -> 5422 (-1.83%)
helped: 25
HURT: 1
helped stats (abs) min: 1 max: 78 x̄: 4.08 x̃: 1
helped stats (rel) min: 0.34% max: 13.61% x̄: 1.04% x̃: 0.52%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.86% max: 0.86% x̄: 0.86% x̃: 0.86%
95% mean confidence interval for instructions value: -9.99 2.22
95% mean confidence interval for instructions %-change: -2.01% 0.08%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 122179674 -> 122179194 (<.01%)
cycles in affected programs: 530162 -> 529682 (-0.09%)
helped: 22
HURT: 1
helped stats (abs) min: 2 max: 292 x̄: 21.91 x̃: 7
helped stats (rel) min: <.01% max: 8.65% x̄: 0.44% x̃: 0.04%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.03% max: 0.03% x̄: 0.03% x̃: 0.03%
95% mean confidence interval for cycles value: -46.56 4.82
95% mean confidence interval for cycles %-change: -1.20% 0.36%
Inconclusive result (value mean confidence interval includes 0).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:15 -08:00
Ian Romanick
03fb13f646 nir: Rearrange logic op-compounded integer compares
Skylake and Broadwell had similar results.  Skylake shown.
total instructions in shared programs: 14521769 -> 14521753 (<.01%)
instructions in affected programs: 8782 -> 8766 (-0.18%)
helped: 16
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.12% max: 0.40% x̄: 0.20% x̃: 0.18%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -0.23% -0.16%
Instructions are helped.

total cycles in shared programs: 533000376 -> 533000205 (<.01%)
cycles in affected programs: 447035 -> 446864 (-0.04%)
helped: 9
HURT: 9
helped stats (abs) min: 2 max: 40 x̄: 35.78 x̃: 40
helped stats (rel) min: 0.02% max: 0.18% x̄: 0.10% x̃: 0.09%
HURT stats (abs)   min: 1 max: 52 x̄: 16.78 x̃: 10
HURT stats (rel)   min: <.01% max: 1.11% x̄: 0.29% x̃: 0.12%
95% mean confidence interval for cycles value: -25.07 6.07
95% mean confidence interval for cycles %-change: -0.08% 0.27%
Inconclusive result (value mean confidence interval includes 0).

No changes on GM45, Iron Lake, Sandy Bridge, Ivy Bridge, or Haswell.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:14 -08:00
Ian Romanick
053be9f020 nir: Rearrange and-compounded float compares
If both comparisons are used as sources for instructions other than the
iand, this transformation is detrimental.  If the non-identical value in
both compares is constant, the fmin or fmax will be constant-folded
away, so the transformation is always a win.

It is interesting to me that on Iron Lake only 81 shaders have
instruction counts changed, but 726 shaders have cycle counts changed.

shader-db results:

Skylake
total instructions in shared programs: 14525728 -> 14521017 (-0.03%)
instructions in affected programs: 1164726 -> 1160015 (-0.40%)
helped: 1692
HURT: 5
helped stats (abs) min: 1 max: 637 x̄: 2.79 x̃: 2
helped stats (rel) min: 0.07% max: 16.36% x̄: 0.81% x̃: 0.33%
HURT stats (abs)   min: 1 max: 12 x̄: 3.20 x̃: 1
HURT stats (rel)   min: 0.38% max: 2.86% x̄: 2.36% x̃: 2.86%
95% mean confidence interval for instructions value: -3.52 -2.03
95% mean confidence interval for instructions %-change: -0.86% -0.74%
Instructions are helped.

total cycles in shared programs: 533115449 -> 532991404 (-0.02%)
cycles in affected programs: 119401803 -> 119277758 (-0.10%)
helped: 1145
HURT: 467
helped stats (abs) min: 1 max: 34644 x̄: 145.92 x̃: 18
helped stats (rel) min: <.01% max: 45.33% x̄: 1.58% x̃: 0.42%
HURT stats (abs)   min: 1 max: 1590 x̄: 92.15 x̃: 15
HURT stats (rel)   min: <.01% max: 13.48% x̄: 1.26% x̃: 0.39%
95% mean confidence interval for cycles value: -122.16 -31.74
95% mean confidence interval for cycles %-change: -0.94% -0.57%
Cycles are helped.

total spills in shared programs: 9597 -> 9534 (-0.66%)
spills in affected programs: 403 -> 340 (-15.63%)
helped: 1
HURT: 1

total fills in shared programs: 13904 -> 13790 (-0.82%)
fills in affected programs: 1627 -> 1513 (-7.01%)
helped: 2
HURT: 1

LOST:   0
GAINED: 2

Broadwell
total instructions in shared programs: 14816966 -> 14812590 (-0.03%)
instructions in affected programs: 1499885 -> 1495509 (-0.29%)
helped: 1672
HURT: 15
helped stats (abs) min: 1 max: 455 x̄: 2.70 x̃: 2
helped stats (rel) min: 0.05% max: 16.36% x̄: 0.81% x̃: 0.33%
HURT stats (abs)   min: 1 max: 21 x̄: 9.20 x̃: 8
HURT stats (rel)   min: 0.08% max: 2.86% x̄: 1.06% x̃: 0.53%
95% mean confidence interval for instructions value: -3.14 -2.05
95% mean confidence interval for instructions %-change: -0.85% -0.73%
Instructions are helped.

total cycles in shared programs: 559353622 -> 559345595 (<.01%)
cycles in affected programs: 139893703 -> 139885676 (<.01%)
helped: 921
HURT: 697
helped stats (abs) min: 1 max: 42424 x̄: 143.45 x̃: 18
helped stats (rel) min: <.01% max: 36.23% x̄: 2.02% x̃: 0.87%
HURT stats (abs)   min: 1 max: 2370 x̄: 178.03 x̃: 38
HURT stats (rel)   min: <.01% max: 17.35% x̄: 0.71% x̃: 0.14%
95% mean confidence interval for cycles value: -59.64 49.72
95% mean confidence interval for cycles %-change: -1.02% -0.66%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 78902 -> 78861 (-0.05%)
spills in affected programs: 2418 -> 2377 (-1.70%)
helped: 1
HURT: 11

total fills in shared programs: 83782 -> 83678 (-0.12%)
fills in affected programs: 3515 -> 3411 (-2.96%)
helped: 2
HURT: 11

LOST:   0
GAINED: 5

Haswell and Ivy Bridge had similar results. Haswell shown.
total instructions in shared programs: 9033898 -> 9032010 (-0.02%)
instructions in affected programs: 308064 -> 306176 (-0.61%)
helped: 921
HURT: 4
helped stats (abs) min: 1 max: 20 x̄: 2.05 x̃: 1
helped stats (rel) min: 0.17% max: 17.54% x̄: 0.80% x̃: 0.35%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 3.23% max: 3.23% x̄: 3.23% x̃: 3.23%
95% mean confidence interval for instructions value: -2.21 -1.87
95% mean confidence interval for instructions %-change: -0.88% -0.68%
Instructions are helped.

total cycles in shared programs: 84628949 -> 84620520 (<.01%)
cycles in affected programs: 2164913 -> 2156484 (-0.39%)
helped: 518
HURT: 359
helped stats (abs) min: 1 max: 440 x̄: 41.52 x̃: 20
helped stats (rel) min: <.01% max: 17.17% x̄: 1.95% x̃: 1.01%
HURT stats (abs)   min: 1 max: 586 x̄: 36.43 x̃: 8
HURT stats (rel)   min: 0.04% max: 18.65% x̄: 1.47% x̃: 0.40%
95% mean confidence interval for cycles value: -15.17 -4.05
95% mean confidence interval for cycles %-change: -0.77% -0.32%
Cycles are helped.

LOST:   0
GAINED: 4

Sandy Bridge
total instructions in shared programs: 10544860 -> 10542933 (-0.02%)
instructions in affected programs: 360019 -> 358092 (-0.54%)
helped: 931
HURT: 4
helped stats (abs) min: 1 max: 20 x̄: 2.07 x̃: 1
helped stats (rel) min: 0.11% max: 15.52% x̄: 0.68% x̃: 0.30%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 3.33% max: 3.33% x̄: 3.33% x̃: 3.33%
95% mean confidence interval for instructions value: -2.23 -1.89
95% mean confidence interval for instructions %-change: -0.76% -0.58%
Instructions are helped.

total cycles in shared programs: 146106820 -> 146098397 (<.01%)
cycles in affected programs: 3435047 -> 3426624 (-0.25%)
helped: 572
HURT: 329
helped stats (abs) min: 1 max: 1289 x̄: 32.52 x̃: 15
helped stats (rel) min: <.01% max: 26.29% x̄: 0.97% x̃: 0.33%
HURT stats (abs)   min: 1 max: 1714 x̄: 30.93 x̃: 6
HURT stats (rel)   min: 0.02% max: 41.31% x̄: 1.13% x̃: 0.19%
95% mean confidence interval for cycles value: -16.85 -1.85
95% mean confidence interval for cycles %-change: -0.39% -0.01%
Cycles are helped.

LOST:   1
GAINED: 0

Iron Lake
total instructions in shared programs: 7886925 -> 7886711 (<.01%)
instructions in affected programs: 25763 -> 25549 (-0.83%)
helped: 75
HURT: 6
helped stats (abs) min: 1 max: 13 x̄: 3.33 x̃: 1
helped stats (rel) min: 0.35% max: 17.57% x̄: 1.96% x̃: 0.53%
HURT stats (abs)   min: 1 max: 16 x̄: 6.00 x̃: 1
HURT stats (rel)   min: 2.86% max: 4.79% x̄: 3.49% x̃: 2.86%
95% mean confidence interval for instructions value: -3.69 -1.60
95% mean confidence interval for instructions %-change: -2.54% -0.57%
Instructions are helped.

total cycles in shared programs: 178116888 -> 178115324 (<.01%)
cycles in affected programs: 5858790 -> 5857226 (-0.03%)
helped: 484
HURT: 242
helped stats (abs) min: 2 max: 76 x̄: 5.27 x̃: 6
helped stats (rel) min: 0.01% max: 10.70% x̄: 0.18% x̃: 0.06%
HURT stats (abs)   min: 2 max: 76 x̄: 4.07 x̃: 2
HURT stats (rel)   min: 0.01% max: 3.99% x̄: 0.19% x̃: 0.03%
95% mean confidence interval for cycles value: -2.76 -1.55
95% mean confidence interval for cycles %-change: -0.12% 0.01%
Inconclusive result (%-change mean confidence interval includes 0).

GM45
total instructions in shared programs: 4857870 -> 4857762 (<.01%)
instructions in affected programs: 13994 -> 13886 (-0.77%)
helped: 39
HURT: 5
helped stats (abs) min: 1 max: 13 x̄: 3.28 x̃: 2
helped stats (rel) min: 0.33% max: 17.11% x̄: 1.86% x̃: 0.48%
HURT stats (abs)   min: 1 max: 16 x̄: 4.00 x̃: 1
HURT stats (rel)   min: 2.86% max: 4.71% x̄: 3.23% x̃: 2.86%
95% mean confidence interval for instructions value: -3.86 -1.05
95% mean confidence interval for instructions %-change: -2.61% 0.04%
Inconclusive result (%-change mean confidence interval includes 0).

total cycles in shared programs: 122180744 -> 122179674 (<.01%)
cycles in affected programs: 3686646 -> 3685576 (-0.03%)
helped: 273
HURT: 141
helped stats (abs) min: 2 max: 76 x̄: 5.81 x̃: 6
helped stats (rel) min: 0.01% max: 10.70% x̄: 0.18% x̃: 0.06%
HURT stats (abs)   min: 2 max: 76 x̄: 3.66 x̃: 2
HURT stats (rel)   min: 0.01% max: 3.99% x̄: 0.16% x̃: 0.02%
95% mean confidence interval for cycles value: -3.42 -1.75
95% mean confidence interval for cycles %-change: -0.15% 0.03%
Inconclusive result (%-change mean confidence interval includes 0).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:14 -08:00
Ian Romanick
821e7a4d32 nir: Separate a weird compare with zero to two compares with zero
min(a+b, c+d) >= 0 becomes (a+b >= 0 && c+d >= 0).

No shader-db changes, but it does prevent 6 to 12 instruction
regressions in the next patch on all measured Intel platforms.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:14 -08:00
Ian Romanick
68420d8322 nir: Simplify min and max of b2f
v2: Rebase on almost 2 years.  Require that one of the arguments to fmin
or fmax be used only once.  This prevents some regressions.

shader-db results:

Skylake and Broadwell had similar results.  Skylake shown.
total instructions in shared programs: 14526021 -> 14525913 (<.01%)
instructions in affected programs: 4613 -> 4505 (-2.34%)
helped: 31
HURT: 0
helped stats (abs) min: 1 max: 4 x̄: 3.48 x̃: 4
helped stats (rel) min: 0.62% max: 6.67% x̄: 3.31% x̃: 2.42%

total cycles in shared programs: 533118710 -> 533118403 (<.01%)
cycles in affected programs: 34334 -> 34027 (-0.89%)
helped: 24
HURT: 0
helped stats (abs) min: 4 max: 24 x̄: 12.79 x̃: 14
helped stats (rel) min: 0.25% max: 2.40% x̄: 1.08% x̃: 1.03%

No changes on GM45, Iron Lake, Sandy Bridge, Ivy Bridge, or Haswell.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:14 -08:00
Ian Romanick
d8d18516b0 nir: Undo possible damage caused by rearranging or-compounded float compares
shader-db results:

Skylake and Broadwell had similar results (Skylake shown)
total instructions in shared programs: 14525898 -> 14525836 (<.01%)
instructions in affected programs: 1964 -> 1902 (-3.16%)
helped: 14
HURT: 0
helped stats (abs) min: 1 max: 25 x̄: 4.43 x̃: 1
helped stats (rel) min: 0.68% max: 9.77% x̄: 2.10% x̃: 0.86%
95% mean confidence interval for instructions value: -9.46 0.60
95% mean confidence interval for instructions %-change: -3.97% -0.24%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 533119892 -> 533115756 (<.01%)
cycles in affected programs: 96061 -> 91925 (-4.31%)
helped: 13
HURT: 1
helped stats (abs) min: 60 max: 596 x̄: 318.77 x̃: 300
helped stats (rel) min: 1.15% max: 5.49% x̄: 4.27% x̃: 4.42%
HURT stats (abs)   min: 8 max: 8 x̄: 8.00 x̃: 8
HURT stats (rel)   min: 0.46% max: 0.46% x̄: 0.46% x̃: 0.46%
95% mean confidence interval for cycles value: -379.43 -211.43
95% mean confidence interval for cycles %-change: -4.84% -3.01%
Cycles are helped.

Haswell, Ivy Bridge and Sandy Bridge had similar results (Haswell shown).
total instructions in shared programs: 9033948 -> 9033898 (<.01%)
instructions in affected programs: 535 -> 485 (-9.35%)
helped: 2
HURT: 0

total cycles in shared programs: 84631402 -> 84628949 (<.01%)
cycles in affected programs: 63197 -> 60744 (-3.88%)
helped: 13
HURT: 2
helped stats (abs) min: 1 max: 594 x̄: 189.62 x̃: 140
helped stats (rel) min: 0.07% max: 5.04% x̄: 3.79% x̃: 4.01%
HURT stats (abs)   min: 4 max: 8 x̄: 6.00 x̃: 6
HURT stats (rel)   min: 0.17% max: 0.45% x̄: 0.31% x̃: 0.31%
95% mean confidence interval for cycles value: -253.40 -73.67
95% mean confidence interval for cycles %-change: -4.24% -2.25%
Cycles are helped.

No changes on GM45 or Iron Lake.

v2: Add a couple more tautological compares.  Suggested by Elie.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:14 -08:00
Ian Romanick
3941cba0f7 nir: Be more conservative about rearranging or-compounded compares
If both comparisons are used as sources for instructions other than the
ior, this transformation is detrimental.  If the non-identical value in
both compares is constant, the fmin or fmax will be constant-folded
away, so the transformation is always a win.

shader-db results:

Skylake
total instructions in shared programs: 14526147 -> 14525898 (<.01%)
instructions in affected programs: 70239 -> 69990 (-0.35%)
helped: 102
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 2.44 x̃: 1
helped stats (rel) min: 0.07% max: 2.30% x̄: 0.38% x̃: 0.20%
95% mean confidence interval for instructions value: -2.86 -2.02
95% mean confidence interval for instructions %-change: -0.46% -0.31%
Instructions are helped.

total cycles in shared programs: 533120531 -> 533119892 (<.01%)
cycles in affected programs: 994875 -> 994236 (-0.06%)
helped: 76
HURT: 26
helped stats (abs) min: 1 max: 324 x̄: 27.09 x̃: 13
helped stats (rel) min: <.01% max: 4.21% x̄: 0.45% x̃: 0.18%
HURT stats (abs)   min: 1 max: 167 x̄: 54.62 x̃: 26
HURT stats (rel)   min: <.01% max: 4.36% x̄: 1.01% x̃: 0.39%
95% mean confidence interval for cycles value: -19.44 6.91
95% mean confidence interval for cycles %-change: -0.30% 0.15%
Inconclusive result (value mean confidence interval includes 0).

Broadwell
total instructions in shared programs: 14816005 -> 14815787 (<.01%)
instructions in affected programs: 64658 -> 64440 (-0.34%)
helped: 97
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 2.25 x̃: 1
helped stats (rel) min: 0.07% max: 2.30% x̄: 0.38% x̃: 0.20%
95% mean confidence interval for instructions value: -2.62 -1.87
95% mean confidence interval for instructions %-change: -0.45% -0.30%
Instructions are helped.

total cycles in shared programs: 559340386 -> 559339907 (<.01%)
cycles in affected programs: 1090491 -> 1090012 (-0.04%)
helped: 66
HURT: 28
helped stats (abs) min: 2 max: 198 x̄: 23.83 x̃: 16
helped stats (rel) min: 0.01% max: 4.21% x̄: 0.47% x̃: 0.27%
HURT stats (abs)   min: 2 max: 226 x̄: 39.07 x̃: 11
HURT stats (rel)   min: <.01% max: 4.61% x̄: 0.64% x̃: 0.20%
95% mean confidence interval for cycles value: -15.94 5.75
95% mean confidence interval for cycles %-change: -0.35% 0.07%
Inconclusive result (value mean confidence interval includes 0).

LOST:   0
GAINED: 1

Haswell
total instructions in shared programs: 9034106 -> 9033948 (<.01%)
instructions in affected programs: 24096 -> 23938 (-0.66%)
helped: 38
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 4.16 x̃: 4
helped stats (rel) min: 0.42% max: 2.29% x̄: 0.71% x̃: 0.64%
95% mean confidence interval for instructions value: -4.71 -3.60
95% mean confidence interval for instructions %-change: -0.84% -0.58%
Instructions are helped.

total cycles in shared programs: 84631628 -> 84631402 (<.01%)
cycles in affected programs: 148674 -> 148448 (-0.15%)
helped: 14
HURT: 14
helped stats (abs) min: 1 max: 114 x̄: 22.14 x̃: 12
helped stats (rel) min: 0.02% max: 2.98% x̄: 0.66% x̃: 0.21%
HURT stats (abs)   min: 1 max: 10 x̄: 6.00 x̃: 5
HURT stats (rel)   min: 0.01% max: 0.20% x̄: 0.12% x̃: 0.11%
95% mean confidence interval for cycles value: -19.42 3.28
95% mean confidence interval for cycles %-change: -0.59% 0.05%
Inconclusive result (value mean confidence interval includes 0).

Ivy Bridge
total instructions in shared programs: 10015456 -> 10015293 (<.01%)
instructions in affected programs: 27701 -> 27538 (-0.59%)
helped: 38
HURT: 0
helped stats (abs) min: 1 max: 9 x̄: 4.29 x̃: 4
helped stats (rel) min: 0.33% max: 2.79% x̄: 0.66% x̃: 0.52%
95% mean confidence interval for instructions value: -4.87 -3.71
95% mean confidence interval for instructions %-change: -0.82% -0.51%
Instructions are helped.

total cycles in shared programs: 87524771 -> 87524569 (<.01%)
cycles in affected programs: 112324 -> 112122 (-0.18%)
helped: 6
HURT: 12
helped stats (abs) min: 2 max: 111 x̄: 44.67 x̃: 20
helped stats (rel) min: 0.02% max: 2.94% x̄: 1.45% x̃: 1.26%
HURT stats (abs)   min: 1 max: 16 x̄: 5.50 x̃: 5
HURT stats (rel)   min: <.01% max: 0.16% x̄: 0.08% x̃: 0.08%
95% mean confidence interval for cycles value: -29.14 6.69
95% mean confidence interval for cycles %-change: -0.93% 0.08%
Inconclusive result (value mean confidence interval includes 0).

LOST:   0
GAINED: 2

Sandy Bridge
total instructions in shared programs: 10545655 -> 10545465 (<.01%)
instructions in affected programs: 37198 -> 37008 (-0.51%)
helped: 42
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 4.52 x̃: 4
helped stats (rel) min: 0.31% max: 2.15% x̄: 0.58% x̃: 0.49%
95% mean confidence interval for instructions value: -5.14 -3.91
95% mean confidence interval for instructions %-change: -0.68% -0.47%
Instructions are helped.

total cycles in shared programs: 146113059 -> 146112427 (<.01%)
cycles in affected programs: 423514 -> 422882 (-0.15%)
helped: 32
HURT: 10
helped stats (abs) min: 4 max: 162 x̄: 24.34 x̃: 12
helped stats (rel) min: 0.06% max: 2.74% x̄: 0.37% x̃: 0.11%
HURT stats (abs)   min: 12 max: 19 x̄: 14.70 x̃: 14
HURT stats (rel)   min: 0.10% max: 0.18% x̄: 0.16% x̃: 0.14%
95% mean confidence interval for cycles value: -26.03 -4.07
95% mean confidence interval for cycles %-change: -0.43% -0.05%
Cycles are helped.

Iron Lake
total instructions in shared programs: 7886959 -> 7886925 (<.01%)
instructions in affected programs: 1340 -> 1306 (-2.54%)
helped: 4
HURT: 0
helped stats (abs) min: 2 max: 15 x̄: 8.50 x̃: 8
helped stats (rel) min: 0.63% max: 4.30% x̄: 2.45% x̃: 2.43%
95% mean confidence interval for instructions value: -20.44 3.44
95% mean confidence interval for instructions %-change: -5.78% 0.89%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 178116996 -> 178116888 (<.01%)
cycles in affected programs: 6262 -> 6154 (-1.72%)
helped: 2
HURT: 2
helped stats (abs) min: 44 max: 78 x̄: 61.00 x̃: 61
helped stats (rel) min: 3.31% max: 3.94% x̄: 3.62% x̃: 3.62%
HURT stats (abs)   min: 6 max: 8 x̄: 7.00 x̃: 7
HURT stats (rel)   min: 0.34% max: 0.68% x̄: 0.51% x̃: 0.51%
95% mean confidence interval for cycles value: -93.27 39.27
95% mean confidence interval for cycles %-change: -5.38% 2.27%
Inconclusive result (value mean confidence interval includes 0).

GM45
total instructions in shared programs: 4857887 -> 4857870 (<.01%)
instructions in affected programs: 674 -> 657 (-2.52%)
helped: 2
HURT: 0

total cycles in shared programs: 122180816 -> 122180744 (<.01%)
cycles in affected programs: 3764 -> 3692 (-1.91%)
helped: 1
HURT: 1
helped stats (abs) min: 78 max: 78 x̄: 78.00 x̃: 78
helped stats (rel) min: 3.94% max: 3.94% x̄: 3.94% x̃: 3.94%
HURT stats (abs)   min: 6 max: 6 x̄: 6.00 x̃: 6
HURT stats (rel)   min: 0.34% max: 0.34% x̄: 0.34% x̃: 0.34%

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:14 -08:00
Ian Romanick
cfc0d34802 nir: See through an fneg to apply existing optimizations
Doing the same for the existing feq and fne transformations didn't help
anything in shader-db.

shader-db results:

Broadwell and Skylake (Skylake shown)
total instructions in shared programs: 14529463 -> 14526147 (-0.02%)
instructions in affected programs: 402420 -> 399104 (-0.82%)
helped: 2136
HURT: 131
helped stats (abs) min: 1 max: 10 x̄: 1.61 x̃: 1
helped stats (rel) min: 0.03% max: 16.22% x̄: 3.14% x̃: 1.12%
HURT stats (abs)   min: 1 max: 2 x̄: 1.01 x̃: 1
HURT stats (rel)   min: 0.13% max: 7.69% x̄: 0.75% x̃: 0.57%
95% mean confidence interval for instructions value: -1.51 -1.41
95% mean confidence interval for instructions %-change: -3.06% -2.78%
Instructions are helped.

total cycles in shared programs: 533146915 -> 533120531 (<.01%)
cycles in affected programs: 10356261 -> 10329877 (-0.25%)
helped: 1933
HURT: 844
helped stats (abs) min: 1 max: 490 x̄: 29.44 x̃: 16
helped stats (rel) min: <.01% max: 28.57% x̄: 3.43% x̃: 1.88%
HURT stats (abs)   min: 1 max: 423 x̄: 36.17 x̃: 12
HURT stats (rel)   min: <.01% max: 23.75% x̄: 1.90% x̃: 0.59%
95% mean confidence interval for cycles value: -11.78 -7.22
95% mean confidence interval for cycles %-change: -1.98% -1.65%
Cycles are helped.

Haswell
total instructions in shared programs: 9037416 -> 9034106 (-0.04%)
instructions in affected programs: 389831 -> 386521 (-0.85%)
helped: 2184
HURT: 120
helped stats (abs) min: 1 max: 11 x̄: 1.57 x̃: 1
helped stats (rel) min: 0.03% max: 25.00% x̄: 2.73% x̃: 1.02%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.19% max: 7.69% x̄: 0.81% x̃: 0.57%
95% mean confidence interval for instructions value: -1.49 -1.39
95% mean confidence interval for instructions %-change: -2.68% -2.41%
Instructions are helped.

total cycles in shared programs: 84636243 -> 84631628 (<.01%)
cycles in affected programs: 4745058 -> 4740443 (-0.10%)
helped: 1904
HURT: 960
helped stats (abs) min: 1 max: 466 x̄: 30.21 x̃: 18
helped stats (rel) min: 0.02% max: 36.36% x̄: 3.57% x̃: 2.38%
HURT stats (abs)   min: 1 max: 1080 x̄: 55.11 x̃: 14
HURT stats (rel)   min: 0.02% max: 51.33% x̄: 2.77% x̃: 0.81%
95% mean confidence interval for cycles value: -4.51 1.29
95% mean confidence interval for cycles %-change: -1.64% -1.25%
Inconclusive result (value mean confidence interval includes 0).

LOST:   1
GAINED: 0

Sandy Bridge and Ivy Bridge (Ivy Bridge shown)
total instructions in shared programs: 10018873 -> 10015456 (-0.03%)
instructions in affected programs: 512820 -> 509403 (-0.67%)
helped: 2268
HURT: 162
helped stats (abs) min: 1 max: 11 x̄: 1.62 x̃: 1
helped stats (rel) min: 0.03% max: 25.00% x̄: 2.47% x̃: 0.88%
HURT stats (abs)   min: 1 max: 4 x̄: 1.59 x̃: 1
HURT stats (rel)   min: 0.09% max: 7.69% x̄: 0.86% x̃: 0.50%
95% mean confidence interval for instructions value: -1.46 -1.35
95% mean confidence interval for instructions %-change: -2.38% -2.12%
Instructions are helped.

total cycles in shared programs: 87538223 -> 87524771 (-0.02%)
cycles in affected programs: 5435520 -> 5422068 (-0.25%)
helped: 1916
HURT: 946
helped stats (abs) min: 1 max: 1392 x̄: 29.44 x̃: 18
helped stats (rel) min: <.01% max: 34.51% x̄: 3.34% x̃: 1.97%
HURT stats (abs)   min: 1 max: 633 x̄: 45.41 x̃: 11
HURT stats (rel)   min: 0.02% max: 25.95% x̄: 2.41% x̃: 0.62%
95% mean confidence interval for cycles value: -7.34 -2.06
95% mean confidence interval for cycles %-change: -1.62% -1.26%
Cycles are helped.

LOST:   1
GAINED: 0

Iron Lake
total instructions in shared programs: 7888446 -> 7886959 (-0.02%)
instructions in affected programs: 331581 -> 330094 (-0.45%)
helped: 1160
HURT: 97
helped stats (abs) min: 1 max: 10 x̄: 1.37 x̃: 1
helped stats (rel) min: 0.02% max: 9.68% x̄: 0.93% x̃: 0.43%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.17% max: 4.17% x̄: 0.37% x̃: 0.25%
95% mean confidence interval for instructions value: -1.25 -1.12
95% mean confidence interval for instructions %-change: -0.91% -0.75%
Instructions are helped.

total cycles in shared programs: 178130766 -> 178116996 (<.01%)
cycles in affected programs: 12534564 -> 12520794 (-0.11%)
helped: 1856
HURT: 187
helped stats (abs) min: 2 max: 202 x̄: 7.78 x̃: 4
helped stats (rel) min: <.01% max: 6.47% x̄: 0.28% x̃: 0.11%
HURT stats (abs)   min: 2 max: 26 x̄: 3.55 x̃: 2
HURT stats (rel)   min: 0.01% max: 2.14% x̄: 0.08% x̃: 0.02%
95% mean confidence interval for cycles value: -7.41 -6.07
95% mean confidence interval for cycles %-change: -0.28% -0.22%
Cycles are helped.

GM45
total instructions in shared programs: 4858912 -> 4857887 (-0.02%)
instructions in affected programs: 237565 -> 236540 (-0.43%)
helped: 867
HURT: 57
helped stats (abs) min: 1 max: 10 x̄: 1.25 x̃: 1
helped stats (rel) min: 0.02% max: 9.38% x̄: 0.87% x̃: 0.43%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.16% max: 3.85% x̄: 0.34% x̃: 0.22%
95% mean confidence interval for instructions value: -1.18 -1.04
95% mean confidence interval for instructions %-change: -0.88% -0.71%
Instructions are helped.

total cycles in shared programs: 122189118 -> 122180816 (<.01%)
cycles in affected programs: 8776418 -> 8768116 (-0.09%)
helped: 1213
HURT: 166
helped stats (abs) min: 2 max: 202 x̄: 7.30 x̃: 4
helped stats (rel) min: <.01% max: 6.43% x̄: 0.25% x̃: 0.11%
HURT stats (abs)   min: 2 max: 26 x̄: 3.35 x̃: 2
HURT stats (rel)   min: 0.01% max: 2.14% x̄: 0.06% x̃: 0.02%
95% mean confidence interval for cycles value: -6.78 -5.26
95% mean confidence interval for cycles %-change: -0.24% -0.18%
Cycles are helped.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:14 -08:00
Timothy Arceri
283e25102b st/glsl_to_nir: disable io lowering and array splitting of fs inputs
We need this to be able to support the interpolateAt builtins in a
sane way. It also leads to the generation of more optimal code.

The lowering and splitting is made conditional on lower_all_io_to_temps
because vc4 and freedreno both expect these passes to be enabled and
niether support glsl 400 so don't need to deal with the interpolateAt
builtins.

We leave the other stages for now as to avoid regressions. Ideally we
could remove the stage checks and just set the nir options correctly
for each stage. However all gallium drivers currently just use return
the same nir compiler options for all stages, and it's probably more
trouble than its worth to change this.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:08 +11:00
Timothy Arceri
9a2e085680 nir: add lower_all_io_to_temps flag
This will be used for freedreno and vc4 which require all inputs
and outputs to be copied to temps.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:08 +11:00
Timothy Arceri
3218756262 nir/st_glsl_to_nir: add param to disable splitting of inputs
We need this because we will always copy fs outputs to temps and
split the arrays, but do not want to do either of these with fs
inputs as it is unnessisary and makes handling interpolateAt
builtins difficult.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:08 +11:00
Timothy Arceri
93e213f91f st/glsl_to_nir: copy nir compiler options to context
Various nir passes may expect this to be here as does the nir
serialisation pass.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:08 +11:00
Timothy Arceri
dd6d6c63a7 radeonsi/nir: add input support for arrays that have not been copied to temps and split
We need this to be able to support the interpolateAt builtins in a
sane way. It also leads to the generation of more optimal code.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:07 +11:00
Timothy Arceri
d185190222 ac/radeonsi: add lookup_interp_param and load_sample_position to the abi
This will enable the interpolateAt builtins to work on the radeonsi
nir backend.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:07 +11:00
Timothy Arceri
97058168a4 radeonsi/nir: add prim_mask to the abi
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:07 +11:00
Timothy Arceri
3ff012f142 radeonsi/nir: adjust load_sample_position() to be shared between backends
With this interface change it can be shared between the tgsi and
nir backends.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:07 +11:00
Timothy Arceri
3a47b138e3 radeonsi/nir: add si_nir_lookup_interp_param() helper
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:07 +11:00
Timothy Arceri
b8808848ce ac/nir_to_llvm: move some interp defines to the header
These will be used in the following patch.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:07 +11:00
Timothy Arceri
fea6da9aaa radeonsi/nir: move the interpolation qualifier scanning
We need to collect this when scanning over the instruction rather
than when scanning over the inputs otherwise we might get confliting
values for inputs that are use by the interpolateAt* builtins.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:07 +11:00
Timothy Arceri
580f1aa247 radeonsi/nir: add interpolate at intrinsics to scan_instruction()
V2: use the uses_*_opcode_interp_* flags

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:07 +11:00
Bas Nieuwenhuizen
882eff4d20 radv: Merge raster state with PM4 generation.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:02:05 +01:00
Bas Nieuwenhuizen
69364f1c34 radv: Move gs state out of pipeline.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:02:01 +01:00
Bas Nieuwenhuizen
e4e060d135 radv: Split out cliprect rule generation.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:01:56 +01:00
Bas Nieuwenhuizen
acbaef3005 radv: Merge VGT_GS_MODE computation with PM4 generation.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:01:52 +01:00
Bas Nieuwenhuizen
4ae6a8b0cd radv: Split out processing the vertex input state.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:01:41 +01:00
Bas Nieuwenhuizen
9062b1c241 radv: Move tessellation state out of pipeline.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:01:38 +01:00
Bas Nieuwenhuizen
4aa1cb4e90 radv: Move blend state out of pipeline.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:01:34 +01:00
Bas Nieuwenhuizen
0f72f0eacb radv: Split out generating VGT_SHADER_STAGES_EN.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:01:30 +01:00
Bas Nieuwenhuizen
694c34314b radv: Split out the ia_multi_vgt_param precomputation.
Also moved everything in a struct and then return the struct from
the helper function, so it is clear in the caller what part of the
pipeline gets modified.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:01:26 +01:00
Bas Nieuwenhuizen
0bea0851aa radv: Split out db_shader_control computation.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:01:18 +01:00
Bas Nieuwenhuizen
5dce47ae6d radv: Compute shader_z_format when emitting it.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:01:13 +01:00
Bas Nieuwenhuizen
df2e7ab0db radv: Merge depth stencil state with PM4 generation.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:01:06 +01:00
Bas Nieuwenhuizen
d5a0af84ec radv: Merge ps_input_cntl computation with PM4 generation.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:01:01 +01:00
Bas Nieuwenhuizen
e2bf18030d radv: Merge vtx_reuse_depth computation with PM4 generation.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:00:55 +01:00
Bas Nieuwenhuizen
c80747b32c radv: Merge vs state computation with PM4 generation.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:00:50 +01:00
Bas Nieuwenhuizen
c4191cf944 radv: Merge binning state generation with pm4 emission.
We don't need the pipeline state struct anymore.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:00:45 +01:00
Bas Nieuwenhuizen
6f1a3f081e radv: Constify some pipeline helpers.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:00:40 +01:00
Bas Nieuwenhuizen
f0c9ef410a radv: Add PM4 pregeneration for compute pipelines.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:00:34 +01:00
Bas Nieuwenhuizen
beeab44190 radv: Record a PM4 sequence for graphics pipeline switches.
This gives about 2% performance improvement on dota2 for me.

This is mostly a mechanical copy and replacement, but at bind time
we still do:

1) Some stuff that is only based on num_samples changes.
2) Some command buffer state setting.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:00:22 +01:00
Bas Nieuwenhuizen
7c366bc152 radv: Determine unneeded dynamic states.
Which avoids setting or emitting them.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-30 22:00:17 +01:00
Andres Rodriguez
0a89784bcc mesa: check for invalid index on UUID glGet queries
This fixes the piglit test:
spec/ext_semaphore/api-errors/usigned-byte-i-v-bad-value

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
566ed727a4 mesa: fix glGet for ext_external_objects parameters
This allows the client to actually query the enums specified in the
ext_external_objects spec.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
0ebd3cc863 mesa: fix error codes for importing memory/semaphore FDs
This fixes the following piglit tests:
spec/ext_semaphore_fd/api-errors/import-semaphore-fd-bad-enum
spec/ext_memory_object_fd/api-errors/import-memory-fd-bad-enum

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
50b06cbc10 radeonsi: fix fence_server_sync() holding up extra work v2
When calling si_fence_server_sync(), the wait operation is associated
with the next kernel submission. Therefore, any unflushed work
submitted previous to fence_server_sync() will also be affected by
the wait.

To avoid adding the dependency to the unflushed work, we flush before
emitting the fence dependency.

v2: s/semaphore/fence

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
e0f16ee666 radeonsi: implement semaphore_server_signal v2
Syncobj based waits or signals only happen at submission boundaries. In
order to guarantee that the requested signal event will occur when the
state tracker requested it, we must issue a flush.

v2: s/fence/semaphore for pipe objects

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
5b07b06d6b radeonsi: add support for importing PIPE_FD_TYPE_SYNCOBJ semaphores
Hook up importing semaphores of type PIPE_FD_TYPE_SYNCOBJ

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
cc9762d74d winsys/amdgpu: add support for syncobj signaling v3
Add the ability to signal a syncobj when a cs completes execution.

v2: corresponding changes for gallium fence->semaphore rename
v3: s/semaphore/fence for pipe objects

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
29b9bd0539 mesa/st: add support for semaphore object signal/wait v4
Bits to implement ServerWaitSemaphoreObject/ServerSignalSemaphoreObject

v2:
  - corresponding changes for gallium fence->semaphore rename
  - flushing moved to mesa/main

v3: s/semaphore/fence for pipe objects
v4: add bitmap flushing

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
89b52891fd mesa: add support for semaphore object signal/wait v3
Memory synchronization is left for a future patch.

v2: flush vertices/bitmaps moved to mesa/main
v3: removed spaces before/after braces

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
260f7fcc46 mesa: add semaphore parameter stub v2
EXT_semaphore and EXT_semaphore_fd define no pnames. Therefore there
isn't much to do besides determining the correct error code.

v2: removed useless return

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
382067f065 mesa/st: add support for semaphore object create/import/delete v3
Add basic semaphore object operations.

v2: s/semaphore/fence for pipe objects
v3: added missing license headers

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
67d5d08682 mesa: add support for semaphore object creation/import/delete v3
Used by EXT_semmaphore and EXT_semaphore_fd

v2: Removed unnecessary dummy callback initialization
v3: Fixed attempting to free the DummySemaphoreObject

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
8e635f7d65 mesa/st: introduce EXT_semaphore and EXT_semaphore_fd v2
Guarded by PIPE_CAP_SEMAPHORE_SIGNAL

v2: corresponding changes for PIPE_CAP_SEMAPHORE_SIGNAL rename

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
fde1afc495 u_threaded_context: add support for fence_server_signal v2
v2: s/semaphore/fence

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
d34c2cf3e6 gallium: add fence_server_signal() v2
Calling this function will emit a fence signal operation into the
GPU's command stream.

v2: documentation typos

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
458f89be78 gallium: introduce PIPE_FD_TYPE_SYNCOBJ
Denotes that a fd is backed by a synobj. For example, radv shared
semaphores.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
2ab405d254 gallium: introduce PIPE_CAP_FENCE_SIGNAL v2
Protects semaphore signaling functionality required by GL_EXT_semaphore.

v2: s/semaphore/fence

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Andres Rodriguez
585daa2378 gallium: add type parameter to create_fence_fd
An fd can potentially have different types of objects backing it.
Specifying the type helps us make sure we treat the FD correctly.

This is in preparation to allow importing syncobj fence FDs in addition
to native sync FDs.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 15:13:49 -05:00
Dave Airlie
16dd0eb517 ac/llvm: bump the number of results to 8.
This function can get access for a 64-bit dvec4, which means we
have to load 8 components.

This fixes:
R600_DEBUG=nir ./bin/shader_runner generated_tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-abs-dvec4.shader_test -auto

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 05:37:16 +10:00
Dave Airlie
8d633f067b r600/sb: insert the else clause when we might depart from a loop
If there is a break inside the else clause and this means we
are breaking from a loop, the loop finalise will want to insert
the LOOP_BREAK/CONTINUE instruction, however if we don't emit
the else there is no where for these to end up, so they will end
up in the wrong place.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101442
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-31 04:47:29 +10:00
Brian Paul
1a9aa69ae8 mesa: remove invalid assertion in _mesa_enable_vertex_array_attrib()
The meta module passes some 0-based attrib values.  Should fix Piglit
regressions reported by Mark Janes.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104863
Fixes: 4ab7e03e1f ("mesa: add an assertion in
_mesa_enable_vertex_array_attrib()")
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-30 11:02:43 -07:00
Brian Paul
efa0993eaf mesa: use gl_vert_attrib enum type in more places
Slightly better readbility.

Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-30 11:02:43 -07:00
Brian Paul
f892e332a8 mesa: rename some 'client' array functions
A long time ago gl_vertex_array was gl_client_array.  Update some function
names to be consistent.

Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-30 09:07:59 -07:00
Brian Paul
d2d9d090e5 mesa: s/src/attribs/ in _mesa_update_client_array()
Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-30 09:07:59 -07:00
Brian Paul
e863541e43 mesa: check/assert array index in _mesa_bind_vertex_buffer()
Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-30 09:07:59 -07:00
Brian Paul
fcee2cc711 mesa: trivial comment typo fix in arrayobj.c
Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-30 09:07:59 -07:00
Brian Paul
4ab7e03e1f mesa: add an assertion in _mesa_enable_vertex_array_attrib()
Some of the enable/disable vertex array functions take a zero-based
generic index, while others take a VERT_ATTRIB_GENERIC0-based value.
Add an assertion to clarify that in one place.

Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-30 09:07:59 -07:00
Brian Paul
7f12791cc6 mesa: rename some vars in client_state()
Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-30 09:07:59 -07:00
Mathias Fröhlich
06621e8a0d mesa: Care for differences in fog mode only if fog is consumed.
In creating fixed function vertex shader hash keys do only
care for producing the varying output if fog is enabled and the
varing is consumed in the fragment stage.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-30 09:07:59 -07:00
Mathias Fröhlich
6395a0ecf2 mesa: Reduce ffvertex_prog state_key to 36 bytes.
Using lower alignment restrictions for the state key fields finally
yields to a smaller hashing state key.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-30 09:07:59 -07:00
Mathias Fröhlich
b4216b588e mesa: Remove unused ffvertex_prog texunit_really_enabled.
Remove set but not read field from the state key used for hashing
fixed function vertex shaders.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-30 09:07:59 -07:00
Mathias Fröhlich
1169791c18 mesa: Remove unused bit in ffvertex_prog state_key.
Remove set but not read field from the state key used for hashing
fixed function vertex shaders.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-30 09:07:59 -07:00
Mathias Fröhlich
6726d16098 mesa: texgen_enabled is only 1 bit.
For the state key for hashing fixed function vertex shaders, the
texgen_enabled field requires only a single bit.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-30 09:07:59 -07:00
Mathias Fröhlich
d6b0ad51ec mesa: Encode fog modes in a 2 bit field.
For the state key for hashing fixed function
vertex shaders, encode the different fog modes, including
if fog is generally enabled or not, into a 2 bit field.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-30 09:07:59 -07:00
Mathias Fröhlich
63e845d3cc mesa: Move seperate_specular into the lighting section.
For the state key for hashing fixed function
vertex shaders, the information is only evaluated
if lighting is generally switched on.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-30 09:07:58 -07:00
Mathias Fröhlich
11e665d434 mesa: Get the point size array state from varying_vp_inputs.
For the state key for hashing fixed function
vertex shaders, The varying_vp_inputs bitmask already
contains the point size array enabled information.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-30 09:07:58 -07:00
Mathias Fröhlich
bc5c54cadf mesa: Remove unused gl_fog_attrib::_Scale.
The patch removes a variable that is only written to.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-30 09:07:58 -07:00
Iago Toral Quiroga
99b57daf4a anv/pipeline: lower constant initializers on output variables earlier
If a shader only writes to an output via a constant initializer we
need to lower it before we call nir_remove_dead_variables so that
this pass sees the stores from the initializer and doesn't kill the
output.

Fixes test failures in new work-in-progress CTS tests:
dEQP-VK.spirv_assembly.instruction.graphics.variable_init.output_vert
dEQP-VK.spirv_assembly.instruction.graphics.variable_init.output_frag

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-30 08:10:29 +01:00
Tapani Pälli
6316c2ecbd i965: move disk cache from brw_context to intel_screen
Now every context refers to same disk_cache instance in screen.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Suggested-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-01-30 08:42:51 +02:00
Elie Tournier
6f8518e068 mesa: Correctly print glTexImage dimensions
texture_format_error_check_gles() displays error like "glTexImage%dD".
This patch just replace the %d by the correct dimension.

Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-01-30 07:48:56 +02:00
Brian Paul
d5f42f96e1 mesa: shrink size of gl_array_attributes (v2)
Inspired by Marek's earlier patch, but even smaller.  Sort fields from
largest to smallest.  Use bitfields for more fields (sometimes with an
extra bit for MSVC).  Reduce Stride field to GLshort.

Note that some fields cannot be bitfields because they're accessed via
pointers (such as for glEnableClientState(GL_VERTEX_ARRAY) to set the
Enabled field).

Reduces size from 48 to 24 bytes.
Also reduces size of gl_vertex_array_object from 3632 to 2864 bytes.

And add some assertions in init_array().

v2: use s/GLuint/unsigned/, improve commit comments.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-29 21:16:50 -07:00
Brian Paul
79cafa0df3 mesa: shrink gl_vertex_array
Inspired by Marek's earlier patch, but goes a little further.
Sort fields from largest to smallest.  Use bitfields.

Reduced from 48 bytes to 32.  Also reduces size of gl_vertex_array_object
from 4144 to 3632

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-29 21:15:52 -07:00
Marek Olšák
f96a69f916 mesa: replace GLenum with GLenum16 in common structures (v4)
v2: - fix glGet*
    - also use GLenum16 for DrawBuffers
v3: - rebase to top of tree (BrianP) and incorporate Ian's suggestions
v4: - fix a GLenum16 bug in VBO/save code, add some STATIC_ASSERT()s

gl_context = 152432 -> 136840 bytes
vbo_context = 22096 -> 20608 bytes

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-29 21:15:52 -07:00
Brian Paul
94843e6056 mesa: fix incorrect size/error test in _mesa_GetUnsignedBytevEXT()
get_value_size() returns -1 for an error.  The similar check in
_mesa_GetUnsignedBytei_vEXT() is correct.

Found by chance.  There are apparently no Piglit tests which exercise
glGetUnsignedBytei_vEXT() or glGetUnsignedBytevEXT().

Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-29 21:15:52 -07:00
Neha Bhende
e4ca1d6456 svga: Check rasterization state object before checking poly_stipple_enable
Sometimes rasterization state object could be empty. This is causing
segfault on hw8,9,10 for some traces.

This patch fixes enemy_territory_quake_wars_high,
enemy_territory_quake_wars_low, etqw-demo, lightsmark2008, quake1
glretrace crashes on hw 8,9,10.

Tested with mtt-glretrace and mtt-piglit.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-29 21:04:49 -07:00
Neha Bhende
d4a5e14fae svga: Adjust alpha for S3TC_DXT1_EXT RGB formats
According to spec, S3TC_DXT1_EXT RGB formats are supposed to be
opaque. Correspoding svga formats are not handling it so explicitly
setting it to 1.0.
This fixes piglit test spec@ext_texture_compression_s3tc@s3tc-targeted
Note: This test is testcase for freedesktop bug 100925

Tested with mtt-piglit and mtt-glretrace on 8,9,10,11 and 15

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-29 21:04:49 -07:00
Gert Wollny
6a7d1ca2c4 mesa/st/glsl_to_tgsi: Mark first write as unconditional when appropriate
In the register lifetime estimation if the first write is unconditional or
conditional but not within a loop then this is an unconditional dominant
write in the sense of register life time estimation.
Add a test case and record the write accordingly.

Fixes: 807e2539e5 ("mesa/st/glsl_to_tgsi: Add
tracking of ifelse writes in register merging")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104803
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-29 21:04:49 -07:00
Roland Scheidegger
3c7aa242f5 mesa: skip validation of legality of size/type queries for format queries
The size/type query is always legal (if we made it that far).
Removing this causes a difference for GL_TEXTURE_BUFFER - the reason is that
these parameters are valid only with GetTexLevelParameter() if gl 3.1 is
supported, but not if only ARB_texture_buffer_object is supported.
However, while the spec says that these queries return "the same information
as querying GetTexLevelParameter" I believe we're not expected to return just
zeros here. By definition, these pnames are always valid (unlike for the
GetTexLevelParameter() function which would return an error without GL 3.1).
The spec is a bit inconsistent there and open to interpretation - while
mentioning the "same information as querying GetTexLevelParameter" is
returned, it also mentions that 0 is returned for size/type if the
target/format is not supported - implying correct results to be returned
if it is supported, regardless that GetTexLevelParameter would return
an error. (Also, the bit about this returning the same as
GetTexLevelParameter also includes querying stencil type, which isn't
even possible with GetTexLevelParameter.)

This breaks some piglit arb_internalformat_query2 tests (which I believe to
be wrong).

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com
2018-01-30 01:28:47 +01:00
Roland Scheidegger
21fe02d1d3 mesa: restrict formats being supported by target type for formatquery
The code just considered all formats as being supported if they were either
a valid fbo or texture format.
This was quite awkward since then the query would return "supported" for
e.g. GL_RGB9E5 or compressed formats and target RENDERBUFFER (albeit the driver
could still refuse it in theory). However, when then querying for instance the
internalformat sizes, it would just return 0 (due to the checks being more
strict there).
It was also a problem for texture buffer targets, which have a more restricted
list of formats which are allowed (and again, it would return supported but
then querying sizes would return 0).
So only take validation of formats into account which make sense for a given
target.
Can also toss out some special checks for rgb9e5 later, since we'd never get
there if it wasn't supported in the first place.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-01-30 01:28:47 +01:00
Roland Scheidegger
272e7e1bd5 mesa: (trivial) add TODO comment for default results for internal queries 2018-01-30 01:28:47 +01:00
Roland Scheidegger
09dc4f9012 mesa: remove misleading gles checks for formatquery
Testing for gles there is just confusing - this is about target being
supported, if it was valid at all was already determined earlier
(in _legal_parameters). It didn't make sense at all in any case, since
it would only have said false there for gles for 2d but not 2d arrays etc.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-01-30 01:28:47 +01:00
Rafael Antognolli
e7ecc5e160 i965: Emit PIPE_CONTROL with ISP bit on older platforms.
Emit it on all platforms since gen7.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-29 14:52:07 -08:00
Rafael Antognolli
fa21ddf7b1 anv/cmd_buffer: Emit PIPE_CONTROL with ISP bit on older platforms.
Emit it on all platforms since gen7.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-29 14:52:07 -08:00
Timothy Arceri
2b4afaef1c st/glsl_to_nir: remove dead io after conversion to nir
This fixes an assert in nir_lower_var_copies() for some bioshock
shaders where an unused clipdistance array has no size.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 09:14:36 +11:00
Timothy Arceri
327c1a7fb3 radeonsi/nir: add support vs double inputs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 09:08:47 +11:00
Timothy Arceri
44067d6f0d radeonsi: pass input_idx to declare_nir_input_vs()
This make it consistent with declare_nir_input_fs() and will allow
us to support doubles.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 09:08:47 +11:00
Timothy Arceri
cf75ee3ab1 radeonsi: add bitcast_inputs() helper
Will be used in a following patch to help support doubles.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 09:08:47 +11:00
Timothy Arceri
96cfd4bd7e radeonsi/nir: fix num_inputs for doubles in vs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 09:08:47 +11:00
Timothy Arceri
09cd484d61 nir: partially revert c2acf97fcc
c2acf97fcc changed the use of double_inputs_read to be
inconsitent with its previous meaning. Here we re-enable the
gather info code that was removed as the modified code from
c2acf97fcc now uses the double_inputs member rather than
double_inputs_read.

This change allows us to use double_inputs_read with gallium
drivers without impacting double_inputs which is used by i965.

We also make use of the compiler option vs_inputs_dual_locations
to allow for the difference in behaviour between drivers that handle
vs inputs as taking up two locations for doubles, versus those that
treat them as taking a single location.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-01-30 09:08:47 +11:00
Timothy Arceri
5b8de4bdff nir: add vs_inputs_dual_locations compiler option
Allows nir drivers to either use a single or dual locations for
vs double inputs.

i965 uses dual locations for both OpenGL and Vulkan drivers, for
now gallium OpenGL drivers only use a single location.

The following patch will also make use of this option when
calling nir_shader_gather_info().

Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-01-30 09:08:47 +11:00
Timothy Arceri
f63e05ae9e compiler: tidy up double_inputs_read uses
First we move double_inputs_read into a vs struct in the union,
double_inputs_read is only used for vs inputs so this will
save space and also allows us to add a new double_inputs field.

We add the new field because c2acf97fcc changed the behaviour
of double_inputs_read, and while it's no longer used to track
actual reads in i965 we do still want to track this for gallium
drivers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 09:08:47 +11:00
Dave Airlie
f6cc15dccd radv/gfx9: fix block compression texture views. (v2)
This ports a fix from amdvlk, to fix the sizing for mip levels
when block compressed images are viewed using uncompressed views.

My original fix didn't power the clamping, but it looks like
the clamping is required to stop the sizing going too large.

Fixes:
dEQP-VK.image.texel_view_compatible.graphic.extended*bc*
Doesn't crash DOW3 anymore.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: e38685cc62 'Revert "radv: disable support for VEGA for now."'
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-30 07:39:13 +10:00
Bas Nieuwenhuizen
0347a83bbf radv: Signal fence correctly after sparse binding.
It did not signal syncobjs in the fence, and also signalled too early
if there was work on the queue already, as we have to wait till that
work is done.

Fixes: d27aaae4d2 "radv: Add external fence support."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-29 17:22:58 +01:00
Brian Paul
0d044f7d61 mesa/vbo: replace vbo_draw_method() with _mesa_set_drawing_arrays()
The arrays specified by ctx->Array._DrawArrays are used for all
vertex drawing via vbo_context::draw_prims().  Different arrays are
used for immediate mode, vertex arrays, display lists, etc.  Changing
from one to another requires updating derived/driver array state.

Before, we indirectly specifid the arrays with the gl_draw_method values.
Now we just directly specify the arrays instead.  This is simpler and
will allow a subsequent display list optimization.

In the future, it might make sense to get rid of ctx->Array._DrawArrays
entirely and just pass the arrays as another parameter to
vbo_context::draw_prims().

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-01-29 08:35:14 -07:00
Brian Paul
d9894ede02 vbo: s/[0]/[VERT_ATTRIB_POS]/ in recalculate_input_bindings()
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-01-29 08:35:14 -07:00
Brian Paul
48a6ab472a vbo: add new VBO_ATTRIBS_ masks to vbo_attrib.h
These will be used in a later patch.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-01-29 08:35:14 -07:00
Brian Paul
41cd3ee5a2 vbo: s/VBO_ATTRIB_INDEX/VBO_ATTRIB_COLOR_INDEX/
To match the VERT_ATTRIB_COLOR_INDEX name.
Give a name to the previously anonymous enum of VBO_ATTRIB_x values.
Update the comment on the enum.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-01-29 08:35:14 -07:00
Brian Paul
425da3bbfc vbo: minor clean-ups in vbo_exec.h
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-01-29 08:35:14 -07:00
Brian Paul
d631ea3a23 vbo: s/_API_NOOP_H/VBO_NOOP_H/ in vbo_noop.h
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-01-29 08:35:14 -07:00
Brian Paul
094a80db4c vbo: whitespace/formatting fixes in vbo_exec.h
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-01-29 08:35:14 -07:00
Brian Paul
b080fc6199 vbo: move, rename vp_mode enums, get_program_mode() function
Instead of NONE/ARB use FF/SHADER.  Move the enum declaration to
vbo_private.h where it's used.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-01-29 08:35:14 -07:00
Brian Paul
35e0ff5bd5 vbo: s/cl/array/ in vbo_context.c
I think 'cl' used to mean client array.

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-01-29 08:35:14 -07:00
Tapani Pälli
d0343bef66 nir: mark unused space in packed_tex_data
This change cleans following scary warnings in valgrind output
when disk cache is being written:

   ==6532== Uninitialised byte(s) found during client check request
   ==6532==    at 0x14423FAD: blob_write_bytes (blob.c:152)
   ==6532==    by 0x144240FB: blob_write_uint32 (blob.c:194)
   ==6532==    by 0x144001A5: write_tex (nir_serialize.c:613)

and later (loads of):

   ==6532== Use of uninitialised value of size 8
   ==6532==    at 0x62FCD9E: crc32_z (in /usr/lib64/libz.so.1.2.11)
   ==6532==    by 0x13F65014: util_hash_crc32 (crc32.c:127)
   ==6532==    by 0x13F5DABA: cache_put (disk_cache.c:947)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-29 08:11:22 +02:00
Tapani Pälli
b99c88037b i965: fix disk_cache leak when destroying context
==2780== 1,024 bytes in 1 blocks are possibly lost in loss record 180 of 205
   ==2780==    at 0x4C31A1E: calloc (vg_replace_malloc.c:711)
   ==2780==    by 0x13F6467E: util_queue_init (u_queue.c:309)
   ==2780==    by 0x13F5C9F6: disk_cache_create (disk_cache.c:369)
   ==2780==    by 0x13F05406: brw_disk_cache_init (brw_disk_cache.c:428)
   ==2780==    by 0x13F01E78: brwCreateContext (brw_context.c:1068)

Fixes: 1a61a8b9a7 ("i965: Initialize disk shader cache if MESA_GLSL_CACHE_DISABLE is false")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-01-29 08:11:14 +02:00
Tapani Pälli
28db950b51 i965: fix prog_data leak in brw_disk_cache
==25481== 576 bytes in 1 blocks are definitely lost in loss record 179 of 208
   ==25481==    at 0x4C2FB6B: malloc (vg_replace_malloc.c:299)
   ==25481==    by 0x1404E2CC: ralloc_size (ralloc.c:121)
   ==25481==    by 0x14119F82: read_and_upload (brw_disk_cache.c:176)
   ==25481==    by 0x1411A5C9: brw_disk_cache_upload_program (brw_disk_cache.c:271)
   ==25481==    by 0x1412FCA4: brw_upload_wm_prog (brw_wm.c:597)

Fixes: 516d50db31 ("i965: add initial implementation of on disk shader cache")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-29 08:11:03 +02:00
Timothy Arceri
9afc38c799 ac: fix indentation
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-29 11:14:23 +11:00
Timothy Arceri
03086f86ae ac: remove unused nir2llvmtype()
The last use of this was removed in the previous patch.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-29 11:14:23 +11:00
Timothy Arceri
fa29a9625e ac: fix gs load inputs type
This fixes the scenario where the input is a struct. With this
the Unreal engines Elemental demo now works on radeonsi.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-29 11:14:23 +11:00
Kai Wasserbäch
0aba967328 ac/nir: call glsl_get_sampler_dim() only once where possible
Changes since v1:
  * Rebased on top of e68150de26 and
    82adf53308.

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-01-29 10:47:31 +11:00
Dave Airlie
2af66ba7e7 docs/features: add r600 ARB_query_buffer_object support
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-29 05:42:34 +10:00
Dave Airlie
1c9ea24a19 r600: add ARB_query_buffer_object support
This uses a different shader than radeonsi, as we can't address non-256
aligned ssbos, which the radeonsi code does. This passes some extra
offsets into the shader.

It also contains a set of u64 instruction implementation that may
or may not be complete (at least the u64div is definitely not something
that works outside this use-case). If r600 grows 64-bit integers,
it will use the GLSL lowering for divmod.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-29 05:42:28 +10:00
Dave Airlie
a7ec366e50 r600/shader: refactor mul hi/lo instruction emission
This just makes it a bit simpler for cayman vs eg

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-29 05:42:17 +10:00
Dave Airlie
e0e23ea69c r600/eg: construct proper rat mask for image/buffers.
If the images/buffer bindings had a gap, this produced the wrong values,
this should fix that to generate the correct rat mask for mixes of
images/buffers/cbs.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-29 05:41:58 +10:00
Jon Turney
4a0bab1d7f meson: libdrm shouldn't appear in Requires.private: if it wasn't found
Otherwise, using pkg-config to retrieve flags will fail, e.g.

$ pkg-config gl --cflags
Package libdrm was not found in the pkg-config search path.
Perhaps you should add the directory containing `libdrm.pc'
to the PKG_CONFIG_PATH environment variable
Package 'libdrm', required by 'gl', not found

Fixes: 3218056e0e ("meson: Build i965 and dri stack")

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
2018-01-27 18:13:18 +00:00
Eric Anholt
e5a81ac704 broadcom/vc5: Don't forget to get the BO offset when opening a dmabuf.
Fixes black display in DRI due to storing to 0x00000000.
2018-01-27 19:40:14 +11:00
Eric Anholt
314e9ee6c4 broadcom/vc5: Enable the driver on V3D 4.2.
The changes in 4.2 haven't impacted any of our CL or state struct entries
that I can see, so I haven't enabled custom compile for doing 4.2 instead
of 4.1.
2018-01-27 19:39:56 +11:00
Eric Anholt
71c7e9bea1 broadcom/vc5: Enable CLIF dumping of V3D 4.2. 2018-01-27 19:04:21 +11:00
Eric Anholt
91f899cbc1 broadcom/vc5: Update the compiler for V3D 4.2. 2018-01-27 19:04:21 +11:00
Eric Anholt
f2e41daac5 broadcom/vc5: Update QPU instruction pack/unpack for v4.2.
After the 4.1 spec, 4.2 retroactively renamed patchid to barrierid because
it's used for other barriers in compute.
2018-01-27 19:03:55 +11:00
Eric Anholt
96d3e8f134 broadcom/vc5: Add XML for V3D 4.2. 2018-01-27 18:57:58 +11:00
Eric Anholt
b026063b16 broadcom/vc5: Fix a race between XML codegen build and CLIF build. 2018-01-27 18:57:58 +11:00
Eric Anholt
de60ea4432 Android: Attempt to fix broadcom build after vc5 changes. 2018-01-27 18:03:58 +11:00
Marek Olšák
b633999a4e ac: rename and move si_const_array into common code
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-27 02:09:09 +01:00
Marek Olšák
e17eb8800f ac: move address space definitions to common code
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-27 02:09:09 +01:00
Marek Olšák
0d62370bbb ac: don't use byval LLVM qualifier in shaders
shader-db doesn't show any regression and 32-bit pointers with byval
are declared as VGPRs for some reason.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-27 02:09:09 +01:00
Marek Olšák
0e40c6a7b7 gallium/radeon: set number of pb_cache buckets = number of heaps
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-27 02:09:09 +01:00
Marek Olšák
175549e0e9 pb_cache: let drivers choose the number of buckets
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-27 02:09:09 +01:00
Marek Olšák
ecfd521502 pb_cache: call os_time_get outside of the loop
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-27 02:09:09 +01:00
Marek Olšák
e553cb5a68 gallium/radeon: simplify radeon_flags_from_heap
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-27 02:09:09 +01:00
Timothy Arceri
041b18cf23 st/shader_cache: restore num_tgsi_tokens when loading from cache
Without this we will fail to correctly serialise programs when
using glGetProgramBinary() if the program was retrieved from
the disk cache rather than freshly compiled.

Fixes: c69b0dd681 "st/glsl_to_tgsi: store num_tgsi_tokens in st_*_program"

Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104762
2018-01-27 10:06:16 +11:00
Marek Olšák
17423c993d winsys/amdgpu: fix assertion failure with UVD and VCE rings
Cc: 18.0 <mesa-stable@lists.freedesktop.org>
2018-01-26 23:12:11 +01:00
Brian Paul
ac0e9e343c mesa: remove MESA_FUNCTION
Just use __func__ in the two macros where it was used.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-01-26 13:52:48 -07:00
Brian Paul
bacf72a18d mesa: change gl_link_status enums to uppercase
follow the convention of other enums.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-01-26 13:52:48 -07:00
Brian Paul
aff5d9c256 mesa: change gl_compile_status enums to uppercase
To follow the convention of other enums.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-01-26 13:52:48 -07:00
Brian Paul
d9832f1fc4 mesa: minor comment reformatting, whitespace fixes in mtypes.h
Trivial.
2018-01-26 13:52:42 -07:00
Rafael Antognolli
131e871385 i965/gen10: Use CS Stall instead of WriteImmediate.
Fixes: ca19ee33d7
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-26 12:02:34 -08:00
Rafael Antognolli
20578f81a6 anv/gen10: Emit CS stall and mark push constants dirty.
I got reviews and fixed the patches locally, but ended up merging the
ones that I sent originally to the list. This patch fixes those
mistakes.

Fixes: 78c125af39
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-26 11:59:17 -08:00
Rafael Antognolli
bcfd78e448 i965/gen10: Re-enable push constants.
The GPU hang caused by push constants is apparently fixed, so let's
enable them again.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-26 10:07:44 -08:00
Rafael Antognolli
78c125af39 anv/gen10: Ignore push constant packets during context restore.
Similar to the GL driver, ignore 3DSTATE_CONSTANT_* packets when doing a
context restore.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-26 10:07:40 -08:00
Rafael Antognolli
ca19ee33d7 i965/gen10: Ignore push constant packets during context restore.
These packets were causing GPU hangs when the context was restored,
possibly because they were pointing to BO's that were already
unreferenced. So we tell the hardware to ignore such packets after the
batch buffer ends, since we know those BO's are not around anymore.

This change fixes GPU hangs on CNL. The (partial) solution to this
problem so far was to entirely disable push constants on this platform.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-26 10:07:35 -08:00
Brian Paul
acaec6cdd9 mesa: silence MinGW 'may be unused uninitialized' warning in get.c
The warning happens on line 2114 for the memcpy(data, p, size) call.
I'm not sure why that generates the warning but not the earlier use
of p in the code.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-01-26 10:44:05 -07:00
Eleni Maria Stea
8096b558a7 mesa: Fix function pointers initialization in status tracker
We assigned the function that gets the device uuid to the GetDriverUuid
function pointer and the function that gets the driver uuid to the
GetDeviceUuid function pointer inside the state tracker. Exchanged the
pointers.

cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-26 08:17:55 -07:00
Iago Toral Quiroga
d3ce493b34 anv/pipeline: remove the pipeline layout field from anv_pipeline
It no longer has any users.

Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-26 14:06:47 +01:00
Iago Toral Quiroga
75a4802060 anv/cmd_buffer: add the pipeline layout to the pipeline state
We need to access the pipeline layout to compute correct dynamic
offsets for dyamic UBO/SSBO descriptors when we emit draw commands.
Instead of taking it from the pipeline object, store the layout
in the command buffer pipeline state.

Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-26 14:06:47 +01:00
Iago Toral Quiroga
e1a49f974b anv/pipeline: don't take the layout from the pipeline to compile shaders
The Vulkan spec states that VkPipelineLayout objects must not be
destroyed while any command buffer that uses them is in the recording
state, but it permits them to be destroyed otherwise. This means that
applications are allowed to free pipeline layouts after command recording
is finished even if there are pipeline objects that still exist and were
created with these layouts.

There are two solutions to this, one is to use reference counting on
pipeline layout objects. The other is to avoid holding references to
pipeline layouts where they are not really needed.

This patch takes a step towards the second option by making the
pipeline shader compile code take pipeline layout from the
VkGraphicsPipelineCreateInfo provided rather than the pipeline
object.

A follow-up patch will remove any remaining uses of the layout field
so we can remove it from the pipeline object and avoid the need
for reference counting.

v2: Use ANV_FROM_HANDLE, remove unnecessary braces (Jason)

Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-26 14:06:46 +01:00
Iago Toral Quiroga
14f6275c92 anv/descriptor_set: add reference counting for descriptor set layouts
The spec states that descriptor set layouts can be destroyed almost
at any time:

   "VkDescriptorSetLayout objects may be accessed by commands that
    operate on descriptor sets allocated using that layout, and those
    descriptor sets must not be updated with vkUpdateDescriptorSets
    after the descriptor set layout has been destroyed. Otherwise,
    descriptor set layouts can be destroyed any time they are not in
    use by an API command."

v2: allocate off the device allocator with DEVICE scope (Jason)

Fixes the following work-in-progress CTS tests:
dEQP-VK.api.descriptor_set.descriptor_set_layout_lifetime.graphics
dEQP-VK.api.descriptor_set.descriptor_set_layout_lifetime.compute

Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-26 14:06:46 +01:00
Samuel Pitoiset
e28233a527 ac/nir: set amdgpu.uniform and invariant.load for SSBOs
For descriptors.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-26 12:14:28 +01:00
Samuel Pitoiset
49b0a140a7 ac/nir: set amdgpu.uniform and invariant.load for UBOs
UBOs are constants buffers.

Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Fixes: 41c36c45 ("amd/common: use ac_build_buffer_load() for emitting UBO loads")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-26 12:14:28 +01:00
Samuel Pitoiset
b453f38a47 ac/nir: set the noalias attribute on input pointers
This attribute is similar to the definition of restrict in
C99 and it might help LLVM.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-26 12:14:28 +01:00
Samuel Pitoiset
310d17fcf1 ac: only load used channels when sampling buffer views
This allows to reduce the number of dwords that are loaded
with buffer_load_format_xyzw. For example, when the only used
channel is 1, the driver will emit buffer_load_format_x instead.

Shader stats for DOW3 (with some local hacky scripts for SPIRV):

143 shaders in 143 tests
Totals:
SGPRS: 5344 -> 5352 (0.15 %)
VGPRS: 3476 -> 3452 (-0.69 %)
Spilled SGPRs: 30 -> 29 (-3.33 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 269860 -> 269808 (-0.02 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 1267 -> 1272 (0.39 %)
Wait states: 0 -> 0 (0.00 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-26 12:14:27 +01:00
Samuel Pitoiset
51e14bc3c0 ac: pass the number of channels to ac_build_buffer_load_format()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-26 12:14:27 +01:00
Samuel Pitoiset
d7c93b558a ac: add ac_build_buffer_load_common() helper
For both versions of llvm.amdgcn.buffer.load.{format}.*.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-26 12:14:27 +01:00
Samuel Pitoiset
6d07e443ba radv: fix RADV_DEBUG=syncshaders on GFX9
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-26 12:14:27 +01:00
Samuel Pitoiset
5391de1262 radv: fix a GPU hang with RADV_DEBUG=syncshaders
The GPU hangs when the driver forces a PS_PARTIAL_FLUSH after
a dispatch call (and vice versa for graphics). Something has
changed in the kernel driver because it used to work.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-26 12:14:27 +01:00
Samuel Pitoiset
b358e0e67f ac/shader: scan if fragment shaders write memory
It's better to do that in ac_shader_info.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-26 12:14:27 +01:00
Samuel Pitoiset
b9e2f78d6e ac/nir: only canonicalize 32-bit float min/max outputs on pre-GFX9
According to LLVM, only pre-GFX9 targets do not flush denorms
for fmin/fmax.

All dEQP-VK.glsl.builtin.precision.* still pass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-26 12:14:27 +01:00
Jason Ekstrand
c8949e2498 anv/pipeline: Don't look at blend state unless we have an attachment
Without this, we may end up dereferencing blend before we check for
binding->index != UINT32_MAX.  However, Vulkan allows the blend state to
be NULL so long as you don't have any color attachments.  This fixes a
segfault when running The Talos Principal.

Fixes: 12f4e00b69
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-26 01:44:45 -08:00
Maxin B. John
8116b9170b anv_icd.py: improve reproducible builds
Sort the output to ensure build reproducibility

Signed-off-by: Maxin B. John <maxin.john@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Fixes: 0ab04ba979 ("anv: Use python to generate ICD json files")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-26 01:37:45 -08:00
Ian Romanick
c7deeb71a8 nouveau: Remove no-op nvgl_logicop_func function
The values that this function returned were always the values passed
in.  The only thing that happened was either an assertion or undefined
results when an unknown value was passed in.  This doesn't seem that
useful.  Most of nouveau_gldefs.h could be removed in this manner.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-01-26 11:21:46 +08:00
Ian Romanick
f5b9c2a6e3 i915: Silence unused parameter warnings
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c: In function ‘intel_alloc_window_storage’:
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:290:48: warning: unused parameter ‘ctx’ [-Wunused-parameter]
 intel_alloc_window_storage(struct gl_context * ctx, struct gl_renderbuffer *rb,
                                                ^~~
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c: In function ‘intel_nop_alloc_storage’:
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:303:74: warning: unused parameter ‘rb’ [-Wunused-parameter]
 intel_nop_alloc_storage(struct gl_context * ctx, struct gl_renderbuffer *rb,
                                                                          ^~
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:304:32: warning: unused parameter ‘internalFormat’ [-Wunused-parameter]
                         GLenum internalFormat, GLuint width, GLuint height)
                                ^~~~~~~~~~~~~~
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:304:55: warning: unused parameter ‘width’ [-Wunused-parameter]
                         GLenum internalFormat, GLuint width, GLuint height)
                                                       ^~~~~
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:304:69: warning: unused parameter ‘height’ [-Wunused-parameter]
                         GLenum internalFormat, GLuint width, GLuint height)
                                                                     ^~~~~~
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c: In function ‘intel_bind_framebuffer’:
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:396:47: warning: unused parameter ‘fb’ [-Wunused-parameter]
                        struct gl_framebuffer *fb, struct gl_framebuffer *fbread)
                                               ^~
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:396:74: warning: unused parameter ‘fbread’ [-Wunused-parameter]
                        struct gl_framebuffer *fb, struct gl_framebuffer *fbread)
                                                                          ^~~~~~
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c: In function ‘intel_renderbuffer_update_wrapper’:
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:422:57: warning: unused parameter ‘intel’ [-Wunused-parameter]
 intel_renderbuffer_update_wrapper(struct intel_context *intel,
                                                         ^~~~~
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c: In function ‘intel_blit_framebuffer_with_blitter’:
../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:644:61: warning: unused parameter ‘filter’ [-Wunused-parameter]
                                     GLbitfield mask, GLenum filter)
                                                             ^~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-01-26 11:21:46 +08:00
Ian Romanick
39f875a6b7 i915: Make intelEmitCopyBlit static
And rename to emit_copy_blit.

v2: sed --in-place -e 's/color_logic_ops/gl_logicop_mode/g' $(grep -lr
color_logic_ops src/) suggested by Brian.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]
2018-01-26 11:21:46 +08:00
Ian Romanick
9eed6bea6b i965: Make intelEmitCopyBlit static
And rename to emit_copy_blit.

v2: sed --in-place -e 's/color_logic_ops/gl_logicop_mode/g' $(grep -lr
color_logic_ops src/) suggested by Brian.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]
2018-01-26 11:21:46 +08:00
Ian Romanick
4e9e964de6 i915: Use enum color_logic_ops for blits
v2: sed --in-place -e 's/color_logic_ops/gl_logicop_mode/g' $(grep -lr
color_logic_ops src/) suggested by Brian.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]
2018-01-26 11:21:46 +08:00
Ian Romanick
21be331401 i965: Use enum color_logic_ops for blits
v2: sed --in-place -e 's/color_logic_ops/gl_logicop_mode/g' $(grep -lr
color_logic_ops src/) suggested by Brian.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]
2018-01-26 11:21:46 +08:00
Ian Romanick
0aaa27f291 mesa: Pass the translated color logic op dd_function_table::LogicOpcode
And delete the resulting dead code.  This has only been compile-tested.

v2: sed --in-place -e 's/color_logic_ops/gl_logicop_mode/g' $(grep -lr
color_logic_ops src/) suggested by Brian.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-26 11:21:46 +08:00
Ian Romanick
cf0b26ec12 st/mesa: Use the translated color logic op from the context
And delete the resulting dead code.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-26 11:21:46 +08:00
Ian Romanick
0c69db895f i965: Use the translated color logic op from the context
And delete the resulting dead code.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-01-26 11:21:46 +08:00
Ian Romanick
9c1f010f34 mesa: Also track a remapped version of the color logic op
With the exception of NVIDIA hardware, these are is the values that all
hardware and Gallium want.  The remapping is currently implemented in at
least 6 places.  This starts the process of consolidating to a single
place.

v2: sed --in-place -e 's/color_logic_ops/gl_logicop_mode/g' $(grep -lr
color_logic_ops src/) suggested by Brian.  Added some comments about the
selection of bit patterns for gl_logicop_mode and the GLenums.
Suggested by Nicolai.  Folded the GLenum_to_color_logicop macro into its
only users.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com> [v1]
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-26 11:21:46 +08:00
Bas Nieuwenhuizen
5a3404d443 radeonsi: Export signalled sync file instead of -1.
-1 is considered an error for EGL_ANDROID_native_fence_sync, so
we need to actually create a sync file.

Fixes: f536f45250 "radeonsi: implement sync_file import/export"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-26 01:26:53 +01:00
Jason Ekstrand
db682b8f0e i965/fs: Reset the register file to VGRF in lower_integer_multiplication
18fde36ced changed the way temporary
registers were allocated in lower_integer_multiplication so that we
allocate regs_written(inst) space and keep the stride of the original
destination register.  This was to ensure that any MUL which originally
followed the CHV/BXT integer multiply regioning restrictions would
continue to follow those restrictions even after lowering.  This works
fine except that I forgot to reset the register file to VGRF so, even
though they were assigned a number from alloc.allocate(), they had the
wrong register file.  This caused some GLES 3.0 CTS tests to start
failing on Sandy Bridge due to attempted reads from the MRF:

    ES3-CTS.functional.shaders.precision.int.highp_mul_fragment.snbm64
    ES3-CTS.functional.shaders.precision.int.mediump_mul_fragment.snbm64
    ES3-CTS.functional.shaders.precision.int.lowp_mul_fragment.snbm64
    ES3-CTS.functional.shaders.precision.uint.highp_mul_fragment.snbm64
    ES3-CTS.functional.shaders.precision.uint.mediump_mul_fragment.snbm64
    ES3-CTS.functional.shaders.precision.uint.lowp_mul_fragment.snbm64

This commit remedies this problem by, instead of copying inst->dst and
overwriting nr, just make a new register and set the region to match
inst->dst.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103626
Fixes: 18fde36ced
Cc: "17.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-01-25 13:58:55 -08:00
Jason Ekstrand
af9d4ce480 vulkan: Update the XML and headers to 1.0.68
Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Chad Versace <chadversary@chromium.org>
2018-01-25 13:30:05 -08:00
Dave Airlie
f4c534ef68 radv: don't enable tc compat for d32s8 + 4/8 samples (v1.1)
This seems to be broken, at least the cts tests fail.

This fixes:
dEQP-VK.renderpass.suballocation.multisample.d32_sfloat_s8_uint.samples_4
dEQP-VK.renderpass.suballocation.multisample.d32_sfloat_s8_uint.samples_8

2 samples seems to pass fine, amdvlk doesn't appear to enable TC for
possibly some other reasons here.

This is most likely a hack.

v1.1: add a bit of explaination text. (Samuel)
Fixes: ad3d98da9 (radv: enable tc compatible htile for d32s8 also.)
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-26 06:55:09 +10:00
Chuck Atkins
6ac5e851f1 configure.ac: add missing llvm dependencies to .pc files
v2: Only add as dependencies for gallium-osmesa and gallium-xlib

CC: <mesa-stable@lists.freedesktop.org>
Signed-of-by: Chuck Atkins <chuck.atkins@kitware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-25 14:54:08 -05:00
George Kyriazis
5d8f270d10 swr/rast: Optimize DumpToFile output size
Modify DumpToFile to only dump the function, not the entire module.
Reduces file sizes and speeds up the dumping.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-25 13:26:49 -06:00
George Kyriazis
dfe4dd48ec swr/rast: Updated copyright dates
on knob-related files.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-25 13:26:49 -06:00
George Kyriazis
36dbbf11a0 swr/rast: Move memory-related JIT functions
Move them to their own file (builder_mem.{h|cpp}).  Add builder_mem.cpp
to the build system.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-25 13:26:49 -06:00
George Kyriazis
94922dbe4b swr/rast: Add extra (optional) parameter in GATHERPS
Now also takes in an additional parameter (draw context) for future
expansion.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-25 13:26:49 -06:00
George Kyriazis
0b46c7b3b0 swr/rast: Better ExecCmd (i.e. system()) implmentation
Hides console window creation during JIT linker execution in apps that
don't have a console.  Remove hooking of CreateProcessInternalA - the
MSFT implementation just turns around and calls CreateProcessInternalW
which, we do hook.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-25 13:26:49 -06:00
George Kyriazis
2d16b61bff swr/rast: Support USE_SIMD16_FRONTEND=0 for EarlyRast
Early Rasterization did not initially work with USE_SIMD16_FRONTEND=0.
Fix it so it works there, too.  Please note that the default setting
is USE_SIMD16_FRONTEND=1.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-25 13:26:49 -06:00
Brian Paul
123798eb44 mesa: whitespace fixes in attrib.c
Trivial.
2018-01-25 12:17:26 -07:00
Brian Paul
0e7aaaf5a5 mesa: whitespace fixes in varray.h
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-01-25 12:17:26 -07:00
Brian Paul
ba01589c0c mesa: include mtypes.h in varray.h
We actually use some of the types from mtypes.h so include it directly
instead of relying on indirectly including it via bufferobj.h

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-01-25 12:17:26 -07:00
Brian Paul
e4504be6fc mesa: s/gl_vertex_attrib_array/gl_array_attributes/ in comments
The structure type was renamed some time ago, but some comments
were not updated.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-01-25 12:17:26 -07:00
Brian Paul
6c724fb7c1 mesa: simplify _mesa_delete_list() a bit, add some assertions
All but two cases of the switch did the same n += InstSize[n[0].opcode]
instruction.  Just move it after the switch.

Add some sanity check assertions.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-25 12:17:26 -07:00
Brian Paul
c860171c63 st/mesa: expand glDrawPixels cache to handle multiple images
The newest version of WSI Fusion makes several glDrawPixels calls
per frame.  By caching more than one image, we get better performance
when panning/zooming the map.

v2: move pixel unpack param checking out of cache search loop, per Roland
v3: also move unpack->BufferObj check out of loop, per Roland.
2018-01-25 12:17:26 -07:00
Brian Paul
5092610f29 st/mesa: add some debug code in st_choose_format()
To aid in debugging gallium surface format selection issues.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-25 12:17:26 -07:00
Brian Paul
94610758a3 svga: s/Bool/SVGA3dBool/ in SVGA3dDevCapResult
And fix whitespace.  To sync up with in-house code.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-01-25 11:56:33 -07:00
Emil Velikov
6aeef54644 configure.ac: correct driglx-direct help text
The default was toggled a while back, but the text wasn't updated.

Fixes: bd526ec9e1 ("configure: Always default to
--enable-driglx-direct")
Cc: Jon TURNEY <jon.turney@dronecode.org.uk>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-01-25 17:44:35 +00:00
Emil Velikov
7b744a494d swrast: remove non-applicable GLX_SWAP_COPY_OML comment
Noticed while skimming for GLX_ instances in the dri codebase.
Comment is completely off and was in such a state since day 1.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-25 17:42:57 +00:00
Emil Velikov
3e3956d6ae mapi: remove duplicate GL typedefs
Remove the instances already available in gl.h or glext.h.
Sadly GLclampx is only available in GLES(1) so we need to keep that one.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-25 17:42:50 +00:00
Emil Velikov
647f40298a mapi: remove non applicable HAVE_DIX_CONFIG_H hunk
Seeming artefact from when the xserver build was diving directly into
mesa's tree.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-25 17:42:48 +00:00
Emil Velikov
48e7bc6833 mapi: autotools: remove unused MAPI_FILES file list
The sole user was OpenVG, which was removed couple of years ago.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-25 17:42:46 +00:00
Emil Velikov
785d9a4ed8 automake: st/mesa/tests: add st_tests_common.h to the tarball
Fixes: 6569b33b6e ("mesa/st/tests: unify MockCodeLine* classes")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-25 17:06:29 +00:00
Emil Velikov
0beaf7ad3e automake: mesa: include vbo_private.h in the tarball
Fixes: a7cfec3be0 ("vbo: move VBO-private types, prototypes, etc. into
new vbo_private.h header")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-25 17:06:29 +00:00
Emil Velikov
ac4437b20b automake: small cleanup after the meson.build inclusion
Namely extend the EXTRA_DIST list, instead of re-assigning it and bring
back a file dropped by mistake.

Fixes: 436ed65d38 ("autotools: include meson build files in tarball")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-01-25 17:06:29 +00:00
Emil Velikov
50265cd9ee automake: anv: ship anv_extensions_gen.py in the tarball
Fixes: dd088d4bec ("anv/extensions: Generate a header file with
extension tables")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-25 17:06:29 +00:00
Emil Velikov
265d36c890 automake: vc5: remove non-applicable v3dx_simulator.h
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-25 17:06:28 +00:00
Roland Scheidegger
4fe662c58f gallivm: fix crash with seamless cube filtering with different min/mag filter
We are not allowed to modify the incoming coords values, or things may
crash (as we may be inside a llvm conditional and the values may be used
in another branch).
I recently broke this when fixing an issue with NaNs and seamless cube
map filtering, and it causes crashes when doing cubemap filtering
if the min and mag filters are different.
Add const to the pointers passed in to prevent this mishap in the future.

Fixes: a485ad0bcd ("gallivm: fix an issue with NaNs with seamless cube filtering")

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-01-25 18:03:38 +01:00
Eric Engestrom
57223fb07a egl: keep extension list sorted, per comment at the top
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2018-01-25 16:38:11 +00:00
George Kyriazis
0e879aad2f swr/rast: support llvm 3.9 type declarations
LLVM 3.9 was not taken into account in initial check-in.

Fixes: 01ab218bbc ("swr/rast: Initial work for debugging support.")
cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104749
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2018-01-25 08:22:52 -06:00
Samuel Pitoiset
e1331c9d61 ac/nir: add break statements in needs_view_index_sgpr()
Previous code is correct but as the first case statement uses
a break, keep it consistent.

CID: 1428579
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-25 13:59:52 +01:00
Eric Engestrom
0663ae0aa1 loader: let compiler figure out the length of the string
Basically, turn comment into code

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-25 11:40:25 +00:00
Eric Engestrom
57b0ccd178 meson: simplify dri3 logic
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-01-25 10:10:04 +00:00
Juan A. Suarez Romero
513c2263cb mesa: add missing RGB9_E5 format in _mesa_base_fbo_format
This fixes KHR-GL45.internalformat.renderbuffer.rgb9_e5.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-25 09:54:31 +01:00
Jason Ekstrand
df13588d21 i965: Stop disabling aux during texture preparation
Previously, we were handling self-dependencies by marking the render
buffer and then passing disable_aux=true to prepare_texture so that it
would do a resolve.  This works but ends us up doing to much resolving
in some cases.  Specifically, if we're doing something such as mipmap
generation, this would cause us to resolve all levels of the texture if
even one of them is overlapping.

Instead, this commit makes us wait until we process the framebuffer to
do these resolves and we only resolve the slices needed for rendering.
Doing this resolve puts them into the pass-through state so, even if we
do texture using CCS_E, the CCS data will effectively be ignored and the
real surface contents read.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-24 19:05:36 -08:00
Jason Ekstrand
20f70ae385 i965/draw: Set NEW_AUX_STATE when draw aux changes
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104411
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104383
Fixes: ea0d2e98ec
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-24 19:05:36 -08:00
Jason Ekstrand
e52a9f18d6 i965: Replace draw_aux_buffer_disabled with draw_aux_usage
Instead of keeping an array of booleans, we now hang onto an array of
isl_aux_usage enums.  This means that the thing we are passing from
brw_draw.c to surface state setup is the thing that surface state setup
actually needs instead of an input to compute what it needs.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-24 19:05:36 -08:00
Jason Ekstrand
468ea3cc45 i965/surface_state: Drop brw_aux_surface_disabled
The only purpose of this function is to disable aux on texture surfaces
when the corresponding renderbuffer has aux disabled.  However, the act
of disabling aux on the renderbuffer will cause it to be resolved and
intel_miptree_texture_aux_usage will already check the resolved status
of a texture and return ISL_AUX_USAGE_NONE for it.  Even if we used CCS
for it, that wouldn't really be a problem because the CCS will be in the
pass-through state and so it would effectively be ignored.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-24 19:05:36 -08:00
Jason Ekstrand
d38ec24f53 i965/miptree: Add an aux_disabled parameter to render_aux_usage
Only one of the callers of intel_miptree_render_aux_usage actually took
brw->draw_aux_buffer_disabled into account.  This was causing us to
ignore draw_aux_buffer_disabled for the intel_miptree_prepare_render.
This isn't a problem because the draw_aux_buffer_disabled entry was set
during texture preparation and we already did the resolve at that time.
However, this also meant that the aux_usage we were passing to
brw_cache_flush_for_render and brw_render_cache_add_bo was wrong so our
automatic cache flushing around aux_usage changes wasn't happening.
This was causing GPU hangs in Oxenfree.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104711
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104411
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104383
Fixes: ea0d2e98ec
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-24 19:05:36 -08:00
Jason Ekstrand
dfe0217905 i965/miptree: Take an aux_usage in prepare/finish_render
Both callers of intel_miptree_prepare/finish_render have to call
intel_miptree_render_aux_usage anyway for other reasons.  They may as
well pass the result in instead of us calling it again.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-24 19:05:36 -08:00
Jason Ekstrand
7d4007d58a aubinator: Multiply count by 4 to compute buffer sizes
The count field is in terms of dwords and not bytes.
2018-01-24 19:05:36 -08:00
Timothy Arceri
e776791432 st/glsl_to_nir: remove reallocation of sampler/image location
As far as I can tell this always just reassigns the same value.

Also as we don't curretly store UniformHash in the shader cache
removing this will help with adding a shader cache to gallium
nir drivers.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-01-25 13:27:22 +11:00
Jordan Justen
62b68d05e7 docs: add 18.1.0-devel release notes template
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2018-01-24 17:10:58 -08:00
Jordan Justen
65c18b02fc mesa: bump version to 18.1.0-devel
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2018-01-24 17:10:58 -08:00
Greg V
8fae5eddd9 meson: handle LLVM 'x.x.xgit-revision' versions
When LLVM is built inside of a git repo (even way below, e.g. /usr/ports/.git
exists, and LLVM is built in /usr/ports/devel/llvm50/work), its version
becomes something like 5.0.0git-f8ab206b2176.

New meson versions already handle this, but we support older versions too.

Fixes: 673dda8330 ("meson: build "radv" vulkan driver for radeon hardware")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-01-24 15:25:54 -08:00
Greg V
53f9131205 meson: fix getting cflags from pkg-config
get_pkgconfig_variable('cflags') always returns an empty list, it's a
function for getting *custom* variables.

Meson does not yet support asking for cflags, so explicitly invoke
pkg-config for now.

Fixes: 68076b8747 ("meson: build gallium vdpau state tracker")
Fixes: a817af8a89eb ("meson: build gallium xvmc state tracker")
Fixes: 1d36dc674d ("meson: build gallium omx state tracker")
Fixes: 5a785d51a6 ("meson: build gallium va state tracker")
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
2018-01-24 15:25:54 -08:00
Greg V
c38c60a63c meson: fix BSD build
CC: 18.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-01-24 15:25:54 -08:00
Greg V
7c8cfe2d59 meson: fix missing dependencies
Fixes: 66f97f6640 ("meson: build radeonsi")
Reviewed-by: Emil Velikov <emil.velikov@colalbora.com>
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
2018-01-24 15:25:54 -08:00
Grazvydas Ignotas
0cc7370733 anv: correct a duplicate check in an assert
Looks like checking both sources was intended, instead of the first one
twice. Found with Coccinelle, coccinellery/xand/xand.cocci semantic patch.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-01-25 01:10:45 +02:00
Marc Dietrich
a2a1b0e75e meson: fix HAVE_LLVM version define in meson build
LLVM patch level is not included in HAVE_LLVM.

Fixes: e6418ab156 ("meson: build "radv" vulkan driver for radeon hardware")
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
Signed-off-by: Marc Dietrich <marvin24@gmx.de>
2018-01-24 14:04:20 -08:00
Dylan Baker
5781c3d1db meson: correctly set SYSCONFDIR for loading dirrc
Fixes: d1992255bb ("meson: Add build Intel "anv" vulkan driver")
Reported-by: Marc Dietrich <marvin24@gmx.de>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-01-24 13:10:32 -08:00
Dave Airlie
d2414e64e4 radv: add multisample Z optimisation from amdvlk
This was just found while reading for other stuff,
src/core/hw/gfxip/gfx6/gfx6DepthStencilView.cpp.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-25 06:48:11 +10:00
Dave Airlie
298554541d radv: move spi_baryc_cntl to pipeline
We need to enable the pos float location 2 mode anytime we have
persample not just when forced by the frag shader.

This fixes:
dEQP-VK.pipeline.multisample.min_sample_shading*

Fixes: 58c97a079 (radv: enable location at sample when persample is forced.)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-25 06:47:28 +10:00
Marek Olšák
125c0529f3 gallium/u_tests: add texture_barrier and FBFETCH tests
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-24 21:08:45 +01:00
Marek Olšák
022c5b22fe radeonsi: don't ignore pitch for imported textures
Cc: 17.2 17.3 <mesa-stable@lists.freedesktop.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-24 21:08:45 +01:00
Scott D Phillips
0b8d38bd48 meson: Fix define for USE_SSE41
Before we were adding -DHAVE_SSE41 which isn't what the code is
looking for, so some uses of the sse4.1 code were always being
skipped.

v2: Don't add any compile check for the quite old -msse4.1 option (Dylan)

Fixes: 84486f6462 ("meson: Enable SSE4.1 optimizations")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-01-24 11:32:34 -08:00
Gert Wollny
8172b9ff48 mesa/st/glsl_to_tgsi: remove now unneeded assert.
With the implementation of the tracking of the registers used in reladdr
asserting that a driver calling merge_register() uses the address register
is no longer needed.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-24 10:34:05 -07:00
Gert Wollny
f2040fbe48 mesa/st/tests: Add tests for lifetime tracking with indirect addressing
Add a code line type that accepts one layer of indirect addressing and
add tests to check that temporary register access used for indirect
addressing is accounted for in the lifetime estimation.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-24 10:34:00 -07:00
Gert Wollny
51c0cee267 mesa/st/glsl_to_tgsi: Add tracking of indirect addressing registers
So far indirect addressing was not tracked to estimate the temporary
life time, and it was not needed, because code to load the address
registers was always emitted eliminating the reladdr* handles in the
past glsl-to.tgsi stages. Now, with Mareks patch allowing any 1D register
to be used for addressing on some hardware this changed, and
the tracking becomes necessary.

Because the registers have no direct indication on whether the reladdr* was
already loaded into an address register, the temporaries in reladdr* are
always tracked as reads. This may result in a slight over-estimation of the
lifetime in the cases when the load to the address register was emitted.

v2: no changes
v3: Use debug_log variable instead of directly writing to std::err in debugging
    output.
v6: fix indention and typos

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-24 10:23:00 -07:00
Gert Wollny
517e34c62f mesa/st/tests: Add tests for improved tracking of temporaries
Additional tests are added that check the tracking of access to temporaries
in if-else branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-24 10:23:00 -07:00
Gert Wollny
807e2539e5 mesa/st/glsl_to_tgsi: Add tracking of ifelse writes in register merging
Improve the life-time evaluation of temporary registers by also tracking
writes in both if and else branches and in up to 32 nested scopes.
As a result the estimated required register life-times can be further
reduced enabling more registers to be merged.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-24 10:23:00 -07:00
Gert Wollny
8dda01ef5a mesa/st/tests: cleanup whitespace usage and correct some comments
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-24 10:23:00 -07:00
Gert Wollny
6569b33b6e mesa/st/tests: unify MockCodeLine* classes
* Merge the classes MockCodeLine and MockCodelineWithSwizzle into
   one, and  refactor tests accordingly.
 * Change memory allocations to use ralloc* interface.

 v2:
 * move the test classes into a conveniance library
 * rename the Mock* classes to Fake* since they are not really
   Mocks
 * Base assertion of correct number of src and dst registers in tests
   on what the operatand actually expects
 * Fix number of destinations in one test

 v6:
 * fix local includes using "..." insteadof <...>

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-24 10:23:00 -07:00
Gert Wollny
ad1990629e mesa/st/tests: Fix zero-byte allocation leaks
Don't allocate a zero-sized array, when no texture offsets are given.

v5: correct spaces and empty lines

Reviewed-by: Brian Paul <brianp@vmware.com>(v4)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-24 10:23:00 -07:00
Gert Wollny
ee48e3acb8 mesa/st/glsl_to_tgsi: Add some operators for glsl_to_tgsi related classes
Add the equal operator and the "<<" stream write operator for the
st_*_reg classes and the "<<" operator to the instruction class, and
make use of these operators in the debugging output.

v5: Fix empty lines

Reviewed-by: Brian Paul <brianp@vmware.com> (v4)
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-24 10:23:00 -07:00
Gert Wollny
6a3421078a mesa/program: Add missing file types to printout
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
2018-01-24 10:23:00 -07:00
Brian Paul
365a48abdd vbo: fix incorrect min/max_index values in display list draw call
This fixes another regression from commit 8e4efdc895 ("vbo: optimize
some display list drawing").  The problem was the min_index, max_index
values passed to the vbo drawing function were not computed to compensate
for the biased prim::start values.

https://bugs.freedesktop.org/show_bug.cgi?id=104746
https://bugs.freedesktop.org/show_bug.cgi?id=104742
https://bugs.freedesktop.org/show_bug.cgi?id=104690
Tested-by: Clayton Craft <clayton.a.craft@intel.com>
Fixes: 8e4efdc895 ("vbo: optimize some display list drawing")
Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
2018-01-24 10:12:49 -07:00
Brian Paul
2123bd2805 vbo: whitespace/formatting fixes in vbo_split_inplace.c
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
6b0109cf39 vbo: whitespace/formatting fixes in vbo.h
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
b9280031a8 vbo/i965: move vbo_all_varyings_in_vbos() to brw_draw.c
It's only used in brw_draw_prims().

s/GLboolean/bool/, etc.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
a83f7e119c vbo: remove unused vbo_any_varyings_in_vbos() function
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
718f4251c5 vbo: remove unneeded #includes
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
f4376a0c2b vbo: remove vbo_context.h and change includes to use vbo.h instead
Now vbo.h is the public interface to the VBO module.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
aafb56a148 vbo: move remaining items from vbo_context.h to vbo.h
Non-VBO sources files sometimes included vbo.h while others included
vbo_context.h.  We're moving all public types, functions to the former.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
a7cfec3be0 vbo: move VBO-private types, prototypes, etc. into new vbo_private.h header
Things which should not be used outside the VBO module.
More public/private clean-ups coming.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
d40fa42292 mesa: use new _vbo_install_exec_vtxfmt() function
Instead of reaching into the vbo_context object in vtxfmt.c

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
04a17ec327 nouveau: remove vbo_context() call
_vbo_DestroyContext() can be safely called even if there's no VBO
module.  Removes a dependency on the vbo_context() function.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
7b0ae96711 i965: use vbo_set_[indirect]_draw_func()
Instead of poking into the vbo_context object.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
3bbf8d9042 vbo: move vbo_sizeof_ib_type() into vbo_exec_array.c
It's only used in this one file.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
a152cb7492 mesa: move vbo_count_tessellated_primitives() to api_validate.c
It's only used in this file and has nothing VBO-specific about it.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
5d3e10fd27 mesa: update comment on gl_display_list
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
cffa82327d mesa: whitespace clean-ups in mtypes.h
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
b3a1aa94d9 mesa: remove unused MAT_INDEX_AMBIENT/DIFFUSE/SPECULAR contants
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
67dc551ba9 vbo: move DLIST_DANGLING_REFS from mtypes.h to vbo_save_api.c
It's only used in this file.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
cb7ef0df00 vbo: replace assert(0) with unreachable()
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
8b3cb7c651 vbo: fix, add comment in vbo_save.h
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Brian Paul
67ebde19d4 vbo: whitespace, formatting fixes in vbo_split.[ch]
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-24 10:12:49 -07:00
Topi Pohjolainen
ec4bb693a0 i965: Don't try to disable render aux buffers for compute
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104546
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-01-24 10:54:08 +02:00
Jason Ekstrand
4064fe59e7 anv/cmd_buffer: Move gen7 index buffer state to graphics state
Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:46 -08:00
Jason Ekstrand
38ec78049f anv/cmd_buffer: Move num_workgroups to compute state
While we're here, make it an anv_address.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:44 -08:00
Jason Ekstrand
95ff232294 anv/cmd_buffer: Move dynamic state to graphics state
Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:43 -08:00
Jason Ekstrand
24caee8975 anv/cmd_buffer: Use a temporary variable for dynamic state
We were already doing this for some packets to keep the lines shorter.
We may as well just do it for all of them.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:40 -08:00
Jason Ekstrand
8bd5ec5b86 anv/cmd_buffer: Move vb_dirty bits into anv_cmd_graphics_state
Vertex buffers are entirely a graphics pipeline thing.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:39 -08:00
Jason Ekstrand
e85aaec148 anv/cmd_buffer: Move dirty bits into anv_cmd_*_state
Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:36 -08:00
Jason Ekstrand
97f96610c8 anv: Separate compute and graphics descriptor sets
The Vulkan spec says:

    "pipelineBindPoint is a VkPipelineBindPoint indicating whether the
    descriptors will be used by graphics pipelines or compute pipelines.
    There is a separate set of bind points for each of graphics and
    compute, so binding one does not disturb the other."

Up until now, we've been ignoring the pipeline bind point and had just
one bind point for everything.  This commit separates things out into
separate bind points.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102897
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:33 -08:00
Jason Ekstrand
31b2144c83 anv/cmd_buffer: Use anv_descriptor_for_binding for samplers
Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:31 -08:00
Jason Ekstrand
b9e1ca16f8 anv/cmd_buffer: Add a helper for binding descriptor sets
This lets us unify some code between push descriptors and regular
descriptors.  It doesn't do much for us yet but it will.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:30 -08:00
Jason Ekstrand
90cceaa9dd anv/cmd_buffer: Refactor ensure_push_descriptor_set
It's now a function which returns the push descriptor set.  Since we set
the error on the command buffer, returning the error is a little
redundant.  Returning the descriptor set (or NULL on error) is more
convenient.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:28 -08:00
Jason Ekstrand
d5592e2fda anv: Remove semicolons from vk_error[f] definitions
With the semicolons, they can't be used in a function argument without
throwing syntax errors.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:27 -08:00
Jason Ekstrand
9af5379228 anv/cmd_buffer: Add substructs to anv_cmd_state for graphics and compute
Initially, these just contain the pipeline in a base struct.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:25 -08:00
Jason Ekstrand
ddc2d28548 anv/cmd_buffer: Use some pre-existing pipeline temporaries
There are several places where we'd already saved the pipeline off to a
temporary variable but, due to an artifact of history, weren't actually
using that temporary everywhere.  No functional change.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:24 -08:00
Jason Ekstrand
cd3feea745 anv/cmd_buffer: Rework anv_cmd_state_reset
This splits anv_cmd_state_reset into separate init and finish functions.
This lets us share init code with cmd_buffer_create.  This potentially
fixes subtle bugs where we may have missed some bit of state that needs
to get initialized on command buffer creation.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:22 -08:00
Jason Ekstrand
d6c9a89d13 anv/cmd_buffer: Get rid of the meta query workaround
Meta has been gone for a long time.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:20 -08:00
Jason Ekstrand
bc0a21e348 anv/cmd_state: Drop the scratch_size field
This is a legacy left-over from the mechanism we used to use to handle
scratch.  The new (and better) mechanism doesn't use this.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:19 -08:00
Jason Ekstrand
4b69ba3817 anv/pipeline: Don't assert on more than 32 samplers
This prevents an assert when running one unreleased Vulkan game.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:08 -08:00
Dave Airlie
766589d89a radv: fix sample_mask_in loading. (v3.1)
This is ported from radeonsi and fixes:
dEQP-VK.pipeline.multisample_shader_builtin.sample_mask.bit_*

v2: don't call this path for radeonsi, it does it in the epilog.
use the radeonsi code path.
v3: handle NULL pCreateInfo->pMultisampleState properly (Samuel)
v3.1: set ps_iter_samples default to 1 (Bas)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: bdcbe7c76 (radv: add sample mask input support)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-24 14:25:11 +10:00
Dave Airlie
c727ea9370 radv: don't use hw resolves for r16g16 norm formats.
radeonsi has a workaround for this, but it uses a R16A16 format,
which vulkan doesn't have, we could probably come up with a work
around but for now just avoid hw resolves.

Fixes:
dEQP-VK.renderpass.suballocation.multisample.r16g16_*norm*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: 2a04f5481d (radv/meta: select resolve paths)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-24 09:01:12 +10:00
Dave Airlie
4df414bbd2 radv: don't use hw resolve for integer image formats
From reading AMDVLK it currently never uses hw resolve paths.

This patch takes from radeonsi which doesn't use hw resolve
for integer formats, and does the same for radv.

This fixes:
dEQP-VK.renderpass.suballocation.multisample*uint tests.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: 2a04f5481d (radv/meta: select resolve paths)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-24 08:53:18 +10:00
Dave Airlie
316d762186 radv: add fs_key meta format support to resolve passes.
Some of the hw resolve passes need the SPI color format setup
correctly.

This fixes lots of 16-bit and 32-bit format tests in
dEQP-VK.renderpass.suballocation.multisample*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-24 08:50:51 +10:00
Grazvydas Ignotas
224fd17e1e winsys/svga: check correct member after create
.mob_fenced was already checked, probably a copy-paste bug.
Found by Coccinelle.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-23 11:04:07 -07:00
Grazvydas Ignotas
08085df313 svga: fix context alloc error handling
'cleanup' path is dereferencing 'svga' a lot, 'done' is a better choice.
Found by Coccinelle.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-23 11:04:07 -07:00
Christoph Haag
4b4d929c27 meson: remove lib prefix from libd3dadapter9.so
Fixes: 6b4c7047d5 ("meson: build gallium nine state_tracker")
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
2018-01-23 09:30:30 -08:00
Emil Velikov
3b6d232a5c docs: update calendar 18.0.0-rc1 is out
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-23 17:02:17 +00:00
Eric Engestrom
eee8dd7c33 radeon: remove left over dead code
Fixes: 4e0d99a635 "r100: Use shared debug code"
Cc: Pauli Nieminen <suokkos@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-01-23 15:39:57 +00:00
Eric Engestrom
10f5e0dce2 docs: ask for backport nominations to cc: the author
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-23 15:39:57 +00:00
Marc Dietrich
911ca587f8 meson: fix some defines misspelled errors in meson.build
Defines
- HAVE_FUNC_ATTRIBUTE_RETURNS_NONNULL
- HAVE_FUNC_ATTRIBUTE_VISIBILITY
were misspelled.

Signed-off-by: Marc Dietrich <marvin24@gmx.de>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-01-23 15:39:57 +00:00
1879 changed files with 151662 additions and 78253 deletions

View File

@@ -148,6 +148,8 @@ Emil Velikov <emil.l.velikov@gmail.com> <emil.velikov@collabora.com>
Eric Anholt <eric@anholt.net> Eric Anholt <anholt@FreeBSD.org>
Eric Engestrom <eric@engestrom.ch> <eric.engestrom@imgtec.com>
Eugeni Dodonov <eugeni.dodonov@intel.com> <eugeni@mandriva.com>
Fabian Bieler <der.fabe@gmx.net> <fabianbieler@fastmail.fm>

View File

@@ -17,12 +17,12 @@ env:
- DRI2PROTO_VERSION=dri2proto-2.8
- LIBPCIACCESS_VERSION=libpciaccess-0.13.4
- LIBDRM_VERSION=libdrm-2.4.74
- XCBPROTO_VERSION=xcb-proto-1.11
- LIBXCB_VERSION=libxcb-1.11
- XCBPROTO_VERSION=xcb-proto-1.13
- LIBXCB_VERSION=libxcb-1.13
- LIBXSHMFENCE_VERSION=libxshmfence-1.2
- LIBVDPAU_VERSION=libvdpau-1.1
- LIBVA_VERSION=libva-1.6.2
- LIBWAYLAND_VERSION=wayland-1.11.1
- LIBVA_VERSION=libva-1.7.0
- LIBWAYLAND_VERSION=wayland-1.15.0
- WAYLAND_PROTOCOLS_VERSION=wayland-protocols-1.8
- PKG_CONFIG_PATH=$HOME/prefix/lib/pkgconfig:$HOME/prefix/share/pkgconfig
- LD_LIBRARY_PATH="$HOME/prefix/lib:$LD_LIBRARY_PATH"
@@ -33,18 +33,18 @@ matrix:
- env:
- LABEL="meson Vulkan"
- BUILD=meson
- MESON_OPTIONS="-Ddri-drivers= -Dgallium-drivers="
- LLVM_VERSION=4.0
- MESON_OPTIONS="-Ddri-drivers=[] -Dgallium-drivers=[]"
- LLVM_VERSION=5.0
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
addons:
apt:
sources:
- llvm-toolchain-trusty-3.9
- llvm-toolchain-trusty-5.0
packages:
# LLVM packaging is broken and misses these dependencies
- libedit-dev
# From sources above
- llvm-3.9-dev
- llvm-5.0-dev
# Common
- xz-utils
- libexpat1-dev
@@ -53,7 +53,7 @@ matrix:
- env:
- LABEL="meson loaders/classic DRI"
- BUILD=meson
- MESON_OPTIONS="-Dvulkan-drivers= -Dgallium-drivers="
- MESON_OPTIONS="-Dvulkan-drivers=[] -Dgallium-drivers=[]"
addons:
apt:
packages:
@@ -92,12 +92,10 @@ matrix:
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="true"
- LLVM_VERSION=3.9
- LLVM_VERSION=4.0
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
- OVERRIDE_CC="gcc-4.8"
- OVERRIDE_CXX="g++-4.8"
# New binutils linker is required for llvm-3.9
- OVERRIDE_PATH=/usr/lib/binutils-2.26/bin
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
- DRI_DRIVERS=""
- GALLIUM_ST="--enable-dri --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
@@ -107,13 +105,41 @@ matrix:
addons:
apt:
sources:
- llvm-toolchain-trusty-3.9
- llvm-toolchain-trusty-4.0
packages:
- binutils-2.26
# LLVM packaging is broken and misses these dependencies
- libedit-dev
# From sources above
- llvm-3.9-dev
- llvm-4.0-dev
# Common
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- libunwind8-dev
- env:
- LABEL="make Gallium Drivers RadeonSI"
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="true"
- LLVM_VERSION=5.0
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
- DRI_DRIVERS=""
- GALLIUM_ST="--enable-dri --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
- GALLIUM_DRIVERS="radeonsi"
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--enable-libunwind"
addons:
apt:
sources:
- llvm-toolchain-trusty-5.0
packages:
# LLVM packaging is broken and misses these dependencies
- libedit-dev
# From sources above
- llvm-5.0-dev
# Common
- xz-utils
- x11proto-xf86vidmode-dev
@@ -133,7 +159,7 @@ matrix:
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
- DRI_DRIVERS=""
- GALLIUM_ST="--enable-dri --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
- GALLIUM_DRIVERS="i915,nouveau,pl111,r300,r600,radeonsi,freedreno,svga,swrast,vc4,virgl,etnaviv,imx"
- GALLIUM_DRIVERS="i915,nouveau,pl111,r300,r600,freedreno,svga,swrast,v3d,vc4,virgl,etnaviv,imx"
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--enable-libunwind"
addons:
@@ -168,7 +194,7 @@ matrix:
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
- DRI_DRIVERS=""
- GALLIUM_ST="--disable-dri --enable-opencl --enable-opencl-icd --enable-llvm --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
- GALLIUM_DRIVERS="r600,radeonsi"
- GALLIUM_DRIVERS="r600"
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--enable-libunwind"
addons:
@@ -205,7 +231,7 @@ matrix:
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
- DRI_DRIVERS=""
- GALLIUM_ST="--disable-dri --enable-opencl --enable-opencl-icd --enable-llvm --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
- GALLIUM_DRIVERS="r600,radeonsi"
- GALLIUM_DRIVERS="r600"
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--enable-libunwind"
addons:
@@ -264,6 +290,39 @@ matrix:
- libx11-xcb-dev
- libelf-dev
- libunwind8-dev
- env:
# NOTE: Analogous to SWR above, building Clover is quite slow.
- LABEL="make Gallium ST Clover LLVM-6.0"
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="true"
- LLVM_VERSION=6.0
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
- DRI_DRIVERS=""
- GALLIUM_ST="--disable-dri --enable-opencl --enable-opencl-icd --enable-llvm --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
- GALLIUM_DRIVERS="r600,radeonsi"
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--enable-libunwind"
addons:
apt:
sources:
- llvm-toolchain-trusty-6.0
# llvm-6 depends on gcc-4.9 which is not in main repo
- ubuntu-toolchain-r-test
packages:
- libclc-dev
# From sources above
- llvm-6.0-dev
- clang-6.0
- libclang-6.0-dev
# Common
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- libunwind8-dev
- env:
- LABEL="make Gallium ST Other"
- BUILD=make
@@ -305,10 +364,8 @@ matrix:
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="make -C src/gtest check && make -C src/intel check"
- LLVM_VERSION=3.9
- LLVM_VERSION=5.0
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
# New binutils linker is required for llvm-3.9
- OVERRIDE_PATH=/usr/lib/binutils-2.26/bin
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl --with-platforms=x11,wayland"
- DRI_DRIVERS=""
- GALLIUM_ST="--enable-dri --enable-dri3 --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
@@ -318,13 +375,12 @@ matrix:
addons:
apt:
sources:
- llvm-toolchain-trusty-3.9
- llvm-toolchain-trusty-5.0
packages:
- binutils-2.26
# LLVM packaging is broken and misses these dependencies
- libedit-dev
# From sources above
- llvm-3.9-dev
- llvm-5.0-dev
# Common
- xz-utils
- x11proto-xf86vidmode-dev
@@ -376,7 +432,7 @@ matrix:
- BUILD=scons
- SCONSFLAGS="-j4"
- SCONS_TARGET="swr=1"
- LLVM_VERSION=3.9
- LLVM_VERSION=4.0
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
# Keep it symmetrical to the make build. There's no actual SWR, yet.
- SCONS_CHECK_COMMAND="true"
@@ -385,13 +441,13 @@ matrix:
addons:
apt:
sources:
- llvm-toolchain-trusty-3.9
- llvm-toolchain-trusty-4.0
packages:
- scons
# LLVM packaging is broken and misses these dependencies
- libedit-dev
# From sources above
- llvm-3.9-dev
- llvm-4.0-dev
# Common
- xz-utils
- x11proto-xf86vidmode-dev
@@ -405,6 +461,11 @@ matrix:
- MAKE_CHECK_COMMAND="make check"
- DRI_LOADERS="--with-platforms=x11 --disable-egl"
os: osx
- env:
- LABEL="macOS meson"
- BUILD=meson
- MESON_OPTIONS="-Degl=false"
os: osx
before_install:
- |
@@ -530,7 +591,9 @@ script:
export CFLAGS="$CFLAGS -isystem`pwd`";
./autogen.sh --enable-debug
mkdir build &&
cd build &&
../autogen.sh --enable-debug
$LIBUNWIND_FLAGS
$DRI_LOADERS
--with-dri-drivers=$DRI_DRIVERS

View File

@@ -70,8 +70,10 @@ LOCAL_CFLAGS += \
-DHAVE_DLADDR \
-DHAVE_DL_ITERATE_PHDR \
-DHAVE_LINUX_FUTEX_H \
-DHAVE_ENDIAN_H \
-DHAVE_ZLIB \
-DMAJOR_IN_SYSMACROS \
-DVK_USE_PLATFORM_ANDROID_KHR \
-fvisibility=hidden \
-Wno-sign-compare

View File

@@ -45,7 +45,7 @@ AM_DISTCHECK_CONFIGURE_FLAGS = \
--enable-libunwind \
--with-platforms=x11,wayland,drm,surfaceless \
--with-dri-drivers=i915,i965,nouveau,radeon,r200,swrast \
--with-gallium-drivers=i915,nouveau,r300,pl111,r600,radeonsi,freedreno,svga,swrast,vc4,virgl,swr,etnaviv,imx \
--with-gallium-drivers=i915,nouveau,r300,pl111,r600,radeonsi,freedreno,svga,swrast,vc4,tegra,virgl,swr,etnaviv,imx \
--with-vulkan-drivers=intel,radeon
ACLOCAL_AMFLAGS = -I m4
@@ -64,7 +64,8 @@ EXTRA_DIST = \
meson_options.txt \
bin/meson.build \
include/meson.build \
bin/install_megadrivers.py
bin/install_megadrivers.py \
bin/meson_get_version.py
noinst_HEADERS = \
include/c99_alloca.h \
@@ -75,12 +76,15 @@ noinst_HEADERS = \
include/drm-uapi/drm_fourcc.h \
include/drm-uapi/drm_mode.h \
include/drm-uapi/i915_drm.h \
include/drm-uapi/tegra_drm.h \
include/drm-uapi/v3d_drm.h \
include/drm-uapi/vc4_drm.h \
include/D3D9 \
include/GL/wglext.h \
include/HaikuGL \
include/no_extern_c.h \
include/pci_ids
include/pci_ids \
include/vulkan
# We list some directories in EXTRA_DIST, but don't actually want to include
# the .gitignore files in the tarball.

10
PRESUBMIT.cfg Normal file
View File

@@ -0,0 +1,10 @@
# This sample config file disables all of the ChromiumOS source style checks.
# Comment out the disable-flags for any checks you want to leave enabled.
[Hook Overrides]
stray_whitespace_check: false
long_line_check: false
cros_license_check: false
tab_check: false
bug_field_check: false
test_field_check: false

79
README.rst Normal file
View File

@@ -0,0 +1,79 @@
`Mesa <https://mesa3d.org>`_ - The 3D Graphics Library
======================================================
Source
------
This repository lives at https://gitlab.freedesktop.org/mesa/mesa.
Other repositories are likely forks, and code found there is not supported.
Build status
------------
Travis:
.. image:: https://travis-ci.org/mesa3d/mesa.svg?branch=master
:target: https://travis-ci.org/mesa3d/mesa
Appveyor:
.. image:: https://img.shields.io/appveyor/ci/mesa3d/mesa.svg
:target: https://ci.appveyor.com/project/mesa3d/mesa
Coverity:
.. image:: https://scan.coverity.com/projects/139/badge.svg?flat=1
:target: https://scan.coverity.com/projects/mesa
Build & install
---------------
You can find more information in our documentation (`docs/install.html
<https://mesa3d.org/install.html>`_), but the recommended way is to use
Meson (`docs/meson.html <https://mesa3d.org/meson.html>`_):
.. code-block:: sh
$ mkdir build
$ cd build
$ meson ..
$ sudo ninja install
Support
-------
Many Mesa devs hang on IRC; if you're not sure which channel is
appropriate, you should ask your question on `Freenode's #dri-devel
<irc://chat.freenode.net#dri-devel>`_, someone will redirect you if
necessary.
Remember that not everyone is in the same timezone as you, so it might
take a while before someone qualified sees your question.
To figure out who you're talking to, or which nick to ping for your
question, check out `Who's Who on IRC
<https://dri.freedesktop.org/wiki/WhosWho/>`_.
The next best option is to ask your question in an email to the
mailing lists: `mesa-dev\@lists.freedesktop.org
<https://lists.freedesktop.org/mailman/listinfo/mesa-dev>`_
Bug reports
-----------
If you think something isn't working properly, please file a bug report
(`docs/bugs.html <https://mesa3d.org/bugs.html>`_).
Contributing
------------
Contributions are welcome, and step-by-step instructions can be found in our
documentation (`docs/submittingpatches.html
<https://mesa3d.org/submittingpatches.html>`_).
Note that Mesa uses email mailing-lists for patches submission, review and
discussions.

View File

@@ -116,6 +116,7 @@ MESON BUILD
R: Dylan Baker <dylan@pnwbakers.com>
R: Eric Engestrom <eric@engestrom.ch>
F: */meson.build
F: meson.build
F: meson_options.txt
ANDROID EGL SUPPORT

View File

@@ -1 +1 @@
18.0.0-rc5
18.2.0-devel

View File

@@ -35,13 +35,13 @@ clone_depth: 100
cache:
- win_flex_bison-2.5.9.zip
- llvm-3.3.1-msvc2013-mtd.7z
- llvm-5.0.1-msvc2015-mtd.7z
os: Visual Studio 2013
os: Visual Studio 2015
environment:
WINFLEXBISON_ARCHIVE: win_flex_bison-2.5.9.zip
LLVM_ARCHIVE: llvm-3.3.1-msvc2013-mtd.7z
LLVM_ARCHIVE: llvm-5.0.1-msvc2015-mtd.7z
install:
# Check pip
@@ -69,10 +69,10 @@ install:
- set LLVM=%CD%\llvm
build_script:
- scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=12.0 llvm=1
- scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=14.0 llvm=1
after_build:
- scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=12.0 llvm=1 check
- scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=14.0 llvm=1 check
# It's possible to setup notification here, as described in

View File

@@ -1,6 +0,0 @@
# fixes: The following commits were applied without the "cherry-picked from" tag
50265cd9ee4caffee853700bdcd75b92eedc0e7b automake: anv: ship anv_extensions_gen.py in the tarball
ac4437b20b87c7285b89466f05b51518ae616873 automake: small cleanup after the meson.build inclusion
# stable: The KHX extension is disabled all together in the stable branches.
2ffe395cba0f7b3c1f1c41062f4376eae3a188b5 radv: Don't expose VK_KHX_multiview on android.

View File

@@ -23,7 +23,7 @@ echo "<ul>"
echo ""
# extract fdo urls from commit log
git log $* | grep 'bugs.freedesktop.org/show_bug' | sed -e $trim_before | sort -n -u | sed -e $use_after |\
git log --pretty=medium $* | grep 'bugs.freedesktop.org/show_bug' | sed -e $trim_before | sort -n -u | sed -e $use_after |\
while read url
do
id=$(echo $url | cut -d'=' -f2)

View File

@@ -16,7 +16,7 @@ latest_branchpoint=`git merge-base origin/master HEAD`
git log --reverse --pretty=%H $latest_branchpoint > already_landed
# ... and the ones cherry-picked.
git log --reverse --grep="cherry picked from commit" $latest_branchpoint..HEAD |\
git log --reverse --pretty=medium --grep="cherry picked from commit" $latest_branchpoint..HEAD |\
grep "cherry picked from commit" |\
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked
@@ -38,7 +38,7 @@ do
# Place every "fixes:" tag on its own line and join with the next word
# on its line or a later one.
fixes=`git show -s $sha | tr -d "\n" | sed -e 's/fixes:[[:space:]]*/\nfixes:/Ig' | grep "fixes:" | sed -e 's/\(fixes:[a-zA-Z0-9]*\).*$/\1/'`
fixes=`git show --pretty=medium -s $sha | tr -d "\n" | sed -e 's/fixes:[[:space:]]*/\nfixes:/Ig' | grep "fixes:" | sed -e 's/\(fixes:[a-zA-Z0-9]*\).*$/\1/'`
# For each one try to extract the tag
fixes_count=`echo "$fixes" | wc -l`

View File

@@ -12,7 +12,7 @@
latest_branchpoint=`git merge-base origin/master HEAD`
# Grep for commits with "cherry picked from commit" in the commit message.
git log --reverse --grep="cherry picked from commit" $latest_branchpoint..HEAD |\
git log --reverse --pretty=medium --grep="cherry picked from commit" $latest_branchpoint..HEAD |\
grep "cherry picked from commit" |\
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked

View File

@@ -1,6 +1,6 @@
#!/usr/bin/env python
# encoding=utf-8
# Copyright © 2017 Intel Corporation
# Copyright © 2017-2018 Intel Corporation
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
@@ -35,26 +35,30 @@ def main():
parser.add_argument('drivers', nargs='+')
args = parser.parse_args()
to = os.path.join(os.environ.get('MESON_INSTALL_DESTDIR_PREFIX'), args.libdir)
if os.path.isabs(args.libdir):
to = os.path.join(os.environ.get('DESTDIR', '/'), args.libdir[1:])
else:
to = os.path.join(os.environ['MESON_INSTALL_DESTDIR_PREFIX'], args.libdir)
master = os.path.join(to, os.path.basename(args.megadriver))
if not os.path.exists(to):
os.makedirs(to)
shutil.copy(args.megadriver, master)
for each in args.drivers:
driver = os.path.join(to, each)
for driver in args.drivers:
abs_driver = os.path.join(to, driver)
if os.path.exists(driver):
os.unlink(driver)
print('installing {} to {}'.format(args.megadriver, driver))
os.link(master, driver)
if os.path.exists(abs_driver):
os.unlink(abs_driver)
print('installing {} to {}'.format(args.megadriver, abs_driver))
os.link(master, abs_driver)
try:
ret = os.getcwd()
os.chdir(to)
name, ext = os.path.splitext(each)
name, ext = os.path.splitext(driver)
while ext != '.so':
if os.path.exists(name):
os.unlink(name)

View File

@@ -74,24 +74,29 @@ AC_SUBST([OPENCL_VERSION])
# in the first entry.
LIBDRM_REQUIRED=2.4.75
LIBDRM_RADEON_REQUIRED=2.4.71
LIBDRM_AMDGPU_REQUIRED=2.4.89
LIBDRM_AMDGPU_REQUIRED=2.4.91
LIBDRM_INTEL_REQUIRED=2.4.75
LIBDRM_NVVIEUX_REQUIRED=2.4.66
LIBDRM_NOUVEAU_REQUIRED=2.4.66
LIBDRM_FREEDRENO_REQUIRED=2.4.89
LIBDRM_ETNAVIV_REQUIRED=2.4.82
LIBDRM_FREEDRENO_REQUIRED=2.4.92
LIBDRM_ETNAVIV_REQUIRED=2.4.89
LIBDRM_VC4_REQUIRED=2.4.89
dnl Versions for external dependencies
DRI2PROTO_REQUIRED=2.8
GLPROTO_REQUIRED=1.4.14
LIBOMXIL_BELLAGIO_REQUIRED=0.0
LIBVA_REQUIRED=0.38.0
LIBOMXIL_TIZONIA_REQUIRED=0.10.0
LIBVA_REQUIRED=0.39.0
VDPAU_REQUIRED=1.1
WAYLAND_REQUIRED=1.11
WAYLAND_EGL_BACKEND_REQUIRED=3
WAYLAND_PROTOCOLS_REQUIRED=1.8
XCB_REQUIRED=1.9.3
XCBDRI2_REQUIRED=1.8
XCBDRI3_MODIFIERS_REQUIRED=1.13
XCBGLX_REQUIRED=1.8.1
XCBPRESENT_MODIFIERS_REQUIRED=1.13
XDAMAGE_REQUIRED=1.1
XSHMFENCE_REQUIRED=1.1
XVMC_REQUIRED=1.0.6
@@ -103,9 +108,9 @@ dnl LLVM versions
LLVM_REQUIRED_GALLIUM=3.3.0
LLVM_REQUIRED_OPENCL=3.9.0
LLVM_REQUIRED_R600=3.9.0
LLVM_REQUIRED_RADEONSI=3.9.0
LLVM_REQUIRED_RADV=3.9.0
LLVM_REQUIRED_SWR=3.9.0
LLVM_REQUIRED_RADEONSI=5.0.0
LLVM_REQUIRED_RADV=5.0.0
LLVM_REQUIRED_SWR=4.0.0
dnl Check for progs
AC_PROG_CPP
@@ -116,6 +121,8 @@ dnl other CC/CXX flags related help
AC_ARG_VAR([CXX11_CXXFLAGS], [Compiler flag to enable C++11 support (only needed if not
enabled by default and different from -std=c++11)])
AM_PROG_CC_C_O
AC_PROG_GREP
AC_PROG_NM
AM_PROG_AS
AX_CHECK_GNU_MAKE
AC_CHECK_PROGS([PYTHON2], [python2.7 python2 python])
@@ -429,28 +436,41 @@ fi
AM_CONDITIONAL([SSE41_SUPPORTED], [test x$SSE41_SUPPORTED = x1])
AC_SUBST([SSE41_CFLAGS], $SSE41_CFLAGS)
dnl Check for new-style atomic builtins
AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
dnl Check for new-style atomic builtins. We first check without linking to
dnl -latomic.
AC_MSG_CHECKING(whether __atomic_load_n is supported)
AC_LINK_IFELSE([AC_LANG_SOURCE([[
#include <stdint.h>
int main() {
int n;
return __atomic_load_n(&n, __ATOMIC_ACQUIRE);
}]])], GCC_ATOMIC_BUILTINS_SUPPORTED=1)
if test "x$GCC_ATOMIC_BUILTINS_SUPPORTED" = x1; then
DEFINES="$DEFINES -DUSE_GCC_ATOMIC_BUILTINS"
dnl On some platforms, new-style atomics need a helper library
AC_MSG_CHECKING(whether -latomic is needed)
AC_LINK_IFELSE([AC_LANG_SOURCE([[
#include <stdint.h>
uint64_t v;
int main() {
return (int)__atomic_load_n(&v, __ATOMIC_ACQUIRE);
}]])], GCC_ATOMIC_BUILTINS_NEED_LIBATOMIC=no, GCC_ATOMIC_BUILTINS_NEED_LIBATOMIC=yes)
AC_MSG_RESULT($GCC_ATOMIC_BUILTINS_NEED_LIBATOMIC)
if test "x$GCC_ATOMIC_BUILTINS_NEED_LIBATOMIC" = xyes; then
LIBATOMIC_LIBS="-latomic"
fi
struct {
uint64_t *v;
} x;
return (int)__atomic_load_n(x.v, __ATOMIC_ACQUIRE) &
(int)__atomic_add_fetch(x.v, (uint64_t)1, __ATOMIC_ACQ_REL);
}]])], GCC_ATOMIC_BUILTINS_SUPPORTED=yes, GCC_ATOMIC_BUILTINS_SUPPORTED=no)
dnl If that didn't work, we try linking with -latomic, which is needed on some
dnl platforms.
if test "x$GCC_ATOMIC_BUILTINS_SUPPORTED" != xyes; then
save_LDFLAGS=$LDFLAGS
LDFLAGS="$LDFLAGS -latomic"
AC_LINK_IFELSE([AC_LANG_SOURCE([[
#include <stdint.h>
int main() {
struct {
uint64_t *v;
} x;
return (int)__atomic_load_n(x.v, __ATOMIC_ACQUIRE) &
(int)__atomic_add_fetch(x.v, (uint64_t)1, __ATOMIC_ACQ_REL);
}]])], GCC_ATOMIC_BUILTINS_SUPPORTED=yes LIBATOMIC_LIBS="-latomic",
GCC_ATOMIC_BUILTINS_SUPPORTED=no)
LDFLAGS=$save_LDFLAGS
fi
AC_MSG_RESULT($GCC_ATOMIC_BUILTINS_SUPPORTED)
if test "x$GCC_ATOMIC_BUILTINS_SUPPORTED" = xyes; then
DEFINES="$DEFINES -DUSE_GCC_ATOMIC_BUILTINS"
fi
AM_CONDITIONAL([GCC_ATOMIC_BUILTINS_SUPPORTED], [test x$GCC_ATOMIC_BUILTINS_SUPPORTED = x1])
AC_SUBST([LIBATOMIC_LIBS])
dnl Check if host supports 64-bit atomics
@@ -743,21 +763,6 @@ esac
AC_SUBST([LIB_EXT])
dnl
dnl potentially-infringing-but-nobody-knows-for-sure stuff
dnl
AC_ARG_ENABLE([texture-float],
[AS_HELP_STRING([--enable-texture-float],
[enable floating-point textures and renderbuffers @<:@default=disabled@:>@])],
[enable_texture_float="$enableval"],
[enable_texture_float=no]
)
if test "x$enable_texture_float" = xyes; then
AC_MSG_WARN([Floating-point textures enabled.])
AC_MSG_WARN([Please consult docs/patents.txt with your lawyer before building Mesa.])
DEFINES="$DEFINES -DTEXTURE_FLOAT_ENABLED"
fi
dnl
dnl Arch/platform-specific settings
dnl
@@ -862,6 +867,7 @@ fi
AC_HEADER_MAJOR
AC_CHECK_HEADER([xlocale.h], [DEFINES="$DEFINES -DHAVE_XLOCALE_H"])
AC_CHECK_HEADER([sys/sysctl.h], [DEFINES="$DEFINES -DHAVE_SYS_SYSCTL_H"])
AC_CHECK_HEADERS([endian.h])
AC_CHECK_FUNC([strtof], [DEFINES="$DEFINES -DHAVE_STRTOF"])
AC_CHECK_FUNC([mkostemp], [DEFINES="$DEFINES -DHAVE_MKOSTEMP"])
AC_CHECK_FUNC([timespec_get], [DEFINES="$DEFINES -DHAVE_TIMESPEC_GET"])
@@ -1311,14 +1317,19 @@ AC_ARG_ENABLE([vdpau],
[enable_vdpau=auto])
AC_ARG_ENABLE([omx],
[AS_HELP_STRING([--enable-omx],
[DEPRECATED: Use --enable-omx-bellagio instead @<:@default=auto@:>@])],
[AC_MSG_ERROR([--enable-omx is deprecated. Use --enable-omx-bellagio instead.])],
[DEPRECATED: Use --enable-omx-bellagio or --enable-omx-tizonia instead @<:@default=auto@:>@])],
[AC_MSG_ERROR([--enable-omx is deprecated. Use --enable-omx-bellagio or --enable-omx-tizonia instead.])],
[])
AC_ARG_ENABLE([omx-bellagio],
[AS_HELP_STRING([--enable-omx-bellagio],
[enable OpenMAX Bellagio library @<:@default=disabled@:>@])],
[enable_omx_bellagio="$enableval"],
[enable_omx_bellagio=no])
AC_ARG_ENABLE([omx-tizonia],
[AS_HELP_STRING([--enable-omx-tizonia],
[enable OpenMAX Tizonia library @<:@default=disabled@:>@])],
[enable_omx_tizonia="$enableval"],
[enable_omx_tizonia=no])
AC_ARG_ENABLE([va],
[AS_HELP_STRING([--enable-va],
[enable va library @<:@default=auto@:>@])],
@@ -1350,7 +1361,7 @@ GALLIUM_DRIVERS_DEFAULT="r300,r600,svga,swrast"
AC_ARG_WITH([gallium-drivers],
[AS_HELP_STRING([--with-gallium-drivers@<:@=DIRS...@:>@],
[comma delimited Gallium drivers list, e.g.
"i915,nouveau,r300,r600,radeonsi,freedreno,pl111,svga,swrast,swr,vc4,vc5,virgl,etnaviv,imx"
"i915,nouveau,r300,r600,radeonsi,freedreno,pl111,svga,swrast,swr,tegra,v3d,vc4,virgl,etnaviv,imx"
@<:@default=r300,r600,svga,swrast@:>@])],
[with_gallium_drivers="$withval"],
[with_gallium_drivers="$GALLIUM_DRIVERS_DEFAULT"])
@@ -1370,11 +1381,17 @@ if test "x$enable_opengl" = xno -a \
"x$enable_xvmc" = xno -a \
"x$enable_vdpau" = xno -a \
"x$enable_omx_bellagio" = xno -a \
"x$enable_omx_tizonia" = xno -a \
"x$enable_va" = xno -a \
"x$enable_opencl" = xno; then
AC_MSG_ERROR([at least one API should be enabled])
fi
if test "x$enable_omx_bellagio" = xyes -a \
"x$enable_omx_tizonia" = xyes; then
AC_MSG_ERROR([Can't enable both bellagio and tizonia at same time])
fi
# Building OpenGL ES1 and/or ES2 without OpenGL is not supported on mesa 9.0.x
if test "x$enable_opengl" = xno -a \
"x$enable_gles1" = xyes; then
@@ -1770,19 +1787,6 @@ if test "x$with_platforms" = xauto; then
with_platforms=$with_egl_platforms
fi
PKG_CHECK_MODULES([WAYLAND_SCANNER], [wayland-scanner],
WAYLAND_SCANNER=`$PKG_CONFIG --variable=wayland_scanner wayland-scanner`,
WAYLAND_SCANNER='')
if test "x$WAYLAND_SCANNER" = x; then
AC_PATH_PROG([WAYLAND_SCANNER], [wayland-scanner], [:])
fi
PKG_CHECK_EXISTS([wayland-protocols >= $WAYLAND_PROTOCOLS_REQUIRED], [have_wayland_protocols=yes], [have_wayland_protocols=no])
if test "x$have_wayland_protocols" = xyes; then
ac_wayland_protocols_pkgdatadir=`$PKG_CONFIG --variable=pkgdatadir wayland-protocols`
fi
AC_SUBST(WAYLAND_PROTOCOLS_DATADIR, $ac_wayland_protocols_pkgdatadir)
# Do per platform setups and checks
platforms=`IFS=', '; echo $with_platforms`
for plat in $platforms; do
@@ -1791,13 +1795,22 @@ for plat in $platforms; do
PKG_CHECK_MODULES([WAYLAND_CLIENT], [wayland-client >= $WAYLAND_REQUIRED])
PKG_CHECK_MODULES([WAYLAND_SERVER], [wayland-server >= $WAYLAND_REQUIRED])
PKG_CHECK_MODULES([WAYLAND_PROTOCOLS], [wayland-protocols >= $WAYLAND_PROTOCOLS_REQUIRED])
if test "x$enable_egl" = xyes; then
PKG_CHECK_MODULES([WAYLAND_EGL], [wayland-egl-backend >= $WAYLAND_EGL_BACKEND_REQUIRED])
fi
WAYLAND_PROTOCOLS_DATADIR=`$PKG_CONFIG --variable=pkgdatadir wayland-protocols`
PKG_CHECK_MODULES([WAYLAND_SCANNER], [wayland-scanner],
WAYLAND_SCANNER=`$PKG_CONFIG --variable=wayland_scanner wayland-scanner`,
WAYLAND_SCANNER='')
if test "x$WAYLAND_SCANNER" = x; then
AC_PATH_PROG([WAYLAND_SCANNER], [wayland-scanner], [:])
fi
if test "x$WAYLAND_SCANNER" = "x:"; then
AC_MSG_ERROR([wayland-scanner is needed to compile the wayland platform])
fi
if test "x$have_wayland_protocols" = xno; then
AC_MSG_ERROR([wayland-protocols >= $WAYLAND_PROTOCOLS_REQUIRED is needed to compile the wayland platform])
fi
DEFINES="$DEFINES -DHAVE_WAYLAND_PLATFORM -DWL_HIDE_DEPRECATED"
;;
@@ -1818,6 +1831,9 @@ for plat in $platforms; do
android)
PKG_CHECK_MODULES([ANDROID], [cutils hardware sync])
if test -n "$with_gallium_drivers"; then
PKG_CHECK_MODULES([BACKTRACE], [backtrace])
fi
DEFINES="$DEFINES -DHAVE_ANDROID_PLATFORM"
;;
@@ -1832,6 +1848,7 @@ for plat in $platforms; do
;;
esac
done
AC_SUBST([WAYLAND_PROTOCOLS_DATADIR])
if test "x$enable_glx" != xno; then
if ! echo "$platforms" | grep -q 'x11'; then
@@ -1844,6 +1861,12 @@ if test x"$enable_dri3" = xyes; then
dri3_modules="x11-xcb xcb >= $XCB_REQUIRED xcb-dri3 xcb-xfixes xcb-present xcb-sync xshmfence >= $XSHMFENCE_REQUIRED"
PKG_CHECK_MODULES([XCB_DRI3], [$dri3_modules])
dri3_modifier_modules="xcb-dri3 >= $XCBDRI3_MODIFIERS_REQUIRED xcb-present >= $XCBPRESENT_MODIFIERS_REQUIRED"
PKG_CHECK_MODULES([XCB_DRI3_MODIFIERS], [$dri3_modifier_modules], [have_dri3_modifiers=yes], [have_dri3_modifiers=no])
if test "x$have_dri3_modifiers" == xyes; then
DEFINES="$DEFINES -DHAVE_DRI3_MODIFIERS"
fi
fi
AM_CONDITIONAL(HAVE_PLATFORM_X11, echo "$platforms" | grep -q 'x11')
@@ -2070,6 +2093,9 @@ if test -n "$with_vulkan_drivers"; then
PKG_CHECK_MODULES([AMDGPU], [libdrm >= $LIBDRM_AMDGPU_REQUIRED libdrm_amdgpu >= $LIBDRM_AMDGPU_REQUIRED])
radeon_llvm_check $LLVM_REQUIRED_RADV "radv"
require_x11_dri3 "radv"
if test "x$acv_mako_found" = xno; then
AC_MSG_ERROR([Python mako module v$PYTHON_MAKO_REQUIRED or higher not found])
fi
HAVE_RADEON_VULKAN=yes
;;
*)
@@ -2215,6 +2241,10 @@ if test -n "$with_gallium_drivers" -a "x$with_gallium_drivers" != xswrast; then
PKG_CHECK_EXISTS([libomxil-bellagio >= $LIBOMXIL_BELLAGIO_REQUIRED], [enable_omx_bellagio=yes], [enable_omx_bellagio=no])
fi
if test "x$enable_omx_tizonia" = xauto -a "x$have_omx_platform" = xyes; then
PKG_CHECK_EXISTS([libtizonia >= $LIBOMXIL_TIZONIA_REQUIRED], [enable_omx_tizonia=yes], [enable_omx_tizonia=no])
fi
if test "x$enable_va" = xauto -a "x$have_va_platform" = xyes; then
PKG_CHECK_EXISTS([libva >= $LIBVA_REQUIRED], [enable_va=yes], [enable_va=no])
fi
@@ -2224,6 +2254,7 @@ if test "x$enable_dri" = xyes -o \
"x$enable_xvmc" = xyes -o \
"x$enable_vdpau" = xyes -o \
"x$enable_omx_bellagio" = xyes -o \
"x$enable_omx_tizonia" = xyes -o \
"x$enable_va" = xyes; then
need_gallium_vl=yes
fi
@@ -2232,6 +2263,7 @@ AM_CONDITIONAL(NEED_GALLIUM_VL, test "x$need_gallium_vl" = xyes)
if test "x$enable_xvmc" = xyes -o \
"x$enable_vdpau" = xyes -o \
"x$enable_omx_bellagio" = xyes -o \
"x$enable_omx_tizonia" = xyes -o \
"x$enable_va" = xyes; then
if echo $platforms | grep -q "x11"; then
PKG_CHECK_MODULES([VL], [x11-xcb xcb xcb-dri2 >= $XCBDRI2_REQUIRED])
@@ -2265,9 +2297,27 @@ if test "x$enable_omx_bellagio" = xyes; then
fi
PKG_CHECK_MODULES([OMX_BELLAGIO], [libomxil-bellagio >= $LIBOMXIL_BELLAGIO_REQUIRED])
gallium_st="$gallium_st omx_bellagio"
AC_DEFINE([ENABLE_ST_OMX_BELLAGIO], 1, [Use Bellagio for OMX IL])
else
AC_DEFINE([ENABLE_ST_OMX_BELLAGIO], 0)
fi
AM_CONDITIONAL(HAVE_ST_OMX_BELLAGIO, test "x$enable_omx_bellagio" = xyes)
if test "x$enable_omx_tizonia" = xyes; then
if test "x$have_omx_platform" != xyes; then
AC_MSG_ERROR([OMX requires at least one of the x11 or drm platforms])
fi
PKG_CHECK_MODULES([OMX_TIZONIA],
[libtizonia >= $LIBOMXIL_TIZONIA_REQUIRED
tizilheaders >= $LIBOMXIL_TIZONIA_REQUIRED
libtizplatform >= $LIBOMXIL_TIZONIA_REQUIRED])
gallium_st="$gallium_st omx_tizonia"
AC_DEFINE([ENABLE_ST_OMX_TIZONIA], 1, [Use Tizoina for OMX IL])
else
AC_DEFINE([ENABLE_ST_OMX_TIZONIA], 0)
fi
AM_CONDITIONAL(HAVE_ST_OMX_TIZONIA, test "x$enable_omx_tizonia" = xyes)
if test "x$enable_va" = xyes; then
if test "x$have_va_platform" != xyes; then
AC_MSG_ERROR([VA requires at least one of the x11 drm or wayland platforms])
@@ -2441,6 +2491,15 @@ AC_ARG_WITH([omx-bellagio-libdir],
$PKG_CONFIG --define-variable=libdir=\$libdir --variable=pluginsdir libomxil-bellagio`])
AC_SUBST([OMX_BELLAGIO_LIB_INSTALL_DIR])
dnl Directory for OMX_TIZONIA libs
AC_ARG_WITH([omx-tizonia-libdir],
[AS_HELP_STRING([--with-omx-tizonia-libdir=DIR],
[directory for the OMX_TIZONIA libraries])],
[OMX_TIZONIA_LIB_INSTALL_DIR="$withval"],
[OMX_TIZONIA_LIB_INSTALL_DIR=`$PKG_CONFIG --define-variable=libdir=\$libdir --variable=pluginsdir libtizcore`])
AC_SUBST([OMX_TIZONIA_LIB_INSTALL_DIR])
dnl Directory for VA libs
AC_ARG_WITH([va-libdir],
@@ -2571,14 +2630,6 @@ if test -n "$with_gallium_drivers"; then
HAVE_GALLIUM_RADEONSI=yes
PKG_CHECK_MODULES([RADEON], [libdrm >= $LIBDRM_RADEON_REQUIRED libdrm_radeon >= $LIBDRM_RADEON_REQUIRED])
PKG_CHECK_MODULES([AMDGPU], [libdrm >= $LIBDRM_AMDGPU_REQUIRED libdrm_amdgpu >= $LIBDRM_AMDGPU_REQUIRED])
# Blacklist libdrm_amdgpu 2.4.90 because it causes a crash in older
# radeonsi with pretty much any app.
libdrm_version=`pkg-config libdrm_amdgpu --modversion`
if test "x$libdrm_version" = x2.4.90; then
AC_MSG_ERROR([radeonsi can't use libdrm 2.4.90 due to a compatibility issue. Use a newer or older version.])
fi
require_libdrm "radeonsi"
radeon_llvm_check $LLVM_REQUIRED_RADEONSI "radeonsi"
if test "x$enable_egl" = xyes; then
@@ -2603,6 +2654,10 @@ if test -n "$with_gallium_drivers"; then
ximx)
HAVE_GALLIUM_IMX=yes
;;
xtegra)
HAVE_GALLIUM_TEGRA=yes
require_libdrm "tegra"
;;
xswrast)
HAVE_GALLIUM_SOFTPIPE=yes
if test "x$enable_llvm" = xyes; then
@@ -2670,20 +2725,20 @@ if test -n "$with_gallium_drivers"; then
;;
xvc4)
HAVE_GALLIUM_VC4=yes
require_libdrm "vc4"
PKG_CHECK_MODULES([VC4], [libdrm >= $LIBDRM_VC4_REQUIRED])
PKG_CHECK_MODULES([SIMPENROSE], [simpenrose],
[USE_VC4_SIMULATOR=yes;
DEFINES="$DEFINES -DUSE_VC4_SIMULATOR"],
[USE_VC4_SIMULATOR=no])
;;
xvc5)
HAVE_GALLIUM_VC5=yes
xv3d)
HAVE_GALLIUM_V3D=yes
PKG_CHECK_MODULES([VC5_SIMULATOR], [v3dv3],
[USE_VC5_SIMULATOR=yes;
DEFINES="$DEFINES -DUSE_VC5_SIMULATOR"],
[AC_MSG_ERROR([vc5 requires the simulator])])
PKG_CHECK_MODULES([V3D_SIMULATOR], [v3dv3],
[USE_V3D_SIMULATOR=yes;
DEFINES="$DEFINES -DUSE_V3D_SIMULATOR"],
[USE_V3D_SIMULATOR=no])
;;
xpl111)
HAVE_GALLIUM_PL111=yes
@@ -2703,8 +2758,8 @@ if test -n "$with_gallium_drivers"; then
fi
# XXX: Keep in sync with LLVM_REQUIRED_SWR
AM_CONDITIONAL(SWR_INVALID_LLVM_VERSION, test "x$LLVM_VERSION" != x3.9.0 -a \
"x$LLVM_VERSION" != x3.9.1)
AM_CONDITIONAL(SWR_INVALID_LLVM_VERSION, test "x$LLVM_VERSION" != x4.0.0 -a \
"x$LLVM_VERSION" != x4.0.1)
if test "x$enable_llvm" = "xyes" -a "$with_gallium_drivers"; then
llvm_require_version $LLVM_REQUIRED_GALLIUM "gallium"
@@ -2727,6 +2782,9 @@ if test "x$HAVE_GALLIUM_VC4" != xyes -a "x$HAVE_GALLIUM_PL111" = xyes ; then
AC_MSG_ERROR([Building with pl111 requires vc4])
fi
if test "x$HAVE_GALLIUM_NOUVEAU" != xyes -a "x$HAVE_GALLIUM_TEGRA" = xyes; then
AC_MSG_ERROR([Building with tegra requires nouveau])
fi
detect_old_buggy_llvm() {
dnl llvm-config may not give the right answer when llvm is a built as a
@@ -2821,19 +2879,19 @@ AM_CONDITIONAL(HAVE_GALLIUM_PL111, test "x$HAVE_GALLIUM_PL111" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_R300, test "x$HAVE_GALLIUM_R300" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_R600, test "x$HAVE_GALLIUM_R600" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_RADEONSI, test "x$HAVE_GALLIUM_RADEONSI" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_RADEON_COMMON, test "x$HAVE_GALLIUM_RADEONSI" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_NOUVEAU, test "x$HAVE_GALLIUM_NOUVEAU" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_FREEDRENO, test "x$HAVE_GALLIUM_FREEDRENO" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_ETNAVIV, test "x$HAVE_GALLIUM_ETNAVIV" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_IMX, test "x$HAVE_GALLIUM_IMX" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_TEGRA, test "x$HAVE_GALLIUM_TEGRA" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_SOFTPIPE, test "x$HAVE_GALLIUM_SOFTPIPE" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_LLVMPIPE, test "x$HAVE_GALLIUM_LLVMPIPE" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_SWR, test "x$HAVE_GALLIUM_SWR" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_SWRAST, test "x$HAVE_GALLIUM_SOFTPIPE" = xyes -o \
"x$HAVE_GALLIUM_LLVMPIPE" = xyes -o \
"x$HAVE_GALLIUM_SWR" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_V3D, test "x$HAVE_GALLIUM_V3D" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_VC4, test "x$HAVE_GALLIUM_VC4" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_VC5, test "x$HAVE_GALLIUM_VC5" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_VIRGL, test "x$HAVE_GALLIUM_VIRGL" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_STATIC_TARGETS, test "x$enable_shared_pipe_drivers" = xno)
@@ -2861,7 +2919,7 @@ AM_CONDITIONAL(HAVE_AMD_DRIVERS, test "x$HAVE_GALLIUM_RADEONSI" = xyes -o \
"x$HAVE_RADEON_VULKAN" = xyes)
AM_CONDITIONAL(HAVE_BROADCOM_DRIVERS, test "x$HAVE_GALLIUM_VC4" = xyes -o \
"x$HAVE_GALLIUM_VC5" = xyes)
"x$HAVE_GALLIUM_V3D" = xyes)
AM_CONDITIONAL(HAVE_INTEL_DRIVERS, test "x$HAVE_INTEL_VULKAN" = xyes -o \
"x$HAVE_I965_DRI" = xyes)
@@ -2872,8 +2930,8 @@ AM_CONDITIONAL(NEED_RADEON_DRM_WINSYS, test "x$HAVE_GALLIUM_R300" = xyes -o \
AM_CONDITIONAL(NEED_WINSYS_XLIB, test "x$enable_glx" = xgallium-xlib)
AM_CONDITIONAL(HAVE_GALLIUM_COMPUTE, test x$enable_opencl = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_LLVM, test "x$enable_llvm" = xyes)
AM_CONDITIONAL(USE_V3D_SIMULATOR, test x$USE_V3D_SIMULATOR = xyes)
AM_CONDITIONAL(USE_VC4_SIMULATOR, test x$USE_VC4_SIMULATOR = xyes)
AM_CONDITIONAL(USE_VC5_SIMULATOR, test x$USE_VC5_SIMULATOR = xyes)
AM_CONDITIONAL(HAVE_LIBDRM, test "x$have_libdrm" = xyes)
AM_CONDITIONAL(HAVE_OSMESA, test "x$enable_osmesa" = xyes)
@@ -2909,7 +2967,7 @@ AC_SUBST([XVMC_MAJOR], 1)
AC_SUBST([XVMC_MINOR], 0)
AC_SUBST([XA_MAJOR], 2)
AC_SUBST([XA_MINOR], 3)
AC_SUBST([XA_MINOR], 4)
AC_SUBST([XA_PATCH], 0)
AC_SUBST([XA_VERSION], "$XA_MAJOR.$XA_MINOR.$XA_PATCH")
@@ -2958,37 +3016,33 @@ AC_CONFIG_FILES([Makefile
src/egl/Makefile
src/egl/main/egl.pc
src/egl/wayland/wayland-drm/Makefile
src/egl/wayland/wayland-egl/Makefile
src/egl/wayland/wayland-egl/wayland-egl.pc
src/gallium/Makefile
src/gallium/auxiliary/Makefile
src/gallium/auxiliary/pipe-loader/Makefile
src/gallium/drivers/freedreno/Makefile
src/gallium/drivers/ddebug/Makefile
src/gallium/drivers/i915/Makefile
src/gallium/drivers/llvmpipe/Makefile
src/gallium/drivers/noop/Makefile
src/gallium/drivers/nouveau/Makefile
src/gallium/drivers/pl111/Makefile
src/gallium/drivers/r300/Makefile
src/gallium/drivers/r600/Makefile
src/gallium/drivers/radeon/Makefile
src/gallium/drivers/radeonsi/Makefile
src/gallium/drivers/rbug/Makefile
src/gallium/drivers/softpipe/Makefile
src/gallium/drivers/svga/Makefile
src/gallium/drivers/swr/Makefile
src/gallium/drivers/trace/Makefile
src/gallium/drivers/tegra/Makefile
src/gallium/drivers/etnaviv/Makefile
src/gallium/drivers/imx/Makefile
src/gallium/drivers/v3d/Makefile
src/gallium/drivers/vc4/Makefile
src/gallium/drivers/vc5/Makefile
src/gallium/drivers/virgl/Makefile
src/gallium/state_trackers/clover/Makefile
src/gallium/state_trackers/dri/Makefile
src/gallium/state_trackers/glx/xlib/Makefile
src/gallium/state_trackers/nine/Makefile
src/gallium/state_trackers/omx_bellagio/Makefile
src/gallium/state_trackers/omx/Makefile
src/gallium/state_trackers/omx/bellagio/Makefile
src/gallium/state_trackers/omx/tizonia/Makefile
src/gallium/state_trackers/osmesa/Makefile
src/gallium/state_trackers/va/Makefile
src/gallium/state_trackers/vdpau/Makefile
@@ -2999,7 +3053,7 @@ AC_CONFIG_FILES([Makefile
src/gallium/targets/d3dadapter9/d3d.pc
src/gallium/targets/dri/Makefile
src/gallium/targets/libgl-xlib/Makefile
src/gallium/targets/omx-bellagio/Makefile
src/gallium/targets/omx/Makefile
src/gallium/targets/opencl/Makefile
src/gallium/targets/opencl/mesa.icd
src/gallium/targets/osmesa/Makefile
@@ -3026,8 +3080,9 @@ AC_CONFIG_FILES([Makefile
src/gallium/winsys/sw/null/Makefile
src/gallium/winsys/sw/wrapper/Makefile
src/gallium/winsys/sw/xlib/Makefile
src/gallium/winsys/tegra/drm/Makefile
src/gallium/winsys/v3d/drm/Makefile
src/gallium/winsys/vc4/drm/Makefile
src/gallium/winsys/vc5/drm/Makefile
src/gallium/winsys/virgl/drm/Makefile
src/gallium/winsys/virgl/vtest/Makefile
src/gbm/Makefile
@@ -3063,6 +3118,7 @@ AC_CONFIG_FILES([Makefile
src/util/Makefile
src/util/tests/hash_table/Makefile
src/util/tests/string_buffer/Makefile
src/util/tests/vma/Makefile
src/util/xmlpool/Makefile
src/vulkan/Makefile])
@@ -3075,6 +3131,9 @@ $SED -i -e 's/brw_blorp.cpp/brw_blorp.c/' src/mesa/drivers/dri/i965/.deps/brw_bl
rm -f src/compiler/spirv/spirv_info.lo
echo "# dummy" > src/compiler/spirv/.deps/spirv_info.Plo
rm -f src/compiler/nir/.deps/nir_intrinsics.Plo
echo "# dummy" > src/compiler/nir/.deps/nir_intrinsics.Plo
dnl
dnl Output some configuration info for the user
dnl

View File

@@ -83,7 +83,7 @@ We try to quote the OpenGL specification where prudent:
* "An INVALID_OPERATION error is generated for any of the following
* conditions:
*
* * <length> is zero."
* * &lt;length&gt; is zero."
*
* Additionally, page 94 of the PDF of the OpenGL 4.5 core spec
* (30.10.2014) also says this, so it's no longer allowed for desktop GL,
@@ -94,7 +94,7 @@ Function comment example:
<pre>
/**
* Create and initialize a new buffer object. Called via the
* ctx->Driver.CreateObject() driver callback function.
* ctx-&gt;Driver.CreateObject() driver callback function.
* \param name integer name of the object
* \param type one of GL_FOO, GL_BAR, etc.
* \return pointer to new object or NULL if error

View File

@@ -168,6 +168,7 @@ the X server directly using (XCB-)DRI2 protocol.</p>
<p>This driver can share DRI drivers with <code>libGL</code>.</p>
</dd>
</dl>
<h2>Packaging</h2>

View File

@@ -88,22 +88,40 @@ This is a work-around for that.
<li>MESA_GL_VERSION_OVERRIDE - changes the value returned by
glGetString(GL_VERSION) and possibly the GL API type.
<ul>
<li> The format should be MAJOR.MINOR[FC]
<li> FC is an optional suffix that indicates a forward compatible context.
This is only valid for versions &gt;= 3.0.
<li> GL versions &lt; 3.0 are set to a compatibility (non-Core) profile
<li> GL versions = 3.0, see below
<li> GL versions &gt; 3.0 are set to a Core profile
<li> Examples: 2.1, 3.0, 3.0FC, 3.1, 3.1FC
<ul>
<li> 2.1 - select a compatibility (non-Core) profile with GL version 2.1
<li> 3.0 - select a compatibility (non-Core) profile with GL version 3.0
<li> 3.0FC - select a Core+Forward Compatible profile with GL version 3.0
<li> 3.1 - select a Core profile with GL version 3.1
<li> 3.1FC - select a Core+Forward Compatible profile with GL version 3.1
</ul>
<li> Mesa may not really implement all the features of the given version.
(for developers only)
<li>The format should be MAJOR.MINOR[FC|COMPAT]
<li>FC is an optional suffix that indicates a forward compatible
context. This is only valid for versions &gt;= 3.0.
<li>COMPAT is an optional suffix that indicates a compatibility
context or GL_ARB_compatibility support. This is only valid for
versions &gt;= 3.1.
<li>GL versions &lt;= 3.0 are set to a compatibility (non-Core)
profile
<li>GL versions = 3.1, depending on the driver, it may or may not
have the ARB_compatibility extension enabled.
<li>GL versions &gt;= 3.2 are set to a Core profile
<li>Examples: 2.1, 3.0, 3.0FC, 3.1, 3.1FC, 3.1COMPAT, X.Y, X.YFC,
X.YCOMPAT.
<ul>
<li>2.1 - select a compatibility (non-Core) profile with GL
version 2.1.
<li>3.0 - select a compatibility (non-Core) profile with GL
version 3.0.
<li>3.0FC - select a Core+Forward Compatible profile with GL
version 3.0.
<li>3.1 - select GL version 3.1 with GL_ARB_compatibility enabled
per the driver default.
<li>3.1FC - select GL version 3.1 with forward compatibility and
GL_ARB_compatibility disabled.
<li>3.1COMPAT - select GL version 3.1 with GL_ARB_compatibility
enabled.
<li>X.Y - override GL version to X.Y without changing the profile.
<li>X.YFC - select a Core+Forward Compatible profile with GL
version X.Y.
<li>X.YCOMPAT - select a Compatibility profile with GL version
X.Y.
</ul>
<li>Mesa may not really implement all the features of the given
version. (for developers only)
</ul>
<li>MESA_GLES_VERSION_OVERRIDE - changes the value returned by
glGetString(GL_VERSION) for OpenGL ES.
@@ -135,6 +153,16 @@ home directory.
<li>MESA_NO_MINMAX_CACHE - when set, the minmax index cache is globally disabled.
<li>MESA_SHADER_CAPTURE_PATH - see <a href="shading.html#capture">Capturing Shaders</a></li>
<li>MESA_SHADER_DUMP_PATH and MESA_SHADER_READ_PATH - see <a href="shading.html#replacement">Experimenting with Shader Replacements</a></li>
<li>MESA_VK_VERSION_OVERRIDE - changes the Vulkan physical device version
as returned in VkPhysicalDeviceProperties::apiVersion.
<ul>
<li>The format should be MAJOR.MINOR[.PATCH]</li>
<li>This will not let you force a version higher than the driver's
instance versionas advertised by vkEnumerateInstanceVersion</li>
<li>This can be very useful for debugging but some features may not be
implemented correctly. (For developers only)</li>
</ul>
</li>
</ul>
@@ -241,7 +269,7 @@ Mesa EGL supports different sets of environment variables. See the
Especially useful to toggle hud at specific points of application and
disable for unencumbered viewing the rest of the time. For example, set
GALLIUM_HUD_VISIBLE to false and GALLIUM_HUD_TOGGLE_SIGNAL to 10 (SIGUSR1).
Use kill -10 <pid> to toggle the hud as desired.
Use kill -10 &lt;pid&gt; to toggle the hud as desired.
<li>GALLIUM_HUD_DUMP_DIR - specifies a directory for writing the displayed
hud values into files.
<li>GALLIUM_DRIVER - useful in combination with LIBGL_ALWAYS_SOFTWARE=true for
@@ -313,6 +341,12 @@ such as the OpenGL program's name and command line arguments.
<li>See the driver code for other, lesser-used variables.
</ul>
<h3>WGL environment variables</h3>
<ul>
<li>WGL_SWAP_INTERVAL - to set a swap interval, equivalent to calling
wglSwapIntervalEXT() in an application. If this environment variable
is set, application calls to wglSwapIntervalEXT() will have no effect.
</ul>
<h3>VA-API state tracker environment variables</h3>
<ul>

BIN
docs/favicon.ico Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 13 KiB

BIN
docs/favicon.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.9 KiB

View File

@@ -24,16 +24,19 @@ not started
# OpenGL Core and Compatibility context support
OpenGL 3.1 and later versions are only supported with the Core profile.
There are no plans to support GL_ARB_compatibility. The last supported OpenGL
version with all deprecated features is 3.0. Some of the later GL features
are exposed in the 3.0 context as extensions.
Some drivers do not support the Compatibility profile or the
ARB_compatibility extensions. If an application does not request a
specific version without the forward-compatiblity flag, such drivers
will be limited to OpenGL 3.0. If an application requests OpenGL 3.1,
it will get a context that may or may not have the ARB_compatibility
extension enabled. Some of the later GL features are exposed in the 3.0
context as extensions.
Feature Status
------------------------------------------------------- ------------------------
GL 3.0, GLSL 1.30 --- all DONE: freedreno, i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr
GL 3.0, GLSL 1.30 --- all DONE: freedreno, i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr, virgl
glBindFragDataLocation, glGetFragDataLocation DONE
GL_NV_conditional_render (Conditional rendering) DONE ()
@@ -65,7 +68,7 @@ GL 3.0, GLSL 1.30 --- all DONE: freedreno, i965, nv50, nvc0, r600, radeonsi, llv
(*) freedreno, llvmpipe, softpipe, and swr have fake Multisample anti-aliasing support
GL 3.1, GLSL 1.40 --- all DONE: freedreno, i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr
GL 3.1, GLSL 1.40 --- all DONE: freedreno, i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr, virgl
Forward compatible context support/deprecations DONE ()
GL_ARB_draw_instanced (Instanced drawing) DONE ()
@@ -78,7 +81,7 @@ GL 3.1, GLSL 1.40 --- all DONE: freedreno, i965, nv50, nvc0, r600, radeonsi, llv
GL_EXT_texture_snorm (Signed normalized textures) DONE ()
GL 3.2, GLSL 1.50 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr
GL 3.2, GLSL 1.50 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr, virgl
Core/compatibility profiles DONE
Geometry shaders DONE ()
@@ -93,7 +96,7 @@ GL 3.2, GLSL 1.50 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, soft
GLX_ARB_create_context_profile DONE
GL 3.3, GLSL 3.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe
GL 3.3, GLSL 3.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, virgl
GL_ARB_blend_func_extended DONE (freedreno/a3xx, swr)
GL_ARB_explicit_attrib_location DONE (all drivers that support GLSL)
@@ -107,7 +110,7 @@ GL 3.3, GLSL 3.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, soft
GL_ARB_vertex_type_2_10_10_10_rev DONE (freedreno, swr)
GL 4.0, GLSL 4.00 --- all DONE: i965/gen7+, nvc0, r600, radeonsi
GL 4.0, GLSL 4.00 --- all DONE: i965/gen7+, nvc0, r600, radeonsi, virgl
GL_ARB_draw_buffers_blend DONE (freedreno, i965/gen6+, nv50, llvmpipe, softpipe, swr)
GL_ARB_draw_indirect DONE (freedreno, i965/gen7+, llvmpipe, softpipe, swr)
@@ -136,7 +139,7 @@ GL 4.0, GLSL 4.00 --- all DONE: i965/gen7+, nvc0, r600, radeonsi
GL_ARB_transform_feedback3 DONE (i965/gen7+, llvmpipe, softpipe, swr)
GL 4.1, GLSL 4.10 --- all DONE: i965/gen7+, nvc0, r600, radeonsi
GL 4.1, GLSL 4.10 --- all DONE: i965/gen7+, nvc0, r600, radeonsi, virgl
GL_ARB_ES2_compatibility DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_get_program_binary DONE (0 or 1 binary formats)
@@ -148,17 +151,17 @@ GL 4.1, GLSL 4.10 --- all DONE: i965/gen7+, nvc0, r600, radeonsi
GL 4.2, GLSL 4.20 -- all DONE: i965/gen7+, nvc0, r600, radeonsi
GL_ARB_texture_compression_bptc DONE (freedreno, i965)
GL_ARB_texture_compression_bptc DONE (freedreno, i965, virgl)
GL_ARB_compressed_texture_pixel_storage DONE (all drivers)
GL_ARB_shader_atomic_counters DONE (freedreno/a5xx, i965, softpipe)
GL_ARB_texture_storage DONE (all drivers)
GL_ARB_transform_feedback_instanced DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_base_instance DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_transform_feedback_instanced DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr, virgl)
GL_ARB_base_instance DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr, virgl)
GL_ARB_shader_image_load_store DONE (freedreno/a5xx, i965, softpipe)
GL_ARB_conservative_depth DONE (all drivers that support GLSL 1.30)
GL_ARB_shading_language_420pack DONE (all drivers that support GLSL 1.30)
GL_ARB_shading_language_packing DONE (all drivers)
GL_ARB_internalformat_query DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_internalformat_query DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr, virgl)
GL_ARB_map_buffer_alignment DONE (all drivers)
@@ -171,29 +174,29 @@ GL 4.3, GLSL 4.30 -- all DONE: i965/gen8+, nvc0, r600, radeonsi
GL_ARB_copy_image DONE (i965, nv50, softpipe, llvmpipe)
GL_KHR_debug DONE (all drivers)
GL_ARB_explicit_uniform_location DONE (all drivers that support GLSL)
GL_ARB_fragment_layer_viewport DONE (i965, nv50, llvmpipe, softpipe)
GL_ARB_fragment_layer_viewport DONE (i965, nv50, llvmpipe, softpipe, virgl)
GL_ARB_framebuffer_no_attachments DONE (freedreno, i965, softpipe)
GL_ARB_internalformat_query2 DONE (all drivers)
GL_ARB_invalidate_subdata DONE (all drivers)
GL_ARB_multi_draw_indirect DONE (freedreno, i965, llvmpipe, softpipe, swr)
GL_ARB_multi_draw_indirect DONE (freedreno, i965, llvmpipe, softpipe, swr, virgl)
GL_ARB_program_interface_query DONE (all drivers)
GL_ARB_robust_buffer_access_behavior DONE (i965)
GL_ARB_shader_image_size DONE (freedreno/a5xx, i965, softpipe)
GL_ARB_shader_storage_buffer_object DONE (freedreno/a5xx, i965, softpipe)
GL_ARB_stencil_texturing DONE (freedreno, i965/hsw+, nv50, llvmpipe, softpipe, swr)
GL_ARB_texture_buffer_range DONE (freedreno, nv50, i965, llvmpipe)
GL_ARB_stencil_texturing DONE (freedreno, i965/hsw+, nv50, llvmpipe, softpipe, swr, virgl)
GL_ARB_texture_buffer_range DONE (freedreno, nv50, i965, llvmpipe, virgl)
GL_ARB_texture_query_levels DONE (all drivers that support GLSL 1.30)
GL_ARB_texture_storage_multisample DONE (all drivers that support GL_ARB_texture_multisample)
GL_ARB_texture_view DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_vertex_attrib_binding DONE (all drivers)
GL 4.4, GLSL 4.40 -- all DONE: i965/gen8+, nvc0, radeonsi
GL 4.4, GLSL 4.40 -- all DONE: i965/gen8+, nvc0, r600, radeonsi
GL_MAX_VERTEX_ATTRIB_STRIDE DONE (all drivers)
GL_ARB_buffer_storage DONE (freedreno, i965, nv50, r600, llvmpipe, swr)
GL_ARB_clear_texture DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_enhanced_layouts DONE (i965, nv50, r600, llvmpipe, softpipe)
GL_ARB_buffer_storage DONE (freedreno, i965, nv50, llvmpipe, swr)
GL_ARB_clear_texture DONE (i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_enhanced_layouts DONE (i965, nv50, llvmpipe, softpipe)
- compile-time constant expressions DONE
- explicit byte offsets for blocks DONE
- forced alignment within blocks DONE
@@ -202,17 +205,17 @@ GL 4.4, GLSL 4.40 -- all DONE: i965/gen8+, nvc0, radeonsi
- input/output block locations DONE
GL_ARB_multi_bind DONE (all drivers)
GL_ARB_query_buffer_object DONE (i965/hsw+)
GL_ARB_texture_mirror_clamp_to_edge DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_texture_stencil8 DONE (freedreno, i965/hsw+, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_vertex_type_10f_11f_11f_rev DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_texture_mirror_clamp_to_edge DONE (i965, nv50, llvmpipe, softpipe, swr, virgl)
GL_ARB_texture_stencil8 DONE (freedreno, i965/hsw+, nv50, llvmpipe, softpipe, swr, virgl)
GL_ARB_vertex_type_10f_11f_11f_rev DONE (i965, nv50, llvmpipe, softpipe, swr, virgl)
GL 4.5, GLSL 4.50 -- all DONE: nvc0, radeonsi
GL_ARB_ES3_1_compatibility DONE (i965/hsw+, r600)
GL_ARB_clip_control DONE (freedreno, i965, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_conditional_render_inverted DONE (freedreno, i965, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_cull_distance DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_derivative_control DONE (i965, nv50, r600)
GL_ARB_conditional_render_inverted DONE (freedreno, i965, nv50, r600, llvmpipe, softpipe, swr, virgl)
GL_ARB_cull_distance DONE (i965, nv50, r600, llvmpipe, softpipe, swr, virgl)
GL_ARB_derivative_control DONE (i965, nv50, r600, virgl)
GL_ARB_direct_state_access DONE (all drivers)
GL_ARB_get_texture_sub_image DONE (all drivers)
GL_ARB_shader_texture_image_samples DONE (i965, nv50, r600)
@@ -226,13 +229,13 @@ GL 4.6, GLSL 4.60
GL_ARB_gl_spirv in progress (Nicolai Hähnle, Ian Romanick)
GL_ARB_indirect_parameters DONE (i965/gen7+, nvc0, radeonsi)
GL_ARB_pipeline_statistics_query DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_polygon_offset_clamp DONE (freedreno, i965, nv50, nvc0, r600, radeonsi, llvmpipe, swr)
GL_ARB_polygon_offset_clamp DONE (freedreno, i965, nv50, nvc0, r600, radeonsi, llvmpipe, swr, virgl)
GL_ARB_shader_atomic_counter_ops DONE (freedreno/a5xx, i965/gen7+, nvc0, r600, radeonsi, softpipe)
GL_ARB_shader_draw_parameters DONE (i965, nvc0, radeonsi)
GL_ARB_shader_group_vote DONE (i965, nvc0, radeonsi)
GL_ARB_spirv_extensions in progress (Nicolai Hähnle, Ian Romanick)
GL_ARB_texture_filter_anisotropic DONE (freedreno, i965, nv50, nvc0, r600, radeonsi, softpipe (*), llvmpipe (*))
GL_ARB_transform_feedback_overflow_query DONE (i965/gen6+, radeonsi, llvmpipe, softpipe)
GL_ARB_transform_feedback_overflow_query DONE (i965/gen6+, nvc0, radeonsi, llvmpipe, softpipe, virgl)
GL_KHR_no_error DONE (all drivers)
(*) softpipe and llvmpipe advertise 16x anisotropy but simply ignore the setting
@@ -242,7 +245,7 @@ GLES3.1, GLSL ES 3.1 -- all DONE: i965/hsw+, nvc0, r600, radeonsi
GL_ARB_arrays_of_arrays DONE (all drivers that support GLSL 1.30)
GL_ARB_compute_shader DONE (freedreno/a5xx, i965/gen7+, softpipe)
GL_ARB_draw_indirect DONE (freedreno, i965/gen7+, llvmpipe, softpipe, swr)
GL_ARB_draw_indirect DONE (freedreno, i965/gen7+, llvmpipe, softpipe, swr, virgl)
GL_ARB_explicit_uniform_location DONE (all drivers that support GLSL)
GL_ARB_framebuffer_no_attachments DONE (freedreno, i965/gen7+, softpipe)
GL_ARB_program_interface_query DONE (all drivers)
@@ -252,12 +255,12 @@ GLES3.1, GLSL ES 3.1 -- all DONE: i965/hsw+, nvc0, r600, radeonsi
GL_ARB_shader_storage_buffer_object DONE (freedreno/a5xx, i965/gen7+, softpipe)
GL_ARB_shading_language_packing DONE (all drivers)
GL_ARB_separate_shader_objects DONE (all drivers)
GL_ARB_stencil_texturing DONE (freedreno, nv50, llvmpipe, softpipe, swr)
GL_ARB_texture_multisample (Multisample textures) DONE (i965/gen7+, nv50, llvmpipe, softpipe)
GL_ARB_stencil_texturing DONE (freedreno, nv50, llvmpipe, softpipe, swr, virgl)
GL_ARB_texture_multisample (Multisample textures) DONE (i965/gen7+, nv50, llvmpipe, softpipe, virgl)
GL_ARB_texture_storage_multisample DONE (all drivers that support GL_ARB_texture_multisample)
GL_ARB_vertex_attrib_binding DONE (all drivers)
GS5 Enhanced textureGather DONE (freedreno, i965/gen7+,)
GS5 Packing/bitfield/conversion functions DONE (i965/gen6+)
GS5 Enhanced textureGather DONE (freedreno, i965/gen7+,virgl)
GS5 Packing/bitfield/conversion functions DONE (i965/gen6+,virgl)
GL_EXT_shader_integer_mix DONE (all drivers that support GLSL)
Additional functionality not covered above:
@@ -269,7 +272,7 @@ GLES3.1, GLSL ES 3.1 -- all DONE: i965/hsw+, nvc0, r600, radeonsi
GLES3.2, GLSL ES 3.2 -- all DONE: i965/gen9+
GL_EXT_color_buffer_float DONE (all drivers)
GL_KHR_blend_equation_advanced DONE (i965, nvc0)
GL_KHR_blend_equation_advanced DONE (i965, nvc0, radeonsi)
GL_KHR_debug DONE (all drivers)
GL_KHR_robustness DONE (i965, nvc0, radeonsi)
GL_KHR_texture_compression_astc_ldr DONE (freedreno, i965/gen9+)
@@ -297,16 +300,16 @@ Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES ve
GL_ARB_cl_event not started
GL_ARB_compute_variable_group_size DONE (nvc0, radeonsi)
GL_ARB_ES3_2_compatibility DONE (i965/gen8+)
GL_ARB_fragment_shader_interlock not started
GL_ARB_fragment_shader_interlock DONE (i965)
GL_ARB_gpu_shader_int64 DONE (i965/gen8+, nvc0, radeonsi, softpipe, llvmpipe)
GL_ARB_parallel_shader_compile not started, but Chia-I Wu did some related work in 2014
GL_ARB_post_depth_coverage DONE (i965)
GL_ARB_post_depth_coverage DONE (i965, nvc0)
GL_ARB_robustness_isolation not started
GL_ARB_sample_locations not started
GL_ARB_seamless_cubemap_per_texture DONE (i965, nvc0, radeonsi, r600, softpipe, swr)
GL_ARB_sample_locations DONE (nvc0)
GL_ARB_seamless_cubemap_per_texture DONE (i965, nvc0, radeonsi, r600, softpipe, swr, virgl)
GL_ARB_shader_ballot DONE (i965/gen8+, nvc0, radeonsi)
GL_ARB_shader_clock DONE (i965/gen7+, nv50, nvc0, r600, radeonsi)
GL_ARB_shader_stencil_export DONE (i965/gen9+, r600, radeonsi, softpipe, llvmpipe, swr)
GL_ARB_shader_stencil_export DONE (i965/gen9+, r600, radeonsi, softpipe, llvmpipe, swr, virgl)
GL_ARB_shader_viewport_layer_array DONE (i965/gen6+, nvc0, radeonsi)
GL_ARB_sparse_buffer DONE (radeonsi/CIK+)
GL_ARB_sparse_texture not started
@@ -316,8 +319,8 @@ Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES ve
GL_EXT_memory_object DONE (radeonsi)
GL_EXT_memory_object_fd DONE (radeonsi)
GL_EXT_memory_object_win32 not started
GL_EXT_semaphore not started
GL_EXT_semaphore_fd not started
GL_EXT_semaphore DONE (radeonsi)
GL_EXT_semaphore_fd DONE (radeonsi)
GL_EXT_semaphore_win32 not started
GL_KHR_blend_equation_advanced_coherent DONE (i965/gen9+)
GL_KHR_texture_compression_astc_hdr DONE (i965/bxt)

View File

@@ -16,6 +16,100 @@
<h1>News</h1>
<h2>June 3, 2018</h2>
<p>
<a href="relnotes/18.0.5.html">Mesa 18.0.5</a> is released.
This is a bug-fix release.
<br>
NOTE: It is anticipated that 18.0.5 will be the final release in the
18.0 series. Users of 18.0 are encouraged to migrate to the 18.1
series in order to obtain future fixes.
</p>
<h2>June 1, 2018</h2>
<p>
<a href="relnotes/18.1.1.html">Mesa 18.1.1</a> is released.
This is a bug-fix release.
</p>
<h2>May 18, 2018</h2>
<p>
<a href="relnotes/18.1.0.html">Mesa 18.1.0</a> is released. This is a
new development release. See the release notes for more information
about the release.
</p>
<h2>May 17, 2018</h2>
<p>
<a href="relnotes/18.0.4.html">Mesa 18.0.4</a> is released.
This is a bug-fix release.
</p>
<h2>May 7, 2018</h2>
<p>
<a href="relnotes/18.0.3.html">Mesa 18.0.3</a> is released.
This is a bug-fix release.
</p>
<h2>April 28, 2018</h2>
<p>
<a href="relnotes/18.0.2.html">Mesa 18.0.2</a> is released.
This is a bug-fix release.
</p>
<h2>April 18, 2018</h2>
<p>
<a href="relnotes/18.0.1.html">Mesa 18.0.1</a> is released.
This is a bug-fix release.
</p>
<h2>April 18, 2018</h2>
<p>
<a href="relnotes/17.3.9.html">Mesa 17.3.9</a> is released.
This is a bug-fix release.
<br>
NOTE: It is anticipated that 17.3.9 will be the final release in the
17.3 series. Users of 17.3 are encouraged to migrate to the 18.0
series in order to obtain future fixes.
</p>
<h2>April 03, 2018</h2>
<p>
<a href="relnotes/17.3.8.html">Mesa 17.3.8</a> is released.
This is a bug-fix release.
</p>
<h2>March 27, 2018</h2>
<p>
<a href="relnotes/18.0.0.html">Mesa 18.0.0</a> is released. This is a
new development release. See the release notes for more information
about the release.
</p>
<h2>March 21, 2018</h2>
<p>
<a href="relnotes/17.3.7.html">Mesa 17.3.7</a> is released.
This is a bug-fix release.
</p>
<h2>February 26, 2018</h2>
<p>
<a href="relnotes/17.3.6.html">Mesa 17.3.6</a> is released.
This is a bug-fix release.
</p>
<h2>February 19, 2018</h2>
<p>
<a href="relnotes/17.3.5.html">Mesa 17.3.5</a> is released.
This is a bug-fix release.
</p>
<h2>February 15, 2018</h2>
<p>
<a href="relnotes/17.3.4.html">Mesa 17.3.4</a> is released.
This is a bug-fix release.
</p>
<h2>January 18, 2018</h2>
<p>
<a href="relnotes/17.3.3.html">Mesa 17.3.3</a> is released.

View File

@@ -18,16 +18,22 @@
<h2 id="basic">1. Basic Usage</h2>
<p><strong>The Meson build system for Mesa is still under active development,
and should not be used in production environments.</strong></p>
<p><strong>The Meson build system is generally considered stable and ready
for production</strong></p>
<p>The meson build is currently only tested on linux, and is known to not work
on macOS, Windows, and haiku. This will be fixed.</p>
<p>The meson build is tested on on Linux, macOS, Cygwin and Haiku, it should
work on FreeBSD, DragonflyBSD, NetBSD, and OpenBSD.</p>
<p><strong>Mesa requires Meson >= 0.44.1 to build.</strong>
Some older versions of meson do not check that they are too old and will error
out in odd ways.
</p>
<p>
The meson program is used to configure the source directory and generates
either a ninja build file or Visual Studio® build files. The latter must
be enabled via the --backend switch, as ninja is the default backend on all
be enabled via the <code>--backend</code> switch, as ninja is the default backend on all
operating systems. Meson only supports out-of-tree builds, and must be passed a
directory to put built and generated sources into. We'll call that directory
"build" for examples.
@@ -43,7 +49,7 @@ along with a build directory to view the selected options for. This will show
your meson global arguments and project arguments, along with their defaults
and your local settings.
Moes does not currently support listing options before configure a build
Meson does not currently support listing options before configure a build
directory, but this feature is being discussed upstream.
</p>
@@ -54,13 +60,21 @@ directory, but this feature is being discussed upstream.
<p>
With additional arguments <code>meson configure</code> is used to change
options on already configured build directory. All options passed to this
command are in the form -D "command"="value".
command are in the form <code>-D "command"="value"</code>.
</p>
<pre>
meson configure build/ -Dprefix=/tmp/install -Dglx=true
</pre>
<p>
Note that options taking lists (such as <code>platforms</code>) are
<a href="http://mesonbuild.com/Build-options.html#using-build-options">a bit
more complicated</a>, but the simplest form compatible with Mesa options
is to use a comma to separate values (<code>-D platforms=drm,wayland</code>)
and brackets to represent an empty list (<code>-D platforms=[]</code>).
</p>
<p>
Once you've run the initial <code>meson</code> command successfully you can use
your configured backend to build the project. With ninja, the -C option can be
@@ -76,13 +90,14 @@ Without arguments, it will produce libGL.so and/or several other libraries
depending on the options you have chosen. Later, if you want to rebuild for a
different configuration, you should run <code>ninja clean</code> before
changing the configuration, or create a new out of tree build directory for
each configuration you want to build.
http://mesonbuild.com/Using-multiple-build-directories.html
each configuration you want to build
<a href="http://mesonbuild.com/Using-multiple-build-directories.html">as
recommended in the documentation</a>
</p>
<dl>
<dt><code>Environment Variables</code></dt>
<dd><p>Meson supports the standard CC and CXX envrionment variables for
<dd><p>Meson supports the standard CC and CXX environment variables for
changing the default compiler, and CFLAGS, CXXFLAGS, and LDFLAGS for setting
options to the compiler and linker.
@@ -93,9 +108,9 @@ the popular compilers, a complete list is available
These arguments are consumed and stored by meson when it is initialized or
re-initialized. Therefore passing them to meson configure will not do anything,
and passing them to ninja will only do something if ninja decides to
re-initialze meson, for example, if a meson.build file has been changed.
re-initialize meson, for example, if a meson.build file has been changed.
Changing these variables will not cause all targets to be rebuilt, so running
ninja clean is recomended when changing CFLAGS or CXXFLAGS. meson will never
ninja clean is recommended when changing CFLAGS or CXXFLAGS. Meson will never
change compiler in a configured build directory.
</p>
@@ -107,27 +122,27 @@ change compiler in a configured build directory.
CFLAGS=-Wno-typedef-redefinition ninja -C build-clang
</pre>
<p>Meson also honors DESTDIR for installs</p>
<p>Meson also honors <code>DESTDIR</code> for installs</p>
</dd>
<dt><code>LLVM</code></dt>
<dd><p>Meson includes upstream logic to wrap llvm-config using it's standard
dependncy interface. It will search $PATH (or %PATH% on windows) for
dependency interface. It will search <code>$PATH</code> (or <code>%PATH%</code> on windows) for
llvm-config, so using an LLVM from a non-standard path is as easy as
<code>PATH=/path/with/llvm-config:$PATH meson build</code>.
</p></dd>
</dl>
<dl>
<dt><code>PKG_CONFIG_PATH</code></dt>
<dd><p>The
<code>pkg-config</code> utility is a hard requirement for configuring and
building Mesa on Linux and *BSD. It is used to search for external libraries
on the system. This environment variable is used to control the search
path for <code>pkg-config</code>. For instance, setting
<code>PKG_CONFIG_PATH=/usr/X11R6/lib/pkgconfig</code> will search for
package metadata in <code>/usr/X11R6</code> before the standard
directories.</p>
building Mesa on Unix-like systems. It is used to search for external libraries
on the system. This environment variable is used to control the search path for
<code>pkg-config</code>. For instance, setting
<code>PKG_CONFIG_PATH=/usr/X11R6/lib/pkgconfig</code> will search for package
metadata in <code>/usr/X11R6</code> before the standard directories.</p>
</dd>
</dl>
@@ -136,7 +151,7 @@ One of the oddities of meson is that some options are different when passed to
the <code>meson</code> than to <code>meson configure</code>. These options are
passed as --option=foo to <code>meson</code>, but -Doption=foo to <code>meson
configure</code>. Mesa defined options are always passed as -Doption=foo.
<p>
</p>
<p>For those coming from autotools be aware of the following:</p>
@@ -145,24 +160,28 @@ configure</code>. Mesa defined options are always passed as -Doption=foo.
<dd><p>This option will set the compiler debug/optimisation levels to aid
debugging the Mesa libraries.</p>
<p>Note that in meson this defaults to "debugoptimized", and not setting it to
"release" will yield non-optimal performance and binary size. Not using "debug"
may interfer with debbugging as some code and validation will be optimized
away.
<p>Note that in meson this defaults to <code>debugoptimized</code>, and
not setting it to <code>release</code> will yield non-optimal
performance and binary size. Not using <code>debug</code> may interfere
with debugging as some code and validation will be optimized away.
</p>
<p> For those wishing to pass their own -O option, use the "plain" buildtype,
which cuases meson to inject no additional compiler arguments, only those in
the C/CXXFLAGS and those that mesa itself defines.</p>
<p> For those wishing to pass their own optimization flags, use the <code>plain</code>
buildtype, which causes meson to inject no additional compiler arguments, only
those in the C/CXXFLAGS and those that mesa itself defines.</p>
</dd>
</dl>
<dl>
<dt><code>-Db_ndebug</code></dt>
<dd><p>This option controls assertions in meson projects. When set to false
<dd><p>This option controls assertions in meson projects. When set to <code>false</code>
(the default) assertions are enabled, when set to true they are disabled. This
is unrelated to the <code>buildtype</code>; setting the latter to
<code>release</code> will not turn off assertions.
</p>
</dd>
</dl>
</div>
</body>
</html>

View File

@@ -1,31 +0,0 @@
ARB_texture_float:
Silicon Graphics, Inc. owns US Patent #6,650,327, issued November 18,
2003 [1].
SGI believes this patent contains necessary IP for graphics systems
implementing floating point rasterization and floating point
framebuffer capabilities described in ARB_texture_float extension, and
will discuss licensing on RAND terms, on an individual basis with
companies wishing to use this IP in the context of conformant OpenGL
implementations [2].
The source code to implement ARB_texture_float extension is included
and can be toggled on at compile time, for those who purchased a
license from SGI, or are in a country where the patent does not apply,
etc.
The software is provided "as is", without warranty of any kind, express
or implied, including but not limited to the warranties of
merchantability, fitness for a particular purpose and noninfringement.
In no event shall the authors or copyright holders be liable for any
claim, damages or other liability, whether in an action of contract,
tort or otherwise, arising from, out of or in connection with the
software or the use or other dealings in the software.
You should contact a lawyer or SGI's legal department if you want to
enable this extension.
[1] https://www.google.com/patents/about?id=mIIOAAAAEBAJ&dq=6650327
[2] https://www.opengl.org/registry/specs/ARB/texture_float.txt

View File

@@ -24,10 +24,12 @@ Some Linux distributions closely follow the latest Mesa releases. On others one
has to use unofficial channels.
<br>
There are some general directions:
<ul>
<li>Debian/Ubuntu based distros - PPA: xorg-edgers, oibaf and padoka</li>
<li>Fedora - Corp: erp and che</li>
<li>OpenSuse/SLES - OBS: X11:XOrg and pontostroy:X11</li>
<li>Gentoo/Archlinux - officially provided/supported</li>
</ul>
</p>
</div>

View File

@@ -39,67 +39,60 @@ if you'd like to nominate a patch in the next stable release.
<th>Notes</th>
</tr>
<tr>
<td rowspan="3">17.3</td>
<td>2018-01-26</td>
<td>17.3.4</td>
<td>Emil Velikov</td>
<td>2018-06-29</td>
<td>18.1.3</td>
<td>Dylan Baker</td>
<td></td>
</tr>
<tr>
<td>2018-02-09</td>
<td>17.3.5</td>
<td>Juan A. Suarez Romero</td>
<td>2018-07-13</td>
<td>18.1.4</td>
<td>Dylan Baker</td>
<td></td>
</tr>
<tr>
<td>2018-02-23</td>
<td>17.3.6</td>
<td>Juan A. Suarez Romero</td>
<td>Final planned release for the 17.3 series</td>
</tr>
<tr>
<td rowspan="7">18.0</td>
<td>2018-01-19</td>
<td>18.0.0-rc1</td>
<td>Emil Velikov</td>
<td>2018-07-27</td>
<td>18.1.5</td>
<td>Dylan Baker</td>
<td></td>
</tr>
<tr>
<td>2018-01-26</td>
<td>18.0.0-rc2</td>
<td>Emil Velikov</td>
<td>2018-08-10</td>
<td>18.1.6</td>
<td>Dylan Baker</td>
<td></td>
</tr>
<tr>
<td>2018-02-02</td>
<td>18.0.0-rc3</td>
<td>Emil Velikov</td>
<td></td>
<td>2018-08-24</td>
<td>18.1.7</td>
<td>Dylan Baker</td>
<td>Last planned 18.1.x release</td>
</tr>
<tr>
<td>2018-02-09</td>
<td>18.0.0-rc4</td>
<td>Emil Velikov</td>
<td>May be promoted to 18.0.0 final</td>
</tr>
<tr>
<td>2018-02-23</td>
<td>18.0.1</td>
<td rowspan="4">18.2</td>
<td>2018-07-20</td>
<td>18.2.0rc1</td>
<td>Andres Gomez</td>
<td></td>
</tr>
<tr>
<td>2018-03-09</td>
<td>18.0.2</td>
<td>2018-07-27</td>
<td>18.2.0rc2</td>
<td>Andres Gomez</td>
<td></td>
</tr>
<tr>
<td>2018-03-23</td>
<td>18.0.3</td>
<td>2018-08-03</td>
<td>18.2.0rc3</td>
<td>Andres Gomez</td>
<td></td>
</tr>
<tr>
<td>2018-08-10</td>
<td>18.2.0rc4</td>
<td>Andres Gomez</td>
<td>Last planned RC/Final release</td>
</tr>
</table>
</div>

View File

@@ -21,7 +21,22 @@ The release notes summarize what's new or changed in each Mesa release.
</p>
<ul>
<li><a href="relnotes/17.3.2.html">17.3.3 release notes</a>
<li><a href="relnotes/18.1.2.html">18.1.2 release notes</a>
<li><a href="relnotes/18.0.5.html">18.0.5 release notes</a>
<li><a href="relnotes/18.1.1.html">18.1.1 release notes</a>
<li><a href="relnotes/18.1.0.html">18.1.0 release notes</a>
<li><a href="relnotes/18.0.4.html">18.0.4 release notes</a>
<li><a href="relnotes/18.0.3.html">18.0.3 release notes</a>
<li><a href="relnotes/18.0.2.html">18.0.2 release notes</a>
<li><a href="relnotes/18.0.1.html">18.0.1 release notes</a>
<li><a href="relnotes/17.3.9.html">17.3.9 release notes</a>
<li><a href="relnotes/17.3.8.html">17.3.8 release notes</a>
<li><a href="relnotes/18.0.0.html">18.0.0 release notes</a>
<li><a href="relnotes/17.3.7.html">17.3.7 release notes</a>
<li><a href="relnotes/17.3.6.html">17.3.6 release notes</a>
<li><a href="relnotes/17.3.5.html">17.3.5 release notes</a>
<li><a href="relnotes/17.3.4.html">17.3.4 release notes</a>
<li><a href="relnotes/17.3.3.html">17.3.3 release notes</a>
<li><a href="relnotes/17.3.2.html">17.3.2 release notes</a>
<li><a href="relnotes/17.2.8.html">17.2.8 release notes</a>
<li><a href="relnotes/17.3.1.html">17.3.1 release notes</a>

275
docs/relnotes/17.3.4.html Normal file
View File

@@ -0,0 +1,275 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.4 Release Notes / January 15, 2018</h1>
<p>
Mesa 17.3.4 is a bug fix release which fixes bugs found since the 17.3.3 release.
</p>
<p>
Mesa 17.3.4 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
2d3a4c3cbc995b3e192361dce710d8c749e046e7575aa1b7d8fc9e6b4df28f84 mesa-17.3.4.tar.gz
71f995e233bc5df1a0dd46c980d1720106e7f82f02d61c1ca50854b5e02590d0 mesa-17.3.4.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90311">Bug 90311</a> - Fail to build libglx with clang at linking stage</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101442">Bug 101442</a> - Piglit shaders&#64;ssa&#64;fs-if-def-else-break fails with sb but passes with R600_DEBUG=nosb</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102435">Bug 102435</a> - [skl,kbl] [drm] GPU HANG: ecode 9:0:0x86df7cf9, in csgo_linux64 [4947], reason: Hang on rcs, action: reset</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103006">Bug 103006</a> - [OpenGL CTS] [HSW] KHR-GL45.vertex_attrib_binding.basic-inputL-case1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103626">Bug 103626</a> - [SNB] ES3-CTS.functional.shaders.precision</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104163">Bug 104163</a> - [GEN9+] 2-3% perf drop in GfxBench Manhattan 3.1 from &quot;i965: Disable regular fast-clears (CCS_D) on gen9+&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104383">Bug 104383</a> - [KBL] Intel GPU hang with firefox</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104411">Bug 104411</a> - [CCS] lemonbar-xft GPU hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104487">Bug 104487</a> - [KBL] portal2_linux GPU hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104711">Bug 104711</a> - [skl CCS] Oxenfree (unity engine game) hangs GPU</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104741">Bug 104741</a> - Graphic corruption for Android apps Telegram and KineMaster</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104745">Bug 104745</a> - HEVC VDPAU decoding broken on RX 460 with UVD Firmware v1.130</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104818">Bug 104818</a> - mesa fails to build on ia64</li>
</ul>
<h2>Changes</h2>
<p>Andres Gomez (1):</p>
<ul>
<li>i965: perform 2 uploads with dual slot *64*PASSTHRU formats on gen&lt;8</li>
</ul>
<p>Bas Nieuwenhuizen (10):</p>
<ul>
<li>radv: Fix ordering issue in meta memory allocation failure path.</li>
<li>radv: Fix memory allocation failure path in compute resolve init.</li>
<li>radv: Fix freeing meta state if the device pipeline cache fails to allocate.</li>
<li>radv: Fix fragment resolve init memory allocation failure paths.</li>
<li>radv: Fix bufimage failure deallocation.</li>
<li>radv: Init variant entry with memset.</li>
<li>radv: Don't allow 3d or 1d depth/stencil textures.</li>
<li>ac/nir: Use instance_rate_inputs per attribute, not per variable.</li>
<li>ac/nir: Use correct 32-bit component writemask for 64-bit SSBO stores.</li>
<li>ac/nir: Fix vector extraction if source vector has &gt;4 elements.</li>
</ul>
<p>Boyuan Zhang (2):</p>
<ul>
<li>radeon/vcn: add and manage render picture list</li>
<li>radeon/uvd: add and manage render picture list</li>
</ul>
<p>Chuck Atkins (1):</p>
<ul>
<li>configure.ac: add missing llvm dependencies to .pc files</li>
</ul>
<p>Dave Airlie (10):</p>
<ul>
<li>r600/sb: fix a bug emitting ar load from a constant.</li>
<li>ac/nir: account for view index in the user sgpr allocation.</li>
<li>radv: add fs_key meta format support to resolve passes.</li>
<li>radv: don't use hw resolve for integer image formats</li>
<li>radv: don't use hw resolves for r16g16 norm formats.</li>
<li>radv: move spi_baryc_cntl to pipeline</li>
<li>r600/sb: insert the else clause when we might depart from a loop</li>
<li>radv: don't enable tc compat for d32s8 + 4/8 samples (v1.1)</li>
<li>radv/gfx9: fix block compression texture views. (v2)</li>
<li>virgl: also remove dimension on indirect.</li>
</ul>
<p>Eleni Maria Stea (1):</p>
<ul>
<li>mesa: Fix function pointers initialization in status tracker</li>
</ul>
<p>Emil Velikov (18):</p>
<ul>
<li>cherry-ignore: i965: Accept CONTEXT_ATTRIB_PRIORITY for brwCreateContext</li>
<li>cherry-ignore: swr: refactor swr_create_screen to allow for proper cleanup on error</li>
<li>cherry-ignore: anv: add explicit 18.0 only nominations</li>
<li>cherry-ignore: radv: fix sample_mask_in loading. (v3.1)</li>
<li>cherry-ignore: meson: multiple fixes</li>
<li>cherry-ignore: swr/rast: support llvm 3.9 type declarations</li>
<li>Revert "cherry-ignore: intel/fs: Use the original destination region for int MUL lowering"</li>
<li>cherry-ignore: ac/nir: set amdgpu.uniform and invariant.load for UBOs</li>
<li>cherry-ignore: add gen10 fixes</li>
<li>cherry-ignore: add r600/amdgpu 18.0 nominations</li>
<li>cherry-ignore: add i965 shader cache fixes</li>
<li>cherry-ignore: nir: mark unused space in packed_tex_data</li>
<li>radv: Stop advertising VK_KHX_multiview</li>
<li>cherry-ignore: radv: Don't expose VK_KHX_multiview on android.</li>
<li>configure.ac: correct driglx-direct help text</li>
<li>cherry-ignore: add meson fix</li>
<li>cherry-ignore: add a few more meson fixes</li>
<li>Update version to 17.3.4</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>radeon: remove left over dead code</li>
</ul>
<p>Gert Wollny (1):</p>
<ul>
<li>r600/shader: Initialize max_driver_temp_used correctly for the first time</li>
</ul>
<p>Grazvydas Ignotas (2):</p>
<ul>
<li>st/va: release held locks in error paths</li>
<li>st/vdpau: release held lock in error path</li>
</ul>
<p>Igor Gnatenko (1):</p>
<ul>
<li>link mesautil with pthreads</li>
</ul>
<p>Indrajit Das (4):</p>
<ul>
<li>st/omx_bellagio: Update default intra matrix per MPEG2 spec</li>
<li>radeon/uvd: update quantiser matrices only when requested</li>
<li>radeon/vcn: update quantiser matrices only when requested</li>
<li>st/va: clear pointers for mpeg2 quantiser matrices</li>
</ul>
<p>Jason Ekstrand (19):</p>
<ul>
<li>i965: Call brw_cache_flush_for_render in predraw_resolve_framebuffer</li>
<li>i965: Add more precise cache tracking helpers</li>
<li>i965/blorp: Add more destination flushing</li>
<li>i965: Track the depth and render caches separately</li>
<li>i965: Track format and aux usage in the render cache</li>
<li>Re-enable regular fast-clears (CCS_D) on gen9+</li>
<li>i965/miptree: Refactor CCS_E and CCS_D cases in render_aux_usage</li>
<li>i965/miptree: Add an explicit tiling parameter to create_for_bo</li>
<li>i965/miptree: Use the tiling from the modifier instead of the BO</li>
<li>i965/bufmgr: Add a create_from_prime_tiled function</li>
<li>i965: Set tiling on BOs imported with modifiers</li>
<li>i965/miptree: Take an aux_usage in prepare/finish_render</li>
<li>i965/miptree: Add an aux_disabled parameter to render_aux_usage</li>
<li>i965/surface_state: Drop brw_aux_surface_disabled</li>
<li>intel/fs: Use the original destination region for int MUL lowering</li>
<li>anv/pipeline: Don't look at blend state unless we have an attachment</li>
<li>anv/cmd_buffer: Re-emit the pipeline at every subpass</li>
<li>anv: Stop advertising VK_KHX_multiview</li>
<li>i965: Call prepare_external after implicit window-system MSAA resolves</li>
</ul>
<p>Jon Turney (3):</p>
<ul>
<li>configure: Default to gbm=no on osx</li>
<li>glx/apple: include util/debug.h for env_var_as_boolean prototype</li>
<li>glx/apple: locate dispatch table functions to wrap by name</li>
</ul>
<p>José Fonseca (1):</p>
<ul>
<li>svga: Prevent use after free.</li>
</ul>
<p>Juan A. Suarez Romero (1):</p>
<ul>
<li>docs: add sha256 checksums for 17.3.3</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>i965: Bind null render targets for shadow sampling + color.</li>
<li>i965: Bump official kernel requirement to Linux v3.9.</li>
</ul>
<p>Lucas Stach (2):</p>
<ul>
<li>etnaviv: dirty TS state when framebuffer has changed</li>
<li>renderonly: fix dumb BO allocation for non 32bpp formats</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>radeonsi: don't ignore pitch for imported textures</li>
</ul>
<p>Matthew Nicholls (2):</p>
<ul>
<li>radv: restore previous stencil reference after depth-stencil clear</li>
<li>radv: remove predication on cache flushes</li>
</ul>
<p>Maxin B. John (1):</p>
<ul>
<li>anv_icd.py: improve reproducible builds</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>winsys/radeon: Compute is_displayable in surf_drm_to_winsys</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>r600: don't do stack workarounds for hemlock</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: create pipeline layout objects for all meta operations</li>
</ul>
<p>Samuel Thibault (1):</p>
<ul>
<li>glx: fix non-dri build</li>
</ul>
<p>Timothy Arceri (2):</p>
<ul>
<li>ac: fix buffer overflow bug in 64bit SSBO loads</li>
<li>ac: fix visit_ssa_undef() for doubles</li>
</ul>
</div>
</body>
</html>

66
docs/relnotes/17.3.5.html Normal file
View File

@@ -0,0 +1,66 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.5 Release Notes / February 19, 2018</h1>
<p>
Mesa 17.3.5 is a bug fix release which fixes bugs found since the 17.3.4 release.
</p>
<p>
Mesa 17.3.5 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
bc1ee20366aae2affc37c89228f871f438136f70252005e9f842169bde976788 mesa-17.3.5.tar.gz
eb9228fc8aaa71e0205c1481c5b157752ebaec9b646b030d27478e25a6d7936a mesa-17.3.5.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
</ul>
<h2>Changes</h2>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: add sha256 checksums for 17.3.4</li>
<li>Update version to 17.3.5</li>
</ul>
<p>James Legg (1):</p>
<ul>
<li>ac/nir: Fix conflict resolution typo in handle_vs_input_decl</li>
</ul>
</div>
</body>
</html>

85
docs/relnotes/17.3.6.html Normal file
View File

@@ -0,0 +1,85 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.6 Release Notes / February 27, 2018</h1>
<p>
Mesa 17.3.6 is a bug fix release which fixes bugs found since the 17.3.5 release.
</p>
<p>
Mesa 17.3.6 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
d5e10ea3f0d11b06d2b0b235bba372a04278c39bc0e712090bda1f61842db188 mesa-17.3.6.tar.gz
e5915680d44ac9d05defdec529db7459ac9edd441c9845266eff2e2d3e57fbf8 mesa-17.3.6.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104383">Bug 104383</a> - [KBL] Intel GPU hang with firefox</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104411">Bug 104411</a> - [CCS] lemonbar-xft GPU hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104546">Bug 104546</a> - Crash happens when running compute pipeline after calling glxMakeCurrent two times</li>
</ul>
<h2>Changes</h2>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: add sha256 checksums for 17.3.5</li>
<li>Update version to 17.3.6</li>
</ul>
<p>Jason Ekstrand (4):</p>
<ul>
<li>i965/draw: Do resolves properly for textures used by TXF</li>
<li>i965: Replace draw_aux_buffer_disabled with draw_aux_usage</li>
<li>i965/draw: Set NEW_AUX_STATE when draw aux changes</li>
<li>i965: Stop disabling aux during texture preparation</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>i965: Don't disable CCS for RT dependencies when dispatching compute.</li>
</ul>
<p>Topi Pohjolainen (1):</p>
<ul>
<li>i965: Don't try to disable render aux buffers for compute</li>
</ul>
</div>
</body>
</html>

312
docs/relnotes/17.3.7.html Normal file
View File

@@ -0,0 +1,312 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.7 Release Notes / March 21, 2018</h1>
<p>
Mesa 17.3.7 is a bug fix release which fixes bugs found since the 17.3.7 release.
</p>
<p>
Mesa 17.3.7 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
f08de6d0ccb3dbca04b44790d85c3ff9e7b1cc4189d1b7c7167e5ba7d98736c0 mesa-17.3.7.tar.gz
0595904a8fba65a8fe853a84ad3c940205503b94af41e8ceed245fada777ac1e mesa-17.3.7.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103007">Bug 103007</a> - [OpenGL CTS] [HSW] KHR-GL45.gpu_shader_fp64.fp64.max_uniform_components fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103988">Bug 103988</a> - Intermittent piglit failures with shader cache enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104302">Bug 104302</a> - Wolfenstein 2 (2017) under wine graphical artifacting on RADV</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104381">Bug 104381</a> - swr fails to build since llvm-svn r321257</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104625">Bug 104625</a> - semicolon after if</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104642">Bug 104642</a> - Android: NULL pointer dereference with i965 mesa-dev, seems build_id_length related</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104654">Bug 104654</a> - r600/sb: Alien Isolation GPU lock</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104905">Bug 104905</a> - SpvOpFOrdEqual doesn't return correct results for NaNs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104915">Bug 104915</a> - Indexed SHADING_LANGUAGE_VERSION query not supported</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104923">Bug 104923</a> - anv: Dota2 rendering corruption</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105013">Bug 105013</a> - [regression] GLX+VA-API+clutter-gst video playback is corrupt with Mesa 17.3 (but is fine with 17.2)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105029">Bug 105029</a> - simdlib_512_avx512.inl:371:57: error: could not convert _mm512_mask_blend_epi32((__mmask16)(ImmT), a, b) from __m512i {aka __vector(8) long long int} to SIMDImpl::SIMD512Impl::Float</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105098">Bug 105098</a> - [RADV] GPU freeze with simple Vulkan App</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105103">Bug 105103</a> - Wayland master causes Mesa to fail to compile</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105224">Bug 105224</a> - Webgl Pointclouds flickers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105255">Bug 105255</a> - Waiting for fences without waitAll is not implemented</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105271">Bug 105271</a> - WebGL2 shader crashes i965_dri.so 17.3.3</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105436">Bug 105436</a> - Blinking textures in UT2004 [bisected]</li>
</ul>
<h2>Changes</h2>
<p>Alex Smith (1):</p>
<ul>
<li>radv: Fix CmdCopyImage between uncompressed and compressed images</li>
</ul>
<p>Andriy Khulap (1):</p>
<ul>
<li>i965: Fix RELOC_WRITE typo in brw_store_data_imm64()</li>
</ul>
<p>Anuj Phogat (1):</p>
<ul>
<li>isl: Don't use surface format R32_FLOAT for typed atomic integer operations</li>
</ul>
<p>Bas Nieuwenhuizen (6):</p>
<ul>
<li>radv: Always lower indirect derefs after nir_lower_global_vars_to_local.</li>
<li>radeonsi: Export signalled sync file instead of -1.</li>
<li>radv: Implement WaitForFences with !waitAll.</li>
<li>radv: Implement waiting on non-submitted fences.</li>
<li>radv: Fix copying from 3D images starting at non-zero depth.</li>
<li>radv: Increase the number of dynamic uniform buffers.</li>
</ul>
<p>Brian Paul (1):</p>
<ul>
<li>mesa: add missing switch case for EXTRA_VERSION_40 in check_extra()</li>
</ul>
<p>Chuck Atkins (1):</p>
<ul>
<li>glx: Properly handle cases where screen creation fails</li>
</ul>
<p>Daniel Stone (3):</p>
<ul>
<li>i965: Fix bugs in intel_from_planar</li>
<li>egl/wayland: Fix ARGB/XRGB transposition in config map</li>
<li>egl/wayland: Always use in-tree wayland-egl-backend.h</li>
</ul>
<p>Dave Airlie (9):</p>
<ul>
<li>r600: fix cubemap arrays</li>
<li>r600/sb/cayman: fix indirect ubo access on cayman</li>
<li>r600: fix xfb stream check.</li>
<li>ac/nir: to integer the args to bcsel.</li>
<li>r600/cayman: fix fragcood loading recip generation.</li>
<li>radv: don't support tc-compat on multisample d32s8 at all.</li>
<li>virgl: remap query types to hw support.</li>
<li>ac/nir: don't apply slice rounding on txf_ms</li>
<li>r600: implement callstack workaround for evergreen.</li>
</ul>
<p>Dylan Baker (2):</p>
<ul>
<li>glapi/check_table: Remove 'extern "C"' block</li>
<li>glapi: remove APPLE extensions from test</li>
</ul>
<p>Emil Velikov (1):</p>
<ul>
<li>docs: add sha256 checksums for 17.3.6</li>
</ul>
<p>Eric Anholt (4):</p>
<ul>
<li>mesa: Drop incorrect A4B4G4R4 _mesa_format_matches_format_and_type() cases.</li>
<li>ac/nir: Fix compiler warning about uninitialized dw_addr.</li>
<li>glsl/tests: Fix strict aliasing warning about int64/double.</li>
<li>glsl/tests: Fix a compiler warning about signed/unsigned loop comparison.</li>
</ul>
<p>Francisco Jerez (1):</p>
<ul>
<li>i965: Fix KHR_blend_equation_advanced with some render targets.</li>
</ul>
<p>Frank Binns (1):</p>
<ul>
<li>egl/dri2: fix segfault when display initialisation fails</li>
</ul>
<p>George Kyriazis (1):</p>
<ul>
<li>swr/rast: blend_epi32() should return Integer, not Float</li>
</ul>
<p>Gert Wollny (1):</p>
<ul>
<li>r600: Take ALU_EXTENDED into account when evaluating jump offsets</li>
</ul>
<p>Gurchetan Singh (1):</p>
<ul>
<li>mesa: don't clamp just based on ARB_viewport_array extension</li>
</ul>
<p>Iago Toral Quiroga (2):</p>
<ul>
<li>i965/sbe: fix number of inputs for active components</li>
<li>i965/vec4: use a temp register to compute offsets for pull loads</li>
</ul>
<p>James Legg (1):</p>
<ul>
<li>radv: Really use correct HTILE expanded words.</li>
</ul>
<p>Jason Ekstrand (3):</p>
<ul>
<li>intel/isl: Add an isl_color_value_is_zero helper</li>
<li>vulkan/wsi/x11: Set OUT_OF_DATE if wait_for_special_event fails</li>
<li>intel/fs: Set up sampler message headers in the visitor on gen7+</li>
</ul>
<p>Jonathan Gray (1):</p>
<ul>
<li>configure.ac: pthread-stubs not present on OpenBSD</li>
</ul>
<p>Jordan Justen (3):</p>
<ul>
<li>i965: Create new program cache bo when clearing the program cache</li>
<li>program: Don't reset SamplersValidated when restoring from shader cache</li>
<li>intel/vulkan: Hard code CS scratch_ids_per_subslice for Cherryview</li>
</ul>
<p>Juan A. Suarez Romero (14):</p>
<ul>
<li>cherry-ignore: Explicit 18.0 only nominations</li>
<li>cherry-ignore: r600/compute: only mark buffer/image state dirty for fragment shaders</li>
<li>cherry-ignore: anv: Move setting current_pipeline to cmd_state_init</li>
<li>cherry-ignore: anv: Be more careful about fast-clear colors</li>
<li>cherry-ignore: Add patches that has a specific version for 17.3</li>
<li>cherry-ignore: r600: Take ALU_EXTENDED into account when evaluating jump offsets</li>
<li>cherry-ignore: intel/compiler: Memory fence commit must always be enabled for gen10+</li>
<li>cherry-ignore: i965: Avoid problems from referencing orphaned BOs after growing.</li>
<li>cherry-ignore: include all Meson related fixes</li>
<li>cherry-ignore: ac/shader: fix vertex input with components.</li>
<li>cherry-ignore: i965: Use absolute addressing for constant buffer 0 on Kernel 4.16+.</li>
<li>cherry-ignore: anv/image: Separate modifiers from legacy scanout</li>
<li>cherry-ignore: glsl: Fix memory leak with known glsl_type instances</li>
<li>Update version to 17.3.7</li>
</ul>
<p>Karol Herbst (1):</p>
<ul>
<li>nvir/nvc0: fix legalizing of ld unlock c0[0x10000]</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>i965: Emit CS stall before MEDIA_VFE_STATE.</li>
</ul>
<p>Lionel Landwerlin (1):</p>
<ul>
<li>i965: perf: ensure reading config IDs from sysfs isn't interrupted</li>
</ul>
<p>Marek Olšák (2):</p>
<ul>
<li>radeonsi: align command buffer starting address to fix some Raven hangs</li>
<li>configure.ac: blacklist libdrm 2.4.90</li>
</ul>
<p>Michal Navratil (1):</p>
<ul>
<li>winsys/amdgpu: allow non page-aligned size bo creation from pointer</li>
</ul>
<p>Samuel Iglesias Gonsálvez (1):</p>
<ul>
<li>glsl/linker: fix bug when checking precision qualifier</li>
</ul>
<p>Samuel Pitoiset (2):</p>
<ul>
<li>ac/nir: use ordered float comparisons except for not equal</li>
<li>Revert "mesa: do not trigger _NEW_TEXTURE_STATE in glActiveTexture()"</li>
</ul>
<p>Stephan Gerhold (1):</p>
<ul>
<li>util/build-id: Fix address comparison for binaries with LOAD vaddr &gt; 0</li>
</ul>
<p>Thomas Hellstrom (2):</p>
<ul>
<li>svga: Fix a leftover debug hack</li>
<li>loader_dri3/glx/egl: Reinstate the loader_dri3_vtable get_dri_screen callback</li>
</ul>
<p>Tim Rowley (1):</p>
<ul>
<li>swr/rast: fix MemoryBuffer build break for llvm-6</li>
</ul>
<p>Timothy Arceri (1):</p>
<ul>
<li>nir: fix interger divide by zero crash during constant folding</li>
</ul>
<p>Tobias Droste (1):</p>
<ul>
<li>gallivm: Use new LLVM fast-math-flags API</li>
</ul>
<p>Vadym Shovkoplias (1):</p>
<ul>
<li>mesa: add glsl version query (v4)</li>
</ul>
<p>Vinson Lee (1):</p>
<ul>
<li>swr/rast: Fix macOS macro.</li>
</ul>
</div>
</body>
</html>

147
docs/relnotes/17.3.8.html Normal file
View File

@@ -0,0 +1,147 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.8 Release Notes / April 03, 2018</h1>
<p>
Mesa 17.3.8 is a bug fix release which fixes bugs found since the 17.3.7 release.
</p>
<p>
Mesa 17.3.8 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
175d2ca9be2af3a8db6cd603986096d75da70f59699528d7b6675d542a305e23 mesa-17.3.8.tar.gz
8f9d9bf281c48e4a8f5228816577263b4c655248dc7666e75034ab422951a6b1 mesa-17.3.8.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102542">Bug 102542</a> - mesa-17.2.0/src/gallium/state_trackers/nine/nine_ff.c:1938: bad assignment ?</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103746">Bug 103746</a> - [BDW BSW SKL KBL] dEQP-GLES31.functional.copy_image regressions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104636">Bug 104636</a> - [BSW/HD400] Aztec Ruins GL version GPU hangs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105290">Bug 105290</a> - [BSW/HD400] SynMark OglCSDof GPU hangs when shaders come from cache</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105464">Bug 105464</a> - Reading per-patch outputs in Tessellation Control Shader returns undefined values</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105670">Bug 105670</a> - [regression][hang] Trine1EE hangs GPU after loading screen on Mesa3D-17.3 and later</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105704">Bug 105704</a> - compiler assertion hit</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105717">Bug 105717</a> - [bisected] Mesa build tests fails: BIGENDIAN_CPU or LITTLEENDIAN_CPU must be defined</li>
</ul>
<h2>Changes</h2>
<p>Axel Davy (3):</p>
<ul>
<li>st/nine: Fix bad tracking of vs textures for NINESBT_ALL</li>
<li>st/nine: Fixes warning about implicit conversion</li>
<li>st/nine: Fix non inversible matrix check</li>
</ul>
<p>Caio Marcelo de Oliveira Filho (1):</p>
<ul>
<li>anv/pipeline: fail if TCS/TES compile fail</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>radv: get correct offset into LDS for indexed vars.</li>
</ul>
<p>Derek Foreman (1):</p>
<ul>
<li>egl/wayland: Make swrast display_sync the correct queue</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>meson/configure: detect endian.h instead of trying to guess when it's available</li>
</ul>
<p>Ian Romanick (2):</p>
<ul>
<li>mesa: Don't write to user buffer in glGetTexParameterIuiv on error</li>
<li>i965/vec4: Fix null destination register in 3-source instructions</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>i965: Emit texture cache invalidates around blorp_copy</li>
</ul>
<p>Jordan Justen (2):</p>
<ul>
<li>i965: Calculate thread_count in brw_alloc_stage_scratch</li>
<li>i965: Hard code CS scratch_ids_per_subslice for Cherryview</li>
</ul>
<p>Juan A. Suarez Romero (6):</p>
<ul>
<li>docs: add sha256 checksums for 17.3.7</li>
<li>cherry-ignore: ac/nir: pass the nir variable through tcs loading.</li>
<li>cherry-ignore: radv: handle exporting view index to fragment shader. (v1.1)</li>
<li>cherry-ignore: omx: always define ENABLE_ST_OMX_{BELLAGIO,TIZONIA}</li>
<li>cherry-ignore: docs: fix 18.0 release note version</li>
<li>Update version to 17.3.8</li>
</ul>
<p>Leo Liu (1):</p>
<ul>
<li>radeon/vce: move feedback command inside of destroy function</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>st/dri: fix OpenGL-OpenCL interop for GL_TEXTURE_BUFFER</li>
</ul>
<p>Rob Clark (1):</p>
<ul>
<li>nir: fix per_vertex_output intrinsic</li>
</ul>
<p>Timothy Arceri (2):</p>
<ul>
<li>glsl: fix infinite loop caused by bug in loop unrolling pass</li>
<li>nir: fix crash in loop unroll corner case</li>
</ul>
</div>
</body>
</html>

162
docs/relnotes/17.3.9.html Normal file
View File

@@ -0,0 +1,162 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.9 Release Notes / April 18, 2018</h1>
<p>
Mesa 17.3.9 is a bug fix release which fixes bugs found since the 17.3.8 release.
</p>
<p>
Mesa 17.3.9 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
4d625f65a1ff4cd8cfeb39e38f047507c6dea047502a0d53113c96f54588f340 mesa-17.3.9.tar.gz
c5beb5fc05f0e0c294fefe1a393ee118cb67e27a4dca417d77c297f7d4b6e479 mesa-17.3.9.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98281">Bug 98281</a> - 'message's in ctx-&gt;Debug.LogMessages[] seem to leak.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101408">Bug 101408</a> - [Gen8+] Xonotic fails to render one of the weapons</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102342">Bug 102342</a> - mesa-17.1.7/src/gallium/auxiliary/pipebuffer/pb_cache.c:169]: (style) Suspicious condition</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105317">Bug 105317</a> - The GPU Vega 56 was hang while try to pass #GraphicsFuzz shader15 test</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105440">Bug 105440</a> - GEN7: rendering issue on citra</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105442">Bug 105442</a> - Hang when running nine ff lighting shader with radeonsi</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105994">Bug 105994</a> - surface state leak when creating and destroying image views with aspectMask depth and stencil</li>
</ul>
<h2>Changes</h2>
<p>Andres Gomez (2):</p>
<ul>
<li>dri_util: when overriding, always reset the core version</li>
<li>mesa: adds some comments regarding MESA_GLES_VERSION_OVERRIDE usage</li>
</ul>
<p>Axel Davy (2):</p>
<ul>
<li>st/nine: Declare lighting consts for ff shaders</li>
<li>st/nine: Do not use scratch for face register</li>
</ul>
<p>Bas Nieuwenhuizen (1):</p>
<ul>
<li>ac/nir: Add workaround for GFX9 buffer views.</li>
</ul>
<p>Daniel Stone (1):</p>
<ul>
<li>st/dri: Initialise modifier to INVALID for DRI2</li>
</ul>
<p>Emil Velikov (1):</p>
<ul>
<li>glsl: remove unreachable assert()</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>gbm: remove never-implemented function</li>
</ul>
<p>Henri Verbeet (1):</p>
<ul>
<li>mesa: Inherit texture view multi-sample information from the original texture images.</li>
</ul>
<p>Iago Toral Quiroga (1):</p>
<ul>
<li>compiler/spirv: set is_shadow for depth comparitor sampling opcodes</li>
</ul>
<p>Jason Ekstrand (4):</p>
<ul>
<li>nir/vars_to_ssa: Remove copies from the correct set</li>
<li>nir/lower_indirect_derefs: Support interp_var_at intrinsics</li>
<li>intel/vec4: Set channel_sizes for MOV_INDIRECT sources</li>
<li>nir/lower_vec_to_movs: Only coalesce if the vec had a SSA destination</li>
</ul>
<p>Juan A. Suarez Romero (3):</p>
<ul>
<li>docs: add sha256 checksums for 17.3.8</li>
<li>cherry-ignore: Explicit 18.0 only nominations</li>
<li>Update version to 17.3.9</li>
</ul>
<p>Lionel Landwerlin (1):</p>
<ul>
<li>anv: fix number of planes for depth &amp; stencil</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>mesa: simplify MESA_GL_VERSION_OVERRIDE behavior of API override</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: fix picking the method for resolve subpass</li>
</ul>
<p>Sergii Romantsov (1):</p>
<ul>
<li>i965: Extend the negative 32-bit deltas to 64-bits</li>
</ul>
<p>Timothy Arceri (6):</p>
<ul>
<li>gallium/pipebuffer: fix parenthesis location</li>
<li>glsl: always call do_lower_jumps() after loop unrolling</li>
<li>ac: add if/loop build helpers</li>
<li>radeonsi: make use of if/loop build helpers in ac</li>
<li>ac: make use of if/loop build helpers</li>
<li>mesa: free debug messages when destroying the debug state</li>
</ul>
<p>Xiong, James (1):</p>
<ul>
<li>i965: return the fourcc saved in __DRIimage when possible</li>
</ul>
</div>
</body>
</html>

View File

@@ -1,76 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.4.0 Release Notes / TBD</h1>
<p>
Mesa 17.4.0 is a new development release.
People who are concerned with stability and reliability should stick
with a previous release or wait for Mesa 17.4.1.
</p>
<p>
Mesa 17.4.0 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
TBD.
</pre>
<h2>New features</h2>
<p>
Note: some of the new features are only available with certain drivers.
</p>
<ul>
<li>Disk shader cache support for i965 when MESA_GLSL_CACHE_DISABLE environment variable is set to "0" or "false"</li>
<li>GL_ARB_shader_atomic_counters and GL_ARB_shader_atomic_counter_ops on r600/evergreen+</li>
<li>GL_ARB_shader_image_load_store and GL_ARB_shader_image_size on r600/evergreen+</li>
<li>GL_ARB_shader_storage_buffer_object on r600/evergreen+<li>
<li>GL_ARB_compute_shader on r600/evergreen+<li>
<li>GL_ARB_cull_distance on r600/evergreen+</li>
<li>GL_ARB_enhanced_layouts on r600/evergreen+</li>
<li>GL_ARB_bindless_texture on nvc0/kepler</li>
<li>OpenGL 4.3 on r600/evergreen with hw fp64 support</li>
<li>Support 1 binary format for GL_ARB_get_program_binary on i965.
(For the 18.0 release, 0 formats continue to be supported in
compatibility profiles.)</li>
<li>Cannonlake support on i965 and anv</li>
</ul>
<h2>Bug fixes</h2>
<ul>
TBD
</ul>
<h2>Changes</h2>
<ul>
<li>Remove incomplete GLX_MESA_set_3dfx_mode from the Xlib libGL</li>
</ul>
</div>
</body>
</html>

321
docs/relnotes/18.0.0.html Normal file
View File

@@ -0,0 +1,321 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.0.0 Release Notes / March 27 2018</h1>
<p>
Mesa 18.0.0 is a new development release.
People who are concerned with stability and reliability should stick
with a previous release or wait for Mesa 18.0.1.
</p>
<p>
Mesa 18.0.0 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
93c2d3504b2871ac2146603fb1270f341d36a39695e2950a469c5eac74f98457 mesa-18.0.0.tar.gz
694e5c3d37717d23258c1f88bc134223c5d1aac70518d2f9134d6df3ee791eea mesa-18.0.0.tar.xz
</pre>
<h2>New features</h2>
<p>
Note: some of the new features are only available with certain drivers.
</p>
<ul>
<li>Disk shader cache support for i965 when MESA_GLSL_CACHE_DISABLE environment variable is set to "0" or "false"</li>
<li>GL_ARB_shader_atomic_counters and GL_ARB_shader_atomic_counter_ops on r600/evergreen+</li>
<li>GL_ARB_shader_image_load_store and GL_ARB_shader_image_size on r600/evergreen+</li>
<li>GL_ARB_shader_storage_buffer_object on r600/evergreen+</li>
<li>GL_ARB_compute_shader on r600/evergreen+</li>
<li>GL_ARB_cull_distance on r600/evergreen+</li>
<li>GL_ARB_enhanced_layouts on r600/evergreen+</li>
<li>GL_ARB_bindless_texture on nvc0/kepler</li>
<li>OpenGL 4.3 on r600/evergreen with hw fp64 support</li>
<li>Support 1 binary format for GL_ARB_get_program_binary on i965.
(For the 18.0 release, 0 formats continue to be supported in
compatibility profiles.)</li>
<li>Cannonlake support on i965 and anv</li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85564">Bug 85564</a> - Dead Island rendering issues</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90311">Bug 90311</a> - Fail to build libglx with clang at linking stage</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92363">Bug 92363</a> - [BSW/BDW] ogles1conform Gets test fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94739">Bug 94739</a> - Mesa 11.1.2 implementation error: bad format MESA_FORMAT_Z_FLOAT32 in _mesa_unpack_uint_24_8_depth_stencil_row</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97532">Bug 97532</a> - Regression: GLB 2.7 &amp; Glmark-2 GLES versions segfault due to linker precision error (259fc505) on dead variable</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97852">Bug 97852</a> - Unreal Engine corrupted preview viewport</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100438">Bug 100438</a> - glsl/ir.cpp:1376: ir_dereference_variable::ir_dereference_variable(ir_variable*): Assertion `var != NULL' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101378">Bug 101378</a> - interpolateAtSample check for input parameter is too strict</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101442">Bug 101442</a> - Piglit shaders&#64;ssa&#64;fs-if-def-else-break fails with sb but passes with R600_DEBUG=nosb</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101560">Bug 101560</a> - SPIR-V OpSwitch with int64 not supported even though shaderInt64 is true</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101691">Bug 101691</a> - gfx corruption on windowed 3d-apps running on dGPU</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102177">Bug 102177</a> - [SKL] ES31-CTS.core.sepshaderobjs.StateInteraction fails sporadically</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102264">Bug 102264</a> - Missing MESA_FORMAT_{B8G8R8A8,B8G8R8X8}_SRGB formats</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102354">Bug 102354</a> - Mesa 17.2 no longer can give SRGB-capable framebuffer on i965, even though Mesa 17.1.x does.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102358">Bug 102358</a> - WarThunder freezes at start, with activated vsync (vblank_mode=2)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102435">Bug 102435</a> - [skl,kbl] [drm] GPU hang in Valve games based on Source 1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102503">Bug 102503</a> - Report SRGB framebuffer to SuperTuxKart to workaround SuperTuxKart crash</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102665">Bug 102665</a> - test_glsl_to_tgsi_lifetime.cpp:53:67: error: &gt;&gt; should be &gt; &gt; within a nested template argument list</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102677">Bug 102677</a> - [OpenGL CTS] KHR-GL45.CommonBugs.CommonBug_PerVertexValidation fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102680">Bug 102680</a> - [OpenGL CTS] KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102710">Bug 102710</a> - vkCmdBlitImage with arrayLayers &gt; 1 fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102774">Bug 102774</a> - [BDW] [Bisected] Absolute constant buffers break VAAPI in mpv</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102809">Bug 102809</a> - Rust shadows(?) flash random colours</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102897">Bug 102897</a> - Separate bind points are not implemented correctly</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102955">Bug 102955</a> - HyperZ related rendering issue in ARK: Survival Evolved</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103006">Bug 103006</a> - [OpenGL CTS] [HSW] KHR-GL45.vertex_attrib_binding.basic-inputL-case1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103007">Bug 103007</a> - [OpenGL CTS] [HSW] KHR-GL45.gpu_shader_fp64.fp64.max_uniform_components fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103085">Bug 103085</a> - [ivb byt hsw] piglit.spec.arb_indirect_parameters.tf-count-arrays</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103098">Bug 103098</a> - [OpenGL CTS] KHR-GL45.enhanced_layouts.varying_structure_locations fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103101">Bug 103101</a> - [SKL][bisected] DiRT Rally GPU hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103115">Bug 103115</a> - [BSW BXT GLK] dEQP-VK.spirv_assembly.instruction.compute.sconvert.int32_to_int64</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103128">Bug 103128</a> - [softpipe] piglit fs-ldexp regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103142">Bug 103142</a> - R600g+sb: optimizer apparently stuck in an endless loop</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103227">Bug 103227</a> - [G965 G45 ILK] ES2-CTS.gtf.GL2ExtensionTests.texture_float.texture_float regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103283">Bug 103283</a> - drm_get_device_name_for_fd is broken on FreeBSD</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103388">Bug 103388</a> - Linking libcltgsi.la (llvm/codegen/libclllvm_la-common.lo) fails with &quot;error: no match for 'operator-'&quot; with GCC-7, Mesa from Git and current LLVM revisions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103393">Bug 103393</a> - glDispatchComputeGroupSizeARB : gl_GlobalInvocationID.x != gl_WorkGroupID.x * gl_LocalGroupSizeARB.x + gl_LocalInvocationID.x</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103412">Bug 103412</a> - gallium/wgl: Another fix to context creation without prior SetPixelFormat()</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103496">Bug 103496</a> - svga_screen.c:26:46: error: git_sha1.h: No such file or directory</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103513">Bug 103513</a> - [build failure] radv_shader.c:683:2: error: format not a string literal and no format arguments [-Werror=format-security]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103519">Bug 103519</a> - wayland egl apps crash on start with mesa 17.2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103529">Bug 103529</a> - [GM45] GPU hang with mpv fullscreen (bisected)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103537">Bug 103537</a> - i965: Shadow of Mordor broken since commit 379b24a40d3d34ffdaaeb1b328f50e28ecb01468 on Haswell</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103544">Bug 103544</a> - Graphical glitches r600 in game this war of mine linux native</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103579">Bug 103579</a> - Vertex shader causes compiler to crash in SPIRV-to-NIR</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103616">Bug 103616</a> - Increased difference from reference image in shaders</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103626">Bug 103626</a> - [SNB] ES3-CTS.functional.shaders.precision</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103628">Bug 103628</a> - [BXT, GLK, BSW] KHR-GL46.shader_ballot_tests.ShaderBallotBitmasks</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103653">Bug 103653</a> - Unreal segfault since gallium/u_threaded: avoid syncs for get_query_result</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103658">Bug 103658</a> - addrlib/gfx9/gfx9addrlib.cpp:727:50: error: expected expression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103674">Bug 103674</a> - u_queue.c:173:7: error: implicit declaration of function 'timespec_get' is invalid in C99</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103746">Bug 103746</a> - [BDW BSW SKL KBL] dEQP-GLES31.functional.copy_image regressions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103759">Bug 103759</a> - plasma desktop corrupted rendering</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103784">Bug 103784</a> - [bisected] Egl changes breaks all of EGL</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103787">Bug 103787</a> - [BDW,BSW] gpu hang on spec.arb_pipeline_statistics_query.arb_pipeline_statistics_query-comp</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103801">Bug 103801</a> - [i965] &gt;Observer_ issue</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103808">Bug 103808</a> - [radeonsi, bisected] World of Warcraft scribbling all over screen</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103902">Bug 103902</a> - Portal 2 game hangs at startup with latest mesa dev</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103904">Bug 103904</a> - Source engine-based games won't hang at start without R600_DEBUG=vs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103909">Bug 103909</a> - anv_allocator.c:113:1: error: static declaration of memfd_create follows non-static declaration</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103942">Bug 103942</a> - KHR-GL46.enhanced_layouts.varying* regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103955">Bug 103955</a> - Using array in structure results in wrong GLSL compilation output</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103966">Bug 103966</a> - Mesa 17.2.5 implementation error: bad format MESA_FORMAT_Z_FLOAT32 in _mesa_unpack_uint_24_8_depth_stencil_row</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103988">Bug 103988</a> - Intermittent piglit failures with shader cache enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104005">Bug 104005</a> - [sklgt4e] GPU hangs in Car_Chase</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104119">Bug 104119</a> - radv: OpBitFieldInsert produces 0 with a loop counter for Insert</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104141">Bug 104141</a> - include/c11/threads_posix.h:96: undefined reference to `pthread_once'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104143">Bug 104143</a> - r600/sb: clobbers gl_Position -&gt; gl_FragCoord</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104163">Bug 104163</a> - [GEN9+] 2-3% perf drop in GfxBench Manhattan 3.1 from &quot;i965: Disable regular fast-clears (CCS_D) on gen9+&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104183">Bug 104183</a> - mesa-17.3.0/src/broadcom/qpu/qpu_pack.c:171]: (error) Invalid memcmp() argument</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104199">Bug 104199</a> - [i965 bisected] BIO and EM Vision in &gt;Observer_ is broken since commit af2c320190f3c73180f1610c8df955a7fa2a4d09</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104213">Bug 104213</a> - NULL pointer access crashes on compiling Vulkan compute shaders after &quot;anv: Add support for the variablePointers feature&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104214">Bug 104214</a> - Dota crashes when switching from game to desktop</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104226">Bug 104226</a> - [bisected] Anvil accesses uninitialized memory while compiling shaders</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104231">Bug 104231</a> - DispatchSanity_test.GL30 regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104246">Bug 104246</a> - Talos Principle Vulkan version crash: spirv_to_nir() returns NULL entry_point</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104271">Bug 104271</a> - i965: Timeout in dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.5</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104288">Bug 104288</a> - Steamroll needs allow_glsl_cross_stage_interpolation_mismatch=true</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104302">Bug 104302</a> - Wolfenstein 2 (2017) under wine graphical artifacting on RADV</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104331">Bug 104331</a> - [r600g] Ogre demo &quot;TutorialUAV01&quot; crash at r600_decompress_color_images</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104338">Bug 104338</a> - NULL pointer access crash on Sacha Willems' Vulkan raytracing demo after &quot;spirv: Add basic type validation for OpLoad, OpStore, and OpCopyMemory&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104359">Bug 104359</a> - Mesa freezes in &quot;vtn_cfg_walk_blocks&quot; with Sacha Willems' hdr, parallaxmapping and specializationconstants Vulkan demos</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104381">Bug 104381</a> - swr fails to build since llvm-svn r321257</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104383">Bug 104383</a> - [KBL] Intel GPU hang with firefox</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104411">Bug 104411</a> - [CCS] lemonbar-xft GPU hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104424">Bug 104424</a> - DOOM 2016 broken by spirv OpStore validation</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104487">Bug 104487</a> - [KBL] portal2_linux GPU hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104490">Bug 104490</a> - [radeonsi/290x] Dota2 fails to start (can't create opengl context)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104492">Bug 104492</a> - Compute Shader: Wrong alignment when assigning struct value to structured SSBO</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104546">Bug 104546</a> - Crash happens when running compute pipeline after calling glxMakeCurrent two times</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104551">Bug 104551</a> - Check if Mako templates for Python are installed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104625">Bug 104625</a> - semicolon after if</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104636">Bug 104636</a> - [BSW/HD400] Aztec Ruins GL version GPU hangs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104642">Bug 104642</a> - Android: NULL pointer dereference with i965 mesa-dev, seems build_id_length related</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104654">Bug 104654</a> - r600/sb: Alien Isolation GPU lock</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104668">Bug 104668</a> - dEQP-GLES31.functional.shaders.linkage.uniform.block.differing_precision regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104677">Bug 104677</a> - radv_generate_graphics_pipeline_key reads input rate from incorrect binding</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104690">Bug 104690</a> - [G33] regression: piglit.spec.!opengl 1_4.draw-batch and gl-1_4-dlist-multidrawarrays</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104711">Bug 104711</a> - [skl CCS] Oxenfree (unity engine game) hangs GPU</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104741">Bug 104741</a> - Graphic corruption for Android apps Telegram and KineMaster</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104742">Bug 104742</a> - [swrast] piglit gl-1.4-dlist-multidrawarrays regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104746">Bug 104746</a> - [swrast] piglit attribs regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104749">Bug 104749</a> - rasterizer/jitter/JitManager.cpp:252:91: error: no matching function for call to llvm::DIBuilder::createBasicType(const char [8], int, llvm::dwarf::TypeKind)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104762">Bug 104762</a> - Various segfaults/problems in qt/plasma</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104777">Bug 104777</a> - Attaching multiple shader objects for the same stage to a GLSL program triggers a linker error</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104884">Bug 104884</a> - memory leak with intel i965 mesa when running android container in Ubuntu</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104905">Bug 104905</a> - SpvOpFOrdEqual doesn't return correct results for NaNs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104915">Bug 104915</a> - Indexed SHADING_LANGUAGE_VERSION query not supported</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104923">Bug 104923</a> - anv: Dota2 rendering corruption</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105013">Bug 105013</a> - [regression] GLX+VA-API+clutter-gst video playback is corrupt with Mesa 17.3 (but is fine with 17.2)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105029">Bug 105029</a> - simdlib_512_avx512.inl:371:57: error: could not convert _mm512_mask_blend_epi32((__mmask16)(ImmT), a, b) from __m512i {aka __vector(8) long long int} to SIMDImpl::SIMD512Impl::Float</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105065">Bug 105065</a> - Qt Programs occasionally fail to render with new Mesa (glGetProgramBinary)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105098">Bug 105098</a> - [RADV] GPU freeze with simple Vulkan App</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105103">Bug 105103</a> - Wayland master causes Mesa to fail to compile</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105120">Bug 105120</a> - meson build broken</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105224">Bug 105224</a> - Webgl Pointclouds flickers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105255">Bug 105255</a> - Waiting for fences without waitAll is not implemented</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105271">Bug 105271</a> - WebGL2 shader crashes i965_dri.so 17.3.3</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105290">Bug 105290</a> - [BSW/HD400] SynMark OglCSDof GPU hangs when shaders come from cache</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105292">Bug 105292</a> - vkGetQueryPoolResults returns incorrect query status for large query buffers (bisected)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105436">Bug 105436</a> - Blinking textures in UT2004 [bisected]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105464">Bug 105464</a> - Reading per-patch outputs in Tessellation Control Shader returns undefined values</li>
</ul>
<h2>Changes</h2>
<ul>
<li>Remove incomplete GLX_MESA_set_3dfx_mode from the Xlib libGL</li>
</ul>
</div>
</body>
</html>

225
docs/relnotes/18.0.1.html Normal file
View File

@@ -0,0 +1,225 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.0.1 Release Notes / April 18, 2018</h1>
<p>
Mesa 18.0.1 is a bug fix release which fixes bugs found since the 18.0.0 release.
</p>
<p>
Mesa 18.0.1 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
0c93ba892c0610f5dd87f2e2673b9445187995c395b3ddb33fd4260bfb291e89 mesa-18.0.1.tar.gz
b2d2f5b5dbaab13e15cb0dcb5ec81887467f55ebc9625945b303a3647cd87954 mesa-18.0.1.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101408">Bug 101408</a> - [Gen8+] Xonotic fails to render one of the weapons</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102342">Bug 102342</a> - mesa-17.1.7/src/gallium/auxiliary/pipebuffer/pb_cache.c:169]: (style) Suspicious condition</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102542">Bug 102542</a> - mesa-17.2.0/src/gallium/state_trackers/nine/nine_ff.c:1938: bad assignment ?</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105317">Bug 105317</a> - The GPU Vega 56 was hang while try to pass #GraphicsFuzz shader15 test</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105440">Bug 105440</a> - GEN7: rendering issue on citra</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105442">Bug 105442</a> - Hang when running nine ff lighting shader with radeonsi</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105567">Bug 105567</a> - meson/ninja: 1. mesa/vdpau incorrect symlinks in DESTDIR and 2. Ddri-drivers-path Dvdpau-libs-path overrides DESTDIR</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105670">Bug 105670</a> - [regression][hang] Trine1EE hangs GPU after loading screen on Mesa3D-17.3 and later</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105704">Bug 105704</a> - compiler assertion hit</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105717">Bug 105717</a> - [bisected] Mesa build tests fails: BIGENDIAN_CPU or LITTLEENDIAN_CPU must be defined</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105942">Bug 105942</a> - Graphical artefacts after update to mesa 18.0.0-2</li>
</ul>
<h2>Changes</h2>
<p>Andres Gomez (2):</p>
<ul>
<li>dri_util: when overriding, always reset the core version</li>
<li>mesa: adds some comments regarding MESA_GLES_VERSION_OVERRIDE usage</li>
</ul>
<p>Axel Davy (5):</p>
<ul>
<li>st/nine: Fix bad tracking of vs textures for NINESBT_ALL</li>
<li>st/nine: Fixes warning about implicit conversion</li>
<li>st/nine: Fix non inversible matrix check</li>
<li>st/nine: Declare lighting consts for ff shaders</li>
<li>st/nine: Do not use scratch for face register</li>
</ul>
<p>Bas Nieuwenhuizen (3):</p>
<ul>
<li>ac/nir: Add workaround for GFX9 buffer views.</li>
<li>radv: Don't set instance count using predication.</li>
<li>radv: Always reset draw user SGPRs after secondary command buffer.</li>
</ul>
<p>Caio Marcelo de Oliveira Filho (1):</p>
<ul>
<li>anv/pipeline: fail if TCS/TES compile fail</li>
</ul>
<p>Daniel Stone (1):</p>
<ul>
<li>st/dri: Initialise modifier to INVALID for DRI2</li>
</ul>
<p>Derek Foreman (1):</p>
<ul>
<li>egl/wayland: Make swrast display_sync the correct queue</li>
</ul>
<p>Dylan Baker (4):</p>
<ul>
<li>meson: don't use compiler.has_header</li>
<li>autotools: include meson_get_version</li>
<li>meson: Set .so version for xa like autotools does</li>
<li>meson: fix megadriver symlinking</li>
</ul>
<p>Emil Velikov (1):</p>
<ul>
<li>docs: add sha256 checksums for 18.0.0</li>
</ul>
<p>Eric Engestrom (3):</p>
<ul>
<li>meson/configure: detect endian.h instead of trying to guess when it's available</li>
<li>docs: fix 18.0 release note version</li>
<li>gbm: remove never-implemented function</li>
</ul>
<p>Henri Verbeet (1):</p>
<ul>
<li>mesa: Inherit texture view multi-sample information from the original texture images.</li>
</ul>
<p>Iago Toral Quiroga (1):</p>
<ul>
<li>compiler/spirv: set is_shadow for depth comparitor sampling opcodes</li>
</ul>
<p>Ian Romanick (1):</p>
<ul>
<li>i965/vec4: Fix null destination register in 3-source instructions</li>
</ul>
<p>Jason Ekstrand (4):</p>
<ul>
<li>nir/vars_to_ssa: Remove copies from the correct set</li>
<li>nir/lower_indirect_derefs: Support interp_var_at intrinsics</li>
<li>intel/vec4: Set channel_sizes for MOV_INDIRECT sources</li>
<li>nir/lower_vec_to_movs: Only coalesce if the vec had a SSA destination</li>
</ul>
<p>Juan A. Suarez Romero (5):</p>
<ul>
<li>cherry-ignore anv: Be more careful about fast-clear colors</li>
<li>cherry-ignore: ac/shader: fix vertex input with components.</li>
<li>cherry-ignore: radv: handle exporting view index to fragment shader. (v1.1)</li>
<li>cherry-ignore: omx: always define ENABLE_ST_OMX_{BELLAGIO,TIZONIA}</li>
<li>Update version to 18.0.1</li>
</ul>
<p>Leo Liu (1):</p>
<ul>
<li>radeon/vce: move feedback command inside of destroy function</li>
</ul>
<p>Lionel Landwerlin (1):</p>
<ul>
<li>i965/perf: fix config registration when uploading to kernel</li>
</ul>
<p>Marc Dietrich (1):</p>
<ul>
<li>meson: fix HAVE_LLVM version define in meson build</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>mesa: simplify MESA_GL_VERSION_OVERRIDE behavior of API override</li>
</ul>
<p>Mark Thompson (1):</p>
<ul>
<li>st/va: Enable vaExportSurfaceHandle()</li>
</ul>
<p>Rob Clark (3):</p>
<ul>
<li>nir: fix per_vertex_output intrinsic</li>
<li>freedreno/a5xx: fix page faults on last level</li>
<li>freedreno/a5xx: don't align height for PIPE_BUFFER</li>
</ul>
<p>Samuel Pitoiset (2):</p>
<ul>
<li>radv: fix picking the method for resolve subpass</li>
<li>radv: fix radv_layout_dcc_compressed() when image doesn't have DCC</li>
</ul>
<p>Sergii Romantsov (1):</p>
<ul>
<li>i965: Extend the negative 32-bit deltas to 64-bits</li>
</ul>
<p>Timothy Arceri (7):</p>
<ul>
<li>ac: add if/loop build helpers</li>
<li>radeonsi: make use of if/loop build helpers in ac</li>
<li>ac: make use of if/loop build helpers</li>
<li>glsl: fix infinite loop caused by bug in loop unrolling pass</li>
<li>nir: fix crash in loop unroll corner case</li>
<li>gallium/pipebuffer: fix parenthesis location</li>
<li>glsl: always call do_lower_jumps() after loop unrolling</li>
</ul>
<p>Xiong, James (1):</p>
<ul>
<li>i965: return the fourcc saved in __DRIimage when possible</li>
</ul>
</div>
</body>
</html>

144
docs/relnotes/18.0.2.html Normal file
View File

@@ -0,0 +1,144 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.0.2 Release Notes / April 28, 2018</h1>
<p>
Mesa 18.0.2 is a bug fix release which fixes bugs found since the 18.0.1 release.
</p>
<p>
Mesa 18.0.2 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
SHA256: ffd8dfe3337b474a3baa085f0e7ef1a32c7cdc3bed1ad810b2633919a9324840 mesa-18.0.2.tar.gz
SHA256: 98fa159768482dc568b9f8bf0f36c7acb823fa47428ffd650b40784f16b9e7b3 mesa-18.0.2.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95009">Bug 95009</a> - [SNB] amd_shader_trinary_minmax.execution.built-in-functions.gs-mid3-ivec2-ivec2-ivec2 intermittent</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95012">Bug 95012</a> - [SNB] glsl-1_50.execution.built-in-functions.gs-op tests intermittent</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98281">Bug 98281</a> - 'message's in ctx-&gt;Debug.LogMessages[] seem to leak.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105320">Bug 105320</a> - Storage texel buffer access produces wrong results (RX Vega)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105775">Bug 105775</a> - SI reaches the maximum IB size in dwords and fail to submit</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105994">Bug 105994</a> - surface state leak when creating and destroying image views with aspectMask depth and stencil</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106074">Bug 106074</a> - radv: si_scissor_from_viewport returns incorrect result when using half-pixel viewport offset</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106126">Bug 106126</a> - eglMakeCurrent does not always ensure dri_drawable-&gt;update_drawable_info has been called for a new EGLSurface if another has been created and destroyed first</li>
</ul>
<h2>Changes</h2>
<p>Bas Nieuwenhuizen (2):</p>
<ul>
<li>ac/nir: Make the GFX9 buffer size fix apply to image loads/atomics too.</li>
<li>radv: Mark GTT memory as device local for APUs.</li>
</ul>
<p>Dylan Baker (2):</p>
<ul>
<li>bin/install_megadrivers: fix DESTDIR and -D*-path</li>
<li>meson: don't build classic mesa tests without dri_drivers</li>
</ul>
<p>Ian Romanick (1):</p>
<ul>
<li>intel/compiler: Add scheduler deps for instructions that implicitly read g0</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>i965/fs: Return mlen * 8 for size_read() for INTERPOLATE_AT_*</li>
</ul>
<p>Johan Klokkhammer Helsing (1):</p>
<ul>
<li>st/dri: Fix dangling pointer to a destroyed dri_drawable</li>
</ul>
<p>Juan A. Suarez Romero (4):</p>
<ul>
<li>docs: add sha256 checksums for 18.0.1</li>
<li>travis: radv needs LLVM 4.0</li>
<li>cherry-ignore: add explicit 18.1 only nominations</li>
<li>Update version to 18.0.2</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>i965: Fix shadow batches to be the same size as the real BO.</li>
</ul>
<p>Lionel Landwerlin (1):</p>
<ul>
<li>anv: fix number of planes for depth &amp; stencil</li>
</ul>
<p>Lucas Stach (1):</p>
<ul>
<li>etnaviv: fix texture_format_needs_swiz</li>
</ul>
<p>Marek Olšák (3):</p>
<ul>
<li>radeonsi/gfx9: fix a hang with an empty first IB</li>
<li>glsl_to_tgsi: try harder to lower unsupported ir_binop_vector_extract</li>
<li>Revert "st/dri: Fix dangling pointer to a destroyed dri_drawable"</li>
</ul>
<p>Samuel Pitoiset (2):</p>
<ul>
<li>radv: fix scissor computation when using half-pixel viewport offset</li>
<li>radv/winsys: allow to submit up to 4 IBs for chips without chaining</li>
</ul>
<p>Thomas Hellstrom (1):</p>
<ul>
<li>svga: Fix incorrect advertizing of EGL_KHR_gl_colorspace</li>
</ul>
<p>Timothy Arceri (1):</p>
<ul>
<li>mesa: free debug messages when destroying the debug state</li>
</ul>
</div>
</body>
</html>

107
docs/relnotes/18.0.3.html Normal file
View File

@@ -0,0 +1,107 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.0.3 Release Notes / May 7, 2018</h1>
<p>
Mesa 18.0.3 is a bug fix release which fixes bugs found since the 18.0.2 release.
</p>
<p>
Mesa 18.0.3 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
58cc5c5b1ab2a44e6e47f18ef6c29836ad06f95450adce635ce3c317507a171b mesa-18.0.3.tar.gz
099d9667327a76a61741a533f95067d76ea71a656e66b91507b3c0caf1d49e30 mesa-18.0.3.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105374">Bug 105374</a> - texture3d, a SaschaWillems demo, assert fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106147">Bug 106147</a> - SIGBUS in write_reloc() when Sacha Willems' &quot;texture3d&quot; Vulkan demo starts</li>
</ul>
<h2>Changes</h2>
<p>Andres Rodriguez (1):</p>
<ul>
<li>radv/winsys: fix leaking resources from bo's imported by fd</li>
</ul>
<p>Boyuan Zhang (1):</p>
<ul>
<li>radeon/vcn: fix mpeg4 msg buffer settings</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>gallium/util: Fix incorrect refcounting of separate stencil.</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>anv/allocator: Don't shrink either end of the block pool</li>
</ul>
<p>Juan A. Suarez Romero (3):</p>
<ul>
<li>docs: add sha256 checksums for 18.0.2</li>
<li>cherry-ignore: add explicit 18.1 only nominations</li>
<li>Update version to 18.0.3</li>
</ul>
<p>Leo Liu (1):</p>
<ul>
<li>st/omx/enc: fix blit setup for YUV LoadImage</li>
</ul>
<p>Marek Olšák (2):</p>
<ul>
<li>util/u_queue: fix a deadlock in util_queue_finish</li>
<li>radeonsi/gfx9: workaround for INTERP with indirect indexing</li>
</ul>
<p>Nanley Chery (1):</p>
<ul>
<li>i965/tex_image: Avoid the ASTC LDR workaround on gen9lp</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: compute the number of subpass attachments correctly</li>
</ul>
</div>
</body>
</html>

157
docs/relnotes/18.0.4.html Normal file
View File

@@ -0,0 +1,157 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.0.4 Release Notes / May 17, 2018</h1>
<p>
Mesa 18.0.4 is a bug fix release which fixes bugs found since the 18.0.3 release.
</p>
<p>
Mesa 18.0.4 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
d1dc3469faccdd73439479426952d71a9e8f684e8d03b6687063c12b13430801 mesa-18.0.4.tar.gz
1f3bcfe7cef0a5c20dae2b41df5d7e0a985e06be0183fa4d43b6068fcba2920f mesa-18.0.4.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91808">Bug 91808</a> - trine1 misrender r600g</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100430">Bug 100430</a> - [radv] graphical glitches on dolphin emulator</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106243">Bug 106243</a> - [kbl] GPU HANG: 9:0:0x85dffffb, in Cinnamon</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106480">Bug 106480</a> - A2B10G10R10_SNORM vertex attribute doesn't work.</li>
</ul>
<h2>Changes</h2>
<p>Bas Nieuwenhuizen (3):</p>
<ul>
<li>radv: Translate logic ops.</li>
<li>radv: Fix up 2_10_10_10 alpha sign.</li>
<li>radv: Disable texel buffers with A2 SNORM/SSCALED/SINT for pre-vega.</li>
</ul>
<p>Dave Airlie (3):</p>
<ul>
<li>r600: fix constant buffer bounds.</li>
<li>radv: resolve all layers in compute resolve path.</li>
<li>radv: use compute path for multi-layer images.</li>
</ul>
<p>Deepak Rawat (1):</p>
<ul>
<li>egl/x11: Send invalidate to driver on copy_region path in swap_buffer</li>
</ul>
<p>Ian Romanick (1):</p>
<ul>
<li>mesa: Add missing support for glFogiv(GL_FOG_DISTANCE_MODE_NV)</li>
</ul>
<p>Jan Vesely (8):</p>
<ul>
<li>clover: Add explicit virtual destructor to argument class</li>
<li>eg/compute: Drop reference on code_bo in destructor.</li>
<li>r600: Cleanup constant buffers on context destruction</li>
<li>eg/compute: Drop reference to kernel_param bo in destructor</li>
<li>pipe-loader: Free driver_name in error path</li>
<li>gallium/auxiliary: Add helper function to count the number of entries in hash table</li>
<li>winsys/radeon: Destroy fd_hash table when the last winsys is removed.</li>
<li>winsys/amdgpu: Destroy dev_hash table when the last winsys is removed.</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>i965,anv: Set the CS stall bit on the ISP disable PIPE_CONTROL</li>
</ul>
<p>Jose Maria Casanova Crespo (2):</p>
<ul>
<li>intel/compiler: fix 16-bit int brw_negate_immediate and brw_abs_immediate</li>
<li>intel/compiler: fix brw_imm_w for negative 16-bit integers</li>
</ul>
<p>Juan A. Suarez Romero (7):</p>
<ul>
<li>docs: add sha256 checksums for 18.0.3</li>
<li>cherry-ignore: add explicit 18.1 only nominations</li>
<li>cherry-ignore: glsl: change ast_type_qualifier bitset size to work around GCC 5.4 bug</li>
<li>cherry-ignore: mesa: fix glGetInteger/Float/etc queries for vertex arrays attribs</li>
<li>cherry-ignore: mesa: revert GL_[SECONDARY_]COLOR_ARRAY_SIZE glGet type to TYPE_INT</li>
<li>cherry-ignore: radv/resolve: do fmask decompress on all layers.</li>
<li>Update version to 18.0.4</li>
</ul>
<p>Kai Wasserbäch (1):</p>
<ul>
<li>opencl: autotools: Fix linking order for OpenCL target</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>i965: Don't leak blorp on Gen4-5.</li>
</ul>
<p>Lionel Landwerlin (2):</p>
<ul>
<li>i965: require pixel scoreboard stall prior to ISP disable</li>
<li>anv: emit pixel scoreboard stall before ISP disable</li>
</ul>
<p>Matthew Nicholls (1):</p>
<ul>
<li>radv: fix multisample image copies</li>
</ul>
<p>Neil Roberts (1):</p>
<ul>
<li>spirv: Apply OriginUpperLeft to FragCoord</li>
</ul>
<p>Rhys Perry (1):</p>
<ul>
<li>mesa: fix error handling in get_framebuffer_parameteriv</li>
</ul>
<p>Ross Burton (1):</p>
<ul>
<li>src/intel/Makefile.vulkan.am: add missing MKDIR_GEN</li>
</ul>
</div>
</body>
</html>

162
docs/relnotes/18.0.5.html Normal file
View File

@@ -0,0 +1,162 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.0.5 Release Notes / June 3, 2018</h1>
<p>
Mesa 18.0.5 is a bug fix release which fixes bugs found since the 18.0.4 release.
</p>
<p>
Mesa 18.0.5 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
ea3e00329cea899b1e32db812fd2f426832be37e4baa2e2fd9288a3480f30531 mesa-18.0.5.tar.gz
5187bba8d72aea78f2062d134ec6079a508e8216062dce9ec9048b5eb2c4fc6b mesa-18.0.5.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78097">Bug 78097</a> - glUniform1ui and friends not supported by display lists</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102390">Bug 102390</a> - centroid interpolation causes broken attribute values</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105351">Bug 105351</a> - [Gen6+] piglit's arb_shader_image_load_store-host-mem-barrier fails with a glGetTexSubImage fallback path</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106090">Bug 106090</a> - Compiling compute shader crashes RADV</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106315">Bug 106315</a> - The witness + dxvk suffers flickering garbage</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106465">Bug 106465</a> - No test for Image Load/Store on format-incompatible texture buffer</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106479">Bug 106479</a> - NDEBUG not defined for libamdgpu_addrlib</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106481">Bug 106481</a> - No test for Image Load/Store on texture buffer sized greater than MAX_TEXTURE_BUFFER_SIZE_ARB</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106504">Bug 106504</a> - vulkan SPIR-V parsing failed at ../src/compiler/spirv/vtn_cfg.c:381</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106587">Bug 106587</a> - Dota2 is very dark when using vulkan render on a Intel &lt;&lt; AMD prime setup</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106629">Bug 106629</a> - [SNB,IVB,HSW,BDW] dEQP-EGL.functional.image.create.gles2_cubemap_negative_z_rgb_read_pixels</li>
</ul>
<h2>Changes</h2>
<p>Anuj Phogat (1):</p>
<ul>
<li>i965/glk: Add l3 banks count for 2x6 configuration</li>
</ul>
<p>Bas Nieuwenhuizen (2):</p>
<ul>
<li>amd/addrlib: Use defines in autotools build.</li>
<li>radv: Fix SRGB compute copies.</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>tgsi/scan: add hw atomic to the list of memory accessing files</li>
</ul>
<p>Francisco Jerez (4):</p>
<ul>
<li>Revert "mesa: simplify _mesa_is_image_unit_valid for buffers"</li>
<li>i965: Move buffer texture size calculation into a common helper function.</li>
<li>i965: Handle non-zero texture buffer offsets in buffer object range calculation.</li>
<li>i965: Use intel_bufferobj_buffer() wrapper in image surface state setup.</li>
</ul>
<p>Jan Vesely (1):</p>
<ul>
<li>eg/compute: Use reference counting to handle compute memory pool.</li>
</ul>
<p>Jason Ekstrand (2):</p>
<ul>
<li>intel/eu: Set EXECUTE_1 when setting the rounding mode in cr0</li>
<li>intel/blorp: Support blits and clears on surfaces with offsets</li>
</ul>
<p>Jose Dapena Paz (1):</p>
<ul>
<li>mesa: do not leak ctx-&gt;Shader.ReferencedProgram references</li>
</ul>
<p>Juan A. Suarez Romero (8):</p>
<ul>
<li>docs: add sha256 checksums for 18.0.4</li>
<li>cherry-ignore: i965/miptree: Fix handling of uninitialized MCS buffers</li>
<li>cherry-ignore: add explicit 18.1 only nominations</li>
<li>cherry-ignore: mesa/st: handle vert_attrib_mask in nir case too</li>
<li>cherry-ignore: Tegra is not supported</li>
<li>cherry-ignore: st/mesa: fix assertion failures with GL_UNSIGNED_INT64_ARB (v2)</li>
<li>cherry-ignore: nv30: ensure that displayable formats are marked accordingly</li>
<li>Update version to 18.0.5</li>
</ul>
<p>Marek Olšák (3):</p>
<ul>
<li>st/mesa: simplify lastLevel determination in st_finalize_texture</li>
<li>radeonsi: fix incorrect parentheses around VS-PS varying elimination</li>
<li>mesa: handle GL_UNSIGNED_INT64_ARB properly (v2)</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>dri3: Stricter SBC wraparound handling</li>
</ul>
<p>Nanley Chery (1):</p>
<ul>
<li>i965/miptree: Zero-initialize CCS_D buffers</li>
</ul>
<p>Samuel Pitoiset (2):</p>
<ul>
<li>spirv: fix visiting inner loops with same break/continue block</li>
<li>radv: fix centroid interpolation</li>
</ul>
<p>Stuart Young (1):</p>
<ul>
<li>etnaviv: Fix missing rnndb file in tarballs</li>
</ul>
<p>Timothy Arceri (1):</p>
<ul>
<li>mesa: add glUniform*ui{v} support to display lists</li>
</ul>
</div>
</body>
</html>

268
docs/relnotes/18.1.0.html Normal file
View File

@@ -0,0 +1,268 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.1.0 Release Notes / May 18 2018</h1>
<p>
Mesa 18.1.0 is a new development release. People who are concerned
with stability and reliability should stick with a previous release or
wait for Mesa 18.1.1.
</p>
<p>
Mesa 18.1.0 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
b1c1dbb42597190503d3abc518b12de880623f097c6cb6c293ecf69ae87e6fbf mesa-18.1.0.tar.gz
c855c5b67ef993b7621f76d8b120769ec0415f1c3616eaff44ef7f7f300aceba mesa-18.1.0.tar.xz
</pre>
<h2>New features</h2>
<p>
Note: some of the new features are only available with certain drivers.
</p>
<ul>
<li>OpenGL 3.1 with ARB_compatibility on nv50, nvc0, r600, radeonsi, softpipe, llvmpipe, svga</li>
<li>GL_ARB_bindless_texture on nvc0/maxwell+</li>
<li>GL_ARB_transform_feedback_overflow_query on nvc0</li>
<li>GL_EXT_semaphore on radeonsi</li>
<li>GL_EXT_semaphore_fd on radeonsi</li>
<li>GL_EXT_shader_framebuffer_fetch on i965 on desktop GL (GLES was already supported)</li>
<li>GL_EXT_shader_framebuffer_fetch_non_coherent on i965</li>
<li>GL_KHR_blend_equation_advanced on radeonsi</li>
<li>Disk shader cache support for i965 enabled by default</li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90311">Bug 90311</a> - Fail to build libglx with clang at linking stage</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91808">Bug 91808</a> - trine1 misrender r600g</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95009">Bug 95009</a> - [SNB] amd_shader_trinary_minmax.execution.built-in-functions.gs-mid3-ivec2-ivec2-ivec2 intermittent</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95012">Bug 95012</a> - [SNB] glsl-1_50.execution.built-in-functions.gs-op tests intermittent</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98281">Bug 98281</a> - 'message's in ctx-&gt;Debug.LogMessages[] seem to leak.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99549">Bug 99549</a> - pp: Failed to translate a shader</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100259">Bug 100259</a> - [EGL] [GBM] undefined reference to `gbm_bo_create_with_modifiers'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101408">Bug 101408</a> - [Gen8+] Xonotic fails to render one of the weapons</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101442">Bug 101442</a> - Piglit shaders&#64;ssa&#64;fs-if-def-else-break fails with sb but passes with R600_DEBUG=nosb</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102342">Bug 102342</a> - mesa-17.1.7/src/gallium/auxiliary/pipebuffer/pb_cache.c:169]: (style) Suspicious condition</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102542">Bug 102542</a> - mesa-17.2.0/src/gallium/state_trackers/nine/nine_ff.c:1938: bad assignment ?</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102905">Bug 102905</a> - [R600] Miscompilation of TGSI to VLIW causes artifacts in Gallium Nine with Crysis2 bump mapping</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103006">Bug 103006</a> - [OpenGL CTS] [HSW] KHR-GL45.vertex_attrib_binding.basic-inputL-case1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103142">Bug 103142</a> - R600g+sb: optimizer apparently stuck in an endless loop</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103626">Bug 103626</a> - </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103746">Bug 103746</a> - [BDW BSW SKL KBL] dEQP-GLES31.functional.copy_image regressions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104302">Bug 104302</a> - Wolfenstein 2 (2017) under wine graphical artifacting on RADV</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104335">Bug 104335</a> - [OpenGL CTS][SKL,KBL] KHR-GL45.vertex_attrib_64bit.limits_test occasionally fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104625">Bug 104625</a> - semicolon after if</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104636">Bug 104636</a> - [BSW/HD400] Aztec Ruins GL version GPU hangs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104642">Bug 104642</a> - Android: NULL pointer dereference with i965 mesa-dev, seems build_id_length related</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104654">Bug 104654</a> - r600/sb: Alien Isolation GPU lock</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104668">Bug 104668</a> - dEQP-GLES31.functional.shaders.linkage.uniform.block.differing_precision regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104717">Bug 104717</a> - Rocket League: grass rendering broken with nir</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104732">Bug 104732</a> - [radv] Binding descriptor sets disturbs other pipeline bindings</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104741">Bug 104741</a> - Graphic corruption for Android apps Telegram and KineMaster</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104762">Bug 104762</a> - Various segfaults/problems in qt/plasma</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104777">Bug 104777</a> - Attaching multiple shader objects for the same stage to a GLSL program triggers a linker error</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104794">Bug 104794</a> - piglit.spec.arb_internalformat_query2.samples and num_sample_counts pname checks</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104803">Bug 104803</a> - SIGSEGV in state_tracker/st_glsl_to_tgsi_temprename.cpp</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104863">Bug 104863</a> - 186 assertions in piglit</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104884">Bug 104884</a> - memory leak with intel i965 mesa when running android container in Ubuntu</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104905">Bug 104905</a> - SpvOpFOrdEqual doesn't return correct results for NaNs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104908">Bug 104908</a> - Texture Compression Hint not converted to enum16</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104915">Bug 104915</a> - Indexed SHADING_LANGUAGE_VERSION query not supported</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104923">Bug 104923</a> - anv: Dota2 rendering corruption</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104989">Bug 104989</a> - [r600] [bisected] OpenGL applications can't render anything at all</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105013">Bug 105013</a> - [regression] GLX+VA-API+clutter-gst video playback is corrupt with Mesa 17.3 (but is fine with 17.2)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105026">Bug 105026</a> - glxgears asserts with pp_jimenezmlaa=1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105029">Bug 105029</a> - simdlib_512_avx512.inl:371:57: error: could not convert _mm512_mask_blend_epi32((__mmask16)(ImmT), a, b) from __m512i {aka __vector(8) long long int} to SIMDImpl::SIMD512Impl::Float</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105052">Bug 105052</a> - </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105065">Bug 105065</a> - Qt Programs occasionally fail to render with new Mesa (glGetProgramBinary)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105067">Bug 105067</a> - </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105088">Bug 105088</a> - brw_nir_uniforms.cpp:256:10: error: non-constant-expression cannot be narrowed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105098">Bug 105098</a> - [RADV] GPU freeze with simple Vulkan App</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105103">Bug 105103</a> - Wayland master causes Mesa to fail to compile</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105120">Bug 105120</a> - meson build broken</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105161">Bug 105161</a> - KHR_blend_equation_advanced doesn't work in GLSL 1.10-1.40 shaders</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105183">Bug 105183</a> - Weird assertion in NIR linker</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105211">Bug 105211</a> - build failure after zwp_dmabuf commit if wayland-protocols is not installed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105224">Bug 105224</a> - Webgl Pointclouds flickers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105229">Bug 105229</a> - [KBL SKL BDW HSW] [Regression] KHR-GLES31.core.shader_image_load_store.advanced-sso-simple failures</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105238">Bug 105238</a> - ast.h:648:16: error: union member 'i' has a non-trivial constructor</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105255">Bug 105255</a> - Waiting for fences without waitAll is not implemented</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105262">Bug 105262</a> - [R600] [BISECTED] ttf fonts are invisible in many programs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105271">Bug 105271</a> - WebGL2 shader crashes i965_dri.so 17.3.3</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105274">Bug 105274</a> - </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105290">Bug 105290</a> - </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105292">Bug 105292</a> - vkGetQueryPoolResults returns incorrect query status for large query buffers (bisected)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105317">Bug 105317</a> - The GPU Vega 56 was hang while try to pass #GraphicsFuzz shader15 test</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105320">Bug 105320</a> - Storage texel buffer access produces wrong results (RX Vega)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105374">Bug 105374</a> - texture3d, a SaschaWillems demo, assert fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105436">Bug 105436</a> - Blinking textures in UT2004 [bisected]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105440">Bug 105440</a> - GEN7: rendering issue on citra</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105442">Bug 105442</a> - Hang when running nine ff lighting shader with radeonsi</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105444">Bug 105444</a> - Enable GL disk shader cache when transform feedback is enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105464">Bug 105464</a> - </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105471">Bug 105471</a> - [g33] [bisected] dEQP-GLES2.functional.shaders failures</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105497">Bug 105497</a> - shader-db crashes on 72 core system after ast_type_qualifier bitset change</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105529">Bug 105529</a> - u_debug_stack.c:268: error: #pragma GCC diagnostic not allowed inside functions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105567">Bug 105567</a> - meson/ninja: 1. mesa/vdpau incorrect symlinks in DESTDIR and 2. Ddri-drivers-path Dvdpau-libs-path overrides DESTDIR</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105621">Bug 105621</a> - Build failure on GNOME Continuous</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105634">Bug 105634</a> - Android build test fails when building brw_oa_metrics.c</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105670">Bug 105670</a> - </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105704">Bug 105704</a> - </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105717">Bug 105717</a> - [bisected] Mesa build tests fails: BIGENDIAN_CPU or LITTLEENDIAN_CPU must be defined</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105737">Bug 105737</a> - st_tests_common.cpp:140:42: error: no matching function for call to 'tgsi_get_opcode_info'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105738">Bug 105738</a> - commit f7ffa504a065dc2631fd38cc5fe885b277f4e7e7 causes artifacting in radv</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105740">Bug 105740</a> - glsl_types.cpp(524): error: a dynamically-initialized local static variable is not allowed inside of a statement expression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105775">Bug 105775</a> - SI reaches the maximum IB size in dwords and fail to submit</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105807">Bug 105807</a> - [Regression, bisected]: 3D Rendering not working correctly in Warhammer 40k: Dawn of War II</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105817">Bug 105817</a> - scons build broken by glSpecializeShaderARB</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105820">Bug 105820</a> - [m32] piglit regressions relinking program without shaders</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105942">Bug 105942</a> - Graphical artefacts after update to mesa 18.0.0-2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105952">Bug 105952</a> - radv causes GPU hang on SI</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105960">Bug 105960</a> - [bisected] meson build test fails with: undefined reference to `etna_pm_create_query'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105994">Bug 105994</a> - surface state leak when creating and destroying image views with aspectMask depth and stencil</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106074">Bug 106074</a> - radv: si_scissor_from_viewport returns incorrect result when using half-pixel viewport offset</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106126">Bug 106126</a> - eglMakeCurrent does not always ensure dri_drawable-&gt;update_drawable_info has been called for a new EGLSurface if another has been created and destroyed first</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106131">Bug 106131</a> - meson/ninja build missing file gtest.h</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106133">Bug 106133</a> - make check &quot;OSError: [Errno 24] Too many open files&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106147">Bug 106147</a> - SIGBUS in write_reloc() when Sacha Willems' &quot;texture3d&quot; Vulkan demo starts</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106174">Bug 106174</a> - vulkan dota2 broken (segfaulting), found bug commit</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106180">Bug 106180</a> - [bisected] radv vulkan smoke test black screen (Add support for DRI3 v1.2)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106243">Bug 106243</a> - [kbl] GPU HANG: 9:0:0x85dffffb, in Cinnamon</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106450">Bug 106450</a> - </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106462">Bug 106462</a> - piglit.spec.arb_vertex_array_bgra.get regression</li>
</ul>
<h2>Changes</h2>
<ul>
<li>Remove incomplete GLX_SGIX_swap_barrier stubs from the Xlib libGL</li>
<li>Remove incomplete GLX_SGIX_swap_group stubs from the Xlib libGL</li>
</ul>
</div>
</body>
</html>

168
docs/relnotes/18.1.1.html Normal file
View File

@@ -0,0 +1,168 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.1.1 Release Notes / June 1 2018</h1>
<p>
Mesa 18.1.1 is a bug fix release which fixes bugs found since the 18.1.0 release.
</p>
<p>
Mesa 18.1.1 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
366a35f7530a016f2a8284fb0ee5759eeb216b4d6fa47f0e96b89ad2e43faf96 mesa-18.1.1.tar.gz
d3312a2ede5aac14a47476b208b8e3a401367838330197c4588ab8ad420d7781 mesa-18.1.1.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>None<p>
<h2>Changes</h2>
<p>Anuj Phogat (1):</p>
<ul>
<li>i965/glk: Add l3 banks count for 2x6 configuration</li>
</ul>
<p>Bas Nieuwenhuizen (7):</p>
<ul>
<li>radv: Fix multiview queries.</li>
<li>radv: Translate logic ops.</li>
<li>radv: Fix up 2_10_10_10 alpha sign.</li>
<li>radv: Disable texel buffers with A2 SNORM/SSCALED/SINT for pre-vega.</li>
<li>amd/addrlib: Use defines in autotools build.</li>
<li>radv: Fix SRGB compute copies.</li>
<li>radv: Only expose subgroup shuffles on VI+.</li>
</ul>
<p>Christoph Haag (1):</p>
<ul>
<li>radv: fix VK_EXT_descriptor_indexing</li>
</ul>
<p>Dave Airlie (5):</p>
<ul>
<li>radv/resolve: do fmask decompress on all layers.</li>
<li>radv: resolve all layers in compute resolve path.</li>
<li>radv: use compute path for multi-layer images.</li>
<li>virgl: set texture buffer offset alignment to disable ARB_texture_buffer_range.</li>
<li>tgsi/scan: add hw atomic to the list of memory accessing files</li>
</ul>
<p>Dylan Baker (2):</p>
<ul>
<li>docs: Add sha sums for release</li>
<li>VERSION: bump to 18.1.1 for next release</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>vulkan: don't free uninitialised memory</li>
</ul>
<p>Francisco Jerez (4):</p>
<ul>
<li>Revert "mesa: simplify _mesa_is_image_unit_valid for buffers"</li>
<li>i965: Move buffer texture size calculation into a common helper function.</li>
<li>i965: Handle non-zero texture buffer offsets in buffer object range calculation.</li>
<li>i965: Use intel_bufferobj_buffer() wrapper in image surface state setup.</li>
</ul>
<p>Ilia Mirkin (1):</p>
<ul>
<li>nv30: ensure that displayable formats are marked accordingly</li>
</ul>
<p>Jan Vesely (1):</p>
<ul>
<li>eg/compute: Use reference counting to handle compute memory pool.</li>
</ul>
<p>Jason Ekstrand (2):</p>
<ul>
<li>intel/eu: Set EXECUTE_1 when setting the rounding mode in cr0</li>
<li>intel/blorp: Support blits and clears on surfaces with offsets</li>
</ul>
<p>Jose Dapena Paz (1):</p>
<ul>
<li>mesa: do not leak ctx-&gt;Shader.ReferencedProgram references</li>
</ul>
<p>Kai Wasserbäch (1):</p>
<ul>
<li>opencl: autotools: Fix linking order for OpenCL target</li>
</ul>
<p>Marek Olšák (3):</p>
<ul>
<li>st/mesa: simplify lastLevel determination in st_finalize_texture</li>
<li>radeonsi: fix incorrect parentheses around VS-PS varying elimination</li>
<li>mesa: handle GL_UNSIGNED_INT64_ARB properly (v2)</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>dri3: Stricter SBC wraparound handling</li>
</ul>
<p>Nanley Chery (4):</p>
<ul>
<li>i965: Add and use a getter for the miptree aux buffer</li>
<li>i965: Add and use a single miptree aux_buf field</li>
<li>i965/miptree: Fix handling of uninitialized MCS buffers</li>
<li>i965/miptree: Zero-initialize CCS_D buffers</li>
</ul>
<p>Samuel Pitoiset (2):</p>
<ul>
<li>spirv: fix visiting inner loops with same break/continue block</li>
<li>radv: fix centroid interpolation</li>
</ul>
<p>Stuart Young (1):</p>
<ul>
<li>etnaviv: Fix missing rnndb file in tarballs</li>
</ul>
<p>Thierry Reding (3):</p>
<ul>
<li>tegra: Treat resources with modifiers as scanout</li>
<li>tegra: Fix scanout resources without modifiers</li>
<li>tegra: Remove usage of non-stable UAPI</li>
</ul>
<p>Timothy Arceri (1):</p>
<ul>
<li>mesa: add glUniform*ui{v} support to display lists</li>
</ul>
</div>
</body>
</html>

170
docs/relnotes/18.1.2.html Normal file
View File

@@ -0,0 +1,170 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.1.2 Release Notes / June 15 2018</h1>
<p>
Mesa 18.1.2 is a bug fix release which fixes bugs found since the 18.1.1 release.
</p>
<p>
Mesa 18.1.2 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
a644df23937f4078a2bd9a54349f6315c1955f5e3a4ac272832da51dea4d3c11 mesa-18.1.1.tar.gz
070bf0648ba5b242d7303ceed32aed80842f4c0ba16e5acc1a650a46eadfb1f9 mesa-18.1.1.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>None<p>
<h2>Changes</h2>
<p>Alex Smith (4):</p>
<ul>
<li>radv: Consolidate GFX9 merged shader lookup logic</li>
<li>radv: Handle GFX9 merged shaders in radv_flush_constants()</li>
<li>radeonsi: Fix crash on shaders using MSAA image load/store</li>
<li>radv: Set active_stages the same whether or not shaders were cached</li>
</ul>
<p>Andrew Galante (2):</p>
<ul>
<li>meson: Test for __atomic_add_fetch in atomic checks</li>
<li>configure.ac: Test for __atomic_add_fetch in atomic checks</li>
</ul>
<p>Bas Nieuwenhuizen (1):</p>
<ul>
<li>radv: Don't pass a TESS_EVAL shader when tesselation is not enabled.</li>
</ul>
<p>Cameron Kumar (1):</p>
<ul>
<li>vulkan/wsi: Destroy swapchain images after terminating FIFO queues</li>
</ul>
<p>Dylan Baker (6):</p>
<ul>
<li>docs/relnotes: Add sha256 sums for mesa 18.1.1</li>
<li>cherry-ignore: add commits not to pull</li>
<li>cherry-ignore: Add patches from Jason that he rebased on 18.1</li>
<li>meson: work around gentoo applying -m32 to host compiler in cross builds</li>
<li>cherry-ignore: Add another patch</li>
<li>version: bump version for 18.1.2 release</li>
</ul>
<p>Eric Engestrom (3):</p>
<ul>
<li>autotools: add missing android file to package</li>
<li>configure: radv depends on mako</li>
<li>i965: fix resource leak</li>
</ul>
<p>Jason Ekstrand (10):</p>
<ul>
<li>intel/eu: Add some brw_get_default_ helpers</li>
<li>intel/eu: Copy fields manually in brw_next_insn</li>
<li>intel/eu: Set flag [sub]register number differently for 3src</li>
<li>intel/blorp: Don't vertex fetch directly from clear values</li>
<li>intel/isl: Add bounds-checking assertions in isl_format_get_layout</li>
<li>intel/isl: Add bounds-checking assertions for the format_info table</li>
<li>i965/screen: Refactor query_dma_buf_formats</li>
<li>i965/screen: Use RGBA non-sRGB formats for images</li>
<li>anv: Set fence/semaphore types to NONE in impl_cleanup</li>
<li>i965/screen: Return false for unsupported formats in query_modifiers</li>
</ul>
<p>Jordan Justen (1):</p>
<ul>
<li>mesa/program_binary: add implicit UseProgram after successful ProgramBinary</li>
</ul>
<p>Juan A. Suarez Romero (1):</p>
<ul>
<li>glsl: Add ir_binop_vector_extract in NIR</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>i965: Fix batch-last mode to properly swap BOs.</li>
<li>anv: Disable __gen_validate_value if NDEBUG is set.</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>r300g/swtcl: make pipe_context uploaders use malloc'd memory as before</li>
</ul>
<p>Matt Turner (1):</p>
<ul>
<li>meson: Fix -latomic check</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>glx: Fix number of property values to read in glXImportContextEXT</li>
</ul>
<p>Nicolas Boichat (1):</p>
<ul>
<li>configure.ac/meson.build: Fix -latomic test</li>
</ul>
<p>Philip Rebohle (1):</p>
<ul>
<li>radv: Use correct color format for fast clears</li>
</ul>
<p>Samuel Pitoiset (3):</p>
<ul>
<li>radv: fix a GPU hang when MRTs are sparse</li>
<li>radv: fix missing ZRANGE_PRECISION(1) for GFX9+</li>
<li>radv: add a workaround for DXVK hangs by setting amdgpu-skip-threshold</li>
</ul>
<p>Scott D Phillips (1):</p>
<ul>
<li>intel/tools: add intel_sanitize_gpu to EXTRA_DIST</li>
</ul>
<p>Thomas Petazzoni (1):</p>
<ul>
<li>configure.ac: rework -latomic check</li>
</ul>
<p>Timothy Arceri (2):</p>
<ul>
<li>ac: fix possible truncation of intrinsic name</li>
<li>radeonsi: fix possible truncation on renderer string</li>
</ul>
</div>
</body>
</html>

72
docs/relnotes/18.2.0.html Normal file
View File

@@ -0,0 +1,72 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.2.0 Release Notes / TBD</h1>
<p>
Mesa 18.2.0 is a new development release. People who are concerned
with stability and reliability should stick with a previous release or
wait for Mesa 18.2.1.
</p>
<p>
Mesa 18.2.0 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<p>
libwayland-egl is now distributed by Wayland (since 1.15,
<a href="https://lists.freedesktop.org/archives/wayland-devel/2018-April/037767.html">see announcement</a>),
and has been removed from Mesa in this release. Make sure you're using
an up-to-date version of Wayland to keep the functionality.
</p>
<h2>SHA256 checksums</h2>
<pre>
TBD.
</pre>
<h2>New features</h2>
<p>
Note: some of the new features are only available with certain drivers.
</p>
<ul>
<li>GL_ARB_fragment_shader_interlock on i965</li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li>GL_ARB_sample_locations and GL_NV_sample_locations on nvc0 (GM200+)</li>
</ul>
<h2>Changes</h2>
<ul>
<li>Removed GL_EXT_polygon_offset applications should use glPolygonOffset instead.</li>
<li>Removed libwayland-egl, now part of Wayland</li>
</ul>
</div>
</body>
</html>

View File

@@ -122,9 +122,9 @@ Please use common sense and do <strong>not</strong> blindly add everyone.
<pre>
$ scripts/get_reviewer.pl --help # to get the help screen
$ scripts/get_reviewer.pl -f src/egl/drivers/dri2/platform_android.c
Rob Herring <robh@kernel.org> (reviewer:ANDROID EGL SUPPORT,added_lines:188/700=27%,removed_lines:58/283=20%)
Tomasz Figa <tfiga@chromium.org> (reviewer:ANDROID EGL SUPPORT,authored:12/41=29%,added_lines:308/700=44%,removed_lines:115/283=41%)
Emil Velikov <emil.l.velikov@gmail.com> (authored:13/41=32%,removed_lines:76/283=27%)
Rob Herring &lt;robh@kernel.org&gt; (reviewer:ANDROID EGL SUPPORT,added_lines:188/700=27%,removed_lines:58/283=20%)
Tomasz Figa &lt;tfiga@chromium.org&gt; (reviewer:ANDROID EGL SUPPORT,authored:12/41=29%,added_lines:308/700=44%,removed_lines:115/283=41%)
Emil Velikov &lt;emil.l.velikov@gmail.com&gt; (authored:13/41=32%,removed_lines:76/283=27%)
</pre>
</ul>
@@ -246,6 +246,10 @@ release.
Note: resending patch identical to one on mesa-dev@ or one that differs only
by the extra mesa-stable@ tag is <strong>not</strong> recommended.
</p>
<p>
If you are not the author of the original patch, please Cc: them in your
nomination request.
</p>
<h3 id="thetag">The stable tag</h3>

View File

@@ -31,7 +31,7 @@
<dd>is a very useful tool for tracking down
memory-related problems in your code.</dd>
<dt><a href="https://scan.coverity.com/projects/mesa">Coverity</a><dt>
<dt><a href="https://scan.coverity.com/projects/mesa">Coverity</a></dt>
<dd>provides static code analysis of Mesa. If you create an account
you can see the results and try to fix outstanding issues.</dd>
</dl>

View File

@@ -18,8 +18,8 @@
<p>
This page lists known issues with
<a href="https://www.spec.org/gwpg/gpc.static/vp11info.html" target="_main">SPEC Viewperf 11</a>
and <a href="https://www.spec.org/gwpg/gpc.static/vp12info.html" target="_main">SPEC Viewperf 12</a>
<a href="https://www.spec.org/gwpg/gpc.static/vp11info.html">SPEC Viewperf 11</a>
and <a href="https://www.spec.org/gwpg/gpc.static/vp12info.html">SPEC Viewperf 12</a>
when running on Mesa-based drivers.
</p>
@@ -66,13 +66,10 @@ either in Viewperf or the Mesa driver.
<p>
These tests use features of the
<a href="https://www.opengl.org/registry/specs/NV/fragment_program2.txt"
target="_main">
GL_NV_fragment_program2</a> and
<a href="https://www.opengl.org/registry/specs/NV/vertex_program3.txt"
target="_main">
GL_NV_vertex_program3</a> extensions without checking if the driver supports
them.
<a href="https://www.opengl.org/registry/specs/NV/fragment_program2.txt">GL_NV_fragment_program2</a>
and
<a href="https://www.opengl.org/registry/specs/NV/vertex_program3.txt">GL_NV_vertex_program3</a>
extensions without checking if the driver supports them.
</p>
<p>
When Mesa tries to compile the vertex/fragment programs it generates errors
@@ -86,8 +83,8 @@ Subsequent drawing calls become no-ops and the rendering is incorrect.
<p>
These tests depend on the
<a href="https://www.opengl.org/registry/specs/NV/primitive_restart.txt"
target="_main">GL_NV_primitive_restart</a> extension.
<a href="https://www.opengl.org/registry/specs/NV/primitive_restart.txt">GL_NV_primitive_restart</a>
extension.
</p>
<p>
@@ -124,7 +121,7 @@ never specified.
<p>
A trace captured with
<a href="https://github.com/apitrace/apitrace" target="_main">API trace</a>
<a href="https://github.com/apitrace/apitrace">API trace</a>
shows this sequences of calls like this:
<pre>

View File

@@ -933,6 +933,7 @@ EGLAPI EGLSurface EGLAPIENTRY eglCreatePixmapSurfaceHI (EGLDisplay dpy, EGLConfi
#define EGL_DRM_BUFFER_STRIDE_MESA 0x31D4
#define EGL_DRM_BUFFER_USE_SCANOUT_MESA 0x00000001
#define EGL_DRM_BUFFER_USE_SHARE_MESA 0x00000002
#define EGL_DRM_BUFFER_USE_CURSOR_MESA 0x00000004
typedef EGLImageKHR (EGLAPIENTRYP PFNEGLCREATEDRMIMAGEMESAPROC) (EGLDisplay dpy, const EGLint *attrib_list);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLEXPORTDRMIMAGEMESAPROC) (EGLDisplay dpy, EGLImageKHR image, EGLint *name, EGLint *handle, EGLint *stride);
#ifdef EGL_EGLEXT_PROTOTYPES

View File

@@ -34,13 +34,6 @@ extern "C" {
#include <EGL/eglplatform.h>
#ifdef EGL_MESA_drm_image
/* Mesa's extension to EGL_MESA_drm_image... */
#ifndef EGL_DRM_BUFFER_USE_CURSOR_MESA
#define EGL_DRM_BUFFER_USE_CURSOR_MESA 0x0004
#endif
#endif
#ifndef EGL_WL_bind_wayland_display
#define EGL_WL_bind_wayland_display 1

View File

@@ -104,6 +104,12 @@ typedef struct ANativeWindow* EGLNativeWindowType;
typedef struct egl_native_pixmap_t* EGLNativePixmapType;
typedef void* EGLNativeDisplayType;
#elif defined(USE_OZONE)
typedef intptr_t EGLNativeDisplayType;
typedef intptr_t EGLNativeWindowType;
typedef intptr_t EGLNativePixmapType;
#elif defined(__unix__) || defined(__APPLE__)
#if defined(MESA_EGL_NO_X11_HEADERS)
@@ -124,11 +130,13 @@ typedef Window EGLNativeWindowType;
#endif /* MESA_EGL_NO_X11_HEADERS */
#elif __HAIKU__
#elif defined(__HAIKU__)
#include <kernel/image.h>
typedef void *EGLNativeDisplayType;
typedef khronos_uintptr_t EGLNativePixmapType;
typedef khronos_uintptr_t EGLNativeWindowType;
typedef void *EGLNativeDisplayType;
typedef khronos_uintptr_t EGLNativePixmapType;
typedef khronos_uintptr_t EGLNativeWindowType;
#else
#error "Platform not recognized"

View File

@@ -47,9 +47,9 @@
# define GLAPI __declspec(dllimport)
# else /* for use with static link lib build of Win32 edition only */
# define GLAPI extern
# endif /* _STATIC_MESA support */
# endif
# if defined(__MINGW32__) && defined(GL_NO_STDCALL) || defined(UNDER_CE) /* The generated DLLs by MingW with STDCALL are not compatible with the ones done by Microsoft's compilers */
# define GLAPIENTRY
# define GLAPIENTRY
# else
# define GLAPIENTRY __stdcall
# endif

View File

@@ -48,6 +48,7 @@ typedef unsigned int drm_drawable_t;
typedef struct drm_clip_rect drm_clip_rect_t;
#endif
#include <stdbool.h>
#include <stdint.h>
/**
@@ -82,7 +83,7 @@ typedef struct __DRI2flushExtensionRec __DRI2flushExtension;
typedef struct __DRI2throttleExtensionRec __DRI2throttleExtension;
typedef struct __DRI2fenceExtensionRec __DRI2fenceExtension;
typedef struct __DRI2interopExtensionRec __DRI2interopExtension;
typedef struct __DRI2blobExtensionRec __DRI2blobExtension;
typedef struct __DRIimageLoaderExtensionRec __DRIimageLoaderExtension;
typedef struct __DRIimageDriverExtensionRec __DRIimageDriverExtension;
@@ -336,6 +337,30 @@ struct __DRI2throttleExtensionRec {
enum __DRI2throttleReason reason);
};
/**
* Extension for EGL_ANDROID_blob_cache
*/
#define __DRI2_BLOB "DRI2_Blob"
#define __DRI2_BLOB_VERSION 1
typedef void
(*__DRIblobCacheSet) (const void *key, signed long keySize,
const void *value, signed long valueSize);
typedef signed long
(*__DRIblobCacheGet) (const void *key, signed long keySize,
void *value, signed long valueSize);
struct __DRI2blobExtensionRec {
__DRIextension base;
/**
* Set cache functions for setting and getting cache entries.
*/
void (*set_cache_funcs) (__DRIscreen *screen,
__DRIblobCacheSet set, __DRIblobCacheGet get);
};
/**
* Extension for fences / synchronization objects.
@@ -565,7 +590,7 @@ struct __DRIdamageExtensionRec {
* SWRast Loader extension.
*/
#define __DRI_SWRAST_LOADER "DRI_SWRastLoader"
#define __DRI_SWRAST_LOADER_VERSION 3
#define __DRI_SWRAST_LOADER_VERSION 4
struct __DRIswrastLoaderExtensionRec {
__DRIextension base;
@@ -607,6 +632,24 @@ struct __DRIswrastLoaderExtensionRec {
void (*getImage2)(__DRIdrawable *readable,
int x, int y, int width, int height, int stride,
char *data, void *loaderPrivate);
/**
* Put shm image to drawable
*
* \since 4
*/
void (*putImageShm)(__DRIdrawable *drawable, int op,
int x, int y, int width, int height, int stride,
int shmid, char *shmaddr, unsigned offset,
void *loaderPrivate);
/**
* Get shm image from readable
*
* \since 4
*/
void (*getImageShm)(__DRIdrawable *readable,
int x, int y, int width, int height,
int shmid, void *loaderPrivate);
};
/**
@@ -704,7 +747,8 @@ struct __DRIuseInvalidateExtensionRec {
#define __DRI_ATTRIB_BIND_TO_TEXTURE_TARGETS 46
#define __DRI_ATTRIB_YINVERTED 47
#define __DRI_ATTRIB_FRAMEBUFFER_SRGB_CAPABLE 48
#define __DRI_ATTRIB_MAX (__DRI_ATTRIB_FRAMEBUFFER_SRGB_CAPABLE + 1)
#define __DRI_ATTRIB_MUTABLE_RENDER_BUFFER 49 /* EGL_MUTABLE_RENDER_BUFFER_BIT_KHR */
#define __DRI_ATTRIB_MAX 50
/* __DRI_ATTRIB_RENDER_TYPE */
#define __DRI_ATTRIB_RGBA_BIT 0x01
@@ -1227,6 +1271,9 @@ struct __DRIdri2ExtensionRec {
#define __DRI_IMAGE_FORMAT_R16 0x100d
#define __DRI_IMAGE_FORMAT_GR1616 0x100e
#define __DRI_IMAGE_FORMAT_YUYV 0x100f
#define __DRI_IMAGE_FORMAT_XBGR2101010 0x1010
#define __DRI_IMAGE_FORMAT_ABGR2101010 0x1011
#define __DRI_IMAGE_FORMAT_SABGR8 0x1012
#define __DRI_IMAGE_USE_SHARE 0x0001
#define __DRI_IMAGE_USE_SCANOUT 0x0002
@@ -1263,6 +1310,7 @@ struct __DRIdri2ExtensionRec {
#define __DRI_IMAGE_FOURCC_ABGR8888 0x34324241
#define __DRI_IMAGE_FOURCC_XBGR8888 0x34324258
#define __DRI_IMAGE_FOURCC_SARGB8888 0x83324258
#define __DRI_IMAGE_FOURCC_SABGR8888 0x84324258
#define __DRI_IMAGE_FOURCC_ARGB2101010 0x30335241
#define __DRI_IMAGE_FOURCC_XRGB2101010 0x30335258
#define __DRI_IMAGE_FOURCC_ABGR2101010 0x30334241
@@ -1842,9 +1890,57 @@ struct __DRI2rendererQueryExtensionRec {
* Image Loader extension. Drivers use this to allocate color buffers
*/
/**
* See __DRIimageLoaderExtensionRec::getBuffers::buffer_mask.
*/
enum __DRIimageBufferMask {
__DRI_IMAGE_BUFFER_BACK = (1 << 0),
__DRI_IMAGE_BUFFER_FRONT = (1 << 1)
__DRI_IMAGE_BUFFER_FRONT = (1 << 1),
/**
* A buffer shared between application and compositor. The buffer may be
* simultaneously accessed by each.
*
* A shared buffer is equivalent to an EGLSurface whose EGLConfig contains
* EGL_MUTABLE_RENDER_BUFFER_BIT_KHR and whose active EGL_RENDER_BUFFER (as
* opposed to any pending, requested change to EGL_RENDER_BUFFER) is
* EGL_SINGLE_BUFFER.
*
* If buffer_mask contains __DRI_IMAGE_BUFFER_SHARED, then must contains no
* other bits. As a corollary, a __DRIdrawable that has a "shared" buffer
* has no front nor back buffer.
*
* The loader returns __DRI_IMAGE_BUFFER_SHARED in buffer_mask if and only
* if:
* - The loader supports __DRI_MUTABLE_RENDER_BUFFER_LOADER.
* - The driver supports __DRI_MUTABLE_RENDER_BUFFER_DRIVER.
* - The EGLConfig of the drawable EGLSurface contains
* EGL_MUTABLE_RENDER_BUFFER_BIT_KHR.
* - The EGLContext's EGL_RENDER_BUFFER is EGL_SINGLE_BUFFER.
* Equivalently, the EGLSurface's active EGL_RENDER_BUFFER (as
* opposed to any pending, requested change to EGL_RENDER_BUFFER) is
* EGL_SINGLE_BUFFER. (See the EGL 1.5 and
* EGL_KHR_mutable_render_buffer spec for details about "pending" vs
* "active" EGL_RENDER_BUFFER state).
*
* A shared buffer is similar to a front buffer in that all rendering to the
* buffer should appear promptly on the screen. It is different from
* a front buffer in that its behavior is independent from the
* GL_DRAW_BUFFER state. Specifically, if GL_DRAW_FRAMEBUFFER is 0 and the
* __DRIdrawable's buffer_mask is __DRI_IMAGE_BUFFER_SHARED, then all
* rendering should appear promptly on the screen if GL_DRAW_BUFFER is not
* GL_NONE.
*
* The difference between a shared buffer and a front buffer is motivated
* by the constraints of Android and OpenGL ES. OpenGL ES does not support
* front-buffer rendering. Android's SurfaceFlinger protocol provides the
* EGL driver only a back buffer and no front buffer. The shared buffer
* mode introduced by EGL_KHR_mutable_render_buffer is a backdoor though
* EGL that allows Android OpenGL ES applications to render to what is
* effectively the front buffer, a backdoor that required no change to the
* OpenGL ES API and little change to the SurfaceFlinger API.
*/
__DRI_IMAGE_BUFFER_SHARED = (1 << 2),
};
struct __DRIimageList {
@@ -1869,7 +1965,8 @@ struct __DRIimageLoaderExtensionRec {
* \param stamp Address of variable to be updated when
* getBuffers must be called again
* \param loaderPrivate The loaderPrivate for driDrawable
* \param buffer_mask Set of buffers to allocate
* \param buffer_mask Set of buffers to allocate. A bitmask of
* __DRIimageBufferMask.
* \param buffers Returned buffers
*/
int (*getBuffers)(__DRIdrawable *driDrawable,
@@ -1983,4 +2080,85 @@ struct __DRIbackgroundCallableExtensionRec {
GLboolean (*isThreadSafe)(void *loaderPrivate);
};
/**
* The driver portion of EGL_KHR_mutable_render_buffer.
*
* If the driver creates a __DRIconfig with
* __DRI_ATTRIB_MUTABLE_RENDER_BUFFER, then it must support this extension.
*
* To support this extension:
*
* - The driver should create at least one __DRIconfig with
* __DRI_ATTRIB_MUTABLE_RENDER_BUFFER. This is strongly recommended but
* not required.
*
* - The driver must be able to handle __DRI_IMAGE_BUFFER_SHARED if
* returned by __DRIimageLoaderExtension:getBuffers().
*
* - When rendering to __DRI_IMAGE_BUFFER_SHARED, it must call
* __DRImutableRenderBufferLoaderExtension::displaySharedBuffer() in
* response to glFlush and glFinish. (This requirement is not documented
* in EGL_KHR_mutable_render_buffer, but is a de-facto requirement in the
* Android ecosystem. Android applications expect that glFlush will
* immediately display the buffer when in shared buffer mode, and Android
* drivers comply with this expectation). It :may: call
* displaySharedBuffer() more often than required.
*
* - When rendering to __DRI_IMAGE_BUFFER_SHARED, it must ensure that the
* buffer is always in a format compatible for display because the
* display engine (usually SurfaceFlinger or hwcomposer) may display the
* image at any time, even concurrently with 3D rendering. For example,
* display hardware and the GL hardware may be able to access the buffer
* simultaneously. In particular, if the buffer is compressed then take
* care that SurfaceFlinger and hwcomposer can consume the compression
* format.
*
* \see __DRI_IMAGE_BUFFER_SHARED
* \see __DRI_ATTRIB_MUTABLE_RENDER_BUFFER
* \see __DRI_MUTABLE_RENDER_BUFFER_LOADER
*/
#define __DRI_MUTABLE_RENDER_BUFFER_DRIVER "DRI_MutableRenderBufferDriver"
#define __DRI_MUTABLE_RENDER_BUFFER_DRIVER_VERSION 1
typedef struct __DRImutableRenderBufferDriverExtensionRec __DRImutableRenderBufferDriverExtension;
struct __DRImutableRenderBufferDriverExtensionRec {
__DRIextension base;
};
/**
* The loader portion of EGL_KHR_mutable_render_buffer.
*
* Requires loader extension DRI_IMAGE_LOADER, through which the loader sends
* __DRI_IMAGE_BUFFER_SHARED to the driver.
*
* \see __DRI_MUTABLE_RENDER_BUFFER_DRIVER
*/
#define __DRI_MUTABLE_RENDER_BUFFER_LOADER "DRI_MutableRenderBufferLoader"
#define __DRI_MUTABLE_RENDER_BUFFER_LOADER_VERSION 1
typedef struct __DRImutableRenderBufferLoaderExtensionRec __DRImutableRenderBufferLoaderExtension;
struct __DRImutableRenderBufferLoaderExtensionRec {
__DRIextension base;
/**
* Inform the display engine (that is, SurfaceFlinger and/or hwcomposer)
* that the __DRIdrawable has new content.
*
* The display engine may ignore this call, for example, if it continually
* refreshes and displays the buffer on every frame, as in
* EGL_ANDROID_front_buffer_auto_refresh. On the other extreme, the display
* engine may refresh and display the buffer only in frames in which the
* driver calls this.
*
* If the fence_fd is not -1, then the display engine will display the
* buffer only after the fence signals.
*
* The drawable's current __DRIimageBufferMask, as returned by
* __DRIimageLoaderExtension::getBuffers(), must be
* __DRI_IMAGE_BUFFER_SHARED.
*/
void (*displaySharedBuffer)(__DRIdrawable *drawable, int fence_fd,
void *loaderPrivate);
};
#endif

View File

@@ -13,9 +13,9 @@ $ make headers_install INSTALL_HDR_PATH=/path/to/install
The last update was done at the following kernel commit :
commit ca797d29cd63e7b71b4eea29aff3b1cefd1ecb59
Merge: 2c1c55cb75a9 010d118c2061
commit 78230c46ec0a91dd4256c9e54934b3c7095a7ee3
Merge: b65bd4031156 037f03155b7d
Author: Dave Airlie <airlied@redhat.com>
Date: Mon Dec 4 09:40:35 2017 +1000
Date: Wed Mar 21 14:07:03 2018 +1000
Merge tag 'drm-intel-next-2017-11-17-1' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
Merge tag 'omapdrm-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux into drm-next

View File

@@ -178,7 +178,7 @@ extern "C" {
#define DRM_FORMAT_MOD_VENDOR_NONE 0
#define DRM_FORMAT_MOD_VENDOR_INTEL 0x01
#define DRM_FORMAT_MOD_VENDOR_AMD 0x02
#define DRM_FORMAT_MOD_VENDOR_NV 0x03
#define DRM_FORMAT_MOD_VENDOR_NVIDIA 0x03
#define DRM_FORMAT_MOD_VENDOR_SAMSUNG 0x04
#define DRM_FORMAT_MOD_VENDOR_QCOM 0x05
#define DRM_FORMAT_MOD_VENDOR_VIVANTE 0x06
@@ -188,7 +188,7 @@ extern "C" {
#define DRM_FORMAT_RESERVED ((1ULL << 56) - 1)
#define fourcc_mod_code(vendor, val) \
((((__u64)DRM_FORMAT_MOD_VENDOR_## vendor) << 56) | (val & 0x00ffffffffffffffULL))
((((__u64)DRM_FORMAT_MOD_VENDOR_## vendor) << 56) | ((val) & 0x00ffffffffffffffULL))
/*
* Format Modifier tokens:
@@ -338,29 +338,17 @@ extern "C" {
*/
#define DRM_FORMAT_MOD_VIVANTE_SPLIT_SUPER_TILED fourcc_mod_code(VIVANTE, 4)
/* NVIDIA Tegra frame buffer modifiers */
/*
* Some modifiers take parameters, for example the number of vertical GOBs in
* a block. Reserve the lower 32 bits for parameters
*/
#define __fourcc_mod_tegra_mode_shift 32
#define fourcc_mod_tegra_code(val, params) \
fourcc_mod_code(NV, ((((__u64)val) << __fourcc_mod_tegra_mode_shift) | params))
#define fourcc_mod_tegra_mod(m) \
(m & ~((1ULL << __fourcc_mod_tegra_mode_shift) - 1))
#define fourcc_mod_tegra_param(m) \
(m & ((1ULL << __fourcc_mod_tegra_mode_shift) - 1))
/* NVIDIA frame buffer modifiers */
/*
* Tegra Tiled Layout, used by Tegra 2, 3 and 4.
*
* Pixels are arranged in simple tiles of 16 x 16 bytes.
*/
#define NV_FORMAT_MOD_TEGRA_TILED fourcc_mod_tegra_code(1, 0)
#define DRM_FORMAT_MOD_NVIDIA_TEGRA_TILED fourcc_mod_code(NVIDIA, 1)
/*
* Tegra 16Bx2 Block Linear layout, used by TK1/TX1
* 16Bx2 Block Linear layout, used by desktop GPUs, and Tegra K1 and later
*
* Pixels are arranged in 64x8 Groups Of Bytes (GOBs). GOBs are then stacked
* vertically by a power of 2 (1 to 32 GOBs) to form a block.
@@ -380,7 +368,21 @@ extern "C" {
* Chapter 20 "Pixel Memory Formats" of the Tegra X1 TRM describes this format
* in full detail.
*/
#define NV_FORMAT_MOD_TEGRA_16BX2_BLOCK(v) fourcc_mod_tegra_code(2, v)
#define DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(v) \
fourcc_mod_code(NVIDIA, 0x10 | ((v) & 0xf))
#define DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_ONE_GOB \
fourcc_mod_code(NVIDIA, 0x10)
#define DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_TWO_GOB \
fourcc_mod_code(NVIDIA, 0x11)
#define DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_FOUR_GOB \
fourcc_mod_code(NVIDIA, 0x12)
#define DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_EIGHT_GOB \
fourcc_mod_code(NVIDIA, 0x13)
#define DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_SIXTEEN_GOB \
fourcc_mod_code(NVIDIA, 0x14)
#define DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_THIRTYTWO_GOB \
fourcc_mod_code(NVIDIA, 0x15)
/*
* Broadcom VC4 "T" format

View File

@@ -38,14 +38,18 @@ extern "C" {
#define DRM_DISPLAY_MODE_LEN 32
#define DRM_PROP_NAME_LEN 32
#define DRM_MODE_TYPE_BUILTIN (1<<0)
#define DRM_MODE_TYPE_CLOCK_C ((1<<1) | DRM_MODE_TYPE_BUILTIN)
#define DRM_MODE_TYPE_CRTC_C ((1<<2) | DRM_MODE_TYPE_BUILTIN)
#define DRM_MODE_TYPE_BUILTIN (1<<0) /* deprecated */
#define DRM_MODE_TYPE_CLOCK_C ((1<<1) | DRM_MODE_TYPE_BUILTIN) /* deprecated */
#define DRM_MODE_TYPE_CRTC_C ((1<<2) | DRM_MODE_TYPE_BUILTIN) /* deprecated */
#define DRM_MODE_TYPE_PREFERRED (1<<3)
#define DRM_MODE_TYPE_DEFAULT (1<<4)
#define DRM_MODE_TYPE_DEFAULT (1<<4) /* deprecated */
#define DRM_MODE_TYPE_USERDEF (1<<5)
#define DRM_MODE_TYPE_DRIVER (1<<6)
#define DRM_MODE_TYPE_ALL (DRM_MODE_TYPE_PREFERRED | \
DRM_MODE_TYPE_USERDEF | \
DRM_MODE_TYPE_DRIVER)
/* Video mode flags */
/* bit compatible with the xrandr RR_ definitions (bits 0-13)
*
@@ -66,8 +70,8 @@ extern "C" {
#define DRM_MODE_FLAG_PCSYNC (1<<7)
#define DRM_MODE_FLAG_NCSYNC (1<<8)
#define DRM_MODE_FLAG_HSKEW (1<<9) /* hskew provided */
#define DRM_MODE_FLAG_BCAST (1<<10)
#define DRM_MODE_FLAG_PIXMUX (1<<11)
#define DRM_MODE_FLAG_BCAST (1<<10) /* deprecated */
#define DRM_MODE_FLAG_PIXMUX (1<<11) /* deprecated */
#define DRM_MODE_FLAG_DBLCLK (1<<12)
#define DRM_MODE_FLAG_CLKDIV2 (1<<13)
/*
@@ -99,6 +103,20 @@ extern "C" {
#define DRM_MODE_FLAG_PIC_AR_16_9 \
(DRM_MODE_PICTURE_ASPECT_16_9<<19)
#define DRM_MODE_FLAG_ALL (DRM_MODE_FLAG_PHSYNC | \
DRM_MODE_FLAG_NHSYNC | \
DRM_MODE_FLAG_PVSYNC | \
DRM_MODE_FLAG_NVSYNC | \
DRM_MODE_FLAG_INTERLACE | \
DRM_MODE_FLAG_DBLSCAN | \
DRM_MODE_FLAG_CSYNC | \
DRM_MODE_FLAG_PCSYNC | \
DRM_MODE_FLAG_NCSYNC | \
DRM_MODE_FLAG_HSKEW | \
DRM_MODE_FLAG_DBLCLK | \
DRM_MODE_FLAG_CLKDIV2 | \
DRM_MODE_FLAG_3D_MASK)
/* DPMS flags */
/* bit compatible with the xorg definitions. */
#define DRM_MODE_DPMS_ON 0
@@ -173,6 +191,10 @@ extern "C" {
DRM_MODE_REFLECT_X | \
DRM_MODE_REFLECT_Y)
/* Content Protection Flags */
#define DRM_MODE_CONTENT_PROTECTION_UNDESIRED 0
#define DRM_MODE_CONTENT_PROTECTION_DESIRED 1
#define DRM_MODE_CONTENT_PROTECTION_ENABLED 2
struct drm_mode_modeinfo {
__u32 clock;
@@ -341,7 +363,7 @@ struct drm_mode_get_connector {
__u32 pad;
};
#define DRM_MODE_PROP_PENDING (1<<0)
#define DRM_MODE_PROP_PENDING (1<<0) /* deprecated, do not use */
#define DRM_MODE_PROP_RANGE (1<<1)
#define DRM_MODE_PROP_IMMUTABLE (1<<2)
#define DRM_MODE_PROP_ENUM (1<<3) /* enumerated type with text strings */
@@ -576,8 +598,11 @@ struct drm_mode_crtc_lut {
};
struct drm_color_ctm {
/* Conversion matrix in S31.32 format. */
__s64 matrix[9];
/*
* Conversion matrix in S31.32 sign-magnitude
* (not two's complement!) format.
*/
__u64 matrix[9];
};
struct drm_color_lut {

View File

@@ -102,6 +102,46 @@ enum drm_i915_gem_engine_class {
I915_ENGINE_CLASS_INVALID = -1
};
/**
* DOC: perf_events exposed by i915 through /sys/bus/event_sources/drivers/i915
*
*/
enum drm_i915_pmu_engine_sample {
I915_SAMPLE_BUSY = 0,
I915_SAMPLE_WAIT = 1,
I915_SAMPLE_SEMA = 2
};
#define I915_PMU_SAMPLE_BITS (4)
#define I915_PMU_SAMPLE_MASK (0xf)
#define I915_PMU_SAMPLE_INSTANCE_BITS (8)
#define I915_PMU_CLASS_SHIFT \
(I915_PMU_SAMPLE_BITS + I915_PMU_SAMPLE_INSTANCE_BITS)
#define __I915_PMU_ENGINE(class, instance, sample) \
((class) << I915_PMU_CLASS_SHIFT | \
(instance) << I915_PMU_SAMPLE_BITS | \
(sample))
#define I915_PMU_ENGINE_BUSY(class, instance) \
__I915_PMU_ENGINE(class, instance, I915_SAMPLE_BUSY)
#define I915_PMU_ENGINE_WAIT(class, instance) \
__I915_PMU_ENGINE(class, instance, I915_SAMPLE_WAIT)
#define I915_PMU_ENGINE_SEMA(class, instance) \
__I915_PMU_ENGINE(class, instance, I915_SAMPLE_SEMA)
#define __I915_PMU_OTHER(x) (__I915_PMU_ENGINE(0xff, 0xff, 0xf) + 1 + (x))
#define I915_PMU_ACTUAL_FREQUENCY __I915_PMU_OTHER(0)
#define I915_PMU_REQUESTED_FREQUENCY __I915_PMU_OTHER(1)
#define I915_PMU_INTERRUPTS __I915_PMU_OTHER(2)
#define I915_PMU_RC6_RESIDENCY __I915_PMU_OTHER(3)
#define I915_PMU_LAST I915_PMU_RC6_RESIDENCY
/* Each region is a minimum of 16k, and there are at most 255 of them.
*/
#define I915_NR_TEX_REGIONS 255 /* table size 2k - maximum due to use
@@ -278,6 +318,7 @@ typedef struct _drm_i915_sarea {
#define DRM_I915_PERF_OPEN 0x36
#define DRM_I915_PERF_ADD_CONFIG 0x37
#define DRM_I915_PERF_REMOVE_CONFIG 0x38
#define DRM_I915_QUERY 0x39
#define DRM_IOCTL_I915_INIT DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
#define DRM_IOCTL_I915_FLUSH DRM_IO ( DRM_COMMAND_BASE + DRM_I915_FLUSH)
@@ -335,6 +376,7 @@ typedef struct _drm_i915_sarea {
#define DRM_IOCTL_I915_PERF_OPEN DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_OPEN, struct drm_i915_perf_open_param)
#define DRM_IOCTL_I915_PERF_ADD_CONFIG DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_ADD_CONFIG, struct drm_i915_perf_oa_config)
#define DRM_IOCTL_I915_PERF_REMOVE_CONFIG DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_REMOVE_CONFIG, __u64)
#define DRM_IOCTL_I915_QUERY DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
/* Allow drivers to submit batchbuffers directly to hardware, relying
* on the security mechanisms provided by hardware.
@@ -1318,7 +1360,9 @@ struct drm_intel_overlay_attrs {
* active on a given plane.
*/
#define I915_SET_COLORKEY_NONE (1<<0) /* disable color key matching */
#define I915_SET_COLORKEY_NONE (1<<0) /* Deprecated. Instead set
* flags==0 to disable colorkeying.
*/
#define I915_SET_COLORKEY_DESTINATION (1<<1)
#define I915_SET_COLORKEY_SOURCE (1<<2)
struct drm_intel_sprite_colorkey {
@@ -1564,15 +1608,115 @@ struct drm_i915_perf_oa_config {
__u32 n_flex_regs;
/*
* These fields are pointers to tuples of u32 values (register
* address, value). For example the expected length of the buffer
* pointed by mux_regs_ptr is (2 * sizeof(u32) * n_mux_regs).
* These fields are pointers to tuples of u32 values (register address,
* value). For example the expected length of the buffer pointed by
* mux_regs_ptr is (2 * sizeof(u32) * n_mux_regs).
*/
__u64 mux_regs_ptr;
__u64 boolean_regs_ptr;
__u64 flex_regs_ptr;
};
struct drm_i915_query_item {
__u64 query_id;
#define DRM_I915_QUERY_TOPOLOGY_INFO 1
/*
* When set to zero by userspace, this is filled with the size of the
* data to be written at the data_ptr pointer. The kernel sets this
* value to a negative value to signal an error on a particular query
* item.
*/
__s32 length;
/*
* Unused for now. Must be cleared to zero.
*/
__u32 flags;
/*
* Data will be written at the location pointed by data_ptr when the
* value of length matches the length of the data to be written by the
* kernel.
*/
__u64 data_ptr;
};
struct drm_i915_query {
__u32 num_items;
/*
* Unused for now. Must be cleared to zero.
*/
__u32 flags;
/*
* This points to an array of num_items drm_i915_query_item structures.
*/
__u64 items_ptr;
};
/*
* Data written by the kernel with query DRM_I915_QUERY_TOPOLOGY_INFO :
*
* data: contains the 3 pieces of information :
*
* - the slice mask with one bit per slice telling whether a slice is
* available. The availability of slice X can be queried with the following
* formula :
*
* (data[X / 8] >> (X % 8)) & 1
*
* - the subslice mask for each slice with one bit per subslice telling
* whether a subslice is available. The availability of subslice Y in slice
* X can be queried with the following formula :
*
* (data[subslice_offset +
* X * subslice_stride +
* Y / 8] >> (Y % 8)) & 1
*
* - the EU mask for each subslice in each slice with one bit per EU telling
* whether an EU is available. The availability of EU Z in subslice Y in
* slice X can be queried with the following formula :
*
* (data[eu_offset +
* (X * max_subslices + Y) * eu_stride +
* Z / 8] >> (Z % 8)) & 1
*/
struct drm_i915_query_topology_info {
/*
* Unused for now. Must be cleared to zero.
*/
__u16 flags;
__u16 max_slices;
__u16 max_subslices;
__u16 max_eus_per_subslice;
/*
* Offset in data[] at which the subslice masks are stored.
*/
__u16 subslice_offset;
/*
* Stride at which each of the subslice masks for each slice are
* stored.
*/
__u16 subslice_stride;
/*
* Offset in data[] at which the EU masks are stored.
*/
__u16 eu_offset;
/*
* Stride at which each of the EU masks for each subslice are stored.
*/
__u16 eu_stride;
__u8 data[];
};
#if defined(__cplusplus)
}
#endif

View File

@@ -0,0 +1,209 @@
/*
* Copyright (c) 2012-2013, NVIDIA CORPORATION. All rights reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
* OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
* ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*/
#ifndef _TEGRA_DRM_H_
#define _TEGRA_DRM_H_
#include "drm.h"
#if defined(__cplusplus)
extern "C" {
#endif
#define DRM_TEGRA_GEM_CREATE_TILED (1 << 0)
#define DRM_TEGRA_GEM_CREATE_BOTTOM_UP (1 << 1)
struct drm_tegra_gem_create {
__u64 size;
__u32 flags;
__u32 handle;
};
struct drm_tegra_gem_mmap {
__u32 handle;
__u32 pad;
__u64 offset;
};
struct drm_tegra_syncpt_read {
__u32 id;
__u32 value;
};
struct drm_tegra_syncpt_incr {
__u32 id;
__u32 pad;
};
struct drm_tegra_syncpt_wait {
__u32 id;
__u32 thresh;
__u32 timeout;
__u32 value;
};
#define DRM_TEGRA_NO_TIMEOUT (0xffffffff)
struct drm_tegra_open_channel {
__u32 client;
__u32 pad;
__u64 context;
};
struct drm_tegra_close_channel {
__u64 context;
};
struct drm_tegra_get_syncpt {
__u64 context;
__u32 index;
__u32 id;
};
struct drm_tegra_get_syncpt_base {
__u64 context;
__u32 syncpt;
__u32 id;
};
struct drm_tegra_syncpt {
__u32 id;
__u32 incrs;
};
struct drm_tegra_cmdbuf {
__u32 handle;
__u32 offset;
__u32 words;
__u32 pad;
};
struct drm_tegra_reloc {
struct {
__u32 handle;
__u32 offset;
} cmdbuf;
struct {
__u32 handle;
__u32 offset;
} target;
__u32 shift;
__u32 pad;
};
struct drm_tegra_waitchk {
__u32 handle;
__u32 offset;
__u32 syncpt;
__u32 thresh;
};
struct drm_tegra_submit {
__u64 context;
__u32 num_syncpts;
__u32 num_cmdbufs;
__u32 num_relocs;
__u32 num_waitchks;
__u32 waitchk_mask;
__u32 timeout;
__u64 syncpts;
__u64 cmdbufs;
__u64 relocs;
__u64 waitchks;
__u32 fence; /* Return value */
__u32 reserved[5]; /* future expansion */
};
#define DRM_TEGRA_GEM_TILING_MODE_PITCH 0
#define DRM_TEGRA_GEM_TILING_MODE_TILED 1
#define DRM_TEGRA_GEM_TILING_MODE_BLOCK 2
struct drm_tegra_gem_set_tiling {
/* input */
__u32 handle;
__u32 mode;
__u32 value;
__u32 pad;
};
struct drm_tegra_gem_get_tiling {
/* input */
__u32 handle;
/* output */
__u32 mode;
__u32 value;
__u32 pad;
};
#define DRM_TEGRA_GEM_BOTTOM_UP (1 << 0)
#define DRM_TEGRA_GEM_FLAGS (DRM_TEGRA_GEM_BOTTOM_UP)
struct drm_tegra_gem_set_flags {
/* input */
__u32 handle;
/* output */
__u32 flags;
};
struct drm_tegra_gem_get_flags {
/* input */
__u32 handle;
/* output */
__u32 flags;
};
#define DRM_TEGRA_GEM_CREATE 0x00
#define DRM_TEGRA_GEM_MMAP 0x01
#define DRM_TEGRA_SYNCPT_READ 0x02
#define DRM_TEGRA_SYNCPT_INCR 0x03
#define DRM_TEGRA_SYNCPT_WAIT 0x04
#define DRM_TEGRA_OPEN_CHANNEL 0x05
#define DRM_TEGRA_CLOSE_CHANNEL 0x06
#define DRM_TEGRA_GET_SYNCPT 0x07
#define DRM_TEGRA_SUBMIT 0x08
#define DRM_TEGRA_GET_SYNCPT_BASE 0x09
#define DRM_TEGRA_GEM_SET_TILING 0x0a
#define DRM_TEGRA_GEM_GET_TILING 0x0b
#define DRM_TEGRA_GEM_SET_FLAGS 0x0c
#define DRM_TEGRA_GEM_GET_FLAGS 0x0d
#define DRM_IOCTL_TEGRA_GEM_CREATE DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GEM_CREATE, struct drm_tegra_gem_create)
#define DRM_IOCTL_TEGRA_GEM_MMAP DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GEM_MMAP, struct drm_tegra_gem_mmap)
#define DRM_IOCTL_TEGRA_SYNCPT_READ DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_SYNCPT_READ, struct drm_tegra_syncpt_read)
#define DRM_IOCTL_TEGRA_SYNCPT_INCR DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_SYNCPT_INCR, struct drm_tegra_syncpt_incr)
#define DRM_IOCTL_TEGRA_SYNCPT_WAIT DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_SYNCPT_WAIT, struct drm_tegra_syncpt_wait)
#define DRM_IOCTL_TEGRA_OPEN_CHANNEL DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_OPEN_CHANNEL, struct drm_tegra_open_channel)
#define DRM_IOCTL_TEGRA_CLOSE_CHANNEL DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_CLOSE_CHANNEL, struct drm_tegra_open_channel)
#define DRM_IOCTL_TEGRA_GET_SYNCPT DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GET_SYNCPT, struct drm_tegra_get_syncpt)
#define DRM_IOCTL_TEGRA_SUBMIT DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_SUBMIT, struct drm_tegra_submit)
#define DRM_IOCTL_TEGRA_GET_SYNCPT_BASE DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GET_SYNCPT_BASE, struct drm_tegra_get_syncpt_base)
#define DRM_IOCTL_TEGRA_GEM_SET_TILING DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GEM_SET_TILING, struct drm_tegra_gem_set_tiling)
#define DRM_IOCTL_TEGRA_GEM_GET_TILING DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GEM_GET_TILING, struct drm_tegra_gem_get_tiling)
#define DRM_IOCTL_TEGRA_GEM_SET_FLAGS DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GEM_SET_FLAGS, struct drm_tegra_gem_set_flags)
#define DRM_IOCTL_TEGRA_GEM_GET_FLAGS DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GEM_GET_FLAGS, struct drm_tegra_gem_get_flags)
#if defined(__cplusplus)
}
#endif
#endif

View File

@@ -1,5 +1,5 @@
/*
* Copyright © 2014-2017 Broadcom
* Copyright © 2014-2018 Broadcom
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
@@ -21,8 +21,8 @@
* IN THE SOFTWARE.
*/
#ifndef _VC5_DRM_H_
#define _VC5_DRM_H_
#ifndef _V3D_DRM_H_
#define _V3D_DRM_H_
#include "drm.h"
@@ -30,30 +30,28 @@
extern "C" {
#endif
#define DRM_VC5_SUBMIT_CL 0x00
#define DRM_VC5_WAIT_SEQNO 0x01
#define DRM_VC5_WAIT_BO 0x02
#define DRM_VC5_CREATE_BO 0x03
#define DRM_VC5_MMAP_BO 0x04
#define DRM_VC5_GET_PARAM 0x05
#define DRM_VC5_GET_BO_OFFSET 0x06
#define DRM_V3D_SUBMIT_CL 0x00
#define DRM_V3D_WAIT_BO 0x01
#define DRM_V3D_CREATE_BO 0x02
#define DRM_V3D_MMAP_BO 0x03
#define DRM_V3D_GET_PARAM 0x04
#define DRM_V3D_GET_BO_OFFSET 0x05
#define DRM_IOCTL_VC5_SUBMIT_CL DRM_IOWR(DRM_COMMAND_BASE + DRM_VC5_SUBMIT_CL, struct drm_vc5_submit_cl)
#define DRM_IOCTL_VC5_WAIT_SEQNO DRM_IOWR(DRM_COMMAND_BASE + DRM_VC5_WAIT_SEQNO, struct drm_vc5_wait_seqno)
#define DRM_IOCTL_VC5_WAIT_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_VC5_WAIT_BO, struct drm_vc5_wait_bo)
#define DRM_IOCTL_VC5_CREATE_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_VC5_CREATE_BO, struct drm_vc5_create_bo)
#define DRM_IOCTL_VC5_MMAP_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_VC5_MMAP_BO, struct drm_vc5_mmap_bo)
#define DRM_IOCTL_VC5_GET_PARAM DRM_IOWR(DRM_COMMAND_BASE + DRM_VC5_GET_PARAM, struct drm_vc5_get_param)
#define DRM_IOCTL_VC5_GET_BO_OFFSET DRM_IOWR(DRM_COMMAND_BASE + DRM_VC5_GET_BO_OFFSET, struct drm_vc5_get_bo_offset)
#define DRM_IOCTL_V3D_SUBMIT_CL DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_SUBMIT_CL, struct drm_v3d_submit_cl)
#define DRM_IOCTL_V3D_WAIT_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_WAIT_BO, struct drm_v3d_wait_bo)
#define DRM_IOCTL_V3D_CREATE_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_CREATE_BO, struct drm_v3d_create_bo)
#define DRM_IOCTL_V3D_MMAP_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_MMAP_BO, struct drm_v3d_mmap_bo)
#define DRM_IOCTL_V3D_GET_PARAM DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_GET_PARAM, struct drm_v3d_get_param)
#define DRM_IOCTL_V3D_GET_BO_OFFSET DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_GET_BO_OFFSET, struct drm_v3d_get_bo_offset)
/**
* struct drm_vc5_submit_cl - ioctl argument for submitting commands to the 3D
* struct drm_v3d_submit_cl - ioctl argument for submitting commands to the 3D
* engine.
*
* This asks the kernel to have the GPU execute an optional binner
* command list, and a render command list.
*/
struct drm_vc5_submit_cl {
struct drm_v3d_submit_cl {
/* Pointer to the binner command list.
*
* This is the first set of commands executed, which runs the
@@ -77,6 +75,13 @@ struct drm_vc5_submit_cl {
/** End address of the RCL (first byte after the RCL) */
__u32 rcl_end;
/** An optional sync object to wait on before starting the BCL. */
__u32 in_sync_bcl;
/** An optional sync object to wait on before starting the RCL. */
__u32 in_sync_rcl;
/** An optional sync object to place the completion fence in. */
__u32 out_sync;
/* Offset of the tile alloc memory
*
* This is optional on V3D 3.3 (where the CL can set the value) but
@@ -84,62 +89,44 @@ struct drm_vc5_submit_cl {
*/
__u32 qma;
/** Size of the tile alloc memory. */
/** Size of the tile alloc memory. */
__u32 qms;
/** Offset of the tile state data array. */
/** Offset of the tile state data array. */
__u32 qts;
/* Pointer to a u32 array of the BOs that are referenced by the job.
*/
__u64 bo_handles;
/* Pointer to an array of chunks of extra submit CL information. (the
* chunk struct is not yet defined)
*/
__u64 chunks;
/* Number of BO handles passed in (size is that times 4). */
__u32 bo_handle_count;
__u32 chunk_count;
__u64 flags;
/* Pad, must be zero-filled. */
__u32 pad;
};
/**
* struct drm_vc5_wait_seqno - ioctl argument for waiting for
* DRM_VC5_SUBMIT_CL completion using its returned seqno.
*
* timeout_ns is the timeout in nanoseconds, where "0" means "don't
* block, just return the status."
*/
struct drm_vc5_wait_seqno {
__u64 seqno;
__u64 timeout_ns;
};
/**
* struct drm_vc5_wait_bo - ioctl argument for waiting for
* completion of the last DRM_VC5_SUBMIT_CL on a BO.
* struct drm_v3d_wait_bo - ioctl argument for waiting for
* completion of the last DRM_V3D_SUBMIT_CL on a BO.
*
* This is useful for cases where multiple processes might be
* rendering to a BO and you want to wait for all rendering to be
* completed.
*/
struct drm_vc5_wait_bo {
struct drm_v3d_wait_bo {
__u32 handle;
__u32 pad;
__u64 timeout_ns;
};
/**
* struct drm_vc5_create_bo - ioctl argument for creating VC5 BOs.
* struct drm_v3d_create_bo - ioctl argument for creating V3D BOs.
*
* There are currently no values for the flags argument, but it may be
* used in a future extension.
*/
struct drm_vc5_create_bo {
struct drm_v3d_create_bo {
__u32 size;
__u32 flags;
/** Returned GEM handle for the BO. */
@@ -148,12 +135,15 @@ struct drm_vc5_create_bo {
* Returned offset for the BO in the V3D address space. This offset
* is private to the DRM fd and is valid for the lifetime of the GEM
* handle.
*
* This offset value will always be nonzero, since various HW
* units treat 0 specially.
*/
__u32 offset;
};
/**
* struct drm_vc5_mmap_bo - ioctl argument for mapping VC5 BOs.
* struct drm_v3d_mmap_bo - ioctl argument for mapping V3D BOs.
*
* This doesn't actually perform an mmap. Instead, it returns the
* offset you need to use in an mmap on the DRM device node. This
@@ -163,7 +153,7 @@ struct drm_vc5_create_bo {
* There are currently no values for the flags argument, but it may be
* used in a future extension.
*/
struct drm_vc5_mmap_bo {
struct drm_v3d_mmap_bo {
/** Handle for the object being mapped. */
__u32 handle;
__u32 flags;
@@ -171,17 +161,17 @@ struct drm_vc5_mmap_bo {
__u64 offset;
};
enum drm_vc5_param {
DRM_VC5_PARAM_V3D_UIFCFG,
DRM_VC5_PARAM_V3D_HUB_IDENT1,
DRM_VC5_PARAM_V3D_HUB_IDENT2,
DRM_VC5_PARAM_V3D_HUB_IDENT3,
DRM_VC5_PARAM_V3D_CORE0_IDENT0,
DRM_VC5_PARAM_V3D_CORE0_IDENT1,
DRM_VC5_PARAM_V3D_CORE0_IDENT2,
enum drm_v3d_param {
DRM_V3D_PARAM_V3D_UIFCFG,
DRM_V3D_PARAM_V3D_HUB_IDENT1,
DRM_V3D_PARAM_V3D_HUB_IDENT2,
DRM_V3D_PARAM_V3D_HUB_IDENT3,
DRM_V3D_PARAM_V3D_CORE0_IDENT0,
DRM_V3D_PARAM_V3D_CORE0_IDENT1,
DRM_V3D_PARAM_V3D_CORE0_IDENT2,
};
struct drm_vc5_get_param {
struct drm_v3d_get_param {
__u32 param;
__u32 pad;
__u64 value;
@@ -189,10 +179,10 @@ struct drm_vc5_get_param {
/**
* Returns the offset for the BO in the V3D address space for this DRM fd.
* This is the same value returned by drm_vc5_create_bo, if that was called
* This is the same value returned by drm_v3d_create_bo, if that was called
* from this DRM fd.
*/
struct drm_vc5_get_bo_offset {
struct drm_v3d_get_bo_offset {
__u32 handle;
__u32 offset;
};
@@ -201,4 +191,4 @@ struct drm_vc5_get_bo_offset {
}
#endif
#endif /* _VC5_DRM_H_ */
#endif /* _V3D_DRM_H_ */

View File

@@ -42,6 +42,9 @@ extern "C" {
#define DRM_VC4_GET_TILING 0x09
#define DRM_VC4_LABEL_BO 0x0a
#define DRM_VC4_GEM_MADVISE 0x0b
#define DRM_VC4_PERFMON_CREATE 0x0c
#define DRM_VC4_PERFMON_DESTROY 0x0d
#define DRM_VC4_PERFMON_GET_VALUES 0x0e
#define DRM_IOCTL_VC4_SUBMIT_CL DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_SUBMIT_CL, struct drm_vc4_submit_cl)
#define DRM_IOCTL_VC4_WAIT_SEQNO DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_WAIT_SEQNO, struct drm_vc4_wait_seqno)
@@ -55,6 +58,9 @@ extern "C" {
#define DRM_IOCTL_VC4_GET_TILING DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_GET_TILING, struct drm_vc4_get_tiling)
#define DRM_IOCTL_VC4_LABEL_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_LABEL_BO, struct drm_vc4_label_bo)
#define DRM_IOCTL_VC4_GEM_MADVISE DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_GEM_MADVISE, struct drm_vc4_gem_madvise)
#define DRM_IOCTL_VC4_PERFMON_CREATE DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_PERFMON_CREATE, struct drm_vc4_perfmon_create)
#define DRM_IOCTL_VC4_PERFMON_DESTROY DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_PERFMON_DESTROY, struct drm_vc4_perfmon_destroy)
#define DRM_IOCTL_VC4_PERFMON_GET_VALUES DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_PERFMON_GET_VALUES, struct drm_vc4_perfmon_get_values)
struct drm_vc4_submit_rcl_surface {
__u32 hindex; /* Handle index, or ~0 if not present. */
@@ -173,6 +179,22 @@ struct drm_vc4_submit_cl {
* wait ioctl).
*/
__u64 seqno;
/* ID of the perfmon to attach to this job. 0 means no perfmon. */
__u32 perfmonid;
/* Syncobj handle to wait on. If set, processing of this render job
* will not start until the syncobj is signaled. 0 means ignore.
*/
__u32 in_sync;
/* Syncobj handle to export fence to. If set, the fence in the syncobj
* will be replaced with a fence that signals upon completion of this
* render job. 0 means ignore.
*/
__u32 out_sync;
__u32 pad2;
};
/**
@@ -308,6 +330,7 @@ struct drm_vc4_get_hang_state {
#define DRM_VC4_PARAM_SUPPORTS_THREADED_FS 5
#define DRM_VC4_PARAM_SUPPORTS_FIXED_RCL_ORDER 6
#define DRM_VC4_PARAM_SUPPORTS_MADVISE 7
#define DRM_VC4_PARAM_SUPPORTS_PERFMON 8
struct drm_vc4_get_param {
__u32 param;
@@ -352,6 +375,66 @@ struct drm_vc4_gem_madvise {
__u32 pad;
};
enum {
VC4_PERFCNT_FEP_VALID_PRIMS_NO_RENDER,
VC4_PERFCNT_FEP_VALID_PRIMS_RENDER,
VC4_PERFCNT_FEP_CLIPPED_QUADS,
VC4_PERFCNT_FEP_VALID_QUADS,
VC4_PERFCNT_TLB_QUADS_NOT_PASSING_STENCIL,
VC4_PERFCNT_TLB_QUADS_NOT_PASSING_Z_AND_STENCIL,
VC4_PERFCNT_TLB_QUADS_PASSING_Z_AND_STENCIL,
VC4_PERFCNT_TLB_QUADS_ZERO_COVERAGE,
VC4_PERFCNT_TLB_QUADS_NON_ZERO_COVERAGE,
VC4_PERFCNT_TLB_QUADS_WRITTEN_TO_COLOR_BUF,
VC4_PERFCNT_PLB_PRIMS_OUTSIDE_VIEWPORT,
VC4_PERFCNT_PLB_PRIMS_NEED_CLIPPING,
VC4_PERFCNT_PSE_PRIMS_REVERSED,
VC4_PERFCNT_QPU_TOTAL_IDLE_CYCLES,
VC4_PERFCNT_QPU_TOTAL_CLK_CYCLES_VERTEX_COORD_SHADING,
VC4_PERFCNT_QPU_TOTAL_CLK_CYCLES_FRAGMENT_SHADING,
VC4_PERFCNT_QPU_TOTAL_CLK_CYCLES_EXEC_VALID_INST,
VC4_PERFCNT_QPU_TOTAL_CLK_CYCLES_WAITING_TMUS,
VC4_PERFCNT_QPU_TOTAL_CLK_CYCLES_WAITING_SCOREBOARD,
VC4_PERFCNT_QPU_TOTAL_CLK_CYCLES_WAITING_VARYINGS,
VC4_PERFCNT_QPU_TOTAL_INST_CACHE_HIT,
VC4_PERFCNT_QPU_TOTAL_INST_CACHE_MISS,
VC4_PERFCNT_QPU_TOTAL_UNIFORM_CACHE_HIT,
VC4_PERFCNT_QPU_TOTAL_UNIFORM_CACHE_MISS,
VC4_PERFCNT_TMU_TOTAL_TEXT_QUADS_PROCESSED,
VC4_PERFCNT_TMU_TOTAL_TEXT_CACHE_MISS,
VC4_PERFCNT_VPM_TOTAL_CLK_CYCLES_VDW_STALLED,
VC4_PERFCNT_VPM_TOTAL_CLK_CYCLES_VCD_STALLED,
VC4_PERFCNT_L2C_TOTAL_L2_CACHE_HIT,
VC4_PERFCNT_L2C_TOTAL_L2_CACHE_MISS,
VC4_PERFCNT_NUM_EVENTS,
};
#define DRM_VC4_MAX_PERF_COUNTERS 16
struct drm_vc4_perfmon_create {
__u32 id;
__u32 ncounters;
__u8 events[DRM_VC4_MAX_PERF_COUNTERS];
};
struct drm_vc4_perfmon_destroy {
__u32 id;
};
/*
* Returns the values of the performance counters tracked by this
* perfmon (as an array of ncounters u64 values).
*
* No implicit synchronization is performed, so the user has to
* guarantee that any jobs using this perfmon have already been
* completed (probably by blocking on the seqno returned by the
* last exec that used the perfmon).
*/
struct drm_vc4_perfmon_get_values {
__u32 id;
__u64 values_ptr;
};
#if defined(__cplusplus)
}
#endif

View File

@@ -22,6 +22,7 @@ inc_drm_uapi = include_directories('drm-uapi')
inc_vulkan = include_directories('vulkan')
inc_d3d9 = include_directories('D3D9')
inc_gl_internal = include_directories('GL/internal')
inc_haikugl = include_directories('HaikuGL')
if with_gles1
install_headers(
@@ -80,6 +81,13 @@ if with_gallium_st_nine
)
endif
if with_platform_haiku
install_headers(
'HaikuGL/GLRenderer.h', 'HaikuGL/GLView.h', 'HaikuGL/OpenGLKit.h',
subdir : 'opengl',
)
endif
# Only install the headers if we are building a stand alone implementation and
# not an ICD enabled implementation
if with_gallium_opencl and not with_opencl_icd

View File

@@ -156,6 +156,7 @@ CHIPSET(0x5912, kbl_gt2, "Intel(R) HD Graphics 630 (Kaby Lake GT2)")
CHIPSET(0x5916, kbl_gt2, "Intel(R) HD Graphics 620 (Kaby Lake GT2)")
CHIPSET(0x591A, kbl_gt2, "Intel(R) HD Graphics P630 (Kaby Lake GT2)")
CHIPSET(0x591B, kbl_gt2, "Intel(R) HD Graphics 630 (Kaby Lake GT2)")
CHIPSET(0x591C, kbl_gt2, "Intel(R) Kaby Lake GT2")
CHIPSET(0x591D, kbl_gt2, "Intel(R) HD Graphics P630 (Kaby Lake GT2)")
CHIPSET(0x591E, kbl_gt2, "Intel(R) HD Graphics 615 (Kaby Lake GT2)")
CHIPSET(0x5921, kbl_gt2, "Intel(R) Kabylake GT2F")
@@ -165,16 +166,16 @@ CHIPSET(0x5927, kbl_gt3, "Intel(R) Iris Plus Graphics 650 (Kaby Lake GT3e)")
CHIPSET(0x593B, kbl_gt4, "Intel(R) Kabylake GT4")
CHIPSET(0x3184, glk, "Intel(R) UHD Graphics 605 (Geminilake)")
CHIPSET(0x3185, glk_2x6, "Intel(R) UHD Graphics 600 (Geminilake 2x6)")
CHIPSET(0x3E90, cfl_gt1, "Intel(R) HD Graphics (Coffeelake 2x6 GT1)")
CHIPSET(0x3E93, cfl_gt1, "Intel(R) HD Graphics (Coffeelake 2x6 GT1)")
CHIPSET(0x3E90, cfl_gt1, "Intel(R) UHD Graphics 610 (Coffeelake 2x6 GT1)")
CHIPSET(0x3E93, cfl_gt1, "Intel(R) UHD Graphics 610 (Coffeelake 2x6 GT1)")
CHIPSET(0x3E99, cfl_gt1, "Intel(R) HD Graphics (Coffeelake 2x6 GT1)")
CHIPSET(0x3EA1, cfl_gt1, "Intel(R) HD Graphics (Coffeelake 2x6 GT1)")
CHIPSET(0x3EA4, cfl_gt1, "Intel(R) HD Graphics (Coffeelake 2x6 GT1)")
CHIPSET(0x3E91, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)")
CHIPSET(0x3E92, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)")
CHIPSET(0x3E91, cfl_gt2, "Intel(R) UHD Graphics 630 (Coffeelake 3x8 GT2)")
CHIPSET(0x3E92, cfl_gt2, "Intel(R) UHD Graphics 630 (Coffeelake 3x8 GT2)")
CHIPSET(0x3E96, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)")
CHIPSET(0x3E9A, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)")
CHIPSET(0x3E9B, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)")
CHIPSET(0x3E9B, cfl_gt2, "Intel(R) UHD Graphics 630 (Coffeelake 3x8 GT2)")
CHIPSET(0x3E94, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)")
CHIPSET(0x3EA0, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)")
CHIPSET(0x3EA3, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)")
@@ -196,3 +197,11 @@ CHIPSET(0x5A50, cnl_5x8, "Intel(R) HD Graphics (Cannonlake 5x8 GT2)")
CHIPSET(0x5A51, cnl_5x8, "Intel(R) HD Graphics (Cannonlake 5x8 GT2)")
CHIPSET(0x5A52, cnl_5x8, "Intel(R) HD Graphics (Cannonlake 5x8 GT2)")
CHIPSET(0x5A54, cnl_5x8, "Intel(R) HD Graphics (Cannonlake 5x8 GT2)")
CHIPSET(0x8A50, icl_8x8, "Intel(R) HD Graphics (Ice Lake 8x8 GT2)")
CHIPSET(0x8A51, icl_8x8, "Intel(R) HD Graphics (Ice Lake 8x8 GT2)")
CHIPSET(0x8A52, icl_8x8, "Intel(R) HD Graphics (Ice Lake 8x8 GT2)")
CHIPSET(0x8A5A, icl_6x8, "Intel(R) HD Graphics (Ice Lake 6x8 GT1.5)")
CHIPSET(0x8A5B, icl_4x8, "Intel(R) HD Graphics (Ice Lake 4x8 GT1)")
CHIPSET(0x8A5C, icl_6x8, "Intel(R) HD Graphics (Ice Lake 6x8 GT1.5)")
CHIPSET(0x8A5D, icl_4x8, "Intel(R) HD Graphics (Ice Lake 4x8 GT1)")
CHIPSET(0x8A71, icl_1x8, "Intel(R) HD Graphics (Ice Lake 1x8 GT0.5)")

View File

@@ -216,6 +216,9 @@ CHIPSET(0x6995, POLARIS12)
CHIPSET(0x6997, POLARIS12)
CHIPSET(0x699F, POLARIS12)
CHIPSET(0x694C, VEGAM)
CHIPSET(0x694E, VEGAM)
CHIPSET(0x6860, VEGA10)
CHIPSET(0x6861, VEGA10)
CHIPSET(0x6862, VEGA10)
@@ -226,4 +229,10 @@ CHIPSET(0x6868, VEGA10)
CHIPSET(0x687F, VEGA10)
CHIPSET(0x686C, VEGA10)
CHIPSET(0x69A0, VEGA12)
CHIPSET(0x69A1, VEGA12)
CHIPSET(0x69A2, VEGA12)
CHIPSET(0x69A3, VEGA12)
CHIPSET(0x69AF, VEGA12)
CHIPSET(0x15DD, RAVEN)

View File

@@ -24,13 +24,34 @@
#define VKICD_H
#include "vulkan.h"
#include <stdbool.h>
/*
* Loader-ICD version negotiation API
*/
#define CURRENT_LOADER_ICD_INTERFACE_VERSION 3
// Loader-ICD version negotiation API. Versions add the following features:
// Version 0 - Initial. Doesn't support vk_icdGetInstanceProcAddr
// or vk_icdNegotiateLoaderICDInterfaceVersion.
// Version 1 - Add support for vk_icdGetInstanceProcAddr.
// Version 2 - Add Loader/ICD Interface version negotiation
// via vk_icdNegotiateLoaderICDInterfaceVersion.
// Version 3 - Add ICD creation/destruction of KHR_surface objects.
// Version 4 - Add unknown physical device extension qyering via
// vk_icdGetPhysicalDeviceProcAddr.
// Version 5 - Tells ICDs that the loader is now paying attention to the
// application version of Vulkan passed into the ApplicationInfo
// structure during vkCreateInstance. This will tell the ICD
// that if the loader is older, it should automatically fail a
// call for any API version > 1.0. Otherwise, the loader will
// manually determine if it can support the expected version.
#define CURRENT_LOADER_ICD_INTERFACE_VERSION 5
#define MIN_SUPPORTED_LOADER_ICD_INTERFACE_VERSION 0
typedef VkResult (VKAPI_PTR *PFN_vkNegotiateLoaderICDInterfaceVersion)(uint32_t *pVersion);
#define MIN_PHYS_DEV_EXTENSION_ICD_INTERFACE_VERSION 4
typedef VkResult(VKAPI_PTR *PFN_vkNegotiateLoaderICDInterfaceVersion)(uint32_t *pVersion);
// This is defined in vk_layer.h which will be found by the loader, but if an ICD is building against this
// file directly, it won't be found.
#ifndef PFN_GetPhysicalDeviceProcAddr
typedef PFN_vkVoidFunction(VKAPI_PTR *PFN_GetPhysicalDeviceProcAddr)(VkInstance instance, const char *pName);
#endif
/*
* The ICD must reserve space for a pointer for the loader's dispatch
* table, at the start of <each object>.
@@ -64,6 +85,9 @@ typedef enum {
VK_ICD_WSI_PLATFORM_WIN32,
VK_ICD_WSI_PLATFORM_XCB,
VK_ICD_WSI_PLATFORM_XLIB,
VK_ICD_WSI_PLATFORM_ANDROID,
VK_ICD_WSI_PLATFORM_MACOS,
VK_ICD_WSI_PLATFORM_IOS,
VK_ICD_WSI_PLATFORM_DISPLAY
} VkIcdWsiPlatform;
@@ -77,7 +101,7 @@ typedef struct {
MirConnection *connection;
MirSurface *mirSurface;
} VkIcdSurfaceMir;
#endif // VK_USE_PLATFORM_MIR_KHR
#endif // VK_USE_PLATFORM_MIR_KHR
#ifdef VK_USE_PLATFORM_WAYLAND_KHR
typedef struct {
@@ -85,7 +109,7 @@ typedef struct {
struct wl_display *display;
struct wl_surface *surface;
} VkIcdSurfaceWayland;
#endif // VK_USE_PLATFORM_WAYLAND_KHR
#endif // VK_USE_PLATFORM_WAYLAND_KHR
#ifdef VK_USE_PLATFORM_WIN32_KHR
typedef struct {
@@ -93,7 +117,7 @@ typedef struct {
HINSTANCE hinstance;
HWND hwnd;
} VkIcdSurfaceWin32;
#endif // VK_USE_PLATFORM_WIN32_KHR
#endif // VK_USE_PLATFORM_WIN32_KHR
#ifdef VK_USE_PLATFORM_XCB_KHR
typedef struct {
@@ -101,7 +125,7 @@ typedef struct {
xcb_connection_t *connection;
xcb_window_t window;
} VkIcdSurfaceXcb;
#endif // VK_USE_PLATFORM_XCB_KHR
#endif // VK_USE_PLATFORM_XCB_KHR
#ifdef VK_USE_PLATFORM_XLIB_KHR
typedef struct {
@@ -109,13 +133,28 @@ typedef struct {
Display *dpy;
Window window;
} VkIcdSurfaceXlib;
#endif // VK_USE_PLATFORM_XLIB_KHR
#endif // VK_USE_PLATFORM_XLIB_KHR
#ifdef VK_USE_PLATFORM_ANDROID_KHR
typedef struct {
ANativeWindow* window;
VkIcdSurfaceBase base;
struct ANativeWindow *window;
} VkIcdSurfaceAndroid;
#endif //VK_USE_PLATFORM_ANDROID_KHR
#endif // VK_USE_PLATFORM_ANDROID_KHR
#ifdef VK_USE_PLATFORM_MACOS_MVK
typedef struct {
VkIcdSurfaceBase base;
const void *pView;
} VkIcdSurfaceMacOS;
#endif // VK_USE_PLATFORM_MACOS_MVK
#ifdef VK_USE_PLATFORM_IOS_MVK
typedef struct {
VkIcdSurfaceBase base;
const void *pView;
} VkIcdSurfaceIOS;
#endif // VK_USE_PLATFORM_IOS_MVK
typedef struct {
VkIcdSurfaceBase base;
@@ -128,4 +167,4 @@ typedef struct {
VkExtent2D imageExtent;
} VkIcdSurfaceDisplay;
#endif // VKICD_H
#endif // VKICD_H

View File

@@ -89,32 +89,4 @@ extern "C"
} // extern "C"
#endif // __cplusplus
// Platform-specific headers required by platform window system extensions.
// These are enabled prior to #including "vulkan.h". The same enable then
// controls inclusion of the extension interfaces in vulkan.h.
#ifdef VK_USE_PLATFORM_ANDROID_KHR
#include <android/native_window.h>
#endif
#ifdef VK_USE_PLATFORM_MIR_KHR
#include <mir_toolkit/client_types.h>
#endif
#ifdef VK_USE_PLATFORM_WAYLAND_KHR
#include <wayland-client.h>
#endif
#ifdef VK_USE_PLATFORM_WIN32_KHR
#include <windows.h>
#endif
#ifdef VK_USE_PLATFORM_XLIB_KHR
#include <X11/Xlib.h>
#endif
#ifdef VK_USE_PLATFORM_XCB_KHR
#include <xcb/xcb.h>
#endif
#endif

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,126 @@
#ifndef VULKAN_ANDROID_H_
#define VULKAN_ANDROID_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_KHR_android_surface 1
struct ANativeWindow;
#define VK_KHR_ANDROID_SURFACE_SPEC_VERSION 6
#define VK_KHR_ANDROID_SURFACE_EXTENSION_NAME "VK_KHR_android_surface"
typedef VkFlags VkAndroidSurfaceCreateFlagsKHR;
typedef struct VkAndroidSurfaceCreateInfoKHR {
VkStructureType sType;
const void* pNext;
VkAndroidSurfaceCreateFlagsKHR flags;
struct ANativeWindow* window;
} VkAndroidSurfaceCreateInfoKHR;
typedef VkResult (VKAPI_PTR *PFN_vkCreateAndroidSurfaceKHR)(VkInstance instance, const VkAndroidSurfaceCreateInfoKHR* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkCreateAndroidSurfaceKHR(
VkInstance instance,
const VkAndroidSurfaceCreateInfoKHR* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
#endif
#define VK_ANDROID_external_memory_android_hardware_buffer 1
struct AHardwareBuffer;
#define VK_ANDROID_EXTERNAL_MEMORY_ANDROID_HARDWARE_BUFFER_SPEC_VERSION 3
#define VK_ANDROID_EXTERNAL_MEMORY_ANDROID_HARDWARE_BUFFER_EXTENSION_NAME "VK_ANDROID_external_memory_android_hardware_buffer"
typedef struct VkAndroidHardwareBufferUsageANDROID {
VkStructureType sType;
void* pNext;
uint64_t androidHardwareBufferUsage;
} VkAndroidHardwareBufferUsageANDROID;
typedef struct VkAndroidHardwareBufferPropertiesANDROID {
VkStructureType sType;
void* pNext;
VkDeviceSize allocationSize;
uint32_t memoryTypeBits;
} VkAndroidHardwareBufferPropertiesANDROID;
typedef struct VkAndroidHardwareBufferFormatPropertiesANDROID {
VkStructureType sType;
void* pNext;
VkFormat format;
uint64_t externalFormat;
VkFormatFeatureFlags formatFeatures;
VkComponentMapping samplerYcbcrConversionComponents;
VkSamplerYcbcrModelConversion suggestedYcbcrModel;
VkSamplerYcbcrRange suggestedYcbcrRange;
VkChromaLocation suggestedXChromaOffset;
VkChromaLocation suggestedYChromaOffset;
} VkAndroidHardwareBufferFormatPropertiesANDROID;
typedef struct VkImportAndroidHardwareBufferInfoANDROID {
VkStructureType sType;
const void* pNext;
struct AHardwareBuffer* buffer;
} VkImportAndroidHardwareBufferInfoANDROID;
typedef struct VkMemoryGetAndroidHardwareBufferInfoANDROID {
VkStructureType sType;
const void* pNext;
VkDeviceMemory memory;
} VkMemoryGetAndroidHardwareBufferInfoANDROID;
typedef struct VkExternalFormatANDROID {
VkStructureType sType;
void* pNext;
uint64_t externalFormat;
} VkExternalFormatANDROID;
typedef VkResult (VKAPI_PTR *PFN_vkGetAndroidHardwareBufferPropertiesANDROID)(VkDevice device, const struct AHardwareBuffer* buffer, VkAndroidHardwareBufferPropertiesANDROID* pProperties);
typedef VkResult (VKAPI_PTR *PFN_vkGetMemoryAndroidHardwareBufferANDROID)(VkDevice device, const VkMemoryGetAndroidHardwareBufferInfoANDROID* pInfo, struct AHardwareBuffer** pBuffer);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkGetAndroidHardwareBufferPropertiesANDROID(
VkDevice device,
const struct AHardwareBuffer* buffer,
VkAndroidHardwareBufferPropertiesANDROID* pProperties);
VKAPI_ATTR VkResult VKAPI_CALL vkGetMemoryAndroidHardwareBufferANDROID(
VkDevice device,
const VkMemoryGetAndroidHardwareBufferInfoANDROID* pInfo,
struct AHardwareBuffer** pBuffer);
#endif
#ifdef __cplusplus
}
#endif
#endif

7576
include/vulkan/vulkan_core.h Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,58 @@
#ifndef VULKAN_IOS_H_
#define VULKAN_IOS_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_MVK_ios_surface 1
#define VK_MVK_IOS_SURFACE_SPEC_VERSION 2
#define VK_MVK_IOS_SURFACE_EXTENSION_NAME "VK_MVK_ios_surface"
typedef VkFlags VkIOSSurfaceCreateFlagsMVK;
typedef struct VkIOSSurfaceCreateInfoMVK {
VkStructureType sType;
const void* pNext;
VkIOSSurfaceCreateFlagsMVK flags;
const void* pView;
} VkIOSSurfaceCreateInfoMVK;
typedef VkResult (VKAPI_PTR *PFN_vkCreateIOSSurfaceMVK)(VkInstance instance, const VkIOSSurfaceCreateInfoMVK* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkCreateIOSSurfaceMVK(
VkInstance instance,
const VkIOSSurfaceCreateInfoMVK* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
#endif
#ifdef __cplusplus
}
#endif
#endif

View File

@@ -0,0 +1,58 @@
#ifndef VULKAN_MACOS_H_
#define VULKAN_MACOS_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_MVK_macos_surface 1
#define VK_MVK_MACOS_SURFACE_SPEC_VERSION 2
#define VK_MVK_MACOS_SURFACE_EXTENSION_NAME "VK_MVK_macos_surface"
typedef VkFlags VkMacOSSurfaceCreateFlagsMVK;
typedef struct VkMacOSSurfaceCreateInfoMVK {
VkStructureType sType;
const void* pNext;
VkMacOSSurfaceCreateFlagsMVK flags;
const void* pView;
} VkMacOSSurfaceCreateInfoMVK;
typedef VkResult (VKAPI_PTR *PFN_vkCreateMacOSSurfaceMVK)(VkInstance instance, const VkMacOSSurfaceCreateInfoMVK* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkCreateMacOSSurfaceMVK(
VkInstance instance,
const VkMacOSSurfaceCreateInfoMVK* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
#endif
#ifdef __cplusplus
}
#endif
#endif

View File

@@ -0,0 +1,65 @@
#ifndef VULKAN_MIR_H_
#define VULKAN_MIR_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_KHR_mir_surface 1
#define VK_KHR_MIR_SURFACE_SPEC_VERSION 4
#define VK_KHR_MIR_SURFACE_EXTENSION_NAME "VK_KHR_mir_surface"
typedef VkFlags VkMirSurfaceCreateFlagsKHR;
typedef struct VkMirSurfaceCreateInfoKHR {
VkStructureType sType;
const void* pNext;
VkMirSurfaceCreateFlagsKHR flags;
MirConnection* connection;
MirSurface* mirSurface;
} VkMirSurfaceCreateInfoKHR;
typedef VkResult (VKAPI_PTR *PFN_vkCreateMirSurfaceKHR)(VkInstance instance, const VkMirSurfaceCreateInfoKHR* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);
typedef VkBool32 (VKAPI_PTR *PFN_vkGetPhysicalDeviceMirPresentationSupportKHR)(VkPhysicalDevice physicalDevice, uint32_t queueFamilyIndex, MirConnection* connection);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkCreateMirSurfaceKHR(
VkInstance instance,
const VkMirSurfaceCreateInfoKHR* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
VKAPI_ATTR VkBool32 VKAPI_CALL vkGetPhysicalDeviceMirPresentationSupportKHR(
VkPhysicalDevice physicalDevice,
uint32_t queueFamilyIndex,
MirConnection* connection);
#endif
#ifdef __cplusplus
}
#endif
#endif

View File

@@ -0,0 +1,58 @@
#ifndef VULKAN_VI_H_
#define VULKAN_VI_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_NN_vi_surface 1
#define VK_NN_VI_SURFACE_SPEC_VERSION 1
#define VK_NN_VI_SURFACE_EXTENSION_NAME "VK_NN_vi_surface"
typedef VkFlags VkViSurfaceCreateFlagsNN;
typedef struct VkViSurfaceCreateInfoNN {
VkStructureType sType;
const void* pNext;
VkViSurfaceCreateFlagsNN flags;
void* window;
} VkViSurfaceCreateInfoNN;
typedef VkResult (VKAPI_PTR *PFN_vkCreateViSurfaceNN)(VkInstance instance, const VkViSurfaceCreateInfoNN* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkCreateViSurfaceNN(
VkInstance instance,
const VkViSurfaceCreateInfoNN* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
#endif
#ifdef __cplusplus
}
#endif
#endif

View File

@@ -0,0 +1,65 @@
#ifndef VULKAN_WAYLAND_H_
#define VULKAN_WAYLAND_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_KHR_wayland_surface 1
#define VK_KHR_WAYLAND_SURFACE_SPEC_VERSION 6
#define VK_KHR_WAYLAND_SURFACE_EXTENSION_NAME "VK_KHR_wayland_surface"
typedef VkFlags VkWaylandSurfaceCreateFlagsKHR;
typedef struct VkWaylandSurfaceCreateInfoKHR {
VkStructureType sType;
const void* pNext;
VkWaylandSurfaceCreateFlagsKHR flags;
struct wl_display* display;
struct wl_surface* surface;
} VkWaylandSurfaceCreateInfoKHR;
typedef VkResult (VKAPI_PTR *PFN_vkCreateWaylandSurfaceKHR)(VkInstance instance, const VkWaylandSurfaceCreateInfoKHR* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);
typedef VkBool32 (VKAPI_PTR *PFN_vkGetPhysicalDeviceWaylandPresentationSupportKHR)(VkPhysicalDevice physicalDevice, uint32_t queueFamilyIndex, struct wl_display* display);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkCreateWaylandSurfaceKHR(
VkInstance instance,
const VkWaylandSurfaceCreateInfoKHR* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
VKAPI_ATTR VkBool32 VKAPI_CALL vkGetPhysicalDeviceWaylandPresentationSupportKHR(
VkPhysicalDevice physicalDevice,
uint32_t queueFamilyIndex,
struct wl_display* display);
#endif
#ifdef __cplusplus
}
#endif
#endif

View File

@@ -0,0 +1,276 @@
#ifndef VULKAN_WIN32_H_
#define VULKAN_WIN32_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_KHR_win32_surface 1
#define VK_KHR_WIN32_SURFACE_SPEC_VERSION 6
#define VK_KHR_WIN32_SURFACE_EXTENSION_NAME "VK_KHR_win32_surface"
typedef VkFlags VkWin32SurfaceCreateFlagsKHR;
typedef struct VkWin32SurfaceCreateInfoKHR {
VkStructureType sType;
const void* pNext;
VkWin32SurfaceCreateFlagsKHR flags;
HINSTANCE hinstance;
HWND hwnd;
} VkWin32SurfaceCreateInfoKHR;
typedef VkResult (VKAPI_PTR *PFN_vkCreateWin32SurfaceKHR)(VkInstance instance, const VkWin32SurfaceCreateInfoKHR* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);
typedef VkBool32 (VKAPI_PTR *PFN_vkGetPhysicalDeviceWin32PresentationSupportKHR)(VkPhysicalDevice physicalDevice, uint32_t queueFamilyIndex);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkCreateWin32SurfaceKHR(
VkInstance instance,
const VkWin32SurfaceCreateInfoKHR* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
VKAPI_ATTR VkBool32 VKAPI_CALL vkGetPhysicalDeviceWin32PresentationSupportKHR(
VkPhysicalDevice physicalDevice,
uint32_t queueFamilyIndex);
#endif
#define VK_KHR_external_memory_win32 1
#define VK_KHR_EXTERNAL_MEMORY_WIN32_SPEC_VERSION 1
#define VK_KHR_EXTERNAL_MEMORY_WIN32_EXTENSION_NAME "VK_KHR_external_memory_win32"
typedef struct VkImportMemoryWin32HandleInfoKHR {
VkStructureType sType;
const void* pNext;
VkExternalMemoryHandleTypeFlagBits handleType;
HANDLE handle;
LPCWSTR name;
} VkImportMemoryWin32HandleInfoKHR;
typedef struct VkExportMemoryWin32HandleInfoKHR {
VkStructureType sType;
const void* pNext;
const SECURITY_ATTRIBUTES* pAttributes;
DWORD dwAccess;
LPCWSTR name;
} VkExportMemoryWin32HandleInfoKHR;
typedef struct VkMemoryWin32HandlePropertiesKHR {
VkStructureType sType;
void* pNext;
uint32_t memoryTypeBits;
} VkMemoryWin32HandlePropertiesKHR;
typedef struct VkMemoryGetWin32HandleInfoKHR {
VkStructureType sType;
const void* pNext;
VkDeviceMemory memory;
VkExternalMemoryHandleTypeFlagBits handleType;
} VkMemoryGetWin32HandleInfoKHR;
typedef VkResult (VKAPI_PTR *PFN_vkGetMemoryWin32HandleKHR)(VkDevice device, const VkMemoryGetWin32HandleInfoKHR* pGetWin32HandleInfo, HANDLE* pHandle);
typedef VkResult (VKAPI_PTR *PFN_vkGetMemoryWin32HandlePropertiesKHR)(VkDevice device, VkExternalMemoryHandleTypeFlagBits handleType, HANDLE handle, VkMemoryWin32HandlePropertiesKHR* pMemoryWin32HandleProperties);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkGetMemoryWin32HandleKHR(
VkDevice device,
const VkMemoryGetWin32HandleInfoKHR* pGetWin32HandleInfo,
HANDLE* pHandle);
VKAPI_ATTR VkResult VKAPI_CALL vkGetMemoryWin32HandlePropertiesKHR(
VkDevice device,
VkExternalMemoryHandleTypeFlagBits handleType,
HANDLE handle,
VkMemoryWin32HandlePropertiesKHR* pMemoryWin32HandleProperties);
#endif
#define VK_KHR_win32_keyed_mutex 1
#define VK_KHR_WIN32_KEYED_MUTEX_SPEC_VERSION 1
#define VK_KHR_WIN32_KEYED_MUTEX_EXTENSION_NAME "VK_KHR_win32_keyed_mutex"
typedef struct VkWin32KeyedMutexAcquireReleaseInfoKHR {
VkStructureType sType;
const void* pNext;
uint32_t acquireCount;
const VkDeviceMemory* pAcquireSyncs;
const uint64_t* pAcquireKeys;
const uint32_t* pAcquireTimeouts;
uint32_t releaseCount;
const VkDeviceMemory* pReleaseSyncs;
const uint64_t* pReleaseKeys;
} VkWin32KeyedMutexAcquireReleaseInfoKHR;
#define VK_KHR_external_semaphore_win32 1
#define VK_KHR_EXTERNAL_SEMAPHORE_WIN32_SPEC_VERSION 1
#define VK_KHR_EXTERNAL_SEMAPHORE_WIN32_EXTENSION_NAME "VK_KHR_external_semaphore_win32"
typedef struct VkImportSemaphoreWin32HandleInfoKHR {
VkStructureType sType;
const void* pNext;
VkSemaphore semaphore;
VkSemaphoreImportFlags flags;
VkExternalSemaphoreHandleTypeFlagBits handleType;
HANDLE handle;
LPCWSTR name;
} VkImportSemaphoreWin32HandleInfoKHR;
typedef struct VkExportSemaphoreWin32HandleInfoKHR {
VkStructureType sType;
const void* pNext;
const SECURITY_ATTRIBUTES* pAttributes;
DWORD dwAccess;
LPCWSTR name;
} VkExportSemaphoreWin32HandleInfoKHR;
typedef struct VkD3D12FenceSubmitInfoKHR {
VkStructureType sType;
const void* pNext;
uint32_t waitSemaphoreValuesCount;
const uint64_t* pWaitSemaphoreValues;
uint32_t signalSemaphoreValuesCount;
const uint64_t* pSignalSemaphoreValues;
} VkD3D12FenceSubmitInfoKHR;
typedef struct VkSemaphoreGetWin32HandleInfoKHR {
VkStructureType sType;
const void* pNext;
VkSemaphore semaphore;
VkExternalSemaphoreHandleTypeFlagBits handleType;
} VkSemaphoreGetWin32HandleInfoKHR;
typedef VkResult (VKAPI_PTR *PFN_vkImportSemaphoreWin32HandleKHR)(VkDevice device, const VkImportSemaphoreWin32HandleInfoKHR* pImportSemaphoreWin32HandleInfo);
typedef VkResult (VKAPI_PTR *PFN_vkGetSemaphoreWin32HandleKHR)(VkDevice device, const VkSemaphoreGetWin32HandleInfoKHR* pGetWin32HandleInfo, HANDLE* pHandle);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkImportSemaphoreWin32HandleKHR(
VkDevice device,
const VkImportSemaphoreWin32HandleInfoKHR* pImportSemaphoreWin32HandleInfo);
VKAPI_ATTR VkResult VKAPI_CALL vkGetSemaphoreWin32HandleKHR(
VkDevice device,
const VkSemaphoreGetWin32HandleInfoKHR* pGetWin32HandleInfo,
HANDLE* pHandle);
#endif
#define VK_KHR_external_fence_win32 1
#define VK_KHR_EXTERNAL_FENCE_WIN32_SPEC_VERSION 1
#define VK_KHR_EXTERNAL_FENCE_WIN32_EXTENSION_NAME "VK_KHR_external_fence_win32"
typedef struct VkImportFenceWin32HandleInfoKHR {
VkStructureType sType;
const void* pNext;
VkFence fence;
VkFenceImportFlags flags;
VkExternalFenceHandleTypeFlagBits handleType;
HANDLE handle;
LPCWSTR name;
} VkImportFenceWin32HandleInfoKHR;
typedef struct VkExportFenceWin32HandleInfoKHR {
VkStructureType sType;
const void* pNext;
const SECURITY_ATTRIBUTES* pAttributes;
DWORD dwAccess;
LPCWSTR name;
} VkExportFenceWin32HandleInfoKHR;
typedef struct VkFenceGetWin32HandleInfoKHR {
VkStructureType sType;
const void* pNext;
VkFence fence;
VkExternalFenceHandleTypeFlagBits handleType;
} VkFenceGetWin32HandleInfoKHR;
typedef VkResult (VKAPI_PTR *PFN_vkImportFenceWin32HandleKHR)(VkDevice device, const VkImportFenceWin32HandleInfoKHR* pImportFenceWin32HandleInfo);
typedef VkResult (VKAPI_PTR *PFN_vkGetFenceWin32HandleKHR)(VkDevice device, const VkFenceGetWin32HandleInfoKHR* pGetWin32HandleInfo, HANDLE* pHandle);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkImportFenceWin32HandleKHR(
VkDevice device,
const VkImportFenceWin32HandleInfoKHR* pImportFenceWin32HandleInfo);
VKAPI_ATTR VkResult VKAPI_CALL vkGetFenceWin32HandleKHR(
VkDevice device,
const VkFenceGetWin32HandleInfoKHR* pGetWin32HandleInfo,
HANDLE* pHandle);
#endif
#define VK_NV_external_memory_win32 1
#define VK_NV_EXTERNAL_MEMORY_WIN32_SPEC_VERSION 1
#define VK_NV_EXTERNAL_MEMORY_WIN32_EXTENSION_NAME "VK_NV_external_memory_win32"
typedef struct VkImportMemoryWin32HandleInfoNV {
VkStructureType sType;
const void* pNext;
VkExternalMemoryHandleTypeFlagsNV handleType;
HANDLE handle;
} VkImportMemoryWin32HandleInfoNV;
typedef struct VkExportMemoryWin32HandleInfoNV {
VkStructureType sType;
const void* pNext;
const SECURITY_ATTRIBUTES* pAttributes;
DWORD dwAccess;
} VkExportMemoryWin32HandleInfoNV;
typedef VkResult (VKAPI_PTR *PFN_vkGetMemoryWin32HandleNV)(VkDevice device, VkDeviceMemory memory, VkExternalMemoryHandleTypeFlagsNV handleType, HANDLE* pHandle);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkGetMemoryWin32HandleNV(
VkDevice device,
VkDeviceMemory memory,
VkExternalMemoryHandleTypeFlagsNV handleType,
HANDLE* pHandle);
#endif
#define VK_NV_win32_keyed_mutex 1
#define VK_NV_WIN32_KEYED_MUTEX_SPEC_VERSION 1
#define VK_NV_WIN32_KEYED_MUTEX_EXTENSION_NAME "VK_NV_win32_keyed_mutex"
typedef struct VkWin32KeyedMutexAcquireReleaseInfoNV {
VkStructureType sType;
const void* pNext;
uint32_t acquireCount;
const VkDeviceMemory* pAcquireSyncs;
const uint64_t* pAcquireKeys;
const uint32_t* pAcquireTimeoutMilliseconds;
uint32_t releaseCount;
const VkDeviceMemory* pReleaseSyncs;
const uint64_t* pReleaseKeys;
} VkWin32KeyedMutexAcquireReleaseInfoNV;
#ifdef __cplusplus
}
#endif
#endif

View File

@@ -0,0 +1,66 @@
#ifndef VULKAN_XCB_H_
#define VULKAN_XCB_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_KHR_xcb_surface 1
#define VK_KHR_XCB_SURFACE_SPEC_VERSION 6
#define VK_KHR_XCB_SURFACE_EXTENSION_NAME "VK_KHR_xcb_surface"
typedef VkFlags VkXcbSurfaceCreateFlagsKHR;
typedef struct VkXcbSurfaceCreateInfoKHR {
VkStructureType sType;
const void* pNext;
VkXcbSurfaceCreateFlagsKHR flags;
xcb_connection_t* connection;
xcb_window_t window;
} VkXcbSurfaceCreateInfoKHR;
typedef VkResult (VKAPI_PTR *PFN_vkCreateXcbSurfaceKHR)(VkInstance instance, const VkXcbSurfaceCreateInfoKHR* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);
typedef VkBool32 (VKAPI_PTR *PFN_vkGetPhysicalDeviceXcbPresentationSupportKHR)(VkPhysicalDevice physicalDevice, uint32_t queueFamilyIndex, xcb_connection_t* connection, xcb_visualid_t visual_id);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkCreateXcbSurfaceKHR(
VkInstance instance,
const VkXcbSurfaceCreateInfoKHR* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
VKAPI_ATTR VkBool32 VKAPI_CALL vkGetPhysicalDeviceXcbPresentationSupportKHR(
VkPhysicalDevice physicalDevice,
uint32_t queueFamilyIndex,
xcb_connection_t* connection,
xcb_visualid_t visual_id);
#endif
#ifdef __cplusplus
}
#endif
#endif

View File

@@ -0,0 +1,66 @@
#ifndef VULKAN_XLIB_H_
#define VULKAN_XLIB_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_KHR_xlib_surface 1
#define VK_KHR_XLIB_SURFACE_SPEC_VERSION 6
#define VK_KHR_XLIB_SURFACE_EXTENSION_NAME "VK_KHR_xlib_surface"
typedef VkFlags VkXlibSurfaceCreateFlagsKHR;
typedef struct VkXlibSurfaceCreateInfoKHR {
VkStructureType sType;
const void* pNext;
VkXlibSurfaceCreateFlagsKHR flags;
Display* dpy;
Window window;
} VkXlibSurfaceCreateInfoKHR;
typedef VkResult (VKAPI_PTR *PFN_vkCreateXlibSurfaceKHR)(VkInstance instance, const VkXlibSurfaceCreateInfoKHR* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);
typedef VkBool32 (VKAPI_PTR *PFN_vkGetPhysicalDeviceXlibPresentationSupportKHR)(VkPhysicalDevice physicalDevice, uint32_t queueFamilyIndex, Display* dpy, VisualID visualID);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkCreateXlibSurfaceKHR(
VkInstance instance,
const VkXlibSurfaceCreateInfoKHR* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
VKAPI_ATTR VkBool32 VKAPI_CALL vkGetPhysicalDeviceXlibPresentationSupportKHR(
VkPhysicalDevice physicalDevice,
uint32_t queueFamilyIndex,
Display* dpy,
VisualID visualID);
#endif
#ifdef __cplusplus
}
#endif
#endif

View File

@@ -0,0 +1,54 @@
#ifndef VULKAN_XLIB_RANDR_H_
#define VULKAN_XLIB_RANDR_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2017 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_EXT_acquire_xlib_display 1
#define VK_EXT_ACQUIRE_XLIB_DISPLAY_SPEC_VERSION 1
#define VK_EXT_ACQUIRE_XLIB_DISPLAY_EXTENSION_NAME "VK_EXT_acquire_xlib_display"
typedef VkResult (VKAPI_PTR *PFN_vkAcquireXlibDisplayEXT)(VkPhysicalDevice physicalDevice, Display* dpy, VkDisplayKHR display);
typedef VkResult (VKAPI_PTR *PFN_vkGetRandROutputDisplayEXT)(VkPhysicalDevice physicalDevice, Display* dpy, RROutput rrOutput, VkDisplayKHR* pDisplay);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkAcquireXlibDisplayEXT(
VkPhysicalDevice physicalDevice,
Display* dpy,
VkDisplayKHR display);
VKAPI_ATTR VkResult VKAPI_CALL vkGetRandROutputDisplayEXT(
VkPhysicalDevice physicalDevice,
Display* dpy,
RROutput rrOutput,
VkDisplayKHR* pDisplay);
#endif
#ifdef __cplusplus
}
#endif
#endif

View File

@@ -0,0 +1,54 @@
#ifndef VULKAN_XLIB_XRANDR_H_
#define VULKAN_XLIB_XRANDR_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_EXT_acquire_xlib_display 1
#define VK_EXT_ACQUIRE_XLIB_DISPLAY_SPEC_VERSION 1
#define VK_EXT_ACQUIRE_XLIB_DISPLAY_EXTENSION_NAME "VK_EXT_acquire_xlib_display"
typedef VkResult (VKAPI_PTR *PFN_vkAcquireXlibDisplayEXT)(VkPhysicalDevice physicalDevice, Display* dpy, VkDisplayKHR display);
typedef VkResult (VKAPI_PTR *PFN_vkGetRandROutputDisplayEXT)(VkPhysicalDevice physicalDevice, Display* dpy, RROutput rrOutput, VkDisplayKHR* pDisplay);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkAcquireXlibDisplayEXT(
VkPhysicalDevice physicalDevice,
Display* dpy,
VkDisplayKHR display);
VKAPI_ATTR VkResult VKAPI_CALL vkGetRandROutputDisplayEXT(
VkPhysicalDevice physicalDevice,
Display* dpy,
RROutput rrOutput,
VkDisplayKHR* pDisplay);
#endif
#ifdef __cplusplus
}
#endif
#endif

File diff suppressed because it is too large Load Diff

View File

@@ -1,4 +1,4 @@
# Copyright © 2017 Intel Corporation
# Copyright © 2017-2018 Intel Corporation
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
@@ -20,8 +20,11 @@
option(
'platforms',
type : 'string',
value : 'auto',
type : 'array',
value : ['auto'],
choices : [
'', 'auto', 'x11', 'wayland', 'drm', 'surfaceless', 'haiku', 'android',
],
description : 'comma separated list of window systems to support. If this is set to auto all platforms applicable to the OS will be enabled.'
)
option(
@@ -33,9 +36,10 @@ option(
)
option(
'dri-drivers',
type : 'string',
value : 'auto',
description : 'comma separated list of dri drivers to build. If this is set to auto all drivers applicable to the target OS/architecture will be built'
type : 'array',
value : ['auto'],
choices : ['', 'auto', 'i915', 'i965', 'r100', 'r200', 'nouveau', 'swrast'],
description : 'List of dri drivers to build. If this is set to auto all drivers applicable to the target OS/architecture will be built'
)
option(
'dri-drivers-path',
@@ -51,9 +55,14 @@ option(
)
option(
'gallium-drivers',
type : 'string',
value : 'auto',
description : 'comma separated list of gallium drivers to build. If this is set to auto all drivers applicable to the target OS/architecture will be built'
type : 'array',
value : ['auto'],
choices : [
'', 'auto', 'pl111', 'radeonsi', 'r300', 'r600', 'nouveau', 'freedreno',
'swrast', 'v3d', 'vc4', 'etnaviv', 'imx', 'tegra', 'i915', 'svga', 'virgl',
'swr',
],
description : 'List of gallium drivers to build. If this is set to auto all drivers applicable to the target OS/architecture will be built'
)
option(
'gallium-extra-hud',
@@ -91,8 +100,8 @@ option(
'gallium-omx',
type : 'combo',
value : 'auto',
choices : ['auto', 'true', 'false'],
description : 'enable gallium omx bellagio state tracker.',
choices : ['auto', 'disabled', 'bellagio', 'tizonia'],
description : 'enable gallium omx state tracker.',
)
option(
'omx-libs-path',
@@ -141,9 +150,10 @@ option(
)
option(
'vulkan-drivers',
type : 'string',
value : 'auto',
description : 'comma separated list of vulkan drivers to build. If this is set to auto all drivers applicable to the target OS/architecture will be built'
type : 'array',
value : ['auto'],
choices : ['', 'auto', 'amd', 'intel'],
description : 'List of vulkan drivers to build. If this is set to auto all drivers applicable to the target OS/architecture will be built'
)
option(
'shader-cache',
@@ -214,6 +224,12 @@ option(
value : true,
description : 'Build assembly code if possible'
)
option(
'glx-read-only-text',
type : 'boolean',
value : false,
description : 'Disable writable .text section on x86 (decreases performance)'
)
option(
'llvm',
type : 'combo',
@@ -248,12 +264,6 @@ option(
value : false,
description : 'Build unit tests. Currently this will build *all* unit tests, which may build more than expected.'
)
option(
'texture-float',
type : 'boolean',
value : false,
description : 'Enable floating point textures and renderbuffers. This option may be patent encumbered, please read docs/patents.txt and consult with your lawyer before turning this on.'
)
option(
'selinux',
type : 'boolean',
@@ -276,7 +286,22 @@ option(
)
option(
'swr-arches',
type : 'string',
value : 'avx,avx2',
description : 'Comma delemited swr architectures. choices : avx,avx2,knl,skx'
type : 'array',
value : ['avx', 'avx2'],
choices : ['avx', 'avx2', 'knl', 'skx'],
description : 'Architectures to build SWR support for.',
)
option(
'tools',
type : 'array',
value : [],
choices : ['freedreno', 'glsl', 'intel', 'nir', 'nouveau', 'all'],
description : 'List of tools to build.',
)
option(
'power8',
type : 'combo',
value : 'auto',
choices : ['auto', 'true', 'false'],
description : 'Enable power8 optimizations.',
)

View File

@@ -134,7 +134,9 @@ def check_cc(env, cc, expr, cpp_opt = '-E'):
source.write('#if !(%s)\n#error\n#endif\n' % expr)
source.close()
pipe = SCons.Action._subproc(env, [env['CC'], cpp_opt, source.name],
# sys.stderr.write('%r %s %s\n' % (env['CC'], cpp_opt, source.name));
pipe = SCons.Action._subproc(env, env.Split(env['CC']) + [cpp_opt, source.name],
stdin = 'devnull',
stderr = 'devnull',
stdout = 'devnull')
@@ -352,6 +354,9 @@ def generate(env):
if check_header(env, 'xlocale.h'):
cppdefines += ['HAVE_XLOCALE_H']
if check_header(env, 'endian.h'):
cppdefines += ['HAVE_ENDIAN_H']
if check_functions(env, ['strtod_l', 'strtof_l']):
cppdefines += ['HAVE_STRTOD_L']
@@ -387,10 +392,6 @@ def generate(env):
cppdefines += ['PIPE_SUBSYSTEM_WINDOWS_USER']
if env['embedded']:
cppdefines += ['PIPE_SUBSYSTEM_EMBEDDED']
if env['texture_float']:
print('warning: Floating-point textures enabled.')
print('warning: Please consult docs/patents.txt with your lawyer before building Mesa.')
cppdefines += ['TEXTURE_FLOAT_ENABLED']
env.Append(CPPDEFINES = cppdefines)
# C compiler options

View File

@@ -123,6 +123,10 @@ def generate(env):
'LLVMDemangle', 'LLVMGlobalISel', 'LLVMDebugInfoMSF',
'LLVMBinaryFormat',
])
if env['platform'] == 'windows' and env['crosscompile']:
# LLVM 5.0 requires MinGW w/ pthreads due to use of std::thread and friends.
assert env['gcc']
env['CXX'] = env['CXX'] + '-posix'
elif llvm_version >= distutils.version.LooseVersion('4.0'):
env.Prepend(LIBS = [
'LLVMX86Disassembler', 'LLVMX86AsmParser',
@@ -211,8 +215,11 @@ def generate(env):
'imagehlp',
'psapi',
'shell32',
'advapi32'
'advapi32',
'ole32',
'uuid',
])
if env['msvc']:
# Some of the LLVM C headers use the inline keyword without
# defining it.

View File

@@ -67,7 +67,6 @@ SUBDIRS += vulkan
endif
EXTRA_DIST += vulkan/registry/vk.xml
EXTRA_DIST += vulkan/registry/vk_android_native_buffer.xml
if HAVE_AMD_DRIVERS
SUBDIRS += amd
@@ -96,11 +95,6 @@ if HAVE_GBM
SUBDIRS += gbm
endif
## Optionally required by EGL
if HAVE_PLATFORM_WAYLAND
SUBDIRS += egl/wayland/wayland-egl
endif
if HAVE_EGL
SUBDIRS += egl
endif

View File

@@ -22,6 +22,7 @@
ADDRLIB_LIBS = addrlib/libamdgpu_addrlib.la
addrlib_libamdgpu_addrlib_la_CPPFLAGS = \
$(DEFINES) \
-I$(top_srcdir)/src/ \
-I$(srcdir)/common \
-I$(srcdir)/addrlib \

View File

@@ -45,8 +45,6 @@ AMD_COMPILER_FILES = \
common/ac_llvm_util.c \
common/ac_llvm_util.h \
common/ac_shader_abi.h \
common/ac_shader_info.c \
common/ac_shader_info.h \
common/ac_shader_util.c \
common/ac_shader_util.h

View File

@@ -1054,7 +1054,7 @@ ADDR_E_RETURNCODE ADDR_API AddrComputePrtInfo(
*/
ADDR_E_RETURNCODE ADDR_API AddrGetMaxAlignments(
ADDR_HANDLE hLib, ///< address lib handle
ADDR_GET_MAX_ALIGNMENTS_OUTPUT* pOut) ///< [out] output structure
ADDR_GET_MAX_ALINGMENTS_OUTPUT* pOut) ///< [out] output structure
{
Addr::Lib* pLib = Lib::GetLib(hLib);
@@ -1072,6 +1072,36 @@ ADDR_E_RETURNCODE ADDR_API AddrGetMaxAlignments(
return returnCode;
}
/**
****************************************************************************************************
* AddrGetMaxMetaAlignments
*
* @brief
* Convert maximum alignments for metadata
*
* @return
* ADDR_OK if successful, otherwise an error code of ADDR_E_RETURNCODE
****************************************************************************************************
*/
ADDR_E_RETURNCODE ADDR_API AddrGetMaxMetaAlignments(
ADDR_HANDLE hLib, ///< address lib handle
ADDR_GET_MAX_ALINGMENTS_OUTPUT* pOut) ///< [out] output structure
{
Addr::Lib* pLib = Lib::GetLib(hLib);
ADDR_E_RETURNCODE returnCode = ADDR_OK;
if (pLib != NULL)
{
returnCode = pLib->GetMaxMetaAlignments(pOut);
}
else
{
returnCode = ADDR_ERROR;
}
return returnCode;
}
////////////////////////////////////////////////////////////////////////////////////////////////////

View File

@@ -528,7 +528,8 @@ typedef union _ADDR_SURFACE_FLAGS
UINT_32 preferEquation : 1; ///< Return equation index without adjusting tile mode
UINT_32 matchStencilTileCfg : 1; ///< Select tile index of stencil as well as depth surface
/// to make sure they share same tile config parameters
UINT_32 reserved : 2; ///< Reserved bits
UINT_32 disallowLargeThickDegrade : 1; ///< Disallow large thick tile degrade
UINT_32 reserved : 1; ///< Reserved bits
};
UINT_32 value;
@@ -2273,7 +2274,7 @@ typedef struct _ADDR_COMPUTE_DCCINFO_INPUT
typedef struct _ADDR_COMPUTE_DCCINFO_OUTPUT
{
UINT_32 size; ///< Size of this structure in bytes
UINT_64 dccRamBaseAlign; ///< Base alignment of dcc key
UINT_32 dccRamBaseAlign; ///< Base alignment of dcc key
UINT_64 dccRamSize; ///< Size of dcc key
UINT_64 dccFastClearSize; ///< Size of dcc key portion that can be fast cleared
BOOL_32 subLvlCompressible; ///< Whether sub resource is compressiable
@@ -2298,17 +2299,17 @@ ADDR_E_RETURNCODE ADDR_API AddrComputeDccInfo(
/**
****************************************************************************************************
* ADDR_GET_MAX_ALIGNMENTS_OUTPUT
* ADDR_GET_MAX_ALINGMENTS_OUTPUT
*
* @brief
* Output structure of AddrGetMaxAlignments
****************************************************************************************************
*/
typedef struct _ADDR_GET_MAX_ALIGNMENTS_OUTPUT
typedef struct _ADDR_GET_MAX_ALINGMENTS_OUTPUT
{
UINT_32 size; ///< Size of this structure in bytes
UINT_64 baseAlign; ///< Maximum base alignment in bytes
} ADDR_GET_MAX_ALIGNMENTS_OUTPUT;
UINT_32 baseAlign; ///< Maximum base alignment in bytes
} ADDR_GET_MAX_ALINGMENTS_OUTPUT;
/**
****************************************************************************************************
@@ -2320,9 +2321,19 @@ typedef struct _ADDR_GET_MAX_ALIGNMENTS_OUTPUT
*/
ADDR_E_RETURNCODE ADDR_API AddrGetMaxAlignments(
ADDR_HANDLE hLib,
ADDR_GET_MAX_ALIGNMENTS_OUTPUT* pOut);
ADDR_GET_MAX_ALINGMENTS_OUTPUT* pOut);
/**
****************************************************************************************************
* AddrGetMaxMetaAlignments
*
* @brief
* Gets maximnum alignments for metadata
****************************************************************************************************
*/
ADDR_E_RETURNCODE ADDR_API AddrGetMaxMetaAlignments(
ADDR_HANDLE hLib,
ADDR_GET_MAX_ALINGMENTS_OUTPUT* pOut);
/**
****************************************************************************************************
@@ -2366,22 +2377,25 @@ typedef union _ADDR2_SURFACE_FLAGS
{
struct
{
UINT_32 color : 1; ///< This resource is a color buffer, can be used with RTV
UINT_32 depth : 1; ///< Thie resource is a depth buffer, can be used with DSV
UINT_32 stencil : 1; ///< Thie resource is a stencil buffer, can be used with DSV
UINT_32 fmask : 1; ///< This is an fmask surface
UINT_32 overlay : 1; ///< This is an overlay surface
UINT_32 display : 1; ///< This resource is displable, can be used with DRV
UINT_32 prt : 1; ///< This is a partially resident texture
UINT_32 qbStereo : 1; ///< This is a quad buffer stereo surface
UINT_32 interleaved : 1; ///< Special flag for interleaved YUV surface padding
UINT_32 texture : 1; ///< This resource can be used with SRV
UINT_32 unordered : 1; ///< This resource can be used with UAV
UINT_32 rotated : 1; ///< This resource is rotated and displable
UINT_32 needEquation : 1; ///< This resource needs equation to be generated if possible
UINT_32 opt4space : 1; ///< This resource should be optimized for space
UINT_32 minimizeAlign : 1; ///< This resource should use minimum alignment
UINT_32 reserved : 17; ///< Reserved bits
UINT_32 color : 1; ///< This resource is a color buffer, can be used with RTV
UINT_32 depth : 1; ///< Thie resource is a depth buffer, can be used with DSV
UINT_32 stencil : 1; ///< Thie resource is a stencil buffer, can be used with DSV
UINT_32 fmask : 1; ///< This is an fmask surface
UINT_32 overlay : 1; ///< This is an overlay surface
UINT_32 display : 1; ///< This resource is displable, can be used with DRV
UINT_32 prt : 1; ///< This is a partially resident texture
UINT_32 qbStereo : 1; ///< This is a quad buffer stereo surface
UINT_32 interleaved : 1; ///< Special flag for interleaved YUV surface padding
UINT_32 texture : 1; ///< This resource can be used with SRV
UINT_32 unordered : 1; ///< This resource can be used with UAV
UINT_32 rotated : 1; ///< This resource is rotated and displable
UINT_32 needEquation : 1; ///< This resource needs equation to be generated if possible
UINT_32 opt4space : 1; ///< This resource should be optimized for space
UINT_32 minimizeAlign : 1; ///< This resource should use minimum alignment
UINT_32 noMetadata : 1; ///< This resource has no metadata
UINT_32 metaRbUnaligned : 1; ///< This resource has rb unaligned metadata
UINT_32 metaPipeUnaligned : 1; ///< This resource has pipe unaligned metadata
UINT_32 reserved : 14; ///< Reserved bits
};
UINT_32 value;

View File

@@ -76,7 +76,7 @@ typedef int INT;
#ifndef ADDR_STDCALL
#if defined(__GNUC__)
#if defined(__AMD64__)
#if defined(__amd64__) || defined(__x86_64__)
#define ADDR_STDCALL
#else
#define ADDR_STDCALL __attribute__((stdcall))
@@ -87,7 +87,9 @@ typedef int INT;
#endif
#ifndef ADDR_FASTCALL
#if defined(__GNUC__)
#if defined(BRAHMA_ARM)
#define ADDR_FASTCALL
#elif defined(__GNUC__)
#if defined(__i386__)
#define ADDR_FASTCALL __attribute__((regparm(0)))
#else

View File

@@ -79,12 +79,14 @@
#define AMDGPU_POLARIS10_RANGE 0x50, 0x5A
#define AMDGPU_POLARIS11_RANGE 0x5A, 0x64
#define AMDGPU_POLARIS12_RANGE 0x64, 0x6E
#define AMDGPU_VEGAM_RANGE 0x6E, 0xFF
#define AMDGPU_CARRIZO_RANGE 0x01, 0x21
#define AMDGPU_BRISTOL_RANGE 0x10, 0x21
#define AMDGPU_STONEY_RANGE 0x61, 0xFF
#define AMDGPU_VEGA10_RANGE 0x01, 0x14
#define AMDGPU_VEGA12_RANGE 0x14, 0x28
#define AMDGPU_RAVEN_RANGE 0x01, 0x81
@@ -116,6 +118,7 @@
#define ASICREV_IS_POLARIS10_P(r) ASICREV_IS(r, POLARIS10)
#define ASICREV_IS_POLARIS11_M(r) ASICREV_IS(r, POLARIS11)
#define ASICREV_IS_POLARIS12_V(r) ASICREV_IS(r, POLARIS12)
#define ASICREV_IS_VEGAM_P(r) ASICREV_IS(r, VEGAM)
#define ASICREV_IS_CARRIZO(r) ASICREV_IS(r, CARRIZO)
#define ASICREV_IS_CARRIZO_BRISTOL(r) ASICREV_IS(r, BRISTOL)
@@ -123,6 +126,8 @@
#define ASICREV_IS_VEGA10_M(r) ASICREV_IS(r, VEGA10)
#define ASICREV_IS_VEGA10_P(r) ASICREV_IS(r, VEGA10)
#define ASICREV_IS_VEGA12_P(r) ASICREV_IS(r, VEGA12)
#define ASICREV_IS_VEGA12_p(r) ASICREV_IS(r, VEGA12)
#define ASICREV_IS_RAVEN(r) ASICREV_IS(r, RAVEN)

View File

@@ -285,10 +285,12 @@ ADDR_E_RETURNCODE Lib::Create(
{
pCreateOut->numEquations =
pLib->HwlGetEquationTableInfo(&pCreateOut->pEquationTable);
}
if ((pLib == NULL) &&
(returnCode == ADDR_OK))
pLib->SetMaxAlignments();
}
else if ((pLib == NULL) &&
(returnCode == ADDR_OK))
{
// Unknown failures, we return the general error code
returnCode = ADDR_ERROR;
@@ -336,6 +338,23 @@ VOID Lib::SetMinPitchAlignPixels(
m_minPitchAlignPixels = (minPitchAlignPixels == 0) ? 1 : minPitchAlignPixels;
}
/**
****************************************************************************************************
* Lib::SetMaxAlignments
*
* @brief
* Set max alignments
*
* @return
* N/A
****************************************************************************************************
*/
VOID Lib::SetMaxAlignments()
{
m_maxBaseAlign = HwlComputeMaxBaseAlignments();
m_maxMetaBaseAlign = HwlComputeMaxMetaBaseAlignments();
}
/**
****************************************************************************************************
* Lib::GetLib
@@ -358,21 +377,21 @@ Lib* Lib::GetLib(
* Lib::GetMaxAlignments
*
* @brief
* Gets maximum alignments
* Gets maximum alignments for data surface (include FMask)
*
* @return
* ADDR_E_RETURNCODE
****************************************************************************************************
*/
ADDR_E_RETURNCODE Lib::GetMaxAlignments(
ADDR_GET_MAX_ALIGNMENTS_OUTPUT* pOut ///< [out] output structure
ADDR_GET_MAX_ALINGMENTS_OUTPUT* pOut ///< [out] output structure
) const
{
ADDR_E_RETURNCODE returnCode = ADDR_OK;
if (GetFillSizeFieldsFlags() == TRUE)
{
if (pOut->size != sizeof(ADDR_GET_MAX_ALIGNMENTS_OUTPUT))
if (pOut->size != sizeof(ADDR_GET_MAX_ALINGMENTS_OUTPUT))
{
returnCode = ADDR_PARAMSIZEMISMATCH;
}
@@ -380,7 +399,54 @@ ADDR_E_RETURNCODE Lib::GetMaxAlignments(
if (returnCode == ADDR_OK)
{
returnCode = HwlGetMaxAlignments(pOut);
if (m_maxBaseAlign != 0)
{
pOut->baseAlign = m_maxBaseAlign;
}
else
{
returnCode = ADDR_NOTIMPLEMENTED;
}
}
return returnCode;
}
/**
****************************************************************************************************
* Lib::GetMaxMetaAlignments
*
* @brief
* Gets maximum alignments for metadata (CMask, DCC and HTile)
*
* @return
* ADDR_E_RETURNCODE
****************************************************************************************************
*/
ADDR_E_RETURNCODE Lib::GetMaxMetaAlignments(
ADDR_GET_MAX_ALINGMENTS_OUTPUT* pOut ///< [out] output structure
) const
{
ADDR_E_RETURNCODE returnCode = ADDR_OK;
if (GetFillSizeFieldsFlags() == TRUE)
{
if (pOut->size != sizeof(ADDR_GET_MAX_ALINGMENTS_OUTPUT))
{
returnCode = ADDR_PARAMSIZEMISMATCH;
}
}
if (returnCode == ADDR_OK)
{
if (m_maxMetaBaseAlign != 0)
{
pOut->baseAlign = m_maxMetaBaseAlign;
}
else
{
returnCode = ADDR_NOTIMPLEMENTED;
}
}
return returnCode;

View File

@@ -282,14 +282,38 @@ public:
BOOL_32 GetExportNorm(const ELEM_GETEXPORTNORM_INPUT* pIn) const;
ADDR_E_RETURNCODE GetMaxAlignments(ADDR_GET_MAX_ALIGNMENTS_OUTPUT* pOut) const;
ADDR_E_RETURNCODE GetMaxAlignments(ADDR_GET_MAX_ALINGMENTS_OUTPUT* pOut) const;
ADDR_E_RETURNCODE GetMaxMetaAlignments(ADDR_GET_MAX_ALINGMENTS_OUTPUT* pOut) const;
protected:
Lib(); // Constructor is protected
Lib(const Client* pClient);
/// Pure virtual function to get max alignments
virtual ADDR_E_RETURNCODE HwlGetMaxAlignments(ADDR_GET_MAX_ALIGNMENTS_OUTPUT* pOut) const = 0;
/// Pure virtual function to get max base alignments
virtual UINT_32 HwlComputeMaxBaseAlignments() const = 0;
/// Gets maximum alignements for metadata
virtual UINT_32 HwlComputeMaxMetaBaseAlignments() const
{
ADDR_NOT_IMPLEMENTED();
return 0;
}
VOID ValidBaseAlignments(UINT_32 alignment) const
{
#if DEBUG
ADDR_ASSERT(alignment <= m_maxBaseAlign);
#endif
}
VOID ValidMetaBaseAlignments(UINT_32 metaAlignment) const
{
#if DEBUG
ADDR_ASSERT(metaAlignment <= m_maxMetaBaseAlign);
#endif
}
//
// Initialization
@@ -341,6 +365,8 @@ private:
VOID SetMinPitchAlignPixels(UINT_32 minPitchAlignPixels);
VOID SetMaxAlignments();
protected:
LibClass m_class; ///< Store class type (HWL type)
@@ -370,6 +396,10 @@ protected:
UINT_32 m_minPitchAlignPixels; ///< Minimum pitch alignment in pixels
UINT_32 m_maxSamples; ///< Max numSamples
UINT_32 m_maxBaseAlign; ///< Max base alignment for data surface
UINT_32 m_maxMetaBaseAlign; ///< Max base alignment for metadata
private:
ElemLib* m_pElemLib; ///< Element Lib pointer
};

View File

@@ -428,6 +428,8 @@ ADDR_E_RETURNCODE Lib::ComputeSurfaceInfo(
}
}
ValidBaseAlignments(pOut->baseAlign);
return returnCode;
}
@@ -895,6 +897,8 @@ ADDR_E_RETURNCODE Lib::ComputeFmaskInfo(
}
}
ValidBaseAlignments(pOut->baseAlign);
return returnCode;
}
@@ -1333,6 +1337,8 @@ ADDR_E_RETURNCODE Lib::ComputeHtileInfo(
}
}
ValidMetaBaseAlignments(pOut->baseAlign);
return returnCode;
}
@@ -1399,6 +1405,8 @@ ADDR_E_RETURNCODE Lib::ComputeCmaskInfo(
}
}
ValidMetaBaseAlignments(pOut->baseAlign);
return returnCode;
}
@@ -1443,9 +1451,11 @@ ADDR_E_RETURNCODE Lib::ComputeDccInfo(
pIn = &input;
}
if (ADDR_OK == ret)
if (ret == ADDR_OK)
{
ret = HwlComputeDccInfo(pIn, pOut);
ValidMetaBaseAlignments(pOut->dccRamBaseAlign);
}
}
@@ -3652,7 +3662,7 @@ VOID Lib::OptimizeTileMode(
tileMode = (thickness == 1) ?
ADDR_TM_1D_TILED_THIN1 : ADDR_TM_1D_TILED_THICK;
}
else if (thickness > 1)
else if ((thickness > 1) && (pInOut->flags.disallowLargeThickDegrade == 0))
{
// As in the following HwlComputeSurfaceInfo, thick modes may be degraded to
// thinner modes, we should re-evaluate whether the corresponding

View File

@@ -295,6 +295,8 @@ ADDR_E_RETURNCODE Lib::ComputeSurfaceInfo(
ADDR_ASSERT(pOut->surfSize != 0);
ValidBaseAlignments(pOut->baseAlign);
return returnCode;
}
@@ -447,6 +449,8 @@ ADDR_E_RETURNCODE Lib::ComputeHtileInfo(
else
{
returnCode = HwlComputeHtileInfo(pIn, pOut);
ValidMetaBaseAlignments(pOut->baseAlign);
}
return returnCode;
@@ -545,6 +549,8 @@ ADDR_E_RETURNCODE Lib::ComputeCmaskInfo(
else
{
returnCode = HwlComputeCmaskInfo(pIn, pOut);
ValidMetaBaseAlignments(pOut->baseAlign);
}
return returnCode;
@@ -688,6 +694,8 @@ ADDR_E_RETURNCODE Lib::ComputeFmaskInfo(
}
}
ValidBaseAlignments(pOut->baseAlign);
return returnCode;
}
@@ -764,6 +772,8 @@ ADDR_E_RETURNCODE Lib::ComputeDccInfo(
else
{
returnCode = HwlComputeDccInfo(pIn, pOut);
ValidMetaBaseAlignments(pOut->dccRamBaseAlign);
}
return returnCode;

View File

@@ -480,12 +480,6 @@ protected:
return HwlGetEquationIndex(pIn, pOut);
}
virtual UINT_32 HwlComputeSurfaceBaseAlign(AddrSwizzleMode swizzleMode) const
{
ADDR_NOT_IMPLEMENTED();
return 0;
}
virtual ADDR_E_RETURNCODE HwlComputePipeBankXor(
const ADDR2_COMPUTE_PIPEBANKXOR_INPUT* pIn,
ADDR2_COMPUTE_PIPEBANKXOR_OUTPUT* pOut) const

View File

@@ -189,10 +189,10 @@ ADDR_E_RETURNCODE Gfx9Lib::HwlComputeHtileInfo(
numCompressBlkPerMetaBlk = 1 << numCompressBlkPerMetaBlkLog2;
Dim3d metaBlkDim = {8, 8, 1};
Dim3d metaBlkDim = {8, 8, 1};
UINT_32 totalAmpBits = numCompressBlkPerMetaBlkLog2;
UINT_32 widthAmp = (pIn->numMipLevels > 1) ? (totalAmpBits >> 1) : RoundHalf(totalAmpBits);
UINT_32 heightAmp = totalAmpBits - widthAmp;
UINT_32 widthAmp = (pIn->numMipLevels > 1) ? (totalAmpBits >> 1) : RoundHalf(totalAmpBits);
UINT_32 heightAmp = totalAmpBits - widthAmp;
metaBlkDim.w <<= widthAmp;
metaBlkDim.h <<= heightAmp;
@@ -221,39 +221,42 @@ ADDR_E_RETURNCODE Gfx9Lib::HwlComputeHtileInfo(
pIn->unalignedWidth, pIn->unalignedHeight, pIn->numSlices,
&numMetaBlkX, &numMetaBlkY, &numMetaBlkZ);
UINT_32 sizeAlign = numPipeTotal * numRbTotal * m_pipeInterleaveBytes;
const UINT_32 metaBlkSize = numCompressBlkPerMetaBlk << 2;
UINT_32 align = numPipeTotal * numRbTotal * m_pipeInterleaveBytes;
if ((IsXor(pIn->swizzleMode) == FALSE) && (numPipeTotal > 2))
{
align *= (numPipeTotal >> 1);
}
align = Max(align, metaBlkSize);
if (m_settings.metaBaseAlignFix)
{
align = Max(align, GetBlockSize(pIn->swizzleMode));
}
if (m_settings.htileAlignFix)
{
sizeAlign <<= 1;
const INT_32 metaBlkSizeLog2 = numCompressBlkPerMetaBlkLog2 + 2;
const INT_32 htileCachelineSizeLog2 = 11;
const INT_32 maxNumOfRbMaskBits = 1 + Log2(numPipeTotal) + Log2(numRbTotal);
INT_32 rbMaskPadding = Max(0, htileCachelineSizeLog2 - (metaBlkSizeLog2 - maxNumOfRbMaskBits));
align <<= rbMaskPadding;
}
pOut->pitch = numMetaBlkX * metaBlkDim.w;
pOut->height = numMetaBlkY * metaBlkDim.h;
pOut->sliceSize = numMetaBlkX * numMetaBlkY * numCompressBlkPerMetaBlk * 4;
pOut->sliceSize = numMetaBlkX * numMetaBlkY * metaBlkSize;
pOut->metaBlkWidth = metaBlkDim.w;
pOut->metaBlkHeight = metaBlkDim.h;
pOut->metaBlkWidth = metaBlkDim.w;
pOut->metaBlkHeight = metaBlkDim.h;
pOut->metaBlkNumPerSlice = numMetaBlkX * numMetaBlkY;
pOut->baseAlign = Max(numCompressBlkPerMetaBlk * 4, sizeAlign);
if (m_settings.metaBaseAlignFix)
{
pOut->baseAlign = Max(pOut->baseAlign, GetBlockSize(pIn->swizzleMode));
}
if ((IsXor(pIn->swizzleMode) == FALSE) && (numPipeTotal > 2))
{
UINT_32 additionalAlign = numPipeTotal * numCompressBlkPerMetaBlk * 2;
if (additionalAlign > sizeAlign)
{
sizeAlign = additionalAlign;
}
}
pOut->htileBytes = PowTwoAlign(pOut->sliceSize * numMetaBlkZ, sizeAlign);
pOut->baseAlign = align;
pOut->htileBytes = PowTwoAlign(pOut->sliceSize * numMetaBlkZ, align);
return ADDR_OK;
}
@@ -333,17 +336,17 @@ ADDR_E_RETURNCODE Gfx9Lib::HwlComputeCmaskInfo(
UINT_32 sizeAlign = numPipeTotal * numRbTotal * m_pipeInterleaveBytes;
if (m_settings.metaBaseAlignFix)
{
sizeAlign = Max(sizeAlign, GetBlockSize(pIn->swizzleMode));
}
pOut->pitch = numMetaBlkX * metaBlkDim.w;
pOut->height = numMetaBlkY * metaBlkDim.h;
pOut->sliceSize = (numMetaBlkX * numMetaBlkY * numCompressBlkPerMetaBlk) >> 1;
pOut->cmaskBytes = PowTwoAlign(pOut->sliceSize * numMetaBlkZ, sizeAlign);
pOut->baseAlign = Max(numCompressBlkPerMetaBlk >> 1, sizeAlign);
if (m_settings.metaBaseAlignFix)
{
pOut->baseAlign = Max(pOut->baseAlign, GetBlockSize(pIn->swizzleMode));
}
pOut->metaBlkWidth = metaBlkDim.w;
pOut->metaBlkHeight = metaBlkDim.h;
@@ -638,16 +641,16 @@ ADDR_E_RETURNCODE Gfx9Lib::HwlComputeDccInfo(
sizeAlign *= (numFrags / m_maxCompFrag);
}
if (m_settings.metaBaseAlignFix)
{
sizeAlign = Max(sizeAlign, GetBlockSize(pIn->swizzleMode));
}
pOut->dccRamSize = numMetaBlkX * numMetaBlkY * numMetaBlkZ *
numCompressBlkPerMetaBlk * numFrags;
pOut->dccRamSize = PowTwoAlign(pOut->dccRamSize, sizeAlign);
pOut->dccRamBaseAlign = Max(numCompressBlkPerMetaBlk, sizeAlign);
if (m_settings.metaBaseAlignFix)
{
pOut->dccRamBaseAlign = Max(pOut->dccRamBaseAlign, GetBlockSize(pIn->swizzleMode));
}
pOut->pitch = numMetaBlkX * metaBlkDim.w;
pOut->height = numMetaBlkY * metaBlkDim.h;
pOut->depth = numMetaBlkZ * metaBlkDim.d;
@@ -670,21 +673,78 @@ ADDR_E_RETURNCODE Gfx9Lib::HwlComputeDccInfo(
/**
************************************************************************************************************************
* Gfx9Lib::HwlGetMaxAlignments
* Gfx9Lib::HwlComputeMaxBaseAlignments
*
* @brief
* Gets maximum alignments
* @return
* ADDR_E_RETURNCODE
* maximum alignments
************************************************************************************************************************
*/
ADDR_E_RETURNCODE Gfx9Lib::HwlGetMaxAlignments(
ADDR_GET_MAX_ALIGNMENTS_OUTPUT* pOut ///< [out] output structure
) const
UINT_32 Gfx9Lib::HwlComputeMaxBaseAlignments() const
{
pOut->baseAlign = HwlComputeSurfaceBaseAlign(ADDR_SW_64KB);
return ComputeSurfaceBaseAlignTiled(ADDR_SW_64KB);
}
return ADDR_OK;
/**
************************************************************************************************************************
* Gfx9Lib::HwlComputeMaxMetaBaseAlignments
*
* @brief
* Gets maximum alignments for metadata
* @return
* maximum alignments for metadata
************************************************************************************************************************
*/
UINT_32 Gfx9Lib::HwlComputeMaxMetaBaseAlignments() const
{
// Max base alignment for Htile
const UINT_32 maxNumPipeTotal = GetPipeNumForMetaAddressing(TRUE, ADDR_SW_64KB_Z);
const UINT_32 maxNumRbTotal = m_se * m_rbPerSe;
// If applyAliasFix was set, the extra bits should be MAX(10u, m_pipeInterleaveLog2),
// but we never saw any ASIC whose m_pipeInterleaveLog2 != 8, so just put an assertion and simply the logic.
ADDR_ASSERT((m_settings.applyAliasFix == FALSE) || (m_pipeInterleaveLog2 <= 10u));
const UINT_32 maxNumCompressBlkPerMetaBlk = 1u << (m_seLog2 + m_rbPerSeLog2 + 10u);
UINT_32 maxBaseAlignHtile = maxNumPipeTotal * maxNumRbTotal * m_pipeInterleaveBytes;
if (maxNumPipeTotal > 2)
{
maxBaseAlignHtile *= (maxNumPipeTotal >> 1);
}
maxBaseAlignHtile = Max(maxNumCompressBlkPerMetaBlk << 2, maxBaseAlignHtile);
if (m_settings.metaBaseAlignFix)
{
maxBaseAlignHtile = Max(maxBaseAlignHtile, GetBlockSize(ADDR_SW_64KB));
}
if (m_settings.htileAlignFix)
{
maxBaseAlignHtile *= maxNumPipeTotal;
}
// Max base alignment for Cmask will not be larger than that for Htile, no need to calculate
// Max base alignment for 2D Dcc will not be larger than that for 3D, no need to calculate
UINT_32 maxBaseAlignDcc3D = 65536;
if ((maxNumPipeTotal > 1) || (maxNumRbTotal > 1))
{
maxBaseAlignDcc3D = Min(m_se * m_rbPerSe * 262144, 65536 * 128u);
}
// Max base alignment for Msaa Dcc
UINT_32 maxBaseAlignDccMsaa = maxNumPipeTotal * maxNumRbTotal * m_pipeInterleaveBytes * (8 / m_maxCompFrag);
if (m_settings.metaBaseAlignFix)
{
maxBaseAlignDccMsaa = Max(maxBaseAlignDccMsaa, GetBlockSize(ADDR_SW_64KB));
}
return Max(maxBaseAlignHtile, Max(maxBaseAlignDccMsaa, maxBaseAlignDcc3D));
}
/**
@@ -724,9 +784,11 @@ ADDR_E_RETURNCODE Gfx9Lib::HwlComputeCmaskAddrFromCoord(
UINT_32 metaBlkWidthLog2 = Log2(output.metaBlkWidth);
UINT_32 metaBlkHeightLog2 = Log2(output.metaBlkHeight);
const CoordEq* pMetaEq = GetMetaEquation({0, fmaskElementBytesLog2, 0, pIn->cMaskFlags,
Gfx9DataFmask, pIn->swizzleMode, pIn->resourceType,
metaBlkWidthLog2, metaBlkHeightLog2, 0, 3, 3, 0});
MetaEqParams metaEqParams = {0, fmaskElementBytesLog2, 0, pIn->cMaskFlags,
Gfx9DataFmask, pIn->swizzleMode, pIn->resourceType,
metaBlkWidthLog2, metaBlkHeightLog2, 0, 3, 3, 0};
const CoordEq* pMetaEq = GetMetaEquation(metaEqParams);
UINT_32 xb = pIn->x / output.metaBlkWidth;
UINT_32 yb = pIn->y / output.metaBlkHeight;
@@ -798,9 +860,11 @@ ADDR_E_RETURNCODE Gfx9Lib::HwlComputeHtileAddrFromCoord(
UINT_32 metaBlkHeightLog2 = Log2(output.metaBlkHeight);
UINT_32 numSamplesLog2 = Log2(pIn->numSamples);
const CoordEq* pMetaEq = GetMetaEquation({0, elementBytesLog2, numSamplesLog2, pIn->hTileFlags,
Gfx9DataDepthStencil, pIn->swizzleMode, ADDR_RSRC_TEX_2D,
metaBlkWidthLog2, metaBlkHeightLog2, 0, 3, 3, 0});
MetaEqParams metaEqParams = {0, elementBytesLog2, numSamplesLog2, pIn->hTileFlags,
Gfx9DataDepthStencil, pIn->swizzleMode, ADDR_RSRC_TEX_2D,
metaBlkWidthLog2, metaBlkHeightLog2, 0, 3, 3, 0};
const CoordEq* pMetaEq = GetMetaEquation(metaEqParams);
UINT_32 xb = pIn->x / output.metaBlkWidth;
UINT_32 yb = pIn->y / output.metaBlkHeight;
@@ -870,9 +934,11 @@ ADDR_E_RETURNCODE Gfx9Lib::HwlComputeHtileCoordFromAddr(
UINT_32 metaBlkHeightLog2 = Log2(output.metaBlkHeight);
UINT_32 numSamplesLog2 = Log2(pIn->numSamples);
const CoordEq* pMetaEq = GetMetaEquation({0, elementBytesLog2, numSamplesLog2, pIn->hTileFlags,
Gfx9DataDepthStencil, pIn->swizzleMode, ADDR_RSRC_TEX_2D,
metaBlkWidthLog2, metaBlkHeightLog2, 0, 3, 3, 0});
MetaEqParams metaEqParams = {0, elementBytesLog2, numSamplesLog2, pIn->hTileFlags,
Gfx9DataDepthStencil, pIn->swizzleMode, ADDR_RSRC_TEX_2D,
metaBlkWidthLog2, metaBlkHeightLog2, 0, 3, 3, 0};
const CoordEq* pMetaEq = GetMetaEquation(metaEqParams);
UINT_32 numPipeBits = GetPipeLog2ForMetaAddressing(pIn->hTileFlags.pipeAligned,
pIn->swizzleMode);
@@ -948,10 +1014,12 @@ ADDR_E_RETURNCODE Gfx9Lib::HwlComputeDccAddrFromCoord(
UINT_32 compBlkHeightLog2 = Log2(output.compressBlkHeight);
UINT_32 compBlkDepthLog2 = Log2(output.compressBlkDepth);
const CoordEq* pMetaEq = GetMetaEquation({pIn->mipId, elementBytesLog2, numSamplesLog2, pIn->dccKeyFlags,
Gfx9DataColor, pIn->swizzleMode, pIn->resourceType,
metaBlkWidthLog2, metaBlkHeightLog2, metaBlkDepthLog2,
compBlkWidthLog2, compBlkHeightLog2, compBlkDepthLog2});
MetaEqParams metaEqParams = {pIn->mipId, elementBytesLog2, numSamplesLog2, pIn->dccKeyFlags,
Gfx9DataColor, pIn->swizzleMode, pIn->resourceType,
metaBlkWidthLog2, metaBlkHeightLog2, metaBlkDepthLog2,
compBlkWidthLog2, compBlkHeightLog2, compBlkDepthLog2};
const CoordEq* pMetaEq = GetMetaEquation(metaEqParams);
UINT_32 xb = pIn->x / output.metaBlkWidth;
UINT_32 yb = pIn->y / output.metaBlkHeight;
@@ -1055,6 +1123,10 @@ BOOL_32 Gfx9Lib::HwlInitGlobalParams(
break;
}
// Addr::V2::Lib::ComputePipeBankXor()/ComputeSlicePipeBankXor() requires pipe interleave to be exactly 8 bits,
// and any larger value requires a post-process (left shift) on the output pipeBankXor bits.
ADDR_ASSERT(m_pipeInterleaveBytes == ADDR_PIPEINTERLEAVE_256B);
switch (gbAddrConfig.bits.NUM_BANKS)
{
case ADDR_CONFIG_1_BANK:
@@ -1151,6 +1223,19 @@ BOOL_32 Gfx9Lib::HwlInitGlobalParams(
ADDR_ASSERT((m_blockVarSizeLog2 == 0) ||
((m_blockVarSizeLog2 >= 17u) && (m_blockVarSizeLog2 <= 20u)));
m_blockVarSizeLog2 = Min(Max(17u, m_blockVarSizeLog2), 20u);
if ((m_rbPerSeLog2 == 1) &&
(((m_pipesLog2 == 1) && ((m_seLog2 == 2) || (m_seLog2 == 3))) ||
((m_pipesLog2 == 2) && ((m_seLog2 == 1) || (m_seLog2 == 2)))))
{
ADDR_ASSERT(m_settings.isVega10 == FALSE);
ADDR_ASSERT(m_settings.isRaven == FALSE);
if (m_settings.isVega12)
{
m_settings.htileCacheRbConflict = 1;
}
}
}
else
{
@@ -1187,6 +1272,7 @@ ChipFamily Gfx9Lib::HwlConvertChipFamily(
case FAMILY_AI:
m_settings.isArcticIsland = 1;
m_settings.isVega10 = ASICREV_IS_VEGA10_P(uChipRevision);
m_settings.isVega12 = ASICREV_IS_VEGA12_P(uChipRevision);
m_settings.isDce12 = 1;
@@ -3279,10 +3365,11 @@ ADDR_E_RETURNCODE Gfx9Lib::HwlGetPreferredSurfaceSetting(
addrPreferredSwSet.value = AddrSwSetZ;
addrValidSwSet.value = AddrSwSetZ;
if (pIn->flags.depth && pIn->flags.texture)
if (pIn->flags.noMetadata == FALSE)
{
if (((bpp == 16) && (numFrags >= 4)) ||
((bpp == 32) && (numFrags >= 2)))
if (pIn->flags.depth &&
pIn->flags.texture &&
(((bpp == 16) && (numFrags >= 4)) || ((bpp == 32) && (numFrags >= 2))))
{
// When _X/_T swizzle mode was used for MSAA depth texture, TC will get zplane
// equation from wrong address within memory range a tile covered and use the
@@ -3290,6 +3377,16 @@ ADDR_E_RETURNCODE Gfx9Lib::HwlGetPreferredSurfaceSetting(
pOut->canXor = FALSE;
prtXor = FALSE;
}
if (m_settings.htileCacheRbConflict &&
(pIn->flags.depth || pIn->flags.stencil) &&
(slice > 1) &&
(pIn->flags.metaRbUnaligned == FALSE) &&
(pIn->flags.metaPipeUnaligned == FALSE))
{
// Z_X 2D array with Rb/Pipe aligned HTile won't have metadata cache coherency
pOut->canXor = FALSE;
}
}
}
else if (ElemLib::IsBlockCompressed(pIn->format))
@@ -3402,12 +3499,12 @@ ADDR_E_RETURNCODE Gfx9Lib::HwlGetPreferredSurfaceSetting(
if (pIn->bpp == 64)
{
addrPreferredSwSet.value = AddrSwSetD;
addrValidSwSet.value = AddrSwSetD;
addrValidSwSet.value = AddrSwSetS | AddrSwSetD;
}
else
{
addrPreferredSwSet.value = AddrSwSetS;
addrValidSwSet.value = AddrSwSetS | AddrSwSetD;
addrValidSwSet.value = AddrSwSetS;
}
blockSet.micro = FALSE;
@@ -4037,7 +4134,7 @@ ADDR_E_RETURNCODE Gfx9Lib::HwlComputeSurfaceInfoTiled(
pOut->sliceSize = static_cast<UINT_64>(pOut->mipChainPitch) * pOut->mipChainHeight *
(pIn->bpp >> 3) * pIn->numFrags;
pOut->surfSize = pOut->sliceSize * pOut->mipChainSlice;
pOut->baseAlign = HwlComputeSurfaceBaseAlign(pIn->swizzleMode);
pOut->baseAlign = ComputeSurfaceBaseAlignTiled(pIn->swizzleMode);
if (pIn->flags.prt)
{
@@ -4762,15 +4859,12 @@ ADDR_E_RETURNCODE Gfx9Lib::HwlComputeSurfaceAddrFromCoordTiled(
UINT_32 pitchInMacroBlock = localOut.mipChainPitch / localOut.blockWidth;
UINT_32 paddedHeightInMacroBlock = localOut.mipChainHeight / localOut.blockHeight;
UINT_32 sliceSizeInMacroBlock = pitchInMacroBlock * paddedHeightInMacroBlock;
UINT_32 macroBlockIndex =
UINT_64 macroBlockIndex =
(pIn->slice + mipStartPos.d) * sliceSizeInMacroBlock +
((pIn->y / localOut.blockHeight) + mipStartPos.h) * pitchInMacroBlock +
((pIn->x / localOut.blockWidth) + mipStartPos.w);
UINT_64 macroBlockOffset = (static_cast<UINT_64>(macroBlockIndex) <<
GetBlockSizeLog2(pIn->swizzleMode));
pOut->addr = blockOffset | macroBlockOffset;
pOut->addr = blockOffset | (macroBlockIndex << log2blkSize);
}
else
{
@@ -4835,7 +4929,7 @@ ADDR_E_RETURNCODE Gfx9Lib::HwlComputeSurfaceAddrFromCoordTiled(
UINT_32 pitchInBlock = localOut.mipChainPitch / localOut.blockWidth;
UINT_32 sliceSizeInBlock =
(localOut.mipChainHeight / localOut.blockHeight) * pitchInBlock;
UINT_32 blockIndex = zb * sliceSizeInBlock + yb * pitchInBlock + xb;
UINT_64 blockIndex = zb * sliceSizeInBlock + yb * pitchInBlock + xb;
pOut->addr = blockOffset | (blockIndex << log2blkSize);
}

View File

@@ -55,19 +55,19 @@ struct Gfx9ChipSettings
UINT_32 isArcticIsland : 1;
UINT_32 isVega10 : 1;
UINT_32 isRaven : 1;
UINT_32 reserved0 : 29;
UINT_32 isVega12 : 1;
// Display engine IP version name
UINT_32 isDce12 : 1;
UINT_32 isDcn1 : 1;
UINT_32 reserved1 : 29;
// Misc configuration bits
UINT_32 metaBaseAlignFix : 1;
UINT_32 depthPipeXorDisable : 1;
UINT_32 htileAlignFix : 1;
UINT_32 applyAliasFix : 1;
UINT_32 reserved2 : 28;
UINT_32 htileCacheRbConflict: 1;
UINT_32 reserved2 : 27;
};
};
@@ -121,9 +121,6 @@ public:
return (pMem != NULL) ? new (pMem) Gfx9Lib(pClient) : NULL;
}
virtual BOOL_32 IsValidDisplaySwizzleMode(
const ADDR2_COMPUTE_SURFACE_INFO_INPUT* pIn) const;
protected:
Gfx9Lib(const Client* pClient);
virtual ~Gfx9Lib();
@@ -224,7 +221,7 @@ protected:
AddrSwizzleMode swMode,
UINT_32 elementBytesLog2) const;
virtual UINT_32 HwlComputeSurfaceBaseAlign(AddrSwizzleMode swizzleMode) const
UINT_32 ComputeSurfaceBaseAlignTiled(AddrSwizzleMode swizzleMode) const
{
UINT_32 baseAlign;
@@ -400,11 +397,11 @@ protected:
static const UINT_32 MaxCachedMetaEq = 2;
private:
virtual ADDR_E_RETURNCODE HwlGetMaxAlignments(
ADDR_GET_MAX_ALIGNMENTS_OUTPUT* pOut) const;
virtual UINT_32 HwlComputeMaxBaseAlignments() const;
virtual BOOL_32 HwlInitGlobalParams(
const ADDR_CREATE_INPUT* pCreateIn);
virtual UINT_32 HwlComputeMaxMetaBaseAlignments() const;
virtual BOOL_32 HwlInitGlobalParams(const ADDR_CREATE_INPUT* pCreateIn);
VOID GetRbEquation(CoordEq* pRbEq, UINT_32 rbPerSeLog2, UINT_32 seLog2) const;
@@ -434,6 +431,8 @@ private:
UINT_32 mip0Width, UINT_32 mip0Height, UINT_32 mip0Depth,
UINT_32* pNumMetaBlkX, UINT_32* pNumMetaBlkY, UINT_32* pNumMetaBlkZ) const;
BOOL_32 IsValidDisplaySwizzleMode(const ADDR2_COMPUTE_SURFACE_INFO_INPUT* pIn) const;
ADDR_E_RETURNCODE ComputeSurfaceLinearPadding(
const ADDR2_COMPUTE_SURFACE_INFO_INPUT* pIn,
UINT_32* pMipmap0PaddedWidth,

View File

@@ -401,6 +401,7 @@ ChipFamily CiLib::HwlConvertChipFamily(
m_settings.isPolaris10 = ASICREV_IS_POLARIS10_P(uChipRevision);
m_settings.isPolaris11 = ASICREV_IS_POLARIS11_M(uChipRevision);
m_settings.isPolaris12 = ASICREV_IS_POLARIS12_V(uChipRevision);
m_settings.isVegaM = ASICREV_IS_VEGAM_P(uChipRevision);
family = ADDR_CHIP_FAMILY_VI;
break;
case FAMILY_CZ:
@@ -470,6 +471,10 @@ BOOL_32 CiLib::HwlInitGlobalParams(
{
m_pipes = 4;
}
else if (m_settings.isVegaM)
{
m_pipes = 16;
}
if (valid)
{
@@ -736,7 +741,7 @@ ADDR_E_RETURNCODE CiLib::HwlComputeSurfaceInfo(
SiLib::HwlComputeSurfaceInfo(&localIn, pOut);
ADDR_ASSERT(((MinDepth2DThinIndex <= pOut->tileIndex) && (MaxDepth2DThinIndex >= pOut->tileIndex)) || pOut->tileIndex == Depth1DThinIndex);
ADDR_ASSERT((MinDepth2DThinIndex <= pOut->tileIndex) && (MaxDepth2DThinIndex >= pOut->tileIndex));
depthStencil2DTileConfigMatch = DepthStencilTileCfgMatch(pIn, pOut);
}
@@ -2157,29 +2162,27 @@ VOID CiLib::HwlPadDimensions(
/**
****************************************************************************************************
* CiLib::HwlGetMaxAlignments
* CiLib::HwlComputeMaxBaseAlignments
*
* @brief
* Gets maximum alignments
* @return
* ADDR_E_RETURNCODE
* maximum alignments
****************************************************************************************************
*/
ADDR_E_RETURNCODE CiLib::HwlGetMaxAlignments(
ADDR_GET_MAX_ALIGNMENTS_OUTPUT* pOut ///< [out] output structure
) const
UINT_32 CiLib::HwlComputeMaxBaseAlignments() const
{
const UINT_32 pipes = HwlGetPipes(&m_tileTable[0].info);
// Initial size is 64 KiB for PRT.
UINT_64 maxBaseAlign = 64 * 1024;
UINT_32 maxBaseAlign = 64 * 1024;
for (UINT_32 i = 0; i < m_noOfMacroEntries; i++)
{
// The maximum tile size is 16 byte-per-pixel and either 8-sample or 8-slice.
UINT_32 tileSize = m_macroTileTable[i].tileSplitBytes;
UINT_64 baseAlign = tileSize * pipes * m_macroTileTable[i].banks *
UINT_32 baseAlign = tileSize * pipes * m_macroTileTable[i].banks *
m_macroTileTable[i].bankWidth * m_macroTileTable[i].bankHeight;
if (baseAlign > maxBaseAlign)
@@ -2188,12 +2191,32 @@ ADDR_E_RETURNCODE CiLib::HwlGetMaxAlignments(
}
}
if (pOut != NULL)
return maxBaseAlign;
}
/**
****************************************************************************************************
* CiLib::HwlComputeMaxMetaBaseAlignments
*
* @brief
* Gets maximum alignments for metadata
* @return
* maximum alignments for metadata
****************************************************************************************************
*/
UINT_32 CiLib::HwlComputeMaxMetaBaseAlignments() const
{
UINT_32 maxBank = 1;
for (UINT_32 i = 0; i < m_noOfMacroEntries; i++)
{
pOut->baseAlign = maxBaseAlign;
if ((m_settings.isVolcanicIslands) && IsMacroTiled(m_tileTable[i].mode))
{
maxBank = Max(maxBank, m_macroTileTable[i].banks);
}
}
return ADDR_OK;
return SiLib::HwlComputeMaxMetaBaseAlignments() * maxBank;
}
/**

View File

@@ -137,7 +137,9 @@ protected:
const ADDR_COMPUTE_HTILE_ADDRFROMCOORD_INPUT* pIn,
ADDR_COMPUTE_HTILE_ADDRFROMCOORD_OUTPUT* pOut) const;
virtual ADDR_E_RETURNCODE HwlGetMaxAlignments(ADDR_GET_MAX_ALIGNMENTS_OUTPUT* pOut) const;
virtual UINT_32 HwlComputeMaxBaseAlignments() const;
virtual UINT_32 HwlComputeMaxMetaBaseAlignments() const;
virtual VOID HwlPadDimensions(
AddrTileMode tileMode, UINT_32 bpp, ADDR_SURFACE_FLAGS flags,

View File

@@ -100,11 +100,13 @@ BOOL_32 EgBasedLib::DispatchComputeSurfaceInfo(
ADDR_TILEINFO tileInfoDef = {0};
ADDR_TILEINFO* pTileInfo = &tileInfoDef;
UINT_32 padDims = 0;
UINT_32 padDims = 0;
BOOL_32 valid;
tileMode = DegradeLargeThickTile(tileMode, bpp);
if (pIn->flags.disallowLargeThickDegrade == 0)
{
tileMode = DegradeLargeThickTile(tileMode, bpp);
}
// Only override numSamples for NI above
if (m_chipFamily >= ADDR_CHIP_FAMILY_NI)

View File

@@ -611,6 +611,29 @@ ADDR_E_RETURNCODE SiLib::ComputePipeEquation(
break;
}
if (m_settings.isVegaM && (pEquation->numBits == 4))
{
ADDR_CHANNEL_SETTING addeMsb = pAddr[0];
ADDR_CHANNEL_SETTING xor1Msb = pXor1[0];
ADDR_CHANNEL_SETTING xor2Msb = pXor2[0];
pAddr[0] = pAddr[1];
pXor1[0] = pXor1[1];
pXor2[0] = pXor2[1];
pAddr[1] = pAddr[2];
pXor1[1] = pXor1[2];
pXor2[1] = pXor2[2];
pAddr[2] = pAddr[3];
pXor1[2] = pXor1[3];
pXor2[2] = pXor2[3];
pAddr[3] = addeMsb;
pXor1[3] = xor1Msb;
pXor2[3] = xor2Msb;
}
for (UINT_32 i = 0; i < pEquation->numBits; i++)
{
if (pAddr[i].value == 0)
@@ -754,6 +777,16 @@ UINT_32 SiLib::ComputePipeFromCoord(
ADDR_UNHANDLED_CASE();
break;
}
if (m_settings.isVegaM && (numPipes == 16))
{
UINT_32 pipeMsb = pipeBit0;
pipeBit0 = pipeBit1;
pipeBit1 = pipeBit2;
pipeBit2 = pipeBit3;
pipeBit3 = pipeMsb;
}
pipe = pipeBit0 | (pipeBit1 << 1) | (pipeBit2 << 2) | (pipeBit3 << 3);
UINT_32 microTileThickness = Thickness(tileMode);
@@ -3468,22 +3501,20 @@ VOID SiLib::HwlSelectTileMode(
/**
****************************************************************************************************
* SiLib::HwlGetMaxAlignments
* SiLib::HwlComputeMaxBaseAlignments
*
* @brief
* Gets maximum alignments
* @return
* ADDR_E_RETURNCODE
* maximum alignments
****************************************************************************************************
*/
ADDR_E_RETURNCODE SiLib::HwlGetMaxAlignments(
ADDR_GET_MAX_ALIGNMENTS_OUTPUT* pOut ///< [out] output structure
) const
UINT_32 SiLib::HwlComputeMaxBaseAlignments() const
{
const UINT_32 pipes = HwlGetPipes(&m_tileTable[0].info);
// Initial size is 64 KiB for PRT.
UINT_64 maxBaseAlign = 64 * 1024;
UINT_32 maxBaseAlign = 64 * 1024;
for (UINT_32 i = 0; i < m_noOfEntries; i++)
{
@@ -3494,7 +3525,7 @@ ADDR_E_RETURNCODE SiLib::HwlGetMaxAlignments(
UINT_32 tileSize = Min(m_tileTable[i].info.tileSplitBytes,
MicroTilePixels * 8 * 16);
UINT_64 baseAlign = tileSize * pipes * m_tileTable[i].info.banks *
UINT_32 baseAlign = tileSize * pipes * m_tileTable[i].info.banks *
m_tileTable[i].info.bankWidth * m_tileTable[i].info.bankHeight;
if (baseAlign > maxBaseAlign)
@@ -3504,12 +3535,29 @@ ADDR_E_RETURNCODE SiLib::HwlGetMaxAlignments(
}
}
if (pOut != NULL)
return maxBaseAlign;
}
/**
****************************************************************************************************
* SiLib::HwlComputeMaxMetaBaseAlignments
*
* @brief
* Gets maximum alignments for metadata
* @return
* maximum alignments for metadata
****************************************************************************************************
*/
UINT_32 SiLib::HwlComputeMaxMetaBaseAlignments() const
{
UINT_32 maxPipe = 1;
for (UINT_32 i = 0; i < m_noOfEntries; i++)
{
pOut->baseAlign = maxBaseAlign;
maxPipe = Max(maxPipe, HwlGetPipes(&m_tileTable[i].info));
}
return ADDR_OK;
return m_pipeInterleaveBytes * maxPipe;
}
/**

View File

@@ -87,6 +87,7 @@ struct SiChipSettings
UINT_32 isPolaris10 : 1;
UINT_32 isPolaris11 : 1;
UINT_32 isPolaris12 : 1;
UINT_32 isVegaM : 1;
// VI fusion
UINT_32 isCarrizo : 1;
};
@@ -263,7 +264,9 @@ protected:
return TRUE;
}
virtual ADDR_E_RETURNCODE HwlGetMaxAlignments(ADDR_GET_MAX_ALIGNMENTS_OUTPUT* pOut) const;
virtual UINT_32 HwlComputeMaxBaseAlignments() const;
virtual UINT_32 HwlComputeMaxMetaBaseAlignments() const;
virtual VOID HwlComputeSurfaceAlignmentsMacroTiled(
AddrTileMode tileMode, UINT_32 bpp, ADDR_SURFACE_FLAGS flags,

Some files were not shown because too many files have changed in this diff Show More