Compare commits

...

5751 Commits

Author SHA1 Message Date
Dylan Baker
1ad0ce42de VERSION: bump to 20.0.8 2020-06-11 18:17:50 -07:00
Dylan Baker
823ff39e46 docs: Add release notes for 20.0.8 2020-06-11 18:17:30 -07:00
Michel Dänzer
1abcfb02a8 util: Change os_same_file_description return type from bool to int
This allows communicating that it wasn't possible to determine whether
the two file descriptors reference the same file description. When
that's the case, log a warning in the amdgpu winsys.

In turn, remove the corresponding debugging output from the fallback
os_same_file_description implementation. It depends on the caller if
false negatives are problematic or not.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3879>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3879>
(cherry picked from commit f5a8958910)
2020-06-10 12:54:38 -07:00
Lionel Landwerlin
05ee65801c iris: fix export of GEM handles
We reuse DRM file descriptors internally. Therefore when we export a
GEM handle we must do so in the file descriptor used externally.

This change also fixes a file descriptor leak of the FD given at
screen creation.

v2: Don't bother checking fd equals, they're always different
    Fix dmabuf leak
    Fix GEM handle leaks by tracking exported handles

v3: Check os_same_file_description error (Michel)
    Don't create multiple exports for a given GEM table

v4: Add WARN_ONCE (Ken)
    Rename external_fd to winsys_fd

v5: Remove export lock in favor of bufmgr's

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2882
Fixes: 7557f16059 ("iris: share buffer managers accross screens")
Tested-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4861>
(cherry picked from commit aba3aed96e)
2020-06-09 11:02:04 -07:00
Samuel Pitoiset
549e247aeb radv: enable zero VRAM for all VKD3D (DX12->VK) games
To fix rendering issues with Metro Exodus, RE2 and 3 and probably
more titles. It seems the default behaviour of DX12 anyways.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3064
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5262>
(cherry picked from commit d3c937c0e4)
2020-06-09 11:02:04 -07:00
Samuel Pitoiset
a7a9df28a9 radv: enable zero VRAM for Doom Eternal
That fixes some rendering issues. Probably some unitialized data
from the game.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3064
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5262>
(cherry picked from commit fd5ffd3a83)
2020-06-09 11:02:04 -07:00
Jonathan Marek
90b30fb32f freedreno/a6xx: use nonbinning VS when GS is used
The current "ds = state->bs" seems broken, and the "vs = state->bs" is
unnecessary (already set above). Since it was added as part of a GS-related
patch, I think this is what was intended.

Note: tesselation disables GMEM rendering so we shouldn't have to worry
about hs/ds + binning interaction.

Fixes: 0eebedb619 ("freedreno/a6xx: Emit program state for GS")

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5370>
(cherry picked from commit 6cc95abb27)
2020-06-09 11:02:04 -07:00
Danylo Piliaiev
134dbf5416 glsl: inline functions with unsupported return type before converting to nir
glsl_to_nir doesn't expect non-vector/scalar return types in functions.

Fixes: 7e60d5a501
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3058
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3060
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Tested-by: Witold Baryluk <witold.baryluk@gmail.com>
Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5333>
(cherry picked from commit 9f1cf0e491)
2020-06-09 11:02:04 -07:00
Samuel Pitoiset
9ff455439b nir/lower_explicit_io: fix NON_UNIFORM access for UBO loads
Make sure to propagate the NON_UNIFORM access for UBO loads, so
that non-uniform loads are correctly lowered.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5311>
(cherry picked from commit 86f21e4eba)
2020-06-09 11:02:04 -07:00
Timothy Arceri
0f8fad7f3c glsl: fix potential slow compile times for GLSLOptimizeConservatively
See code comment for full description of the change.

Fixes: 0a5018c1a4 ("mesa: add gl_constants::GLSLOptimizeConservatively")

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3034

Tested-by: Witold Baryluk <witold.baryluk@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5346>
(cherry picked from commit e43ab7bb05)
2020-06-09 11:02:04 -07:00
Eric Engestrom
3f0a6cad4d intel: fix gen_sort_tags.py
The script was failing for me (python 3.8), not sure if this is a recent
python version break or not as I don't know how often people have been
running this script:

    Processing ./gen9.xml... Traceback (most recent call last):
      File "./gen_sort_tags.py", line 177, in <module>
        main()
      File "./gen_sort_tags.py", line 170, in main
        genxml[:] = enums + sorted_structs.values() + instructions + registers
    TypeError: can only concatenate list (not "odict_values") to list

Turning the odict into a list fixes it for me, and the resulting xml
file are identical to before :)

Fixes: 903e142f0d ("genxml: add a sorting script")
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5352>
(cherry picked from commit 981d07c74a)
2020-06-09 11:02:04 -07:00
Eric Engestrom
38dc311654 glapi: remove deprecated .getchildren() that has been replace with an iterator
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3086
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Vinson Lee <vlee@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5342>
(cherry picked from commit 7a68045b5d)
2020-06-09 11:02:04 -07:00
Erik Faye-Lund
8df640b461 nir: reuse existing psiz-variable
For shaders where there's already a psiz-variable, we should rather
reuse it than create a second one. This can happen if a shader writes
gl_PointSize, but disables GL_PROGRAM_POINT_SIZE.

Fixes: 878c94288a ("nir: add lowering-pass for point-size mov")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5328>
(cherry picked from commit e61a98877c)
2020-06-09 11:02:04 -07:00
Lionel Landwerlin
45403a5a45 i965: fix export of GEM handles
We reuse DRM file descriptors internally. Therefore when we export a
GEM handle we must do so in the file descriptor used externally.

v2: Fix dmabuf leak
    Fix GEM handle leaks by tracking exported handles

v3: Check os_same_file_description error (Michel)
    Don't create multiple exports for a given GEM table

v4: Add WARN_ONCE (Ken)

v5: Remove blank line (Ian)
    Remove unused field (Ian)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2882
Fixes: 4094558e86 ("i965: share buffer managers across screens")
Tested-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4861>
(cherry picked from commit 57e4d0aa1c)
2020-06-09 11:02:04 -07:00
Lionel Landwerlin
fdb763d6a5 i965: don't forget to set screen on duped image
We'll start using this field more for querying image properties.
Without it we run into a crash.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4861>
(cherry picked from commit e41e820648)
2020-06-09 11:02:04 -07:00
Lionel Landwerlin
746e76675d iris: fix BO destruction in error path
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4861>
(cherry picked from commit 604a86e46f)
2020-06-09 11:02:04 -07:00
Vinson Lee
2dc59d1dd5 mesa: Fix NetBSD compiler macro.
Reported-by: Rafał Mikrut <mikrutrafal54@gmail.com>
Fixes: a63b90712a ("mesa: also check for __NetBSD__")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3015
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5191>
(cherry picked from commit c3025bde19)
2020-06-09 11:02:04 -07:00
Vinson Lee
8edafd6aae vdpau: Fix wrong calloc sizeof argument.
Fix warning reported by Coverity Scan.

Wrong sizeof argument (SIZEOF_MISMATCH)
suspicious_sizeof: Passing argument 3544UL (sizeof
(vlVdpPresentationQueue)) to function calloc that returns a pointer of
type vlVdpPresentationQueueTarget * is suspicious because a multiple of
sizeof (vlVdpPresentationQueueTarget) /*16*/ is expected.

Fixes: 65fe0866ae ("vl: implemented a few functions and made stubs to get mplayer running")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3026
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5182>
(cherry picked from commit 8b353524b0)
2020-06-09 11:02:04 -07:00
Dylan Baker
765a7d3207 vulkan-overlay/meson: use install_data instead of configure_file
We don't want to copy the file into the build directory, we want to
install it. That's what install_data is for.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2924
Fixes: 56ccea58ae
       ("vulkan/overlay: Add basic overlay control script.")

Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4740>
(cherry picked from commit fb62e642ae)
2020-06-09 11:02:04 -07:00
Marek Olšák
41092aa008 radeonsi: add a hack to disable TRUNC_COORD for shadow samplers
This fixes dEQP-GLES3.functional.shaders.texture_functions.textureprojlodoffset.sampler2dshadow_vertex.

This is probably a dEQP bug.

Fixes: d573d1d825

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5209>
(cherry picked from commit fe3947632c)
2020-06-09 11:02:04 -07:00
Pierre-Eric Pelloux-Prayer
cfb5679577 omx: fix build with gcc 10
bellagio/omx header files reference a global variable without the
extern keyworkd.
Now that gcc-10 enables the '-fno-common' by default the build fails.
Since these are external headers we can't easily fix them, so for
now build the omx module with the '-fcommon' flag to keep the
previous behavior.

See https://gitlab.freedesktop.org/mesa/mesa/issues/2385

Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4058>
(cherry picked from commit 283e815339)
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3019
2020-06-09 11:02:04 -07:00
Dylan Baker
c660bf0d61 .pick_status.json: Update to 0795241dde 2020-06-09 11:02:04 -07:00
Yevhenii Kolesnikov
8c35c65f66 intel/compiler: fix cmod propagation optimisations
Knowing following:
 - CMP writes to flag register the result of
   applying cmod to the `src0 - src1`.
   After that it stores the same value to dst.
   Other instructions first store their result to
   dst, and then store cmod(dst) to the flag
   register.
 - inst is either CMP or MOV
 - inst->dst is null
 - inst->src[0] overlaps with scan_inst->dst
 - inst->src[1] is zero
 - scan_inst wrote to a flag register

There can be three possible paths:

 - scan_inst is CMP:

   Considering that src0 is either 0x0 (false),
   or 0xffffffff (true), and src1 is 0x0:

   - If inst's cmod is NZ, we can always remove
     scan_inst: NZ is invariant for false and true. This
     holds even if src0 is NaN: .nz is the only cmod,
     that returns true for NaN.

   - .g is invariant if src0 has a UD type

   - .l is invariant if src0 has a D type

 - scan_inst and inst have the same cmod:

   If scan_inst is anything than CMP, it already
   wrote the appropriate value to the flag register.

 - else:

   We can change cmod of scan_inst to that of inst,
   and remove inst. It is valid as long as we make
   sure that no instruction uses the flag register
   between scan_inst and inst.

Nine new cmod_propagation unit tests:
 - cmp_cmpnz
 - cmp_cmpg
 - plnnz_cmpnz
 - plnnz_cmpz (*)
 - plnnz_sel_cmpz
 - cmp_cmpg_D
 - cmp_cmpg_UD (*)
 - cmp_cmpl_D (*)
 - cmp_cmpl_UD

(*) this would fail without changes to brw_fs_cmod_propagation.

This fixes optimisation that used to be illegal (see issue #2154)

= Before =
 0: linterp.z.f0.0(8) vgrf0:F, g2:F, attr0<0>:F
 1: cmp.nz.f0.0(8) null:F, vgrf0:F, 0f
= After =
 0: linterp.z.f0.0(8) vgrf0:F, g2:F, attr0<0>:F

Now it is optimised as such (note change of cmod in line 0):

= Before =
 0: linterp.z.f0.0(8) vgrf0:F, g2:F, attr0<0>:F
 1: cmp.nz.f0.0(8) null:F, vgrf0:F, 0f
= After =
 0: linterp.nz.f0.0(8) vgrf0:F, g2:F, attr0<0>:F

No shaderdb changes

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2154

Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3348>
(cherry picked from commit 32b7ba66b0)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5149>
2020-06-01 22:11:49 +00:00
Eric Engestrom
85f380f0d0 tree-wide: fix deprecated GitLab URLs
They will stop working in the next GitLab release, so let's update them
ASAP to make sure things are propagated to everyone by then.

See:
https://about.gitlab.com/releases/2020/05/06/gitlab-com-13-0-breaking-changes/#removal-of-deprecated-project-paths

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5111>

The backport on this was non-trival, git couldn't figure out what had
actually changed and came up with thousands of whitespace changes. I
reimplemented this commit using:

git grep -l -P '/mesa/mesa/(?!-)' | xargs sed -r -i  's@/mesa/mesa/([a-zA-Z0-1])@/mesa/mesa/-/\1@g'
2020-06-01 14:59:33 -07:00
Danylo Piliaiev
9bd42de931 intel/fs: Work around dual-source blending hangs in combination with SIMD16
It was found that dual-source blending hangs with SIMD16 dispatch in some
specific but unknown situation. Which in the wild happen when rgba
anti-aliasing is enabled for fonts.

Cc: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2183
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5037>
(cherry picked from commit 296c04d78c)
2020-06-01 14:52:26 -07:00
Samuel Pitoiset
434d2dc427 spirv,radv,anv: implement no-op VK_GOOGLE_user_type
This extension only allows HLSL shader compilers to optionally embed
unambiguous type information which can be safely ignored by the driver.

This fixes a crash with the recent Vulkan backend of Path Of Exile
(it uses the extension without checking if it's supported).

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5237>
(cherry picked from commit 10c4a7cf59)
2020-06-01 14:52:25 -07:00
Joshua Ashton
5618984b67 radeonsi: Use TRUNC_COORD on samplers
The default behaviour (0) is: "round-nearest-even to n.6 and drop fraction when point sampling" whereas the OpenGL spec simply wants us to floor it (1) "truncate when point sampling".
See 8.14.2 in the OpenGL spec:

https://www.khronos.org/registry/OpenGL/specs/gl/glspec46.core.pdf

The Direct3D spec also mandates this (https://microsoft.github.io/DirectX-Specs/d3d/archive/D3D11_3_FunctionalSpec.htm#7.18.7%20Point%20Sample%20Addressing)

On WineD3D:
This fixes some point-sampling texture precision issues in some Direct3D 9 titles such as Guild Wars 2 and htoL#NiQ: The Firefly Diary that are not present on other vendors.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3953>
(cherry picked from commit d573d1d825)
2020-06-01 14:06:45 -07:00
Marek Olšák
694232e53f radeonsi: don't expose 16xAA on chips with 1 RB due to an occlusion query issue
Only Stoney and Raven2 are affected.

Cc: 20.0 20.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5047>
(cherry picked from commit f80d653d70)
2020-06-01 14:05:10 -07:00
Dylan Baker
7b351f6c99 radonsi/si_state.c: retab 2020-06-01 14:02:06 -07:00
Bas Nieuwenhuizen
824388eaf9 radv: Provide a better error for permission issues with priorities.
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4816>
(cherry picked from commit 9e3c6a7ba7)
2020-06-01 14:01:27 -07:00
Gert Wollny
5f1f6ee8c6 nir: lower_tex: Don't normalize coordinates for TXF with RECT
v2: remove the option to actually request normalization and its
    application in Intel < Gen6 (Jason)

v3: Also don't lower for query operations (Jason)

Fixes: 1ce8060c25
    nir/lower_tex: support for lowering RECT textures

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5105>
(cherry picked from commit 682e14d3ea)
2020-06-01 13:52:23 -07:00
Neha Bhende
31cf682484 util: Initialize pipe_shader_state for passthrough and transform shaders
mesa/st is initializing pipe_shader_state for user define shaders.
This patch intialized pipe_shader_state for all passthough
and transform shaders.

This fixes crashes for several opengl apps. Issue is found in vmware
internal testing

Fixes: f01c0565bb ("draw: free the NIR IR.")

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5240>
(cherry picked from commit 838666a41d)
2020-06-01 13:52:21 -07:00
Vinson Lee
ffdd605674 r300g: Remove extra printf format specifiers.
Fix warning reported by Coverity Scan.
Missing argument to printf format specifier (PRINTF_ARGS)
missing_argument: No argument for format specifier %s.

Fixes: 04c1536bf7 ("r300g: rasterizer debug logging")
Fixes: 85efb2fff0 ("r300g: try to use color varyings for texcoords if max texcoord limit is exceeded")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5274>
(cherry picked from commit d2f8105b60)
2020-06-01 13:52:21 -07:00
Ilia Mirkin
e5257b88f4 nouveau: allow invalidating coherent/persistent buffer backings
This is needed to support the core's usage of coherent buffers for
glVertex-style input. The reason why this was disallowed is that any
mappings will be invalidated. Let the state tracker worry about that,
and just reallocate when we're told.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Cc: mesa-stable@lists.freedesktop.org
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5276>
(cherry picked from commit 6e1c47b98d)
2020-06-01 13:52:20 -07:00
Jason Ekstrand
f27e8fc0b3 intel/fs: Fix unused texture coordinate zeroing on Gen4-5
We were inserting the right number of MOVs but, thanks to the way we
advanced msg_end earlier in the function, were often writing the zeros
past the end of where we actually read in the register file.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5243>
(cherry picked from commit 94aa7997e4)
2020-06-01 13:52:19 -07:00
Jason Ekstrand
0bb69d0216 intel/vec4: Stomp the return type of RESINFO to UINT32
We already do this in the FS back-end; we just weren't doing it in vec4
so RESINFO messages weren't returning the right data.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5243>
(cherry picked from commit a7c8811fe4)
2020-06-01 13:52:18 -07:00
Timothy Arceri
bd4c699f93 radv: fix regression with builtin cache
If the ~/.cache dir already exists continue on without failing.

Fixes: cd61f5234d ("radv: Handle failing to create .cache dir.")

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5249>
(cherry picked from commit e843303d6f)
2020-06-01 13:52:17 -07:00
Vinson Lee
b3098f7dc4 zink: Check fopen result.
Fix warning reported by Coverity.

Dereference null return value (NULL_RETURNS)
dereference: Dereferencing a pointer that might be NULL fp when calling
fwrite.

Fixes: 8d46e35d16 ("zink: introduce opengl over vulkan")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5235>
(cherry picked from commit a2ee293422)
2020-06-01 13:52:14 -07:00
Dylan Baker
e4b235dc5e .pick_status.json: Update to e58112bc08 2020-06-01 13:52:08 -07:00
Rhys Perry
41962e0660 aco: preserve more fields when combining additions into SMEM
Totals from 11 (0.01% of 127638) affected shaders:

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4880>
(cherry picked from commit e1900ee2c7)
2020-05-28 11:16:39 -07:00
Rhys Perry
94850e33fb aco: check instruction format before waiting for a previous SMEM store
Totals from 7 (0.01% of 127638) affected shaders:
CodeSize: 40336 -> 40320 (-0.04%)
Instrs: 7807 -> 7803 (-0.05%)
Cycles: 118588 -> 118344 (-0.21%); split: -0.23%, +0.02%
SMEM: 331 -> 339 (+2.42%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 1749953ea3 ('aco/gfx10: Wait for pending SMEM stores before loads')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4880>
(cherry picked from commit 95d5c1b8a1)
2020-05-28 11:16:38 -07:00
Rhys Perry
e6d7576112 aco: fix interaction with 3f branch workaround and p_constaddr
The offset was incorrect if we inserted a nop before the p_constaddr.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5164>
(cherry picked from commit 8aa98cebc1)
2020-05-28 11:16:37 -07:00
Erik Faye-Lund
8cd4f57381 zink: use general-layout when blitting to/from same resource
This avoids a validator warning when for instance generating mipmaps.

Fixes: d2bb63c8d4 ("zink: Use optimal layout instead of general. Reduces valid layer warnings. Fixes RADV image noise.")
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5199>
(cherry picked from commit dd2bd68fa6)
2020-05-28 11:16:34 -07:00
Timothy Arceri
1e5bbe8d09 glsl: stop cascading errors if process_parameters() fails
Generally we do not completely stop compilation as soon as we see an error,
instead we continue on to attemp to find any futher errors.

This means we shouldn't be checking state->error to see if any error has
happened during the compilation process, doing so was causing
process_parameters() to fail on completely valid functions if there was
any error found in the shader previously. This then caused the valid
functions not to be found because the paramlist was considered empty,
resulting in the compiler spewing out misleading error messages.

Here we simply add the IR error value to the param list when we have
an issue with processing a parameter, this leads to much better error
messaging.

Fixes: 53e4159eaa ("glsl: stop processing function parameters if error happened")

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5205>
(cherry picked from commit f6214750eb)
2020-05-28 11:16:33 -07:00
Bas Nieuwenhuizen
af4999d0e0 radv: Handle failing to create .cache dir.
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5181>
(cherry picked from commit cd61f5234d)
2020-05-28 11:16:28 -07:00
Bas Nieuwenhuizen
566eb598f8 radv/winsys: Remove extra sizeof multiply.
The pointer is already uint64_t*, so the sizeof was too much ...

Fixes: eeff7e1154 "radv: Add userspace fence buffer per context."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5181>
(cherry picked from commit 906435fb0e)
2020-05-28 11:16:27 -07:00
Jason Ekstrand
b081c4bb01 nir/copy_prop_vars: Record progress in more places
Fixes: 96c32d7776 "nir/copy_prop_vars: handle load/store of vector..."
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5170>
(cherry picked from commit f0e075ce6e)
2020-05-28 11:16:22 -07:00
Jason Ekstrand
4efaa80139 nir/opt_deref: Report progress if we remove a deref
Fixes: a1c688517d "nir/opt_deref: Properly optimize ptr_as_array..."
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5170>
(cherry picked from commit db6d9cdf06)
2020-05-28 11:16:21 -07:00
Jason Ekstrand
1b95522f3d nir/lower_double_ops: Rework the if (progress) tree
Fixes: d7d35a9522 "nir/lower_doubles: Use the new NIR lowering..."
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5170>
(cherry picked from commit 111b0a6699)
2020-05-28 11:16:20 -07:00
Danylo Piliaiev
2ddc061692 mesa: Fix double-lock of Shared->FrameBuffers and usage of wrong mutex
Fixes: 7534c536ca
Fixes: 8cfb3e4ee5
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3024
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5160>
(cherry picked from commit 4025583123)
2020-05-28 11:16:18 -07:00
Danylo Piliaiev
5ec4368b36 meson: Disable GCC's dead store elimination for memory zeroing custom new
Some classes use custom new operator which zeroes memory, however gcc does
aggressive dead-store elimination which threats all writes to the memory
before the constructor as "dead stores".

For now we disable this optimization.

The new operators in question are declared via:
 DECLARE_RZALLOC_CXX_OPERATORS
 DECLARE_LINEAR_ZALLOC_CXX_OPERATORS

The issue was found with lto builds, however there is no guarantee that
it didn't happen with ordinary ones.

CC: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2977
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/1358
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5104>
(cherry picked from commit 5500a2b7fc)
2020-05-28 11:16:13 -07:00
Dave Airlie
a771dd8f25 llvmpipe: compute shaders work better with all the threads.
I got to benchmarking some vulkan compute benchmark and wondered
why my CPUs weren't being saturated, helps if you actually wake up
all the threads in the threadpool.

Fixes: 1b24e3ba75 (llvmpipe: add compute threadpool + mutex)

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5138>
(cherry picked from commit 22554e1fbc)
2020-05-28 11:16:12 -07:00
Nataraj Deshpande
0e9812d0ec dri_util: Update internal_format to GL_RGB8 for MESA_FORMAT_R8G8B8X8_UNORM
The commit helps to resolve GL_INVALID_OPERATION error returned
during CTS test when Android format RGBX8888 fallback to RGBA8888
and then set color with glTexSubImage2D(format=GL_RGB).

Fixes android.hardware.nativehardware.cts.AHardwareBufferNativeTests:
 #SingleLayer_ColorTest_GpuSampledImageCanBeSampled_R8G8B8X8_UNORM

Cc: <mesa-stable@lists.freedesktop.org>
Fixes: bf576772ab ("dri_util: add driImageFormatToSizedInternalGLFormat function")
Signed-off-by: Nataraj Deshpande <nataraj.deshpande@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5034>
(cherry picked from commit 02a1f95386)
2020-05-28 11:16:11 -07:00
D Scott Phillips
4724bad429 anv/gen11+: Disable object level preemption
An unknown issue is causing vs push constants to become corrupted
during object-level preemption. For now, restrict to command
buffer level preemption to avoid rendering corruption.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5110>
(cherry picked from commit 81201e4617)
2020-05-28 11:16:10 -07:00
Dylan Baker
0cbeea6f8a tests: Make tests aware of meson test wrapper
Meson 0.55.0 will set the MESON_EXE_WRAPPER environment variable to the
joined version of that wrapper if it is needed. Our tests that take
compiled targets as arguments can use that information to run cross
built binaries, or if there isn't a wrapper and we get an ENOEXEC, we
can skip the tests gracefully.

We try to use mesonlib.split_args, which handles windows arguments
better than python's builtin shlex module, but fall back to that if the
meson module isn't available for some reason.

Cc: 20.0 20.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5103>
(cherry picked from commit 5580322486)
2020-05-28 11:16:09 -07:00
Jason Ekstrand
54973a393b anv:gpu_memcpy: Emit 3DSTATE_VF_INDEXING on Gen8+
If this gets run right after something which uses
VK_VERTEX_INPUT_RATE_INSTANCE on its first vertex binding, we could end
up in serious trouble.

Fixes: 3d9747780b "anv: Add a helper for doing buffer copies with..."

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5090>
(cherry picked from commit 164aed6c81)
2020-05-28 11:16:08 -07:00
Dylan Baker
ccf4cbed47 .pick_status.json: Update to f0c102c075 2020-05-28 11:16:01 -07:00
Lucas Stach
b1a5ca11af etnaviv: retarget transfer to render resource when necessary
If we have a separate render resource, it may contain more up-to-date
data than what is available in the base resource, so we need to retarget
the transfer to this resource. As the most likely reason for the
existence of the render resource is a multi-tiled render layout we need
to allow this transfer to go through the resolve/blit copy path, as we
can't de-/tile those layouts in software.

Fixes: b962776530 (etnaviv: rework compatible render base)
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5051>
(cherry picked from commit 9d1821adf0)
2020-05-28 11:15:52 -07:00
Dylan Baker
d7da2ec6f0 .pick_status.json: Update to 4504d6374d 2020-05-28 11:15:52 -07:00
Rhys Perry
e1e05dc463 nir: fix lowering to scratch with boolean access
Backport of 8e2009c448 for 20.0

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5215>
2020-05-26 22:18:11 +00:00
Rob Clark
216a6f8081 freedreno: clear last_fence after resource tracking
The resource tracking in the clear/draw_vbo/blit paths could itself
trigger a flush.  Which would update last_fence.  So we need to clear
last_fence *after* all the dependency tracking.

Fixes: ddb7fadaf8 ("freedreno: avoid no-op flushes by re-using last-fence")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2992
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5190>
2020-05-26 22:00:27 +00:00
Jan Palus
9387009b09 targets/opencl: fix build against LLVM>=10 with Polly support
see https://bugs.llvm.org/show_bug.cgi?id=44870

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4511>
(cherry picked from commit a1b69d101a)
2020-05-15 09:48:39 -07:00
Dylan Baker
1df589274f .pick_status.json: Update to a887ad7c84 2020-05-15 09:38:27 -07:00
Marek Vasut
274851dd78 etnaviv: Disable seamless cube map on GC880
The GC880 on iMX6DL indicates in it's minorFeatures2 register that it
does support SEAMLESS_CUBE_MAP, however when the TE.SAMPLER_CONFIG1
VIVS_TE_SAMPLER_CONFIG1_SEAMLESS_CUBE_MAP bit is set on GC880 on iMX6DL,
the result is corrupted image. In particular, the following ~112 dEQPs
are affected and fail:

  dEQP-GLES2.functional.texture.filtering.cube.*

This only happens on MX6DL GC880, MX6Q GC2000 and STM32MP1 GC400(GCnano)
do not report the minorFeatures2 SEAMLESS_CUBE_MAP bit and ignore the
TE_SAMPLER_CONFIG1 VIVS_TE_SAMPLER_CONFIG1_SEAMLESS_CUBE_MAP bit (note
that ss->seamless_cube_map is unconditionally set by mesa at times even
PIPE_CAP_SEAMLESS_CUBE_MAP_PER_TEXTURE returns 0), so there is no visible
problem and there are no failing dEQP tests on the GC2000 and GCnano.

This might imply that the minorFeatures2 SEAMLESS_CUBE_MAP has some
different meaning on GC880 or the SEAMLESS_CUBE_MAP behaves differently
on the GC880.

This patch does not set the SEAMLESS_CUBE_MAP bit on hardware which does
not indicate support for seamless cube map and on GC880, which results
in reduction in failed dEQPs: 635 to 186 on GC880, 274 to 270 on GC2000
and no change on GC400(GCnano).

Fixes: 8dd26fa2f0 ("etnaviv: support GL_ARB_seamless_cubemap_per_texture")
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Marek Vasut <marex@denx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4865>
(cherry picked from commit 2b535ac61b)
2020-05-14 10:28:43 -07:00
Ian Romanick
fcc8debd5a anv/tests: Don't rely on assert or changing NDEBUG in tests
This is the last part of the fix for #2903.

v2: Add test_common.h.

Fixes: f7c56475d2 ("anv/tests: compile to something sensible in release builds")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4994>
(cherry picked from commit f4638cfdad)
2020-05-14 10:20:01 -07:00
Danylo Piliaiev
3705ec33a5 anv: Fix deadlock in anv_timelines_wait
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2945
Fixes: 34f32a6d66
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5005>
(cherry picked from commit 06b6c687e2)
2020-05-14 10:20:00 -07:00
Danylo Piliaiev
6b4950f2d3 anv: Translate relative timeout to absolute when calling anv_timelines_wait
Fixes: 34f32a6d66
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5025>
(cherry picked from commit 15dd7933bc)
2020-05-14 10:19:59 -07:00
Dylan Baker
ad94986ec1 .pick_status.json: Update to ceae09da15 2020-05-14 10:19:55 -07:00
Dylan Baker
ffec17eaa4 docs/relnotes Add sha256 sums to 20.0.7 2020-05-14 10:09:24 -07:00
Dylan Baker
e925e97746 bump version to 20.0.7 2020-05-14 09:39:46 -07:00
Dylan Baker
4d46849706 docs: Add release notes for 20.0.7 2020-05-14 09:38:57 -07:00
Samuel Pitoiset
cfcb3a0e83 radv: limit the Vulkan version to 1.1 for Android
Vulkan 1.2 seems rejected. This hardcodes the Android version to
1.1.107.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2936
Fixes: 7f5462e349 ("radv: enable Vulkan 1.2")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4985>
(cherry picked from commit 69430921fc)
2020-05-12 11:08:46 -07:00
Axel Davy
50781aaa1b gallium/util: Fix leak in the live shader cache
When the nir backend is used, the create_shader
call is supposed to release state->ir.nir.
When the cache hits, create_shader is not called,
thus state->ir.nir should be freed.

There is nothing to be done for the TGSI case as the
tokens release is done by the caller.

This fixes a leak noticed in:
https://gitlab.freedesktop.org/mesa/mesa/-/issues/2931

Fixes: 4bb919b0b8

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4980>
(cherry picked from commit 47bfc799da)
2020-05-12 11:08:44 -07:00
Ian Romanick
787cecdb02 nir/algebraic: Optimize ushr of pack_half, not ishr
When a = -1.0, pack_half_2x16(vec2(0x0000, 0xBC00)) will produce
0xBC000000.  The ishr will produce 0xFFFFBC00.  The replacement
pack_half_2x16(vec2(0xBC00, 0x0000)) will produce 0x0000BC00.

Fixes: 1f72857739 ("nir/algebraic: add some half packing optimizations")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4515>
(cherry picked from commit a2bf41ec65)
2020-05-12 11:08:39 -07:00
Dylan Baker
061d24b4b2 .pick_status.json: Update to d76e722ed6 2020-05-12 11:08:36 -07:00
Samuel Pitoiset
b749209863 aco: fix 64-bit trunc with negative exponents on GFX6
v_frexp_exp returns the exponent as an unsigned value.

Also, v_ashr returns either 0 or -1 depending on the sign of the
source operand, but what we want is only the sign bit.

Fixes a bunch of recent dEQP-VK.glsl.builtin.precision_double.* tests.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4921>
(cherry picked from commit 3fba0a7a6f)
2020-05-11 10:17:24 -07:00
Qiang Yu
6b2e0a0c13 panfrost: don't always build bifrost_compiler
src/panfrost/shared is shared with lima driver, build
bifrost_compiler for lima driver is meaningless and
get link error when only lima driver is enabled.

So only build bifrost_compiler when configued with:
  meson -Dtools=panfrost

Fixes: ec2a59cd7a "panfrost: Move non-Gallium files outside of Gallium"
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4960>
(cherry picked from commit 07b0fbea92)
2020-05-11 10:17:19 -07:00
Dylan Baker
6486fcc79e .pick_status.json: Update to 0bea2a1321 2020-05-11 10:17:17 -07:00
Lionel Landwerlin
299d0f9a81 anv: don't expose VK_INTEL_performance_query without kernel support
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 2b5f30b1d9 ("anv: implement VK_INTEL_performance_query")
Acked-by: Timothy Strelchun <timothy.strelchun@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4937>
(cherry picked from commit 4f17e9eef6)
2020-05-08 10:29:06 -07:00
Lionel Landwerlin
071ba3898a intel/perf: store the probed i915-perf version
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4344>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4344>
(cherry picked from commit aad0e6f810)
2020-05-08 10:29:05 -07:00
Blaž Tomažič
f2237b1381 radeonsi: Fix omitted flush when moving suballocated texture
Fixes: 5e805cc74b "radeonsi: flush the context after resource_copy_region for buffer exports"

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4925>
(cherry picked from commit 808eb20186)
2020-05-08 10:21:32 -07:00
Dylan Baker
8fdd01aa0d radeonsi: retab
To allow backports to apply
2020-05-08 10:21:32 -07:00
Dylan Baker
8c7d9a4e51 .pick_status.json: Update to d11e4738a8 2020-05-08 10:14:36 -07:00
Marek Olšák
9109f99add radeonsi: fix compilation of monolithic PS
This was totally broken. Monolithic PS is only used if FBFETCH or
interpolateAtSample are used.

When the PS prolog was built, it overwrote ctx->main_fn.

Discovered by @eefano.

Fixes: 8832a88434 "radeonsi: move PS LLVM code into si_shader_llvm_ps.c"
Closes: #2814

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4918>
(cherry picked from commit 29da521280)
2020-05-07 10:08:37 -07:00
Dylan Baker
c6a65843ff radeonsi: retab si_shader_llvm_ps.c
So that patches apply cleanly

Generated with sed -i 's@\t@   @g'
2020-05-07 10:08:37 -07:00
Samuel Pitoiset
7bb2d35b38 radv: don't report error with other vendor DRM devices
Enumeration should just skip unsupported DRM devices.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4806>
(cherry picked from commit 8d993c9d2c)
2020-05-07 10:08:37 -07:00
Samuel Pitoiset
5f3ab97cb8 radv: report INITIALIZATION_FAILED when the amdgpu winsys init failed
The driver should be capable if it reaches the winsys initialization.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4806>
(cherry picked from commit f03abd5041)
2020-05-07 10:08:37 -07:00
Jose Maria Casanova Crespo
e45959247f v3d: Include supported DXT formats to enable s3tc/dxt extensions
DXT1_RGBA and sRGB variants of DXT[135] formats are enabled as
valid format on V3D.

Once all S3TC formats supported by V3C are enabled the following
extensions become exposed by gallium.

    * GL_ANGLE_texture_compression_dxt3
    * GL_ANGLE_texture_compression_dxt5,
    * GL_EXT_texture_compression_dxt1
    * GL_EXT_texture_compression_s3tc
    * GL_S3_s3tc
    * GL_EXT_texture_compression_s3tc_srgb

This enables 206 passing piglit test related to gl_compressed.*s3tc_dxt

Cc: 20.0 20.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4934>
(cherry picked from commit 905edc376d)
2020-05-07 09:47:49 -07:00
Jose Maria Casanova Crespo
574b5b37e4 v3d: Fix swizzle in DXT3 and DXT5 formats
Swizzles were ignoring the W component of the format DXT3_RGBA and
DXT5_RGBA.

This fixes 15 piglit tests:

spec/!opengl 1.1/copyteximage 2d
spec/!opengl 1.2/copyteximage 3d
spec/arb_texture_compression/fbo-generatemipmap-formats/gl_compressed_rgba
spec/arb_texture_compression/fbo-generatemipmap-formats/gl_compressed_rgba npot
spec/arb_texture_compression/texwrap formats bordercolor-swizzled/gl_compressed_rgba, swizzled, border color only
spec/arb_texture_compression/texwrap formats bordercolor/gl_compressed_rgba, border color only
spec/arb_texture_cube_map/copyteximage cube
spec/arb_texture_cube_map/copyteximage cube samples=2
spec/arb_texture_cube_map/copyteximage cube samples=4
spec/arb_texture_rectangle/copyteximage rect
spec/arb_texture_rectangle/copyteximage rect samples=2
spec/arb_texture_rectangle/copyteximage rect samples=4
spec/ext_texture_array/copyteximage 2d_array
spec/ext_texture_array/copyteximage 2d_array samples=2
spec/ext_texture_array/copyteximage 2d_array samples=4

Fixes: 469bbd8387 "broadcom/vc5: Move the formats table to per-V3D-version compile."
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4934>
(cherry picked from commit e3ecf48dda)
2020-05-07 09:47:49 -07:00
Pierre Moreau
ef726459dd clover/nir: Check the result of spirv_to_nir
Fixes: deb04adf2a ("clover: add support for passing kernels as nir to the driver")
Signed-off-by: Pierre Moreau <dev@pmoreau.org>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4901>
(cherry picked from commit 38bbfd3a57)
2020-05-07 09:47:48 -07:00
Dylan Baker
1d1d583696 .pick_status.json: Update to 6d513eb0db 2020-05-07 09:47:43 -07:00
Dylan Baker
a35f3fd1d1 .pick_status.json: Mark 9392ddab43 as backported 2020-05-06 19:31:20 -07:00
Rhys Perry
58653838f5 aco: consider blocks unreachable if they are in the logical cfg
backport of 9392ddab43

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4888>
2020-05-06 19:29:30 -07:00
Danylo Piliaiev
7d00190859 i965: Fix out-of-bounds access to brw_stage_state::surf_offset
../src/mesa/drivers/dri/i965/brw_wm_surface_state.c:1378:32: runtime error: index 3503345872 out of bounds for type 'uint32_t [149]'

brw_assign_common_binding_table_offsets has the following comment:
 "Unused groups are initialized to 0xd0d0d0d0 to make it obvious that they're
 unused but also make sure that addition of small offsets to them will
 trigger some of our asserts that surface indices are < BRW_MAX_SURFACES."

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4350>
(cherry picked from commit 784358bd6e)
2020-05-06 16:06:08 -07:00
Dave Airlie
640f810f95 llvmpipo/nir: free compute shader NIR
I forgot this in the last round.

Fixes: 18f896e55d (llvmpipe: add initial nir support)

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4899>
(cherry picked from commit 870b6a6050)
2020-05-06 16:06:03 -07:00
Rhys Perry
2fba5c1cd8 nir: add missing group_memory_barrier handling
Totals from 2 (0.00% of 127638) affected shaders:
VGPRs: 164 -> 168 (+2.44%)
CodeSize: 18420 -> 18756 (+1.82%)
Instrs: 3658 -> 3700 (+1.15%)
Cycles: 82912 -> 83080 (+0.20%)
VMEM: 70 -> 69 (-1.43%)
PreVGPRs: 155 -> 168 (+8.39%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
CC: <mesa-stable@lists.freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4889>
(cherry picked from commit a46aa3dc2e)
2020-05-06 16:06:01 -07:00
Dylan Baker
e12545f3b8 .pick_status.json: Mark d80fb02430 as backported 2020-05-06 16:05:58 -07:00
Dylan Baker
85cdb63e43 .pick_status.json: Update to 6292059662 2020-05-06 16:05:45 -07:00
Christopher James Halse Rogers
bc1b6f4324 egl/wayland: Fix zwp_linux_dmabuf usage
There's no guarantee that the formats advertised by wl_drm and the formats
advertised by zwp_linux_dmabuf_v1 are the same.

get_back_bo() handles this by falling back from createImageWithModifiers() to
createImage() when there's a wl_drm format but no corresponding linux_dmabuf
format, but create_wl_buffer() unconditionally tries to create a linux_dmabuf
buffer unless DRIimage has DRM_FORMAT_MOD_INVALID.

Fix this by always checking if the DRIimage modifier has been advertised
by zwp_linux_dmabuf_v1, and falling back to wl_drm if not.

If DRM_FORMAT_MOD_INVALID has been advertised then we trust the client
has allocated something appropriate and treat any modifier as matching.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2220
Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
(cherry picked from commit 98675d34c1)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4869>
2020-05-06 22:18:15 +00:00
Ivan Molodetskikh
433c29ba34 egl: allow INVALID format for linux_dmabuf
As per
fb9b2a8731,
the compositor may advertise DRM_FORMAT_MOD_INVALID as a supported
modifier. This patch makes mesa recognize this fact and allow
linux_dmabuf usage with the INVALID modifier in this case.

In case the driver doesn't support modifiers, we can still use
linux-dmabuf protocol instead of the legacy wl_drm interface to create
wl_buffers. This will help compositors to handle these buffers better.

In this commit, the INVALID modifier is allowed to be added to the list
of supported modifiers, and create_wl_buffer will be able to use
linux_dmabuf with an INVALID modifier if the compositor advertised it as
supported.

Signed-off-by: Ivan Molodetskikh <yalterz@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2147>
(cherry picked from commit c376865f5e)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4869>
2020-05-06 22:18:15 +00:00
Bas Nieuwenhuizen
c927a2ae62 winsys/amdgpu: Retrieve WC flags from imported buffers.
Otherwise reading from an imported mapped GTT+WC linear texture
is painfully slow.

Sadly no radeon winsys implementation, as I don't know a suitable
kernel driver operation.

Hit this  in vaGetImage with an image imported from minigbm (which
we are switching to allocate WC for SCANOUT images).

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
(cherry picked from commit d80fb02430)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4906>
2020-05-06 21:37:49 +00:00
Neil Armstrong
6ef2e25c85 ci: disable t820/mali4xx tests
The BayLibre LAVA lab is down for a week now and requires more work
than anticipated to make it available again.

Let's disable the tests running on these lab until the lab is up again.

Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
2020-05-06 20:19:31 +00:00
Dylan Baker
ee0af67ebd .pick_status.json: Mark bdd2f284d9 as denominated 2020-05-04 14:07:29 -07:00
Lionel Landwerlin
d1e566c705 iris: don't assert on unfinished aux import in copy paths
After a resource is created the first command using it could be a copy
command.

In iris_state we finish the import on surface/view creation but we
don't do that for copies.

v2: Move finish call to gallium entrypoints (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2725
Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4657>
(cherry picked from commit 612e35c8d9)
2020-05-04 14:07:29 -07:00
Marek Olšák
5e686abe25 radeonsi: unify and align down the max SSBO/TBO/UBO buffer binding size
Rounding down the size fixes:
    KHR-GL45.enhanced_layouts.ssb_member_invalid_offset_alignment

Fixes: 03e2adc990

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4761>
(cherry picked from commit e58dcc47c3)
2020-05-04 10:46:04 -07:00
Dylan Baker
1ce401ffef radeonsi: Retab si_get.c
This was done on master, and is making applying backports awful.
2020-05-04 10:38:04 -07:00
Jason Ekstrand
ac5cb6a66c vulkan: Allow destroying NULL debug report callbacks
Fixes: 086cfa5652 "anv: implementation of VK_EXT_debug_report extension"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4690>
(cherry picked from commit 9d10bde5a8)
2020-05-04 10:21:09 -07:00
Tapani Pälli
52f1498321 st/mesa: destroy only own program variants when program is released
Earlier commit tried to achieve this but actually did more. This makes
sure the variants for other contexts continue to live.

Fixes: de3d7dbed5 ("mesa/st: release variants for active programs before unref")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2865
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4831>
(cherry picked from commit 46b3cb011f)
2020-05-04 10:21:07 -07:00
Pierre-Eric Pelloux-Prayer
52ad2d9bb2 radeonsi: fix export count
Fixes: 17acff01a0 ("radeonsi: skip vs output optimizations for some outputs")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2877
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4871>
(cherry picked from commit 7e7bb38bd8)
2020-05-04 10:21:06 -07:00
Bas Nieuwenhuizen
62b2fcdbb8 radv: Extend tiling flags to 64-bit.
SCANOUT is bit 63 ....

Fixes: bfd9e7ff24 "radv: Use new scanout gfx9 metadata flag."
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2879
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4859>
(cherry picked from commit df9629e593)
2020-05-04 10:21:05 -07:00
D Scott Phillips
7b7a921c1c anv,iris: Fix input vertex max for tcs on gen12
gen12 does away with the single patch dispatch mode for tcs, and
increases some limits so that 8_patch mode can always work. Make the
necessary changes so we don't try to fall back to single patch mode.

Fixes KHR-GL46.tessellation_shader.single.max_patch_vertices and others

Fixes: 44754279ac ("intel/fs/gen12: Use TCS 8_PATCH mode.")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4843>
(cherry picked from commit 65b05ebdda)
2020-05-04 10:21:04 -07:00
D Scott Phillips
2ddc07ee08 intel/fs: Update location of Render Target Array Index for gen12
Render Target Array Index has moved from R0.0[26:16] to
R1.1[26:16] on gen12.

Fixes dEQP-VK.multiview.input_attachments.*

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4836>
(cherry picked from commit 7bd15135a6)
2020-05-04 10:21:03 -07:00
Dylan Baker
606286b349 .pick_status.json: Update to b97cc41aa2 2020-05-04 10:20:56 -07:00
Dylan Baker
fa1dd377e4 .pick_status.json: Mark 3fac55ce0d as denominated 2020-04-30 09:54:07 -07:00
Dylan Baker
1213e7a099 .pick_status.json: Update to 3fac55ce0d 2020-04-30 09:52:36 -07:00
Jason Ekstrand
90934659dc intel/fs: Don't delete coalesced MOVs if they have a cmod
Shader-db results on ICL:

    total instructions in shared programs: 17133088 -> 17133287 (<.01%)
    instructions in affected programs: 61300 -> 61499 (0.32%)
    helped: 0
    HURT: 199

This means it's likely fixing 199 bugs. :-)  All the changed shaders are
in Mad Max.  It's surprisingly difficult to get the back-end compiler to
generate a pattern that hits this we don't tend to emit a lot coalescable
MOVs.  The pattern in Mad Max that's able to hit is fsign(fsat(x)) under
the right conditions.

Closes: #2820
Cc: mesa-stable@lists.freedesktop.org
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4773>
(cherry picked from commit e581ddeeee)
2020-04-29 16:12:07 -07:00
Marek Olšák
a0c046f5e7 mesa: report GL_INVALID_OPERATION for invalid glTextureBuffer target
This fixes:
    KHR-GL46.direct_state_access.textures_buffer_errors
    KHR-GL46.direct_state_access.textures_buffer_range_errors

Fixes: 98e64e538a - main: Added entry point for glTextureBuffer

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4759>
(cherry picked from commit a2542deb63)
2020-04-29 16:12:07 -07:00
Jason Ekstrand
60c2ae0240 nir/copy_prop_vars: Report progress when deleting self-copies
Fixes: 62332d139c "nir: Add a local variable-based copy prop..."

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4767>
(cherry picked from commit ed67717167)
2020-04-29 16:12:07 -07:00
Dylan Baker
b7c3e67ab1 .pick_status.json: Update to 2efa76f795 2020-04-29 16:12:07 -07:00
Dylan Baker
e0cc731986 docs: Add SHA256 sums for 20.0.6 2020-04-29 16:10:57 -07:00
Dylan Baker
4c59d9944a VERSION: bump to 20.0.6 2020-04-29 15:35:17 -07:00
Dylan Baker
dda9d5346b docs: Add release notes for 20.0.6 2020-04-29 15:33:42 -07:00
Jason Ekstrand
4072515d57 anv: Expose CS workgroup sizes based on a maximum of 64 threads
Otherwise, we'll hit asserts in brw_compile_cs.

Fixes: cf12faef61 "intel/compiler: Restrict cs_threads to 64"
Closes: #2835
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4746>
(cherry picked from commit 81ac741f89)
2020-04-28 12:01:49 -07:00
Jason Ekstrand
002f718dfa intel/devinfo: Compute the correct L3$ size for Gen12
Fixes: 8125d7960b "intel/dev: Add preliminary device info for Tigerlake"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Clayton Craft <clayton.a.craft@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4782>
(cherry picked from commit 86f67952d3)
2020-04-28 12:01:48 -07:00
Bas Nieuwenhuizen
b7ef444719 radv: Use actual memory type count for setting app-visible bitset.
Otherwise we might make a bitset that is too large.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4751>
(cherry picked from commit 4a8d172d3f)
2020-04-28 12:01:47 -07:00
Eric Anholt
8191b91695 freedreno: Fix calculation of the const buffer cmdstream size.
The HW packet requires padding the number of pointers you emit, and we
would assertion fail about running out of buffer space if the number of
UBOs to be uploaded was odd.

Fixes: b4df115d3f ("freedreno/a6xx: pre-calculate userconst stateobj size")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4621>
(cherry picked from commit 69c8dfd49f)
2020-04-28 12:01:47 -07:00
Dylan Baker
660e650dbc .pick_status.json: Update to 6b551d9f36 2020-04-28 12:01:45 -07:00
Danylo Piliaiev
373c5360e4 st/mesa: Treat vertex inputs absent in inputMapping as zero in mesa_to_tgsi
After updating vertex inputs being read based on optimized NIR, they may go out
of sync with inputs in mesa IR. Which is translated to TGSI and used together
with NIR if draw doesn't have llvm.

It's much easier to treat such inputs as zero because there is no pass to
entirely get rid of them and they don't contribute to shader's output.

Fixes: d684fb37bf
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2815
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4705>
(cherry picked from commit eeab9c93db)
2020-04-27 10:33:14 -07:00
Jason Ekstrand
109caeeb43 nir/lower_subgroups: Mask off unused bits in ballot ops
Thanks to VK_EXT_subgroup_size_control, we can end up with
gl_SubgroupSize being as low as 8 on Intel.

Fixes: d10de25309 "anv: Implement VK_EXT_subgroup_size_control"
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4694>
(cherry picked from commit fdf9b674ee)
2020-04-27 10:33:13 -07:00
Jason Ekstrand
74e0db6171 anv: Drop an assert
Ever since Vulkan 1.2, this feature has been in core so enabling the
extension is no longer required.

Fixes: 4ef3f7e3d3 "anv: Enable Vulkan 1.2 support"
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4694>
(cherry picked from commit 9c009da208)
2020-04-27 10:33:12 -07:00
Jason Ekstrand
2b01692b9e spirv: Fix passing combined image/samplers through function calls
Fixes dEQP-VK.spirv_assembly.instruction.function_params.sampler_param

cc: mesa-stable@lists.freedesktop.org

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4684>
(cherry picked from commit bc5c438289)
2020-04-27 10:33:11 -07:00
Jason Ekstrand
f1ccead5b8 nir/opt_deref: Remove certain sampler type casts
The SPIR-V parser sometimes generates casts from specific sampler types
like sampler2D to the bare sampler type.  This results in a cast which
causes heartburn for drivers but is harmless to remove.

cc: mesa-stable@lists.freedesktop.org

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4684>
(cherry picked from commit a1a08a5802)
2020-04-27 10:33:11 -07:00
Jason Ekstrand
081efbbc79 turnip: Properly handle all sizes of specialization constants
cc: mesa-stable@lists.freedesktop.org

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4675>
(cherry picked from commit 6211e79ba5)
2020-04-27 10:33:10 -07:00
Jason Ekstrand
844b382806 radv: Properly handle all sizes of specialization constants
cc: mesa-stable@lists.freedesktop.org

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4675>
(cherry picked from commit a4885df9f8)
2020-04-27 10:33:09 -07:00
Jason Ekstrand
ca9452e34c anv: Properly handle all sizes of specialization constants
Closes: #2812
cc: mesa-stable@lists.freedesktop.org

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4675>
(cherry picked from commit a44e63398b)
2020-04-27 10:33:08 -07:00
Jason Ekstrand
cacb0c0268 spirv: Allow constants and NULLs in SpvOpConvertUToPtr
We were accidentally asserting that the value had to be a vtn_ssa_value
which isn't true if it, for instance, comes from a spec constant.

Fixes: fb282a68bc "spirv: Implement OpConvertPtrToU and OpConvertUToPtr"
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4675>
(cherry picked from commit 64e4297629)
2020-04-27 10:33:07 -07:00
Quentin Glidic
a2776c24c7 meson: Use dependency.partial_dependency()
It avoids calling pkg-config which was searched for in a wrong way, thus
breaking setup where unprefixed pkg-config was banned (e.g. on Exherbo).

Signed-off-by: Quentin Glidic <sardemff7+git@sardemff7.net>
Fixes: 53f9131205
       ("meson: fix getting cflags from pkg-config")

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4701>
(cherry picked from commit 00f5ea9fdc)
2020-04-27 10:33:06 -07:00
Dylan Baker
5cb1414707 .pick_status.json: Update to 42b1696ef6 2020-04-27 10:33:04 -07:00
Lionel Landwerlin
ea37c93a6b intel/perf: Enable MDAPI queries for Gen12
We're missing the cases for gen12 leading to those metrics going
missing.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 15b7b56eb2 ("intel/perf: add TGL support")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4586>
(cherry picked from commit 086ea1ac7e)
2020-04-23 09:37:03 -07:00
Lionel Landwerlin
96750187c1 intel/perf: move mdapi query definitions to their own file
Where they belong.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4344>
(cherry picked from commit dde96d31b7)
2020-04-23 09:37:01 -07:00
Lionel Landwerlin
265c4537ab intel/perf: break GL query stuff away
This stuff is somewhat specific to the GL extension & drivers. On
Vulkan we won't use this, it also made a rather large file.

v2: Fix Android build (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4344>
(cherry picked from commit 33b9c7a7f6)
2020-04-23 09:36:41 -07:00
Lionel Landwerlin
c4b110d33b intel/perf: move register definition to special file
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4344>
(cherry picked from commit f5c5574f42)
2020-04-23 09:36:29 -07:00
Erik Faye-Lund
b754deecb0 meson: correct windows-version define
The macro "_WINVER" does nothing, the macro definitions that matter for
windows API version selection are "_WIN32_WINNT" and "WINVER".

The header "sdkddkver.h" (which is included from thousands of
different windows-headers) defines "WINVER" to the same value as
"_WIN32_WINNT" of only the latter is defined, which explains why this
works right now. But we shouldn't depend on that kind of luck, and
instead define the right maco.

Fixes: 3aee462781 ("meson: add windows compiler checks and libraries")
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4681>
(cherry picked from commit 7f17a0a809)
2020-04-23 09:27:44 -07:00
Joshua Ashton
86e586862f radv: Use TRUNC_COORD on samplers
The default behaviour (0) is: "round-nearest-even to n.6 and drop fraction when point sampling" whereas the Vulkan spec simply wants us to floor it (1) "truncate when point sampling".

See 15.6.1 in the Vulkan spec.
https://www.khronos.org/registry/vulkan/specs/1.2-extensions/html/vkspec.html#textures-normalized-operations

The Direct3D spec also mandates this (https://microsoft.github.io/DirectX-Specs/d3d/archive/D3D11_3_FunctionalSpec.htm#7.18.7%20Point%20Sample%20Addressing)

This fixes some point-sampling texture precision issues in some Direct3D 9 titles such as Guild Wars 2 and htoL#NiQ: The Firefly Diary that are not present on other vendors.

Fixes dEQP-VK.pipeline.sampler.exact_sampling.*

https://github.com/Joshua-Ashton/d9vk/issues/450
https://github.com/doitsujin/dxvk/issues/1433

CC: <mesa-stable@lists.freedesktop.org>

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3951>
(cherry picked from commit 58f25098a0)
2020-04-23 09:27:42 -07:00
Samuel Pitoiset
a2f0ee9664 radv: make sure to export the viewport index if FS needs it
If FS reads gl_ViewportIndex but VS doesn't export it, it should
be zero to avoid reading garbage.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2818
Fixes: b424d49ac0 ("radv/llvm: fix exporting the viewport index if the fragment shader needs it")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4687>
(cherry picked from commit 7086b38c81)
2020-04-23 09:27:42 -07:00
Dylan Baker
3a8c5e2bc1 .pick_status.json: Update to efdb7fa9a8 2020-04-23 09:27:39 -07:00
Dylan Baker
dab1ccedd2 .pick_status.json: Update to 51c1c4d95a 2020-04-22 22:09:38 -07:00
Pierre-Eric Pelloux-Prayer
d160bd3cf0 radeonsi: skip vs output optimizations for some outputs
If PT_SPRITE_TEX is enabled, PS inputs are overriden at runtime so
we can't apply the vs output optim.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2747
Fixes: 3ec9975555 ("radeonsi: eliminate trivial constant VS outputs")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4559>
(cherry picked from commit 17acff01a0)
2020-04-22 22:09:20 -07:00
Jason Ekstrand
dde2cac42a anv: Apply any needed PIPE_CONTROLs before emitting state
Push constants in particular can get picked up by the hardware at weird
times that happen *before* 3DPRIMITIVE.  Therefore, we need to flush
before we emit all our state to ensure that any data they may pick up is
in memory in time.  This fixes an app which does vkCmdCopyBuffers
immediately followed by a vkCmdBeginRenderPass and vkCmdDraw which uses
the destination of the copy as a UBO which we push.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4601>
(cherry picked from commit 969aeb6a93)
2020-04-22 22:00:45 -07:00
Jason Ekstrand
928580a13d anv: Move vb_emit setup closer to where it's used in flush_state
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4601>
(cherry picked from commit ffc84eac0d)
2020-04-22 22:00:40 -07:00
Dylan Baker
02a153fbfa .pick_status.json: Update to 51c1c4d95a 2020-04-22 21:42:01 -07:00
Dylan Baker
a251177ea0 .pick_status.json: Mark 0123b8f634 as denominated 2020-04-22 15:03:25 -07:00
Danylo Piliaiev
cde6f3c122 spirv: Expand workaround for OpControlBarrier on old GLSLang
In SPIRV of compute shader in Aztec Ruins benchmark there is:

OpControlBarrier %uint_1 %uint_1 %uint_0
// ControlBarrier(Device, Device, rdcspv::MemorySemantics(0));

which is an incorrect translation of glsl barrier().

GLSLang, prior to c3f1cdfa, emitted the OpControlBarrier with
Device instead of Workgroup for execution scope.

2365520c covers similar case but isn't applied when execution_scope
is SpvScopeDevice.

Cc: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2742
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4660>
(cherry picked from commit 66229aa169)
2020-04-22 15:03:19 -07:00
Lionel Landwerlin
3b4a9206f4 iris: fail screen creation when kernel support is not there
v2: Bump check to I915_PARAM_HAS_CONTEXT_ISOLATION (v4.16) (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2803
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4643>
(cherry picked from commit f402b7c576)
2020-04-22 15:03:17 -07:00
Erik Faye-Lund
d0a7c5b2e0 mesa/gallium: do not use enum for bit-allocated member
The signedness of enums are undefined, so on platforms with signed
enums, this isn't going to work. One such platform is Microsoft Windows.

So let's just use an unsigned here instead.

Fixes: b1c4c4c7f5 ("mesa/gallium: automatically lower alpha-testing")
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4648>
(cherry picked from commit 013d9e40fe)
2020-04-22 15:03:16 -07:00
Dylan Baker
8d13ea1571 meson: update llvm dependency logic for meson 0.54.0
In meson 0.54.0 I fixed the llvm cmake dependency to return "not found"
if shared linking is requested. This means that for 0.54.0 and later we
don't need to do anything, and for earlier versions we only need to
change the logic to force the config-tool method if shared linking is
required.

Fixes: 821cf6942a
       ("meson: Use cmake to find LLVM when building for window")

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4556>
(cherry picked from commit fdd0ce12ac)
2020-04-22 15:03:12 -07:00
Abhishek Kumar
dd7fdda487 anv/android: fix assert in anv_import_ahw_memory
Commit fixes assert that triggers when running
   dEQP-VK.api.external.memory.android_hardware_buffer.dedicated.buffer#bind_export_import_bind

on a debug build of Mesa.

Fixes: c79a528d ("anv/android: support import/export of AHardwareBuffer objects")
Signed-off-by: Abhishek Kumar <abhishek4.kumar@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4655>
(cherry picked from commit f06e4ab319)
2020-04-22 15:03:11 -07:00
Danylo Piliaiev
5ac0076928 st/mesa: Re-assign vs in locations after updating nir info for ffvp/ARB_vp
After call to nir_shader_gather_info - inputs_read may have changed so
st_nir_assign_vs_in_locations should be called for shader to remain in
sync with vbo state.

Fixes piglit tests:
  gl-1.0-fpexceptions
  gl-1.1-color-material-unused-normal-array
  arb_vertex_program-unused-attributes
regression on several gallium drivers.

Fixes: d684fb37bf
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4645>
(cherry picked from commit 829013d0ca)
2020-04-22 15:03:10 -07:00
Dylan Baker
98dd416312 .pick_status.json: Update to c552b5fd1d 2020-04-22 15:02:59 -07:00
Dylan Baker
887e6442d2 docs: Add sha256 sums for 20.0.5 2020-04-22 14:55:34 -07:00
Dylan Baker
728cf6631f VERSION: bump for 20.0.5 2020-04-22 14:35:13 -07:00
Dylan Baker
cb07bd313c docs: Add relnotes for 20.0.5 2020-04-22 14:34:51 -07:00
Jason Ekstrand
4dfb9d9fce anv: Report correct SLM size
Fixes: d787a2d0 "anv: Implement VK_KHR_pipeline_executable_properties"
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4597>
(cherry picked from commit b8acf9a3d4)
2020-04-20 10:03:00 -07:00
Jason Ekstrand
e16cb98ce2 intel: Add _const versions of prog_data cast helpers
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4597>
(cherry picked from commit e003104605)
2020-04-20 10:02:56 -07:00
Arcady Goldmints-Orlov
5bbf4cc54e nir: Lower returns correctly inside nested loops
Inside nested flow control, nir_lower_returns inserts predicated breaks
in the outer block. However, it would omit doing this if the remainder
of the outer block (after the inner block) was empty. This is not
correct in the case of loops, as execution just wraps back around to the
start of the loop, so this change doesn't skip the predication inside
loops.

Fixes: 79dec93ead (nir: Add return lowering pass)
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2724

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4603>
(cherry picked from commit ec1b96fdc8)
2020-04-20 10:01:27 -07:00
Lionel Landwerlin
eb39c5fb4f util/sparse_free_list: manipulate node pointers using atomic primitives
Probably doesn't fix anything but those should be accessed in an
atomic way just like the head pointer.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: e4f01eca3b ("util: Add a free list structure for use with util_sparse_array")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4613>
(cherry picked from commit cdc4377591)
2020-04-20 10:01:27 -07:00
Jason Ekstrand
29200718b5 spirv: Handle OOB vector extract operations
We use vtn_vector_extract to handle vector component level derefs.  This
makes us gracefully handle the case where your vector component is OOB
and give you an undef.  The SPIR-V working group is still working out
whether or not this is technically legal but it's very little code for
us to handle it so we may as well.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4495>
(cherry picked from commit 380bf556bf)
2020-04-20 10:01:27 -07:00
D Scott Phillips
5ef9dccff3 util/sparse_array: don't stomp head's counter on pop operations
By temporarily storing the new_head by a uint32_t, we wipe out the
counter section of the head pointer.

Fixes: e4f01eca ("util: Add a free list structure for use with util_sparse_array")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4612>
(cherry picked from commit dc3a17997b)
2020-04-20 10:01:27 -07:00
Danylo Piliaiev
76dbcb1f5e st/mesa: Update shader info of ffvp/ARB_vp after translation to NIR
We must update stp->Base.info after translation and before
st_prepare_vertex_program is called, because inputs_read
may become outdated after NIR optimization passes.

For ffvp/ARB_vp inputs_read is populated based on declared
attributes without taking their usage into consideration.
When creating shader variants we expect that their inputs_read
would match the base ones for input mapping to work properly.

Cc: <mesa-stable@lists.freedesktop.org>
Fixes: 8a0dd0af3f
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2758
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4598>
(cherry picked from commit d684fb37bf)
2020-04-20 10:01:27 -07:00
Samuel Pitoiset
9d052b2534 aco: fix exporting the viewport index if the fragment shader needs it
It's like the layer, it has to be exported via the pos and also
as a varying if the fragment shader reads it.

Fixes dEQP-VK.draw.shader_viewport_index.fragment_shader_*

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4564>
(cherry picked from commit c4ca9e66dd)
2020-04-20 10:01:27 -07:00
Samuel Pitoiset
5cd2b945df radv/llvm: fix exporting the viewport index if the fragment shader needs it
It's like the layer, it has to be exported via the pos and also
as a varying if the fragment shader reads it.

Fixes dEQP-VK.draw.shader_viewport_index.fragment_shader_*

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4564>
(cherry picked from commit b424d49ac0)
2020-04-20 10:01:27 -07:00
Marek Olšák
feb2f12db8 st/mesa: fix a crash due to passing a draw vertex shader into the driver
Fixes: bc99b22a30
Closes: #2754

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4527>
(cherry picked from commit 80797edd71)
2020-04-20 09:39:39 -07:00
Tapani Pälli
c447fcbcbe mesa/st: initialize all winsys_handle fields for memory objects
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reported-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4547>
(cherry picked from commit a934c8e7ed)
2020-04-20 09:39:39 -07:00
Samuel Pitoiset
d13c6c251e radv: do not abort with unknown/unimplemented descriptor types
To workaround a crash with Wolfeinstein Younglood because the
games creates one descriptor with
VK_DESCRIPTOR_TYPE_ACCELERATION_STRUCTURE_NV...

I reported the problem to Machine Games, but still no answer, so
let's remove the unreachable calls (which are technically not
unreachable for buggy apps) to help gamers.

Note that AMDVLK and AMDGPU-PRO don't crash because they ignore
unsupported descriptor types.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4571>
(cherry picked from commit 35b3963928)
2020-04-20 09:39:38 -07:00
Karol Herbst
ded9c5c490 Revert "nvc0: fix line width on GM20x+"
This reverts commit a0e57432b7.

It's unclear what caused the test to fail back then. Now it's seems to be
reversed. I tested with a close enough piglit and mesa branch and wasn't
able to reproduce the same test result I've got in some older piglit runs.

Fixes:
dEQP-GLES2.functional.rasterization.primitives.lines_wide
dEQP-GLES2.functional.rasterization.primitives.line_strip_wide
dEQP-GLES2.functional.rasterization.primitives.line_loop_wide
dEQP-GLES2.functional.rasterization.limits.points
dEQP-GLES2.functional.clipping.line.wide_line_z_clip
dEQP-GLES2.functional.clipping.line.wide_line_z_clip_viewport_center
dEQP-GLES2.functional.clipping.line.wide_line_z_clip_viewport_corner
dEQP-GLES2.functional.clipping.line.wide_line_clip
dEQP-GLES2.functional.clipping.line.wide_line_clip_viewport_center
dEQP-GLES2.functional.clipping.line.wide_line_clip_viewport_corner
dEQP-GLES2.functional.clipping.line.wide_line_attrib_clip
dEQP-GLES2.functional.polygon_offset.default_result_depth_clamp
dEQP-GLES2.functional.polygon_offset.default_factor_1_slope
dEQP-GLES2.functional.polygon_offset.fixed16_result_depth_clamp
dEQP-GLES2.functional.polygon_offset.fixed16_factor_1_slope

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4575>
(cherry picked from commit 2d489f76f4)
2020-04-20 09:39:37 -07:00
Dave Airlie
2f5509c44e llvmpipe/nir: free the nir shader
Fixes: 18f896e55d (llvmpipe: add initial nir support)
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4563>
(cherry picked from commit befe2ff3a6)
2020-04-20 09:39:36 -07:00
Dave Airlie
b1f087965f draw: free the NIR IR.
Not sure how I missed this, the ownership was a bit blurry,
free the NIR.

Fixes: bf12bc2dd7 (draw: add nir info gathering and building support)
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4563>
(cherry picked from commit f01c0565bb)
2020-04-20 09:39:36 -07:00
Dylan Baker
01844f40af .pick_status.json: Update to c3c1f4d6bc 2020-04-20 09:38:28 -07:00
Karol Herbst
b5b56d89a1 clover: fix build with single library clang build
Closes: #2560
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4417>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4417>
(cherry picked from commit ff1a3a00cb)
2020-04-19 15:39:30 +02:00
Emil Velikov
fc140276b4 glx: omit loader_loader() for macOS
Earlier commit added the code unconditionally, since the loader code
itself is already built on macOS.
Although it did not consider the #include mayhem that src/glx is.

In particular, none of the __GLXDRI{screen,context,drawable) are
available for macOS... those are pulled by dri_common.[ch].

Ideally we'll untangle that, but for the time being simply #ifdef out
the include/call.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2726
Fixes: b699d070a6 ("glx: set the loader_logger early and for everyone")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4490>
(cherry picked from commit 22406da756)
2020-04-15 11:05:48 -07:00
Rhys Perry
88f89f7a10 aco: fix 1D textureGrad() on GFX9
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: 6f718edced ('aco: simplify gathering of MIMG address components')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4550>
(cherry picked from commit c818b5c089)
2020-04-15 11:05:47 -07:00
Lionel Landwerlin
8e81105741 iris: drop cache coherent cpu mapping for external BO
We have to assume any external buffer could be used by the display HW.
In the case that buffer is also CPU mapped, we want to assume no cache
coherency as it is only available between GT & CPU, not display.

Many thanks to Michel Dänzer for the hint!

v2: Move cache coherent drop to bufmgr (Chris)

v3: Also make BO external if created with PIPE_BIND_SHARED (Eric)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2552
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4533>
(cherry picked from commit 8ce46f352e)
2020-04-15 11:05:46 -07:00
Vinson Lee
721648e2a3 swr: Remove Byte Order Mark.
before:
$ file src/gallium/drivers/swr/rasterizer/codegen/gen_llvm_types.py
src/gallium/drivers/swr/rasterizer/codegen/gen_llvm_types.py: Python script text executable, UTF-8 Unicode (with BOM) text

after:
$ file src/gallium/drivers/swr/rasterizer/codegen/gen_llvm_types.py
src/gallium/drivers/swr/rasterizer/codegen/gen_llvm_types.py: Python script text executable, ASCII text

This patch also fixes this build error.

  File "src/gallium/drivers/swr/rasterizer/codegen/gen_llvm_types.py", line 1

    # Copyright (C) 2014-2018 Intel Corporation.   All Rights Reserved.

    ^

SyntaxError: invalid character in identifier

Fixes: c6e67f5a93 ("gallium/swr: add OpenSWR rasterizer")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4221>
(cherry picked from commit 68b40cfe27)
2020-04-15 11:05:45 -07:00
Dylan Baker
8be0ceb13c .pick_status.json: Update to 13ce637f1b 2020-04-15 11:05:43 -07:00
pal1000
c71cf85615 scons/windows: Support build with LLVM 10.
LLVM engine component contains 2 new libraries, LLVMCFGuard and LLVMTextAPI.
Without them build fails like in this run: https://ci.appveyor.com/project/Alexpux/mingw-packages/builds/32102425/job/wkmb4gimfqkkb3cg

Cc: <mesa-stable@lists.freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4521>
(cherry picked from commit f0b94262c1)
2020-04-14 10:43:40 -07:00
Tobias Jakobi
38e0d199cf meson: Link Gallium Nine with ld_args_build_id
This fixes an assertion in iris_disk_cache_init() when the initialization
goes through drm_create_adapter(), which lives in d3dadapter9.so.
In this case build_id_find_nhdr_for_addr() fails and returns NULL, since
the shared library does not include a build ID.

The issue can be reproduced with an iris capable GPU and Xnine, while
removing the shader cache prior to launching the application.

Fix this by doing the same as in 29ea92e6a1.

Fixes: 4756864cdc "iris: Start wiring up on-disk shader cache"
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4499>
(cherry picked from commit c38946e62d)
2020-04-14 10:43:39 -07:00
Matt Turner
3323a30266 meson: Specify the maximum required libdrm in dri.pc
When dealing with a regression in libdrm-2.4.101, I masked the package
in Gentoo. In doing so, we discovered that Mesa's dri.pc specifies a
version requirement in dri.pc for >= the version of libdrm Mesa was
built against, thus preventing packages from being rebuilt with the
older version of libdrm installed.

Let's reduce this version requirement to the latest libdrm required by
Mesa instead, since libdrm is backward compatible.

Fixes: a3a16d4aa7 ("meson: use dep_libdrm version for pkg-config")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4534>
(cherry picked from commit e4268ffb99)
2020-04-14 10:43:37 -07:00
Dylan Baker
2111ccfbd9 .pick_status.json: Update to acf7e73be5 2020-04-14 10:43:34 -07:00
Neil Armstrong
7f2539dcc0 gitlab-ci: re-enable mali400/450 and t820 jobs
The FILES_HOST_NAME and FILES_HOST_URL are in the baylibre's runner
environment to make it more flexible.

Also use the new aarch64 mesa-ci-aarch64-lava-baylibre runner with
embedded nginx server to serve the LAVA artifacts.

Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4295>
(cherry picked from commit 51831537a2)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4462>
2020-04-13 21:46:40 +00:00
Neil Armstrong
b637994b2c gitlab-ci: add FILES_HOST_URL and move FILES_HOST_NAME into jobs
The FILES_HOST_URL & FILES_HOST_NAME will be in the Baylibre's runner
environment, move them into the t860/t720/t760 jobs using Collabora's
runner.

Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
(cherry picked from commit 842f13d8f8)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4462>
2020-04-13 21:46:40 +00:00
Tomeu Vizoso
5cbe73152d gitlab-ci: Serve files for LAVA via separate service
Currently, we store the kernel and ramdisk for each LAVA job in the
artifacts of the job that built them. Because artifacts are stored in
GCE and LAVA labs aren't, this causes a lot of egress with is expensive.

To avoid this, have runners download most of the data via the (cached)
container images once, and for each job upload the kernel and ramdisk to
a server outside GCE.

Right now we only have Collabora's runner with a local web server, so
jobs that go to Baylibre's lab have been disabled.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit b123849880)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4462>
2020-04-13 21:46:40 +00:00
Tomeu Vizoso
78a1deb958 gitlab-ci: Place files from the Mesa repo into the build tarball
There's some files from the .gitlab-ci directory that are needed in the
test stage and that, because the Mesa repository isn't checked out in
that stage, need to be made available through other means.

Because those files are going to be needed in LAVA devices, place them
ino the tarball containing the built files so it's available to both
gitlab-ci runners and LAVA devices.

Before those files were passed in the artifacts of the Gitlab CI job,
but this commit places them into the built tarball so scripts later in
the pipeline don't need to account for this discrepancy.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 92f3c51560)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4462>
2020-04-13 21:46:40 +00:00
Eric Anholt
9d50c53e47 ci: Remove LLVM from ARM test drivers.
The LLVM libraries were a significant fraction of the entire payload
(55M/250M uncompressed) into the initramfs of the test boards, but
LLVM is only used for the draw module used in select/feedback (which
isn't even tested in CI on ARM yet).

Assume that llvmpipe draw is safe enough for ARM given the coverage on
x86, and disable LLVM for these jobs.

Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
(cherry picked from commit 257415863b)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4462>
2020-04-13 21:46:40 +00:00
Rohan Garg
8980c58000 ci: Split out radv build-testing on arm64
radv needs libllvm which increases our ramdisk size
significantly. Since this driver is only build tested,
we can split it out into a separate job.

Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
(cherry picked from commit 9c0bbba856)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4462>
2020-04-13 21:46:39 +00:00
Tapani Pälli
8d1baa2854 glsl: stop processing function parameters if error happened
Fixes: d1fa69ed61 ("glsl: do not attempt assignment if operand type not parsed correctly")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2696
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4341>
(cherry picked from commit 53e4159eaa)
2020-04-13 10:09:16 -07:00
Lionel Landwerlin
6ae8c93bf3 i965: share buffer managers across screens
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1373
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4086>
(cherry picked from commit 4094558e86)
2020-04-13 10:09:15 -07:00
Lionel Landwerlin
13a79c1522 i965: store DRM fd on intel_screen
v2: Fix storing of drm fd (Ajax)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1373
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4086>
(cherry picked from commit 865b840a6b)
2020-04-13 10:09:14 -07:00
Lionel Landwerlin
b5a8a2aee6 iris: make resources take a ref on the screen object
Because St creates resources from a screen and attach them onto
another we need to ensure the resources associated to a screen &
bufmgr stay around until we don't need them anymore.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1373
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4086>
(cherry picked from commit 0a497eb130)
2020-04-13 10:09:14 -07:00
Lionel Landwerlin
01bd048138 iris: share buffer managers accross screens
St happilly uses pipe_resources created with one screen with other
screens. Unfortunately our resources have a single identifier that
related to a given screen and its associated DRM file descriptor.

To workaround this, let's share the buffer manager between screens for
a given DRM device. That way handles are always valid.

v2: Don't forget to close the fd that bufmgr now owns
    Take a copy of the fd to ensure it stays alive even if the dri
    layer closes it

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1373
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4086>
(cherry picked from commit 7557f16059)
2020-04-13 10:09:13 -07:00
Lionel Landwerlin
4bf1d033a6 iris: properly free resources on BO allocation failure
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4086>
(cherry picked from commit bd3e505453)
2020-04-13 10:09:12 -07:00
Dylan Baker
e03687984e .pick_status.json: Update to 28d36d26c2 2020-04-13 10:09:11 -07:00
Jose Maria Casanova Crespo
9a0e6d6490 v3d: Primitive Counts Feedback needs an extra 32-bit padding.
Store Primitive Counts operations write 7 counters in 32-bit words
but also a padding 32-bit with 0. So we need 8 32-bit words instead
of the current 7 allocated.

This was causing an corruption in the next buffer when Transform
Feedback was enabled that were exposed on tests like:
dEQP-GLES3.functional.transform_feedback.*.points.*

This patch fixes 196 tests that were failing when they were run isolated
but they were passing when run using cts-runner.

Fixes: 0f2d1dfe65 ("v3d: use the GPU to record primitives written to transform feedback")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2674
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4501>
(cherry picked from commit b06fdb8edd)
2020-04-10 10:09:14 -07:00
Vinson Lee
167e434433 swr/rasterizer: Use private functions for min/max to avoid namespace issues.
This is a similiar fix as bb2287ccdf ("gallivm/tessellator: use
private functions for min/max to avoid namespace issues").

Fixes: ab55708200 ("swr/rasterizer: Add tessellator implementation to the rasterizer")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
Tested-by: Jan Zielinski <jan.zielinski@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4208>
(cherry picked from commit 0536ca20d7)
2020-04-10 10:09:14 -07:00
Dylan Baker
2de66086f6 .pick_status.json: Update to 65e2eaa4d3 2020-04-10 10:09:12 -07:00
Bas Nieuwenhuizen
019bb88019 radv: Use correct buffer count with variable descriptor set sizes.
Fixes dEQP-VK.binding_model.descriptorset_random.sets16.noarray.ubolimitlow.sbolimitlow.imglimitlow.iublimitlow.frag.ialimitlow.0

CC: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2607
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4489>
(cherry picked from commit a7e2efa7c9)
2020-04-09 14:23:55 -07:00
Bas Nieuwenhuizen
abdb320af4 radv: Consider maximum sample distances for entire grid.
The other pixels in the grid might have samples with a larger
distance than the (0,0) pixel.

Fixes dEQP-VK.pipeline.multisample.sample_locations_ext.verify_location.samples_8_packed
when CTS is compiled with clang.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4480>
(cherry picked from commit a3682670c8)
2020-04-09 14:23:54 -07:00
Timothy Arceri
26dfb62c25 radeonsi: don't lower constant arrays to uniforms in GLSL IR 2020-04-09 14:23:11 -07:00
Bas Nieuwenhuizen
d15f45ca31 radv: Store 64-bit availability bools if requested.
Fixes dEQP-VK.query_pool.*.reset_before_copy.* on RAVEN.

CC: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2296
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4334>
(cherry picked from commit 940ed5078d)
2020-04-09 14:16:57 -07:00
Hyunjun Ko
d4ef9d6ac0 nir: fix wrong assignment to buffer in xfb_varyings_info
Tested with dEQP-VK.transform_feedback.fuzz.various_buffers.buffers100_instance_array_vertex

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4459>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4459>
(cherry picked from commit 9f174eb2df)
2020-04-09 14:16:57 -07:00
Tapani Pälli
de3d7dbed5 mesa/st: release variants for active programs before unref
Programs can be shared among many contexts and each program holds a
variant list which has context specific variants. When context gets
destroyed it must make sure it relases all variants, otherwise remaining
context that utilizes same program will attempt to save a zombie shader
for already deleted context when releasing program and its variants.

Fixes:
   dEQP-EGL.functional.sharing.gles2.program.render

and other flaky multihread dEQP-EGL failures.

v2: pass program pointer via & (Marek)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4386>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4386>
(cherry picked from commit 84e845c969)
2020-04-09 14:16:56 -07:00
Tapani Pälli
1fdf425560 mesa/st: unbind shader state before deleting it
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4386>
(cherry picked from commit 4822cc9700)
2020-04-09 14:16:56 -07:00
Rob Clark
3d5d4ee8e6 nir: fix definition of imadsh_mix16 for vectors
Fixes: c27b3758fa ("nir/opcodes: Add new 'umul_low' and 'imadsh_mix16' opcodes")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4423>
(cherry picked from commit bf64648864)
2020-04-09 14:16:55 -07:00
Jason Ekstrand
e70e9d4e78 nir/load_store_vectorize: Fix shared atomic info
These were clearly copied and pasted from SSBOs.  The shared atomics
don't have an SSBO index so their offset is src0 and data is src1.

Fixes: ce9205c03b "nir: add a load/store vectorization pass"
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4367>
(cherry picked from commit 04d08ea149)
2020-04-09 14:16:54 -07:00
Ilia Mirkin
8afe06115f nv50: don't try to upload MSAA settings for BUFFER textures
We need the MSAA scaling parameters to properly fetch samples from MSAA
textures. These are stored in the miptree which wraps all regular
textures. However it does not wrap buffer textures, so make sure to skip
them rather than accessing out-of-bounds or unmapped memory.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2727
Fixes: 3bd40073b9 ("nv50: add support for texelFetch'ing MS textures")
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4424>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4424>
(cherry picked from commit 1288ac7632)
2020-04-09 14:16:53 -07:00
Daniel Stone
143c99ef4b EGL: Add eglSetDamageRegionKHR to GLVND dispatch list
This was missed in the original conversion, which added support for
eglSetDamageRegionKHR to local EGL exports, but forgot to generate
updated dispatch for GLVND.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Fixes: 9827547313 ("egl/android: support for EGL_KHR_partial_update")
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4403>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4403>
(cherry picked from commit bfb9c08e5c)
2020-04-09 14:16:52 -07:00
Samuel Pitoiset
7f8b89e030 radv/llvm: enable 16-bit storage features on GFX6-GFX7
Should allow to play Doom Eternal on GFX6-GFX7 because the
driver now supports storageBuffer16BitAccess.

It's now supported and all CTS tests pass.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/857
Cc: 20.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4339>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4339>
(cherry picked from commit 655e8449d0)
2020-04-09 14:16:50 -07:00
Samuel Pitoiset
053e5a2c13 ac/nir: split 16-bit SSBO stores on GFX6
Due to possible alignment issues, make sure to split stores of
16-bit vectors.

Doom Eternal requires storageBuffer16BitAccess.

Cc: 20.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4339>
(cherry picked from commit 3cd5450df5)
2020-04-09 14:16:46 -07:00
Samuel Pitoiset
fbef99b5a6 ac/nir: split 16-bit load/store to global memory on GFX6
Due to possible alignment issues, make sure to split loads/stores
of 16-bit vectors.

Doom Eternal requires storageBuffer16BitAccess.

Cc: 20.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4339>
(cherry picked from commit 55fdcc03de)
2020-04-09 14:16:46 -07:00
Samuel Pitoiset
820c636a06 radv/llvm: enable 8-bit storage features on GFX6-GFX7
It's now supported and all CTS tests pass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4339>
(cherry picked from commit 7308f2e912)
2020-04-09 14:16:44 -07:00
Samuel Pitoiset
a0e857c768 ac/nir: split 8-bit SSBO stores on GFX6
Due to possible alignment issues, make sure to split stores of
8-bit vectors.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4339>
(cherry picked from commit c6bf1597d1)
2020-04-09 14:16:23 -07:00
Samuel Pitoiset
87f1e7b1d8 ac/nir: split 8-bit load/store to global memory on GFX6
Due to possible alignment issues, make sure to split loads/stores
of 8-bit vectors.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4339>
(cherry picked from commit 433f3380eb)
2020-04-09 14:16:14 -07:00
Jason Ekstrand
1cf4f626d0 anv/image: Use align_u64 for image offsets
The ALIGN functions in util/u_math.h work on uintptr_t whose size
changes depending on your platform.  Use ones which take an explicit
64-bit type instead to avoid 32-bit platform issues.

Cc: mesa-stable@lists.freedesktop.org
Reported-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4414>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4414>
(cherry picked from commit 5cc27d59a1)
2020-04-09 14:06:48 -07:00
Juan A. Suarez Romero
49a4ba5e05 anv/pipeline: allow more than 16 FS inputs
A fragment shader can have more than 16 inputs, so SBE emission should
deal with all of them.

This fixes dEQP-VK.pipeline.max_varyings.*

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2010>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2010>
(cherry picked from commit 191ced539a)
2020-04-09 14:06:44 -07:00
Juan A. Suarez Romero
04067fbe59 intel/compiler: store the FS inputs in WM prog data
Store the fragment shader inputs in the program data so we can use them
later when required without needing the NIR shader.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2010>
(cherry picked from commit 460de2159e)
2020-04-09 14:06:42 -07:00
Mathias Fröhlich
3744a31d6c i965: Move down genX_upload_sbe in profiles.
Avoid looping over all VARYING_SLOT_MAX urb_setup array
entries from genX_upload_sbe. Prepare an array indirection
to the active entries of urb_setup already in the compile
step. On upload only walk the active arrays.

v2: Use uint8_t to store the attribute numbers.
v3: Change loop to build up the array indirection.
v4: Rebase.
v5: Style fix.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/308>
(cherry picked from commit 630154e77b)
2020-04-09 14:06:40 -07:00
Emil Velikov
a93c67d633 Revert "egl/dri2: Don't dlclose() the driver on dri2_load_driver_common failure"
This reverts commit 1b87f4058d.

dlclose() of the handle is perfectly reasonable, a follow-up NULL
assignment is missing.

As-is this causes a leak for nearly every platform, since they call
dri2_load_driver* initially, followed by a second swrast fallback call.

Some platforms even loop through the existing drivers probing.

Revert the commit and add the NULL check.

Fixes: 1b87f4058d ("egl/dri2: Don't dlclose() the driver on dri2_load_driver_common failure")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4084>
(cherry picked from commit d3c9143971)
2020-04-09 13:30:56 -07:00
Emil Velikov
e594e7fa27 egl/drm: reinstate (kms_)swrast support
With earlier commit we've added a generic LIBGL_ALWAYS_SOFTWARE handling
yet did not consider that the existing codebase unconditionally errors
out when set. That was fixed with a latter commit, while the fix itself
added erroneous restriction for egl/drm.

As mentioned in the report - the feature was working for ages. It was a
Gnome developer who added kms_swrast support for gbm in the first place.

Admittedly kms_swrast is somewhat in the middle between traditional
swrast and HW drivers, regardless - reinstate support.

Fixes: 47273d7312 ("egl: set UseFallback if LIBGL_ALWAYS_SOFTWARE is set")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/165
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4084>
(cherry picked from commit fa5e800e05)
2020-04-09 13:30:55 -07:00
Emil Velikov
c9f4058183 glx: set the loader_logger early and for everyone
Currently we set the logger only for DRI3. Even though it's used nearly
everywhere. For platforms where we don't the function is effectively a
no-op.

With this in place, LIBGL_DEBUG=verbose works across the board.

Fixes: d971a4230d ("loader: Factor out the common driver opening logic from each loader.")
Cc: Eric Anholt <eric@anholt.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4084>
(cherry picked from commit b699d070a6)
2020-04-09 13:30:53 -07:00
Dylan Baker
800e02453a .pick_status.json: Update to 089e1fb287 2020-04-09 13:30:47 -07:00
Neil Armstrong
4c2c6e6119 gitlab-ci/lava: fix handling of lava tags
The lava tags was a python array not it's a gitlab CI string,
slit the string with periods in the jinja2 template to avoid having
the following tags :

tags:
 - p
 - a
 - n
 - f
 - r
 - o
 - s
 - t

instead of :
tags:
 - panfrost

Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
(cherry picked from commit bbdb4b1a6d)

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4434>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4434>
2020-04-04 14:07:19 +00:00
Thong Thai
ce2921a9c3 gallium/auxiliary/vl: fix bob compute shaders for deint yuv
Scales the Y-axis by 2 when using the Bob deinterlace filter.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2523
Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3857>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3857>
(cherry picked from commit c81aa15d64)
2020-04-04 15:38:40 +02:00
Eric Engestrom
9de8d58ab2 docs/relnotes: add sha256sum for 20.0.4 2020-04-03 12:28:20 +02:00
Eric Engestrom
d3586b5291 VERSION: bump to 20.0.4 2020-04-03 11:25:46 +02:00
Eric Engestrom
882256534d docs: add release notes for 20.0.4 2020-04-03 11:24:56 +02:00
Jason Ekstrand
2120f106e0 Revert "spirv: Implement OpCopyObject and OpCopyLogical as blind copies"
This reverts commit 7a53e67816.

(cherry picked from commit 68f325b256)
2020-04-02 23:38:11 +02:00
Eric Engestrom
4601358fa3 .pick_status.json: Update to c71c1f44b0 2020-04-02 22:47:45 +02:00
Eric Engestrom
a680481532 docs/relnotes: add sha256sum for 20.0.3 2020-04-01 23:40:37 +02:00
Eric Engestrom
103f12d23e VERSION: bump to 20.0.3 2020-04-01 23:25:56 +02:00
Eric Engestrom
b04ae1f964 docs: add release notes for 20.0.3 2020-04-01 23:24:57 +02:00
Thomas Hellstrom
5e85ed86eb svga, winsys/svga: Fix persistent memory discard maps
The kernel driver requires immediate notification using a
BindGBSurface command when a graphics coherent memory resource changes
backing MOB, so that it can start dirty-tracking the new MOB.
Since we always use graphics coherent memory for persistent memory, enqueue
and flush a BindGBSurface commmand at map time rather than at unmap time.
Since we're dealing with persistent memory, It's OK to flush while mapped.

This fixes an issue with gnome-shell / Wayland which uses persistent
memory together with discard maps when we advertise ARB_buffer_storage.
XWayland clients will render incorrectly.

Fixes: 71b43490dd ("svga: Support ARB_buffer_storage")
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4399>
(cherry picked from commit 46fdc288fb)
2020-04-01 18:05:29 +02:00
Timothy Arceri
9c8dab082f nir: fix crash in varying packing on interface mismatch
For example when the outputs are scalars but the inputs are struct
members.

Fixes: 26aa460940 ("nir: rewrite varying component packing")

Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4351>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4351>
(cherry picked from commit 0f4a81430e)
2020-04-01 18:05:28 +02:00
Jason Ekstrand
84531156ee spirv: Implement OpCopyObject and OpCopyLogical as blind copies
Because the types etc. are required to logically match, we can just
copy-propagate the guts of the vtn_value.  This was causing issues with
some new CTS tests that are doing an OpCopyObject of a sampler which is
a special-cased type in spirv_to_nir.  Of course, this is only a partial
solution.  Ideally, we've got a bit of work to do to make all the
composite stuff able to handle all types including images, sampler, and
combined image/samplers but this gets some CTS tests passing.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4375>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4375>
(cherry picked from commit 7a53e67816)
2020-04-01 18:05:24 +02:00
Eric Engestrom
52aafdafb3 .pick_status.json: Update to 70ac7f5b0c 2020-04-01 18:05:22 +02:00
Jason Ekstrand
e51b749f1d anv: Account for the header in anv_state_stream_alloc
If we have an allocation that's exactly the block size, we end up
computing a new block size to allocate that's exactly the block size,
add in the header, and then assert fail.  When computing the block size,
we need to account for the header.

Fixes: 955127db93 "anv/allocator: Add support for large stream..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4336>
(cherry picked from commit 63bec07e14)
2020-03-31 12:29:11 +02:00
Jason Ekstrand
6ddc34f659 nir/lower_int64: Lower 8 and 16-bit downcasts with nir_lower_mov64
We have the code to do the lowering, we were just missing the
boilerplate bits to make should_lower_int64_alu_instr return true.

Fixes: 62d55f1281 "nir: Wire up int64 lowering functions"
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4365>
(cherry picked from commit 14a49f31d3)
2020-03-31 12:29:09 +02:00
Rob Clark
99471dbddc util: fix u_fifo_pop()
Seems like no one ever depended on it to actually return false when fifo
is empty.

Fixes: 6e61d06209 ("util: Add super simple fifo")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4366>
(cherry picked from commit ffd3226678)
2020-03-31 12:29:07 +02:00
Rhys Perry
b571cfabb6 util/u_queue: fix race in total_jobs_size access
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
CC: <mesa-stable@lists.freedesktop.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4335>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4335>
(cherry picked from commit 1ef9658906)
2020-03-31 12:29:04 +02:00
Rhys Perry
1a6ce0f9af glsl: fix race in instance getters
Insertions can modify entry->data. Seems to fix random Fossilize crashes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Eric Anholt <eric@anholt.net>
CC: <mesa-stable@lists.freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4335>
(cherry picked from commit d101ca3f5a)
2020-03-31 12:29:01 +02:00
Eric Engestrom
b97550253d .pick_status.json: Update to 5f4d9b419a 2020-03-31 12:27:49 +02:00
Erik Faye-Lund
f44779d7eb vtn/opencl: fully enable OpenCLstd_Clz
Fixes: 7325f6ac98 ("vtn/opencl: add clz support")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4318>
(cherry picked from commit 4821ec6d8f)
2020-03-30 22:08:17 +02:00
Erik Faye-Lund
243cc87032 pipebuffer: clean up cast-warnings
This code produces warnings, so let's fix that. The problem is that
casting a pointer to an integer of non-pointer-size triggers warnings on
MSVC, and on 64-bit Windows unsigned long is 32-bit large.

So let's instead use uintptr_t, which is exactly for these kinds of
things.

While we're at it, let's make the resulting index a plain "unsigned",
which is the type this originated from before we started with this
cast-dance.

Fixes: 1a66ead1c7 ("pipebuffer, winsys/svga: Add functionality to update pb_validate_entry flags")
Reviewed-by: Brian Paul <brianp@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4297>
(cherry picked from commit 079cb4949d)
2020-03-30 22:08:17 +02:00
Rhys Perry
3facfb3435 aco: implement 64-bit VGPR constant copies in handle_operands()
64-bit VGPR constant copies can happen because of 64-bit constant copy
propagation. Since this optimization is beneficial and more annoying to
deal with in the optimizer, I've implemented 64-bit VGPR constant copies
in handle_operands().

This also sets copy_operation::size correctly for 64-bit constant copies.

Cc: 20.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4260>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4260>
(cherry picked from commit 43918c9a7f)
2020-03-30 22:08:17 +02:00
Timur Kristóf
1e598bf8e0 radv/llvm: fix subgroup shuffle for chips without bpermute
bpermute only exists on GFX8+ and only with Wave32 on GFX10. Instead
we have to use readlane with a waterfall loop to defeat the LLVM
backend.

This fixes DOOM Eternal which requires subgroup shuffle.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4284>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4284>
(cherry picked from commit 7ac8bb33cd)

Squashed with:

radv: Enable subgroup shuffle on GFX10 when ACO is used.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4159>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4159>
(cherry picked from commit cfa299eadb)
2020-03-30 22:07:47 +02:00
Rob Clark
db199be2c3 freedreno/ir3/ra: fix array liveranges
Fixes: 1b658533e1 ("freedreno/ir3: extend liverange of arrays")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4272>
(cherry picked from commit d2cc92c747)
2020-03-30 12:26:45 +02:00
Timothy Arceri
284f3ce6bc nir: fix packing of TCS varyings not read by the TES
Unlike other stages TCS outputs not read by the TES cannot always
be demoted to globals e.g. when they are read by other TCS
invocations.

We were not taking these outputs into account when packing which
could result in other outputs being assigned to the same location.

Here we make sure to gather information on these outputs and group
them together when packing.

This fixes rendering issues in QUBE 2 via Proton.

Closes: #2653
Fixes: 26aa460940 ("nir: rewrite varying component packing")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4328>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4328>
(cherry picked from commit b5e00f5c2b)
2020-03-30 12:26:45 +02:00
Timothy Arceri
c16dfe1c63 glsl: fix varying packing for 64bit integers
Without this we can incorrectly end up marking things as making
use of ARB_enhanced_layouts style packing.

Cc: 19.3 20.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4328>
(cherry picked from commit 8b9ebbcb54)
2020-03-30 12:26:45 +02:00
Samuel Pitoiset
dfc0a5cc14 ac/nir: use llvm.amdgcn.rcp in ac_build_fdiv()
Instead of emitting 1.0 / x which includes a slow division that
LLVM doesn't always optimize even if the metadata is correctly set.

No pipeline-db changes with VEGA10/LLVM 9.

pipeline-db (VEGA10/LLVM 10):
Totals from affected shaders:
SGPRS: 6672 -> 6672 (0.00 %)
VGPRS: 6652 -> 6652 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 561780 -> 561692 (-0.02 %) bytes
Max Waves: 1043 -> 1043 (0.00 %)

pipeline-db (VEGA10/LLVM 11 - 92744f62478):
Totals from affected shaders:
SGPRS: 84608 -> 83768 (-0.99 %)
VGPRS: 106768 -> 106636 (-0.12 %)
Spilled SGPRs: 1625 -> 1713 (5.42 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 10850936 -> 10726712 (-1.14 %) bytes
Max Waves: 3152 -> 3180 (0.89 %)

LLVM 11 (master) is more affected than previous versions, but
based on the small impact with LLVM 9/10, I decided to emit it
unconditionally.

Cc: 20.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4326>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4326>
(cherry picked from commit ba2ec1f369)
2020-03-30 12:26:45 +02:00
Samuel Pitoiset
6e4db7e726 ac/nir: use llvm.amdgcn.rsq for nir_op_frsq
Instead of emitting 1.0 / sqrt(x) which includes a slow division that
LLVM doesn't always optimize even if the metadata is correctly set.

pipeline-db (VEGA10/LLVM 9):
Totals from affected shaders:
SGPRS: 16872 -> 16864 (-0.05 %)
VGPRS: 15320 -> 15464 (0.94 %)
Spilled SGPRs: 2021 -> 2133 (5.54 %)
Code Size: 1915464 -> 1917476 (0.11 %) bytes
Max Waves: 641 -> 639 (-0.31 %)

pipeline-db (VEGA10/LLVM 10):
Totals from affected shaders:
SGPRS: 43936 -> 44120 (0.42 %)
VGPRS: 41776 -> 41972 (0.47 %)
Spilled SGPRs: 875 -> 875 (0.00 %)
Code Size: 4468164 -> 4468120 (-0.00 %) bytes
Max Waves: 2412 -> 2414 (0.08 %)

pipeline-db (VEGA10/LLVM 11 - 92744f62478):
Totals from affected shaders:
SGPRS: 60096 -> 60096 (0.00 %)
VGPRS: 63552 -> 63648 (0.15 %)
Spilled SGPRs: 6135 -> 6117 (-0.29 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 6252996 -> 6249772 (-0.05 %) bytes
Max Waves: 2324 -> 2337 (0.56 %)

LLVM 11 (master) is more affected than previous versions, but
based on the small impact with LLVM 9/10, I decided to emit it
unconditionally.

Cc: 20.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4326>
(cherry picked from commit d548384fc6)
2020-03-30 12:26:45 +02:00
Samuel Pitoiset
f4179f1b17 ac/nir: use llvm.amdgcn.rcp for nir_op_frcp
Instead of emitting 1.0 / x which includes a slow division that
LLVM doesn't always optimize even if the metadata is correctly set.

pipeline-db (VEG10/LLVM 9):
Totals from affected shaders:
SGPRS: 50384 -> 50312 (-0.14 %)
VGPRS: 42572 -> 42696 (0.29 %)
Spilled SGPRs: 1372 -> 1372 (0.00 %)
Code Size: 5692040 -> 5691428 (-0.01 %) bytes
Max Waves: 3954 -> 3951 (-0.08 %)

pipeline-db (VEG10/LLVM 10):
Totals from affected shaders:
SGPRS: 78512 -> 78464 (-0.06 %)
VGPRS: 62408 -> 62484 (0.12 %)
Spilled SGPRs: 1502 -> 1502 (0.00 %)
Code Size: 8106188 -> 8103372 (-0.03 %) bytes
Max Waves: 7759 -> 7753 (-0.08 %)

pipeline-db (VEGA10/LLVM 11 - 92744f62478):
Totals from affected shaders:
SGPRS: 112760 -> 113232 (0.42 %)
VGPRS: 111132 -> 110568 (-0.51 %)
Spilled SGPRs: 5870 -> 5940 (1.19 %)
Spilled VGPRs: 650 -> 652 (0.31 %)
Code Size: 11887232 -> 11561744 (-2.74 %) bytes
Max Waves: 8964 -> 9015 (0.57 %)

LLVM 11 (master) is more affected than previous versions, but
based on the small impact with LLVM 9/10, I decided to emit it
unconditionally.

Cc: 20.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4326>
(cherry picked from commit 66426ce119)
2020-03-30 12:26:45 +02:00
Francisco Jerez
cbc7ba2e47 intel/fs/gen12: Fix interaction of SWSB dependency combination with EU fusion workaround.
This has been reported to fix a hang in Shadow of Mordor on Gen12.
One of its compute shaders seems to cause an in-order exec_all
dependency to be merged into an out-of-order SET dependency slot,
which would prevent us from baking the SET dependency into the parent
instruction, leading to an assert failure in emit_inst_dependencies()
(Thanks to Rafael for noticing that).  Prevent that by avoiding
combination of in-order dependencies whenever that would cause a SET
dependency to be demoted to a SYNC.NOP instruction.

Fixes: e14529ff32 "intel/fs/gen12: Workaround data coherency issues due to broken NoMask control flow."
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit 36c155a017)
2020-03-30 12:26:45 +02:00
Tapani Pälli
c0927e9f72 glsl: set error_emitted true if type not ok for assignment
Patch changes also existing assert to not trigger when we have
error types in assignment.

v2: simplify, cleanup (Ian)

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2629
Fixes: d1fa69ed61 ("glsl: do not attempt assignment if operand type not parsed correctly")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4178>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4178>
(cherry picked from commit 0847fe6e7f)
2020-03-30 12:26:45 +02:00
Eric Engestrom
08a2a2025b .pick_status.json: Update to 8970b7839a 2020-03-30 12:26:45 +02:00
Marek Vasut
96868b7297 etnaviv: Emit PE.ALPHA_COLOR_EXT* on GPUs with half-float support
At least GC880 (iMX6S), GC2000 (iMX6Q) blobs do not emit the
PE.ALPHA_COLOR_EXT0 and PE.ALPHA_COLOR_EXT1 into the command
stream. The GCnano (STM32MP1) is not affected by this change
either. This is because neither of these GPUs support the
half-float feature.

Emit PE.ALPHA_COLOR_EXT* in etnaviv only if half-float support
is present in the GPU. This fixes all of the currently failing
dEQPs in this group:
  dEQP-GLES2.functional.fragment_ops.blend.*

Fixes: 76adf041f2 ("etnaviv: fix blend color on newer GPUs")
Signed-off-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4277>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4277>
(cherry picked from commit 9e78f17b74)
2020-03-30 12:26:37 +02:00
Erik Faye-Lund
2e745e0ed3 rbug: do not return void-value
Returning a void-value is nonsensical, and in this case it seems like a
mistake.

This eliminates a warning when building on MSVC.

Fixes: fb04e5da97 ("gallium: add pipe_screen::finalize_nir")
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4297>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4297>
(cherry picked from commit 8c30b9d987)
2020-03-26 13:40:53 +01:00
Eric Engestrom
85c2780f78 .pick_status.json: Update to 05069e1f07 2020-03-26 13:40:47 +01:00
Jordan Justen
64092f4de3 intel: Add TGL PCI ID
Ref: Bspec 44455
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit f02ae69867)
2020-03-25 15:35:03 +01:00
Jordan Justen
fdd4beab68 intel: Update TGL PCI strings
Ref: Bspec 44455
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 1c6ef0165f)
2020-03-25 15:35:02 +01:00
Lionel Landwerlin
ead58ae0c8 intel: add new TGL pci ids
Update following kernel : https://patchwork.freedesktop.org/patch/357921/

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bspec: 44455
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4248>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4248>
(cherry picked from commit 58deebe547)
2020-03-25 15:34:58 +01:00
Samuel Pitoiset
d41c5951ba radv: enable VK_KHR_8bit_storage on GFX6-GFX7
Enabling a Vulkan extension doesn't mean that all features need
to be implemented. DOOM Eternal crashes at launch if that ext
is not supported but it doesn't matter if the features are enabled
or not.

Let's enable it like we did for VK_KHR_16bit_storage.

Cc: 19.3 20.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4299>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4299>
(cherry picked from commit 238e2ed210)
2020-03-25 15:32:19 +01:00
Rhys Perry
9cdc30f837 nir/gather_info: fix per-vertex handling in try_mask_partial_io
pipeline-db (Navi, ACO):
Totals from affected shaders:
SGPRS: 6432 -> 6432 (0.00 %)
VGPRS: 11924 -> 11924 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Scratch size: 1596 -> 1596 (0.00 %) dwords per thread
Code Size: 575524 -> 518620 (-9.89 %) bytes
LDS: 12187 -> 12187 (0.00 %) blocks
Max Waves: 2695 -> 2695 (0.00 %)

Helps a few hundred Dark Souls 3 shaders.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4190>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4190>
(cherry picked from commit 9f4ba2d2b4)
2020-03-25 15:32:18 +01:00
Marek Olšák
c81092f789 st/mesa: fix use of uninitialized memory due to st_nir_lower_builtin
reported by valgrind

Cc: 19.3 20.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4274>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4274>
(cherry picked from commit 719063d4d0)
2020-03-25 15:32:18 +01:00
Rhys Perry
ca17bf0f81 aco: fix boolean undef regclass
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4285>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4285>
(cherry picked from commit 17c7f4e30e)
2020-03-25 15:32:18 +01:00
Rhys Perry
c2601fe16b aco: emit IR in IF's merge block instead if the other side ends in a jump
Fixes NIR such as:
if (divergent) {
   a = sgpr()
} else {
   break;
}
use(a)

Previously we would have emitted:
if (divergent) {
   a = sgpr()
}
if (!divergent) {
   break;
}
use(a)

But "a" isn't available at it's use. Now we emit:
if (divergent) {
}
if (!divergent) {
   break;
}
a = sgpr()
use(a)

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 1936 -> 1936 (0.00 %)
VGPRS: 1264 -> 1264 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 159408 -> 159152 (-0.16 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 81 -> 81 (0.00 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
CC: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2557
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3658>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3658>
(cherry picked from commit 9d56ed199b)
2020-03-25 15:32:18 +01:00
Rhys Perry
f130cbe716 aco: improve check for unreachable loop continue blocks
The old code would have previously caught:
loop {
   ...
   break
}
when it was meant to just catch:
loop {
   if (...)
      break
   else
      break
}

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3658>
(cherry picked from commit 8d8c864beb)
2020-03-25 15:32:18 +01:00
Rhys Perry
a1fd16e507 aco: skip NIR in unreachable merge blocks
NIR removes most of this but undef instructions for loop header phis can
remain. These were harmless because ACO would DCE them itself.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3658>
(cherry picked from commit 46e94fd854)
2020-03-25 15:32:18 +01:00
Rhys Perry
275e939299 aco: handle missing second predecessors at merge block phis
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3658>
(cherry picked from commit f2c4878de9)
2020-03-25 15:32:18 +01:00
Rhys Perry
cb7517be47 aco: set has_divergent_branch for discards in loops
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3658>
(cherry picked from commit f1a2e1df78)
2020-03-25 15:32:18 +01:00
Marek Olšák
4c2f8b3dd6 ac: fix fast division
This stopped working with LLVM 11 and might occasionally have been broken
on older LLVM, because the metadata was set on the mul, not on the rcp.

Cc: 19.3 20.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4268>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4268>
(cherry picked from commit 303842b2db)
2020-03-25 15:32:18 +01:00
Neil Armstrong
d48baa3859 Revert "ci: Remove T820 from CI temporarily"
This reverts commit 089c8f0b8d.

Our office changes are finished and power is now stable in our lab
for T820 CI to run again.

Cc: Daniel Stone <daniels@collabora.com>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4057>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4057>
(cherry picked from commit 4b61ad372d)
2020-03-25 15:32:18 +01:00
Eric Engestrom
4b5df32e82 .pick_status.json: Update to 1271193932 2020-03-25 13:59:43 +01:00
Samuel Pitoiset
f3766dada2 radv: fix optional pSizes parameter when binding streamout buffers
The Vulkan spec 1.2.135 says:

   "pSizes is an optional array of buffer sizes, specifying the maximum
   number of bytes to capture to the corresponding transform feedback
   buffer. If pSizes is NULL, or the value of the pSizes array element
   is VK_WHOLE_SIZE, then the maximum bytes captured will be the size
   of the corresponding buffer minus the buffer offset."

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2650
Fixes: b4eb029062 ("radv: implement VK_EXT_transform_feedback")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4232>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4232>
(cherry picked from commit 2d3223ca90)
2020-03-20 15:59:43 -07:00
Caio Marcelo de Oliveira Filho
3ab95d1846 mesa/main: Fix overflow in validation of DispatchComputeGroupSizeARB
An uint64_t can store the result of multiplying two GLuint (uint32_t),
so use that property to check for overflow when calculating the total.

Change the error message so we don't need to care about the actual
total -- which means we don't need a larger than 64-bit value to hold
it.

Fixes: 45ab63c0cb ("mesa/main: add support for ARB_compute_variable_groups_size")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4240>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4240>
(cherry picked from commit fdc6032928)
2020-03-20 15:59:43 -07:00
Dylan Baker
52b2c50164 .pick_status.json: Update to aee004a7c8 2020-03-20 15:59:40 -07:00
John Stultz
57e746cfc0 vc4_bufmgr: Remove duplicative VC definition
This is already defined in
  src/broadcom/cle/v3d_packet_helpers.h:42:9

And was causing build issues in AOSP when building with mmma

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4175>
(cherry picked from commit 0df48e5d1f)
Signed-off-by: John Stultz <john.stultz@linaro.org>
2020-03-20 00:08:05 +00:00
Lionel Landwerlin
7e7722dca3 isl: drop min row pitch alignment when set by the driver
When the caller of the isl_surf_init() specifies a row pitch, do not
consider the minimum CCS requirement if it's incompatible with the
caller's value.

isl_surf_get_ccs_surf() will check that the main surface alignment
matches CCS expectations.

v2: Simplify checks (Nanley)

v3: Add Comment about isl_surf_get_ccs_surf() (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Fixes: a3f6db2c4e ("isl: drop CCS row pitch requirement for linear surfaces")
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4243>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4243>
(cherry picked from commit 507abc3959)
2020-03-20 00:21:51 +01:00
Lionel Landwerlin
b1cbf7d9fa isl: only apply main surface ccs pitch constraint with CCS
We could be creating a Y-tiled surface that isn't going to use CCS
(this could be the case when clearly indicated through modifiers).
Don't apply the main surface pitch alignment constraint in that case.

v2: Use logical NOT (Sagar)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: a3f6db2c4e ("isl: drop CCS row pitch requirement for linear surfaces")
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4243>
(cherry picked from commit def3470e9b)
2020-03-20 00:21:50 +01:00
Lionel Landwerlin
18e76206b0 isl: properly filter supported display modifiers on Gen9+
Y tiling is supported for display on Gen9+ so don't filter it from the
possible flags.

v2: Drop Yf from display supported tilings on Gen12+ (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4243>
(cherry picked from commit dab0aadea9)
2020-03-20 00:21:49 +01:00
Lionel Landwerlin
0414dba695 isl: implement linear tiling row pitch requirement for display
We're missing a requirement for alignment of row pitch for the display
HW. In linear tiling, the row pitch must be a 64bytes aligned.

v2: Use correct formula to align to 64bytes (Chad)

v3: Matching {} (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4243>
(cherry picked from commit 157a3cf3ec)
2020-03-20 00:21:49 +01:00
Eric Engestrom
29443dad40 .pick_status.json: Update to 3252041a78 2020-03-20 00:21:27 +01:00
Dylan Baker
4441e00be1 .pick_status.json: Mark 56de6f698e as denominated 2020-03-19 10:01:11 -07:00
Rhys Perry
fb341213fa nir/gather_info: handle emit_vertex_with_counter
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
CC: <mesa-stable@lists.freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4193>
(cherry picked from commit 5193688e1a)
2020-03-19 09:51:52 -07:00
Marek Olšák
5e14037227 nir: fix clip/cull_distance_array_size in nir_lower_clip_cull_distance_arrays
This fixes a GPU hang on radeonsi.

It only works if optimizations have already been run.

Cc: 19.3 20.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4194>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4194>
(cherry picked from commit 3c03718fd7)
2020-03-19 09:51:51 -07:00
Jason Ekstrand
226ff465b7 anv: Swizzle fast-clear values
Starting with Gen12, we can fast-clear a lot more surface formats and we
are suddenly in the position of having to fast-clear surfaces with
formats with an implicit swizzle such as VK_FORMAT_R4G4B4A4_UNORM_PACK16
which is represented as ISL_FORMAT_A4B4G4R4 with a BGRA swizzle.  In
order for blorp to do the fast-clear color conversion for us, it needs
a properly swizzled color.

This fixes the following Vulkan CTS groups on TGL:

 - dEQP-VK.pipeline.blend.format.b4g4r4a4_unorm_pack16.*
 - dEQP-VK.api.image_clearing.core.clear_color_image.*.b4g4r4a4*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4218>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4218>
(cherry picked from commit 46187bb54f)
2020-03-19 09:51:48 -07:00
Jason Ekstrand
7eb4b33a9a intel/blorp: Add support for swizzling fast-clear colors
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4218>
(cherry picked from commit 3fb8f19481)
2020-03-19 09:51:47 -07:00
Ian Romanick
12ed35a395 soft-fp64: Split a block that was missing a cast on a comparison
This function has code like:

   if (0x7FD <= zExp) {
      if ((0x7FD < zExp) ||
         ((zExp == 0x7FD) &&
            (0x001FFFFFu == zFrac0 && 0xFFFFFFFFu == zFrac1) &&
               increment)) {
         ...
	 return ...;
      }
      if (zExp < 0) {

I saw that, and I thought, "Uh... what?  Dead code?"  I thought it was a
bit fishy, so I grabbed the Berkeley SoftFloat Library 3e code, and
there is similar code in softfloat_roundPackToF64
(source/s_roundPackToF64.c), but it has an extra (uint16_t) cast in the
first comparison.  This is basicially a shortcut for

   if (zExp < 0 || zExp >= 0x7FD) {

So, having the nesting kind of makes sense. On a CPU, nesting the flow
control can be an optimization.  On a GPU, it's just fail.  Split the
block so that we don't need the uint16_t cast magic.

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 683638 -> 658127 (-3.73%)
instructions in affected programs: 666839 -> 641328 (-3.83%)
helped: 92
HURT: 0
helped stats (abs) min: 26 max: 2456 x̄: 277.29 x̃: 144
helped stats (rel) min: 3.21% max: 4.22% x̄: 3.79% x̃: 3.90%
95% mean confidence interval for instructions value: -345.84 -208.75
95% mean confidence interval for instructions %-change: -3.86% -3.73%
Instructions are helped.

total cycles in shared programs: 5458858 -> 5344600 (-2.09%)
cycles in affected programs: 5360114 -> 5245856 (-2.13%)
helped: 92
HURT: 0
helped stats (abs) min: 126 max: 10300 x̄: 1241.93 x̃: 655
helped stats (rel) min: 1.71% max: 2.37% x̄: 2.12% x̃: 2.17%
95% mean confidence interval for cycles value: -1539.93 -943.94
95% mean confidence interval for cycles %-change: -2.16% -2.08%
Cycles are helped.

Fixes: f111d72596 ("glsl: Add "built-in" functions to do add(fp64, fp64)")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
(cherry picked from commit bf2eb3e0ee)
2020-03-19 09:51:46 -07:00
Ian Romanick
d6b98e432c soft-fp64/fsat: Correctly handle NaN
fsat is defined as min(max(a, 0.0), 1.0), and IEEE defines both min and
max to return the non-NaN value when one value is NaN.  Based on this,
fsat should definitely return 0.0 for NaN.

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 841666 -> 841647 (<.01%)
instructions in affected programs: 122033 -> 122014 (-0.02%)
helped: 7
HURT: 0
helped stats (abs) min: 1 max: 4 x̄: 2.71 x̃: 3
helped stats (rel) min: 0.01% max: 0.02% x̄: 0.02% x̃: 0.01%
95% mean confidence interval for instructions value: -3.74 -1.69
95% mean confidence interval for instructions %-change: -0.02% -0.01%
Instructions are helped.

total cycles in shared programs: 6927246 -> 6926904 (<.01%)
cycles in affected programs: 1038987 -> 1038645 (-0.03%)
helped: 7
HURT: 0
helped stats (abs) min: 18 max: 72 x̄: 48.86 x̃: 54
helped stats (rel) min: 0.03% max: 0.05% x̄: 0.03% x̃: 0.03%
95% mean confidence interval for cycles value: -67.38 -30.33
95% mean confidence interval for cycles %-change: -0.05% -0.02%
Cycles are helped.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: a42163cbbc ("compiler: Add lowering support for 64-bit saturate operations to software")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2585
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
(cherry picked from commit 7673dcbd21)
2020-03-19 09:51:45 -07:00
Pierre-Eric Pelloux-Prayer
6923ae24f4 st/mesa: disallow deferred flush if there are multiple contexts
u_threaded can hang in these situation, with one context waiting on a
deferred fence from the other context.
But the other context isn't flushing its pending work (because it's waiting
for more work to pushed) so everything is stuck.

Fixes: d17b35e671 ("gallium: add PIPE_FLUSH_DEFERRED")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1430
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4213>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4213>
(cherry picked from commit e7f3a8d695)
2020-03-19 09:51:44 -07:00
Dylan Baker
8c9b63ee40 .pick_status.json: Mark c923de68dd as backported 2020-03-19 09:51:41 -07:00
Dylan Baker
ee1ebc22ae .pick_status.json: Mark 672d106199 as backported 2020-03-19 09:51:41 -07:00
Dylan Baker
7dc859e2ed .pick_status.json: Update to cf62c2b2ac 2020-03-19 09:51:38 -07:00
Greg V
2af8aeb9a6 amd/addrlib: fix build on non-x86 platforms
regparm(0) attribute does not work on aarch64 (and presumably powerpc64 and others).
Default to not specifying any calling convention on non-amd64/i386 platforms.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 56f31328f2)

Signed-off-by: John Stultz <john.stultz@linaro.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4239>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4239>
2020-03-19 00:09:51 +00:00
John Stultz
3cabdc38fd gallium: hud_context: Fix scalar initializer warning.
When trying to build mesa/master under AOSP, I've run into the
following error:

external/mesa3d/src/gallium/auxiliary/hud/hud_context.c:1821:31: error: braces around scalar initializer [-Werror,-Wbraced-scalar-init]
   struct sigaction action = {{0}};
                              ^~~
1 error generated.

This patch addresses this by switching to using memset instead of
using an initializer.

Signed-off-by: John Stultz <john.stultz@linaro.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4141>
(cherry picked from commit be22995ecf)

Signed-off-by: John Stultz <john.stultz@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4239>
2020-03-19 00:09:35 +00:00
Samuel Pitoiset
41c56b6cbc radv/gfx10: fix required ballot size with VK_EXT_subgroup_size_control
If compute shaders require a specific subgroup size (ie. Wave32),
we have to use the correct ballot size.

Fixes dEQP-VK.subgroups.ballot_other.compute.*_requiredsubgroupSize.

Fixes: fb07fd4e6c ("radv: implement VK_EXT_subgroup_size_control")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4230>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4230>
2020-03-18 21:58:47 +00:00
Samuel Pitoiset
66c3f0c063 radv/gfx10: fix required subgroup size with VK_EXT_subgroup_size_control
If compute shaders require a specific subgroup size (ie. Wave32),
we have to return the correct one.

Fixes dEQP-VK.subgroups.size_control.compute.required_subgroup_size_*.

Fixes: fb07fd4e6c ("radv: implement VK_EXT_subgroup_size_control")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4230>
2020-03-18 21:58:47 +00:00
Dylan Baker
8647cdc7c3 docs/relnotes: Add sha256 sums for 20.0.2 2020-03-18 14:40:36 -07:00
Dylan Baker
fa6e67066b VERSION: bump for 20.0.2 release 2020-03-18 14:22:35 -07:00
Dylan Baker
7a1423d41a Docs: Add release notes for 20.0.2 2020-03-18 14:22:17 -07:00
Jason Ekstrand
c15220de7e anv: Do an end-of-pipe sync before updating AUX table entries
We've found in GL that an actual end-of-pipe sync is required before
invalidating the aux tables and that a simple CS stall is insufficient.
If we're about to modify the actual AUX table entries from the GPU, we
should definitely make sure it's stopped dead before we do so.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4206>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4206>
(cherry picked from commit d60375cbc2)
2020-03-18 10:28:45 -07:00
Rafael Antognolli
5d57fe5cb7 iris: Wait for the GPU to be idle before invalidating the aux table.
An end of pipe sync seems to satisfy this restriction. It takes care of
GPU hangs seen in dEQP-GLES31.functional.copy_image.* tests.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4005>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4005>
(cherry picked from commit b4ddc6139b)
2020-03-18 10:28:40 -07:00
Rafael Antognolli
e5e0fdf50f iris: Split aux map initialization from invalidation.
We can write the aux map address only once during the batch
initialization, and then only invalidate it once we modify it.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4005>
(cherry picked from commit a7de6f1321)
2020-03-18 10:28:32 -07:00
Rafael Antognolli
98cd8c666d anv: Wait for the GPU to be idle before invalidating the aux table.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4005>
(cherry picked from commit 43dc842cb9)
2020-03-18 10:28:24 -07:00
Jason Ekstrand
44e9b6ab62 anv: Do end-of-pipe sync around MCS/CCS ops instead of CS stall
v2: Do end-of-pipe sync after clear depth stencil too (Jason).
v3: Also do end-of-pipe sync before clear depth stencil too (Jason).

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4005>
(cherry picked from commit 3ca3050de5)
2020-03-18 10:28:19 -07:00
Jason Ekstrand
8bc42bf9db anv: Use a proper end-of-pipe sync instead of just CS stall
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4005>
(cherry picked from commit 2db471953a)
2020-03-18 10:28:12 -07:00
Jason Ekstrand
5d2f7e96ad anv: Use the PIPE_CONTROL instead of bits for the CS stall W/A
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4005>
(cherry picked from commit ac8d412ba3)
2020-03-18 10:28:05 -07:00
Samuel Pitoiset
753c61f76d radv: fix random depth range unrestricted failures due to a cache issue
The shader module name is used to compute the pipeline key. The
driver used to load the wrong pipelines because the shader names
were similar.

This should fix random failures of
dEQP-VK.pipeline.depth_range_unrestricted.*

Fixes: f11ea22666 ("radv: fix a performance regression with graphics depth/stencil clears")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4216>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4216>
(cherry picked from commit 94e37859a9)
2020-03-18 10:00:49 -07:00
Bas Nieuwenhuizen
f0ac5321f8 amd/llvm: Fix divergent descriptor regressions with radeonsi.
piglit/bin/arb_bindless_texture-limit -auto -fbo:
  Needed to deal with non-NULL dynamic_index without deref in tex instructions.

piglit/bin/shader_runner tests/spec/arb_bindless_texture/execution/images/multiple-resident-images-reading.shader_test -auto:
  Need to deal with non-deref images in enter_waterfall_imae.

Fixes: b83c9aca4a "amd/llvm: Fix divergent descriptor indexing. (v3)"
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4191>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4191>
(cherry picked from commit 8e4e2cedcf)
2020-03-18 10:00:49 -07:00
Dave Airlie
a3eb254cfc gallium: fix build with latest meson and gcc10
In Fedora 32 build was failing with meson-0.53.2-1.git88e40c7.fc32
and gcc-10.0.1-0.9.fc32.x86_64.

Worked with meson-0.53.1-1 and same gcc.

/usr/bin/ld: src/gallium/state_trackers/dri/libdri.a(dri2.c.o): in function `dri2_interop_export_object':
/home/airlied/devel/mesa/mesa/build/../src/gallium/state_trackers/dri/dri2.c:1813: undefined reference to `st_finalize_texture'
/usr/bin/ld: src/gallium/state_trackers/dri/libdri.a(dri_screen.c.o): in function `dri_init_screen_helper':
/home/airlied/devel/mesa/mesa/build/../src/gallium/state_trackers/dri/dri_screen.c:580: undefined reference to `st_gl_api_create'

Moving this around seems to fix it.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4220>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4220>
(cherry picked from commit 040ce9a1b3)
2020-03-18 10:00:49 -07:00
Dylan Baker
336c187087 .pick_status.json: Update to 94e37859a9 2020-03-18 09:55:31 -07:00
Samuel Pitoiset
aa3fe28d05 radv: only inject implicit subpass dependencies if necessary
The Vulkan 1.2.134 spec update clarified when implicit subpass
dependencies should be injected by the driver. They only make
sense if automatic layout transitions are performed.

This should fix a performance regression with RPCS3 (although
they added a workaround for RADV since the regression has been found).

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2502
Fixes: e60de08547 ("radv: handle missing implicit subpass dependencies")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4210>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4210>
(cherry picked from commit 46e8ba1344)
2020-03-17 09:47:26 -07:00
Michel Dänzer
34ca8f82af llvmpipe: Use uintptr_t for pointer values
Instead of uint64_t. Fixes potentially writing beyond the end of the
handles pointer array on 32-bit architectures (and copying all 0s
instead of the computed pointer values to the array on big endian
ones).

Corresponding compiler warning:

../src/gallium/drivers/llvmpipe/lp_state_cs.c: In function ‘llvmpipe_set_global_binding’:
../src/gallium/drivers/llvmpipe/lp_state_cs.c:1312:12: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
 1312 |       va = (uint64_t)((char *)lp_res->data + offset);
      |            ^

Fixes: 264663d55d "gallivm/llvmpipe: add support for global
                     operations."

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4166>
(cherry picked from commit 106bf59ca9)
2020-03-17 09:47:25 -07:00
Dylan Baker
370247f220 .pick_status.json: Update to 3dd0d12aa5 2020-03-17 09:47:20 -07:00
Jose Fonseca
9a9413a7c5 meson: Avoid duplicate symbols.
All the stubs in src/compiler/glsl/glcpp/pp_standalone_scaffolding.c
are duplicate symbols.  They should only be used as replacement for
Mesa functions when building glcpp and glsl standalone compilers, but
in fact they are getting linked with Mesa.

This change fixes this by moving the standalone stubs to a
libglcpp_standalone target, that's only linked with the glcpp/glsl
tools.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4186>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4186>
(cherry picked from commit f6dad10d04)
2020-03-16 10:47:23 -07:00
Danylo Piliaiev
3cfca286af st/mesa: Fix signed integer overflow when using util_throttle_memory_usage
../src/mesa/state_tracker/st_cb_texture.c:1719:57: runtime error: signed integer overflow: 203489280 * 16 cannot be represented in type 'int'

Fixes: 21ca322e63
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4185>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4185>
(cherry picked from commit 51b1b102bd)
2020-03-16 10:47:23 -07:00
Dylan Baker
872de20b21 .pick_status.json: Update to ee9e0d1eca 2020-03-16 10:47:23 -07:00
Bas Nieuwenhuizen
86a6711451 amd/llvm: Fix divergent descriptor indexing. (v3)
There are multiple LLVM passes that very much move the
intrinsic using the descriptor outside of the loop, defeating
the entire point of creating the loop.

Defeat the optimizer by  splitting the break into a separate
if-statement and putting an optimization barrier on the bool
in between.

v2: Move from a callback based system to begin/end loop.
    This does not make it significantly less intrusive but
    is a bit nicer with all the extra struct and callback
    stubs.
v3: Deal with non-divergent values in divergent path.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2160
Fixes: 028ce52739 "radv: Add non-uniform indexing lowering."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit b83c9aca4a)

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4171>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4171>
2020-03-13 17:15:14 +00:00
Dylan Baker
93b59199db .pick_status.json: Mark b83c9aca4a as backported 2020-03-13 09:45:25 -07:00
Marek Olšák
35dbf55ada gallium/cso_context: remove cso_delete_xxx_shader helpers to fix the live cache
With the live shader cache, equivalent shaders can be backed by the same
CSO. This breaks the logic that identifies whether the shader being deleted
is bound.

For example, having shaders A and B, you can bind shader A and delete
shader B. Deleting shader B will unbind shader A if they are equivalent.

Pierre-Eric figured out the root cause for this issue.

Fixes: 0db74f479b - radeonsi: use the live shader cache
Closes: #2596

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4078>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4078>
(cherry picked from commit 2dc300421d)
2020-03-13 09:45:15 -07:00
Danylo Piliaiev
e2db4624b7 glsl: do not crash if string literal is used outside of #include/#line
Fixes: 67b32190f3
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2619
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4146>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4146>
(cherry picked from commit 1305b93274)
2020-03-13 09:23:41 -07:00
Eric Anholt
f2b5a72f4a glsl/tests: Fix waiting for disk_cache_put() to finish.
We were wasting 4s on waiting for expected-not-to-appear files to show
up on every test.  Using timeouts in test code is error-prone anyway,
as our shared runners may be busy on other jobs.

Fixes: 50989f87e6 ("util/disk_cache: use a thread queue to write to shader cache")
Link: https://gitlab.freedesktop.org/mesa/mesa/issues/2505
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4140>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4140>
(cherry picked from commit d0a52432b1)
2020-03-13 09:23:40 -07:00
Timur Kristóf
97f4d6bdf0 radv: Enable lowering dynamic quad broadcasts.
This will lower dynamic quad broadcasts into something that both
LLVM and ACO can understand. On hardware which supports shuffles,
they are lowered to shuffle, on older hardware (GFX6-7) they will
get lowered to constant quad broadcasts.

Fixes dEQP-VK.subgroups.quad.*.subgroupquadbroadcast_nonconst_*

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4147>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4147>
(cherry picked from commit 967eb23261)
2020-03-13 09:23:36 -07:00
Timur Kristóf
5f896ad529 nir: Add ability to lower non-const quad broadcasts to const ones.
Some hardware doesn't support subgroup shuffle, and on such hardware
it makes no sense to lower quad broadcasts to shuffle. Instead, let's
lower them to four const quad broadcasts, paired with bcsel instructions.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4147>
(cherry picked from commit ec16535b49)
2020-03-13 09:23:36 -07:00
Eric Engestrom
a8fa654cbe gen_release_notes: fix version in "you should wait" message
Fixes: 86079447da ("scripts: Add a gen_release_notes.py script")
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4113>
(cherry picked from commit 64af6b3bcf)
2020-03-13 09:23:35 -07:00
Samuel Pitoiset
b35d45f21a ac/llvm: add missing optimization barrier for 64-bit readlanes
Otherwise, LLVM optimizes it but it's actually incorrect.

Fixes: 0f45d4dc2b ("ac: add ac_build_readlane without optimization barrier")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3585>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3585>
(cherry picked from commit cc320ef9af)
2020-03-13 09:23:34 -07:00
Eric Engestrom
1b2c982166 vulkan/wsi: fix cleanup when dup() fails
Fixes: f5433e4d6c ("vulkan/wsi: Add modifiers support to wsi_create_native_image")
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4137>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4137>
(cherry picked from commit 1fa259b035)
2020-03-13 09:23:30 -07:00
Dylan Baker
92242753b0 .pick_status.json: Update to 625d8705f0 2020-03-13 09:23:29 -07:00
Martin Fuzzey
ffbafd9c8b freedreno: android: fix build of perfcounters.
Some dependencies were missing on android causing a build failure.

(cherry picked from commit d8bae10bfe)

Signed-off-by: John Stultz <john.stultz@linaro.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4151>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4151>
2020-03-11 20:15:10 +00:00
Martin Fuzzey
b1b35d1033 freedreno: android: add a6xx-pack.xml.h generation to android build
The generation of a6xx-pack.xml.h was missing in the android build scripts
leading to a build failure.

Signed-off-by: Martin Fuzzey <martin.fuzzey@flowbird.group>
(cherry picked from commit fad9924315)

Signed-off-by: John Stultz <john.stultz@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4151>
2020-03-11 20:15:10 +00:00
Martin Fuzzey
b26ac839aa freedreno: android: fix build failure on android due to python version
The freedreno gen_header.py script now only works under python3.
It contains a "print()" call which prints a blank line under python3
but prints "()" under python2.7.

However the Android build currently uses python2.

This leads to incorrect code generation and a later build error.

.../STATIC_LIBRARIES/libfreedreno_registers_intermediates/registers/adreno_common.xml.h:163:2: error: expected identifier or '('
()

Fix this by adding MESA_PYTHON3 and using it for the freedreno scripts.

Signed-off-by: Martin Fuzzey <martin.fuzzey@flowbird.group>
(cherry picked from commit cad400a59e)

Signed-off-by: John Stultz <john.stultz@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4151>
2020-03-11 20:15:10 +00:00
Jason Ekstrand
155a550c60 vulkan/wsi: Return an error if dup() fails
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4135>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4135>
(cherry picked from commit af68b0d346)
2020-03-10 18:40:48 +01:00
Jason Ekstrand
7daa262486 vulkan/wsi: Don't leak the FD when GetImageDrmFormatModifierProperties fails
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4135>
(cherry picked from commit 34d2637fa7)
2020-03-10 18:40:47 +01:00
Rob Clark
4fa692872b freedreno: fix FD_MESA_DEBUG=inorder
Fixes: 2c07e03b79 ("freedreno: allow ctx->batch to be NULL")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4071>
(cherry picked from commit b3efa2a4da)
2020-03-10 18:40:42 +01:00
Eric Engestrom
6b0033dc30 .pick_status.json: Update to ba03e308b6 2020-03-10 18:40:40 +01:00
Francisco Jerez
868286d10b intel/fs: Fix workaround for VxH indirect addressing bug under control flow.
The current workaround for this hardware bug involved marking the ADD
instruction used to initialize the address register as NoMask on
Gen12, which was based on the assumption that the problem was caused
by a hardware bug affecting the application of the execution mask to
the address register write.

However that doesn't seem to be the case: The address register write
was working correctly, the real problem leading to hangs on TGL is
that the indirect addressing logic is unable to deal with garbage
values in the address register (e.g. misaligned offsets), even for
channels which are currently inactive due to non-uniform control flow.
The current workaround isn't able to avoid that situation in general,
since the result of the NoMask ADD instruction for a dead channel is
calculated based on the corresponding (dead) component of the
indirect_byte_offset source, which would still be undefined in the
likely case that the source was initialized under control flow itself.

This would lead to hangs whenever MOV_INDIRECT was used under
non-uniform control flow in some scenarios like a tessellation shader
from GFXBench5/gl_4 (AKA Car Chase) on TGL.  In addition I've managed
to reproduce the same issue on earlier platforms by initializing the
whole address register with garbage before the ADD instruction, so
this seems to be a long-standing issue we have avoided mostly by luck.

This patch fixes the problem and applies the workaround to all
platforms, since even when the hardware is able to deal with garbage
address values without hanging there might be a significant
performance cost from reading random GRF registers due to the useless
extra EU cycles spent fetching registers for dead channels and due to
the potential for unintended serialization with respect to other
random instructions that could be executed in parallel, which may have
had a cost of the order of hundreds of cycles in the worst case
scenario.

Fixes: f93dfb509c "intel/fs: Write the address register with NoMask for MOV_INDIRECT"
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 45d4665dc7)
2020-03-10 13:30:15 +01:00
Vinson Lee
5de89fe2ff st/nine: Fix incompatible-pointer-types-discards-qualifiers errors.
../src/gallium/state_trackers/nine/nine_ff.c:129:28: error: initializing 'struct nine_ff_vs_key *' with an expression of type 'const void *' discards qualifiers [-Werror,-Wincompatible-pointer-types-discards-qualifiers]
    struct nine_ff_vs_key *vs = key;
                           ^    ~~~
../src/gallium/state_trackers/nine/nine_ff.c:145:28: error: initializing 'struct nine_ff_ps_key *' with an expression of type 'const void *' discards qualifiers [-Werror,-Wincompatible-pointer-types-discards-qualifiers]
    struct nine_ff_ps_key *ps = key;
                           ^    ~~~

Fixes: fdd96578ef ("nine: Add state tracker nine for Direct3D9 (v3)")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Andre Heider <a.heider@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4015>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4015>
(cherry picked from commit 5ffa6eab88)
2020-03-10 13:30:15 +01:00
Marek Olšák
c53788cd41 ac: add a bug workaround for the 100% NGG culling case
Fixes: 8db00a51f8 - radeonsi/gfx10: implement NGG culling for 4x wave32 subgroups
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4079>
(cherry picked from commit fc65df5651)
2020-03-10 13:30:15 +01:00
Marek Olšák
0f437dd261 radeonsi: add a bug workaround for NGG - LATE_ALLOC_GS
Cc: 19.3 20.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4079>
(cherry picked from commit 7481c4be58)
2020-03-10 13:30:15 +01:00
Jason Ekstrand
1e3edbfbd3 anv: Parse VkPhysicalDeviceFeatures2 in CreateDevice
The client may enable robustBufferAccess2 via either
pCreateInfo->pEnabledFeatures or via a chained-in
VkPhysicalDeviceFeatures2 struct.  We need to parse both.

Fixes: 022e5c7e5a "anv: Implement VK_KHR_get_physical_device_properties2"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3777>
(cherry picked from commit 35ca2ad22e)
2020-03-10 13:30:15 +01:00
Eric Engestrom
558746603b docs/relnotes/20.0: fix vulkan version reported
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4092>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4092>
(cherry picked from commit 0e4c001951)
2020-03-10 13:30:15 +01:00
Eric Engestrom
5f4fd166fd gen_release_notes: fix vulkan version reported
Fixes: 4ef3f7e3d3 ("anv: Enable Vulkan 1.2 support")
Fixes: 7f5462e349 ("radv: enable Vulkan 1.2")
Fixes: 75755e0eba ("turnip: Pretend to support Vulkan 1.2")
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4092>
(cherry picked from commit 2557d614d3)
2020-03-10 12:47:01 +01:00
Eric Engestrom
33c6933ef4 .pick_status.json: Update to 24db276d11 2020-03-10 12:46:30 +01:00
Eric Engestrom
f8f4c317fb bin/gen_release_notes.py: fix commit list command
Fixes: 86079447da ("scripts: Add a gen_release_notes.py script")
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4069>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4069>
(cherry picked from commit d7a70fbb23)
2020-03-06 16:43:10 -08:00
Samuel Pitoiset
f64c5d0cf6 nir/lower_input_attachments: remove bogus assert in try_lower_input_texop()
It can be a sampler too.

Fixes: 84b08971fb ("nir/lower_input_attachments: lower nir_texop_fragment_{mask}_fetch")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2558
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4043>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4043>
(cherry picked from commit 913d2dcd23)
2020-03-06 16:43:10 -08:00
Jason Ekstrand
a1041546ec iris: Don't skip fast depth clears if the color changed
We depend on BLORP to convert the clear color and write it into the
clear color buffer for us.  However, we weren't bothering to call blorp
in the case where the state is ISL_AUX_STATE_CLEAR.  This leads to the
clear color not getting properly updated if we have back-to-back clears
with different clear colors.  Technically, we could go out of our way to
set the clear color directly from iris in this case but this is a case
we're unlikely to see in the wild so let's not bother.  This matches
what we already do for color surfaces.

Cc: mesa-stable@lists.freedesktop.org
Reported-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4073>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4073>
(cherry picked from commit 9d07d59842)
2020-03-06 16:43:10 -08:00
Jason Ekstrand
18508b37cb isl: Set 3DSTATE_DEPTH_BUFFER::Depth correctly for 3D surfaces
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3717>
(cherry picked from commit 9f5f4269a6)
2020-03-06 16:43:09 -08:00
Dylan Baker
acc1dfdc02 .pick_status.json: Update to 70341d7746c177a4cd7377ef633e9f85afd11d54 2020-03-06 16:43:09 -08:00
Andreas Baierl
955226028d gitlab-ci: Add add a set of lima flakes
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
(cherry picked from commit ef0abe5404)

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4017>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4017>
2020-03-06 16:43:09 -08:00
Kristian H. Kristensen
45e14cbdc7 Revert "spirv: Use a simpler and more correct implementaiton of tanh()"
This reverts commit da1c49171d.

The reduced formula has precision problems on fp16 around 0.  Bring
back the old formula, but make sure to keep the clamping.

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4054>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4054>
(cherry picked from commit 9f9432d56c)
2020-03-05 15:47:09 -08:00
Kristian H. Kristensen
a9ee554df3 Revert "glsl: Use a simpler formula for tanh"
This reverts commit 9807f502eb.

The simplified formula doesn't pass the tanh dEQP tests when we lower
to fp16 math.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4054>
(cherry picked from commit 986e92f0ea)
2020-03-05 15:47:09 -08:00
Samuel Pitoiset
3710f7c6a1 aco: fix image load/store with lod and 1D images
Make sure to add the lod value if non-null as the 2nd operand.

Fixes dEQP-VK.image.load_store_lod.with_format.1d.* on all gens
except GFX9.

Fixes: 4d49a7ac73 ("aco: handle nir_intrinsic_image_deref_{load,store} with lod")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4060>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4060>
(cherry picked from commit 7618fe1b48)
2020-03-05 15:47:09 -08:00
Marek Olšák
b6bb359b64 Revert "mesa: check for z=0 in _mesa_Vertex3dv()"
This reverts commit f04d7439a0.

It no longer helps performance and the current vbo implementation is
faster anyway.

The app that hit this was a CAD program called Spazio3D. It made pretty
terrible use of the OpenGL API and we sent them some tips for improvements.
I'm assuming they've fixed this by now.

Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4052>
(cherry picked from commit df3891e74a)
2020-03-05 15:47:09 -08:00
Dylan Baker
4960149a1f .pick_status.json: Update to 07f1ef5656 2020-03-05 15:47:09 -08:00
Dylan Baker
4392cf2d6d docs: Add sha256sums for 20.0.1 2020-03-05 13:59:11 -08:00
Dylan Baker
53b2b224dc Bump version for 20.0.1 2020-03-05 13:33:32 -08:00
Dylan Baker
bc67340ea5 docs: add relnotes for 20.0.1 2020-03-05 13:33:09 -08:00
Andrii Simiklit
1defcf02b4 Revert "glx: convert glx_config_create_list to one big calloc"
This reverts commit 35fc7bdf0e.

Unfortunately mentioned commit introduced a memory leak because
`driwindowsMapConfigs` and `createDriMode` functions allocate
small memory portions for each element:
 21,576 (232 direct, 21,344 indirect) bytes in 1 blocks are definitely lost in loss record 1,411 of 1,414
    at 0x483A7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
    by 0x5D4AA09: createDriMode (dri_common.c:291)
    by 0x5D4ABF5: driConvertConfigs (dri_common.c:310)
    by 0x5D58414: dri3_create_screen (dri3_glx.c:945)
    by 0x5D39829: AllocAndFetchScreenConfigs (glxext.c:815)
    by 0x5D39C57: __glXInitialize (glxext.c:941)
    by 0x5D3290A: GetGLXPrivScreenConfig (glxcmds.c:174)
    by 0x5D34F38: glXQueryExtensionsString (glxcmds.c:1307)
    by 0x4F83038: glXQueryExtensionsString (in /usr/local/lib/libGL.so.1.7.0)
    by 0x4F2EA6B: ??? (in /usr/lib/x86_64-linux-gnu/libwaffle-1.so.0.6.0)
    by 0x4F2A0D7: waffle_display_connect (in /usr/lib/x86_64-linux-gnu/libwaffle-1.so.0.6.0)
    by 0x498F42A: wfl_checked_display_connect (piglit-util-waffle.h:74)

There is one more thing which disallow us to easily fix it are different element sizes
for instance: `glx_config_create_list` allocates memory just for `glx_config`,
`driwindowsMapConfigs` for `driwindows_config` and
`createDriMode` for `__GLXDRIconfigPrivate`.
Yes it is possible but size of such fix
will be more big and complex than original one.
So it make sense only if the malloc overhead
really is a big problem there.

Acked-by: Eric Engestrom <eric@engestrom.ch>
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3406>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3406>
(cherry picked from commit 311c82e192)
2020-03-04 08:27:39 -08:00
Rafael Antognolli
3d203789a9 intel/gen12+: Disable mid thread preemption.
Fixes a GPU hang in Car Chase.

Cc: mesa-stable@lists.freedesktop.org

v2: Add comment explaining why (Jason).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4035>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4035>
(cherry picked from commit 5f13996262)
2020-03-04 08:27:37 -08:00
Rhys Perry
5620cd4bca aco: fix carry-out size for wave32 v_add_co_u32_e64
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Fixes: e0bcefc3a0 ('aco/wave32: Use lane mask regclass for exec/vcc.')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3902>
(cherry picked from commit 215df21dea)
2020-03-04 08:27:36 -08:00
Rhys Perry
fe1ba4a041 aco: keep track of which events are used in a barrier
And properly handle unordered events so that they always wait for 0.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3774>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3774>
(cherry picked from commit 9fea90ad51)
2020-03-04 08:27:35 -08:00
Paulo Zanoni
d808374674 intel/device: bdw_gt1 actually has 6 eus per subslice
Found by inspection, I'm not aware of any bugs caused by this typo.

According to Lionel, it seems we only use this to generate masks
of available EUs for perfromance queries, and it's only used when we
can't query the fused parts of the GPU through DRM_IOCTL_I915_QUERY.
So this patch should help for the corner case where the Kernel is too
old to support the query ioctl.

v2: improve commit message, cc stable (Lionel).

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4006>
(cherry picked from commit aa78801f0a)
2020-03-04 08:27:34 -08:00
Paulo Zanoni
e7b8a304bd intel: fix the gen 12 compute shader scratch IDs
This is the same idea as "intel: fix the gen 11 compute shader scratch
IDs".

The number of EUs on TGL is not the same as ICL, but the
MEDIA_VFE_STATE restrictions stay the same, so adapt the code to it.
Also, consider the base configuration instead of what we read from the
Kernel.

According to Mark, this fixes the following piglit tests on TGL:

    piglit.spec.arb_compute_shader.execution.shared-atomicmax-uint.tglm64
    piglit.spec.arb_compute_shader.execution.shared-atomicmax-int.tglm64
    piglit.spec.intel_shader_atomic_float_minmax.execution.shared-atomicmax-float.tglm64

v2: s/ICL+/Gen11+/ (Jason).

Cc: mesa-stable@lists.freedesktop.org
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4006>
(cherry picked from commit 9e5ce30da7)
2020-03-04 08:27:33 -08:00
Paulo Zanoni
dd4df57ad9 intel: fix the gen 11 compute shader scratch IDs
Scratch space allocation is based on the number of threads in the base
configuration, and we only have one base configuration for ICL, with 8
subslices.

This fixes an issue with Aztec on Vulkan in a machine with a
configuration that's not the base. The issue looks like a regression
from b9e93db208, but it seems things are broken since forever, just
not easily reproducible.

v2: Reimplement it using the subslices variable. Don't touch TGL.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4006>
(cherry picked from commit 1efe139cad)
2020-03-04 08:27:32 -08:00
Dylan Baker
131136ab99 .pick_status.json: Update to 0ac731b1ff 2020-03-04 08:27:31 -08:00
Tapani Pälli
54574288a1 mesa/st: fix formats required for EXT_texture_norm16
Earlier commit did not take in to account that lists required for
rendering and texturing are parsed separately. This commit simply
removes formats added to the other list.

Fixes: de4eb9a3bb ("mesa/st: toggle EXT_texture_norm16 based on format support")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3961>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3961>
(cherry picked from commit fbd61b3fb6)
2020-03-02 11:29:19 -08:00
Jordan Justen
eb4a39ba2f intel/compiler: Restrict cs_threads to 64
Our current GPGPU_WALKER code only supports up to 64 threads.

On HSW we could use up to 70 and TGL up to 112, but only if the walker
is adjusted so the width does not exceed 64. Work to support this is
in progress.

Previous to this change, we might try to downgrade to SIMD8 if the
SIMD16 shader spilled. Since HSW and TGL have the max number of
threads above 64, we would then try to emit an invalid GPGPU walker
command.

Fixes: 932045061b ("i965/cs: Emit compute shader code and upload programs")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Tested-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
(cherry picked from commit cf12faef61)
2020-03-02 11:29:18 -08:00
Dylan Baker
d0a1bc1c21 .pick_status.json: Update to 3503cb4c28 2020-03-02 11:28:45 -08:00
Marek Olšák
6f9a26ac25 mesa: fix incorrect prim.begin/end for glMultiDrawElements
This has no effect on Gallium, but it affects tnl.

Cc: 19.3 20.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3990>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3990>
(cherry picked from commit 1a61a5b1d4)
Conflicts Resolved by Dylan Baker

Conflicts:
	src/mesa/main/draw.c
2020-02-28 14:34:33 -08:00
Jonathan Marek
8334c60cba turnip: fix srgb MRT
Register packing macros makes this only set the first bit. Set to whole
dword to fix srgb for color attachments >0.

Fixes: 59f29fc8 ("turnip: Convert the rest of tu_cmd_buffer.c over to the new pack macros.")

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3979>
(cherry picked from commit 6420406f19)
2020-02-28 14:30:45 -08:00
Eric Anholt
a14b7adabd aco: Fix signed-vs-unsigned warning.
The previous instance of this comparision was 1u to avoid the warning, fix
this one too.

Fixes: dba71de5c6 ("aco: only create parallelcopy to restore exec at loop exit if needed")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3607>
(cherry picked from commit b9773631d3)
2020-02-28 14:30:44 -08:00
Samuel Pitoiset
9fc39b9c67 ac/llvm: flush denorms for nir_op_fmed3 on GFX8 and older gens
The hardware doesn't flush denorms, exactly like fmin/fmax, so
we have to do it manually. This doesn't fix anything known.

Fixes: d6a07732c9 ("ac: use llvm.amdgcn.fmed3 intrinsic for nir_op_fmed3")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3962>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3962>
(cherry picked from commit 9e5d2a73c5)
2020-02-28 14:30:40 -08:00
Samuel Pitoiset
710388f006 ac/llvm: fix 16-bit fmed3 on GFX8 and older gens
16-bit med3 is only supported on GFX9+.

Fixes dEQP-VK.spirv_assembly.instruction.amd_trinary_minmax.mid3.f16.*.

Fixes: d6a07732c9 ("ac: use llvm.amdgcn.fmed3 intrinsic for nir_op_fmed3")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3962>
(cherry picked from commit 30ac733680)
2020-02-28 14:30:39 -08:00
Samuel Pitoiset
c2f6f63f7b ac/llvm: fix 64-bit fmed3
Lower 64-bit fmed3 because LLVM doesn't expose an intrinsic.

Fixes dEQP-VK.spirv_assembly.instruction.amd_trinary_minmax.mid3.f64.*.

Fixes: d6a07732c9 ("ac: use llvm.amdgcn.fmed3 intrinsic for nir_op_fmed3")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3962>
(cherry picked from commit 50b8c25274)
2020-02-28 14:30:39 -08:00
Mathias Fröhlich
69edb32eaa mesa: Flush vertices before changing the OpenGL state.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3958>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3958>
(cherry picked from commit 636656bcd7)
2020-02-28 14:30:38 -08:00
Dave Airlie
f1ec137ec9 gallivm/nir: handle mod 0 better.
I haven't seen this crash but TGSI does it so best align with
it to avoid future issues.

Fixes: 44a6b0107b3J (gallivm: add nir->llvm translation (v2))
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3956>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3956>
(cherry picked from commit 2b155b1086)
2020-02-28 14:30:37 -08:00
Dave Airlie
981b2b6241 gallivm/nir: fix integer divide SIGFPE
Blender was crashing with a SIGFPE even though the divide by 0
logic was kicking in. I'm not sure why TGSI doesn't get into this state.

The problem was is the numerator was INT_MIN we'd replace the div by
0 with a divide by -1, which is an exception for INT_MIN as INT_MIN/-1
== INT_MAX + 1 (too large for 32-bits). Instead for integer divides
just replace the mask values with 0x7fffffff. Also fix up the
result handling so it aligns with TGSI usage. (gives 0)

Fixes: c717ac1247 ("gallivm/nir: wrap idiv to avoid divide by 0 (v2)")
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3956>
(cherry picked from commit 5370c685da)
2020-02-28 14:30:37 -08:00
Dave Airlie
eb0f6e3ab8 gallivm/tgsi: fix stream id regression
This broke TGSI GS shaders with llvmpipe, it wasn't looking at the
right immediates and it should be cast to an integer type.

Fixes: 163d5fde06 (gallium/swr: Enable GL_ARB_gpu_shader5: multiple streams)

Reviewed-by: Krzysztof Raszkowski <krzysztof.raszkowski@intel.com>
Acked-by: Jan Zielinski <jan.zielinski@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3949>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3949>
(cherry picked from commit 954cf8e86b)
2020-02-28 14:30:36 -08:00
Marek Olšák
bb589b48eb mesa: call FLUSH_VERTICES before updating CoordReplace
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Cc: 20.0 <mesa-stable@lists.freedesktop.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3947>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3947>
(cherry picked from commit 4449611ffb)
2020-02-28 14:30:36 -08:00
Rafael Antognolli
3aafc6ab8f iris: Apply the flushes when switching pipelines.
Even though the workaround description says:

   "all the listed commands are non-pipelined and hence flush caused due
   to pipeline mode change must not cause performance issues..."

My understanding is that we still need to have the flushes. Also, the
flushes are required not only to stall the pipeline, but also to clear
caches, so I don't think they can simply be discarded.

Additionally, while doing some testing that increased the number of
surface STATE_BASE_ADDRESS emitted, I got a lot more GPU hangs. Adding
these flushes fixes those hangs.

Fixes: b8fbb39a (iris: Implement Gen12 workaround for non pipelined
                 state)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3908>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3908>
(cherry picked from commit a70a605ad6)
2020-02-28 14:30:35 -08:00
Marek Olšák
0942c41582 tgsi_to_nir: set num_images and num_samplers with holes correctly
This fixes the copy_uv shader from st/omx, because it uses image 0 and 2
and image 1 isn't declared.

Cc: 20.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3936>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3936>
(cherry picked from commit c798aae739)
2020-02-28 14:30:35 -08:00
Eric Anholt
8a9f1293b8 turnip: Fix compiler warning about casting a nondispatchable handle.
Fixes: 1c5d84fcae ("turnip: hook up cmdbuffer event set/wait")
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3916>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3916>
(cherry picked from commit bd53f4f56b)
2020-02-28 14:30:34 -08:00
Tapani Pälli
5f0ae1f0b4 mesa/st: toggle EXT_texture_norm16 based on format support
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2556
Fixes: 7f467d4f73 ("mesa: GL_EXT_texture_norm16 extension plumbing")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3941>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3941>
(cherry picked from commit de4eb9a3bb)
2020-02-28 14:30:34 -08:00
Tapani Pälli
0d189290f0 i965: toggle on EXT_texture_norm16
Fixes: 7f467d4f73 ("mesa: GL_EXT_texture_norm16 extension plumbing")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3941>
(cherry picked from commit 200a83a983)
2020-02-28 14:30:33 -08:00
Tapani Pälli
c38d7f390d mesa: introduce boolean toggle for EXT_texture_norm16
Fixes: 7f467d4f73 ("mesa: GL_EXT_texture_norm16 extension plumbing")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3941>
(cherry picked from commit dc531869a9)
2020-02-28 14:30:32 -08:00
Mathias Fröhlich
9f45639452 egl: Fix A2RGB10 platform_{device,surfaceless} PBuffer configs.
The __DRI_IMAGE_FORMAT_* part wants to be handled for the *101010
type formats as well. Factor out a common function for that task.
That again makes the piglit egl_ext_device_base test work again
for hardware drivers.

v2: Factor out a common function for that task.
v3: dri2_pbuffer_visuals -> dri2_pbuffer_visuals

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Fixes: 9acb94b623 "egl: Enable 10bpc EGLConfigs for platform_{device,surfaceless}"
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3790>
(cherry picked from commit d32c458de7)
2020-02-28 14:30:32 -08:00
Jason Ekstrand
47bfaa8795 anv: Always enable the data cache
Because we set the needs_data_cache bit from the NIR during compilation,
any time a shader was pulled out of the pipeline cache, we wouldn't set
the bit and the data cache was disabled.  Fortunately, on Gen8+, this
bit is ignored because we always use the ALL section in the L3$ config
instead of separate DC and RO sections.  On Gen7, however, this meant
that we were basically never running with the data cache enabled and our
compute performance was suffering massively because of it.  This commit
improves Geekbench 5 scores on my Haswell GT3 by roughly 330% (no,
that's not a typo).

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3912>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3912>
(cherry picked from commit 5dfd83d7a1)
2020-02-28 14:30:31 -08:00
Dylan Baker
956a9c7d2a .pick_status.json: Update to 0932363489 2020-02-28 14:30:30 -08:00
Jose Maria Casanova Crespo
20e6bcea31 v3d: Sync on last CS when non-compute stage uses resource written by CS
When a resource is written by a compute shader and then used by a
non-compute stage we sync on last compute job to guarantee that the
resource has been completely written when the next stage reads resources.

In the other cases how flushes are done guarantee the serialization of
the writes and reads.

To reproduce the failure the following tests should be executed in batch
as last test don't fail when run isolated:

KHR-GLES31.core.shader_image_load_store.basic-allFormats-load-fs
KHR-GLES31.core.shader_image_load_store.basic-allFormats-loadStoreComputeStage
KHR-GLES31.core.shader_image_load_store.basic-allTargets-load-cs
KHR-GLES31.core.shader_image_load_store.advanced-sync-vertexArray

v2: Use fence dep instead of bo_wait (Eric Anholt)
v3: Rename struct names (Iago Toral)
    Document why is not needed on graphics->compute case. (Iago Toral)
    Follow same code pattern of the other update of in_sync_bcl.
v4: Fixed comments style. (Iago Toral)

Fixes KHR-GLES31.core.shader_image_load_store.advanced-sync-vertexArray

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
CC: 19.3 20.0 <mesa-stable@lists.freedesktop.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2700>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2700>
(cherry picked from commit 01496e3d1e)
2020-02-25 08:54:34 -08:00
Andreas Baierl
16615264b2 gitlab-ci: lima: Add flaky tests to the skips list
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Cc: <mesa-stable@lists.freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3884>
(cherry picked from commit 31a8075678)
2020-02-25 08:54:33 -08:00
Dave Airlie
555bf5f991 glx/drisw: fix shm put image fallback
The fallback to the non-shm put path used the wrong width here
as the pixmap is still allocated in a shared segment, so the
width needs to reflect that.

Fixes: 02c3dad0f3 ("Call shmget() with permission 0600 instead of 0777")

Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3823>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3823>
(cherry picked from commit 84395190ec)
2020-02-25 08:54:33 -08:00
Dave Airlie
c2dac5a508 glx/drisw: return false if shmid == -1
If an attempt to create an shm pixmap in XCreateDrawable fails
then it ends up with the shmid == -1. This means the get image
path needs to fallback so return false in this case to use the
non-shm get image path.

Fixes: 02c3dad0f3 ("Call shmget() with permission 0600 instead of 0777")

Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3823>
(cherry picked from commit 246e4aeaef)
2020-02-25 08:54:32 -08:00
Dave Airlie
01929a5d90 glx/drisw: add getImageShm2 path
This adds return values to the get image path, so the caller can fallback.

Fixes: 02c3dad0f3 ("Call shmget() with permission 0600 instead of 0777")

Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3823>
(cherry picked from commit 8d0bab8a93)
2020-02-25 08:54:31 -08:00
Dave Airlie
b5b626396e dri: add another get shm variant.
When Brian in 02c3dad0f3 restricted
the shm permissions it means we hit the fallback paths in some
scenarios we hadn't before.

When you use Xephyr to xdmcp from one user to another the new perms
stop the X server (running as user a) attaching to the SHM segments
from gnome-shell (running as user b).

In this case however only the GLX side of the code had insight into this,
and the dri could was meant of fall back, and it worked for put image
fine but the get image path was broken, since there was no indication
in the broken case of the need to fallback.

This adds a return type to a new interface member that lets the
caller know it has to fallback.

Fixes: 02c3dad0f3 ("Call shmget() with permission 0600 instead of 0777")

Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3823>
(cherry picked from commit 466a0b2e49)
2020-02-25 08:54:30 -08:00
Dylan Baker
9243bbc9f9 .pick_status.json: Update to 01496e3d1e 2020-02-25 08:54:29 -08:00
Chris Wilson
5d48c29658 iris: Fix import sync-file into syncobj
When importing a sync-file, the kernel expects to be told which syncobj
to replace with the new fence -- it does not automatically create a new
handle for us. Abide by this rule and create a new syncobj for the
imported sync-file.

Fixes: f459c56be6 ("iris: Add fence support using drm_syncobj")
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3919>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3919>
(cherry picked from commit ae7bda27a0)
2020-02-24 11:16:45 -08:00
Kenneth Graunke
d7f95b226b iris: Fix BLORP vertex buffers to respect ISL MOCS settings
Fixes: a4da6008b6 ("iris: Use mocs from isl_dev.")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3720>
(cherry picked from commit 4bac2fa3c6)
2020-02-24 11:16:44 -08:00
Kenneth Graunke
23043719f7 iris: Make mocs an inline helper in iris_resource.h
Now that it uses ISL rather than genxml code, there's no need for it to
live as a vtable function inside the state module.  We can just make it
a static inline helper in iris_resource.h so it's available throughout
the codebase.

Fixes: a4da6008b6 ("iris: Use mocs from isl_dev.")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3720>
(cherry picked from commit 1cdf5abdfa)
2020-02-24 11:16:43 -08:00
Arcady Goldmints-Orlov
4fbaecd768 spirv: Remove outdated SPIR-V decoration warnings
spirv_to_nir warns if it encounters XFB decorations and errors if
it encounters a Stream decoration with value other than 0, despite
the fact that these decorations are in fact handled correctly.

Fixes dEQP-VK.transform_feedback.simple.query_1_*
Fixes: cd4a14be06 "spirv: Handle XFB variable decorations"

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3910>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3910>
(cherry picked from commit 5f3cbbd958)
2020-02-24 11:16:43 -08:00
Erik Faye-Lund
91de922c7f util: promote u_debug_memory.c to src/util
When os_memory_debug.h was promoted to src/util, this source-file on
which it depends on when the debug-flag is set on windows was left
out. So let's move this also.

It doesn't seem there's any way of triggering this issue right now, but
it seems better to correct this to avoid this from biting us in the ass
in the future.

Fixes: 88c4680b5a ("util: promote u_memory to src/util")
Reviewed-by: Dylan Baker <dylan@pnwbakers>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3844>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3844>
(cherry picked from commit 2e3318b151)
2020-02-24 11:16:42 -08:00
Dylan Baker
410dad2eb4 .pick_status.json: Update to e4baff9081 2020-02-24 11:16:25 -08:00
James Xiong
4d3f535ebb iris: handle the failure of converting unsupported yuv formats to isl
Signed-off-by: James Xiong <james.xiong@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit d8569baaed)

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3898>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3898>
2020-02-21 19:49:52 +00:00
Danylo Piliaiev
5a7ae6be76 i965: Do not generate D16 B5G6R5_UNORM configs on gen < 8
We don't support MESA_FORMAT_Z_UNORM16 before Gen8, see
intel_screen_init_surface_formats.

As a consequence disables B5G6R5_UNORM configs with depth
on gen < 6.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2275
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3206>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3206>
(cherry picked from commit 5bfd363be4)
2020-02-20 09:10:41 -08:00
Marek Olšák
5adcb0a62a util: remove the dependency on kcmp.h
Fixes: f76cbc7901 "util: Add os_same_file_description helper"

Acked-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3860>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3860>
(cherry picked from commit f7bfb10c69)
2020-02-20 09:10:40 -08:00
Ian Romanick
01020aef25 intel/fs: Correctly handle multiply of fsign with a source modifier
The other source of the multiply will be interpreted as a uint32_t in an
XOR instruction.  Any source modifiers with either not be interpreted at
all or will be misinterpreted due to the differing types.

If the other operand of the multiplication has a source modifier, just
emit an extra move to resolve the source modifiers.

The negation source modifier problem is difficult to reproduce due to an
algebraic optimization that changes (-a*b) to -(a*b).  However, changes
in MR !1359 push the negations back down.

On Gen7+ it might be possible to do slightly better for an abs() source
modifier by using BFI2 as a glorified copysign().

On Gen8+ it might be possible to do slightly better for a neg() source
modifier by emitting (~a ^ b).

There were no shader-db changes on any Intel platform, so I think we can
deal with that problem when it arises.

See also piglit!224.

Fixes: 06d2c11641 ("intel/fs: Add a scale factor to emit_fsign")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3780>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3780>
(cherry picked from commit 273b8cd1ca)
2020-02-20 09:10:40 -08:00
Bas Nieuwenhuizen
d43407e622 radeonsi: Fix compute copies for subsampled formats.
We cannot do image stores (or render) to subsampled formats.

Reinterpret as R32_UINT instead.

si_set_shader_image_desc already uses the blockwidth from
the view formats, so the image width adjustments are
already implemented.

This is still icky with mipmapping on GFX9+ though, but
since it is mostly a video format I don't think that will
be much of an issue and broken mipmapping is still better
than broken everything.

Fixes: e5167a9276 "radeonsi: disable SDMA on gfx8 to fix corruption on RX 580"
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2535
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3853>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3853>
(cherry picked from commit 68d1757420)
2020-02-20 09:10:39 -08:00
Ian Romanick
760b8cfd1c nir/search: Use larger type to hold linearized index
"index" is an offset into a linearized 3-dimensional array.  Starting
with fbd5359a0a, the 3-dimensional array can have 43 elements in each
dimension.  43**3 = 79507, and that will overflow the uint16_t.

See also the discussion in MR !3765.

Fixes: fbd5359a0a ("nir/algebraic: Rearrange bcsel sequences generated by nir_opt_peephole_select")
Suggested-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3871>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3871>
(cherry picked from commit 58bdc1c748)
2020-02-20 09:10:39 -08:00
Michel Dänzer
5376a79962 st/vdpau: Only call is_video_format_supported hook if needed
Namely only if *is_supported is true, otherwise the hook result can't
affect it.

Avoids

../src/gallium/state_trackers/vdpau/vdpau_private.h:138: FormatYCBCRToPipe: Assertion `0' failed.

with assertions enabled.

Fixes: 5d5b414a7b "st/vdpau: fix chroma_format handling in
                     VideoSurfaceQueryGetPutBitsYCbCrCapabilities"

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3848>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3848>
(cherry picked from commit 7e6010106f)
2020-02-20 09:10:38 -08:00
Eric Anholt
d499e9b4e2 llvmpipe: Fix real uninitialized use of "atype" for SEMANTIC_FACE
Fixes: 502548a09c ("gallivm/llvmpipe: add support for front facing in sysval.")
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3867>
(cherry picked from commit 45b2ccc6b3)
2020-02-20 09:10:38 -08:00
Marek Olšák
387ee33305 mesa: fix immediate mode with tessellation and varying patch vertices
Cc: 19.3 20.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3861>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3861>
(cherry picked from commit 2e05a280b6)
2020-02-20 09:10:37 -08:00
Caio Marcelo de Oliveira Filho
cdc4ea1496 intel/gen12: Take into account opcode when decoding SWSB
The interpretation of the fields is different depending whether the
instruction is a SEND/MATH or not.

This fixes the disassembly output for non-SEND/MATH instructions that
have both in-order and out-of-order dependencies.  Their dependencies
were wrongly represented as `@A $B` when the correct would be `@A
$B.dst`.

Fixes: 6154cdf924 ("intel/eu/gen12: Add auxiliary type to represent SWSB information during codegen.")
Fixes: 83612c0127 ("intel/disasm/gen12: Disassemble software scoreboard information.")
Acked-by: Francisco Jerez <currojerez@riseup.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3660>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3660>
(cherry picked from commit 79788b8f7f)
2020-02-20 09:10:37 -08:00
Dylan Baker
0d9f4578d6 .pick_status.json: Update to 8291d728dc 2020-02-20 09:10:35 -08:00
Dylan Baker
16c364aaf0 docs: Add release notes for 20.0.0 2020-02-19 11:58:58 -08:00
Dylan Baker
9abde3412d VERSION: bump for 20.0.0 release 2020-02-19 11:27:25 -08:00
Dylan Baker
dfdfddc4bb docs: Empty new_features.txt 2020-02-19 11:27:00 -08:00
Dylan Baker
604fa0dcad Docs: Add 20.0.0 release notes 2020-02-19 11:26:47 -08:00
Danylo Piliaiev
555cb9e7d2 st/nir: Unify inputs_read/outputs_written before serializing NIR
Otherwise input/output interfaces won't be unified when reading
NIR from a cache.

Fixes piglit test on iris:
  clip-distance-vs-gs-out.shader_test

Fixes: 19ed12af

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3787>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3787>
(cherry picked from commit b684ba6ce7)
2020-02-18 11:21:47 -08:00
luc
a06fc57fda zink: confused compilation macro usage for zink in target helpers.
Fixes: 8d46e35d16 ("zink: introduce opengl over vulkan")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3831>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3831>
(cherry picked from commit 692093fbdc)
2020-02-18 11:21:47 -08:00
Mathias Fröhlich
f27e5d9df5 egl: Implement getImage/putImage on pbuffer swrast.
This change adds getImage/putImage callbacks to the swrast pbuffer
loader extension.
This fixes a recent crash with Weston as well as a crashing
test with classic swrast without an official gitlab issue.

v2: Determine bytes per pixel differently and fix non X11 builds.
v3: Plug memory leak and fix crash on out of bounds access.
    (Daniel Stone)
v4: Follow the code structure of the wayland get/put image
    implementation - hopefully being more obvious.
    Handle 64 bits formats.
    Use BufferSize directly.
    (Emil Velikov)
v5: Change pixel size computation.
    (Eric Engestrom)

Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2219
Fixes: d6edccee8d "egl: add EGL_platform_device support"
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3711>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3711>
(cherry picked from commit c7617d8908)
2020-02-18 11:21:46 -08:00
Alyssa Rosenzweig
52714f58ff pan/midgard: Use fprintf instead of printf for constants
I was wondering where those constants disappeared to :-)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Fixes: 968f36d1fc ("pan/midgard: Support disassembling to a file")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3835>
(cherry picked from commit 8ab0bf1f93)
2020-02-18 11:21:46 -08:00
Alyssa Rosenzweig
01dab4b7fe pan/midgard: Don't crash with constants on unknown ops
Just use a dummy name instead.. we can't know a priori what type an
unknown op will consume, but we don't want to dereference a null
pointer.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Fixes: 24360966ab ("panfrost/midgard: Prettify embedded constant
prints")

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3835>
(cherry picked from commit 6af14d3685)
2020-02-18 11:21:45 -08:00
Alyssa Rosenzweig
3b65640ea5 pan/midgard: Fix missing prefixes
I was wondering where those were going... :)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Fixes: c1952779d6 ("pan/decode: Dump to a file")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3835>
(cherry picked from commit fbe1fd3de0)
2020-02-18 11:21:45 -08:00
Francisco Jerez
e6d598b69f intel/fs/gen12: Workaround data coherency issues due to broken NoMask control flow.
Together with the fixup_nomask_control_flow() pass introduced in a
previous patch, this implements a less invasive alternative to the
workaround documented in the hardware spec for GEN:BUG:1407528679,
which doesn't involve disabling structured control flow.

Under some conditions Gen12 hardware can end up executing a BB with
all channels disabled, which will lead to the execution of any NoMask
instructions in it, even though any execution-masked instructions will
be correctly shot down.  This could break assumptions of the SWSB pass
if the data computed by a NoMask instruction is synchronized against
by using an SWSB annotation baked into a regular execution-masked
instruction, since the first (NoMask) instruction may be executed
redundantly by the hardware, even though the second will correctly be
shot down, potentially leading to a RaW or WaW hazard if a third
instruction subsequently accesses the destination register of the
first instruction.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Cc: 20.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e14529ff32)
2020-02-18 11:21:44 -08:00
Francisco Jerez
5188cd51e2 intel/fs/gen12: Fixup/simplify SWSB annotations of SIMD32 scratch writes.
Found by inspection.  Existing code was trying to avoid assuming that
an SBID had been assigned to the virtual instruction, but
synchronizing the header setup with respect to the previous SIMD16
SEND by using SYNC.ALLRD doesn't really seem possible unless the SEND
instruction had been assigned an SBID.  Assert-fail instead if no SBID
has been allocated.

Fixes: 15e3a0d9d2 "intel/eu/gen12: Set SWSB annotations in hand-crafted assembly."
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Cc: 20.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4e4e8d793f)
2020-02-18 11:21:44 -08:00
Francisco Jerez
96052fdc24 intel/fs/gen12: Workaround unwanted SEND execution due to broken NoMask control flow.
This is a less invasive alternative to the workaround documented in
the hardware spec for GEN:BUG:1407528679, which doesn't involve
disabling structured control flow (it's unlikely that switching to
GOTO/JOIN would have actually fixed the problem anyway).

Under some conditions Gen12 hardware can end up executing a BB with
all channels disabled, which will lead to the execution of any NoMask
instructions in it, even though any execution-masked instructions will
be correctly shot down.  This may break assumptions of some NoMask
SEND messages whose descriptor depends on data generated by live
invocations of the shader.

This avoids the problem by predicating certain instructions on an ANY
horizontal predicate that makes sure that their execution is omitted
when all channels of the program are disabled.  The shader-db impact
of this patch seems to be minimal:

total instructions in shared programs: 17169833 -> 17169913 (0.00%)
instructions in affected programs: 30663 -> 30743 (0.26%)
helped: 0
HURT: 42

total cycles in shared programs: 336966176 -> 336968568 (0.00%)
cycles in affected programs: 2367290 -> 2369682 (0.10%)
helped: 0
HURT: 13

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Cc: 20.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a8ac0bd759)
2020-02-18 11:21:43 -08:00
Francisco Jerez
fa5b4bb2df intel/fs: Add virtual instruction to load mask of live channels into flag register.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Cc: 20.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 008f95a043)
2020-02-18 11:21:42 -08:00
Francisco Jerez
5df7a39c1b intel/fs/gen7: Fix fs_inst::flags_written() for SHADER_OPCODE_FIND_LIVE_CHANNEL.
We need to pass a width of 32 since the opcode bashes the whole f1.0
register on IVB.  This is unlikely to have caused problems since f1.0
is largely unused currently.  That's likely to change soon though,
even on platforms other than Gen7.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Cc: 20.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b8b509fb92)
2020-02-18 11:21:42 -08:00
Francisco Jerez
d92cf57345 intel/fs/cse: Make HALT instruction act as CSE barrier.
Found by inspection.  This seems particularly likely to cause problems
with instructions dependent on the current execution mask like
SHADER_OPCODE_FIND_LIVE_CHANNEL or the FS_OPCODE_LOAD_LIVE_CHANNELS
instruction I'm about to introduce, but one could imagine it leading
to data corruption if CSE ever managed to combine two instructions
before and after the FS_OPCODE_PLACEHOLDER_HALT, since the one before
may not be executed for some channels.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Cc: 20.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c9e33e5cbf)
2020-02-18 11:21:41 -08:00
Marek Olšák
5547fc53c0 radeonsi: don't wait for shader compilation to finish when destroying a context
This was a hack for glsl_types deinitialization and it predates the proper
fix, which was the addition of glsl_type_singleton_decref.

This fixes a crash when the context is destroyed via the atexit handler.

Cc: 19.3 20.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3800>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3800>
(cherry picked from commit 7e2b4bf256)
2020-02-18 11:21:41 -08:00
Dylan Baker
99d3ad1c71 .pick_status.json: Update to bee5c9b0dc 2020-02-18 11:21:39 -08:00
Pierre-Eric Pelloux-Prayer
0dce180b43 radeonsi/ngg: add VGT_FLUSH when enabling fast launch
(cherry-picked from 3da91b3327)

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3846>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3846>
2020-02-18 17:29:22 +00:00
Krzysztof Raszkowski
1a35b3d286 gallium/swr: simplify environmental variabled expansion code
There were 2 versions of code doing the same thing.
Since std::regexp are locale-sensitive better is to leave old
good way to do this.
This commit fix problem with bd934ff613.

Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3843>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3843>
2020-02-17 18:05:54 +01:00
Timothy Arceri
ed90606831 glsl: fix gl_nir_set_uniform_initializers() for image arrays
The if was incorrectly checking for an image type on what could
be an array of images. Here we change it to use the type stored
in uniform storage which has already been stripped of arrays,
this is what the above code for samplers does also.

Fixes: 2bf91733fc ("nir/linker: Set the uniform initial values")

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3757>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3757>
(cherry picked from commit 676869e1d4)
2020-02-14 09:40:59 -08:00
Dylan Baker
ea43cc6702 .pick_status.json: Update to 946eacbafb 2020-02-14 09:40:58 -08:00
Thong Thai
edcc8ba73a Revert "st/va: Convert interlaced NV12 to progressive"
This reverts commit 2add63060b.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2454
Fixes: 2add63060b "st/va: Convert interlaced NV12 to progressive"
Signed-off-by: Thong Thai <thong.thai@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3815>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3815>
(cherry picked from commit 08cff938b7)
2020-02-13 10:39:31 -08:00
Erik Faye-Lund
6aacd69a02 Revert "nir: Add a couple trivial abs optimizations"
These were already added in 9fdaeb7776 ("nir: add min/max optimisation"),
and there's no point in doing them twice.

This reverts commit e4d346c86d.

Fixes: e4d346c86d ("nir: Add a couple trivial abs optimizations")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3786>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3786>
(cherry picked from commit 0c1ba69a27)
2020-02-13 10:39:30 -08:00
Tapani Pälli
1c0b76a382 iris: fix aux buf map failure in 32bits app on Android
Cc: mesa-stable@lists.freedesktop.org
Reported-by: Zhifang Long <zhifang.long@intel.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3784>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3784>
(cherry picked from commit fdd20be324)
2020-02-13 10:39:29 -08:00
Tapani Pälli
a429930e0a glsl: fix a memory leak with resource_set
==7265== 248 (120 direct, 128 indirect) bytes in 1 blocks are definitely lost in loss record 1,438 of 1,465
   ==7265==    at 0x483980B: malloc (vg_replace_malloc.c:309)
   ==7265==    by 0x598A2AB: ralloc_size (ralloc.c:119)
   ==7265==    by 0x598F861: _mesa_set_create (set.c:127)
   ==7265==    by 0x599079D: _mesa_pointer_set_create (set.c:570)
   ==7265==    by 0x58BD7D1: build_program_resource_list(gl_context*, gl_shader_program*, bool) (linker.cpp:4026)
   ==7265==    by 0x548231B: st_link_shader (st_glsl_to_ir.cpp:170)
   ==7265==    by 0x54DA269: _mesa_glsl_link_shader (ir_to_mesa.cpp:3119)

Fixes: a6aedc66 ("st/glsl_to_nir: use nir based program resource list builder")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3574>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3574>
(cherry picked from commit f7d1bf075a)
2020-02-13 10:39:29 -08:00
Peng Huang
5c0a93b6eb radeonsi: make si_fence_server_signal flush pipe without work
glSignalSemaphoreEXT sometime doesn't signal the semaphore, it is
because radeonsi doesn't flush if gl context doesn't have pending
work. Fix the porblem by always submit ib.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: 19.3 20.0 <mesa-stable@lists.freedesktop.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3779>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3779>
(cherry picked from commit 0660cbf426)
2020-02-13 10:39:28 -08:00
Dylan Baker
0e0abec7b6 .pick_status.json: Update to 2a98cf3b2e 2020-02-13 10:39:27 -08:00
Dylan Baker
6bbbef9699 VERSION: bump for 20.0.0-rc3 2020-02-13 10:18:44 -08:00
Samuel Pitoiset
fa0dcef2ef nir: do not use De Morgan's Law rules for flt and fge
In presence of NaNs, "!(flt(a, b) && flt(c, d))" is NOT EQUAL
to "fge(a, b) || fge(c, d)". These optimizations are unsafe for
apps that rely on NaN behaviour.

pipeline-db (GFX9/LLVM):
Totals from affected shaders:
SGPRS: 3176 -> 3136 (-1.26 %)
VGPRS: 2188 -> 2144 (-2.01 %)
Spilled SGPRs: 227 -> 169 (-25.55 %)
Code Size: 150572 -> 151800 (0.82 %) bytes
Max Waves: 307 -> 310 (0.98 %)

pipeline-db (GFX9/ACO):
Totals from affected shaders:
SGPRS: 18744 -> 18744 (0.00 %)
VGPRS: 15576 -> 15580 (0.03 %)
Spilled SGPRs: 164 -> 164 (0.00 %)
Code Size: 1573012 -> 1576492 (0.22 %) bytes
Max Waves: 1534 -> 1532 (-0.13 %)

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2127
Fixes: d1ed4ffe0b ("nir: Use De Morgan's Law on logic compounded comparisons")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3696>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3696>
(cherry picked from commit 8e77280774)
2020-02-11 09:49:15 -08:00
Samuel Pitoiset
4558bdb95a aco: fix creating v_madak if v_mad_f32 has two sgpr literals
Do not ignore that src1 can be a sgpr.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2435
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3759>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3759>
(cherry picked from commit ddd767387f)
2020-02-11 09:49:15 -08:00
Vinson Lee
75ea9c808d panfrost: Remove unused anonymous enum variables.
This patch fix these build errors with GCC 10.

/usr/bin/ld: src/gallium/drivers/panfrost/libpanfrost.a(pan_resource.c.o):src/panfrost/midgard/midgard_compile.h:52: multiple definition of `pan_sysval'; src/gallium/drivers/panfrost/libpanfrost.a(pan_screen.c.o):src/panfrost/midgard/midgard_compile.h:52: first defined here
/usr/bin/ld: src/gallium/drivers/panfrost/libpanfrost.a(pan_resource.c.o):src/panfrost/midgard/midgard_compile.h:68: multiple definition of `pan_special_attributes'; src/gallium/drivers/panfrost/libpanfrost.a(pan_screen.c.o):src/panfrost/midgard/midgard_compile.h:68: first defined here

Fixes: 7e8de5a707 ("panfrost: Implement system values")
Fixes: 306800d747 ("pan/midgard: Lower gl_VertexID/gl_InstanceID to attributes")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3752>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3752>
(cherry picked from commit 63345a3596)
2020-02-11 09:49:14 -08:00
Eric Anholt
e5f13bca20 Revert "gallium: Fix big-endian addressing of non-bitmask array formats."
This reverts the functional part of commit
d17ff2f7f1, leaving the unit test for
mesa/pipe agreement on what's an array.

The issue is that the util_channel_desc.shift values on array formats are
not used for bit addressing in memory, they're bit addressing within a
word treating a pixel of the format as a native type, as seen by
llvmpipe's use of the values to do shifts (see
lp_build_unpack_arith_rgba_aos() for example).  This means the values are
nonsensical for 3-byte RGB, but then llvmpipe doesn't expose those formats
so it works out.

I still want to clean up our big-endian format handling at some point, but
let's fix the s390x regression first, sort out our format unit tests in
CI, then be able to refactor with confidence.

Fixes: d17ff2f7f1 ("gallium: Fix big-endian addressing of non-bitmask array formats.")
Closes: #2472
Acked-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3721>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3721>
(cherry picked from commit 1886dbfe73)
2020-02-11 09:49:14 -08:00
Marek Olšák
f93c8d8598 radeonsi: fix the DCC MSAA bug workaround
Cc: 19.3 20.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3646>
(cherry picked from commit fbb27eebc8)
2020-02-11 09:49:13 -08:00
Neha Bhende
c4e1dd07eb svga: Use pipe_shader_state_from_tgsi to set shader state
Use pipe_shader_state_from_tgsi() to set shader state for transformed
shader so that we get all correct data for respective shader state.

This fixes several regressed glretrace, piglit crashes found during merging
upsteam mesa

Fixes: bf12bc2dd7 (draw: add nir info gathering and building support)

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
(cherry picked from commit 144561dc5e)
2020-02-11 09:49:12 -08:00
Neha Bhende
f86e27156d svga: fix size of format_conversion_table[]
Since we are now using sparse matrix for format_conversion_table,
we have to make sure we have last entry in table which gives the
sense of required size of format_conversion_table

Fixes: 84db6ba7 ("svga: Drop unsupported formats from the format table")

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
(cherry picked from commit 470e73e7f8)
2020-02-11 09:49:12 -08:00
Dylan Baker
9724b0f32c .pick_status.json: Update to 2303762735 2020-02-11 09:49:08 -08:00
Samuel Pitoiset
a3bd400c14 aco: fix waiting for scalar stores before "writing back" data on GFX8-GFX9
Seems required also on GFX8-GFX9 to achieve correct behaviour. This
is an undocumented behaviour but it makes real sense to me.

pipeline-db on GFX9:
Totals from affected shaders:
SGPRS: 1018 -> 1018 (0.00 %)
VGPRS: 516 -> 516 (0.00 %)
Code Size: 40516 -> 40636 (0.30 %) bytes
Max Waves: 280 -> 280 (0.00 %)

This fixes some sort of sun flickering with Assassins Creed Origins.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2488
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3750>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3750>
(cherry picked from commit 34fd894e42)
2020-02-10 09:05:28 -08:00
Georg Lehmann
32dc7fff47 Vulkan overlay: use the corresponding image index for each swapchain
pImageIndices should be a pointer to the current image index
otherwise every swapchain but the first one could have a wrong image index

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3741>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3741>
(cherry picked from commit 7283c33b98)
2020-02-10 09:05:28 -08:00
Marek Olšák
027f9c887c radeonsi: don't report that multi-plane formats are supported
Fixes: a554b45d - st/mesa: don't lower YUV when driver supports it natively
Closes: #2376

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3632>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3632>
(cherry picked from commit 35961b10da)
2020-02-10 09:05:27 -08:00
Hyunjun Ko
f3f4751851 freedreno/ir3: put the conversion back for half const to the right place.
The previous commit leads to match immed values unexpectedly.

This makes constlen for each shader including bvert wrong.
Also fixes atan2 for mediump deqp tests.

Fixes: cbd1f47433 ("freedreno/ir3: convert back to 32-bit values for half constant registers.")

v2: Move conversion up above fabs/fneg modifier handling as well.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3737>
(cherry picked from commit 260bd32b58)
2020-02-10 09:05:27 -08:00
Dylan Baker
a25c7674aa .pick_status.json: Update to 689817c9df 2020-02-10 09:05:25 -08:00
Georg Lehmann
1d17f42732 Vulkan Overlay: Don't try to change the image layout to present twice
The render pass already does the transition.
The pipeline barrier is still needed to transfer the queue family ownership.

Fixes: 320b0f66c2 ("vulkan/overlay: bounce image back to present layout")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3740>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3740>
(cherry picked from commit f239bb8020)
2020-02-07 09:21:52 -08:00
Samuel Pitoiset
e393404ff1 aco: do not use ds_{read,write}2 on GFX6
According to LLVM, these instructions have a bounds checking bug.
LLVM only uses them on GFX7+.

This fixes broken geometry in Assassins Creed Origins.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2489
Fixes: 4a553212fa ("radv: enable ACO support for GFX6")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3746>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3746>
(cherry picked from commit 4b978cd950)
2020-02-07 09:21:52 -08:00
Tapani Pälli
1f8db81632 intel/vec4: fix valgrind errors with vf_values array
Fixes valgrind errors introduced since commit a8ec4082.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2346
Fixes: a8ec4082 ("nir+vtn: vec8+vec16 support")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3691>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3691>
(cherry picked from commit da76dfb515)
2020-02-07 09:21:51 -08:00
Rhys Perry
8f29aaa2cf aco: fix gfx10_wave64_bpermute
Since 9254fb4fc7, the pass replaced the SCC clobber with the scalar
identity temporary. Just skip most of the temporary setup, since we don't
need it for gfx10_wave64_bpermute.

Although shuffles are disabled on GFX10, Detroit: Become Human seems to
use them anyway.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Fixes: 9254fb4fc7 ('aco: don't use a scalar
       temporary for reductions on GFX10')

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3683>
(cherry picked from commit 20eb1acb6f)
2020-02-07 09:21:49 -08:00
Georg Lehmann
1e0cc313ba Correctly wait in the fragment stage until all semaphores are signaled
This fixes two issues:
- a crash if the application uses more than one semaphore for presenting because the driver expects one stage per semaphore
- the swapchain image could be not ready yet if the semaphores aren't signaled, #946 is possible related

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3718>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3718>
(cherry picked from commit 1c79afd946)
2020-02-07 09:21:47 -08:00
Thomas Hellstrom
d96f0faacf svga: Fix banded DMA upload
A previous commit ("winsys/svga: Limit the maximum DMA hardware buffer
size") made banded DMA transfer kick in when transfering gnome-shell
window contents under gnome-shell / wayland. This uncovered a bug where
we assumed that banded DMA transfers always occur to the top (y=0) of the
surface.
Fix this by taking the destination y offset into account.

Cc: 19.2 19.3 20.0 <mesa-stable@lists.freedesktop.org>
Fixes: 287c94ea49 ("Squashed commit of the following:")
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3733>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3733>
(cherry picked from commit 451cf228d5)
2020-02-07 09:21:46 -08:00
Lionel Landwerlin
203710e94c anv: set MOCS on push constants
v2: Also set MOCS on 3DSTATE_CONSTANT_ALL (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 67d2cb3e93 ("anv: Add get_push_range_address() helper.")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3732>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3732>
(cherry picked from commit f9febfae41)
2020-02-07 09:21:45 -08:00
Rafael Antognolli
d189ab9fcc intel: Load the driver even if I915_PARAM_REVISION is not found.
This param is only available starting on kernel 4.1. Use a default
value of 0 if it is not found instead.

v2: Update commit message (Lionel)

Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Mark Janes <mark.a.janes@intel.com>
Fixes: 96e1c945f2 ("i965: Move device info initialization to common
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3727>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3727>
(cherry picked from commit 4aa7af9e9a)
2020-02-07 09:21:44 -08:00
Vinson Lee
bd934ff613 swr: Fix GCC 4.9 checks.
Fixes: f0a22956be ("swr/rast: _mm*_undefined_* implementations for gcc<4.9")
Fixes: e21fc2c625 ("swr/rast: non-regex knob fallback code for gcc < 4.9")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
(cherry picked from commit deb2bbf57e)
2020-02-07 09:21:44 -08:00
James Xiong
51f7d81dd2 gallium: let the pipe drivers decide the supported modifiers
fixes: ac0219cc5b ("gallium: dmabuf support for yuv formats that are not natively supported")

Signed-off-by: James Xiong <james.xiong@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3527>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3527>
(cherry picked from commit 205ce0bea5)
2020-02-07 09:21:43 -08:00
Timur Kristóf
06a9d51f27 aco/optimizer: Don't combine uniform bool s_and to s_andn2.
Fixes: 8a32f57fff

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3714>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3714>
(cherry picked from commit 4d34abd15c)
2020-02-07 09:21:43 -08:00
Dylan Baker
419c992e65 .pick_status.json: Update to d8bae10bfe 2020-02-07 09:21:42 -08:00
Dylan Baker
440e52ca03 VERSION: bump to 20.0-rc2 2020-02-07 08:42:33 -08:00
Krzysztof Raszkowski
8dede1745c gallium/swr: Fix gcc 4.8.5 compile error
Stop using C++14 feature so it can be compile on default centos7
gcc compiler.

Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3679>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3679>
2020-02-05 17:58:23 +00:00
Ian Romanick
222b47c6c5 intel/fs: Don't count integer instructions as being possibly coissue
Integer instructions don't coissue.  Before e64be391dd
("intel/compiler: generalize the combine constants pass"), this pass
only looked at float sources.  There's no shader-db data in that commit,
so I collected some.  The results are not good:

    Haswell
    total instructions in shared programs: 11898805 -> 11908127 (0.08%)
    instructions in affected programs: 1218680 -> 1228002 (0.76%)
    helped: 2
    HURT: 5171
    helped stats (abs) min: 12 max: 111 x̄: 61.50 x̃: 61
    helped stats (rel) min: 1.59% max: 9.20% x̄: 5.40% x̃: 5.40%
    HURT stats (abs)   min: 1 max: 311 x̄: 1.83 x̃: 1
    HURT stats (rel)   min: 0.02% max: 9.91% x̄: 1.05% x̃: 0.70%
    95% mean confidence interval for instructions value: 1.55 2.05
    95% mean confidence interval for instructions %-change: 1.02% 1.08%
    Instructions are HURT.

    total cycles in shared programs: 221664974 -> 221404750 (-0.12%)
    cycles in affected programs: 120012620 -> 119752396 (-0.22%)
    helped: 3464
    HURT: 3159
    helped stats (abs) min: 1 max: 428160 x̄: 314.55 x̃: 16
    helped stats (rel) min: <.01% max: 57.33% x̄: 3.40% x̃: 1.28%
    HURT stats (abs)   min: 1 max: 87846 x̄: 262.54 x̃: 14
    HURT stats (rel)   min: <.01% max: 85.57% x̄: 3.01% x̃: 0.77%
    95% mean confidence interval for cycles value: -224.23 145.65
    95% mean confidence interval for cycles %-change: -0.50% -0.19%
    Inconclusive result (value mean confidence interval includes 0).

    total spills in shared programs: 9804 -> 10047 (2.48%)
    spills in affected programs: 6869 -> 7112 (3.54%)
    helped: 2
    HURT: 41

    total fills in shared programs: 19863 -> 20319 (2.30%)
    fills in affected programs: 17428 -> 17884 (2.62%)
    helped: 2
    HURT: 41

    LOST:   20
    GAINED: 13

This also prevents regressions in "intel/fs: Promote integer constants
after lowering integer multiplication" (note: that patch will probably
not be committed).  When the passes are reorderd, code like

    mul(8)      acc0<1>D    g9<8,8,1>D  -2078209981D    { align1 1Q };

gets turned into

    mov(1)      g23<1>D     2078209981D                 { align1 WE_all 1N };
    ...
    mul(8)      acc0<1>D    g13<8,8,1>D  -g23<0,1,0>D   { align1 1Q compacted };

It's not 100% clear why, but these produce different results.  Note that
-2078209981 & 0x0ffff = 0x0843, and -(2078209981 & 0x0ffff) =
0xffff0843.  It seems like the upper 16-bits of the negation should be
ignored.

Fixes: e64be391dd ("intel/compiler: generalize the combine constants pass")
Cc: Iago Toral Quiroga <itoral@igalia.com>
Suggested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>

The shaders with spills or fills hurt are the usual suspects.  A couple
compute shaders in Dirt Showdown and a compute shader in Bioshock
Infinite.  On Haswell, a compute shader (that appears twice in
shader-db) from Aztec Ruins was also hurt for spill and fills.

Haswell
total instructions in shared programs: 11573934 -> 11568335 (-0.05%)
instructions in affected programs: 828623 -> 823024 (-0.68%)
helped: 2825
HURT: 6
helped stats (abs) min: 1 max: 134 x̄: 2.16 x̃: 1
helped stats (rel) min: 0.02% max: 9.05% x̄: 0.84% x̃: 0.61%
HURT stats (abs)   min: 1 max: 216 x̄: 81.83 x̃: 56
HURT stats (rel)   min: 0.16% max: 8.65% x̄: 4.21% x̃: 4.68%
95% mean confidence interval for instructions value: -2.31 -1.64
95% mean confidence interval for instructions %-change: -0.85% -0.80%
Instructions are helped.

total cycles in shared programs: 187573593 -> 187004633 (-0.30%)
cycles in affected programs: 82816107 -> 82247147 (-0.69%)
helped: 2186
HURT: 1741
helped stats (abs) min: 1 max: 35230 x̄: 326.96 x̃: 16
helped stats (rel) min: <.01% max: 46.11% x̄: 3.11% x̃: 0.90%
HURT stats (abs)   min: 1 max: 6138 x̄: 83.73 x̃: 16
HURT stats (rel)   min: <.01% max: 104.11% x̄: 2.73% x̃: 0.75%
95% mean confidence interval for cycles value: -197.13 -92.64
95% mean confidence interval for cycles %-change: -0.72% -0.33%
Cycles are helped.

total spills in shared programs: 7870 -> 7743 (-1.61%)
spills in affected programs: 2260 -> 2133 (-5.62%)
helped: 31
HURT: 5

total fills in shared programs: 6320 -> 6263 (-0.90%)
fills in affected programs: 3547 -> 3490 (-1.61%)
helped: 31
HURT: 6

LOST:   9
GAINED: 9

Ivybridge
total instructions in shared programs: 11863372 -> 11859793 (-0.03%)
instructions in affected programs: 757183 -> 753604 (-0.47%)
helped: 2236
HURT: 3
helped stats (abs) min: 1 max: 81 x̄: 1.86 x̃: 1
helped stats (rel) min: 0.03% max: 5.26% x̄: 0.74% x̃: 0.48%
HURT stats (abs)   min: 11 max: 301 x̄: 192.33 x̃: 265
HURT stats (rel)   min: 1.55% max: 10.51% x̄: 6.89% x̃: 8.62%
95% mean confidence interval for instructions value: -2.01 -1.18
95% mean confidence interval for instructions %-change: -0.77% -0.70%
Instructions are helped.

total cycles in shared programs: 178377378 -> 177946087 (-0.24%)
cycles in affected programs: 76261390 -> 75830099 (-0.57%)
helped: 1635
HURT: 1395
helped stats (abs) min: 1 max: 34796 x̄: 333.53 x̃: 16
helped stats (rel) min: <.01% max: 47.15% x̄: 2.82% x̃: 0.64%
HURT stats (abs)   min: 1 max: 4315 x̄: 81.74 x̃: 18
HURT stats (rel)   min: <.01% max: 49.98% x̄: 1.99% x̃: 0.53%
95% mean confidence interval for cycles value: -197.06 -87.62
95% mean confidence interval for cycles %-change: -0.78% -0.43%
Cycles are helped.

total spills in shared programs: 4188 -> 4182 (-0.14%)
spills in affected programs: 1557 -> 1551 (-0.39%)
helped: 30
HURT: 3

total fills in shared programs: 5056 -> 5245 (3.74%)
fills in affected programs: 2708 -> 2897 (6.98%)
helped: 30
HURT: 3

LOST:   5
GAINED: 1

No shader-db changes on any other Intel platform.

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3544>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3544>
(cherry picked from commit 59488cbbac)
2020-02-05 08:58:42 -08:00
Eric Engestrom
ccf70209d2 util/disk_cache: check for write() failure in the zstd path
CoverityID: 1458074
Fixes: a8d941091f ("util: Use ZSTD for shader cache if possible")
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3672>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3672>
(cherry picked from commit 2799676218)
2020-02-05 08:58:41 -08:00
Lionel Landwerlin
7538851a22 anv: implement gen9 post sync pipe control workaround
We've been missing this workaround for a while and since it's required
for Gen12, let's implement it for Gen9 first.

v2: Update comment for Gen9.

v3: Fix clearing of bits... (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3405>
(cherry picked from commit 8949d27bb8)
2020-02-05 08:58:41 -08:00
Rob Clark
c937b84643 freedreno: allow ctx->batch to be NULL
This was mostly true already, now that we use `fd_context_batch()` for
first access to batch in draw/clear/grid paths.  So we can drop the old
code in `batch_flush()` that tried to prevent `ctx->batch` from being
NULL.

Fixes a crash with a large number of tabs in chromium.

Cc: "20.0" mesa-stable@lists.freedesktop.org
Signed-off-by: Rob Clark <robdclark@chromium.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3700>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3700>
(cherry picked from commit 2c07e03b79)
2020-02-05 08:58:40 -08:00
Bas Nieuwenhuizen
b8b9233d5a radv: Do not set SX DISABLE bits for RB+ with unused surfaces.
The extra bits in CB_SHADER_MASK break dual source blending in
SkQP on a Stoney device. However:

- As far as I can tell, some other dual source blend tests are passing
  before and after the change.
- A hacked around skqp passes on my Vega desktop and Raven laptop
- Getting Skqp to give any useful info or to run it outside of Android
  on ChromeOS is proving difficult.

I have confirmed 3 strategies that seem to work:
- The old radv behavior of setting CB_SHADER_MASK to 0xF
- AMDVLK: CB_SHADER_MASK = 0xFF, and the 3 RB+ regs
  are 0.
- radeonsi: CB_SHADER_MASK = 0xFF, but does not set DISABLE
  bits in SX_BLEND_OPT_CONTROL for CB 1-7.

Let us use the radeonsi solution as that solution also seems like the correct
thing to do for holes. I have tested on my Raven laptop that setting the high
surfaces to not disabled and downconvert to 32_R does not imply a performance
penalty.

Fixes: e9316fdfd4 "radv: fix setting CB_SHADER_MASK for dual source blending"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3670>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3670>
(cherry picked from commit 65a6dc5139)
2020-02-05 08:58:40 -08:00
Eric Engestrom
e163a39994 freedreno/perfcntrs: fix fd leak
CoverityID: 1110568, 1458071
Fixes: 5a13507164 ("freedreno/perfcntrs: add fdperf")
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3671>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3671>
(cherry picked from commit cae6093266)
2020-02-05 08:58:39 -08:00
Danylo Piliaiev
ce7f021269 st/mesa: Handle the rest renderbuffer formats from OSMesa
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2189
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/989
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2036
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3216>
(cherry picked from commit d83abf1d37)
2020-02-05 08:58:39 -08:00
Eric Engestrom
26adf4b532 util/os_socket: fix header unavailable on windows
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2464
Fixes: e62c3cf350 ("util/os_socket: Include unistd.h to fix build error")
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Krzysztof Raszkowski <krzysztof.raszkowski@intel.com>
(cherry picked from commit d1165ad18b)
2020-02-05 08:58:38 -08:00
Danylo Piliaiev
27e39daa4b i965: Do not set front_buffer_dirty if there is no front buffer
Otherwise there will be a warning:
 "libEGL warning: FIXME: egl/x11 doesn't support front buffer rendering."

Happens with EGL_KHR_surfaceless_context:

 eglMakeCurrent(egl_display, EGL_NO_SURFACE, EGL_NO_SURFACE, egl_context)
 eglMakeCurrent(egl_display, egl_surface, egl_surface, egl_context)
 glFlush() // Here will be a warning

Cc: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1525
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3628>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3628>
(cherry picked from commit 36126b6211)
2020-02-05 08:58:37 -08:00
Dylan Baker
69604c6fd4 .pick_status.json: Update to 7eaf21cb6f 2020-02-05 08:58:35 -08:00
Erik Faye-Lund
450657e26c st/mesa: use uint-result for sampling stencil buffers
Otherwise, we end up mismatching the result-type and the sampler-type.

Fixes: 642125edd9 ("st/mesa: use uint-samplers for sampling stencil buffers")

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3680>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3680>
(cherry picked from commit fd27fb5113)
2020-02-04 08:10:05 -08:00
Marek Vasut
ecf9055e9c etnaviv: Destroy rsc->pending_ctx set in etna_resource_destroy()
Destroy rsc->pending_ctx set in etna_resource_destroy(), otherwise
the memory is allocated, never free'd, and becomes unreachable. This
fixes a memory leak.

Fixes: 9e672e4d20 ("etnaviv: keep references to pending resources")
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3633>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3633>
(cherry picked from commit c32bd325e7)
2020-02-04 08:10:05 -08:00
Jan Vesely
66bb55cc68 clover: Use explicit conversion from llvm::StringRef to std::string
Fixes build after llvm 777180a32b61070a10dd330b4f038bf24e916af1
("[ADT] Make StringRef's std::string conversion operator explicit")

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit 0ccda2ebff)
2020-02-04 08:10:04 -08:00
Dylan Baker
958f06b99f .pick_status.json: Update to 9afdcd64f2 2020-02-04 08:10:02 -08:00
Boris Brezillon
7bc056dd78 panfrost: Fix the damage box clamping logic
When the rendering are is not covering the whole FBO, and the biggest
damage rect is empty, we can have damage.max{x,y} > damage.min{x,y},
which leads to invalid reload boxes.

Fixes: 65ae86b854 ("panfrost: Add support for KHR_partial_update()")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3676>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3676>
(cherry picked from commit b550b7ef3b)
2020-02-03 08:50:51 -08:00
Bernd Kuhls
b5a345595a util/os_socket: Include unistd.h to fix build error
Fixes

In file included from ../src/util/os_socket.c:8:
../src/util/os_socket.h:26:1: error: unknown type name ‘ssize_t’; did you mean ‘size_t’?
 ssize_t os_socket_recv(int socket, void *buffer, size_t length, int flags);

seen with gcc version 8.3.0 (Buildroot 2019.11) and uClibc 1.0.32.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Fixes: ef5266ebd5 ("util/os_socket: Add socket related functions.")
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3659>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3659>
(cherry picked from commit e62c3cf350)
2020-02-03 08:50:51 -08:00
Jason Ekstrand
0cac8f332a anv/blorp: Use the correct size for vkCmdCopyBufferToImage
Now that we're using an uncompressed format for the buffer, we have to
scale down the dimensions we pass into BLORP when doing buffer->image
copies.

Fixes: dd92179a72 "anv: Canonicalize buffer formats for image/buffer..."
Closes: #2452
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3664>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3664>
(cherry picked from commit d7fe9af620)
2020-02-03 08:50:50 -08:00
Vinson Lee
e87c0c17b2 lima: Fix build with GCC 10.
This patch fixes this build error with GCC 10.

/usr/bin/ld: src/gallium/drivers/lima/liblima.a(lima_context.c.o):src/gallium/drivers/lima/lima_util.h:32: multiple definition of `lima_dump_command_stream'; src/gallium/drivers/lima/liblima.a(lima_screen.c.o):src/gallium/drivers/lima/lima_util.h:32: first defined here
/usr/bin/ld: src/gallium/drivers/lima/liblima.a(lima_resource.c.o):src/gallium/drivers/lima/lima_util.h:32: multiple definition of `lima_dump_command_stream'; src/gallium/drivers/lima/liblima.a(lima_screen.c.o):src/gallium/drivers/lima/lima_util.h:32: first defined here
/usr/bin/ld: src/gallium/drivers/lima/liblima.a(lima_draw.c.o):src/gallium/drivers/lima/lima_util.h:32: multiple definition of `lima_dump_command_stream'; src/gallium/drivers/lima/liblima.a(lima_screen.c.o):src/gallium/drivers/lima/lima_util.h:32: first defined here
/usr/bin/ld: src/gallium/drivers/lima/liblima.a(lima_bo.c.o):src/gallium/drivers/lima/lima_util.h:32: multiple definition of `lima_dump_command_stream'; src/gallium/drivers/lima/liblima.a(lima_screen.c.o):src/gallium/drivers/lima/lima_util.h:32: first defined here
/usr/bin/ld: src/gallium/drivers/lima/liblima.a(lima_submit.c.o):src/gallium/drivers/lima/lima_util.h:32: multiple definition of `lima_dump_command_stream'; src/gallium/drivers/lima/liblima.a(lima_screen.c.o):src/gallium/drivers/lima/lima_util.h:32: first defined here
/usr/bin/ld: src/gallium/drivers/lima/liblima.a(lima_util.c.o):src/gallium/drivers/lima/lima_util.h:32: multiple definition of `lima_dump_command_stream'; src/gallium/drivers/lima/liblima.a(lima_screen.c.o):src/gallium/drivers/lima/lima_util.h:32: first defined here
/usr/bin/ld: src/gallium/drivers/lima/liblima.a(lima_texture.c.o):src/gallium/drivers/lima/lima_util.h:32: multiple definition of `lima_dump_command_stream'; src/gallium/drivers/lima/liblima.a(lima_screen.c.o):src/gallium/drivers/lima/lima_util.h:32: first defined here

Fixes: d71cd245d7 ("lima: Rotate dump files after each finished pp frame")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
(cherry picked from commit 02658df152)
2020-02-03 08:50:50 -08:00
Jason Ekstrand
e6fbf8d9d3 intel/fs: Write the address register with NoMask for MOV_INDIRECT
This fixes a hang in the following Vulkan CTS test on TGL-LP:

    dEQP-VK.descriptor_indexing.storage_buffer_dynamic_in_loop

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3642>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3642>
(cherry picked from commit f93dfb509c)
2020-02-03 08:50:49 -08:00
Daniel Schürmann
208fb2edfa aco: fix image_atomic_cmp_swap
Fixes: 71440ba0f5 ('aco: reorder VMEM operands in ACO IR')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3652>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3652>
(cherry picked from commit 3b323d6601)
2020-02-03 08:50:49 -08:00
Dylan Baker
623126e741 .pick_status.json: Update to b550b7ef3b 2020-02-03 08:50:47 -08:00
Samuel Pitoiset
4b43bee72f aco: fix MUBUF VS input loads when expanding vec3 to vec4 on GFX6
When some unused channels are skipped and that we expand vec3 loads
to vec4 loads, we have to adjust the fourth component.

While we are at it, add an assertion to make sure we don't use
MUBUF for vec3 loads on GFX6.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2450
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2442
Fixes: 6aecc316 ("aco: fix VS input loads with MUBUF on GFX6")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3641>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3641>
(cherry picked from commit 0d14f41625)
2020-01-31 08:50:34 -08:00
Jason Ekstrand
e33167af79 anv: Always fill out the AUX table even if CCS is disabled
Cc: "20.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3454>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3454>
(cherry picked from commit 8c5fd2942b)
2020-01-31 08:50:34 -08:00
Jason Ekstrand
f60209b935 iris: Plumb deref block size through to 3DSTATE_SF
Cc: "20.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3454>
(cherry picked from commit 2ccdf881ab)
2020-01-31 08:50:33 -08:00
Jason Ekstrand
0aff56bd7a anv: Plumb deref block size through to 3DSTATE_SF
Cc: "20.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3454>
(cherry picked from commit e6b39850f0)
2020-01-31 08:50:33 -08:00
Jason Ekstrand
0bacf9963b intel/blorp: Plumb deref block size through to 3DSTATE_SF
Cc: "20.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3454>
(cherry picked from commit ce9c45a60e)
2020-01-31 08:50:32 -08:00
Jason Ekstrand
fc7ff68df7 intel/common: Return the block size from get_urb_config
Cc: "20.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3454>
(cherry picked from commit fdc0c19328)
2020-01-31 08:50:32 -08:00
Jason Ekstrand
4be5ef5caa anv: Emit URB setup earlier
Cc: "20.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3454>
(cherry picked from commit e340a79b9c)
2020-01-31 08:50:31 -08:00
Jason Ekstrand
1a57247da8 iris: Consolodate URB emit
Now that we don't have to carry a URB state emit function for BLORP we
can roll some stuff together and drop a genX helper.

Cc: "20.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3454>
(cherry picked from commit e928676b69)
2020-01-31 08:50:31 -08:00
Jason Ekstrand
deeba167fd intel/blorp: Always emit URB config on Gen7+
Previously, i965/iris tried to reuse the currently programmed URB config
if it was good enough for BLORP, rather than reprogramming it each time.
However, this will make some things harder on Gen12+ and we've not seen
any performance impact from emitting URB more frequently in ANV.

This makes the blorp <-> driver interface a bit simpler on Gen7+ because
now all the driver has to do is to provide the L3$ config rather than
trying to hand off URB re-config to blorp.

Cc: "20.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3454>
(cherry picked from commit 09e4c33085)
2020-01-31 08:50:30 -08:00
Jason Ekstrand
c9cf6b5f60 intel: Take a gen_l3_config in gen_get_urb_config
Instead of making each driver pass in the same push constant size and do
it's own L3$ config URB size calculation, just make them pass in their
L3$ configuration.

Cc: "20.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3454>
(cherry picked from commit 73a684964b)
2020-01-31 08:50:30 -08:00
Jason Ekstrand
46c7314770 i965: Re-emit l3 state before BLORP executes
If BLORP is the first thing to execute, we may not have set the L3$
config yet.  That's not normally a problem but we're about to add code
to BLORP which will look at brw_context::l3::config and we'd like that
to be initialized.  It's also just good practice.

Cc: "20.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3454>
(cherry picked from commit 9d05822cb8)
2020-01-31 08:50:29 -08:00
Jason Ekstrand
f1c8089f97 iris: Use the URB size from the L3$ config
Cc: "20.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3454>
(cherry picked from commit bff7b3c7bd)
2020-01-31 08:50:29 -08:00
Jason Ekstrand
5d9a1303ba iris: Store the L3$ configs in the screen
We only calculate them based on device info and never change them so
this seems like a reasonable place to put them.  We could also put them
in the context, but that's not accessible from iris_init_*_context.

Cc: "20.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3454>
(cherry picked from commit 99f3178a24)
2020-01-31 08:50:28 -08:00
Jason Ekstrand
8585ae7c03 iris: Set SLMEnable based on the L3$ config
Cc: "20.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3454>
(cherry picked from commit 6471bac99e)
2020-01-31 08:50:28 -08:00
Jason Ekstrand
6c705ea125 intel/genxml: Drop SLMEnable from L3CNTLREG on Gen11
SML is no longer in the L3$ on Gen11+.  It's not incredibly clear from
the docs but no Gen11 platforms are in the list of platforms on which
this bit exists.  Also, we've been always setting it false on Gen11 in
ANV and i965 thanks to GEN_L3P_SLM being zero with no ill effects.

Cc: "20.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3454>
(cherry picked from commit 73434b665b)
2020-01-31 08:50:27 -08:00
Jason Ekstrand
ea7ab69455 anv,iris: Set 3DSTATE_SF::DerefBlockSize to per-poly on Gen12+
According to the BSpec, this should prevent hangs when using shaders
with large URB entries.  A more precise fix can be done but it requires
re-arranging URB setup.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3454>
(cherry picked from commit e1bdb127b6)
2020-01-31 08:50:27 -08:00
Jason Ekstrand
6c0b18c5d1 genxml: Add a new 3DSTATE_SF field on gen12
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3454>
(cherry picked from commit 9da9abf8a7)
2020-01-31 08:50:27 -08:00
Dylan Baker
102aa6d549 .pick_status.json: Update to 0d14f41625 2020-01-31 08:50:23 -08:00
Dylan Baker
b611a616c3 bin/pick-ui: Add a new maintainer script for picking patches
In the long term the goal of this script is to nearly completely
automate the process of picking stable nominations, in a well tested
way.

In the short term the goal is to provide a better, faster UI to interact
with stable nominations.
2020-01-31 08:50:15 -08:00
Dylan Baker
21f5d79c7b VERSION: bump to 20.0.0-rc1 2020-01-30 14:14:01 -08:00
Brian Ho
58fd26c433 turnip: Fix vkCmdCopyQueryPoolResults with available flag
Previously, calling vkCmdCopyQueryPoolResults with the
VK_QUERY_RESULT_WITH_AVAILABILITY_BIT flag set the query result
field in the buffer to 0 if unavailable and the query result if
available. This was a misunderstanding of the Vulkan spec, and this
commit corrects the behavior to emitting a separate available
result in addition to the query result.

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3560>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3560>
2020-01-30 20:30:46 +00:00
Brian Ho
1a3e2a7fa8 turnip: Fix vkGetQueryPoolResults with available flag
Previously, calling vkGetQueryPoolResults with the
VK_QUERY_RESULT_WITH_AVAILABILITY_BIT flag set the query result
field in *pData to 0 if unavailable and the query result if
available. This was a misunderstanding of the Vulkan spec, and this
commit corrects the behavior to eriting a separate available result
in addition to the query result.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3560>
2020-01-30 20:30:46 +00:00
Brian Ho
1c3319cf81 turnip: Free event->bo on vkDestroyEvent
Fixes a leak from freeing event but not event->bo.

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3639>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3639>
2020-01-30 18:50:06 +00:00
Kenneth Graunke
594cb30356 loader: Fix leak of kernel driver name
This is strdup'd, it needs to be freed.

CID: 1458032
Fixes: f93bb2fb102 ("loader: Check if the kernel driver is i915 before loading iris")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3630>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3630>
2020-01-30 10:08:17 -08:00
Jan Zielinski
f09c466732 docs: Update SWR tessellation support
Update features.txt to reflect ARB_tessellation_shader
support in SWR

Reviewed-by: Krzysztof Raszkowski <krzysztof.raszkowski@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3636>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3636>
2020-01-30 11:18:15 +00:00
Kenneth Graunke
bdba744d70 i965: Use brw_batch_references in tex_busy check
If the batch references the buffer, we will have to flush the batch
immediately before mapping it, at which point it will be busy.

(This bug has existed for a long time...even going back to BLT-era...)

Fixes: 779923194c ("i965/tex_image: Use meta for instead of the blitter PBO TexImage and GetTexImage")
Fixes: d5d4ba9139 ("i965/tex_subimage: use meta instead of the blitter for PBO TexSubImage")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3616>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3616>
2020-01-30 10:01:21 +00:00
Christian Gmeiner
d3fa18a1fa etnaviv: drm-shim: add GC400
These are the ETNAVIV_PARAM's returned from a GC400 found on a
STM32MP157C-DK2 Discovery Board running mainline kernel.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3195>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3195>
2020-01-30 04:05:39 +00:00
Qiang Yu
c5e4d28724 lima: add noheap debug option
Disable using heap buffer when set.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3264>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3264>
2020-01-30 03:39:21 +00:00
Qiang Yu
b220aec628 lima: create heap buffer with new interface if available
Newly added heap buffer create interface can create a
large enough buffer whose backup memory can increase
dynamically as needed.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3264>
2020-01-30 03:39:21 +00:00
Qiang Yu
92465cc999 lima: sync lima_drm.h with kernel
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3264>
2020-01-30 03:39:21 +00:00
Icenowy Zheng
cd30c4d719 lima: fix lima_set_vertex_buffers()
When setting the vertex buffers, lima calls
util_set_vertex_buffers_mask() to reference and copy buffers. That
function
function adds dst with start_slot internally, so lima should not offset
the destination address again.

This is discovered when comparing with other drivers, and fixed by
removing the extra offset in lima_set_vertex_buffers().

This fixes draws that get translated in u_vbuf, because u_vbuf adds
extra vertex buffers when translating.

Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3620>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3620>
2020-01-30 07:51:35 +08:00
Jonathan Marek
1c5d84fcae turnip: hook up cmdbuffer event set/wait
Gets some basic tests under "dEQP-VK.synchronization.*event*" passing

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3123>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3123>
2020-01-29 23:13:43 +00:00
Christian Gmeiner
5b5b762475 etnaviv: drop default state for PE_STENCIL_CONFIG_EXT2
It gets emitted when needed.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3631>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3631>
2020-01-29 23:31:04 +01:00
Daniel Schürmann
d78e0de772 docs: add new features for RADV/ACO.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3627>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3627>
2020-01-29 22:05:37 +00:00
Samuel Pitoiset
3a3b16a395 radv: refactor physical device properties
Based on ANV. This removes a bunch of duplicated code for properties.

Fixes: 1b8d99e288 ("radv: bump conformance version to 1.2.0.0")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3626>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3626>
2020-01-29 21:44:56 +00:00
Rob Clark
5b9fe18485 freedreno: remove flush-queue
Signed-off-by: Rob Clark <robdclark@chromium.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>
2020-01-29 21:19:41 +00:00
Rob Clark
b3b1fa5e2b freedreno: add gmem_lock
The gmem state is split out now, so it does not require synchronization.
But gmem rendering still accesses vsc state from the context.

TODO maybe there is a better way?  For gen's that don't do vsc resizing,
this is probably easier.. but for a6xx there isn't really a great
position for more fine grained locking.  Maybe it doesn't matter since
in practice the lock shouldn't be contended.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>
2020-01-29 21:19:41 +00:00
Rob Clark
91f9bb99c5 freedreno: add gmem state cache
Which also has the benefit of getting rid of fd_context::gmem.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>
2020-01-29 21:19:41 +00:00
Rob Clark
712f8802ee freedreno: get GMEM state from batch
Prep work to reduce churn in next patch.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>
2020-01-29 21:19:41 +00:00
Rob Clark
4bcc3a0923 freedreno/a2xx: constify gmem state
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>
2020-01-29 21:19:41 +00:00
Rob Clark
5d442144ae freedreno/a3xx: constify gmem state
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>
2020-01-29 21:19:41 +00:00
Rob Clark
7236d6dd4c freedreno/a4xx: constify gmem state
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>
2020-01-29 21:19:41 +00:00
Rob Clark
2d2f4a55eb freedreno/a5xx: constify gmem state
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>
2020-01-29 21:19:41 +00:00
Rob Clark
637ca78ee2 freedreno/a6xx: constify gmem state
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>
2020-01-29 21:19:41 +00:00
Rob Clark
82a64af907 freedreno: constify fd_vsc_pipe
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>
2020-01-29 21:19:41 +00:00
Rob Clark
cbae9f34e9 freedreno: constify fd_tile
In a following patch, when we cache the gmem state, we will want to
treat the gmem state as immuatable.  So start converting things to
const to make this more clear.. fd_tile is a good place to start.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>
2020-01-29 21:19:41 +00:00
Rob Clark
c7ab8874d0 freedreno: consolidate GMEM state
The tile and vsc_pipe arrays are really part of the GMEM configuration.
So pull these out of fd_context and into fd_gmem_stateobj.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>
2020-01-29 21:19:41 +00:00
Rob Clark
62c10b395e freedreno: extract vsc pipe bo from GMEM state
Prep work for reorganizing GMEM state and extracting out of fd_context.
The vsc pipe bo was the one thing that doesn't change with GMEM/tile
config.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>
2020-01-29 21:19:41 +00:00
Alejandro Piñeiro
d5c32db076 turnip: remove unused descriptor state dirty
It was only used to be initialized to zero. Not even updated as
descriptor sets are bind.

As far as I understand, setting the bit TU_CMD_DIRTY_DESCRIPTOR_SET on
tu_cmd_state.dirty is used instead.

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3624>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3624>
2020-01-29 20:52:52 +00:00
Timur Kristóf
e73f604b21 aco: Fix the meaning of is_atomic.
Previously, is_atomic really meant "is not atomic", contrary to its name.
This commit fixes it to mean what one would think it means.

Fixes: 69bed1c918

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3618>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3618>
2020-01-29 20:32:31 +00:00
Kenneth Graunke
ba148813d7 iris: Support multiple chained batches.
There was never much point in artificially limiting chaining to two
batches - we can trivially support arbitrary length chains.

Currently, we should only ever have 1 or 2, but this may change.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3613>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3613>
2020-01-29 19:53:22 +00:00
Kenneth Graunke
94f9c5fff6 iris: Make iris_emit_default_l3_config pull devinfo from the batch
No need to pass it, we can just use batch->screen->devinfo.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3613>
2020-01-29 19:53:22 +00:00
Kenneth Graunke
afcb6625e3 iris: Drop 'engine' from iris_batch.
For the moment, everything is I915_EXEC_RENDER, so this isn't necessary.
But even should that change, I don't think we want to handle multiple
engines in this manner.

Nowadays, we have batch->name (IRIS_BATCH_RENDER, IRIS_BATCH_COMPUTE,
possibly an IRIS_BATCH_BLIT for blorp batches someday), which describes
the functional usage of the batch.  We can simply check that and select
an engine for that class of work (assuming there ever is more than one).

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3613>
2020-01-29 19:53:22 +00:00
Eric Anholt
06b13dfed2 tu: Fix binning address setup after pack macros change.
This fixes a regression in "vkcube -m headless" rendering, but upsettingly
none of my CTS tests I've been using.

Fixes: 59f29fc845 ("turnip: Convert the rest of tu_cmd_buffer.c over to the new pack macros.")
Caught-by: Jonathan Marek <jonathan@marek.ca>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3609>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3609>
2020-01-29 19:30:09 +00:00
Brian Ho
3d5bdea2cf turnip: Enable occlusionQueryPrecise
This commit enables the occlusionQueryPrecise feature. No additonal
work is required as occlusion queries are already implemented to
track exact sample counts.

Also enables a number of extra tests on the Vulkan CTS.

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3605>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3605>
2020-01-29 19:05:23 +00:00
Daniel Schürmann
6f718edced aco: simplify gathering of MIMG address components
This patch has a slight effect on pipelinedb:
Totals from affected shaders:
SGPRS: 23616 -> 21504 (-8.94 %)
VGPRS: 15088 -> 14444 (-4.27 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 662660 -> 664600 (0.29 %) bytes
LDS: 49 -> 49 (0.00 %) blocks
Max Waves: 3079 -> 3204 (4.06 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>
2020-01-29 18:45:23 +00:00
Daniel Schürmann
901f06e9ad aco: simplify adjust_sample_index_using_fmask() & get_image_coords()
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>
2020-01-29 18:45:23 +00:00
Daniel Schürmann
99d032f3cd aco: fix register allocation with multiple live-range splits
This patch fixes register allocation if multiple live-range splits
occur to the same variable within one instruction.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>
2020-01-29 18:45:23 +00:00
Daniel Schürmann
71440ba0f5 aco: reorder VMEM operands in ACO IR
For all VMEM instructions, the resource constant is now
in operands[0]. For MIMG instructions, the sampler shares
operands[1] with write data in case this instruction writes memory.
Moving the VADDR to be the last operand for MIMG is the first step to
support Navi NSA encoding.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>
2020-01-29 18:45:23 +00:00
Caio Marcelo de Oliveira Filho
8548fe19f0 nir: Make nir_deref_path_init skip trivial casts
In a NIR generated using SPIR-V initializers to variables, copy
propagation can end up transforming

    vec1 32 ssa_33 = deref_var &@1 (shared mat2x4)
    vec1 32 ssa_35 = mov ssa_33
    vec1 32 ssa_7 = deref_cast (mat2x4 *)ssa_35 (shared mat2x4)  /* ptr_stride=0 */

into

    vec1 32 ssa_33 = deref_var &@1 (shared mat2x4)
    vec1 32 ssa_7 = deref_cast (mat2x4 *)ssa_33 (shared mat2x4)  /* ptr_stride=0 */

Before the optimization, the "head" of a path of deref that uses ssa_7
will be the cast.  After, it will be the variable in ssa_33.  Since
the types are the same, this is a trivial cast that would be picked up
by nir_opt_deref.

If we need to compare such deref-chain after optimization with another
deref-chain for the same variable, the compare function will get
confused by the cast in the middle.

One alternative would be to add nir_opt_deref to places that compare
derefs, but that might not scale well, so skip the trivial casts when
generating the paths instead.

Motivated by the discussion in
https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3047#note_383660.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3420>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3420>
2020-01-29 18:25:36 +00:00
Rhys Perry
db19e96c8c aco: fix exec mask consistency issues
There seems to be more, these are just the ones found in
Detroit: Become Human shaders.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
c7d0514168 aco: parallelcopy exec mask before s_wqm
It can be used later and we want any uses to not be fixed to exec, so it's
definition can't be fixed to exec because of how exec masks interact with
register demand calculation.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
517fc3abc4 aco: fill reg_demand with sensible information in add_coupling_code()
process_block() will use this to determine the register demand of the
before the current instruction. Previously, it was filled with zeroes
which could result in process_block() only using the register demand
of after the current instruction.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
26d2511bcb aco: improve assertion at the end of spiller
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
5ea23ba659 aco: set exec_potentially_empty after continues/breaks in nested IFs
Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
4e83e05e62 aco: error when block has no logical preds but VGPRs are live at the start
This would have caught the liveness error fixed in the previous commit.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
d282a292ec aco: don't always add logical edges from continue_break blocks to headers
Otherwise, code like this will be broken:
loop {
   if (...) {
      break;
   } else {
      break;
   }
}
The continue_or_break block doesn't have any logical predecessors but it's
a logical predecessor of the header block. This liveness error breaks the
spiller in init_live_in_vars() (under "keep variables spilled on all
incoming paths") and eventually creates garbage reloads.

Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
dba71de5c6 aco: only create parallelcopy to restore exec at loop exit if needed
The operand isn't fixed to exec, which can mess up the spiller. This also
adds a new situation where a phi is needed.

Fixes dEQP-VK.ssbo.layout.random.descriptor_indexing.2 and an assertion
when compiling a Detroit: Become Human shader.

Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
4537b97410 aco: don't update demand in add_coupling_code() for loop headers
We don't need to update it since it won't be used later.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
521525fc0a aco: don't consider loop header blocks branch blocks in add_coupling_code
Loops without continues create header blocks with only 1 predecessor.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
590c26beab aco: fix target calculation when vgpr spilling introduces sgpr spilling
A shader might require vgpr spilling but not require sgpr spilling. In
that case, the spiller lowers the sgpr target by 5 which could mean sgpr
spilling is then required. Then the vgpr target has to be lowered to make
space for the linear vgprs. Previously, space wasn't make for the linear
vgprs.

Found while testing the spiller on the pipeline-db with a lowered limit

Fixes: a7ff1bb5b9
   ('aco: simplify calculation of target register pressure when spilling')

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Samuel Pitoiset
a61eff8330 radv/gfx10: re-enable NGG GS
Now that NGG GS queries are implemented, it should be safe enough
to enable NGG GS by default. It can be disabled with RADV_DEBUG=nongg
if necessary.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3380>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3380>
2020-01-29 17:40:51 +01:00
Samuel Pitoiset
e4752dafed radv/gfx10: implement NGG GS queries
The number of generated primitives is only counted by the hardware
if GS uses the legacy path. For NGG GS, we need to accumulate that
value in the NGG GS itself. To achieve that, we use a plain GDS
atomic operation.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3380>
2020-01-29 17:40:48 +01:00
Samuel Pitoiset
3c1f657f35 radv/gfx10: add a separate flag for creating a GDS OA buffer
For implementing NGG GS queries, we decided to use GDS but GDS OA
is only required for NGG streamout.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3380>
2020-01-29 17:40:46 +01:00
Michel Dänzer
ca6a22305b winsys/amdgpu: Close KMS handles for other DRM file descriptions
When a BO or amdgpu_screen_winsys is destroyed.

Should fix leaking such BOs in other DRM file descriptions.

v2:
* Pass the correct file descriptor to drmIoctl (Pierre-Eric
  Pelloux-Prayer)
* Use _mesa_hash_table_remove
v3:
* Close handles in amdgpu_winsys_unref as well
v4:
* Adapt to amdgpu_winsys::sws_list_lock.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2270
Fixes: 11a3679e3a "winsys/amdgpu: Make KMS handles valid for original
                     DRM file descriptor"

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3582>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3582>
2020-01-29 15:51:01 +00:00
Michel Dänzer
9f2bed49d4 winsys/amdgpu: Re-use amdgpu_screen_winsys when possible
Namely, if os_same_file_description determined that the DRM file
descriptor references the same file description.

v2:
* Adapt to amdgpu_winsys::sws_list_lock.
v3:
* Fix comparison of amdgpu_screen_winsys file descriptions, see
  https://gitlab.freedesktop.org/mesa/mesa/issues/2413 .
* Lock amdgpu_winsys::sws_list_lock for traversing the sws_list in
  amdgpu_winsys_create.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3582>
2020-01-29 15:51:01 +00:00
Jason Ekstrand
f21b40d0bf anv: Rename a variable
The name "desc" shadows another variable.  Name it "desc_data" like all
of the other descriptor data variables in this file.
2020-01-29 09:43:42 -06:00
Jason Ekstrand
e3f1a08c56 anv/block_pool: Ensure allocations have contiguous maps
Because softpin block pools are made up of a set of BOs with different
maps, it was possible for a single state to end up straddling blocks.
To fix this, we pass a contiguous size to anv_block_pool_grow and it
ensures that the next allocation in the pool will have at least that
size.

We also add an assert in anv_block_pool_map to ensure we always get
contiguous maps.  Prior to the changes to anv_block_pool_grow, the unit
tests failed with this assert.  With this patch, the tests pass.

This was causing problems on Gen12 where we allocate the pages for the
AUX table from the dynamic state pool.  The first chunk, which gets
allocated very early in the pool's history, is 1MB which was enough that
it was getting multiple BOs.  This caused the gen_aux_map code to write
outside of the map and overwrite the instruction state pool buffer which
lead to GPU hangs.

Fixes: 731c4adcf9 "anv/allocator: Add support for non-userptr"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2020-01-29 09:43:42 -06:00
Jason Ekstrand
ee4cdef9ae anv: Re-use one old BT block in reset_batch_bo_chain
We intentionally throw away all but one BT block but then we set
cmd_buffer->bt_block to ANV_STATE_NULL instead of the one we hung on to.
This causes the command buffer to immediately re-emit STATE_BASE_ADDRESS
the first time a BT is needed for no good reason.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2020-01-29 09:43:42 -06:00
Jason Ekstrand
a2e9dd51b3 anv: Set actual state pool sizes when we have softpin
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2020-01-29 09:43:42 -06:00
Rhys Perry
1f72857739 nir/algebraic: add some half packing optimizations
pipeline-db (ACO):
Totals from affected shaders:
SGPRS: 29200 -> 29200 (0.00 %)
VGPRS: 17372 -> 17372 (0.00 %)
Spilled SGPRs: 105 -> 105 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 1406576 -> 1389256 (-1.23 %) bytes
LDS: 83 -> 83 (0.00 %) blocks
Max Waves: 3976 -> 3976 (0.00 %)

pipeline-db (LLVM):
Totals from affected shaders:
SGPRS: 21320 -> 21320 (0.00 %)
VGPRS: 17056 -> 17036 (-0.12 %)
Spilled SGPRs: 22 -> 22 (0.00 %)
Spilled VGPRs: 503 -> 487 (-3.18 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 396 -> 396 (0.00 %) dwords per thread
Code Size: 1441244 -> 1423292 (-1.25 %) bytes
LDS: 463 -> 463 (0.00 %) blocks
Max Waves: 3609 -> 3611 (0.06 %)

v2: add pattern for ishr

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2271>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2271>
2020-01-29 14:30:33 +00:00
Rhys Perry
5476d18183 nir/algebraic: add patterns for a >> #b << #b
Fixes compilation of a Battlefront 2 shader with ACO by removing VGPR
spilling. The reassociation makes it worse on LLVM though.

pipeline-db (ACO):
Totals from affected shaders:
SGPRS: 10704 -> 10688 (-0.15 %)
VGPRS: 18736 -> 18528 (-1.11 %)
Spilled SGPRs: 70 -> 70 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 909696 -> 885796 (-2.63 %) bytes
LDS: 225 -> 225 (0.00 %) blocks
Max Waves: 1115 -> 1129 (1.26 %)

pipeline-db (LLVM):
Totals from affected shaders:
SGPRS: 8472 -> 8424 (-0.57 %)
VGPRS: 14284 -> 14368 (0.59 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 442 -> 503 (13.80 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 268 -> 396 (47.76 %) dwords per thread
Code Size: 862568 -> 853028 (-1.11 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 971 -> 964 (-0.72 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2271>
2020-01-29 14:30:33 +00:00
Samuel Pitoiset
6aecc316c0 aco: fix VS input loads with MUBUF on GFX6
Only MTBUF supports vec3.

Fixes: 03a0d39366 ("aco: use MUBUF in some situations instead of splitting vertex fetches")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3615>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3615>
2020-01-29 13:58:37 +00:00
Rhys Perry
404818dd28 aco: run p_wqm instructions in WQM
If the p_wqm ends up creating copies, these need to be in WQM. Helps (but
doesn't completely fix) artifacts in Strange Brigade. The actual issue
still exists and is harder to fix.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3273>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3273>
2020-01-29 13:23:03 +00:00
Rhys Perry
2d7386a2d0 aco: ensure predecessors' p_logical_end is in WQM when a p_phi is in WQM
We want any copies to be in WQM. I don't know if this fixes any real
application, but I can create a vkrunner test than reproduces the issue.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3273>
2020-01-29 13:23:03 +00:00
Icecream95
9be9fd8591 pan/midgard: Fix a liveness info leak
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3566>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3566>
2020-01-29 12:59:32 +00:00
Jonathan Marek
6346490a2e etnaviv: implement UBOs
At the same time, use pre-HALTI2 to use address register for indirect
uniform loads, since integers/LOAD instruction isn't always available.

Passes all dEQP-GLES3.functional.ubo.* on GC7000L. GC3000 with an extra
flush hack passes most of them, but still fails on some of the cases with
many loads.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3389>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3389>
2020-01-29 11:47:34 +00:00
Rob Clark
7ff8ce7a3f freedreno/a6xx: convert blend state to stateobj
And move to new register builders while we are at it.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3565>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3565>
2020-01-29 11:21:47 +00:00
Rob Clark
f066e3afc7 freedreno/a6xx: remove special handling based on MRT format
Logicop in particular is supposed to work for integer formats.. but
maybe this situation doesn't happen in gles.  The only thing that isn't
required for integer formats is blending.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3565>
2020-01-29 11:21:47 +00:00
Rob Clark
eb281df1a1 mesa/st: random whitespace cleanup
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3565>
2020-01-29 11:21:47 +00:00
Rob Clark
d0e0141526 freedreno: use PIPE_CAP_RGB_OVERRIDE_DST_ALPHA_BLEND
This lets us drop a bunch of special handling for xRGB blend.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3565>
2020-01-29 11:21:47 +00:00
Thomas Hellstrom
9ee3ec348e gallium/util: Increase the debug_flush map depth
Some piglit tests trigger a map depth assert when debug_flush is active.
Fix this by increasing the map depth from 16 to 32.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3614>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3614>
2020-01-29 10:56:06 +00:00
Thomas Hellstrom
8830e9f0ca svga: Avoid discard DMA uploads
Newer versions of the device code will make discard DMA uploads
sub-optimal. Disable them for guest-backed aware code, where we previously
had them conditionally enabled.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3614>
2020-01-29 10:56:06 +00:00
Thomas Hellstrom
8afe12b212 winsys/svga: Enable transhuge pages for buffer objects
If the kernel supports it, enable transhuge pages for graphics buffer
objects. Except for the syscall itself, this is never expected to cause
any negative performance implications.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3614>
2020-01-29 10:56:06 +00:00
Roland Scheidegger
3b3c2daf3a winsys/svga: use new ioctl for logging
Use the new ioctl for logging (rather than duplicating what the kernel
is doing). This way it's also independent from the actual guest/host
mechanism to do the logging.

Signed-off-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3614>
2020-01-29 10:56:06 +00:00
Samuel Pitoiset
f53b4defad radv: remove the non conformant VK implementation warning on GFX10
It's no longer true.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3597>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3597>
2020-01-29 10:35:15 +00:00
Samuel Pitoiset
1b8d99e288 radv: bump conformance version to 1.2.0.0
https://www.khronos.org/conformance/adopters/conformant-products#submission_472
https://www.khronos.org/conformance/adopters/conformant-products#submission_473
https://www.khronos.org/conformance/adopters/conformant-products#submission_474

Fixes dEQP-VK.api.driver_properties.conformance_version.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3597>
2020-01-29 10:35:15 +00:00
Samuel Pitoiset
401bfe0283 radv: implement VK_AMD_shader_explicit_vertex_parameter
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2402
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
2020-01-29 09:49:50 +00:00
Samuel Pitoiset
663d5c1399 radv: gather which input PS variables use an explicit interpolation mode
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
2020-01-29 09:49:50 +00:00
Samuel Pitoiset
3922d95b51 aco: implement VK_AMD_shader_explicit_vertex_parameter
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
2020-01-29 09:49:50 +00:00
Samuel Pitoiset
6f4c300919 ac/llvm: implement VK_AMD_shader_explicit_vertex_parameter
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
2020-01-29 09:49:50 +00:00
Samuel Pitoiset
531a26d5aa spirv: implement SPV_AMD_shader_explicit_vertex_parameter
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
2020-01-29 09:49:50 +00:00
Samuel Pitoiset
cf6cae832c nir: lower interp_deref_at_vertex to load_input_vertex
This introduces a new NIR intrinsic for loading inputs at a specific
vertex index.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
2020-01-29 09:49:50 +00:00
Samuel Pitoiset
d29f10a7ca nir: add nir_intrinsic_interp_deref_at_vertex
From the SPV_AMD_shader_explicit_vertex_parameter extension:
   "Returns the value of the input <interpolant> without any
    interpolation, i.e. the raw output value of previous shader
    stage."

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
2020-01-29 09:49:50 +00:00
Samuel Pitoiset
687f170311 nir: lower SYSTEM_VALUE_BARYCENTRIC_* to nir_load_barycentric()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
2020-01-29 09:49:50 +00:00
Samuel Pitoiset
9021b45b35 nir: add nir_intrinsic_load_barycentric_model
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
2020-01-29 09:49:50 +00:00
Samuel Pitoiset
df8dd12e5b spirv: add support for SpvBuiltInBaryCoord*
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
2020-01-29 09:49:50 +00:00
Samuel Pitoiset
61d24080bb compiler: add new SYSTEM_VALUE_BARYCENTRIC_*
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
2020-01-29 09:49:50 +00:00
Samuel Pitoiset
15d53d8294 compiler: add PERSP to the existing barycentric system values
We need the LINEAR versions for AMD_shader_explicit_vertex_parameter.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
2020-01-29 09:49:50 +00:00
Samuel Pitoiset
5c053cc6ec spirv: add support for SpvDecorationExplicitInterpAMD
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
2020-01-29 09:49:50 +00:00
Samuel Pitoiset
746e9e5d66 compiler: add a new explicit interpolation mode
This introduces one more interpolation mode INTERP_MODE_EXPLICIT,
which is needed for AMD_shader_explicit_vertex_parameter.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
2020-01-29 09:49:50 +00:00
Eduardo Lima Mitev
e6b531af66 turnip: Fix issues in tu_compute_pipeline_create() that may lead to crash
The shader object is destroyed even if its creation failed. It is also
not destroyed if its compilation or upload fails, leading to leaks.

Finally, tu_compute_pipeline_create() should set output var
pPipeline to VK_NULL_HANDLE if it fails.

Avoids crash on
dEQP-VK.api.object_management.alloc_callback_fail_multiple.compute_pipeline

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3572>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3572>
2020-01-29 09:25:20 +00:00
Eduardo Lima Mitev
0e11e8ba89 turnip: Remove failed command buffer from pool
When an error condition occurs during tu_create_cmd_buffer(), the
cmd buffer has already been added to a pool, so the cleanup code should
remove it.

Fixes a crash (assert in tu_device::tu_bo_finish()) in dEQP tests:

dEQP-VK.api.object_management.max_concurrent.command_buffer_primary
dEQP-VK.api.object_management.max_concurrent.command_buffer_secondary

due to pool attempting to destroy an invalid command buffer.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3572>
2020-01-29 09:25:20 +00:00
Pierre-Eric Pelloux-Prayer
ab54624d0d radeonsi: stop using the VM_ALWAYS_VALID flag
Allocation all the bo as ALWAYS_VALID means they must all fit in memory
(vram + gtt) at each command submission.
This causes some trouble when the total allocated memory is greater than
the available memory.

Possible solutions:
- being able to tag/untag a bo as ALWAYS_VALID: would require kernel changes
- disable VM_ALWAYS_VALID when memory usage is more than a percentage of the
  available memory
- disable VM_ALWAYS_VALID entirely

v1 of this patch implemented option 2. v2 (this version) implements option 3.

Related issues:
 - https://gitlab.freedesktop.org/drm/amd/issues/607
 - https://gitlab.freedesktop.org/mesa/mesa/issues/1257

It also helps with some piglit tests (-t maxsize -t "max[_-].*size" -t maxuniformblocksize):
instead of crashing the machine, the tests fail cleanly.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2190
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3430>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3430>
2020-01-29 09:05:04 +01:00
Samuel Pitoiset
b05ac4b158 radv: enable VK_AMD_shader_fragment_mask on GFX6-GFX7
Works fine.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3603>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3603>
2020-01-29 08:08:27 +01:00
Kenneth Graunke
baf9327fa1 loader: Check if the kernel driver is i915 before loading iris
To prevent it from trying to load on say gma500 hardware.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3595>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3595>
2020-01-28 15:35:09 -08:00
Jordan Justen
2969012d03 anv: Emit CS Stall before Instruction Cache flush for gen12 WA
Before flushing the instruction cache with a pipe control, we need to
use a CS Stall pipe control.

Ref: GEN:BUG:1409226450
Rework: Add stall-at-scoreboard (Lionel)
Rework: Merge with other anvil pre-invalidate stalls (Lionel)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3457>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3457>
2020-01-28 21:57:17 +00:00
Jordan Justen
da03e07cc2 iris: Emit CS Stall before Instruction Cache flush for gen12 WA
Before flushing the instruction cache with a pipe control, we need to
use a CS Stall pipe control.

Ref: GEN:BUG:1409226450
Rework: Add stall-at-scoreboard (Lionel)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3457>
2020-01-28 21:57:17 +00:00
Erik Faye-Lund
b175effc72 zink: set compareEnable when setting compareOp
We need to enable compareEnable for compareOp to be valid, and ANV was
recently updated to respect this. So let's update Zink to match.

This fixes the shadow-variants of several piglit regressions, like these:
spec@arb_shader_texture_lod@execution@tex-miplevel-selection
spec@glsl-1.20@execution@tex-miplevel-selection

Fixes: a19cdf989b ("anv: only use VkSamplerCreateInfo::compareOp if enabled")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3473>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3473>
2020-01-28 21:04:26 +00:00
Eric Anholt
f6e59911e5 ci: Enable -Werror on the meson-i386 build.
I find warnings to be very disruptive to my workflow (using emacs's "go to
next error" feature), and I periodically have to go clean up other
people's drivers to get back to finding my own warnings in the noise.  I
know I'm not the only one doing something like this.

We don't want to enable -Werror by default in builds, since it means that
end users will have builds spuriously fail based on what compiler version
and opt flags they have compared to what the devs are using.  However, it
is quite easy to have CI ensure that we at least don't introduce warnings
on the compiler version that it uses.

For now I've just enabled it on meson-i386 to cover a bunch of Mesa core
and get us started on ratcheting up warnings-cleanliness in the tree,
without me having to fix up all the drivers at once.

Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3539>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3539>
2020-01-28 12:31:07 -08:00
Eric Anholt
527a8c345b mesa/st: Fix compiler warnings from INTEL_shader_integer_functions.
Fixes: 1d165b0548 ("glsl: Add new expressions for INTEL_shader_integer_functions2")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3539>
2020-01-28 12:31:03 -08:00
Eric Anholt
096921c878 iris: Silence warning about AUX_USAGE_MC.
It was recently introduced and not added to iris yet it looks like.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3539>
2020-01-28 12:30:48 -08:00
Eric Anholt
05e3ccd8a1 vulkan/wsi: Fix compiler warning when no WSI platforms are enabled.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3539>
2020-01-28 12:30:48 -08:00
Dylan Baker
71c6208200 docs: update news, calendar, and link release notes for 19.3.3
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3604>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3604>
2020-01-28 11:36:21 -08:00
Dylan Baker
3e49d0efe7 docs: Add SHA 256 sums for 19.3.3
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3604>
2020-01-28 11:35:30 -08:00
Dylan Baker
f9ef115927 docs: Add relnotes for 19.3.3 release
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3604>
2020-01-28 11:35:28 -08:00
Jason Ekstrand
997040e4b8 intel/mi_builder: Force write completion on Gen12+
Otherwise, we have no guarantee that the write actually lands before we
move on to other things.  Doing this on every SDI is probably a bit
harsh but it's safe.  We should figure out a good way to avoid this when
we can.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3593>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3593>
2020-01-28 18:15:29 +00:00
Jason Ekstrand
06657e1dda anv: Replace one more aux_surface.isl.size_B check
This one was missed in 41bffe0913.

Fixes: 41bffe0913 "anv: Replace aux_surface.isl.size_B checks with..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3593>
2020-01-28 18:15:29 +00:00
Jason Ekstrand
f229579c0a intel/blorp: Handle bit-casting UNORM and BGRA formats
In f132e0fddf, I attempted to allow BLORP to do CCS_E copies by using
the UNORM formats instead.  However, the old BLORP bit-cast code could
only handle RGBA formats and asserted on anything other than UINT
formats.  The reason we didn't catch this is because it only comes up on
Gen12 platforms which aren't in our normal CI yet.

Fixes: f132e0fddf "intel/blorp: Add support for CCS_E copies with..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3593>
2020-01-28 18:15:29 +00:00
Daniel Schürmann
396be00640 aco: fix combine_salu_not_bitwise() when SCC is used
Previously, we didn't use the SCC bit, and thus, we didn't care about it.
With 'aco: Transform uniform bitwise instructions to 32-bit if possible.'
that changed, so that we have to handle it.

Fixes: 8a32f57fff ('aco: Transform uniform bitwise instructions to 32-bit if possible.')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3598>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3598>
2020-01-28 18:14:02 +01:00
Drew Davenport
0d99ff54cc radeonsi: Clear uninitialized variable
|view| was not initialized leading to flaky test failures in SkQP
test unitTest_ES2BlendWithNoTexture.

Fixes: 029bfa3d25 "radeonsi: add ability to bind images as image buffers"

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3592>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3592>
2020-01-28 16:29:48 +00:00
Brian Ho
815a603889 anv: Handle unavailable queries in vkCmdCopyQueryPoolResults
If VK_QUERY_RESULT_WAIT_BIT is not set, there is currently no
special handling of unavailable queries in vkCmdCopyQueryPoolResults,
and anv will write an invalid value for the query result.

This commit updates vkCmdCopyQueryPoolResults for unavailable
queries to return 0 if the VK_QUERY_RESULT_PARTIAL_BIT flag is set
and if not, skip writing altogether.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3586>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3586>
2020-01-28 15:17:21 +00:00
Brian Ho
af92ce50a7 anv: Properly fetch partial results in vkGetQueryPoolResults
Currently, fetching the partial results (VK_QUERY_RESULT_PARTIAL_BIT)
of an unavailable occlusion query via vkGetQueryPoolResults can
return invalid values. anv returns slot.end - slot.begin, but in the
case of unavailable queries, slot.end is still at the initial value
of 0. If slot.begin is non-zero, the occlusion count underflows to
a value that is likely outside the acceptable range of the partial
result.

This commit fixes vkGetQueryPoolResults by always returning 0 if the
query is unavailable and the VK_QUERY_RESULT_PARTIAL_BIT is set.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3586>
2020-01-28 15:17:21 +00:00
Rhys Perry
7edcf4a59d aco: fix rebase error from GS copy shader support
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: f8f7712666 ('aco: implement GS copy shaders')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3601>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3601>
2020-01-28 13:50:53 +00:00
Tapani Pälli
dd9bf7d291 anv/android: make format_supported_with_usage static
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3532>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3532>
2020-01-28 14:46:38 +02:00
Tapani Pälli
104744f4df anv/android: setup gralloc1 usage from gralloc0 usage manually
This cuts away dependency to libgrallocusage.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3532>
2020-01-28 14:46:25 +02:00
Rhys Perry
03a0d39366 aco: use MUBUF in some situations instead of splitting vertex fetches
Fixes most of the regressions from splitting vertex fetches in an earlier
commit.

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 0 -> 0 (0.00 %)
VGPRS: 0 -> 0 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 0 -> 0 (0.00 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 0 -> 0 (0.00 %)

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 562696 -> 558344 (-0.77 %)
VGPRS: 395596 -> 393752 (-0.47 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 11600912 -> 11311804 (-2.49 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 101839 -> 102372 (0.52 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>
2020-01-28 11:44:52 +00:00
Rhys Perry
21d2799cee aco: value-number MUBUF instructions
We will have to do this when we start creating MUBUF instructions for
load_input because NIR might not be able to tell they are identical since
it doesn't know whether two vertex attributes have the same offset.

No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>
2020-01-28 11:40:22 +00:00
Rhys Perry
d39f5519a1 aco: handle unaligned vertex fetch on GFX10
pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 0 -> 0 (0.00 %)
VGPRS: 0 -> 0 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 0 -> 0 (0.00 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 0 -> 0 (0.00 %)

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 795000 -> 802368 (0.93 %)
VGPRS: 579632 -> 581280 (0.28 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 17208408 -> 17583652 (2.18 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 145731 -> 145279 (-0.31 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>
2020-01-28 11:40:10 +00:00
Rhys Perry
d9e357e35b aco: skip unused channels at the start when fetching vertices
pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 161320 -> 161224 (-0.06 %)
VGPRS: 153968 -> 149408 (-2.96 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 4331496 -> 4331308 (-0.00 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 27814 -> 28594 (2.80 %)

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 161504 -> 161408 (-0.06 %)
VGPRS: 153836 -> 149440 (-2.86 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 4327572 -> 4327604 (0.00 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 27837 -> 28618 (2.81 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>
2020-01-28 11:40:01 +00:00
Rhys Perry
525b107347 aco: rework vertex fetching a bit
This will make it easier to skip unused channels at the start and to split
unaligned loads on GFX10.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>
2020-01-28 11:39:57 +00:00
Rhys Perry
4363a1f75b amd/common,radv: move vertex_format_table to ac_shader_util.{h,c}
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>
2020-01-28 11:39:52 +00:00
Jan Zielinski
ab7ac1ffda gallium/swr: fix tessellation state save/restore
Tessellation state should be saved with TCS/TES state
when binding new state and restored if old state
is set again.

Reviewed-by: Krzysztof Raszkowski <krzysztof.raszkowski@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3596>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3596>
2020-01-28 13:55:47 +01:00
Vasily Khoruzhick
fe5267d322 lima: disable early-z if fragment shader uses discard
We have to disable early-z if fragment shader uses discard,
otherwise we'll get misrendering.

Reported-by: Icenowy Zheng <icenowy@aosc.io>
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3570>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3570>
2020-01-27 22:35:43 -08:00
Vasily Khoruzhick
650c680545 lima: ppir: always create move and update ld_tex successors for all blocks
Always create a mov for ld_tex since we can't rely on
ppir_node_has_single_src_succ() if we have multiple blocks. And since
ld_tex successor can be in a different block we have to update their
ppir_src as well.

Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3564>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3564>
2020-01-28 01:45:29 +00:00
Vasily Khoruzhick
4a0f62f1fc lima: ppir: don't delete root ld_tex nodes without successors in current block
We don't clone ld_tex nodes into each block anymore, so ld_tex may have
successors in another block.

Fixes: c8554f849e ("lima/ppir: don't clone texture loads")
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3564>
2020-01-28 01:45:29 +00:00
Rob Clark
63af27bc76 freedreno/drm: fix invalid-cmdstream-size with older kernels
A cmdstream of size zero is invalid.  But this can appear in various
places where we emit a pointer to state.  This doesn't show up with
newer kernels (newer than v5.0) which use "softpin", but on earlier
kernels can result in:

  [drm:msm_ioctl_gem_submit [msm]] *ERROR* invalid cmdstream size: 0

Since the pointer value doesn't matter in these cases, the easy solution
is just to not emit a cmds table entry in this case.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2805>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2805>
2020-01-28 00:09:34 +00:00
Marek Olšák
0c154d9e2d Revert "winsys/amdgpu: Re-use amdgpu_screen_winsys when possible"
This reverts commit b60f5cbc15.

This fixes dmesg errors and X freezes:
[   29.543096] amdgpu 0000:0c:00.0: No GEM object associated to handle 0x00000009, can't create framebuffer
[   29.543103] amdgpu 0000:0c:00.0: No GEM object associated to handle 0x00000009, can't create framebuffer
2020-01-27 17:48:42 -05:00
Marek Olšák
ba06c7620f Revert "winsys/amdgpu: Close KMS handles for other DRM file descriptions"
This reverts commit 552028c013.

Required by the next reverted commit.
2020-01-27 17:48:25 -05:00
Jason Ekstrand
993f866d2e anv: Insert holes for non-existant XFB varyings
Thanks to optimizations, it's possible for varyings to get deleted but
still leave the variable there for nir_gather_xfb_info to find.  If we
get into this case, insert a hole.

Fixes: 36ee2fd61c "anv: Implement the basic form of..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3520>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3520>
2020-01-27 20:26:23 +00:00
Jason Ekstrand
68b3bfaa42 intel/genxml: Make SO_DECL::"Hole Flag" a Boolean
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3520>
2020-01-27 20:26:23 +00:00
Sagar Ghuge
a27542c5dd intel/compiler: Clear accumulator register before EOT
v2: (Francisco Jerez)
- Drop vec4 changes.
- Handle explicit acc0 operand and implicit one.
- Make sure instruction is SIMD16, prediction is off and default mask
  control set to true.

v3: (Francisco Jerez)
- Clear accumulator only when it's written.
- Use BRW_MASK_DISABLE instead of true.
- Use correct width for brw_acc_reg().
- Fix last_inst_offset.

v4: (Francisco Jerez)
- Don't check for last instruction for accummulator write.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3376>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3376>
2020-01-27 19:48:11 +00:00
Alyssa Rosenzweig
480cf7d9bf pan/midgard: Remove float_bitcast
Now unused.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3588>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3588>
2020-01-27 13:37:36 -05:00
Samuel Pitoiset
83e1fa87a7 radv: do not allow sparse resources with multi-planar formats
It's unsupported.

Fixes some fails or hangs with
dEQP-VK.sparse_resources.image_sparse_binding.*

Cc: 19.3 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3581>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3581>
2020-01-27 15:47:49 +00:00
Boris Brezillon
24360966ab panfrost/midgard: Prettify embedded constant prints
Until now, embedded constants were printed as all 32 bits integer or
floats, but the compiler can pack constant from different types if
severa instructions with different reg_mode and native type refer to
the constant register. Let's implement something smarter so users don't
have to do a manual conversion when looking at a trace.

Note that 8-bit constants are not decoded yet, as we're not sure how
the writemask is encoded in that case.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3536>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3536>
2020-01-27 15:24:54 +00:00
Boris Brezillon
aa973fc14e panfrost/midgard: Add a condense_writemask() helper
This way we can convert an 8-bit writemask (Midgard specific
representation) into the more common 1-bit/component representation.

8-bit mode is not supported yet, as we're not sure how the writemask is
encoded for this mode.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3536>
2020-01-27 15:24:54 +00:00
Rhys Perry
2dc63d39d3 aco: fix literal application with v_cndmask_b32/v_addc_co_u32/etc
No pipeline-db changes

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 0be7409069 ('aco: rewrite literal combining')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3541>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3541>
2020-01-27 14:50:37 +00:00
Rhys Perry
827681f921 aco: always add sgprs to sgpr_ids when choosing literals
Even if it's a literal, we should add this to sgpr_ids.
No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 0be7409069 ('aco: rewrite literal combining')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3541>
2020-01-27 14:50:37 +00:00
Rhys Perry
92970adb4b aco: fix operand to scc when selecting SGPR ufind_msb/ifind_msb
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3541>
2020-01-27 14:50:37 +00:00
Rhys Perry
e6c90e4af9 aco: fix WaR check for >64-bit FLAT/GLOBAL instructions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 5986e0019 ('aco: improve WAR hazard workaround with >64bit stores')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3541>
2020-01-27 14:50:37 +00:00
Alyssa Rosenzweig
8784062abb pan/midgard: Handle tag 0x4 as texture
Used for barriers which work as texture ops.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3580>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3580>
2020-01-27 13:38:41 +00:00
Alyssa Rosenzweig
5a271df028 pan/midgard: Validate barriers use a barrier tag
...and that non-barriers don't use a barrier tag. It's not clear what
the difference means quite yet, though.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3580>
2020-01-27 13:38:41 +00:00
Alyssa Rosenzweig
c9f4eface3 pan/midgard: Disassemble barrier instructions
We don't need to print all the usual texture noise; just the relevant
fields and the rest can be guarded to zero.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3580>
2020-01-27 13:38:41 +00:00
Alyssa Rosenzweig
556964d927 pan/midgard: Record TEXTURE_OP_BARRIER
It's 0x0B for whatever reason.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3580>
2020-01-27 13:38:41 +00:00
Alyssa Rosenzweig
3993969477 pan/decode: Drop MFBD compute shader stuff
This is triggering all sorts of failures in pandecode and is only mostly
spurious. Let's not overwhelm ourselves with this yet.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3580>
2020-01-27 13:38:41 +00:00
Icecream95
8004874885 panfrost: Don't copy uniforms when the size is zero
This fixes a crash when using Gallium HUD with QuakeSpasm when gamma
correction shaders (a QuakeSpasm feature, not part of Mesa) are used.

Reviewd-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3549>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3549>
2020-01-27 13:23:34 +00:00
Florian Will
951083768b radv/winsys: set IB flags prior to submit in the sysmem path
This fixes missing scene objects in ZUSI 3 + dxvk. Index / vertex buffer
upload using thousands of CopyBuffer commands in one huge Vulkan command
buffer, mixed with lots of render pass begin/end and draw calls, failed
for some of the buffers.

radv divides the huge command buffer into 3 IBs, and they had random
flags set because the field was uninitialized. Maybe IBs got discarded
if they had the PREAMBLE bit set.

Signed-off-by: Florian Will <florian.will@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: <mesa-stable@lists.freedesktop.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3577>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3577>
2020-01-27 11:53:22 +01:00
Pierre-Eric Pelloux-Prayer
90312de551 docs: document AMD_DEBUG variable
See https://gitlab.freedesktop.org/mesa/mesa/issues/2022

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3492>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3492>
2020-01-27 09:29:10 +01:00
Pierre-Eric Pelloux-Prayer
a803d41248 radeonsi: move AMD_DEBUG tests to AMD_TEST
AMD_DEBUG env var is stored in a 64 bits int and has 64 different values.
This commit makes some space by moving the test* special values to AMD_TEST.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3492>
2020-01-27 09:29:10 +01:00
Dave Airlie
58ba7b696d gallivm/nir: add missing break for isub.
Pointed out by coverity scan.

Fixes: 3adf74f2ef ("gallivm: pick integer builders for alu instructions.")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3571>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3571>
2020-01-27 08:05:27 +10:00
Lionel Landwerlin
8bd92a15cf isl: add gen12 comment about CCS for linear tiling
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3551>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3551>
2020-01-26 20:46:14 +00:00
Lionel Landwerlin
a3f6db2c4e isl: drop CCS row pitch requirement for linear surfaces
We were applying row pitch constraint of CCS surfaces to linear
surfaces. But CCS is only supported in linear tiling under some
condition (more on that in the following commit). So let's drop that
requirement for now.

Fixes a bunch of crucible assert where the byte size of a linear image
is expected to be similar to the byte size of buffer for the same
extent in the following category :

   func.miptree.r8g8b8a8-unorm.aspect-color.view-2d.*download-copy-with-draw.*

v2: Move restriction to isl_calc_tiled_min_row_pitch()

v3: Move restrinction to isl_calc_row_pitch_alignment() (Jason)

v4: Update message (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 07e16221d9 ("isl: Round up some pitches to 512B for Gen12's CCS")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3551>
2020-01-26 20:46:14 +00:00
Lionel Landwerlin
397ff2976b intel: Implement Gen12 workaround for array textures of size 1
Gen12 does not support RENDER_SURFACE_STATE::SurfaceArray = true &&
RENDER_SURFACE_STATE::Depth = 0. SurfaceArray can only be set to true
if Depth >= 1.

We workaround this limitation by adding the max(value, 1) snippet in
the shaders on the 3 components for texture array sizes.

Tested on Gen9 with the following Vulkan CTS tests :
dEQP-VK.image.image_size.2d_array.*

v2: Drop debug print (Tapani)
    Switch to GEN:BUG instead of Wa_

v3: Fix dEQP-VK.image.image_size.1d_array.* cases (Lionel)

v4: Fix dEQP-VK.glsl.texture_functions.query.texturesize.* cases
    (Missing tex_op handling) (Lionel)

v5: Missing break statement (Lionel)

v6: Fixup comment (Tapani)

v7: Fixup comment again (Tapani)

v8: Don't use sample_dim as index (Jason)
    Rename pass
    Simplify control flow

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v7)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3362>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3362>
2020-01-26 22:27:03 +02:00
Jason Ekstrand
4d03e53127 intel/isl: Allow CCS_E on more formats
Now that BLORP supports copies on everything except R11G11B10_FLOAT,
we should be able to support CCS_E those formats.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3554>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3554>
2020-01-25 17:48:54 +00:00
Jason Ekstrand
f132e0fddf intel/blorp: Add support for CCS_E copies with UNORM formats
Some of the smaller bit-size formats which support CCS_E don't have a
UINT representative in their compression class.  However, we should be
able to use UNORM just fine and still get bit-exact copies.  We just
have to do a conversion to/from UNORM when we bitcast.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3554>
2020-01-25 17:48:54 +00:00
Erico Nunes
ae0b8ba5d5 lima/ppir: fix src read mask swizzling
The src mask can't be calculated from the dest write_mask.
Instead, it must be calculated from the swizzled operators of the src.
Otherwise, liveness calculation may report incorrect live components for
non-ssa registers.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3502>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3502>
2020-01-25 14:48:55 +01:00
Erico Nunes
ab36523ae7 lima/ppir: split ppir_op_undef into undef and dummy again
Those were renamed/merged some time ago but it turns out that
ppir_op_undef can't be shared.
It was being used for undefined ssa operations and for read-before-write
operations that may happen to e.g. uninitialized registers (non-ssa)
inside a loop.
We really don't want to reserve a register for the undef ssa case, but
we must reserve and allocate register for the unitialized register case
because when it happens inside a loop it may need to hold its value
across iterations.

This dummy node might be eliminated with a code refactor in ppir in case
we are able to emit the write and allocate the ppir_reg before we emit
the read. But a major refactor we need this to keep this code to avoid
apparent regressions with the new liveness analysis implementation.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3502>
2020-01-25 14:48:55 +01:00
Erico Nunes
4ca3de06ec lima/ppir: fix ssa undef emit
The ssa doesn't need to be manually added to block->comp->reg_list.
Doing so actually causes other registers to be marked as undef=true
later.

This patch alone fixes a few deqp tests that have undefs.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3502>
2020-01-25 14:48:55 +01:00
Erico Nunes
d6b1917c01 lima/ppir: handle write to dead registers in ppir
nir can output writes to dead registers when expanding vec4 operations
to non-ssa registers. In that case, some components of the vec4 may be
assigned but never read. These are also not currently removed by a nir
dead code elimination pass as they are not ssa.
In order to prevent regalloc from allocating a live register for this
operation, an interference must be assigned to it during liveness
analysis.

This workaround may be removed in the future if the assignments to dead
components can be removed earlier in ppir or nir.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3502>
2020-01-25 14:48:02 +01:00
Marek Olšák
eb7cd575da radeonsi: fix a regression since the addition of si_shader_llvm_vs.c
Fixes: cd5b99c541 - radeonsi: move VS shader code into si_shader_llvm_vs.c
Closes: #2416
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3561>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3561>
2020-01-25 05:59:24 +00:00
Marek Olšák
688d2901b8 radeonsi: make screen available to shader part compilation
to fix a crash in is_multi_part_shader.

Fixes: 1a0890dcf3 - radeonsi: change prototypes of si_is_multi_part_shader & si_is_merged_shader
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3561>
2020-01-25 05:59:24 +00:00
Jason Ekstrand
07a441d53f anv: Rework CCS memory handling on TGL-LP
The previous way we were attempting to handle AUX tables on TGL-LP was
very GL-like.  We used the same aux table management code that's shared
with iris and we updated the table on image create/destroy.  The problem
with this is that Vulkan allows multiple VkImage objects to be bound to
the same memory location simultaneously and the app can ping-pong back
and forth between them in the same command buffer.  Because the AUX
table contains format-specific data, we cannot support this ping-pong
behavior with only CPU updates of the AUX table.

The new mechanism switches things around a bit and instead makes the aux
data part of the BO.  At BO creation time, a bit of space is appended to
the end of the BO for AUX data and the AUX table is updated in bulk for
the entire BO.  The problem here, of course, is that we can't insert the
format-specific data into the AUX table at BO create time.

Fortunately, Vulkan has a requirement that every TILING_OPTIMAL image
must be initialized prior to use by transitioning the image from
VK_IMAGE_LAYOUT_UNDEFINED to something else.  When doing the above
described ping-pong behavior, the app has to do such an initialization
transition every time it corrupts the underlying memory of the VkImage
by using it as something else.  We can hook into this initialization and
use it to update the AUX-TT entries from the command streamer.  This way
the AUX table gets its format information, apps get aliasing support,
and everyone is happy.

One side-effect of this is that we disallow CCS on shared buffers.
We'll need to fix this for modifiers on the scanout path but that's a
task for another patch.  We should be able to do it with dedicated
allocations.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>
2020-01-25 02:18:33 +00:00
Jason Ekstrand
b29cf7daf3 anv: Make anv_vma_alloc/free a lot dumber
All they do now is take a size, align, and flags and figure out which
heap to allocate in.  All of the actual code to deal with the BO is in
anv_allocator.c.  We want to leave anv_vma_alloc/free in anv_device.c
because it deals with API-exposed heaps so it still makes sense to have
it there.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>
2020-01-25 02:18:33 +00:00
Jason Ekstrand
fd0f9d1196 anv: Make AUX table invalidate a PIPE_* bit
This commit moves it in with all the other cache invalidation operations
as if it were done by PIPE_CONTROL even though it's a pair of register
writes.  This means we only have to write the GFX_AUX_TABLE_BASE_ADDR
register once at device initialization instead of every invalidate.
Invalidates are now a single LRI instead of two.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>
2020-01-25 02:18:33 +00:00
Jason Ekstrand
658dc9ca50 anv: Add another align_down helper
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>
2020-01-25 02:18:33 +00:00
Jason Ekstrand
64ca8a3272 isl: Add a helper for calculating subimage memory ranges
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>
2020-01-25 02:18:33 +00:00
Jason Ekstrand
4793116036 anv: Delete a redundant calculation
We compute the same thing with the same variable name at the top of the
function.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>
2020-01-25 02:18:33 +00:00
Jason Ekstrand
a1e9adc9ce intel/aux-map: Factor out some useful helpers
This breaks add_mapping() into three pieces:

    1. get_aux_entry() adds AUX-TT pages as needed and returns the
       L1 entry index, L1 entry address, and L1 entry map.

    2. gen_aux_map_format_bits_for_isl_surf() computes the format-
       specific information that goes in the AUX-TT entry.

    3. add_mapping() is a lot dumber function that now just adds the
       requested mapping with the requested format bits.

This lets us break out some additional helpers in the API which we want
to use for more direct AUX-TT management in ANV.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>
2020-01-25 02:18:33 +00:00
Jason Ekstrand
bea62ea566 intel/aux-map: Add some #defines
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>
2020-01-25 02:18:33 +00:00
Marek Olšák
0366c8c5b7 radeonsi: expose shader cache stats to the HUD
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929>
2020-01-24 20:29:29 -05:00
Marek Olšák
c046551e60 radeonsi: print shader cache stats with AMD_DEBUG=cache_stats
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929>
2020-01-24 20:29:29 -05:00
Marek Olšák
2fd3bb23ab radeonsi: restructure si_shader_cache_load_shader
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929>
2020-01-24 20:29:29 -05:00
Marek Olšák
0db74f479b radeonsi: use the live shader cache
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929>
2020-01-24 20:29:29 -05:00
Marek Olšák
4bb919b0b8 gallium/util: add a cache of live shaders for shader CSO deduplication
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929>
2020-01-24 20:29:29 -05:00
Marek Olšák
f36f85d958 util/simple_mtx: add a missing include to get ASSERTED
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929>
2020-01-24 20:29:29 -05:00
Caio Marcelo de Oliveira Filho
6a0dda63dd intel/compiler: Add names for SHADER_OPCODE_[IU]SUB_SAT
Fixes: 58907568ec ("intel/fs: Add SHADER_OPCODE_[IU]SUB_SAT pseudo-ops")
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3558>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3558>
2020-01-24 23:52:30 +00:00
Caio Marcelo de Oliveira Filho
c1a2ac2abe anv: Always initialize target_stencil_layout
Pass down stencil data from the subpass attachment like we do
elsewhere.  Only stencil attachments will make use of it.

Fixes warnings like

    ../src/intel/vulkan/genX_cmd_buffer.c: In function ‘cmd_buffer_begin_subpass’:
    ../src/intel/vulkan/genX_cmd_buffer.c:4656:41: warning: ‘target_stencil_layout’ may be used uninitialized in this function [-Wmaybe-uninitialized]
     4656 |       att_state->current_stencil_layout = target_stencil_layout;
          |       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3557>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3557>
2020-01-24 14:01:38 -08:00
Jason Ekstrand
41bffe0913 anv: Replace aux_surface.isl.size_B checks with aux_usage checks
Now that aux_usage has a unified meaning, aux_usage == NONE if and only
if aux_surface.isl.size_B > 0.  In most of these cases, the question
we're asking is "does have compression?" and not "have we allocated an
aux surface for compression?".

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3556>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3556>
2020-01-24 21:07:26 +00:00
Jason Ekstrand
e693a57232 anv: Rework the meaning of anv_image::planes[]::aux_usage
Previously, we set aux_usage=ISL_AUX_USAGE_NONE when we really meant
CCS_D.  This sort-of made sense before we had anv_layout_to_aux_usage
but now that we have that helper.  However, in our more modern aux
tracking model, all aux usage goes through anv_layout_to_* and we're
better off making the meaning of anv_image::planes[]::aux_usage be
AUX_USAGE_NONE if and only if there is no compression.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3556>
2020-01-24 21:07:26 +00:00
Samuel Pitoiset
de64719024 radv: print NIR shaders after lowering FS inputs/outputs
This is confusing otherwise.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3553>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3553>
2020-01-24 19:54:58 +00:00
Jason Ekstrand
17e225ee1e intel/isl: Add a hack for the Gen12 A0 texture buffer bug
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>
2020-01-24 19:18:27 +00:00
Jason Ekstrand
4cd23420bd intel/isl: Plumb devinfo into isl_genX(buffer_fill_state_s)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>
2020-01-24 19:18:27 +00:00
Jason Ekstrand
98aab272a8 intel/disasm: Properly disassemble indirect SENDs
Instead of emitting g[a0]UD for the indirect descriptor, emit a0<0>UD.
This is more correct because there is no GRF involved.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>
2020-01-24 19:18:27 +00:00
Jason Ekstrand
3b2eafbea9 intel/fs: Don't unnecessarily fall back to indirect sends on Gen12
The instruction encoding for SENDS changed on Gen12 and it now supports
embedding the entire extended message descriptor in the instruction if
it's an immediate.  Stop falling back to doing an indirect SEND just
because we had something in [15:12] of ex_desc.ud.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>
2020-01-24 19:18:27 +00:00
Jason Ekstrand
c70a786c77 anv: Improve BTI change cache flushing
This commit makes two changes:

 1. We set pending_pipe_bits instead of emitting PIPE_CONTROL directly
    for the flush at the end of cmd_buffer_begin_subpass.

 2. Because BLORP ops such as vkCmdClearAttachments may come in the
    middle of a render pass, we have to also flag the need for a cache
    flush after the blorp op.

Fixes: 185630c6bc "anv/blorp: Do the gen11 BTI flush"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>
2020-01-24 19:18:26 +00:00
Alyssa Rosenzweig
e39c52787e panfrost: Fix 32-bit warning for indices
../src/gallium/drivers/panfrost/pan_context.c: In function ‘panfrost_draw_vbo’:
../src/gallium/drivers/panfrost/pan_context.c:1551:70: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
                 ctx->payloads[PIPE_SHADER_FRAGMENT].prefix.indices = (u64) NULL;
                                                                      ^

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reported-by: Icecream95 <ixn@keemail.me>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3543>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3543>
2020-01-24 18:53:31 +00:00
Alyssa Rosenzweig
58aa2b8cfc pan/decode: Remove SHORT_SLIDE indirection
../src/panfrost/pandecode/decode.c: In function ‘pandecode_compute_fbd’:
../src/panfrost/pandecode/decode.c:789:35: warning: taking address of packed member of ‘struct mali_compute_fbd’ may result in an unaligned pointer value [-Waddress-of-packed-member]
  789 |         pandecode_u32_slide(num, s->unknown ## num, ARRAY_SIZE(s->unknown ## num))
      |                                  ~^~~~~~~~~
../src/panfrost/pandecode/decode.c:800:9: note: in expansion of macro ‘SHORT_SLIDE’
  800 |         SHORT_SLIDE(1);
      |         ^~~~~~~~~~~

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3543>
2020-01-24 18:53:31 +00:00
Alyssa Rosenzweig
7d52b3a18b pan/midgard: Remove pack_color define
Unused at the moment.

../src/panfrost/midgard/midgard_compile.c:124:29: warning: ‘m_pack_colour’ defined but not used [-Wunused-function]
  124 |  static midgard_instruction m_##name(unsigned ssa, unsigned address) { \
      |                             ^~
../src/panfrost/midgard/midgard_compile.c:145:22: note: in expansion of macro ‘M_LOAD_STORE’
  145 | #define M_LOAD(name) M_LOAD_STORE(name, false)
      |                      ^~~~~~~~~~~~
../src/panfrost/midgard/midgard_compile.c:213:1: note: in expansion of macro ‘M_LOAD’
  213 | M_LOAD(pack_colour);
      | ^~~~~~

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3543>
2020-01-24 18:53:31 +00:00
Alyssa Rosenzweig
6c95ea6bd7 pan/decode: Remove last_size
Fixes ../src/panfrost/pandecode/decode.c: In function ‘pandecode_jc’:
../src/panfrost/pandecode/decode.c:2859:14: warning: variable ‘last_size’ set but not used [-Wunused-but-set-variable]
 2859 |         bool last_size;

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3543>
2020-01-24 18:53:31 +00:00
Alyssa Rosenzweig
d126515a16 panfrost: Don't use implicit mali_exception_status enum
Fixes ../src/panfrost/pandecode/public.h:53:33: warning: ‘enum mali_exception_access’ declared inside parameter list will not be visible outside of this definition or declaration

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3543>
2020-01-24 18:53:31 +00:00
Samuel Pitoiset
4a553212fa radv: enable ACO support for GFX6
CTS should pass, as well as Crucible and the few number of Piglit tests.

List of game benchmarks tested:
- Dawn of War 3
- Serious Sam 2017
- Shadow of The Tomb Raider
- The Talos Principle
- Thrones of Britannia
- Total Warhammer 2
- Total War: Three Kingdoms

Note that F12017 hangs with or without ACO on GFX6 at the moment.

My whole pipelinedb (~30 games) doesn't trigger any compiler crashes.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2401
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3533>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3533>
2020-01-24 18:34:27 +00:00
Samuel Pitoiset
d4b4f40595 aco: copy the literal offset of SMEM instructions to a temporary
GFX6 only supports up to 8-bit for the literal offset, so make sure
it's copied to a temporary SGPR before emitting a SMEM instruction.
The optimizer will propagate the literal offset if possible anyways.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3533>
2020-01-24 18:34:27 +00:00
Samuel Pitoiset
1ac49ba908 aco: fix a hazard with v_interp_* and v_{read,readfirst}lane_* on GFX6
It's required to insert 1 wait state if the dst VGPR of any v_interp_*
is followed by a read with v_readfirstlane or v_readlane to fix GPU
hangs on GFX6. Note that v_writelane_* is apparently not affected.
This hazard isn't documented anywhere but AMD confirmed it.

This fixes a GPU hang with the texturemipmapgen Sascha demo on GFX6.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3533>
2020-01-24 18:34:27 +00:00
Samuel Pitoiset
b9cc50fbce aco: fix a hardware bug for MRTZ exports on GFX6
GFX6 (except OLAND and HAINAN) has a bug that it only looks at
the X writemask component.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3533>
2020-01-24 18:34:27 +00:00
Brian Ho
f55e215b8c turnip: Implement vkCmdCopyQueryPoolResults for occlusion queries
Use CP_COND_EXEC and CP_COND_WRITE to conditionally copy the results
of a query to a buffer based off the query's availability.

Fixes: #2238
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>
2020-01-24 18:14:01 +00:00
Brian Ho
9a3656b9fd turnip: Implement vkCmdResetQueryPool
Clears the available bit for each requested query on the GPU.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>
2020-01-24 18:14:01 +00:00
Brian Ho
97fa4cb3dc turnip: Implement vkGetQueryPoolResults for occlusion queries
Implements fetching the results of a query pool with the
VK_QUERY_RESULT_WAIT_BIT, VK_QUERY_RESULT_WITH_AVAILABILITY_BIT,
and VK_QUERY_RESULT_PARTIAL_BIT flags.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>
2020-01-24 18:14:01 +00:00
Brian Ho
24b95485dc turnip: Update query availability on render pass end
Unlike on an immidiate-mode renderer, Turnip only renders tiles on
vkCmdEndRenderPass. As such, we need to track all queries that were
active in a given render pass and defer setting the available bit
on those queries until after all tiles have rendered.

This commit adds a draw_epilogue_cs to tu_cmd_buffer that is
executed as an IB at the end of tu_CmdEndRenderPass. We then emit
packets to this command stream that update the availability bit of a
given query in tu_CmdEndQuery.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>
2020-01-24 18:14:01 +00:00
Brian Ho
f750dd2ab8 turnip: Implement vkCmdEndQuery for occlusion queries
Mostly a translation of freedreno's implementation of glEndQuery for
GL_SAMPLES_PASSED query objects with a slight modification to set the
availability bit of the query bo (slot->available) if the query was
not ended inside a render pass.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>
2020-01-24 18:14:01 +00:00
Brian Ho
5824a59ee2 turnip: Implement vkCmdBeginQuery for occlusion queries
Mostly a translation of freedreno's implementation of glBeginQuery for
GL_SAMPLES_PASSED query objects with special logic for handling tiled
render passes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>
2020-01-24 18:14:01 +00:00
Brian Ho
78dea40b1c turnip: Implement vkCreateQueryPool for occlusion queries
General structure is inspired by anv's implementation in genX_query.c.
We define a packed struct that tracks sample count at the beginning of
the query and at the end; the result of the occlusion query is then
slot->end - slot->begin.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>
2020-01-24 18:14:01 +00:00
Brian Ho
a155ab93a3 turnip: Update tu_query_pool with turnip-specific fields
tu_query_pool was forked from radv_query_pool, but we will need a
different set of fields to implement queries in turnip.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>
2020-01-24 18:14:01 +00:00
Jason Ekstrand
0aa13245c1 anv: Allow HiZ in read-only depth layouts
This improves the performance of Aztec Ruins by 5% on ICL.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605>
2020-01-24 17:42:36 +00:00
Jason Ekstrand
bf3a262a80 anv: Add a usage parameter to anv_layout_to_aux_usage
Most places we actually know the usage and can provide it.  There are
two exceptions to this:

 1. We pass 0 into get_blorp_surf_for_anv_image when we use
    ANV_IMAGE_LAYOUT_EXPLICIT_AUX because anv_layout_to_aux_usage is
    never actually called so it doesn't matter.

 2. We pass 0 into anv_layout_to_aux_usage in transition_color_buffer.
    However, the coming commits which will begin using the usage
    parameter only care about depth.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605>
2020-01-24 17:42:36 +00:00
Jason Ekstrand
f8a4de6316 anv: Use isl_aux_state for HiZ resolves
Rather than looking at the aux usage, we look at the isl_aux_state which
provides us with more detailed information.  This commit adds a couple
helpers to isl which let us quickly determine if we have valid depth/hiz
on the initial layout and if we need valid depth/hiz for the final
layout.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605>
2020-01-24 17:42:36 +00:00
Jason Ekstrand
9a1232a745 anv: Add a layout_to_aux_state helper
This new helper maps VkImageLayout enums to isl_aux_state enums which
are the hardware's concept of image layouts.  We can then use the aux
state to get the fast clear type and the aux usage.  This should yield
no functional change in driver behavior.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605>
2020-01-24 17:42:36 +00:00
Jason Ekstrand
769d6ba200 anv: Use TRANSFER_SRC_OPTIMAL for depth/stencil MSAA resolves
As of 52ad1712ed, TRANSFER_SRC_OPTIMAL and SHADER_READ_ONLY_OPTIMAL
are now identical for depth buffers so there's no reason why we need to
use the "wrong" layout.  Technically, according to Vulkan, blits and
MSAA resolves are transfer ops so we should use the transfer layout now
that we can.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605>
2020-01-24 17:42:36 +00:00
Jason Ekstrand
71c0f9e76d intel/blorp: resize src and dst surfaces separately
When copying to an RGB surface, we treat it as an R only one of three
times the width, which may end up being larger than the maximum size
supported by the hardware and so it hits the shrink path. This forced
both source and destination surfaces to be shrunk, even though it's not
necessary for the former, and may even hit some assertions in some
cases, such as the surface being compressed.

Fixes several tests under dEQP-VK.api.copy_and_blit.core.image_to_image.dimensions.*

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3422>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3422>
2020-01-24 17:02:40 +00:00
Samuel Pitoiset
918f00eef8 aco: combine MRTZ (depth, stencil, sample mask) exports
Instead of emitting up to 3 for each different components (depth,
stencil and sample mask). This is needed to fix a hw bug on GFX6.

Totals from affected shaders:
SGPRS: 34728 -> 35056 (0.94 %)
VGPRS: 26440 -> 26476 (0.14 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 1346088 -> 1344180 (-0.14 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 3922 -> 3915 (-0.18 %)
Wait states: 0 -> 0 (0.00 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3538>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3538>
2020-01-24 16:42:15 +00:00
Timur Kristóf
c787b8d2a1 aco/gfx10: Fix VcmpxExecWARHazard mitigation.
The SOPP instruction shouldn't have a definition, and its block
should be set to -1 in order to prevent it from being recognized
as a branch.
Also fix a typo in the readme.

Fixes: d6dfce02d0
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3552>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3552>
2020-01-24 16:21:08 +00:00
Timur Kristóf
8a32f57fff aco: Transform uniform bitwise instructions to 32-bit if possible.
This allows removing superfluous s_cselect instructions
that come from turning booleans into 64-bit vector condition.

v2 by Daniel Schürmann:
- Make the code massively simpler
v3 by Timur Kristóf:
- Fix regressions, make it work in wave32 mode
- Eliminate extra moves by not always using the SCC definition
- Use s_absdiff_i32 for uniform XOR
- Skip the transformation for uncommon or invalid instructions

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3450>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3450>
2020-01-24 14:40:45 +00:00
Martin Fuzzey
d1925fec53 etnaviv: update Android build files
etnaviv no longer builds on Android, fix this.

Signed-off-by: Martin Fuzzey <martin.fuzzey@flowbird.group>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3447>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3447>
2020-01-24 14:03:28 +00:00
Rhys Perry
b046f55086 aco: use nir_move_copies
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Rhys Perry
72e9a23443 radv/aco: use ACO for GS copy shaders
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Rhys Perry
f8f7712666 aco: implement GS copy shaders
v5: rebase on float_controls changes
v7: rebase after shader args MR and load/store vectorizer MR

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Rhys Perry
de4ce66f5c aco: remove needs_instance_id
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Rhys Perry
e192e268de aco: explicitly mark end blocks for exports
For GS copy shaders, whether we want to do exports is conditional. By
explicitly marking the end blocks, we can mark an IF's then branch as an
export block and ensure that's where the assembler inserts null exports.

v6: only fixup exports in the end block, like before
v8: simplify some code

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Rhys Perry
d46a54ecff radv/aco: allow ACO for GS
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Rhys Perry
8bad100f83 aco: implement GS on GFX7-8
GS is the same on GFX6, but GFX6 isn't fully supported yet.

v4: fix regclass
v7: rebase after shader args MR

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Rhys Perry
40bb81c9dd radv/aco,aco: implement GS on GFX9+
v2: implement GFX10
v3: rebase
v7: rebase after shader args MR
v8: fix gs_vtx_offset usage on GFX9/GFX10
v8: use unreachable() instead of printing intrinsic
v8: rename output_state to ge_output_state
v8: fix formatting around nir_foreach_variable()
v8: rename some helpers in the scheduler
v8: rename p_memory_barrier_all to p_memory_barrier_common
v8: fix assertion comparing ctx.stage against vertex_geometry_gs

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Rhys Perry
70f63c1988 aco: improve support for s_sendmsg
In particular, the messages needed for GS.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Rhys Perry
0da7b3b18b radv: move gs copy shader creation before other variants
ACO lowers output derefs which breaks the shader_info pass used by gs copy
shader creation.

v3: rebase

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Timur Kristóf
23edcf6490 aco: Make a better guess at which instructions need the VCC hint.
Previously, bool_to_vector_condition would always set the VCC hint
on its result. This commit improves it by having the optimizer set
the VCC hint only when the result really needs to be in the VCC.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3451>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3451>
2020-01-24 13:14:23 +00:00
Jan Zielinski
83f24b0587 gallium/swr: implementation of tessellation shaders compilation
TCS and TES shaders compilation mechanisms in SWR and state
management implementation.

Reviewed-by: Krzysztof Raszkowski <krzysztof.raszkowski@intel.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Acked-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3484>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3484>
2020-01-24 11:38:03 +00:00
Bas Nieuwenhuizen
0890482969 radv: Allow DCC & TC-compat HTILE with VK_IMAGE_CREATE_EXTENDED_USAGE_BIT.
I misunderstood the flag when initially disabling. But this flag
only does something with mutable formats. If we have DCC and
mutable formats, the formats are close enough that the allowed
usage flags are not meaningfully different nor used during
allocation.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3424>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3424>
2020-01-24 11:16:39 +00:00
Bas Nieuwenhuizen
1b447bd2e6 radv: Expose VK_KHR_swapchain_mutable_format.
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2354
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3425>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3425>
2020-01-24 10:47:07 +00:00
Connor Abbott
b103157a0e freedreno: Document CP_INDIRECT_BUFFER_CHAIN
This will let us use batch chaining instead of growing batches on a5xx
and a6xx.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3537>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3537>
2020-01-24 10:03:08 +00:00
Connor Abbott
f58242b56e freedreno: Document CP_UNK_A6XX_55
Reviewed-by: Rob Clark <robdclark@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3537>
2020-01-24 10:03:08 +00:00
Connor Abbott
3cf1d6b8db freedreno: Document CP_COND_REG_EXEC more
The vulkan blob uses the RENDER_MODE mode to condition a blit on the
render mode in traces of a dEQP triangle test.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3182>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3182>
2020-01-24 09:23:27 +00:00
Samuel Pitoiset
a31bcf2be6 ac/llvm: fix missing casts in ac_build_readlane()
Because ac_build_optimization_barrier() overwrites the original
src_type, we have to keep track of it before emitting that barrier.
Otherwise, wrong conversions are expected for pointers or small
bitsizes.

By doing this, we no longer need to do the cast dance in
ac_build_readlane_no_opt_barrier(), it was just necessary for
ac_build_optimization_barrier().

This fixes a bunch of crashes with subgroups related tests when
RADV_DEBUG=checkir is enabled, and it also fixes a compiler crash
with The Surge 2.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2395
Fixes: 0f45d4dc2b ("ac: add ac_build_readlane without optimization barrier")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3535>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3535>
2020-01-24 07:40:07 +01:00
Jason Ekstrand
8a135ff6e5 anv/apply_pipeline_layout: Initialize the nir_builder before use
Fixes: #2410
Fixes: 3c754900b5 "nir: don't emit ishl in _nir_mul_imm() if backend doesn't support bitops"
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3548>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3548>
2020-01-23 19:35:39 -08:00
Kenneth Graunke
adaa3583f5 meson: Prefer 'iris' by default over 'i965'.
This changes the default driver for Intel Gen8-11 hardware to be
the newer 'iris' driver rather than the older 'i965' driver.  To
continue using i965, pass -Dprefer-iris=false when building.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3540>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3540>
2020-01-23 15:34:54 -08:00
Adam Jackson
2fc11e8a05 drisw: Cache the depth of the X drawable
This is not always ->rgbBits, because there are cases where that could
be 32 but we're (legally) bound to a depth-24 pixmap. The important
thing to have match here is the actual server-side notion of depth.  You
can look this up (at modest expense) from the xlib visual info if the
fbconfig has a visual. But it might not, so if not, fetch it (at
slightly greater expense) from XGetGeometry. Do this at GLX drawable
creation so you don't have to do it on the SwapBuffers path.

Apparently this fixes glx/glx-swap-singlebuffer, which is unintentional
but quite pleasant.

Fixes: mesa/mesa#2291
Fixes: 90d58286 ("drisw: Fix and simplify drawable setup")
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3305>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3305>
2020-01-23 23:03:13 +00:00
Eric Anholt
59f29fc845 turnip: Convert the rest of tu_cmd_buffer.c over to the new pack macros.
There are only a couple of hard cases left using pkt4, where the register
number to write is computed.

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3455>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3455>
2020-01-23 22:46:09 +00:00
Eric Anholt
d67100519e turnip: Convert renderpass setup to the new register packing macros.
This gets a lot of the hard code converted over to the new macros,
resulting in (I feel) much more readable code with
LESS_SHOUTING_ABOUT_THE_REG().  I decided to consistently put the reg on
its own line, so that all the register names line up.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3455>
2020-01-23 22:46:09 +00:00
Eric Anholt
08837ea3d2 turnip: Port krh's packing macros from freedreno to tu.
This introduces some minor unpacking of the temporary fd_reg_pair structs
to code that previously was packing a whole register field.

In the pack wrapper in tu_cs.h, I added some explanatory docs, dropped the
relocs handling since we don't need it, and removed the extra regs[] in
the __ONE_REG() macro (which was causing gcc's optimizer to fall on its
face in my release build).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3455>
2020-01-23 22:46:09 +00:00
Eric Anholt
d4bc3c93ea freedreno: Fix OUT_REG() on address regs without a .bo supplied.
Sometimes you want to zero out an address by supplying a NULL BO, but
without this we would end up only emitting one dword.  Increases size of
fd6_gmem.o by .8%, though it's not clear to me why (no obvious terrible
codegen happening)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3455>
2020-01-23 22:46:09 +00:00
Eric Anholt
c1327bc283 freedreno: Add some missing a6xx address declarations.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3455>
2020-01-23 22:46:09 +00:00
Ian Romanick
4b7de92e5f relnotes: Add GL_INTEL_shader_integer_functions2 and VK_INTEL_shader_integer_functions2
Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2020-01-23 13:36:14 -08:00
Vasily Khoruzhick
beab31b9bb lima: use imul for calculations with intrinsic src
It's source is supposed to be int, so we have to use integer
multiplication otherwise we'll get undefined result.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3529>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3529>
2020-01-23 21:16:22 +00:00
Vasily Khoruzhick
3c754900b5 nir: don't emit ishl in _nir_mul_imm() if backend doesn't support bitops
Otherwise we'll have to lower it later.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3529>
2020-01-23 21:16:22 +00:00
Icecream95
cf2c5a56a1 pan/decode: Rotate trace files
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3525>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3525>
2020-01-23 20:46:38 +00:00
Icecream95
c1952779d6 pan/decode: Dump to a file
The file name is taken from the environment variable
PANDECODE_DUMP_FILE, defaulting to pandecode.dump if it is not set.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3525>
2020-01-23 20:46:38 +00:00
Icecream95
be22c0789f pan/decode: Support dumping to a file
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3525>
2020-01-23 20:46:38 +00:00
Icecream95
20a8957397 pan/bifrost: Support disassembling to a file
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3525>
2020-01-23 20:46:38 +00:00
Icecream95
968f36d1fc pan/midgard: Support disassembling to a file
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3525>
2020-01-23 20:46:38 +00:00
Icecream95
7b525ba02b pan/midgard: Fix a memory leak in the disassembler
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3525>
2020-01-23 20:46:38 +00:00
Eric Anholt
fbd9b4ce08 turnip: Fix execution of secondary cmd bufs with nothing in primary.
We want to finish off cmd emission in the primary CS and add its entry to
the IB, but regardless of whether there had been anything in the primary
CS to emit, we still need a reserved CS entry for the loop below.

Fixes crashes in dEQP-VK.binding_model.shader_access.secondary_cmd_buf.*
and many more in dEQP-VK.renderpass*

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3524>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3524>
2020-01-23 20:27:26 +00:00
Alyssa Rosenzweig
d6d6ef2862 panfrost: Drop mysterious zero=0xFFFF field
It doesn't seem to affect any results and it's not at all clear if/why
the blob sometimes(?) sets it? So let's clean this up since this
solution isn't correct anyway.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3513>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3513>
2020-01-23 19:59:58 +00:00
Icecream95
f8eb4441ae pan/midgard: Fix bundle dynarray leak
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3496>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3496>
2020-01-23 19:35:09 +00:00
Marek Olšák
43d9bac6f2 radeonsi: separate LLVM compilation from non-LLVM code
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421>
2020-01-23 19:10:21 +00:00
Marek Olšák
1a0890dcf3 radeonsi: change prototypes of si_is_multi_part_shader & si_is_merged_shader
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421>
2020-01-23 19:10:21 +00:00
Marek Olšák
7ce84b256e radeonsi: make si_compile_shader return bool
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421>
2020-01-23 19:10:21 +00:00
Marek Olšák
be772182e0 radeonsi: make si_compile_llvm return bool
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421>
2020-01-23 19:10:21 +00:00
Marek Olšák
bd19d144a1 radeonsi: move more LLVM functions into si_shader_llvm.c
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421>
2020-01-23 19:10:21 +00:00
Marek Olšák
9a66f3d3e2 radeonsi: fold si_shader_context_set_ir into si_build_main_function
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421>
2020-01-23 19:10:21 +00:00
Marek Olšák
beacb414b9 radeonsi: move si_nir_build_llvm into si_shader_llvm.c
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421>
2020-01-23 19:10:21 +00:00
Marek Olšák
1c73d598eb radeonsi: minor cleanup in si_shader_internal.h
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421>
2020-01-23 19:10:21 +00:00
Marek Olšák
ab33ba987a radeonsi: move si_shader_llvm_build.c content into si_shader_llvm.c
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421>
2020-01-23 19:10:21 +00:00
Marek Olšák
cd5b99c541 radeonsi: move VS shader code into si_shader_llvm_vs.c
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421>
2020-01-23 19:10:21 +00:00
Marek Olšák
d1c42e2c6a radeonsi: move non-LLVM code out of si_shader_llvm.c
There was also some redundant code in si_shader_nir.c

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421>
2020-01-23 19:10:21 +00:00
Marek Olšák
594f085cfa radeonsi: use ctx->ac. for types and integer constants
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421>
2020-01-23 19:10:21 +00:00
Jonathan Marek
8aa5d96864 turnip: simplify tu_physical_device_get_format_properties
Fixes the "bad VkImageTiling" error when tiling is
VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3485>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3485>
2020-01-23 18:34:07 +00:00
Jonathan Marek
b7e22b7a35 vulkan/wsi: remove unused image_get_modifier
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3485>
2020-01-23 18:34:07 +00:00
Jonathan Marek
e8afd40758 turnip: set linear tiling for scanout images
Fixes: 210e6887 "vulkan/wsi: Use the interface from the real modifiers extension"

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Acked-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3485>
2020-01-23 18:34:07 +00:00
Jonathan Marek
11f6fba1c9 turnip: hook up GetImageDrmFormatModifierPropertiesEXT
Fixes: 210e6887 "vulkan/wsi: Use the interface from the real modifiers extension"

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Acked-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3485>
2020-01-23 18:34:07 +00:00
Guido Günther
c5334d2943 freedreno/drm: Don't miscalculate timeout
The current code overflows (s * 1000000000) for s >= 5 but that is
e.g. used in msm_bo_cpu_prep.

Signed-off-by: Guido Günther <agx@sigxcpu.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3514>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3514>
2020-01-23 18:07:13 +00:00
Eric Anholt
b327501dbf turnip: Add support for fine derivatives.
This does appear to be the required instruction sequence (dsxpp_1 dst src;
dsxpp_1.p dst src) as dropping either instruction fails the testsuite.

Fixes dEQP-VK.glsl.derivate.*

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3494>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3494>
2020-01-23 17:38:29 +00:00
Eric Anholt
876824908d freedreno/ir3: Plumb the ir3_shader_variant into legalize.
legalize is computing a lot of state that goes in the variant, let's just
store it directly instead of passing pointers around.  This leaves
max_bary in place, which is doing some surprising work (overwriting the
original total_in in some cases).

Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3494>
2020-01-23 17:38:29 +00:00
Anthony Pesch
f77369086c util/hash_table: update users to use new optimal integer hash functions
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3475>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3475>
2020-01-23 17:06:57 +00:00
Anthony Pesch
1496cc92f6 util/hash_table: added hash functions for integer types
A few hash_table users roll their own integer hash functions which
call _mesa_hash_data to perform the hashing which ultimately calls
into XXH32 with a dynamic key length. When using small keys with a
constant size the hash rate can be greatly improved by inlining
XXH32 and providing it a constant key length, see:
https://fastcompression.blogspot.com/2018/03/xxhash-for-small-keys-impressive-power.html

Additionally, this patch removes calls to _mesa_key_hash_string and
makes them instead call _mesa_has_string directly, matching the new
integer hash functions.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3475>
2020-01-23 17:06:57 +00:00
Anthony Pesch
931388ceca util/hash_table: replace _mesa_hash_data's fnv1a hash function with xxhash
For most key sizes, xxhash outperforms fnv1a's hash rate substantially (bug
2153). In particular, the V3D driver hashes multiple ~200 byte keys as part
of the shader cache lookup which can easily eat up 10-20% of the runtime on
the Raspberry Pi. Swapping over to xxhash drops this to ~1% of the runtime.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3475>
2020-01-23 17:06:57 +00:00
Anthony Pesch
032f8807f7 util: move fnv1a hash implementation into its own header
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3475>
2020-01-23 17:06:57 +00:00
Anthony Pesch
17fac0e32d util: import xxhash
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3475>
2020-01-23 17:06:57 +00:00
Michel Dänzer
552028c013 winsys/amdgpu: Close KMS handles for other DRM file descriptions
When a BO or amdgpu_screen_winsys is destroyed.

Should fix leaking such BOs in other DRM file descriptions.

v2:
* Pass the correct file descriptor to drmIoctl (Pierre-Eric
  Pelloux-Prayer)
* Use _mesa_hash_table_remove
v3:
* Close handles in amdgpu_winsys_unref as well
v4:
* Adapt to amdgpu_winsys::sws_list_lock.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2270
Fixes: 11a3679e3a "winsys/amdgpu: Make KMS handles valid for original
                     DRM file descriptor"

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3202>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3202>
2020-01-23 17:39:34 +01:00
Michel Dänzer
b60f5cbc15 winsys/amdgpu: Re-use amdgpu_screen_winsys when possible
Namely, if os_same_file_description determined that the DRM file
descriptor references the same file description.

v2:
* Adapt to amdgpu_winsys::sws_list_lock.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3202>
2020-01-23 17:39:34 +01:00
Michel Dänzer
f76cbc7901 util: Add os_same_file_description helper
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3202>
2020-01-23 17:39:34 +01:00
Michel Dänzer
c6468f66c7 winsys/amdgpu: Only re-export KMS handles for different DRM FDs
When the amdgpu_screen_winsys uses the same FD as the amdgpu_winsys
(which is always the case for the first amdgpu_screen_winsys), we can
just use bo->u.real.kms_handle.

v2:
* Also only create the kms_handles hash table if the
  amdgpu_screen_winsys fd is different from the amdgpu_winsys one.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3202>
2020-01-23 17:39:34 +01:00
Michel Dänzer
24075ac60f winsys/amdgpu: Keep track of retrieved KMS handles using hash tables
The assumption being that KMS handles are only retrieved for relatively
few BOs, so hash tables should be efficient both in terms of performance
and memory consumption.

We use the address of struct amdgpu_winsys_bo as the key and its
kms_handle field (the KMS handle valid for the DRM file descriptor
passed to amdgpu_device_initialize) as the hash value.

v2:
* Add comment above amdgpu_screen_winsys::kms_handles (Pierre-Eric
  Pelloux-Prayer)
v3:
* Protect kms_handles hash table with amdgpu_winsys::sws_list_lock
  mutex.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3202>
2020-01-23 17:24:00 +01:00
Michel Dänzer
f4010a6da9 winsys/amdgpu: Keep a list of amdgpu_screen_winsyses in amdgpu_winsys
v2:
* Add dedicated mutex for the list.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3202>
2020-01-23 17:23:32 +01:00
Samuel Pitoiset
8d5203dad2 aco: implement nir_op_f2i64/nir_op_f2u64 on GFX6
V_TRUNC_F64 and V_FLOOR_F64 needs to be lowered on GFX6.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>
2020-01-23 14:40:48 +01:00
Samuel Pitoiset
4d92601715 aco: implement 64-bit nir_op_ffloor on GFX6
GFX6 doesn't have V_FLOOR_F64, it needs to be lowered. Loosely based
on the AMDGPU LLVM backend.

Introduce a new function because it will be useful for some other
64-bit operations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>
2020-01-23 14:40:45 +01:00
Samuel Pitoiset
fbd169e421 aco: implement 64-bit nir_op_fround_even on GFX6
GFX6 doesn't have V_RNDNE_F64, it needs to be lowered. Loosely based
on the AMDGPU LLVM backend.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>
2020-01-23 14:40:42 +01:00
Samuel Pitoiset
87588801d3 aco: implement 64-bit nir_op_fceil on GFX6
GFX6 doesn't have V_CEIL_F64, it needs to be lowered. Loosely based
on the AMDGPU LLVM backend.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>
2020-01-23 14:40:38 +01:00
Samuel Pitoiset
aad5176c58 aco: implement 64-bit nir_op_ftrunc on GFX6
GFX6 doesn't have V_TRUNC_F64, it needs to be lowered. Loosely based
on the AMDGPU LLVM backend.

Introduce a new function because it will be useful for some other
64-bit operations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>
2020-01-23 14:40:34 +01:00
Samuel Pitoiset
36e7a5f5b9 aco: implement nir_intrinsic_global_atomic_* on GFX6
GFX6 doesn't have FLAT instructions, use MUBUF instructions instead.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>
2020-01-23 14:40:30 +01:00
Samuel Pitoiset
22d8822683 aco: implement nir_intrinsic_load_global on GFX6
GFX6 doesn't have FLAT instructions, use MUBUF instructions instead.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>
2020-01-23 14:40:27 +01:00
Samuel Pitoiset
d6af7571c2 aco: implement nir_intrinsic_store_global on GFX6
GFX6 doesn't have FLAT instructions, use MUBUF instructions instead.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>
2020-01-23 14:40:24 +01:00
Samuel Pitoiset
01f0bef71e aco: fix wrong IR in nir_intrinsic_load_barycentric_at_sample
Only GFX6 was affected, my mistake. The total number of SGPR operands
should be 4 when we want to create a vec4.

Fixes: dbdf3b3ef9 ("aco: implement nir_intrinsic_load_barycentric_at_sample on GFX6")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>
2020-01-23 14:40:21 +01:00
Lionel Landwerlin
d101907de9 anv/iris: warn gen12 3DSTATE_HS restriction
This should never happen but better off documenting it in case someone
plays with max threads numbers.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3489>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3489>
2020-01-23 15:06:59 +02:00
Krzysztof Raszkowski
bf74a7f092 gallium/swr: add option for static link
Set swr-shared to 'false' to link SWR statically into Mesa.
Only one swr arch can be specified if swr-shared is set to false.

Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3510>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3510>
2020-01-23 12:20:24 +00:00
Samuel Pitoiset
54e54ec3e8 aco: fix printing assembly with CLRXdisasm on GFX6
We thought that CLRXdisasm allowed gfx600 as well as gfx700 but
it actually doesn't. Use the family for GFX6 chips instead.

Fixes: 0099f85232 ("aco: print assembly with CLRXdisasm for GFX6-GFX7 if found on the system")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3531>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3531>
2020-01-23 11:34:37 +00:00
Pierre Moreau
dda542e912 clover/meson: Define OpenCL header macros
Rather than defining the macros any time right before including an
OpenCL header, set Meson to define them for the whole clover project.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Francisco Jerez <currojerez@riseup.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3137>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3137>
2020-01-23 11:12:33 +00:00
Pierre Moreau
dd756b704f clover: Use the dispatch table type from the OpenCL headers
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2243

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3137>
2020-01-23 11:12:33 +00:00
Pierre Moreau
cd1c661cfc include/CL: Update OpenCL headers to latest
This latest update contains a new header that defines the dispatch table
structure in order to avoid OpenCL implementations having to define it
themselves.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3137>
2020-01-23 11:12:33 +00:00
Samuel Pitoiset
12fe19ba3b radv: advertise VK_AMD_shader_fragment_mask
Only for GFX8+ because it's untested on older generations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304>
2020-01-23 10:48:02 +00:00
Samuel Pitoiset
e030aef32c aco: add support for nir_texop_fragment_{mask}_fetch
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304>
2020-01-23 10:48:02 +00:00
Samuel Pitoiset
9e477d79b7 ac/nir: add support for nir_texop_fragment_{mask}_fetch
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304>
2020-01-23 10:48:02 +00:00
Samuel Pitoiset
84b08971fb nir/lower_input_attachments: lower nir_texop_fragment_{mask}_fetch
These instructions are allowed to fetch from multisampled
subpass input attachments.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304>
2020-01-23 10:48:02 +00:00
Samuel Pitoiset
76a34f5d3f spirv: add support for SpvOpFragment{Mask}FetchAMD operations
nir_tex_src_ms_index is re-used for the fragment index with
nir_texop_fragment_fetch to avoid introducing a new texture source type.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304>
2020-01-23 10:48:02 +00:00
Samuel Pitoiset
603e6ba972 nir: add two new texture ops for multisample fragment color/mask fetches
This introduces:
   - nir_texop_fragment_mask_fetch (fetch a fragment mask from a
     compressed multisampled color surface)
   - nir_texop_fragment_fetch (fetch a color fragment for a
     particular sample at corresponding fragment mask index).

These two texture operations are necessary for implementing
SPV_AMD_shader_fragment_mask.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304>
2020-01-23 10:48:02 +00:00
Samuel Pitoiset
dea29b3818 spirv: add SpvCapabilityFragmentMaskAMD
This new capability is for SPV_AMD_shader_fragment_mask.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304>
2020-01-23 10:48:02 +00:00
Samuel Pitoiset
e60de08547 radv: handle missing implicit subpass dependencies
When a subpass doesn't declare an explicit dependency from/to
VK_SUBPASS_EXTERNAL, Vulkan says there is an implicit dependency.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3330>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3330>
2020-01-23 11:25:41 +01:00
Samuel Pitoiset
0d2da2a8c0 radv: add explicit external subpass dependencies to meta operations
No functional changes because a subpass dependency with dstStageMask
set to VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT is a no-op.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3330>
2020-01-23 11:25:38 +01:00
Dave Airlie
48ab21109c gallivm: fix find lsb
the GLSL return value is different than the llvm intrinsic.

Fixes arb gpu shader5 tests

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3528>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3528>
2020-01-23 13:48:16 +10:00
Dave Airlie
1e433c398e galllivm: fix gather offset casting
cast texture offsets to 32-bit integers

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3528>
2020-01-23 13:48:16 +10:00
Dave Airlie
fc9d67394d llvmpipe: fix some integer instruction lowering.
We want to lower to shifts for bitfields, and lower ifind_msb.

Fixes a bunch of gpu shader5 tests.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3528>
2020-01-23 13:48:16 +10:00
Dave Airlie
6c88c81df9 gallivm: fix gather component handling.
Fixes the extended gather test for gpu shader5

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3528>
2020-01-23 13:48:16 +10:00
Eric Anholt
65e432695d turnip: Add support for uniform texel buffers.
Pretty straightforward: Port texture descriptor code from freedreno, fill
in alignment limits from closed vk, and tu_cmd_buffer.c was already
uploading the texture descriptor.

This doesn't implement storage texel buffers (required in the compute
pipeline) yet, since those will need an IBO descriptor for the store path.
Still, making the load path be connected to the texture descriptor won't
hurt.

Part of #2237

Fixes dEQP-VK.binding_model.shader_access.primary_cmd_buf.uniform_texel_buffer.*

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3522>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3522>
2020-01-23 02:40:09 +00:00
Kenneth Graunke
8dc0540a17 intel: Fix aux map alignments on 32-bit builds.
ALIGN() brilliantly uses uintptr_t, making it unsafe for use with 64-bit
GPU addresses in 32-bit builds of the driver.  Use align64() instead,
which uses uint64_t.

Fixes assertion failures when running any 32-bit program on Tigerlake.

Fixes: 2e6a7ced4d ("iris/gen12: Write GFX_AUX_TABLE base address register")
Fixes: 0d0290bb3f ("intel/common: Add surface to aux map translation table support")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3507>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3507>
2020-01-23 02:16:50 +00:00
Matt Turner
4413537c80 util: Remove tmp argument from BITSET_FOREACH_SET macro
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3499>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3499>
2020-01-23 01:52:43 +00:00
Matt Turner
d3eb2a0951 util: Explain BITSET_FOREACH_SET params
__size, in particular, makes this macro rather confusing to understand
how to use. Hopefully this comment saves future users the headache.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3499>
2020-01-23 01:52:42 +00:00
Vasily Khoruzhick
60f9b45802 lima: implement invalidate_resource()
We don't need to resolve invalidated resources, so it should
improve performance for applications that are doing this hint.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3476>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3476>
2020-01-23 01:26:23 +00:00
Timothy Arceri
bf830250a7 glsl_to_nir: update interface type properly
Since 76ba225184 the member variable types were being redefined
but we assigned the old interface type to the variable.

In a following patch series we will use the types to check if we
are dealing with an interface instance when apply GLSL linking
rules.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3468>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3468>
2020-01-23 01:02:25 +00:00
Timothy Arceri
d3a4d1775e glsl: count uniform components and storage better in nir linking
This helps avoid incorrect validation error when linking glsl
shaders and avoids assigning uniform storage slots that will
never be used.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3468>
2020-01-23 01:02:25 +00:00
Timothy Arceri
e5b3cf433e glsl: fix check for matrices in blocks when using nir uniform linker
We need to stripe any arrays before checking the type. Here we
just use the uniform type which has already be stripped.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3468>
2020-01-23 01:02:25 +00:00
Timothy Arceri
55e4410b34 glsl: remove bogus assert in nir uniform linking
I'm not sure why this was first added but it causes an assert
on any uniform matrix.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3468>
2020-01-23 01:02:25 +00:00
Ian Romanick
b065d8fb8c nir/algebraic: Optimize some 64-bit integer comparisons involving zero
I noticed that we can do better for these kinds of comparisons while
working on the lowering for iadd_sat@64 and isub_sat@64.  This
eliminated 11 instruction from the fs-addSaturate-int64.shader_test.

My hope is that this will improve the run-time of int64 tests on Ice
Lake.  I have no data to support or refute this.

Unsurprisingly, no changes on shader-db.

v2: Condition the min and max patterns with nir_lower_minmax64.
Suggested by Caio.  Very long discussion in the MR. :)

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Ian Romanick
c57338b924 anv: Enable SPV_INTEL_shader_integer_functions2 and VK_INTEL_shader_integer_functions2
Currently only implemented in the scalar backend, so only enable for
Gen8+.  If support for the other opcodes is added to the vec4 backend,
Gen7 could be supported.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Ian Romanick
76970940a6 iris: Enable INTEL_shader_integer_functions2
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Ian Romanick
b14e718e68 gallium: Add a cap bit for integer multiplication between 32-bit and 16-bit
Driver supports integer multiplication between a 32-bit integer and a
16-bit integer.  If the second operand is 32-bits, the upper 16-bits are
ignored, and the low 16-bits are possibly sign extended as necessary.

Iris will eventually enable this.  Not sure about other drivers.

v2: Add default value to u_screen.c.  Suggested by Caio.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Ian Romanick
9db20748fd gallium: Add a cap bit for OpenCL-style extended integer functions
Iris will eventually enable this.  Looking at the header files, it looks
like Midgard could also enable it.  Basically, any GPU that fully
supports OpenCL can.

v2: Add default value to u_screen.c.  Suggested by Caio.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Ian Romanick
4e9079d0c7 i965: Enable INTEL_shader_integer_functions2 on Gen8+
v2: Use new lower_hadd64 and lower_usub_sat64 flags.

v3: Enable SPIR-V capability.

v4: Move lowering options to COMMON_SCALAR_OPTIONS.  Suggested by Caio.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Ian Romanick
4fcddb55f2 spirv: Add support for IntegerFunctions2INTEL capability
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Ian Romanick
aa56934e2a spirv: Silence a bunch of unused parameter warnings
The change to get_uniform_nir_atomic_op make it look like the other
get_*_nir_atomic_op functions.  The rest just add UNUSED or ASSERTED
to parameters required for some of the interfaces.

src/compiler/spirv/spirv_to_nir.c: In function ‘struct_member_decoration_cb’:
src/compiler/spirv/spirv_to_nir.c:673:47: warning: unused parameter ‘val’ [-Wunused-parameter]
                             struct vtn_value *val, int member,
                                               ^~~
src/compiler/spirv/spirv_to_nir.c: In function ‘struct_member_matrix_stride_cb’:
src/compiler/spirv/spirv_to_nir.c:778:50: warning: unused parameter ‘val’ [-Wunused-parameter]
                                struct vtn_value *val, int member,
                                                  ^~~
src/compiler/spirv/spirv_to_nir.c: In function ‘type_decoration_cb’:
src/compiler/spirv/spirv_to_nir.c:805:61: warning: unused parameter ‘ctx’ [-Wunused-parameter]
                     const struct vtn_decoration *dec, void *ctx)
                                                             ^~~
src/compiler/spirv/spirv_to_nir.c: In function ‘spec_constant_decoration_cb’:
src/compiler/spirv/spirv_to_nir.c:1359:70: warning: unused parameter ‘v’ [-Wunused-parameter]
 spec_constant_decoration_cb(struct vtn_builder *b, struct vtn_value *v,
                                                                      ^
src/compiler/spirv/spirv_to_nir.c: In function ‘handle_workgroup_size_decoration_cb’:
src/compiler/spirv/spirv_to_nir.c:1407:43: warning: unused parameter ‘data’ [-Wunused-parameter]
                                     void *data)
                                           ^~~~
src/compiler/spirv/spirv_to_nir.c: In function ‘vtn_handle_function_call’:
src/compiler/spirv/spirv_to_nir.c:1806:55: warning: unused parameter ‘opcode’ [-Wunused-parameter]
 vtn_handle_function_call(struct vtn_builder *b, SpvOp opcode,
                                                       ^~~~~~
src/compiler/spirv/spirv_to_nir.c:1807:54: warning: unused parameter ‘count’ [-Wunused-parameter]
                          const uint32_t *w, unsigned count)
                                                      ^~~~~
src/compiler/spirv/spirv_to_nir.c: In function ‘get_uniform_nir_atomic_op’:
src/compiler/spirv/spirv_to_nir.c:2548:47: warning: unused parameter ‘b’ [-Wunused-parameter]
 get_uniform_nir_atomic_op(struct vtn_builder *b, SpvOp opcode)
                                               ^
src/compiler/spirv/spirv_to_nir.c: In function ‘vtn_handle_atomics’:
src/compiler/spirv/spirv_to_nir.c:2633:48: warning: unused parameter ‘count’ [-Wunused-parameter]
                    const uint32_t *w, unsigned count)
                                                ^~~~~
src/compiler/spirv/spirv_to_nir.c: In function ‘vtn_handle_barrier’:
src/compiler/spirv/spirv_to_nir.c:3197:48: warning: unused parameter ‘count’ [-Wunused-parameter]
                    const uint32_t *w, unsigned count)
                                                ^~~~~
src/compiler/spirv/spirv_to_nir.c: In function ‘vtn_handle_execution_mode’:
src/compiler/spirv/spirv_to_nir.c:3618:68: warning: unused parameter ‘data’ [-Wunused-parameter]
                           const struct vtn_decoration *mode, void *data)
                                                                    ^~~~

Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Ian Romanick
44471a76e9 nir/spirv: Translate SPIR-V to NIR for new INTEL_shader_integer_functions2 opcodes
v2: Rebase on 272e927d0e ("nir/spirv: initial handling of OpenCL.std
extension opcodes")

v3: Add missing SpvOpUCountTrailingZerosINTEL case to switch in
vtn_handle_body_instruction. Remove stray semicolon in
vtn_nir_alu_op_for_spirv_opcode. Use umin instead of umax for
SpvOpUCountTrailingZerosINTEL "lowering" in vtn_handle_alu.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Ian Romanick
de6c0f8487 intel/fs: Implement support for NIR opcodes for INTEL_shader_integer_functions2
v2: Remove smashing type to D for nir_op_irhadd.  Caio noticed it was
odd, and removing it fixes an assertion failure in the crucible
func.shader.averageRounded.int64_t test (because the source should be
W).

v3: Emit BRW_OPCODE_MUL directly for nir_op_umul_32x16 and
nir_op_imul_32x16.  Suggested by Curro.

v4: Smash types of MUL instruction generated for nir_op_umul_32x16 and
nir_op_imul_32x16.  With this change, I get the same assembly now as I
did with v2.

v5: Remove support for pre-Gen7.  The integer multiply path was
incorrect, and, since the extension isn't enabled pre-Gen7, there's no
way to test it.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Ian Romanick
58907568ec intel/fs: Add SHADER_OPCODE_[IU]SUB_SAT pseudo-ops
v2: Add a big comment explaining the [IU]SUB_SAT lowering.  Suggested by
Caio.

v3: Use get_fpu_lowered_simd_width in get_lowered_simd_width.  Suggested
by Ken on IRC.

v4: Fix a typo in a comment.  Noticed by Caio.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Ian Romanick
74cd0964d6 intel/fs: Don't lower integer multiplies that don't need lowering
v2: Move the check to fs_visitor::lower_integer_multiplication.
Previously the cases where lowering was skipped, the original
instruction was removed by fs_visitor::lower_integer_multiplication.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Ian Romanick
db649fd582 compiler: Translate GLSL IR to NIR for new INTEL_shader_integer_functions2 expressions
v2: Rebase on 272e927d0e ("nir/spirv: initial handling of OpenCL.std
extension opcodes")

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Ian Romanick
d3d970166c nir/algebraic: Add lowering for 64-bit iadd_sat and isub_sat
v2: Rearranged and expand the comment about the optimizations applied to
the lowering.  Suggested by Caio.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Ian Romanick
dcadbd2dd2 nir/algebraic: Add lowering for 64-bit uadd_sat
Fixes piglit fs-addsaturate-uint64 and vs-addsaturate-uint64 on Ice
Lake.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Ian Romanick
1bdfc6d7cb nir/algebraic: Add lowering for 64-bit usub_sat
v2: Rebase on 272e927d0e ("nir/spirv: initial handling of OpenCL.std
extension opcodes")

v3: Add a new lower_usub_sat64 flag that only applies to the 64-bit
version of the nir_op_usub_sat instruction.

v4: Also enable the lowering when nir_lower_iadd64 is set.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> [v3]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Ian Romanick
a483771045 nir/algebraic: Add lowering for 64-bit hadd and rhadd
v2: Rebase on 272e927d0e ("nir/spirv: initial handling of OpenCL.std
extension opcodes")

v3: Add a new lower_hadd64 flag that only applies to the 64-bit versions
of the instructions.

v4: Also enable the lowering when nir_lower_iadd64 is set.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> [v3]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Ian Romanick
ea435560ee nir/algebraic: Add lowering for uabs_usub and uabs_isub
v2: Remove some rebase failures noticed by Caio.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Ian Romanick
21f0d020fe nir: Add new instructions for INTEL_shader_integer_functions2
uctz isn't added because it will implemented in the GLSL path and the
SPIR-V path using other pre-existing instructions.

v2: Avoid signed integer overflow for uabs_isub(0, INT_MIN).  Noticed by
Caio.

v3: Alternate fix for signed integer overflow for abs_sub(0, INT_MIN).
I tried the previous methon in a small test program with -ftrapv, and it
failed.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Ian Romanick
cb518df775 glsl: Add built-in functions for INTEL_shader_integer_functions2
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Ian Romanick
5eda9f5832 glsl_types: Add function to get an unsigned base type from a signed type
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Ian Romanick
1d165b0548 glsl: Add new expressions for INTEL_shader_integer_functions2
v2: Re-write iadd64_saturate and isub64_saturate to avoid undefined
overflow behavior.  Also fix copy-and-paste bug in isub64_saturate.
Suggested by Caio.

v3: Avoid signed integer overflow for abs_sub(0, INT_MIN).  Noticed by
Caio.

v4: Alternate fix for signed integer overflow for abs_sub(0, INT_MIN).
I tried the previous methon in a small test program with -ftrapv, and it
failed.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Ian Romanick
20d34c4ebf mesa: Extension boilerplate for INTEL_shader_integer_functions2
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Matt Turner
88a0523bd2 intel/compiler: Move Gen4/5 rounding to visitor
Gen4/5's rounding instructions operate differently than later Gens'.
They all return the floor of the input and the "Round-increment"
conditional modifier answers whether the result should be incremented by
1.0 to get the appropriate result for the operation (and thus its
behavior is determined by the round opcode; e.g., RNDZ vs RNDE).

Since this requires a second instruciton (a predicated ADD) that
consumes the result of the round instruction, the round instruction
cannot write its result directly to the (write-only) message registers.
By emitting the ADD in the generator, the backend thinks it's safe to
store the round's result directly to the message register file.

To avoid this, we move the emission of the ADD instruction to the NIR
translator so that the backend has the information it needs.

I suspect this also fixes code generated for RNDZ.SAT but since
Gen4/5 don't support GLSL 1.30 which adds the trunc() function, I
couldn't write a piglit test to confirm. My thinking is that if x=-0.5:

      sat(trunc(-0.5)) = 0.0

But on Gen4/5 where sat(trunc(x)) is implemented as

      rndz.r.f0  result, x             // result = floor(x)
                                       // set f0 if increment needed
      (+f0) add  result, result, 1.0   // fixup so result = trunc(x)

then putting saturate on both instructions will give the wrong result.

      floor(-0.5) = -1.0
      sat(floor(-0.5)) = 0.0
      // +1 increment would be needed since floor(-0.5) != trunc(-0.5)
      sat(sat(floor(-0.5)) + 1.0) = 1.0

Fixes: 6f394343b1 ("nir/algebraic: i2f(f2i()) -> trunc()")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2355
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3459>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3459>
2020-01-22 23:47:02 +00:00
Samuel Thibault
2fd85105c6 meson: Do not require libdrm for DRI2 on hurd
Cc: 19.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3231>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3231>
2020-01-22 23:15:05 +00:00
Samuel Thibault
4f52425159 util: Do not fail to build on unknown pthread_setname_np
This is only used for debugging, so better making porting on various systems
less hard.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3229>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3229>
2020-01-22 22:39:57 +00:00
Samuel Thibault
e45dc93136 loader: #define PATH_MAX when undefined (eg. Hurd)
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3228>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3228>
2020-01-22 22:10:29 +00:00
Eric Engestrom
d60b8fd3cb util/atomic: fix return type of p_atomic_add_return() fallback
Fixes: 385d13f26d ("util/atomic: Add a _return variant of p_atomic_add")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3012>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3012>
2020-01-22 21:42:52 +00:00
James Xiong
ac0219cc5b gallium: dmabuf support for yuv formats that are not natively supported
V2 (Kenneth Graunke):
   added a helper function to check if every format is supported

Signed-off-by: James Xiong <james.xiong@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2846>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2846>
2020-01-22 21:18:49 +00:00
Emmanuel Gil Peyrot
5f78524d9b intel/compiler: Return early if read() failed
This was the only warning I could see while compiling Iris.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2821>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2821>
2020-01-22 20:52:47 +00:00
Alan Coopersmith
8490b7d917 intel/perf: adapt to platforms like Solaris without d_type in struct dirent
Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
[Eric: factor out the is_dir_or_link() check and fix a bug in v1]
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
v3: include directory path when lstat'ing files
v4: fix inverted check in enumerate_sysfs_metrics()

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2258>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2258>
2020-01-22 20:23:51 +00:00
Eric Engestrom
8f140422ed llvmpipe: drop LLVM < 3.4 support
We don't support < 3.9 anymore, so this code is dead.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2760>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2760>
2020-01-22 11:21:13 -08:00
Eric Engestrom
7d7d1da1ac egl: drop confusing mincore() error message
A user came to me asking how to fix this error, but it's entirely
expected that `get_wl_surface_proxy()` on recent enough wayland
compositors will always print it.
Let's just remove the message altogether, it is basically never useful.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3219>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3219>
2020-01-22 17:55:26 +00:00
Rhys Perry
15a1cc00d3 aco: fix off-by-one error when initializing sgpr_live_in
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2394
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3511>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3511>
2020-01-22 17:23:30 +00:00
Samuel Pitoiset
bd51538d28 radv: fix double free corruption in radv_alloc_memory()
If the driver fails to allocate memory for some reasons, it shouldn't
free the 'mem' object twice.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2302
Fixes: 825ddfee59 ("radv: Handle device memory alloc failure with normal free.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3508>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3508>
2020-01-22 17:01:16 +00:00
Michel Dänzer
5a6a88f58c gitlab-ci: Use single if for manual job rules entry
I thought multiple ifs would all need to match, but looks like only the
last one (or either one?) does.

This should prevent a manual pipeline from getting created after merging
changes which can't affect the pipeline.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3474>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3474>
2020-01-22 16:42:11 +00:00
Michel Dänzer
2dd0cc60f1 gitlab-ci: Set GIT_STRATEGY to none for the dummy job
It doesn't need anything from the Git repository.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3474>
2020-01-22 16:42:11 +00:00
X512
eb40c0adfc util/u_thread: Fix build under Haiku 2020-01-22 16:21:54 +00:00
Alexander von Gluck IV
49d2a066c2 haiku/hgl: Fix build via header reordering 2020-01-22 16:21:54 +00:00
Rhys Perry
3f96a1ed86 aco: fix operand kill flags when a temporary is used more than once
Helps create v_mac_f32 from v_mad_f32(b, a, b)

Totals from affected shaders:
SGPRS: 35824 -> 35824 (0.00 %)
VGPRS: 33460 -> 33456 (-0.01 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 2187264 -> 2180976 (-0.29 %) bytes
LDS: 127 -> 127 (0.00 %) blocks
Max Waves: 3802 -> 3802 (0.00 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3486>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3486>
2020-01-22 15:55:00 +00:00
Boris Brezillon
5b810c7de3 panfrost/midgard: Add missing lowering passes for type/size conversion ops
Replace the manual type/size conversion lowering description by one
that's automatically generated and covers all type/size conversions.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3478>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3478>
2020-01-22 15:31:28 +00:00
Boris Brezillon
fcceeaffae panfrost/midgard: Add 64 bits float <-> int converters
The 64 bit converter cases were missing, add them now.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3478>
2020-01-22 15:31:28 +00:00
Boris Brezillon
fe5fbadd46 panfrost/midgard: Fix mir_print_instruction() for branch instructions
Branch instructions should not be treated as regular ALUs.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3478>
2020-01-22 15:31:28 +00:00
Boris Brezillon
e1f9e8d60b panfrost/midgard: Add f2f64 support
So we can convert floats into doubles.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3478>
2020-01-22 15:31:28 +00:00
Boris Brezillon
f53a0799c7 panfrost/midgard: Factorize f2f and u2u handling
Those size conversion operations work the same way apart from f2f
using an fmov op code and u2u using an imov. Let's handle them in the
same case block to avoid code duplication.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3478>
2020-01-22 15:31:28 +00:00
Boris Brezillon
6548d01b3d panfrost/midgard: Make sure promote_fmov() only promotes 32-bit imovs
mir_constant_float() assumes we're dealing with 32-bit integers/floats,
which is only the case if reg_mode is equal to midgard_reg_mode_32.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3478>
2020-01-22 15:31:28 +00:00
Boris Brezillon
9566f26ed4 panfrost/midgard: Rework mir_adjust_constants() to make it type/size agnostic
Right now, constant combining is not supported in 16 bit mode, and 64
bit mode is simply ignored. Let's rework the function to make it
type/bit-size agnostic.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3478>
2020-01-22 15:31:28 +00:00
Boris Brezillon
15c92d158c panfrost/midgard: Use a union to manipulate embedded constants
Each instruction bundle can contain up to 16 constant bytes. The meaning
of those byte is instruction dependent: it depends on the instruction
native type (int, uint or float) and the instruction reg_mode (8, 16, 32
or 64 bit). Those different layouts can be exposed as a union to
facilitate constants manipulation.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3478>
2020-01-22 15:31:28 +00:00
Lionel Landwerlin
63461cb7e1 anv: ensure prog params are initialized with 0s
As a result of 9baa33cef0 our backend compiler leaves params pretty
much untouched. So in order to avoid storing uninitialized values in
the shader cache blobs, just 0 out this array.

I've considered not even allocating this array which works on gen8+
but the vec4 backend still makes a copy of this array and so it
crashes on memcpy on HSW.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 9baa33cef0 ("anv: Rework push constant handling")
Reported-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3516>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3516>
2020-01-22 16:47:55 +02:00
Alyssa Rosenzweig
4936120230 panfrost: Fix crash in compute variant allocation
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Fixes: d8a3501f1b ("panfrost: Dynamically allocate shader variants")
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3515>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3515>
2020-01-22 13:48:24 +00:00
Guido Günther
d817f2c696 etnaviv: drm: Don't miscalculate timeout
The current code overflows (s * 1000000000) for s >= 5 but that is
e.g. used in etna_bo_cpu_prep.

Signed-off-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3509>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3509>
2020-01-22 13:22:47 +00:00
Alexander van der Grinten
047162d99c egl: Fix _eglPointerIsDereferencable w/o mincore()
On platforms without mincore(), _eglPointerIsDereferencable()
currently just checks whether p != NULL. This is not sufficient:
In the Wayland platform code (i.e., in get_wl_surface_proxy()),
_eglPointerIsDereferencable() is called on the version field
of `struct wl_egl_window` which is 3 on current versions of
Wayland. This causes a segfault when trying to dereference p.

Fix this behavior by assuming that the first page of the
process is never dereferencable.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3103>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3103>
2020-01-22 12:55:05 +00:00
Tapani Pälli
39e7492d33 egl/android: fix buffer_count for applications setting max count
Problem with previous solution was that it did not take account that
some applications may set a max count for buffers. Therefore we need to
query both min and max and clamp our setting based on that.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2373
Fixes: be08e6a449 ("egl/android: Restrict minimum triple buffering for android color_buffers")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3480>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3480>
2020-01-22 10:37:04 +00:00
Timur Kristóf
1c9ecb2123 aco: Fix signedness compare warning.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3483>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3483>
2020-01-22 11:09:17 +01:00
Timur Kristóf
533a20dbd5 aco: Fix maybe-uninitialized warnings.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3483>
2020-01-22 11:09:14 +01:00
Timur Kristóf
6fb3df2786 aco: Fix -Wstringop-overflow warnings in aco_span.
GCC does not understand how aco_span works.
This patch fixes it by casting the aco_span's this pointer
to uintptr_t rather than to a char pointer, effectively
telling GCC not to try to figure it out.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3483>
2020-01-22 11:09:10 +01:00
Timur Kristóf
75e5720e1a radeon: Fix multiple definition error with radeon_debug
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3488>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3488>
2020-01-22 09:36:28 +01:00
Timur Kristóf
8e22df3aec gallium: Fix a couple of multiple definition warnings.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3488>
2020-01-22 09:36:25 +01:00
Timur Kristóf
a134ac5ee9 r600: Move get_pic_param to radeon_vce.c
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3488>
2020-01-22 09:36:23 +01:00
Timur Kristóf
b7f9759809 radeon: Move si_get_pic_param to radeon_vce.c
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3488>
2020-01-22 09:36:16 +01:00
Timur Kristóf
e45ea781f8 intel/compiler: Fix array bounds warning on GCC 10.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2020-01-22 08:35:18 +01:00
Eric Anholt
3abfde13be turnip: Add support for non-zero (still constant) UBO buffer indices.
This was actually all ready to go at this point, and just needed to
increment by the value.

Fixes dEQP-VK.binding_model.shader_access.primary_cmd_buf.uniform_buffer.*

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3504>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3504>
2020-01-22 02:13:38 +00:00
Jonathan Marek
5f791df0d0 turnip: fix array/matrix varyings
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3109>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3109>
2020-01-21 20:36:08 -05:00
Jonathan Marek
c171765223 turnip: remove tu_sort_variables_by_location
nir_assign_io_var_locations already does sorting.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3109>
2020-01-21 20:36:08 -05:00
Jonathan Marek
1736447f27 freedreno/ir3: allow inputs with the same location
turnip can have multiple inputs with the same location, and different
location_frac.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3109>
2020-01-21 20:36:08 -05:00
Matt Turner
17c9ec94f5 gitlab-ci: Skip ext_timer_query/time-elapsed
This test's result is unpredictable, so it may occasionally pass when we
expect it to fail, thus causing the CI pipeline to fail.

Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3498>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3498>
2020-01-22 00:53:48 +00:00
Matt Turner
68cfc65ccb intel/compiler: Test compaction on Gen <= 12
With the previous commits we can now enable the unit test on Gen <= 12.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>
2020-01-22 00:19:21 +00:00
Matt Turner
22462ba242 intel/compiler: Validate fuzzed instructions
... before giving them to the instruction compactor.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>
2020-01-22 00:19:21 +00:00
Matt Turner
72cf63cfc6 intel/compiler: Add unit tests for new EU validation checks
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>
2020-01-22 00:19:21 +00:00
Matt Turner
5f4eacaeda intel/compiler: Validate some instruction word encodings
Specifically, execution size, register file, and register type. I did
not add validation for vertical stride and width because I don't believe
it's possible to have an otherwise valid instruction with an invalid
vertical stride or width, due to all of the other regioning
restrictions.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>
2020-01-22 00:19:21 +00:00
Matt Turner
0fc490cdee intel/compiler: Factor out brw_validate_instruction()
In order to fuzz test instructions, we first need to do some sanity
checking first. Factoring out this function allows us an easy way to
validate a single instruction.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>
2020-01-22 00:19:21 +00:00
Matt Turner
40f0ade68e intel/compiler: Handle invalid compacted immediates
16-bit immediates need to be replicated through the 32-bit immediate
field, so we should never see one that isn't.

This does happen however in the fuzzer unit test, so returning false
allows the fuzzer to reject this case.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>
2020-01-22 00:19:21 +00:00
Matt Turner
205cb8a139 intel/compiler: Handle invalid inputs to brw_reg_type_to_*()
Necessary to handle these cases when we test fuzzed instructions.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>
2020-01-22 00:19:21 +00:00
Matt Turner
741cf9a104 intel/compiler: Split hw_type tables
Previously we were sharing tables between generations that were nearly
identical (i.e., Gen8 3-src adds HF support) and used a small bit of
code to handle the differences. This is kind of a mess if you want to
reject 64-bit types on platforms that don't support 64-bit types, so
split the tables, allowing each generation's table to list exactly what
it supports.

Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>
2020-01-22 00:19:21 +00:00
Matt Turner
0b70d46f7a intel/compiler: Add a INVALID_{,HW_}REG_TYPE macros
Since the enum brw_reg_type is packed, comparisons with -1 don't work
directly, necessitating the cast. Add a macro to avoid this confusion.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>
2020-01-22 00:19:20 +00:00
Matt Turner
ab7c25b9aa intel/compiler: Add NF some more places
Necessary to handle these cases when we test fuzzed instructions.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>
2020-01-22 00:19:20 +00:00
Matt Turner
8634286c5d intel/compiler: Limit compaction unit tests to specific gens
Two of the tests emit instructions with MRF destinations, and MRFs
aren't present on Gen7+. I think we were just lucky that this didn't
cause a problem earlier since we were running the tests on Gen7-9.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>
2020-01-22 00:19:20 +00:00
Matt Turner
713c123bfa intel/compiler: Don't disassemble align1 3-src operands on Gen < 10
Since the platforms don't support align1 3-src instructions, the
contents of these operands are not going to be meaningful. Just don't
print them to avoid hitting some assertions in brw_inst functions.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>
2020-01-22 00:19:20 +00:00
Matt Turner
49c21802cb intel/compiler: Split has_64bit_types into float/int
Gen7 has 64-bit floats but not 64-bit ints.

Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>
2020-01-22 00:19:20 +00:00
Matt Turner
bb47aa2124 intel/compiler: Extract GEN_* macros into separate file
Will be used by the instruction compaction unit test.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>
2020-01-22 00:19:20 +00:00
Matt Turner
c69f3ece61 intel/compiler: Use ARRAY_SIZE()
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>
2020-01-22 00:19:20 +00:00
Caio Marcelo de Oliveira Filho
45164fc8c5 intel/fs: Don't emit control barrier if only one thread is used
When there's only one hardware thread (i.e. the dispatch width greater
or equal to the workgroup size), there's no need to use a barrier to
ensure all the invocations reach the same point in the shader, because
they are already running lock-step.

Results for SKL running Iris for shader-db tests with compute shaders

    total sends in shared programs: 18361 -> 18339 (-0.12%)
    sends in affected programs: 904 -> 882 (-2.43%)
    helped: 9
    HURT: 0
    helped stats (abs) min: 1 max: 5 x̄: 2.44 x̃: 2
    helped stats (rel) min: 0.84% max: 21.43% x̄: 7.82% x̃: 2.67%
    95% mean confidence interval for sends value: -3.31 -1.58
    95% mean confidence interval for sends %-change: -14.67% -0.97%
    Sends are helped.

Shaders from Aztec Ruins, Car Chase, Manhattan and DeusEx are helped.

Results for ICL and TGL are similar to SKL.

Results for BDW are similar to SKL except for DeusEx shader that has a
workgroup size 16 but in BDW picks the SIMD8.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3226>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3226>
2020-01-21 23:41:35 +00:00
Caio Marcelo de Oliveira Filho
4f431e870c intel/fs: Don't emit fence for shared memory if only one thread is used
When there's only one hardware thread (i.e. the dispatch width greater
or equal to the workgroup size), there's no need to synchronize shared
memory access (SLM) since all the requests from a single thread are
already synchronized.  In such case, we just add a scheduling fence.

To be able to identify that case for all platforms, move the handling
of platforms prior to Gen11 (which don't have a separate SLM fence)
after the optimization.

Results for SKL running Iris for shader-db tests with compute shaders

    total sends in shared programs: 18395 -> 18361 (-0.18%)
    sends in affected programs: 938 -> 904 (-3.62%)
    helped: 9
    HURT: 0
    helped stats (abs) min: 1 max: 5 x̄: 3.78 x̃: 4
    helped stats (rel) min: 1.56% max: 26.32% x̄: 10.33% x̃: 2.60%
    95% mean confidence interval for sends value: -4.85 -2.71
    95% mean confidence interval for sends %-change: -19.12% -1.54%
    Sends are helped.

Shaders from Aztec Ruins, Car Chase, Manhattan and DeusEx are helped.

Results for ICL and TGL are similar to SKL.

Results for BDW are similar to SKL except for DeusEx shader that has a
workgroup size 16 but in BDW picks the SIMD8.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3226>
2020-01-21 23:41:35 +00:00
Caio Marcelo de Oliveira Filho
ff5b74ef32 intel/fs: Add workgroup_size() helper
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3226>
2020-01-21 23:41:35 +00:00
Caio Marcelo de Oliveira Filho
18e72ee210 intel/fs: Add FS_OPCODE_SCHEDULING_FENCE
Like a SHADER_OPCODE_MEMORY_FENCE but doesn't doesn't generate any
assembly code.

Will be used when the compiler shouldn't reorder certain instructions
but there's no need to generate code for the HW to do it -- as the
ordering will be guaranteed by other means.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3226>
2020-01-21 23:41:35 +00:00
Dongwon Kim
9d964da19f gallium: check all planes' pipe formats in case of multi-samplers
Current code only checks whether first plane's format is supported
in case YUV format sampling is done by sampling each plane separately.
It would be safer to check other planes' as well.

Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2863>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2863>
2020-01-21 23:04:33 +00:00
Kenneth Graunke
d3a0d3a80b anv: Drop some workarounds that are no longer necessary
These workarounds are no longer required by 10th Gen hardware.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3495>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3495>
2020-01-21 13:58:42 -08:00
Kenneth Graunke
311cab27e2 iris: Drop some workarounds which are no longer necessary
These workarounds are no longer required by 10th Gen hardware.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3495>
2020-01-21 13:58:40 -08:00
Eric Anholt
d1166a3b3a turnip: Disable UBWC on images used as storage images.
The closed GL driver doesn't use UBWC on any storage images.  It does tile
mostly (skipping tiling on writeonly images, it seems), but for freedreno
we've been enabling tiling in all cases and it's fine.  We do need to
disable UBWC, as tests fail otherwise and just plugging in the equivalent
UBWC regs like we were setting up a texture isn't enough.

Fixes dEQP-VK.image.atomic_operations.*

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3433>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3433>
2020-01-21 19:29:59 +00:00
Eric Anholt
e5ce365cde turnip: Add limited support for storage images.
So far this doesn't handle the texture state-based storage image access
loads, and doesn't support descriptor arrays (same as SSBOs).  The texture
side is more tricky, since we have another remapping table to work around.

This is enough to get some of dEQP-VK.image.atomic_operations.* working.

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3433>
2020-01-21 19:29:59 +00:00
Eric Anholt
85e424c591 turnip: Refactor the intrinsic lowering.
Too many things in one function, split them out based on the intrinsic.

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3433>
2020-01-21 19:29:59 +00:00
Eric Anholt
3ac662e8df turnip: Fix some whitespace around binary operators.
Conforms to mesa style and the rest of turnip.

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3433>
2020-01-21 19:29:59 +00:00
Eric Anholt
6c10af95c7 radeonsi: Drop PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS.
Now that we don't expose TGSI, we can stop exposing the flag.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3493>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3493>
2020-01-21 19:04:22 +00:00
Eric Anholt
609a67461d r300: Remove a bunch of default handling of pipe caps.
u_screen will return 0 for all of these, which means that this is one
less driver to see in git grep when I'm checking who exposes a cap.
The exception is the texel/gather offsets and stream output
components, which will not be exposed since we don't expose the
corresponding GLSL version.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3493>
2020-01-21 19:04:22 +00:00
Eric Anholt
e7e034e1de r600: Remove a bunch of default handling of pipe caps.
u_screen will return 0 for all of these, which means that this is one
less driver to see in git grep when I'm checking who exposes a cap.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3493>
2020-01-21 19:04:22 +00:00
Eric Anholt
3e1dd99adc radeonsi: Remove a bunch of default handling of pipe caps.
u_screen will return 0 for all of these, which means that this is one
less driver to see in git grep when I'm checking who exposes a cap.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3493>
2020-01-21 19:04:22 +00:00
Lionel Landwerlin
e618951322 anv: don't report error with other vendor DRM devices
Enumeration should just skip unsupported DRM devices.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 34c8621c3b ("anv: Allow enumerating multiple physical devices")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2386
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3481>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3481>
2020-01-21 18:36:26 +00:00
Eric Anholt
fb6fca0037 freedreno: Stop scattered remapping of SSBOs/images to IBOs.
Just make it be all SSBOs then all storage images.  The remapping table
was there to make it so that the big gap present from gallium's atomic
lowering would get cleaned up, but that's no longer case.  The table has
made it very hard to support Vulkan storage images, so it's time for it to
go.

This does mean that an SSBO/IBO that is only loaded (or size-queried) will
now occupy a slot in the table where it wouldn't before.  This seems like
a minor cost compared to being able to drop this much logic.

With the remapping table gone, SSBO array handling for turnip just falls
out.

Fixes many array cases of
dEQP-VK.binding_model.shader_access.primary_cmd_buf.storage_buffer.*

Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jonathan Marek <jonathan@marek.ca> (turnip)
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240>
2020-01-21 10:06:23 -08:00
Eric Anholt
7558b5da13 compiler: Add a note about how num_ssbos works in the program info.
These numbers are always confusing, and it's particularly so for this
field where it has a different meaning in different info structs.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240>
2020-01-21 10:06:23 -08:00
Eric Anholt
d0975bfc4a nir: Drop the ssbo_offset to atomic lowering.
The arguments passed in were:
- prog->info.num_ssbos
- prog->nir->info.num_ssbos
- arbitrary values for standalone compilers

The num_ssbos should match between the prog's info and prog->nir's info
until this lowering happens.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240>
2020-01-21 10:06:23 -08:00
Eric Anholt
d5a3971457 gallium: Pack the atomic counters just above the SSBOs.
We carve out half the SSBO space for atomics, and we were just binding
them way up there.  freedreno was then using a remapping table to map the
sparse buffer index back down, since space in the descriptor array is a
shared resource that may limit parallelism.  That remapping table
generated inside of the ir3 compiler is getting thoroughly in the way of
implementing vulkan descriptor sets.

We will be able to get rid of the freedreno's remapping table, and
hopefully save shared resources on other hardware, by packing the atomics
tightly above the SSBOs (like i965 does).  We already rebind the shader
buffers on program change if either the old or new program has SSBOs or
ABOs, so this doesn't necessarily increase the program state change cost
(the only cost increase I can come up with is if you're using the same
atomic counter without rebinding it across changes of programs with
varying SSBO counts, meaning it would now bounce around index space).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240>
2020-01-21 10:06:23 -08:00
Eric Anholt
10dc4ac4c5 mesa: Make atomic lowering put atomics above SSBOs.
Gallium arbitrarily (it seems) put atomics below SSBOs, resulting in a
bunch of extra index management, and surprising shader code when you would
see your SSBOs up at index 16.  It makes a lot more sense to see atomics
converted to SSBOs appear as magic high numbers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240>
2020-01-21 10:06:23 -08:00
Eric Anholt
2dc2055157 turnip: Refactor linkage state setup.
As I touch this for descriptor set reworks, I don't want to have to update
it twice.

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240>
2020-01-21 10:06:23 -08:00
Timur Kristóf
28eb481bc2 nouveau/nvc0: add extern keyword to nvc0_miptree_vtbl.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2020-01-21 17:36:36 +01:00
Tapani Pälli
5fede43fe0 anv: initialize clear_color_is_zero_one
Fixes following valgrind warning:

   ==12508== Conditional jump or move depends on uninitialised value(s)
   ==12508==    at 0x2CCD8B79: cmd_buffer_begin_subpass (genX_cmd_buffer.c:4599)
   ==12508==    by 0x2CCDA72B: gen9_CmdBeginRenderPass (genX_cmd_buffer.c:5275)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3487>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3487>
2020-01-21 17:47:30 +02:00
Boris Brezillon
9134f22df2 panfrost/midgard: Print the actual source register for store operations
Store operation use r26/r27 but have a word->reg set to 0 or 1 (base is
r26). Let's take this base offset into account in
print_load_store_instr().

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3482>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3482>
2020-01-21 14:57:12 +00:00
Alyssa Rosenzweig
14b37ebd44 panfrost: Add pandecode entries for ASTC/ETC formats
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>
2020-01-21 08:35:23 -05:00
Icecream95
31bd3b5279 panfrost: Add ASTC texture formats
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>
2020-01-21 08:35:23 -05:00
Icecream95
960fe9daea panfrost: Add ETC1/ETC2 texture formats
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>
2020-01-21 08:35:23 -05:00
Alyssa Rosenzweig
2091d311c9 panfrost: Rework linear<--->tiled conversions
There's a lot going on here (it's a ton of commits squashed together
since otherwise this would be impossible to review...)

1. We have a fast path for linear->tiled for whole (aligned) tiles, but we
have to use a slow path for unaligned accesses. We can get a pretty
major win for partial updates by using this slow path simply on the
borders of the update region, and then hit the fast path for the
tile-aligned interior. This does require some shuffling.

2. Mark the LUTs constant, which allows the compiler to inline them,
which pairs well with loop unrolling (eliminating the memory accesses
and just becoming some immediates.. which are not as immediate on
aarch64 as I'd like..)

3. Add fast path for bpp1/2/8/16. These use the same algorithm and we
have native types for them, so may as well get the fast path.

4. Drop generic path for bpp != 1/2/8/16, since these formats are
generally awful and there's no way to tile them efficienctly and
honestly there's not a good reason too either. Lima doesn't support any
of these formats; Panfrost can make the opinionated choice to make them
linear.

5. Specialize the unaligned routines. They don't have to be fully
generic, they just can't assume alignment. So now they should be nearly
as fast as the aligned versions (which get some extra tricks to be even
faster but the difference might be neglible on some workloads).

6. Specialize also for the size of the tile, to allow 4x4 tiling as well
as 16x16 tiling. This allows compressed textures to be efficiently tiled
with the same routines (so we add support for tiling ASTC/ETC textures
while we're at it)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Vasily Khoruzhick <anarsoul@gmail.com> #lima on Mali400
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>
2020-01-21 08:35:19 -05:00
Alyssa Rosenzweig
f2d876b2b2 panfrost,lima: De-Galliumize tiling routines
There's an implicit dependence on Gallium here that will add more
complexity than needed when testing/optimizing out of driver as well as
potentially Vulkanizing. We don't need a full pipe_box, just the x/y/w/h
properties directly.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Vasily Khoruzhick <anarsoul@gmail.com> #lima on Mali400
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>
2020-01-21 08:35:16 -05:00
Alyssa Rosenzweig
0ca7ab1c97 panfrost: Compile tiling routines with -O3
These are major hot spots for panfrost and lima; better let the compiler
do its thing even on debug builds.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Vasily Khoruzhick <anarsoul@gmail.com> #lima on Mali400
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>
2020-01-21 08:35:01 -05:00
Bas Nieuwenhuizen
bd4380c63c radv: Remove syncobj_handle variable in header.
I strongly suspect it was supposed to be a typedef. However, used
nowhere, we should remove it.

Fixes: eaa56eab6d "radv: initial support for shared semaphores (v2)"
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2385
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3479>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3479>
2020-01-21 12:28:00 +00:00
Neil Armstrong
dc594c95dd gitlab-ci/lava: add pipeline information in the lava job name
In order to have more informations in the LAVA jobs list, add the
current pipeline URL and commit ref name in the LAVA job name.

Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2337>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2337>
2020-01-21 11:29:36 +00:00
Jan Zielinski
a24b3b228a gallium/gallivm: enable linking lp_bld_printf function with C++ code
To enable linking functions declared in lp_bld_printf.h file with C++,
we need to add appropriate macros to the header.

Reviewed-by: Krzysztof Raszkowski <krzysztof.raszkowski@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3470>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3470>
2020-01-21 11:00:18 +00:00
Danylo Piliaiev
3f9a6011a6 iris: Fix value of out-of-bounds accesses for vertex attributes
Having VERTEX_BUFFER_STATE.BufferSize greater than the size of
a bound vertex buffer allows shader to read uninitialized vertex
attributes from BO, instead of allowing hardware to return zeroes
on out-of-bounds access.

OpenGL spec "6.4 Effects of Accessing Outside Buffer Bounds" says:

"Robust buffer access can be enabled by creating a context with robust access
 enabled through the window system binding APIs. When enabled, any command
 unable to generate a GL error as described above, such as buffer object accesses
 from the active program, will not read or modify memory outside of the data
 store of the buffer object and will not result in GL interruption or termination.
 Out-of-bounds reads may return values from within the buffer object or zero
 values."

Fixes three webgl tests:
 conformance/rendering/out-of-bounds-array-buffers.html
 conformance2/rendering/out-of-bounds-index-buffers-after-copying.html
 conformance2/rendering/element-index-uint.html

See #1996

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3427>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3427>
2020-01-21 09:52:40 +00:00
Vasily Khoruzhick
e470116aac ci: Re-enable CI for lima on mali450
Amend fails and skips lists basing on lists from Andreas Baierl,
shard mali400 job across two devices since it takes close to 10min
and rename jobs to lima-mali400-test and lima-mali450-test.

Also don't set MESA_GLES_VERSION_OVERRIDE=3.0 for lima since we don't support
GLES 3.0 and lower DEQP_PARALLEL to 3 for jobs on H3.

Keep mali400 jobs disabled atm since they take too much time to complete
and we also get some unexplicable failures in dEQP-GLES2.functional.default_vertex_attrib.*

Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3163>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3163>
2020-01-21 09:33:57 +00:00
Vasily Khoruzhick
5e5b5348f6 ci: lava: pass CI_NODE_INDEX and CI_NODE_TOTAL to lava jobs
deqp-runner.sh uses it to determine whether we split job across multiple
devices and if we do what's the node index.

With this change we now can set 'parallel: N' in job description if we want
to split the job.

Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3163>
2020-01-21 09:33:57 +00:00
Hyunjun Ko
26d93a7495 turnip: fix invalid VK_ERROR_OUT_OF_POOL_MEMORY
When VK_DESCRIPTOR_TYPE_SAMPLER is provided, it doesn't need to be
counted as a buffer count. Otherwise it leads to mismatch of allocated
buffer size, hitting VK_ERROR_OUT_OF_POOL_MEMORY finally.

Fixes: c39afe68f0

Also fixes amber tests:
./tests/cases/address_modes_float.amber
./tests/cases/address_modes_int.amber
./tests/cases/magfilter_linear.amber
./tests/cases/magfilter_nearest.amber

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2020-01-21 10:29:16 +01:00
Jan Vesely
87e1f8eca5 clover: Initialize Asm Parsers
Fixes piglits that use ADMGCN inline assembly:
	program@execute@calls
	program@execute@amdgcn-mubuf-negative-vaddr

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
2020-01-21 01:39:08 +00:00
Jason Ekstrand
34c8621c3b anv: Allow enumerating multiple physical devices
Instead of having a single physical device in anv_instance, have a
linked list of them.  What we have now works today because we our GPUs
are build into the CPU and so you're guaranteed to only ever have one of
them.  One day, that will change and we want ANV to be ready.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3461>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3461>
2020-01-20 22:08:52 +00:00
Jason Ekstrand
e963e151d8 anv: Re-arrange physical_device_init
This commit simply moves fetching the device info and checking if ANV
supports the device a bit higher up.  This way we fail earlier and it'll
make error checking easier in the next commit.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3461>
2020-01-20 22:08:52 +00:00
Jason Ekstrand
3ecfba388a anv: Drop separate chipset_id fields
This already exists in gen_device_info.  There's no reason to keep
duplicate copies.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3461>
2020-01-20 22:08:52 +00:00
Jason Ekstrand
02044be23f anv: Move the physical device dispatch table to anv_instance
We don't actually have genX versions of any physical device level
commands so we don't need the trampoline versions and we don't need to
have a separate table per physical device.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3461>
2020-01-20 22:08:52 +00:00
Jason Ekstrand
78ff747408 anv: Drop the instance pointer from anv_device
There are very few times when we actually want to fetch the instance
from the anv_device.  We can put up with a bit of pain there in exchange
for strongly discouraging people from doing this in general.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3461>
2020-01-20 22:08:52 +00:00
Jason Ekstrand
f0519c9cf9 anv: Stop allocating WSI event fences off the instance
Fixes: 16eb390834 "anv: add VK_EXT_display_control to anv driver [v5]"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3461>
2020-01-20 22:08:52 +00:00
Jason Ekstrand
1ec84bd208 anv: Take a device in anv_perf_warn
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3461>
2020-01-20 22:08:52 +00:00
Jason Ekstrand
cb6ea77045 anv: Take an anv_device in vk_errorf
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3461>
2020-01-20 22:08:52 +00:00
Jason Ekstrand
70e8064e13 anv: Add an anv_physical_device field to anv_device
Having to always pull the physical device from the instance has been
annoying for almost as long as the driver has existed.  It also won't
work in a world where we ever have more than one physical device.  This
commit adds a new field called "physical" to anv_device and switches
every location where we use device->instance->physicalDevice to use the
new field instead.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3461>
2020-01-20 22:08:52 +00:00
Marek Olšák
735a3ba007 radeonsi/gfx10: enable GS fast launch for triangles and strips with NGG culling
Only non-indexed triangle lists and strips are supported. This increases
performance if there is something to cull.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
c377f45c18 radeonsi/gfx10: rewrite late alloc computation
- Use conservative late alloc when the number of CUs <= 6.
- Move the late alloc GS register to the GS shader state, so that it can be
  tuned for NGG culling.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
4e4b2d13f0 ac: add helper ac_build_triangle_strip_indices_to_triangle
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
8db00a51f8 radeonsi/gfx10: implement NGG culling for 4x wave32 subgroups
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
aa2d846604 radeonsi/gfx10: move GE_PC_ALLOC setting to shader states
The value is not changed. I just use a different way to compute it.

The value will vary with NGG culling.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
41fef6fc09 radeonsi/gfx10: don't initialize VGPRs not used by NGG passthrough
v2: TES doesn't use the GS PrimitiveID

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
943d131e7d radeonsi/gfx10: merge main and pos/param export IF blocks into one if possible
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
a966729c84 radeonsi/gfx10: export primitives at the beginning of VS/TES
This decreases VGPR usage and will allow us to merge some IF blocks
in shaders.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
5a0fcf11f0 radeonsi/gfx10: move s_sendmsg gs_alloc_req to the beginning of shaders
This will allow us to merge some IF blocks in shaders.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
cf9f8d1ea2 radeonsi/gfx10: correct VS PrimitiveID implementation for NGG
We didn't use the correct LDS pointer, though it probably doesn't matter,
because I think that nothing else is using LDS here.

This commit makes it consistent with all other esgs_ring use.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
b2326a7549 radeonsi/gfx10: update comments and remove invalid TODOs
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
0f45d4dc2b ac: add ac_build_readlane without optimization barrier
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
77393cf39b ac: add prefix bitcount functions
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
679b6244e1 radeonsi: turn an assertion into return in si_nir_store_output_tcs
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 15:40:13 -05:00
Marek Olšák
27cc7703d3 radeonsi: fix doubles and int64
Fixes: 57bd73e229 - radeonsi: remove llvm_type_is_64bit

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 15:40:10 -05:00
Marek Olšák
df34fa14bb radeonsi: don't invoke decompression inside internal launch_grid
Decompress resources properly but don't do it inside launch_grid
to prevent recursion.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Cc: 19.3 <mesa-stable@lists.freedesktop.org>
2020-01-20 15:40:08 -05:00
Marek Olšák
58c929be0d radeonsi: clean up how internal compute dispatches are handled
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Cc: 19.3 <mesa-stable@lists.freedesktop.org>
2020-01-20 15:40:07 -05:00
Marek Olšák
d69483270e Revert "radeonsi: unbind image before compute clear"
This reverts commit 3a527eda7c.

It's incorrect.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 15:40:05 -05:00
Samuel Pitoiset
dbdf3b3ef9 aco: implement nir_intrinsic_load_barycentric_at_sample on GFX6
GFX6 doesn't have FLAT instructions which means we have to emit
a 64-bit MUBUF load.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432>
2020-01-20 16:24:55 +00:00
Samuel Pitoiset
9e2fde84fc aco: add new addr64 bit to MUBUF instructions on GFX6-GFX7
According to the different ISA docs (and to LLVM), this bit seems
to only exists on GFX6-GFX7.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432>
2020-01-20 16:24:55 +00:00
Samuel Pitoiset
fe9157a700 aco: do not use the vec3 variant for loads on GFX6
GFX6 only supports vec3 with load/store format.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432>
2020-01-20 16:24:55 +00:00
Samuel Pitoiset
1b5bb204d9 aco: do not use the vec3 variant for stores on GFX6
GFX6 only supports vec3 with load/store format.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432>
2020-01-20 16:24:55 +00:00
Samuel Pitoiset
b8abfafe86 aco: fix constant folding of SMRD instructions on GFX6
SMRD instructions have an 8-bit dword offset on SI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432>
2020-01-20 16:24:55 +00:00
Jason Ekstrand
dd92179a72 anv: Canonicalize buffer formats for image/buffer copies
Some formats, in particular YCbCr formats and ASTC have additional
restrictions.  We already whack ASTC formats to RGBA32_UINT because the
hardware doesn't allow LINEAR with ASTC.  However, we need to fix YCbCr
formats as well because they come with alignment restrictions that we
can't guarantee are satisfied.  We're using blorp_copy to do the copies
so we may as well just stomp formats for everything.

Fixes: b24b93d584 "anv: enable VK_KHR_sampler_ycbcr_conversion"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3460>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3460>
2020-01-20 16:08:17 +00:00
Jason Ekstrand
14c6e665f7 anv/blorp: Rename buffer image stride parameters
The new names fit better with the Vulkan names and don't pretend to be
an actual image extent.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3460>
2020-01-20 16:08:17 +00:00
Daniel Stone
cf5fccb0d9 Revert "gallium: add st_context_iface::flush_resource to call FLUSH_VERTICES"
This reverts commit bec9c90b5e.

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3472>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3472>
2020-01-20 12:33:29 +00:00
Daniel Stone
32d45733ae Revert "st/dri: do FLUSH_VERTICES before calling flush_resource"
This reverts commit 3ba16d36c9.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3472>
2020-01-20 12:33:22 +00:00
Rhys Perry
29bfe18abd aco: fix fall-through test in try_remove_simple_block() with back-edges
3bca0af2 enhanced empty block determination which exposed this bug and
created an infinite loop in a Guild Wars 2 shader.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 3bca0af25d
     ('aco: ignore parallelcopies to the same register on jump threading')

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2364
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3452>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3452>
2020-01-20 11:51:45 +00:00
Krzysztof Raszkowski
afb75e71e0 docs/GL4: update gallium/swr features
Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
2020-01-20 11:37:16 +00:00
Rhys Perry
e151398de6 aco: fix stack buffer overflow in apply_sgprs()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: cef7879719 ('aco: rewrite apply_sgprs()')
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2361
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3442>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3442>
2020-01-20 11:13:11 +00:00
Tapani Pälli
9b2ccd6a0e anv: add assert for isl_mod_info in choose_isl_tiling_flags
CID: 1457859
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3469>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3469>
2020-01-20 12:12:29 +02:00
Tapani Pälli
8eebdd594b anv: fix assert in GetImageDrmFormatModifierPropertiesEXT
CID: 1457861
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3469>
2020-01-20 12:11:43 +02:00
Tapani Pälli
31feae1c21 isl/gen12: add reminder comment about missing WA with 3D surfaces
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3441>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3441>
2020-01-20 08:06:19 +02:00
Icecream95
d8a3501f1b panfrost: Dynamically allocate shader variants
This fixes a crash in LZDoom where over 16 shader variants are needed
for a few shaders in some maps, and should also save a few kilobytes
of RAM as most of the time only one or two variants of the 8 previously
allocated are actually needed.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2020-01-18 11:47:34 -05:00
Alyssa Rosenzweig
bef716b56c panfrost: Expose some functionality with dEQP flag
These features are stable enough that they don't need to be hidden.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3464>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3464>
2020-01-18 14:57:52 +00:00
Alyssa Rosenzweig
4af8d5b064 pan/midgard: Fix recursive csel scheduling
Corner case causing invalid scheduling on shaders with nested csels,
i.e. GLSL code resembling:

   (foo ? bool1 : bool2) ? x : y

By explicitly disallowing csels this is fixed.

Fixes INSTR_INVALID_ENC on a glamor shader (noticeable with slowdown and
visual corruption when scrolling "too far" on GTK apps).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3463>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3463>
2020-01-18 14:40:05 +00:00
Alyssa Rosenzweig
564a782ff7 panfrost: Identify un/pack colour opcodes
We still need to identify formats in the disassembler, but this will at
least get the opcode name clear.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3462>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3462>
2020-01-18 14:18:48 +00:00
Alyssa Rosenzweig
13c32e5fed pan/midgard: Bytemasks should round up, not round down
Otherwise we'll lost components in DCE.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3462>
2020-01-18 14:18:48 +00:00
Icecream95
5e8386c606 panfrost: Compact the bo_access readers array
Previously, the array bo_access->readers was only cleared when there
were no unsignaled fences, which in some situations never happened.

That resulted in the array having thousands of NULL pointers, but only
a handful of active readers.

With this patch, all the unsignaled readers are moved to the front of
the array, effectively building a new array only containing the active
readers in-place. This results in the readers array usually only having
a couple of elements.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3419>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3419>
2020-01-18 13:58:43 +00:00
Erik Faye-Lund
c0ba9000d2 zink: support arrays of samplers
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3275>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3275>
2020-01-18 10:45:38 +00:00
Erik Faye-Lund
a9023ec566 zink: support sampling non-float textures
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3275>
2020-01-18 10:45:38 +00:00
Erik Faye-Lund
3e1acff560 zink: store image-type per texture
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3275>
2020-01-18 10:45:38 +00:00
Erik Faye-Lund
5fc1562a72 zink: avoid incorrect vector-construction
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3275>
2020-01-18 10:45:38 +00:00
Erik Faye-Lund
8112240d29 zink: support offset-variants of texturing
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3275>
2020-01-18 10:45:38 +00:00
Erik Faye-Lund
f1a5bcdc16 zink: implement nir_texop_txs
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3275>
2020-01-18 10:45:38 +00:00
Erik Faye-Lund
7ee94d1b21 docs: fixup indentation
The most canonical indentation-style here is two spaces, which is what
the standard boilerplate in all documents use. So let's normalize to
that.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3443>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3443>
2020-01-18 11:39:32 +01:00
Erik Faye-Lund
2ef989473a docs: remove pointless, stray newline
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3443>
2020-01-18 11:39:29 +01:00
Erik Faye-Lund
199572b65b docs: use [1] instead of asterisk for footnote
While we're at it, make it a link as well.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3443>
2020-01-18 11:39:25 +01:00
Erik Faye-Lund
063a28642e docs: remove trailing newlines
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3443>
2020-01-18 11:39:22 +01:00
Erik Faye-Lund
9954120b38 docs: remove leading spaces
There's no good reason to have leading space in these pre-formatted
blocks. It looks strange, so let's get rid of it.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3443>
2020-01-18 11:39:18 +01:00
Erik Faye-Lund
c871862744 docs: remove trailing header
This header has been there since the document was added, but contains
nothing. So let's get rid of it.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3443>
2020-01-18 11:39:14 +01:00
Erik Faye-Lund
37daddd3e4 docs: use figure/figcaption instead of tables
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3443>
2020-01-18 11:39:07 +01:00
Erik Faye-Lund
f5983a6eed docs: do not use definition-list for sub-topics
The dl-tag isn't a neat tool for defining sub-headings, it's a semantic
tool for defining definitions and their meaning. Let's insetad use
normal sub-headings instead.

To make the last few paragraphs stand out from the above, let's add a
sub-heading for those as well.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3443>
2020-01-18 11:38:56 +01:00
Rob Clark
95187083c4 freedreno/a6xx: add PROG_FB_RAST stateobj
For the handful of registers that depend on the union of program/
framebuffer/rasterizer state.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
2020-01-17 15:43:51 -08:00
Rob Clark
6dc9b292d0 freedreno/a6xx: move dynamic program state to streaming stateobj
Move the program state which we can't pre-bake to a streaming state
object, rather than emitting directly in the draw cmdstream.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
2020-01-17 15:43:51 -08:00
Rob Clark
d2fd6469c3 freedreno/a6xx: drop a few more per-draw registers
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
2020-01-17 15:43:51 -08:00
Rob Clark
4d8f42c851 freedreno/a6xx: separate rast stateobj for prim restart
This lets us move PC_PRIMITIVE_CNTL into the rasterizr stateobj, rather
than unconditionally emitting it directly in the cmdstream on every
draw.

This also starts adding some tracking about previous draw state, so that
following patches can limit some of the register writes we currently
emit on every draw.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
2020-01-17 15:43:51 -08:00
Rob Clark
0e063b3079 freedreno/a6xx: cleanup rasterizer state
All but one of the reg values is only used in the stateobj, so we can
inline the register value setup and stateobj construction.  While we
are at it, switch over to the new register builders.

Prep work for next patch.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
2020-01-17 15:43:51 -08:00
Rob Clark
fba7e6f896 freedreno/a6xx: limit scratch/debug markers to debug builds
The overhead does seem to matter when you have a high enough # of draw
calls that effect few bins/pixels, because these writes would happen
unconditionally (ie. not part of a state-group).

Possibly we could keep these if we moved them into a state-group so the
register writes would be no-ops on bins with no geometry.  OTOH I
usually end up adding in a WFI when using them scratch reg values to
track down a crash.  (So add a WFI to mitigate the annoyance of needing
to use a debug build to get scratch regs to locate the position of a
crash/hang in the cmdstream.)

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
2020-01-17 15:43:51 -08:00
Jordan Justen
5d7381c645 iris: Fix some indentation in iris_init_render_context
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2020-01-17 15:28:07 -08:00
C Stout
c1104e4cee util/vector: Fix u_vector_foreach when head rolls over
Also add unit tests for u_vector.

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3453>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3453>
2020-01-17 22:21:00 +00:00
Francisco Jerez
b54b67e067 intel/fs: Switch to standard vector layout for barycentrics at optimization time.
This involves permuting the registers of barycentric vectors to have
the standard X[0-n] Y[0-n] layout at NIR translation time.
Barycentrics are converted to the format expected by the PLN
instruction in the lower_barycentrics() pass run after the
optimization loop.

Main reason is correctness of SIMD32 fragment shaders.  The
shuffle_from_pln_layout() and shuffle_to_pln_layout() helpers used
during NIR translation are busted for SIMD32.  This leads to serious
corruption at present with INTEL_DEBUG=do32, especially on Gen11+
where these helpers are hit more frequently due to the lack of a
hardware PLN instruction.

Of course one could have chosen to fix those helpers instead, but
there is another far more subtle issue that was reported during review
of the SIMD32 fragment shader codegen changes: The SIMD splitting pass
currently handles SIMD32 barycentric vectors as if they had the
standard X[0-n] Y[0-n] layout, even though they are interleaved for
the PLN instruction, which causes incorrect execution masks to be
applied to the MOVs unzipping barycentric vectors in cases where a
LINTERP instruction occurs under non-uniform control flow.

I'm not aware of any conformance regressions due to the latter issue
at present, but for our peace of mind let's move the conversion to the
PLN layout into the lower_barycentrics() pass run after
lower_simd_width().

This leads to the following shader-db improvements (including SIMD32
shaders) in combination with the previous back-end preparation changes
-- Without them (especially the copy propagation changes) this would
lead to a massive number of regressions.  On ICL:

   total instructions in shared programs: 20662316 -> 20466903 (-0.95%)
   instructions in affected programs: 10538474 -> 10343061 (-1.85%)
   helped: 68775
   HURT: 6

   total spills in shared programs: 8938 -> 8748 (-2.13%)
   spills in affected programs: 376 -> 186 (-50.53%)
   helped: 9
   HURT: 5

   total fills in shared programs: 8965 -> 8663 (-3.37%)
   fills in affected programs: 965 -> 663 (-31.30%)
   helped: 9
   HURT: 6

   LOST:   146
   GAINED: 43

On SKL:

   total instructions in shared programs: 18725867 -> 18614912 (-0.59%)
   instructions in affected programs: 3876590 -> 3765635 (-2.86%)
   helped: 27492
   HURT: 2

   LOST:   191
   GAINED: 417

On SNB:

   total instructions in shared programs: 14573613 -> 13980646 (-4.07%)
   instructions in affected programs: 5199074 -> 4606107 (-11.41%)
   helped: 29998
   HURT: 0

   LOST:   21
   GAINED: 30

Results are somewhat less impressive but still significant without
SIMD32 fragment shaders enabled.  On ICL:

   total instructions in shared programs: 16148728 -> 16061659 (-0.54%)
   instructions in affected programs: 6114788 -> 6027719 (-1.42%)
   helped: 42046
   HURT: 6

   total spills in shared programs: 8218 -> 8028 (-2.31%)
   spills in affected programs: 376 -> 186 (-50.53%)
   helped: 9
   HURT: 5

   total fills in shared programs: 8953 -> 8651 (-3.37%)
   fills in affected programs: 965 -> 663 (-31.30%)
   helped: 9
   HURT: 6

   LOST:   0
   GAINED: 3

On SKL:

   total instructions in shared programs: 14927994 -> 14926738 (-0.01%)
   instructions in affected programs: 168850 -> 167594 (-0.74%)
   helped: 711
   HURT: 2

On SNB:

   total instructions in shared programs: 10770538 -> 10734403 (-0.34%)
   instructions in affected programs: 2702172 -> 2666037 (-1.34%)
   helped: 17818
   HURT: 0

All of the hurt shaders are either spilling slightly more or emitting
additional NOP instructions due to the SIMD16 POW workaround for
Gen8-9 combined with differences in scheduling.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-17 13:23:12 -08:00
Francisco Jerez
79bd252d6e intel/fs: Introduce barycentric layout lowering pass.
The goal is to represent barycentrics with the standard vector layout
during optimization and particularly SIMD lowering.  Instead of
emitting the barycentric layout conversions at NIR translation time,
do it later as a lowering pass.  For the moment this is only applied
to PI messages, but we'll give the same treatment to LINTERP
instructions too.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-17 13:22:59 -08:00
Francisco Jerez
44d7d66adc intel/fs: Split fetch_payload_reg() into separate helper for barycentrics.
We're about to change the layout of barycentric vectors, which will
involve permuting the GRFs of barycentrics fetched from the thread
payload.  Make room for this in a function separate from the generic
fetch_payload_reg(), since the permutation will only be applicable to
barycentric vectors.  This allows simplifying fetch_payload_reg(),
since there was no need for handling multiple-component payload
registers except for barycentrics.

This causes some minor shader-db noise due to the new helper emitting
a LOAD_PAYLOAD instruction unconditionally, but it will be cleaned up
shortly.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-17 13:22:51 -08:00
Francisco Jerez
9c9e80103c intel/fs/gen6: Use SEL instead of bashing thread payload for unlit centroid workaround.
This prevents regressions on SNB due to the redundant MOVs lying
around in cases where fetch_payload_reg() returns a VGRF (currently
only in SIMD32 but soon in pretty much all cases).  The MOVs can't be
register-coalesced due to their source being a FIXED_GRF, and they
can't be copy-propagated either due to the unlit centroid workaround
partial writes.  They can be copy-propagated just fine into a SEL
instruction though.

On SNB this prevents the following shader-db regressions (including
SIMD32 programs) in combination with the interpolation rework part of
this series:

   total instructions in shared programs: 13996898 -> 14001982 (0.04%)
   instructions in affected programs: 197461 -> 202545 (2.57%)
   helped: 0
   HURT: 1251

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-17 13:22:39 -08:00
Francisco Jerez
0dd18d70ae intel/fs/gen6: Generalize aligned_pairs_class to SIMD16 aligned barycentrics.
This is mainly meant to avoid shader-db regressions on SNB as we start
using VGRFs for barycentrics more frequently.  Currently the
aligned_pairs_class is only useful in SIMD8 mode, because in SIMD16
mode barycentric vectors are typically 4 GRFs.  This is not a problem
on Gen4-5, because on those platforms all VGRF allocations are
pair-aligned in SIMD16 mode.  However on Gen6 we end up using either
the fast or the slow path of LINTERP rather non-deterministically
based on the behavior of the register allocator.

Fix it by repurposing aligned_pairs_class to hold PLN-aligned
registers of whatever the natural size of a barycentric vector is in
the current dispatch width.

On SNB this prevents the following shader-db regressions (including
SIMD32 programs) in combination with the interpolation rework part of
this series:

   total instructions in shared programs: 13983257 -> 14527274 (3.89%)
   instructions in affected programs: 1766255 -> 2310272 (30.80%)
   helped: 0
   HURT: 11608

   LOST:   26
   GAINED: 13

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-17 13:22:34 -08:00
Francisco Jerez
0db4455c1f intel/fs/gen6: Constrain barycentric source of LINTERP during bank conflict mitigation.
This avoids regressions on SNB due to the bank conflict mitigation
pass moving a VGRF-allocated barycentric vector to a misaligned
location, which would prevent the PLN instruction from being used.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-17 13:22:29 -08:00
Francisco Jerez
369aef851d intel/fs/gen4-6: Allocate registers from aligned_pairs_class based on LINTERP use.
Previously we would hardcode fs_visitor::delta_xy barycentrics to be
allocated from aligned_pairs_class on hardware with PLN source
alignment restrictions (pre-Gen7).  Instead allocate any registers
consumed by LINTERP from aligned_pairs_class, even if some barycentric
vector had ended up in a temporary.

On SNB this prevents the following shader-db regressions (including
SIMD32 programs) in combination with the interpolation rework part of
this series:

   total instructions in shared programs: 13983257 -> 14527274 (3.89%)
   instructions in affected programs: 1766255 -> 2310272 (30.80%)
   helped: 0
   HURT: 11608

   LOST:   26
   GAINED: 13

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-17 13:22:20 -08:00
Francisco Jerez
54b1b71e73 intel/fs: Allow limited copy propagation of a LOAD_PAYLOAD into another.
This is particularly useful in cases where register coalaesce is
unlikely to succeed because the LOAD_PAYLOAD isn't a plain copy --
E.g. when a LOAD_PAYLOAD is shuffling the contents of a barycentric
vector in order to transform it into the PLN layout.

This prevents the following shader-db regressions (including SIMD32
programs) in combination with the interpolation rework part of this
series.  On SKL:

   total instructions in shared programs: 18596672 -> 18976097 (2.04%)
   instructions in affected programs: 7937041 -> 8316466 (4.78%)
   helped: 39
   HURT: 67427

   LOST:   466
   GAINED: 220

On SNB:

   total instructions in shared programs: 13993866 -> 14202963 (1.49%)
   instructions in affected programs: 7611309 -> 7820406 (2.75%)
   helped: 624
   HURT: 52943

   LOST:   6
   GAINED: 18

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-17 13:22:09 -08:00
Francisco Jerez
8eb4f2092a intel/fs: Add support for copy-propagating a block of multiple FIXED_GRFs.
In cases where a LOAD_PAYLOAD instruction copies a single block of
sequential GRF registers into the destination (see
is_identity_payload()), splitting the block copy into a number of ACP
entries (one for each LOAD_PAYLOAD source) is undesirable, because
that prevents copy propagation into any instructions which read
multiple components at once with the same source (the barycentric
source of the LINTERP instruction is going to be the overwhelmingly
most common example).

Technically it would also be possible to do this for VGRF sources, but
there is little benefit from that since register coalesce already
covers many of those cases -- There is no way for a block of
FIXED_GRFs to be coalesced into a VGRF though.

This prevents the following shader-db regressions (including SIMD32
programs) in combination with the interpolation rework part of this
series.  On SKL:

   total instructions in shared programs: 18595160 -> 18828562 (1.26%)
   instructions in affected programs: 13374946 -> 13608348 (1.75%)
   helped: 7
   HURT: 108977

   total spills in shared programs: 9116 -> 9106 (-0.11%)
   spills in affected programs: 404 -> 394 (-2.48%)
   helped: 7
   HURT: 9

   total fills in shared programs: 8994 -> 9176 (2.02%)
   fills in affected programs: 898 -> 1080 (20.27%)
   helped: 7
   HURT: 9

   LOST:   469
   GAINED: 220

On SNB:

   total instructions in shared programs: 13996898 -> 14096222 (0.71%)
   instructions in affected programs: 8088546 -> 8187870 (1.23%)
   helped: 2
   HURT: 66520

   total spills in shared programs: 2985 -> 2961 (-0.80%)
   spills in affected programs: 632 -> 608 (-3.80%)
   helped: 2
   HURT: 0

   total fills in shared programs: 3144 -> 3128 (-0.51%)
   fills in affected programs: 1515 -> 1499 (-1.06%)
   helped: 2
   HURT: 0

   LOST:   0
   GAINED: 4

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-17 13:21:41 -08:00
Francisco Jerez
e328fbd9f8 intel/fs: Add partial support for copy-propagating FIXED_GRFs.
This will be useful for eliminating redundant copies from the FS
thread payload, particularly in SIMD32 programs.  For the moment we
only allow FIXED_GRFs with identity strides in order to avoid dealing
with composing the arbitrary bidimensional strides that FIXED_GRF
regions potentially have, which are rarely used at the IR level
anyway.

This enables the following commit allowing block-propagation of
FIXED_GRF LOAD_PAYLOAD copies, and prevents the following shader-db
regressions (including SIMD32 programs) in combination with the
interpolation rework part of this series.  On ICL:

   total instructions in shared programs: 20484665 -> 20529650 (0.22%)
   instructions in affected programs: 6031235 -> 6076220 (0.75%)
   helped: 5
   HURT: 42073

   total spills in shared programs: 8748 -> 8925 (2.02%)
   spills in affected programs: 186 -> 363 (95.16%)
   helped: 5
   HURT: 9

   total fills in shared programs: 8663 -> 8960 (3.43%)
   fills in affected programs: 647 -> 944 (45.90%)
   helped: 5
   HURT: 9

On SKL:

   total instructions in shared programs: 18937442 -> 19128162 (1.01%)
   instructions in affected programs: 8378187 -> 8568907 (2.28%)
   helped: 39
   HURT: 68176

   LOST:   1
   GAINED: 4

On SNB:

   total instructions in shared programs: 14094685 -> 14243499 (1.06%)
   instructions in affected programs: 7751062 -> 7899876 (1.92%)
   helped: 623
   HURT: 53586

   LOST:   7
   GAINED: 25

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-17 13:21:33 -08:00
Francisco Jerez
5153d06d92 intel/fs: Extend copy propagation dataflow analysis to copies with FIXED_GRF source.
This involves indexing the ACP tables used internally by
fs_copy_prop_dataflow::setup_initial_values() by reg_space() instead
of register number.  Both are nearly equivalent for virtual GRFs
(barring the single bit of entropy lost in the hash), and this makes
handling FIXED_GRFs straightforward.

Because we're only going to support FIXED_GRFs for the source of a
copy, this change is only strictly necessary during the second pass
that checks for source interference, but we also apply the same change
to the first pass for consistency.

Note that this shouldn't change the behavior of the copy propagation
pass until we start inserting FIXED_GRF entries into the ACP.  Even
then FIXED_GRF writes are extremely rare so this change will hardly
ever have an effect, but they aren't completely non-existing so we
need to handle them for correctness.

No functional nor shader-db changes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-17 13:21:27 -08:00
Francisco Jerez
ab0d1b3b3d intel/fs: Rework fs_inst::is_copy_payload() into multiple classification helpers.
This reworks the current fs_inst::is_copy_payload() method into a
number of classification helpers with well-defined semantics.  This
will be useful later on in order to optimize LOAD_PAYLOAD instructions
more aggressively in cases where we can determine it's safe to do so.

The closest equivalent of the present fs_inst::is_copy_payload()
method is the is_coalescing_payload() helper introduced here.

No functional nor shader-db changes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-17 13:21:19 -08:00
Francisco Jerez
1873202f44 intel/fs: Generalize fs_reg::is_contiguous() to register files other than VGRF.
No functional nor shader-db changes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-17 13:20:59 -08:00
Francisco Jerez
d9a57c85cc intel/fs: Try to vectorize header setup in lower_load_payload().
In cases where LOAD_PAYLOAD is provided a pair of contiguous registers
as header sources, try to use a single SIMD16 instruction in order to
initialize them.  This is unlikely to affect the overall cycle count
of the shader, since the compressed instruction has twice the issue
time, except due to the reduced pressure on the instruction cache.

Main motivation is avoiding instruction-count regressions in
combination with the following copy propagation improvements, which
will allow the SIMD16 g0-1 header setup emitted for framebuffer writes
to be copy-propagated into its LOAD_PAYLOAD, leading to the emission
of two SIMD8 MOV instructions instead of a single SIMD16 MOV.

Reverting this commit on top of the copy propagation changes would
lead to the following shader-db regressions on SKL and other
platforms:

 total instructions in shared programs: 14926738 -> 14935415 (0.06%)
 instructions in affected programs: 1892445 -> 1901122 (0.46%)
 helped: 0
 HURT: 8676

Without the following copy propagation changes this doesn't have any
effect on shader-db on Gen7+, because we would typically set up the FB
write header with a separate SIMD16 MOV that isn't currently
copy-propagated into the LOAD_PAYLOAD, so the individual SIMD8 MOVs
result of LOAD_PAYLOAD lowering would get register-coalesced away
under normal circumstances.  However that wasn't the case for MRF
LOAD_PAYLOAD destinations on Gen6 and earlier, because register
coalesce only kicks in for GRFs, leaving a number of redundant SIMD8
MOVs lying around.  On SNB this leads to the following shader-db
improvements:

 total instructions in shared programs: 10770538 -> 10734681 (-0.33%)
 instructions in affected programs: 2700655 -> 2664798 (-1.33%)
 helped: 17791
 HURT: 0

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-17 13:20:46 -08:00
Marek Olšák
3ba16d36c9 st/dri: do FLUSH_VERTICES before calling flush_resource 2020-01-17 15:04:35 -05:00
Marek Olšák
bec9c90b5e gallium: add st_context_iface::flush_resource to call FLUSH_VERTICES 2020-01-17 15:04:35 -05:00
Lionel Landwerlin
ddb80f9276 anv: enable VK_KHR_swapchain_mutable_format
Enable new tests in dEQP-VK.image.swapchain_mutable.*

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434>
2020-01-17 18:27:29 +00:00
Jason Ekstrand
4bdf8547f4 vulkan/wsi: Implement VK_KHR_swapchain_mutable_format
This is only the core WSI code for the extension.  It adds the image
format list and the flags to vkCreateImage as well as handling things
properly in the modifier queries.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434>
2020-01-17 18:27:29 +00:00
Jason Ekstrand
a218f13278 vulkan/wsi: Filter modifiers with ImageFormatProperties
Just because a modifier is returned for the given format, that doesn't
mean it works with all usages and flags.  We need to filter the list by
calling vkGetPhysicalDeviceImageFormatProperties2.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434>
2020-01-17 18:27:29 +00:00
Jason Ekstrand
210e68874b vulkan/wsi: Use the interface from the real modifiers extension
The anv implementation still isn't quite complete, but we can at least
start using the structs from the real extension.

v2: Fix circular pNext list (Lionel)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434>
2020-01-17 18:27:29 +00:00
Jason Ekstrand
c78926b84d vulkan/wsi: Move the ImageCreateInfo higher up
Future changes will be easier if we can modify it based on whether or
not we're using modifiers.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434>
2020-01-17 18:27:29 +00:00
Jason Ekstrand
6790397346 anv: Support modifiers in GetImageFormatProperties2
Images with modifiers come with restrictions:

 1. They have to be simple 2D images right now

 2. They need to have a sensible format (not compressed, multi-plane, or
    non-power-of-two)

 3. If a CCS modifier is being requested, they have to actually support
    CCS_E and be CCS-compatible with any other formats the client may
    wish to use for image views.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434>
2020-01-17 18:27:29 +00:00
Jason Ekstrand
44f5a92c0b anv: Drop some VK_IMAGE_TILING_OPTIMAL checks
The DRM format modifiers extension adds a TILING_DRM_FORMAT_MODIFIER
which will be used for modifiers so we can no longer use OPTIMAL to
indicate tiled inside the driver.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434>
2020-01-17 18:27:29 +00:00
Samuel Pitoiset
0099f85232 aco: print assembly with CLRXdisasm for GFX6-GFX7 if found on the system
LLVM only supports GFX8+. Using CLRXdisasm works most of the time,
so it's useful to add support for it.

Original patch by Daniel Schürmann.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3439>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3439>
2020-01-17 17:41:32 +00:00
Andres Rodriguez
51de5d5ac6 vulkan/wsi: disable the hardware cursor
Ensure the hardware cursor is disabled when we set the mode for a
VkDisplayKHR object. The extension doesn't expose any mechanisms to
program the hardware cursor, so we need to ensure it is hidden.

Currently, it seems like X is responsible for disabling the cursor
before handing over the lease. But that seems a little frail, and we
should be disabling the cursor ourselves so it works correctly
independently of how the lease was prepared for us.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1922>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1922>
2020-01-17 17:15:52 +00:00
Krzysztof Raszkowski
ad820d5aca gallium/swr: Disable showing detected arch message.
When swr driver is in use it print detected architecture
message to std::err. It can be harmfull when swr is using
in multinodes environments.
It can be enabled setting env var SWR_PRINT_INFO to 1.

Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
2020-01-17 16:41:53 +00:00
Samuel Pitoiset
b9b393f0ce aco: fix emitting slc for MUBUF instructions on GFX6-GFX7
Same as GFX10, only GFX8/GFX9 moved that bit near the opcode.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3437>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3437>
2020-01-17 16:56:04 +01:00
Boris Brezillon
6af63c939b panfrost/midgard: Fix swizzle for store instructions
The current logic considers that the nir_intrinsic_component(store_intr)
encodes the source components start, but it actually encodes the
destination one. Source component offset adjustment is taken care of in
install_registers_instr(), when offset_swizzle() is called.

This fixes dEQP-GLES2.functional.shaders.random.all_features.fragment.45
when PAN_MESA_DEBUG=deqp (looks like exposing GLES3 features has an
impact on the varyings layout).

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3429>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3429>
2020-01-17 12:54:31 +00:00
Erik Faye-Lund
be95c816a7 docs: do not double-close link tag
Fixes: f8148d0cc1 "docs: remove mailing list as way of submitting patches"
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>
2020-01-17 13:19:16 +01:00
Erik Faye-Lund
b009a7644b docs: remove double-closed definition-list
Fixes: bc17ac5866 "docs: add documentation for building with meson"
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>
2020-01-17 13:19:10 +01:00
Erik Faye-Lund
b387f68f49 docs: move paragraph closing tag
The pre-tag right before is a block-level tag, which means it implicitly
terminates the paragraph. So there's no paragraph to close after this.
Instead, move the paragraph-closing before the pre-tag, to explicitly
close the paragraph.

Fixes: 41b3eb08d9 "docs: update meson docs for windows"
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>
2020-01-17 13:19:03 +01:00
Erik Faye-Lund
a370cfd96e docs: use code-tags instead of pre-tags
Similar to the previous two commits, it seems more appropriate to use
code-tags here than pre-tag.

Fixes: 9af6c38def "docs: Add use of Closes: tag for closing gitlab issues"
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>
2020-01-17 13:18:52 +01:00
Erik Faye-Lund
1de361e56b docs: use code-tags instead of pre-tags
Similar to the previous commit, code-tags seems more appropriate than
pre-tags here. So let's change it.

Fixes: ca0c1e69ca "docs: update releasing process to use new scripts and gitlab"
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>
2020-01-17 13:18:48 +01:00
Erik Faye-Lund
36e0275275 docs: use code-tag instead of pre-tag
It's unlikely the author meant to use <pre>-here, as that starts a whole
new block. Instead, the inline code-tag seems more appropriate here.

Fixes: 41b3eb08d9 "docs: update meson docs for windows"
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>
2020-01-17 13:18:42 +01:00
Erik Faye-Lund
f0677086a1 docs: open paragraph before closing it
Fixes: 44c5e634a5 "docs: update meson docs for windows"
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>
2020-01-17 13:18:36 +01:00
Erik Faye-Lund
a0d25c4d87 docs: fix paragraphs
Paragraphs are terminated by pre-tags, so the latter one closes a new,
empty one. Let's split the paragraph in two around the pre-tag instead.

Fixes: c0dfe8c6df "docs: do not use div for line-breaking"
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>
2020-01-17 13:18:25 +01:00
Erik Faye-Lund
750d664226 docs: fix typo in html tag name
Fixes: 5d11a828e1 "docs: update install docs for meson"
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>
2020-01-17 13:18:06 +01:00
Pierre-Eric Pelloux-Prayer
5b1c4e1b75 util: call bind_sampler_states before setting sampler_views
Fixes the following valgrind error:

    Invalid read of size 16
       at 0x28F458A1: si_set_sampler_view_desc (in radeonsi_drv_video.so)
       by 0x28F4657E: si_set_sampler_views (in radeonsi_drv_video.so)
       by 0x28D62BF5: util_compute_blit (in radeonsi_drv_video.so)
       by 0x28D3A944: vlVaHandleVAProcPipelineParameterBufferType (in radeonsi_drv_video.so)
       by 0x28D34EE1: vlVaRenderPicture (in radeonsi_drv_video.so)
       by 0x4B2582B: vaRenderPicture (in libva.so.2.500.0)
     Address 0x18142a10 is 0 bytes inside a block of size 48 free'd
       at 0x48369AB: free (vg_replace_malloc.c:540)
       by 0x28D62D51: util_compute_blit (in radeonsi_drv_video.so)
       by 0x28D3A944: vlVaHandleVAProcPipelineParameterBufferType (in radeonsi_drv_video.so)
       by 0x28D34EE1: vlVaRenderPicture (in radeonsi_drv_video.so)
       by 0x4B2582B: vaRenderPicture (in libva.so.2.500.0)
     Block was alloc'd at
       at 0x4837B65: calloc (vg_replace_malloc.c:762)
       by 0x28EFB2EC: si_create_sampler_state (in radeonsi_drv_video.so)
       by 0x28D62C30: util_compute_blit (in radeonsi_drv_video.so)
       by 0x28D3A944: vlVaHandleVAProcPipelineParameterBufferType (in radeonsi_drv_video.so)
       by 0x28D34EE1: vlVaRenderPicture (in radeonsi_drv_video.so)
       by 0x4B2582B: vaRenderPicture (in libva.so.2.500.0)

Fixes: 69430d7e59 ("va: use a compute shader for the blit")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2321
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3428>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3428>
2020-01-17 10:14:57 +01:00
Eric Anholt
d55573aac6 nir: Fix printing of ~0 .locations.
I kept wondering what "429" meant in variable declarations, when it was
just a truncated ~0 snprintf.

Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3423>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3423>
2020-01-16 23:29:10 +00:00
Eric Engestrom
65641e0c7a meson: use github URL for wraps instead of completely unreliable wrapdb
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3391>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3391>
2020-01-16 23:06:43 +00:00
Dylan Baker
d7cef7c67b docs: Update release calendar for 20.0
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3417>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3417>
2020-01-16 22:41:55 +00:00
Andreas Baierl
2ebfc6db16 lima: Fix alpha blending
Introduce separate helper functions to set the blendfactor bits.

Lima uses bits 0-2 for the type, bit 3 sets the inverted function
and bit 4 is set if alpha is used.
alpha_src_factor and alpha_dst_factor don't need the alpha bit, so
they are masked with 0xf. There is only place for 4 bits anyway.
If alpha_src_factor is PIPE_BLENDFACTOR_SRC_ALPHA_SATURATE, we need
to change it to PIPE_BLENDFACTOR_ONE first.
This is exactly what the blob does and we pass all
dEQP-GLES2.functional.fragment_ops.blend.* tests now.
Better than the blob btw...

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3411>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3411>
2020-01-16 16:43:41 +00:00
Daniel Schürmann
3bca0af25d aco: ignore parallelcopies to the same register on jump threading
The more conservative lowering to CSSA inserts unnecessary parallelcopies
which might get coalesced and can be ignored on jump threading.

v2: outline is_empty_block() check.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3385>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3385>
2020-01-16 16:01:59 +01:00
Daniel Schürmann
427e5eeb02 aco: handle phi affinities transitively through parallelcopies
This can coalesce most unnecessarily inserted parallelcopies
from lowering to CSSA.

v2: refactor loop a bit to make it more efficient and readable.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3385>
2020-01-16 16:01:59 +01:00
Daniel Schürmann
d098024c40 aco: rework lower_to_cssa()
This patch changes lower_to_cssa to be much more conservative
about assumptions which phi operands might interfere.
Previously, this pass wasn't exhaustive and could miss some corner cases.

v2: remove optimizations to find better insertion points as it's hard
to guarantee that they are always correct and have overall no benefit.

Fixes: 0b8216b2cd ('aco: Lower to CSSA')

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3385>
2020-01-16 16:01:59 +01:00
Samuel Pitoiset
300f8dec76 aco: implement stream output with vec3 on GFX6
GFX6 doesn't support vec3.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3412>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3412>
2020-01-16 14:06:06 +00:00
Samuel Pitoiset
a445cb35bd aco: do not combine additions of DS instructions on GFX6
The offset field doesn't work as expected on GFX6.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3412>
2020-01-16 14:06:06 +00:00
Samuel Pitoiset
923005bf54 aco: do not select 96-bit/128-bit variants for ds_read/ds_write on GFX6
Only GFX7 and later support large ds_read/ds_write.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3412>
2020-01-16 14:06:06 +00:00
Lionel Landwerlin
44ffeb4fee intel/perf: report query split for mdapi
Also forgotten in the initial implementation.

v2: Report begin timestamp scaled by the timestamp frequency (Windows
    behavior)

v3: Rename split to disjoint to match GL terminology (Tapani)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3112>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3112>
2020-01-16 15:29:40 +02:00
Lionel Landwerlin
3bb8a4bfec intel/perf: expose timestamp begin for mdapi
This was forgotten in the initial implementation.

v2: ensure the value is written for both GL & Vulkan queries

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3112>
2020-01-16 15:29:28 +02:00
Tapani Pälli
630cbb45ac anv: set depth stall enabled when depth flush enabled on gen12
This implements HW workaround #1409600907 for anv driver.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3378>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3378>
2020-01-16 14:05:54 +02:00
Tapani Pälli
3cec148455 iris: set depth stall enabled when depth flush enabled on gen12
This implements HW workaround #1409600907 for iris driver.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3378>
2020-01-16 14:05:54 +02:00
Lionel Landwerlin
308efbf2f3 anv: implement another workaround for non pipelined states
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3408>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3408>
2020-01-16 11:51:30 +02:00
Lionel Landwerlin
9eca823cce iris: implement another workaround for non pipelined states
v2: add comment (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3408>
2020-01-16 11:51:22 +02:00
Lionel Landwerlin
e6e5cbac04 iris: handle new PIPE_CONTROL field
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3408>
2020-01-16 11:48:11 +02:00
Lionel Landwerlin
31f0af5568 genxml: add new Gen11+ PIPE_CONTROL field
PIPE_CONTROL gained a new field in its first DWORD on Gen11. We had no
use for it so far, but we start using it on Gen12.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3408>
2020-01-16 11:48:04 +02:00
Kenneth Graunke
e3405f177b st/mesa: Allocate full miplevels if MaxLevel is explicitly set
Some applications explicitly call glTex[ture]Parameteri[v] to set
GL_TEXTURE_MAX_LEVEL and GL_TEXTURE_BASE_LEVEL before uploading any
texture data.  Core Mesa initializes MaxLevel to 1000, so if it isn't
that, we know they've set it.  (We check for < TEXTURE_MAX_LEVELS to
avoid hardcoding that value, however.)

If MaxLevel - BaseLevel > 0, then the app is trying to tell us that
this texture is going to have multiple miplevels.  In that case, go
ahead and allocate the space for it.

Avoids many resource_copy_region calls at texture finalization time
in the Civilization VI benchmark.

Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3401>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3401>
2020-01-16 00:06:54 -08:00
Samuel Pitoiset
68abc07317 aco: fix emitting SMEM instructions with no operands on GFX6-GFX7
Like s_memtime.

Fixes dEQP-VK.glsl.shader_clock.* on GFX6.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3407>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3407>
2020-01-16 08:18:18 +01:00
Vasily Khoruzhick
e5226cff75 lima: fix handling of reverse depth range
Looks like we need to handle cases when near > far and near == far.
In first case we just need to swap near and far, and in second we
need subtract epsilon from near if it's not zero.

Fixes 10 tests in dEQP-GLES2.functional.depth_range.*

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3400>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3400>
2020-01-16 01:57:05 +00:00
Ilia Mirkin
784b84d308 nvc0: disable xfb's which don't have a stride
No stride / no attributes means that nothing is being written to the
buffer. However it might still prevent primitives from being written out
to the other buffers. Disabling it entirely seems to fix it.

Fixes GTF-GL45.gtf30.GL3Tests.transform_feedback.transform_feedback_overflow

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2020-01-15 19:53:18 -05:00
Erico Nunes
9bf210ba98 lima/ppir: implement full liveness analysis for regalloc
The existing liveness analysis in ppir still ultimately relies on a
single continuous live_in and live_out range per register and was
observed to be the bottleneck for register allocation on complicated
examples with several control flow blocks.
The use of live_in and live_out ranges was fine before ppir got control
flow, but now it ends up creating unnecessary interferences as live_in
and live_out ranges may span across entire blocks after blocks get
placed sequentially.

This new liveness analysis implementation generates a set of live
variables at each program point; before and after each instruction and
beginning and end of each block.
This is a global analysis and propagates the sets of live registers
across blocks independently of their sequence.
The resulting sets optimally represent all variables that cannot share a
register at each program point, so can be directly translated as
interferences to the register allocator.

Special care has to be taken with non-ssa registers. In order to
properly define their live range, their alive components also need to be
tracked. Therefore ppir can't use simple bitsets to keep track of live
registers.

The algorithm uses an auxiliary set data structure to keep track of the
live registers. The initial implementation used only trivial arrays,
however regalloc execution time was then prohibitive (>1minute on
Cortex-A53) on extreme benchmarks with hundreds of instructions,
hundreds of registers and several spilling iterations, mostly due to the
n^2 complexity to generate the interferences from the live sets. Since
the live registers set are only a very sparse subset of all registers at
each instruction, iterating only over this subset allows it to run very
fast again (a couple of seconds for the same benchmark).

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3358>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3358>
2020-01-15 22:55:31 +00:00
Erico Nunes
7e2765fded lima/ppir: remove orphan load node after cloning
There are some cases in shades using control flow where the varying load
is cloned to every block, and then the original node is left orphan.
This is not harmful for program execution, but it complicates analysis
for register allocation as there is now a case of writing to a register
that is never read.
While ppir doesn't have a dead code elimination pass for its own
optimizations and it is not hard to detect when we cloned the last load,
let's remove it early.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3358>
2020-01-15 22:55:31 +00:00
Kristian H. Kristensen
a3a73d116c iris: Print warning and return *out = NULL when fd to syncobj fails
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-15 14:47:46 -08:00
Kristian H. Kristensen
1ac138694b iris: Advertise PIPE_CAP_NATIVE_FENCE_FD
Enables EGL_ANDROID_native_fence_sync.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-15 14:47:46 -08:00
Kenneth Graunke
e9f9a944d3 iris: Fix export of fences that have already completed.
After flushing batches, iris_fence_flush() asks the kernel whether
each batch's last_syncpt has already signalled or not.  (The idea is
that either the compute or render batch may not have actually had any
work queued up, so last_syncpt there might have been signalled a long
time ago.)  If it's already completed, we don't bother to record it.

A strange corner is the case of repeated flushes.  For example, we
might flush for some reason, and hit a glFlush(), and hit SwapBuffers.
It's possible for all the batches to have been flushed previously, -and-
for them to have actually completed.  In this case, we'll see that there
are no syncobj's to wait on, and record fence->count == 0.

This works fine internally - fence_finish can see count == 0 and realize
that it doesn't need to wait, for example.  But when working with native
FDs, we may be asked to export a fence with count == 0.  So we need an
actual synchronization primitive we can hand off.  Because all of the
relevant batches had been signalled when creating the fence, we want the
new dummy fence to be signalled as well.

So we just make a signalled syncobj and export it.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2020-01-15 14:47:46 -08:00
Robert Foss
6b9fce5d9e android: Fix whitespace issue
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2020-01-15 22:30:17 +00:00
Robert Foss
62adb6522b panfrost: Prefix schedule_program to prevent collision
Currently the schedule_program implementation being used is picked
at compile time, which on the Android platform means that the
bifrost compiler & scheduler is used for all targets, including
midgard based hardware.

This commit disambiguates between the two schedule_program functions.

Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2020-01-15 22:30:17 +00:00
Marek Olšák
c4daf2b485 radeonsi: merge si_compile_llvm and si_llvm_compile functions
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>
2020-01-15 21:54:55 +00:00
Marek Olšák
68586bdd21 radeonsi: remove useless #includes
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>
2020-01-15 21:54:55 +00:00
Marek Olšák
30b14ba67e radeonsi: move code for shader resources into si_shader_llvm_resources.c
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>
2020-01-15 21:54:55 +00:00
Marek Olšák
da2c12af4b radeonsi: move geometry shader code into si_shader_llvm_gs.c
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>
2020-01-15 21:54:55 +00:00
Marek Olšák
57bd73e229 radeonsi: remove llvm_type_is_64bit
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>
2020-01-15 21:54:55 +00:00
Marek Olšák
194449a405 radeonsi: move tessellation shader code into si_shader_llvm_tess.c
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>
2020-01-15 21:54:55 +00:00
Marek Olšák
d7c86b106c radeonsi: move si_insert_input_* functions
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>
2020-01-15 21:54:55 +00:00
Marek Olšák
8ff8e68e42 radeonsi: work around an LLVM crash when using llvm.amdgcn.icmp.i64.i1
Cc: 19.2 19.3 <mesa-stable@lists.freedesktop.org>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3338>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3338>
2020-01-15 20:17:23 +00:00
Marek Olšák
af3fbb410c radeonsi: fix si_build_wrapper_function for compute-based primitive culling
Fixes: 3b143369a5 "ac/nir, radv, radeonsi: Switch to using ac_shader_args"

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3338>
2020-01-15 20:17:23 +00:00
Marek Olšák
6d4993c942 radeonsi/gfx10: separate code for determining the number of vertices for NGG
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-15 15:06:34 -05:00
Marek Olšák
7a25521f92 radeonsi/gfx10: separate code for getting edgeflags from the gs_invocation_id VGPR
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-15 15:06:33 -05:00
Marek Olšák
cf65c6f0d2 radeonsi: move VS_STATE.LS_OUT_PATCH_SIZE a few bits higher to make space there
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-15 15:06:31 -05:00
Marek Olšák
34ef0c5083 radeonsi: make si_insert_input_* functions non-static
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-15 15:06:29 -05:00
Marek Olšák
eeb4a11c11 ac/cull: don't read Position.Z if it's not needed for culling
It could be NULL.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-15 15:06:20 -05:00
Marek Olšák
8070402a30 radeonsi: separate code computing info for small primitive culling
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-15 14:59:11 -05:00
Kenneth Graunke
0a1c47074b intel/compiler: Fix illegal mutation in get_nir_image_intrinsic_image
get_nir_image_intrinsic_image() was incorrectly mutating the value held
by the register which holds the intrinsic's first source (image index).

If this happened to be the register for an SSA def which is also used
elsewhere in the program, this meant that we would clobber that value
in subsequent uses.

Note that this only affects i965, because neither anv nor iris use the
binding table start sections, so nothing is ever added here.

Fixes KHR-GL46.compute_shader.resources-max on i965 with Eric Anholt's
MR !3240 applied.  That MR reorders SSBOs and ABOs, so that test uses
image 0 and SSBO 0, causing this code to brilliantly add binding table
index 45 to both the image (correct) and the SSBO (bzzt, wrong!).

Fixes: 09f1de97a7 ("anv,i965: Lower away image derefs in the driver")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3404>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3404>
2020-01-15 19:25:35 +00:00
Rob Clark
b706a157c5 gitlab-ci: fix missing caselist.css/xsl
My best guess is that this was broken by d62dd8b0

Signed-off-by: Rob Clark <robdclark@chromium.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3413>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3413>
2020-01-15 19:03:56 +00:00
Jason Ekstrand
af6c2f4193 relnotes: Add Vulkan 1.2 2020-01-15 09:25:51 -06:00
Samuel Pitoiset
7f5462e349 radv: enable Vulkan 1.2
This bumps the Vulkan version to 1.2.128.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
68d6bead78 radv: implement Vulkan 1.2 features and properties
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
b3033198a8 radv: implement Vulkan 1.1 features and properties
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
a09ab76828 radv: update VK_KHR_timeline_semaphore for Vulkan 1.2
Promoted to Vulkan 1.2 with the KHR suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
fab0aa9182 radv: update VK_KHR_uniform_buffer_standard_layout for Vulkan 1.2
Promoted to Vulkan 1.2 with the KHR suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
3ff8d12458 radv: update VK_KHR_shader_subgroup_extended_types for Vulkan 1.2
Promoted to Vulkan 1.2 with the KHR suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
af25c8d57b radv: update VK_KHR_shader_float_controls for Vulkan 1.2
Promoted to Vulkan 1.2 with the KHR suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
5335bb6c39 radv: update VK_KHR_shader_float16_int8 for Vulkan 1.2
Promoted to Vulkan 1.2 with the KHR suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
a73d01b1db radv: update VK_KHR_shader_atomic_int64 for Vulkan 1.2
Promoted to Vulkan 1.2 with the KHR suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
83d1773a57 radv: update VK_KHR_imageless_framebuffer for Vulkan 1.2
Promoted to Vulkan 1.2 with the KHR suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
b3bdb4e6ff radv: update VK_KHR_image_format_list for Vulkan 1.2
Promoted to Vulkan 1.2 with the KHR suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
a80229941f radv: update VK_KHR_driver_properties for Vulkan 1.2
Promoted to Vulkan 1.2 with the KHR suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
af883bf3dc radv: update VK_KHR_draw_indirect_count for Vulkan 1.2
Promoted to Vulkan 1.2 with the KHR suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
b537be4368 radv: update VK_KHR_depth_stencil_resolve for Vulkan 1.2
Promoted to Vulkan 1.2 with the KHR suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
5993f13b27 radv: update VK_KHR_create_renderpass2 for Vulkan 1.2
Promoted to Vulkan 1.2 with the KHR suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
b2be00fbc1 radv: update VK_KHR_buffer_device_address for Vulkan 1.2
Promoted to Vulkan 1.2 with the KHR suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
0eb26aae1c radv: update VK_KHR_8bit_storage for Vulkan 1.2
Promoted to Vulkan 1.2 with the KHR suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
b4eed4e548 radv: update VK_EXT_scalar_block_layout for Vulkan 1.2
Promoted to Vulkan 1.2 with the EXT suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
efdf9d8969 radv: update VK_EXT_sampler_filter_minmax for Vulkan 1.2
Promoted to Vulkan 1.2 with the EXT suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
65e215e6f3 radv: update VK_EXT_host_query_reset for Vulkan 1.2
Promoted to Vulkan 1.2 with the EXT suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
95ec0c050b radv: update VK_EXT_descriptor_indexing for Vulkan 1.2
Promoted to Vulkan 1.2 with the EXT suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Iván Briano
4ef3f7e3d3 anv: Enable Vulkan 1.2 support
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2020-01-15 08:34:57 -06:00
Jason Ekstrand
c616627f63 anv: Implement the new core version property queries
Vulkan 1.2 introduces some new structures to get the properties and
features of a device from extensions that were promoted to core in 1.1
and 1.2.  This commit implements the new property queries and makes all
of the corresponding extension queries map to them.

Reviewed-by: Iván Briano <ivan.briano@intel.com>
2020-01-15 08:34:57 -06:00
Jason Ekstrand
a47152c622 anv: Implement the new core version feature queries
Vulkan 1.2 introduces some new structures to get the properties and
features of a device from extensions that were promoted to core in 1.1
and 1.2.  This commit implements the new feature queries and makes all
of the corresponding extension queries map to them.

Reviewed-by: Iván Briano <ivan.briano@intel.com>
2020-01-15 08:34:57 -06:00
Jason Ekstrand
721666e52a anv,nir: Lower quad_broadcast with dynamic index in NIR
This is required for the subgroupBroadcastDynamicId feature that was
added in Vulkan 1.2.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2020-01-15 08:34:57 -06:00
Jason Ekstrand
7e3e2ce702 anv: Bump the patch version to 131 2020-01-15 08:34:57 -06:00
Samuel Pitoiset
f33a68af63 vulkan/overlay: Fix for Vulkan 1.2
v2 (Jason Ekstrand):
 - Add duplicate hooks for both the 1.2 and KHR versions of
   vkCmdDraw[Indexed]IndirectCount.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2020-01-15 08:34:57 -06:00
Jason Ekstrand
75755e0eba turnip: Pretend to support Vulkan 1.2
It doesn't really support any Vulkan properly yet so why not claim 1.2?
This was an easier way of fixing the build than trying to roll it
forward to a later version of ANV's entrypoint generator scripts.
2020-01-15 08:34:57 -06:00
Jason Ekstrand
ac0c7ad2c2 vulkan: Update the XML and headers to 1.2.131
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2020-01-15 08:07:04 -06:00
Michel Dänzer
8775b742ea gitlab-ci: Stop using manual jobs for merge requests
They were causing trouble with Marge Bot: The project settings require
that the pipeline succeeds before a merge request (MR) can be merged,
otherwise Marge doesn't wait for the pipeline to succeed before merging
an MR assigned to her. But Marge can't start manual jobs, so she would
always time out waiting for pipelines with manual jobs.

To avoid this, use these rules:
* Run the pipeline by default for MRs and main project branches changing
  any files affecting it.
* For other MRs, run a single dummy job which always succeeds.
* Don't run any jobs for main project branch changes (e.g. from an MR
  having been merged) not affecting the pipeline.
* Allow jobs to be started manually on branches of forked projects, as
  before.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3361>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3361>
2020-01-15 10:31:01 +00:00
Pierre-Eric Pelloux-Prayer
7b0b085c94 radeonsi: drop the negation from fmask_is_not_identity
This change eases code reading ("fmask_is_identity = true" is clearer than
"fmask_is_not_identity = false").
Initialization is not changed so fmask_is_identity is false when a texture is
created.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3174>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3174>
2020-01-15 10:10:15 +00:00
Pierre-Eric Pelloux-Prayer
3a527eda7c radeonsi: unbind image before compute clear
It's not used and avoid infinite recursion when used from si_compute_expand_fmask

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3174>
2020-01-15 10:10:15 +00:00
Pierre-Eric Pelloux-Prayer
c2df5389bb radeonsi: make sure fmask expand is done if needed
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2248
Fixes: 095a58204d ("radeonsi: expand FMASK before MSAA image stores are used")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3174>
2020-01-15 10:10:15 +00:00
Pierre-Eric Pelloux-Prayer
b5e748b49b radeonsi: fix fmask expand compute shader
'coord' variable was using TGSI_WRITEMASK_XYZ so subsequent uses of
TGSI_WRITEMASK_W were dropped.
The result for a 2 samples program was:

  0: UMAD TEMP[0].xy, SV[1].xyyy, IMM[0].xxxx, SV[0].xyyy
  1: STORE IMAGE[0], TEMP[0], TEMP[1], RESTRICT, 2D_MSAA
  2: STORE IMAGE[0], TEMP[0], TEMP[2], RESTRICT, 2D_MSAA
  3: END

instead of the expected:

  0: UMAD TEMP[0].xy, SV[1].xyyy, IMM[0].xxxx, SV[0].xyyy
  1: MOV TEMP[0].w, IMM[0].yyyy
  2: LOAD TEMP[1], IMAGE[0], TEMP[0], RESTRICT, 2D_MSAA
  3: MOV TEMP[0].w, IMM[0].zzzz
  4: LOAD TEMP[2], IMAGE[0], TEMP[0], RESTRICT, 2D_MSAA
  5: MOV TEMP[0].w, IMM[0].yyyy
  6: STORE IMAGE[0], TEMP[0], TEMP[1], RESTRICT, 2D_MSAA
  7: MOV TEMP[0].w, IMM[0].zzzz
  8: STORE IMAGE[0], TEMP[0], TEMP[2], RESTRICT, 2D_MSAA
  9: END

This fixes half of https://gitlab.freedesktop.org/mesa/mesa/issues/2248

Fixes: 095a58204d ("radeonsi: expand FMASK before MSAA image stores are used")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3174>
2020-01-15 10:10:15 +00:00
Nataraj Deshpande
be08e6a449 egl/android: Restrict minimum triple buffering for android color_buffers
The patch restricts triple buffering as minimum at driver for android
color_buffers in order to fix onscreen performance hit for T-Rex and
Manhattan.

v2: Update min_buffer check condition (Tapani Pälli)
v3: further code cleanup (Eric Engestrom)

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2332
Fixes: 0661c357c6 ("egl/android: Update color_buffers querying for buffer age")
Signed-off-by: Nataraj Deshpande <nataraj.deshpande@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3384>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3384>
2020-01-15 09:42:08 +00:00
Lionel Landwerlin
a014105498 anv: fix pipeline switch back for non pipelined states
Setting state base address can happen even before pipeline is
selected. Also we must ensure it is set to 3D for Gen12, we can't
switch back to an invalid pipeline value (UINT32_MAX).

v2: Reuse helpers (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: b34422db5e ("anv: Implement Gen12 workaround for non pipelined state")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3396>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3396>
2020-01-15 11:14:43 +02:00
Samuel Pitoiset
fce28a7341 radv/gfx10: simplify some duplicated NGG GS code
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3382>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3382>
2020-01-15 07:45:29 +00:00
Samuel Pitoiset
53b50be35c radv/gfx10: enable all CUs if NGG is never used
Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3382>
2020-01-15 07:45:29 +00:00
Samuel Pitoiset
5ff12322c9 radv: only use VkSamplerCreateInfo::compareOp if enabled
Cc: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2350
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3392>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3392>
2020-01-15 08:16:15 +01:00
Iago Toral Quiroga
3f3ec07be5 v3d: fix bug when checking result of syncobj fence import
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3383>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3383>
2020-01-15 07:53:58 +01:00
Jonathan Marek
222e127e39 st/mesa: run st_nir_lower_tex_src_plane for lowered xyuv/ayuv
Has the effect of removing the nir_tex_src_plane for these formats too.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1896>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1896>
2020-01-15 02:20:00 +00:00
Jonathan Marek
a554b45d73 st/mesa: don't lower YUV when driver supports it natively
This fixes YUYV support on etnaviv.

Fixes: 7404833c "gallium: add handling for YUV planar surfaces"

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1896>
2020-01-15 02:20:00 +00:00
Bas Nieuwenhuizen
4e3c81517b radv: Disable VK_EXT_sample_locations on GFX10.
Workaround for https://gitlab.freedesktop.org/mesa/mesa/issues/2163

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3236>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3236>
2020-01-15 01:54:27 +00:00
Gurchetan Singh
6c978b1362 st/mesa: implement EGLImageTargetTexStorage
We can now support this extension.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3375>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3375>
2020-01-15 01:18:54 +00:00
Gurchetan Singh
2f1032f8f2 st/mesa: refactor egl image binding a bit
We'll need it for egl image tex storage.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3375>
2020-01-15 01:18:54 +00:00
Gurchetan Singh
be347863ba st/dri: track if image is created by a dmabuf
Will be used by EXT_EGL_image_storage later.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3375>
2020-01-15 01:18:54 +00:00
Rob Clark
2629cb627c freedreno/ir3: rename instructions
Turns out this range of opcodes are more general purpose if/else/endif
instructions.

We should re-work tess to create a basic block and use normal flow
control.  And possibly (for a6xx+) optimize cases to use if/else/endif
when appropriate.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3398>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3398>
2020-01-15 00:56:24 +00:00
Elie Tournier
22c5c54a4f nir/algebraic: sqrt(x)*sqrt(x) -> fabs(x)
total instructions in shared programs: 12840840 -> 12839341 (-0.01%)
instructions in affected programs: 122581 -> 121082 (-1.22%)
helped: 559
HURT: 0

total cycles in shared programs: 302505756 -> 302490031 (<.01%)
cycles in affected programs: 2022900 -> 2007175 (-0.78%)
helped: 1090
HURT: 130

Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/948>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/948>
2020-01-15 00:30:52 +00:00
Elie Tournier
6f394343b1 nir/algebraic: i2f(f2i()) -> trunc()
total instructions in shared programs: 12840968 -> 12840784 (<.01%)
instructions in affected programs: 17886 -> 17702 (-1.03%)
helped: 77
HURT: 0

total cycles in shared programs: 302508917 -> 302505592 (<.01%)
cycles in affected programs: 249964 -> 246639 (-1.33%)
helped: 70
HURT: 7

Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/948>
2020-01-15 00:30:52 +00:00
Eric Anholt
3d9a3d0be0 i965: Reuse the new core glsl_count_dword_slots().
The only difference I could see was treating interfaces like structs.
Maintain that case.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3297>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3297>
2020-01-14 23:55:00 +00:00
Eric Anholt
bc4f089d01 mesa/st: Move the dword slot counting function to glsl_types as well.
To implement NIR-to-TGSI, we need to be able to get the size of the
uniform variable for the TGSI declaration, not just the
.driver_location.  With its location in mesa/st, drivers couldn't link
to it from nir-to-tgsi.

This feels like a common enough function to want, so let's share it in
the core compiler.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3297>
2020-01-14 23:55:00 +00:00
Eric Anholt
4cabd4812a mesa/prog: Reuse count_vec4_slots() from ir_to_mesa.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3297>
2020-01-14 23:55:00 +00:00
Eric Anholt
74ee3f76de mesa/st: Move the vec4 type size function into core GLSL types.
The only bit that gallium varied on was handling of bindless.  We can
retain previous behavior for count_attribute_slots() by passing in
"true" (though I suspect this is just giving a silly answer to a silly
question), and delete our recursive function from mesa/st.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3297>
2020-01-14 23:55:00 +00:00
Eric Anholt
b807f7a43a mesa/st: Deduplicate the NIR uniform lowering code.
Just a little refactor as I go looking at the type size functions.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3297>
2020-01-14 23:55:00 +00:00
Marek Olšák
8832a88434 radeonsi: move PS LLVM code into si_shader_llvm_ps.c
This is an attempt to clean up si_shader.c.

v2: don't move code that is not specific to LLVM

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (v1)
2020-01-14 18:46:07 -05:00
Marek Olšák
9b60b3ce93 radeonsi: remove always constant ballot_mask_bits from si_llvm_context_init
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2020-01-14 18:46:07 -05:00
Marek Olšák
37916a66b1 radeonsi: fold si_create_function into si_llvm_create_func
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2020-01-14 18:46:07 -05:00
Marek Olšák
42112010a3 radeonsi: rename si_shader_create -> si_create_shader_variant for clarity
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2020-01-14 18:46:07 -05:00
Marek Olšák
63b5d85baa radeonsi: rename si_compile_tgsi_main -> si_build_main_function
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2020-01-14 18:46:07 -05:00
Marek Olšák
f4ba457e1e radeonsi: clean up si_shader_info
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2020-01-14 18:46:07 -05:00
Marek Olšák
03950473df radeonsi: merge si_tessctrl_info into si_shader_info
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2020-01-14 18:46:07 -05:00
Marek Olšák
5fa2ab831e radeonsi: fork tgsi_shader_info and tgsi_tessctrl_info
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2020-01-14 18:46:07 -05:00
Marek Olšák
18aaceae8d radeonsi: rename si_shader_info -> si_shader_binary_info
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2020-01-14 18:46:07 -05:00
Marek Olšák
7f4a54d5bd radeonsi: remove TGSI from comments
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2020-01-14 18:46:07 -05:00
Marek Olšák
b1badf4ad6 radeonsi: rename DBG_NO_TGSI -> DBG_NO_NIR
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2020-01-14 18:46:07 -05:00
Marek Olšák
b144d4be74 radeonsi: don't adjust depth and stencil PS output locations
this was for compatibility with TGSI

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2020-01-14 18:46:07 -05:00
Caio Marcelo de Oliveira Filho
3cc501be69 nir: Add missing nir_var_mem_global to various passes
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3322>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3322>
2020-01-14 14:42:12 -08:00
Caio Marcelo de Oliveira Filho
d8440a3d2f spirv: Handle PhysicalStorageBuffer in memory barriers
PhysicalStorageBuffer is lowered to nir_var_mem_global, and
SPIR-V 1.5rev1 in section "3.25. Memory Semantics <id>" says

    UniformMemory

    Apply the memory-ordering constraints to StorageBuffer,
    PhysicalStorageBuffer, or Uniform Storage Class memory.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3322>
2020-01-14 14:42:12 -08:00
Caio Marcelo de Oliveira Filho
1ec0d4fdff spirv: Drop EXT for PhysicalStorageBuffer symbols
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3322>
2020-01-14 14:42:12 -08:00
Timur Kristóf
dfaa3c0af6 aco: Flip s_cbranch / s_cselect to optimize out an s_not if possible.
When possible, get rid of an s_not when all it does is invert the SCC,
and its successor s_cbranch / s_cselect can be inverted instead.

Also modify some parts of instruction_selection to take advantage of
this feature.

Example:
s2: %3900,  s1: %3899:scc = s_andn2_b64 %0:exec, %406
s2: %3902 = s_cselect_b64 -1, 0, %3900:scc
s2: %407,  s1: %3903:scc = s_not_b64 %3902
s2: %3906,  s1: %3905:scc = s_and_b64 %407, %0:exec
p_cbranch_z %3905:scc
Can now be optimized to:
s2: %3900,  s1: %3899:scc = s_andn2_b64 %0:exec, %406
p_cbranch_nz %3900:scc

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2020-01-14 21:21:06 +01:00
Timur Kristóf
c0f82165a7 aco: Optimize out s_and with exec, when used on uniform bitwise values.
Previously all booleans needed an s_and with exec when they were turned
into a scalar condition. However, this is not needed for uniform booleans.

v2 by Daniel Schürmann:
- Make the code more readable
v3 by Timur Kristóf:
- Fix regressions, make it work in wave32 mode

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2020-01-14 21:21:06 +01:00
Timur Kristóf
1c44129db3 aco: Don't skip combine_instruction when definitions[1] is used.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2020-01-14 21:21:06 +01:00
Timur Kristóf
338d03090f aco: Allow optimizing vote_all and nir_op_iand.
By adding an extra instruction, we can replace the operands of
the s_cselect_b64, which allows it to get picked up by the
optimizer when it looks for uniform booleans.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2020-01-14 21:21:06 +01:00
Timur Kristóf
d962bbd895 aco: Implement 64-bit constant propagation.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2020-01-14 21:21:06 +01:00
Alyssa Rosenzweig
6bd9c4dc57 panfrost: Fix linear depth textures
As pointed out by Boris, what we were calling PAN_LINEAR depth textures
was in fact u-interleaved tiled (!), but we never noticed since we
flipped the flag used for sampling, leading to all sorts of fun bugs
when attempting to directly acess depth textures from the CPU. Which
begs the question -- if what we called LINEAR was tiled, how do we
actually render linear depth textures? It turns out the flags for AFBC
form a mali_block_format 2-bit code just like their render-target
counterparts, so we can render to any of the above.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reported-by: Boris Brezillon <boris.brezillon@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3393>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3393>
2020-01-14 19:42:20 +00:00
Jason Ekstrand
7c16a1ae4e vulkan/wsi: Add a driconf option to force WSI to advertise BGRA8_UNORM first
The Aztec Ruins benchmark just grabs the first format in the list and
SRGB causes it to render washed out.  With this workaround, it renders
the same as OpenGL.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3350>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3350>
2020-01-14 19:27:13 +00:00
Caio Marcelo de Oliveira Filho
edf6a40cb2 intel/fs: Only use SLM fence in compute shaders
Fixes: b390ff3517 ("intel/fs: Add support for SLM fence in Gen11")
Fixes: e142061399 ("intel/fs: Implement scoped_memory_barrier")

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2020-01-14 10:55:48 -08:00
Marek Olšák
9e699ae690 radeonsi: actually enable VBOs in user SGPRs
Fixes: 363b4027fc - radeonsi: put up to 5 VBO descriptors into user SGPRs
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-14 13:42:36 -05:00
Marek Olšák
f341db3e17 radeonsi: fix assertion and other failures in si_emit_graphics_shader_pointers
The assertion was failing.

Fixes: 363b4027fc - radeonsi: put up to 5 VBO descriptors into user SGPRs
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-14 13:42:36 -05:00
Rhys Perry
cc3ef3643a nir/algebraic: a & ~(a >> 31) -> imax(a, 0)
Found in some Doom shaders

Totals from affected shaders:
SGPRS: 30056 -> 30064 (0.03 %)
VGPRS: 28024 -> 28024 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 4278648 -> 4270852 (-0.18 %) bytes
Max Waves: 1476 -> 1476 (0.00 %)
Instructions: 835287 -> 833338 (-0.23 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3089>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3089>
2020-01-14 17:54:40 +00:00
Marco Felsch
1607123ae7 etnaviv: Fix assert when try to accumulate an invalid fd
Check if it is a valid fd before merging it to the context's fd.

Signed-off-by: Marco Felsch <m.felsch@pengutronix.de>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3381>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3381>
2020-01-14 17:40:10 +00:00
Afonso Bordado
22217f24ec pan/midgard: Fix midgard_compile.h includes
We now use enum mali_format which is defined in panfrost-job.h

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3243>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3243>
2020-01-14 17:16:11 +00:00
Lionel Landwerlin
a19cdf989b anv: only use VkSamplerCreateInfo::compareOp if enabled
The spec says nothing about the validity of the compareOp field when
compareEnable is false.

v2: use vulkan enum to pick default value (Caio)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2350
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3387>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3387>
2020-01-14 16:40:16 +00:00
Rhys Perry
d8e05edbd9 nir/sink,nir/move: move/sink nir_op_mov
Can uncover opportunities to move other instructions. This can increase
register usage, but that doesn't seem to actually happen.

This optimizes a pattern of a load_per_vertex_input followed by several
moves and then a store_output in a different block.

v2: add nir_move_copies to make it optional

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
Acked-by: Rob Clark <robdclark@chromium.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2420>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2420>
2020-01-14 13:56:45 +00:00
Rhys Perry
04fac72ec7 nir/sink,nir/move: move/sink load_per_vertex_input
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2420>
2020-01-14 13:56:45 +00:00
Tomeu Vizoso
22d976454f gitlab-ci: Consolidate container and build stages for LAVA
Use the normal build job to also prepare the artifacts for LAVA jobs.

For that, the build container needs to also build the test suites,
kernel, ramdisk, etc.

Then the build job will place the just-built Mesa in the ramdisk and the
test job can generate a LAVA job and point to those artifacts.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3295>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3295>
2020-01-14 13:17:24 +00:00
Rhys Perry
f978e0e516 aco: add integer min/max to can_swap_operands
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
f92a89a979 aco: improve readfirstlane after uniform LDS loads
Totals from affected shaders:
SGPRS: 976 -> 968 (-0.82 %)
VGPRS: 580 -> 584 (0.69 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 106032 -> 103076 (-2.79 %) bytes
Max Waves: 237 -> 237 (0.00 %)
Instructions: 19452 -> 18740 (-3.66 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
92ace0bb31 aco: replace extract_vector with copies
Helps a small number of small shaders with situations like this:
a = p_create_vector ...
b = p_extract_vector a, 3
and copy propagation can't be done

Totals from affected shaders:
SGPRS: 14304 -> 14416 (0.78 %)
VGPRS: 8716 -> 6592 (-24.37 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 184664 -> 176888 (-4.21 %) bytes
Max Waves: 6260 -> 6260 (0.00 %)
Instructions: 35561 -> 33617 (-5.47 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
20d869079d aco: allow input modifiers on v_cndmask_b32
Totals from affected shaders:
SGPRS: 594099 -> 594019 (-0.01 %)
VGPRS: 441016 -> 441124 (0.02 %)
Spilled SGPRs: 101 -> 101 (0.00 %)
Spilled VGPRs: 18 -> 18 (0.00 %)
Code Size: 30266652 -> 30125256 (-0.47 %) bytes
Max Waves: 67044 -> 67057 (0.02 %)
Instructions: 5753097 -> 5726607 (-0.46 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
f9405ceb8a aco: don't move literal to reg when making an instruction VOP3 on GFX10
pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 163398 -> 163398 (0.00 %)
VGPRS: 143820 -> 143820 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 13065744 -> 13044308 (-0.16 %) bytes
Max Waves: 18921 -> 18921 (0.00 %)
Instructions: 2514644 -> 2509285 (-0.21 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
e686e4765e aco: add min(-max(), ) and max(-min(), ) optimization
No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
fa8357eb70 aco: improve clamp optimization
Not sure why it checked the use count, it doesn't apply the constants.

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 269409 -> 269745 (0.12 %)
VGPRS: 238120 -> 238132 (0.01 %)
Spilled SGPRs: 305 -> 305 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 22908584 -> 22904672 (-0.02 %) bytes
Max Waves: 20217 -> 20217 (0.00 %)
Instructions: 4275312 -> 4263869 (-0.27 %)

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 155409 -> 155233 (-0.11 %)
VGPRS: 153072 -> 153072 (0.00 %)
Spilled SGPRs: 269 -> 269 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 14650824 -> 14650396 (-0.00 %) bytes
Max Waves: 9609 -> 9609 (0.00 %)
Instructions: 2762802 -> 2755517 (-0.26 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
edc888ccb1 aco: fix clamp optimization
We can't do the optimization if there are neg/abs in-between.

No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
f664cb01ec aco: improve creation of v_madmk_f32/v_madak_f32
Using needs_vop3 check was flawed because it would only combine the
literal if the first operand is the literal. If the second or third
operand is the literal, then needs_vop3 will be true and the literal will
not be combined.

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 782051 -> 782051 (0.00 %)
VGPRS: 630048 -> 630048 (0.00 %)
Spilled SGPRs: 195 -> 195 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 54743740 -> 54585548 (-0.29 %) bytes
Max Waves: 67340 -> 67340 (0.00 %)
Instructions: 10182030 -> 10182030 (0.00 %)

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 701990 -> 699590 (-0.34 %)
VGPRS: 566632 -> 566784 (0.03 %)
Spilled SGPRs: 218 -> 218 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 49173564 -> 49007856 (-0.34 %) bytes
Max Waves: 59650 -> 59612 (-0.06 %)
Instructions: 9315135 -> 9293330 (-0.23 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
15e25da3e5 aco: take advantage of GFX10's constant bus limit and VOP3 literals
pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 2397159 -> 2392494 (-0.19 %)
VGPRS: 1756036 -> 1753920 (-0.12 %)
Spilled SGPRs: 461 -> 470 (1.95 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 110287304 -> 109946304 (-0.31 %) bytes
Max Waves: 318341 -> 318475 (0.04 %)
Instructions: 21019327 -> 20533618 (-2.31 %)

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 0 -> 0 (0.00 %)
VGPRS: 0 -> 0 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 0 -> 0 (0.00 %) bytes
Max Waves: 0 -> 0 (0.00 %)
Instructions: 0 -> 0 (0.00 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
9c2d37308f aco: allow an extra SGPR with multiple uses to be applied to VOP3
This is in a separate patch from the apply_sgprs() rewrite so that the
rewrite can be more easily tested.

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 3056 -> 3056 (0.00 %)
VGPRS: 1632 -> 1632 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 156468 -> 156304 (-0.10 %) bytes
Max Waves: 288 -> 288 (0.00 %)
Instructions: 29510 -> 29469 (-0.14 %)

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 2984 -> 2984 (0.00 %)
VGPRS: 1616 -> 1616 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 156132 -> 155968 (-0.11 %) bytes
Max Waves: 289 -> 289 (0.00 %)
Instructions: 29426 -> 29385 (-0.14 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
f4c2c90e1a aco: allow applying two sgprs to an instruction
We could create VALU instructions which read two sgprs, but only if isel
created an instruction which already read one of them.

This change is in a separate patch from the apply_sgprs() rewrite so that
it can be tested if the rewrite affected anything.

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 216 -> 216 (0.00 %)
VGPRS: 64 -> 64 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 1756 -> 1708 (-2.73 %) bytes
Max Waves: 120 -> 120 (0.00 %)
Instructions: 312 -> 300 (-3.85 %)

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 216 -> 216 (0.00 %)
VGPRS: 64 -> 64 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 1784 -> 1736 (-2.69 %) bytes
Max Waves: 120 -> 120 (0.00 %)
Instructions: 319 -> 307 (-3.76 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
7da07ca3e4 aco: follow through temporary when merging tests into constant comparisons
This can happen with v_mov_b32(s_mov_b32(literal))

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 632 -> 632 (0.00 %)
VGPRS: 492 -> 492 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 77488 -> 76928 (-0.72 %) bytes
Max Waves: 67 -> 67 (0.00 %)
Instructions: 14426 -> 14332 (-0.65 %)

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 632 -> 632 (0.00 %)
VGPRS: 492 -> 492 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 77512 -> 76952 (-0.72 %) bytes
Max Waves: 67 -> 67 (0.00 %)
Instructions: 14432 -> 14338 (-0.65 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
dc6c35e1c3 aco: be more careful with literals in combine_salu_{n2,lshl_add}
No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
fcf52eb42d aco: add check_vop3_operands()
This will be useful when taking advantage of GFX10 features.

No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
cef7879719 aco: rewrite apply_sgprs()
This will make it easier to apply two different sgprs (for GFX10) or apply
the same sgpr twice (just remove the break).

No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
0be7409069 aco: rewrite literal combining
Should make taking advantage of GFX10's increased constant bus limit and
VOP3 literals easier.

No pipeline-db changes

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
84b9f3786b aco: improve can_use_VOP3()
No pipeline-db changes

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
3cb98ed939 aco: combine two sgprs into a VALU if they're the same
This was supposed to be done before but it wasn't done correctly and
everywhere.

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 784680 -> 786128 (0.18 %)
VGPRS: 574012 -> 573892 (-0.02 %)
Spilled SGPRs: 461 -> 461 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 45477088 -> 45478172 (0.00 %) bytes
Max Waves: 81294 -> 81277 (-0.02 %)
Instructions: 8657970 -> 8622483 (-0.41 %)

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 780664 -> 782072 (0.18 %)
VGPRS: 573880 -> 573760 (-0.02 %)
Spilled SGPRs: 629 -> 629 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 45445244 -> 45448340 (0.01 %) bytes
Max Waves: 81178 -> 81161 (-0.02 %)
Instructions: 8649902 -> 8614918 (-0.40 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
c240c1aecf aco: apply literals to split mads
Removing the return is also needed to apply literals to mads (which can be
done on GFX10).

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 368787 -> 367555 (-0.33 %)
VGPRS: 312436 -> 312448 (0.00 %)
Spilled SGPRs: 461 -> 461 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 26113388 -> 26098260 (-0.06 %) bytes
Max Waves: 35982 -> 35982 (0.00 %)
Instructions: 5038670 -> 5028941 (-0.19 %)

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 369843 -> 368659 (-0.32 %)
VGPRS: 317224 -> 317196 (-0.01 %)
Spilled SGPRs: 629 -> 629 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 26310540 -> 26295156 (-0.06 %) bytes
Max Waves: 36324 -> 36326 (0.01 %)
Instructions: 5073957 -> 5064164 (-0.19 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
8f10e48745 aco: update IR validator
GFX10 increased the constant bus limit and allowed literals on VOP3

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:27 +00:00
Rhys Perry
1ffacc3ce1 nir/lower_gs_intrinsics: add option for per-stream counts
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2422>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2422>
2020-01-14 12:11:14 +00:00
Rhys Perry
9fb0c2e033 nir/divergence: handle load_primitive_id in GS
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2323>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2323>
2020-01-14 11:29:44 +00:00
Erik Faye-Lund
9aab36b6eb mesa/st: use float literals
This removes a warning on MSVC.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2020-01-14 12:01:29 +01:00
Erik Faye-Lund
fcdd3c866b gallium: fix a warning
On some platforms (like Win64), unsigned long is 32-bit, so the first
cast doesn't do anything, and the compiler complains about an implicit
cast to a smaller type. So let's cast to an uintptr_t instead first,
as that's large enough on all platforms.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2020-01-14 12:01:05 +01:00
Erik Faye-Lund
1a1e5a763a st/wgl: eliminate implicit cast warning
I get warnings on MSVC for these implicit casts. Let's use explicit
casts instead.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2020-01-14 12:01:00 +01:00
Erik Faye-Lund
d5c0fbfd78 util: initialize float-array with float-literals
We currently initialize this float-array with double-literals. Some
compilers generate warnings for this, so let's switch these to
float-literals instead.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2020-01-14 12:00:27 +01:00
Lionel Landwerlin
b34422db5e anv: Implement Gen12 workaround for non pipelined state
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3365>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3365>
2020-01-14 11:52:36 +02:00
Lionel Landwerlin
b8fbb39ab2 iris: Implement Gen12 workaround for non pipelined state
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3365>
2020-01-14 11:52:36 +02:00
Vasily Khoruzhick
55b0aa436e lima: add new findings to texture descriptor
Lower 8 bits of unknown_1_3 seems to be min_lod,
rest of 4 bits + miplevels are max_lod and min_mipfilter seems to be
lod bias. All are in fixed format with 4 bit integer and 4 bit fraction,
lod_bias also has sign bit.

Blob also seems to do some magic with lod_bias if min filter is nearest --
it adds 0.5 to lod_bias in this case. Same story when all filters are
nearest and mipmapping is enabled, but in this case it subtracts 1/16
from lod_bias.

Fixes 134 dEQP tests in dEQP-GLES2.functional.texture.*

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3359>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3359>
2020-01-13 22:50:36 -08:00
Kenneth Graunke
a9bd0668d5 intel: Use similar brand strings to the Windows drivers
This updates our product name strings to match the ones reported
by the Windows driver, which is typically the marketing name.

We retain a platform abbreviation and GT level in parenthesis so that
we're able to distinguish similar parts more easily, helping us better
understand at a glance which GPU a bug reporter has.

Acked-by: Matt Turner <mattst88@gmail.com>
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3371>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3371>
2020-01-13 19:42:35 -08:00
Kenneth Graunke
f63d6260d1 iris: Simplify iris_get_renderer_string()
We use gen_get_device_name() instead of PCI ID list munging.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3371>
2020-01-13 19:42:30 -08:00
Kenneth Graunke
44bad9c31a i965: Simplify brw_get_renderer_string()
This stops using driGetRendererString() in favor of a simple snprintf().
This should have the same functionality on 64-bit systems, but drops
a "x86/MMX/SSE2" suffix on 32-bit systems.  (People shouldn't be using
the GL_RENDERER string to check for CPU features...)

We also use gen_get_device_name() instead of PCI ID list munging.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3371>
2020-01-13 19:42:22 -08:00
Kenneth Graunke
50c47ba49e Revert "nir: assert that nir_lower_tex runs after lowering derefs"
This reverts commit 4cda61f11e for now,
as it appears to break i965 CI (32,000+ failures).  Rob and I suspect
we need to do the equivalent of 1c6a2efa06
on i965 - we are doing nir_lower_tex and brw_nir_lower_resources in the
wrong order and that's likely triggering this condition.  Once we fix
that, we should put this patch back.
2020-01-13 17:37:40 -08:00
Erik Faye-Lund
09b37ba65f zink: fixup initialization of operand_mask / num_extra_operands
This doesn't change behavior, but makes the code a bit easier to read.
Both values are zero, but I somehow swapped the logical meaning of them
when initializing.
2020-01-14 01:06:59 +00:00
Eric Anholt
3be4b89c03 mesa: Fix detection of invalidating both depth and stencil.
Fixes an extra 1024x1024x4 MSAA Z/S store on WebGL fishtank on cheza.

Reported-by: Dave Airlie <airlied@redhat.com>
Fixes: db2ae51121 ("mesa: Skip partial InvalidateFramebuffer of packed depth/stencil.")
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3370>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3370>
2020-01-13 23:37:54 +00:00
Rob Clark
1c6a2efa06 mesa/st: lower samplers before nir_lower_tex
Fixes incorrect lowering of YUV samplers when there are non-yuv
samplers.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3368>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3368>
2020-01-13 23:19:49 +00:00
Rob Clark
4cda61f11e nir: assert that nir_lower_tex runs after lowering derefs
It isn't going to do the right thing, because texture_index/
sampler_index defaults to zero.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3368>
2020-01-13 23:19:49 +00:00
Gurchetan Singh
d72f178753 i965: support EXT_EGL_image_storage
i965 can support this.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2020-01-13 14:57:36 -08:00
Gurchetan Singh
b1c266d5fa i965: refactor intel_image_target_texture_2d
intel_image_target_texture_tex_storage can reuse much of this
code.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2020-01-13 14:57:32 -08:00
Gurchetan Singh
34fe560cd6 i965: track if image is created by a dmabuf
Will be used by EXT_EGL_image_storage later.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2020-01-13 14:57:27 -08:00
Gurchetan Singh
bf576772ab dri_util: add driImageFormatToSizedInternalGLFormat function
This is needed to implement the EXT_EGL_image_storage spec:

"If <target> is GL_TEXTURE_2D, then the resultant texture must have a
sized internal format which is colorspace and size compatible with the
dma-buf.  If the GL is unable to determine such a format, the error
INVALID_OPERATION is generated."

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2020-01-13 14:57:22 -08:00
Gurchetan Singh
b68ff2b873 glapi / teximage: implement EGLImageTargetTexStorageEXT
Check various parts of the EXT_EGL_image_storage spec, and add a
new vfunc for drivers implementing it.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2020-01-13 14:57:18 -08:00
Gurchetan Singh
1fe23d0e22 teximage: split out helper from EGLImageTargetTexture2DOES
The major differences between EXT_EGL_image_storage and
EGLImageTargetTexture2DOES are:

(1) The texture target is made immutable
(2) EXT_EGL_image_storage supports non-2D targets.

We can reuse EGLImageTargetTexture2D and FreeTextureImageBuffer
for (1) pretty easily.  For (2), let's just not support the
complicated targets.  Let's reuse aspects of the
EGLImageTargetTexture2DOES implementation.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2020-01-13 14:57:07 -08:00
Jason Ekstrand
7978f2401b anv: Memset array properties
This is probably better than possibly leaving those bytes uninitialized
even if the app will theoretically not use them.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3369>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3369>
2020-01-13 22:33:55 +00:00
Jason Ekstrand
d36eed3e69 anv: Don't over-advertise descriptor indexing features
We should only advertise sub-features if we advertise the extension.

Fixes: 6e230d7607 "anv: Implement VK_EXT_descriptor_indexing"
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3369>
2020-01-13 22:33:55 +00:00
Jason Ekstrand
d7ff137445 intel/blorp: Fill out all the dwords of MI_ATOMIC
This makes us valgrind clean again.

Fixes: 9175c7058e "intel/blorp: Make blorp update the clear color..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3366>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3366>
2020-01-13 21:48:00 +00:00
Tomeu Vizoso
40dd418e14 gitlab-ci: Upgrade kernel for LAVA jobs to v5.5-rc5
Some fixes got in that should prevent hangs in lima jobs.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3363>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3363>
2020-01-13 21:26:11 +00:00
Daniel Schürmann
05c81875d7 aco: fix unconditional demote_to_helper
This patch fixes an out-of-bounds access on p_exit_early
and binds the exec register to the correct operand.

Fixes: 2ea9e59e8d ('aco: move s_andn2_b64 instructions out of the p_discard_if')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3347>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3347>
2020-01-13 21:08:41 +00:00
Marek Olšák
2bb88b2fdc radeonsi: don't enable VBOs in user SGPRs if compute-based culling can be used
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-13 15:57:07 -05:00
Marek Olšák
363b4027fc radeonsi: put up to 5 VBO descriptors into user SGPRs
gfx6-8: 1 VBO descriptor in user SGPRs
gfx9-10: 5 VBO descriptors in user SGPRs

We no longer pull up to 5 VBO descriptors from GTT when SDMA is disabled.

Totals from affected shaders:
SGPRS: 1110528 -> 1170528 (5.40 %)
VGPRS: 952896 -> 951936 (-0.10 %)
Spilled SGPRs: 83 -> 61 (-26.51 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 23766296 -> 22843920 (-3.88 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 179344 -> 179344 (0.00 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-13 15:57:07 -05:00
Marek Olšák
220d00314f ac,radeonsi: increase the maximum number of shader args and return values
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-13 15:57:07 -05:00
Marek Olšák
ef253c6789 radeonsi: simplify si_set_vertex_buffers
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-13 15:57:07 -05:00
Marek Olšák
312e04689a radeonsi: don't allow draw calls with uninitialized VS inputs
These always hang, because vertex buffer descriptors are not set up.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-13 15:57:07 -05:00
Marek Olšák
c278c73f13 radeonsi: add si_context::num_vertex_elements
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-13 15:57:07 -05:00
Marek Olšák
1e03b63b3b radeonsi: rename desc_list_byte_size -> vb_desc_list_alloc_size
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-13 15:57:07 -05:00
Lionel Landwerlin
2cc14bd7b8 anv: set stencil layout for input attachments
If an input attachment has a stencil format, we need to set this.

v2: Fish out VkAttachmentReferenceStencilLayoutKHR from
    VkAttachmentReference2KHR::pNext (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: c1c346f166 ("anv: implement VK_KHR_separate_depth_stencil_layouts")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2891>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2891>
2020-01-13 21:57:33 +02:00
Jason Ekstrand
21bc16a723 anv: Drop an unused variable 2020-01-13 12:20:48 -06:00
Jason Ekstrand
d3737002ee nir/lower_atomics_to_ssbo: Also lower barriers
This is more correct for a pass which is supposed to completely lower
away atomic counters.  It also lets us stop supporting atomic counter
barriers in most of the drivers.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
2020-01-13 17:23:47 +00:00
Jason Ekstrand
e40b11bbcb nir: Rename nir_intrinsic_barrier to control_barrier
This is a more explicit name now that we don't want it to be doing any
memory barrier stuff for us.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
2020-01-13 17:23:47 +00:00
Jason Ekstrand
bd3ab75aef intel/nir: Stop adding redundant barriers
Now that both GLSL and SPIR-V are adding shared and tcs_patch barriers
(as appropreate) prior to the nir_intrinsic_barrier, we don't need to do
it ourselves in the back-end.  This reverts commit
26e950a5de01564e3b5f2148ae994454ae5205fe.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
2020-01-13 17:23:47 +00:00
Jason Ekstrand
ba43b66dc9 nir/glsl: Emit memory barriers as part of barrier()
The GLSL barrier() intrinsic does an implicit shared memory barrier in
compute shaders and an implicit TCS patch output barrier in tessellation
control shaders.  We'd like NIR's barrier intrinsic to just be a control
flow barrier and not have memory implications.  To satisfy this, we need
to add an extra memory barrier in front of each nir_intrinsic_barrier.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
2020-01-13 17:23:47 +00:00
Jason Ekstrand
a4125b4d26 spirv: Add output memory semantics to OpControlBarrier in TCS
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
2020-01-13 17:23:47 +00:00
Jason Ekstrand
2365520c9d spirv: Add a workaround for OpControlBarrier on old GLSLang
As per the Vulkan memory model, the proper translation of GLSL barrier()
is an OpControlBarrier with a scope of Workgroup and semantics of
Acquire, Release, and WorkgroupMemory.  Older versions of GLSLang gave
an OpControlBarrier with semantics of None so we need to patch it up on
those versions.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
2020-01-13 17:23:47 +00:00
Jason Ekstrand
60097cc840 nir: Add a new memory_barrier_tcs_patch intrinsic
Right now, it's implemented as a no-op for everyone.  For most drivers,
it's a switch case in the NIR -> whatever which just breaks.  For ir3,
they already have code to delete tessellation barriers so we just add a
case to also delete memory_barrier_tcs_patch.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
2020-01-13 17:23:47 +00:00
Jason Ekstrand
f2eece773c llmvpipe: No-op implement more barriers
Acked-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
2020-01-13 17:23:46 +00:00
Jason Ekstrand
3498ab98f5 nir: Handle barriers with more granularity in combine_stores
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
2020-01-13 17:23:46 +00:00
Jason Ekstrand
f09db0bed5 nir: Handle more barriers in dead_write and copy_prop
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
2020-01-13 17:23:46 +00:00
Jason Ekstrand
ada49bae5e intel/vec4: Support scoped_memory_barrier
Fixes: 06aecb14c0 "anv: Implement VK_KHR_vulkan_memory_model"
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
2020-01-13 17:23:46 +00:00
Andreas Baierl
40aef2bf3e lima: Add stencil support
This re-enables and fixes support for stencil buffer.

It fixes 365 stencil related deqp tests. All tests that use INCR, INCR_WRAR,
DECR and DECR_WRAP as a stencil op still fail, but they also fail with the
blob, so we may ignore that for now.
We still have dEQP-GLES2.functional.depth_stencil_clear.depth_stencil_masked
failing, which is strange because it's the only one out of the
depth_stencil_clear.* set.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
2020-01-13 16:11:37 +00:00
Andreas Baierl
2ce71494f1 lima/parser: Make rsw alpha blend parsing more readable
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
2020-01-13 16:11:37 +00:00
Boris Brezillon
440b0d6eec panfrost: Remove unneeded phi nodes
Add a pass to remove unneeded phi nodes as done in other drivers.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3294>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3294>
2020-01-13 14:09:47 +00:00
Rhys Perry
809c8feb92 aco: check if multiplication/clamp is live when applying output modifier
It's possible that a multiplication/clamp is dead code and the single use
is from a different user.

Fixes portal rendering in Path of Exile when global illumination is
enabled.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>
2020-01-13 13:26:43 +00:00
Rhys Perry
ef8abfa790 aco: disable add combining for ds_swizzle_b32
ds_bpermute_b32/ds_permute_b32 are fine, I think

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>
2020-01-13 13:26:43 +00:00
Rhys Perry
69bed1c918 aco: don't DCE atomics with return values
We don't create atomics with definitions if they are not used in NIR, but
our own DCE can remove the uses if an export turns out to be null.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>
2020-01-13 13:26:43 +00:00
Rhys Perry
8f291dc146 aco: set exec_potentially_empty for demotes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>
2020-01-13 13:26:43 +00:00
Rhys Perry
21eafe30df aco: better handle neg/abs of sgprs
isel/label_instruction currently doesn't create these but we should
probably check anyway.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>
2020-01-13 13:26:43 +00:00
Rhys Perry
f29a5a205c aco: check usesModifiers() when identifying a neg/abs
This was fine because a literal used to mean that it didn't use modifiers,
but now VOP3 can take a literal on GFX10.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>
2020-01-13 13:26:43 +00:00
Rhys Perry
46fb341b8d aco: handle omod successors with the constant in the first operand
No pipeline-db changes

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>
2020-01-13 13:26:43 +00:00
Rhys Perry
7ce244b7d1 aco: handle VOP3 modifiers when combining a constant comparison's NaN test
No pipeline-db changes

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>
2020-01-13 13:26:43 +00:00
Rhys Perry
bbac52873f aco: fix uninitialized data in the binary
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>
2020-01-13 13:25:32 +00:00
Rhys Perry
fcd6d83245 aco: fix imageSize()/textureSize() with large buffers on GFX8
Tested on Navi by using dEQP-VK.image.image_size.buffer.* and the GFX8
path with the size multipled by the stride.
dEQP-VK.image.image_size.buffer.* was also run with the tests modified to
use a 96bit format.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>
2020-01-13 13:25:32 +00:00
Rhys Perry
49bcd06f97 aco: set vm for pos0 exports on GFX10
RADV's LLVM backend and radeonsi does the same thing.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: 19.3 <mesa-stable@lists.freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>
2020-01-13 13:25:32 +00:00
Daniel Ogorchock
632885741f panfrost: Fix headers and gpu_headers memory leak
The per-batch headers/gpu_headers dynarrays need to be freed during the
batch cleanup to prevent leaking.

Signed-off-by: Daniel Ogorchock <daniel.ogorchock@garmin.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3308>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3308>
2020-01-13 09:11:35 +00:00
Daniel Ogorchock
2848edc0ef panfrost: Fix panfrost_bo_access memory leak
The bo access needs to be freed prior to removing it from its hash
table. This prevents leaking them over time.

Signed-off-by: Daniel Ogorchock <daniel.ogorchock@garmin.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3308>
2020-01-13 09:11:35 +00:00
Samuel Pitoiset
ecace26853 radv/gfx10: improve performance for TES using PrimID but not exporting it
This field is for the primitive ID export to the fragment shader.
Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-13 08:14:47 +01:00
Samuel Pitoiset
1db276ba23 radv/gfx10: add support for NGG passthrough mode
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-13 08:14:45 +01:00
Samuel Pitoiset
471738e97b radv/gfx10: do not declare LDS for NGG if useless
Only needed for NGG without passthrough mode or for NGG streamout.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-13 08:14:43 +01:00
Samuel Pitoiset
0758f645d0 radv/gfx10: determine if a pipeline is eligible for NGG passthrough
It can't be enabled for geometry shaders, for NGG streamout and
for vertex shaders that export the primitive ID. NGG passthrough
requires that LDS isn't used.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-13 08:14:40 +01:00
Samuel Pitoiset
c65015f83c radv/gfx10: disable vertex grouping
RadeonSI and AMDVLK does that.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-13 08:14:38 +01:00
Ilia Mirkin
201b88a93b nvc0: treat all draws without color0 broadcast as MRT
Per the semi-recently-released NVIDIA docs, when this bit is not
enabled, then the result for RT[0] will be used. So if e.g. only a
single RT is drawn to and it's not RT[2], the results will not be
visible. Fixes
GTF-GL45.gtf33.GL3Tests.explicit_attrib_location.explicit_attrib_location_pipeline
which was failing due to a frag shader outputting only to location=2.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2020-01-12 12:11:16 -05:00
Ilia Mirkin
3e9aacb139 gm107/ir: avoid combining geometry shader stores at 0x60
This corresponds to gl_PrimitiveID and gl_Layer. When both of these are
stored in a single AST.64 or AST.128 operation, then it appears as
though the whole store fails. Fixes the recently extended
glsl-1.50-transform-feedback-builtins piglit, and also
gtf30.GL3Tests.transform_feedback.transform_feedback_builtins.

The issue was reproduced on GM107 and GP108 but not GK208 nor GK104.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2020-01-12 12:11:16 -05:00
Ilia Mirkin
3be708eb31 nvc0: add dummy reset status support
Perhaps in a future implementation, such events could be passed back to
the driver, or queried directly. However for now, this is required for
GL 4.3 robustness contexts.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2020-01-12 12:11:16 -05:00
Ilia Mirkin
838118462e nv50,nvc0: fix destination coordinates of blit
The fix was found by Karol Herbst a long time ago, but it was unclear
why it helped or if it would create additional problems. This change
adds a comment that explains what's going on, and in the process also
normalizes the nv50 implementation to match.

The coordinates which are fed to gl_Position map directly to pixel
coordinates, since the viewport transform is disabled. If the
framebuffer is MSAA, then that doesn't affect the pixel coordinates at
all, it's just that each pixel has multiple samples.

Note that this makes it really clear that this approach is inappropriate
for EXT_framebuffer_multisample_blit_scaled, and also the 3d path will
fail terribly for direct copies. Thankfully the 2d path normally takes
care of this.

Fixes KHR-GL43.packed_depth_stencil.blit.depth32f_stencil8 as well as
scaling issues in a number of EXT_framebuffer_multisample-related piglit
tests (although they continue to fail due to inaccuracies).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2020-01-12 12:11:16 -05:00
Bas Nieuwenhuizen
bfd9e7ff24 radv: Use new scanout gfx9 metadata flag.
This updates for the new metadata ABI in radeonsi.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3244>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3244>
2020-01-12 14:01:59 +01:00
Vasily Khoruzhick
f06be79457 lima: fix PIPE_CAP_* to mark features that aren't supported yet
lima doesn't support alpha test, flat shading, two-sided color nor
clip planes. We can enable these caps when corresponding hw features
are implemented in the driver.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2020-01-12 00:10:04 -08:00
Vasily Khoruzhick
8a421135fa lima: implement polygon offset
Fixes some of dEQP-GLES2.functional.polygon_offset.* tests and shadows in Q3A.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2020-01-12 00:10:04 -08:00
Vasily Khoruzhick
b936b1f9b4 lima: fix viewport clipping
Apparently Mali4x0 doesn't do viewport clipping, so anything rendered beyond viewport
is still rendered. Looks like we need to use scissors to do clipping.

Fixes most of dEQP-GLES2.functional.clipping.*, 6 out of 7 remaining failures
fail on blob as well. Remaining [1] fails on many other gallium drivers.

[1] dEQP-GLES2.functional.clipping.triangle_vertex.clip_three.clip_neg_x_neg_z_and_pos_x_pos_z_and_neg_x_neg_y_pos_z

Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2020-01-12 00:10:04 -08:00
Vasily Khoruzhick
997a30d709 lima: fix PLBU_CMD_PRIMITIVE_SETUP command
Apparently it doesn't depend on primitive type, the value
only depends on whether we specify point size via PLBU command --
bit 12 is set in this case

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2020-01-12 00:10:04 -08:00
Timothy Arceri
6bafd230e3 glsl: fix potential bug in nir uniform linker
The state value of main_uniform_storage_index will be wrong for
add_parameter() when find_and_update_previous_uniform_storage()
finds a uniform if there is more than 1 uniform used in
multiple shader stages.

The new code is also simpler.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2020-01-12 11:02:20 +11:00
Christian Gmeiner
db7967ef9f etnaviv: add deqp debug option
This new debug option will fake some driver CAPs to be able to run dEQP
for GLES3.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3351>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3351>
2020-01-11 22:05:35 +00:00
Timur Kristóf
44a6b17df7 aco/wave32: Set the definitions of v_cmp instructions to the lane mask.
The output of v_cmp instructions is s1 (a single SGPR) in wave32 mode,
as opposed to s2 (an SGPR-pair) in wave64 mode.
A couple of cases where this should have been fixed were omitted from
the previous patch by mistake.

Fixes: e0bcefc3a0
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2020-01-11 20:15:53 +01:00
Alyssa Rosenzweig
59d30fd4bc pan/midgard: Support indirect UBO offsets
...in case we have arrays in a UBO block that we'd like to access
indirectly.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3352>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3352>
2020-01-10 17:48:42 -05:00
Francisco Jerez
c20dc9b836 intel/fs: Make implied_mrf_writes() an fs_inst method.
This will be convenient in a later commit enabling SIMD32 fragment
shaders, and happens to fix the calculation for MATH instructions
which is currently inaccurate for SIMD-lowered instructions on Gen4-5
platforms (all of them on Gen4 in SIMD16 mode), since it was based on
the shader's dispatch width rather than on the actual execution size
of the instruction.

This causes some shader-db noise on Gen4 due to the more compact
register allocation interacting with the SEND dependency workarounds,
but otherwise no major changes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-10 11:02:30 -08:00
Francisco Jerez
591f146fd2 intel/fs/cse: Fix non-deterministic behavior due to inaccurate liveness calculation.
The liveness calculation done by the local CSE pass in order to prune
AEB entries whose sources are no longer live is currently inaccurate,
because the live intervals are calculated once at the beginning of the
pass, so they don't take into account any of the copy instructions
inserted by the CSE pass as it makes progress.  However the IP counter
used in that calculation is based on the start_ip of the basic block,
which is updated automatically whenever any instructions are inserted
into the CFG.  This causes the IP counter and liveness intervals to
get out of sync in programs with multiple basic blocks, causing the
CSE pass to toss AEB entries prematurely, which can lead to missed
optimization opportunities rather non-deterministically.

On BDW this leads to the following shader-db changes:

 total instructions in shared programs: 14952488 -> 14951763 (-0.00%)
 instructions in affected programs: 45416 -> 44691 (-1.60%)
 helped: 40
 HURT: 4

 total spills in shared programs: 20989 -> 20970 (-0.09%)
 spills in affected programs: 103 -> 84 (-18.45%)
 helped: 3
 HURT: 0

 total fills in shared programs: 24981 -> 24926 (-0.22%)
 fills in affected programs: 127 -> 72 (-43.31%)
 helped: 3
 HURT: 0

In addition it avoids a number of regressions in combination with some
of the optimization changes I'm working on for SIMD32, which would
have made CSE more effective...  Causing it to be less effective
elsewhere in the program astonishingly.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-10 11:02:06 -08:00
Francisco Jerez
cc0ea482ad intel/fs: Fix nir_intrinsic_load_barycentric_at_sample for SIMD32.
For uniform sample ID, only the first channel of msg_data will be
initialized.  We need to pass that component only to the SEND message
for SIMD lowering to unzip the descriptor source correctly.

Fixes several dozens of conformance test failures with SIMD32 fragment
shaders enabled, including:

dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_sample.dynamic_sample_number.*

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-10 11:01:52 -08:00
Francisco Jerez
0703eab012 intel/fs/gen8+: Fix r127 dst/src overlap RA workaround for EOT message payload.
The problem occured when the return payload of a SIMD8 SEND
instruction was re-used as source payload of an EOT SEND message.  In
such cases the interference edge added by that workaround between the
payload and grf127_send_hack_node would have no effect, because the
payload would be allocated to a fixed range of registers containing
r127 by the special handling of EOT message payloads in the same
function.  This would cause things to blow up if the source payload of
the first SIMD8 message ended up being allocated to a range which
happened to overlap the destination.

Fix it by avoiding r127 altogether in the allocation of EOT message
payloads.

The problem can be reproduced on ICL with the fp-indirections2 Piglit
test-case in combination with the other optimizer changes of this
series.

Fixes: 232ed89802 "i965/fs: Register allocator shoudn't use grf127 for sends dest"
Cc: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-10 11:00:42 -08:00
Francisco Jerez
0a6e46d44d intel/fs/gen11+: Handle ROR/ROL in lower_simd_width().
Prevents invalid code from being emitted for ROR/ROL instructions in
SIMD32 shaders.

The problem can be reproduced with the following tests while forcing
SIMD32 to be used for fragment shaders:

 piglit.shaders.glsl-rotate-left
 piglit.shaders.glsl-rotate-right

However the issue could occur in production already with compute
shaders and a workgroup size large enough to trigger SIMD32 dispatch.

Fixes: 83fdec0f0d "intel/compiler: Enable the emission of ROR/ROL instructions"
Cc: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-10 11:00:24 -08:00
Francisco Jerez
a30bb25a7a glsl: Fix software 64-bit integer to 32-bit float conversions.
The current implementation was broken for any integers between 2^24
and 2^30 (it would return zero for me on ICL).  The reason is that for
such integers we wouldn't take the 'if (0 <= shiftCount)' early return
path, however 'shiftCount + 7' would be positive, leading to a
negative 'count' argument passed to __shift64RightJamming(), which
would give undefined results.

This reworks the affected conversion functions to use either
__shortShift64Left() or __shift64RightJamming() based on the sign of
the final shift count, which should avoid the problem.  In addition
this should qualify as a clean-up/optimization -- This implementation
of the conversion functions translates to 7 instructions less than the
original on Intel hardware.

This fixes the 'KHR-GL46.shader_ballot_tests.ShaderBallotFunctionBallot'
conformance tests on soft fp64 hardware with large enough subgroup
size (>16).

Fixes: d5cf6e92b4 "glsl: Add built-in functions to do uint64_to_fp32(uint64_t)"
Fixes: c9d333a6b7 "glsl: Add built-in functions to do int64_to_fp32(int64_t)"
Cc: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
2020-01-10 10:51:58 -08:00
Daniel Schürmann
8b7a42d6d0 aco: compact aco::span<T> to use uint16_t offset and size instead of pointer and size_t.
This reduces the size of the Instruction base class
from 40 bytes to 16 bytes. No pipelinedb changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3332>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3332>
2020-01-10 17:49:18 +00:00
Daniel Schürmann
ffb4790279 aco: compact various Instruction classes
No pipelinedb changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3332>
2020-01-10 17:49:18 +00:00
Andrii Simiklit
ebaab89761 mesa/st: fix a memory leak in get_version
This patch prevents memory leak in get_version function in st_manager.c
This issue was found by valgrind:
16 bytes in 1 blocks are definitely lost in loss record 6 of 1,418
   at 0x483CD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
   by 0x63D9476: st_init_extensions (st_extensions.c:1679)
   by 0x63B803B: get_version (st_manager.c:1271)
   by 0x63B8124: st_api_query_versions (st_manager.c:1289)
   by 0x63266EF: dri_init_screen_helper (dri_screen.c:583)
   by 0x6321B12: dri2_init_screen (dri2.c:2110)
   by 0x631AACC: driCreateNewScreen2 (dri_util.c:155)
   by 0x5D58192: dri3_create_screen (dri3_glx.c:897)
   by 0x5D39829: AllocAndFetchScreenConfigs (glxext.c:815)
   by 0x5D39C57: __glXInitialize (glxext.c:941)
   by 0x5D3290A: GetGLXPrivScreenConfig (glxcmds.c:174)
   by 0x5D34F38: glXQueryExtensionsString (glxcmds.c:1307)

Fixes: eca8032f20 ("gallium: Add ARB_gl_spirv support")
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3345>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3345>
2020-01-10 17:27:39 +00:00
Lasse Lopperi
3de2774dcb freedreno/drm: Fix memory leak in softpin implementation
Free the memory allocated for cmds/reloc_bos array when destoying the
associated ringbuffer.

For similar fix for the non-softpin implementation see:
d014af98b7

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2324

Fixes: f3cc0d2 ("freedreno: import libdrm_freedreno + redesign submit")

Signed-off-by: Lasse Lopperi <lasse.lopperi@ge.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3342>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3342>
2020-01-10 16:21:35 +00:00
Rhys Perry
b5c9688516 aco: limit register usage for large work groups
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2020-01-10 12:10:37 +00:00
Timur Kristóf
eccac46cdc ac/llvm: Fix ac_build_reduce in wave32 mode.
Previously, when cluster_size was set to 0, it always worked as if
the cluster size was 64. This commit fixes it in wave32 mode by
changing to work as if the cluster size was set to 32.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2020-01-10 12:30:44 +01:00
Pierre-Eric Pelloux-Prayer
a5fe84aefb radeonsi: release saved resources in si_compute_do_clear_or_copy
Fixes: 9b331e462e ("radeonsi: use compute shaders for clear_buffer & copy_buffer")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2020-01-10 08:41:40 +01:00
Pierre-Eric Pelloux-Prayer
6912149ee5 radeonsi: release saved resources in si_compute_clear_12bytes_buffer
Fixes: 6c901f0675 ("radeonsi: use compute shader for clear 12-byte buffer")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2020-01-10 08:41:38 +01:00
Pierre-Eric Pelloux-Prayer
1acf714d57 radeonsi: release saved resources in si_compute_copy_image
Fixes: 1b25d340b7 ("radeonsi: use compute for resource_copy_region when possible")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2020-01-10 08:41:35 +01:00
Pierre-Eric Pelloux-Prayer
e1e87466ae radeonsi: release saved resources in si_compute_clear_render_target
Fixes: 984fd73515 ("radeonsi: use compute for clear_render_target when possible")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2020-01-10 08:41:33 +01:00
Pierre-Eric Pelloux-Prayer
6c019e28ca radeonsi: release saved resources in si_compute_expand_fmask
Fixes: 095a58204d ("radeonsi: expand FMASK before MSAA image stores are used")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2020-01-10 08:41:31 +01:00
Pierre-Eric Pelloux-Prayer
9211cbe07a radeonsi: release saved resources in si_retile_dcc
Fixes: 1f21396431 ("radeonsi: add support for displayable DCC for multi-RB chips")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2330
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2020-01-10 08:41:19 +01:00
Samuel Iglesias Gonsálvez
39c1892dd8 main: fix coverity error in _mesa_program_resource_find_name()
We did not take into account if name is NULL, so we could dereference
a NULL pointer in strncmp() call.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2020-01-10 08:40:00 +01:00
Icecream95
f2f1277624 panfrost: Add negative lod bias support
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2020-01-10 06:51:42 +00:00
Gurchetan Singh
daf1d5ad4c virgl/drm: update UAPI
This seems to compile. Header copied over from drm-misc-next
7da5492739db.

Acked-by: Eric Engestrom <eric@engestrom.ch>
2020-01-10 04:12:40 +00:00
Vasily Khoruzhick
438c677859 lima: drop support for R8G8B8 format
We can only sample from 24-bit packed format and can't render into it and
it causes chromium-based browsers to fail when they create FBO with GL_RGB
format. Drop R8G8B8 alltogether so mesa can promote it to RGBX format.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2020-01-09 18:46:08 -08:00
Jason Ekstrand
9b71171442 anv: Re-use flush_descriptor_sets in flush_compute_state
There's no reason to hand-roll all of the memory re-allocation fall-back
code for compute shaders.  It's  just duplicated complexity.  This also
makes it more clear in flush_compute_state where the
MEDIA_INTERFACE_DESCRIPTOR_LOAD command gets emitted relative to other
packets in the command stream.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2020-01-09 19:45:00 -06:00
Jason Ekstrand
ae72d1238c anv: Flag descriptors dirty when gl_NumWorkgroups is used
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2020-01-09 19:45:00 -06:00
Jason Ekstrand
ca6b3b11af anv: Don't add dynamic state base address to push constants on Gen7
Because Gen7 push constants are already relative to dynamic state base
address, they aren't really an address.  It's deceptive to return an
address from the helper function.  Instead, let's leave it as a
special-case in the gen7-11 helper; we don't need the helper for code
de-duplication for Gen7 anyway.

Fixes: 67d2cb3e93 "anv: Add get_push_range_address() helper"
Closes: #2323
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2020-01-09 19:44:06 -06:00
Vasily Khoruzhick
044da65f52 lima: add debug flag to disable tiling
Add debug flag to disable tiling. Note that it prevents lima from creating
tiled buffers, but it's still able to import them if modifier is specified

Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2020-01-10 01:13:47 +00:00
Vasily Khoruzhick
a533d1d4c6 lima: use linear layout for shared buffers if modifier is not specified
Use linear layout for shared buffers if modifier is not specified
and use linear layout when importing buffers with invalid modifier.

Fixes: 01a451b04d ("lima: handle DRM_FORMAT_MOD_INVALID in resource_from_handle()")
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2020-01-10 01:13:47 +00:00
Timothy Arceri
87e0dd68f5 glsl: call calculate_subroutine_compat() from the nir linker
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2020-01-10 00:41:20 +00:00
Timothy Arceri
726e8f24c6 glsl: move calculate_subroutine_compat() to shared linker code
We will make use of this in the nir linker in the following patch.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2020-01-10 00:41:20 +00:00
Timothy Arceri
c60d0bd92f glsl: call uniform resource checks from the nir linker
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2020-01-10 00:41:20 +00:00
Timothy Arceri
05c1f7a154 glsl: move uniform resource checks into the common linker code
We will call this from the nir linker in the following patch.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2020-01-10 00:41:20 +00:00
Timothy Arceri
b85985dd51 glsl: call check_subroutine_resources() from the nir linker
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2020-01-10 00:41:20 +00:00
Timothy Arceri
a6fd1c7752 glsl: move check_subroutine_resources() into the shared util code
We will make use of this in the nir linker in the following patch.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2020-01-10 00:41:20 +00:00
Jason Ekstrand
3dec68e682 genxml: Remove a non-existant HW bit 2020-01-09 18:40:20 -06:00
Kristian H. Kristensen
f9d35ea55b ir3: Set up full/half register conflicts correctly
Setting up transitive conflicts between a full register and its two
half registers (eg r0.x and hr0.x and hr0.y) will make the half
registers conflict.  They don't actually conflict and this prevents us
from using both at the same time.

Add and use a new ra helper that sets up transitive conflicts between
a register and its subregisters, except it carefully avoids the
subregister conflict.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2020-01-09 16:03:25 -08:00
Dave Airlie
85eed5def3 llvmpipe: add ARB_derivative_control support
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2020-01-10 08:43:40 +10:00
Marek Olšák
269953e779 radeonsi/gfx9: force the micro tile mode for MSAA resolve correctly on gfx9
Fixes: 69ea473 "amd/addrlib: update to the latest version"
Closes: #2325

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-09 16:28:28 -05:00
Lionel Landwerlin
60e0db3bfb anv: fix intel perf queries availability writes
The availability is not written at the location changed in
ee6fbb95a74d...

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: ee6fbb95a7 ("anv: Properly handle host query reset of performance queries")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2020-01-09 20:42:36 +02:00
Dylan Baker
da2fe9c15e docs: Add release notes for 19.3.2, update calendar and home page 2020-01-09 10:33:49 -08:00
Dylan Baker
2d46a7f26d docs: add SHA256 sums for 19.3.2 2020-01-09 10:32:18 -08:00
Dylan Baker
d4f237dcce docs: Add release notes for 19.3.2 2020-01-09 10:32:14 -08:00
Satyajit Sahu
4e3a09db25 radeon/vcn: Handle crop parameters for encoder
Set proper cropping parameter if frame cropping is enabled

Signed-off-by: Satyajit Sahu <satyajit.sahu@amd.com>
Reviewed-by: Boyuan Zhang boyuan.zhang@amd.com
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3328>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3328>
2020-01-09 15:43:18 +00:00
Daniel Schürmann
cd31da4587 nir: fix printing of var_decl with more than 4 components.
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Fixes: a8ec4082a4 ('nir+vtn: vec8+vec16 support')
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3320>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3320>
2020-01-09 10:31:26 +01:00
Samuel Pitoiset
e298e78a01 radv: advertise VK_AMD_shader_image_load_store_lod
This extension allows to use LOD with image read/write operations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-09 07:58:34 +01:00
Samuel Pitoiset
4d49a7ac73 aco: handle nir_intrinsic_image_deref_{load,store} with lod
Use image_load_mip and image_store_mip respectively if the lod
parameter isn't zero.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-09 07:58:33 +01:00
Samuel Pitoiset
e77ff89914 amd/llvm: handle nir_intrinsic_image_deref_{load,store} with lod
Use image_load_mip and image_store_mip respectively if the lod
parameter isn't zero.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-09 07:58:33 +01:00
Samuel Pitoiset
1b808d208f spirv,nir: add new lod parameter to image_{load,store} intrinsics
SPV_AMD_shader_image_load_store_lod allows to use a lod parameter
with OpImageRead, OpImageWrite and OpImageSparseRead.

According to the specification, this parameter should be a 32-bit
integer. It is initialized to 0 when no lod parameter is found
during SPIR-V->NIR translation.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-09 07:58:33 +01:00
Samuel Pitoiset
37bfd854c7 spirv: add SpvCapabilityImageReadWriteLodAMD
New SPIR-V capability for SPV_AMD_shader_image_load_store_lod.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-09 07:58:33 +01:00
Tapani Pälli
1e29ff7b3d mesa: create program resource hash in a single place
This is a cleanup but also a fix for commit dd09f1d806. In case of
i965 we did not actually create hash for cached shader programs.

Fixes: dd09f1d806 "mesa/st/i965: add a ProgramResourceHash for quicker resource lookup"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2020-01-09 07:28:13 +02:00
Dave Airlie
ee9879335e llvmpipe: add support for ARB_indirect_parameters.
This just adds support for getting the draw count from the
indirect buffer.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3234>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3234>
2020-01-09 10:35:44 +10:00
Dave Airlie
315fa2e5c9 llvmpipe: enable driver side multi draw indirect
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3234>
2020-01-09 10:35:40 +10:00
Dave Airlie
d10a3d528f gallium/util: add multi_draw_indirect to util_draw_indirect.
ARB_indirect_parameters needs drivers to deal with mutli_draw_indirect
themselves.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3234>
2020-01-09 10:35:36 +10:00
Thong Thai
3a4f8c8158 mesa: Prevent _MaxLevel from being less than zero
When decoding using VDPAU, the _MaxLevel value becomes -1 due to
NumLevels being equal to 0 at a certain point, and decoding fails
due to an assertion later on.

Signed-off-by: Thong Thai <thong.thai@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Cc: 19.2 19.3 <mesa-stable@lists.freedesktop.org>
2020-01-08 16:44:20 -05:00
Marek Olšák
9b71041627 ac: add ac_build_s_endpgm
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-08 16:03:48 -05:00
Marek Olšák
1c44480538 ac: add 128-bit bitcount
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-08 16:00:41 -05:00
Marek Olšák
d7b565365e ac/gpu_info: add pc_lines and use it in radeonsi
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-08 16:00:40 -05:00
Marek Olšák
d1c8aeb24f ac: unify primitive export code
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-08 16:00:38 -05:00
Marek Olšák
1c77a18cc2 ac: unify build_sendmsg_gs_alloc_req
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-08 16:00:36 -05:00
Marek Olšák
fd84e422b6 radeonsi: clean up messy si_emit_rasterizer_prim_state
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-08 15:48:49 -05:00
Marek Olšák
b64a3240c2 radeonsi: determine accurately if line stippling is enabled for performance
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-08 15:48:47 -05:00
Marek Olšák
79cc7e6ff0 radeonsi: test polygon mode enablement accurately
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-08 15:48:43 -05:00
Marek Olšák
898c9cb797 radeonsi: fix context roll tracking in si_emit_shader_vs
probably harmless, because we don't need to track context rolls on gfx10

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-08 15:48:39 -05:00
Marek Olšák
4249a90f5d radeonsi: fix monolithic pixel shaders with two-sided colors and SampleMaskIn
They are never used except for testing AMD_DEBUG=mono.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-08 15:48:35 -05:00
Marek Olšák
186335d17d ac/gpu_info: always use distributed tessellation on gfx10
This might fix a hang on Navi14.

Cc: 19.2 19.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-08 15:48:32 -05:00
Marek Olšák
eb1e10d0be gallium: bypass u_vbuf if it's not needed (no fallbacks and no user VBOs)
This decreases CPU overhead, because u_vbuf is completely bypassed
in those cases.

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2020-01-08 13:40:59 -05:00
Marek Olšák
9f6020abc6 gallium/cso_context: move non-vbuf vertex buffer and element code into helpers
These will be reused.

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2020-01-08 13:40:59 -05:00
Marek Olšák
ce648b913f gallium: put u_vbuf_get_caps return values into u_vbuf_caps
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2020-01-08 13:40:59 -05:00
Jonathan Marek
472593e9cf etnaviv: remove unnecessary vertex_elements_state_create error checking
PIPE_CAP_MAX_VERTEX_BUFFERS already sets the maximum vertex_buffer_index.

There's no need to error on num_elements == 0 (if that can even happen).

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2020-01-08 12:27:35 -05:00
Jonathan Marek
76d93b437b etnaviv: implement gl_VertexID/gl_InstanceID
Fixes:
dEQP-GLES3.functional.instanced.*

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2020-01-08 12:27:35 -05:00
Jonathan Marek
93ff6f5919 etnaviv: HALTI2+ instanced draw
Fixes:
dEQP-GLES3.functional.draw.draw_arrays_instanced.*
dEQP-GLES3.functional.draw.draw_elements_instanced.*

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2020-01-08 12:27:34 -05:00
Jonathan Marek
ea608ae23b etnaviv: update headers from rnndb
Update to etna_viv commit 46af5f1d.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2020-01-08 12:27:34 -05:00
Lionel Landwerlin
4578d4ae52 anv: don't close invalid syncfd semaphore
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2020-01-08 18:20:50 +02:00
Krzysztof Raszkowski
7d33203b44 gallium/swr: Fix glVertexPointer race condition.
Sometimes using user buffer (not VBO) e.g. glVertexPointer
one thread could free memory before other thread used it.
Instead of copying this memory to driver simplier thing is
to block until draw finish.

Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
2020-01-08 15:42:03 +00:00
Jason Ekstrand
b788cccfe2 intel/disasm: Fix decoding of src0 of SENDS
There is no instruction field for the register file for src0 because
it's always GRF.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3309>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3309>
2020-01-08 14:14:16 +00:00
Yevhenii Kolesnikov
8dcff01c8b meta: Add cleanup function for Bitmap
Buffer object and temporary texture were never freed, causing memory leaks.

Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2020-01-08 15:34:03 +02:00
Juan A. Suarez Romero
ad4fb7ea04 nir/spirv: skip unreachable blocks in Phi second pass
Only the blocks that are reachable are inserted with an end_nop
instruction at the end.

When handling the Phi second pass, if the Phi has a parent block that
does not have an end_nop then it means this block is unreachable, and
thus we can ignore it, as the Phi will never come through it.

Fixes dEQP-VK.graphicsfuzz.uninit-element-cast-in-loop.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2020-01-08 11:32:24 +01:00
Pierre-Eric Pelloux-Prayer
5f8daae4d8 radeonsi: check ctx->sdma_cs before using it
e5167a9276 disabled SDMA for gfx8.
This caused 3 piglit arb_sparse_buffer tests (basic, buffer-data
and commit) to crash on GFX8.

Reported-by: Michel Dänzer <michel@daenzer.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Fixes: e5167a9276 ("radeonsi: disable SDMA on gfx8 to fix corruption on RX 580")
2020-01-08 09:31:35 +01:00
Samuel Pitoiset
e565fd4255 radv: do not fill keys from fragment shader twice
radv_fill_shader_info() already does that.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-08 08:59:04 +01:00
Yevhenii Kolesnikov
ed43dd62ac main: allow external textures for BindImageTexture
From issue 10 of the OES_EGL_image_external_essl3:

  A limited set of use-cases is enabled by making glBindImageTexture
  accept external textures. Shaders can access such external textures
  using the existing <image2D> sampler type.

Fixes: 02a6d901ee ("mesa: add OES_EGL_image_external_essl3 support")

Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2020-01-08 09:21:39 +02:00
Jason Ekstrand
803fad43c3 intel/nir: Add a memory barrier before barrier()
Our barrier instruction does not implicitly do a memory fence but the
GLSL barrier() intrinsic is supposed to.  The easiest back-portable
solution is to just add the NIR barriers.  We'll sort this out more
properly in later commits.

Cc: mesa-stable@lists.freedesktop.org
Closes: #2138
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2020-01-07 21:52:19 -06:00
Bas Nieuwenhuizen
7cc0702bbb radv: Emit a BATCH_BREAK when changing pixel shaders or CB_TARGET_MASK.
Fixes a hang on Raven with Resident Evil 2.

I did not find anything more restricted to fix it:

- Setting persistent_states_per_bin to 1 fixes it too,
  but likely does an internal break on any descriptor set changes
  too.
- Only breaking the batch when cb_target_mask changes does not fix
  it (and looking at AMDVLK comments, I suspect the code in radeonsi
  should really be doing a FLUSH_DFSM).
- Always doing a FLUSH_DFSM on shader switch helps, but that is more
  often than this and I don't think we should be doing that when DFSM
  is disabled.
- Also emitting the existing break on framebuffer change when DFSM is
  disabled does not fix the issue.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2315
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2020-01-07 22:44:31 +01:00
Tapani Pälli
dd09f1d806 mesa/st/i965: add a ProgramResourceHash for quicker resource lookup
Many resource APIs require searching by name, add a hash table to make
this faster. Currently we traverse the whole resource list for name
based queries, this change makes all these cases use the hash.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2203
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3254>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3254>
2020-01-07 10:48:41 +00:00
Michel Dänzer
5f0ff004ca gitlab-ci: Test against LLVM / clang 9 on x86
They're not available for Debian buster yet, so we have to use upstream
snapshot packages again.

In contrast to earlier, we now store the LLVM APT repository key in Git
instead of re-downloading it every time.
2020-01-07 11:00:16 +01:00
Alyssa Rosenzweig
4cd3dc94ad panfrost: Don't double-flip Z/W for 2D arrays
We need to mindful that we don't clobber the shadow comparator.

Fixes dEQP-GLES3.functional.shaders.texture_functions.texture.sampler2darrayshadow_*

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2020-01-07 08:54:52 +01:00
Alyssa Rosenzweig
bc4c853b49 pan/midgard: Account for z/w flip in texelFetch
Required for proper txf of 2D arrays.

Fixes dEQP-GLES3.functional.shaders.texture_functions.texelfetch.*2darray*

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2020-01-07 08:54:47 +01:00
Alyssa Rosenzweig
4152d45d38 panfrost: Adjust for mismatch between hardware/Gallium in arrays/cube
The hardware separates face selection and array indexing, it looks like,
whereas Gallium smushes them together with some modulus fun. Let's fix
it so mipmapped 2D arrays work without regressing cubemaps.

Fixes dEQP-GLES3.functional.texture.filtering.2d_array.* among others.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2020-01-07 08:54:40 +01:00
Alyssa Rosenzweig
0b714f3fa3 panfrost: Respect constant buffer_offset
Fixes dEQP-GLES3.functional.ubo.multi_basic_types.single_buffer.* among
others

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2020-01-07 08:54:23 +01:00
Timothy Arceri
3bd4bcd418 glsl: use nir version of check_image_resources() for nir linker
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2020-01-07 09:53:51 +11:00
Timothy Arceri
feffd1fa65 glsl: add check_image_resources() for the nir linker
This is adapted from the GLSL IR code but doesn't need to
iterate over the IR. I believe this also fixes a potential bug in
the GLSL IR code which potentially counts the same output twice.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2020-01-07 09:53:51 +11:00
Timothy Arceri
a853de0c95 glsl: use nir linker to link atomics
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2020-01-07 09:50:57 +11:00
Timothy Arceri
8f2cab7767 mesa: add new UseNIRGLSLLinker constant
This will be used to disable some GLSL IR passes in following
patches.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2020-01-07 08:39:47 +11:00
Timothy Arceri
4caf3fc8df glsl: reorder link_and_validate_uniforms() calls
This is required for the following commit.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2020-01-07 08:39:34 +11:00
Timothy Arceri
ed325ac4dd glsl: add new gl_nir_link_glsl() helper
This will allow us to do some linking in NIR that was previously
done by the GLSL IR linker. To start with this just has calls for
linking atomics.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2020-01-07 08:39:16 +11:00
Timothy Arceri
0e60ea1d67 glsl: add gl_nir_link_check_atomic_counter_resources()
This is pretty much a copy of link_check_atomic_counter_resources()
updated to work with the NIR linker.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2020-01-07 08:38:52 +11:00
Timothy Arceri
432ed13dec glsl: rename gl_nir_link() to gl_nir_link_spirv()
A NIR based glsl linking function will be too different to the
spirv version to bother attempting any sharing. So lets change
the name to be explicit.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2020-01-07 08:38:41 +11:00
Kristian H. Kristensen
6c1c13e90e st/mesa: Lower vars to ssa and constant prop before gl_nir_lower_buffers
The gl_nir_lower_buffers pass relies on recognizing the same literal
constants as the GLSL compiler so that constant buffer array indices
are constant in nir as well.  Without this, get_block_array_index()
would see

  vec1 32 ssa_723 = deref_var &const_temp@1 (function_temp int)
  vec1 32 ssa_724 = load_const (0x00000001 /* 0.000000 */)
  ...
  vec1 32 ssa_5 = deref_var &const_temp@1 (function_temp int)
  vec1 32 ssa_6 = intrinsic load_deref (ssa_5) (0) /* access=0 */
  vec1 32 ssa_7 = deref_var &blockB (ssbo BlockB[1])
  vec1 32 ssa_8 = deref_array &(*ssa_7)[ssa_6] (ssbo BlockB) /* &blockB[ssa_6] */

instead of a literal 1, and ultimately generate the block name
BlockB[0].  That used to work, since we before the previous commits
we'd compact the block binding points and names. Thus, there would
always be a BlockB[0].

Now, if an entry in a block array isn't used, we don't generate that
block name, which means that if entry 0 isn't used BlockB[0] isn't
present and then get_block_array_index() fails to find the block.

In most cases we would have dealt with this in the call to
st_nir_opts() in st_nir_link_shaders(), but in the num_shaders == 1
case (for example, compute) we would call gl_nir_lower_buffers()
before we lowered GLSL constants.  Move that corner case up next to
where we call st_nir_link_shaders() so we call st_nir_opts() at the
same point in the flow for all shaders.

Fixes: dEQP-GLES31.functional.ssbo.layout.random.all_per_block_buffers.18

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2020-01-06 13:01:19 -08:00
Andrii Simiklit
be6d51e1e3 glsl/nir: do not change an element index to have correct block name
When SSBO array is used with packed layout, both IR tree
and as a result, NIR tree will be incorrect.
In fact, the SSBO dereference indices won't
match the array size in some cases like the following:

"layout(packed, binding=1) buffer SSBO { vec4 a; } ssbo[3];
 out vec4 color;
 void main() {
   color = ssbo[2].a;
 }"

After linking the IR and then NIR will have an SSBO array
definition with size 1 but dereference still will have index 2
and linked_shader->Program->sh.ShaderStorageBlocks
will contain just SSBO with name "SSBO[2]"

So this line should be removed at least as a workaround for now
to avoid error like:
Failed to find the block by name "SSBO[0]"

Fixes: 810dde2a "glsl/nir: Add a pass to lower UBO and SSBO access"
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2020-01-06 13:01:19 -08:00
Andrii Simiklit
4beb0a2308 glsl: fix a binding points assignment for ssbo/ubo arrays
This is needed to be in agreement with spec requirements:
https://github.com/KhronosGroup/OpenGL-API/issues/46

Piers Daniell:
   "We discussed this in the OpenGL/ES working group meeting
    and agreed that eliminating unused elements from the interface
    block array is not desirable. There is no statement in the spec
    that this takes place and it would be highly implementation
    dependent if it happens. If the application has an "interface"
    in the shader they need to match up with the API it would be
    quite confusing to have the binding point get compacted.
    So the answer is no, the binding points aren't affected by
    unused elements in the interface block array."

v2: - 'original_dim_size' field moved above to keep
      the struct packed better on 64-bit
    - added a comment for 'total_num_array_elements' field
    - fixed a binding point calculations for SSBOs array of arrays
          ( Ian Romanick <ian.d.romanick@intel.com> )
    - fixed binding point calculations for non-packed SSBOs
v3:
    - rename 'total_num_array_elements' to 'aoa_size'
          ( Jason Ekstrand <jason@jlekstrand.net> )
    - rename 'boffset' to 'binding_stride'
          ( Alejandro Piñeiro <apinheiro@igalia.com> )

Fixes: 8cf1333b "glsl: link uniform block arrays of arrays"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109532
Reported-By: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Fritz Koenig <frkoenig@google.com>
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2020-01-06 13:01:19 -08:00
Andrii Simiklit
a3c9a2881e glsl: fix an incorrect max_array_access after optimization of ssbo/ubo
This is needed to fix these tests:
piglit.spec.arb_shader_storage_buffer_object.compiler.unused-array-element_frag
piglit.spec.arb_shader_storage_buffer_object.compiler.unused-array-element_comp

Fixes: 8cf1333b "glsl: link uniform block arrays of arrays"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109532
Reported-By: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Fritz Koenig <frkoenig@google.com>
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2020-01-06 13:01:19 -08:00
Marek Olšák
420fe1e7f9 radeonsi: remove TGSI
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-06 15:57:20 -05:00
Marek Olšák
e5167a9276 radeonsi: disable SDMA on gfx8 to fix corruption on RX 580
Closes: #1399
Closes: #1889

Cc: 19.2 19.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2020-01-06 15:38:36 -05:00
Marek Olšák
991328498b radeonsi: move SI and CIK+ SDMA code into 1 common function for cleanups
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2020-01-06 15:38:35 -05:00
Marek Olšák
3c265c2586 radeonsi: rename dma_cs -> sdma_cs
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2020-01-06 15:38:33 -05:00
Marek Olšák
cd6a4f7631 radeonsi: add AMD_DEBUG=nodmacopyimage for debugging
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2020-01-06 15:38:32 -05:00
Marek Olšák
0c9e7a67f9 radeonsi: add AMD_DEBUG=nodmaclear for debugging
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2020-01-06 15:38:30 -05:00
Marek Olšák
4110e6e564 radeonsi: remove broken and unused SI SDMA image copy code
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2020-01-06 15:38:28 -05:00
Marek Olšák
503bd821fa radeonsi: rename SDMA debug flags
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2020-01-06 15:38:11 -05:00
Tomeu Vizoso
d62dd8b0cb gitlab-ci: Switch LAVA jobs to use shared dEQP runner
Take one step towards sharing code between the LAVA and non-LAVA jobs,
with the goals of reducing maintenance burden and use of computational
resources.

The env var DEQP_NO_SAVE_RESULTS allows us to skip the procesing of the
XML result files, which can take a long time and is not useful in the
LAVA case as we are not uploading artifacts anywhere at the moment.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2020-01-06 14:27:36 +01:00
Tomeu Vizoso
f5c2807ff2 gitlab-ci: Update kernel for LAVA to 5.5-rc1 plus fixes
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2020-01-06 14:27:21 +01:00
Alyssa Rosenzweig
b3ff83c107 panfrost: Handle PIPE_FORMAT_R10G10B10A2_USCALED
Same format code as UINT... might be different in how it's fed into a
shader but we'll deal with that when we get there.

Fixes dEQP-GLES3.functional.vertex_arrays.single_attribute.output_types.usigned_int2_10_10_10.components4_vec2_quads1

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2020-01-06 07:50:00 -05:00
Alyssa Rosenzweig
5c71547c68 panfrost: Report MSAA 4x supported for dEQP
Fixes dEQP-GLES3.functional.state_query.integers.max_samples_getinteger64

We'll have to actually implement multisampling next, but hey.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2020-01-06 07:49:58 -05:00
Alyssa Rosenzweig
32851ff715 panfrost: Cleanup tiling selection logic
Make it a lot more obvious what we're doing and fix more than a few
corner cases in the process.

Fixes
dEQP-GLES3.functional.buffer.map.write.render_as_index_array.pixel*, and
likely others.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2020-01-06 07:49:53 -05:00
Alyssa Rosenzweig
dadfca3775 panfrost: Implement sRGB blend shaders
We use the lowering in nir_format_convert. There are native ops for this
so this is far from optimal and not remotely efficient but as with most
blend shader things right now, it's hard enough to get it working, so
let's focus on that for now. We'll make it fast later (once we have
GLES3 stable, we can start optimizing these things).

Fixes dEQP-GLES3.functional.fragment_ops.blend.fbo_srgb.*

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2020-01-06 07:49:48 -05:00
Alyssa Rosenzweig
ef00849877 panfrost: Support rendering to non-zero Z/S layers
Fixes abort in STK's shadow implementation.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2020-01-06 07:49:42 -05:00
Alyssa Rosenzweig
ef8c2ebee1 panfrost: Texture from Z32F_S8 as R32F
Z32F_S8 becomes Z32F in texturing, which in turn just becomes R32F.

Fixes dEQP-GLES3.functional.texture.format.sized.*.depth32f_stencil8*

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2020-01-06 07:49:33 -05:00
Danylo Piliaiev
f3ca47d9f3 iris/query: Implement PIPE_QUERY_GPU_FINISHED
Implementation is similar to radeonsi in 5f1cef76

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-06 12:43:14 +02:00
Erik Faye-Lund
642125edd9 st/mesa: use uint-samplers for sampling stencil buffers
Otherwise, we end up mismatching the sampler types when rendering.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2020-01-06 09:23:43 +01:00
Samuel Pitoiset
09ea2de2b8 ac/surface: use uint16_t for mipmap level pitches
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2020-01-06 07:59:50 +01:00
Jonathan Marek
680d806950 etnaviv: fix incorrectly failing vertex size assert
Changes the assert to match the comment above.

This assert was failing in some cases while running darkplaces.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2020-01-05 17:04:39 +00:00
Vasily Khoruzhick
c5ae64ebc7 lima: fix PP stream terminator size
PP stream terminator size seems to be 4 words, it worked with full PP
stream because we align stream beginning to 32 bytes and BO is
initialized with zeroes. But with partial PP stream it sometimes break
if for new PP stream we reuse BO that has non-zero value at this place.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2020-01-05 00:16:39 -08:00
Vasily Khoruzhick
4f5bfe2a5e lima: don't reload and redraw tiles that were not updated
We don't need to reload and redraw some tiles if framebuffer was not
cleared and scissor test was enabled for some of draws. This simple
optimization fixes cursor lag in X11

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2020-01-05 00:16:36 -08:00
Vasily Khoruzhick
83abdf8e45 lima: postpone PP stream generation
This commit postpones PP stream generation till job is submitted.
Doing that this late allows us to skip reloading and redrawing tiles
that were not updated.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2020-01-05 00:16:33 -08:00
Andreas Baierl
7ad1896ab8 lima/parser: Fix VS cmd stream parser
prefetch is int, not bool.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
2020-01-05 03:08:01 +00:00
Andreas Baierl
af7dc4675d lima/parser: Fix rsw parser
Drop assert as it is not necessary and used wrong anyway.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
2020-01-05 03:08:01 +00:00
Kenneth Graunke
defb3a9465 anv: Only enable EWA LOD algorithm when doing anisotropic filtering.
Updated documentation renames "Anisotropic Algorithm" to "LOD Algorithm"
and adds a note for Gen9+ saying "The EWA Algorithm should only be
enabled for Anisotropic Filtering modes." and indicating that the extra
accuracy shouldn't be necessary for other modes, and comes at a cost.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2020-01-04 14:27:22 -08:00
Kenneth Graunke
c0c899cf78 iris: Allow HiZ for copy_region sources
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2020-01-04 12:25:55 -08:00
Jason Ekstrand
7d75bf4f3f i965: Allow HiZ for glCopyImageSubData sources
v2 (Ken): Handle platforms without sampler support for HiZ

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> [v2 changes]
2020-01-04 12:25:55 -08:00
Jason Ekstrand
52ad1712ed anv: Allow HiZ in TRANSFER_SRC_OPTIMAL on Gen8-9
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-04 12:25:54 -08:00
Jason Ekstrand
b274469daa intel/blorp: Use the source format when using blorp_copy with HiZ
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-04 12:25:54 -08:00
Jason Ekstrand
ea7446ba82 i965/blorp: Don't resolve HiZ unless we're reinterpreting
This eliminates 50% of pixels (2M) rendered for a blit in GS:GO.  This
accounts for 3% of pixels rendered in the game.  Total GPU clocks for
the first 900 frames of CSGO improves by 1%.

Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-04 12:25:54 -08:00
Jason Ekstrand
95cc5438eb blorp: Allow reading with HiZ
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-04 12:25:54 -08:00
Jason Ekstrand
4a1093005c blorp: Stop whacking Z24 depth to BGRA8
The shader code required to do this is int(sat(x) * UINT24_MAX) which
isn't really worth all the effort to avoid.  Doing the format
conversion, on the other hand, prevents us from sampling with HiZ which
is something that we very much want on gen8-9 where we can.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-04 12:25:54 -08:00
Christian Gmeiner
a597a64ae2 etnaviv: move descriptor based texture structs
This moves the descriptor based texture structs and their helpers
into the only user.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2020-01-04 20:44:36 +01:00
Christian Gmeiner
7c687d221d etnaviv: move state based texture structs
This moves the state based texture structs and their helpers
into the only user.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2020-01-04 20:44:36 +01:00
Roman Stratiienko
ed0fa78b46 panfrost: Fix Android build
Include missing `encoder/pan_props.c` into the build.

Signed-off-by: Roman Stratiienko <roman.stratiienko@globallogic.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2020-01-04 16:54:38 +00:00
Gert Wollny
9162e2f03f mesa/st: glsl_to_nir: don't lower atomics to SSBOs if driver supports HW atomics
At least on r600 HW atomic operations are way less expensive than SSBO atomic
operations.

v2: use st->has_hw_atomics (Erik Anholt)

v3: remove second invocation of atomic to ssbo lowering (Erik Anholt)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3286>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3286>
2020-01-04 16:22:40 +00:00
Gert Wollny
b119f8b4a0 r600: Delete vertex buffer only if there is actually a shader state
Fixes: gl-2.0-vertexattribpointer
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Konstantin Kharlamov <hi-angel@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3286>
2020-01-04 16:22:40 +00:00
Gert Wollny
32bb5f2941 r600: Make SID and unsigned value
The value is never negative, and makeing it unsigned fixes some
warnings

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Konstantin Kharlamov <hi-angel@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3286>
2020-01-04 16:22:40 +00:00
Gert Wollny
e8559ae448 r600: Fix maximum line width
There are only 13 bits available to store the line width, hence
it can't be larger than 8191

v2: Add Fixes tag

v3: - Unify value since for all r600 archs (Konstantin Kharlamov)
    - Correct the value the line width value is emitted as a 12.4
      fixed point value of 1/2 line width on r600-r700 and as
      8 * line width on Evergreen and newer.

Fixes: 06bfb2d28f
    r600: fork and import gallium/radeon

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Konstantin Kharlamov <hi-angel@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3286>
2020-01-04 16:22:40 +00:00
Gert Wollny
829107819d r600/sb: Correct SB disassambler for better debugging
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Konstantin Kharlamov <hi-angel@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3286>
2020-01-04 16:22:40 +00:00
Gert Wollny
bfbdaf9a46 r600: Make it possible to include r600_asm.h in a C++ file
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Konstantin Kharlamov <hi-angel@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3286>
2020-01-04 16:22:40 +00:00
Gert Wollny
23c5ba8baa r600: Add functions to dump the shader info
This will be helpful to compare TGSI and NIR code path,

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Konstantin Kharlamov <hi-angel@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3286>
2020-01-04 16:22:40 +00:00
Gert Wollny
570a6c6c79 gallium: tgsi_from_mesa - handle VARYING_SLOT_FACE
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3286>
2020-01-04 16:22:40 +00:00
Gert Wollny
6c9495b392 nir: make nir_get_texture_size/lod available outside nir_lower_tex
This functions can be useful in other places.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3286>
2020-01-04 16:22:40 +00:00
Gert Wollny
f69bf7fe8c gallium/tgsi_from_mesa: Add 'extern "C"' to be able to include from C++
The r600/nir backend is in C++ and needs to include this file.

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3286>
2020-01-04 16:22:40 +00:00
Bas Nieuwenhuizen
96c9483ccf spirv: Fix glsl type assert in spir2nir.
Fixes: 624789e370 "compiler/glsl: handle case where we have multiple users for types"
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2020-01-04 15:53:24 +00:00
Christian Gmeiner
b178262cb9 etnaviv: use a better name for FE_VERTEX_STREAM_UNK14680
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2020-01-04 14:15:36 +01:00
Bas Nieuwenhuizen
17741a0a05 radv: Only use the gfx mipmap level offset/pitch for linear textures.
The tiled-case is non-sensical for non-base mips, but Vulkan requires
that this function handles it but at the same time does not require
returning anything useful. So we can basically return anything.

Correct tiled pitch and offset are still required for our own WSI and
in the future getting the layouts of images with DRM format modifiers.
Both don't have to deal with images with more than 1 level though.

Fixes: 824bd0830e "radv: return the correct pitch for linear mipmaps on GFX10"
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2301
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2304
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2020-01-04 13:04:40 +01:00
Bas Nieuwenhuizen
f0ed67b770 Revert "amd/common: Always initialize gfx9 mipmap offset/pitch."
This reverts commit 973181c06c.

Requested by Marek.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2020-01-04 13:04:40 +01:00
Kenneth Graunke
645b195312 iris: Delete remnants of the unimplemented ASTC 5x5 workaround
I copy and pasted some of the boilerplate but never the implementation.
For now, ASTC 5x5 is disabled and faked via uncompressed RGBA; let's
delete these remnants until such a time when we implement it properly.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2020-01-03 18:06:38 -08:00
Kenneth Graunke
e858321f09 iris: Disable ASTC 5x5 support on Gen9 for now.
Intel Gen9 hardware has some nasty restrictions where ASTC 5x5 formats
and color compression can't both live in the sampler cache at the same
time.  To properly support it, we have to track which of those exist
in the cache and flush ASTC out or resolve away compression.

As far as I'm aware, very little uses ASTC 5x5 textures, so instead
of replicating all that for iris, we simply turn it off and rely on
the Gallium fallback mechanism to fake it via uncompressed RGBA.

This should avoid GPU hangs any time people use ASTC 5x5 with CCS.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2020-01-03 18:06:38 -08:00
Kenneth Graunke
8e6308363b st/mesa: Allow ASTC5x5 fallbacks separately from other ASTC LDR formats.
This patch allows us to fake ASTC 5x5 specifically, while leaving the
other ASTC LDR formats with native support.  I plan to use this in iris,
at least for the time being.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2020-01-03 18:06:35 -08:00
Erik Faye-Lund
56fc791b31 etnaviv: use nir_lower_clip_halfz instead of open-coding
We already have a helper for this, so let's use that instead of rolling
our own version.

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Tested-by: Paul Cercueil <paul@crapouillou.net>
2020-01-03 22:48:19 +00:00
Erik Faye-Lund
d9ff5f0414 nir/zink: move clip_halfz-lowering to common code
Etnaviv also does the same thing, so let's try to avoid repetition here,
and use the same for it code as well.

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Tested-by: Paul Cercueil <paul@crapouillou.net>
2020-01-03 22:48:19 +00:00
Erik Faye-Lund
5c2376af63 zink: remove unused code-path in lower_pos_write
This code is never reached, because we don't call nir_lower_io before
lowering this. So let's get rid of it.
2020-01-03 22:48:19 +00:00
Erik Faye-Lund
87b3d8dce5 zink: use nir_fmul_imm
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Tested-by: Paul Cercueil <paul@crapouillou.net>
2020-01-03 22:48:19 +00:00
Erik Faye-Lund
e51bf4914c zink: implement load_vertex_id
Not 100% sure if this matches the semantics, but it seems to pass the
tests, so it seems like an improvement.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2020-01-03 22:20:12 +00:00
Erik Faye-Lund
1b2731f268 zink: factor out builtin-var creation
This is useful so we can reuse it for the next patch

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2020-01-03 22:20:12 +00:00
Erik Faye-Lund
ce1ea6e9c2 zink: simplify front-face type
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2020-01-03 22:20:12 +00:00
Caio Marcelo de Oliveira Filho
75a19186b2 anv: Ignore some CreateInfo structs when rasterization is disabled
According to the description of VkGraphicsPipelineCreateInfo(),
pViewportState, pMultisampleState, pDepthStencilState and
pColorBlendState must be ignored when rasterization is not enabled.

This avoids potentially invalid pointers being dereferenced when
rasterization is disabled.  Tested with `demos_x64 VK_Parameter_Zoo`
from Renderdoc repository.

v2: Don't store the `raster_enabled` as part of anv_pipeline, just
    query it from the create info.  This avoids storing a state that's
    only used during pipeline creation. (Jason)

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2258
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric@engestrom.ch> [v1]
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> [v1]
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2020-01-03 13:57:31 -08:00
Caio Marcelo de Oliveira Filho
6755b6315b anv: Drop unused function parameter
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2020-01-03 13:29:49 -08:00
Marek Olšák
66483ee017 radeonsi: remove the "display_dcc_offset == 0" assertion
I think it's not needed.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-03 15:07:19 -05:00
Marek Olšák
bfddfd12b6 radeonsi: ignore PIPE_BIND_SCANOUT for imported textures
It's obtained from the BO metadata.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-03 15:07:17 -05:00
Marek Olšák
ba10fb3f7f radeonsi: preserve the scanout flag for shared resources on gfx9 and gfx10
Closes: #2195
Closes: #2294

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-03 15:07:11 -05:00
Vasily Khoruzhick
1de06e540a lima: fix allocation of GP outputs storage for indexed draw
For indexed draw number of VS invocations is (ctx->max_index - ctx->min_index + 1),
so we have to use this number when calculating space for varyings, gl_Position and
gl_PointSize.

Fixes dEQP-GLES2.functional.buffer.write.use.index_array.array and
dEQP-GLES2.functional.buffer.write.use.index_array.element_array

Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2020-01-03 18:57:36 +00:00
Jason Ekstrand
9bd8000c6c anv: Drop unneeded struct keywords
All VkFoo structs are typedef'd to not need the struct keyword.  Leaving
it in there is just extra characters and breaks Vulkan's aliasing when
stuff gets promoted to core versions.  It's better to just never use
struct for VkFoo.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2020-01-03 11:32:34 -06:00
Thong Thai
8dc7c467e6 r600: Remove HEVC related code since HEVC is not supported
Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3153>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3153>
2020-01-03 16:30:22 +00:00
Thong Thai
466001a226 radeon: Use P010 for decoding of 10-bit videos
Previously, P016 was used for the decoding of 10-bit HEVC/H.265 encoded
videos, which worked fine for mpv and ffmpeg. GStreamer specifically looks
for P010, so this patch sets the default buffer type to P010 for HEVC
decoding.

Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3153>
2020-01-03 16:30:22 +00:00
Thong Thai
68881af435 st/va: Add support for P010, used for 10-bit videos
Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3153>
2020-01-03 16:30:22 +00:00
Thong Thai
f3569f215d gallium: Add PIPE_FORMAT_P010 support
Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3153>
2020-01-03 16:30:22 +00:00
Thong Thai
ee8344bdcf util/format: Add the P010 format used for 10-bit videos
Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3153>
2020-01-03 16:30:22 +00:00
Erik Faye-Lund
98885e9f61 zink: implement some more trivial opcodes
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2020-01-03 17:16:18 +01:00
Erik Faye-Lund
8c18331afe zink: implement txf
texelFetch is a requirement for OpenGL 3.0, so this gets us a step
closer to GL 3.0 support.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2020-01-03 15:28:27 +01:00
Samuel Pitoiset
7b70502a5d radv: implement VK_AMD_mixed_attachment_samples
With VK_AMD_mixed_attachment_samples, the number of depth/stencil
samples isn't always equal to the number of color samples. Adjust
the number of Z samples when it's different but make sure to have
a consistent sample count if there are no depth/stencil attachments.

Also adjust the number of samples used for fragment shaders which is
the number of color samples if mixed attachment samples are used.

Only enabled on GFX8+ because it's untested on previous chips.

All dEQP-VK.pipeline.multisample.mixed_attachment_samples.* now pass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3018>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3018>
2020-01-03 12:31:53 +00:00
Samuel Pitoiset
7bbf497b68 radv: record number of color/depth samples for each subpass
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3018>
2020-01-03 12:31:53 +00:00
Christian Gmeiner
8d50ab5395 etnaviv: gc400 does not support any vertex sampler
On STM32MP1 fixes the dEQPs below and changes the dEQP run statistics to:

-  Passed:        16856/17346 (97.2%)
-  Failed:        236/17346 (1.4%)
-  Not supported: 199/17346 (1.1%)
+  Passed:        16780/17346 (96.7%)
+  Failed:        86/17346 (0.5%)
+  Not supported: 425/17346 (2.5%)
   Warnings:      55/17346 (0.3%)

dEQP-GLES2.functional.shaders.struct.uniform.sampler_vertex
dEQP-GLES2.functional.shaders.struct.uniform.sampler_nested_vertex
dEQP-GLES2.functional.shaders.struct.uniform.sampler_array_vertex
dEQP-GLES2.functional.shaders.struct.uniform.sampler_in_function_arg_vertex
dEQP-GLES2.functional.shaders.struct.uniform.sampler_in_array_function_arg_vertex
dEQP-GLES2.functional.shaders.texture_functions.vertex.texture2d
dEQP-GLES2.functional.shaders.texture_functions.vertex.texture2dproj_vec3
dEQP-GLES2.functional.shaders.texture_functions.vertex.texture2dproj_vec4
dEQP-GLES2.functional.shaders.texture_functions.vertex.texture2dlod
dEQP-GLES2.functional.shaders.texture_functions.vertex.texture2dprojlod_vec3
dEQP-GLES2.functional.shaders.texture_functions.vertex.texture2dprojlod_vec4
dEQP-GLES2.functional.shaders.texture_functions.vertex.texturecube
dEQP-GLES2.functional.shaders.texture_functions.vertex.texturecubelod
dEQP-GLES2.functional.shaders.random.texture.vertex.0
dEQP-GLES2.functional.shaders.random.texture.vertex.1
dEQP-GLES2.functional.shaders.random.texture.vertex.2
dEQP-GLES2.functional.shaders.random.texture.vertex.3
dEQP-GLES2.functional.shaders.random.texture.vertex.4
dEQP-GLES2.functional.shaders.random.texture.vertex.5
dEQP-GLES2.functional.shaders.random.texture.vertex.6
dEQP-GLES2.functional.shaders.random.texture.vertex.7
dEQP-GLES2.functional.shaders.random.texture.vertex.8
dEQP-GLES2.functional.shaders.random.texture.vertex.9
dEQP-GLES2.functional.shaders.random.texture.vertex.10
dEQP-GLES2.functional.shaders.random.texture.vertex.11
dEQP-GLES2.functional.shaders.random.texture.vertex.12
dEQP-GLES2.functional.shaders.random.texture.vertex.13
dEQP-GLES2.functional.shaders.random.texture.vertex.14
dEQP-GLES2.functional.shaders.random.texture.vertex.16
dEQP-GLES2.functional.shaders.random.texture.vertex.17
dEQP-GLES2.functional.shaders.random.texture.vertex.18
dEQP-GLES2.functional.shaders.random.texture.vertex.19
dEQP-GLES2.functional.shaders.random.texture.vertex.20
dEQP-GLES2.functional.shaders.random.texture.vertex.22
dEQP-GLES2.functional.shaders.random.texture.vertex.23
dEQP-GLES2.functional.shaders.random.texture.vertex.24
dEQP-GLES2.functional.shaders.random.texture.vertex.26
dEQP-GLES2.functional.shaders.random.texture.vertex.28
dEQP-GLES2.functional.shaders.random.texture.vertex.29
dEQP-GLES2.functional.shaders.random.texture.vertex.31
dEQP-GLES2.functional.shaders.random.texture.vertex.34
dEQP-GLES2.functional.shaders.random.texture.vertex.37
dEQP-GLES2.functional.shaders.random.texture.vertex.38
dEQP-GLES2.functional.shaders.random.texture.vertex.39
dEQP-GLES2.functional.shaders.random.texture.vertex.40
dEQP-GLES2.functional.shaders.random.texture.vertex.42
dEQP-GLES2.functional.shaders.random.texture.vertex.43
dEQP-GLES2.functional.shaders.random.texture.vertex.44
dEQP-GLES2.functional.shaders.random.texture.vertex.45
dEQP-GLES2.functional.shaders.random.texture.vertex.48
dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_nearest_clamp
dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_nearest_repeat
dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_nearest_mirror
dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_linear_clamp
dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_linear_repeat
dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_linear_mirror
dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_nearest_clamp
dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_nearest_repeat
dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_nearest_mirror
dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_linear_clamp
dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_linear_repeat
dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_linear_mirror
dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_nearest_nearest_clamp
dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_nearest_nearest_repeat
dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_nearest_nearest_mirror
dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_nearest_linear_clamp
dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_nearest_linear_repeat
dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_nearest_linear_mirror
dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_nearest_nearest_clamp
dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_nearest_nearest_repeat
dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_nearest_nearest_mirror
dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_nearest_linear_clamp
dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_nearest_linear_repeat
dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_nearest_linear_mirror
dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_linear_nearest_clamp
dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_linear_nearest_repeat
dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_linear_nearest_mirror
dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_linear_linear_clamp
dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_linear_linear_repeat
dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_linear_linear_mirror
dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_linear_nearest_clamp
dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_linear_nearest_repeat
dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_linear_nearest_mirror
dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_linear_linear_clamp
dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_linear_linear_repeat
dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_linear_linear_mirror
dEQP-GLES2.functional.texture.vertex.2d.wrap.clamp_clamp
dEQP-GLES2.functional.texture.vertex.2d.wrap.clamp_repeat
dEQP-GLES2.functional.texture.vertex.2d.wrap.clamp_mirror
dEQP-GLES2.functional.texture.vertex.2d.wrap.repeat_clamp
dEQP-GLES2.functional.texture.vertex.2d.wrap.repeat_repeat
dEQP-GLES2.functional.texture.vertex.2d.wrap.repeat_mirror
dEQP-GLES2.functional.texture.vertex.2d.wrap.mirror_clamp
dEQP-GLES2.functional.texture.vertex.2d.wrap.mirror_repeat
dEQP-GLES2.functional.texture.vertex.2d.wrap.mirror_mirror
dEQP-GLES2.functional.fbo.api.attach_names
dEQP-GLES2.functional.uniform_api.info_query.basic.sampler2D_vertex
dEQP-GLES2.functional.uniform_api.info_query.basic.sampler2D_both
dEQP-GLES2.functional.uniform_api.info_query.basic.samplerCube_vertex
dEQP-GLES2.functional.uniform_api.info_query.basic.samplerCube_both
dEQP-GLES2.functional.uniform_api.info_query.basic_array.sampler2D_vertex
dEQP-GLES2.functional.uniform_api.info_query.basic_array.sampler2D_both
dEQP-GLES2.functional.uniform_api.info_query.basic_struct.sampler2D_samplerCube_vertex
dEQP-GLES2.functional.uniform_api.info_query.basic_struct.sampler2D_samplerCube_both
dEQP-GLES2.functional.uniform_api.info_query.struct_in_array.sampler2D_samplerCube_vertex
dEQP-GLES2.functional.uniform_api.info_query.struct_in_array.sampler2D_samplerCube_both
dEQP-GLES2.functional.uniform_api.info_query.array_in_struct.sampler2D_samplerCube_vertex
dEQP-GLES2.functional.uniform_api.info_query.array_in_struct.sampler2D_samplerCube_both
dEQP-GLES2.functional.uniform_api.info_query.nested_structs_arrays.sampler2D_samplerCube_vertex
dEQP-GLES2.functional.uniform_api.info_query.nested_structs_arrays.sampler2D_samplerCube_both
dEQP-GLES2.functional.uniform_api.info_query.unused_uniforms.sampler2D_samplerCube_vertex
dEQP-GLES2.functional.uniform_api.info_query.unused_uniforms.sampler2D_samplerCube_both
dEQP-GLES2.functional.uniform_api.value.initial.get_uniform.basic.sampler2D_vertex
dEQP-GLES2.functional.uniform_api.value.initial.get_uniform.basic.sampler2D_both
dEQP-GLES2.functional.uniform_api.value.initial.get_uniform.basic.samplerCube_vertex
dEQP-GLES2.functional.uniform_api.value.initial.get_uniform.basic.samplerCube_both
dEQP-GLES2.functional.uniform_api.value.initial.get_uniform.basic_array.sampler2D_vertex
dEQP-GLES2.functional.uniform_api.value.initial.get_uniform.basic_array.sampler2D_both
dEQP-GLES2.functional.uniform_api.value.initial.get_uniform.basic_struct.sampler2D_samplerCube_vertex
dEQP-GLES2.functional.uniform_api.value.initial.get_uniform.basic_struct.sampler2D_samplerCube_both
dEQP-GLES2.functional.uniform_api.value.initial.get_uniform.struct_in_array.sampler2D_samplerCube_vertex
dEQP-GLES2.functional.uniform_api.value.initial.get_uniform.struct_in_array.sampler2D_samplerCube_both
dEQP-GLES2.functional.uniform_api.value.initial.get_uniform.array_in_struct.sampler2D_samplerCube_vertex
dEQP-GLES2.functional.uniform_api.value.initial.get_uniform.array_in_struct.sampler2D_samplerCube_both
dEQP-GLES2.functional.uniform_api.value.initial.get_uniform.nested_structs_arrays.sampler2D_samplerCube_vertex
dEQP-GLES2.functional.uniform_api.value.initial.get_uniform.nested_structs_arrays.sampler2D_samplerCube_both
dEQP-GLES2.functional.uniform_api.value.initial.render.basic.sampler2D_vertex
dEQP-GLES2.functional.uniform_api.value.initial.render.basic.sampler2D_both
dEQP-GLES2.functional.uniform_api.value.initial.render.basic.samplerCube_vertex
dEQP-GLES2.functional.uniform_api.value.initial.render.basic.samplerCube_both
dEQP-GLES2.functional.uniform_api.value.initial.render.basic_array.sampler2D_vertex
dEQP-GLES2.functional.uniform_api.value.initial.render.basic_array.sampler2D_both
dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.get_uniform.basic.sampler2D_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.get_uniform.basic.sampler2D_both
dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.get_uniform.basic.samplerCube_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.get_uniform.basic.samplerCube_both
dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.get_uniform.basic_array.sampler2D_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.get_uniform.basic_array.sampler2D_both
dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.get_uniform.basic_array_first_elem_without_brackets.sampler2D_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.get_uniform.basic_array_first_elem_without_brackets.sampler2D_both
dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.get_uniform.basic_struct.sampler2D_samplerCube_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.get_uniform.basic_struct.sampler2D_samplerCube_both
dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.get_uniform.struct_in_array.sampler2D_samplerCube_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.get_uniform.struct_in_array.sampler2D_samplerCube_both
dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.get_uniform.array_in_struct.sampler2D_samplerCube_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.get_uniform.array_in_struct.sampler2D_samplerCube_both
dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.get_uniform.nested_structs_arrays.sampler2D_samplerCube_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.get_uniform.nested_structs_arrays.sampler2D_samplerCube_both
dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.basic.sampler2D_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.basic.sampler2D_both
dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.basic.samplerCube_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.basic.samplerCube_both
dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.basic_array.sampler2D_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.basic_array.sampler2D_both
dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.basic_struct.sampler2D_samplerCube_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.basic_struct.sampler2D_samplerCube_both
dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.struct_in_array.sampler2D_samplerCube_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.struct_in_array.sampler2D_samplerCube_both
dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.array_in_struct.sampler2D_samplerCube_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.array_in_struct.sampler2D_samplerCube_both
dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.nested_structs_arrays.sampler2D_samplerCube_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.nested_structs_arrays.sampler2D_samplerCube_both
dEQP-GLES2.functional.uniform_api.value.assigned.by_value.get_uniform.basic.sampler2D_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.by_value.get_uniform.basic.sampler2D_both
dEQP-GLES2.functional.uniform_api.value.assigned.by_value.get_uniform.basic.samplerCube_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.by_value.get_uniform.basic.samplerCube_both
dEQP-GLES2.functional.uniform_api.value.assigned.by_value.get_uniform.basic_array.sampler2D_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.by_value.get_uniform.basic_array.sampler2D_both
dEQP-GLES2.functional.uniform_api.value.assigned.by_value.get_uniform.basic_array_first_elem_without_brackets.sampler2D_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.by_value.get_uniform.basic_array_first_elem_without_brackets.sampler2D_both
dEQP-GLES2.functional.uniform_api.value.assigned.by_value.get_uniform.basic_struct.sampler2D_samplerCube_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.by_value.get_uniform.basic_struct.sampler2D_samplerCube_both
dEQP-GLES2.functional.uniform_api.value.assigned.by_value.get_uniform.struct_in_array.sampler2D_samplerCube_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.by_value.get_uniform.struct_in_array.sampler2D_samplerCube_both
dEQP-GLES2.functional.uniform_api.value.assigned.by_value.get_uniform.array_in_struct.sampler2D_samplerCube_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.by_value.get_uniform.array_in_struct.sampler2D_samplerCube_both
dEQP-GLES2.functional.uniform_api.value.assigned.by_value.get_uniform.nested_structs_arrays.sampler2D_samplerCube_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.by_value.get_uniform.nested_structs_arrays.sampler2D_samplerCube_both
dEQP-GLES2.functional.uniform_api.value.assigned.by_value.render.basic.sampler2D_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.by_value.render.basic.sampler2D_both
dEQP-GLES2.functional.uniform_api.value.assigned.by_value.render.basic.samplerCube_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.by_value.render.basic.samplerCube_both
dEQP-GLES2.functional.uniform_api.value.assigned.by_value.render.basic_array.sampler2D_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.by_value.render.basic_array.sampler2D_both
dEQP-GLES2.functional.uniform_api.value.assigned.by_value.render.basic_struct.sampler2D_samplerCube_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.by_value.render.basic_struct.sampler2D_samplerCube_both
dEQP-GLES2.functional.uniform_api.value.assigned.by_value.render.struct_in_array.sampler2D_samplerCube_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.by_value.render.struct_in_array.sampler2D_samplerCube_both
dEQP-GLES2.functional.uniform_api.value.assigned.by_value.render.array_in_struct.sampler2D_samplerCube_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.by_value.render.array_in_struct.sampler2D_samplerCube_both
dEQP-GLES2.functional.uniform_api.value.assigned.by_value.render.nested_structs_arrays.sampler2D_samplerCube_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.by_value.render.nested_structs_arrays.sampler2D_samplerCube_both
dEQP-GLES2.functional.uniform_api.value.assigned.basic_array_assign_full.basic_array.sampler2D_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.basic_array_assign_full.basic_array.sampler2D_both
dEQP-GLES2.functional.uniform_api.value.assigned.basic_array_assign_full.array_in_struct.sampler2D_samplerCube_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.basic_array_assign_full.array_in_struct.sampler2D_samplerCube_both
dEQP-GLES2.functional.uniform_api.value.assigned.basic_array_assign_partial.basic_array.sampler2D_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.basic_array_assign_partial.basic_array.sampler2D_both
dEQP-GLES2.functional.uniform_api.value.assigned.basic_array_assign_partial.array_in_struct.sampler2D_samplerCube_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.basic_array_assign_partial.array_in_struct.sampler2D_samplerCube_both
dEQP-GLES2.functional.uniform_api.value.assigned.unused_uniforms.sampler2D_samplerCube_vertex
dEQP-GLES2.functional.uniform_api.value.assigned.unused_uniforms.sampler2D_samplerCube_both
dEQP-GLES2.functional.uniform_api.random.0
dEQP-GLES2.functional.uniform_api.random.3
dEQP-GLES2.functional.uniform_api.random.6
dEQP-GLES2.functional.uniform_api.random.11
dEQP-GLES2.functional.uniform_api.random.14
dEQP-GLES2.functional.uniform_api.random.21
dEQP-GLES2.functional.uniform_api.random.22
dEQP-GLES2.functional.uniform_api.random.24
dEQP-GLES2.functional.uniform_api.random.25
dEQP-GLES2.functional.uniform_api.random.29
dEQP-GLES2.functional.uniform_api.random.30
dEQP-GLES2.functional.uniform_api.random.32
dEQP-GLES2.functional.uniform_api.random.33
dEQP-GLES2.functional.uniform_api.random.37
dEQP-GLES2.functional.uniform_api.random.41
dEQP-GLES2.functional.uniform_api.random.49
dEQP-GLES2.functional.uniform_api.random.51
dEQP-GLES2.functional.uniform_api.random.55
dEQP-GLES2.functional.uniform_api.random.61
dEQP-GLES2.functional.uniform_api.random.69
dEQP-GLES2.functional.uniform_api.random.72
dEQP-GLES2.functional.uniform_api.random.78
dEQP-GLES2.functional.uniform_api.random.79
dEQP-GLES2.functional.uniform_api.random.82
dEQP-GLES2.functional.uniform_api.random.87
dEQP-GLES2.functional.uniform_api.random.88
dEQP-GLES2.functional.uniform_api.random.94
dEQP-GLES2.functional.uniform_api.random.95
dEQP-GLES2.functional.uniform_api.random.96

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Marek Vasut <marex@denx.de>
2020-01-03 09:23:41 +01:00
Christian Gmeiner
46b8273eb1 etnaviv: check if MSAA is supported
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2020-01-03 08:31:02 +01:00
Iago Toral Quiroga
2271a187c2 u_vbuf: don't try to delete NULL driver CSO
Since 18a8c3f7f1 we don't create a driver CSO if there are any
incompatible elements, so only ask backends to delete it if it exists.

Fixes multiple CTS crashes in V3D.

Fixes: 18a8c3f7f1 ("u_vbuf: Only create driver CSO if no incompatible elements")

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2020-01-03 07:58:35 +01:00
Kenneth Graunke
d0d28c783d iris: Set nir_shader_compiler_options::unify_interfaces.
This is technically enabling the option in the common intel backend
code, but only the st/nir linker uses the option, so it's iris-only.

Fixes Piglit's spec/glsl-1.50/execution/geometry/clip-distance-vs-gs-out

Closes: #2274
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3249>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3249>
2020-01-03 00:41:50 +00:00
Kenneth Graunke
19ed12afd1 st/nir: Optionally unify inputs_read/outputs_written when linking.
i965 and iris use inputs_read/outputs_written for a shader stage to
determine the layout of input and output storage.  Adjacent stages must
agree on the layout, so adjacent input/output bitfields must match.

This patch adds a new nir_shader_compiler_options::unify_interfaces
flag which asks the linker to unify the input/output interfaces between
adjacent stages.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3249>
2020-01-03 00:41:50 +00:00
Kenneth Graunke
7a9c0fc0d7 intel: Drop Gen11 WaBTPPrefetchDisable workaround
This isn't needed on production Icelake hardware.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3250>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3250>
2020-01-03 00:20:17 +00:00
Jordan Justen
ed17baab5f intel: Remove unused Tigerlake PCI ID
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2020-01-02 15:18:18 -08:00
Alyssa Rosenzweig
3759b84926 pan/midgard: Use upper ALU tags for MFBD writeout
It's not clear yet what the distinction is.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2020-01-02 17:27:23 -05:00
Alyssa Rosenzweig
2d1e18ee83 pan/midgard: Identity ld_color_buffer as 32-bit
I'm not sure why I mistakenly identified it as an 8-bit op before.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2020-01-02 15:20:55 -05:00
Alyssa Rosenzweig
5063ab6a9c pan/midgard: Remove old comment
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2020-01-02 15:20:55 -05:00
Alyssa Rosenzweig
5bc62af2a0 pan/midgard: Generate MRT writeout loops
They need a very particular form; the naive way we did before is not
sufficient in practice, it doesn't look like. So let's follow the rough
structure of the blob's writeout since this is fixed code anyway.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2020-01-02 15:20:55 -05:00
Alyssa Rosenzweig
db879b034a pan/midgard: Generalize IS_ALU and quadword_size
There are more ALU tags, let's do some cleanup while we're at it.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2020-01-02 15:20:55 -05:00
Alyssa Rosenzweig
66f98ffab0 pan/midgard: Use better heuristic for shader termination
This still may not be perfect (in the sense that legal shaders might
still get cut off) but this fits how writeout is done with both Panfrost
and the blob, so it's good enough for what we need and allows MRT
shaders to be sanely disassembled.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2020-01-02 15:20:55 -05:00
Alyssa Rosenzweig
c298f25c4e pan/midgard: Fix memory corruption in constant combining
It's a long story... but we'd try to insert constants that weren't there
and end up clobbering fields in the bundle following the constant
array...

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2020-01-02 15:20:55 -05:00
Alyssa Rosenzweig
d58600c0e0 panfrost: Pack MRT blend shaders into a single BO
Blend shader size and location in memory is considerably constrained,
probably to facilitate optimizations (my guess is that blend shaders are
run strictly out of i-cache). We need to pack the blend shaders for each
RT of a single framebuffer together. The easiest way to do this is at
draw time which is not terribly efficient but will hold us over for now.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2020-01-02 15:20:55 -05:00
Alyssa Rosenzweig
1b86e0927d panfrost: Handle RGB16F colour clear
We don't handle this format yet, but we will soon, and the abort in
pan_pack_color is possible even without exposing the format... Handling
this gracefully might not be required by the spec but let's not crash.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2020-01-02 15:20:55 -05:00
Tomeu Vizoso
829f338a59 panfrost: Store internal format
It's needed by u_transfer_helper to know when the depth+stencil buffer
has been split.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2020-01-02 12:41:17 -05:00
Tomeu Vizoso
14bc4c7cce panfrost: Map with size of first layer for 3D textures
As that's what Gallium expects in transfer.layer_stride.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2020-01-02 12:41:15 -05:00
Tomeu Vizoso
ed3eede296 panfrost: Dynamically allocate array of texture pointers
With 3D textures we can have lots of layers, so better allocate it
dynamically at runtime.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2020-01-02 12:41:02 -05:00
Bas Nieuwenhuizen
c1a1a86658 meson: Enable -Werror=int-conversion.
I think implicit conversions here are almost always wrong:

1) wrong argument position ptr vs. int
2) will often have issues with 32-bit platforms.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2570>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2570>
2020-01-02 11:47:02 +00:00
Bas Nieuwenhuizen
b72182fcfa turnip: Use VK_NULL_HANDLE instead of NULL.
Only occurrence of implicitly converting pointer->int.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2570>
2020-01-02 11:47:02 +00:00
Bas Nieuwenhuizen
973181c06c amd/common: Always initialize gfx9 mipmap offset/pitch.
The WSI expects pitch to be meaningful even for tiled
textures.

(It is used for the pitch in modesetting and X11)

Fixes: 824bd0830e "radv: return the correct pitch for linear mipmaps on GFX10"
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2301
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2304
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3245>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3245>
2020-01-02 11:25:51 +00:00
Bas Nieuwenhuizen
59c4fb9d72 nir: print non-uniform tex fields.
To ease debugging in the future.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3246>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3246>
2020-01-02 11:42:33 +01:00
Bas Nieuwenhuizen
69bdc1c5fc nir: Add clone/hash/serialize support for non-uniform tex instructions.
These were missed when the fields got added. Added it everywhere where
texture_index got used and it made sense.

Found this in "The Surge 2", where the inliner does not copy the fields,
resulting in corruption and hangs.

Fixes: 3bd5457641 "nir: Add a lowering pass for non-uniform resource access"
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1203
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3246>
2020-01-02 11:41:33 +01:00
Afonso Bordado
525cbe85ef pan/midgard: Optimize branches with inverted arguments
Remove the invert on arguments to branches, and invert the branch
condition instead. This saves one instruction per inverted argument.

Closes #2088

Signed-off-by: Afonso Bordado <afonsobordado@az8.co>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-31 20:01:16 +00:00
Afonso Bordado
0e83688f47 pan/midgard: Move midgard_is_branch_unit to helpers
Signed-off-by: Afonso Bordado <afonsobordado@az8.co>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-31 20:01:12 +00:00
Marek Vasut
5e9106f7af etnaviv: Do not filter out PIPE_FORMAT_S8_UINT_Z24_UNORM on pre-HALTI2
The format PIPE_FORMAT_S8_UINT_Z24_UNORM is supported even on pre-HALTI
hardware like GCnano. Do not report it as unsupported format.

This fixes the following dEQP on GCnano:
dEQP-GLES2.functional.fbo.completeness.renderable.texture.color0.depth_stencil_unsigned_int_24_8

Fixes: 64c7cdcae5 ("etnaviv: add missing formats")
Signed-off-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3200>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3200>
2019-12-31 15:12:49 +00:00
Marek Vasut
a812cb57e5 etnaviv: Report correct number of vertex buffers
The GCnano has only 4 vertex buffers instead of 16. This information
can be extracted from the GPU status registers and is already stored
in screen->specs.stream_count. Use PIPE_CAP_MAX_VERTEX_BUFFERS to
report this information and permit u_vbuf to reorganize the shaders
to fit.

This fixes the following dEQP on GCnano:
dEQP-GLES2.functional.shaders.conversions.vector_combine.float_float_float_float_to_vec4_vertex

This fixes all the other dEQP-GLES2.functional.shaders.conversions.*
which used to fail on GCnano.

Signed-off-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3241>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3241>
2019-12-31 14:55:04 +00:00
Timur Kristóf
11e62a9734 aco: Fix uniform i2i64.
Fixes 240 failing test cases in dEQP-VK.spirv_assembly which
were failing due to a bad s_ashr_i32 instruction. This commit
fixes the instruction format along with the definitions of the
instruction.

Fixes: 11f43caaec
Cc: 19.3 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-31 14:22:31 +01:00
Robert Foss
182679e7c5 android: Fix u_format_table.c being generated twice
Two competing rules for defining u_format_table.c exists,
which is an error.

Additionally the more general rule lacks the inclusion of
format/u_format.csv.

Fixes: 882ca6dfb0 ("util: Move gallium's PIPE_FORMAT utils to /util/format/")

Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-12-31 14:06:02 +01:00
Alyssa Rosenzweig
a0d65d860d pan/midgard: Remove prepacked_branch
It's an ugly hack that's no longer used.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-31 03:26:24 +00:00
Alyssa Rosenzweig
02f503ef00 pan/midgard: Convert fragment writeout to proper branches
This eliminates the only use of prepacked_branch, which is a such a
hack anyway.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-31 03:26:24 +00:00
Marek Olšák
84b82f8cd1 winsys/radeon: initialize pte_fragment_size
Cc: 19.2 19.3 <mesa-stable@lists.freedesktop.org>

Closes: #2179

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-30 20:20:59 -05:00
Marek Olšák
5c9dcbea77 Revert "u_vbuf: Regard non-constant vbufs with non-instance elements as free"
This reverts commit c6ef79c488.

It broke torcs.
2019-12-30 18:41:24 -05:00
Alyssa Rosenzweig
3909b16000 panfrost: Respect glPointSize()
We have native support for this somehow. Fixes the mesa demo `points`

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-30 17:11:08 -05:00
Alyssa Rosenzweig
8f4b15636b panfrost: Remove MRT indirection in blend shaders
Since we have a separate blend shader for each render target, let's
simplify this structure and reduce the options memory footprint by 88%
or something goofy like that.

Should also enable separate blending per render target.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-30 17:11:08 -05:00
Alyssa Rosenzweig
67fe2afa51 panfrost: Implement integer varyings
We need to actually work out the varying format on demand, rather than
assuming rgba32f.

Fixes dEQP-GLES3.functional.fragment_out.basic.int.*

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-30 17:11:08 -05:00
Alyssa Rosenzweig
62d056d8e3 panfrost: Disable some CAPs we want lowered
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-30 17:11:08 -05:00
Alyssa Rosenzweig
71df7c69bc panfrost: Identify glProvokingVertex flag
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-30 17:11:08 -05:00
Alyssa Rosenzweig
c17a441666 pan/midgard: Implement flat shading
We need to shuffle around some lowerings but it's just a flag.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-30 17:11:08 -05:00
Alyssa Rosenzweig
66c2696fda pan/midgard: Use type-appropriate st_vary
We would like to store (u)ints as well.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-30 17:11:08 -05:00
Alyssa Rosenzweig
3996fd7b90 pan/midgard: Promote tilebuffer reads to 32-bit
Fixes (among others)
dEQP-GLES3.functional.fbo.render.shared_colorbuffer_clear.tex2d_rgba16f

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-30 17:11:08 -05:00
Alyssa Rosenzweig
ddc5a371b3 glsl: Set .flat for gl_FrontFacing
It is a boolean.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3237>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3237>
2019-12-30 19:54:50 +00:00
Samuel Pitoiset
824bd0830e radv: return the correct pitch for linear mipmaps on GFX10
On GFX9, the pitch of a level is always the pitch of the entire image
but not on GFX10.

This fixes graphics glithes with Halo - The Master Chief Collection.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2188
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-30 14:17:45 +01:00
Yevhenii Kolesnikov
b318bc2072 meta: Cleanup function for DrawTex
Buffer object was never freed, causing memory leaks.

Fixes: 76cfe2bc44 ("meta: Don't pollute the buffer object namespace in _mesa_meta_DrawTex")
CC: Ian Romanick <ian.d.romanick@intel.com>

Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1390>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1390>
2019-12-30 12:41:52 +02:00
Jan Zielinski
7040d6c197 gallium/gallivm/tgsi: enable tessellation shaders
Tessellation Control and Evaluation shaders are implementing
tessellation and require special handling of their inputs
and outputs.

TCS can write out not only per-vertex, but also per-patch
(per-primitive) attributes and tessellation factor values
that control the tessellator.

TES can read TCS outputs, plus must be feeded with new
system values (tessellation coordinates) that are
outputs of the tessellator fixed function.

TCS can also contain calls to barrier() function (similar
to compute shaders).

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Alok Hota <alok.hota@intel.com>
2019-12-30 11:32:47 +01:00
Dave Airlie
26c5ae80f0 llvmpipe: enable ARB_shader_group_vote.
This just adds the NIR paths for shader group vote.

v2: drop feq for now. (Roland)

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3213>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3213>
2019-12-30 05:30:30 +00:00
Bas Nieuwenhuizen
88f567b5ce amd/common: Handle alignment of 96-bit formats.
addrlib doesn't quite do it right, so do it ourselves.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2162
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-12-30 00:02:46 +01:00
Caio Marcelo de Oliveira Filho
b0203b561c panfrost: Fix Makefile.sources
Add missing `\`.  Fixes Android build.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Fixes: de077c2078 ("panfrost: Remove mali_alt_func")
2019-12-28 12:31:41 -08:00
Eric Engestrom
a6873a8df2 mesa: avoid returning a value in a void function
Fixes: 1d1722e910 ("mesa: add EXT_dsa NamedProgram functions")
Cc: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-28 11:58:37 +00:00
Eric Engestrom
dcba7731e6 meson: simplify install_megadrivers.py invocation
Note: `find_program()` needs a shebang on scripts.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-12-27 22:43:34 +00:00
Eric Engestrom
ff3a2576a4 nine: fix empty-body-issues
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Fixes: 8d43e2b2de ("meson: add -Werror=empty-body to disallow `if(x);`")
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2019-12-27 22:09:16 +00:00
Eric Engestrom
51569e525a amd: fix empty-body issues
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Fixes: 8d43e2b2de ("meson: add -Werror=empty-body to disallow `if(x);`")
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2019-12-27 22:09:00 +00:00
Eric Engestrom
7a4a75a185 u_format: move format tests to util/tests/
Suggested-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-27 21:04:44 +00:00
Eric Engestrom
da9937d09b util/format: add trivial srgb<->linear conversion test
This would've caught 8829f9ccb0 ("u_format: add ETC2 to
util_format_srgb/util_format_linear").

Suggested-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-27 21:04:43 +00:00
Eric Engestrom
8f4d4c808b util/format: add PIPE_FORMAT_ASTC_*x*x*_SRGB to util_format_{srgb,linear}()
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-27 21:04:43 +00:00
Eric Engestrom
cc7a64f101 util/format: remove left-over util_format_description_table declaration
Fixes: 3c45c4bc44 ("util: Cope with the fact that formats in u_format.csv are not ordered.")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-27 21:04:43 +00:00
Dave Airlie
baa064f0f5 gallivm: fixup const int64 builder.
Pointed out by Ilia.

Fixes: 84ba008774 (gallivm: add 64-bit const int creator.)
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-28 05:44:31 +10:00
Marek Olšák
e79f55ff86 radeonsi/gfx10: improve performance for TES using PrimID but not exporting it
This field is really for the primitive export to the pixel shader.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-12-27 13:50:57 -05:00
Marek Olšák
aa3df12fc2 radeonsi/gfx10: enable NGG passthrough for eligible shaders
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-12-27 13:50:57 -05:00
Marek Olšák
17164d4e27 radeonsi/gfx10: don't declare any LDS for NGG if it's not used
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-12-27 13:50:57 -05:00
Alyssa Rosenzweig
65e5c1942a panfrost: Remove 32-bit next_job path
It has been unused for a while; let's just remove the abstraction.
Technically the hardware does support 32-bit job descriptors, but we
don't and we can't keep them from breaking so let's not pretend they
work.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Suggested-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-12-27 13:03:22 -05:00
Alyssa Rosenzweig
95ba661b49 panfrost; Update comment about work/uniform_count
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-27 13:01:17 -05:00
Alyssa Rosenzweig
de077c2078 panfrost: Remove mali_alt_func
There's only one way to encode comparison functions in the command
stream, not two. It's just that the semantics for texture comparisons
are flipped from the semantics of stencil comparison. We can factor out
that flip to common Panfrost code, rather than tying it to a second
Gallium routine.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-27 12:58:00 -05:00
Alyssa Rosenzweig
bc1fc29e21 panfrost: Add missing #include in common header
Fixes way back when...

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-27 12:58:00 -05:00
Alyssa Rosenzweig
330e9b154e panfrost: Add pan_attributes.c to Android.mk
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Fixes: 31305e1b28 ("panfrost: Move instancing routines to encoder/")
2019-12-27 12:58:00 -05:00
Alyssa Rosenzweig
5fe58271b2 panfrost: Implement remaining texture wrap modes
Somehow we have native hardware for all of these. Suspected by staring
at the bit pattern; confirmed by poking in various texture wrap modes
into the textures mesa demo and seeing what happens.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-27 12:58:00 -05:00
Alyssa Rosenzweig
4ccd42e0bc panfrost: Inline away MALI_NEGATIVE
It's an awfully fancy way to add one...

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-27 12:16:09 -05:00
Alyssa Rosenzweig
76519b216b panfrost: Remove MALI_ATTR_INTERNAL
It's a relic from before we understood the varying builtins. It should
never actually come up if the builtins are decoded correctly.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-27 12:11:37 -05:00
Alyssa Rosenzweig
5f8376101d panfrost: Update information on fixed attributes/varyings
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-27 12:10:24 -05:00
Alyssa Rosenzweig
9bde6e551d panfrost: Remove MALI_SPECIAL_ATTRIBUTE_BASE defines
These are conventions by the blob (a convention we happent to follow).
They are not at all intrinsic to the hardware, so now that the
convention is implemented within the Midgard stack, these defines are
wholly unused. Remove them.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-27 12:08:45 -05:00
Alyssa Rosenzweig
8c188722d9 pan/midgard: Fix minor typo
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reported-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-12-27 12:07:45 -05:00
Mauro Rossi
563bd61fee android: radv: build radv_shader_args.c
Updates radv Makefile.sources and fixes the following building error:

external/mesa/src/amd/vulkan/radv_shader.c:1122:
error: undefined reference to 'radv_declare_shader_args'

Fixes: 3b14336 ("ac/nir, radv, radeonsi: Switch to using ac_shader_args")
Fixes: 66c703b ("radv: Move argument declaration out of nir_to_llvm")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-12-27 09:20:54 +01:00
Mauro Rossi
962b70c259 android: radeonsi,ac: fix building error due to ac changes
Updates amd Makefile.sources and fixes the following building errors:

external/mesa/src/gallium/drivers/radeonsi/si_compute_prim_discard.c:338: error: undefined reference to 'ac_add_arg'
external/mesa/src/gallium/drivers/radeonsi/si_compute_prim_discard.c:340: error: undefined reference to 'ac_add_arg'
external/mesa/src/gallium/drivers/radeonsi/si_compute_prim_discard.c:341: error: undefined reference to 'ac_add_arg'
external/mesa/src/gallium/drivers/radeonsi/si_compute_prim_discard.c:342: error: undefined reference to 'ac_add_arg'

Fixes: 9885af3 ("ac: Add a shared interface between radv, radeonsi, LLVM and ACO")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-12-27 09:20:49 +01:00
Mauro Rossi
ad1c65e322 android: radv: fix vk_format_table.c generated source build
RADV Android build rules are now getting the wrong vk_format.h
from src/vulkan/util include, the simplest way to fix is to add
src/amd/vulkan include prior to src/vulkan/util include

Fixes the following building errors:

out/target/product/x86_64/obj_x86/STATIC_LIBRARIES/libmesa_radv_common_intermediates/vk_format_table.c:39:4:
 error: use of undeclared identifier 'VK_FORMAT_LAYOUT_PLAIN'
...
out/target/product/x86_64/obj_x86/STATIC_LIBRARIES/libmesa_radv_common_intermediates/vk_format_table.c:131:8:
error: use of undeclared identifier 'VK_FORMAT_TYPE_UNSIGNED'; did you mean 'UTIL_FORMAT_TYPE_UNSIGNED'?
      {VK_FORMAT_TYPE_UNSIGNED, true, false, false, 4, 0},      /* x = a */
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.

Fixes: 3a28281 ("util: Add a mapping from VkFormat to PIPE_FORMAT.")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-12-27 09:20:44 +01:00
Mauro Rossi
13ef793770 android: util: Add a mapping from VkFormat to PIPE_FORMAT.
Updates Makefile.sources and fixes the following building error:

In file included from external/mesa/src/vulkan/util/vk_format.c:24:
In file included from external/mesa/src/vulkan/util/vk_format.h:28:
external/mesa/src/util/format/u_format.h:33:10: fatal error: 'pipe/p_format.h' file not found
#include "pipe/p_format.h"
         ^~~~~~~~~~~~~~~~~
1 error generated.

Fixes: 3a28281 ("util: Add a mapping from VkFormat to PIPE_FORMAT.")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-27 09:20:40 +01:00
Mauro Rossi
200be80858 android: nir: add a load/store vectorization pass
Fixes the following aco building error:

external/mesa/src/amd/compiler/aco_instruction_selection_setup.cpp:846:
error: undefined reference to 'nir_opt_load_store_vectorize'

Fixes: ce9205c ("nir: add a load/store vectorization pass")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-27 09:20:35 +01:00
Dave Airlie
c8042c289e llvmpipe: add debug option to enable OpenCL support.
LP_DEBUG=cl will enable CL support for now.

Acked-by: Roland Scheidegger <sroland@vmware.com>
2019-12-27 13:26:33 +10:00
Dave Airlie
29784bb49c gallivm/nir: add vec8/16 support
Acked-by: Roland Scheidegger <sroland@vmware.com>
2019-12-27 13:26:33 +10:00
Dave Airlie
5be1ea7d79 gallivm/nir: lower packing
This fixes some CL upsample tests, which lower into packing that needs
lowering.

Acked-by: Roland Scheidegger <sroland@vmware.com>
2019-12-27 13:26:33 +10:00
Dave Airlie
31e0e8a51b llvmpipe: lower hadd/add_sat
Fixes some CL piglits.

Acked-by: Roland Scheidegger <sroland@vmware.com>
2019-12-27 13:26:33 +10:00
Dave Airlie
0a73eafdbe gallivm: handle non-32 bit undefined
other sized undefs caused llvm asserts

Acked-by: Roland Scheidegger <sroland@vmware.com>
2019-12-27 13:26:33 +10:00
Dave Airlie
b16fd4d9e9 llvmpipe/nir: use nir_max_vec_components in more places
This is prep work for when vec8/16 have landed.

Acked-by: Roland Scheidegger <sroland@vmware.com>
2019-12-27 13:26:33 +10:00
Dave Airlie
073734ca7f llvmpipe: add support for compute shader params
Acked-by: Roland Scheidegger <sroland@vmware.com>
2019-12-27 13:26:33 +10:00
Dave Airlie
22d631e235 llvmpipe: handle serialized nir as a shader type.
Acked-by: Roland Scheidegger <sroland@vmware.com>
2019-12-27 13:26:33 +10:00
Dave Airlie
264663d55d gallivm/llvmpipe: add support for global operations.
Acked-by: Roland Scheidegger <sroland@vmware.com>
2019-12-27 13:26:33 +10:00
Dave Airlie
9630c2ddd8 gallivm/llvmpipe: add support for block size intrinsic
We have to pass the main block size into the coroutine
and into the shader.

Acked-by: Roland Scheidegger <sroland@vmware.com>
2019-12-27 13:26:32 +10:00
Dave Airlie
336954f7e7 gallivm/llvmpipe: add support for work dimension intrinsic.
We have to pass the work_dim given by the user into the shader.

Acked-by: Roland Scheidegger <sroland@vmware.com>
2019-12-27 13:26:32 +10:00
Dave Airlie
b8d403c03f tgsi/mesa: handle KERNEL case
Translate to compute for now.

Acked-by: Roland Scheidegger <sroland@vmware.com>
2019-12-27 13:26:32 +10:00
Dave Airlie
dac8cb981f gallivm/nir: allow 8/16-bit conversion and comparison.
This adds the convert to 8/16 and support for 8/16 comparsions

Acked-by: Roland Scheidegger <sroland@vmware.com>
2019-12-27 13:26:32 +10:00
Dave Airlie
3adf74f2ef gallivm: pick integer builders for alu instructions.
This allows these to be used with non 32-bit types.

Acked-by: Roland Scheidegger <sroland@vmware.com>
2019-12-27 13:26:32 +10:00
Dave Airlie
df3e0fe9d8 gallivm: add support for 8-bit/16-bit integer builders
Acked-by: Roland Scheidegger <sroland@vmware.com>
2019-12-27 13:22:43 +10:00
Dave Airlie
258b9bc02e llvmpipe/gallivm: add kernel inputs
compute shaders need kernel input support

Acked-by: Roland Scheidegger <sroland@vmware.com>
2019-12-27 13:22:40 +10:00
Dave Airlie
84ba008774 gallivm: add 64-bit const int creator.
Acked-by: Roland Scheidegger <sroland@vmware.com>
2019-12-27 13:22:35 +10:00
Dave Airlie
41c77dbc1e nir: sanitize work group intrinsics to always be 32-bit.
This saves handling them in the backend later.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-12-27 13:22:34 +10:00
Bas Nieuwenhuizen
a435f002c4 radv: Expose all sample counts for integer formats as well.
Things work the same between float and integer.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2261
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-12-26 10:48:29 +01:00
Alyssa Rosenzweig
be691ca22d panfrost: Route gl_VertexID through cmdstream
It shows up as a special (magic?) attribute. We could try to be clever
and only include the extra record if gl_VertexID is actually read, but
honestly that's just extra complexity for no good reason. Might as well
just always include it; this won't be a real bottleneck, I don't think.

Fixes dEQP-GLES3.functional.shaders.builtin_variable.vertex_id.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-24 22:55:04 -05:00
Alyssa Rosenzweig
8781378224 panfrost: Extend attribute_count for vertex builtins
They stretch beyond the usual limit for attributes so are included
implicitly.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-24 22:55:04 -05:00
Alyssa Rosenzweig
306800d747 pan/midgard: Lower gl_VertexID/gl_InstanceID to attributes
We have special records for these, put in a fixed location by convention
per the blob.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-24 22:55:04 -05:00
Alyssa Rosenzweig
6e68890fd6 pan/midgard: Factor out emit_attr_read
We will load attributes directly for gl_VertexID.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-24 22:55:04 -05:00
Alyssa Rosenzweig
695b35605b panfrost: Unset vertex_id_zero_based
We don't want the lowering; we have native gl_VertexID.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-24 22:55:04 -05:00
Alyssa Rosenzweig
3b3d9653a7 pan/decode: Handle gl_VertexID/gl_InstanceID
Just like varyings have special records for point coordinates (etc),
attributes have special records for vertex/instance ID. We can parse
these fairly easily, although they don't line up exactly with normal
attribute records.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-24 22:54:58 -05:00
Alyssa Rosenzweig
d36ca7c0a3 panfrost: Remove pan_shift_odd
Padded counts are numbers of the form:

   n = (2k + 1) * 2^s

for k, s integers. Rather than explicitly store k and s separately and
then compute this formula on demand, it's much cleaner to store the
padded number itself, which is what you manipulate most of the time.
When you do need k,s it is easy to factor by noticing the bitwise
representation:

   s = ctz(n)
   k = n >> (s + 1)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-24 22:42:07 -05:00
Alyssa Rosenzweig
62ce9001c2 panfrost: Slight cleanup of Gallium's pan_attribute.c
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-24 22:42:07 -05:00
Alyssa Rosenzweig
385a4f773f pan/decode: Fix reference computation for invocations
Slight bug with instancing. No harm done but let's get rid of the
pandecode warning, it's just noise.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-24 22:42:07 -05:00
Alyssa Rosenzweig
9c249d3e6b panfrost: Fix off-by-one in pan_invocation.c
When instance_count=2, the packing code was broken. Fixes a dEQP test.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-24 22:42:07 -05:00
Alyssa Rosenzweig
467ae0d39d panfrost: Factor out panfrost_compute_magic_divisor
The algorithm doesn't need to be tangled up in details about the
attribute records themselves. We'll need to compute magic divisors for
gl_InstanceID in a second.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-24 22:42:07 -05:00
Alyssa Rosenzweig
31305e1b28 panfrost: Move instancing routines to encoder/
Nothing Gallium specific or stateful about them.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-24 19:48:57 -05:00
Alyssa Rosenzweig
8a57672673 panfrost: Factor batch/resource out of instancing routines
They don't need them; this will allow us to move the code into encoder/
which in turn will make the messy Gallium code less scary.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-24 19:48:57 -05:00
Alyssa Rosenzweig
ddcd68f52b panfrost: Rename pan_instancing.c -> pan_attributes.c
Let's follow the naming convention that panfrost command stream code is
organized by command stream structure.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-24 19:48:57 -05:00
Alyssa Rosenzweig
a0e75adabb pan/midgard: Compute destination override
We shift over the mask in this case.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-24 19:23:02 -05:00
Alyssa Rosenzweig
9a5d462480 pan/midgard: Add mir_upper_override helper
Checks if we should emit a dest_override=upper, given a mask.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-24 19:23:02 -05:00
Alyssa Rosenzweig
fc4193d0c7 pan/midgard: Support loads from R11G11B10 in a blend shader
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-24 19:22:54 -05:00
Alyssa Rosenzweig
3af5a398f3 pan/midgard: Enable lower_(un)pack_* lowering
These show up in some blend shaders. Let's use the shared lowering and
remove our own.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-24 19:21:52 -05:00
Tomeu Vizoso
843a6db6bb panfrost: Increase PIPE_SHADER_CAP_MAX_OUTPUTS to 16
GL ES 3.0 requires it to be higher, and stuff seems to work just fine.

Fixes: dEQP-GLES3.functional.implementation_limits.max_vertex_output_components

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-12-24 19:21:52 -05:00
Tomeu Vizoso
f107059bb2 panfrost: Handle Z24_UNORM_S8_UINT as MALI_Z32_UNORM
Fixes dEQP-GLES3.functional.texture.format.sized.2d.depth24_stencil8_pot

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-12-24 19:21:52 -05:00
Alyssa Rosenzweig
6b7243f28f pan/midgard: Implement shadow cubemaps
We need to reshuffle to sync up the shadow coordinate temporary with the
cubemap coordinate temporary. Once that's in place, it's simple enough
(we load the shadow coordinate into .z like 2D).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-24 23:46:23 +00:00
Alyssa Rosenzweig
9e5a1412ed pan/midgard: Generalize temp coordinate to non-2D
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-24 23:46:23 +00:00
Alyssa Rosenzweig
1bce7fdecd pan/midgard: Do witchcraft on texture offsets
My latest divination spell has uncovered a pattern in the aether.
Although the swizzle is unaligned, its format is otherwise standard.
Document this, removing the old incorrect understanding of the swizzle
(which coincided on common special swizzles only).

Fixes dEQP-GLES3.functional.shaders.texture_functions.texelfetchoffset.sampler2d_fixed_fragment

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-24 23:46:23 +00:00
Alyssa Rosenzweig
4ec1f95d76 pan/midgard: Fix fallthrough from offset to comparator
Fixes: ccbc9a4e67 ("pan/midgard: Implement textureOffset for 2D textures")
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-24 23:46:22 +00:00
Alyssa Rosenzweig
64b2fe9626 pan/midgard: Expand swizzle for texelFetch
We zero the extra components anyway. Fixes
dEQP-GLES3.functional.shaders.texture_functions.texelfetch.sampler2d_fixed_fragment

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-24 23:46:22 +00:00
Alyssa Rosenzweig
72e5749a63 pan/midgard: Clamp LOD register swizzle
Fixes register allocation failures with textureLodOffset.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-24 23:46:22 +00:00
Alyssa Rosenzweig
06df977c1c pan/midgard: Extend IS_VEC4_ONLY to arguments
I think both need to be aligned at least for ld_cubemap_coords.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-24 23:46:22 +00:00
Alyssa Rosenzweig
4e75d75724 pan/midgard: Bounds check lcra_restrict_range
We may call it with sentinel values (~0 in particular) corresponding to
unused arguments; ignore these.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-24 23:46:22 +00:00
Rob Clark
0c32063794 freedreno/ir3: fix flat shading again
These days `ctx->inputs` is the split scalar input components and
`ir->inputs` is the full vecN.  This got fixed in the load_input case,
but the load_interpolated_input case was missed.

Fixes: bdf6b7018c ("freedreno/ir3: re-work shader inputs/outputs")
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-12-24 17:16:31 +00:00
Alyssa Rosenzweig
a8beef332d pan/midgard: Fix disassembler cycle/quadword counting
Due to the succeeding break we would fall into some off-by-one errors.
These should be resolved now.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-24 16:55:46 +00:00
Alyssa Rosenzweig
0cc6e33537 pan/decode: Append 0:0 spills:fills to blobber-db
At the moment there's no need to actually count these but we do need a
placeholder for report.py to be happy.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-24 16:55:46 +00:00
Alyssa Rosenzweig
6a74934e7a pan/decode: Prefix blobberdb with MESA_SHADER_*
We use these prefixes in panfrost shader-db and they need to match for
shader-db to be happpy.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-24 16:55:46 +00:00
Alyssa Rosenzweig
ead35f586c pan/decode: Skip COMPUTE in blobber-db
The blob uses COMPUTE jobs for some internal purposes. These are
essentially free but panfrost doesn't use them, so it messes up the
numbering. Just filter them out.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-24 16:55:46 +00:00
Alyssa Rosenzweig
09671c8d68 panfrost: Decode shader types in pantrace shader-db
We see some COMPUTE jobs that were mistakenly identified as VERTEX.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-24 16:55:46 +00:00
Jason Ekstrand
ac70442ce1 anv: Properly advertise sampledImageIntegerSampleCounts
We support the same set of samples for integer color formats as for
non-integer.  We've been advertising it wrong since before the initial
Vulkan 1.0 release. :-(

Fixes: d689745303 "vk/0.210.0: Rework device features and limits"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-24 08:31:44 -06:00
Roman Stratiienko
c411d4896c Android: Fix build issue without LLVM
Some of the latest changes are causing the following build error on Android:

```
external/mesa3d/src/gallium/auxiliary/nir/nir_to_tgsi_info.c:403:6:
error: redefinition of 'nir_tgsi_scan_shader'
void nir_tgsi_scan_shader(const struct nir_shader *nir,
     ^
external/mesa3d/src/gallium/auxiliary/nir/nir_to_tgsi_info.h:37:20:
note: previous definition is here
static inline void nir_tgsi_scan_shader(const struct nir_shader *nir,
                   ^
```

Include nir_to_tgsi_info.c and nir_to_tgsi_info.h into the build
only if LLVM is enabled.

Signed-off-by: Roman Stratiienko <roman.stratiienko@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2978>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2978>
2019-12-23 10:22:02 +02:00
Kenneth Graunke
97e9de1795 iris: Avoid replacing backing storage for buffers with no contents
We might get asked to pitch the storage on a buffer that already has
no meaningful contents.  In this case, the existing buffer is as good
as a new one.
2019-12-22 16:18:30 -08:00
Kenneth Graunke
c96c1141fb iris: Fix shader recompile debug printing
I was passing iris keys to brw_debug_key_recompile, leading to out of
bounds memory reads.

Fixes: 2e654db27a ("iris: Create smaller program keys without legacy features")
2019-12-22 16:18:30 -08:00
Kenneth Graunke
1ef4514c5b iris: Make helper functions to turn iris shader keys into brw keys.
We'll need to use these in recompile debugging in the next commit.

Fixes: 2e654db27a ("iris: Create smaller program keys without legacy features")
2019-12-22 16:18:30 -08:00
Vinson Lee
2d971cc1ca swr: Fix build with llvm-10.0.
Fix build error after llvm-10 commit 5d986953c8b9 ("[IR] Split out
target specific intrinsic enums into separate headers").

../src/gallium/drivers/swr/rasterizer/jitter/functionpasses/lower_x86.cpp:78:37: error: ‘x86_bmi_bextr_32’ is not a member of ‘llvm::Intrinsic’
         {"meta.intrinsic.BEXTR_32", Intrinsic::x86_bmi_bextr_32},
                                     ^

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Krzysztof Raszkowski <krzysztof.raszkowski@intel.com>
Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
2019-12-21 16:36:27 -08:00
Eric Engestrom
bc943d00aa travis: autodetect python version instead of hard-coding it
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-12-21 20:23:08 +00:00
Marek Vasut
45e1443fd8 etnaviv: tgsi: Fix gl_FrontFacing support
The GPU presents the state of the hardware front_face in internal
register 0 (i0), the range of which is 0.0f..1.0f.

This patch assigns the fragment shader input to this internal register.
Moreover, based on the internal front_ccw state, the value of the i0
register is inverted accordingly using SET.EQ/SEQ.NE instruction before
being further processed in the shader. This mimics the operation of the
NIR compiler.

Signed-off-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2868>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2868>
2019-12-21 20:17:27 +01:00
Paul Cercueil
63b33120b7 u_vbuf: Return true in u_vbuf_get_caps if nb of vbufs is below minimum
Return true in u_vbuf_get_caps if the number of vertex buffers is below
the minimum required for proper OpenGL 2.0.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2807>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2807>
2019-12-21 18:29:30 +00:00
Paul Cercueil
c6ef79c488 u_vbuf: Regard non-constant vbufs with non-instance elements as free
In the case of unroll_indices, we can regard all non-constant
vertex buffers with only non-instance vertex elements as incompatible
and thus free.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2807>
2019-12-21 18:29:30 +00:00
Wladimir J. van der Laan
87a6029ccf u_vbuf: use single vertex buffer if it's not possible to have multiple
Put CONST, VERTEX and INSTANCE attributes into one vertex buffer if
necessary due to hardware constraints.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2807>
2019-12-21 18:29:30 +00:00
Paul Cercueil
18a8c3f7f1 u_vbuf: Only create driver CSO if no incompatible elements
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2807>
2019-12-21 18:29:30 +00:00
Paul Cercueil
88d041a6b9 u_vbuf: Mark vbufs incompatible if more were requested than HW supports
More vertex buffers are used than the hardware supports.  In
principle, we only need to make sure that less vertex buffers are
used, and mark some of the latter vertex buffers as incompatible.
For now, mark all vertex buffers as incompatible.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2807>
2019-12-21 18:29:30 +00:00
Wladimir J. van der Laan
5f37e38b81 u_vbuf: add logic to use a limited number of vbufs
Make it possible to limit the number of vertex buffers as there exist
GPUs with less then 32 supported vertex buffers.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2807>
2019-12-21 18:29:30 +00:00
Christian Gmeiner
5bd6a5c41b gallium: add PIPE_CAP_MAX_VERTEX_BUFFERS
Add PIPE_CAP_MAX_VERTEX_BUFFERS param, which defaults to 16.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2807>
2019-12-21 18:29:30 +00:00
David Heidelberg
5343124932 .mailmap: use correct email address
Signed-off-by: David Heidelberg <david@ixit.cz>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3190>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3190>
2019-12-21 17:50:01 +00:00
Paul Cercueil
2bbf8ebadc kmsro: Extend to include ingenic-drm
This enables Mesa to work with Ingenic SoCs through the use of the
ingenic-drm modesetting driver along with the render-only drivers,
such as Etnaviv on the JZ4770 SoC.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-21 18:27:51 +01:00
Stephan Gerhold
4da46a1c3c kmsro: Add "mcde" entry point
ST-Ericsson Ux500 boards use a Mali 400 GPU together with MCDE
("Multi Channel Display Engine"), which is supported by the "mcde"
DRM driver.

Adding an entry point for it in kmsro seems to be enough to make
Lima work - at least kmscube is working correctly.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3139>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3139>
2019-12-21 16:43:35 +00:00
Rhys Perry
afe1a8ff5b aco: fix vgpr alloc granule with wave32
We still need to increase the number of physical vgprs

Totals from affected shaders:
SGPRS: 671976 -> 675288 (0.49 %)
VGPRS: 550112 -> 562596 (2.27 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 27621660 -> 27606532 (-0.05 %) bytes
Max Waves: 81083 -> 87833 (8.32 %)
Instructions: 5391560 -> 5389031 (-0.05 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-21 12:38:42 +01:00
Rhys Perry
01ccd7839c aco: improve jump threading with wave32
Totals from affected shaders:
SGPRS: 748746 -> 748746 (0.00 %)
VGPRS: 636984 -> 636984 (0.00 %)
Spilled SGPRs: 387 -> 387 (0.00 %)
Spilled VGPRs: 15 -> 15 (0.00 %)
Code Size: 61138824 -> 60928620 (-0.34 %) bytes
Max Waves: 48602 -> 48602 (0.00 %)
Instructions: 11967660 -> 11915084 (-0.44 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-21 12:38:42 +01:00
Rhys Perry
6ff92f3d68 aco/wave32: fix comparison optimizations
Previously, they weren't done in wave32.

Totals from affected shaders:
SGPRS: 507726 -> 508006 (0.06 %)
VGPRS: 450340 -> 450268 (-0.02 %)
Spilled SGPRs: 298 -> 298 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 39689708 -> 39384488 (-0.77 %) bytes
Max Waves: 39631 -> 39636 (0.01 %)
Instructions: 7865919 -> 7793650 (-0.92 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-21 12:38:42 +01:00
Karol Herbst
4dd08b710b nv50ir/nir: support vec8 and vec16
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2019-12-21 11:00:17 +00:00
Rob Clark
a8ec4082a4 nir+vtn: vec8+vec16 support
This introduces new vec8 and vec16 instructions (which are the only
instructions taking more than 4 sources), in order to construct 8 and 16
component vectors.

In order to avoid fixing up the non-autogenerated nir_build_alu() sites
and making them pass 16 src args for the benefit of the two instructions
that take more than 4 srcs (ie vec8 and vec16), nir_build_alu() is has
nir_build_alu_tail() split out and re-used by nir_build_alu2() (which is
used for the > 4 src args case).

v2 (Karol Herbst):
  use nir_build_alu2 for vec8 and vec16
  use python's array multiplication syntax
  add nir_op_vec helper
  simplify nir_vec
  nir_build_alu_tail -> nir_builder_alu_instr_finish_and_insert
  use nir_build_alu for opcodes with <= 4 sources
v3 (Karol Herbst):
  fix nir_serialize
v4 (Dave Airlie):
  fix serialization of glsl_type
  handle vec8/16 in lowering of bools
v5 (Karol Herbst):
  fix load store vectorizer

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-12-21 11:00:17 +00:00
Karol Herbst
b35e583c17 aco: use NIR_MAX_VEC_COMPONENTS instead of 4
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-12-21 11:00:16 +00:00
Karol Herbst
c83b1a4560 nir/serialize: cast swizzle before shifting
fixes undefined behaviour with enabled vec16

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-12-21 11:00:16 +00:00
Dave Airlie
e6b2af56cb llvmpipe: switch to NIR by default
Add LP_DEBUG=tgsi_ir (tgsi already taken) to fallback to TGSI paths.

Disable NIR_VALIDATE in CI (Michel/Eric acked)

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2303>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2303>
2019-12-21 13:07:17 +10:00
Dave Airlie
c717ac1247 gallivm/nir: wrap idiv to avoid divide by 0 (v2)
This code is taken from the TGSI paths, and should fix the regression
seens with GLES2

v2: use the udiv path which has d3d10 defined return.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2303>
2019-12-21 13:06:58 +10:00
Marek Olšák
7d65614422 ac/surface: fix an assertion failure on gfx9 in CMASK computation
addrlib only allows the 2D resource type with CMASK.

Fixes: 69ea473eeb "amd/addrlib: update to the latest version"

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3187>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3187>
2019-12-20 22:57:08 +00:00
Afonso Bordado
3e1e4ad13d pan/midgard: Optimize comparisions with similar operations
Optimizes comparisions by removing the invert flag on operands
which we can prove to be equal without the invert.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3036>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3036>
2019-12-20 22:36:06 +00:00
Erico Nunes
8e9e94d084 lima: set shader caps to optimize control flow
With these new caps, nir is able to unroll loops and optimize
conditionals much more efficiently in both gpit and ppir.
panfrost and vc4 were used as reference for the values.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3176>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3176>
2019-12-20 20:59:15 +01:00
Erico Nunes
4322656dee lima/ppir: remove assert on ppir_emit_tex unsupported feature
This assert causes testing tools such as shaderdb to abort on some test
cases. This is an unsupported feature and not a compiler bug. The
compilation error is already propagated correctly, so we can remove the
assert to allow testing tools to run to completion.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3176>
2019-12-20 20:58:50 +01:00
Erico Nunes
d56710ab82 lima/ppir: fix lod bias src
ppir has some code that operates on all ppir_src variables, and for that
uses ppir_node_get_src.
lod bias support introduced a separate ppir_src that is inaccessible by
that function, causing it to be missed by the compiler in some routines.
Ultimately this caused, in some cases, a bug in const lowering:

  .../pp/lower.c:42: ppir_lower_const: Assertion `src != NULL' failed.

This fix moves the ppir_srcs in ppir_load_texture_node together so they
don't get missed.

Fixes: 721d82cf06 lima/ppir: add lod-bias support

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3185>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3185>
2019-12-20 19:39:55 +00:00
Andreas Baierl
1b0743dbb6 lima: Fix dump file creation
Otherwise lima_dump_file_next() always opens a new file and creates the
dumps regardless of what the environment variables say.

Fixes d71cd245d7 ('lima: Rotate dump files after each finished pp frame')

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3179>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3179>
2019-12-20 17:44:12 +01:00
Pierre-Eric Pelloux-Prayer
9c2a3b4e75 radeon/vcn2: enable rate control for hevc encoding
Based on b0626c1f30 ("radeon/vcn: enable rate control for hevc encoding").

Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2225
Fixes: 587b9c5dae ("radeon/vcn: implement vcn 2.0 encode")
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3134>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3134>
2019-12-20 16:51:53 +01:00
Samuel Pitoiset
02dd1fb859 radv: rely on pipeline layout when creating push descriptors with template
descriptorSetLayout should be ignored for push descriptors. While
we are it, also ignore pipelineBindPoint.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2210
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3180>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3180>
2019-12-20 13:41:29 +01:00
Marek Vasut
f51ee564f5 etnaviv: Replace bitwise OR with logical OR
The test here is testing whether either variable is non-zero.
While currently the test works fine, it's fragile. Replace it
with logical OR to avoid the fragility.

Signed-off-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-12-20 13:15:37 +01:00
Christian Gmeiner
6e75f2172b etnaviv: update resource status after flushing
Currently piglit spec@arb_occlusion_query@occlusion_query_conform
spins for ever as the resource status is never reset. See
etna_hw_get_query_result(..) for more details.

Fixes: 1456aa61cc ("etnaviv: Rework resource status tracking")
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Marek Vasut <marex@denx.de>
2019-12-20 12:43:23 +01:00
Ross Zwisler
cabcbb4db0 intel: limit shader geometry on BDW GT1
Similar to the SKL GT1 fix introduced here:

b1ba7ffdbd

we need to limit the .urb.max_entries[MESA_SHADER_GEOMETRY] on BDW GT1
to address failures in these two tests:

dEQP-GLES31.functional.geometry_shading.layered.render_with_default_layer_3d
dEQP-GLES31.functional.geometry_shading.layered.render_with_default_layer_2d_array

The value 690 was found via bisection.  691 is the actual max on the
hardware I'm using, but 690 seemed like a nice round number.

Signed-off-by: Ross Zwisler <zwisler@google.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3173>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3173>
2019-12-20 10:47:52 +00:00
Alyssa Rosenzweig
c57337bbd3 pan/midgard: Lower txd with lower_tex
This is a hack since we do have native gradient stuff, but for the
moment I'm more interested in conformance and the lowered code is good
enough. Fixes
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.sampler2d_fixed_fragment

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3169>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3169>
2019-12-20 09:10:39 +01:00
Alyssa Rosenzweig
da73651da4 pan/midgard: Fix crash with txs
This regressed since we implemented RECT textures natively, oops.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3169>
2019-12-20 09:10:36 +01:00
Alyssa Rosenzweig
ccbc9a4e67 pan/midgard: Implement textureOffset for 2D textures
Fixes dEQP-GLES3.functional.shaders.texture_functions.textureoffset.sampler2d_fixed_fragment.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3169>
2019-12-20 09:10:26 +01:00
Samuel Pitoiset
2eef9e050f radv: ignore pColorBlendState if rasterization is disabled
Or if the subpass has no color attachments.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3167>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3167>
2019-12-20 08:21:02 +01:00
Samuel Pitoiset
021c7b5309 radv: tidy up radv_pipeline_init_blend_state()
This is needed for the next commit because pColorBlendState can
actually be NULL but some fields might have to be initialized
(eg. alpha to coverage with no color attachments).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3167>
2019-12-20 08:20:58 +01:00
Samuel Pitoiset
ebc7a77869 radv: ignore pDepthStencilState if rasterization is disabled
Or if the subpass has no depth stencil attachment.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3167>
2019-12-20 08:20:55 +01:00
Samuel Pitoiset
ce67e41535 radv: ignore pTessellationState if the pipeline doesn't use tess
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3167>
2019-12-20 08:20:52 +01:00
Samuel Pitoiset
7735f314b7 radv: ignore pMultisampleState if rasterization is disabled
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3167>
2019-12-20 08:20:49 +01:00
Samuel Pitoiset
589bfcbde3 radv: init a default multisample state for the resolve FS path
pMultisampleState must be a valid pointer.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3167>
2019-12-20 08:20:44 +01:00
Caio Marcelo de Oliveira Filho
4fbc99c124 spirv: Implement SPV_KHR_non_semantic_info
Do nothing for OpExtInst from extended instruction sets that name
start with "NonSemantic.".

Since they can be used within the "preamble" to annotate global
decorations, also don't stop iterating when one of them is found.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3154>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3154>
2019-12-19 22:49:39 -08:00
Jonathan Marek
13adce2845 turnip: disable B8G8R8 vertex formats
Looks like swap doesn't work as expected on these, disable them.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3170>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3170>
2019-12-19 19:03:02 -05:00
Jonathan Marek
54f72c83d6 util/format: add missing vulkan formats
Add some missing vulkan formats to util/format, this solves all the missing
pipe format cases for the formats that turnip supports.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3170>
2019-12-19 19:03:02 -05:00
Jonathan Marek
b9d4c10e4b turnip: minor warning fixes
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3177>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3177>
2019-12-19 23:21:01 +00:00
Andreas Baierl
d71cd245d7 lima: Rotate dump files after each finished pp frame
This rotates the dump files like the mali-syscall-tracker does.

After each finished pp frame a new file is generated. They are
numbered like lima.dump.0000, lima.dump.0001 ...
The filename and path can be given with the new environment
variable LIMA_DUMP_FILE.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3175>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3175>
2019-12-19 23:53:22 +01:00
Vasily Khoruzhick
039f3f6adb lima: drop suballocator
Since we're using a separate per-draw BO for GP outputs we don't
need suballocator anymore.

Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3158>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3158>
2019-12-19 14:28:32 -08:00
Vasily Khoruzhick
9f72d7195a lima: use single BO for GP outputs
Varyings, gl_Position and gl_PointSize are all GP outputs, so we
can use a single BO for them all. Also that allows us to get rid
of suballocator.

Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3158>
2019-12-19 14:28:32 -08:00
Jonathan Marek
06ae0674fd nir: fix assign_io_var_locations for vertex inputs
Also fixes fragment inputs using the wrong "base" value (which was working
only because FRAG_RESULT_DATA0 is less than VARYING_SLOT_VAR0)

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3108>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3108>
2019-12-19 21:26:52 +00:00
Jonathan Marek
e9a32af3bf turnip: implement secondary command buffers
Uses a new "tu_cs_add_entries" function because tu_cs_emit_call doesn't
work inside draw_cs (which is already called by tu_cs_emit_call).

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3075>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3075>
2019-12-19 20:42:08 +00:00
Jonathan Marek
85fff42d08 turnip: compute gmem offsets at renderpass creation time
This makes it easier to implement secondary command buffers, since we no
longer need to know the render area to set the gmem offsets for input
attachments and CmdClearAttachments.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3075>
2019-12-19 20:42:08 +00:00
Jonathan Marek
f81c41a812 turnip: emit_compute_driver_params fixes
Offset was wrong, it is in vec4 not dwords.

There's a hole between DP_NUM_WORK_GROUPS_Z and DP_LOCAL_GROUP_SIZE_X so
use the IR3 enums.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3162>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3162>
2019-12-19 15:13:40 -05:00
Jonathan Marek
bb134c5316 turnip: emit base instance vs driver param
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3162>
2019-12-19 15:13:40 -05:00
Jonathan Marek
a3a70588c0 freedreno/ir3: support load_base_instance
Not supported by hardware, uses same mechanism as base vertex.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3162>
2019-12-19 15:13:40 -05:00
Jonathan Marek
5c17d9b9ca freedreno/registers: document vertex/instance id offset bits
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3162>
2019-12-19 15:13:40 -05:00
Neha Bhende
83ad2e5084 st/mesa: release tgsi tokens for shader states
Since we are using st_common_variant while creating variant for vertext
program, we can release tokens created in st_create_vp_variant which
are already stored in respective states.
This fix memory leak found with piglit tests

Fixes bc99b22a30 ('st/mesa: use a separate VS variant for the draw module')

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2019-12-19 14:40:08 -05:00
Juan A. Suarez Romero
7f821289cb Revert "nir/lower_double_ops: relax lower mod()"
This reverts commit 8172b1fa03.

This commit was done taking in account Vulkan spec, but did not realize
it was affecting OpenGL too.

Closes: #2252
2019-12-19 20:01:16 +01:00
Kristian H. Kristensen
a4db9a1512 freedreno/a6xx: Set up multisample sysmem MRTs correctly
We had an extra factor of num_samples in the stride.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
2019-12-19 09:56:05 -08:00
Kristian H. Kristensen
e688a16e2b freedreno/a6xx: Rewrite compressed blits in a helper function
Similar to how we handle zs blits.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
2019-12-19 09:56:05 -08:00
Kristian H. Kristensen
f8c0ea61e4 freedreno/a6xx: Move handle_rgba_blit() up
If we move this function up, we don't have to forward declare it.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
2019-12-19 09:56:05 -08:00
Kristian H. Kristensen
183d482f7f freedreno/a6xx: Handle srgb blits on the blitter
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
2019-12-19 09:56:05 -08:00
Kristian H. Kristensen
3a18e5d420 freedreno/a6xx: Use A6XX_SP_2D_SRC_FORMAT_MASK macro
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
2019-12-19 09:56:05 -08:00
Kristian H. Kristensen
e4c2bb6a93 freedreno/a6xx: RB6_R8G8B8 is actually 32 bit RGBX
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
2019-12-19 09:56:05 -08:00
Kristian H. Kristensen
8089fb2e62 freedreno/a6xx: Use blitter for resolve blits
We have a SAMPLES_AVERAGE bit that does what we need for resolving
multisample buffers - let's use it.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
2019-12-19 09:56:05 -08:00
Kristian H. Kristensen
1d7267fc91 freedreno/a6xx: Add fd_resource_swap() helper
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
2019-12-19 09:56:05 -08:00
Kristian H. Kristensen
e0ebaa819d freedreno/a6xx: Pick blitter swap based on resource tiling
The linear levels in a tiled resource are stored in the canonical
swap, WZYX.  We need to pick the swap based on whether or not the
resource is tiled, not whether the the level in question is tiled.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
2019-12-19 09:56:05 -08:00
Kristian H. Kristensen
b59222640e freedreno/a6xx: Program sampler swap based on resource tiling
It doesn't matter whether or not the level in question is linear.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
2019-12-19 09:56:05 -08:00
Kristian H. Kristensen
a2f6c44a1c freedreno: Add debug flag for forcing linear layouts
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
2019-12-19 09:56:05 -08:00
Kristian H. Kristensen
d908a2ab18 freedreno/a6xx: Make DEBUG_BLIT_FALLBACK only dump fallbacks
Use new macro, DEBUG_BLIT, for dumping all blits.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
2019-12-19 09:56:05 -08:00
Jonathan Marek
fe4a8df9a8 freedreno/ir3: fix vertex shader sysvals with pre_assign_inputs
The first pre_assign_inputs loop doesn't pre-assign sysvals, so skip the
second part for sysvals.

The sysvals don't need to be pre-assigned since the state for those isn't
shared between binning / nonbinning shaders.

Fixes assert failures in cases where the sysvals didn't end up in the same
registers for binning / nonbinning.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3168>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3168>
2019-12-19 11:31:12 -05:00
Thong Thai
2add63060b st/va: Convert interlaced NV12 to progressive
In vlVaDeriveImage, convert interlaced NV12 buffers to progressive.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1193
Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3157>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3157>
2019-12-19 15:49:09 +00:00
Alyssa Rosenzweig
5710250074 pan/midgard: Add uniform/work heuristic
Uniform/work registers are partitioned on a shader-by-shader basis as
determined by the compiler. We add a simple heuristic here running
before scheduling that prioritizes mitigating spilling at all costs.

A more sophisticated heuristic should run *after* scheduling, doing a
dry run of the register allocator itself to determine spilling. Fitting
this into our current scheduling model is difficult, so while this
heuristic does hurt some shaders, overall the results are acceptable:

total instructions in shared programs: 50065 -> 38747 (-22.61%)
instructions in affected programs: 37187 -> 25869 (-30.44%)
helped: 59
HURT: 77
helped stats (abs) min: 1 max: 757 x̄: 198.46 x̃: 151
helped stats (rel) min: 0.48% max: 62.89% x̄: 32.95% x̃: 36.27%
HURT stats (abs)   min: 1 max: 9 x̄: 5.08 x̃: 6
HURT stats (rel)   min: 0.92% max: 14.29% x̄: 6.71% x̃: 4.60%
95% mean confidence interval for instructions value: -111.15 -55.29
95% mean confidence interval for instructions %-change: -14.33% -6.67%
Instructions are helped.

total bundles in shared programs: 30606 -> 19157 (-37.41%)
bundles in affected programs: 23907 -> 12458 (-47.89%)
helped: 58
HURT: 74
helped stats (abs) min: 6 max: 757 x̄: 203.09 x̃: 152
helped stats (rel) min: 5.19% max: 77.00% x̄: 49.38% x̃: 53.79%
HURT stats (abs)   min: 1 max: 9 x̄: 4.46 x̃: 5
HURT stats (rel)   min: 1.85% max: 26.32% x̄: 11.70% x̃: 9.57%
95% mean confidence interval for bundles value: -115.46 -58.01
95% mean confidence interval for bundles %-change: -20.87% -9.41%
Bundles are helped.

total quadwords in shared programs: 31305 -> 32027 (2.31%)
quadwords in affected programs: 20471 -> 21193 (3.53%)
helped: 0
HURT: 133
HURT stats (abs)   min: 1 max: 9 x̄: 5.43 x̃: 5
HURT stats (rel)   min: 0.76% max: 15.15% x̄: 5.47% x̃: 4.65%
95% mean confidence interval for quadwords value: 5.00 5.86
95% mean confidence interval for quadwords %-change: 4.85% 6.08%
Quadwords are HURT.

total registers in shared programs: 2256 -> 2545 (12.81%)
registers in affected programs: 708 -> 997 (40.82%)
helped: 0
HURT: 95
HURT stats (abs)   min: 1 max: 8 x̄: 3.04 x̃: 3
HURT stats (rel)   min: 12.50% max: 100.00% x̄: 39.41% x̃: 37.50%
95% mean confidence interval for registers value: 2.64 3.45
95% mean confidence interval for registers %-change: 34.62% 44.19%
Registers are HURT.

total threads in shared programs: 1776 -> 1709 (-3.77%)
threads in affected programs: 134 -> 67 (-50.00%)
helped: 0
HURT: 67
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: -1.00 -1.00
95% mean confidence interval for threads %-change: -50.00% -50.00%
Threads are HURT.

total spills in shared programs: 3868 -> 2 (-99.95%)
spills in affected programs: 3868 -> 2 (-99.95%)
helped: 60
HURT: 0

total fills in shared programs: 6456 -> 4 (-99.94%)
fills in affected programs: 6456 -> 4 (-99.94%)
helped: 60
HURT: 0

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3150>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3150>
2019-12-19 15:22:39 +00:00
Samuel Pitoiset
13b4e9adcf ac: declare an enum for the OOB select field on GFX10
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3147>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3147>
2019-12-19 15:15:32 +01:00
Samuel Pitoiset
f3cccd05d9 radv/gfx10: fix the out-of-bounds check for vertex descriptors
When stride is 0, it should check against the offset not the index.

This fixes black character models with Beat Saber and missing snow
with Dragon Quest.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2233
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1975
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3147>
2019-12-19 15:15:30 +01:00
Juan A. Suarez Romero
8172b1fa03 nir/lower_double_ops: relax lower mod()
Currently when lowering mod() we add an extra instruction so if
mod(a,b) == b then 0 is returned instead of b, as mathematically
mod(a,b) is in the interval [0, b).

But Vulkan spec has relaxed this restriction, and allows the result to
be in the interval [0, b].

This commit takes this in account to remove the extra instruction
required to return 0 instead.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2922>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2922>
2019-12-19 12:36:30 +00:00
Erik Faye-Lund
af65bfb38f zink: implement nir_texop_txd
This lets us enable PIPE_CAP_FRAGMENT_SHADER_TEXTURE_LOD, which in turns
gives us ARB_shader_texture_lod.

Still fails one piglit test on ANV, namely
spec@arb_shader_texture_lod@execution@arb_shader_texture_lod-texgradcube,
but with 33 new passing tests, I think this is worth it.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3140>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3140>
2019-12-19 13:14:29 +01:00
Erik Faye-Lund
b31d1b73bc zink: enable PIPE_CAP_MIXED_COLORBUFFER_FORMATS
This just works in Vulkan, there's no work neeed to enable it.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3148>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3148>
2019-12-19 10:08:13 +01:00
Jonathan Marek
5785bcc8a0 turnip: don't set SP_FS_CTRL_REG0_VARYING if only fragcoord is used
Fixes artifacts in the subpasses demo, which has a shader using fragcoord
without any varyings. It looks like setting this bit when there are no
varyings can cause weirdness in some cases (without this change, if the
previous shader had <= 8 varyings it would work, but with 9 varyings it
would have artifacts).

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3143>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3143>
2019-12-18 19:03:37 -05:00
Jonathan Marek
4a59bc6df2 turnip: add cache invalidate to fix input attachment cases
Fixes artifacts in the subpasses demo.

Workaround texture cache with input attachments from GMEM by adding a cache
invalidate between subpasses.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3143>
2019-12-18 19:03:37 -05:00
Lionel Landwerlin
fc2552b644 loader: fix close on uninitialized file descriptor value
Using a drm syscall layer faking a kernel driver :

  ==581460== Conditional jump or move depends on uninitialised value(s)
  ==581460==    by 0x48A4C2B: close (drm-hooks.cpp:185)
  ==581460==    by 0x5A815F1: dri3_alloc_render_buffer (loader_dri3_helper.c:1469)
  ==581460==    by 0x5A82050: dri3_get_buffer (loader_dri3_helper.c:1827)
  ==581460==    by 0x5A82662: loader_dri3_get_buffers (loader_dri3_helper.c:2028)
  ==581460==    by 0x6C78109: intel_update_image_buffers (brw_context.c:1870)
  ==581460==    by 0x6C77805: intel_update_renderbuffers (brw_context.c:1499)
  ==581460==    by 0x6C7789D: intel_prepare_render (brw_context.c:1520)
  ==581460==    by 0x6C773D4: intelMakeCurrent (brw_context.c:1341)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 069fdd5f9f ("egl/x11: Support DRI3 v1.1")
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3152>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3152>
2019-12-19 00:51:36 +02:00
Connor Abbott
648cc22afb freedreno: Fix CP_MEM_TO_REG flag definitions
These actually mean something completely different, at least on A5xx
and A6xx. The only other usage of the old flags on something older than
A6xx was a typo, so I don't know if it was always this way, but at the
same time it means that we don't have to worry too much about that.

Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3116>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3116>
2019-12-18 23:09:05 +01:00
Connor Abbott
4c5ac156c3 freedreno: Use new macros for CP_WAIT_REG_MEM and CP_WAIT_MEM_GTE
Similar to the existing usage for CP_COND_WRITE5, this makes it clear
what each of the magic parameters are for.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3116>
2019-12-18 23:09:00 +01:00
Connor Abbott
cfa1fb895a a6xx: Add more CP packets
And add fields uncovered by looking at the firmware. I think this covers
all the memory, register, and scratch manipulation opcodes that exist on
A6xx, plus one additional nice find for Vulkan and describing a
previously unknown opcode and documenting CP_WAIT_REG_MEM.

Note that the bits for the CP_REG_TO_MEM count, as well as the formula
for computing the actual count for both CP_REG_TO_MEM and CP_MEM_TO_REG,
are changed because the A630 SQE firmware actually does something
different. I haven't investigated older microcodes to see whether this
extends back to A5xx and A4xx, but the only non-A6xx uses of this
field result in the same bit-pattern when using the A6xx bit range and
formula, so it should be safe to change the definition universally.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3116>
2019-12-18 23:08:55 +01:00
Bas Nieuwenhuizen
a9a3108be7 radv: Limit workgroup size to 1024.
Fixes a hang with geekbench.

The existence of RX 580 and NAVI10 results shows that the generations
before and after this do not have the issue. (They show up on the
website). So this is likely a GFX9 only issue.

This is not something weird like LDS size since none of the shaders
seem to use LDS.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3145>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3145>
2019-12-18 20:41:18 +00:00
Dylan Baker
69decdb28a docs: Add release notes, news, and update calendar for 19.2.8 2019-12-18 11:25:32 -08:00
Dylan Baker
7017f69a64 docs/relnotes/19.2.8: Add SHA256 sum 2019-12-18 11:24:46 -08:00
Dylan Baker
2f724d2202 docs: add relnotes for 19.2.8 2019-12-18 11:24:44 -08:00
Dylan Baker
d32e1257c0 docs: Add release notes, update calendar, and add news for 19.3.1 2019-12-18 10:58:54 -08:00
Dylan Baker
636175da6d dcos: add releanse notes for 19.3.1 2019-12-18 10:57:54 -08:00
Lionel Landwerlin
afdc0121b5 i965/iris/perf: factor out frequency register capture
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3113>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3113>
2019-12-18 14:23:17 +02:00
Jonathan Marek
072e95e07a freedreno/ir3: update prefetch input_offset when packing inlocs
If the input location changes then prefetch input_offset needs to change.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3141>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3141>
2019-12-17 16:41:13 -05:00
Eric Anholt
62998f6e2d ci: Fix caselist results archiving after parallel-deqp-runner rename.
Noticed while reviewing some lava parallel-deqp-runner changes.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3138>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3138>
2019-12-17 20:13:10 +00:00
Kristian H. Kristensen
9aaa23fbad freedreno/a6xx: Document the CP_SET_DRAW_STATE enable bits
There are bits for binning, gmem and sysmem.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3131>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3131>
2019-12-17 11:45:20 -08:00
Caio Marcelo de Oliveira Filho
c61ad77cd2 anv/gen12: Temporarily disable VK_KHR_buffer_device_address (and EXT)
For the sake of our testing infrastructure, disable this extension
for TGL until we can sort out a hang in Vulkan CTS.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-17 11:07:41 -08:00
Caio Marcelo de Oliveira Filho
766fdeccf9 intel/vec4: Fix lowering of multiplication by 16-bit constant
Existing code was ignoring whether the type of the immediate source
was signed or not.  If the source was signed, it would ignore small
negative values but it also would wrongly accept values between
INT16_MAX and UINT16_MAX, causing the atual value to later be
reinterpreted as a negative number (under 16-bits).

Fixes tests/shaders/glsl-mul-const.shader_test in Piglit for older
platforms that don't support MUL with 32x32 types and use vec4.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-12-17 10:45:22 -08:00
Caio Marcelo de Oliveira Filho
2137be22fa intel/fs: Fix lowering of dword multiplication by 16-bit constant
Existing code was ignoring whether the type of the immediate source
was signed or not.  If the source was signed, it would ignore small
negative values but it also would wrongly accept values between
INT16_MAX and UINT16_MAX, causing the atual value to later be
reinterpreted as a negative number (under 16-bits).

Fixes tests/shaders/glsl-mul-const.shader_test in Piglit for platforms
that don't support MUL with 32x32 types, including ICL and TGL.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2186
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-12-17 10:45:22 -08:00
Alyssa Rosenzweig
66013cb1be pan/midgard: Set Z to shadow comparator for 2D
We still need to generalize for other types of (non-2D / array) shadow
samplers, but this is enough for sampler2DShadow to work with initial
dEQP tests passing.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125>
2019-12-17 17:42:57 +00:00
Alyssa Rosenzweig
1a53bed41c pan/midgard: Set .shadow for shadow samplers
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125>
2019-12-17 17:42:57 +00:00
Alyssa Rosenzweig
d183f84585 pan/midgard: Hoist temporary coordinate for cubemaps
We'll reuse some of this code for shadow samplers, which are represented
by a distinct source in NIR.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125>
2019-12-17 17:42:57 +00:00
Alyssa Rosenzweig
96df5f1fbf pan/midgard: Use a reg temporary for mutiple writes
Bug in texelfetch implementation from inspection.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125>
2019-12-17 17:42:57 +00:00
Alyssa Rosenzweig
bf5d8cfd28 panfrost: Handle empty shaders
I didn't realize this was in spec, but it fixes a crash in shaderdb.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125>
2019-12-17 17:42:57 +00:00
Alyssa Rosenzweig
35418f6770 panfrost: Let precompile imply shaderdb
This cuts down the number of random environmental variables we need
flying around; now PAN_MESA_DEBUG=precompile is sufficient and
MIDGARD_MESA_DEBUG=shaderdb will be implied.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125>
2019-12-17 17:42:57 +00:00
Alyssa Rosenzweig
271726eaca panfrost: Add PAN_MESA_DEBUG=precompile for shader-db
We would like to use run.c for shader-db runs (rather than capturing in
real-time, which is limiting).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125>
2019-12-17 17:42:57 +00:00
Lionel Landwerlin
2c8742ed85 mesa: avoid triggering assert in implementation
When tearing down a GL context with an active performance query, the
implementation can be confused by a query marked active when it's
being deleted.

This shouldn't happen in the implementation because the context will
already be idle.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2235
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3115>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3115>
2019-12-17 12:52:04 +00:00
Samuel Pitoiset
d399f4f414 radv/gfx10: fix ngg_get_ordered_id
Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3133>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3133>
2019-12-17 12:34:18 +00:00
Neil Armstrong
089c8f0b8d ci: Remove T820 from CI temporarily
Our lab will have continuous programmed power cuts until the 6th January 2020,
so it's safer to disable the T820 CI running on the BayLibre kernelCI lab
to avoid breaking CI.

Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3135>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3135>
2019-12-17 11:40:07 +01:00
Tapani Pälli
75caae2268 i965: expose MESA_FORMAT_B8G8R8X8_SRGB visual
Patch adds BGRX sRGB visuals, required format translation information
to the __DRI_IMAGE_FOURCC_SXRGB8888 format and makes all BGRX visuals
sRGB capable just like is done with BGRA.

squashed patches from Yevhenii Kolesnikov:
  dri: Add __DRI_IMAGE_FOURCC_SXRGB8888 conversion
  i965: force visuals without alpha bits to use sRGB

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1501
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3077>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3077>
2019-12-17 09:28:25 +00:00
Tapani Pälli
8b6b5ce669 dri: add __DRI_IMAGE_FORMAT_SXRGB8
Add format definition and required plumbing to create images.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3077>
2019-12-17 09:28:25 +00:00
Gert Wollny
cffa7bb990 virgl: Increase the shader transfer buffer by doubling the size
With only linearly increasing the size of the shader transfer buffer
the transfer of very large shaders may fail, so with each attempt double
the size of the buffer.

CTS:
  dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.48
  for VTK-GL-CTS b5dcfb9c5 and newer

virglrenderer bug:
  https://gitlab.freedesktop.org/virgl/virglrenderer/issues/150

Fixes: a8987b88ff
    virgl: add driver for virtio-gpu 3D (v2)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3121>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3121>
2019-12-17 08:07:51 +00:00
Eric Anholt
2da68c8649 turnip: Fix support for immutable samplers.
We were setting up the hardware sampler state when updating a combined
image sampler, but never looking at the immutable sampler for in the
separate case.

Fixes failures in
dEQP-VK.binding_model.shader_access.primary_cmd_buf.sampler_immutable.fragment.*

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3127>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3127>
2019-12-16 19:51:27 -08:00
Jonathan Marek
edfc4daab8 turnip: don't set LRZ enable at end of renderpass
Fixes hanging with cases that use more than one renderpass.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3122>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3122>
2019-12-17 00:59:00 +00:00
Jonathan Marek
c7c5a84cf3 freedreno/ir3: lower pack/unpack ops
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3106>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3106>
2019-12-16 19:20:07 -05:00
Jonathan Marek
004797002f nir: add option to lower half packing opcodes
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3106>
2019-12-16 19:20:07 -05:00
Eric Anholt
2d3182b429 turnip: Add support for descriptor arrays.
I had a bigger rework I was working on, but this is simple and gets tests
passing.

Fixes 36 failures in
dEQP-VK.binding_model.shader_access.primary_cmd_buf.sampler_mutable.fragment.*
(now all passing)

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3124>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3124>
2019-12-16 23:57:22 +00:00
Eric Anholt
02d764b96a turnip: Drop unused variable.
We really need -Werror in CI.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3124>
2019-12-16 23:57:22 +00:00
Alyssa Rosenzweig
0eb84eb702 panfrost: Don't double-create scratchpad
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Fixes: 4f7fddbd71 ("panfrost: Pass size to panfrost_batch_get_scratchpad")
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3119>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3119>
2019-12-16 23:32:07 +00:00
Alyssa Rosenzweig
73bd9fe20c panfrost: Simplify sampler upload condition
Makes it more obvious what's going on.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3119>
2019-12-16 23:32:06 +00:00
Icecream95
37bc028367 gallium/auxiliary: Handle count == 0 in u_vbuf_get_minmax_index_mapped
This makes u_vbuf_get_minmax_index_mapped return min = 0 / max = 0
when info->count == 0.

That should never happen anyway, but this commit makes it at least
return a sane value that callers expect, and also allows us - and
GCC - to assume count != 0 for optimization purposes.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3050>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3050>
2019-12-16 22:57:35 +00:00
Icecream95
80aca96803 gallium/auxiliary: Reduce conversions in u_vbuf_get_minmax_index_mapped
With this patch, GCC generates vectorized code that does the comparisons
without converting the indices to 32-bit first.

This optimization makes the aforementioned function almost twice as fast
for ARM NEON, and should speed up vectorised code on other platforms.

Without vectorisation, the function is still a percent or two faster,
but slightly larger.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3050>
2019-12-16 22:57:35 +00:00
Marek Olšák
69ea473eeb amd/addrlib: update to the latest version
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-12-16 17:04:57 -05:00
Jonathan Marek
a3ea4805aa turnip: remove duplicate A6XX_SP_CS_CONFIG_NIBO
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3104>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3104>
2019-12-16 21:04:42 +00:00
Jonathan Marek
2d3492bc62 turnip: change emit_ibo to be like emit_textures
Adds missing alignment and error checking.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3104>
2019-12-16 21:04:42 +00:00
Jonathan Marek
718bd4f8b4 turnip: fix emit_ibo
Based on the GL driver:
-Compute needs different opcode (this fixes a GPU hang problem)
-REG_A6XX_SP_IBO_LO/REG_A6XX_SP_CS_IBO_LO were swapped

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3104>
2019-12-16 21:04:42 +00:00
Jonathan Marek
65007d438c turnip: remove compute emit_border_color
Current tu6_emit_border_color doesn't work for compute and there's no
example from the GL driver to base it on, so replace it with a finishme.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3104>
2019-12-16 21:04:42 +00:00
Jonathan Marek
c9b12c71d7 turnip: fix emit_textures for compute shaders
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3104>
2019-12-16 21:04:42 +00:00
Rafael Antognolli
ed43d01dec utils/os_socket: Define ssize_t on windows.
Fixes: ef5266ebd5 ("util/os_socket: Add socket related functions.")

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-12-16 20:35:22 +00:00
Marek Olšák
43f05e0421 radeonsi/gfx10: fix ngg_get_ordered_id
This could have caused issues with NGG streamout.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
2019-12-16 20:06:07 +00:00
Marek Olšák
8edf3df3e4 radeonsi: reset more fields in si_llvm_context_set_ir to fix reusing ctx
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
2019-12-16 20:06:07 +00:00
Marek Olšák
1436c261e9 radeonsi: fix determining whether the VS prolog is needed
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
2019-12-16 20:06:07 +00:00
Marek Olšák
378444ce90 radeonsi: allow generating VS prologs with 0 inputs
If "ls_vgpr_fix" is set, we use a prolog, but it can have 0 inputs.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
2019-12-16 20:06:07 +00:00
Marek Olšák
4846aeaf57 radeonsi/gfx10: don't insert NGG streamout atomics if they are never used
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
2019-12-16 20:06:07 +00:00
Marek Olšák
de4a4595f6 radeonsi: don't wrap the VS prolog in if (ES thread) .. endif
We can execute it unconditionally and the values computed for disabled
threads won't be used anyway.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
2019-12-16 20:06:07 +00:00
Marek Olšák
db67e51903 radeonsi: set is_monolithic for VS prologs when the shader is really monolithic
This fixes a bug with NGG that is probably harmless.

Basically, !is_monolithic makes the VS prolog emit
llvm.amdgcn.init.exec.from.input, which sets the EXEC mask to only enable
ES threads. In the NGG non-GS case, the GS threads <= ES threads, so it was
never an issue.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
2019-12-16 20:06:07 +00:00
Marek Olšák
451bc91158 radeonsi: disallow compute-based culling if polygon mode is enabled
Polygon mode can generate thick points or lines.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
2019-12-16 20:06:07 +00:00
Marek Olšák
1a07df840e radeonsi: deduplicate ES and GS thread enablement code
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
2019-12-16 20:06:07 +00:00
Marek Olšák
f90cbd18ff ac: fix the return value in cull_bbox when bbox culling is disabled
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
2019-12-16 20:06:07 +00:00
Marek Olšák
e5e3ffa6b9 ac: fix ac_get_i1_sgpr_mask for Wave32
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
2019-12-16 20:06:07 +00:00
Alyssa Rosenzweig
5386b7e011 panfrost: Remove asserts in panfrost_pack_work_groups_compute
It's a hot routine and these are exceedingly unlikely to break.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3067>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3067>
2019-12-16 19:48:28 +00:00
Alyssa Rosenzweig
6378797a6d panfrost: Pack invocation_shifts manually instead of a bit field
gcc generates exceptionally bad code for panfrost_pack_work_groups_fused
otherwise ... although that routine is somehow still hot ...

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3067>
2019-12-16 19:48:28 +00:00
Iván Briano
a649bbffee anv: Export VK_KHR_buffer_device_address only when really supported
Fixes: 1b6991ba1d ("anv: Implement VK_KHR_buffer_device_address")

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3071>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3071>
2019-12-16 19:24:46 +00:00
Iván Briano
0fd93b9589 anv: Export filter_minmax support only when it's really supported
Fixes: bea4d4c78c ("anv: add VK_EXT_sampler_filter_minmax support")

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3071>
2019-12-16 19:24:46 +00:00
Jonathan Marek
b936143327 freedreno/ir3: lower mul_2x32_64
lower_mul_2x32_64 generates mul_high opcodes, and lower_mul_high is done by
nir_lower_alu, so call nir_lower_alu after nir_opt_algebraic.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-16 13:37:09 -05:00
Jonathan Marek
d4676d7a16 turnip: implement CmdFillBuffer/CmdUpdateBuffer
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-16 13:13:53 -05:00
Jonathan Marek
8d893a2071 turnip: don't require src image to be set for clear blits
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-16 13:13:53 -05:00
Jonathan Marek
f78c4251f1 turnip: use common blit path for buffer copy
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-16 13:13:53 -05:00
Jonathan Marek
d6c8aa2b72 turnip: use single substream cs
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-16 13:13:53 -05:00
Alyssa Rosenzweig
8959364937 panfrost: Remove fbd_type enum
Just use the MALI_MFBD tag directly; it's clean.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3118>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3118>
2019-12-16 12:51:03 -05:00
Alyssa Rosenzweig
5408700a12 ci: Reinstate Panfrost CI
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3118>
2019-12-16 12:51:03 -05:00
Alyssa Rosenzweig
caf55e7bfd panfrost: Fix FBD issue
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Fixes: b0e915b4e6 ("panfrost: Emit SFBD/MFBD after a batch, instead of before")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3118>
2019-12-16 12:50:26 -05:00
Lionel Landwerlin
bc36160ccb vulkan/wsi: error out when image fence doesn't signal
If for some reason the fence associated with an image doesn't signal,
we're likely in a device lost scenario, we should report that error.

We can't really wait for a given amount of time because we could get a
timeout and that is not a valid error to report for vkQueuePresentKHR,
so just wait forever.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/830
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-12-16 14:59:10 +02:00
Lionel Landwerlin
c056193288 anv: drop unused parameter from apply layout pass
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-12-16 14:35:25 +02:00
Lionel Landwerlin
7c223cf316 anv: constify pipeline layout in nir passes
Was hoping to find potential issues but nothing. Still probably a good
idea.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-12-16 14:35:22 +02:00
Alyssa Rosenzweig
e7721d8775 pan/midgard: Set r1.w magic
I'm honestly unsure what this is for, but it's needed on MFBD systems
for unknown reasons, at least when MRT is actually in use and then
sometimes without MRT (it fixes a blend shader issue on T760?)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Visoso <tomeu.vizoso@collabora.com>
2019-12-16 09:10:33 +00:00
Alyssa Rosenzweig
3448b2641a pan/midgard: Fix liveness analysis with multiple epilogues
Epilogues are special fixed-function blocks, so they need special
handling for liveness analysis to work completely. This in turns fixes
RA issues for many shaders using MRT.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Visoso <tomeu.vizoso@collabora.com>
2019-12-16 09:10:33 +00:00
Alyssa Rosenzweig
60396340f5 pan/midgard: Writeout per render target
The flow is considerably more complicated. Instead of one writeout loop
like usual, we have a separate write loop for each render target. This
requires some scheduling shenanigans to get right.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Visoso <tomeu.vizoso@collabora.com>
2019-12-16 09:10:33 +00:00
Alyssa Rosenzweig
281cc6f9a6 pan/midgard: Add schedule barrier after fragment writeout
This is a branch, like discard, so we need a barrier to make it safe.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Visoso <tomeu.vizoso@collabora.com>
2019-12-16 09:10:33 +00:00
Alyssa Rosenzweig
a2d5503b68 panfrost: Pass blend RT number through
We have to key the blend shader for the render target number due to
writeout silliness.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Visoso <tomeu.vizoso@collabora.com>
2019-12-16 09:10:33 +00:00
Pierre-Eric Pelloux-Prayer
2c1983aefe gallium: refuse to create buffers larger than UINT32_MAX
pipe_resource.width0 is 32 bits and hardware support for bigger buffer is
limited (eg: AMD hardware doesn't support buffer shader resources bigger
than 4GB).

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2053
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2948>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2948>
2019-12-16 09:30:14 +01:00
Pierre-Eric Pelloux-Prayer
0e286f6cbf radeonsi: disable dcc for 2x MSAA surface and bpe < 4
This fixes a series of dEQP tests on Raven platforms:
  - dEQP-GLES3.functional.fbo.msaa.2_samples.rgba4
  - dEQP-GLES3.functional.fbo.msaa.2_samples.rgb5_a1
  - dEQP-GLES3.functional.fbo.msaa.2_samples.rgb565
  - dEQP-GLES3.functional.fbo.msaa.2_samples.rg8
  - dEQP-GLES3.functional.fbo.msaa.2_samples.r16f

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3090>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3090>
2019-12-16 08:08:08 +00:00
Iago Toral Quiroga
4202cf8bf1 v3d: expose OES_geometry_shader
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
ba7bc83dd5 v3d: support precompiling geometry shaders
At present, this is only relevant for shader-db.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
7cee56b1df v3d: disable lowering of indirect inputs
V3D can do indirect inputs so we don't need it. Also, the lowering
produces horrible if-ladder code that is particularly bad for geometry
shaders where inputs are always arrays and shader bodies usually have
a loop indexing into them.

This fixes a couple of geometry shader tests in CTS that would fail to
register allocate otherwise.

There are no changes in shader-db.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
a1b7c0844d v3d: fix primitive queries for geometry shaders
With geometry shaders the number of emitted primitived is decided
at run time, so we cannot precompute it in the CPU and we need to
use the PRIMITIVE_COUNTS_FEEDBACK commands to have the GPU provide
the number like we do for the number of primitives written to
transform feedback. This may have a performance impact though, since
it requires a sync wait for the draw to complete, so we only do
it when geometry shaders are present.

v2: remove '> 0' comparison for ponter type (Alejandro)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
6c7a2b69f8 v3d: handle writes to gl_Layer from geometry shaders
When geometry shaders write a value to gl_Layer that doesn't correspond to
an existing layer in the target framebuffer the rendering behavior is
undefined according to the spec, however, there are CTS tests that trigger
this scenario on purpose, probably to ensure that nothing terrible happens.

For V3D, this situation is problematic because the binner uses the layer
index to select the offset to write into the tile state data, and we only
allocate tile state for MAX2(num_layers, 1), so we want to make sure we
don't produce values that would lead to out of bounds writes. The simulator
has an assert to catch this, although we haven't observed issues in actual
hardware it is probably best to play safe.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
45bc61add0 v3d: move layer rendering to a separate helper
This helps with reducing nesting level after adding the loop
to handle layered rendering.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
74a59fdc6e v3d: support rendering to multi-layered framebuffers
When doing layered rendering the binning stage will prepare per-tile
lists for each layer in the framebuffer, so we need to make sure
we allocate enough space for them .

We also need to emit the NUMBER_OF_LAYERS packet. This is required
even when the number of layers is only 1, otherwise the simulator
detects buffer overflows in the tile_state BO during some CTS test
cases involving layered FBOs.

When rendering, we need to emit commands for each layer of the
framebuffer separately and make sure we address the correct layers for
each one.

v2: fixed typo in comment (Alejandro)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
a0c94c70ee v3d: do not limit new CL space allocations with branch to 4096 bytes
For layered rendering we need to emit per layer rendering commands
lists so we we can end up requiring a fairly large buffer for this
if the number of layers is large enough.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
56ba6f42e2 v3d: remove obsolete assertion
OES_geometry_shader introduced the concept of layered framebuffers.
Removing this assertion gets a bunch of CTS tests to pass. We will
also need layered images to implement layered rendering with geometry
shaders.

v2: fix typo in commit message (Alejandro)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
e054fe0167 v3d: support transform feedback with geometry shaders
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
e54cf64939 v3d: save geometry shader state for blitting
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
a6b318ef52 v3d: predicate geometry shader outputs inside non-uniform control flow
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
b636d4ebc7 v3d: don't try to render if shaders failed to compile
This is the same we do in the compute path to avoid crashes
at draw time.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
e2f2263433 v3d: add support for adjacency primitives
v2: remove obsolete comment (Alejandro)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
a07d70c54b v3d: we always have at least one output segment
If we program an output size of 0 the simulator asserts. This was
not a problem until now because our VS would always have to
emit fixed function outputs, however, now that it can be paired
with a GS we can end up with a VS shader that no longer emits
any outputs.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
76fc8c8bb1 v3d: compute appropriate VPM memory configuration for geometry shader workloads
Geometry shaders can output many vertices and thus have higher VPM memory
pressure as a result. It is possible that too wide geometry shader dispatches
exceed the maximum available VPM output allocated, in which case we need
to reduce the dispatch width until we can fit the VPM memory requirements.
Supported dispatch widths for geometry shaders are 16, 8, 4, 1.

There is a limit in the number of VPM output sectors that can be used by a
geometry shader that we can meet by lowering the dispatch width at compile
time, however, at draw time we need to revisit this number and, together with
other elements that can contribute to total VPM memory requirements, decide
on a configuration that can fit the program into the available VPM memory.
Ideally, we also want to aim for not using more than half of the available
memory so we that we can run a pair of bin and render programs in parallel.

v2: fixed language in comment and typo in commit log. (Alejandro)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
76f4c83815 v3d: add 1-way SIMD packing definition
According to the documentation, the 1-way dispatch width is only supported
with geometry shaders.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
4f5fbd6490 v3d: implement geometry shader instancing
v2:
 - Remove unused field uses_iid from v3d_gs_prog_data (Alejandro)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
8a81ac2eed v3d: emit geometry shader state commands
This is good enough to get basic GS workloads working, later patches will
improve this by adding instancing support, proper SIMD configuration, etc.

Notice that most of the TESSELLATION_GEOMETRY_SHADER_PARAMS fields are only
relevant when tessellation shaders are present. We do not support tessellation
yet, but we still need to fill in these tessellation state with default values
since our packing functions require some of these to have non-zero values.

v2:
 - Add a comment in the code explaining why we fill in
   tessellation fields (Alejandro)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
0934bd4460 v3d: fix packet descriptions for geometry and tessellation shaders
Every code address starts at bit 3 (addresses must be 64-bit aligned),
with the first 3 bits used to specify threading and NaN propagation
parameters for the shader program.

We generally skip "reserved" bits, however, doing this when the
reserved field is the last in a struct and it is large enough can
make us compute incorrect (smaller) struct sizes which can
lead to corrupt CLs. In particular, the "Tess/Geom Common Params"
struct has a reserved field at the end that is 8-bit, so if we
don't include this we compute a packet size that is 1 byte smaller
than it shold, making the next packet we emit start 1 byte
earlier and therefore leading to incorrect CL data from that point
forward.

The name of one of the fields was not correct.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
5d578c27ce v3d: add initial compiler plumbing for geometry shaders
Most of the relevant work happens in the v3d_nir_lower_io. Since
geometry shaders can write any number of output vertices, this pass
injects a few variables into the shader code to keep track of things
like the number of vertices emitted or the offsets into the VPM
of the current vertex output, etc. This is also where we handle
EmitVertex() and EmitPrimitive() intrinsics.

The geometry shader VPM output layout has a specific structure
with a 32-bit general header, then another 32-bit header slot for
each output vertex, and finally the actual vertex data.

When vertex shaders are paired with geometry shaders we also need
to consider the following:
  - Only geometry shaders emit fixed function outputs.
  - The coordinate shader used for the vertex stage during binning must
    not drop varyings other than those used by transform feedback, since
    these may be read by the binning GS.

v2:
 - Use MAX3 instead of a chain of MAX2 (Alejandro).
 - Make all loop variables unsigned in ntq_setup_gs_inputs (Alejandro)
 - Update comment in IO owering so it includes the GS stage (Alejandro)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
f63750accf v3d: remove unused variable
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
52cbef0039 v3d: enable debug options for geometry shader dumps
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
d6b0786a38 v3d: add debug assert
While lowering vpm outputs we look for the NIR variables matching
particular store output instructions and we expect to find a match,
so assert on that.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
6e68f74395 v3d: add missing plumbing for VPM load instructions
We will need to use LDVPMG_IN specifically to read VPM inputs
in geometry shaders.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Eric Anholt
f58ef5d481 turnip: Lower usub_borrow.
Fixes dEQP-VK.glsl.builtin.function.integer.usubborrow.uvec2_mediump_fragment.

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2986>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2986>
2019-12-16 04:52:09 +00:00
Caio Marcelo de Oliveira Filho
c06ba83589 intel/fs: Lower 64-bit MOVs after lower_load_payload()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3070>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3070>
2019-12-14 21:12:21 +00:00
Bas Nieuwenhuizen
b53856aca3 amd/common: Always use addrlib for HTILE tc-compat.
Even without depth+stencil addrlib can (correctly!) decide to
disable tc compatible HTILE.

One example is 8x sampling with 32-bit depth on Stoney. The row size
on Stoney is 1024, while the tile size is 2048, which results in
tile splits which are not supported with tc-compat.

On Stoney, this fixes
dEQP-VK.glsl.builtin_var.fragdepth.*_list_d32_sfloat_multisample_8

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3054>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3054>
2019-12-14 20:39:29 +00:00
Bas Nieuwenhuizen
e197fb1c2f amd/common: Fix tcCompatible degradation on Stoney.
addrlib sometimes returns smaller sizes for tcCompat as it does
not seem to take into account the depth+stencil matching config
gymnastics with tcCompat.

This fixes
dEQP-VK.pipeline.render_to_image.core.2d_array.huge.height.r8g8b8a8_unorm_d32_sfloat_s8_uint

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3054>
2019-12-14 20:39:29 +00:00
Denis Pauk
6bf14e9c47 docs/features: mark GL_ARB_texture_compression_bptc as done for llvmpipe, softpipe, swr
Signed-off-by: Denis Pauk <pauk.denis@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
CC: Marek Olšák <maraeo@gmail.com>
CC: Rhys Perry <pendingchaos02@gmail.com>
CC: Bruce Cherniak <bruce.cherniak@intel.com>
CC: Matt Turner <mattst88@gmail.com>
2019-12-14 20:02:10 +00:00
Denis Pauk
3acc15f4f0 gallium/swr: Enable support bptc format.
Reuse Code from:
f69bc797e1 gallium/auxiliary: Add helper support for bptc format compress/decompress

Signed-off-by: Denis Pauk <pauk.denis@gmail.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
CC: Marek Olšák <maraeo@gmail.com>
CC: Tim Rowley <timothy.o.rowley@intel.com>
2019-12-14 20:02:10 +00:00
Rob Clark
1bf3837395 freedreno/a6xx: fix OUT_REG() vs growable cmdstream
BEGIN_RING() could decide we can't fit the next packet in the current
cmdstream segment, and grow a new segment.  So we need to grab ring->cur
*after* BEGIN_RING(), otherwise we are writing cmdstream past the end of
the previous segment.

Fixes: bdd98b892f ("freedreno: New struct packing macros")
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-12-14 09:12:39 -08:00
Erico Nunes
ce52b49348 lima: split draw calls on 64k vertices
The Mali400 only supports draws with up to 64k vertices per command.
To handle this, break the draw_vbo call into multiple commands.
Indexed drawing is left to a separate code path.
This implementation was ported from vc4_draw_vbo.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>
2019-12-14 07:44:43 +01:00
Erico Nunes
6d46d0e82b vc4: move the draw splitting routine to shared code
This can also be useful for other hardware which has similar limitations
on vertex count per single draw.
The Mali400 has a similar limitation and can reuse this.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>
2019-12-14 07:44:43 +01:00
Erico Nunes
2d7be5f01f lima: refactor indexed draw indices upload
As of this commit this is just a refactor in preparation to enable
support for more than 64k vertices.
To support splitting the draw_vbo call, indices shouldn't be re-uploaded
every time.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>
2019-12-14 07:44:43 +01:00
Erico Nunes
270c282a43 lima: allocate separate bo to store varyings
The current strategy using the suballocator with fixed size doesn't
scale and causes some programs with large number of vertices (like some
glmark2 scenes) to crash.
Change it to dynamically allocate a separate bo to accomodate for
arbitrary number of vertices.
This also fixes the buffer read/write flags for gp.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>
2019-12-14 07:44:43 +01:00
Erico Nunes
8bf2b5db78 gallium/util: add alignment parameter to util_upload_index_buffer
At least on Mali Utgard, index buffers need to be aligned on 0x40.
To avoid duplicating this, add an alignment parameter.
Keep the previous default for the other existing users.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>
2019-12-14 07:44:43 +01:00
Kenneth Graunke
9fb45c5bbd drirc: Final Fantasy VIII: Remastered needs allow_higher_compat_version
This gets it running on i965 with Mesa master.  (The game won't start
without GL 3.3 compatibility, but uses 1.20 with GL_EXT_gpu_shader4
for shaders.)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3076>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3076>
2019-12-13 17:58:42 -08:00
Timothy Arceri
7564c5fc6d st/glsl_to_nir: fix SSO validation regression
Fixes: b77907edb554 ("st/glsl_to_nir: use nir based program resource list builder")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2216

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-12-13 23:09:57 +00:00
Alyssa Rosenzweig
46f0b9ecc5 ci: Remove T760/T860 from CI temporarily
I feel really bad about this but this one test is flaking. I don't want
to do a mass revert (and bisection is extremely difficult with
nondeterministic/Heisenbugs), but it's Friday night and master needs to
pass. This commit should be reverted asap (once the flake is solved)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-13 22:52:39 +00:00
Rafael Antognolli
59de5d9b6a iris: Implement WA for push constants.
v2: Apply WA to gen11+ instead of gen12+ (Jordan).

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-12-13 14:15:04 -08:00
Andreas Baierl
8adeeaa7f2 lima/parser: Add texture descriptor parser
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2980>
2019-12-13 22:02:03 +00:00
Andreas Baierl
5456916309 lima/parser: Add RSW parsing
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2980>
2019-12-13 22:02:03 +00:00
Andreas Baierl
31ed081ca3 lima/parser: Some fixes and cleanups
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2980>
2019-12-13 22:02:03 +00:00
Rafael Antognolli
6a3b8811ea vulkan/overlay: Update docs.
Add mention to overlay control socket.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-13 20:53:44 +00:00
Rafael Antognolli
56ccea58ae vulkan/overlay: Add basic overlay control script.
This can be used to start/stop statistics capturing from the command
line.

v3:
 - Install script (Lionel)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-13 20:53:44 +00:00
Rafael Antognolli
a94fa1da93 vulkan/overlay: Add a command to start capturing data to a file.
By default, if an output_file is specified, the overlay layer will start
capturing data immediately. After this commit, when a control socket is
used, the capture starts disabled by default, and is only enabled when a
command ":capture=1;" is received.

when the capture is enabled, we might have already accumulated some
stats. To avoid capturing such noise, we discard and reset the fps and
stats, updating the display and capturing only data from that point on.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-13 20:53:44 +00:00
Rafael Antognolli
606dff1b73 vulkan/overlay: Add support for a control socket.
Add support for socket from which the overlay layer can receive
commands. This control socket can be useful to allow setting options
once the application is already running. For instance, triggering the
capture of fps data at a certain point.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-13 20:53:44 +00:00
Rafael Antognolli
e87d7fea8a vulkan/overlay: Add a control socket.
v2: Use a socket instead of named pipe.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-13 20:53:44 +00:00
Rafael Antognolli
ef5266ebd5 util/os_socket: Add socket related functions.
v3:
 - Add os_socket.c/h into Makefile.sources (Lionel)
 - Add empty non-linux implementation to public functions.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-13 20:53:44 +00:00
Eric Engestrom
c327245257 anv: drop unused #include
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-12-13 20:42:40 +00:00
Eric Engestrom
1a837e803b util/simple_mtx: don't set the canary when it can't be checked
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-12-13 20:20:21 +00:00
Eric Engestrom
d600b19640 intel/compiler: replace 0 pointer with NULL
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-12-13 20:16:20 +00:00
Eric Engestrom
8074f68b3b intel/compiler: add ASSERTED annotation to avoid "unused variable" warning
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-12-13 20:16:20 +00:00
Kenneth Graunke
91efae4f80 iris: Alphabetize source files after iris_perf.c was added 2019-12-13 11:03:13 -08:00
Rob Clark
3b8feefd9c freedreno/ir3: add iterator macros
So many open coded list iterators were getting annoying.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-12-13 09:25:40 -08:00
Rob Clark
ad92aa36ac freedreno/ir3: add scheduler traces
Add some infrastructure to trace scheduler decisions.  The next patch
will add some more traces, just splitting this out to reduce clutter.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-12-13 09:25:40 -08:00
Rob Clark
dd34ccb2c5 freedreno/ir3: add last-baryf shaderdb stat
Sometimes sched changes that are a win in terms of instruction count
and/or register pressure, are worse in real life, due to keeping varying
storage locked for too long.  Add a shader-db stat to give this more
visibility.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-12-13 09:25:40 -08:00
Alejandro Piñeiro
2865d79a33 nir/opt_peephole_select: remove unused variables
To avoid "unused variable" warnings.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-12-13 17:14:58 +01:00
Alyssa Rosenzweig
7c972eba40 panfrost: Report GPU name in es2_info
We can prettify the ID.

Closes #2093

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig
09a2c74cfd panfrost: Add panfrost_model_name helper
This gives us a string representation of a GPU ID.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig
a215289176 panfrost: Move property queries to _encoder
We'll want these in non-Gallium devices.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig
102789886c panfrost: Move nir_undef_to_zero to Midgard compiler
Nothing Gallium about it.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig
ddbbb2db48 pandecode: Add cast
Fixes minor coverity warning about the format specifier.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig
4f7fddbd71 panfrost: Pass size to panfrost_batch_get_scratchpad
We'll compute the size with the new scratchpad helpers.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig
bc887e8281 panfrost: Calculate maximum stack_size per batch
We'll need this so we can allocate a stack for the batch large enough
for all the jobs within it.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig
a337bf319c pan/midgard: Handle misc. cppcheck warnings
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig
f204791cd6 pan/midgard: Remove unused ld/st packing hepers
Identified by cppcheck.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig
709d8c29cd panfrost: Handle minor cppcheck issues
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig
b0e915b4e6 panfrost: Emit SFBD/MFBD after a batch, instead of before
The size of the scratchpad (as well as some tiler details) depend on the
contents of the batch, so we need to wait to defer filling out the FBD
until after all draws are queued.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig
7597015b85 panfrost: Route stack_size from compiler
We'll need it in pan_context.c

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-13 10:26:35 -05:00
Jonathan Marek
440cd835de etnaviv: add missing vs_needs_z_div handling to NIR backend
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-12-13 09:31:40 -05:00
Jonathan Marek
64c7cdcae5 etnaviv: add missing formats
Add missing texture/render formats supported by hardware.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-12-13 09:10:29 -05:00
Jonathan Marek
d30499a3c8 etnaviv: remove swizzle from format table
The only format that needs swizzle is R8 emulated with L8, so we can get
rid of the SWIZ(X, Y, Z, W) everywhere.

Note: R8G8 also had a swizzle, but it wasn't necessary.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-12-13 09:10:28 -05:00
Jonathan Marek
017cbab5b0 etnaviv: disable integer vertex formats on pre-HALTI2 hardware
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-12-13 09:10:28 -05:00
Jonathan Marek
d34705c891 etnaviv: update INT_FILTER choice for GLES3 formats
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-12-13 09:09:08 -05:00
Jonathan Marek
15e9704ccb etnaviv: set output mode and saturate bits
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-12-13 09:09:08 -05:00
Jonathan Marek
b7730c54a9 etnaviv: sRGB render target support
Note: no srgb render target support before HALTI3

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-12-13 09:09:08 -05:00
Jonathan Marek
39349e629a etnaviv: remove sRGB formats from format table
This supports all sRGB formats, without having them in the format table.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-12-13 09:09:08 -05:00
Tomasz Pyra
b62217780a gallium/swr: Fix arb_transform_feedback2
Added support for pause/resume transform feedback.
Fixed DrawTransformFeedback.

Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
Reviewed-by: Krzysztof Raszkowski <krzysztof.raszkowski@intel.com>
2019-12-13 10:58:36 +00:00
Samuel Pitoiset
b37c91c12e radv: handle unaligned vertex fetches on GFX6/GFX10
The Vulkan spec doesn't have any words for vertex attributes alignment.

Fixes a test failure on GFX6 and a GPU hang on GFX10 with:
dEQP-VK.spirv_assembly.instruction.spirv1p4.entrypoint.tess_con_pc_entry_point

vkpipeline-db results on GFX10:
Totals from affected shaders:
SGPRS: 463772 -> 472972 (1.98 %)
VGPRS: 343208 -> 343752 (0.16 %)
Spilled SGPRs: 323 -> 336 (4.02 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 13806200 -> 14164472 (2.60 %) bytes
Max Waves: 84021 -> 83755 (-0.32 %)

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2161
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-13 09:54:07 +00:00
Lionel Landwerlin
bd888bc1d6 i965/iris: perf-queries: don't invalidate/flush 3d pipeline
Our current implementation of performance queries is fairly harsh
because it completely flushes and invalidates the 3d pipeline caches
at the beginning and end of each query. An argument can be made that
this is how performance should be measured but it probably doesn't
reflect what the application is actually doing and the actual cost of
draw calls.

A more appropriate approach is to just stall the pipeline at
scoreboard, so that we measure the effect of a draw call without
having the pipeline in a completely pristine state for every draw
call.

v2: Use end of pipe PIPE_CONTROL instruction for Iris (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-12-13 11:27:22 +02:00
Lionel Landwerlin
a575b3cd5c intel/perf: drop batchbuffer flushing at query begin
This was initially intended to fix issues with the query timings going
occassionally high.

It turns out there was a bug in the attribution of OA reports to our
context when parsing the OA data. This led to reports flagged with
other context IDs to be included in our queries results.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-12-13 11:27:17 +02:00
Iago Toral Quiroga
ca475d5fba v3d: actually root the first BO in a command list in the job
We were passing cl->bo, which is NULL, so v3d_job_add_bo was a no-op.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-13 08:58:10 +00:00
Christian Gmeiner
06db271a6c etnaviv: drop compiled_rs_state forward declaration
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-12-13 08:11:03 +00:00
Christian Gmeiner
5f7c5f5dd2 etnaviv: remove not used etna_bits_ones(..)
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-12-13 08:11:03 +00:00
Vinson Lee
8d20b5cba5 swr: Fix build with llvm-10.0.
Fix build error after llvm-10.0 commit ("1b2842bf902a [Alignment][NFC]
CreateMemSet use MaybeAlign").

../src/gallium/drivers/swr/swr_shader.cpp: In member function ‘void (* BuilderSWR::CompileGS(swr_context*, swr_jit_gs_key&))(HANDLE, HANDLE, SWR_GS_CONTEXT*)’:
../src/gallium/drivers/swr/swr_shader.cpp:738:65: error: no matching function for call to ‘BuilderSWR::MEMSET(llvm::Value*&, llvm::Constant*, int, long unsigned int)’
       MEMSET(pStream, C((char)0), VERTEX_COUNT_SIZE + CONTROL_HEADER_SIZE, sizeof(float) * KNOB_SIMD_WIDTH);
                                                                 ^
In file included from ../src/gallium/drivers/swr/rasterizer/jitter/builder.h:163:0,
                 from ../src/gallium/drivers/swr/swr_shader.cpp:43:
src/gallium/drivers/swr/rasterizer/jitter/gen_builder.hpp:51:11: note: candidate: llvm::CallInst* SwrJit::Builder::MEMSET(llvm::Value*, llvm::Value*, uint64_t, llvm::MaybeAlign, bool, llvm::MDNode*, llvm::MDNode*, llvm::MDNode*)
 CallInst* MEMSET(Value *Ptr, Value *Val, uint64_t Size, MaybeAlign Align, bool isVolatile = false, MDNode *TBAATag = nullptr, MDNode *ScopeTag = nullptr, MDNode *NoAliasTag = nullptr)
           ^
src/gallium/drivers/swr/rasterizer/jitter/gen_builder.hpp:51:11: note:   no known conversion for argument 4 from ‘long unsigned int’ to ‘llvm::MaybeAlign’
In file included from ../src/gallium/drivers/swr/rasterizer/jitter/builder.h:163:0,
                 from ../src/gallium/drivers/swr/swr_shader.cpp:43:
src/gallium/drivers/swr/rasterizer/jitter/gen_builder.hpp:56:11: note: candidate: llvm::CallInst* SwrJit::Builder::MEMSET(llvm::Value*, llvm::Value*, llvm::Value*, llvm::MaybeAlign, bool, llvm::MDNode*, llvm::MDNode*, llvm::MDNode*)
 CallInst* MEMSET(Value *Ptr, Value *Val, Value *Size, MaybeAlign Align, bool isVolatile = false, MDNode *TBAATag = nullptr, MDNode *ScopeTag = nullptr, MDNode *NoAliasTag = nullptr)
           ^
src/gallium/drivers/swr/rasterizer/jitter/gen_builder.hpp:56:11: note:   no known conversion for argument 4 from ‘long unsigned int’ to ‘llvm::MaybeAlign’

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
2019-12-12 23:43:38 -08:00
Jonathan Marek
828f8f5531 turnip: implement subpass input attachments
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-12 20:33:17 -05:00
Jonathan Marek
3b4b5f549f turnip: CmdClearAttachments fixes
Partial depth/stencil clear and skipping unused attachments.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-12 20:33:17 -05:00
Jonathan Marek
aac7d6c1dc turnip: subpass rework
A renderpass is a tile load/store cycle.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-12 20:33:17 -05:00
Jonathan Marek
4322cf34c4 turnip: add dirty bit for push constants
Fixes push constants not updating in some cases.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-12 20:33:17 -05:00
Jonathan Marek
27d2174508 turnip: no 8x msaa on 128bpp formats
We don't have an entry for cpp 128 in the tile_alignment table, but I don't
think the HW supports this at all (blob driver just doesn't have 8x msaa).

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-12 20:33:17 -05:00
Jonathan Marek
5fd9fd3516 turnip: fix VK_IMAGE_ASPECT_STENCIL_BIT image view
Use a special format which allows sampling the stencil and set the correct
swizzle.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-12 20:33:17 -05:00
Jonathan Marek
e71f79f6c6 turnip: set FRAG_WRITES_SAMPMASK bit
GPU hangs if SAMPMASK_REGID is used without this bit.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-12 20:33:17 -05:00
Jonathan Marek
99a4f7c79f turnip: set load_layer_id to zero
We don't have layered rendering and ir3 doesn't support this intrinsic, so
just set it to zero for now.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-12 20:33:17 -05:00
Jonathan Marek
7bbcf7deff turnip: update tile_align_w/tile_align_h
It looks like the actual tile alignment requirement is less than 32x32, but
in some cases input attachment texture needs 64 alignment.

Reduced the h alignment to 16 to compensate and it seems to work fine.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-12 20:33:17 -05:00
Jonathan Marek
402bc111fc turnip: fix tile layout logic
Use DIV_ROUND_UP and stop trying to increase the tile_count width/height
once tile_align_w/tile_align_h are reached.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-12 20:33:17 -05:00
Jonathan Marek
14cbe2dea5 turnip: fix hw binning render area
Fix a mistake in the y2 coordinate.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-12 20:33:17 -05:00
Jonathan Marek
029322c100 freedreno/registers: add a6xx texture format for stencil sampler
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-12 20:33:17 -05:00
Jonathan Marek
2db03867f6 freedreno/ir3: add GLSL_SAMPLER_DIM_SUBPASS to tex_info
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-12 20:33:17 -05:00
Jonathan Marek
ab54aceaa8 turnip: fix incorrectly failing assert
pColorBlendState is allowed to be NULL if subpass has >0 color attachments
but they are all unused.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-12 20:33:16 -05:00
Alyssa Rosenzweig
07d8b98b54 panfrost: Query core count and thread tls alloc
This is supported only on newer kernels.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-13 00:47:23 +00:00
Alyssa Rosenzweig
315324614e panfrost: Factor out panfrost_query_raw
We would like to query properties other than product ID.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-13 00:47:23 +00:00
Timothy Arceri
a6aedc662e st/glsl_to_nir: use nir based program resource list builder
Here we use the NIR based builder to add everything to the resource
list execpt for SSO packed varyings. Since the details of those
varyings get lost during packing we leave the special handing to
the GLSL IR pass for now. In order to do this we add some bools
to the build resource list functions.

Using the NIR based resource list builder gets us a step closer to
using a native NIR based linker. It should also be faster than the
GLSL IR builder, one because the NIR optimisations should mean we
add less entries due to better optimisations, and two because nir
gives us better lists to work with and we don't need to walk the
entire IR to find the resources.

Ack-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-13 00:07:19 +00:00
Timothy Arceri
144f54e483 st/glsl_to_nir: call gl_nir_lower_buffers() a little later
In a following commit we will use a NIR based builder to build the
OpenGL resource list, so we want to delay this call a little.

Ack-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-13 00:07:19 +00:00
Timothy Arceri
d0259f4159 glsl: add subroutine support to nir_build_program_resource_list()
This is required so we can use the NIR linker to link GLSL in
addition to spirv.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-13 00:07:19 +00:00
Timothy Arceri
46f9f74c57 glsl: add support for named varyings in nir_build_program_resource_list()
This adds support for adding names of varying to the resource list
which is required for us to use this function with the glsl linker.
Support for names is optional for spirv which is why it had not been
added yet.

This is mostly a copy of the GLSL IR code adapted to nir.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-13 00:07:19 +00:00
Timothy Arceri
3c364f90fd glsl: copy the new data fields when converting to nir
These fields added in the previous commit will be used to make use
of a NIR based GLSL linker.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-13 00:07:19 +00:00
Timothy Arceri
56c25b938c nir: add some fields to nir_variable_data
These will be used to provide NIR linking functionality to GLSL.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-13 00:07:19 +00:00
Timothy Arceri
89b2b0f767 glsl: copy the how_declared field when converting to nir
This is needed to make use of nir_build_program_resource_list().

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-13 00:07:19 +00:00
Timothy Arceri
c3823d2d29 glsl: move nir_remap_dual_slot_attributes() call out of glsl_to_nir()
In order to be able to implement a NIR based glsl linker we need to
build the program resource list with NIR. This change delays the
remaping so that a later commit can call the NIR based resource
list builder.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-13 00:07:19 +00:00
Dylan Baker
e37115c912 docs: Update release notes, index, and calendar for 19.3.0 2019-12-12 12:05:00 -08:00
Dylan Baker
941aa31572 docs/19.3.0: Add SHA256 sums 2019-12-12 11:57:54 -08:00
Dylan Baker
2ab4c2bc22 docs: add release notes for 19.3.0 2019-12-12 11:57:53 -08:00
Jason Ekstrand
fa4d981f6f i965: Enable GL_EXT_gpu_shader4 on Gen6+
It's already enabled for all gallium drivers that support GLSL 1.40 or
above and we already support everything in our compiler on SNB+

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-12-12 18:43:17 +00:00
Samuel Pitoiset
eda1b77cc2 radv: enable SpvCapabilityImageMSArray
The Vulkan spec says that StorageImageMultisample and ImageMSArray
SPIRV-V capabilities must be enabled if the
shaderStorageImageMultisample feature is supported.

This fixes a warning with RenderDoc.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2212
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-12 18:52:08 +01:00
Alyssa Rosenzweig
eac9247b2d panfrost: Add routines to calculate stack size/shift
These implement the aforementioned formulas.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-12 11:42:07 -05:00
Alyssa Rosenzweig
e6f8ef93ca panfrost: Split stack_shift nibble from unk0
It's conceptually independent from the upper part (which is not yet
understood, but for spilling generally remains equal to 0x1e).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-12 11:42:07 -05:00
Alyssa Rosenzweig
6c6372770c panfrost: Rename unknown_address_0 -> scratchpad
It's the analogue pointer in SFBD.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-12 11:42:07 -05:00
Alyssa Rosenzweig
8b290bb13d panfrost: Describe thread local storage sizing rules
Deeply nested powers-of-two, basically :-)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-12 11:42:07 -05:00
Alyssa Rosenzweig
2b4da476f4 pan/midgard: Fix shift for TLS access
Due to this issue we were using 4x the memory we should have for TLS,
which was messing up the size calculations. Oops!

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-12 11:42:07 -05:00
Alyssa Rosenzweig
05b839f354 pan/midgard: Simplify and fix vector copyprop
Fixes a regression in QuakeSpasm. See
https://gitlab.freedesktop.org/mesa/mesa/issues/2169 for apitrace.

Closes #2169

Fixes: f72873e6aa ("pan/midgard: Copypropagate vector creation")
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reported-by: Icecream95
2019-12-12 11:42:07 -05:00
Alyssa Rosenzweig
4308d75281 pan/midgard: Don't try to free NULL in LCRA
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Fixes: 12e393bacf ("panfrost: add lcra_free() to free lcra state")
2019-12-12 11:42:07 -05:00
Alyssa Rosenzweig
5e75eb547f pan/midgard: Force alignment for csel_v
The swizzle on the conditional gets lost.

Fixes "horizontal mirroring" in godot. See
https://gitlab.freedesktop.org/mesa/mesa/issues/2108 which has attached
apitrace.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Fixes: d3b3daa9d3 ("pan/midgard: Use new scheduler")
Reported-by: Icecream95
2019-12-12 11:42:07 -05:00
Alyssa Rosenzweig
8c79467a0d pan/midgard: Don't use no_spill for memory spill src
I'm not totally sure why this would *break* things, but it's certainly
not necessary and it does break things. Somehow this gives the RA more
freedom, fixing some spill issues.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-12 11:42:07 -05:00
Alyssa Rosenzweig
d48c195acf pan/midgard: Use no_spill bitmask
We would like no_spill decisions to be class-specific -- spilling from
special register to a work register doesn't preclude also spilling that
work register to stack.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-12 11:42:07 -05:00
Alyssa Rosenzweig
08b16fb321 pan/midgard: Dynamically allocate r26/27 for spills
This allows us to spill two 128-bit values in the same bundle, since we
have two registers we can spill with. This improves the
register allocation flexibility in programs with heavy spilling, though
unfortunately it isn't sufficient (theoretically, 3.5 128-bit values can
be spilled from 3 vector units and 2 scalar units).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-12 11:42:07 -05:00
Alyssa Rosenzweig
8e7f2b9ae3 pan/midgard: Remove code marked "TODO: remove me"
It's a fossil, how cute :-)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-12 11:42:06 -05:00
Alyssa Rosenzweig
b6d1b32d58 pan/midgard: Remove consecutive_skip code
This has been unused since the beginning since it's broken. Let's toss
it so it doesn't get in the way of further fixes. Bigger to fish to fry.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-12 11:42:06 -05:00
Alyssa Rosenzweig
3c0f1ea58c pan/midgard: Move bounds checking into LCRA
This simplifies the cost calculation code a bit.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-12 11:42:06 -05:00
Alyssa Rosenzweig
e985ae25a6 pan/midgard: Remove spill cost heuristic
We do need some sort of a cost heuristic, but this one is just causing
spilling to behave worse on shaders I'm looking at, and I don't need
more noise in the spill implementation right now.

Get it working first. We can optimize this later.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-12 11:42:06 -05:00
Alyssa Rosenzweig
cacb4bc022 pan/midgard: Simplify spillability test
Let's not worry about spilling twice in a bundle; that's too
restrictive. We'll need to change the schedule itself -- unfortunately,
this can have second-order effects due to pipeline registers.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-12 11:42:06 -05:00
Alyssa Rosenzweig
7cf5bee5aa pan/midgard: Split spill node selection/spilling
Instead of having a giant function for both, split into the two
subtasks so we can handle errors better.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-12 11:42:06 -05:00
Alyssa Rosenzweig
9dc3b18e49 pan/midgard: Move spilling code out of scheduler
We move it to the register allocator itself. It doesn't belong in
midgard_schedule.c!

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-12 11:42:06 -05:00
Tomeu Vizoso
88f9522f83 st/mesa: Don't access members of NULL pointers
Should be harmless, but UBSAN complains about it and fills the logs with
noise.

../src/mesa/state_tracker/st_manager.c:523:27: runtime error: member access within null pointer of type 'struct st_framebuffer'"}
    #0 0xaad4e89c in st_framebuffer_reference ../src/mesa/state_tracker/st_manager.c:523"}
    #1 0xaad4e89c in st_api_make_current ../src/mesa/state_tracker/st_manager.c:1091"}
    #2 0xaab69e0e in dri_make_current ../src/gallium/state_trackers/dri/dri_context.c:301"}
    #3 0xaab48fd2 in driBindContext ../src/mesa/drivers/dri/common/dri_util.c:581"}
    #4 0xb682a122 in dri2_make_current ../src/egl/drivers/dri2/egl_dri2.c:1625"}
    #5 0xb67f95a4 in eglMakeCurrent ../src/egl/main/eglapi.c:884"}
    #6 0x4c2b0e in tcu::surfaceless::EglRenderContext::EglRenderContext(glu::RenderConfig const&, tcu::CommandLine const&) (/deqp/modules/gles2/deqp-gles2+0x29b0e)"}
    #7 0x4c3302 in tcu::surfaceless::ContextFactory::createContext(glu::RenderConfig const&, tcu::CommandLine const&, glu::RenderContext const*) const (/deqp/modules/gles2/deqp-gles2+0x2a302)"}
    #8 0x73a9b0 in glu::createRenderContext(tcu::Platform&, tcu::CommandLine const&, glu::RenderConfig const&, glu::RenderContext const*) (/deqp/modules/gles2/deqp-gles2+0x2a19b0)"}
    #9 0x73ad86 in glu::createDefaultRenderContext(tcu::Platform&, tcu::CommandLine const&, glu::ApiType) (/deqp/modules/gles2/deqp-gles2+0x2a1d86)"}
    #10 0x4c6a78 in deqp::gles2::Context::Context(tcu::TestContext&) (/deqp/modules/gles2/deqp-gles2+0x2da78)"}
    #11 0x4c3ba0 in deqp::gles2::TestPackage::init() (/deqp/modules/gles2/deqp-gles2+0x2aba0)"}
    #12 0x852fd8 in tcu::TestHierarchyIterator::next() (/deqp/modules/gles2/deqp-gles2+0x3b9fd8)"}
    #13 0x829660 in tcu::TestSessionExecutor::iterate() (/deqp/modules/gles2/deqp-gles2+0x390660)"}
    #14 0x810aac in tcu::App::iterate() (/deqp/modules/gles2/deqp-gles2+0x377aac)"}
    #15 0x4c1d4c in main (/deqp/modules/gles2/deqp-gles2+0x28d4c)"}
    #16 0xb64b6aa8 in __libc_start_main (/lib/arm-linux-gnueabihf/libc.so.6+0x1aaa8)"}

../src/mesa/state_tracker/st_atom.c:115:8: runtime error: member access within null pointer of type 'struct st_program'"}
    #0 0xaae11a58 in check_program_state ../src/mesa/state_tracker/st_atom.c:115"}
    #1 0xaae128f6 in st_validate_state ../src/mesa/state_tracker/st_atom.c:192"}
    #2 0xaadc58c2 in prepare_draw ../src/mesa/state_tracker/st_draw.c:132"}
    #3 0xaadc58c2 in st_draw_vbo ../src/mesa/state_tracker/st_draw.c:184"}
    #4 0xabc4f924 in _mesa_validated_drawrangeelements ../src/mesa/main/draw.c:816"}
    #5 0xabc50240 in _mesa_DrawElements ../src/mesa/main/draw.c:970"}
    #6 0x73ebd2 in glu::CallLogWrapper::glDrawElements(unsigned int, int, unsigned int, void const*) (/deqp/modules/gles2/deqp-gles2+0x2d4bd2)"}
    #7 0x6d86b2 in deqp::gls::FragOpInteractionCase::iterate() (/deqp/modules/gles2/deqp-gles2+0x26e6b2)"}
    #8 0x494d16 in deqp::gles2::TestCaseWrapper::iterate(tcu::TestCase*) (/deqp/modules/gles2/deqp-gles2+0x2ad16)"}
    #9 0x7f9cf2 in tcu::TestSessionExecutor::iterateTestCase(tcu::TestCase*) (/deqp/modules/gles2/deqp-gles2+0x38fcf2)"}
    #10 0x7fa5f0 in tcu::TestSessionExecutor::iterate() (/deqp/modules/gles2/deqp-gles2+0x3905f0)"}
    #11 0x7e1aac in tcu::App::iterate() (/deqp/modules/gles2/deqp-gles2+0x377aac)"}
    #12 0x492d4c in main (/deqp/modules/gles2/deqp-gles2+0x28d4c)"}
    #13 0xb64b9aa8 in __libc_start_main (/lib/arm-linux-gnueabihf/libc.so.6+0x1aaa8)"}

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-12 16:26:58 +01:00
Tomeu Vizoso
99d4c71f7e panfrost: Don't lose bits!
UBSAN complained that when alpha was 255 and we shifted it 24 positions
to the left, it didn't fit in a signed int. That's because bitwise
operations automatically promote to signed int.

../src/gallium/drivers/panfrost/pan_job.c:1130:64: runtime error: left shift of 255 by 24 places cannot be represented in type 'int'"}
    #0 0xacf953d6 in pan_pack_color ../src/gallium/drivers/panfrost/pan_job.c:1130"}
    #1 0xacf953d6 in panfrost_batch_clear ../src/gallium/drivers/panfrost/pan_job.c:1204"}
    #2 0xaae3226a in st_Clear ../src/mesa/state_tracker/st_cb_clear.c:513"}
    #3 0x4c3d0e in deqp::gles2::TestCaseWrapper::iterate(tcu::TestCase*) (/deqp/modules/gles2/deqp-gles2+0x2ad0e)"}
    #4 0x828cf2 in tcu::TestSessionExecutor::iterateTestCase(tcu::TestCase*) (/deqp/modules/gles2/deqp-gles2+0x38fcf2)"}
    #5 0x8295f0 in tcu::TestSessionExecutor::iterate() (/deqp/modules/gles2/deqp-gles2+0x3905f0)"}
    #6 0x810aac in tcu::App::iterate() (/deqp/modules/gles2/deqp-gles2+0x377aac)"}
    #7 0x4c1d4c in main (/deqp/modules/gles2/deqp-gles2+0x28d4c)"}
    #8 0xb64b6aa8 in __libc_start_main (/lib/arm-linux-gnueabihf/libc.so.6+0x1aaa8)"}

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-12 16:26:54 +01:00
Tomeu Vizoso
165cb0a5fe util: Don't access members of NULL pointers
Should be harmless, but UBSAN complains about it and fills the logs with
noise.

../src/gallium/auxiliary/util/u_inlines.h:110:8: runtime error: member access within null pointer of type 'struct pipe_surface'"}
    #0 0xaaccf186 in pipe_surface_reference ../src/gallium/auxiliary/util/u_inlines.h:110"}
    #1 0xaaccf186 in util_copy_framebuffer_state ../src/gallium/auxiliary/util/u_framebuffer.c:105"}
    #2 0xaabfb60e in cso_set_framebuffer ../src/gallium/auxiliary/cso_cache/cso_context.c:723"}
    #3 0xaae195ce in st_update_framebuffer_state ../src/mesa/state_tracker/st_atom_framebuffer.c:207"}
    #4 0xaae12316 in st_validate_state ../src/mesa/state_tracker/st_atom.c:261"}
    #5 0xaae31302 in st_Clear ../src/mesa/state_tracker/st_cb_clear.c:438"}
    #6 0x4c3d0e in deqp::gles2::TestCaseWrapper::iterate(tcu::TestCase*) (/deqp/modules/gles2/deqp-gles2+0x2ad0e)"}
    #7 0x828cf2 in tcu::TestSessionExecutor::iterateTestCase(tcu::TestCase*) (/deqp/modules/gles2/deqp-gles2+0x38fcf2)"}
    #8 0x8295f0 in tcu::TestSessionExecutor::iterate() (/deqp/modules/gles2/deqp-gles2+0x3905f0)"}
    #9 0x810aac in tcu::App::iterate() (/deqp/modules/gles2/deqp-gles2+0x377aac)"}
    #10 0x4c1d4c in main (/deqp/modules/gles2/deqp-gles2+0x28d4c)"}
    #11 0xb64b6aa8 in __libc_start_main (/lib/arm-linux-gnueabihf/libc.so.6+0x1aaa8)"}

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-12 16:26:50 +01:00
Tomeu Vizoso
fb579b0347 nir: Don't copy empty array
It's undefined behavior UBSAN complains about, so fixing this will
reduce the noise a bit.

../src/compiler/nir/nir_clone.c:710:4: runtime error: null pointer passed as argument 2, which is declared to never be null"}
    #0 0xac781be4 in clone_function ../src/compiler/nir/nir_clone.c:710"}
    #1 0xac781be4 in nir_shader_clone ../src/compiler/nir/nir_clone.c:740"}
    #2 0xacf99442 in panfrost_shader_compile ../src/gallium/drivers/panfrost/pan_assemble.c:54"}
    #3 0xacf6b268 in panfrost_bind_shader_state ../src/gallium/drivers/panfrost/pan_context.c:1960"}
    #4 0xaae326bc in set_fragment_shader ../src/mesa/state_tracker/st_cb_clear.c:135"}
    #5 0xaae326bc in clear_with_quad ../src/mesa/state_tracker/st_cb_clear.c:335"}
    #6 0xaae326bc in st_Clear ../src/mesa/state_tracker/st_cb_clear.c:518"}
    #7 0x494d0e in deqp::gles2::TestCaseWrapper::iterate(tcu::TestCase*) (/deqp/modules/gles2/deqp-gles2+0x2ad0e)"}
    #8 0x7f9cf2 in tcu::TestSessionExecutor::iterateTestCase(tcu::TestCase*) (/deqp/modules/gles2/deqp-gles2+0x38fcf2)"}
    #9 0x7fa5f0 in tcu::TestSessionExecutor::iterate() (/deqp/modules/gles2/deqp-gles2+0x3905f0)"}
    #10 0x7e1aac in tcu::App::iterate() (/deqp/modules/gles2/deqp-gles2+0x377aac)"}
    #11 0x492d4c in main (/deqp/modules/gles2/deqp-gles2+0x28d4c)"}
    #12 0xb64b9aa8 in __libc_start_main (/lib/arm-linux-gnueabihf/libc.so.6+0x1aaa8)"}

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-12 16:26:45 +01:00
Tomeu Vizoso
47a73888f5 pan/midgard: Remove undefined behavior
As found by UBSAN, it should be harmless but it's good to remove any UB
so the tool's output is useful.

../src/panfrost/midgard/midgard_schedule.c:1094:9: runtime error: index -1 out of bounds for type 'midgard_instruction *[6]'"}
    #0 0xad047872 in schedule_block ../src/panfrost/midgard/midgard_schedule.c:1094"}
    #1 0xad04d41a in schedule_program ../src/panfrost/midgard/midgard_schedule.c:1116"}
    #2 0xad031f98 in midgard_compile_shader_nir ../src/panfrost/midgard/midgard_compile.c:2588"}
    #3 0xacf9874e in panfrost_shader_compile ../src/gallium/drivers/panfrost/pan_assemble.c:68"}
    #4 0xacf6b268 in panfrost_bind_shader_state ../src/gallium/drivers/panfrost/pan_context.c:1960"}
    #5 0xaae2596e in st_update_fp ../src/mesa/state_tracker/st_atom_shader.c:168"}
    #6 0xaae12316 in st_validate_state ../src/mesa/state_tracker/st_atom.c:261"}
    #7 0xaadc58c2 in prepare_draw ../src/mesa/state_tracker/st_draw.c:132"}
    #8 0xaadc58c2 in st_draw_vbo ../src/mesa/state_tracker/st_draw.c:184"}
    #9 0xabc4f924 in _mesa_validated_drawrangeelements ../src/mesa/main/draw.c:816"}
    #10 0xabc50240 in _mesa_DrawElements ../src/mesa/main/draw.c:970"}
    #11 0x73ebd2 in glu::CallLogWrapper::glDrawElements(unsigned int, int, unsigned int, void const*) (/deqp/modules/gles2/deqp-gles2+0x2d4bd2)"}
    #12 0x6d86b2 in deqp::gls::FragOpInteractionCase::iterate() (/deqp/modules/gles2/deqp-gles2+0x26e6b2)"}
    #13 0x494d16 in deqp::gles2::TestCaseWrapper::iterate(tcu::TestCase*) (/deqp/modules/gles2/deqp-gles2+0x2ad16)"}
    #14 0x7f9cf2 in tcu::TestSessionExecutor::iterateTestCase(tcu::TestCase*) (/deqp/modules/gles2/deqp-gles2+0x38fcf2)"}
    #15 0x7fa5f0 in tcu::TestSessionExecutor::iterate() (/deqp/modules/gles2/deqp-gles2+0x3905f0)"}
    #16 0x7e1aac in tcu::App::iterate() (/deqp/modules/gles2/deqp-gles2+0x377aac)"}
    #17 0x492d4c in main (/deqp/modules/gles2/deqp-gles2+0x28d4c)"}
    #18 0xb64b9aa8 in __libc_start_main (/lib/arm-linux-gnueabihf/libc.so.6+0x1aaa8)"}

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-12 16:26:40 +01:00
Tomeu Vizoso
5dfe41239c panfrost: Hold a reference to sampler views
Before we were just copying, but we need to hold a reference as well.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-12 16:26:35 +01:00
Jan Zielinski
bd5077ae1d gallium/swr: Fix Windows build
Tessellator defines own fmin/fmax functions that conflict
with those defined in cmath header. Need to use legacy math.h
which was originally used in MS code.

Reviewed-by: Krzysztof Raszkowski <krzysztof.raszkowski@intel.com>
2019-12-12 14:35:23 +01:00
Samuel Pitoiset
a0f1a5fa05 ac/nir: fix out-of-bound access when loading constants from global
Global load/store instructions can't know if the offset is
out-of-bound because they don't use descriptors (no range).

Fix this by clamping the offset for arrays that are indexed
with a non-constant offset that's greater or equal to the array
size.

This fixes VM faults and GPU hangs with Dead Rising 4.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2148
Fixes: 71a6794200 ("ac/nir: Enable nir_opt_large_constants")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-12 10:12:56 +00:00
Lionel Landwerlin
2c5eb1df68 anv: fix assumptions about temporary fence payload
Since f9a3d9738b temporary BO_WSI are definitely a thing so we have
an assert wrong.

Take that opportunity to expand a bit on an existing comment.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: f9a3d9738b ("anv: Use BO fences/semaphores for AcquireNextImage")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
2019-12-12 10:10:48 +00:00
Lionel Landwerlin
52bc235f2a anv: fix fence underlying primitive checks
We appear to have got lucky that the only type of temporary fence
payload we could have was a syncobj and that would only happen when
the type of the permanent payload was also a syncobj.

This code was broken if that assumption changed and it did in commit
f9a3d9738b.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
2019-12-12 10:10:48 +00:00
Dave Airlie
790bc9a17e vtn/opencl: add shuffle/shuffle support
This adds nir encoding for these, generating them from libclc
was very expensive, and this is a lot simpler.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-12-12 19:40:58 +10:00
Dave Airlie
5471ef7532 vtn: convert vload/store to single value loops
There is an alignment issue doing this the other way, the
spec clearly says vload/store don't require alignment.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-12-12 19:40:15 +10:00
Kenneth Graunke
dcb4230e5e iris: Default to X-tiling for scanout buffers without modifiers
Neither Mutter nor KWin's wayland compositors appear to use modifiers.
In the non-modifier case, iris was still trying to use Y-tiling for
scan-out surfaces, leading to this error:

(gnome-shell:7247): mutter-WARNING **: 09:23:47.787: meta_drm_buffer_gbm_new failed: drmModeAddFB failed: Invalid argument

We now fall back to the historical X-tiling for scanout buffers, which
ought to work everyone, at lower performance.  To regain that, we need
to ensure modifiers are actually supported in environments people use.

Fixes: fbf3124771 ("iris: Rework tiling/modifiers handling")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-12-11 22:03:48 -08:00
Dave Airlie
3cd903a6c3 llvmpipe: enable ARB_shader_draw_parameters.
All the bits should be in place for this now.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-12-12 10:29:43 +10:00
Dave Airlie
75f21895de gallivm: fixup base_vertex support
base vertex should be 0 for non-indexed draws according to the
piglit tests.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-12-12 10:16:19 +10:00
Dave Airlie
73f5e2d7ef gallivm/draw: add support for draw_id system value.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-12-12 10:16:19 +10:00
Dave Airlie
22a40dd1c1 gallivm: add base instance sysval support
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-12-12 10:16:19 +10:00
Karol Herbst
20d0ae464c nv50/ir: implement global atomics and handle it for nir
TGSI doesn't have any concept of global memory right now.

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2019-12-11 23:54:39 +00:00
Karol Herbst
70c6bff2f0 nir: handle nir_deref_type_ptr_as_array in rematerialize_deref_in_block
I forgot why that was required, but it still is the correct thing to do.

Hit it at some point when working on implementing more CL features.

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-12-11 23:54:39 +00:00
Rob Clark
ddb9701a3c spirv: add OpLifetime*
These are just hints so we can ignore them.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2019-12-11 23:54:39 +00:00
Karol Herbst
acc0658942 clover/spirv: allow Int64 Atomics for supported devices
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-12-11 23:54:39 +00:00
Karol Herbst
dba8bf1169 clover/nir: set spirv environment to OpenCL
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-12-11 23:54:39 +00:00
Karol Herbst
6d08f034ce clover/nir: treat UniformConstant as global memory
Just like we already do in the llvm backend. The current constant buffer code
seems fundamentally flawed and right now we are thinking on how we want to
reimplement all of that.

But until that happens, just treat is as global memory and go on.

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-12-11 23:54:39 +00:00
Karol Herbst
2402232c90 spirv: handle UniformConstant for OpenCL kernels
The caller is responsible for setting up the ubo_addr_format value as
contrary to shared and global, it's not controlled by the spirv.

Right now clovers implementation of CL constant memory uses a 24/8 bit format
to encode the buffer index and offset, but that code is dead as all backends
treat constants as global memory to workaround annoying issues within OpenCL.

Maybe that will change, maybe not. But just in case somebody wants to look at
it, add a toggle for this inside vtn.

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-12-11 23:54:39 +00:00
Dave Airlie
123f90cf36 gallivm/nir: copy compare ordering code from tgsi
This fixes some isinf/isnan tests copying what the tgsi code
paths do for float compares

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-12-12 09:16:41 +10:00
Dave Airlie
8f56ba5da4 gallivm/nir: cleanup code and call cmp wrapper
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-12-12 09:16:37 +10:00
Dave Airlie
63b3d38a50 gallivm: fix perspective enable if usage_mask doesn't have 0 bit set
The current code looks like a typo, and fails if the usage_mask
is for a y/z enabled input.

Fixes piglit ext_transform_feedback-immediate-reuse-index-buffer
with llvmpipe/nir

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-12-12 09:16:33 +10:00
Dave Airlie
bf29040103 gallivm: fix transpose for when first channel isn't created
The previous fix worked when the second channel wasn't exposed, but
a couple of piglit tests have inputs with just the y/z chans, no x/w.

Partly Fixes piglit ext_transform_feedback-immediate-reuse-index-buffer
with llvmpipe/nir

Fixes: 5363cda52b ("gallivm: add swizzle support where one channel isn't defined.")
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-12-12 09:16:28 +10:00
Dave Airlie
e35b2c37cd llvmpipe/nir: handle texcoord requirements
Switch to using texcoord intrinsic support.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-12-12 09:16:24 +10:00
Kristian H. Kristensen
b6f8c42846 freedreno/a6xx: Silence warning for unused perf counters
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-12-11 22:25:47 +00:00
Kristian H. Kristensen
9b09776846 freedreno/a6xx: Convert some tile setup to OUT_REG()
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-12-11 22:25:47 +00:00
Kristian H. Kristensen
8a4b0d852c freedreno/a6xx: Convert gmem blits to OUT_REG()
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-12-11 22:25:47 +00:00
Kristian H. Kristensen
201caa7281 freedreno/a6xx: Convert VSC pipe setup to OUT_REG()
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-12-11 22:25:47 +00:00
Kristian H. Kristensen
c71348f84a freedreno/a6xx: Convert emit_zs() to OUT_REG()
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-12-11 22:25:47 +00:00
Kristian H. Kristensen
ffa7d9cbeb freedreno/a6xx: Convert emit_mrt() to OUT_REG()
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-12-11 22:25:47 +00:00
Kristian H. Kristensen
781b2dd63b freedreno/a6xx: Include fd6_pack.h in a few files
Including non-functional changes to get the value from the fd_reg_pair
in places.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-12-11 22:25:47 +00:00
Kristian H. Kristensen
9783f6bc5d freedreno/a6xx: Drop stale include
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-12-11 22:25:47 +00:00
Kristian H. Kristensen
9b05466144 freedreno/registers: Add 64 bit address registers
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-12-11 22:25:47 +00:00
Kristian H. Kristensen
bdd98b892f freedreno: New struct packing macros
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-12-11 22:25:47 +00:00
Kristian H. Kristensen
b27b0e8550 freedreno/registers: Remove duplicate register definitions
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-12-11 22:25:47 +00:00
Timothy Arceri
f8148d0cc1 docs: remove mailing list as way of submitting patches
All developers now use gitlab, don't confuse newcomers by suggesting
they might use the mailing list. We want everyone to use gitlab so
that patches get run through basic CI before they are merged.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
2019-12-12 09:09:50 +11:00
Jason Ekstrand
776cfde699 anv: Bump the advertised patch version to 129
We've been keeping up with the spec updates.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-11 18:52:08 +00:00
Jason Ekstrand
5f5f5019bd anv: Unconditionally advertise Vulkan 1.1
Vulkan 1.1 requires VK_KHR_external_fence which requires syncobj support
to be actually usable.  However, it doesn't strictly require that we
support any external handle types.  We should be able to advertise 1.1
even on old kernels that don't have syncobj support.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-11 18:52:08 +00:00
Jason Ekstrand
98a83d0fce anv: Flush the queue on DeviceWaitIdle
When we have syncobj_wait, we can trust in WAIT_FOR_SUBMIT but when we
don't, we only have BO waits and those aren't quite as nice.  This
commit adds a flag to _anv_queue_submit to wait for the queue to drain
before returning.  This gives us the behavior we need to implement
DeviceWaitIdle.

Fixes: 246261f0ad "anv: prepare the driver for delayed submissions"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-11 18:52:08 +00:00
Karol Herbst
0bafde717d nir/tests: MSVC build fix
Fixes: 11f736a6f9 "nir/tests: add serializer tests"
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-12-11 17:12:48 +00:00
Jan Zielinski
ab55708200 swr/rasterizer: Add tessellator implementation to the rasterizer
This is initial commit on the way to implement ARB_tessellation_shader
extension in OpenSWR. It introduces tessellator implementation
taken from Microsoft GitHub (published under MIT license):

https://github.com/microsoft/DirectX-Specs/blob/master/d3d/archive/images/d3d11/tessellator.cpp
https://github.com/microsoft/DirectX-Specs/blob/master/d3d/archive/images/d3d11/tessellator.hpp

It also adds some glue code that connects the tessellator
to the internals of SWR rasterizer.

Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Bruce Cherniak <bruce.cherniak@intel.com>
Reviwed-by: Alok Hota <alok.hota@intel.com>
2019-12-11 16:54:37 +00:00
Samuel Pitoiset
ff2e11b210 gitlab-ci: set RADV_DEBUG=checkir for RADV test jobs
This is used to validate if the driver emits correct LLVM IR.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-11 15:44:40 +00:00
Eric Engestrom
b2dac806f8 intel: add mi_builder_test for gen12
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-11 15:38:19 +00:00
Rohan Garg
2129b4152c gitlab-ci: Use lavacli from packages
lavacli 0.9.8 is now available in Debian Testing.
Ref: https://tracker.debian.org/news/1066828/lavacli-098-1-migrated-to-testing/
Fixes: 555c0de ("gitlab-ci: Move LAVA-related files into top-level ci dir")

Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-12-11 15:19:43 +00:00
Erico Nunes
7701b7b7ee lima/ppir: enable lower_fdph
Otherwise we may lower some fdot to fdph which is not implemented in pp.

Fixes #2126

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-12-11 15:55:48 +01:00
Karol Herbst
11f736a6f9 nir/tests: add serializer tests
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-12-11 13:00:44 +01:00
Karol Herbst
676232d76f nir/serialize: fix vec8 and vec16
Nir serializes uses nir_ssa_alu_instr_src_components in a few places to
determine how many components a src has, but that's not what this function
returns. It simply returns how many channels are used, which is still fine
for most of the code.

This was breaking code like this:

vec16 32 ssa_1 = intrinsic load_global
vec1  32 ssa_2 = fmax ssa_1.a, ssa_2.b

v2: make the 16bit encoding work for identify swizzles again

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-12-11 13:00:44 +01:00
Bas Nieuwenhuizen
2e44bfc14f radv: Fix RGBX Android<->Vulkan format correspondence.
This is correct per the Vulkan spec format equivalence table.

Fixes: f36b52740a "radv/android: Add android hardware buffer queries."
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-11 11:40:13 +01:00
Tomeu Vizoso
63ae9e61c1 panfrost: Add PAN_MESA_DEBUG=sync
Sometimes it's useful to get information about GPU faults in the
console, so it's synchronized with other messages.

This commit will cause Mesa to wait for completion and check if there
are any faults raised by the GPU.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-11 08:01:20 +01:00
Kenneth Graunke
2e654db27a iris: Create smaller program keys without legacy features
A lot of the brw_*_prog_key fields are for emulating features on legacy
hardware that iris doesn't support.  In particular, all of the texture
swizzle fields take up a lot of space.  These dead fields make hashing
the shader keys more expensive than it ought to be.

We introduce iris-specific keys with only the information we need, and
translate them to brw keys when actually compiling new variants.  This
way, key comparisons can use the small keys.  The size reductions are:

   VS:  328 bytes ->  8 bytes
   TCS: 312 bytes -> 24 bytes
   TES: 304 bytes -> 24 bytes
   GS:  284 bytes ->  8 bytes
   FS:  304 bytes -> 16 bytes
   CS:  280 bytes ->  4 bytes

Scores for the Piglit drawoverhead microbenchmark case with a shader
program change improve by roughly 30%.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-10 22:25:41 -08:00
Pierre Moreau
8ccd3f48a0 compiler/spirv: Fix uses of gnu struct = {} extension
Fixes: a24d6fbae6 ("meson: Add -Werror=gnu-empty-initializer to MSVC compat args")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Signed-off-by: Pierre Moreau <dev@pmoreau.org>
2019-12-11 06:03:22 +00:00
Vinson Lee
9661fc9cdb util/u_thread: Restrict u_thread_get_time_nano on macOS.
macOS does not have pthread_getcpuclockid.

src/util/u_thread.h:156:4: error: implicit declaration of function 'pthread_getcpuclockid' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
   pthread_getcpuclockid(thread, &cid);
   ^

Fixes: 4913215d14 ("util/u_thread: don't restrict u_thread_get_time_nano() to __linux__")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2171
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Acked-by: Eric Engestrom <eric@engestrom.ch>
2019-12-10 21:35:47 -08:00
Eric Anholt
8bf590b46b tu: Move UBWC layout into fdl6_layout() and use that function.
This gets us shared non-UBWC layout code between gallium and turnip.
Until I fix up the rest of gallium to handle UBWC mipmapping, we do the
single-level UBWC setup in gallium as a fixup after layout.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-12-11 04:24:18 +00:00
Eric Anholt
de619d7503 freedreno: Switch the 16-bit workaround to match what turnip does.
Prevents regressions on argb1555 and rgb565 when making turnip use
freedreno's layout.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-12-11 04:24:18 +00:00
Eric Anholt
d9cf3e76bd freedreno: Move a6xx's setup_slices() to a shareable helper function.
We pass in all the parameters for setting up the layout, though freedreno
still sets a few of them up early (since it uses layout helpers in making
some decisions about the layout setup parameters that will be cleaned up
once krh's blitter work lands).
2019-12-11 04:24:18 +00:00
Eric Anholt
67258a44d2 tu: Move our image layout into a freedreno_layout struct.
This lets us start using some of the fdl_* helpers and have more obviously
matching code between gallium and turnip.  We can't yet use the fdl_* UBWC
helpers, since the gallium driver doesn't do UBWC mipmaps (which I'm
working on in another branch).

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-12-11 04:24:18 +00:00
Eric Anholt
ea7631a9a6 freedreno: Move UBWC layout into a slices array like the non-UBWC slices.
This is a little refactor in preparation for UBWC mipmapping support.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-12-11 04:24:18 +00:00
Eric Anholt
bbe84c6c31 freedreno: Refactor the UBWC flags registers emission.
It's the same logic for each of these being emitted, and I was about to
change the rsc->layout.* for UBWC.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-12-11 04:24:18 +00:00
Eric Anholt
97be9503bb freedreno: Drop the extra offset field for mipmap slices.
We can just bake the UBWC-goes-first delta into the slices at setup time.
I did have to fix up the resource shadowing swap path to swap the slice
fields, as it was missing and regressed the format reinterpets otherwise.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-12-11 04:24:18 +00:00
Kenneth Graunke
69d7782b15 intel/decoder: Make get_state_size take a full 64-bit address and a base
i965 wants to use an offset from a base because everything is in a
single buffer whose address may be relocated, and all base addresses
are set to the start of that buffer.

iris wants to use a full 64-bit address, because state lives in separate
buffers which may be in the shader, surface, and dynamic memory zones,
where addresses grow downward from the top of a 4GB zone,  So it's very
possible for a 32-bit offset to exist relative to multiple bases,
leading to the wrong state size.
2019-12-10 19:10:49 -08:00
Dongwon Kim
8a8534a698 iris: INTEL performance query implementation
low-level implementation of INTEL-performance-query APIs in
Intel iris driver. Most of functions and procedures defined here
are adopted from i965 driver (brw_performance_query.c)

v2: - replace genX_init_performance_query with
      iris_init_perfquery_functions which is gen's version agnositic
    - general code clean-up

v3: include gen_perf_gens.h as some of defines were moved to this new
    header file

v4: - checking for kernel 4.13+ won't be needed here as Iris won't be
      loaded anyway without DRM_SYNCOBJ that is enabled after Kernel
      4.13.

    - checking whether gen < 8 or is_cherryview won't be required as
      well because those cases are screened in iris_screen_create.

v5: remove genX(init_performance_query)

v6: - remove oa_metrics_kernel_support as iris works only with kernel
    4.18 and newer.

    - use perf functions defined in separate file, iris_perf.h/c

Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-12-10 17:02:58 -08:00
Mark Janes
ca2dd99bf6 iris: separating out common perf code
The configuration of the gen_perf vtable will be the same for
INTEL_performance_query and AMD_performance_monitor.
Initialize the table in a single routine that can be called from both
implementations.

Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-12-10 17:02:58 -08:00
Dongwon Kim
106054ef79 gallium: enable INTEL_PERFORMANCE_QUERY
new state tracker APIs added for INTEL_performance_query
This extension is enabled if all vendor specific functions for it
exist.

v2: add st_cb_perfquery.* to the list of sources in Makefile
v3: minor code clean-up
v4: - add driver hooks for intel-performance-query apis
    - add PIPE level performance counter and type enums that
      match to OpenGL enums
    - do conversion of pipe_perf_counter_type and
      pipe_perf_counter_data_type enums to GL defines in state_tracker

Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-12-10 17:02:58 -08:00
Dylan Baker
d0eebda990 meson/broadcom: libbroadcom_cle also needs zlib
Fixes: 1ae8018a6a
       ("meson: Add support for the vc4 driver.")
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-11 00:49:44 +00:00
Kenneth Graunke
0f2f561a10 anv: Enable Gen11 Color/Z write merging optimization
TCCNTLREG contains additional L3 cache write merging optimizations.

The default value on my system appears to be:
- URB Partial Write Merging (bit 0)
- L3 Data Partial Write Merging (bit 2)
- TC Disable (bit 3)

Windows drivers appear to set bit 1 as well to enable "Color/Z Partial
Write Merging".  This should solve an issue we were seeing where MRT
benchmarks were using substantially more bandwidth than they ought.
However, we have not observed it to cause measurable FPS gains.

It is unclear whether we should be setting bit 0 or bit 3, so for now
we leave those at the hardware default value.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-12-10 16:19:46 -08:00
Kenneth Graunke
5cc7636993 iris: Enable Gen11 Color/Z write merging optimization
TCCNTLREG contains additional L3 cache write merging optimizations.

The default value on my system appears to be:
- URB Partial Write Merging (bit 0)
- L3 Data Partial Write Merging (bit 2)
- TC Disable (bit 3)

Windows drivers appear to set bit 1 as well to enable "Color/Z Partial
Write Merging".  This should solve an issue we were seeing where MRT
benchmarks were using substantially more bandwidth than they ought.
However, we have not observed it to cause measurable FPS gains.

It is unclear whether we should be setting bit 0 or bit 3, so for now
we leave those at the hardware default value.

Improves performance in Manhattan 3.0 by 6% on ICL 8x8 at a fixed
frequency, according to Felix Degrood.  I didn't see any improvements
at out-of-the-box power management settings, however.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-12-10 16:19:43 -08:00
Kenneth Graunke
0b74f85870 intel/genxml: Add a partial TCCNTLREG definition
TCCNTLREG contains additional cache programming settings.  In
particular, there are several write combining controls we'd like to use.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-12-10 16:19:33 -08:00
Kenneth Graunke
74665eaf3a util: Detect use-after-destroy in simple_mtx
This makes simple_mtx_destroy set the counter to an invalid canary
value and then makes lock/unlock assert that the value is legal.

That way, calling lock/unlock after destroy will assert fail,
rather than deadlocking or potentially even working.

This has caught real deadlocks in dEQP multithreaded tests (in st/mesa
shader variant zombie list handling), which have since been fixed.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-12-10 23:48:40 +00:00
Rob Clark
fc97643c57 freedreno/a6xx: enable LRZ by default
Now that dEQP should be happy, lets flip the switch.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-12-10 22:55:21 +00:00
Rob Clark
1b4c12d3ee freedreno/a6xx: fix LRZ logic
In particular, we need to invalidate the LRZ state when we cannot be
confident in what the Z state would be during rendering:

1) depth test modes not supported by LRZ
2) stencil test, which would require full rasterization and stencil
   test in the binning pass (whereas LRZ normally just needs to
   determine the min and max z value in an 8x8 quad)

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-12-10 22:55:21 +00:00
Rob Clark
3c479849c5 freedreno/a6xx: fix LRZ layout
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-12-10 22:55:21 +00:00
Rob Clark
6cf101402d freedreno/a5xx+a6xx: split LRZ layout to per-gen
Seems to be a bit different for a6xx, so let's split this out.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-12-10 22:55:21 +00:00
Rob Clark
3b074a2e53 freedreno/a6xx: disable LRZ when blending
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-12-10 22:55:21 +00:00
Marek Olšák
a305543c8d radeonsi: don't rely on CLEAR_STATE to set PA_SC_GENERIC_SCISSOR_*
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-12-10 16:32:37 -05:00
Marek Olšák
aced18aa61 radeonsi/gfx10: simplify the tess_turns_off_ngg condition
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-12-10 16:32:36 -05:00
Marek Olšák
42f921387b radeonsi/gfx10: disable vertex grouping
based on PAL.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-12-10 16:32:34 -05:00
Marek Olšák
75ce078a0a radeonsi: enable NIR by default and document GL 4.6 support
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-12-10 15:48:58 -05:00
Marek Olšák
42b28e7ac3 st/dri: assume external consumers of back buffers can write to the buffers
This was reverted needlessly because if was part of another series.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-By: Tapani Pälli <tapani.palli@intel.com>
2019-12-10 15:37:37 -05:00
Jason Ekstrand
41691ac016 ANV: Stop advertising smoothLines support on gen10+
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
2019-12-10 20:13:56 +00:00
Dylan Baker
85a9698ac3 meson/broadcom: libbroadcom_cle needs expat headers
Fixes: 1ae8018a6a
       ("meson: Add support for the vc4 driver.")
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-10 10:48:38 -08:00
Lionel Landwerlin
5fdea9f401 anv: fix incorrect VMA alignment for CCS main surfaces
Maybe finer way of dealing with this requirement would be to increase
the number of pdevice->memory.types[] to add a category for special
alignment cases.

Meanwhile this fixes the problem of CCS surface alignment and it's
probably not going to cause issues given the size of our address
space.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 6af8a4acc4 ("anv: Add aux-map translation for gen12+")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-12-10 16:06:54 +00:00
Lionel Landwerlin
dcfe1903c3 anv: fix missing gen12 handling
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 181be14d43 ("anv: Build for gen12")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-12-10 16:06:54 +00:00
Eric Engestrom
865f4b193f docs: reword a bit and list HTTPS before FTP
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-12-10 15:21:23 +00:00
Eric Engestrom
d90e656fa7 meson: drop intel_ prefix on imgui_core
Again, no real effect, just the name of a temporary build file.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-12-10 15:16:02 +00:00
Eric Engestrom
2b0e3e9fd1 meson: drop duplicate lib prefix on libiris_gen*
This has no real effect other than the names of the temporary files in
the build folder.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-12-10 15:16:02 +00:00
Samuel Pitoiset
e4c8491bdf radv: implement VK_KHR_separate_depth_stencil_layouts
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-10 13:16:17 +01:00
Samuel Pitoiset
48ee62178f radv: initialize HTILE for separate depth/stencil aspects
It either clears the whole HTILE buffer or part of it depending
on the HTILE mask parameter.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-10 13:09:29 +01:00
Samuel Pitoiset
41cebfc9c1 radv: do not init HTILE as compressed state when dst layout allows it
I don't think this makes much differences and a potential clear
following the initialization will overwrite HTILE anyways.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-10 13:09:26 +01:00
Samuel Pitoiset
b603cc8c84 radv: synchronize after performing a separate depth/stencil fast clears
For depth+stencil images, the driver might use an optimized path
if only one aspect is cleared. It either clears the depth or the
stencil part of HTILE. Because the two separate aspects might use
the same HTILE memory we have to synchronize.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-10 13:09:22 +01:00
Michel Dänzer
dadd609664 gitlab-ci: Don't exclude any piglit quick_shader tests
Now that we're running these with process isolation enabled, their
results will hopefully be stable.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-10 11:19:11 +00:00
Krzysztof Raszkowski
cfe00a52f0 gallivm: add TGSI bit arithmetic opcodes support
Add TGSI_OPCODE_BFI, TGSI_OPCODE_POPC, TGSI_OPCODE_LSB,
TGSI_OPCODE_IMSB, TGSI_OPCODE_UMSB, TGSI_OPCODE_IBFE,
TGSI_OPCODE_UBFE, TGSI_OPCODE_BREV support.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
2019-12-10 10:34:18 +00:00
Samuel Pitoiset
008fe909ca radv: fix possibly wrong PA_SC_AA_CONFIG value for conservative rast
PA_SC_AA_CONFIG might be updated when conversative rasterization is
enabled. Because the driver only re-emits the multisample state if
the number of samples is different, that register value might not
be updated correctly.

Found by inspection, doesn't fix anything known.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-10 11:04:43 +01:00
Samuel Pitoiset
4f659224c8 radv: move emission of two PA_SC_* registers to the pipeline CS
They don't have to be updated dynamically.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-10 11:04:40 +01:00
Pierre-Eric Pelloux-Prayer
87f7ec8a2c st/dri: use st->flush callback to flush the backbuffer
Previously the flush was done before the call to st->flush but
could lead to problems as FLUSH_VERTICES could push some work
that would change the backbuffer (or modify it).

With this commit, all the backbuffer flushing code is executed
right before the call to st_flush.

Closes: https://gitlab.freedesktop.org/drm/amd/issues/842
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=205049

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-12-10 09:25:28 +01:00
Pierre-Eric Pelloux-Prayer
cc0d0afe3b st/mesa: add a notify_before_flush callback param to flush
The new callback is called right before the flush is done to allow
users of st->flush to do some work after all the previous work has
been flushed.

This will be used by dri_flush in the next commit.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-12-10 09:25:28 +01:00
Pierre-Eric Pelloux-Prayer
f5c1cb2383 radeonsi: dcc dirty flag
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-12-10 09:25:28 +01:00
Pierre-Eric Pelloux-Prayer
e3e91cebcd radeonsi: fix multi plane buffers creation
When using 3 planes, the sequence produces this chain:
  plane0 -> plane2
This commit fixes this to produce:
  plane0 -> plane1 -> plane2

Fixes: 86e60bc265 ("radeonsi: remove si_vid_join_surfaces and use combined planar allocations")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2193
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-12-10 08:52:16 +01:00
Pierre-Eric Pelloux-Prayer
ff0f108666 radeonsi: use gfx9.surf_offset to compute texture offset
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2177
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-12-10 08:52:07 +01:00
Sonny Jiang
6c901f0675 radeonsi: use compute shader for clear 12-byte buffer
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-12-09 23:25:57 -05:00
Marek Olšák
38e9eb9561 st/mesa: release the draw shader properly to fix driver crashes (iris)
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-12-09 22:41:41 -05:00
Marek Olšák
41118246c6 draw, st/mesa: generate TGSI for ffvp/ARB_vp if draw lacks LLVM
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-12-09 21:09:28 -05:00
Marek Olšák
a3de63fbb3 st/mesa: don't generate VS TGSI if NIR is enabled
it's no longer needed

Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-12-09 21:09:28 -05:00
Marek Olšák
a90f4453fe st/mesa: remove struct st_vp_variant in favor of st_common_variant
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-12-09 21:09:28 -05:00
Marek Olšák
6299b90fd4 st/mesa: remove st_vp_variant::num_inputs
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-12-09 21:09:28 -05:00
Marek Olšák
bc99b22a30 st/mesa: use a separate VS variant for the draw module
instead of keeping the IR indefinitely in st_vp_variant.

This trivially fixes Selection/Feedback/RasterPos for NIR.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-12-09 21:09:28 -05:00
Marek Olšák
17e8839a2f st/mesa: support shader images for Selection/Feedback/RasterPos
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-12-09 21:09:28 -05:00
Marek Olšák
b7393f1115 st/mesa: support SSBOs for Selection/Feedback/RasterPos
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-12-09 21:09:28 -05:00
Marek Olšák
e91b044bd8 st/mesa: support samplers for Selection/Feedback/RasterPos
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-12-09 21:09:28 -05:00
Marek Olšák
2891c4b2e2 st/mesa: save currently bound vertex samplers and sampler views in st_context
for st_draw_feedback.c

Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-12-09 21:09:28 -05:00
Marek Olšák
226e7aee70 st/mesa: support UBOs for Selection/Feedback/RasterPos
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-12-09 21:09:28 -05:00
Marek Olšák
60db75cb77 gallivm: implement LOAD with CONSTBUF but don't enable it for llvmpipe
This is already used in st_draw_feedback.c, because it uses shaders
generated for drivers.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-12-09 21:09:28 -05:00
Marek Olšák
525c8b90c7 llvmpipe: implement TEX_LZ and TXF_LZ opcodes
gallivm receives these opcodes anyway because st_draw_feedback.c uses
shaders that were assembled for drivers, not llvmpipe.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-12-09 21:09:28 -05:00
Gurchetan Singh
3c8ddc8f4b drirc: set allow_higher_compat_version for Faster Than Light
With 781a78 ("mesa: enable ARB_direct_state_access in compat for
GL3.1+), it's possible to have DSA with GL3.1+.

FTL creates a GL3.1 compat context, but fails the
_mesa_has_geometry_shaders(..) check in frame_buffer_texture.

Bump the compat version to pass the check.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-12-09 15:27:02 -08:00
Roland Scheidegger
23f1b78e8f util/atomic: Fix p_atomic_add for unlocked and msvc paths
Braces mismatch (flagged by CI, untested).

Fixes: 385d13f26d "util/atomic: Add a _return variant of p_atomic_add"

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-12-09 15:02:58 -08:00
Eric Anholt
0470a03769 freedreno: Track the set of UBOs to be uploaded in UBO analysis.
We were iterating over the entire 32-entry array each time, when we
can just use a bitset to know that we're only uploading from the first
entry normally.

Knocks ir3_emit_user_consts down from ~.5% of CPU to .1% on WebGL
fishtank.

Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-12-09 14:13:50 -08:00
Eric Anholt
10da0a9d18 freedreno: Stop forcing ALLOW_MAPPED_BUFFERS_DURING_EXEC off.
The default is to not throw GL errors when drawing with mapped
buffers, but we were forcing it on for unclear reasons.  Internally we
keep all our buffers mapped anyway, so it should be a no-op other than
reducing CPU overhead (.23% in a perf report for WebGL fishtank)

Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-12-09 14:13:47 -08:00
Rob Clark
dc791d3c68 freedreno/fdperf: use drmOpen()
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-12-09 13:09:58 -08:00
Alyssa Rosenzweig
a37822f5f7 gallium/util: Support POLYGON in u_stream_outputs_for_vertices
u_decomposed_prims_for_vertices cannot support POLYGON, but POLYGON is
trivial to support as a special case directly (since we have the number
of vertices directly).

Fixes aborts in Panfrost in apps using GL_POLYGON.

Fixes: e881aa8c12 ("gallium/util: Add u_stream_outputs_for_vertices helper")
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Revewied-by: Eric Anholt <eric@anholt.net>
2019-12-09 21:09:05 +00:00
Anuj Phogat
1a32fbd48c intel: Add pci-ids for Jasper Lake
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-09 12:22:57 -08:00
Anuj Phogat
11fdd5f52c intel: Add device info for 1x4x6 Jasper Lake
Also removing the FIXME comments after matching the numbers with
updated documentation.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-09 12:22:56 -08:00
Vasily Khoruzhick
9f5fa496cb lima: expose tiled format modifier in query_dmabuf_modifiers()
Fixes: 8c12f4e5f2 ("lima: enable tiling")
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-12-09 15:21:55 +00:00
Vasily Khoruzhick
01a451b04d lima: handle DRM_FORMAT_MOD_INVALID in resource_from_handle()
Assume that resource is tiled if we get DRM_FORMAT_MOD_INVALID
in resource_from_handle() and we don't have RO.

Fixes: 8c12f4e5f2 ("lima: enable tiling")
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-12-09 15:21:55 +00:00
Jonathan Marek
9d78cf4584 turnip: add hw binning
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-09 08:22:18 -05:00
Samuel Pitoiset
86dfe92bd0 radv: do not use VK_TRUE/VK_FALSE
For consistency.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-09 09:21:26 +01:00
Dave Airlie
d7dc14628a gallivm: add bitfield reverse and ufind_msb
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Krzysztof Raszkowski <krzysztof.raszkowski@intel.com>
2019-12-09 06:05:02 +10:00
Roland Scheidegger
1c7693e3bd gallium/scons: fix graw_gdi build
Fixes: 44a6b0107b (gallivm: add nir->llvm translation (v2))
Reviewed-by: Dave Airlie <Airlied@redhat.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2019-12-07 17:50:53 +01:00
Daniel Schürmann
8259c97b2d aco: propagate temporaries into expanded vectors
Gives a very slight decrease in code size:
Totals from affected shaders:
Code Size: 1708488 -> 1702768 (-0.33 %) bytes
Max Waves: 2858 -> 2855 (-0.10 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
df3e674fb3 aco: improve readfirstlane after uniform ssbo loads on GFX7
pipeline-db changes for GFX7:

80310 shaders in 40472 tests
Totals:
SGPRS: 3655900 -> 3643916 (-0.33 %)
VGPRS: 2678324 -> 2686324 (0.30 %)
Spilled SGPRs: 1730 -> 1634 (-5.55 %)
Spilled VGPRs: 14 -> 21 (50.00 %)
Scratch size: 15540 -> 15536 (-0.03 %) dwords per thread
Code Size: 136106120 -> 135457616 (-0.48 %) bytes
LDS: 1259 -> 1259 (0.00 %) blocks
Max Waves: 601014 -> 600206 (-0.13 %)

Totals from affected shaders:
SGPRS: 307832 -> 295848 (-3.89 %)
VGPRS: 267864 -> 275864 (2.99 %)
Spilled SGPRs: 770 -> 674 (-12.47 %)
Spilled VGPRs: 14 -> 21 (50.00 %)
Scratch size: 16 -> 12 (-25.00 %) dwords per thread
Code Size: 22007488 -> 21358984 (-2.95 %) bytes
LDS: 65 -> 65 (0.00 %) blocks
Max Waves: 28668 -> 27860 (-2.82 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
0837471463 aco: use soffset for MUBUF instructions on SI/CI
pipeline-db changes for GFX7:

80310 shaders in 40472 tests
Totals:
SGPRS: 3655300 -> 3655900 (0.02 %)
VGPRS: 2677732 -> 2678324 (0.02 %)
Spilled SGPRs: 1730 -> 1730 (0.00 %)
Spilled VGPRs: 14 -> 14 (0.00 %)
Scratch size: 15540 -> 15540 (0.00 %) dwords per thread
Code Size: 136488364 -> 136106120 (-0.28 %) bytes
LDS: 1259 -> 1259 (0.00 %) blocks
Max Waves: 601039 -> 601014 (-0.00 %)

Totals from affected shaders:
SGPRS: 316312 -> 316912 (0.19 %)
VGPRS: 273844 -> 274436 (0.22 %)
Spilled SGPRs: 770 -> 770 (0.00 %)
Spilled VGPRs: 14 -> 14 (0.00 %)
Scratch size: 16 -> 16 (0.00 %) dwords per thread
Code Size: 22724904 -> 22342660 (-1.68 %) bytes
LDS: 114 -> 114 (0.00 %) blocks
Max Waves: 30861 -> 30836 (-0.08 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
7b38d95b32 radv: Enable ACO on GFX7 (Sea Islands)
This patch also disables AMD_shader_ballot on GFX7 by default if ACO is used.
Note that shader_ballot works correctly, but performance seems inferior.
To enable shader_ballot use RADV_PERFTEST=shader_ballot.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
28c95cc402 aco: return to loop_active mask at continue_or_break blocks
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
0f9447ccb0 radv: disable Youngblood app profile if ACO is used
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
746165e540 aco: implement exclusive scan for SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
7ae227effd aco: implement inclusive_scan for SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
f895a8b1df aco: implement (clustered) reductions for SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
9254fb4fc7 aco: don't use a scalar temporary for reductions on GFX10
This patch also adds the scalar temporary for scans on SI/CI

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
8ad43d8838 aco: flush denorms after fmin/fmax on pre-GFX9
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
21f67a3bdc radv: only flush scalar cache for SSBO writes with ACO on GFX8+
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
79ce6c1b33 aco: disable disassembly for SI/CI due to lack of support by LLVM
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
1c4afe38f2 aco: implement 64bit ine/ieq for SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
1e1356b2ad aco: implement 64bit i2b for SI /CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
da7ff58835 aco: make 1/2*PI a literal constant on SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
90fad7360d aco: implement 64bit VGPR shifts for SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
6a586a6006 aco: split read/writelane opcode into VOP2/VOP3 version for SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
23319add93 aco: fix disassembly of writelane instructions.
ACO writes an unused 3rd operand for internal usage
which makes LLVM recoginize it as illegal instruction.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
6fc9ddfef8 aco: recognize SI/CI SMRD hazards
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
3eed4d2be5 aco: implement quad swizzles for SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
bde9c1e3a1 aco: move buffer_store data to VGPR if needed
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
a8195bdf2e aco: implement nir_op_isign on SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
b8783973cd aco: only use scalar loads for readonly buffers on SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
f27783a667 aco: implement nir_op_fquantize2f16 for SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
caea4bbfdc aco: fix SMEM offsets for SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
8aab92b393 aco: SI/CI - fix sampler aniso
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Dave Airlie
9b533a2ca3 aco: handle gfx7 int8/10 clamping on exports
Co-authored-by: Daniel Schürmann <daniel@schuermann.dev>

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
0d42e4d7a0 aco: Initial GFX7 Support
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
3177346bfc aco: refactor visit_store_fs_output() to use the Builder
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Jason Ekstrand
0f60aa4037 anv: Re-emit all compute state on pipeline switch
It's a very odd case to hit in the real world.  However, there are some
CTS tests which switch back and forth between dispatch and clear without
changing the pipeline.

Fixes: bc612536eb "anv: Emit a dummy MEDIA_VFE_STATE before switching..."
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-12-07 04:03:35 +00:00
Jason Ekstrand
bce1c3c668 anv: Re-capture all batch and state buffers
When we moved from allocating BOs directly to using the BO cache, we
lost the EXEC_OBJECT_CAPTURE flag on all our state buffers.

Fixes: 3119b96bdf "anv: Allocate block pool BOs from the cache"
Fixes: ee77938733 "anv: Allocate batch and fence buffers from..."
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-12-07 04:03:35 +00:00
Jason Ekstrand
865ffe4e02 anv: Return VK_ERROR_OUT_OF_DEVICE_MEMORY for too-large buffers
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-06 22:32:05 +00:00
Eric Anholt
e3b249f166 freedreno: Enable texture upload memory throttling.
Fixes oom-killer during streaming-texture-upload, which I found while
trying to enable piglit in CI.

Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-12-06 14:03:50 -08:00
Fritz Koenig
c496d44284 freedreno: reorder format check
With the addition of the planar formats helper, the
planar formats no longer have a valid block.bits field.
Calling util_format_get_blocksize therefore asserts.

Reorder the check to see if the format is supported
before doing the query to get the blocksize.

Fixes: 20f132e5ef ("gallium/util: add planar format layouts and helpers")

Signed-off-by: Fritz Koenig <frkoenig@google.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-12-06 21:27:10 +00:00
Nanley Chery
21376cffb3 iris: Fix import of multi-planar surfaces with modifiers
Multi-planar surfaces are allowed to have modifiers. Don't require
DRM_FORMAT_MOD_INVALID in order to create a surface for each plane
defined by the format.

Fixes: 246eebba4a ("iris: Export and import surfaces with modifiers that have aux data")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-12-06 20:31:48 +00:00
Nanley Chery
51ee8fff9b gallium: Store the image format in winsys_handle
This format will be used to properly handle planar images with modifiers
in iris.

Fixes: 246eebba4a ("iris: Export and import surfaces with modifiers that have aux data")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-12-06 20:31:48 +00:00
Nanley Chery
d5c857837a gallium/dri2: Fix creation of multi-planar modifier images
The commit noted below assumed and enforced that DRM_MOD_INVALID was the
only valid modifier for multi-planar imported images. Due to that, it
required that modifier on multi-planar images to:

   1. Allow multiple planes.
   2. Perform YUV format lowering and extent adjustments.
   3. Use buffer_index to correctly map the given planes.

Fix these issues by removing or updating the code built on that
assumption.

Fixes: 2066966c10 ("gallium/dri2: Support creating multi-planar modifier images")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-12-06 20:31:48 +00:00
Kenneth Graunke
ab016a6a2d meson: Include iris in default gallium-drivers for x86/x86_64
We build i965 by default on x86/x86_64 platforms; let's build iris too.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-12-06 12:27:26 -08:00
Jason Ekstrand
f9a3d9738b anv: Use BO fences/semaphores for AcquireNextImage
Instead of doing a dummy submit on the command buffer for the fence or a
dummy semaphore and trusting in implicit sync, this commit moves us to
take advantage of implicit sync and just use the WSI image BO as the
fence.  Both semaphores and fences require a tiny bit of extra plumbing
to do this but the result is that we can get rid of a bunch of the extra
synchronization we're doing today.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-06 19:58:07 +00:00
Jason Ekstrand
ecc119a96e anv: Add a fence_reset_reset_temporary helper
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-06 19:58:07 +00:00
Jason Ekstrand
ccb7d606f1 anv: Use submit-time implicit sync instead of allocate-time
In 83b943cc2f, we started making all VkDeviceMemory BOs resident all
the time.  One unfortunate side-effect of this is that every
vkQueueSubmit sets EXEC_OBJECT_WRITE on every WSI memory object which
means that X server or Wayland compositor, instead of waiting on the
last vkQueueSubmit to actually write the buffer, now waits on the last
vkQueueSubmit to from that driver instance relative to whenever the
compositor's GL driver instance calls execbuf.  This potentially leads
to a lot of extra synchronization that we didn't intend to have.

Instead, this commit makes it so that we leave WSI memory objects with
EXEC_OBJECT_ASYNC most of the time and only unset EXEC_OBJECT_ASYNC and
set EXEC_OBJECT_WRITE in the dummy execbuf that we do as part of
vkQueuePresent.  This should hopefully result in tighter integration
with the compositor, lower latency, and better performance.

Testing with DOOM 2016, this seems to reduce latency by at least a frame
if not two and makes the game much more responsive.  Testing was,
however, subjective, so we don't have any hard data on that.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-06 19:58:07 +00:00
Jason Ekstrand
6ebf677cfd anv: Always add in EXEC_OBJECT_WRITE when specified in extra_flags
Otherwise, we're trusting in the execbuf_add_bo which sets
EXEC_OBJECT_WRITE to to always be the first one that gets called.  This
is likely true for fences but it seems somewhat fragile.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-06 19:58:07 +00:00
Jason Ekstrand
778b51f491 vulkan/wsi: Add a hooks for signaling semaphores and fences
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-06 19:58:07 +00:00
Jason Ekstrand
48e23a6406 vulkan/wsi: Provide the implicitly synchronized BO to vkQueueSubmit
This lets us treat the implicit synchronization that we need for X11 and
Wayland like a semaphore.  Instead of trusting the driver to somehow
figure out when that memory object needs to be signaled, we provide an
explicit point where the driver can set EXEC_OBJECT_WRITE and signal the
dma_fence on the BO.  Without this, we have to somehow track inside the
driver when WSI buffers are actually used to avoid extra synchronization
dependencies.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-06 19:58:06 +00:00
Urja Rannikko
d07ed0c9c9 panfrost: free spill cost table in mir_spill_register
Signed-off-by: Urja Rannikko <urjaman@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-06 15:26:13 +00:00
Urja Rannikko
12e393bacf panfrost: add lcra_free() to free lcra state
Signed-off-by: Urja Rannikko <urjaman@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-06 15:26:13 +00:00
Urja Rannikko
5b6108834b panfrost: free allocations in schedule_block
Signed-off-by: Urja Rannikko <urjaman@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-06 15:26:13 +00:00
Urja Rannikko
e2dbea683c panfrost: free last_read/write tables in mir_create_dependency_graph
Signed-off-by: Urja Rannikko <urjaman@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-06 15:26:13 +00:00
Alyssa Rosenzweig
adf716dc7f panfrost: Rename SET_VALUE to WRITE_VALUE
See
https://lists.freedesktop.org/archives/dri-devel/2019-December/247601.html

Write value emphasises that it's just a generic write primitive.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-06 14:37:17 +00:00
Alyssa Rosenzweig
9eae950342 panfrost: Update SET_VALUE with information from igt
It's not a tiler specific initialization; it's a generic GPU-side write
primitive that may be used for tiler reset on midgard.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-06 14:37:17 +00:00
Samuel Pitoiset
c1a362722f gitlab-ci: add a job that runs Vulkan CTS with RADV conditionally
Only Polaris10 is tested at the moment, and I disabled a TON of
tests to keep a CTS run within 5 minutes because my local runner
is a bit slow. A full CTS run takes more than 1h, which means it
will hit the timeout.

RADV CI can only be triggered manually on personal branches to
avoid breaking the world because one runner is definitely not
enough. This will allow us to test it until it's stable enough
to be enabled by default.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
2019-12-06 10:58:03 +01:00
Samuel Pitoiset
40c6a56751 gitlab-ci: build RADV in meson-testing
This requires to bump LLVM to 8 because it's the minimum supported
version by RADV.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
2019-12-06 10:58:00 +01:00
Samuel Pitoiset
f32bf4f1e2 gitlab-ci: configure the Vulkan ICD export with VK_DRIVER
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-12-06 10:57:57 +01:00
Samuel Pitoiset
16b999b7d1 gitlab-ci: allow to run dEQP Vulkan with DEQP_VER
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
2019-12-06 10:57:55 +01:00
Samuel Pitoiset
0b246d3558 gitlab-ci: add a new base test job for VK
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
2019-12-06 10:57:54 +01:00
Samuel Pitoiset
35a7ec79db gitlab-ci: build dEQP VK 1.1.6 in the x86 test image for VK
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
2019-12-06 10:57:52 +01:00
Samuel Pitoiset
4bbb1d3b06 gitlab-ci: build cts_runner in the x86 test image for VK
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
2019-12-06 10:57:50 +01:00
Samuel Pitoiset
f2a594f384 gitlab-ci: add a new job that builds a base test image for VK
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
2019-12-06 10:57:48 +01:00
Samuel Pitoiset
520a77d486 gitlab-ci: add a gl suffix to the x86 test image and all test jobs
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
2019-12-06 10:57:46 +01:00
Samuel Pitoiset
7e0ab6aae0 gitlab-ci: rename build-deqp.sh to build-deqp-gl.sh
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
2019-12-06 10:57:45 +01:00
Michel Dänzer
41797a1fed gitlab-ci: Overhaul job run policy
Use new rules: instead of only:

For container stage jobs:

* In the main Mesa project, run them by default.

* In merge requests, run them by default if any files affecting pipeline
  results are changed.

* In all other cases (in particular branches in personal projects),
  don't run them by default but allow triggering them manually.

build & test stage jobs are left at the default (when: on_success), so
they will run automatically once all their dependencies are satisified.
(Using the same rules as above would require these jobs to be manually
triggered as well, which is only possible once all dependency jobs have
passed) Please be considerate of CI runner resources and cancel unneeded
jobs on personal branches with no corresponding merge requests (this can
be done before the jobs start running).

In summary: No more special branch names. Unnecessary job runs are
avoided by default, but jobs which don't run by default can be triggered
manually.

v2:
* Split out LAVA changes to separate commit
* Clarify commit log a little, in particular WRT build/test stage jobs

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> # v1
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> # v1
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> # v1
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-12-06 10:02:01 +01:00
Michel Dänzer
ebd1309fef gitlab-ci: Use the common run policy for LAVA jobs as well again
Having different policies could have some weird results, e.g. changes
only touching documentation (where the intention is not to run the
pipeline by default) would still create a pipeline with the LAVA jobs
running by default.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-12-06 09:39:40 +01:00
Jonathan Marek
0796e7e70d turnip: implement border color
Fixes the deqp fails in:
dEQP-VK.pipeline.sampler.*border*
(minus 1d array/d24 cases which fail for other reasons)

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-12-05 22:12:30 -05:00
Jonathan Marek
095d35eff8 turnip: improve emit_textures
Two things:
* Texture/sampler pointers aligned to the size of texture/sampler state
* Returning errors instead of crashing on OOM

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-12-05 22:12:30 -05:00
Jonathan Marek
3ab4f99461 turnip: add function to allocate aligned memory in a substream cs
To use with texture states that need alignment (texconst, sampler, border)

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-12-05 22:12:29 -05:00
Timothy Arceri
1abca2b3c8 glsl/nir: iterate the system values list when adding varyings
Iterate the system values list when adding varyings to the program
resource list in the NIR linker. This is needed to avoid CTS
regressions when using the NIR to build the GLSL resource list in
an upcoming series. Presumably it also fixes a bug with the current
ARB_gl_spirv support.

Fixes: ffdb44d3a0 ("nir/linker: Add inputs/outputs to the program resource list")

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-05 22:04:31 +00:00
Dave Airlie
201ed4b4e7 llvmpipe: enable support for primitives generated outside streamout
This enables the draw support when the queries are enabled.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-12-06 06:48:30 +10:00
Dave Airlie
5f8af9731e draw: add support for collecting primitives generated outside streamout
GL/gallium require gathering primitives generated outside streamout
stats. This introduces the draw interfaces to enabling collecting this.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-12-06 06:48:30 +10:00
Dave Airlie
f137672197 llvmpipe: disable occlusion queries when requested by state tracker
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-12-06 06:48:30 +10:00
Dave Airlie
3b8e1b3ee4 llvmpipe: add queries disabled flag
This flag is set when the state tracker request queries
be disabled for meta operations.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-12-06 06:48:30 +10:00
Kenneth Graunke
ef893db468 main: Change u_mmAllocMem align2 from bytes (old API) to bits (new API)
The main and Gallium implementations were recently merged, and the
align2 parameter in the Gallium one is in bits.  execmem.c expected
bytes still.  This led to every call here asserting.

Fixes: b6fd679a9e("mesa/main/util: moving gallium u_mm to util, remove main/mm")

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Tested-by: Clayton Craft <clayton.a.craft@intel.com>
2019-12-05 21:07:09 +01:00
Eric Anholt
3097efe5f0 ci: Disable egl_ext_device_drm tests in piglit.
If the runner has a HW device that would be supported, even without
/dev/dri forwarded into the container, it will be enumerated and the tests
on llvmpipe fail with (for example):

libEGL warning: Not allowed to force software rendering when API explicitly selects a hardware device.
libEGL warning: MESA-LOADER: failed to open i965 (search paths /builds/anholt/mesa/install/lib/dri)

Given that we can't necessarily control the DRI devices present on the
runners (particularly for developers bringing their own runners to reduce
the demands on fd.o's shared resources), just skip these tests in CI.

Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
2019-12-05 18:06:10 +00:00
Jason Ekstrand
752196a493 util/atomic: Add p_atomic_add_return for the unlocked path
Fixes: 385d13f26d "util/atomic: Add a _return variant of p_atomic_add"
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-12-05 11:55:21 -06:00
Jason Ekstrand
1b6991ba1d anv: Implement VK_KHR_buffer_device_address
The primary difference between the KHR and EXT versions of the extension
is that the KHR provides the address at AllocateMemory time for replay
so we can replay it safely without moving to a sparse address model.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
4428cd9127 anv: Use a pNext loop in AllocateMemory
This function has a lot of possible extensions and some of them we can
easily handle on-the-fly so it's easier to just have a loop than to find
each structure manually.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
a8e59b3708 anv: Add allocator support for client-visible addresses
When a BO is flagged as having a client visible address, we put it in
its own heap.  We also support the client explicitly specifying an
address in said heap.  If an address collision happens, we return false
from anv_vma_alloc which turns into a VK_ERROR_OUT_OF_DEVICE_MEMORY.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
96e3328ac2 util/vma: Add a function to allocate a particular address range
This new function lets you request to remove a specific address range
from the allocator.  It returns true on success and leaves the allocator
unmodified and returns false on failure.  It doesn't need to return an
offset because, if it succeeds, the offset passed in is the allocated
offset.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
782fb5407d util/vma: Factor out the hole splitting part of util_vma_heap_alloc
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
03450e9cfc anv: Add an explicit_address parameter to anv_device_alloc_bo
We already have a mechanism for specifying that we want a fixed address
provided by the driver internals.  We're about to let the client start
specifying addresses in some very special scenarios as well so we want
to pass this through to the allocation function.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
597fdb9e21 anv: Stop advertising two heaps just for the VF cache WA
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
b47bc0202a anv: Set up VMA heaps independently from memory heaps
Our VMA allocations are really independent from the memory heaps we
expose via the API.  The only thing that really matters is the GTT size
so we can make the high heap the right size.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
1037b52cf4 anv: Stop tracking VMA allocations
util_vma_heap_alloc will already return 0 if it doesn't have enough
space.  The only thing the vma_*_available tracking was doing was
preventing us from allocating too much on any given heap.  Now that
we're tracking that in the heap itself, we can drop these.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
a4e3d8f0db anv: Disallow allocating above heap sizes
We're already tracking the amount of memory used in each heap.  This
commit just makes us start rejecting memory allocations if the heap
would grow too large.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
385d13f26d util/atomic: Add a _return variant of p_atomic_add
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
0a36fafa95 anv: Don't leak when set_tiling fails
Fixes: a44744e01d "anv: Require a dedicated allocation for..."
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
46af0ecc1d anv: Use PIPE_CONTROL flushes to implement the gen8 VF cache WA
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
1b5cb92b62 anv: Apply cache flushes after setting index/draw VBs
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
7ce39a55c1 anv: Always invalidate the VF cache in BeginCommandBuffer
I think the reason why we only do this for primaries is that we didn't
expect to have blorp calls in secondaries.  However, you are allowed to
have a full render pass in a secondary command buffer so resolves and
clears can end up in there.  We should just always invalidate.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
a500a6b7f1 blorp: Pass the VB size to the VF cache workaround
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
c142a40a92 anv: Add a has_softpin boolean
This separates "has" from "use" which will make the next commit a bit
cleaner.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
0bba88081b anv: Drop bo_flags from anv_bo_pool
In ee77938733, we started using the BO cache for anv_bo_pool and
stopped using the bo_flags parameter.  However, we never dropped it from
the struct or the init function.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:58:14 -06:00
Michel Dänzer
f6a913bb95 glsl/tests: Use splitlines() instead of strip()
strip() removes leading and trailing newlines, but leaves newlines
between multiple lines in the string. This could cause failures when
comparing the output of cross-compiled Windows binaries (producing
Windows-style newlines) to the expected output with Unix-style newlines.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-12-05 12:31:17 +01:00
Mauro Rossi
96aef08dc6 android: radeonsi: fix build after vl refactoring (v2)
vl functions moved from radeonsi to gallium/auxiliary/vl have left
android build of radeonsi in broken state.

libmesa_galliumvl static is need to build readeonsi,
gallium_dri building rules are reworked to avoid multiple symbols
and libmesa_galliumvl static dependency is needed in radeonsi.

Here is the changelog:
- android: gallium/auxiliary: add libmesa_galliumvl static
- android: gallium_dri: move libmesa_gallium to static to prevent multiple symbols
- android: radeonsi: fix build after vl refactoring

Fixes the following building error:

external/mesa/src/gallium/drivers/radeonsi/si_uvd.c:47:
error: undefined reference to 'vl_video_buffer_create_as_resource'
clang.real: error: linker command failed with exit code 1 (use -v to see invocation)

Fixes: 86e60bc ("radeonsi: remove si_vid_join_surfaces and use combined planar allocations")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-12-05 08:08:23 +00:00
Tapani Pälli
32ebd4207a intel/compiler: force simd8 when dual src blending on gen8
Patch introduces option to force simd8 and uses it as a workaround for
dual source blending issues seen with skqp (skia testsuite) on gen8.

Fixes following Piglit test on gen8 platforms:
   arb_blend_func_extended-dual-src-blending-issue-1917

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1917
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
c: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 09:42:50 +02:00
Tapani Pälli
f6004bac1f intel/compiler: add newline to limit_dispatch_width message
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 08:13:58 +02:00
Eric Anholt
c3efeac4c6 turnip: Add support for compute shaders.
Since compute shares the FS state with graphics, we have to re-upload the
pipeline state when switching between compute dispatch and graphics draws.
We could potentially expose graphics and compute as separate queues and
then we wouldn't need pipeline state management, but the closed driver
exposes a single queue and consistency with them is probably good.

So far I'm emitting texture/ibo state as IBs that we jump to.  This is
kind of silly when we could just emit it directly in our CS, but that's a
refactor we can do later.

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-12-04 20:32:15 -08:00
Eric Anholt
ccf8230547 turnip: Move pipeline BO list adding to BindPipeline.
We only need to do it once when we bind, rather than having to check at
every draw call.

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-12-04 20:32:15 -08:00
Eric Anholt
e26962f756 turnip: Sanity check that we're adding valid BOs to the list.
I tripped over this during CS enabling when my program BO wasn't set up.
Easier to debug this way than the kernel telling us a 0 handle is invalid.

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-12-04 20:32:15 -08:00
Eric Anholt
4365e955d8 turnip: Add a helper function for getting tu_buffer iovas.
Easier than remembering to add all 3 offsets.

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-12-04 20:32:15 -08:00
Eric Anholt
70d6428be5 turnip: Refactor the graphics pipeline create implementation.
The loop over the pipelines to create (and the failure handling) was
noisy, and the stub for compute setup looked nicer to me.

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-12-04 20:32:15 -08:00
Eric Anholt
e46da7dbea turnip: Add basic SSBO support.
This is enough to pass
dEQP-VK.binding_model.shader_access.primary_cmd_buf.storage_buffer.fragment.single_descriptor.*
with fragmentStoresAndAtomics set, and thus to be able to start working on
compute.  I haven't enabled that flag yet, because it also implies image
load/store support, which I haven't filled in.

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-12-04 20:32:15 -08:00
Eric Anholt
1f4e8f3c46 turnip: Reuse tu6_stage2opcode() more.
A bit of cleanup for adding more stages later.

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-12-04 20:32:15 -08:00
Eric Anholt
5b23671f6a turnip: Drop redefinition of VALIDREG now that it's in ir3.h.
Fixes: 937b905569 ("freedreno/ir3: fix neverball assert in case of unused VS inputs")

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-12-04 20:32:15 -08:00
Eric Anholt
bb49f19c1b turnip: Fix unused variable warnings.
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-12-04 20:32:15 -08:00
Timothy Arceri
1b1b436fa7 glsl: make use of active_shader_mask when building resource list
This allows us to avoid walking the entire IR looking for used
uniforms.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-12-05 13:18:30 +11:00
Timothy Arceri
f0cb0fe1c0 glsl: don't set uniform block as used when its not
The spec requires unused uniform block to be set as active in the
program resource list. To support this we tell opt dead code not to
remove them. However we can mark them as unused internally and
avoid unnecessarily state changes.

This change is also required for the folowing clean-up patch.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-12-05 13:18:23 +11:00
Timothy Arceri
50dc4b77f6 glsl: move calculate_array_size_and_stride() to link_uniforms.cpp
This is where all the other uniform values are populated so it
makes much more sense here. Moving it will also allow us to better
share code between the NIR and GLSL IR resource list builders.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-12-05 13:18:02 +11:00
Ian Romanick
c9acf0739f anv: Fix error message format string
See also 246261f0ad

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
CID: 1455892
Fixes: 246261f0ad ("anv: prepare the driver for delayed submissions")
2019-12-04 15:34:03 -08:00
Ian Romanick
7840985609 mesa: Silence unused parameter warning
Unused since e4da8b9c33 ("mesa/compiler: rework tear down of
builtin/types").

src/mesa/main/context.c: In function ‘_mesa_free_context_data’:
src/mesa/main/context.c:1321:54: warning: unused parameter ‘destroy_compiler_types’ [-Wunused-parameter]
 1321 | _mesa_free_context_data(struct gl_context *ctx, bool destroy_compiler_types)
      |                                                      ^

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-12-04 15:34:03 -08:00
Ian Romanick
a7e607641a mesa: Silence 'left shift of negative value' warning in BPTC compression code
src/util/format/../../mesa/main/texcompress_bptc_tmp.h:830:31: warning: left shift of negative value [-Wshift-negative-value]
  830 |       value |= (~(int32_t) 0) << n_bits;
      |                               ^~

v2: Rewrite to just shift left then shift right.  Based on conversation
with Neil in
https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2792#note_320272,
this should be fine.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> [v1]
Reviewed-by: Neil Roberts <nroberts@igalia.com>
2019-12-04 15:34:03 -08:00
Ian Romanick
668635abd2 intel/compiler: Fix 'comparison is always true' warning
Without looking at the assembly or something, I'm not sure what the
compiler does here.  The brw_reg_type enum is marked packed, so I'm
guess that it gets represented as a uint8_t.  That's the only reason I
could think that comparing with -1 would be always true.

This patch adds the same cast that exists in brw_hw_type_to_reg_type.
It might be better to add a #define outside the enum for
BRW_REGISTER_TYPE_INVALID as (enum brw_reg_type)-1.

src/intel/compiler/brw_eu_compact.c: In function ‘has_immediate’:
src/intel/compiler/brw_eu_compact.c:1515:20: warning: comparison is always true due to limited range of data type [-Wtype-limits]
 1515 |       return *type != -1;
      |                    ^~
src/intel/compiler/brw_eu_compact.c:1518:20: warning: comparison is always true due to limited range of data type [-Wtype-limits]
 1518 |       return *type != -1;
      |                    ^~

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
CID: 1455194
Fixes: 12d3b11908 ("intel/compiler: Add instruction compaction support on Gen12")
Cc: @mattst88
2019-12-04 15:34:03 -08:00
Dylan Baker
5b3d6979a6 docs: Update mesa 19.3 release calendar 2019-12-04 14:42:41 -08:00
Dylan Baker
953d20e6f5 docs: update calendar, add news item and link release notes for 19.2.7 2019-12-04 14:42:41 -08:00
Dylan Baker
bd518aa208 docs: Add SHA256 sums for 19.2.7 2019-12-04 14:42:41 -08:00
Dylan Baker
26aa024cdf docs: Add release notes for 19.2.7 2019-12-04 14:42:41 -08:00
Jonathan Marek
ec28714b78 turnip: allow writes to draw_cs outside of render pass
This is for state commands like CmdSetViewport that can be used outside of
a renderpass. Accumulating those into draw_cs outside of the renderpass
should have the desired effect.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-04 17:35:18 -05:00
Rob Clark
372ed42d22 nir/lower_clip: Fix incorrect driver loc for clipdist outputs
Somehow adjusting maxloc based on existing outputs got lost, resulting
in the clipdist varying clobbering the position varying.  Causing a
shader that had no position output in freedreno/ir3, which triggers GPU
hangs in neverball.

Fixes: d0f746b645 ("nir: Save nir_variable pointers in nir_lower_clip_vs rather than locs.")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-12-04 13:08:52 -08:00
Rob Clark
937b905569 freedreno/ir3: fix neverball assert in case of unused VS inputs
The logic to ensure VS and BS inputs are aligned wasn't accounting for
unused inputs in VS.  This *usually* doesn't happen, but it seems it
can in the case of ARB programs?

Fixes assert:
```
fd6_program_create: Assertion `bs->inputs[i].regid == vs->inputs[i].regid' failed.
```

Fixes: 882d53d8e3 ("freedreno/ir3+a6xx: same VBO state for draw/binning")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-12-04 13:08:52 -08:00
Rob Clark
4e47c205b9 freedreno/ir3: remove store_output lowered to store_shared_ir3
Fixes crashes that were unnoticed in CI because debug_assert() was not
enabled (but become real crashes after the next patch):

dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.ivec2_highp_geometry
dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.ivec2_lowp_geometry
dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.ivec2_mediump_geometry
dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.uvec2_highp_geometry
dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.uvec2_lowp_geometry
dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.uvec2_mediump_geometry

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-12-04 13:08:52 -08:00
Rafael Antognolli
50f60d69e4 iris: Add restriction to 3DSTATE_CONSTANT_ packets.
The following programming note shows up in all 3DSTATE_CONSTANT_*
packets:

   "The sum of all four read length fields must be less than or equal to
   the size of 64."

The backend compiler should guarantee this for us, so let's just add a
check here.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-12-04 20:48:25 +00:00
Rafael Antognolli
d3e339364f anv: Use 3DSTATE_CONSTANT_ALL when possible.
Use this new instruction introduced in Gen12. The instruction itself is
smaller, and it also allows us to emit a single instruction to all
stages that have the same push constant buffers (e.g. when they don't
have constant buffers).

There's one restriction to use this instruction, though: the length
field is only 5 bits long, so we need to check whether we can use it,
and fallback to the old 3DSTATE_CONSTANT_XS if that field is >= 32.

v2:
 - Rebased on top of the lasted changes from Jason.
 - Added review suggestions by Caio.
 - Removed struct push_bos and merged some code into
 anv_nir_compute_push_layout().

v3:
 - Remove code churn due to gen8+ workaround in
 anv_nir_compute_push_layout(). This code has been removed in an earlier
 commit, and implemented in cmd_buffer_emit_push_constant().

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-12-04 20:48:25 +00:00
Rafael Antognolli
7d5da53d27 anv: Move code for emitting push constants into its own function.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-12-04 20:48:25 +00:00
Rafael Antognolli
67d2cb3e93 anv: Add get_push_range_address() helper.
Add a helper function to get the push range address. Once we have a
separate function for emitting gen12 push constants, we can use this
helper and avoid duplicating code.

v3: Do not add range->start to the address in gen7 (Caio).
v4: Do not drop range->start from gen7 (Caio, Jason).

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-12-04 20:48:25 +00:00
Rafael Antognolli
c0225a728e anv: Move gen8+ push constant packet workaround.
Store push_ranges in ascending order, and only "shift" them to the end
of the array during state packet emission.

We don't need this workaround with the new 3DSTATE_CONSTANT_ALL packet.
So instead of applying the workaround here just for GEN < 12 (which
requires and extra loop through all the ranges to figure out if we
should shift them or not), we simply move the whole logic to the state
emission code. At that point, in a later commit, we are already looping
through all of the ranges anyway to check which packet we will be using,
so we might as well implement the workaround there, where it is going to
be used.

v3: Move gen8+ workaround to the state emission code (Caio).
v4: Add explanation of why we moved the workaroudn (Caio).

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-12-04 20:48:25 +00:00
Rafael Antognolli
06438ea7fa iris: Use 3DSTATE_CONSTANT_ALL when possible.
Use this new instruction introduced in Gen12. The instruction itself is
smaller, and it also allows us to emit a single instruction to all
stages that have the same push constant buffers (e.g. when they don't
have constant buffers).

There's one restriction to use this instruction, though: the length
field is only 5 bits long, so we need to check whether we can use it,
and fallback to the old 3DSTATE_CONSTANT_XS if that field is >= 32.

v2 (Suggestions from Caio):
 - use max_length instead of large_buffers.
 - remove UNUSED and use #if GEN_GEN >= 12 instead.
 - inline "buffers" and drop BITSET_RANGE() usage.
 - add assert(n <= max_pointers)
 - move emit to outside of the loop.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-12-04 20:48:25 +00:00
Rafael Antognolli
1ba9a18911 iris: Rework push constants emitting code.
Split into a function the logic to gather the push constant buffers,
which now stores them in struct push_bos. Another function is added to
emit the packet, using data from the push_bos struct.

This will be useful when adding a new function for emitting push
constants for newer platforms.

v2 (Suggestions from Caio):
   - rename 'n' -> 'buffer_count'
   - remove large_buffers (for now)
   - initialize push_bos
   - remove assert
   - change for() condition (i <= 3 -> i < 4)
v3:
   - Add comment about size limit.
   - Rework "shift" logic and 'for' loop.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-12-04 20:48:25 +00:00
Rafael Antognolli
9db044792f intel/blorp: Use 3DSTATE_CONSTANT_ALL to setup push constants.
In blorp, all the push constants are disabled, so we only need to emit a
single 3DSTATE_CONSTANT_ALL with the bitmask for stage update
appropriately set.

v2: Update comment (Caio).

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-12-04 20:48:25 +00:00
Rafael Antognolli
8983622995 intel/aubinator: Decode 3DSTATE_CONSTANT_ALL.
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-04 20:48:25 +00:00
Rafael Antognolli
2d127614a2 intel/genxml: Add 3DSTATE_CONSTANT_ALL packet.
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-12-04 20:48:25 +00:00
Jonathan Marek
1576ff5fbb turnip: MSAA resolve directly from GMEM
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-04 14:39:06 -05:00
Jonathan Marek
abaaf0b2e7 turnip: don't set unused BLIT_DST_INFO bits for GMEM clear
These bits are ignored when clearing so don't bother setting them.

Note: MSAA samples when clearing comes from other registers (tu6_emit_msaa)

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-04 14:39:06 -05:00
Jonathan Marek
4babdc7381 turnip: implement CmdClearAttachments
Passes these deqp tests: dEQP-VK.api.image_clearing.core.*attach*single*

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-04 14:39:06 -05:00
Jonathan Marek
1dfa2e6c99 turnip: don't skip unused attachments when setting up tiling config
This makes it easier to find the gmem_offset associated with an attachment.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-04 14:39:06 -05:00
Vasily Khoruzhick
8c12f4e5f2 lima: enable tiling
Now that we have tiled format modifier merged into linux we can enable tiling.

That should improve overall performance and also workaround broken mipmapping
for linear textures since now we prefer tiled textures.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-12-04 08:20:56 -08:00
Tapani Pälli
272ef5d39a glsl: additional interface redeclaration check for SSO programs
Patch adds additional linker check for SSO programs to make sure they
are redeclaring built-in blocks as required by the desktop spec.

This fixes following Piglit tests:
   arb_separate_shader_objects/linker/pervertex-*

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-12-04 15:27:41 +00:00
Tapani Pälli
2d26cc077d gitlab-ci: bump piglit checkout commit
Commit also updates the Piglit quick_gl.txt, list modifications happened
due to following Piglit commits: c248bf201,c acff58ca, 5603e2e60.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-12-04 15:27:41 +00:00
Rhys Perry
3e67aa2e4e nir/load_store_vectorize: fix combining stores with aliasing loads between
v2: add test

Fixes: ce9205c03b ('nir: add a load/store vectorization pass')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> (v1)
Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v2)
2019-12-04 12:21:40 +00:00
Timur Kristóf
637c5a1dd9 aco/wave32: Fix reductions.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-04 10:36:01 +00:00
Timur Kristóf
21db083504 aco/wave32: Allow setting the subgroup ballot size to 64-bit.
Previously, it would only work when the ballot size was set to the
lane mask. This patch makes is possible to set the ballot size
to either 32-bit or 64-bit for wave32 mode.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-04 10:36:01 +00:00
Timur Kristóf
ed815d503e aco/wave32: Use wave_size for barrier intrinsic.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-04 10:36:01 +00:00
Timur Kristóf
b8f2edb452 aco/wave32: Fix load_local_invocation_index to support wave32.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-04 10:36:01 +00:00
Timur Kristóf
e0bcefc3a0 aco/wave32: Use lane mask regclass for exec/vcc.
Currently all usages of exec and vcc are hardcoded to use s2 regclass.
This commit makes it possible to use s1 in wave32 mode and
s2 in wave64 mode.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-04 10:36:01 +00:00
Timur Kristóf
b4efe179ed aco/wave32: Add wave size specific opcodes to aco_builder.
Several places in ACO we use SOP1 or SOP2 instructions to operate over the
exec mask or VCC, and these need to be adapted to the new size in wave32
mode.

This commit adds a way to deal with this problem in aco_builder: the caller
can specify a wave size specific opcode and the builder will translate that
to the correct opcode based on the current wave size.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-04 10:36:01 +00:00
Timur Kristóf
c44af6cbc7 aco/wave32: Introduce emit_mbcnt which takes wave size into account.
This is relevant because in wave32 mode the v_mbcnt_hi_u32_b32
instruction is superfluous.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-04 10:36:01 +00:00
Timur Kristóf
07754a9c9e aco/wave32: Replace hardcoded numbers in spiller with wave size.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-04 10:36:01 +00:00
Timur Kristóf
c0dbf42a03 aco/wave32: Change uniform bool optimization to work with wave32.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-04 10:36:01 +00:00
Timur Kristóf
dd9dad731b aco: Optimize load_subgroup_id to one bit field extract instruction.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-04 10:36:01 +00:00
Timur Kristóf
753670e902 aco: Remove lower_linear_bool_phi, it is not needed anymore.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-04 10:36:01 +00:00
Timur Kristóf
0d2d672020 aco: Remove superfluous argument from emit_boolean_logic.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-04 10:36:01 +00:00
Timur Kristóf
9a43d26b74 aco: Fix operand of s_bcnt1_i32_b64 in emit_boolean_reduce.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-04 10:36:01 +00:00
Michel Dänzer
5585b8eadd gitlab-ci: Run piglit glslparser & quick_shader tests separately
And only use --process-isolation false for the quick_gl tests.

This will hopefully avoid variance in the test results that we've been
seeing lately. But even if it doesn't, it should at least help narrow
down the cause of the variance.

Tested-by: Vasily Khoruzhick <anarsoul@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-12-04 10:36:33 +01:00
Lionel Landwerlin
ddacd3d43b intel/perf: fix improper pointer access
This expression was unused by the macro, probably why it didn't
register in the compilation.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-12-04 09:21:15 +00:00
Lionel Landwerlin
8c0b058263 intel/perf: simplify the processing of OA reports
This is a more accurate description of what happens in processing the
OA reports.

Previously we only had a somewhat difficult to parse state machine
tracking the context ID.

What we really only need to do to decide if the delta between 2
reports (r0 & r1) should be accumulated in the query result is :

   * whether the r0 is tagged with the context ID relevant to us

   * if r0 is not tagged with our context ID and r1 is: does r0 have a
     invalid context id? If not then we're in a case where i915 has
     resubmitted the same context for execution through the execlist
     submission port

v2: Update comment (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-12-04 09:21:15 +00:00
Lionel Landwerlin
b364e920bf intel/perf: take into account that reports read can be fairly old
If we read the OA reports late enough after the query happens, we can
get a timestamp in the report that is significantly in the past
compared to the start timestamp of the query. The current code must
deal with the wraparound of the timestamp value (every ~6 minute). So
consider that if the difference is greater than half that wraparound
period, we're probably dealing with an old report and make the caller
aware it should read more reports when they're available.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-12-04 09:21:15 +00:00
Lionel Landwerlin
9d0a5c817c intel/perf: set read buffer len to 0 to identify empty buffer
We always add an empty buffer in the list when creating the query.
Let's set the len appropriately so that we can recognize it when we
read OA reports up to the end of a query.

We were using an 0 timestamp value associated with the empty buffer
and incorrectly assuming this was a valid value. In turn that led to
not reading enough reports and resulted in deltas added to our counter
values which should have been discarded because those would be flagged
for a different context.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-12-04 09:21:15 +00:00
Lionel Landwerlin
acea59dbf8 intel/perf: fix invalid hw_id in query results
Accumulation happens between 2 reports, it can be between a start/end
report from another context. So only consider updating the hw_id of
the results when it's not already valid and that we have a valid value
to put in there.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 41b54b5faf ("i965: move OA accumulation code to intel/perf")
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-12-04 09:21:15 +00:00
Pierre-Eric Pelloux-Prayer
a7bbebcfb9 radeonsi: display cs blit count for AMD_DEBUG=testdma
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-12-04 09:08:28 +01:00
Pierre-Eric Pelloux-Prayer
082d1c1686 radeonsi: implement sdma for GFX9
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-12-04 09:08:28 +01:00
Samuel Pitoiset
4cacba0c86 radv/gfx10: fix the vertex order for triangle strips emitted by a GS
My fix wasn't totally correct as pointed out by Marek.
Ported from RadeonSI.

Fixes: deafe4cc58 ("radv/gfx10: fix primitive indices orientation for NGG GS")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-04 08:28:57 +01:00
Samuel Pitoiset
dac6bd29ae radv: simplify a check in radv_fixup_vertex_input_fetches()
The number of loaded channels should always be > 0 now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-04 08:04:05 +01:00
Samuel Pitoiset
3b51259f06 radv: remove dead shader input/output variables
No pipeline-db changes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-04 08:04:05 +01:00
Jason Ekstrand
0604768ae4 iris: Stop setting up fake params
In d1c4e64a69, we added a parameter to tell the back-end compiler to
ignore the param array and just push however many constants you ask it
to push.  Iris doesn't want to push anything so it gives a bogus number
of parameters and trusts the back-end compiler to dead-code all of them.
Now that we can tell the back-end compiler to stop re-arranging things,
delete the hack and enable the new simpler code path.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-12-04 04:52:20 +00:00
Dave Airlie
713636766d gallium/scons: fix graw-xlib build on OSX.
Fixes: 44a6b0107b (gallivm: add nir->llvm translation (v2))

Tested-by: Vinson Lee <vlee@freedesktop.org>
2019-12-04 13:24:44 +10:00
Dave Airlie
3263c9824e llvmpipe: enable texcoord semantics
To make NIR transitioning easier, move the driver to using
texcoord semantics.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-04 12:08:14 +10:00
Jason Ekstrand
178a2946c0 anv: Respect the always_flush_cache driconf option
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-03 17:10:51 -06:00
Krzysztof Raszkowski
07adc47460 gallium/swr: Fix crash when use GL_TDFX_texture_compression_FXT1 format.
Reject the new formats in swr to prevent crashes because it doesn't
know how to handle the new formats.

Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
2019-12-03 16:51:24 +00:00
Rob Clark
b31637c453 gitlab-ci: disable junit results for deqp
They don't seem to be hugely useful, and seem to be bogging down gitlab.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-12-03 08:46:39 -08:00
Jason Ekstrand
b1f37688ba anv: Set up SBE_SWIZ properly for gl_Viewport
gl_Viewport is also in the VUE header so we need to whack the read
offset to 0 and emit a default (no overrides) SBE_SWIZ entry in that
case as well.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-03 16:20:50 +00:00
Michel Dänzer
0c88d5952a gitlab-ci: Update to current ci-templates master
Fixes skopeo copy failures.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-12-03 16:03:31 +01:00
Samuel Pitoiset
f63a3132e8 ac/llvm: fix atomic var operations if source isn't a deref
Fixes some CTS regressions.

Fixes: e61a826f39 ("ac/llvm: fix pointer type for global atomics")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-03 09:41:33 +01:00
Neil Armstrong
dde734030b Add support for T820 CI Jobs
Tomeu: - Small rebase fixups

Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-12-03 06:44:08 +01:00
Dave Airlie
502548a09c gallivm/llvmpipe: add support for front facing in sysval.
This wires up the front facing value as a sysval, I'd like to
remove the other facing code but I'd need to confirm VMware
don't use it first.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-12-03 15:29:04 +10:00
Dave Airlie
f52cdaa517 llvmpipe/images: handle undefined atomic without crashing
just return 0 for unbound atomic operations.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-12-03 15:29:04 +10:00
Alyssa Rosenzweig
71dd52e056 panfrost: Remove blend shader hack
This is no longer used.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-03 04:25:04 +00:00
Tomeu Vizoso
c707b4d0f9 gitlab-ci: Test Panfrost on T720 GPUs
Now that the Mali T720 GPU is supoprted at the same level as the T760,
test it on PINE64 H64 boards.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-03 04:25:04 +00:00
Alyssa Rosenzweig
6d05e38a96 gitlab-ci: Remove non-default skips from Panfrost
During the past months, Panfrost has matured considerably and several
tests stopped being flaky or failing at all.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-03 04:25:04 +00:00
Tomeu Vizoso
b655be7252 panfrost: White list the Mali T720
Support for this GPU is equal now to that of T760, so whitelist it.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-03 04:25:04 +00:00
Alyssa Rosenzweig
8555bffafd pan/midgard: Splatter on fragment out
Make sure that the fragment is complete when writing it out.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-12-03 04:25:04 +00:00
Tomeu Vizoso
ab81a23d36 panfrost: Simplify shader patching
We need to always upload anyway.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-03 04:25:04 +00:00
Alyssa Rosenzweig
6ddaa5558a panfrost: Simplify draw_flags
Fixes dEQP-GLES3.functional.primitive_restart.*. Note the 0x18000 value
is accidentally somehow enabling primitive restart for some reason.
I'm not sure where this value came from but let's not.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-12-03 04:25:04 +00:00
Alyssa Rosenzweig
9fb0904712 panfrost: Implement pan_tiler for non-hierarchy GPUs
The algorithm is as described. Nothing fancy here, just need to add some
new code paths depending on which model we're running on.

Tomeu:
- Also disable tiling when !hierarchy and !vertex_count
- Avoid creating polygon lists smaller than the minimum when
  vertex_count > 0 but tile size smaller than 16 byte
- Take into account tile size when calculating polygon list size for
  !hierarchy
- Allow 0-sized tiles in a single dimension

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-12-03 04:25:04 +00:00
Alyssa Rosenzweig
63cd5b8198 panfrost: Add information about T720 tiling
We've figured out most of the big pieces, and though it looks faintly
like other Midgards, it's much simpler.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-03 04:25:04 +00:00
Tomeu Vizoso
6887ff4e79 panfrost: Add quirks system to cmdstream
Similarly to how it's already done in the compiler, add a way to express
differences between GPU models that need to be taken into account when
assembling the cmdstream.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-12-03 04:25:04 +00:00
Ian Romanick
fbd5359a0a nir/algebraic: Rearrange bcsel sequences generated by nir_opt_peephole_select
Reviewed-by: Matt Turner <mattst88@gmail.com>

All Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 14660366 -> 14653437 (-0.05%)
instructions in affected programs: 316166 -> 309237 (-2.19%)
helped: 905
HURT: 10
helped stats (abs) min: 1 max: 36 x̄: 7.67 x̃: 6
helped stats (rel) min: 0.13% max: 18.75% x̄: 4.28% x̃: 3.60%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.10% max: 1.33% x̄: 0.70% x̃: 0.97%
95% mean confidence interval for instructions value: -7.91 -7.23
95% mean confidence interval for instructions %-change: -4.46% -3.99%
Instructions are helped.

total cycles in shared programs: 228571646 -> 228549759 (<.01%)
cycles in affected programs: 56239919 -> 56218032 (-0.04%)
helped: 681
HURT: 216
helped stats (abs) min: 1 max: 5156 x̄: 45.49 x̃: 10
helped stats (rel) min: <.01% max: 10.45% x̄: 1.29% x̃: 0.65%
HURT stats (abs)   min: 1 max: 320 x̄: 42.09 x̃: 14
HURT stats (rel)   min: <.01% max: 37.04% x̄: 1.38% x̃: 0.49%
95% mean confidence interval for cycles value: -41.51 -7.29
95% mean confidence interval for cycles %-change: -0.80% -0.49%
Cycles are helped.

LOST:   1
GAINED: 0
2019-12-02 16:46:20 -08:00
Ian Romanick
780b5c1037 nir/algebraic: Simplify some Inf and NaN avoidance code
Since a is non-negative, neither fsqrt nor frsq should return NaN.  frsq
should only return Inf when fsqrt returns 0.

The changes are pretty small, but this turns a few hundred hurt shaders
in the next patch into helped shaders.

An alternative to the intBitsToFloat is to import numpy and do
np.finfo(np.float32).max.  That's more explicit, but we may also want to
have specific bit encodings of float values later.  I could be convinced
either way, but intBitsToFloat(0x7f7fffff) was what I implemented first.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>

All Gen7+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 14661140 -> 14661104 (<.01%)
instructions in affected programs: 7520 -> 7484 (-0.48%)
helped: 36
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.32% max: 0.61% x̄: 0.49% x̃: 0.52%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -0.52% -0.47%
Instructions are helped.

total cycles in shared programs: 228585416 -> 228584806 (<.01%)
cycles in affected programs: 56321 -> 55711 (-1.08%)
helped: 32
HURT: 0
helped stats (abs) min: 2 max: 98 x̄: 19.06 x̃: 10
helped stats (rel) min: 0.08% max: 6.41% x̄: 1.09% x̃: 0.65%
95% mean confidence interval for cycles value: -28.32 -9.80
95% mean confidence interval for cycles %-change: -1.63% -0.54%
Cycles are helped.

Sandy Bridge
total cycles in shared programs: 152991077 -> 152991075 (<.01%)
cycles in affected programs: 11525 -> 11523 (-0.02%)
helped: 2
HURT: 2
helped stats (abs) min: 2 max: 4 x̄: 3.00 x̃: 3
helped stats (rel) min: 0.07% max: 0.11% x̄: 0.09% x̃: 0.09%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.08% max: 0.08% x̄: 0.08% x̃: 0.08%
95% mean confidence interval for cycles value: -5.27 4.27
95% mean confidence interval for cycles %-change: -0.16% 0.15%
Inconclusive result (value mean confidence interval includes 0).

No changes on Iron Lake or GM45.
2019-12-02 16:46:20 -08:00
Ian Romanick
d15344c0f5 intel/compiler: Increase nir_opt_peephole_select threshold
I tried 2, 4, 6, 8, and 10.  8 seemed to be the sweet spot across all
Intel platforms.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>

All Gen7+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 14736141 -> 14661140 (-0.51%)
instructions in affected programs: 2272413 -> 2197412 (-3.30%)
helped: 8416
HURT: 140
helped stats (abs) min: 1 max: 1152 x̄: 8.99 x̃: 6
helped stats (rel) min: 0.13% max: 42.55% x̄: 4.15% x̃: 3.20%
HURT stats (abs)   min: 1 max: 140 x̄: 4.73 x̃: 1
HURT stats (rel)   min: 0.03% max: 3.44% x̄: 0.87% x̃: 0.60%
95% mean confidence interval for instructions value: -9.36 -8.17
95% mean confidence interval for instructions %-change: -4.14% -3.99%
Instructions are helped.

total cycles in shared programs: 231560416 -> 228585416 (-1.28%)
cycles in affected programs: 126536021 -> 123561021 (-2.35%)
helped: 7092
HURT: 1898
helped stats (abs) min: 1 max: 419320 x̄: 519.02 x̃: 159
helped stats (rel) min: <.01% max: 77.25% x̄: 13.52% x̃: 11.77%
HURT stats (abs)   min: 1 max: 14518 x̄: 371.91 x̃: 36
HURT stats (rel)   min: <.01% max: 103.23% x̄: 5.92% x̃: 2.55%
95% mean confidence interval for cycles value: -514.34 -147.50
95% mean confidence interval for cycles %-change: -9.69% -9.14%
Cycles are helped.

total spills in shared programs: 5763 -> 5848 (1.47%)
spills in affected programs: 1797 -> 1882 (4.73%)
helped: 13
HURT: 13

total fills in shared programs: 17163 -> 16931 (-1.35%)
fills in affected programs: 7214 -> 6982 (-3.22%)
helped: 22
HURT: 19

total sends in shared programs: 730410 -> 730246 (-0.02%)
sends in affected programs: 2705 -> 2541 (-6.06%)
helped: 114
HURT: 0
helped stats (abs) min: 1 max: 4 x̄: 1.44 x̃: 1
helped stats (rel) min: 0.60% max: 20.00% x̄: 7.26% x̃: 5.88%
95% mean confidence interval for sends value: -1.55 -1.33
95% mean confidence interval for sends %-change: -7.90% -6.62%
Sends are helped.

LOST:   4
GAINED: 0

Sandy Bridge
total instructions in shared programs: 10760511 -> 10724637 (-0.33%)
instructions in affected programs: 961305 -> 925431 (-3.73%)
helped: 3734
HURT: 110
helped stats (abs) min: 1 max: 151 x̄: 9.66 x̃: 8
helped stats (rel) min: 0.14% max: 41.21% x̄: 4.93% x̃: 3.95%
HURT stats (abs)   min: 1 max: 20 x̄: 1.68 x̃: 1
HURT stats (rel)   min: 0.12% max: 5.41% x̄: 0.88% x̃: 0.52%
95% mean confidence interval for instructions value: -9.76 -8.91
95% mean confidence interval for instructions %-change: -4.90% -4.63%
Instructions are helped.

total cycles in shared programs: 153359411 -> 152991077 (-0.24%)
cycles in affected programs: 11615401 -> 11247067 (-3.17%)
helped: 2725
HURT: 1138
helped stats (abs) min: 1 max: 2844 x̄: 164.27 x̃: 80
helped stats (rel) min: 0.02% max: 48.60% x̄: 7.47% x̃: 3.91%
HURT stats (abs)   min: 1 max: 4351 x̄: 69.69 x̃: 25
HURT stats (rel)   min: 0.02% max: 40.00% x̄: 3.39% x̃: 1.47%
95% mean confidence interval for cycles value: -103.18 -87.52
95% mean confidence interval for cycles %-change: -4.57% -3.97%
Cycles are helped.

total sends in shared programs: 584038 -> 583855 (-0.03%)
sends in affected programs: 3512 -> 3329 (-5.21%)
helped: 157
HURT: 0
helped stats (abs) min: 1 max: 5 x̄: 1.17 x̃: 1
helped stats (rel) min: 2.38% max: 25.00% x̄: 6.52% x̃: 6.06%
95% mean confidence interval for sends value: -1.26 -1.07
95% mean confidence interval for sends %-change: -7.17% -5.87%
Sends are helped.

LOST:   23
GAINED: 0

Iron Lake and GM45 had similar results. (Iron Lake shown)
total instructions in shared programs: 8122617 -> 8111592 (-0.14%)
instructions in affected programs: 380503 -> 369478 (-2.90%)
helped: 912
HURT: 86
helped stats (abs) min: 1 max: 129 x̄: 12.19 x̃: 9
helped stats (rel) min: 0.30% max: 39.21% x̄: 3.69% x̃: 2.57%
HURT stats (abs)   min: 1 max: 2 x̄: 1.05 x̃: 1
HURT stats (rel)   min: 0.12% max: 3.64% x̄: 0.54% x̃: 0.36%
95% mean confidence interval for instructions value: -12.00 -10.10
95% mean confidence interval for instructions %-change: -3.56% -3.10%
Instructions are helped.

total cycles in shared programs: 188509780 -> 188534398 (0.01%)
cycles in affected programs: 7211542 -> 7236160 (0.34%)
helped: 859
HURT: 132
helped stats (abs) min: 2 max: 690 x̄: 46.59 x̃: 16
helped stats (rel) min: 0.01% max: 26.76% x̄: 1.53% x̃: 0.33%
HURT stats (abs)   min: 2 max: 1592 x̄: 489.67 x̃: 618
HURT stats (rel)   min: 0.03% max: 185.92% x̄: 23.35% x̃: 6.26%
95% mean confidence interval for cycles value: 9.58 40.10
95% mean confidence interval for cycles %-change: 0.65% 2.93%
Cycles are HURT.
2019-12-02 16:46:20 -08:00
Ian Romanick
e342d6970b nir/opt_peephole_select: Don't count some unary operations
In many cases, fsat, fneg, fabs, ineg, and iabs will get folded into
another instruction as either source or destination modifiers.
Counting them as instructions means that some if-statements won't get
converted to selects.  For example,

        vec1 32 ssa_25 = flt32 ssa_0, ssa_23.x
        /* succs: block_1 block_2 */
        if ssa_25 {
                block block_1:
                /* preds: block_0 */
                vec1 32 ssa_26 = fabs ssa_24
                vec1 32 ssa_27 = fneg ssa_26
                vec1 32 ssa_28 = fabs ssa_20
                vec1 32 ssa_29 = fneg ssa_28
                vec1 32 ssa_30 = fmul ssa_27, ssa_29
                vec1 32 ssa_31 = fsat ssa_30
                /* succs: block_3 */
        } else {
                block block_2:
                /* preds: block_0 */
                /* succs: block_3 */
        }
        block block_3:
        /* preds: block_1 block_2 */

block_1 isn't really 6 instructions, but it will be counted that way.

Most callers of the peephole_select pass use either 1 or 8.  It's very
easy to blow way past either of these limits with things that are really
only one or two actual instructions.

I also tried some fancier things like making sure the fsat was of
another SSA def from the same block, but the simple test was actually
better.

The i965 back-end SEL peephole pass still helps ~700 shaders in
shader-db with this change.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>

All Gen6+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 14743694 -> 14738910 (-0.03%)
instructions in affected programs: 156575 -> 151791 (-3.06%)
helped: 1204
HURT: 0
helped stats (abs) min: 1 max: 27 x̄: 3.97 x̃: 3
helped stats (rel) min: 0.15% max: 19.57% x̄: 5.15% x̃: 4.55%
95% mean confidence interval for instructions value: -4.12 -3.82
95% mean confidence interval for instructions %-change: -5.35% -4.95%
Instructions are helped.

total cycles in shared programs: 231749141 -> 231602916 (-0.06%)
cycles in affected programs: 2818975 -> 2672750 (-5.19%)
helped: 876
HURT: 322
helped stats (abs) min: 2 max: 788 x̄: 180.99 x̃: 220
helped stats (rel) min: <.01% max: 43.82% x̄: 20.75% x̃: 19.44%
HURT stats (abs)   min: 1 max: 1188 x̄: 38.27 x̃: 20
HURT stats (rel)   min: 0.09% max: 102.67% x̄: 5.17% x̃: 1.70%
95% mean confidence interval for cycles value: -130.47 -113.64
95% mean confidence interval for cycles %-change: -14.85% -12.72%
Cycles are helped.

total sends in shared programs: 730495 -> 730491 (<.01%)
sends in affected programs: 46 -> 42 (-8.70%)
helped: 2
HURT: 0

Iron Lake and GM45 had similar results. (Iron Lake shown)
total instructions in shared programs: 8122757 -> 8122617 (<.01%)
instructions in affected programs: 14716 -> 14576 (-0.95%)
helped: 46
HURT: 1
helped stats (abs) min: 1 max: 8 x̄: 3.07 x̃: 3
helped stats (rel) min: 0.36% max: 10.00% x̄: 2.54% x̃: 1.06%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 1.59% max: 1.59% x̄: 1.59% x̃: 1.59%
95% mean confidence interval for instructions value: -3.42 -2.54
95% mean confidence interval for instructions %-change: -3.28% -1.62%
Instructions are helped.

total cycles in shared programs: 188510100 -> 188509780 (<.01%)
cycles in affected programs: 58994 -> 58674 (-0.54%)
helped: 32
HURT: 1
helped stats (abs) min: 2 max: 96 x̄: 10.06 x̃: 6
helped stats (rel) min: 0.05% max: 15.29% x̄: 1.37% x̃: 0.31%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.68% max: 0.68% x̄: 0.68% x̃: 0.68%
95% mean confidence interval for cycles value: -16.34 -3.06
95% mean confidence interval for cycles %-change: -2.46% -0.15%
Cycles are helped.
2019-12-02 16:46:19 -08:00
Jordan Justen
e277009d8d iris: Allow max dynamic pool size of 2GB for gen12
Reworks:
 * Adjust comment to list the state packets that curro found to be
   affected.

Fixes: 8125d7960b ("intel/dev: Add preliminary device info for Tigerlake")
Cc: 19.3 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-12-02 16:34:12 -08:00
Marek Olšák
7730d583c2 radeonsi/gfx10: fix the vertex order for triangle strips emitted by a GS
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-12-02 18:22:27 -05:00
Marek Olšák
91da6a98e7 radeonsi/gfx10: simplify some duplicated NGG GS code
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-12-02 18:22:25 -05:00
Jonathan Gray
4913215d14 util/u_thread: don't restrict u_thread_get_time_nano() to __linux__
pthread_getcpuclockid() and clock_gettime() are also available on at least
OpenBSD, FreeBSD, NetBSD, DragonFly, Cygwin.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2019-12-02 17:23:49 -05:00
Jonathan Gray
c91997b6c4 util/futex: use futex syscall on OpenBSD
Make use of the futex syscall added in OpenBSD 6.2.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2019-12-02 17:23:49 -05:00
Kenneth Graunke
dbe923bff9 meson: Add a "prefer_iris" build option
Enabling this option makes Intel Gen8-11 hardware load the 'iris'
driver by default instead of the older 'i965' driver.

Regardless of how this option is set, users can still override which
driver the loader selects via two methods.  The first is to create a
~/.drirc or /etc/drirc file with the following snippet:

   <driconf>
     <device driver="loader" kernel_driver="i915">
       <option name="dri_driver" value="i965" />
     </device>
   </driconf>

The other option is to set an environment variable:

   export MESA_LOADER_DRIVER_OVERRIDE=i965

For now, "prefer_iris" defaults to i965 (the historical choice).
A separate future patch will change the default driver to iris.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1893
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-02 12:56:27 -08:00
Jonathan Marek
bebfb17a2b turnip: fix display wsi fence timing out
Fixes: df9f2adf ("turnip: add display wsi")

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-02 14:29:47 -05:00
Rhys Perry
5404b7aaa3 nir/lower_io_to_vector: don't create arrays when not needed
Some backends require that there are no array varyings.

If there were no arrays in the input shader, the pass shouldn't have to
create new ones.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2103
Fixes: bcd14756ee ('nir/lower_io_to_vector: add flat mode')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-12-02 17:45:01 +00:00
Rhys Perry
01cacdb71e aco: fix block_kind_discard s_andn2 definition to exec
Improves generated code of dEQP-VK.graphicsfuzz.disc-and-add-in-func-in-loop
because a loop exit phi can then be fixed to exec, removing copies and
improving jump threading.

No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-02 16:56:24 +00:00
Rhys Perry
0e8da9f607 aco: handle loop exit and IF merge phis with break/discard
ACO considers discards jumps and creates edges in the CFG for them but NIR
does neither of these.

This can be fixed instead by keeping track of whether a side of an IF had
a break/discard, but this doesn't solve the issue with discards affecting
loop exit phis. So this reworks phi handling a bit.

Fixes these tests:
dEQP-VK.graphicsfuzz.disc-and-add-in-func-in-loop
dEQP-VK.graphicsfuzz.loop-call-discard
dEQP-VK.graphicsfuzz.complex-nested-loops-and-call

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-02 16:56:19 +00:00
Rhys Perry
06fc83989c aco: validate the CFG
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-02 16:56:05 +00:00
Alejandro Piñeiro
b6fd679a9e mesa/main/util: moving gallium u_mm to util, remove main/mm
Right now there are two copies of mm:
   * mesa/main/mm.[ch]
   * gallium/auxiliary/util/u_mm.[ch]

At some point they splitted, and from the commit message it was not
clear why it was not possible to have only one copy at a common place.

Taking into account that was several years ago, Im assuming that it
was not possible then.

This change would allow to have one copy of the same code, and also
being able to use that code out of mesa/main or gallium, if needed.

This commit moves u_mm and removes mm, as u_mm has slightly more
changes.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2019-12-02 13:59:28 +01:00
Rhys Perry
35fab1ba33 radv: set writes_memory for global memory stores/atomics
Fixes: 13ab63bb62 ('radv: Implement VK_EXT_buffer_device_address.')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-02 11:47:12 +00:00
Rhys Perry
a814f3d8a7 ac/llvm: improve sync scope for global atomics
Stronger ordering is implemented in SPIRV->NIR with barriers.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-12-02 10:48:27 +00:00
Rhys Perry
e61a826f39 ac/llvm: fix pointer type for global atomics
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-12-02 10:48:18 +00:00
Kenneth Graunke
1d416ffd09 iris: Map FXT1 texture formats
This exposes GL_TDFX_texture_compression_FXT1 support.  It's ancient,
only Intel GPUs appear to support it, and I seriously doubt anybody
uses it.  But i965 supports it, and it's trivial to do, so we may as
well support it in the new iris driver as well.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-01 22:55:56 -08:00
Kenneth Graunke
1bdd342b60 st/mesa: Add GL_TDFX_texture_compression_FXT1 support
Eric recently added PIPE_FORMAT_FXT1_RGB[A] as part of his format
unification work.  This was really most of the work of implementing
the extension.  We just need to handle it in a couple of places and
expose the extension.

v2: Reject the new formats in llvmpipe_is_format_supported to prevent
    crashes because it doesn't know how to handle the new formats.

Reviewed-by: Marek Olšák <marek.olsak@amd.com> [v1]
Reviewed-by: Eric Anholt <eric@anholt.net> [v1]
2019-12-01 22:55:21 -08:00
Dave Airlie
3e21e17b2f nir/samplers: don't zero samplers_used/txf.
This allows this pass to be run multiple times and the results are
just or'ed together.

It fixes on test on llvmpipe nir, and regresses none.

Suggested by Kenneth

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-12-02 09:15:55 +10:00
Samuel Pitoiset
0eb78a078e aco: drop useless lowering of deref operations for shared memory
Moved to RADV. No pipeline-db changes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-29 21:58:25 +01:00
Samuel Pitoiset
c105e6169c radv,ac/nir: lower deref operations for shared memory
This shouldn't introduce any functional changes for RadeonSI
when NIR is enabled because these operations are already lowered.

pipeline-db (NAVI10/LLVM):
SGPRS: 9043 -> 9051 (0.09 %)
VGPRS: 7272 -> 7292 (0.28 %)
Code Size: 638892 -> 621628 (-2.70 %) bytes
LDS: 1333 -> 1331 (-0.15 %) blocks
Max Waves: 1614 -> 1608 (-0.37 %)

Found this while glancing at some F12019 shaders.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-29 21:58:18 +01:00
Daniel Schürmann
b690543851 aco: fix a couple of value numbering issues
Fixes: 3a20ef4a32 'aco: refactor value numbering'

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-11-29 21:54:27 +01:00
Daniel Schürmann
8861a82be7 aco: don't split live-ranges of linear VGPRs
Fixes: 93c8ebfa78 'aco: Initial commit of independent AMD compiler'

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-11-29 21:54:27 +01:00
Rhys Perry
73783ed389 aco: implement global atomics
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-29 17:46:02 +00:00
Rhys Perry
389ee819c0 aco: improve FLAT/GLOBAL scheduling
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-29 17:46:02 +00:00
Rhys Perry
cc742562c1 aco: don't enable store_global for helper invocations
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-29 17:46:02 +00:00
Rhys Perry
31e68e230f aco: fix SADDR with FLAT on GFX10
The reference guide is incorrect and SADDR is actually used with FLAT on
GFX10.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-29 17:46:01 +00:00
Rhys Perry
082e3a68fa aco: fix assembly of FLAT/GLOBAL atomics
They can take both a definition and data operand

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-29 17:46:01 +00:00
Rhys Perry
f1381e6715 aco: fix GFX10 opcodes for some global/flat atomics
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-29 17:46:01 +00:00
Rhys Perry
5986e00194 aco: improve WAR hazard workaround with >64bit stores
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-29 17:46:01 +00:00
Rhys Perry
a9fc81b098 aco: add v_nop inbetween exec write and VMEM/DS/FLAT
LLVM and the proprietary compiler seem to do this

Fixes: b01847bd9 ("aco/gfx10: Fix mitigation of VMEMtoScalarWriteHazard.")
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-29 17:46:01 +00:00
Rhys Perry
54742e157d aco: fix incorrect cast in parse_wait_instr()
s_waitcnt is SOPP, not SOPK

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-29 17:46:01 +00:00
Rhys Perry
11f43caaec aco: fix i2i64
Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-29 17:46:01 +00:00
Rhys Perry
ff70ccad16 aco: propagate p_wqm on an image_sample's coordinate p_create_vector
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2156
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-29 17:19:52 +00:00
Christian Gmeiner
1be220833c etnaviv: remove dead code
ptiled is always NULL so the if statement is useless.

CoverityID: 1415572
Fixes: b962776530 ("etnaviv: rework compatible render base")
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-11-29 16:22:40 +01:00
Christian Gmeiner
1dfe6a3e9a etnaviv: handle integer case for GENERIC_ATTRIB_SCALE
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-11-29 15:06:18 +01:00
Christian Gmeiner
5361ea2a9b etnaviv: fix R10G10B10A2 vertex format entries
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-11-29 15:06:18 +01:00
Christian Gmeiner
06d7071bca etnaviv: use NORMALIZE_SIGN_EXTEND
The blob driver does something like this for all vertex formats:

if (normalize) {
   if (OPENGL_ES30)
      val = VIVS_FE_VERTEX_ELEMENT_CONFIG_NORMALIZE_SIGN_EXTEND;
   else
      val = VIVS_FE_VERTEX_ELEMENT_CONFIG_NORMALIZE_ON;
} else {
   val = VIVS_FE_VERTEX_ELEMENT_CONFIG_NORMALIZE_OFF;
}

As there is no way to get to that information in gallium we always
assume OPENGL_ES30.

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-11-29 15:06:18 +01:00
Christian Gmeiner
ca6c73f335 etnaviv: fix integer vertex formats
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-11-29 15:06:18 +01:00
Jonathan Gray
34dda0ca65 i965: update Makefile.sources for perf changes
brw_performance_query_metrics.h was removed in
134e750e16 and
brw_performance_query.h was removed in
8ae6667992

remove reference to these files from Makefile.sources

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Fixes: 134e750e16 ("i965: extract performance query metrics")
Fixes: 8ae6667992 ("intel/perf: move query_object into perf")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-11-29 13:20:55 +00:00
Vinson Lee
0d21fe5397 scons: Bump C standard to gnu11 on macOS 10.15.
Fix build error on macOS 10.15 Catalina.

src/util/u_queue.c:179:7: error: implicit declaration of function 'timespec_get' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
      timespec_get(&ts, TIME_UTC);
      ^

timespec_get needs C11 starting with macOS 10.15.

/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/time.h
   193	#if (__DARWIN_C_LEVEL >= __DARWIN_C_FULL) && \
   194	        ((defined(__STDC_VERSION__) && __STDC_VERSION__ >= 201112L) || \
   195	        (defined(__cplusplus) && __cplusplus >= 201703L))
   196	/* ISO/IEC 9899:201x 7.27.2.5 The timespec_get function */
   197	#define TIME_UTC	1	/* time elapsed since epoch */
   198	__API_AVAILABLE(macosx(10.15), ios(13.0), tvos(13.0), watchos(6.0))
   199	int timespec_get(struct timespec *ts, int base);
   200	#endif

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Acked-by: Eric Engestrom <eric@engestrom.ch>
2019-11-29 12:38:29 +00:00
Boris Brezillon
c6e2096c47 panfrost: Make sure we reset the damage region of RTs at flush time
We must reset the damage info of our render targets here even though a
damage reset normally happens when the DRI layer swaps buffers. That's
because there can be implicit flushes the GL app is not aware of, and
those might impact the damage region: if part of the damaged portion
is drawn during those implicit flushes, you have to reload those areas
before next draws are pushed, and since the driver can't easily know
what's been modified by the draws it flushed, the easiest solution is
to reload everything.

Reported-by: Carsten Haitzler <raster@rasterman.com>
Fixes: 65ae86b854 ("panfrost: Add support for KHR_partial_update()")
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-29 10:20:29 +01:00
Boris Brezillon
b196e1a8cf gallium: Fix the ->set_damage_region() implementation
BACK_LEFT attachment can be outdated when the user calls
KHR_partial_update() (->lastStamp != ->texture_stamp), leading to a
damage region update on the wrong pipe_resource object.
Let's delay the ->set_damage_region() call until the attachments are
updated when we're in that case.

Reported-by: Carsten Haitzler <raster@rasterman.com>
Fixes: 492ffbed63 ("st/dri2: Implement DRI2bufferDamageExtension")
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-29 10:20:29 +01:00
Erik Faye-Lund
5fcb503c73 zink: silence coverity error
Coverity doesn't know that we always have coordinates if we have lod. To
avoid annoying errors, let's just zero-initialize this.

CoverityID: 1455202
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-11-29 09:54:25 +01:00
Erik Faye-Lund
7a63124a06 zink: error-check right variable
That's not the value we just allocated...

CoverityID: 1455177
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-11-29 09:54:25 +01:00
Erik Faye-Lund
c8769ff8dd zink: avoid NULL-deref
Same story as the previous two commits; these functions dereference the
memory they are pointed at. We can't do that.

CoverityID: 1455180
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-11-29 09:54:25 +01:00
Erik Faye-Lund
e54240f153 zink: avoid NULL-deref
Similar to the previous commit, pipe_resource_reference also dereference
the memory pointed at. Let's avoid it.

CoverityID: 1455198
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-11-29 09:54:25 +01:00
Erik Faye-Lund
bda64440e4 zink: avoid NULL-deref
zink_render_pass_reference will dereference the memory 'dst' points at,
which can't really go well. All we want to do here is to increase the
reference-count, so let's use a different helper for that instead.

CoverityID: 1455200
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-11-29 09:54:25 +01:00
Erik Faye-Lund
8e1dca35ab zink: handle calloc-failure
In case we fail to allocate the context, we should notice and fail
gracefully.

CoverityID: 1455193
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-11-29 09:54:25 +01:00
Erik Faye-Lund
8772d95d40 zink: do not try to destroy NULL-fence
destroy_fence doesn't handle NULL-pointers gracefully. So let's avoid
hitting that code-path, by simply returning NULL early here instead.

CoverityID: 1455179
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-11-29 09:54:25 +01:00
Erik Faye-Lund
49f53ee336 zink: delete query rather than allocating a new one
It seems I had some fat fingers when writing this function, and I
accidentally ended up allocating a new query and immediately trying to
delete an uninitialized pool instead of just deleting the pool of the
query that was passed.

CoverityID: 1455196
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-11-29 09:54:25 +01:00
Erik Faye-Lund
f2188e58ce zink: fix crash when restoring sampler-states
When I changed to heap-allocated sampler-objects, I missed the code-path
that restores sampler-states after the blitter; it needs an array of
pointers, not an array of VkSampler objects to behave.

This fixes spec@arb_texture_cube_map@copyteximage for me.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 5ea787950f ("zink: heap-allocate samplers objects")
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-11-29 09:19:54 +01:00
Erik Faye-Lund
655b9aa711 zink: reject invalid sample-counts
Vulkan only allows power-of-two sample counts. We already kinda checked
for this, but forgot to validate the result in the end. Let's check the
result and error properly.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-11-29 08:58:05 +01:00
Erik Faye-Lund
927363e0b9 zink: use true/false instead of TRUE/FALSE
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-11-29 08:57:33 +01:00
Erik Faye-Lund
c7c0bd9f1e st/mesa: unmap pbo after updating cache
Unmapping first leads to accessing an invalid pointer. So let's switch
these lines around.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-29 07:45:06 +00:00
Vinson Lee
de2e5f6f54 panfrost: Fix gnu-empty-initializer build errors.
Fixes: a24d6fbae6 ("meson: Add -Werror=gnu-empty-initializer to MSVC compat args")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-28 16:12:38 -08:00
Timothy Arceri
9d2d609cce docs: update source code repository documentation
This drops all the old documentaion around applying for push access.

Also this removes the documentation stating that you can push
directly to mesa rather than using merge requests.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1969

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-11-29 11:09:00 +11:00
Bas Nieuwenhuizen
48fc65413c radv: Fix timeline semaphore refcounting.
Was totally broken ...

Removed two if(point) {} because point is always non-NULL and we
were counting on that already for counting, since we NULL our
references to semaphores without active point earlier.

Fixes: 4aa75bb3bd "radv: Add wait-before-submit support for timelines."
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2137
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-11-28 23:46:09 +01:00
Jonathan Gray
3fe3bde4f2 winsys/amdgpu: avoid double simple_mtx_unlock()
pthread_mutex_unlock() when unlocked is documented by posix as
being undefined behaviour.  On OpenBSD pthread_mutex_unlock() will call
abort(3) if this happens.

This occurs in amdgpu_winsys_create() after
cb446dc0fa
winsys/amdgpu: Add amdgpu_screen_winsys

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Cc: 19.2 19.3 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2019-11-28 15:03:59 -05:00
Marek Olšák
5e81fbf44a util/driconfig: print ATTENTION if MESA_DEBUG=silent is not set
unix-bytebenchmark refuses to run if the driver prints ATTENTION to stderr.

Acked-by: Eric Engestrom <eric@engestrom.ch>
2019-11-28 14:36:32 -05:00
Tapani Pälli
d61a21f439 glsl: handle max uniform limits with lower_const_arrays_to_uniforms
Fixes arb_tessellation_shader-large-uniforms Piglit test.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-11-28 14:11:46 +02:00
Bas Nieuwenhuizen
4cde0e04e3 radv: Unify max_descriptor_set_size.
They were out of sync. Besides syncing, lets ensure they never diverge
again.

Fixes: 8d2654a419 "radv: Support VK_EXT_inline_uniform_block."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-11-28 12:06:44 +01:00
Bas Nieuwenhuizen
e09426ad6b amd/llvm: Refactor ac_build_scan.
Split out the logic for exclusive scans into a separate function
that makes clear what it does instead of having this opaque 60
line if.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-11-28 11:35:11 +01:00
Samuel Pitoiset
d347f2805d radv: add more constants to avoid using magic numbers
Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-28 10:59:14 +01:00
Samuel Pitoiset
52aadbfd04 ac/llvm: convert src operands to pointers if necessary
To avoid generating invalid LLVM IR when both operands don't have
the same type. This might happen when performing pointer comparisons
with SPIRV 1.4.

Fixes invalid LLVM IR for:
dEQP-VK.spirv_assembly.instruction.spirv1p4.opptrequal.variable_pointers_ssbo_equal
dEQP-VK.spirv_assembly.instruction.spirv1p4.opptrnotequal.variable_pointers_ssbo_not_equal

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-28 08:26:51 +01:00
Dave Airlie
18f896e55d llvmpipe: add initial nir support
This adds the hooks between llvmpipe and the gallivm NIR
code, for compute and fragment shaders.

NIR support is hidden behind LP_DEBUG=nir for now until
all the intergration issues are solved

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-11-28 14:49:23 +10:00
Dave Airlie
5363cda52b gallivm: add swizzle support where one channel isn't defined.
NIR doesn't always define all output channels
relies on outputs being memset to 0

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-11-28 14:49:16 +10:00
Dave Airlie
3eb27cfccd gallium: add nir lowering passes for the draw pipe stages. (v2)
This transforms the NIR shaders like the TGSI transforms worked.

v2: fix some nir info requirements, use 32-bit bools

Acked-by: Roland Scheidegger <sroland@vmware.com>
2019-11-28 14:49:05 +10:00
Dave Airlie
bf12bc2dd7 draw: add nir info gathering and building support
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-11-28 14:48:56 +10:00
Dave Airlie
44a6b0107b gallivm: add nir->llvm translation (v2)
This add the initial implementation of the NIR->LLVM conversion
for llvmpipe NIR support.

v2: lower bool to int32 in nir not llvm

Acked-by: Roland Scheidegger <sroland@vmware.com>
2019-11-28 14:48:44 +10:00
Dave Airlie
18ed09d449 gallivm: add selection for non-32 bit types
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-11-28 14:48:38 +10:00
Dave Airlie
9461f2b5df gallivm: add cttz wrapper
this will be used to write find_lsb support

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-11-28 14:48:32 +10:00
Dave Airlie
1a608901cc gallivm: add popcount intrinsic wrapper
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-11-28 14:48:25 +10:00
Dave Airlie
3b9950098b gallivm: nir->tgsi info convertor (v2)
This is a port of the old radeonsi code to be used for llvmpipe NIR support.

Once we remove TGSI support from llvmpipe (I can dream? :-), then
we should be able to refine most of this down and remove it.

v2: port to later radeonsi code for vertex inputs and sampler/io parsing.

Acked-by: Roland Scheidegger <sroland@vmware.com>
2019-11-28 14:48:11 +10:00
Dave Airlie
c879efec09 gallivm: split out the flow control ir to a common file.
We can share a bunch of flow control handling between NIR and TGSI.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-11-28 14:47:54 +10:00
Marek Olšák
754c7b8939 radeonsi: enable SPIR-V and GL 4.6 for NIR
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-11-27 19:28:35 -05:00
Marek Olšák
cf240ea6a5 radeonsi/nir: support interface output types to fix SPIR-V xfb piglits
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-11-27 19:28:34 -05:00
Marek Olšák
1b45da15a9 radeonsi/nir: fix location_frac handling for TCS outputs
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-11-27 19:28:32 -05:00
Marek Olšák
268e42e4f8 radeonsi/nir: don't rely on data.patch for tess factors
GLCTS SPIR-V tests have this issue.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-11-27 19:28:30 -05:00
Marek Olšák
59daac686d radeonsi/nir: validate is_patch because SPIR-V doesn't set it for tess factors
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-11-27 19:28:29 -05:00
Marek Olšák
272f1369ec radeonsi: simplify get_tcs_tes_buffer_address_from_generic_indices
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-11-27 19:28:28 -05:00
Marek Olšák
1e3aab4cd0 radeonsi: simplify the interface of get_dw_address_from_generic_indices
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-11-27 19:28:26 -05:00
Marek Olšák
756fc9f1bb radeonsi/nir: implement subgroup system values for SPIR-V
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-11-27 19:28:23 -05:00
Marek Olšák
42318f9197 ac/nir: don't rely on data.patch for tess factors
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-11-27 19:28:10 -05:00
Kenneth Graunke
51cc380894 drirc: Set vs_position_always_invariant for Shadow of Mordor on Intel
When drawing the main character in Shadow of Mordor, the game appears
to draw Talion with one vertex shader, and the Wraith with another.
If the compiler optimizes those in different ways which lead to slight
imprecisions, then the resulting positions may not line up, leading to
Z-fighting occurring as the game decides which of the two are in front.

brw_nir_opt_peephole_ffma looks at usages of multiply adds across the
entire shader, and may make different decisions between the two, leading
to such imprecisions and Z-fighting.  This started happening recently
after a NIR change to eliminate unnecessary MOVs (7025dbe7), but that
change simply exposed the existing problem.

Improves performance on Skylake GT4e by 1.22945% +/- 0.398672% (n=3),
likely due to the fixed rendering.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1985
Fixes: 7025dbe794 ("nir: Skip emitting no-op movs from the builder.")
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-11-27 18:48:04 +00:00
Kenneth Graunke
9b577f2a88 driconf, glsl: Add a vs_position_always_invariant option
Many applications use multi-pass rendering and require their vertex
shader position to be computed the same way each time.  Optimizations
may consider, say, fusing a multiply-add based on global usage of an
expression in a shader.  But a second shader with the same expression
may have different code, causing that optimization to make the other
choice the second time around.

The correct solution is for applications to mark their VS outputs
'invariant', indicating they need multiple shaders to compute that
output in the same manner.  However, most applications fail to do so.

So, we add a new driconf option - vs_position_always_invariant - which
forces the gl_Position output in vertex shaders to be marked invariant.

Fixes: 7025dbe794 ("nir: Skip emitting no-op movs from the builder.")
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-11-27 18:48:04 +00:00
Eric Anholt
424d5e4e11 turnip: Disable timestamp queries for now.
They're not implemented, and not critical to bring up immediately.  Avoids
failures in the CTS when nothing gets written to the query.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-27 10:05:59 -08:00
Jonathan Marek
080c92e7d4 freedreno/perfcntrs/fdperf: add missing a2xx case in select_counter
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-11-27 12:11:57 -05:00
Jonathan Marek
98d7125b36 freedreno/perfcntrs/fdperf: add missing a20x compatible
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-11-27 12:11:57 -05:00
Jonathan Marek
24cde37e8d freedreno/perfcntrs/fdperf: fix u64 print on 32-bit builds
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-11-27 12:11:57 -05:00
Jonathan Marek
baab4017b9 freedreno/perfcntrs: add a2xx MH counters
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-11-27 12:11:57 -05:00
Jonathan Marek
0d0c8a9e82 freedreno/registers: add missing MH perfcounter enum for a2xx
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-11-27 12:11:57 -05:00
Michel Dänzer
a3b3d3bfcc gitlab-ci: Put HTML summary in artifacts for failed piglit jobs
This will make it easier to look at details of failed / skipped tests.

Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-27 10:20:31 +01:00
Michel Dänzer
07c1346113 gitlab-ci: Stop storing piglit test results as JUnit
Since we're not reporting test results as JUnit anymore, we can use the
default JSON format.

This affects how test results are summarized, update the reference files
accordingly.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-27 10:19:22 +01:00
Michel Dänzer
c9cdb7cef0 gitlab-ci: Stop reporting piglit test results via JUnit
It was basically useless in this form, and processing the JUnit data in
the GitLab backend was pretty expensive.

Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-27 10:18:33 +01:00
Iago Toral Quiroga
18a09e788d v3d: fix indirect BO allocation for uniforms
We were always ensuring a minimum size of 4 bytes for uniforms
for the case where we don't have any, to account for hardware pre-fetching
of the uniform stream, however, pre-fetching could also lead to to out
of bounds reads when have read the last uniform in the stream, so we
probably want to have the extra 4 bytes to prevent the kernel from
observing invalid memory accesses when the uniform stream sits right at
the end of a page.

This seems to fix MMU exceptions reported with a Linux 5.4 kernel.

Credit goes to Phil Elwell for identifying the problem and narrowing
it down to memory accesses in the uniform stream.

Reported-by: Phil Elwell <phil@raspberrypi.org>
Tested-by: Phil Elwell <phil@raspberrypi.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-27 08:43:13 +01:00
Samuel Pitoiset
a24f1c8f7f radv: enable VK_KHR_shader_subgroup_extended_types on GFX10
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-27 07:42:44 +01:00
Samuel Pitoiset
0812dbd403 ac: add 8-bit and 16-bit supports to ac_build_permlane16()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-27 07:42:42 +01:00
Samuel Pitoiset
c9aa843961 radv/gfx10: fix implementation of exclusive scans
This implementation is loosely based on ROCm.
https://github.com/RadeonOpenCompute/ROCm-Device-Libs/blob/master/ockl/src/wfredscan.cl

This fixes dEQP-VK.subgroups.arithmetic.*.subgroupexclusive* on GFX10.

Fixes: 227c29a80d ("amd/common/gfx10: implement scan & reduce operations")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-27 07:39:26 +01:00
Samuel Pitoiset
86a5fbfd4a radv: fix enabling sample shading with SampleID/SamplePosition
When a fragment shader includes an input variable decorated with
SampleId or SamplePosition, sample shading should be enabled
because minSampleShadingFactor is expected to be 1.0.

Cc: 19.2, 19.3 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-27 07:22:54 +01:00
Jonathan Marek
62ff90cc5e turnip: fix integer render targets
Add missing required bits.  Fixes at least:

dEQP-VK.pipeline.render_to_image.dedicated_allocation.1d.small.r16g16_sint_d24_unorm_s8_uint
dEQP-VK.pipeline.render_to_image.dedicated_allocation.2d.mipmap.r16g16_sint_d24_unorm_s8_uint
dEQP-VK.renderpass.dedicated_allocation.attachment.4.401
dEQP-VK.renderpass2.suballocation.formats.r16_uint.load.draw
dEQP-VK.synchronization.op.single_queue.barrier.write_draw_read_copy_image_to_buffer.image_128x128_r16_uint

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-26 16:01:19 -08:00
Jason Ekstrand
a8965c076b anv: Push constants are relative to dynamic state on IVB
Fixes: aecde2351 "anv: Pre-compute push ranges for graphics pipelines"
Closes: #2136
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-26 22:15:54 +00:00
Dylan Baker
a24d6fbae6 meson: Add -Werror=gnu-empty-initializer to MSVC compat args
Only clang has this argument (at least as of clang 8 and gcc 9), which
errors when using the gcc empty initializer syntax in C:

```C
struct foo f = {};
```

GCC has a warning for this, but only when using -Wpedantic, which is a
lot of noise to lose useful warnings in.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-11-26 12:48:11 -08:00
Dylan Baker
25e58e3718 gallium/auxiliary: Fix uses of gnu struct = {} extension
Most of these will never actually be compiled by windows, but in the
interest of being able to make using struct foo = {}; an error and
avoiding breaking windows removing a handful of safe uses seems like a
good trade off.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-11-26 12:48:11 -08:00
Marek Olšák
ed1ff99da7 st/mesa: add st_variant base class to simplify code for shader variants
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-26 15:14:10 -05:00
Marek Olšák
b8772a559a st/mesa: don't use ** in the st_nir_link_shaders signature
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-26 15:14:10 -05:00
Marek Olšák
adbba2142d st/mesa: simplify looping over linked shaders when linking NIR
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-26 15:14:10 -05:00
Marek Olšák
8567e06046 st/mesa: propagate gl_PatchVerticesIn from TCS to TES before linking for NIR
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-26 15:14:10 -05:00
Marek Olšák
e8f0a39d45 st/mesa: don't call ProgramStringNotify in glsl_to_nir
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-26 15:14:10 -05:00
Marek Olšák
5a714531f7 st/mesa: don't use redundant stp->state.ir.nir
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-26 15:14:10 -05:00
Marek Olšák
6cf011fcc8 st/mesa: don't serialize all streamout state if there are no SO outputs
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-26 15:14:10 -05:00
Kenneth Graunke
3fdf2bb313 iris: Disable VF cache partial address workaround on Gen11+
The vertex cache uses the full 48-bit address on Gen11+.  See the
documentation for 3DSTATE_VERTEX_BUFFERS, which describes the
workaround and lists it as pre-Icelake.

Interestingly, the docs don't mention index buffers as needing a
workaround at all.  So either we've been overzealous, or the docs
never got updated to record that.  Which begs the question of whether
the issue there was fixed, if there was one...

Cuts 40% of the PIPE_CONTROLs from Civilization VI's benchmark; appears
that it improves performance by about 1-2% on Icelake 8x8 (not frequency
locked).
2019-11-26 12:13:34 -08:00
Rob Clark
8d9f5a28e3 freedreno: switch to layout helper
The slices table and most of the other layout fields in the
freedreno_resource moves into fdl_layout.

v2: Changes by anholt to not have duplicate fields, which was introducing
    a surprising behavior change in resource layout (using the
    level_linear helper before the setup of the shadowed fields)

Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Rob Clark <robdclark@chromium.org>
2019-11-26 18:46:08 +00:00
Eric Anholt
997b8d4749 freedreno/a6xx: Log the tiling mode in resource layout debug.
This was important for figuring out what went wrong with the layout
refactor.

Acked-by: Rob Clark <robdclark@chromium.org>
2019-11-26 18:46:07 +00:00
Eric Anholt
2e62a622e7 freedreno: Convert the slice struct to the new resource header.
This gets the worst of the sed required for shared resource layout out of
the way.  The texture layout comment is dropped now that we're referencing
the shared header, which has a more complete description.

Acked-by: Rob Clark <robdclark@chromium.org>
2019-11-26 18:46:07 +00:00
Eric Anholt
930432577f freedreno: Introduce a resource layout header.
This will be used for sharing resource layout code between freedreno and
tu.  Mostly copied from a commit by Rob, with a new location and the slice
struct renamed for consistency.

Acked-by: Rob Clark <robdclark@chromium.org>
2019-11-26 18:46:07 +00:00
Eric Anholt
2ec420b264 freedreno: Introduce a fd_resource_tile_mode() helper.
Multiple places were doing the same thing to get the tile mode of a level,
so refactor it out.  This will make the shared resource helper transition
cleaner.

Acked-by: Rob Clark <robdclark@chromium.org>
2019-11-26 18:46:07 +00:00
Eric Anholt
6b09227ede freedreno: Introduce a fd_resource_layer_stride() helper.
This factors out a bit of duplicated code, but will also make the shared
resource layout transition process clearer.

Acked-by: Rob Clark <robdclark@chromium.org>
2019-11-26 18:46:07 +00:00
Rob Clark
9e9a26c768 freedreno: use rsc->slice accessor everywhere
This will make it easier to extract the slice table out into a layout
helper.

Acked-by: Rob Clark <robdclark@chromium.org>
2019-11-26 18:46:07 +00:00
Eric Anholt
d845dca0f5 nir: Make algebraic backtrack and reprocess after a replacement.
The algebraic pass was exhibiting O(n^2) behavior in
dEQP-GLES2.functional.uniform_api.random.3 and
dEQP-GLES31.functional.ubo.random.all_per_block_buffers.13 (along with
other code-generated tests, and likely real-world loop-unroll cases).
In the process of using fmul(b2f(x), b2f(x)) -> b2f(iand(x, y)) to
transform:

result = b2f(a == b);
result *= b2f(c == d);
...
result *= b2f(z == w);

->

temp = (a == b)
temp = temp && (c == d)
...
temp = temp && (z == w)
result = b2f(temp);

nir_opt_algebraic, proceeding bottom-to-top, would match and convert
the top-most fmul(b2f(), b2f()) case each time, leaving the new b2f to
be matched by the next fmul down on the next time algebraic got run by
the optimization loop.

Back in 2016 in 7be8d07732 ("nir: Do opt_algebraic in reverse
order."), Matt changed algebraic to go bottom-to-top so that we would
match the biggest patterns first.  This helped his cases, but I
believe introduced this failure mode.  Instead of reverting that, now
that we've got the automaton, we can update the automaton's state
recursively and just re-process any instructions whose state has
changed (indicating that they might match new things).  There's a
small chance that the state will hash to the same value and miss out
on this round of algebraic, but this seems to be good enough to fix
dEQP.

Effects with NIR_VALIDATE=0 (improvement is better with validation enabled):

Intel shader-db runtime -0.954712% +/- 0.333844% (n=44/46, obvious throttling
  outliers removed)
dEQP-GLES2.functional.uniform_api.random.3 runtime
  -65.3512% +/- 4.22369% (n=21, was 1.4s)
dEQP-GLES31.functional.ubo.random.all_per_block_buffers.13 runtime
  -68.8066% +/- 6.49523% (was 4.8s)

v2: Use two worklists, suggested by @cwabbott, to cut out a bunch of
    tricky code.  Runtime of uniform_api.random.3 down -0.790299% +/-
    0.244213% compred to v1.
v3: Re-add the nir_instr_remove() that I accidentally dropped in v2,
    fixing infinite loops.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-26 10:13:46 -08:00
Eric Anholt
90ad6304bf nir: Refactor algebraic's block walk
My motivation was to clarify the changes in the following commit, but
incidentally, it reduces runtime of
dEQP-GLES2.functional.uniform_api.random.3 (an algebraic-heavy
testcase) by -5.39524% +/- 2.21179% (n=15)

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-26 10:13:40 -08:00
Connor Abbott
305d1300f9 nir: Maintain the algebraic automaton's state as we work.
In order to have nir_opt_algebraic be able to do further algebraic
work on the output of a replacement, we need to maintain the
automaton's state.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-26 10:13:19 -08:00
Jonathan Marek
2da4a58ed9 etnaviv: support 3d/array/integer formats in texture descriptors
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-11-26 19:07:04 +01:00
Jonathan Marek
7806e058c9 etnaviv: blt: fix partial ZS clears with TS
If not all bits are cleared, then BLT needs to be given the current clear
value and not the new one.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-11-26 19:04:51 +01:00
Daniel Schürmann
7cd548d352 aco: don't value-number instructions from within a loop with ones after the loop.
Fixes:
Wolfenstein:Youngblood (w/o shader_ballot)
dEQP-VK.descriptor_indexing.combined_image_sampler_in_loop_with_lod

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-11-26 14:39:27 +00:00
Rhys Perry
46420dd294 aco: set dlc/glc correctly for image loads
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2019-11-26 14:39:27 +00:00
Rhys Perry
37843e454e aco: allow constant offsets for global/scratch instructions on GFX10
I don't think the bug applies for global/scratch instructions and
load_barycentric_at_sample selection expects this feature to work.

Fixes various dEQP-VK.pipeline.multisample_interpolation.* tests on GFX10.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2019-11-26 14:39:27 +00:00
Bas Nieuwenhuizen
02375b8436 radv: Enable VK_KHR_buffer_device_address.
Still no capture/replay or multi device support.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-11-26 11:59:52 +00:00
Samuel Pitoiset
34dd4251e2 radv: fix reporting subgroup size with VK_KHR_pipeline_executable_properties
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-26 10:48:48 +01:00
Bas Nieuwenhuizen
25bc9102d8 radv: Allocate cmdbuffer space for buffer marker write.
Fixes: 946193ae00 "radv: add support for VK_AMD_buffer_marker"
Reviewed-by:  Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-11-26 09:35:02 +00:00
Gert Wollny
e41958e344 r600: Disable eight bit three channel formats
Commit 0899bf55 made some deqp-gles3 tests related to RGB8 PBOs fail
on R600 because it exposed PIPE_FORMAT_R8G8B8_UNORM and R600 doesn't
propely handle this. Disabling this format also for buffers fixes the
issue.

In addition, disabling also the related RGB8 integer formats for buffers
fixes some deqp-gles3 tests:

  dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgb8ui_cube
  dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgb8i_2d
  dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgb8i_cube
  dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgb8ui_2d
  dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgb8ui_cube
  dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgb8i_2d_array
  dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgb8i_3d
  dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgb8ui_2d_array
  dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgb8ui_3d
  dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgb8i_2d_array
  dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgb8i_3d
  dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgb8ui_2d_array
  dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgb8ui_3d

Fixes: 0899bf55
  st/mesa: Map MESA_FORMAT_RGB_UNORM8 <-> PIPE_FORMAT_R8G8B8_UNORM

Closes #2118

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-26 09:28:52 +01:00
Samuel Pitoiset
f6770b9726 ac/llvm: fix warning in ac_build_canonicalize()
../src/amd/llvm/ac_llvm_build.c: In function ‘ac_build_canonicalize’:
../src/amd/llvm/ac_llvm_build.c:4567:9: warning: ‘intr’ may be used uninitialized in this function [-Wmaybe-uninitialized]
 4567 |  return ac_build_intrinsic(ctx, intr, type, params, 1,
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 4568 |       AC_FUNC_ATTR_READNONE);
      |       ~~~~~~~~~~~~~~~~~~~~~~
../src/amd/llvm/ac_llvm_build.c:4567:9: warning: ‘type’ may be used uninitialized in this function [-Wmaybe-uninitialized]

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-11-26 08:35:10 +01:00
Tapani Pälli
5d58fea660 mapi: add GetInteger64vEXT with EXT_disjoint_timer_query
From EXT_disjoint_timer_query spec:

   "Interaction: This extension adds GetInteger64vEXT if
    OpenGL ES 3.0 is not supported"

See https://github.com/KhronosGroup/OpenGL-Registry/issues/326.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2090
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-26 07:41:24 +02:00
Jason Ekstrand
200a3301e2 vulkan: Update the XML and headers to 1.1.129
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-26 02:48:42 +00:00
Jason Ekstrand
854859fefa anv/entrypoints: Better handle promoted extensions
In the case of promoted extensions we can end up with an entrypoint that
we support being an alias of an entrypoint we do not support.  For
instance, if an extension gets promoted from EXT to KHR, the EXT entry-
points may be aliases of the KHR ones.  We want to leave everything as
EXT until we get around to advertising the KHR so that we don't break
things when we update the XML and headers.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-26 02:48:42 +00:00
Jason Ekstrand
121551bfdb vulkan/enum_to_str: Handle out-of-order aliases
The current code can only handle enum aliases if the original enum is
declared first followed by the alias as we walk the XML in a linear
fashion.  This commit allows us to handle aliases where the alias
declaration comes before the thing it's aliasing.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-26 02:48:42 +00:00
Kenneth Graunke
f6aa51103b iris: Update SURFACE_STATE addresses when setting sampler views
We may have replaced the backing storage for a texture buffer while it
was unbound, at which point iris_rebind_buffer would not have caught it
and updated it.  We need to ensure that the current resource's address
matches the one our SURFACE_STATE points at.  If not, update addresses
and re-upload the SURFACE_STATE.

Shader images and buffers do not suffer from this problem because we
re-stream the surface state on every set call, since there isn't a
created CSO object for those with a saved SURFACE_STATE.  Constant
buffers are also currently re-streamed (we pitch the SURFACE_STATE
on every set_constant_buffer call).  Surfaces would need this
treatment (as they're created CSOs) except that we never swap out
their backing storage today (we only do it for buffers), so it's OK
for now.

Fixes misrendering in Unreal 4 demos (Elemental, Matinee Fight Scene).
Huge thanks to Andrii Simiklit for tracking down the problem - it was
quite difficult to find!  Also fixes Andrii's new Piglit test for the
bug, 'arb_texture_buffer_object-re-init'.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1365
2019-11-25 15:54:54 -08:00
Kenneth Graunke
060a2c52fa iris: Maintain CPU-side SURFACE_STATE copies for views and surfaces.
When replacing the backing storage for texture buffers, image buffers,
and so on, we may need to update the "Surface Base Address" field in
any corresponding SURFACE_STATE.  This is easier to accomplish if we
have a copy on the CPU - we can just compare the current field, update
it, and re-upload.

This patch adds a CPU-side copy to the new iris_surface_state wrapper
struct, and reworks allocation and upload to fill things out on the
CPU copy first, then upload that to the GPU when finished.

This will be necessary to fix iris_invalidate_resource bugs shortly.

Technically, we never replace the backing storage for pipe_surfaces
(render targets), so we don't need to make this change there.  However,
it's nice to have surfaces, sampler views, and image views handled
similarly.  Plus, if we ever wanted to swap out backing storage for
busy textures, we'd need this infrastructure.

v2: Properly free memory (caught by Andrii Simiklit)
2019-11-25 15:54:54 -08:00
Kenneth Graunke
2b09e818dc iris: Create an "iris_surface_state" wrapper struct
Today, we only have a state reference to the GPU buffer containing our
uploaded SURFACE_STATEs.  However, we're going to want a CPU-side copy
soon.  Making a wrapper struct means we can talk about both together,
and also put both in the field called "surface_state".
2019-11-25 15:54:54 -08:00
Kenneth Graunke
4c1f81ad62 iris: Drop 'old_address' parameter from iris_rebind_buffer
We can just compare the VERTEX_BUFFER_STATE address field to the
current BO's address.  When calling rebind, we've already updated
the resource to the new buffer, but the state will have the old
address.
2019-11-25 15:54:54 -08:00
Kenneth Graunke
518be59c1a iris: Stop mutating the resource in get_rt_read_isl_surf().
Mutating fields of global resources is generally not safe, and the only
reason we were doing it was to avoid passing an extra parameter to
the fill_surface_state helper.
2019-11-25 15:54:54 -08:00
Marek Olšák
b02e0d2604 radeonsi/nir: don't run si_nir_opts again if there is no change
0.3% less overhead

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-25 16:48:27 -05:00
Marek Olšák
4675cb2019 radeonsi: initialize the per-context compiler on demand
This takes a noticable amount of time in piglit and some tests don't
need it.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-11-25 16:48:27 -05:00
Marek Olšák
f671cc4d95 ac: set swizzled bit in cache policy as a hint not to merge loads/stores
LLVM now merges loads and stores for all opcodes, so this must be set.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-11-25 16:48:27 -05:00
Eric Anholt
8afab607ac nir: Add a scheduler pass to reduce maximum register pressure.
This is similar to a scheduler I've written for vc4 and i965, but this
time written at the NIR level so that hopefully it's reusable.  A notable
new feature it has is Goodman/Hsu's heuristic of "once we've started
processing the uses of a value, prioritize processing the rest of their
uses", which should help avoid the heuristic otherwise making such
systematically bad choices around getting texture results consumed.

Results for v3d:

total instructions in shared programs: 6497588 -> 6518242 (0.32%)
total threads in shared programs: 154000 -> 152828 (-0.76%)
total uniforms in shared programs: 2119629 -> 2068681 (-2.40%)
total spills in shared programs: 4984 -> 472 (-90.53%)
total fills in shared programs: 6418 -> 1546 (-75.91%)

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> (v1)
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> (v2)

v2: Use the DAG datastructure, fold in the scheduling-for-parallelism
    patch, include SSA defs in live values so we can switch to bottom-up
    if we want.
v3: Squash in improvements from Alejandro Piñeiro for getting V3D to
    successfully register allocate on GLES3.1 dEQP.  Make sure that
    discards don't move after store_output.  Comment spelling fix.
2019-11-25 21:12:21 +00:00
Jonathan Marek
5159db60fc etnaviv: implement 64bpp clear
At the same time, update etna_clear_blit_pack_rgba to work with integer
formats.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-11-25 20:23:22 +01:00
Jonathan Marek
2214f99c07 etnaviv: avoid using RS for 64bpp formats
At the same time, this change allows using BLT for 8bpp formats

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-11-25 20:22:43 +01:00
Christian Gmeiner
92d5e3c692 etnaviv: add support for extended pe formats
Use the extended format if an such a format was passed.

v1 -> v2:
 - set FORMAT_MASK bit when using ext PE format as suggested
   by Wladimir J. van der Laan

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-11-25 20:12:52 +01:00
Christian Gmeiner
396818fd9d etnaviv: handle 8 byte block in tiling
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-11-25 20:11:30 +01:00
Samuel Pitoiset
2af39c719e radv: select the depth decompress path based on the aspect mask
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-25 16:29:23 +01:00
Samuel Pitoiset
905c005561 radv: create decompress pipelines for separate depth/stencil layouts
No functional changes as the driver still uses the depth+stencil
pipeline.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-25 16:29:21 +01:00
Samuel Pitoiset
faa58201f3 radv: rework creation of decompress/resummarize meta pipelines
This refactoring will help for creating more decompress pipelines.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-25 16:29:18 +01:00
Samuel Pitoiset
8f0fb38825 radv: set the image view aspect mask before resolves
No functional changes, but it will be used to decompress
separate depth/stencil aspects.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-25 16:29:16 +01:00
Samuel Pitoiset
9dec90b7bc radv: set the image view aspect mask during subpass transitions
No functional changes because the aspect mask is still not used
during image transitions but it will be needed for the separate
depth/stencil aspects logic.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-25 16:29:13 +01:00
Rhys Perry
459bc77763 aco: enable load/store vectorizer
Totals from affected shaders:
SGPRS: 1890373 -> 1900772 (0.55 %)
VGPRS: 1210024 -> 1215244 (0.43 %)
Spilled SGPRs: 828 -> 828 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 252 -> 252 (0.00 %) dwords per thread
Code Size: 81937504 -> 74608304 (-8.94 %) bytes
LDS: 746 -> 746 (0.00 %) blocks
Max Waves: 230491 -> 230158 (-0.14 %)

In NeiR:Automata and GTA V, the code decrease is especially large: -13.79%
and -15.32%, respectively.

v9: rework the callback function
v10: handle load_shared/store_shared in the callback

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v9)
2019-11-25 13:59:11 +00:00
Rhys Perry
0a759c3be6 nir: add load/store vectorizer tests
v7: run nir_opt_algebraic
v9: rework the callback function
v9: update alignment on all loads/stores, even if they're not vectorized
v10: add tests for 64-bit offsets
v10: add tests for signed offsets

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v9)
2019-11-25 13:59:11 +00:00
Rhys Perry
ce9205c03b nir: add a load/store vectorization pass
This pass combines intersecting, adjacent and identical loads/stores into
potentially larger ones and will be used by ACO to greatly reduce the
number of memory operations.

v2: handle nir_deref_type_ptr_as_array
v3: assume explicitly laid out types for derefs
v4: create less deref casts
v4: fix shared boolean vectorization
v4: fix copy+paste error in resources_different
v4: fix extract_subvector() to pass
    nir_load_store_vectorize_test.ssbo_load_intersecting_32_32_64
v4: rebase
v5: subtract from deref/offset instead of scheduling offset calculations
v5: various non-functional changes/cleanups
v5: require less metadata and preserve more
v5: rebase
v6: cleanup and improve dependency handling
v6: emit less deref casts
v6: pass undef to components not set in the write_mask for new stores
v7: fix 8-bit extract_vector() with 64-bit input
v7: cleanup creation of store write data
v7: update align correctly for when the bit size of load/store increases
v7: rename extract_vector to extract_component and update comment
v8: prevent combining of row-major matrix column acceses
v9: rework process_block() to be able to vectorize more
v9: rework the callback function
v9: update alignment on all loads/stores, even if they're not vectorized
v9: remove entry::store_value, since it will not be updated if it's was
    from a vectorized load
v9: fix bug in subtract_deref(), causing artifacts in Dishonored 2
v9: handle nir_intrinsic_scoped_memory_barrier
v10: use nir_ssa_scalar
v10: handle non-32-bit offsets
v10: use signed offsets for comparison
v10: improve create_entry_key_from_offset()
v10: support load_shared/store_shared
v10: remove strip_deref_casts()
v10: don't ever pass NULL to memcmp
v10: remove recursion in gcd()
v10: fix outdated comment
v11: use the new nir_extract_bits()
v12: remove use of nir_src_as_const_value in resources_different
v13: make entry key hash function deterministic
v13: simplify mask_sign_extend()
v14: add comment in hash_entry_key() about hashing pointers

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v9)
2019-11-25 13:59:11 +00:00
Rhys Perry
b3a3e4d1d2 radv: set alignment for load_ssbo/store_ssbo in meta shaders
Otherwise, nir_intrinsic_align() will assert when called on the intrinsics

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-11-25 13:59:11 +00:00
Rhys Perry
c14f823ee5 nir: add nir_num_variable_modes and nir_var_mem_push_const
These will be useful in the upcoming load/store vectorizer.

v11: rebase

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-25 13:59:11 +00:00
Connor Abbott
01eb6ef870 aco: Make unused workgroup id's 0
It shouldn't matter, but the 1 was leftover from when it was handled
together with workgroup_size and num_work_groups.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-25 14:17:51 +01:00
Connor Abbott
bb78f9b4e4 aco: Use common argument handling
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-25 14:17:51 +01:00
Connor Abbott
e7f4cadd02 radv: Replace supports_spill with explict_scratch_args
The former was always true and hence dead code. We will want to
explicitly declare the ring offset register with ACO, but we also want
to declare the scratch offset too, and we can't try to disable it since
ACO also supports spilling and the determination of whether spilling has
to happen occurs well after setting up registers. So replace
supports_spill with something that will actually be used for ACO.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-11-25 14:17:51 +01:00
Connor Abbott
4d6676d78a aco: Make num_workgroups and local_invocation_ids one argument each
To match the LLVM argument setup code.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-25 14:17:51 +01:00
Connor Abbott
a7f1c63442 aco: Split vector arguments at the beginning
Due to how LLVM works we have to make some of the FS inputs become
vectors, and therefore have to split them early so that they don't take
up extra register pressure due to how RA currently works.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-25 14:17:51 +01:00
Connor Abbott
b45c54ff8d aco: Use radv_shader_args in aco_compile_shader()
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-25 14:17:51 +01:00
Connor Abbott
680b086db1 aco: Constify radv_nir_compiler_options in isel
It's already const for everything else.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-25 14:17:51 +01:00
Connor Abbott
66c703b3e8 radv: Move argument declaration out of nir_to_llvm
Now it's executed for ACO too.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-11-25 14:17:51 +01:00
Connor Abbott
3b143369a5 ac/nir, radv, radeonsi: Switch to using ac_shader_args
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
2019-11-25 14:17:10 +01:00
Connor Abbott
9885af3bdf ac: Add a shared interface between radv, radeonsi, LLVM and ACO
ac_shader_args will be similar to ac_shader_abi, except for being free
from LLVM-specific concepts and therefore capable of being shared
between LLVM and ACO. This will help us accomplish a few different
things:

- Decouple setting up SGPR and VGPR arguments from translating to LLVM,
so that we can reference these arguments in NIR lowering passes, which
will let us lower e.g. descriptor sets in NIR.

- Stop using radv-specific structures for things like determining the
chip generation in ACO.

In the end, we should replace ac_shader_abi with this structure +
driver-specific lowering passes.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-11-25 14:12:46 +01:00
Connor Abbott
43da33c169 radv: Rename ac_arg_regfile
We'll duplicate this in a header file in the next commit, and then
remove the original enum. Just rename it temporarily so that things
keep building.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-11-25 14:12:46 +01:00
Danylo Piliaiev
29081c671f drirc: Add glsl_zero_init workaround for GpuTest
GiMark benchmark from GpuTest has such code in VS:

 out vec4 lightDir0;
 out vec4 lightDir1;

 ...

 lightDir0.xyz = lp0 - vVertex.xyz;
 lightDir1.xyz = lp1 - vVertex.xyz;

In FS:

 float distSqr = dot(lightDir0, lightDir0);

So due to the usage of uninitialized .w channel in the dot product,
distSqr may become undefined which results in many black dots
in the test on Iris.

In https://www.geeks3d.com/forums/index.php/topic,6242.0.html
developer stated that this benchmark most likely won't be updated.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1919
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-25 12:22:37 +02:00
Samuel Pitoiset
d6db858771 meson: only build imgui when needed
Only required for Intel tools or the Vulkan overlay layer.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-11-25 07:51:56 +00:00
Samuel Pitoiset
bfb307aea9 ac/llvm: fix the local invocation index for wave32
Fixes dEQP-VK.compute.builtin_var.local_invocation_index with
RADV_PERFTEST=cswave32.

My initial fix was to lower it but Rhys suggested the shift-right
and it's much better like this.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-25 07:25:48 +00:00
Samuel Pitoiset
b99295fb33 radv: disable subgroup shuffle operations on GFX10
They are broken like on GFX6-GFX7. It seems better to disable them
instead of enabling a broken feature.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-25 08:03:24 +01:00
Dave Airlie
1c5dc4eaf9 docs: add llvmpipe to ARB_query_buffer_object. 2019-11-25 12:37:58 +10:00
Dave Airlie
506e51b856 llvmpipe: initial query buffer object support. (v2)
This fails a couple of piglits due to other bugs in llvmpipe,
but it adds support for the feature properly.

v2: don't reset pipestats, just recalc, fix CI expectation
2019-11-25 12:37:32 +10:00
Timothy Arceri
f54c4e85ce radv: create a fresh fork for each pipeline compile
In order to prevent a potential malicious pipeline tainting our
secure compile process and interfering with successive pipelines
we want to create a fresh fork for each pipeline compile.

Benchmarking has shown that simply forking on each pipeline
creation doubles the total time it takes to compile a fossilize db
collection. So instead here we fork the process at device creation
so that we have a slim copy of the device and then fork this
otherwise idle and untainted process each time we compile a
pipeline. Forking this slim copy of the device results in only a
20% increase in compile time vs a 100% increase.

Fixes: cff53da3 ("radv: enable secure compile support")
2019-11-25 10:10:14 +11:00
Timothy Arceri
1663bb1f77 radv: add a secure_compile_open_fifo_fds() helper
This will be used to create a communication pipe between the user
facing device and a freshly forked (per pipeline compile) slim copy
of that device.

We can't use pipe() here because the fork will not be a direct fork
of the user facing process. Instead we use a previously forked
copy of the process that was forked at device creation in order to
reduce the resources required for the fork and avoid performance
issues.

Fixes: cff53da374 ("radv: enable secure compile support")
2019-11-25 10:10:14 +11:00
Timothy Arceri
ef54f15da9 radv: add some infrastructure for fresh forks for each secure compile
In the following commits we want to be able to fork an existing lightweight
fork created at device creation time. In order for the user facing process
to communicate with this new fresh fork we create some members here to hold
FIFO file descriptors and a unique id.

Here we also add a new fork enum that we use to tell the lightweight
process to create a fresh fork.

For more information on why we create a fresh fork see the following
commits.
2019-11-25 10:10:14 +11:00
Brian Paul
a2689ebcd6 nir: no-op C99 _Pragma() with MSVC
This fixes a build failure on MSVC.

BTW, it looks like clang supports _Pragma() but I don't know if it
understands the "gcc unroll N" directive.

Signed-off-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-11-23 10:34:24 -07:00
Michel Zou
95fdde5a60 Meson: Add llvm>=9 modules
Fixes build with MinGW, with shared LLVM and lto
/tmp/opengl32.dll.BxiIYm.ltrans59.ltrans.o:<artificial>:(.text+0x1674): undefined reference to `LLVMAddInstructionCombiningPass'

See also scons/llvm.py

Acked-by: Dylan Baker <dylan@pnwbakers.com>
2019-11-23 16:09:52 +00:00
Michel Zou
02d63ee5a4 disk_cache_get_function_timestamp: check for dladdr
instead of dlopen

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-11-23 12:01:11 +01:00
Michel Zou
bfd9f3201e Meson: Check for dladdr with MinGW
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-11-23 12:01:11 +01:00
Marek Olšák
ad40715f35 nir/serialize: support any num_components for remaining instructions
Only NPOT vectors greater than vec4 use the extra uint32.

This is for instructions that share the dest code.
load_const and undef already support 1-16 in the header.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
c028449c01 nir/serialize: use 3 unused bits in intrinsic for packed_const_indices
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
3d44aed09e nir/serialize: don't serialize redundant nir_intrinsic_instr::num_components
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
a2df670b14 nir/serialize: serialize writemask for vec8 and vec16
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
a5c5388234 nir/serialize: serialize swizzles for vec8 and vec16
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
f1a48d54ea nir/serialize: reuse the writemask field for 2 src X swizzles of SSA ALU
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
487a495cc0 nir/serialize: remove up to 3 consecutive equal ALU instruction headers
vec4 scalarized ALUs typically have 4 equal instruction headers, so remove
the last 3.

There are no bits left in the ALU header for more flags, so future
extensions of NIR will have to use something like instr_type == 15
to describe more complex ALU instructions.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
c3fa9de2a9 nir/serialize: try to pack both deref array src into 32 bits
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
ed6b01d5e0 nir/serialize: cleanup - fold nir_deref_type_var cases into switches
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
a0cd67d292 nir/serialize: try to put deref->var index into the unused bits of the header
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
ca201bfe70 nir/serialize: don't serialize mode for deref non-cast instructions
It can be derived from src and var. This frees 10 bits in the header
that will be used later.

"mode" is moved in the structure, because those bits will be used for
something else later.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
2286340fde nir/serialize: don't store deref types if not needed
- type_cast: deduplicate types if the last one is the same
- derive the type from the parent for other derefs

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
70a7f85149 nir/serialize: try to pack two alu srcs into 1 uint32
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
ef4630cf4f nir/serialize: pack nir_intrinsic_instr::const_index[] better
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
d3346b275a nir/serialize: pack 1-component constants into 20 bits if possible
The majority of constants can be packed like this.

v2: - use enum for the packing encoding,
    - trim packed_value to 20 bits add 1 bit to last_component,
      which simplifies a later commit

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
75f7c38863 nir/serialize: pack load_const with non-64-bit constants better
v2: use blob_write_uint8/16

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
a572ba673b nir/serialize: try to store a diff in var data locations instead of var data
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
c8314678ee nir/serialize: deduplicate serialized var types by reusing the last unique one
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
545415f45f nir/serialize: don't serialize var->data for temporaries
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
c358c2b2bf nir/serialize: pack src better and limit the object count to 1M from 1G
We need to limit the object count to 1M to free 10 bits for the src
modifiers.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
35655865cb nir/serialize: pack instructions better
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
4fe1d7822b util/blob: add 8-bit and 16-bit reads and writes
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Eric Anholt
59b489f44b ci: Use a tag from the parallel-deqp-runner repo.
If the repo continues development, we don't want to accidentally pick
up potentially breaking changes on our next container rebuild.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-11-22 15:37:04 -08:00
Rob Clark
215866523b gitlab-ci/freedreno/a6xx: remove most of the flakes
xfb + lines/points still flakes too frequently (and the problem isn't
even related to xfb), but we can add the rest back into this mix now.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-11-22 13:48:29 -08:00
Rob Clark
9f422cbe1c gitlab-ci/deqp: generate junit results
Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-11-22 13:48:29 -08:00
Rob Clark
415d565d96 gitlab-ci/deqp: generate xml results for fails/flakes
Extract .qpa for the individual unexpected results and flakes, and
translate to xml, preserved with the artifacts.  This allows easy
browsing of the test logs for fails/flakes, for easier debugging.

The # of logs to preserve is capped at 50 to avoid saving 100s of
megabytes of logs in case someone pushes a change that breaks
everything.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-11-22 13:48:29 -08:00
Rob Clark
8af7551a9e gitlab-ci: bump arm test container
To pick up updated cts_runner and netcat for the flake reporting.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-11-22 13:48:29 -08:00
Rob Clark
fdaf777076 gitlab-ci/deqp: detect and report flakes
If there are a small number of fails, re-run to determine if they are
flakes, and optionally (if `$FLAKES_CHANNEL` configured) report the
flakes.

This way flakes don't interfere with developers working on other
drivers, but get logged so that the developers working on the flaking
driver can monitor the situation.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-11-22 13:48:29 -08:00
Rob Clark
cc6484f164 gitlab-ci/deqp: preserve caselists for blocks with fails
Bump cts_runner to pick up the change to preserve .qpa and caselist .txt
files for blocks of tests that contain fails, and preserve the caselist
files.  To reproduce fails that depend on order of running tests, these
are useful.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-11-22 13:48:29 -08:00
Rob Clark
59ed90fc74 gitlab-ci/deqp: preserve full list of unexpected results
The log only shows the first 50, but preserve the full list for easier
browsing.

(Also move return of exit code to end which makes later patches in the
series easier)

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-11-22 13:48:29 -08:00
Rob Clark
5fa397a0d9 gitlab-ci: update deqp build so we can generate xml
Update the deqp build to preserve testlog-to-xml and stylesheets, so
deqp runner can extract .qpa for failed/flaked tests, and convert to
xml.  With this, will be able to browse output from failed tests
directly from the artifacts.

The main motiviation is to give better visibility into what happens with
flaked tests, when it is difficult/impossible to reproduce the flake
locally (ie. when it happens once out of N million tests).  But this
should also make it easier to debug regressions that a MR triggers,
especially when it is on hw that you don't have.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-11-22 13:48:29 -08:00
Markus Wick
dba903ed0b drirc: Enable glthread for dolphin/citra/yuzu.
Dolphin: 75 fps -> 88 fps - Super Mario Galaxy
Citra:   81 fps -> 91 fps - A Link Between Worlds
Yuzu:    21 fps -> 27 fps - Super Mario Odyssey

Dolphin still has many syncs because of glFenceSync and glClientWaitSync.
Moving them to the dispatcher thread might yield another speedup.

Yuzu uses a compatible profile by default. This benchmark used the variable
MESA_GL_VERSION_OVERRIDE=4.5FC to overwrite this behavior.

This profilation was done on a mobile i7-8550U CPU with i965.

Signed-off-by: Markus Wick <markus@selfnet.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-22 15:29:29 -05:00
Markus Wick
f4c61d422d mesa/glthread: Implement ARB_multi_bind.
Signed-off-by: Markus Wick <markus@selfnet.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-22 15:29:07 -05:00
Rhys Perry
517728477c aco: fix waitcnts for barriers at block ends
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: d1b9deee ('aco: improve waitcnt insertion around loops')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-22 19:56:31 +00:00
Zebediah Figura
a3c8bc10aa Revert "draw: revert using correct order for prim decomposition."
This reverts commit f97b731c82.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/250

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-11-22 20:37:42 +01:00
Kenneth Graunke
acd36e488d iris: Change keybox parenting
For temporary lookups, just allocate out of the NULL ralloc context,
so we don't have to edit the linked list of ralloc children to add it
and then immediately remove it again.

When uploading a new shader, allocate the keybox off the shader, so
if we delete the shader the keybox also goes away.  Less manual cleanup.
2019-11-22 09:50:59 -08:00
Ian Romanick
ca353285cb nir/range_analysis: Make sure the table validation only occurs once
All of the tables are static const, so they only need to be validated
once.  As noted in the previous commit, the compiler should be able to
eliminate all of this code when the assertions would pass.  Even with
the help of the previous commit, this does not always occur.

-Og: -95.688 +/- 3.91935 (-24.9562% +/- 1.0222%) N=5
-O1: No difference proven at 95.0% confidence. N=5
-O2: -1.962 +/- 0.85001 (-0.860013% +/- 0.372589%) N=5

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-22 08:16:06 -08:00
Ian Romanick
ccefce46cb nir/range-analysis: Add pragmas to help loop unrolling
I was pretty liberal with these assertions when I wrote this code
because I had assumed that GCC would unroll the loops, inline the look ups
of static const arrays with now constant indices, and then elmininate
all the actuall assertions.  It seems none of this happens even at -O3.

Adding the pragmas helps encourage loop unrolling at some optimization
levels.  I tested by running shader-db with NIR_VALIDATE=false on a Core
i7 Haswell desktop system.

-Og: No difference proven at 95.0% confidence. N=5
-O1: -48.304 +/- 1.221 (-16.3343% +/- 0.412888%) N=5
-O2: -49.94 +/- 1.23521 (-17.9634% +/- 0.444303%) N=5

v2: Add a _Pragma to an inner loop that was accidentally dropped during
a rebase.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-22 08:16:06 -08:00
Danylo Piliaiev
25a00b449f glsl: Add varyings to "zero-init of uninitialized vars" workaround
Varyings are similar to already handled cases. And "glsl_zero_init"
name of the workaround already looks like it should include varyings.

The issue was observed in GiMark subtest from GpuTest.

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-22 15:25:56 +00:00
Alyssa Rosenzweig
4c43b354c3 pan/midgard: Use lower_tex_without_implicit_lod
Just a bit of cleanup. lower_tex can do this lowering for us, which
should also eliminate some special cases (one less thing to fix if we
ever need texturing in tess/geom/etc, perhaps?)

Closes #2133

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-22 08:38:57 -05:00
Christian Gmeiner
47c7c4263c etnaviv: use a more self-explanatory param name
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-11-22 10:47:13 +00:00
Christian Gmeiner
a949fa9d5d etnaviv: drop not used config_out function param
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-11-22 10:47:13 +00:00
Samuel Pitoiset
6f7ec6ee39 gitlab-ci: reduce the number of scons build
It seems overkill to me to build scons 7x for every pipeline.
Scons is now build with the oldest llvm version in scons-old-llvm
and with the newest llvm version in scons.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-22 10:39:21 +01:00
Alyssa Rosenzweig
2e14fe6490 panfrost: Add lcra.c to Android.mk
This was forgotten.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-22 05:07:19 +00:00
Alyssa Rosenzweig
bda2bb31b1 pan/midgard: Enable LOD lowering only on buggy chips
T720 and earlier need this workaround, so check the quirk before
lowering.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-22 05:07:19 +00:00
Alyssa Rosenzweig
68c2c7962a pan/midgard: Describe quirk MIDGARD_BROKEN_LOD
Corresponds to errata #10471, applies to T6xx and T720. Fixed in T760.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-22 05:07:19 +00:00
Alyssa Rosenzweig
d32d4acf68 pan/midgard: Add LOD bias/clamp lowering
We fetch the info with the new intrinsic and lower with ALU ops for txl
instructions, which seemingly correspond to "TEXGRD" instructions (what
we call textureLod).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-22 05:07:19 +00:00
Alyssa Rosenzweig
4e07e7b232 pan/midgard: Implement load_sampler_lod_paramaters_pan
We can stuff this information in as parametrized system values, like we
currently do texture size and SSBO addresses.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-22 05:07:19 +00:00
Alyssa Rosenzweig
deaebc82a7 nir: Add load_sampler_lod_paramaters_pan intrinsic
This loads in the <min_lod, max_lod, lod_bias> settings for a given
sampler, which is necessary for lowering clamps/biases on certain
Midgard chips.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-22 05:07:19 +00:00
Markus Wick
b1156ecdf2 mapi/glapi: Generate sizeof() helpers instead of fixed sizes.
Generating a source code with a fixed size leads to issues with plattform dependent types.
We either hard code 4 or 8 bytes there, and both are wrong on the other plattform.
So this patch solves this issue by generating eg sizeof(GLsizeiptr), which is valid both
on 32 and on 64 bit plattforms.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2019-11-21 22:52:55 -05:00
Ian Romanick
e51eda99df intel/fs: Disable conditional discard optimization on Gen4 and Gen5
The CMP instruction on Gen4 and Gen5 generates one bit (the LSB) of
valid data and 31 bits of junk.  Results of comparisons that are used as
Boolean values need to have a fixup applied to generate the proper 0/~0
values.

Calling fs_visitor::nir_emit_alu with need_dest=false prevents the fixup
code from being generated.  This results in a sequence like:

        cmp.l.f0.0(16)  g8<1>F          g14<8,8,1>F     0x0F  /* 0F */
        ...
        cmp.l.f0.0(16)  g4<1>F          g6<8,8,1>F      0x0F  /* 0F */
(+f0.1) or.z.f0.1(16) null<1>UD g4<8,8,1>UD     g8<8,8,1>UD

instead of

        cmp.l.f0.0(16)  g8<1>F          g14<8,8,1>F     0x0F  /* 0F */
        ...
        cmp.l.f0.0(16)  g4<1>F          g6<8,8,1>F      0x0F  /* 0F */
        or(16) g4<1>UD g4<8,8,1>UD     g8<8,8,1>UD
(+f0.1) and.z.f0.1(16) null<1>UD g4<8,8,1>UD     1UD

I examined a couple of the shaders hurt by this change, and ALL of them
would have been affected by this bug. :(

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1836
Fixes: 0ba9497e66 ("intel/fs: Improve discard_if code generation")

Iron Lake
total instructions in shared programs: 8122757 -> 8122957 (<.01%)
instructions in affected programs: 8307 -> 8507 (2.41%)
helped: 0
HURT: 100
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.84% max: 6.67% x̄: 2.81% x̃: 2.76%
95% mean confidence interval for instructions value: 2.00 2.00
95% mean confidence interval for instructions %-change: 2.58% 3.03%
Instructions are HURT.

total cycles in shared programs: 188510100 -> 188510376 (<.01%)
cycles in affected programs: 76018 -> 76294 (0.36%)
helped: 0
HURT: 55
HURT stats (abs)   min: 2 max: 12 x̄: 5.02 x̃: 4
HURT stats (rel)   min: 0.07% max: 3.75% x̄: 0.86% x̃: 0.56%
95% mean confidence interval for cycles value: 4.33 5.71
95% mean confidence interval for cycles %-change: 0.60% 1.12%
Cycles are HURT.

GM45
total instructions in shared programs: 4994403 -> 4994503 (<.01%)
instructions in affected programs: 4212 -> 4312 (2.37%)
helped: 0
HURT: 50
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.84% max: 6.25% x̄: 2.76% x̃: 2.72%
95% mean confidence interval for instructions value: 2.00 2.00
95% mean confidence interval for instructions %-change: 2.45% 3.07%
Instructions are HURT.

total cycles in shared programs: 128928750 -> 128928982 (<.01%)
cycles in affected programs: 67442 -> 67674 (0.34%)
helped: 0
HURT: 47
HURT stats (abs)   min: 2 max: 12 x̄: 4.94 x̃: 4
HURT stats (rel)   min: 0.09% max: 3.75% x̄: 0.75% x̃: 0.53%
95% mean confidence interval for cycles value: 4.19 5.68
95% mean confidence interval for cycles %-change: 0.50% 1.00%
Cycles are HURT.
2019-11-21 16:40:50 -08:00
Dylan Baker
bba44ef176 docs: update calendar, add news item and link release notes for 19.2.6 2019-11-21 16:34:00 -08:00
Dylan Baker
3531d74e82 docs: Add SHA256 sum for 19.2.6 2019-11-21 16:32:35 -08:00
Dylan Baker
f8070577a4 docs: Add release notes for 19.2.6 2019-11-21 16:32:34 -08:00
Marek Olšák
0b1452ffdd nir/serialize: do ctx = {0} instead of manual initializations
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-21 18:49:57 -05:00
Marek Olšák
ff71fae440 nir: strip as we serialize to remove the nir_shader_clone call
Serializing stripped NIR is faster now.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-21 18:49:57 -05:00
Christian Gmeiner
8acaab1aa7 etnaviv: add drm-shim
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-21 22:56:04 +00:00
Eric Engestrom
609a6ae23e vk_util: drop duplicate formats in vk_format_map[]
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-21 22:52:40 +00:00
Jonathan Marek
773d640efa turnip: implement UBWC
This enables UBWC for everything except 3D textures.

It breaks many image_to_image copies but those aren't important and it can
be worked around later (image_to_image copy needs to be done in two steps,
decode from the source format and then encode to the destination format).

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-21 22:21:57 +00:00
Jonathan Marek
91fd83d142 freedreno/regs: update UBWC related bits
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-21 22:21:57 +00:00
Vinson Lee
6613a4a029 swr: Fix build with llvm-10.0.
Fix build error after llvm-10.0 commit 1dfede3122ee ("Move
CodeGenFileType enum to Support/CodeGen.h").

../src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp: In member function ‘void JitManager::DumpAsm(llvm::Function*, const char*)’:
../src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp:428:45: error: ‘CGFT_AssemblyFile’ is not a member of ‘llvm::TargetMachine’
             *pMPasses, filestream, nullptr, TargetMachine::CGFT_AssemblyFile);
                                             ^

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
2019-11-21 13:20:08 -08:00
Rhys Perry
29d131d619 aco: fix copy+paste error
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-21 20:28:57 +00:00
Rhys Perry
d1b9deeea8 aco: improve waitcnt insertion around loops
Do this by repeating processing of loops until no progress is made.

Totals from affected shaders:
SGPRS: 162576 -> 162576 (0.00 %)
VGPRS: 145228 -> 145228 (0.00 %)
Spilled SGPRs: 668 -> 668 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 15778640 -> 15771336 (-0.05 %) bytes
LDS: 146 -> 146 (0.00 %) blocks
Max Waves: 6087 -> 6087 (0.00 %)

v2: use block_kind_loop_header/block_kind_loop_exit to repeat at the end
    of loops instead of at each continue

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-21 20:28:57 +00:00
Rob Clark
1a8c49d76c freedreno/perfctrs/fdperf: periodically restore counters
When GPU is idle and suspends, the currently selected countables
will all reset to the first one.  So periodically restore the selected
countables.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-21 20:01:03 +00:00
Rob Clark
5a13507164 freedreno/perfcntrs: add fdperf
Port from the envytools tree, but converted to use the .c tables for
describing the perfcounter groups/countables, rather than using rnndec
to get this at runtime from the register xml.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-21 20:01:03 +00:00
Rob Clark
b2338a5b00 freedreno/perfcntrs/a6xx: remove RBBM counters
Currently this are getting blocked by the kernel.. these counters don't
seem to be the most useful ones, and to use them we'd have to somehow
probe the kernel by submitting cmdstream to write the selector regs and
see if that triggers a GPU fault.  So let's just skip them.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-21 20:01:03 +00:00
Rob Clark
6a517b3079 freedreno/perfctrs/a2xx: move CP to be first group
fdperf expects this, to find the ALWAYS_COUNT counter

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-21 20:01:03 +00:00
Rob Clark
e35c4e6ad2 freedreno/perfcntrs: add accessor to get per-gen tables
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-21 20:01:03 +00:00
Rob Clark
b21f03ae7e freedreno/perfcntrs: move to shared location
This should eventually be useful for VK_KHR_performance_query as well.
And in the more near term, for fdperf.

Attempt to not break android build is best-effort and untested.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-21 20:01:03 +00:00
Rob Clark
6727114cba freedreno/perfcntrs: remove gallium dependencies
Prep work to move to a shared location.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-21 20:01:02 +00:00
Rob Clark
3fb6aaf42e freedreno/perfcntrs: small cleanup
When we had one gen supporting performance counters, it made sense to
have these builder macros in the .c file with the table.  But time has
come to de-duplicate.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-21 20:01:02 +00:00
Dave Airlie
cce07ea835 nir: fix deref offset builder
Use the correct bit size

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-22 04:37:41 +10:00
Dave Airlie
7325f6ac98 vtn/opencl: add clz support
This is needed for OpenCL

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-22 04:37:41 +10:00
Dave Airlie
e3b21dfcb1 nouveau: request ufind_msb64 lowering in the frontend.
This passes the piglit CL builtin-ulong-clz-1.0.generated.cl
test.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-11-22 04:37:41 +10:00
Dave Airlie
d0d96053e6 nir: add 64-bit ufind_msb lowering support. (v2)
This adds the option to lower 64-bit ufind_msb opcodes.

v2: use split_x/y removes component loops (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-22 04:37:37 +10:00
Dave Airlie
12913bcf86 spirv/nir/opencl: handle some multiply instructions.
This adds support for some missing 24-bit and hi multiply
variants.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-22 04:37:25 +10:00
Dave Airlie
5375c30234 spirv: get the correct type for function returns.
This needs to be derived from the address format, not always 1/32.

Suggested by Jason

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-22 04:37:25 +10:00
Dave Airlie
b62a925ad1 spirv: don't store 0 to cs.ptr_size for non kernel stages.
cs is a union so storing this there is wrong.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-22 04:37:25 +10:00
Jonathan Marek
1496e1164f util: add missing R8G8B8A8_SRGB format to vk_format_map
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-21 17:46:27 +00:00
Elie Tournier
72b44d148d docs: fix ascii html representation
v2 (Eric): Use more readable ascii version

Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-11-21 16:51:18 +00:00
Elie Tournier
64d7bd96b8 Docs: remove duplicate meson docs for windows
This block is duplicated, we already have the windows instruction above.

Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-11-21 16:51:18 +00:00
Eric Anholt
dd76a6f198 ci: Move freedreno's parallelism to the runner instead of gitlab-ci jobs.
I set the runners to concurrency=1, so they serve only one gitlab-ci job
at at time.  Swap over to using the parallel runner now to keep the
runners busy, more efficiently than spawning many docker containers and
downloading artifacts multiple times, and producing easier-to-understand
results for browsing on the web.

This bumps the a306 runners to 4x parallel instead of 2x like before, but
cheza gles3 drops from 6 to 4.  Current rough timings of the jobs (if no
container download):

db410c-gles2: 5:00
a630-gles2: 1:30
a630-gles3: 6:00
a630-gles31: 5:30

a630-gles3 is a bit longer than I like, but it should come back down once
I can sort out the NIR algebraic rewinding.
2019-11-21 05:48:17 -08:00
Iago Toral Quiroga
c573b50179 glsl: add missing initialization of the location path field
This was apparently missed in 67b32190f3, which added support
for ARB_shading_language_include to #line, including the 'path'
field for the location.

Fixes crashes in CTS with all drivers as they attempt to access
an uninitialized path string during parsing.

Fixes: 67b32190f3 ("glsl: add ARB_shading_language_include support to #line")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2132
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jose Maria Casanova <jmcasanova@igalia.com>
2019-11-21 12:55:15 +01:00
Rhys Perry
1a0500cd04 docs: update features.txt for RADV
[skip ci]

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-11-21 11:00:50 +00:00
Michel Dänzer
32618ee719 gitlab-ci: Directly use host-mapped directory for ccache
Use hardcoded /cache/mesa/ccache for the cache, so it will be shared by
all jobs of all Mesa projects running on the same runner host. This
should increase the hit rate and decrease the worst case storage used.

Further benefits of directly using a host-mapped directory:

* Saves up to ~1 minute per job for restoring and saving the cache
  contents via the GitLab CI cache mechanism
* Cache contents generated by failed jobs are no longer lost
* Jobs running in parallel on the same runner host can get hits from
  each other

Also enable compression, so the default maximum cache size of 5G might
be sufficient.

v2:
* Move CCACHE_DIR variable to the .build-linux template

Suggested-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net> # v1
2019-11-21 10:13:43 +01:00
Samuel Pitoiset
0d1085ac4a gitlab-ci: remove now useless meson-swr-glvnd build job
All things are already part of meson-main.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
2019-11-21 09:35:05 +01:00
Samuel Pitoiset
7362176cfe gitlab-ci: build GLVND in meson-clang
Building GLVND in meson-main doesn't work because this disables
libEGL and it's needed for running shader-db.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
2019-11-21 09:35:05 +01:00
Samuel Pitoiset
e6d26d77a3 gitlab-ci: build swr in meson-main
Now that debugoptimized isn't set and that all test jobs depend on
meson-testing, enabling swr shouldn't slowdown the CI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
2019-11-21 09:35:05 +01:00
Samuel Pitoiset
6cf9b53fa2 gitlab-ci: do not build with debugoptimized for meson-main
This should reduce compile time because optimizations are costly.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
2019-11-21 09:35:05 +01:00
Samuel Pitoiset
66b5627074 gitlab-ci: add a job that only build things needed for testing
For turnip and RADV testing, we will need a debugoptimized build
without UBSAN. This introduces meson-testing which builds only the
things that are needed by the test stage.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
2019-11-21 09:35:04 +01:00
Samuel Pitoiset
eab328fbe9 gitlab-ci: fix ldd check for Vulkan drivers
The 'dri' directory isn't created when building Vulkan drivers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
2019-11-21 09:34:08 +01:00
Samuel Pitoiset
24dd730efc gitlab-ci: move building piglit into a separate script
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
2019-11-21 09:33:39 +01:00
Samuel Pitoiset
8fc8e8e8be pipe-loader: check that the pointer to driconf_xml isn't NULL
This happens when mesa is built with only swrast. The default
driver being kmsro and the default driconf file being v3d,
it's NULL and then strdup crashes.

This fixes a crash with piglit spec/egl_mesa_query_driver/conformance.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-21 07:34:20 +01:00
Alyssa Rosenzweig
046097c092 panfrost: Add the lod_bias field
Enough trial and error ... just think even *more* Midgard about where
this field might be!

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-21 06:05:12 +00:00
Timothy Arceri
cd6322366d compiler: move build definition of pp_standalone_scaffolding.c
This should fix android build issues while still allowing scons to
build the standalone compiler.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2129

Reviewed-by: Mark Janes <mark.a.janes@intel.com>
2019-11-21 16:07:08 +11:00
Karol Herbst
5934a53bfe nir/validate: validate num_components on registers and intrinsics
also make 8 and 16 compoments invalid. We will enable that later again
when we actually support it.

v2: fix validation of nir_intrinsic_instr::num_components
    correct validation of instr->num_components

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-21 01:10:24 +01:00
Mark Janes
eae8dfef58 Revert "st/mesa: keep serialized NIR instead of nir_shader in st_program"
This reverts commit db0c89d4bf.

Gitlab: mesa/mesa#2128
Acked-by: Marek Olšák <maraeo@gmail.com>
2019-11-20 15:22:32 -08:00
Mark Janes
f1f19b6445 Revert "st/mesa: call nir_serialize only once per shader"
This reverts commit 3a8d686889.

Acked-by: Marek Olšák <maraeo@gmail.com>
2019-11-20 15:22:32 -08:00
Arno Messiaen
721d82cf06 lima/ppir: add lod-bias support
Signed-off-by: Arno Messiaen <arnomessiaen@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
2019-11-20 22:24:00 +00:00
Jason Ekstrand
2fca325ea6 Revert "i965/fs: Merge CMP and SEL into CSEL on Gen8+"
This reverts commit 52c7df1643.  The pass,
while clearly useful for some shaders, has at least three bugs that I
was able to find fairly quickly:

 1. It doesn't work for type-converting MOVs because f > 0 is not the
    same as f2i(f) > 0

 2. CSEL is a 3src instruction and only supports one source type; it
    doesn't take this into account and tries to create instructions
    which do a F compare and a D select.  This is especially nasty to
    debug because you don't see that in the dumped assembly because we
    don't properly assert that types are the same in codegen.

 3. While you can handle 2, in theory, by reinterpreting types, you
    can't do that in the presence of source modifiers.  This pass
    doesn't even attempt to detect that.

Those are just the ones I found with the one almost trival shader I was
debugging.  There very likely may be more and.  Best thing to do for now
is just shut it off until someone has the time to figure out how to do
this properly and write tests to ensure it's correct.

Fixes: 3cb085e6d61a "i965/fs: Merge CMP and SEL into CSEL on Gen8+"
Reviewed-by: Brian Paul <brianp@vmware.com>
2019-11-20 20:47:32 +00:00
Daniel Schürmann
8d7621a53f radv: Enable Subgroup Arithmetic and Clustered for SI
This patch also allows to enable VK_AMD_shader_ballot on SI.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-11-20 20:31:45 +00:00
Daniel Schürmann
0cbcfc071e amd/llvm: Add Subgroup Scan functions for SI
The idea of this implementation is taken from the ROCm Device Libs:
https://github.com/RadeonOpenCompute/ROCm-Device-Libs/blob/master/ockl/src/wfredscan.cl

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-11-20 20:31:45 +00:00
Andreas Baierl
fca2d3ce3f lima/streamparser: Add findings introduced with gl_PointSize
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
2019-11-20 19:24:12 +00:00
Andreas Baierl
804c295039 lima/streamparser: Fix typo in vs semaphore parser
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
2019-11-20 19:24:12 +00:00
Yevhenii Kolesnikov
9af22ccddc meson: Fix linkage of libgallium_nine with libgalliumvl
Do not link libgallium_nine with libgalliumvl_stub if it's already
linked with libgalliumvl. Linking with stub leads to "duplicate
symbol" errors.

Fixes: 6b4c7047d5
       ("meson: build gallium nine state_tracker")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2040

Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-11-20 19:16:20 +00:00
Dylan Baker
bcfc9c0fec docs/release-calendar: Update for extended 19.3 rc period 2019-11-20 09:57:05 -08:00
Dylan Baker
ff21acc91c docs: update calendar, add news item and link release notes for 19.2.5 2019-11-20 09:22:29 -08:00
Dylan Baker
d35429239b docs/relnotes/19.2.5: Add SHA256 sum 2019-11-20 09:19:02 -08:00
Dylan Baker
6567b2daa9 docs: Add relnotes for 19.2.5 2019-11-20 09:19:00 -08:00
Rhys Perry
ca2de7ae9c nir/large_constants: use nir_index_vars and nir_variable::index
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-20 15:05:42 +00:00
Rhys Perry
9f92e8b721 nir: add nir_variable::index and nir_index_vars
This will be useful as a deterministic identifier/index for the variable.

v2: fix comment style

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v1)
2019-11-20 15:05:42 +00:00
Rhys Perry
45a0b53490 nir: make nir_variable::{num_members,num_state_slots} a uint16_t
Doesn't shrink it (at least, on x86-64) and leaves space for more members.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-20 15:05:42 +00:00
Samuel Pitoiset
645332f3f5 docs: add missing new features for RADV
[skip ci]

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-20 16:04:15 +01:00
Hyunjun Ko
02f4c39b8d freedreno/ir3: enable half precision for pre-fs texture fetch
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-20 14:09:43 +01:00
Hyunjun Ko
407f8c71d3 freedreno/ir3: fixup when changing to mad.f16
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-20 14:09:43 +01:00
Hyunjun Ko
d0f38394b1 freedreno/ir3: fix printing output registers of FS.
Fixes: cea39af2fb ("freedreno/ir3: Generalize ir3_shader_disasm()")

Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-20 14:09:43 +01:00
Neil Roberts
37f5395783 freedreno/ir3: Enabling lowering 16-bit flrp
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-20 14:09:43 +01:00
Hyunjun Ko
35124b0311 freedreno: support 16b for the sampler opcode
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-20 14:09:43 +01:00
Neil Roberts
b934716bd8 freedreno/ir3: Implement f2b16 and i2b16
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-20 14:09:43 +01:00
Neil Roberts
030b046df8 freedreno/ir3: Add implementation of nir_op_b16csel
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-20 14:09:43 +01:00
Neil Roberts
f0a046024d freedreno/ir3: Support 16-bit comparison instructions
v2. [Hyunjun Ko (zzoon@igalia.com)]
Avoid using too much open code like "instr->regs[n]->flags |= FOO"

v3. [Hyunjun Ko (zzoon@igalia.com)]
Remove redundant code for both 16b and 32b operations.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-20 14:09:43 +01:00
Hyunjun Ko
138542499f freedreno/ir3: cleanup by removing repeated code
Prep-work for the corresponding patch.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-20 14:09:43 +01:00
Neil Roberts
f6b5abe91a nir/lower_alu_to_scalar: Support lowering 8- and 16-bit reduce ops
Reviewed-by: Rob Clark <robdclark@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-20 14:09:43 +01:00
Neil Roberts
634eb9c04b nir: Add a 8-bit bool type
Adds nir_type_bool8 as well as 8-bit versions of all the bool
opcodes.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-20 14:09:43 +01:00
Neil Roberts
0f5640c577 nir: Add a 16-bit bool type
Adds nir_type_bool16 as well as 16-bit versions of all the bool
opcodes.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-20 14:09:43 +01:00
Neil Roberts
2ec97e78a9 nir/opcodes: Add a helper function to generate reduce opcodes
Adds binop_reduce_all_sizes which generates both 1-bit and 32-bit
versions of the reduce operation. This reduces the code duplication a
bit and will make it easier to later add 16-bit versions as well.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-20 14:09:43 +01:00
Neil Roberts
9a96afb97e nir/opcodes: Add a helper function to generate the comparison binops
Adds binop_compare_all_sizes which generates both 1-bit and 32-bit
versions of the comparison operation. This reduces the code
duplication a bit and will make it easier to later add 16-bit versions
as well.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-20 14:09:43 +01:00
Samuel Pitoiset
7ecd8a3471 radv: enable VK_KHR_shader_subgroup_extended_types on GFX6-GFX7
Most of DEQP-VK.subgroups are skipped because 16-bit float aren't
supported but others pass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-20 11:09:58 +00:00
Alejandro Piñeiro
b4bc59e37e v3d: adds an extra MOV for any sig.ld*
Specifically when we are in non-uniform control flow, as we would need
to set the condition for the last instruction. If (for example) a
image atomic load stores directly their value on a NIR register,
last_inst would be a nop, and would fail when set the condition.

Fixes piglit test:
spec/glsl-es-3.10/execution/cs-ssbo-atomic-if-else-2.shader_test

Fixes: 6281f26f06 ("v3d: Add support for shader_image_load_store.")

v2: (Changes suggested by Eric Anholt)
   * Cover all sig.ld* signals, not just ldunif and ldtmu, as all of
     them have the same restriction.
   * Update comment explaining why we add a MOV in that case
   * Tweak commit message.

v3:
   * Drop extra set of parens (Eric)
   * Add missing ld signal to is_ld_signal to fix shader-db regression.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-20 11:21:16 +01:00
Jose Maria Casanova Crespo
d983055184 v3d: Fix predication with atomic image operations
Fixes dEQP test:
dEQP-GLES31.functional.synchronization.inter_call.with_memory_barrier.image_atomic_multiple_interleaved_write_read

Fixes piglit test:
spec/glsl-es-3.10/execution/cs-image-atomic-if-else.shader_test

Fixes: 6281f26f06 ("v3d: Add support for shader_image_load_store.")

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-20 11:20:55 +01:00
Tomeu Vizoso
36b099a7b0 panfrost: Don't print the midgard_blend_rt structs on SFBD
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-20 08:04:25 +01:00
Tomeu Vizoso
2dc720cb2c gitlab-ci: Fix dir name for VK-GL-CTS sources
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-20 08:03:44 +01:00
Tomeu Vizoso
409f6c40ca panfrost: Rework buffers in SFBD
Support cases such as depth-only renders and only set stencil buffers
when needed, to match the blob's behaviour.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-20 08:03:36 +01:00
Tomeu Vizoso
697f02c2a1 panfrost: Just print tiler fields as-is for Tx20
The tiler unit in these GPUs is quite different and we haven't reverse
engineered enough of it yet to validate and pretty print it.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-20 08:00:41 +01:00
Alyssa Rosenzweig
fcf144d96a pan/midgard: Introduce quirks checks
Rather than open-coding checks on gpu_id in the compiler, let's track
quirks applying to whatever we're compiling for, to allow us to manage
the complexity of many heterogenous GPUs in the compiler.

It was discovered that a workaround used on T720 is also required on
T820 (and presumably T830), so let's fix this. This will also decrease
friction as we continue improving T720 support.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-20 07:41:39 +01:00
Timothy Arceri
614fba0ce1 gitlab-ci: update for arb_shading_language_include 2019-11-20 05:05:56 +00:00
Timothy Arceri
530d3b2900 gitlab-ci: bump piglit checkout commit 2019-11-20 05:05:56 +00:00
Timothy Arceri
af432be538 mesa: enable ARB_shading_language_include
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/999

Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:56 +00:00
Timothy Arceri
49cdbba9f6 mesa: implement glCompileShaderIncludeARB()
Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:56 +00:00
Timothy Arceri
bad2c77aa8 mesa: add shader include lookup support for relative paths
Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:56 +00:00
Timothy Arceri
1201d3377e mesa: add support cursor support for relative path shader includes
This will allow us to continue searching the current path for
relative shader includes.

From the ARB_shading_language_include spec:

   "If it is quoted with double quotes in a previously included
   string, then the first search point will be the tree location
   where the previously included string had been found."

Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:56 +00:00
Timothy Arceri
db5197cec5 glsl: delay compilation skip if shader contains an include
If the shader contains an include when need to first run the
preprocessor before deciding if we can skip compilation based
on the shader cache.

Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:56 +00:00
Timothy Arceri
17df8f8b5d glsl: add can_skip_compile() helper
We will reuse this in the following commit.

Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:56 +00:00
Timothy Arceri
5327b756bf glsl: error if #include used while extension is disabled
In other words make sure the shader does this:

Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:55 +00:00
Timothy Arceri
13a1426b97 glsl: add preprocessor #include support
Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:55 +00:00
Timothy Arceri
e0fd2fa689 glsl: pass gl_context to glcpp_parser_create()
This is a small tidy up and will be useful in the following commit.

Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:55 +00:00
Timothy Arceri
67b32190f3 glsl: add ARB_shading_language_include support to #line
From the ARB_shading_language_include spec:

   "#line must have, after macro substitution, one of the following
    forms:

       #line <line>
       #line <line> <source-string-number>
       #line <line> "<path>"

    where <line> and <source-string-number> are constant integer
    expressions and <path> is a valid string for a path supplied in the
    #include directive. After processing this directive (including its
    new-line), the implementation will behave as if it is compiling at
    line number <line> and source string number <source-string-number>
    or <path> path. Subsequent source strings will be numbered
    sequentially, until another #line directive overrides that
    numbering."

Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:55 +00:00
Timothy Arceri
2497c51717 mesa: implement glDeleteNamedStringARB()
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:55 +00:00
Timothy Arceri
f2d01cac7e mesa: split _mesa_lookup_shader_include() in two
The new local function lookup_shader_include() will be used by
glDeleteNamedStringARB() in the following patch.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:55 +00:00
Timothy Arceri
ae2e41841f mesa: implement glGetNamedStringivARB()
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:55 +00:00
Timothy Arceri
575137e613 mesa: implement glIsNamedStringARB()
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:55 +00:00
Timothy Arceri
fafda32127 mesa: make error checking optional in _mesa_lookup_shader_include()
This will be usefull when implementing glIsNamedStringARB() which
doesn't do error checking, it just returns false for invalid
lookups instead.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:55 +00:00
Timothy Arceri
a47bfbe189 mesa: implement glGetNamedStringARB()
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:55 +00:00
Timothy Arceri
fc573c9816 mesa: add glNamedStringARB() support
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:55 +00:00
Timothy Arceri
628d34fddd mesa: add copy_string() helper
This will be used by the various ARB_shading_language_include
functions in the following patches.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:55 +00:00
Timothy Arceri
8acab84f93 mesa: add _mesa_lookup_shader_include() helper
This will be used both by the glsl compiler and the GL API.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:55 +00:00
Timothy Arceri
643a533fc2 mesa: add helper to validate tokenise shader include path
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:55 +00:00
Timothy Arceri
06f33d82ca mesa: add ARB_shading_language_include infrastructure to gl_shared_state
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:55 +00:00
Timothy Arceri
35108caa71 glsl: add infrastructure for ARB_shading_language_include
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:55 +00:00
Timothy Arceri
906f1a2933 mesa: add ARB_shading_language_include stubs
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:55 +00:00
Bas Nieuwenhuizen
4eb2a1dc6f radv: Do not change scratch settings while shaders are active.
When the scratch ringbuffer settings are changed, the shader unit has
to be idle or we will have shaders using old and new settings.

That combination is not supported on the HW (likely the offset is
ringbuffer idx * WAVESIZE * 1024).

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-11-20 01:18:36 +00:00
Eric Anholt
bdf03b738d turnip: Drop the copy of the formats table.
Now that we can (mostly) generate a pipe format for a VkFormat, use that
to answer queries about formats.  This will let us refactor the freedreno
format table surface layout code to be shared between gallium and vulkan.

This causes us to expose fewer formats for now (on a 1/100 CTS run I'm
doing, skips go from 3671 to 3835 out of 5145 tests).  Fails stay about
the same (478 -> 434, but the run is pretty flaky and we're doing fewer
tests now).

v2: Rebase on master, throw a finishme on missing vk-to-pipe formats that
    tu used to support.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> (v1)
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-11-19 15:35:52 -08:00
Eric Anholt
3a28281bf8 util: Add a mapping from VkFormat to PIPE_FORMAT.
I'm planning on using this from radv and tu for queries about formats.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-11-19 15:35:52 -08:00
Marek Olšák
36c055c9b7 winsys/amdgpu: detect noop dependencies on the same ring correctly
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-11-19 18:32:56 -05:00
Marek Olšák
e7fb9c73a7 ac: fill num_rings for remaining IPs
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-11-19 18:31:53 -05:00
Marek Olšák
e9cc4f670f ac: add radeon_info::num_rings and move ring_type to amd_family.h
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-11-19 18:31:53 -05:00
Marek Olšák
654efd38bb nir: don't use GLenum16 in nir.h
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-19 18:20:12 -05:00
Marek Olšák
ec7d37c9c0 nir: move data.descriptor_set above data.index for better packing
4 bytes down

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-19 18:20:10 -05:00
Marek Olšák
b160acb9f5 glsl_to_nir: rename image_access to mem_access
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-19 18:20:09 -05:00
Marek Olšák
193e2c9625 nir/print: only print image.format for image variables
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-19 18:20:07 -05:00
Marek Olšák
ebe7579655 nir: move data.image.access to data.access
The size of the data structure doesn't change.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-19 18:20:05 -05:00
Marek Olšák
3a8d686889 st/mesa: call nir_serialize only once per shader
It was called twice.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-11-19 18:02:06 -05:00
Marek Olšák
db0c89d4bf st/mesa: keep serialized NIR instead of nir_shader in st_program
This decreases memory usage, because serialized NIR is more compact.

If shader_has_one_variant is true and the shader is uncached, the first
variant is created from nir_shader, otherwise the first variant and
all other variants are created from serialized NIR.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-11-19 18:02:06 -05:00
Marek Olšák
610fb0e19c st/mesa: call nir_sweep in st_finalize_nir
This is invoked sooner before (pre-)compiling the first variant and is
also applied to fixed-func and ARB programs.
2019-11-19 18:02:06 -05:00
Marek Olšák
4e70cba638 st/mesa: subclass st_vertex_program for VP-specific members
Inheritance:
    gl_program -> st_program -> st_vertex_program

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-11-19 18:02:06 -05:00
Marek Olšák
16e5f13b64 st/mesa: more cleanups after unification of st_vertex/common_program
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-11-19 18:02:06 -05:00
Marek Olšák
6b3d72b041 st/mesa: rename occurences of stcp to stp to correspond to st_program
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-11-19 18:02:06 -05:00
Marek Olšák
1375217116 st/mesa: cleanups after unification of st_vertex/common program
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-11-19 18:02:06 -05:00
Marek Olšák
5fed208285 st/mesa: rename st_common_program to st_program
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-11-19 18:02:06 -05:00
Marek Olšák
2e39e8b972 st/mesa: trivially merge st_vertex_program into st_common_program
a later commit will add back st_vertex_program as a subclass of
st_common_program

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-11-19 18:02:06 -05:00
Marek Olšák
c97df7b4c7 st/mesa: consolidate and simplify code flagging program::affected_states
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-11-19 18:02:06 -05:00
Marek Olšák
f71e93db0a st/mesa: initialize affected_states and uniform storage earlier in deserialize
This matches the uncached codepath.

affected_states was used before initialization, which was technically
a bug, but probably not reproducible due to _NEW_PROGRAM rebinding
everything.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-11-19 18:02:06 -05:00
Marek Olšák
60398e2d45 st/mesa: start deduplicating some program code
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-11-19 18:02:06 -05:00
Marek Olšák
445ec0fc63 st/mesa: decrease the size of st_fp_variant_key from 48 to 40 bytes
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-11-19 18:02:06 -05:00
Marek Olšák
2c8652f98a st/mesa: rename delete_basic_variant -> delete_common_variant
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-11-19 18:02:06 -05:00
Eric Engestrom
51e214c1db anv: add missing "fall-through" annotation
CoverityID: 1455884
Fixes: c1c346f166 ("anv: implement VK_KHR_separate_depth_stencil_layouts")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-19 22:03:00 +00:00
Eric Engestrom
99788de909 egl: use EGL_CAST() macro in eglmesaext.h
Allows eglmesaext.h to be used in C++ code.

This aligns this file with the rest of EGL.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-By: Tapani Pälli <tapani.palli@intel.com>
2019-11-19 22:00:24 +00:00
Eric Engestrom
344859c32d vulkan: delete typo'd header
Two files exist in that directory:
- vulkan_xlib_randr.h
- vulkan_xlib_xrandr.h

Both were imported in 205c271562 ("vulkan: Update the XML and
headers to 1.1.70") with identical contents (ie. the
VK_EXT_acquire_xlib_display extension), but the former was never
included anywhere and can't be found upstream [1], while the latter is
included in vulkan.h and found upstream.

[1] https://github.com/KhronosGroup/Vulkan-Headers/tree/master/include/vulkan

Fixes: 205c271562 ("vulkan: Update the XML and headers to 1.1.70")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-19 21:56:22 +00:00
Eric Engestrom
0d69c2e932 CL: sync C++ headers with Khronos
https://github.com/KhronosGroup/OpenCL-CLHPP at commit
cf9fc1035e8298c7ce65ee33066a660fd9892ebb

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-19 21:50:26 +00:00
Eric Engestrom
a15aef0d39 CL: sync C headers with Khronos
https://github.com/KhronosGroup/OpenCL-Headers at commit
0d5f18c6e7196863bc1557a693f1509adfcee056

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-19 21:50:25 +00:00
Rafael Antognolli
dadb6ebbd1 intel: Add workaround for stencil state.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-11-19 21:43:09 +00:00
Jonathan Marek
d2cf3cad91 turnip: fix sRGB GMEM clear
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-19 21:35:37 +00:00
Jonathan Marek
d68acdb3b9 turnip: implement CmdClearColorImage/CmdClearDepthStencilImage
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-19 21:35:37 +00:00
Rhys Perry
7eb7969213 radv/aco: enable VK_KHR_shader_subgroup_extended_types
We could enable it on GFX10 if LLVM wasn't used as a fallback for
unsupported stages. Note that the CTS only tests it if
VK_KHR_shader_float16_int8 is enabled, even though it's not a
requirement.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-19 18:58:04 +00:00
Rhys Perry
56c06c79fc aco: implement 64-bit integer reductions
The multiplication reduction is larger than it could be, but it should be
easier to implement this way.

No failures with dEQP-VK.subgroups.*int64* except those caused by LLVM
being used for other stages.

v2: don't call setFixed() for v_add carry-out, since setHint sets physReg
v3: add and use emit_vadd32() helper
v4: use num_opcodes instead of last_opcode

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> (v3)
2019-11-19 18:58:04 +00:00
Rhys Perry
33277bd66e aco: refactor reduction lowering helpers
Should make 64-bit integer reductions easier to implement.

v4: use num_opcodes instead of last_opcode

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> (v3)
2019-11-19 18:56:21 +00:00
Samuel Pitoiset
c93f2cefd5 radv: advertise VK_KHR_shader_subgroup_extended_types on GFX8-GFX9
This extension allows to use subgroup operations with 8 and 16-bits

Untested on GFX6-GFX7, and most of subgroup operations are broken
on GFX10, so don't enable it for now. Not enabled on ACO because
it's still doesn't support 8-bits/16-bits.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-19 18:01:13 +00:00
Samuel Pitoiset
80c71cbbd8 ac: add 16-bit float support to ac_build_alu_op()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-19 18:01:13 +00:00
Samuel Pitoiset
670aa24c69 ac: add 8-bit and 16-bit supports to ac_build_optimization_barrier()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-19 18:01:13 +00:00
Samuel Pitoiset
21a9243f5e ac: add 8-bit and 16-bit supports to ac_build_wwm()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-19 18:01:13 +00:00
Samuel Pitoiset
ef352a2466 ac: add 8-bit and 16-bit supports to get_reduction_identity()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-19 18:01:13 +00:00
Samuel Pitoiset
c8af1d51d4 ac: add 8-bit and 16-bit supports to ac_build_swizzle()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-19 18:01:13 +00:00
Samuel Pitoiset
1565118d8f ac: add 8-bit and 16-bit supports to ac_build_dpp()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-19 18:01:13 +00:00
Samuel Pitoiset
2113867f0c ac: add 8-bit and 16-bit supports to ac_build_set_inactive()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-19 18:01:13 +00:00
Samuel Pitoiset
c29514bd22 ac: add 8-bit and 16-bit supports to ac_build_readlane()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-19 18:01:13 +00:00
Samuel Pitoiset
58d5ab98a3 ac: add 8-bit and 16-bit supports to ac_build_shuffle()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-19 18:01:13 +00:00
Samuel Pitoiset
204cf54b70 ac: remove useless cast in ac_build_set_inactive()
The return type is always the src type (32 or 64 bits).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-19 18:01:13 +00:00
Samuel Pitoiset
194bee193c spirv: fix lowering of OpGroupNonUniformAllEqual
It should rely on the source type, not on the return type which
is always a boolean anyways, so vote_feq was never selected. For
OpSubgroupAllEqualKHR it's always an integer comparison.

This fixes some VK_KHR_shader_subgroup_extended_types tests with RADV.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-19 18:01:13 +00:00
Tomeu Vizoso
2941a734a0 gitlab-ci: Remove limit on kernel logging
We don't seem to fault any more when running dEQP GLES2, and we don't
scrape serial output any more anyway so no problems should be caused by
that.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-19 15:39:13 +01:00
Pierre-Eric Pelloux-Prayer
99f0feb9e2 mesa: fix warning in 32 bits build
Fixes: febedee4f6 ("mesa: add EXT_dsa glGetVertexArray* 4 functions")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-19 08:49:45 +01:00
Pierre-Eric Pelloux-Prayer
3a5a55e5a5 mesa: enable EXT_direct_state_access
Always enabled; this doesn't require any driver work, it's just
core mesa bits.

quick_gl.txt is also updated because previously piglit ext_dsa
tests were skipped.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-19 08:49:45 +01:00
Pierre-Eric Pelloux-Prayer
1ef297645c mesa: add ARB_sparse_buffer NamedBufferPageCommitmentEXT function
The spec is unclear on how to handle the buffer argument so we reuse
the logic from the EXT_direct_state_access spec.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-19 08:49:45 +01:00
Pierre-Eric Pelloux-Prayer
8b6d19413f mesa: add ARB_vertex_attrib_binding glVertexArray* functions
We can't simply alias ARB_direct_state_access functions because
those fail if the vao has never been bound before.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-19 08:49:45 +01:00
Pierre-Eric Pelloux-Prayer
657396aa10 mesa: extend vertex_array_attrib_format to support EXT_dsa
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-19 08:49:45 +01:00
Pierre-Eric Pelloux-Prayer
bb2241bf06 mesa: implement ARB_texture_storage_multisample + EXT_dsa functions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-19 08:49:45 +01:00
Pierre-Eric Pelloux-Prayer
a0d667036d mesa: add ARB_texture_buffer_range glTextureBufferRangeEXT function
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-19 08:49:45 +01:00
Pierre-Eric Pelloux-Prayer
b78e2a197a mesa: add ARB_instanced_arrays EXT_dsa function
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-19 08:49:45 +01:00
Pierre-Eric Pelloux-Prayer
a807b8c0a8 mesa: add ARB_gpu_shader_fp64 selector-less functions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-19 08:49:45 +01:00
Pierre-Eric Pelloux-Prayer
e3385eb0c1 mesa: add ARB_clear_buffer_object named functions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-19 08:49:45 +01:00
Pierre-Eric Pelloux-Prayer
442fd3d007 mesa: add ARB_vertex_attrib_64bit VertexArrayVertexAttribLOffsetEXT
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-19 08:49:44 +01:00
Pierre-Eric Pelloux-Prayer
8cfb3e4ee5 mesa: add ARB_framebuffer_no_attachments named functions
The wording in ARB_framebuffer_no_attachments and EXT_direct_state_access
is different.
In the former framebuffer names must have been generated using glGenFramebuffers
before using the named functions.
In the latter framebuffer names have no such constraints, so we can't use
the _mesa_lookup_framebuffer_dsa function.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-19 08:49:44 +01:00
Pierre-Eric Pelloux-Prayer
dc057f638c mesa: update features.txt to reflect EXT_dsa status
All features from the EXT_dsa spec are implemented.

Interactions with other specs:
- GL_AMD_gpu_shader_int64: not needed, since it's not enabled in
  compatibility profile.
- GL_ARB_bindless_texture is DONE
    "INVALID_OPERATION is generated when calling various functions
    to modify the state of a texture object from which handles have
    been extracted"
- GL_ARB_buffer_storage/GL_EXT_buffer_storage is DONE (NamedBufferStorageEXT function)
- GL_ARB_texture_storage is DONE (3 TextureStorage*DEXT functions)
- GL_ARB_vertex_attrib_binding is DONE (6 VertexArray* functions)
- GL_EXT_external_buffer is not supported by Mesa

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-19 08:49:44 +01:00
Alyssa Rosenzweig
8b1548a12f panfrost: Set PIPE_COMPUTE_CAP_ADDRESS_BITS to 64
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-19 06:22:31 +00:00
Alyssa Rosenzweig
9c28700aaf panfrost: Disable tiling for GLOBAL resources
It doesn't make sense to have nonlinear layouts for a buffer that can be
accessed as direct memory for a compute kernel. Turn that off so things
work as expected.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-19 06:22:31 +00:00
Alyssa Rosenzweig
21dd7574a8 panfrost: Pass kernel inputs as uniforms
We can take the OpenCL kernel inputs and interpret them as uniforms by
simply reusing the Gallium callback.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-19 06:22:31 +00:00
Alyssa Rosenzweig
a7b5dd1290 panfrost: Stub out clover callbacks
We don't implement these yet but let's not crash.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-19 06:22:31 +00:00
Miguel Casas-Sanchez
b196958574 i965: Ensure that all 2101010 image imports can pass framebuffer completeness.
Chrome OS would like to import and render to any supported format that has
a corresponding display plane format, and this prevents throwing
framebuffer incomplete for FBOs using these textures.

See: crbug.com/949260

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-19 02:21:12 +00:00
Dave Airlie
1468a4f1f3 nir/serialize: fix serializing functions with no implementations.
Store a flag stating if there was an implmentation, and use
fxn->impl as a temporary flag between deserializsation stages.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-19 09:30:32 +10:00
Dave Airlie
0fd6b8aa98 nir/serialize: pack function has name and entry point into flags.
Suggested by Jason.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-19 09:30:12 +10:00
Jason Ekstrand
fc72df1d93 iris: Re-enable param compaction
In d1c4e64a69, we added a parameter to tell the back-end compiler to
ignore the param array and just push however many constants you ask it
to push.  I enabled it for iris because this is really what iris wants
but it seems to have caused a number of regressions.  Revert to the old
behavior for now.

Fixes: d1c4e64a69 "intel/compiler: Add a flag to avoid compacting..."
2019-11-18 16:54:07 -06:00
Marek Olšák
189c0cc45b mesa: enable glthread for 7 Days To Die
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-18 17:25:57 -05:00
Iván Briano
ca94717035 intel/compiler: Don't change hstride if not needed
Alignment requirements may have changed the horizontal stride already,
so don't set it if not required to avoid breaking said requirements.

Fixes several tests such as
dEQP-VK.subgroups.vote.graphics.subgroupallequal_int8_t

Signed-off-by: Iván Briano <ivan.briano@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-18 14:19:41 -08:00
Jonathan Marek
3cd44839fa turnip: add x11 wsi
Copied from radv

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-18 22:18:05 +00:00
Jonathan Marek
df9f2adfa3 turnip: add display wsi
Copied from radv (minus the fence change)

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-18 22:18:05 +00:00
Jason Ekstrand
7260df5894 nir: Validate that variables are in the right lists
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-18 16:15:30 -06:00
Jonathan Marek
e2b9d6277e etnaviv: blt: set TS dirty after clear
RS engine does this already, it is missing for BLT engine. This fixes
cases where a clear isn't immediately at the start of the frame.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-11-18 20:59:02 +01:00
Jonathan Marek
d819d4b344 etnaviv: separate PE and RS formats, use only RS only for tiling
There are PE formats not supported by RS, so we can't have a single
to translate both.

Use RS only for same formats until we have a translate_rs_format and test
the possible different format blits.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-11-18 20:58:14 +01:00
Jonathan Marek
e1a86bd634 etnaviv: blt: use only for tiling, and add missing formats
* Removes the incorrect usage of translate_rs_format
* Disables use of BLT engine for different src/dst format

We only really need the BLT engine for tiling/detiling right now, but it
would be nice to support as many blit cases as possible to avoid using PE
for that.

To deal with different formats we need to:
 * Have a translate_blt_format which has all supported formats
 * Fix the swizzle translation from gallium (current version was wrong)
 * Set the src/dst sRGB bits as needed
 * Find which type conversions the BLT engine can actually do

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-11-18 20:57:40 +01:00
Brian Paul
02c3dad0f3 Call shmget() with permission 0600 instead of 0777
A security advisory (TALOS-2019-0857/CVE-2019-5068) found that
creating shared memory regions with permission mode 0777 could allow
any user to access that memory.  Several Mesa drivers use shared-
memory XImages to implement back buffers for improved performance.

This path changes the shmget() calls to use 0600 (user r/w).

Tested with legacy Xlib driver and llvmpipe.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-18 12:28:59 -07:00
Jason Ekstrand
fdaf8144a8 anv: Emit a NULL vertex for zero base_vertex/instance
If both are zero (the common case), we can emit a null vertex buffer
rather than emitting a vertex buffer with zeros in it.  The packing of
the VERTEX_BUFFER_STATE is faster because no relocation is emitted and
we can avoid creating the vertex buffer which means one less
anv_state_stream_alloc.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-18 18:35:14 +00:00
Jason Ekstrand
bc9d7836bc anv: Use an anv_state for the next binding table
This is a bit more natural because we're already getting an anv_state
most places in the pipeline.  The important part here, however, is that
we're no longer calling anv_block_pool_map on every alloc_binding_table
call.  While it's probably pretty cheap, it is potentially a linear walk
over the list of BOs and it was showing up in profiles.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-18 18:35:14 +00:00
Jason Ekstrand
98dc179c1e anv: More carefully dirty state in BindPipeline
Instead of blindly dirtying descriptors and push constants the moment we
see a pipeline change, check to see if it actually changes the bind
layout or push constant layout.  This doubles the runtime performance of
one CPU-limited example running with the Dawn WebGPU implementation when
running on my laptop.

NOTE: This effectively reverts beca63c6c0.  While it was a nice
optimization, it was based on prog_data and we can't do that anymore
once we start allowing the same binding table to be used with multiple
different pipelines.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-18 18:35:14 +00:00
Jason Ekstrand
22f16ff54a anv: More carefully dirty state in BindDescriptorSets
Instead of dirtying all graphics or all compute based on binding point,
we're now much more careful.  We first check to see if the actual
descriptor set changed and then only dirty the stages used by that
descriptor set.  For dynamic offsets, we keep a bitfield per-stage of
which offsets are actually used in that stage and we only dirty push
constants and descriptors if that stage has dynamic offsets AND those
offsets actually change.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-18 18:35:14 +00:00
Jason Ekstrand
ca8117b5d5 anv: Use a switch statement for binding table setup
It theoretically could be more efficient but the real point here is that
it's no longer really a matter of dealing with special cases and then
the "real" thing.  The way we're handling binding tables, it's more of a
multi-step process and a switch is more natural.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-18 18:35:14 +00:00
Jason Ekstrand
9baa33cef0 anv: Rework push constant handling
This substantially reworks both the state setup side of push constant
handling and the pipeline compile side.  The fundamental change here is
that we're no longer respecting the prog_data::param array and instead
are just instructing the back-end compiler to leave the array alone.
This makes the state setup side substantially simpler because we can now
just memcpy the whole block of push constants and don't have to
upload one DWORD at a time.

This also means that we can compute the full push constant layout
up-front and just trust the back-end compiler to not mess with it.
Maybe one day we'll decide that the back-end compiler can do useful
things there again but for now, this is functionally no different from
what we had before this commit and makes the NIR handling cleaner.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-18 18:35:14 +00:00
Jason Ekstrand
ca91ab8015 anv: Re-arrange push constant data a bit
This moves the compute stuff into a anv_push_constants::cs sub-struct.
It also moves dynamic offsets into the push constants.  This means we
have to duplicate the data per-stage but that doesn't seem like the end
of the world and one day we may wish to make dynamic offsets per-stage
anyway.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-18 18:35:14 +00:00
Jason Ekstrand
d1c4e64a69 intel/compiler: Add a flag to avoid compacting push constants
In vec4, we can just not run the pass.  In fs, things are a bit more
deeply intertwined.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-18 18:35:14 +00:00
Jason Ekstrand
aecde23519 anv: Pre-compute push ranges for graphics pipelines
It turns off that emitting push constants is one of the hottest paths in
the driver and ANY work we do there costs us.  By pre-computing things a
bit ahead of time, we shave 5% off the runtime of a CPU-limited example
running with the Dawn WebGPU implementation.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-18 18:35:14 +00:00
Jason Ekstrand
4b392ced2d anv: Stop bounds-checking pushed UBOs
The bounds checking is actually less safe than just pushing the data.
If the bounds checking actually ever kicks in and it's not on the last
UBO push range, then the shrinking will cause all subsequent ranges to
be pushed to the wrong place in the GRF.  One of the behaviors we
definitely don't want is for OOB UBO access to result in completely
unrelated UBOs returning garbage values.  It's safer to just push the
UBOs as-requested.  If we're really concerned about robustness, we can
emit shader code to do bounds checking which should be stupid cheap (a
CMP followed by SEL).

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-18 18:35:14 +00:00
Jason Ekstrand
ebad00d9e7 anv: Delete dead shader constant pushing code
As of 2d78e55a8c, nir_intrinsic_load_constant with a constant offset
is constant-folded so we should never end up with any that trigger
brw_nir_analyze_ubo_ranges.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-18 18:35:14 +00:00
Jason Ekstrand
0709c0f6b4 anv: Flatten descriptor bindings in anv_nir_apply_pipeline_layout
This lets us stop tracking the pipeline layout.  It also means less
indirection on a very hot path.  As an extra bonus, we can make some of
our data structures smaller.  No measurable CPU overhead improvement.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-18 18:35:14 +00:00
Jason Ekstrand
fa120cb31c anv: Input attachments are always single-plane
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-18 18:35:14 +00:00
Jason Ekstrand
0a02f2a278 genxml: Mark everything in genX_pack.h always_inline
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-18 18:35:14 +00:00
Jason Ekstrand
abfd4651ed anv/pipeline: Assume layout != NULL
In the early days of the driver we allowed layout to be VK_NULL_HANDLE
and used that for some internal pipelines when we wanted to be lazy.
Vulkan doesn't actually allow NULL layouts, however, so there's no
reason to have this check.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-18 18:35:14 +00:00
Italo Nicola
59623f211b intel/compiler: remove old comment
This comment was correct some time ago, but since commit
d3c10ad427, it isn't true anymore.

Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
2019-11-18 10:20:34 -08:00
Alyssa Rosenzweig
3663340049 pan/midgard: Use shader stage in mir_op_computes_derivative
A 'normal' texture op may be emitted in a vertex shader on T720 but it
still doesn't take any derivatives.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-18 08:48:54 -05:00
Danylo Piliaiev
6f17fe0606 i965: Unify CC_STATE and BLEND_STATE atoms on Haswell as a workaround
Re-emitting 3DSTATE_CC_STATE_POINTERS after emitting
3DSTATE_BLEND_STATE_POINTERS fixes the shadow flickering in
SuperTuxCart and Tropico 6 which was seen only on Haswell.
The reason for this is unknown and fix was found empirically.

The closest mention in PRM is that it should improve performance.
From the HSW PRM, volume 2b, page 823 (3DSTATE_BLEND_STATE_POINTERS):
 "When the BLEND_STATE pointer changes but not the CC_STATE pointer,
  driver needs to force a CC_STATE pointer change to improve
  blend performance in pixel backend."

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1834
Fixes: eca4a654 ("i965: Disable dual source blending when shader doesn't support it on gen8+")
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-11-18 11:00:23 +02:00
Samuel Pitoiset
1ebd9459e7 radv: implement VK_AMD_device_coherent_memory
This extension adds the device coherent and device uncached memory
types. It's known to be slower than non-device coherent memory but
it might be useful for debugging.

This is only exposed for chips that support L2 uncached.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-18 08:20:19 +00:00
Samuel Pitoiset
2af7511ed2 ac: add radeon_info::has_l2_uncached
For chips that have uncached device memory (ie. MTYPE_UC).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-18 08:20:19 +00:00
Pierre-Eric Pelloux-Prayer
3c9ea6bdfd radeonsi: enable mesa_glthread for GfxBench
It improves offscreen tests performance.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-18 09:16:18 +01:00
Alyssa Rosenzweig
bc9a7d0699 pan/midgard: Represent ld/st offset unpacked
This simplifies manipulation of the offsets dramatically, fixing some
UBO access related bugs.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-17 22:19:31 -05:00
Alyssa Rosenzweig
1798f6bfc3 pan/midgard: Fix masks/alignment for 64-bit loads
These need to be handled with special care.

Oh, Midgard, you're *extra* special.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-17 22:19:31 -05:00
Alyssa Rosenzweig
34a860b9e3 pan/midgard: Expose more typesize helpers
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-17 21:30:14 -05:00
Alyssa Rosenzweig
2236904f72 pan/midgard: Implement non-aligned UBOs
The field is more fine-grained than we had assumed.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-17 21:18:45 -05:00
Christian Gmeiner
ee3ad0fad2 etnaviv: rs: upsampling is not supported
This change makes it possible to support different downsample cases
like 4 -> 2 or 4 -> 1.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-11-17 18:42:31 +00:00
Jonathan Marek
75e58d1fae freedreno/registers: fix a6xx_2d_blit_cntl ROTATE
A change from b7093882 got overwritten by 610c8c93

Fixes: 610c8c93 ("freedreno/registers: Update with GS, HS and DS registers")

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-17 17:40:53 +00:00
Jonathan Marek
0f5743429c freedreno/ir3: disable texture prefetch for 1d array textures
Prefetch only supports the basic 2D texture case, checking is_array is
needed because 1d array textures pass the coord num_components==2 test.

Fixes: 2a0d45ae ("freedreno/ir3: Add a NIR pass to select tex instructions eligible for pre-fetch")

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-17 17:01:18 +00:00
Andreas Baierl
ef9635d0bc lima: Parse VS and PLBU command stream while making a dump
This makes the streams more readable and comparable with the blob's parser
as it parses the VS and PLBU stream and shows the currently known values.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
2019-11-17 05:39:17 +00:00
Andreas Baierl
c76eb7ea84 lima: Beautify stream dumps
Change the dump, that the output looks more like the output of
mali-syscall-tracker [1].
This is a preparation for a more detailed stream analysis.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>

[1]: https://gitlab.freedesktop.org/lima/mali-syscall-tracker
2019-11-17 05:39:17 +00:00
Aaron Watry
3b3494174d clover/llvm: fix build after llvm 10 commit 1dfede3122ee
CodeGenFileType moved from ::llvm::TargetMachine in
llvm/Target/TargetMachine.h to ::llvm:: in llvm/Support/CodeGen.h

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-11-15 22:54:31 -06:00
Mauro Rossi
09ab297e9f android: util/format: fix include path list
To avoid following building error:

out/target/product/x86_64/obj_x86/STATIC_LIBRARIES/libmesa_util_intermediates/format/u_format_table.c:30:10:
fatal error: 'u_format.h' file not found
         ^~~~~~~~~~~~
1 error generated.

Fixes: 882ca6d ("util: Move gallium's PIPE_FORMAT utils to /util/format/")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
2019-11-16 00:06:31 +01:00
Mauro Rossi
3cd522c70a android: radeonsi: fix build error due to wrong u_format.csv file path
GEN10_FORMAT_TABLE_INPUTS requires correction of u_format.csv file path
in order to avoid following build error:

ninja: error: 'external/mesa/util/format/u_format.csv',
needed by 'out/target/product/x86_64/gen/STATIC_LIBRARIES/libmesa_pipe_radeonsi_intermediates/radeonsi/gfx10_format_table.h',
missing and no known rule to make it

Fixes: 882ca6d ("util: Move gallium's PIPE_FORMAT utils to /util/format/")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
2019-11-15 23:20:03 +01:00
Eric Anholt
b30589cbd3 mesa/st: Reuse st_choose_matching_format from st_choose_format().
We had this ad-hoc exact size matching for unsized internalformats,
but st_choose_matching_format() can do exactly what we want.  This
means, that, for example, we'll now prefer the matching ordering for
565/565_REV if the driver supports both orders.  We also pass
Unpack.SwapBytes through from ChooseTextureFormat so that we can hit
the memcpy path for 8888 formats when that flag is set.

Some interesting format choice changes from this (on softpipe):
intf/form/type        before            after
----------------------------------------------------
RGBA/RGBA/USHORT:     R8G8B8A8_UNORM -> RGBA_UNORM16
RGB/RGBA/8888:        X8B8G8R8_UNORM -> R8G8B8X8_UNORM
RGB/ABGR/8888_REV:    X8B8G8R8_UNORM -> R8G8B8X8_UNORM
RGBA/RGBA/5551:       B5G5R5A1_UNORM -> A1B5G5R5_UNORM
RGBA/RGBA/4444:       R8G8B8A8_UNORM -> A4B4G4R4_UNORM
RGBA/GL_RGBA/1010102: R8G8B8A8_UNORM -> A2B10G10R10_UNORM
DEPTH/DEPTH/UINT:     Z24X8          -> Z_UNORM32
DEPTH/DEPTH/USHORT:   Z24X8          -> Z_UNORM16

v2: Make sure that the baseformat still matches.  v1 would pick
    MESA_FORMAT_L16_UNORM for RED/LUMINANCE/SHORT, when we clearly
    want a red format.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-11-15 20:32:17 +00:00
Eric Anholt
bc2b14a4a3 mesa: Don't put sRGB formats in the array format table.
sRGB vs unorm was the only conflict case being guarded against in this
function.  Before the PIPE_FORMAT conversion, we always listed the
unorm before the sRGB in the enums, but PIPE_FORMAT_A8B8G8R8_SRGB
happens to be before _UNORM.  We always want the unorm result here.

Fixes: 807a800d8c ("mesa: Redefine MESA_FORMAT_* in terms of PIPE_FORMAT_*.")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-11-15 20:32:17 +00:00
Eric Anholt
e5b06008f1 mesa/st: Simplify st_choose_matching_format().
We now have a nice helper function for finding those memcpy formats,
without needing to go through each entry of the mesa format table to
see if it happens to match.

While looking at sysprof of a softpipe GLES2 CTS run, we were spending
~8% of the CPU on ChooseTextureFormat.  With this, roughly the same
region of the testsuite was .4%.

v2: Add Ken's fix for canonicalizing array formats.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-11-15 20:32:17 +00:00
Kenneth Graunke
69f109cc37 mesa: Handle GL_COLOR_INDEX in _mesa_format_from_format_and_type().
Just return MESA_FORMAT_NONE to avoid triggering unreachable; there's
really no sensible thing to return for this case anyway.

This prevents regressions in the next commit, which makes st/mesa
start using this function to find a reasonable format from GL format
and type enums.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-15 20:32:17 +00:00
Alyssa Rosenzweig
ea232c7cfd pan/midgard: Use generic constant packing for 8/64-bit
Eventually, we will want to combine constants across types, but for now
let's not break the world.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-15 20:08:46 +00:00
Alyssa Rosenzweig
4c182a6d11 pan/midgard: Pack 64-bit swizzles
64-bit ops have their own funky swizzles. Let's pack them, both for
native 64-bit sources as well as extended 32-bit sources.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-15 20:08:46 +00:00
Alyssa Rosenzweig
ba2fb98d36 pan/midgard: Fix mir_round_bytemask_down for !32b
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-15 20:08:46 +00:00
Alyssa Rosenzweig
2655a300a3 pan/midgard: Implement i2i64 and u2u64
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-15 20:08:46 +00:00
Alyssa Rosenzweig
855eec93b1 pan/midgard: Expand 64-bit writemasks
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-15 20:08:46 +00:00
Marek Olšák
bda3ec5d55 radeonsi/nir: don't lower fma, instead, fuse fma
We want fma. This decreases compile times by 4% for Borderlands 2.

48505 shaders in 30515 tests
Totals:
SGPRS: 2206584 -> 2204784 (-0.08 %)
VGPRS: 1647892 -> 1648964 (0.07 %)
Spilled SGPRs: 6256 -> 6078 (-2.85 %)
Spilled VGPRs: 72 -> 72 (0.00 %)
Private memory VGPRs: 2176 -> 2176 (0.00 %)
Scratch size: 2240 -> 2240 (0.00 %) dwords per thread
Code Size: 49680804 -> 49837988 (0.32 %) bytes
LDS: 74 -> 74 (0.00 %) blocks
Max Waves: 371387 -> 371352 (-0.01 %)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-15 14:34:49 -05:00
Marek Olšák
dec34e880d radeonsi/nir: call nir_lower_flrp only once per shader
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-15 14:34:49 -05:00
Marek Olšák
0714b3d57e radeonsi/nir: remove dead function temps
glxgears has dead temps after lowering color inputs to load intrinsics.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-15 14:34:49 -05:00
Marek Olšák
bc5097a7d9 gallium/noop: call finalize_nir
For measuring st/mesa compile time.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-15 14:34:49 -05:00
Tomeu Vizoso
27801b90fa panfrost: Make sure the shader descriptor is in sync with the GL state
State was leaking from previous frames as we weren't updating the
descriptor in all cases.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
2019-11-15 18:37:34 +00:00
Alyssa Rosenzweig
095654e3c2 pan/midgard: Prioritize texture registers
On newer GPUs, this is a no-op. On older GPUs, this prevents needless
spilling since texture registers are shared with a subset of work
registers.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
2019-11-15 18:37:34 +00:00
Alyssa Rosenzweig
339401b53c pan/midgard: Disassemble with old pipeline always on T720
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
2019-11-15 18:37:33 +00:00
Alyssa Rosenzweig
8344d7425b pan/midgard: Use texture, not textureLod, on early Midgard
We have to disable the fixup.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
2019-11-15 18:37:33 +00:00
Alyssa Rosenzweig
29f5b00e6e pan/midgard: Fix vertex texturing on early Midgard
We use a different set of texture registers, probably to save hardware.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
2019-11-15 18:37:33 +00:00
Alyssa Rosenzweig
3866d0776f pan/midgard: Generalize texture registers across GPUs
Early Midgard uses a different set of texture registers; let's not
hardcode.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
2019-11-15 18:37:33 +00:00
Rhys Perry
df645fa369 aco: implement VK_KHR_shader_float_controls
This actually supports more of the extension than the LLVM backend but we
can't enable it because ACO doesn't work with all stages yet.

With more of it enabled, some CTS tests fail because our 64-bit sqrt
is very imprecise. I can't find any precision requirements for it
anywhere, so I'm thinking it might be a CTS issue.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-15 17:36:21 +00:00
Rhys Perry
be1d11249b aco: fix 64-bit fsign with 0
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-15 17:36:21 +00:00
Rhys Perry
b062b92ab1 aco: don't combine literals into v_cndmask_b32/v_subb/v_addc
No pipeline-db changes

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-15 17:36:21 +00:00
Rhys Perry
d7b0d9a8d8 radv: enable FP16/FP64 denormals earlier and only for LLVM
ACO sets this itself and will have to set it differently in the future to
support shaderDenormFlushToZeroFloat64.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-11-15 17:36:21 +00:00
Michel Dänzer
c6c7652753 gitlab-ci: Organize images using new REPO_SUFFIX templates feature
Two benefits:

Most docker image related environment variables can now be defined in
the jobs where they're used instead of globally. The DEBIAN_TAG values
are propagated to other jobs via YAML anchors.

Images on https://gitlab.freedesktop.org/mesa/mesa/container_registry
are now organized in separate repositories with a suffix matching the
name of the job which makes sure the image is there.

Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-11-15 16:23:22 +01:00
Michel Dänzer
506e9d5fc7 gitlab-ci: Rename container install scripts to match job names (better)
Cleans up .gitlab-ci/ a little, and allows using a single DEBIAN_EXEC
line for all container jobs.

v2:
* Use lava_arm.sh instead of arm_lava.sh for consistency with v2 of the
  previous change

Reviewed-by: Eric Anholt <eric@anholt.net> # v1
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-11-15 16:21:10 +01:00
Michel Dänzer
3a48f4565e gitlab-ci: Use functional container job names
This makes it easier to tell which job is which in a pipeline.

v2:
* Use lava_arm{64,hf} instead of arm{64,hf}_lava to keep these jobs
  together in pipeline overviews

Reviewed-by: Eric Anholt <eric@anholt.net> # v1
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-11-15 16:20:16 +01:00
Michel Dänzer
670277846d gitlab-ci: Document that ci-templates refs must be in sync
Otherwise there can be weird breakage.

(Removing the include from .gitlab-ci/lava-gitlab-ci.yml doesn't seem
possible unfortunately:
https://gitlab.freedesktop.org/daenzer/mesa/pipelines/79458)

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-11-15 16:06:54 +01:00
Tomeu Vizoso
7d24cef200 panfrost: Multiply offset_units by 2
Per the spec, the units passed to glPolygonOffset are to be multiplied
by an implementation-defined constant.

On Midgard, this constant seems to be 2.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-15 14:45:29 +01:00
Lionel Landwerlin
c061185e17 intel/perf: add EHL performance query support
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-11-15 13:14:30 +00:00
Lionel Landwerlin
39fd11a9f8 intel/dev: flag the Elkhart Lake platform
We'll use this for performance metrics which are different from ICL.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-11-15 13:14:30 +00:00
Tapani Pälli
7a893a0d57 gitlab-ci: update Piglit commit, update skips
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
2019-11-15 12:06:15 +02:00
Tapani Pälli
1d970f15e2 mesa: allow bit queries for EXT_disjoint_timer_query
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2090
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-15 12:05:56 +02:00
Samuel Pitoiset
41a1152cdc radv: make sure to not clear the ds attachment after resolves
To not overwrite the resolve if there is pending clear aspects,
same as color resolves.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-15 09:36:43 +01:00
Samuel Pitoiset
519d9b30de radv: remove useless RADV_DEBUG=unsafemath debug option
This option is useless and shouldn't be used at all.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-15 09:07:34 +01:00
Nathan Kidd
9a80b7fd8f llvmpipe: Check thread creation errors
In the case of glibc, pthread_t is internally a pointer.  If
lp_rast_destroy() passes a 0-value pthread_t to pthread_join(), the
latter will SEGV dereferencing it.

pthread_create() can fail if either the user's ulimit -u or Linux
kernel's /proc/sys/kernel/threads-max is reached.

Choosing to continue, rather than fail, on theory that it is better to
run with the one main thread, than not run at all.

Keeping as many threads as we got, since lack of threads severely
degrades llvmpipe performance.

Signed-off-by: Nathan Kidd <nkidd@opentext.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-11-15 02:43:22 +01:00
Ben Crocker
9c3be6d21f llvmpipe: use ppc64le/ppc64 Large code model for JIT-compiled shaders
Large programs, e.g. gnome-shell and firefox, may tax the
addressability of the Medium code model once a (potentially unbounded)
number of dynamically generated JIT-compiled shader programs are
linked in and relocated.  Yet the default code model as of LLVM 8 is
Medium or even Small.

The cost of changing from Medium to Large is negligible:
- an additional 8-byte pointer stored immediately before the shader entrypoint;
- change an add-immediate (addis) instruction to a load (ld).

Testing with WebGL Conformance 
(https://www.khronos.org/registry/webgl/sdk/tests/webgl-conformance-tests.html)
yields clean runs with this change (and crashes without it).

Testing with glxgears shows no detectable performance difference.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1753327, 1753789, 1543572, 1747110, and 1582226

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/223

Co-authored by: Nemanja Ivanovic <nemanjai@ca.ibm.com>, Tom Stellard <tstellar@redhat.com>

CC: mesa-stable@lists.freedesktop.org

Signed-off-by: Ben Crocker <bcrocker@redhat.com>
2019-11-14 23:07:26 +00:00
Kenneth Graunke
4242c57227 iris: Wrap iris_fix_edge_flags in NIR_PASS
So nir_validate happens properly.  Unfortunately this means we have
to play the metadata song and dance, so walk over all impls and say
that we didn't hurt anything.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-14 14:50:11 -08:00
Kenneth Graunke
39c23fd1bb iris: Properly move edgeflag_out from output list to global list
When demoting it from an output to a global, we need to actually move
it to the correct list.  While here, we also refactor so it's clear
we aren't mutating the list while iterating.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2106
Fixes: f9fd04aca1 ("nir: Fix non-determinism in lower_global_vars_to_local")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-14 14:50:09 -08:00
Eric Anholt
790d0ebef3 mesa: Move compile of common Mesa core files to a static lib.
We were compiling them twice, costing extra build time.  Reduces my
ccache-hot clean build time by a second (24.3s to 23.3s, 3 runs each).

The windows args are a little strange -- it's not clear to me that
they're actually used for building these files, but keep them in place
just in case, since we don't have a good windows CI story yet.  We
should want them on both gallium and classic regardless: Only osmesa
could be built for windows in classic, and classic OSMesa's scons
build defines these flags too.

Closes: #2052
Acked-by: Dylan Baker <dylan@pnwbakers.com>
2019-11-14 21:46:10 +00:00
Prodea Alexandru-Liviu
cc758f1224 Appveyor: Quickly fix meson build.
As this required use of Python 3.8, mako module also had to be updated.

v2 - Unbind mako module version when using Meson.
Signed-off-by: Prodea Alexandru-Liviu <liviuprodea@yahoo.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-11-14 21:45:23 +00:00
Danylo Piliaiev
0904ee0c60 intel/fs: Do not lower large local arrays to scratch on gen7
On gen7 and earlier the scratch space size is limited to 12kB.
By enabling this optimization we may easily exceed this limit
without having any fallback.

arb_compute_shader/linker/bug-93840.shader_test crashes with
this lowering on IVB due to exceeding scratch size limit.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2092
Fixes: 69244fc7
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-14 20:08:30 +00:00
Eric Anholt
882ca6dfb0 util: Move gallium's PIPE_FORMAT utils to /util/format/
To make PIPE_FORMATs usable from non-gallium parts of Mesa, I want to
move their helpers out of gallium.  Since u_format used
util_copy_rect(), I moved that in there, too.

I've put it in a separate directory in util/ because it's a big chunk
of related code, and it's not clear to me whether we might want it as
a separate library from libmesa_util at some point.

Closes: #1905
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-14 10:47:20 -08:00
Eric Engestrom
ac78ca4b39 gitlab-ci: auto-cancel CI runs when a newer commit is pushed to the same branch
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
2019-11-14 17:32:31 +00:00
Timur Kristóf
9b8dc6929e aco: Optimize out trivial code from uniform bools.
This should remove most of the excess code size that was
introduced by making all booleans per-lane.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-14 17:27:11 +01:00
Timur Kristóf
8995c0b30a aco: Treat all booleans as per-lane.
Previously, instruction selection had two kinds of booleans:
1. divergent which was per-lane and stored in s2 (VCC size)
2. uniform which was stored in s1
Additionally, uniform booleans were made per-lane when they resulted
from operations which were supported only by the VALU.

To decide which type was used, we relied on the destination size,
which was not reliable due to the per-lane uniform bools, but it
mostly works on wave64.
However, in wave32 mode (where VCC is also s1) this approach
makes it impossible keep track of which boolean is uniform and
which is divergent.

This commit makes all booleans per-lane.
The resulting excess code size will be taken care of by the optimizer.

v2 (by Daniel Schürmann):
- Better names for some functions
- Use s_andn2_b64 with exec for nir_op_inot
- Simplify code due to using s_and_b64 in bool_to_scalar_condition

v3 (by Timur Kristóf):
- Fix several subgroups regressions

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-14 17:27:11 +01:00
Daniel Schürmann
a1622c1a11 aco: use s_and_b64 exec to reduce uniform booleans to one bit
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-11-14 17:27:10 +01:00
Timur Kristóf
94e355148f aco: Make sure not to mistakenly propagate 64-bit constants.
ACO's optimizer would try to propagate 64-bit constants, but
does so in such a way that wouldn't work due to how the 64-bit
constants are handled in the IR.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-14 17:27:10 +01:00
Daniel Schürmann
9d3e070524 aco: value number instructions using the execution mask
This patch tries to give instructions with the same execution
mask also the same pass_flags and enables VN for SALU instructions
using exec as Operand.
This patch also adds back VN for VOPC instructions and removes VN for phis.

v2 (by Timur Kristóf):
- Fix some regressions.
v3 (by Daniel Schürmann):
- Fix additional issues

Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-11-14 17:27:10 +01:00
Daniel Schürmann
8657eede8a aco: check if SALU instructions are predeceeded by exec when calculating WQM needs
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-11-14 17:27:10 +01:00
Samuel Pitoiset
ee9811a0bb ac: fix build with recent LLVM
Build is broken since "Move CodeGenFileType enum to Support/CodeGen.h".

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-11-14 14:41:55 +00:00
Tapani Pälli
94cb4916e3 Revert "mesa: allow bit queries for EXT_disjoint_timer_query"
This reverts commit 66d24a9ef7.

This commit made Mesa CI red because commit depends on a Piglit test
change.
2019-11-14 13:34:33 +00:00
Connor Abbott
f9fd04aca1 nir: Fix non-determinism in lower_global_vars_to_local
Using a hash-table walk means that variables will get inserted in
different orders on different runs. Just walk the list of globals
instead, even if some of them can't be turned into locals.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-14 13:10:58 +00:00
Iago Toral Quiroga
f512965b0b mesa/st: make sure we remove dead IO variables before handing NIR to backends
Commit "1c2bf82d24a glsl: disable lower_fragdata_array() for NIR drivers"
disabled the GLSL IR lowering that turned gl_FragData from an array into a
collection of scalar outputs under the assumption that this was already being
handled properly elsewhere, however there are some corner cases where NIR
would fail to do this, leaving gl_FragData[] as an array variable. This can
break backends that assume that all their outputs will be scalar and use the
variable definitions from the shader to do their output setup, such as the
case of V3D.

At least one corner case was found in some Portal shaders from shader-db, where
NIR would optimize out the full body of a fragment shader. In this scenario,
the empty shader would keep the original array definition of gl_FragData[],
causing the backend to assert.

We need to do this late enough for it to be effective, since doing it in
st_nir_preprocess does not fix the original problem.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2091
Fixes: 1c2bf82d ("glsl: disable lower_fragdata_array() for NIR drivers")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-14 10:49:00 +01:00
Tapani Pälli
66d24a9ef7 mesa: allow bit queries for EXT_disjoint_timer_query
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2090
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-14 09:27:13 +02:00
Tapani Pälli
1a093a06d6 Revert "dri_interface: add interface for EGL_EXT_image_flush_external"
This reverts commit 7520478461.

This series caused unexpected flickering artifacts with Iris driver on
Chrome OS and EGL_EXT_image_flush_external spec has not been published
yet.

Acked-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-14 07:46:36 +02:00
Tapani Pälli
7951eb146c Revert "st/dri: assume external consumers of back buffers can write to the buffers"
This reverts commit 1d1b457821.

This series caused unexpected flickering artifacts with Iris driver on
Chrome OS and EGL_EXT_image_flush_external spec has not been published
yet.

Acked-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-14 07:46:30 +02:00
Tapani Pälli
25f596e6ba Revert "st/dri: add support for EGL_EXT_image_flush_external"
This reverts commit 1d122c104a.

This series caused unexpected flickering artifacts with Iris driver on
Chrome OS and EGL_EXT_image_flush_external spec has not been published
yet.

Acked-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-14 07:46:20 +02:00
Tapani Pälli
ff05f16c99 Revert "egl: handle EGL_IMAGE_EXTERNAL_FLUSH_EXT"
This reverts commit 34b1aa957a.

This series caused unexpected flickering artifacts with Iris driver on
Chrome OS and EGL_EXT_image_flush_external spec has not been published
yet.

Acked-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-14 07:46:14 +02:00
Tapani Pälli
e64b91e34a Revert "egl: implement new functions from EGL_EXT_image_flush_external"
This reverts commit c1c574fdf1.

This series caused unexpected flickering artifacts with Iris driver on
Chrome OS and EGL_EXT_image_flush_external spec has not been published
yet.

Acked-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-14 07:46:04 +02:00
Alyssa Rosenzweig
ad6b2ac374 pan/midgard: Fix copypropagation for textures
total instructions in shared programs: 3562 -> 3457 (-2.95%)
instructions in affected programs: 575 -> 470 (-18.26%)
helped: 16
HURT: 0
helped stats (abs) min: 1 max: 14 x̄: 6.56 x̃: 10
helped stats (rel) min: 5.71% max: 24.56% x̄: 16.83% x̃: 18.87%
95% mean confidence interval for instructions value: -9.07 -4.06
95% mean confidence interval for instructions %-change: -19.00% -14.66%
Instructions are helped.

total bundles in shared programs: 1846 -> 1830 (-0.87%)
bundles in affected programs: 338 -> 322 (-4.73%)
helped: 16
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 2.50% max: 20.00% x̄: 8.85% x̃: 3.33%
95% mean confidence interval for bundles value: -1.00 -1.00
95% mean confidence interval for bundles %-change: -13.02% -4.67%
Bundles are helped.

total quadwords in shared programs: 3191 -> 3144 (-1.47%)
quadwords in affected programs: 606 -> 559 (-7.76%)
helped: 16
HURT: 0
helped stats (abs) min: 1 max: 14 x̄: 2.94 x̃: 3
helped stats (rel) min: 5.17% max: 22.22% x̄: 11.20% x̃: 5.62%
95% mean confidence interval for quadwords value: -4.58 -1.29
95% mean confidence interval for quadwords %-change: -15.16% -7.24%
Quadwords are helped.

total registers in shared programs: 312 -> 303 (-2.88%)
registers in affected programs: 27 -> 18 (-33.33%)
helped: 9
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 33.33% max: 33.33% x̄: 33.33% x̃: 33.33%
95% mean confidence interval for registers value: -1.00 -1.00
95% mean confidence interval for registers %-change: -33.33% -33.33%
Registers are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-14 02:36:21 +00:00
Alyssa Rosenzweig
f72873e6aa pan/midgard: Copypropagate vector creation
total instructions in shared programs: 3457 -> 3431 (-0.75%)
instructions in affected programs: 787 -> 761 (-3.30%)
helped: 14
HURT: 0
helped stats (abs) min: 1 max: 12 x̄: 1.86 x̃: 1
helped stats (rel) min: 1.01% max: 11.11% x̄: 9.22% x̃: 11.11%
95% mean confidence interval for instructions value: -3.55 -0.16
95% mean confidence interval for instructions %-change: -11.41% -7.03%
Instructions are helped.

total bundles in shared programs: 1830 -> 1826 (-0.22%)
bundles in affected programs: 279 -> 275 (-1.43%)
helped: 2
HURT: 0

total quadwords in shared programs: 3144 -> 3121 (-0.73%)
quadwords in affected programs: 645 -> 622 (-3.57%)
helped: 13
HURT: 0
helped stats (abs) min: 1 max: 11 x̄: 1.77 x̃: 1
helped stats (rel) min: 2.09% max: 16.67% x̄: 12.61% x̃: 14.29%
95% mean confidence interval for quadwords value: -3.45 -0.09
95% mean confidence interval for quadwords %-change: -15.43% -9.79%
Quadwords are helped.

total registers in shared programs: 303 -> 301 (-0.66%)
registers in affected programs: 14 -> 12 (-14.29%)
helped: 2
HURT: 0

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-14 02:36:21 +00:00
Alyssa Rosenzweig
39b5f2fa0b pan/lcra: Use Chaitin's spilling heuristic
Not much of a difference but slightly better and slightly less
arbitrary.

total instructions in shared programs: 3560 -> 3559 (-0.03%)
instructions in affected programs: 44 -> 43 (-2.27%)
helped: 1
HURT: 0

total bundles in shared programs: 1844 -> 1843 (-0.05%)
bundles in affected programs: 23 -> 22 (-4.35%)
helped: 1
HURT: 0

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-14 02:36:21 +00:00
Alyssa Rosenzweig
23c83f3f05 pan/midgard: Compute spill costs
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-14 02:36:21 +00:00
Paulo Zanoni
eb6352162d intel/compiler: fix nir_op_{i,u}*32 on ICL
On ICL we have the src1 restriction which is applied through
fix_byte_src() and potentially changes the type of the operands from 8
to 32 bits. When this change happens, we fall into the "else if
(bit_size < 32)" case and miscompute src_type because it takes into
consideration bit_size (8) instead of the adjusted size of temp_op
(32). This results in the shader reading unused memory, giving us
mostly failures, but occasional passes due to whatever was already in
the registers we were reading.

This commit fixes a lot of dEQP subgroup i8vec2 tests on ICL, such as:
    dEQP-VK.subgroups.arithmetic.compute.subgroupadd_i8vec2

This can also be verified by simply changing fix_byte_src() to apply
on all platforms.

Fixes: 5847de6e9a ("intel/compiler: don't use byte operands for src1 on ICL")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
2019-11-13 22:13:52 +00:00
Caio Marcelo de Oliveira Filho
7ae506e5b8 spirv: Consider the sampled_image case in wa_glslang_179 workaround
Fixes: 9e440b8d0b ("spirv: Sort out the mess that is sampled image")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-13 12:02:29 -08:00
Dylan Baker
943f630f8e docs: update calendar, add news item and link release notes for 19.2.4 2019-11-13 11:12:53 -08:00
Dylan Baker
ff5bcd7ce9 docs: Add SHA256 sum for for 19.2.4 2019-11-13 11:10:51 -08:00
Dylan Baker
67fd2b936d docs: Add release notes for 19.2.4 2019-11-13 11:10:47 -08:00
Eric Anholt
f0eeb98c6c ci: Expand the freedreno blit skip regex to cover more cases.
We've had flaps on at least:
- r16f_to_r16f
- r16i_to_rg16i

Reviewed-by: Daniel Stone <daniels@collabora.com>
2019-11-13 10:58:52 -08:00
Caio Marcelo de Oliveira Filho
0aaf47f7cd anv: Initialize depth_bounds_test_enable when not explicitly set
This was causing uninitialized value to end up propagated to the
3DSTATE_DEPTH_BOUNDS packet, leading to asserts on packet
building due to the value being greater than 1.

Fixes: 939ddccb7a ("anv: Add support for depth bounds testing.")
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2019-11-13 10:13:27 -08:00
Alyssa Rosenzweig
771d23584a pan/midgard: Remove util/ra support
It's now unused, in favour of LCRA.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-13 15:27:56 +00:00
Alyssa Rosenzweig
e343f2ceb9 pan/midgard: Integrate LCRA
Pretty routine, we do have a hack to force swizzle alignment for !32-bit
for until we implement !32-bit the right way.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-13 15:27:56 +00:00
Alyssa Rosenzweig
66ad64d73d pan/midgard: Implement linearly-constrained register allocation
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-13 15:27:56 +00:00
Alyssa Rosenzweig
fd81916ee5 pan/midgard: Add blend shader selection bits for MRT
This is less complicated than previously thought. Note we have no way of
specifying the work register count for blend shaders; it must be
strictly less than the work register count of the corresponding fragment
shader (which is fine since we force the fragment shader to report a
count of 16 with a blend shader as a major hack until we get register
pressure down for blend shaders).

TODO: pandecode the flags.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-13 15:27:56 +00:00
Christian Gmeiner
e101af8671 drm-shim: fix EOF case
Close input end of the pipe after data was written. Without this
fix I have seen a hang in sysfs_uevent_get(.., "OF_FULLNAME")
when key was not found.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-13 12:39:14 +00:00
Tapani Pälli
b12911c88e util/android: fix android build errors
Fixes: 9020f519 ("util/u_endian: Add error checks")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2078
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-13 12:31:31 +02:00
Samuel Pitoiset
47ba227448 gitlab-ci: build RADV on ARM64
The ARMHF LLVM package is LLVM 7 but RADV requires LLVM 8.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-11-13 10:52:10 +01:00
Samuel Pitoiset
cb19f69ff0 gitlab-ci: build a specific libdrm version for ARM64
RADV requires libdrm-2.4.100 but the distrib package is too old.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-11-13 10:52:08 +01:00
Erik Faye-Lund
4c1cef68cf zink: move drawing separate source
This code is kinda stand-alone, and it makes it a bit easier to find the
right source in the source-tree.
2019-11-13 09:14:05 +01:00
Erik Faye-Lund
589e8651e6 zink: move blitting to separate source
This code is kinda stand-alone, and it makes it a bit easier to find the
right source in the source-tree
2019-11-13 09:14:05 +01:00
Erik Faye-Lund
1605a0c8f2 zink: move filter-helper to separate helper-header
This will help code-reuse a bit in the next commit.
2019-11-13 09:12:36 +01:00
Erik Faye-Lund
36f3902213 zink: move format-checking to separate source
This code is more or less stand-alone, and this keeps the formats array
a bit more encapsulated.
2019-11-13 09:12:36 +01:00
Eric Anholt
fd777d2cea ci: Disable flappy blit tests on a630.
These have shown up with the new CTS runner, which has changed test
ordering.

Reviewed-by: Daniel Stone <daniels@collabora.com>
2019-11-12 16:43:04 -08:00
Rob Clark
0f33c255d3 freedreno/ir3: remove unused parameter
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-11-12 13:57:52 -08:00
Rob Clark
df7a88dca3 freedreno/ir3: legalize cleanups
We can clear the "needs" flags once we emit a flag.  And also, don't
open-code the opcode name.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 13:57:52 -08:00
Rob Clark
b22617fb57 freedreno/ir3: fix gpu hang with pre-fs-tex-fetch
For pre-fs-dispatch texture fetch, we need to assign bary_ij to r0.x,
even if it is not used in the shader (ie. only varying use is for tex
coords).  But if, for example, gl_FragCoord is used, it could get
assigned on top of bary_ij, resulting in a GPU hang.

The solution to this is two-fold: (1) the inputs/outputs rework has the
benefit of making RA realize bary_ij is a vec2, even if there are no
split/collect instructions (due to no varying fetches in the shader
itself).  And (2) extend the live ranges of meta:input instructions to
the first non-input, to prevent RA from assigning the same register to
multiple inputs.

Backport note: because of (1) above, a better solution for 19.3 would be
to revert f30c256ec0.

Fixes: f30c256ec0 ("freedreno/ir3: enable pre-fs texture fetch for a6xx")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 13:57:52 -08:00
Rob Clark
4bb697d938 freedreno/ir3: only tex instructions have wrmask
At the ir3 level, we would assume that we could use wrmask to mask
off other components of an instruction returning a vecN when they are
not used.  Which would let RA use components not written for other live
values.  But this is only true for tex instructions.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 13:57:52 -08:00
Rob Clark
bdf6b7018c freedreno/ir3: re-work shader inputs/outputs
Allow inputs/outputs to be vecN (ie. whatever their actual size is), and
use split to get scalar components of inputs, and collect to gather up
scalar components of outputs.

The main motivation is to simplify RA, by only having to consider split/
collect to figure out where values need to land in consecutive scalar
registers, rather than having to also deal with left/right neighbors.

Because of varying packing, and the resulting fractional location
(location_frac), to implement load_input/store_output, it is still
convenient to have a table of scalar inputs/outputs.  We move this to
the compile ctx (since it is only needed for nir->ir3).

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 13:57:52 -08:00
Rob Clark
2aae13f642 freedreno/ir3: simplify creating sysval inputs
In almost all places, the add_sysval_input() is paired directly with a
create_input().  (The one exception is frag shader ij bary coord, and
this exception will go away in a later patch.)

So go ahead and clean this up before reworking input/output handling.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 13:55:03 -08:00
Rob Clark
68d2ec5f7e freedreno/ir3: remove first-vertex sysval
This is a driver-param (loaded from uniform), not a sysval (populated by
hw into a register).  So it has no value to having a sysval slot.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 13:55:03 -08:00
Rob Clark
7b2166785a freedreno/ir3: helper to print ir if debug enabled
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 13:55:03 -08:00
Rob Clark
7a5f073da3 freedreno/ir3: show input/output wrmask's in disasm
Currently it is always 0x1 (scalar), but that will change in a later
patch.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 13:55:03 -08:00
Rob Clark
c00a67171c freedreno/ir3: add input/output iterators
We can at least get rid of the if-not-NULL check in a bunch of places.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 13:55:03 -08:00
Rob Clark
b2417801e5 freedreno/ir3: remove impossible condition
We keep kill's alive w/ keeps these days, rather than a fake output.
This condition was left over from prior to that change.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 13:55:03 -08:00
Rob Clark
611258d578 freedreno/ir3: rename fanin/fanout to collect/split
If I'm going to refactor a bit to use these meta instructions to also
handle input/output, then might as well cleanup the names first.
Nouveau also uses collect/split for names of these meta instructions,
and I like those names better.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 13:55:03 -08:00
Rob Clark
4af86bd0b9 freedreno/ir3: remove half-precision output
This doesn't really work, we can't necessarily just change the outputs
to half-precision like this in anything but simple cases.

Keep the shader key entry around though, eventually with proper mediump
support we could use this with a nir pass to use lower precision frag
shader outputs when the render target format has <= 16b/component.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 13:55:03 -08:00
Rob Clark
089b105396 freedreno/ir3: fix valgrind complaint with STLW
The instruction has 3 src regs, so `instr->regs[0..3]` are valid, but
`instr->regs[4]` is not.

```
Test case 'dEQP-GLES31.functional.shaders.linkage.es31.tessellation.varying.rules.output_superfluous_declaration'..
==29239== Invalid read of size 8
==29239==    at 0x5BE9CDC: emit_cat6 (ir3.c:841)
==29239==    by 0x5BEA1BF: ir3_assemble (ir3.c:921)
==29239==    by 0x5BDF0A7: ir3_shader_assemble (ir3_shader.c:133)
==29239==    by 0x5BDF193: assemble_variant (ir3_shader.c:162)
==29239==    by 0x5BDF407: create_variant (ir3_shader.c:215)
==29239==    by 0x5BDF4DB: shader_variant (ir3_shader.c:241)
==29239==    by 0x5BDF553: ir3_shader_get_variant (ir3_shader.c:257)
==29239==    by 0x5BA85F7: ir3_shader_variant (ir3_gallium.c:80)
==29239==    by 0x5BA7703: ir3_cache_lookup (ir3_cache.c:96)
==29239==    by 0x5B8B8B3: fd6_emit_get_prog (fd6_emit.h:119)
==29239==    by 0x5B8C137: fd6_draw_vbo (fd6_draw.c:186)
==29239==    by 0x5BB1FBB: fd_draw_vbo (freedreno_draw.c:290)
==29239==  Address 0xb97f2d0 is 0 bytes after a block of size 240 alloc'd
==29239==    at 0x4848D54: malloc (in /usr/lib/aarch64-linux-gnu/valgrind/vgpreload_memcheck-arm64-linux.so)
==29239==    by 0x61BD35B: ralloc_size (ralloc.c:119)
==29239==    by 0x61BD41B: rzalloc_size (ralloc.c:151)
==29239==    by 0x5BE599B: ir3_alloc (ir3.c:45)
==29239==    by 0x5BEA583: instr_create (ir3.c:984)
==29239==    by 0x5BEA5DF: ir3_instr_create2 (ir3.c:1000)
==29239==    by 0x5BEE317: ir3_STLW (ir3.h:1431)
==29239==    by 0x5BF12D3: emit_intrinsic_store_shared_ir3 (ir3_compiler_nir.c:903)
==29239==    by 0x5BF418B: emit_intrinsic (ir3_compiler_nir.c:1802)
==29239==    by 0x5BF5D07: emit_instr (ir3_compiler_nir.c:2339)
==29239==    by 0x5BF603F: emit_block (ir3_compiler_nir.c:2426)
==29239==    by 0x5BF624B: emit_cf_list (ir3_compiler_nir.c:2474)
==29239==
```

Probably this only triggers in non-optimized builds?

Fixes: 1f3b52ce50 ("freedreno/a6xx: Add register offset for STG/LDG")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 13:55:03 -08:00
Eric Anholt
f3244c6019 ci: Remove old commented copy of freedreno artifacts.
This path was from an older version of freedreno CI.
2019-11-12 12:54:04 -08:00
Eric Anholt
52843ec5d3 ci: Enable all of GLES3/3.1 testing for softpipe.
Now that we're not using so many job slots, it's easy to get these
jobs run in a reasonable amount of time (gles3 took 10 minutes for 4
cores, and gles31 was 15 minutes for 4 cores).

Acked-by: Michel Dänzer <mdaenzer@redhat.com>
2019-11-12 12:54:04 -08:00
Eric Anholt
f08c810028 ci: Use cts_runner for our dEQP runs.
This runner is a little project by Bas, written in C++, that spawns
threads that then loop grabbing chunks of the (randomly shuffled but
consistently so) test list and hand it to a dEQP instance.  As the
remaining list gets shorter, so do the chunks, so hopefully the
threads all complete effectively at once.  It also handles restarting
after crashes automatically.  I've extended the runner a bit to do
what I was doing in the bash scripts before, like the skip list and
expected failures handling.  This project should also be a good
baseline for extending to handle retesting of intermittent failures.

By switching to it, we can have the swrast tests just take up one job
slot on the shared runners and keep their allotment of CPUs busy,
instead of taking up job slots with single-threaded dEQP jobs.  It
will also let us (eventually, once I reprovision) switch the freedreno
runners over to threading within the job instead of running concurrent
jobs, so that memory scribbles in one pipeline don't affect unrelated
pipelines, and I can experiment with their parallelism (particularly
on a306 where we are frequently backed up) without trashing other
people's jobs.

What we lose in this process is per-test output in the log (not a big
loss, I think, since we summarize fails at the end and reducing log
length keeps chrome from choking on our logs so badly).  We also drop
the renderer sanity checking, since it's not saving qpa files for us
to go poke through.  Given that all the drivers involved have fail
lists, if we got the wrong renderer somehow, we'd get a job failure
anyway.

v2: Rebase on droppong of the autoscale cluster and the arm64
    build/test split.  Use a script to deduplicate the cts-runner
    build.
v3: Rebase on the amd64 build/test container split.

Acked-by: Daniel Stone <daniels@collabora.com> (v1)
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> (v2)
2019-11-12 12:54:04 -08:00
Eric Anholt
7f52df7fc9 ci: Make the skip list regexes match the full test name.
The bash scripts were using grep in the manner that matches any subset
of the line, but the new CTS runner matches the whole line and I think
that's a pretty good behavior.  Given that some of the skip lists
already were written to match the full test name, just make them
consistently do so.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
2019-11-12 12:54:04 -08:00
Eric Anholt
66719e0242 ci: Use several debian buster packages instead of hand-building.
This helps cut down our container build time.  I've left a few that
we're likely to rev more frequently or I was less confident in
dropping.

v2: Rebase on the build/test container split, now bumps the build
    container tag in this commit.

Acked-by: Eric Engestrom <eric.engestrom@intel.com> (v1)
Acked-by: Daniel Stone <daniels@collabora.com> (v1)
2019-11-12 12:54:04 -08:00
Rafael Antognolli
a4da6008b6 iris: Use mocs from isl_dev.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-12 20:41:52 +00:00
Rafael Antognolli
d4f628235e anv: Use mocs settings from isl_dev.
v2: Remove device->default_mocs and external_mocs (Jason).

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-12 20:41:52 +00:00
Rafael Antognolli
2b01636ddb intel/isl: Add MOCS settings to isl_device.
Centralize mocs settings into isl.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-12 20:41:52 +00:00
Rob Clark
d509a46225 freedreno: fix eglDupNativeFenceFD error
We can end up with scenarios where last_fence is associated with a batch
that is flushed through some other path before needs_out_fence_fd gets
set.  Resulting in returning a fence that has no backing fd.

The simplest thing is to just skip the optimization to try and avoid
no-op batches when a fence-fd is requested.  This should normally be
just once a frame anyways.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-12 11:38:16 -08:00
Brian Paul
bd49dedae0 nir: fix a couple signed/unsigned comparison warnings in nir_builder.h
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-12 11:44:02 -07:00
Brian Paul
a69e105361 s/APIENTRY/GLAPIENTRY/ in teximage.c
The later is the right symbol for entrypoint functions.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-12 11:44:01 -07:00
Lepton Wu
5c2d307a10 android: mesa: Revert "android: mesa: revert "Enable asm unconditionally""
Commit 45206d7673 fixed PIC issue of x86 asm stub.
We can enable asm for Android x86 now. This should sightly improve performance.

Acked-by: Eric Anholt <eric@anholt.net>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Lepton Wu <lepton@chromium.org>
2019-11-12 18:09:43 +00:00
Rhys Perry
6914b0236f aco: combine read_invocation and shuffle implementations
They do mostly the same thing now.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-12 17:21:38 +00:00
Rhys Perry
2c98d79d11 aco: don't propagate vgprs into v_readlane/v_writelane
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
2019-11-12 17:21:38 +00:00
Rhys Perry
5a1bacb6f9 aco: fix read_invocation with VGPR lane index
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
2019-11-12 17:21:38 +00:00
Rhys Perry
c877f4d320 nir/divergence: improve DA of shuffle
If the data is uniform, then it's really a uniform copy. If the index is
uniform, then it's really a read_invocation.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-12 17:21:38 +00:00
Rhys Perry
f97d933426 aco: fix shuffle with uniform operands
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
2019-11-12 17:21:38 +00:00
Rhys Perry
3204e83768 aco: use DPP instead of exec modification when lowering GFX10 shuffles
Seems we can use DPP's row_mask field to get an effect similar to
modifying exec.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-12 17:21:38 +00:00
Eric Engestrom
06347989a0 gitlab-ci: build libdrm using meson instead of autotools
Autotools was deprecated for a while and has now been removed, so let's
start using meson here so that we won't have any issues next time we
update libdrm.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
2019-11-12 17:08:02 +00:00
Daniel Schürmann
746b9380bd aco: rematerialize s_movk instructions
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-11-12 15:59:48 +00:00
Daniel Schürmann
b6f5085dfe aco: preserve kill flag on moved operands during RA
Fixes: 93c8ebfa78 aco: Initial commit of independent AMD compiler

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-11-12 15:59:48 +00:00
Daniel Schürmann
a2a6880743 aco: fix invalid access on Pseudo_instructions
Fixes: 93c8ebfa78 aco: Initial commit of independent AMD compiler

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-11-12 15:59:48 +00:00
Erik Faye-Lund
5b09a7e2e4 zink: remove no-longer-needed hack
It seems whatever was causing this is no longer an issue. So let's get
rid of the hack here.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-11-12 13:30:35 +00:00
Erik Faye-Lund
e1c87bbb4b zink: implement buffer-to-buffer copies 2019-11-12 12:40:49 +00:00
Erik Faye-Lund
9352991880 zink: always allow transfer to/from buffers 2019-11-12 12:40:49 +00:00
Danylo Piliaiev
d4c8182018 intel/blorp: Fix usage of uninitialized memory in key hashing
The automatically generated padding in structs contains
undefined values, force pack the structs to eliminate the
padding. Otherwise structs with the same values may generate
different hashes.

Valgrind output:

Conditional jump or move depends on uninitialised value(s)
 util_fast_urem32 (fast_urem_by_const.h:71)
 hash_table_search (hash_table.c:262)
 _mesa_hash_table_search (hash_table.c:296)
 anv_pipeline_cache_search_locked (anv_pipeline_cache.c:318)
 anv_pipeline_cache_search (anv_pipeline_cache.c:335)
 lookup_blorp_shader (anv_blorp.c:38)
 blorp_params_get_mcs_partial_resolve_kernel (blorp_clear.c:1112)
 blorp_mcs_partial_resolve (blorp_clear.c:1205)
 anv_image_mcs_op (anv_blorp.c:1742)
 anv_cmd_predicated_mcs_resolve (genX_cmd_buffer.c:774)
 transition_color_buffer (genX_cmd_buffer.c:1159)
 cmd_buffer_end_subpass (genX_cmd_buffer.c:4840)

Uninitialised value was created by a stack allocation
 blorp_params_get_mcs_partial_resolve_kernel (blorp_clear.c:1103)

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-12 13:59:29 +02:00
Danylo Piliaiev
3349b4b056 i965/program_cache: Lift restriction on shader key size
This will allow usage of packed structs which may have size
not divisible by 4.

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-12 13:59:24 +02:00
Michel Dänzer
af684753f3 gitlab-ci: Delete install/bin from artifacts as well
This cuts the x86 artifacts zip file size in less than half.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 10:18:31 +01:00
Michel Dänzer
aebf43dcc1 gitlab-ci: Use separate docker images for x86 build/test jobs
Same as was done for the ARM images before.

This should make it less painful to update to newer dEQP / piglit as
well as to make changes to the build/test environment.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 10:17:21 +01:00
Michel Dänzer
576f7b6ea5 gitlab-ci: Run piglit tests with llvmpipe
One job for the quick_gl profile, one for the glslparser & quick_shader
profiles (doing these together takes hardly any more time than
quick_shader alone).

v2:
* Don't break lava tests
v3:
* Remove piglit test artifacts paths:
* Exclude some quick_shader tests again:
  - Test whose result flips between pass/fail/skip
  - *@vs_in tests, as not the same one of these gets picked every time
v4:
* Do not list passing tests in .gitlab-ci/piglit/*.txt (Eric Anholt)
* Include the test number summary in .gitlab-ci/piglit/*.txt
* Completely disable generating any vs_in tests in the piglit build.
* Remove some more unneded files from the piglit build tree.
* Exclude quick_gl arb_gpu_shader5 tests; they were all skipped anyway,
  as llvmpipe doesn't support this extension yet, but occasionally they
  would spuriously fail instead.
v5:
* Set LD_LIBRARY_PATH, so we actually test the Mesa build from the
  pipeline...
* Verify that wflinfo reports the expected Mesa version
* Pass -noreset to Xvfb
v6:
* Don't use autoscale runners, run piglit with -j4 (Eric Anholt)

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 10:16:23 +01:00
Michel Dänzer
4b25b5885b gitlab-ci: Sort packages in debian-install.sh
And remove duplicates.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 10:16:08 +01:00
Michel Dänzer
df26e18b9f gitlab-ci: Share dEQP build process between x86 & ARM test image scripts
See https://gitlab.freedesktop.org/mesa/mesa/issues/2056

v2:
* Rename .gitlab-ci/deqp-build.sh => .gitlab-ci/build-deqp.sh
  (Eric Anholt)

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 10:14:49 +01:00
Michel Dänzer
59fcb019d0 gitlab-ci: Move artifact preparation to separate script
It's currently only needed for the meson-main and meson-arm64 jobs, not
the other meson build jobs.

Also remove MESON_SHADERDB, just run .gitlab-ci/run-shader-db.sh
directly from the meson-main job.

v2:
* Also run prepare-artifacts.sh in meson-arm64 script
v3:
* Move tarball creation into the new script as well, as it prevented
  ccache --show-stats from running in after_script

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> # v1
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 10:14:26 +01:00
Michel Dänzer
2921a38484 gitlab-ci: Use ninja -j4 for building dEQP
By default, ninja tries to saturate all cores of the runner host
machine, which could overload it due to other jobs running in parallel.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 10:14:04 +01:00
Jason Ekstrand
0c7e0c5599 spirv: Fix the MSVC build
Fixes: 9cc4c2c916 "spirv: Add a vtn_decorate_pointer helper"
Tested-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-11-12 08:34:55 +00:00
Erik Faye-Lund
9b8964d064 nir: patch up deref-vars when lowering clip-planes
Otherwise, we fail validation and potentially generate invalid code.
Let's fix up the mode of the accesses to the variable.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-11-12 09:13:22 +01:00
Samuel Pitoiset
bef7b2f805 ac: handle pointer types to LDS in ac_get_elem_bits()
This fixes crashes with some
dEQP-VK.spirv_assembly.instruction.spirv1p4.* tests.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-12 08:32:15 +01:00
Jonathan Marek
01cae57c80 freedreno: add Adreno 640 ID
A640 seems to work without any other changes (glmark and vkcube).

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-11 20:46:01 -05:00
Luis Mendes
0cb5c96a83 radv: fix radv secure compile feature breaks compilation on armhf EABI and aarch64
__NR_select is not defined the same way across architectures, sometimes is
not even defined, like in armhf EABI and aarch64.

Signed-off-by: Luis Mendes <luis.p.mendes@gmail.com>

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2042
2019-11-12 11:47:20 +11:00
Marek Olšák
3a23af9f44 st/mesa: remove unused TGSI-only debug printing functions
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-11 19:45:12 -05:00
Marek Olšák
d29a332862 st/mesa: add ST_DEBUG=nir to print NIR shaders
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-11 19:45:10 -05:00
Marek Olšák
265abc54f8 st/mesa: print TCS/TES/GS/CS TGSI in the right place & keep disk cache enabled
The old place only printed on a disk cache miss, which is why the disk
cache was disabled.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-11 19:45:08 -05:00
Marek Olšák
98e27e5e28 st/mesa: remove \n being only printed in debug builds after printed TGSI
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-11 19:45:07 -05:00
Marek Olšák
c3351bb44b st/mesa: rename DEBUG_TGSI -> DEBUG_PRINT_IR
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-11 19:45:04 -05:00
Marek Olšák
e00791c552 st/mesa: fix Sanctuary and Tropics by disabling ARB_gpu_shader5 for them
They use the "sample" keyword as a variable name.

Cc: 19.2 19.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-11 19:23:37 -05:00
Lionel Landwerlin
34f32a6d66 anv: implement VK_KHR_timeline_semaphore
v2: Fix inverted condition in vkGetPhysicalDeviceExternalSemaphoreProperties()

v3: Add anv_timeline_* helpers (Jason)

v4: Avoid variable shadowing (Jason)
    Split timeline wait/signal device operations (Jason/Lionel)

v5: s/point/signal_value/ (Jason)
    Drop piece of drm-syncobj timeline code (Jason)

v6: Add missing sync_fd semaphore signaling (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-11 21:46:51 +00:00
Jason Ekstrand
5a4f15ef2c anv: Plumb timeline semaphore signal/wait values through from the API
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-11 21:46:51 +00:00
Lionel Landwerlin
edc6606d4e anv/wsi: signal the semaphore in the acquireNextImage
We seem to have forgotten about the semaphore in the
acquireNextImageInfo.

v2: Signal semaphore/fence regardless of presentation status (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-11 21:46:51 +00:00
Jason Ekstrand
b10b455c1d anv: Lock around fetching sync file FDs from semaphores
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-11 21:46:51 +00:00
Lionel Landwerlin
246261f0ad anv: prepare the driver for delayed submissions
Timeline semaphore introduce support for wait before signal behavior,
which means that it is now allowed to call vkQueueSubmit() with wait
semaphores not yet submitted for execution. Our kernel driver requires
all of the wait primitives to be created before calling the execbuf
ioctl. As a result, we must delay submissions in the userspace driver.
This change store the necessary information to be able to delay a
VkSubmitInfo submission to the kernel driver.

v2: Fold count++ into array access (Jason)
    Move queue list to another patch (Jason)

v3: Document cleanup of temporary semaphores (Jason)

v4: Track semaphores of SYNC_FD type that needs updating after delayed
    submission

v5: Don't forget to update sync_fd in signaled semaphores after
    submission (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-11 21:46:51 +00:00
Lionel Landwerlin
3e22363537 anv: refcount semaphores
Delayed submissions required by timeline semaphores mean we need to be
able to update the sync fd backed semaphores in a delayed fashion.
This could mean a race between the application destroying the
semaphore and the submission code trying to update it with the new
sync fd.

This change prepares semaphores to be refcounted, we'll most likely
only take a reference for cases where we signal a sync fd semaphore.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-11 21:46:51 +00:00
Lionel Landwerlin
3da798c9f1 anv: prepare driver to report submission error through queues
When we will submit to i915 from a submission thread, we won't be able
to directly report the error to the user (in particular through the
debug report callbacks). So prepare 2 paths to report errors device ->
notifying the user immediately, queue -> notifying the user the next
time an entry point is called.

In this change we still report directly for both paths, this will
change in the next commit.

v2: Split NULL batch parameter handling in
    anv_queue_submit_simple_batch() in a different commit

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-11 21:46:51 +00:00
Lionel Landwerlin
89de271bc2 anv: allow NULL batch parameter to anv_queue_submit_simple_batch
We can reuse device->trivial_batch_bo

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-11 21:46:51 +00:00
Lionel Landwerlin
f606c12731 anv: move queue init/finish to anv_queue.c
Prepare the queue initialization to take on more responsabilities and
possibly fail.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-11 21:46:51 +00:00
Lionel Landwerlin
206ab49ba1 anv: expose timeout helpers outside of anv_queue.c
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-11 21:46:51 +00:00
Lionel Landwerlin
2f4dcc8a1c anv: detach batch emission allocation from device
In the future we'll have 2 different allocations depending on whether
we're using threaded submission or not.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-11 21:46:51 +00:00
Lionel Landwerlin
935f8f0e56 anv: remove list items on batch fini
This doesn't seem to fix anything because those destroy() calls happen
right before the command buffer object & its list of batch_bo is also
destroyed. Still looks a bit cleaner.

v2: Found a second occurence

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)
Fixes: 26ba0ad54d ("vk: Re-name command buffer implementation files")
Cc: <mesa-stable@lists.freedesktop.org>
2019-11-11 21:46:51 +00:00
Lionel Landwerlin
048f0690ee anv: invalidate file descriptor of semaphore sync fd at vkQueueSubmit
We always close the in_fence at the end the anv_cmd_buffer_execbuf()
so when we take it from the semaphore, let's not forget to invalidate
it.

Note that the code leaks the fence_in if we get any error before
reaching the close(). Let's fix that in another patch or better,
rewrite the whole thing!

v2: drop redundant fd = -1 (Jason)

v3: Update commit message (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-11 21:46:51 +00:00
Rhys Perry
de998d3eb5 radv: fix radv_nir_get_max_workgroup_size when nir=NULL
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 84a1a2578 ('compiler: pack shader_info from 160 bytes to 96 bytes')
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-11-11 20:44:12 +00:00
Lionel Landwerlin
f93bb90302 mesa: check framebuffer completeness only after state update
The change made in 88d665830f ("mesa: check draw buffer completeness
on glClearBufferfi/glClearBufferiv") correctly updated the state prior
to checking the framebuffer completeness on glClearBufferiv but not in
glClearBufferfi.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Fixes: 88d665830f ("mesa: check draw buffer completeness on glClearBufferfi/glClearBufferiv")
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/2072
2019-11-11 22:04:55 +02:00
Caio Marcelo de Oliveira Filho
d4a3b09c4b glsl: Check earlier for MaxTextureImageUnits and MaxImageUniforms
Currently the linker do all the work then check for the limits, which
means num_textures and num_images in shader_info may have to store more
than the limit.  This breaks down now since shader_info was packed and
doesn't expect to store larger invalid values.

To fix this, pull the check before we set the counts in shader_info.
Add necessary plumbing to make sure we bail once those errors are
found.

Fixes: 84a1a2578d ("compiler: pack shader_info from 160 bytes to 96 bytes")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-11 10:58:40 -08:00
Caio Marcelo de Oliveira Filho
fce76ae769 glsl: Check earlier for MaxShaderStorageBlocks and MaxUniformBlocks
Currently the linker do all the work then check for the limits, which
means num_ssbos and num_ubos in shader_info may have to store more
than the limit.  This breaks down now since shader_info was packed and
doesn't expect to store larger invalid values.

To fix this, pull the check before we set the counts in shader_info.
One drawback of this approach is that for some cases we might not see
the collected errors from various stages, but bail as soon as a stage
breaks the limits.

Fixes: 84a1a2578d ("compiler: pack shader_info from 160 bytes to 96 bytes")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-11 10:58:40 -08:00
Dylan Baker
a8d941091f util: Use ZSTD for shader cache if possible
This allows ZSTD instead of ZLIB to be used for compressing the shader
cache.

On a 72 core system emulating skl with a full shader-db (with i965):
ZSTD:
    1915.10s user 229.27s system 5150% cpu 41.632 total (cold cache)
    225.40s user 10.87s system 3810% cpu 6.201 total (warm cache)
    154M (235M on disk)
ZLIB:
    2231.33s user 194.24s system 1899% cpu 2:07.72 total (cold cache)
    229.15s user 10.63s system 3906% cpu 6.139 total (warm cache)
    163M (244M on disk)

Tim Arceri sees (8 core ryzen and a full shader-db):
ZSTD:
    2505.22 user 40.50 system 3:18.73 elapsed 1280% CPU (cold cache)
    418.71 user 14.93 system 0:46.53 elapsed 931% CPU (warm cache)
    454.3 MB (681.7 MB on disk)
ZLIB:
    3069.83 user 40.02 system 4:20.13 elapsed 1195% CPU (cold cache)
    425.50 user 15.17 system 0:46.80 elapsed 941% CPU (warm cache)
    470.3 MB (701.4 MB on disk)

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v1)
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-11 18:53:45 +00:00
Laurent Carlier
57acf921e2 egl: avoid local modifications for eglext.h Khronos standard header file
Move differences in eglextchromium.h header file, then provide the same header than libglvnd-1.2
So program that omit to include eglextchromium.h will fail to build with both mesa and libglvnd headers.

Fixes: a0a8109f "include: add the definition of EGL_EXT_image_flush_external"
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-11 17:20:16 +00:00
Eric Engestrom
eaf4396602 egl: move #include of local headers out of Khronos headers
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-11 17:20:16 +00:00
Jason Ekstrand
69244fc72a intel/fs: Lower large local arrays to scratch
Shader-db results on Kaby Lake:

    total instructions in shared programs: 14929212 -> 14880028 (-0.33%)
    instructions in affected programs: 72428 -> 23244 (-67.91%)
    helped: 6
    HURT: 2
    helped stats (abs) min: 2165 max: 15981 x̄: 8590.00 x̃: 7624
    helped stats (rel) min: 56.06% max: 74.52% x̄: 67.55% x̃: 72.08%
    HURT stats (abs)   min: 1178 max: 1178 x̄: 1178.00 x̃: 1178
    HURT stats (rel)   min: 350.60% max: 361.35% x̄: 355.97% x̃: 355.97%
    95% mean confidence interval for instructions value: -11947.03 -348.97
    95% mean confidence interval for instructions %-change: -125.72% 202.37%
    Inconclusive result (%-change mean confidence interval includes 0).

    total cycles in shared programs: 368585300 -> 342557344 (-7.06%)
    cycles in affected programs: 28144921 -> 2116965 (-92.48%)
    helped: 6
    HURT: 2
    helped stats (abs) min: 1404978 max: 7766106 x̄: 4353922.00 x̃: 3890682
    helped stats (rel) min: 82.01% max: 95.57% x̄: 89.95% x̃: 92.28%
    HURT stats (abs)   min: 47778 max: 47798 x̄: 47788.00 x̃: 47788
    HURT stats (rel)   min: 278.20% max: 282.98% x̄: 280.59% x̃: 280.59%
    95% mean confidence interval for cycles value: -5900438.73 -606550.27
    95% mean confidence interval for cycles %-change: -140.79% 146.16%
    Inconclusive result (%-change mean confidence interval includes 0).

    total spills in shared programs: 9243 -> 8901 (-3.70%)
    spills in affected programs: 2718 -> 2376 (-12.58%)
    helped: 4
    HURT: 4

    total fills in shared programs: 21831 -> 10141 (-53.55%)
    fills in affected programs: 11804 -> 114 (-99.03%)
    helped: 6
    HURT: 2

    total sends in shared programs: 815912 -> 815912 (0.00%)
    sends in affected programs: 0 -> 0
    helped: 0
    HURT: 0

    LOST:   1
    GAINED: 3

The helped shaders are all compute shaders in Aztec Ruins.  There is
also a compute shader in synmark2 OglCSDof that's helped but it doesn't
show up in above shader-db results because it went from SIMD8 to SIMD16.
That shader improves enough to yield an 15-20% performance boost to the
benchmark as a whole on my KBL laptop.  The hurt shaders are a couple
shaders in Kerbal Space Program and a couple in Aztec Ruins.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-11-11 17:17:02 +00:00
Jason Ekstrand
53bfcdeecf intel/fs: Implement the new load/store_scratch intrinsics
This commit fills in a number of different pieces:

 1. We add support to brw_nir_lower_mem_access_bit_sizes to handle the
    new intrinsics.  This involves simple plumbing work as well as a
    tiny bit of extra logic to always scalarize scratch intrinsics

 2. Add code to brw_fs_nir.cpp to turn nir_load/store_scratch intrinsics
    into byte/dword scattered read/write messages which use the A32
    stateless model.

 3. Add code to lower_surface_logical_send to handle dword scattered
    messages and the A32 stateless model.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-11-11 17:17:02 +00:00
Jason Ekstrand
e2297699de intel/nir: Plumb devinfo through lower_mem_access_bit_sizes
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-11-11 17:17:02 +00:00
Jason Ekstrand
1dff48af05 intel/fs: refactor surface header setup
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-11-11 17:17:02 +00:00
Jason Ekstrand
a0999bc049 intel/fs: Add DWord scattered read/write opcodes
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-11-11 17:17:02 +00:00
Jason Ekstrand
83f04d80b0 intel/nir: Use nir_extract_bits in lower_mem_access_bit_sizes
The new helper solves most of the annoying problems with data wrangling
in brw_nir_lower_mem_access_bit_sizes.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-11-11 17:17:02 +00:00
Jason Ekstrand
b8d45d9307 nir: Add tests for nir_extract_bits 2019-11-11 17:17:02 +00:00
Jason Ekstrand
d0bbf98c96 nir/builder: Add a nir_extract_bits helper
This new helper is better than nir_bitcast_vector because it's able to
take a (mostly) arbitrary range from the source vector.  The only
requirement is that first_bit has to be aligned to the smaller of the
two bit sizes.  It wouldn't be hard to lift that requirement but it's
reasonable for now.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-11-11 17:17:02 +00:00
Eric Engestrom
86d3a346f1 egl: fix _EGL_NATIVE_PLATFORM fallback
When the X11 or Haiku platforms were compiled in, they would bypass the
`_EGL_NATIVE_PLATFORM` fallback by always returning themselves instead.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-11 17:14:07 +00:00
Ricardo Garcia
20b403aad0 anv: Unify GetDeviceQueue and GetDeviceQueue2
Avoid duplicating some checks and code by making anv_GetDeviceQueue a
subcase of anv_GetDeviceQueue2, like radv does.

Signed-off-by: Ricardo Garcia <rgarcia@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-11 16:14:56 +00:00
Alyssa Rosenzweig
5b31182665 panfrost: Select format-specific blending intrinsics
If we have an accelerated path for a particular framebuffer format,
let's use it to save a bunch of instructions in a blend shader.

[Tomeu: Only use the faster intrinsic on >T760]

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-11 15:23:44 +00:00
Alyssa Rosenzweig
3295edaadf pan/midgard: Pack load/store masks
While most load/store operations on 32-bit/vec4 intriniscally, some are
not and have special type-size-dependent semantics for the mask. We need
to convert into this native format.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-11 15:23:44 +00:00
Alyssa Rosenzweig
843874c7c3 pan/midgard: Implement nir_intrinsic_load_output_u8_as_fp16_pan
We can use the native Midgard ops for this, depending what chip we're
on.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-11 15:23:44 +00:00
Alyssa Rosenzweig
5885b64e42 pan/midgard: Identify ld_color_buffer_u8_as_fp16*
There are two versions of this opcode, depending what version of the ISA
you're using. I'm not sure if there's a semantic difference; I think
there might be some slight subtleties but it's too early to know at this
stage.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-11 15:23:44 +00:00
Alyssa Rosenzweig
03f73c7fc6 nir: Add load_output_u8_as_fp16_pan intrinsic
This is a single opcode, at least on newer Midgard chips. It's easier to
have this represented in NIR rather than trying to optimize out the
conversions, so let's add the intrinsic.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-11 15:23:44 +00:00
Tomeu Vizoso
ee5321f239 panfrost: Set depth and stencil for SFBD based on the format
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-11 15:23:44 +00:00
Erik Faye-Lund
b4d47e21d7 zink: correct depth-stencil format
When using packed vulkan-formats on little-endian systems, we need to
swap the components for the gallium formats. And since Zink isn't
big-endian safe yet, little-endian is the only endianess we care about
right now.

This fixes a bunch of piglit tests, amongs others:
- spec@arb_depth_texture@depth-level-clamp
- spec@arb_depth_texture@depthstencil-render-miplevels * d=z24
- spec@arb_depth_texture@fbo-depth-gl_depth_component24-blit
- spec@arb_depth_texture@fbo-depth-gl_depth_component24-copypixels
- spec@arb_depth_texture@fbo-depth-gl_depth_component24-drawpixels
- spec@arb_depth_texture@fbo-depth-gl_depth_component24-readpixels

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 8d46e35d16 ("zink: introduce opengl over vulkan")
2019-11-11 14:35:53 +00:00
Erik Faye-Lund
d7a6cc8f4a zink/spirv: add support for nir_op_flrp
This fixes the following piglit:

spec@ati_fragment_shader@ati_fragment_shader-render-fog

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-11-11 14:25:30 +01:00
Chris Wilson
863872e141 egl: Mention if swrast is being forced
The system can be disabling HW acceleration unbeknown to the user,
leading to a long debug session trying to work out which component is
failing. A quick mention that it is the environment override would be
very useful.

v2: Use more generic "CPU renderer" and so try to avoid jargon.

Reviewed-By: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Martin Peres <martin.peres@linux.intel.com>
2019-11-11 11:52:02 +00:00
Jason Ekstrand
9e440b8d0b spirv: Sort out the mess that is sampled image
This commit makes two major changes.  First, we add a second case to
OpLoad for sampled images which constructs a vtn_sampled_image and
stashes that rather than stashing a pointer to the combined image
sampler like we do for bare samplers and images.  This should be more in
line with how SPIR-V is intended to work and hopefully doesn't cause any
weird problems.  The second is a rework of vtn_handle_texture to assume
that everything has an image but not everything has a sampler.  We also
add a vtn_fail_if for the case where a texture instructions require a
sampler but none is provided.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-11-09 15:29:01 +00:00
Jason Ekstrand
9cc4c2c916 spirv: Add a vtn_decorate_pointer helper
This helper makes a duplicate copy of the pointer if any new access
flags are set at this stage.  This way we don't end up propagating
access flags further than they actual SPIR-V decorations.  In several
instances where we create new pointers, we still call the decoration
helper directly because no copy is needed.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-11-09 15:29:01 +00:00
Jason Ekstrand
4f9688e571 spirv: Remove the type from sampled_image
We have types on all vtn_values at this point so there's no reason to
carry the redundant type information.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-11-09 15:29:01 +00:00
Rob Clark
a3dc975ee7 freedreno/ir3: also track # of nops for shader-db
The instruction count is (mostly) a measure of what optimization passes
can do, while # of nops is more an indication of how effectively the
scheduler is balancing register pressure vs instruction count.  So track
these independently.

(There could be opportunities to rematerialize values to reduce register
pressure, swapping some nop's with other alu instructions, so nothing is
truely independent.. but it is still useful to break these stats out.)

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-11-09 02:49:15 +00:00
Rob Clark
5f45818673 freedreno/ir3: sync disasm changes from envytools
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-11-09 02:49:15 +00:00
Rob Clark
f3980a8ef7 freedreno/a4xx: fix SP_FS_MRT_REG.HALF_PRECISION
Set flag based on actual output reg type.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-11-09 02:49:15 +00:00
Rob Clark
f0f9ec6882 freedreno/a3xx: fix SP_FS_MRT_REG.HALF_PRECISION
We should really be setting this based on the actual output register
type.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-11-09 02:49:15 +00:00
Rob Clark
df229977c3 freedreno/ir3: remove obsolete comment
The meta PHI instruction was removed long ago.  And fanin/fanout
themselves to not contribute actual instructions (at least not by the
time you get to sched, they may prevent copy-propagating away a mov)

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-11-09 02:49:15 +00:00
Rob Clark
e804b42fd7 freedreno/ir3/ra: remove ir print after livein/out
The IR hasn't changed at this point, so it isn't really adding any
value.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-11-09 02:49:15 +00:00
Rob Clark
8b92052f10 freedreno/ir3/ra: move regs_count==0 check
Fold it in to writes_gpr() (since a register that does not reference any
registers by definition does not write a register).  This lets us avoid
having to handle this case in a few other places.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-11-09 02:49:15 +00:00
Rob Clark
bd21c73d3f freedreno/ir3: ir3_print tweaks
Handle HALF/HIGH flags in all cases, and colorize SSA src notation.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-11-09 02:49:15 +00:00
Rob Clark
5da10704bb freedreno/ir3: use SSA flag on dest register too
We did this in some places before, but not consistantly.  But it will be
useful for two-pass RA, to identify which registers have already been
assigned.

While we are cleaning this up, use __ssa_src() and new __ssa_dst()
helper more consistently.  (If nothing else, this reduces the # of
callers of ir3_reg_create() to audit that we didn't miss something)

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-11-09 02:49:14 +00:00
Rob Clark
8449f6183f freedreno/ir3: split pre-coloring to it's own function
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-11-09 02:49:14 +00:00
Caio Marcelo de Oliveira Filho
087ecd9ca5 spirv: Don't leak GS initialization to other stages
The stage specific fields of shader_info are in an union.  We've
likely been lucky that this value was either overwritten or ignored by
other stages.  The recent change in shader_info layout in commit
84a1a2578d ("compiler: pack shader_info from 160 bytes to 96 bytes")
made this issue visible.

Fixes: cf2257069c ("nir/spirv: Set a default number of invocations for geometry shaders")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-08 16:28:21 -08:00
Marek Olšák
84a1a2578d compiler: pack shader_info from 160 bytes to 96 bytes
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-11-08 16:54:08 -05:00
Marek Olšák
9950523368 glsl/linker: pass shader_info to analyze_clip_cull_usage directly
This will be needed by the next commit.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-11-08 16:54:06 -05:00
Marek Olšák
3ef50b023e radeonsi/nir: fix compute shader crash due to nir_binary == NULL
This partially reverts 8b30114dda.

Fixes: 8b30114dda "radeonsi/nir: call nir_serialize only once per shader"
2019-11-08 16:47:59 -05:00
Marek Olšák
8b30114dda radeonsi/nir: call nir_serialize only once per shader
We were calling it twice.

First serialize it, then use it to compute the cache key.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-08 15:30:28 -05:00
Marek Olšák
ad56022b0d util: add blob_finish_get_buffer
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-08 15:30:28 -05:00
Eric Anholt
b1f38aed84 u_format: Fix swizzle of A1R5G5B5.
Found once I started using the generated unpack code from the Mesa side.

Fixes: 4bbaac3782 ("gallium: Add some more channel orderings of packed formats.")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-11-08 11:56:02 -08:00
David Stevens
0466239aae virgl: support emulating planar image sampling
Mesa emulates planar format sampling with per-plane samplers. Virgl now
supports this by allowing the plane index to be passed when creating a
sampler view from a planar image. With this change, mesa now passes that
information to virgl.

Signed-off-by: David Stevens <stevensd@chromium.org>
Reviewed-by: Lepton Wu <lepton@chromium.org>
2019-11-08 17:06:56 +00:00
Krzysztof Raszkowski
084431ce45 gallium/swr: Enable some ARB_gpu_shader5 extensions
Enable / add to features.txt:
- Enhanced textureGather.
- Geometry shader instancing.
- Geometry shader multiple streams.

Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
2019-11-08 16:04:47 +00:00
Krzysztof Raszkowski
e5ed9a1b91 gallium/swr: Fix GS invocation issues
- Fixed proper setting gl_InvocationID.
- Fixed GS vertices output memory overflow.

Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
2019-11-08 14:52:16 +00:00
Timur Kristóf
911a826141 ac: Handle invalid GFX10 format correctly in ac_get_tbuffer_format.
It happens that some games try to access a vertex buffer without
a valid format. This case was incorrectly handled by
ac_get_tbuffer_format which made ACO emit an invalid instruction.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Cc: 19.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-11-08 13:30:30 +01:00
Boris Brezillon
ee82f9f07e panfrost: Try to evict unused BOs from the cache
The panfrost BO cache can only grow since all newly allocated BOs are
returned to the cache (unless they've been exported).

With the MADVISE ioctl that's not a big issue because the kernel can
come and reclaim this memory, but MADVISE will only be available on 5.4
kernels. This means an app can currently allocate a lot memory without
ever releasing it, leading to some situations where the OOM-killer kicks
in and kills the app (or even worse, kills another process consuming
more memory than the GL app) to get some of this memory back.

Let's try to limit the amount of BOs we keep in the cache by evicting
entries that have not been used for more than one second (if the app
stopped allocating BOs of this size, it's likely to not allocate
similar BOs in a near future).

This solution is based on the VC4/V3D implementation.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-08 11:26:47 +01:00
Boris Brezillon
25059cc41f panfrost: Move BO cache related fields to a sub-struct
We will soon introduce an LRU list to evict BOs that have been unused
for more than 1 second. Let's first move all BO cache fields to a
sub-struct to clarify which fields are used by the BO caching logic.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-08 11:26:47 +01:00
Alyssa Rosenzweig
5f768eda43 pan/midgard: Switch base for vertex texturing on T720
There aren't texture pipeline registers anymore; instead, space is
shared with work and ldst registers for output and input respectively.
We need to shift the base registers to represent this correctly.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-08 06:45:03 +00:00
Alyssa Rosenzweig
ac14facf7a pan/midgard: Pass shader stage to disassembler
Vertex texturing behaves differently from fragment texturing on some
GPUs.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-08 06:45:03 +00:00
Alyssa Rosenzweig
515941202d pan/midgard: Disassemble half-steps correctly
The meaning of some bits shifts; we need to account for this to print
swizzles sanely.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-08 06:45:03 +00:00
Alyssa Rosenzweig
ec2af6bc97 pan/midgard: Fix printing of half-registers in texture ops
We were using old style half-registers; let's update that to be
consistent, preparing us for more disassmbler changes in this area.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-08 06:45:03 +00:00
Kristian H. Kristensen
4a4fad7f40 freedreno/ir3: Use regid() helper when setting up precolor regs
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:46:21 -08:00
Kristian H. Kristensen
3699a74a43 freedreno/a6xx: Turn on tessellation shaders
Wow. Very triangle. So shader.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen
53782571ae freedreno/a6xx: Only use merged regs and four quads for VS+FS
When other geometry stages are present, we chose two quads and no
merged regs.

Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen
07aedc367c freedreno/blitter: Save tessellation state
We have tessellation state now.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen
d2d0c8186d freedreno/a6xx: Only set emit.hs/ds when we're drawing patches
At least the gallium blitter helper will call us to draw with
tessellation shaders set but a non-patch primitive.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>

Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen
e584790885 freedreno: Use bypass rendering for tessellation
It seems like tiling could work in the Adreno architecture, but we've
only ever seen bypass rendering with tessellation.  For now, let's do
that too.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen
47e2c19511 freedreno/a6xx: Program state for tessellation stages
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen
03a30e7c3d freedreno/a6xx: Emit constant parameters for tessellation stages
Assemble the information the stages need and emit the constants.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen
5dd51d2da7 freedreno/a6xx: Allocate and program tessellation buffer
Tessellation needs a couple of buffers that should hold the entire
output from a full VS+TCS draw call.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen
f0ef3e9697 freedreno/a6xx: Build the right draw command for tessellation
We need to select the right primitive type, set a bit to turn on
tessellation and or in the TES output primitive type.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen
7272e8a709 freedreno/ir3: Allocate const space for tessellation parameters
The tessellation stages need size and stride or the patch layout as
well as locations of attributes in the patch.  The tesselation stages
also use two system memory BOs and need the iovas of those.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen
8739ea3ab5 freedreno/ir3: Pre-color TCS header and primitive ID inputs
Similar to GS, the registers are shared and not reinitialized betewen
VS and TCS, so we need to make sure to allocate the same registers for
the system values between stages.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen
b12ebe3e81 freedreno/ir3: Don't assume binning shader is always VS
In tessellation mode, the TES is (probably) the binning shader.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen
3cedeba7c9 freedreno/ir3: Setup inputs and outputs for tessellation stages
Similar to GS, some inputs are reused when the chsh from VS to TCS or
TES to GS, so we need to make sure we setup the right inputs and make
the shared system values outputs so they don't get clobbered.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen
e28fbbd861 freedreno/ir3: Implement TCS synchronization intrinsics
We add two new IR3 specific nir intrinsics that map to the new condend
and endpatch instructions.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:40:27 -08:00
Kristian H. Kristensen
4915231b8a freedreno/ir3: Implement tess coord intrinsic
Our lowering pass made the z component unused by replacing its uses
by 1 - x - y.  The intrinsic implementation then just need to return
the x and y components.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:37:08 -08:00
Kristian H. Kristensen
e16e48d00c freedreno/ir3: End TES with chsh when using GS
When we have both TES and GS, the TES needs to chain to the VS with
chmask and chsh GS just like the VS does to either TCS or GS.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:37:05 -08:00
Kristian H. Kristensen
581cd59692 freedreno/ir3: Add new synchronization opcodes
There are two new opcodes in use in tesselation control shaders:
category 0, opcodes 13 and 15.  unk13 is a kill type of instruction
that terminates threads where !p0.x and it used to narrow down a patch
wavefront to just thread 0.  Then, once thread 0 has written the tess
levels, it issues unk15, which might signal the TE that another patch
has been fully written.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:37:02 -08:00
Kristian H. Kristensen
56ed835bff freedreno/ir3: Extend geometry lowering pass to handle tessellation
VS and TCS pass varyings the same way as VS and GS does. TCS then
writes entire patch to a system memory BO and TES eventually reads
back from the BO once the TE starts generating vertices.  TES outputs
vertices the same way as VS and GS, except when there's a GS as well,
in which case TES passes varyings to GS same way the VS would.

In addition, the TCS needs a little bit of control flow massaging so
that it only runs for valid invocations needs a couple of unknown
instructions to synchronize with the TE.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:59 -08:00
Kristian H. Kristensen
8621fbc37b freedreno/ir3: Add tessellation field to shader key
Whether we're tessellating and which primitives the TES outputs
affects the entire pipeline so let's add a field to the key to track
that.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:56 -08:00
Kristian H. Kristensen
77b96b843e freedreno/ir3: Use imul24 in offset calculations
With the imul24 opcode in place, we can now use it for computing local
offsets (ie for ldlw/stlw).

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:53 -08:00
Kristian H. Kristensen
41984c8422 freedreno/ir3: Add ir3 intrinsics for tessellation
These provide the iovas for system memory buffers used for
tessellation as well as a new HW specific system value.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:50 -08:00
Kristian H. Kristensen
d6209a50bb freedreno: Don't count primitives for patches
The gallium helper doesn't like patches and we can't determine how
many primitives it gets tessellated into anyway.  On gens where we
have tessellation, we get the prim count from a HW counter so just
skip counting on the CPU.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:47 -08:00
Kristian H. Kristensen
fe450ef4cf freedreno/ir3: Add load and store intrinsics for global io
These intrinsics take a ivec2 for the 64 bit base address and a
integer offset.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:44 -08:00
Kristian H. Kristensen
5d67da13a3 freedreno/ir3: Emit link map as byte or dwords offsets as needed
Stages that load inputs with ldlw (TCS, GS) need byte offsets, stages
that load with ldg (TES) need dwords offsets.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:42 -08:00
Kristian H. Kristensen
1f3b52ce50 freedreno/a6xx: Add register offset for STG/LDG
These instructions take a 64 bit iova as two conescutive registers and
a immediate offset.  This patch adds support for the offset to be a
single register, which is added to the 64 bit iova.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:39 -08:00
Kristian H. Kristensen
3d16ec4a71 freedreno/a6x: Rename z/s formats
What we call eRB6_Z24_UNORM_S8_UINT now is actually
RB6_Z24_UNORM_S8_UINT_AS_R8G8B8A8 and RB6_X8Z24_UNORM is actually
RB6_Z24_UNORM_S8_UINT.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:36 -08:00
Kristian H. Kristensen
50124afe34 freedreno/a6xx: Fix layered texture type enum
2D array textures and 3D textures are different enum values after all.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:33 -08:00
Kristian H. Kristensen
0276d0766d freedreno: Add nogmem debug option to force bypass rendering
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:31 -08:00
Kristian H. Kristensen
7fed7c2a7d freedreno/a6xx: Clear sysmem with CP_BLIT
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:28 -08:00
Kristian H. Kristensen
b0b443dcab freedreno/a6xx: Fix primitive counters again
We use one mechanism for (REG_A6XX_RBBM_PRIMCTR_8_LO)
PIPE_QUERY_PRIMITIVES_GENERATED, which counts all primitives that exit
the geometry pipeline, whether or not xfb is on.  Then for
PIPE_QUERY_PRIMITIVES_EMITTED, we use the CP_EVENT_WRITE subfunction
that writes out per-stream counts for generated and emitted, but only
when xfb is enabled.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:22 -08:00
Kristian H. Kristensen
835f8d1ba1 freedreno/registers: Add comments about primitive counters
Adding comments about best guess at what the counters count.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:19 -08:00
Kristian H. Kristensen
96968d0ba2 freedreno/registers: Move SP_PRIMITIVE_CNTL and SP_VS_VPC_DST
Move these two to be in order with the other VS regs.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:36:16 -08:00
Kristian H. Kristensen
ba54f7dd03 freedreno/registers: Fix typo
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-07 16:35:27 -08:00
Rhys Perry
78e3ea9a0f aco: add Instruction::usesModifiers() and add more checks in the optimizer
No pipeline-db changes.

v2: use early-exit for VOP3

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> (v1)
2019-11-08 00:14:06 +00:00
Rhys Perry
76544f632d radv: adjust loop unrolling heuristics for int64
In particular, increase the cost of 64-bit integer division.

Fixes huge shaders with dEQP-VK.spirv_assembly.type.scalar.i64.mod_geom
, with ACO used for GS this creates shaders requiring a branch with
>32767 dword offset.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-07 23:29:12 +00:00
Erico Nunes
9817bff4da lima: fix bo submit memory leak
Fix memory leak on allocation for lima submit, reported by valgrind.

128 bytes in 1 blocks are definitely lost in loss record 38 of 84
   at 0x484A6E8: realloc (in /usr/lib/valgrind/vgpreload_memcheck-arm64-linux.so)
   by 0x58689C7: util_dynarray_ensure_cap (u_dynarray.h:91)
   by 0x5868BBB: util_dynarray_grow_bytes (u_dynarray.h:139)
   by 0x5868BBB: lima_submit_add_bo (lima_submit.c:113)
   by 0x585D7D3: lima_ctx_buff_va (lima_context.c:57)
   by 0x586378F: lima_pack_plbu_cmd (lima_draw.c:802)
   by 0x586378F: lima_draw_vbo (lima_draw.c:1351)
   by 0x5406A2F: u_vbuf_draw_vbo (u_vbuf.c:1184)
   by 0x55D0A57: st_draw_vbo (st_draw.c:268)
   by 0x55576CB: _mesa_draw_arrays (draw.c:374)
   by 0x55576CB: _mesa_draw_arrays (draw.c:351)
   by 0x43610B: Mesh::render_vbo() (mesh.cpp:583)
   by 0x415DBB: SceneBuild::draw() (scene-build.cpp:242)
   by 0x41131B: MainLoop::draw() (main-loop.cpp:133)
   by 0x411947: MainLoop::step() (main-loop.cpp:108)

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-11-07 23:03:01 +00:00
Erico Nunes
d939f5d463 lima: fix nir shader memory leak
Fix memory leak on allocation for nir shader, reported by valgrind.

3,502 (480 direct, 3,022 indirect) bytes in 1 blocks are definitely lost in loss record 77 of 84
   at 0x48483F8: malloc (in /usr/lib/valgrind/vgpreload_memcheck-arm64-linux.so)
   by 0x5750817: ralloc_size (ralloc.c:119)
   by 0x5750977: rzalloc_size (ralloc.c:151)
   by 0x575C173: nir_shader_create (nir.c:45)
   by 0x5763ACB: nir_shader_clone (nir_clone.c:728)
   by 0x55D5003: st_create_fp_variant (st_program.c:1242)
   by 0x55D789F: st_get_fp_variant (st_program.c:1522)
   by 0x55D789F: st_get_fp_variant (st_program.c:1507)
   by 0x56400C3: st_update_fp (st_atom_shader.c:163)
   by 0x563D333: st_validate_state (st_atom.c:261)
   by 0x55D07CB: prepare_draw (st_draw.c:132)
   by 0x55D08DF: st_draw_vbo (st_draw.c:184)
   by 0x55576CB: _mesa_draw_arrays (draw.c:374)
   by 0x55576CB: _mesa_draw_arrays (draw.c:351)

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-11-07 23:03:01 +00:00
Prodea Alexandru-Liviu
1a05811936 Meson: Remove lib prefix from graw and osmesa when building with Mingw.
Also remove version sufix from osmesa swrast on Windows.

v2: Make sure we don't remove lib prefix on *nix platforms.

Signed-off-by: Prodea Alexandru-Liviu <liviuprodea@yahoo.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>

Cc: "19.3" <mesa-stable@lists.freedesktop.org>
2019-11-07 22:04:50 +00:00
Marek Olšák
0b3111ed84 mesa: expose SPIR-V extensions in the Compatibility profile too
We would like to have GL 4.6 Compatibility too.

The extensions don't support compatibility features, so no other changes
are needed.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-11-07 16:04:30 -05:00
Drew DeVault
299c55df88 st_get_external_sampler_key: improve error message
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2019-11-07 15:57:23 -05:00
Eric Anholt
9d2c8df3eb mesa/st: Make st_pipe_format_to_mesa_format an effective no-op.
All callers other than the unit test just wanted to convert back from
a known-mesa-equivalent format, which is now a no-op.

v2: Fix assertion failure in iris GL startup with BGR565 by continuing
    to return MESA_FORMAT_NONE for non-Mesa formats.

Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
2019-11-07 19:43:41 +00:00
Eric Anholt
75921a0912 mesa/st: Gut most of st_mesa_format_to_pipe_format().
Now that MESA_FORMAT_x is just a PIPE_FORMAT_x define, we can strip
this function down to just the compression fallbacks.

v2: Restore the SRGB format for ASTC SRGB fallback case.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-07 19:43:41 +00:00
Eric Anholt
807a800d8c mesa: Redefine MESA_FORMAT_* in terms of PIPE_FORMAT_*.
There are various places in Mesa where we would like to be able to
have a shared format enum between Mesa and gallium (NIR compiler's
image formats, for example, or mapping from gallium's formats to
mesa's and vice versa in st_format.c).  Rewriting all MESA_FORMAT to
PIPE_FORMAT would be disruptive and possibly more work than it's worth
(And I actually prefer MESA_FORMAT's name scheme), so for now just
make it so that there's one shared set of enum values.

The #defines here were generated by printing out from the
tests/st_format.c round-tripping loop, with the exception of 8888
formats where I hand-edited the #defines to point at the corresponding
gallium packed format define.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-07 19:43:41 +00:00
Eric Anholt
d27dda907a mesa: Prepare for the MESA_FORMAT_* enum to be sparse.
To redefine MESA_FORMAT in terms of PIPE_FORMAT enums, we need to fix
places where we iterated up to MESA_FORMAT_COUNT.  I use
_mesa_get_format_name(f) == NULL as the signal that it's not an enum
value with a MESA_FORMAT.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-07 19:43:41 +00:00
Eric Anholt
6b1c250245 mesa/st: Test round-tripping of all compressed formats.
We checked round-tripping of formats without fallbacks, but weren't
setting the compression support flags in the mock context and thus
needed to skip testing those.  Just set all the flags and assert that
no fallbacks are triggered, so we get full test coverage.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-07 19:43:41 +00:00
Eric Anholt
80a8021d6c mesa: Stop defining a full separate format for RGBA_UINT8.
We have packed formats for RGBA and ABGR already, so we can just
pack/unpack code.

v2: Rebase on endianness macro rename

Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
2019-11-07 19:43:41 +00:00
Eric Anholt
b28eb044cd gallium: Add equivalents of packed MESA_FORMAT_*UINT formats.
These are the last formats that MESA_FORMAT had and PIPE_FORMAT
didn't.  The .csv entries channel sizes and swizzles all came from the
corresponding UNORM format.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-07 19:43:41 +00:00
Eric Anholt
6fab4a7b59 gallium: Add an equivalent of MESA_FORMAT_BGR_UNORM8.
This is the last unorm format that MESA_FORMAT had and PIPE_FORMAT
didn't.  Note that it's an array format on gallium's side as well,
since it's a NPOT pixel size.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-07 19:43:41 +00:00
Eric Anholt
4bbaac3782 gallium: Add some more channel orderings of packed formats.
This covers everything that MESA_FORMAT had for packed unorm.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-07 19:43:41 +00:00
Eric Anholt
6196259d95 gallium: Add defines for FXT1 texture compression.
This texture compression is exposed by 830 and 915, and to make
MESA_FORMAT match PIPE_FORMAT defines I need a corresponding
PIPE_FORMAT.

v2: Set is_hand_written so we don't try to generate pack/unpack code.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-07 19:43:41 +00:00
Eric Anholt
cb9fefe1db mesa/st: Add mapping of MESA_FORMAT_RGB_SNORM16 to gallium.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-07 19:43:41 +00:00
Samuel Pitoiset
deafe4cc58 radv/gfx10: fix primitive indices orientation for NGG GS
The primitive indices have to be swapped to follow the drawing
order.

This fixes corruption with Overwatch when NGG GS is force enabled.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-07 19:21:15 +00:00
Kenneth Graunke
49ee657ef8 Revert "intel/blorp: Fix usage of uninitialized memory in key hashing"
This reverts commit 4432a2d14d.

Pretty much every SKQP test dies with this assertion:
skqp: ../src/mesa/drivers/dri/i965/brw_program_cache.c:102: hash_key: Assertion `item->key_size % 4 == 0' failed.
2019-11-07 09:27:12 -08:00
Danylo Piliaiev
4432a2d14d intel/blorp: Fix usage of uninitialized memory in key hashing
The automatically generated padding in structs contains
undefined values, force pack the structs to eliminate the
padding. Otherwise structs with the same values may generate
different hashes.

Valgrind output:

Conditional jump or move depends on uninitialised value(s)
 util_fast_urem32 (fast_urem_by_const.h:71)
 hash_table_search (hash_table.c:262)
 _mesa_hash_table_search (hash_table.c:296)
 anv_pipeline_cache_search_locked (anv_pipeline_cache.c:318)
 anv_pipeline_cache_search (anv_pipeline_cache.c:335)
 lookup_blorp_shader (anv_blorp.c:38)
 blorp_params_get_mcs_partial_resolve_kernel (blorp_clear.c:1112)
 blorp_mcs_partial_resolve (blorp_clear.c:1205)
 anv_image_mcs_op (anv_blorp.c:1742)
 anv_cmd_predicated_mcs_resolve (genX_cmd_buffer.c:774)
 transition_color_buffer (genX_cmd_buffer.c:1159)
 cmd_buffer_end_subpass (genX_cmd_buffer.c:4840)

Uninitialised value was created by a stack allocation
 blorp_params_get_mcs_partial_resolve_kernel (blorp_clear.c:1103)

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-07 16:02:55 +00:00
Dylan Baker
0013af540d osmesa/tests: Extend render test to cover other working cases
Only the GL_UNSIGNED_BYTE cases actually work, the rest all fail, but we
should test the working cases to ensure that they continue to work.

Reviewed-by: Brian Paul <brianp@vmware.com>
2019-11-07 06:11:19 -08:00
Dylan Baker
7bfb56a135 gallium/osmesa: Convert osmesa test to gtest
This uses a bunch of additional C++ features for niceness and safety.

Reviewed-by: Brian Paul <brianp@vmware.com>
2019-11-07 06:11:19 -08:00
Dylan Baker
d1767362aa meson: gtest needs pthreads
Reviewed-by: Brian Paul <brianp@vmware.com>
2019-11-07 06:11:19 -08:00
Tomeu Vizoso
072207bc18 panfrost: Pipe the GPU ID into compiler and disassembler
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-07 08:48:45 +00:00
Daniel Schürmann
a47e232ccd aco: workaround Tonga/Iceland hardware bug
The workaround got accidentally moved to the wrong place

Fixes: 08d510010b aco: increase accuracy of SGPR limits

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-11-07 09:19:50 +01:00
Boris Brezillon
b60ed3c7b2 panfrost: Release the ctx->pipe_framebuffer ref
ctx->pipe_framebuffer contains the last bound FB state, let's release
resources pointed by this FB state when the context is destroyed.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-07 08:33:08 +01:00
Boris Brezillon
8c8e4fd5c6 panfrost: Destroy the upload manager allocated in panfrost_create_context()
pipe->stream_uploader has been allocated with u_upload_create_default()
in panfrost_create_context(), let's destroy it in the context destroy
path.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-07 08:33:08 +01:00
Kai Wasserbäch
ddc588ff71 intel/gen_decoder: Fix unused-but-set-variable warning
This commit fixes the following warning:
../src/intel/common/gen_decoder.c: In function ‘gen_spec_load_from_path’:
../src/intel/common/gen_decoder.c:741:11: warning: variable ‘len’ set but not used [-Wunused-but-set-variable]
  741 |    size_t len, filename_len = strlen(path) + 20;
      |           ^~~

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-07 11:32:55 +11:00
Kai Wasserbäch
acfea09dbd nir: fix unused function warning in src/compiler/nir/nir.c
This commit fixes the following warning:
../src/compiler/nir/nir.c:1827:1: warning: ‘dest_is_ssa’ defined but not used [-Wunused-function]
 1827 | dest_is_ssa(nir_dest *dest, void *_state)
      | ^~~~~~~~~~~

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-07 11:32:55 +11:00
Kai Wasserbäch
4f8cc032b7 nir: fix unused variable warning in find_and_update_previous_uniform_storage
This commit fixes the following warning:
../src/compiler/glsl/gl_nir_link_uniforms.c: In function ‘find_and_update_previous_uniform_storage’:
../src/compiler/glsl/gl_nir_link_uniforms.c:166:16: warning: unused variable ‘num_blks’ [-Wunused-variable]
  166 |       unsigned num_blks = nir_variable_is_in_ubo(var) ?
      |                ^~~~~~~~

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-07 11:32:55 +11:00
Kai Wasserbäch
8aa4d0bff6 nir: fix unused variable warning in nir_lower_vars_to_explicit_types
This commit fixes the following warning:
../src/compiler/nir/nir_lower_io.c: In function ‘nir_lower_vars_to_explicit_types’:
../src/compiler/nir/nir_lower_io.c:1435:22: warning: unused variable ‘supported’ [-Wunused-variable]
 1435 |    nir_variable_mode supported = nir_var_mem_shared | nir_var_shader_temp | nir_var_function_temp;
      |                      ^~~~~~~~~

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-07 11:32:55 +11:00
Lepton Wu
5a40e153fd gallium: dri2: Use index as plane number.
This fix wrong color when playing video under Android + virgl
configuration.

Fixes: 2decad495f ("gallium/dri2: Support images with multiple planes for modifiers")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Lepton Wu <lepton@chromium.org>
2019-11-06 21:58:28 +00:00
Lionel Landwerlin
c1c346f166 anv: implement VK_KHR_separate_depth_stencil_layouts
v2: Use ternary to simplify code (Jason)

v3: Reorder switch cases to follow existing section ordering (Nanley)
    Add missing comment in cmd_buffer_end_subpass() about new layout (Nanley)

v4: Fix layout comparison for stencil case (Nanley)
    Update a few more comments (Nanley)
    Move VK_IMAGE_LAYOUT_STENCIL_ATTACHMENT_OPTIMAL_KHR in color
    attachment case for future stencil-CCS support (Nanley)

v5: Missed comments update (Nanley)
    Updated relnotes.txt (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-11-06 20:13:30 +00:00
Eric Anholt
cb655d2554 Revert "ci: Switch over to an autoscaling GKE cluster for builds."
This reverts commit c9df92bf79.

It turns out that gitlab-runner uses kubernetes all wrong, spawning Pods
and sshing into them to run the script instead of Jobs containing the
script to run.  This means that when anything goes wrong with the pod
(autoscale, preemption, VM maintenance, cluster reconfiguration), the job
fails and only sometimes gets handled as a runner system failure.  Even
worse, due to bugs in either the runner or k8s itself, some classes of
timeout-related failure end up not being reported as failures, and the job
will incorrectly report success!

Disable using the "autoscale" cluster until we can do something else
(docker-machine instead of k8s, or the custom third-party k8s-native
runner).

Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Acked-by: Daniel Stone <daniels@collabora.com>
2019-11-06 11:38:07 -08:00
Tomeu Vizoso
94e6d17043 panfrost: Print the right zero field
Copy paste error.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2019-11-06 18:13:16 +01:00
Dylan Baker
401d7221ed docs: update calendar, add news item and link release notes for 19.2.2 2019-11-06 09:07:02 -08:00
Dylan Baker
6fb82263d4 docs: add sha256 sum to 19.2.3 release notes 2019-11-06 09:05:58 -08:00
Dylan Baker
d7418d67af docs: add release notes for 19.2.3 2019-11-06 09:05:56 -08:00
Tomeu Vizoso
6469c1a445 panfrost: Generate polygon list manually for SFBD
On clears without draws, the SFBD GPUs need for userspace to generate
the trivial polygon list.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-06 16:19:31 +01:00
Tomeu Vizoso
8e1ae5fa14 panfrost: Decode blend shaders for SFBD
Also set MALI_HAS_BLEND_SHADER as needed.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-06 16:18:46 +01:00
Tomeu Vizoso
afeda06062 panfrost: Take into account texture layers in SFBD
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-06 16:18:46 +01:00
Tomeu Vizoso
9447a84f69 panfrost: Rework format encoding on SFBD
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-06 16:18:46 +01:00
Tomeu Vizoso
e40d11ccb2 panfrost: Set 0x10 bit on mali_shader_meta.unknown2_4 on T720
Testing shows that it's needed.

Also remove ctx->is_t6xx as it was the last use of it.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-06 16:17:13 +01:00
Tomeu Vizoso
23fe7cd2d6 panfrost: Add checksum fields to SFBD descriptor
During tests on T720, these fields were discovered.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-06 16:17:13 +01:00
Erik Faye-Lund
bc80900b6c zink: do advertize integer support in shaders
This is supported, so let's correct this.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-11-06 13:43:14 +01:00
Erik Faye-Lund
8920689a58 zink/spirv: implement ball_fequal[2-4] 2019-11-06 13:43:14 +01:00
Erik Faye-Lund
ea2d9b3d38 zink/spirv: implement ball_iequal[2-4] 2019-11-06 13:43:14 +01:00
Erik Faye-Lund
0515ac4571 zink/spirv: implement bany_inequal[2-4] 2019-11-06 13:43:14 +01:00
Erik Faye-Lund
c18c81edc6 zink/spirv: implement bany_fnequal[2-4] 2019-11-06 13:43:14 +01:00
Erik Faye-Lund
4e0ca477d8 zink/spirv: support loading bool constants
Seems I missed this before; let's add support for this.
2019-11-06 13:43:14 +01:00
Erik Faye-Lund
6630baecf1 zink/spirv: drop temp-array for component-count 2019-11-06 13:43:14 +01:00
Michel Dänzer
e0fff37f70 gitlab-ci: Don't build libdrm for ARM
The Debian packages work fine. Saves a little bit of time and disk
space.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-06 13:10:07 +01:00
Michel Dänzer
b4d3ae2269 gitlab-ci: Use separate arm64 build/test docker images
The image used for test jobs is only about 1/6 as big as before, which
may help avoid some issues with some of the test boards.

Inspired by https://gitlab.freedesktop.org/mesa/mesa/issues/2046 .

v2:
* Leave LIBDRM_VERSION at 2.4.99 (Daniel Stone)
* Delete more build artifacts from dEQP tree (Daniel Stone)
v3:
* Set LD_LIBRARY_PATH for ldd

Acked-by: Daniel Stone <daniels@collabora.com> # v2
Reviewed-by: Eric Anholt <eric@anholt.net> # Except for the ldd line
2019-11-06 13:10:07 +01:00
Erik Faye-Lund
dd4587b55c zink: use u_blitter when format-reinterpreting 2019-11-06 11:37:36 +00:00
Erik Faye-Lund
7b9d17fe84 zink: always allow sampling of images
This is required if we're going to blit from/to it using u_blitter.
2019-11-06 11:37:36 +00:00
Erik Faye-Lund
1277192d55 zink: transition resources before resolving 2019-11-06 11:37:36 +00:00
Erik Faye-Lund
b385ad0c75 zink: disable fragment-shader texture-lod
We don't support nir_texop_txd, which is required by this cap. So let's
disable it for now.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 8d46e35d16 ("zink: introduce opengl over vulkan")
2019-11-06 11:37:36 +00:00
Duncan Hopkins
aa64b6dc7f zink: make sure src image is transfer-src-optimal
Fixes: d2bb63c8d4 ("zink: Use optimal layout instead of general. Reduces valid layer warnings. Fixes RADV image noise.")
2019-11-06 11:37:36 +00:00
Erik Faye-Lund
a32a92f53a zink: do not advertize coherent mapping
We do not support them yet, so let's not pretend.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 8d46e35d16 ("zink: introduce opengl over vulkan")
2019-11-06 11:37:36 +00:00
Erik Faye-Lund
ca87a53b46 zink: always allow mutating the format
There's no good way to know if a texture-view will be created, so we
just have to accept it for all resources.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 8d46e35d16 ("zink: introduce opengl over vulkan")
2019-11-06 11:37:36 +00:00
Erik Faye-Lund
f3a72fd61c zink: use actual format for render-pass
We should use the format derived from the image-view here, not from the
image itselt. Otherwise, we'll end up with incompatible render-passes.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 8d46e35d16 ("zink: introduce opengl over vulkan")
2019-11-06 11:37:36 +00:00
Pierre-Eric Pelloux-Prayer
21be5c8edd radeonsi: fix shader disk cache key
Use unsigned values otherwise signed extension will produce a 64 bits value where
the 32 left-most bits are 1.

Fixes: 2afeed3010 ("radeonsi: tell the shader disk cache what IR is used")
2019-11-06 10:15:37 +01:00
Samuel Pitoiset
fb07fd4e6c radv: implement VK_EXT_subgroup_size_control
This extension allows to control the subgroup size by allowing a
varying subgroup size and also specifying a required subgroup size.

This implementation only allows to specify a required subgroup
size for compute shaders because there is some caveats with
other shader stages (eg. NGG with geometry shader). This
basically allows apps to use Wave32 for compute shaders.

This extension is enabled for all chips but only GFX10 supports
Wave32. ACO doesn't support it.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-06 09:20:39 +01:00
Samuel Pitoiset
da6c30f9f6 radv: rely on shader's wavesize when computing NGG info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-06 09:20:36 +01:00
Samuel Pitoiset
d3f9957de4 radv: determine shaders wavesize at pipeline level
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-06 09:20:34 +01:00
Samuel Pitoiset
d1e1f7c4d5 radv: hardcode the number of waves for the GFX6 LS-HS bug
It's always 64.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-06 09:20:32 +01:00
Samuel Pitoiset
f010b90ac5 radv/gfx10: enable wave32 for compute based on shader's wavesize
This will allow to change wavesize on-demand.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-06 09:20:30 +01:00
Samuel Pitoiset
c0f76528ae nir: fix packing of nir_variable
The maximum number of descriptor sets is indeed 32 but without
the sign bit.

The maximum number of bindings for RADV is way larger, keep it
as 32-bit.

Fixes: 96e6ef80d9 ("nir: pack the rest of nir_variable::data")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2019-11-06 08:51:53 +01:00
Samuel Pitoiset
0b3bd1a7c2 radv: fix 32-bit compiler warnings
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2031
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-06 08:00:33 +01:00
Samuel Pitoiset
50b3ec35d2 radv: add a note about perftest/debug options
Now that all environment variables are documented, it would be
appreciated if we can keep this up-to-date.

[skip ci]

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-06 07:58:33 +01:00
Samuel Pitoiset
cc66976d0a docs: document all RADV environment variables
Requested by https://gitlab.freedesktop.org/mesa/mesa/issues/2022

[skip ci]

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-06 07:58:22 +01:00
Marek Olšák
8145492f4a nir/serialize: pack nir_variable flags
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-05 23:35:31 -05:00
Marek Olšák
3aa72a394a nir/serialize: store 32-bit object IDs instead of 64-bit
That means we have only 30 bits for object IDs, because 2 bits are
sometimes used for something else.

This decrease the uncompressed shader size for the biggest Borderlands 2
shader from 33.6 KB to 23.2 KB. (31% decrease)

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-05 23:35:31 -05:00
Marek Olšák
d5768fcd45 nir/serialize: don't expand 16-bit variable state slots to 32 bits
the swizzle also needs only 16 bits

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-05 23:35:31 -05:00
Marek Olšák
96e6ef80d9 nir: pack the rest of nir_variable::data
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-05 23:32:34 -05:00
Marek Olšák
442ef8c3e3 radeonsi: keep serialized NIR instead of nir_shader in si_shader_selector
This decreases memory usage, because serialized NIR is more compact.

The main shader part is compiled from nir_shader.
Monolithic shader variants are compiled from nir_binary.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-05 23:28:45 -05:00
Marek Olšák
abb8011f9d radeonsi: don't keep compute shader IR after compilation
not needed. We also need to free TGSI in the destroy function for the case
when an app is terminated and si_create_compute_state_async is never
executed because of util_queue_drop_job.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-05 23:28:43 -05:00
Marek Olšák
62229e8949 radeonsi: use IR SHA1 as the cache key for the in-memory shader cache
instead of using whole IR binaries. This saves some memory.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-05 23:28:42 -05:00
Vasily Khoruzhick
65a5b24aee lima: add support for gl_PointSize
GP handles gl_PointSize similar to gl_Position, i.e. it needs
separate buffer and it has special type in varying descriptors, also
for indexed draw we need to emit special PLBU command to pass
address of gl_PointSize buffer.

Blob also clamps gl_PointSize to 1 .. 100 (as well as line width),
so let's do the same.

Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-11-05 17:44:56 -08:00
Eric Engestrom
73cc2fec10 mesa/imports: let the build system detect strtok_r()
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2013
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Prodea Alexandru-Liviu <liviuprodea@yahoo.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-11-05 22:38:04 +00:00
Eric Engestrom
66dd53584e meson: require nm again on Unix systems
This was made optional in ff9bf223c2 ("meson: make nm binary optional")
for Windows, but proper windows has been added and `nm` is now only used
on Unix systems.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviwed-by: Dylan Baker <dylan@pnwbakers>
2019-11-05 20:56:44 +00:00
Eric Engestrom
4d5cde1fff meson: add windows support to symbols checks
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviwed-by: Dylan Baker <dylan@pnwbakers>
2019-11-05 20:31:37 +00:00
Eric Engestrom
2f652e0b36 meson: move the generic symbols check arguments to a common variable
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviwed-by: Dylan Baker <dylan@pnwbakers>
2019-11-05 20:30:47 +00:00
Eric Engestrom
2c4395e61c meson: add variable to control the symbols checks
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviwed-by: Dylan Baker <dylan@pnwbakers>
2019-11-05 20:12:32 +00:00
Pierre-Eric Pelloux-Prayer
67718ca352 mesa: fix call to _mesa_lookup_vao_err
Fixes: 3e842a0b0e ("mesa: rework _mesa_lookup_vao_err to allow usage from EXT_dsa")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2055
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-11-05 12:05:33 -08:00
Dylan Baker
5d085ad052 meson: Add dep_glvnd to egl deps when building with glvnd
Otherwise if glvnd is not installed systemwide, but only in a prefix,
it's headers wont be found. This happens because if it's headers are in
/usr/include/ then another dependence will provide the necessary -I
arguments and compilation will work.

Fixes: 035ec7a2bb
       ("meson: Add support for EGL glvnd")
Acked-by: Eric Engestrom <eric@engestrom.ch>
2019-11-05 16:44:41 +00:00
Dylan Baker
9020f519d2 util/u_endian: Add error checks
As suggested by Eric Engestrom and Michel Dänzer.
2019-11-05 16:39:55 +00:00
Dylan Baker
ee4f1bc187 util: rename PIPE_ARCH_*_ENDIAN to UTIL_ARCH_*_ENDIAN
As requested by Tim.

This was generated with:
grep 'PIPE_ARCH_.*_ENDIAN' -rIl | xargs sed -ie 's@PIPE_ARCH_\(.*\)_ENDIAN@UTIL_ARCH_\1_ENDIAN@'g

v2: - add this patch

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-11-05 16:39:55 +00:00
Dylan Baker
6b6897a9f9 gallium/osmesa: Use PIPE_ARCH_*_ENDIAN instead of little_endian function
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-11-05 16:39:55 +00:00
Dylan Baker
39b9fe03a9 mesa/main: delete now unused _mesa_little_endian
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-11-05 16:39:55 +00:00
Dylan Baker
f73a9c6586 mesa/swrast: replace instances of _mesa_little_endian with preprocessor
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-11-05 16:39:55 +00:00
Dylan Baker
453d52acd8 mesa/main: replace uses of _mesa_little_endian with preprocessor
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-11-05 16:39:55 +00:00
Dylan Baker
f9f60da813 util/u_endian: set PIPE_ARCH_*_ENDIAN to 1
This will allow it to be used as a drop in replacement for
_mesa_little_endian in a number of cases.

v2: - Always define PIPE_ARCH_LITTLE_ENDIAN and PIPE_ARCH_BIG_ENDIAN,
      define the one that reflects the host system to 1 and the other to 0
    - replace all uses of #ifdef, #ifndef, and #if defined() with #if
      and #if ! with PIPE_ARCH_*_ENDIAN

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-11-05 16:39:55 +00:00
Dylan Baker
37e54736a7 util/u_endian: Use _WIN32 instead of _MSC_VER
_WIN32 is defined by basically all windows compilers (MSVC, ICL, MinGW),
wereas _MSC_VER is not defined by MinGW. Without this change MinGW falls
through and doesn't define PIPE_ARCH at all, and is caught by some extra
code in gallium.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-11-05 16:39:55 +00:00
Dylan Baker
cb0dbdd369 dri/osmesa: use preprocessor for selecting endian code paths
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-11-05 16:39:55 +00:00
Dylan Baker
68d8c1f971 r100: Use preprocessor to select big vs little endian paths
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-11-05 16:39:55 +00:00
Dylan Baker
a550b6b7f8 r200: use preprocessor for big vs little endian checks
Instead of using a function at runtime we can just build the right code
for the right platform.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-11-05 16:39:55 +00:00
Philipp Sieweck
38e706656d svga: check return value of define_query_vgpu{9,10}
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2019-11-05 16:23:39 +00:00
Tomeu Vizoso
427d0c4b6a gitlab-ci: Run only LAVA jobs in special-named branches
Run only jobs needed for testing on LAVA devices if a branch starts with
lava-ci-.

This allows developers to have faster test cycles as these pipelines
take only a bit above 8 minutes. Also has the advantage of conserving
resources.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-11-05 16:09:47 +01:00
Pierre-Eric Pelloux-Prayer
febedee4f6 mesa: add EXT_dsa glGetVertexArray* 4 functions
The implementation doesn't share much with get.c because:
  * the refactoring needed for get.c to not depend on ctx->Array.VAO would
    be quite large
  * glGetVertexArray* would still need to filter pname to only accept the one
    specified by the spec
  * these functions are getter, the implementation is trivial (the complexity
    is in the correct filtering of pname input)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-05 13:58:28 +01:00
Pierre-Eric Pelloux-Prayer
2b44ca779b mesa: extract helper function from _mesa_GetPointerv
Will be used by EXT_dsa gllGetVertexArrayPointervEXT implementation.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-05 13:58:28 +01:00
Pierre-Eric Pelloux-Prayer
5adeff8033 mesa: add EXT_dsa EnableVertexArrayAttribEXT / DisableVertexArrayAttribEXT
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-05 13:58:28 +01:00
Pierre-Eric Pelloux-Prayer
f793a8663d mesa: add EXT_dsa glEnableVertexArrayEXT / glDisableVertexArrayEXT
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-05 13:58:28 +01:00
Pierre-Eric Pelloux-Prayer
a053361793 mesa: add gl_vertex_array_object parameter to client state helpers
This will allow to use the same helper for the EXT_direct_state_access
implementation.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-05 13:58:28 +01:00
Pierre-Eric Pelloux-Prayer
aef5d99671 mesa: add EXT_dsa glVertexArray* functions implementation
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-05 13:58:28 +01:00
Pierre-Eric Pelloux-Prayer
a78d4e7e75 mesa: add vao/vbo lookup helper for EXT_dsa
Add a single helper dealing with the lookup of both the vao
and the vbo to avoid duplicating this code in all the
glVertexArray* functions.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-05 13:58:28 +01:00
Pierre-Eric Pelloux-Prayer
3e842a0b0e mesa: rework _mesa_lookup_vao_err to allow usage from EXT_dsa
ARB_dsa and EXT_dsa slightly differs when an uninitialized VAO
is requested.
In this case ARB_dsa fails while EXT_dsa requires to initialize
the object.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-05 13:58:28 +01:00
Pierre-Eric Pelloux-Prayer
a26bb93943 mesa: add EXT_dsa glVertexArray* functions declarations
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-05 13:58:28 +01:00
Pierre-Eric Pelloux-Prayer
bfc1e4c112 mesa: pass vao as a function paramter
This change will allow reusing the same function for the
EXT_direct_state_access implementation.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-05 13:58:28 +01:00
Michel Dänzer
d80dece065 gitlab-ci: Set arm job CCACHE_DIR properly
$PWD doesn't work for variables:, it ended up as "/ccache", always
starting with an empty cache.

v2:
* Use relative path and realpath
v3:
* Use $CI_PROJECT_DIR (Eric Anholt)
* Clear ccache stats in before_script if the cache is in $CI_PROJECT_DIR

Fixes: c9df92bf79 "ci: Switch over to an autoscaling GKE cluster for
                     builds."
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-11-05 09:27:32 +01:00
Kenneth Graunke
337f58438e nir: Handle image arrays when setting variable data
Fixes a ton of regressions in image load store tests.

Fixes: 4319cc8c0f ("nir: pack nir_variable::data::xfb_*")
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-04 18:16:06 -08:00
Paulo Zanoni
b57383a944 intel/compiler: remove the operand restriction for src1 on GLK
Commit 5847de6e9a implemented a restriction that applies to ICL, but
wrongly marked it as also applying to GLK. Reviewers or MR !1125
pointed this, and the commit history shows removal of GLK to parts of
the patch, but it turns there was still a left-over GLK check in the
code.

This code was breaking some of the i8vec2 tests on GLK, for example:
  dEQP-VK.subgroups.arithmetic.compute.subgroupadd_i8vec2

Removing the GLK check solves the issue for GLK. I don't see a reason
on why implementing this restriction would actually break GLK, so
there's still more to investigate here since this bug may be affecting
ICL+, but let's apply the real GLK fix while we analyze and discuss
the other possible issues.

Fixes: 5847de6e9a ("intel/compiler: don't use byte operands for src1
on ICL")
BSpec: 3017
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
2019-11-05 00:08:34 +00:00
Marek Olšák
4319cc8c0f nir: pack nir_variable::data::xfb_*
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-04 18:17:34 -05:00
Marek Olšák
08dc541b66 nir: pack nir_variable::data::stream
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-04 18:17:34 -05:00
Ian Romanick
9be4a422a0 nir/algebraic: Mark other comparison exact when removing a == a
This prevents some additional optimizations that would change the
original result.  This includes things like (b < a && b < c) => b <
min(a, c) and !(a < b) => b >= a.  Both of these optimizations were
specifically observed in the piglit tests added in piglit!160.

This was discovered while investigating
https://gitlab.freedesktop.org/mesa/mesa/issues/1958.  However, the
problem in that issue was Chrome or Angle is replacing calls to isnan()
with some stuff that we (correctly) optimize to false.  If they had left
the calls to isnan() alone, everything would have just worked.

No shader-db changes on any Intel platform.

I also tried marking the comparison generated by the isnan() function
precise.  The precise marker "infects" every computation involved in
calculating the parameter to the isnan() function, and this severely
hurt all of the (few) shaders in shader-db that use isnan().

I also considered adding a new ir_unop_isnan opcode that would implement
the functionality.  During GLSL IR-to-NIR translation, the resulting
comparison operation would be marked exact (and the samething would need
to happen in SPIR-V translation).

This approach taken by this patch seemed easier, but we may want to do
the ir_unop_isnan thing anyway.

Fixes: d55835b8bd ("nir/algebraic: Add optimizations for "a == a && a CMP b"")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-04 14:05:49 -08:00
Ian Romanick
ea19f2fb68 nir/algebraic: Add the ability to mark a replacement as exact
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-04 14:05:49 -08:00
Marek Olšák
af94600484 compiler: make variable::data::binding unsigned
Nothing seems to set a negative value.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-04 16:49:46 -05:00
Marek Olšák
4b4b383f38 st/mesa: call nir_lower_flrp only once per shader
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-04 16:49:44 -05:00
Marek Olšák
7d00218aed st/mesa: call nir_opt_access only once
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-04 16:49:42 -05:00
Leo Liu
352b57d709 ac: add missing Arcturus to the info of pc lines
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: Marek Olšák <marek.olsak@amd.com>
2019-11-04 16:27:35 -05:00
Alyssa Rosenzweig
4da648a170 panfrost/ci: Update T760 expectations
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-04 15:36:08 -05:00
Alyssa Rosenzweig
12d071024b pan/midgard: Extend default_phys_reg to !32-bit
We can pass through a size.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-04 15:36:08 -05:00
Alyssa Rosenzweig
762623381d pan/midgard: Extend swizzle packing for vec4/16-bit
We would like to pack not just xyzw swizzles but also efgh swizzles.
This should work for vec4/16-bit. More work will be needed to pack
swizzles for vec8/16-bit and even more work for 8-bit, of course.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-04 15:36:08 -05:00
Alyssa Rosenzweig
bf5508f7b9 pan/midgard: Extend offset_swizzle to non-32-bit
We take a size parameter; use it.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-04 15:36:08 -05:00
Alyssa Rosenzweig
f538981384 pan/midgard: offset_swizzle doesn't need dstsize
This argument should be omitted.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-04 15:36:08 -05:00
Alyssa Rosenzweig
9eac9389fb pan/midgard: Add bizarre corner case
Someone really needs to look into this.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-04 15:36:08 -05:00
Alyssa Rosenzweig
4ae4d82e21 pan/midgard: Compute bundle interference
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-04 15:36:08 -05:00
Alyssa Rosenzweig
45ac8ea8bd pan/midgard: Fix quadword_count handling
Spilling can mess with this considerably.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-04 15:36:08 -05:00
Alyssa Rosenzweig
0a77dd3203 pan/midgard: Validate tags when branching
Midgard prefetches instructions based on tag (ALU, LD/ST, texture *
size). To do so, the shader descriptor specifies the tag of the first
instruction, all instructions specify the tag of the next linear
instruction is, and all branches explicitly specify the tag of the
branch target.

If you mess this up, you get an INSTR_TYPE_MISMATCH, which unambiguously
refers to this problem, but it's still annoying to try to work out all
the branch targets in your head to debug.

Instead, let's track the tags of various blocks over time, so we can
automatically validate tags of branch targets, to make
INSTR_TYPE_MISMATCH issues immediately obvious in a disassembly.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-04 15:36:08 -05:00
Daniel Schürmann
efe737fc4f aco: fix accidential reordering of instructions when scheduling
Fixes: 8678699918 "aco: implement VGPR spilling"

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-11-04 20:14:14 +01:00
Daniel Schürmann
5c7dcb15e0 aco: only use single-dword loads/stores for spilling
Fixes: 8678699918 "aco: implement VGPR spilling"

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-11-04 20:14:14 +01:00
Daniel Schürmann
d97c0bdd55 aco: fix immediate offset for spills if scratch is used
Fixes: 8678699918 "aco: implement VGPR spilling"

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-11-04 20:14:14 +01:00
Lionel Landwerlin
ee6fbb95a7 anv: Properly handle host query reset of performance queries
The host query reset entry point didn't use the availability offset
for performance queries.

To fix this, reorder the availability of performance queries to match
other queries.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 2b5f30b1d9 ("anv: implement VK_INTEL_performance_query")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-04 19:04:38 +00:00
Paul Gofman
ecc31d032e state_tracker: Handle texture view min level in st_generate_mipmap()
Signed-off-by: Paul Gofman <gofmanp@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2019-11-04 13:24:31 -05:00
James Xiong
b6d45e7f74 iris: try to set the specified tiling when importing a dmabuf
When importing a dmabuf with a specified tiling, the dmabuf user
should always try to set the tiling mode because: 1) the exporter
can set tiling AFTER exporting/importing. 2) a dmabuf could be
exported from a kernel driver other than i915, in this case the
dmabuf user and exporter need to set tiling separately.

This patch fixes a problem when running vkmark under weston with
iris on ICL, it crashed to console with the following assert. i965
doesn't have this problem as it always tries to set the specified
tiling mode.

weston: ../src/gallium/drivers/iris/iris_resource.c:990: iris_resource_from_handle: Assertion `res->bo->tiling_mode == isl_tiling_to_i915_tiling(res->surf.tiling)' failed.

Signed-off-by: James Xiong <james.xiong@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-11-04 17:59:52 +00:00
Kenneth Graunke
fc7b748086 iris: Fix "Force Zero RTA Index Enable" setting again
In 2ca0d913ea, we began updating cso_fb->layers to the actual layer
count, rather than 0.  This fixed cases where we were setting "Force
Zero RTA Index Enable" even when doing layered rendering.  Sadly, it
also broke the check entirely: cso_fb->layers is now 1 for non-layered
cases, but the Force Zero RTA Index check was still comparing for 0.

Fixes: 2ca0d913ea ("iris: Fix framebuffer layer count")
2019-11-04 08:57:37 -08:00
Dylan Baker
717606f9f3 nir: correct use of identity check in python
Python has the identity operator `is`, and the equality operator `==`.
Using `is` with strings sometimes works in CPython due to optimizations
(they have some kind of cache), but it may not always work.

Fixes: 96c4b135e3
       ("nir/algebraic: Don't put quotes around floating point literals")
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-11-04 16:06:39 +00:00
Boris Brezillon
28440820ef panfrost: MALI_DEPTH_TEST is actually MALI_DEPTH_WRITEMASK
MALI_DEPTH_TEST should only be set when depth->writemask is true,
not when the depth test is enabled. Let's rename the flag and patch
panfrost_bind_depth_stencil_state() to do the right thing.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-04 16:14:09 +01:00
Lionel Landwerlin
71634b1003 vulkan: bump headers/registry to 1.1.127
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-04 17:07:11 +02:00
Samuel Pitoiset
9ab27647ff radv: fix compute pipeline keys when optimizations are disabled
If an app first creates a compute pipeline with
VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT set, then re-compile it
without that flag, the driver should re-compile the compute shader.
Otherwise, it will return the unoptimized one.

Fixes: ce188813bf ("radv: add initial support for VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-04 08:50:00 +01:00
Karol Herbst
538d2c33b8 nv50/ir: fix crash in isUniform for undefined values
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-11-03 01:02:52 +01:00
Lionel Landwerlin
88d665830f mesa: check draw buffer completeness on glClearBufferfi/glClearBufferiv
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-02 09:14:26 +00:00
Vasily Khoruzhick
103378f332 lima: set dithering flag when necessary
Bit 13 in aux1 enables dithering

Reviewed-by: Qiang.Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-11-01 21:44:31 -07:00
Marek Olšák
c236e6c1e3 glsl: encode struct/interface types better
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-01 19:19:03 -04:00
Marek Olšák
5dde2aa8d9 glsl: encode array types better
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-01 19:19:03 -04:00
Marek Olšák
c141366560 glsl: encode explicit_stride for basic types better
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-01 19:19:03 -04:00
Marek Olšák
86adce4fef glsl: encode vector_elements and matrix_columns better
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-01 19:19:03 -04:00
Marek Olšák
21d2fbb8c3 glsl: encode/decode types using a union with bitfields for readability
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-01 19:19:03 -04:00
Vasily Khoruzhick
dd52744201 lima: ignore flags while looking for BO in cache
Any BO would work, we don't have any BO types yet anyway. Moreover
lima_submit_add_bo() changes BO flags so they won't match allocation
flags.

Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-11-01 13:12:07 -07:00
Vasily Khoruzhick
ae0b05d8db lima: align size before trying to fetch BO from cache
Otherwise we may be looking in wrong bucket

Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-11-01 13:12:03 -07:00
Vasily Khoruzhick
08d6416a1d lima: add debug prints for BO cache
LIMA_DEBUG=bocache now activates debug prints for BO allocation,
destruction and BO cache.

Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-11-01 13:11:47 -07:00
Alyssa Rosenzweig
b32caa6f1f pan/midgard: Use fp32 blend shaders
Clearly we do want to have fp16 at some point ... but I kind of give up
debugging and it turns out the issues with fp16 support in 'frost are so
deeply rooted that I might as well disable this non-opt and land
LCRA now.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-01 13:47:52 -04:00
Bas Nieuwenhuizen
8efb8f55a6 radv: Close all unnecessary fds in secure compile.
The seccomp filter allows read/write, let us make sure nobody can
do anything with this.

Fixes: cff53da374 "radv: enable secure compile support"
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-01 17:15:34 +01:00
Erik Faye-Lund
dd77bdb34b anv: remove incorrect polygonMode=point early-out
This is incorrect, because polygonMode only applies if the final
primitive type is a polygon; polygonMode doesn't apply to
line-primitives as the comment suggests.

The Vulkan 1.1 spec, section 26.11, "Polygons" defines that polygons are
separate from points and line segments:

" A polygon results from the decomposition of a triangle strip, triangle
  fan or a series of independent triangles. Like points and line segments,
  polygon rasterization is controlled by several variables in the
  VkPipelineRasterizationStateCreateInfo structure. "

Further, section 26.11.2, "Polygon Mode", only define polygonMode to
apply to polygons:

" Possible values of the VkPipelineRasterizationStateCreateInfo::polygonMode
  property of the currently active pipeline, specifying the method of
  rasterization for polygons, are: "

This seems to clearly define that polygonMode doesn't apply to points
and lines, so let's make sure that we don't early out with the wrong
value.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-01 07:26:03 +00:00
Alyssa Rosenzweig
c3a46e7644 pan/midgard: Eliminate blank_alu_src
We don't need it in practice, so this is some more cleanup.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-01 01:01:47 +00:00
Alyssa Rosenzweig
70072a20e0 pan/midgard: Refactor swizzles
Rather than having hw-specific swizzles encoded directly in the
instructions, have a unified swizzle arary so we can manipulate swizzles
generically.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-01 01:01:47 +00:00
Alyssa Rosenzweig
e7fd14ca8a pan/midgard: Add a dummy source for loads
We want symmetry between loads and stores, so we add a dummy source. So
we get, e.g.

   st_int4 _,    val, arg_1, arg_2
   ld_int4 dest,   _, arg_1, arg_2

Semantically, this dummy source represents the data itself, as if the
load is simply a move. That means it has a swizzle that acts as a
source.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-01 01:01:47 +00:00
Alyssa Rosenzweig
b5938be51d pan/midgard: Remove OP_IS_STORE_VARY
Unused.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-01 01:01:46 +00:00
Timothy Arceri
1c2bf82d24 glsl: disable lower_fragdata_array() for NIR drivers
This function was added in 7e414b5864 to work around a defect in
lower_output_reads(). As of the previous commit no NIR driver calls
lower_output_reads().

This change means we don't need the special GLSL IR style
gl_FragData handling for building the resource list in a NIR based
linker.

No shader-db change on SKL i965.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-01 11:33:54 +11:00
Timothy Arceri
0e186c18ba glsl: just use NIR to lower outputs when driver can't read outputs
This will allow us to stop lowering gl_FragData in GLSL IR for NIR
drivers which means we won't need the special GLSL IR type
handling for building the resource list in a NIR based linker.

i965 has been doing this since b828f7a27b.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-01 11:33:33 +11:00
Icenowy Zheng
8fa13db251 lima: support indexed draw with bias
When doing an indexed draw with index_bias set to a non-zero value (e.g.
by glDrawElementsBaseVertex), the vertex buffer should be offseted by
index_bias vertices.

Add this offset when setting the vertex buffer address.

Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-10-31 21:56:45 +00:00
Jason Ekstrand
f60ef0fff4 anv: Move the RT BTI flush workaround to begin_subpass
Now that we're no longer compacting binding table entries, the only time
they can possibly change is when we actually switch subpasses.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-10-31 21:07:15 +00:00
Jason Ekstrand
6a8f43030c anv: Stop compacting render targets in the binding table
Instead, always emit one entry for every color attachment in the subpass
or one NULL if there are no color attachments.  This will let us adjust
an Ice Lake workaround so we don't get a stall on every draw call.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-10-31 21:07:15 +00:00
Jason Ekstrand
c765e2156a anv: Don't claim the null RT as a valid color target
If it's NULL, we can let the compiler go ahead and delete it or flag it
as NULL.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-10-31 21:07:15 +00:00
Jason Ekstrand
df7a730b4f anv: Don't delete fragment shaders that write sample mask
Also, use color_outputs_valid rather than nr_color_outputs since it
should be a bit more accurate.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-10-31 21:07:15 +00:00
Yevhenii Kolesnikov
265e4d9432 glsl: Enable textureSize for samplerExternalOES
From OES_EGL_image_external_essl3

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1901

Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-31 20:23:56 +00:00
Eric Anholt
c9df92bf79 ci: Switch over to an autoscaling GKE cluster for builds.
The GKE pool we're using is 1-3 32-core VMs, preemptible (to keep
costs down), with 8 jobs concurrent per system.  We have plenty of
memory (4G/core), so we run make -j8 to try to keep the cores busy even
when one job is in a single-threaded step (docker image download, git
clone, artifacts processing, etc.)  When all jobs are generating work
for all the cores, they'll be scheduled fairly.

The nodes in the pool have 300GB boot disks (over-provisioned in space
to provide enough iops and throughput) mounted to /ccache, and
CACHE_DIR set pointing to them.  This means that once a new
autoscaled-up node has run some jobs, it should have a hot ccache from
then on (instead of having to rely on the docker container cache
having our ccache laying around and not getting wiped out by some
other fd.o job).  Local SSDs would provide higher performance, but
unfortunately are not supported with the cluster autoscaler.

For now, the softpipe/llvmpipe test runs are still on the shared
runners, until I can get them ported onto Bas's runner so they can be
parallelized in a single job.

Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
2019-10-31 11:19:43 -07:00
Eric Anholt
da6cc72237 ci: Make lava inherit the ccache setup of the .build script.
It was just duplicating the code.

Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
2019-10-31 11:19:43 -07:00
Eric Engestrom
6e21dcc5a3 meson: revert glvnd workaround
This effectively reverts MR !2112.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-31 17:09:59 +00:00
Eric Engestrom
0f201e9dbc meson: require glvnd 1.2.0
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-31 17:09:59 +00:00
Eric Engestrom
9b58ab803d gitlab-ci: build a recent enough version of GLVND (ie. 1.2.0)
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-31 17:09:59 +00:00
Eric Engestrom
c32236811d meson: move idep_xmlconfig_headers to xmlpool/
That's where `xmlpool_options_h` is defined, and this way we can make sure
nobody starts making use of it in the future :)

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-31 16:03:57 +00:00
Jason Ekstrand
02d9403067 anv: Use the new BO alloc API for Android
Fixes: a44f5ee0d8 "anv: Rework the internal BO allocation API"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 15:46:39 +00:00
Erik Faye-Lund
b7674829a1 zink: emit line-width when using polygon line-mode
When switching this to dynamic state, I forgot that this also needs to
be emitted when we use a polygon-mode set to lines.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 6d30abb4f1 ("zink: use dynamic state for line-width")
2019-10-31 15:38:21 +00:00
Eric Engestrom
fbb98ae0ed radeon: replace xmlpool_options_h with idep_xmlconfig_headers
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-31 15:29:06 +00:00
Eric Engestrom
2c9898a329 r200: replace xmlpool_options_h with idep_xmlconfig_headers
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-31 15:29:06 +00:00
Eric Engestrom
ea36ddae1e nouveau: replace xmlpool_options_h with idep_xmlconfig_headers
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-31 15:29:06 +00:00
Eric Engestrom
4c5c31a651 i915: replace xmlpool_options_h with idep_xmlconfig_headers
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-31 15:29:06 +00:00
Eric Engestrom
039797bef9 dri: replace xmlpool_options_h with idep_xmlconfig_headers
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-31 15:29:06 +00:00
Eric Engestrom
5774abe725 targets/xvmc: replace xmlpool_options_h with idep_xmlconfig_headers
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-31 15:29:06 +00:00
Eric Engestrom
ad8cd21def targets/xa: replace xmlpool_options_h with idep_xmlconfig_headers
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-31 15:29:06 +00:00
Eric Engestrom
ec2555a3d6 targets/vdpau: replace xmlpool_options_h with idep_xmlconfig_headers
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-31 15:29:06 +00:00
Eric Engestrom
8be89b4319 targets/va: replace xmlpool_options_h with idep_xmlconfig_headers
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-31 15:29:06 +00:00
Eric Engestrom
375094c70b targets/omx: replace xmlpool_options_h with idep_xmlconfig_headers
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-31 15:29:06 +00:00
Eric Engestrom
71ca5fb68a loader: replace xmlpool_options_h with idep_xmlconfig_headers
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-31 15:29:06 +00:00
Eric Engestrom
0bd6fc0a84 pipe-loader: drop unnecessary xmlpool_options_h
idep_xmlconfig already covers that

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-31 15:29:06 +00:00
Eric Engestrom
a2eba4b17d radv: drop unnecessary xmlpool_options_h
idep_xmlconfig already covers that

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-31 15:29:06 +00:00
Eric Engestrom
791ece114e anv: add missing xmlconfig headers dependency
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-31 15:29:06 +00:00
Eric Engestrom
4072b3360b meson: split out idep_xmlconfig_headers from idep_xmlconfig
A bunch of components need the former but not the latter.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-31 15:29:06 +00:00
Alyssa Rosenzweig
bf15318991 pipe-loader: Build kmsro loader for with all kmsro targets
Build failure reported by i965 CI, triggered by building dynamic
pipeloaders with kmsro drivers (besides 'frost). At this point, there's
no reason to actually do that -- mesa CI didn't mind -- but let's not
break the build.

v2: Simplify script. Add extra dependencies for v3d.

Fixes: afb0d08cb0 ("pipe-loader: Default to kmsro if probe fails")
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reported-by: Clayton Craft <clayton.a.craft@intel.com>
Tested-by: Clayton Craft <clayton.a.craft@intel.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-10-31 15:26:10 +00:00
Erik Faye-Lund
5ea787950f zink: heap-allocate samplers objects
VkSampler is 64-bit even on 32-bit systems, so casting it to a pointer
is a bad idea there. So let's heap-allocate the sampler-object instead.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2017
Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
Tested-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-10-31 13:57:43 +00:00
Jason Ekstrand
0ca0ad1252 anv: Zero released anv_bo structs
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 13:46:09 +00:00
Jason Ekstrand
b3c0b1b218 anv: Use a bitset for tracking residency
Now that we can conveniently map between GEM handles and struct anv_bo
pointers, we can use a simple bitset for residency tracking instead of
the complex hash set.  This shaves about 3% off of a CPU-limited example
running with the Dawn WebGPU implementation.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 13:46:09 +00:00
Jason Ekstrand
9ef198c59a anv: Set the batch allocator for compute pipelines
Otherwise relocations just up and crash.

Fixes: a3153162a9 "anv: Delay allocation of relocation lists"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 13:46:09 +00:00
Jason Ekstrand
9f665d9c1c anv: Add a device parameter to anv_execbuf_add_bo
We're about to start needing to lookup BO pointers by GEM handle so we
need access to the device.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 13:46:09 +00:00
Jason Ekstrand
63d7a38630 anv: Drop anv_bo_init and anv_bo_init_new
BOs are now only ever allocated through the BO cache so there's no need
to have these exposed.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 13:46:09 +00:00
Jason Ekstrand
853d3b59fd anv: Allocate misc BOs from the cache
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 13:46:09 +00:00
Jason Ekstrand
d0ec55d5a3 anv: Allocate scratch BOs from the cache
While we're here, we get rid of the locking and use a lock-free
algorithm.  The chances of spilling contention are low and this is
actually a bit simpler in some ways.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 13:46:09 +00:00
Jason Ekstrand
ee77938733 anv: Allocate batch and fence buffers from the cache
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 13:46:09 +00:00
Jason Ekstrand
e4f01eca3b util: Add a free list structure for use with util_sparse_array
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 13:46:09 +00:00
Jason Ekstrand
0a6d2593b8 anv: Allocate descriptor buffers from the BO cache
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 13:46:09 +00:00
Jason Ekstrand
e0ee23660f anv: Set more flags on descriptor pool buffers
the ASYNC flag, in particular, has the potential to help performance
because it means less sync tracking in the kernel.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 13:46:09 +00:00
Jason Ekstrand
c3eb4b3ba5 anv: Allocate query pool BOs from the cache
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 13:46:09 +00:00
Jason Ekstrand
0d2787f7c9 anv: Use the query_slot helper in vkResetQueryPoolEXT
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 13:46:09 +00:00
Jason Ekstrand
3119b96bdf anv: Allocate block pool BOs from the cache
This commit switches block pools over to being allocated from the BO
cache rather than being allocated manually by the block pool.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 13:46:09 +00:00
Jason Ekstrand
cc972d72c7 anv/tests: Initialize the BO cache and device mutex
We're about to start depending on the BO cache in the state and block
pools so we need them properly initialized for the tests to work.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 13:46:09 +00:00
Jason Ekstrand
9076e9f375 anv/tests: Zero-initialize instances
Some of the tests were actually relying on some of those uninitialized
bits to be non-zero.  In particular, a couple want use_softpin = true.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 13:46:09 +00:00
Jason Ekstrand
5c664dff75 anv: Choose BO flags internally in anv_block_pool
All block pools are allocated with the same flags.  There's no good
reason why it needs to be configurable.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 13:46:09 +00:00
Jason Ekstrand
a44f5ee0d8 anv: Rework the internal BO allocation API
This makes a number of changes to the current API:

 1. Everything is renamed to anv_device_* instead of anv_bo_cache_*
    because the BO cache is soon going to be the sole BO allocation path
    and not some special case to make import/export work.

 2. Drop the cache parameter.  It's totally redundant with the device
    and just annoying to keep typing.

 3. Rework flags so that they go the convenient direction for usage in
    ANV rather than whichever awkward way the i915 specified it to
    maintain backwards compatibility.  This also gives us the
    opportunity to set some defaults.

 4. Add flags for mapping and coherency.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 13:46:09 +00:00
Jason Ekstrand
1be2e4c0ef anv: Use anv_block_pool_foreach_bo in get_bo_from_pool
While we're at it, use gen_48b_address().

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 13:46:08 +00:00
Jason Ekstrand
3178e583c8 anv: Rework anv_block_pool_expand_range
The growing algorithms for the softpin case and the userptr version are
almost entirely different.  Having this weird join doesn't make the code
more comprehensible.  This rework does a few things:

 1. Move the comment about 48-bit addresses to anv_device_init where we
    actually unset the EXEC_OBJECT_SUPPORTS_48B_ADDRESS flag.

 2. Separate the paths in anv_block_pool_expand_range so it's easier to
    see what happens in the two different cases.

 3. Use the anv_block_poo::bos array for storing all allocated BOs in
    both paths rather than using the cleanup list in both paths.  This
    lets us make the cleanups array only used for mmaps of the memfd for
    the userptr case.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 13:46:08 +00:00
Jason Ekstrand
bb257e1852 anv: Fix a potential BO handle leak
Fixes: 731c4adcf9 "anv/allocator: Add support for non-userptr"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 13:46:08 +00:00
Jason Ekstrand
6f4fa81769 anv: Handle state pool relocations using "wrapper" BOs
Instead of depending on a mutable BO in the state pool for handling
growing state pools, add a concept of "wrapper" BOs which just wrap an
actual BO.  This way, the wrapper can exist once for all of time and we
can put it in relocation lists even if the actual BO it references gets
swapped out.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 13:46:08 +00:00
Jason Ekstrand
b781c85c79 anv: Replace ANV_BO_EXTERNAL with anv_bo::is_external
We're not THAT strapped for space that we can't burn one extra bit for
a boolean.  If we're really worried about it, we can always shrink the
flags field to 16 bits because the kernel only uses 7 currently.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 13:46:08 +00:00
Jason Ekstrand
5534358ef6 anv: Inline anv_block_pool_get_bo
It has exactly one caller and we're about to change some of the dynamics
which would make this confusing as a separate function.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 13:46:08 +00:00
Jason Ekstrand
c0a4722f29 anv: Declare the bo in the anv_block_pool_foreach_bo loop
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 13:46:08 +00:00
Jason Ekstrand
325345b2bd anv: Stop storing the GEM handle in anv_reloc_list_add
We have to go through and rewrite them all anyway so it doesn't do us
any good to put them in the list in anv_reloc_list_add.  Also, for state
pools the handles are likely wrong by the time vkQueueSubmit is called.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 13:46:08 +00:00
Jason Ekstrand
c4be72934e anv: Fix a relocation race condition
Previously, we would read the offset from the BO in anv_reloc_list_add
to generate the presumed offset and then again in the caller to compute
the 64-bit address to write into the buffer.  However, if the offset
somehow changed between these two points, the presumed offset would no
longer match the written offset.  This is unlikely to actually ever be a
problem in practice because the presumed offset gets recorded first and
so if the written address is wrong then the presumed offset is almost
certainly wrong and the relocation will trigger.  However, it's much
safer to simply have anv_reloc_list_add return the 64-bit address.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 13:46:08 +00:00
Jason Ekstrand
bbf389013f anv: Use a util_sparse_array for the GEM handle -> BO map
This lets us do less allocation because the anv_bo's are now embedded in
the sparse array and it also allows lock-free translation from GEM
handle to BO which will be useful in future commits.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 13:46:08 +00:00
Jason Ekstrand
821ce7be36 anv: Move refcount to anv_bo
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 13:46:08 +00:00
Jason Ekstrand
09ec6917c1 util: Add a util_sparse_array data structure
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 13:46:08 +00:00
Pierre-Eric Pelloux-Prayer
8a723282e3 mesa: enable msaa in clear_with_quad if needed
If the DrawBuffer sample count is > 1 and msaa is enabled we must also
enable msaa when clearing it.

Fixes: ea5b7de138 ("radeonsi: make gl_SampleMaskIn = 0x1 when MSAA is disabled")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1991

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-10-31 12:30:53 +01:00
Lionel Landwerlin
b087b7bd90 intel/perf: fix Android build
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 15b7b56eb2 ("intel/perf: add TGL support")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-By: Tapani Pälli <tapani.palli@intel.com>
2019-10-31 11:20:30 +00:00
Tomeu Vizoso
01af59b2d9 gitlab-ci: Disable lima jobs
The runner that submits jobs there is down and will turn some time to
get fixed. Disable them for now to keep the CI green.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-10-31 11:08:11 +00:00
Bas Nieuwenhuizen
6ced684e27 radv: Fix disk_cache_get size argument.
Got some int->pointer warnings and 20 is not a valid pointer ....

Fixes: 2e3a635ee6 "radv: Add an early exit in the secure compile if we already have the cache entries."
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-31 11:40:43 +01:00
Andrii Simiklit
e06fcbe2c8 main: fix several 'may be used uninitialized' warnings
This patch fixes approximately 39 warnings in 'texcompress_etc.c'
for the release configuration

v2: Fixed by adding the unreachable case to the etc2_rgb8_fetch_texel
       ( Eric Engestrom <eric.engestrom@intel.com> )

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
2019-10-31 10:14:09 +00:00
Bas Nieuwenhuizen
3e86d553a4 anv: Remove _mesa_locale_init/fini calls.
The resulting locale is not used for Vulkan, and it is not reference
counted, giving issues when multiple instances are created.

CC: 19.2 19.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-31 09:47:56 +00:00
Bas Nieuwenhuizen
72f858fc07 turnip: Remove _mesa_locale_init/fini calls.
The resulting locale is not used for Vulkan, and it is not reference
counted, giving issues when multiple instances are created.

CC: 19.2 19.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-31 09:47:56 +00:00
Bas Nieuwenhuizen
344ba56b0f radv: Remove _mesa_locale_init/fini calls.
The resulting locale is not used for Vulkan, and it is not reference
counted, giving issues when multiple instances are created.

CC: 19.2 19.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-31 09:47:56 +00:00
Pierre-Eric Pelloux-Prayer
2afeed3010 radeonsi: tell the shader disk cache what IR is used
Until 8bef4df196 the IR (TGSI or NIR) was used in disk_cache driver_flags.
This commit restores this features to avoid crashing when switching from
one IR to the other.

As radeonsi's default is TGSI, I used "driver_flags & 0x8000000 = 0" for TGSI
to keep the same driver_flags.

Fixes: 8bef4df196 ("radeonsi: add si_debug_options for convenient adding/removing of options")

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-31 10:43:40 +01:00
Lionel Landwerlin
15b7b56eb2 intel/perf: add TGL support
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-31 09:13:20 +00:00
Robert Foss
f140467b5b android: Add panfrost support to build scripts
Currently the Android build system doesn't expose the panfrost
driver.

This patch enables the panfrost driver to be build on for the
Android platform.

Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-By: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-31 10:03:54 +01:00
Robert Foss
6f3f855320 nir: Build nir_lower_point_size.c in libmesa_nir
nir_lower_point_size.c was not build into the libmesa_nir library for non-meson
builds. However it was included in the meson build.

This patch fixes that.

Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-31 10:03:54 +01:00
Iago Toral Quiroga
e7e501efce v3d: rename vertex shader key (num)_fs_inputs fields
Until now this made sense because we always paired vertex shaders
with fragment shaders, but as soon as we implement geometry and
tessellation shaders that will no longer be the case, so rename
this to (num_)used_outputs.

v2: Use 'used_outputs' instead of ns_outputs, which is more explicit (Eric).

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-31 08:46:35 +00:00
Mauro Rossi
d688e4166c android: aco: fix Lower to CSSA
Fixes the following building error:

external/mesa/src/amd/compiler/aco_spill.cpp:1768:
error: undefined reference to 'aco::lower_to_cssa(aco::Program*, aco::live&, radv_nir_compiler_options const*)'

Fixes: 0b8216b ("aco: Lower to CSSA")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
2019-10-31 07:38:46 +00:00
Jan Zielinski
7baedc9162 gallium/swr: Fix depth values for blit scenario
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-10-31 07:25:54 +00:00
Jordan Justen
bb0c5c487e iris/gen11+: Move flush for render target change
When starting a BLORP operation, we do the BTI-change flush.  However,
when ending it and transitioning back to regular drawing, we change the
render target again - without a set_framebuffer_state() call.  We need
to do the BTI flush there too.  BLORP flags IRIS_DIRTY_RENDER_BUFFER
now, which will cause the next draw to get the BTI flush again.

(explanation of fix by Ken)

Fixes: 2b956a093a ("iris: totally untested icelake support")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-31 00:24:25 -07:00
Jordan Justen
a2c3c65a31 iris: Add IRIS_DIRTY_RENDER_BUFFER state flag
Fixes: 2b956a093a ("iris: totally untested icelake support")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-31 00:24:25 -07:00
Samuel Pitoiset
1e36a8f41d radv: declare NGG scratch for VS or TES and only on GFX10
Do not need to declare it for other stages because this is for
streamout.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-31 06:51:01 +00:00
Arno Messiaen
a9391a1a01 lima: add cubemap support
Signed-off-by: Arno Messiaen <arnomessiaen@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
2019-10-31 06:29:31 +00:00
Arno Messiaen
9890590fba lima: introduce ppir_op_load_coords_reg to differentiate between loading texture coordinates straight from a varying vs loading them from a register
Signed-off-by: Arno Messiaen <arnomessiaen@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
2019-10-31 06:29:31 +00:00
Arno Messiaen
28e1d55d6e lima: add layer_stride field to lima_resource struct
Signed-off-by: Arno Messiaen <arnomessiaen@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
2019-10-31 06:29:31 +00:00
Arno Messiaen
f3686083a4 lima: fix stride in texture descriptor
Signed-off-by: Arno Messiaen <arnomessiaen@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
2019-10-31 06:29:31 +00:00
Ian Romanick
7b3f38ef69 intel/compiler: Report the number of non-spill/fill SEND messages on vec4 too
This make shader-db's report.py work on Haswell and earlier platforms.
The problem is that the script would detect the "sends" output for
scalar shaders and expect in in vec4 shaders too.  When it didn't find
it, the script would fail with:

    Traceback (most recent call last):
      File "./report.py", line 351, in <module>
        main()
      File "./report.py", line 182, in main
        before_count = before[p][m]
    KeyError: 'sends'

Fixes: f192741ddd ("intel/compiler: Report the number of non-spill/fill SEND messages")

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-30 21:27:03 -07:00
Tapani Pälli
b380d47998 nir: fix couple of compile warnings
Fixes "warning: braces around scalar initializer" warnings.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 00:21:44 +00:00
Bas Nieuwenhuizen
ec770085c2 radv: Fix timeout handling in syncobj wait.
libdrm returns -errno instead of directly the ioctl ret of -1.

Fixes: 1c3cda7d27 "radv: Add syncobj signal/reset/wait to winsys."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-31 00:48:17 +01:00
Ilia Mirkin
1b9d1e13d8 nv50/ir: mark STORE destination inputs as used
Observed an issue when looking at the code generatedy by the
image-vertex-attrib-input-output piglit test. Even though the test
itself worked fine (due to TIC 0 being used for the image), this needs
to be fixed.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2019-10-30 19:13:18 -04:00
Ilia Mirkin
869e32593a gm107/ir: fix loading z offset for layered 3d image bindings
Unfortuantely we don't know if a particular load is a real 2d image (as
would be a cube face or 2d array element), or a layer of a 3d image.
Since we pass in the TIC reference, the instruction's type has to match
what's in the TIC (experimentally). In order to properly support
bindless images, this also can't be done by looking at the current
bindings and generating appropriate code.

As a result all plain 2d loads are converted into a pair of 2d/3d loads,
with appropriate predicates to ensure only one of those actually
executes, and the values are all merged in.

This goes somewhat against the current flow, so for GM107 we do the OOB
handling directly in the surface processing logic. Perhaps the other
gens should do something similar, but that is left to another change.

This fixes dEQP tests like image_load_store.3d.*_single_layer and GL-CTS
tests like shader_image_load_store.non-layered_binding without breaking
anything else.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "20.0" <mesa-stable@lists.freedesktop.org>
2019-10-30 19:12:36 -04:00
Lionel Landwerlin
e02c181bfd intel/dev: set default num_eu_per_subslice on gen12
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 8125d7960b ("intel/dev: Add preliminary device info for Tigerlake")
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-30 22:30:09 +00:00
Dylan Baker
4226952199 docs/new_features: Empty the feature list for the 20.0 cycle 2019-10-30 15:18:27 -07:00
Dylan Baker
1fdcc2494e Bump VERSION to 20.0.0-devel 2019-10-30 14:56:02 -07:00
Jordan Justen
98da208660 docs/relnotes/new_features.txt: Add note about gen12 support
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-30 14:08:51 -07:00
Jordan Justen
2b186264cc intel/eu/validate/gen12: Add TGL to eu_validate tests.
These reworks were combined into this patch:

 * Matt Turner: i965: Disable NoDDChk/NoDDClr test on Gen12+
 * Francisco Jerez: intel/eu/validate/gen12: Disable
   qword_low_power_no_depctrl eu_validate test.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-30 14:08:51 -07:00
Jordan Justen
8125d7960b intel/dev: Add preliminary device info for Tigerlake
Reworks:
 * adjust 64-bit support, hiz (Jason Ekstrand)
 * sim-id (Lionel Landwerlin)
 * adjust threads, urb size (Rafael Antognolli)
 * adjust urb size (Kenneth Graunke)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-30 14:08:48 -07:00
Lionel Landwerlin
632995227c intel/dump_gpu: handle context create extended ioctl
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-10-30 21:58:31 +02:00
Bas Nieuwenhuizen
ae454a03b7 radv: Allocate space for temp. semaphore parts.
Calculated the number for allocation and did not
reserve space ....

Fixes: 2117c53b72 "radv: Add temporary datastructure for submissions."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 20:51:39 +01:00
Rafael Antognolli
3c317e8187 anv: Add Tile Cache Flush for Unified Cache. 2019-10-30 19:51:03 +00:00
Rafael Antognolli
a99c67b690 blorp: Add Tile Cache Flush for Unified Cache. 2019-10-30 19:51:03 +00:00
Rafael Antognolli
d3995c19eb iris: Add Tile Cache Flush for Unified Cache. 2019-10-30 19:51:03 +00:00
Jordan Justen
f573cd4757 intel/genxml: Add gen12 tile cache flush bit
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-30 19:51:03 +00:00
Daniel Schürmann
8678699918 aco: implement VGPR spilling
VGPR spilling is implemented via MUBUF instructions and scratch memory.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Daniel Schürmann
c79972b604 aco: always set scratch_offset in startpgm
This patch also moves private_segment_buffer and
scratch_offset to Program to easily access it.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Daniel Schürmann
b0de16b7de aco: omit linear VGPRs as spill variables
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Daniel Schürmann
aded548e66 aco: ensure that spilled VGPR reloads are done after p_logical_start
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Daniel Schürmann
a7ff1bb5b9 aco: simplify calculation of target register pressure when spilling
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Rhys Perry
e73de4e1d8 aco: fix new_demand calculation for first instructions
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-30 19:48:33 +00:00
Daniel Schürmann
93b42a1907 aco: don't add interferences between spilled phi operands
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Daniel Schürmann
fdf8ad0256 aco: consider loop_exit blocks like merge blocks, even if they have only one predecessor
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Daniel Schürmann
d48d72e98a aco: don't insert the exec mask into set of live-out variables when spilling
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Daniel Schürmann
cd20e29de1 aco: fix transitive affinities of spilled variables
Variables spilled on both branch legs need to be assigned to the same spilling slot.
These affinities can be transitive through multiple merge blocks.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Daniel Schürmann
8023dcd71e aco: fix live-range splits of phis
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Daniel Schürmann
655a703349 aco: remove potential critical edge on loops.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Daniel Schürmann
78bca0d0ce aco: improve live variable analysis
This patch makes the live variable analysis more precise
w.r.t. killed phi operands and the block's register pressure.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:32 +00:00
Daniel Schürmann
0b8216b2cd aco: Lower to CSSA
Converting to 'Conventional SSA Form' ensures correctness w.r.t. spilling of phi nodes.
Previously, it was possible that phi operands have intersecting live-ranges, and thus,
couldn't get spilled to the same spilling slot. For this reason, ACO tried to avoid to
spill phis, even if it was beneficial.
This patch implements a conversion pass which is currently only called if spilling is necessary.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:32 +00:00
Jonathan Marek
329d322a16 etnaviv: fix non-pointsprite points on GC7000L
Fixes these deqp tests (and more):
dEQP-GLES2.functional.draw.draw_arrays.points.single_attribute
dEQP-GLES2.functional.draw.draw_arrays.points.multiple_attributes
dEQP-GLES2.functional.draw.draw_arrays.points.default_attribute
dEQP-GLES2.functional.draw.draw_elements.points.single_attribute
dEQP-GLES2.functional.draw.draw_elements.points.multiple_attributes
dEQP-GLES2.functional.draw.draw_elements.points.default_attribute

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-10-30 19:44:41 +00:00
Jonathan Marek
ad5cbbd228 etnaviv: stencil fix
The final version of previous stencil fix patch ended up breaking one-sided
stencil.

Fixes remaining failures in these deqp tests (tested on GC3000/GC7000L):
dEQP-GLES2.functional.fragment_ops.depth_stencil.*

Note: deqp tests require --deqp-gl-config-name=rgba8888d24s8ms0

Fixes: 05da025f ("etnaviv: fix two-sided stencil")

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-10-30 19:44:41 +00:00
Jonathan Marek
7b524e1acb etnaviv: fix depth bias
Fixes remaining failures in these deqp tests (tested on GC3000/GC7000L):
dEQP-GLES2.functional.polygon_offset.*

Fixes: 6c3c05dc ("etnaviv: fix polygon offset")

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-10-30 19:44:41 +00:00
Jordan Justen
b529db00ee iris: Set MOCS for external surfaces to uncached
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-30 12:42:54 -07:00
Rafael Antognolli
ffb46b2bb7 iris: Align fast clear color state buffer to a page.
On gen11 and older, compressed images are tiled and aligned to 4K. On
gen12 this 4K alignment restriction was removed. However, only aligning
the fast clear color buffer to 64B (a cacheline, as it's on the
documentation) is causing some bugs where the fast clear color is not
converted during the fast clear operation. Aligning things to 4K seems
to fix it.

v2: Fix typo case in the comment (Nanley)
v3: Rebase and fix conflicts.
v4: Fix rebase mistake (Nanley).

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-30 19:41:29 +00:00
Rafael Antognolli
e51722a7c7 anv: Align fast clear color state buffer to a page.
On gen11 and older, compressed images are tiled and aligned to 4K. On
gen12 this 4K alignment restriction was removed. However, only aligning
the fast clear color buffer to 64B (a cacheline, as it's on the
documentation) is causing some bugs where the fast clear color is not
converted during the fast clear operation. Aligning things to 4K seems
to fix it.

v2: Assert that image->planes[plane].offset is 4K aligned (Nanley)

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-30 19:41:29 +00:00
Erik Faye-Lund
477f019812 zink: only enable KHR_external_memory_fd if supported
While we're at it, make sure we error out if it's not supported when
required.

This brings us a bit closer to being able to test on SwiftShader, which
doesn't currently support KHR_external_memory_fd.
2019-10-30 19:40:50 +00:00
Bas Nieuwenhuizen
780c937a5d radv: Start signalling semaphores in WSI acquire.
Winsys semaphores without signal operation get silently ignored.

Not so for syncobjs, so actually signal them.

Fixes: 84d9551b23 "radv: Always enable syncobj when supported for all fences/semaphores."
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2030
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 19:42:10 +01:00
Rhys Perry
e1bcc7a828 aco: rename README to README.md
Closes: #1974
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 18:16:00 +00:00
Rhys Perry
d4684a294b aco: a couple loop handling fixes for GFX10 hazard pass
It was joining from the wrong blocks and block.kind is a bitmask instead
of an enum.

Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2019-10-30 18:13:53 +00:00
Matt Turner
12d3b11908 intel/compiler: Add instruction compaction support on Gen12
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-30 11:11:50 -07:00
Matt Turner
c8fbc8823f intel/compiler: Make separate src0/src1 index tables
TGL uses different data (and even a different format!) for each source.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-30 11:11:50 -07:00
Matt Turner
cde73625f8 intel/compiler: Inline get_src_index()
TGL will have separate tables for src0 and src1, so the shared function
will no longer make sense.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-30 11:11:50 -07:00
Matt Turner
d0eff8a539 intel/compiler: Restructure instruction compaction in preparation for Gen12
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-30 11:11:50 -07:00
Matt Turner
ded9fb2b18 intel/compiler: Remove unreachable() from brw_reg_type.c
The EU compaction unit test fuzzes the compaction code by flipping bits.
We use a simple skip_bits() function with a list of reserved bits to
ignore, but for more complex cases like invalid combinations of register
file:type, we need either machinery to check validity or for these
functions to simply inform us whether a combination was valid.

enum brw_reg_type a 4-bit field in brw_reg, so rather than expanding it
with an "INVALID" value, just return -1 and let the caller check for
that.

Scott suggested redefining unreachable() within the unit test to
longjmp() which would allow driver code like this to still use it and
allow the test to handle expected failures like this. If that plan works
out, I plan to revert this.
2019-10-30 11:11:50 -07:00
Jonathan Marek
fa3baeab76 freedreno/a2xx: add missing vertex formats (SSCALE/USCALE/FIXED)
Mostly for vertex formats, but they are supported as texture formats too
(untested however).

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-10-30 18:04:17 +00:00
Pierre-Eric Pelloux-Prayer
03a132912f radeonsi: disable sdma for gfx10
Disable sdma on gfx10 until all timeouts bugs are fixed.

See:
    https://gitlab.freedesktop.org/mesa/mesa/issues/1907
    https://bugs.freedesktop.org/show_bug.cgi?id=111481

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-30 18:03:14 +01:00
Pierre-Eric Pelloux-Prayer
2fb4b3c476 radeonsi: sdma misc fixes
SDMA IB doesn't need to be padded for SDMA.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-30 18:03:14 +01:00
Pierre-Eric Pelloux-Prayer
21b9a6b590 radeonsi: align sdma byte count to dw
If src/dst addresses are dw aligned and size is > 4 then we align
byte count to dw as well.

PAL implementation works like this.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-30 18:03:14 +01:00
Timur Kristóf
f53811aeac radv: Enable ACO on Navi.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 16:54:41 +00:00
Leo Liu
a886ae5162 radeonsi: enable 8K video decode support for HEVC and VP9
HW 8K decode support starts at Renoir

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
2019-10-30 12:43:04 -04:00
Leo Liu
b4c812a269 radeon/vcn: Add VP9 8K decode support
Require increase of context buffer size

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
2019-10-30 12:43:04 -04:00
Rhys Perry
8235bc6411 aco: try to group together VMEM loads of the same resource
v2: remove accidental shaderInt16 change
v2: simplify can_move_down initialization
v2: simplify VMEM_CLAUSE_MAX_GRAB_DIST

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-30 17:23:49 +01:00
Daniel Schürmann
8b5aee78cc aco: don't schedule instructions through depending VMEM instructions
Previously, the scheduler tried to move up instructions from below depending
VMEM instructions only to move them down again when scheduling the VMEM
instruction.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 16:12:10 +00:00
Daniel Schürmann
636d45e46a aco: add can_reorder flags to load_ubo and load_constant
These got lost due to some refactoring.
Due to the way our scheduler works currently, for now
we add back the reorder flag for divergent loads only.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 16:12:10 +00:00
Daniel Schürmann
576f92d900 aco: only skip RAR dependencies if the variable is killed somewhere
This patch changes VMEM scheduling in a way that they can only
be moved upwards by previous VMEM instructions but not downwards.
This way, it improves the order of VMEM instructions in relation
to their users.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 16:12:10 +00:00
Daniel Schürmann
703ce617ca aco: restrict scheduling depending on max_waves
Previously, we allowed all shaders to reduce the number of max_waves to as low as 5.
Restricting this on shaders with low register demand, increases the total number of waves
while the VMEM def-use distances hardly change.
This patch also changes the max number of move operations per MEM instruction.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 16:12:10 +00:00
Jason Ekstrand
beca63c6c0 anv: Avoid emitting UBO surface states that won't be used
This shaves around 4-5% off of a CPU-limited example running with the
Dawn WebGPU implementation.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-30 16:05:57 +00:00
Jason Ekstrand
24c0545b2d intel/vec4: Set brw_stage_prog_data::has_ubo_pull
In 0e4a75f917, Ken added a flag brw_stage_prog_data which indicates
whether any UBO pulls ever occur.  Unfortunately, he neglected to set
the bit in the vec4 back-end.  This was fine at the time because the
optimization was intended for iris which does not support gen7 and using
the vec4 back-end on Gen8+ requires an environment variable.  We want to
use this in Vulkan which does support Gen7 so we want the information
from the vec4 back-end as well as scalar.

Fixes: 0e4a75f917 "intel/compiler: Record whether any pull constant..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-30 16:05:57 +00:00
Samuel Pitoiset
5a9d777f5a radv: fix perftest options
RADV_PERFTEST=outooforder has been removed a while ago. This fixes
dumping the options into hang reports.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-30 14:49:30 +01:00
Samuel Pitoiset
c895e08281 radv: move nomemorycache debug option at the right palce
Fixes: 6571000071 ("radv: add debug option to turn off in memory cache")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-30 14:49:28 +01:00
Samuel Pitoiset
d4e0bef1bb radv: fix dumping SPIR-V into hang reports
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-30 13:02:08 +00:00
Tapani Pälli
4f8c86e6a5 mesa: enable ARB_gpu_shader_int64 in compat profile
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-30 14:37:27 +02:00
Tapani Pälli
2d8b8d3bd1 mesa: add [Program]Uniform*64ARB display list support
This is required for int64 to be enabled in compat profile.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-30 14:37:27 +02:00
Bas Nieuwenhuizen
396195e8f1 radv: Enable VK_KHR_timeline_semaphore.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
4aa75bb3bd radv: Add wait-before-submit support for timelines.
This is actually a non-threaded implementation. I'd summarize this
as event-based submission.

When submit happens we walk a tree of submissions that depend on
the syncobj signal operations to be submitted and if those submission
we no other dependencies we start to execute them immediately.

Or, well I still use a list to avoid issues with long chains and
the stacksize when using recursion.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
88d41367b8 radv: Add timelines with a VK_KHR_timeline_semaphore impl.
This does not fully do wait-before-submit, to be done in a follow
up patch.

For kernels without support for timeline syncobjs, this adds an
implementation of non-shareable timelines using legacy syncobjs.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
2117c53b72 radv: Add temporary datastructure for submissions.
So we can defer them.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
c3eae659e7 radv: Split semaphore into two parts as enum+union.
This is in preparation to adding more types.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
84d9551b23 radv: Always enable syncobj when supported for all fences/semaphores.
This simplifies code for timeline semaphores by needing to support
less configurations.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
45f4a639a8 radv: Improve fence signalling in QueueSubmit.
Only signalling it once.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
a9c8424e08 radv: Do sparse binding in queue submission.
So we have one place to do queue things if we end up deferring
submissions.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
915e9178fa radv: Split out commandbuffer submission.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
43ba44357c radv: Clean up unused variable.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
2e3a635ee6 radv: Add an early exit in the secure compile if we already have the cache entries.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-30 11:38:50 +01:00
Bas Nieuwenhuizen
d78809632f radv: Compute hashes in secure process for secure compilation.
To prevent poisoning arbitrary cache entries.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-30 11:37:41 +01:00
Erik Faye-Lund
4c4ac2d4d5 zink: drop nop descriptor-updates
If there's nothing to be done, let's actually do nothing. Seems like a
good idea.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-10-30 10:29:23 +00:00
Erik Faye-Lund
b222f28357 zink: use bitfield for dirty flagging
Bitfields are a bit more ideomatic than explicit flags, and harder to
get wrong.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-10-30 10:29:23 +00:00
Erik Faye-Lund
6d30abb4f1 zink: use dynamic state for line-width
This will lead to fewer pipelines in the cache, which is assumed to
become our most unavoidable performance bottle-neck down the line.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-10-30 10:29:23 +00:00
Duncan Hopkins
d2bb63c8d4 zink: Use optimal layout instead of general. Reduces valid layer warnings. Fixes RADV image noise.
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-10-30 09:09:49 +00:00
Michel Dänzer
aaf1b09270 gitlab-ci: Disable meson-windows job for the time being
It needs a CI runner carrying the mesa-windows tag, but there's none
available currently.
2019-10-30 09:38:20 +01:00
Timothy Arceri
cf25664686 radv: make use of radv_sc_read()
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-30 04:49:58 +00:00
Timothy Arceri
28fff3efbc radv: add radv_sc_read() helper
This is a function with timeout support for reading from the pipe
between processes used for secure compile.

Initially we hardcode the timeout to 5 seconds. We can adjust the
timeout limit in future if needed.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-30 04:49:58 +00:00
Timothy Arceri
23a6827e4d radv: allow select() calls in secure compile
This will be used in the following patch to support timeouts for
reading the pipe between processes.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-30 04:49:58 +00:00
Lepton Wu
1abf05764b mapi: Improve the x86 tsd stubs performance.
This skips touching %ebx most times and it shows that glGetString performance
increased from 114M/s to 120M/s on my desktop.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Lepton Wu <lepton@chromium.org>
2019-10-29 20:50:05 -07:00
Lepton Wu
41407d5e9f mapi: Inline call x86_current_tls.
This saves one return and a simple benchmark which calls glGetString
repeatedly on my desktop shows it improves calls per second from 123M
to 141M.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1997
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Lepton Wu <lepton@chromium.org>
2019-10-29 17:18:06 -07:00
Lepton Wu
b2b8639d8e mapi: Clean up entry_patch_public for x86 tls
Remove hard coded 16 and use entry_generate_or_patch to patch
public stubs. The generated code actually is sightly tighter
than before since the "nop" instructions before the final "jmp"
get removed.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Lepton Wu <lepton@chromium.org>
2019-10-29 17:18:06 -07:00
Lepton Wu
1fb75bee90 mapi: split entry_generate_or_patch for x86 tls
The code works exactly the same with before. Just split this function
out so we can reuse it.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Lepton Wu <lepton@chromium.org>
2019-10-29 17:18:06 -07:00
Jonathan Gray
45206d7673 mapi: Adapted libglvnd x86 tsd changes
The x86 assembly language stub in src/mapi/entry_x86_tsd.h does not
generate PIC (position-independent code). This causes text relocations
which bring troubles on recent versions of FreeBSD, OpenBSD, Android.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108541
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Lepton Wu <lepton@chromium.org>
2019-10-29 17:13:14 -07:00
Caio Marcelo de Oliveira Filho
9c3c206e71 spirv: Don't fail if multiple ordering semantics bits are set
Vulkan requires that only one bit for the ordering is set, but old
versions of GLSLang just set all the bits.  This was fixed as part of
c51287d744
but we can still find older versions (or shaders compiled with it)
around.

So instead of failing, emit a warning and fallback to the effective
result of any combination of multiple bits: AcquireRelease.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2018
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-29 14:53:46 -07:00
Sagar Ghuge
f0db4c5204 intel/isl: Allow stencil buffer to support compression on Gen12+
v2: (Nanley Chery)
- Fix commit title
- Fix comment

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-29 14:46:15 -07:00
Sagar Ghuge
b22b349443 iris: Resolve stencil resource prior to copy or used by CPU
v2: Decide aux usage in get_copy_region_aux_settings (Nanley Chery)

v3: Use isl_surf_usage_is_stencil function (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-29 14:46:15 -07:00
Sagar Ghuge
5d331251cf iris: Prepare resources before stencil blit operation
We have to resolve destination surfaces if we are bliting to and from
the same surface.

v2: Revert unrelated change (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-29 14:46:15 -07:00
Sagar Ghuge
4e0ed40ed7 iris: Prepare depth resource if clear_depth enable
Avoid preparing depth resource, if we did fast depth clear before.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-29 14:46:15 -07:00
Sagar Ghuge
81de49a9f2 iris: Prepare stencil resource before clear depth stencil
Let aux surface state tracker track the stencil buffer's aux state while
clearing depth stencil buffer.

v2: Fix condition check (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-29 14:46:15 -07:00
Sagar Ghuge
b8223991b5 iris: Resolve stencil buffer lossless compression with WM_HZ_OP packet
Even though stencil buffer compression looks like regular lossless color
compression w/o fast clear support, we have to resolve stencil buffer
with WM_HZ_OP packet.

v2: Check if resource is stencil with helper function (Nanley Chery)

v3: Remove unnecessary included file (Nanley Chery)

v4: (Nanley Chery)
- Avoid stencil buffer aux state transition by improving condition check

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-29 14:46:15 -07:00
Sagar Ghuge
87c57b8dae intel/blorp: Set stencil resolve enable bit
When set, the stencil buffer is filled with the true stencil values and
we have to disable stencil buffer clear enable bit.

v2: 1) Refactor code little bit (Nanley Chery)
    2) Fix assertion (Nanley Chery)

v3: 1) Remove unncessary assignment (Nanley Chery)
    2) Fix GEN_GEN check (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-29 14:46:15 -07:00
Sagar Ghuge
c401186762 intel: Track stencil aux usage on Gen12+
Enable stencil compression enable and control surface enable bit if
stencil buffer lossless compression is enabled.

v2: Remove unnecessary GEN_GEN check (Nanley Chery)

v3: (Nanley Chery)
- Change commit subject tag from intel/isl to intel
- Keep assignment order correct

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-29 14:46:15 -07:00
Sagar Ghuge
53d472df24 intel/blorp: Add helper function for stencil buffer resolve
On Gen12+, Stencil buffer's lossless compression should be resolved
with WM_HZ_OP packet.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-29 14:46:15 -07:00
Sagar Ghuge
ce208be2d8 intel/blorp: Assign correct view while clearing depth stencil
We never saw any failures regarding this typo but it's good to assign
correct stencil view while constructing blorp_params.

Fixes: 0cabf93b80 "intel/blorp: Add an entrypoint for clearing depth and stencil"

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-29 14:46:15 -07:00
Sagar Ghuge
4287e0a4e4 genxml/gen12: Add Stencil Buffer Resolve Enable bit
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-29 14:46:15 -07:00
Nanley Chery
0a2a9a4a5b iris: Allocate main and aux surfaces together
On Gen12, the CCS buffer address doesn't have to be referenced in state
packets. In the case of a stencil buffer with CCS, the kernel won't know
the location of the CCS unless an extra call is made to pin its address.
To avoid this extra call, make the CCS part of the main surface.

v2. Update comment above bo_size. (Jordan)

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-29 14:46:15 -07:00
Nanley Chery
ff5bc81b51 iris: Determine aux offsets within configure_aux
If a resource has a modifier, the main and aux surfaces will share a BO.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-29 14:46:15 -07:00
Nanley Chery
f0ed86c6c6 iris: Bail resource creation upon aux creation error
The functions used during aux buffer configuration and creation only
return false for exceptional errors. Don't proceed with surface creation
in those cases.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-29 14:46:15 -07:00
Nanley Chery
8b62e3d978 iris: Drop iris_resource::aux::extra_aux::bo
The primary and secondary aux buffers are always allocated in the same
BO.

Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-29 14:46:15 -07:00
Duncan Hopkins
bb8e6994cc zink: pass line width from rast_state to gfx_pipeline_state.
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-10-29 20:38:26 +00:00
Jason Ekstrand
52aa7f3e05 anv: Reduce the minimum number of relocations
The original value of 256 was under the assumption that you're a batch
buffer which is likely going to have a large number of relocations.
However, pipeline objects on Gen7 will have at most 6 relocations (one
per shader stage and one for the workaround BO) so this is a lot of
per-pipeline wasted space.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-29 20:27:52 +00:00
Jason Ekstrand
a3153162a9 anv: Delay allocation of relocation lists
The old relocation list code always allocated 256 relocations and a hash
set up-front without knowing whether or not we really need them.  In
particular, in the softpin case, this is two fairly large allocations
that we don't need to be making.  Also, for pipeline objects on haswell
where we don't have softpin, we don't need relocations unless scratch is
used so this is extra data per-pipeline.  Instead, we should do it
on-demand.  This shaves 3.5% off of a cpu-limited example running with
the Dawn WebGPU implementation.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-29 20:27:52 +00:00
Plamena Manolova
4fe2317601 anv: Implement new way for setting streamout buffers.
For gen12 we set the streamout buffers using 4 separate
commands instead of 3DSTATE_SO_BUFFER.

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-29 19:21:20 +00:00
Plamena Manolova
0f610e17bc iris: Implement new way for setting streamout buffers.
For gen12 we set the streamout buffers using 4 separate
commands instead of 3DSTATE_SO_BUFFER.

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-29 19:20:25 +00:00
Plamena Manolova
665b81e29a genxml: Add 3DSTATE_SO_BUFFER_INDEX_* instructions
For gen12 we set the streamout buffers using 4 separate
commands instead of 3DSTATE_SO_BUFFER.

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-29 19:19:58 +00:00
Rob Clark
ff6e148a3d freedreno/a6xx: add a618 support
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-10-29 09:19:34 -07:00
Rob Clark
afd224fac3 freedreno/a6xx: cleanup magic registers
Extract out values for the handful of unknown registers which have
different values across different a6xx models, to simplify adding
support for new a6xx's.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-10-29 09:19:31 -07:00
Rob Clark
1fdc259bfc freedreno/a6xx: remove some left over dead code
These registers don't exist, just remnants of initial port from a5xx.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-10-29 09:19:27 -07:00
Plamena Manolova
f9ad73cdfd anv: Set depthBounds to true in anv_GetPhysicalDeviceFeatures.
Add depth bounds testing to the list of supported
physical device features.

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-29 16:05:33 +00:00
Plamena Manolova
e6c8750278 genxml: Change 3DSTATE_DEPTH_BOUNDS bias.
The bias for the 3DSTATE_DEPTH_BOUNDS instruction
should be 2 not 1.

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-29 16:05:33 +00:00
Michel Dänzer
2a38fc1027 gitlab-ci: Only run the pipeline if any files affecting it have changed
E.g. documentation-only changes cannot affect the outcome of the
pipeline, so don't waste resources on running it.

The thing we need to be careful about here is that the container stage
jobs must always run if any later stage jobs using the corresponding
docker images run. We're currently using the same .ci-run-policy
template for all jobs, so this is trivially true.

v2:
* Add bin/ and common.py (Eric Engestrom)

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> # v1
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-29 15:09:56 +00:00
Krzysztof Raszkowski
163d5fde06 gallium/swr: Enable GL_ARB_gpu_shader5: multiple streams
Added support for geometry shader multiple streams (part of
GL_ARB_gpu_shader5 extension).

Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
2019-10-29 14:50:02 +00:00
Alyssa Rosenzweig
44971b84b7 panfrost: Remove unused definitions in mali-job.h
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-29 13:02:53 +00:00
Alyssa Rosenzweig
fa14cdf6e4 panfrost: Cleanup _shader_upper -> shader
I don't believe this is actually a tagged pointer; warn if it is.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-29 13:02:53 +00:00
Eric Engestrom
b4f508ab59 meson: define _GNU_SOURCE on FreeBSD
_mesa_strtod() needs this to use strtod_l(), which behaves correctly
wrt `,` vs `.` decimal separator.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2008
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-29 12:12:58 +00:00
Lionel Landwerlin
1a2246a5e0 intel/perf: update ICL configurations
A few equations/programming changes for ICL.

v2: Fix a couple of issues in naming and floating/integer operations (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-29 13:00:26 +02:00
Alexandros Frantzis
1257d06ba7 gitlab-ci: Update required libdrm version
Commit 9edcce2a32 bumped the required libdrm-amdgpu version to
2.4.100. Update the version we use in our CI scripts to avoid CI
build failures.

Also bump the debian image name for this change to take effect.
Note that amdgpu is only built with the debian-buster image,
so only this image requires an update.

Fixes: 9edcce2a ("ac: get tcc_harvested from the kernel")
Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-10-29 09:50:09 +00:00
Eric Engestrom
690d359b6f travis: fix scons build after deprecation warning
Fixes: 54053bc8d0 ("scons: Print a deprecation warning about using scons on not windows")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-29 09:25:40 +00:00
Caio Marcelo de Oliveira Filho
e2155158e9 anv: Fix output of INTEL_DEBUG=bat for chained batches
The anv_batch_bo contents are linked one to another, and when printing
we have to start with the first of those.  Since in `u_vector` new
elements are added to the head, to get the first element we need the
vector's tail.

Fixes: 32ffd90002 ("anv: add support for INTEL_DEBUG=bat")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-28 19:34:54 -07:00
Marek Olšák
f9fe86e02a winsys/amdgpu: use the new GPU reset query 2019-10-28 21:38:01 -04:00
Marek Olšák
9edcce2a32 ac: get tcc_harvested from the kernel
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-28 21:38:01 -04:00
Marek Olšák
4d1e43badb radeonsi: initialize shader compilers in threads on demand
It takes a noticable amount of time with piglit.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-28 21:36:18 -04:00
Marek Olšák
1380db9fa8 radeonsi: don't print diagnostic LLVM remarks and notes
We don't use them.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-28 21:36:18 -04:00
Timur Kristóf
c52ebbcea4 aco: Introduce vgpr_limit to keep track of available VGPRs.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-28 23:52:50 +00:00
Timur Kristóf
d59f702e26 aco: Implement subgroup shuffle in GFX10 wave64 mode.
Previously subgroup shuffle was implemented using the bpermute
instruction, which only works accross half-waves, so by itself it's
not suitable for implementing subgroup shuffle when the shader is
running in wave64 mode.

This commit adds a trick using shared VGPRs that allows to implement
subgroup shuffle still relatively effectively in this mode.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-28 23:52:50 +00:00
Rhys Perry
c2eebfe3ea aco: Remove dead code in reduction lowering.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-28 23:52:50 +00:00
Rhys Perry
3865448012 aco: Fix reductions on GFX10.
Fixes p_reduce (all cluster sizes), p_inclusive_scan and p_exclusive_scan
with all reduction operations.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-28 23:52:50 +00:00
Eric Engestrom
cd04b63c00 loader: default to iris for all future PCI IDs
The existing "fallback" code didn't actually do anything, so this
removes it, and instead we just always fallback to `iris` for future
PCI IDs.

Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 23:21:39 +00:00
Eric Engestrom
ea8116908c anv: add a couple printflike() annotations
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-28 23:17:16 +00:00
Erik Faye-Lund
21b7f79a76 st/mesa: lower global vars to local after lowering clip
When this code was merged, this wasn't necessary because the
state-tracker would do it later anyway. But this recently got changed,
without changing the code that depended on this.

Arguably, this was a mistake in the lowering pass to begin with. Either
way, let's fix it by not assuming that the lowering code gets called
later when it's not needed.

This fixed user-defined clip-planes in Zink.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: eaffdad108 ("st/mesa: don't lower_global_vars_to_local for VS if there are no dead inputs")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-28 21:17:40 +00:00
Sagar Ghuge
3ac688b0c2 iris: Create resource with aux_usage MCS_CCS
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-28 14:02:02 -07:00
Sagar Ghuge
366fcbf2d8 intel/isl: Support lossless compression with multisamples
GEN12 adds the ability to losslessly compress each sample plane in a
multisampled buffer that uses MCS compression.

v2: Remove unnecessary assertion (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-28 14:02:01 -07:00
Sagar Ghuge
758a6a3a00 iris: Get correct resource aux usage for copy
Add case for MCS_CCS so that we get the correct aux usage while copy
operation.

v2: Fix commit subject (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-28 14:02:01 -07:00
Sagar Ghuge
e80bca6895 intel/blorp: Use isl_aux_usage_has_mcs instead of comparing
Depending on MCS_CSS or MCS we can emit blorp blit shaders.

As we support MCS_CSS and MCS, it makes sense to use
isl_aux_usage_has_mcs function.

v2: Fix commit message (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-28 14:02:01 -07:00
Sagar Ghuge
d156632374 iris: Define MCS_CCS state transitions and usages
v2: 1) Fix assertion check (Nanley Chery)
    2) Correct commit subject (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-28 14:02:01 -07:00
Sagar Ghuge
2cd849cf17 iris: Initialize CCS to fast clear while using with MCS
v2: Explain Bsepc quotes properly (Nanley Chery)

v3: 1) Fix comment format (Nanley Chery)
    2) Fix typo in comment (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-28 14:02:01 -07:00
Sagar Ghuge
2f0fbe06e6 intel/isl: Don't reconfigure aux surfaces for MCS
If aux for MCS is already configured, don't configure again.

v2: Fix missing period in commit message (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-28 14:02:01 -07:00
Erik Faye-Lund
810fc75dab zink: emulate optional depth-formats
The Vulkan spec says that an implementation has to support one of
VK_FORMAT_X8_D24_UNORM_PACK32 and VK_FORMAT_D32_SFLOAT, as well of
one of VK_FORMAT_D24_UNORM_S8_UINT and VK_FORMAT_D32_SFLOAT_S8_UINT.

So let's keep track which one is supported of earch pair, and emulate
one on top of the other one.

This won't give the exact result for comparisons, or when mapping and
unmapping the resources. But it's better than flat out failing to create
the resource, and we can fix the map/unmap issue later if needed.

Tested-by: Duncan Hopkins <duncan@thefoundry.co.uk>
2019-10-28 17:57:49 +00:00
Erik Faye-Lund
e6ea350fb0 zink: error if VK_KHR_maintenance1 isn't supported
While we're at it, remove the VK_-prefix from the extension bool; all
extensions have this so it's kinda superfluous.
2019-10-28 17:57:49 +00:00
Nanley Chery
d298740a1c iris: Disallow incomplete resource creation
If a modifier specifies an aux, it must be created.

Fixes: 75a3947af4 ("iris/resource: Fall back to no aux if creation fails")
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:06 -07:00
Nanley Chery
f2fc5dece9 iris: Don't leak the resource for unsupported modifier
Make sure the res struct is free'd before returning.

Fixes: 2dce0e94a3 ("iris: Initial commit of a new 'iris' driver for Intel Gen8+ GPUs.")
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:06 -07:00
Nanley Chery
7a619b5c75 iris: Enable HIZ_CCS sampling
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:06 -07:00
Nanley Chery
8e7644e48f intel/blorp: Satisfy clear color rules for HIZ_CCS
Store the converted depth value into two dwords. Avoids regressing the
piglit test "fbo-depth-array depth-clear", when HIZ_CCS sampling is
enabled in a later commit.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:06 -07:00
Nanley Chery
0aa308f420 intel: Fix and use HIZ_CCS write through mode
Write through to the CCS if the surface is used as a texture and can be
sampled by the HW with CCS.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:06 -07:00
Nanley Chery
fee4dbcb4d iris: Start using blorp_can_hiz_clear_depth()
Check that the alignment requirements for HIZ_CCS are satisfied by using
this function.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:06 -07:00
Nanley Chery
5425fcf2cb intel/blorp: Satisfy HIZ_CCS fast-clear alignments
Prevent the piglit test,
amd_vertex_shader_layer-layered-depth-texture-render, from regressing in
in a future commit.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:06 -07:00
Nanley Chery
6451008e8b intel: Refactor blorp_can_hiz_clear_depth()
Prepare this function to be used in iris and to handle new Gen12 behavior.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:06 -07:00
Nanley Chery
cc99d0adc0 isl: Add isl_surf_supports_hiz_ccs_wt()
Add a helper to determine if an ISL surface supports the write-through
mode of HIZ_CCS.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:06 -07:00
Nanley Chery
6020ebf799 iris: Enable HIZ_CCS in depth buffer instructions
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:06 -07:00
Nanley Chery
af6ff48894 iris: Define initial HIZ_CCS state and transitions
Make it match those of HIZ.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:06 -07:00
Nanley Chery
c991045d38 iris: Create an unusable secondary aux surface
The HIZ_CCS and MCS_CCS auxiliary surface modes require that drivers
store information about two aux buffers. We choose to represent this as
HiZ/MCS being the primary aux surface and the CCS as an secondary/extra
aux surface. This representation has the effect of placing most of the
code that will have to choose between the two aux surfaces around the
aux-map entry points.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:06 -07:00
Nanley Chery
909030bca6 iris: Don't guess the aux_usage
Instead of guessing an aux_usage, then confirming it if the
isl_surf_get_*_surf functions are successful, just call the ISL
functions up-front. This will help us to more easily determine if a
depth buffer supports HIZ_CCS.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:06 -07:00
Nanley Chery
04e5f7e8a9 intel/blorp: Treat HIZ_CCS like HiZ
Allow it in depth buffer instructions but disable it for blits.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:06 -07:00
Nanley Chery
cc415f911f intel/blorp: Assert against HiZ in surface states
Avoid unexpected behavior if the caller happens to pass in a HiZ aux
usage.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:05 -07:00
Nanley Chery
c50f8b2fc9 intel: Support HIZ_CCS in isl_surf_get_ccs_surf
Add an extra aux parameter which will be filled out with CCS if the
first two isl_surf parameters fit the requirements for HiZ_CCS.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:05 -07:00
Nanley Chery
e2e67b3f11 isl: Reduce assertions during aux surf creation
Return false more often to reduce the burden on the caller.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:05 -07:00
Nanley Chery
6670e07a6e intel: Enable CCS_E for R24_UNORM_X8_TYPELESS on TGL+
While this format isn't listed in BSpec: 53911, other documentation and
empirical evidence suggest that it's fine to remap it to R32_FLOAT. I've
filed a bug for the BSpec page.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:05 -07:00
Nanley Chery
f93bc14618 intel: Use 3DSTATE_DEPTH_BUFFER::ControlSurfaceEnable
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:05 -07:00
Jason Ekstrand
ab994ecae6 intel/isl: Support HIZ_CCS in emit_depth_stencil_hiz
v2. Remove undocumented CCS_E-only mode for depth. (Nanley)

Co-authored-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:05 -07:00
Nanley Chery
6312328a61 intel: Use RENDER_SURFACE_STATE::DepthStencilResource
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:05 -07:00
Jordan Justen
5d34a9975f intel: Update alignment restrictions for HiZ surfaces.
v2 (Nanley):
* Maintain a chronological ordering for HiZ alignments. Suggested by
  Ken.

Co-authored-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:05 -07:00
Nanley Chery
6cd9731d96 iris: Clear ::has_hiz when disabling aux
Fixes: 2cddc953cd ("iris: some initial HiZ bits")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:05 -07:00
Nanley Chery
d5fb9cccdc intel/blorp: Disable depth testing for slow depth clears
We'll start doing slow depth clears more often on HIZ_CCS buffers in a
future commit. Reduce the performance impact by making them use less
bandwidth.

From the Depth Test section of the BSpec:

   This function is enabled by the Depth Test Enable state variable. If
   enabled, the pixel's ("source") depth value is first computed. After
   computation the pixel's depth value is clamped to the range defined
   by Minimum Depth and Maximum Depth in the selected CC_VIEWPORT state.
   Then the current ("destination") depth buffer value for this pixel is
   read.

and from the Depth Buffer Updates section of the BSpec:

   If depth testing is disabled or the depth test passed, the incoming
   pixel's depth value is written to the Depth Buffer.

Taken together, it's clear that depth testing isn't necessary to perform
a depth buffer clear. Mark Janes and I analyzed this patch with
frameretrace and a depthrange piglit test. I disabled HiZ to ensure we'd
get slow depth clears. We've observed the bandwidth consumption by the
depth buffer access to be cut ~50% on BDW and SKL during depth clears.
On a more graphically intensive workload, the Shadowmapping Sascha
benchmark, I took the average of 3 runs on a BDW with a display
resolution of about 1920x1200 (minus some desktop environment
decorations). I measured a 22.61% FPS improvement when HiZ is disabled.

v2. The BSpec doesn't mandate this behavior, update comment accordingly.
    (Ken)

Fixes: bc4bb5a7e3 ("intel/blorp: Emit more complete DEPTH_STENCIL state")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:05 -07:00
Nanley Chery
e655eed531 intel: Enable CCS_E for some formats on Gen12
In ISL:
   Update the format table to add CCS_E support for some 8BPP formats,
   some 16BPP formats, and R10G10B10A2_UNORM_SRGB.

   In the helper for determining CCS_E support, we return false for some
   16BPP formats because they aren't properly handled in blorp_copy().

In BLORP:
   Allow the new and non-problematic formats for CCS_E-enabled copies.

v2. Update other fields for A1B5G5R5_UNORM and A4B4G4R4_UNORM in table.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
2019-10-28 10:47:05 -07:00
Nanley Chery
126c9562d9 isl: Redefine the CCS layout for Gen12
The CCS could be described in a number of ways, but this format was
chosen to minimize churn in the drivers. We may decide on an different
direction in the future.

v2. Increase alignment for display surfaces. (Nanley)

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:05 -07:00
Nanley Chery
1e91280242 isl: Add and use isl_tiling_flag_to_enum()
Use a helper that will automatically handle Gen12's CCS tiling when
creating a CCS isl_surf.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 10:47:05 -07:00
Nanley Chery
82822bc549 iris: Allow for non-Y-tiled aux allocation
The Gen12 CCS is not Y-tiled.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 10:47:05 -07:00
Nanley Chery
22be1447bb isl/drm: Map HiZ and CCS tilings to Y
In the function which translates ISL tilings to i915 tilings, map ISL's
HiZ and CCS tilings to Y instead of NONE (linear). The HW docs describe
HiZ and pre-Gen12 CCS surfaces as being Y-tiled in memory.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 10:47:05 -07:00
Jason Ekstrand
901bed5122 intel/isl: Update surf_fill_state for gen12
v2 (Nanley):
* Avoid driver churn for now.
* Include some media compression changes.

Co-authored-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 10:47:05 -07:00
Jason Ekstrand
caf4cc548e intel/isl/fill_state: Separate aux_mode handling from aux_surf
v2. Avoid driver churn for now. (Nanley)

Co-authored-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 10:47:05 -07:00
Jason Ekstrand
a1e0b21061 intel/isl: Add new aux modes available on gen12
v2. Add media compression. (Nanley)

Co-authored-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 10:47:05 -07:00
Nanley Chery
77f506382f i965/miptree: Avoid -Wswitch for the Gen12 aux modes
Avoid the compiler warnings for the new enums that will be introduced in
a future commit.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 10:47:05 -07:00
Nanley Chery
8af1853331 anv/private: Modify aux slice helpers for Gen12 CCS
The isl_surf structs for Gen12's CCS won't describe how many slices in
the main surface can be compressed. All slices will be compressable if
CCS is enabled, so lookup the main surface's logical dimension.

v2. Add a space before a `?`. (Jordan)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 10:47:05 -07:00
Nanley Chery
ba52cd7ab2 intel/blorp: Don't assert aux slices match main slices
This isn't accurate enough for HiZ which can have a discontiguous range
of supported aux slices. This also won't work with the plan to represent
Gen12 CCS as a single slice surface.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 10:47:05 -07:00
Jason Ekstrand
4021a3925c intel/blorp: Use surf instead of aux_surf for image dimensions
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 10:47:05 -07:00
Nanley Chery
d90bffaef8 intel/blorp: Halve the Gen12 fast-clear/resolve rectangle
Update their dimensions according to the Bspec.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 10:47:05 -07:00
Rafael Antognolli
43b48ee752 intel/blorp/gen12: Set FWCC when storing the clear color.
From "Render Target Fast Clear" description for Gen12:

   "SW must store clear color using MI_STORE_DATA_IMM with
   ForceWriteCompletionCheck bit set."

From Instruction_MI_STORE_DATA_IMM, bitfield 10 (when set to 1):

   "Following the last write from this command, Command Streamer
   will wait for all previous writes are completed and in global
   observable domain before moving to next command."

We use 4 SDIs to store the clear color (one per channel). From the
description, it looks to me that setting that flag only on the last SDI
should be enough.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 10:47:05 -07:00
Nanley Chery
07e16221d9 isl: Round up some pitches to 512B for Gen12's CCS
Gen12's CCS requires that the main surface have a pitch aligned to 512B.

v2. Provide a BSpec citation. (Ken)

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:05 -07:00
Nanley Chery
f6aefa94cc iris: Don't assume CCS_E includes CCS_D
There's no longer a clear-only compression mode of CCS on Gen12+.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 10:47:05 -07:00
Nanley Chery
300d77c2fa anv/cmd_buffer: Don't assume CCS_E includes CCS_D
There's no longer a clear-only compression mode of CCS on Gen12+.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:05 -07:00
Nanley Chery
4f0b5f9732 anv/image: Disable CCS_D on Gen12+
Clear-only compression no longer exists on TGL.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 10:47:04 -07:00
Nanley Chery
a94cb6503f isl: Disable CCS_D on Gen12+
Clear-only compression no longer exists on TGL.

v2. Add BSpec reference. (Sagar)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 10:47:04 -07:00
Nanley Chery
83fc15e5ba iris: Drop support for I915_FORMAT_MOD_Y_TILED_CCS on TGL+
The format of the CCS has changed.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 10:47:04 -07:00
Nanley Chery
0eaf293b47 anv/formats: Disable I915_FORMAT_MOD_Y_TILED_CCS on TGL+
The format of the CCS has changed.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 10:47:04 -07:00
Nanley Chery
d0fcc2dd50 anv: Properly allocate aux-tracking space for CCS_E
add_aux_state_tracking_buffer() actually checks the aux usage when
determining how many dwords to allocate for state tracking. Move the
function call to the point after the CCS_E aux usage is assigned.

Fixes: de3be61801 ("anv/cmd_buffer: Rework aux tracking")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:04 -07:00
Nanley Chery
698d723a6d anv/blorp: Use BLORP_BATCH_NO_UPDATE_CLEAR_COLOR
Avoid failing the `info->use_clear_address` assertion in ISL on Gen12+.

Fixes: 6c9f9a82d7 ("intel/genxml,isl: Add gen12 render surface state changes")
Reported-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:04 -07:00
Plamena Manolova
939ddccb7a anv: Add support for depth bounds testing.
In gen12 we use the 3DSTATE_DEPTH_BOUNDS instruction
to enable depth bounds testing.

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-28 14:13:04 +00:00
Plamena Manolova
1df871f8ff iris: Add support for depth bounds testing.
In gen12 we use the 3DSTATE_DEPTH_BOUNDS instruction
to enable depth bounds testing.

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 13:46:06 +00:00
Plamena Manolova
1ecd37eac6 genxml: Add 3DSTATE_DEPTH_BOUNDS instruction.
In gen12 we add the 3DSTATE_DEPTH_BOUNDS instruction
which enables support for depth bounds testing.

Signed-off-by: Plamena Manolova <plamena.manolova@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 13:45:24 +00:00
Danylo Piliaiev
8818e0df74 glsl: Initialize all fields of ir_variable in constructor
Better be safe, even if we could technically avoid this for
some fields.

Cc: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1999
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Tested-by: Witold Baryluk <witold.baryluk@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-28 12:49:15 +00:00
Timothy Arceri
1909bc526d util: remove LIST_IS_EMPTY macro
Just use the inlined function directly. The new function was introduced
in addcf410.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-10-28 11:24:39 +00:00
Timothy Arceri
7f106a2b5d util: rename list_empty() to list_is_empty()
This makes it clear that it's a boolean test and not an action
(eg. "empty the list").

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-10-28 11:24:38 +00:00
Timothy Arceri
c578600489 util: remove LIST_DEL macro
Just use the inlined function directly. The macro was replaced with
the function in ebe304fa54.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-10-28 11:24:38 +00:00
Timothy Arceri
c976b427c4 util: remove LIST_DELINIT macro
Just use the inlined function directly. The macro was replaced with
the function in ebe304fa54.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-10-28 11:24:38 +00:00
Timothy Arceri
d23d47c065 util: remove LIST_REPLACE macro
Just use the inlined function directly. The macro was replaced with
the function in ebe304fa54.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-10-28 11:24:38 +00:00
Timothy Arceri
40258fb8b8 util: remove LIST_ADD macro
Just use the inlined function directly. The macro was replaced with
the function in ebe304fa54.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-10-28 11:24:38 +00:00
Timothy Arceri
255de06c59 util: remove LIST_ADDTAIL macro
Just use the inlined function directly. The macro was replaced with
the function in ebe304fa54.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-10-28 11:24:38 +00:00
Timothy Arceri
7ae1be1028 util: remove LIST_INITHEAD macro
Just use the inlined function directly. The macro was replaced with
the function in ebe304fa54.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-10-28 11:24:38 +00:00
Erik Faye-Lund
15e7f94278 gitlab-ci: fixup debian tags
When resolving a merge-conflict, I accidentally only updated the
ARM64-tag tag. Let's correct this.

Fixes: 3d529c1739 ("gitlab-ci: also build Zink on CI")
2019-10-28 12:07:30 +01:00
Danylo Piliaiev
12a8f2616a intel/compiler: Fix C++ one definition rule violations
When building with "-flto" brw::block_data definitions
were colliding.

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-10-28 12:02:40 +02:00
Erik Faye-Lund
3d529c1739 gitlab-ci: also build Zink on CI
This prevents accidentally breaking the driver-build while working on
other drivers.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-10-28 08:51:48 +00:00
Erik Faye-Lund
86ed8132a5 zink: simplify gl-to-vulkan lowering
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:48 +00:00
Erik Faye-Lund
412e2aa23b zink/spirv: more complete sampler-dim handling
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:48 +00:00
Erik Faye-Lund
f26eab3175 zink: fixup scissoring
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:48 +00:00
Duncan Hopkins
c4446098cf zink: limited uniform buffer size so the limits is not exceeded.
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:48 +00:00
Erik Faye-Lund
4ef088f241 zink: do not set lineWidth to invalid value
Some implementations don't support the lineWidth-feature, so let's
avoid setting invalid state to them. But since we don't have a fallback
for this, inform the user.

Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:48 +00:00
Erik Faye-Lund
59f8ba05f5 zink: pass screen to zink_create_gfx_pipeline
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:48 +00:00
Duncan Hopkins
5cf93985a0 zink: respect ubo buffer alignment requirement
The driver can report a minimum alignment for UBOs, and that can be
larger than 64, which we've currently been using. Let's play ball, and
use the reported value instead.

Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:48 +00:00
Duncan Hopkins
108ba81c95 zink: fix line-width calculation
There's two things that goes wrong in this code on some drivers:
1. Rounding off the line-width to granularity can push it outside the
   legal range.
2. A granularity of 0.0 results in NaN, because we divide by zero.

So let's make this code a bit more robust.

Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:48 +00:00
Erik Faye-Lund
df11f3f2ab zink: fixup return-value
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:48 +00:00
Erik Faye-Lund
d5cbc05cde zink: refactor blitting
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:48 +00:00
Erik Faye-Lund
a7fbc8bc7f zink: implement resource_from_handle
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:48 +00:00
Erik Faye-Lund
65fbb1836a zink: use VK_FORMAT_B8G8R8A8_UNORM for PIPE_FORMAT_B8G8R8X8_UNORM
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:48 +00:00
Erik Faye-Lund
867d892d90 zink: do not set VK_IMAGE_CREATE_2D_ARRAY_COMPATIBLE_BIT for non-3D textures
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:48 +00:00
Erik Faye-Lund
d8f1cf4946 zink/spirv: alias var0 on tex0 etc instead
This fixes Quake3, and is more in line with directx semantics.

Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:48 +00:00
Erik Faye-Lund
c7bcb6e5dc zink: lower two-sided coloring
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:48 +00:00
Erik Faye-Lund
67a9749ada zink/spirv: alias generic varyings on non-generic ones
This gets rid of the nasty location-allocation hack.

Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:48 +00:00
Erik Faye-Lund
1f3d2b9f80 zink/spirv: implement load_front_face
We're now adding interface-types during code-emitting, so we need to
defer emitting the entry-point. No biggie, spirv_builder is prepares for
this.

Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:48 +00:00
Erik Faye-Lund
a046957a79 zink/spirv: fixup b2i32
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:47 +00:00
Erik Faye-Lund
b28156413f zink: do not lower bools to float
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:47 +00:00
Erik Faye-Lund
3ed41e3bb6 zink/spirv: prepare for 1-bit booleans
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:47 +00:00
Erik Faye-Lund
c24c3da00a zink/spirv: fixup b2i32 and implement b2f32
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:47 +00:00
Erik Faye-Lund
0a912269d4 zink/spirv: clean up get_[fu]vec_constant
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:47 +00:00
Erik Faye-Lund
3ceba2d312 zink/spirv: inline get_uvec_constant into emit_load_const
This is the only call-site that wants to specify unique values per
component for any of the get_*_constant functions. So let's give this
its own implementation instead, so we can ease the burden for the rest.

Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:47 +00:00
Erik Faye-Lund
20f6b19fdf zink/spirv: add emit_uint_const-helper
While we're at it, let's move emit_float_const to the same location as
this needs to be defined at.

Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:47 +00:00
Erik Faye-Lund
f048196f9e zink/spirv: add emit_bitcast-helper
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:47 +00:00
Erik Faye-Lund
0f697be76d zink/spirv: use bit_size instead of hard-coding
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:47 +00:00
Erik Faye-Lund
54c46db1c8 zink/spirv: implement emit_float_const helper
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:47 +00:00
Erik Faye-Lund
89591c895c zink/spirv: implement emit_select helper
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:47 +00:00
Erik Faye-Lund
2419022a0c zink/spirv: implement b2i32
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:47 +00:00
Erik Faye-Lund
f4ad93462c zink/spirv: implement bitwise ops
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:47 +00:00
Erik Faye-Lund
103776ab9c zink/spirv: implement bcsel
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:47 +00:00
Erik Faye-Lund
0947afaa8f zink/spirv: assert bit-size
This is going to make it easier to verify that 1-bit float sizes don't
leak into the rest of the code.

Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:47 +00:00
Erik Faye-Lund
bb895afaa0 zink/spirv: implement f2b1
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:47 +00:00
Erik Faye-Lund
04bb08ed35 zink/spirv: use ordered compares
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:47 +00:00
Erik Faye-Lund
3ef3ab2d54 zink: lower point-size
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:47 +00:00
Erik Faye-Lund
f24e14cc08 zink: add missing sRGB DXT-formats
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:47 +00:00
Erik Faye-Lund
d50ec9f798 zink: disable PIPE_CAP_QUERY_TIME_ELAPSED for now
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:47 +00:00
Erik Faye-Lund
b525348729 zink: support shadow-samplers
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:47 +00:00
Erik Faye-Lund
d9c068cba1 zink: fix rendering to 3D-textures
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:47 +00:00
Erik Faye-Lund
65e2cf98d5 zink: initialize nr_samples for pipe_surface
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:47 +00:00
Erik Faye-Lund
8575295c17 zink: use primconvert to get rid of 8-bit indices
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:47 +00:00
Erik Faye-Lund
2942becfe9 zink: also accept txl
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:47 +00:00
Erik Faye-Lund
2683619955 HACK: zink: suspend / resume queries on batch-boundaries
HACK because we assert that we don't overrun the pool. We need a
fallback here instead.

Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:47 +00:00
Erik Faye-Lund
67cde39c8c zink: move set_active_query_state-stub to zink_query.c
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:47 +00:00
Erik Faye-Lund
7ebdf5be15 zink: disable timestamp-queries
We don't implement the get_timestamp context-method, so this is just
going to crash if anyone tries to use it. Let's implement it later.

Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:47 +00:00
Erik Faye-Lund
e084211c08 zink: fixup boolean queries
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:47 +00:00
Erik Faye-Lund
69189417ae zink/spirv: support vec1 coordinates
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Erik Faye-Lund
499bf41487 zink: do not use both depth and stencil aspects for sampler-views
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Erik Faye-Lund
5f14168edf zink/spirv: always enable Sampled1D for fragment shaders
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Erik Faye-Lund
967e570511 zink: add note about enabling PIPE_CAP_CLIP_HALFZ
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Erik Faye-Lund
755037e09d zink: don't crash when setting rast-state to NULL
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Erik Faye-Lund
7004afcd24 zink: remove insecure comment
This isn't as inaccurate as the comment says, the Vulkan documentation
even seems to suggest this is the same. Let's drop the comment.

Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Erik Faye-Lund
a10d43d845 zink: avoid texelFetch until it's implemented
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Erik Faye-Lund
a9770e2bd2 zink: set ExecutionModeDepthReplacing when depth is written
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Erik Faye-Lund
10f26ef92d zink: fixup: save rasterizer
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Erik Faye-Lund
c96963a8d1 zink: ensure layout is reasonable before copying
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Erik Faye-Lund
c947aee63b zink/spirv: debug-print unknown varying slots
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Erik Faye-Lund
c2f52cf94f zink/spirv: be a bit more strict with fragment-results
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Erik Faye-Lund
97f4827e2e zink: wait for transfer when reading
TODO: this could really benefit from a separate transfer-queue, I think.
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Erik Faye-Lund
a005fae564 zink: support more texturing
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Erik Faye-Lund
44f374ced5 zink/spirv: correct opcode
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Erik Faye-Lund
baf34dbd75 zink: make sure imageExtent.depth is 1 for arrays
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Erik Faye-Lund
67d2e6258e zink: stub resource_from_handle
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Erik Faye-Lund
b8a9bbeb00 zink: abort on submit-failure
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Erik Faye-Lund
4a64ee192a zink: crash hard on unknown queries
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Erik Faye-Lund
86c0217ee9 zink: add more compares
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Erik Faye-Lund
06859b70b9 zink: more converts
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Erik Faye-Lund
b5bfb72fce zink: more comparison-ops
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Erik Faye-Lund
bcd12adce5 zink: implement ineg
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Erik Faye-Lund
d19f0b437b zink: add shift ops
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Erik Faye-Lund
6032fc65b0 zink: add division ops
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Erik Faye-Lund
60bfee1f31 zink: add some opcodes
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Erik Faye-Lund
4e60d4d52a zink: clean up opcode-emitting a bit
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Erik Faye-Lund
55bcf9b1e0 zink: process one aspect-mask bit at the time
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Erik Faye-Lund
cd59de1e3f zink: save all supported util_blitter states
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Erik Faye-Lund
4887ceb79e zink: save original scissor and viewport
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Erik Faye-Lund
d29cc33a9b zink: store sampler and image_view counts
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Erik Faye-Lund
8e5fe441bd zink: use pipe_stencil_ref instead of uint32_t-array
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Erik Faye-Lund
e14c29b9f2 zink: document end-of-frame hack
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Erik Faye-Lund
10439594ec zink: only consider format-desc if checking details
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:46 +00:00
Dave Airlie
4480aefc38 zink: attempt to get multisample resource creation right
Use the exposed vulkan limits to fill out supported formats.

Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Dave Airlie
e234116a96 zink: add samples to rasterizer
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Dave Airlie
0c5f3e50ae zink: add sample mask support
This isn't really used yet, but may as well just fill it in.

Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Erik Faye-Lund
dbf67e8a20 zink: refactor fence destruction
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Erik Faye-Lund
4c5ade8ca6 zink: drop unused argument
Because si.waitSemaphoreCount is 0, this won't even be looked at by the
driver, so let's just drop it.

Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Erik Faye-Lund
03efb6dd27 zink: cleanup zink_end_batch
This inlines submit_cmdbuf into zink_end_batch, the only place it's
used. This makes the code a bit more straight-forward to read.

Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Erik Faye-Lund
8edd357795 zink: request ucp-lowering
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Erik Faye-Lund
80673264cb zink: do not lower io
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Erik Faye-Lund
af0dc71d6f zink/spirv: rename vec_type
These aren't guaranteed to be vectors, they can also be scalars. The
var-part is the significant part here, not the vector-ness. So let's
rename these.

Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Erik Faye-Lund
62f7d9afe8 zink/spirv: var -> regs
These track nir-registers, so it's clearer if we refer to them by that
name instead. There's potentially more vars than these.

Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Dave Airlie
5dbfb02459 zink: add support for compressed formats
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Erik Faye-Lund
6ae8686bff zink: request alpha-test lowering
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Erik Faye-Lund
d9b7d7b051 zink: pool descriptors per batch
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Erik Faye-Lund
9913e5c40b zink: reuse constants
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Erik Faye-Lund
8e5d24fedf zink: fix off-by-one in assert
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Erik Faye-Lund
35c0ef8852 zink: squashme: trade cplusplus wrapper for header-guard
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Erik Faye-Lund
bb76a3f61d zink: squashme: forward declare hash_table
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Erik Faye-Lund
fe34a35333 zink: do not use hash-table for regs
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Erik Faye-Lund
ca074edc7f zink: clamp scissors
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Erik Faye-Lund
249cd3fc13 zink: kill dead code
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Duncan Hopkins
d850e2a3f2 zink: clamped limits to INT_MAX when stored as uint32_t.
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Erik Faye-Lund
737a2bba35 zink: prepare for shadow-samplers
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Erik Faye-Lund
5dfa6be36e zink: keep a reference to used render-passes
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Erik Faye-Lund
1927d11fc0 zink: pass screen instead of device to program-functions
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Erik Faye-Lund
f90ee9e33a zink: rename sampler-view destroy function
This name is more consistent with other functions.

Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Erik Faye-Lund
e4bbdcbf80 zink: clean up render-pass management
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Erik Faye-Lund
0296e8981d zink: remove hack-comment
This isn't a hack, it's how this *should* work.

Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Erik Faye-Lund
2a7302075d zink: ensure sampler-views survive a batch
we don't need to track the resources for the samplers any longer, as
the sampler view holds a reference instead.

Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Erik Faye-Lund
09e20d88e7 zink: fixup parameter name
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Erik Faye-Lund
795c0e95c5 zink: use helper
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Erik Faye-Lund
9e0ff0ffda zink: more batch-ism
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Erik Faye-Lund
33b2f914db zink: cache framebuffers
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Erik Faye-Lund
a872f46369 zink: cache render-passes
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Erik Faye-Lund
5f21637370 zink: simplify renderpass/framebuffer logic a tad
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Erik Faye-Lund
9cac63cae9 zink: implement batching
This reduces stalling quite a bit.

Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:45 +00:00
Erik Faye-Lund
56b1048bb0 zink: return after blitting
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:44 +00:00
Erik Faye-Lund
ef8750da3d zink: remove unusual alignment
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:44 +00:00
Erik Faye-Lund
94d3b9389e zink: tweak state handling
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:44 +00:00
Erik Faye-Lund
8f6449f296 zink: move primitive-topology stuff into program
The primitive topology is a bit of an odd-ball, as it's the only
truly draw-call specific state that needs to be passed to the program to
get a pipeline.

So let's make this a bit more explict, by passing it separately. This
makes the flow of data a bit easier to wrap your head around.

Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:44 +00:00
Erik Faye-Lund
e0a93ba351 zink: assign increasing locations to varyings
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:44 +00:00
Erik Faye-Lund
cedf3598b4 zink: ensure textures are transitioned properly
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:44 +00:00
Erik Faye-Lund
c471525fdc zink: ensure non-fragment shaders use lod-versions of texture
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:44 +00:00
Erik Faye-Lund
9cf6163ea1 zink: emit dedicated block for variables
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:44 +00:00
Erik Faye-Lund
93af00502e zink: use uvec for undefs
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:44 +00:00
Erik Faye-Lund
a8e63387f3 zink: do not destroy staging-resource, deref it
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:44 +00:00
Erik Faye-Lund
819f9fd2f2 zink: track used resources
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:44 +00:00
Erik Faye-Lund
5a9f235ac2 zink: implement fmod
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:44 +00:00
Erik Faye-Lund
22d080b3ac zink: store shader_info in zink_shader
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:44 +00:00
Erik Faye-Lund
ce6f19c4ec zink: texture-rects?
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:44 +00:00
Erik Faye-Lund
4ae362c0ef zink: delete samplers after the current cmdbuf
This makes them zombies for a little while.

Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:44 +00:00
Erik Faye-Lund
2e2ad61ef1 zink: add curr_cmdbuf-helper
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:44 +00:00
Erik Faye-Lund
806f040bb3 zink: reference blit/copy-region resources
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:44 +00:00
Erik Faye-Lund
b89eb298ff zink: whitespace cleanup
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:44 +00:00
Erik Faye-Lund
453d9f193a zink: wait for idle on context-destroy
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:44 +00:00
Erik Faye-Lund
8541b58e39 zink: reference ubos and textures
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:44 +00:00
Erik Faye-Lund
21cffebe4f zink: reference vertex and index buffers
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:44 +00:00
Erik Faye-Lund
a27b84dd2e zink: return old fence from zink_flush
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:44 +00:00
Erik Faye-Lund
0fcc9550b2 zink: reference renderpass and framebuffer from cmdbuf
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:44 +00:00
Erik Faye-Lund
ce66749e0b zink: cache those pipelines
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:44 +00:00
Erik Faye-Lund
8e56b828e4 zink: move renderpass inside gfx pipeline state
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:44 +00:00
Erik Faye-Lund
1cdbeefd2c zink: cache programs
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:44 +00:00
Erik Faye-Lund
fba0293bef zink: pass zink_render_pass to pipeline-creation
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:44 +00:00
Erik Faye-Lund
86d0e741ec zink: prepare for multiple cmdbufs
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:44 +00:00
Erik Faye-Lund
229cd042d3 zink: move cmdbuf-resetting into a helper
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:44 +00:00
Erik Faye-Lund
ac45bc2359 zink: do not leak image-views
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:44 +00:00
Erik Faye-Lund
e64cc463e3 zink: move render-pass begin to helper
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:43 +00:00
Erik Faye-Lund
7034422389 zink: prepare for caching of renderpases/framebuffers
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:43 +00:00
Erik Faye-Lund
b458863c1e zink/spirv: implement loops
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:43 +00:00
Erik Faye-Lund
acdd12dae3 zink/spirv: implement discard
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:43 +00:00
Erik Faye-Lund
11ad9bfc35 zink/spirv: implement if-statements
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:43 +00:00
Erik Faye-Lund
8bbf86e7bc zink/spirv: prepare for control-flow
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:43 +00:00
Erik Faye-Lund
32aea77cfe zink/spirv: handle reading registers
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:43 +00:00
Erik Faye-Lund
f317105dd9 zink/spirv: implement some integer ops
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:43 +00:00
Dave Airlie
d2abe0ac61 zink/spirv: store all values as uint.
This adds bitcasting to uint everywhere for now,
and stores all spir-v ssa values as uints.

It also casts bool to 0/0xffffffff for now
(nir 1-bit bools may be coming in the future).

This fixes a lot of piglit tests to pass now

Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:43 +00:00
Erik Faye-Lund
ac530c1ce2 zink: remove discard_if
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:43 +00:00
Dave Airlie
6d96578912 zink: query support (v2)
This at least passes piglit occlusion_query test for me here now.

Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:43 +00:00
Erik Faye-Lund
b533de12a5 zink: transform z-range
In vulkan, the Z-range of clip-space goes from 0..W instead of -W..+W
as is the case in OpenGL. So we need to transform the Z-range to
account for this.

Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:43 +00:00
Dave Airlie
9fa7400564 zink: add dri loader
export MESA_LOADER_DRIVER_OVERRIDE=zink should now work without using
swrast paths

Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:43 +00:00
Erik Faye-Lund
4249e4a598 zink/spirv: implement point-sprites
This passes glsl-fs-pointcoord_gles2 from piglit.

Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:43 +00:00
Dave Airlie
c3bd0274c6 zink: ask for flatshade lowering
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:43 +00:00
Erik Faye-Lund
48f1f20a9d zink: detect presence of VK_KHR_maintenance1
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:43 +00:00
Erik Faye-Lund
8d46e35d16 zink: introduce opengl over vulkan
Here's zink, a so far pretty simple vulkan-gallium driver that is able
to translate some applications from OpenGL to Vulkan.

The compiler is quite limited for now, this will be improved on later.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 08:51:43 +00:00
Samuel Pitoiset
5912792501 radv: fix OpQuantizeToF16 for NaN on GFX6-7
Do not flush NaN to 0.

Fixes
dEQP-VK.spirv_assembly.instruction.compute.opquantize.propagated_nans

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-28 09:31:52 +01:00
Samuel Pitoiset
d82dfca872 radv: enable fast depth/stencil clears with separate aspects on GFX8
It's similar to GFX9+. Shadow of Mordor (Vulkan beta) hits that
path and it works fine.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-28 07:54:11 +00:00
Jordan Justen
66796a1787 iris: Mark aux-map BO as used by all batches
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 00:09:14 -07:00
Jordan Justen
2e6a7ced4d iris/gen12: Write GFX_AUX_TABLE base address register
Rework:
 * Move last_aux_map_state to iris_batch. (Nanley, Ken)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 00:09:14 -07:00
Jordan Justen
f046c6d090 iris: Map each surf to it's aux-surf in the aux-map tables
Rework: Nanley Chery
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 00:09:14 -07:00
Jordan Justen
d09db2d7b2 isl/gen12: 64k surface alignment
Reworks:
 * Update size for aux map change (Nanley)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 00:09:14 -07:00
Jordan Justen
f118ca2075 iris/bufmgr: Initialize aux map context for gen12
Reworks:
 * free gen_buffer in gen_aux_map_buffer_free. (Rafael)
 * lock around aux_map_bos accesses. (Ken)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 00:09:14 -07:00
Lionel Landwerlin
6af8a4acc4 anv: Add aux-map translation for gen12+
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 00:09:14 -07:00
Jordan Justen
7737f56544 anv/gen12: Write GFX_AUX_TABLE base address register
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-28 00:09:14 -07:00
Jordan Justen
109c96b322 genxml/gen12: Add AUX MAP register definitions
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 00:09:13 -07:00
Jordan Justen
d4a3299ba1 anv/gen12: Initialize aux map context
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-28 00:09:13 -07:00
Jordan Justen
0d0290bb3f intel/common: Add surface to aux map translation table support
Reworks:
 * Add ISL_FORMAT_B8G8R8X8_UNORM_SRGB to get_format_encoding (Nanley)
 * ralloc_free aux_map_buffer entries in gen_aux_map_finish. (Rafael)
 * verify_aligned_space => align_and_verify_space (Rafael)
 * Add mutex to aux-map code. (Rafael, Nanley)
 * Add gen_aux_map_fill_bos (Ken)
 * Make gen_aux_map_get_state_num lockless

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 00:09:13 -07:00
Jordan Justen
062022f2e4 anv: Implement aux-map allocator interface
This interface allows the aux-map code in the intel/common library to
allocate and free buffers.

Reworks:
 * free gen_buffer in gen_aux_map_buffer_free. (Rafael)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-28 00:09:13 -07:00
Jordan Justen
c848ab45f3 intel/common: Add interface to allocate device buffers
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 00:09:13 -07:00
Lionel Landwerlin
830cdaf3f0 intel/dev: store whether the device uses an aux map tables on devinfo
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 00:09:10 -07:00
Tapani Pälli
412badd059 i965: setup sized internalformat for MESA_FORMAT_R10G10B10A2_UNORM
Commit d2b60e433e introduced restrictions (as per GLES spec) on the
internal format. We need to setup a sized format for the texture image
so framebuffers created with that are considered complete.

This change fixes following Android CTS test in AHardwareBufferNativeTests
category:

   SingleLayer_ColorTest_GpuColorOutputAndSampledImage_R10G10B10A2_UNORM

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Fixes: d2b60e433e ("mesa/main: R10G10B10_(A2) formats are not color renderable in ES")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 07:13:10 +02:00
Eric Engestrom
32cff3781a tu: fix empty-body instruction
Fixes: 8d43e2b2de ("meson: add -Werror=empty-body to disallow `if(x);`")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-27 22:10:31 +00:00
Eric Engestrom
0581a86753 v3d: fix empty-body instruction
Fixes: 8d43e2b2de ("meson: add -Werror=empty-body to disallow `if(x);`")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-27 22:10:31 +00:00
Eric Engestrom
c2430f3edc radv: fix empty-body instruction
Fixes: 8d43e2b2de ("meson: add -Werror=empty-body to disallow `if(x);`")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-27 22:10:31 +00:00
Eric Engestrom
493903199c anv: fix empty-body instruction
Fixes: 8d43e2b2de ("meson: add -Werror=empty-body to disallow `if(x);`")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-27 22:09:14 +00:00
Jonathan Marek
521cdde8fc freedreno/a2xx: use sysval for pointcoord
Fixes a problem with shaders using gl_PointCoord.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reported-by: Fabio Estevam <festevam@gmail.com>
Tested-by: Fabio Estevam <festevam@gmail.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-10-27 16:53:32 +00:00
Alyssa Rosenzweig
a0c0030075 pan/midgard: Disable precise occlusion queries
I thought there was hardware support for this, but it seems to broken,
or at least more complex than I believed.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-26 14:38:59 +00:00
Urja Rannikko
dff99ce7d5 panfrost: allocate bo for occlusion query results
This memory needs to still be available after all the drawing is done
and forgotten about, so cannot be transient.
Also clear the result so that no rendering returns a zero.

Signed-off-by: Urja Rannikko <urjaman@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-26 14:38:25 +00:00
Alyssa Rosenzweig
728a975700 panfrost: Expose serialized NIR support
Serialized NIR is required for clover with the SPIR-V pipeline. With
this change and PAN_MESA_DEBUG=deqp, clinfo is able to successfully
probe panfrost.

Code from Nouveau (commit 7955fabcf8 by
Karol Herbst).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-26 13:17:42 +00:00
Alyssa Rosenzweig
afb0d08cb0 pipe-loader: Default to kmsro if probe fails
A device supported by kmsro will not automatically probe kmsro since the
driver name will be panfrost/lima/v3d/..., not "kmsro". Since kmsro is a
bit of a catch-all for generic (mostly embedded) GPUs, add a fallback on
kmsro for the dynamic loader.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
2019-10-26 13:17:42 +00:00
Alyssa Rosenzweig
4949876dd0 pipe-loader: Add kmsro pipe_loader target
kmsro is used by numerous embedded GPUs for a common winsys abstraction.
Let's add support for it for the dynamic pipe loader, so clover can
probe on these drivers.

We build the target with Panfrost. When other drivers need kmsro+clover,
we can revisit the build system part; my mesonfu is wanting.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
2019-10-26 13:17:42 +00:00
Jose Fonseca
ace5138548 scons: Fix force_scons parsing.
- Use parsed options instead of using ARGUMENTS directly.
- Handle the case of mingw cross compilation.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2003
2019-10-26 08:23:48 +01:00
Timothy Arceri
cff53da374 radv: enable secure compile support
Can be enabled via the environment variable which tells the
driver how many compilation threads are expected to be called,
and therefore how many forked processes the driver should
create.

For example we would expect to call fossilize replay with
something like this:

RADV_SECURE_COMPILE_THREADS=8 ./fossilize-replay --num-threads 8 \
--shader-cache-size 0 --ignore-derived-pipelines pipeline_cache.foz

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-26 13:04:12 +11:00
Timothy Arceri
57c95d2ce2 radv: a support for a secure compile fork at device creation
This added support for the fork, the installation of the seccomp
filter, and the main loop for the actual compilation to be called
from i.e. run_secure_compile_device().

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-26 13:04:12 +11:00
Timothy Arceri
3f2283b3e2 radv: add radv_secure_compile()
This function will be called by the parent process when doing a
secure compile. It first selects a free process to work with then
passes it all the information it needs to compile the pipeline.

Once the pipeline information has been passed to the secure
process, it then waits around to read/write any disk cache entries
required before exiting.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-26 13:04:12 +11:00
Timothy Arceri
07692f703f radv: for secure compile exit early from radv_shader_variant_create()
We don't have permission to be creating shared memory etc.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-26 13:04:12 +11:00
Timothy Arceri
5cd437b1ed radv: allow the secure process to read and write from disk cache
This allows the secure process to read and write to the disk cache
via the parent process. This commit just adds the functionality
needed for the secure process, the following commit will add the
functionality for the parent process.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-26 13:04:12 +11:00
Timothy Arceri
5d25aee005 radv: add radv_device_use_secure_compile() helper
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-26 13:04:12 +11:00
Timothy Arceri
d33f2165c9 radv: add some new members to radv device and instance for secure compile
These will be used by the following commits to hold information about
the forked secure compile processes.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-26 13:04:12 +11:00
Timothy Arceri
e8cb13d499 radv: add radv_secure_compile_type enum
This will be used to identify information being passed between the
parent and secure process during a secure compile.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-26 13:04:12 +11:00
Timothy Arceri
2d2b113e86 radv: add radv_create_shaders() to radv_shader.h
In a follwing commit we want to be able to call this for secure
compiles from radv_device.c

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-26 13:04:12 +11:00
Timothy Arceri
6571000071 radv: add debug option to turn off in memory cache
This can be usefull for debugging the on disk cache, but is also
useful in the following patch for secure compiles which will be
used to compile huge pipeline collections. These pipeline
collections can be multiple GBs and the in memory cache grows to
multiple GBs very quickly when they are compiled so we want to
be able to turn off the in memory cache.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-26 13:04:12 +11:00
Timothy Arceri
637776629d radv: get topology from pipeline key rather than VkGraphicsPipelineCreateInfo
This is cleaner and avoids having to read/write an additional copy of
topology for use with secure compile.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-26 13:04:12 +11:00
Marek Olšák
83a346cd58 docs: document new feature EGL_EXT_image_flush_external 2019-10-25 19:59:04 -04:00
Marek Olšák
c1c574fdf1 egl: implement new functions from EGL_EXT_image_flush_external
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-By: Tapani Pälli <tapani.palli@intel.com>
2019-10-25 19:59:04 -04:00
Marek Olšák
34b1aa957a egl: handle EGL_IMAGE_EXTERNAL_FLUSH_EXT
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-By: Tapani Pälli <tapani.palli@intel.com>
2019-10-25 19:59:04 -04:00
Marek Olšák
1d122c104a st/dri: add support for EGL_EXT_image_flush_external
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-By: Tapani Pälli <tapani.palli@intel.com>
2019-10-25 19:59:04 -04:00
Marek Olšák
1d1b457821 st/dri: assume external consumers of back buffers can write to the buffers
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-By: Tapani Pälli <tapani.palli@intel.com>
2019-10-25 19:59:04 -04:00
Marek Olšák
7520478461 dri_interface: add interface for EGL_EXT_image_flush_external
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-By: Tapani Pälli <tapani.palli@intel.com>
2019-10-25 19:59:04 -04:00
Marek Olšák
a0a8109fb6 include: add the definition of EGL_EXT_image_flush_external
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-By: Tapani Pälli <tapani.palli@intel.com>
2019-10-25 19:59:04 -04:00
Dylan Baker
19851c9ad6 gitlab-ci: Add a job for meson on windows
This adds a new CI job that runs on windows with MSVC. It currently
builds softpipe and osmesa, and runs the related unit tests. It does
rely on meson's wraps for zlib, but I've set up caching of the wrap
dependencies so hopefully that wont be a problem.

I really wanted to user powershell for this, but there just isn't an
easy way to do that, it's much easier to use batch scripts, so thats
what I used.

The leading `/` for .gitlab-ci/lava... must be removed because windows
doesn't understand it, and when it reads the file the job ends in error.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-10-25 22:47:32 +00:00
Dylan Baker
06e4647cb0 gitlab-ci: refactor out some common stuff for Windows and Linux
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-10-25 22:47:32 +00:00
Dylan Baker
09ee11f5da nir: Fix invalid code for MSVC
Fixes: ee2050b111
       ("nir: Use BITSET for tracking varyings in lower_io_arrays")
2019-10-25 22:47:32 +00:00
Dylan Baker
ca0c1e69ca docs: update releasing process to use new scripts and gitlab
There were several out of date entries in this document, update them to
current practices.

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2019-10-25 15:46:19 -07:00
Dylan Baker
8a4541aae2 bin/gen_release_notes.py: Add a warning if new features are introduced in a point release
Fixes: 86079447da
       ("scripts: Add a gen_release_notes.py script")
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2019-10-25 15:46:15 -07:00
Dylan Baker
b153785370 bin/gen_release_notes.py: html escape all external data
All of these (bug titles, patch titles, features, and people's names)
can contain characters that are not valid html. Just escape everything
for safety.

Fixes: 86079447da
       ("scripts: Add a gen_release_notes.py script")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2019-10-25 15:46:13 -07:00
Dylan Baker
7e4b87f987 bin/post_release.py: Add .html to hrefs
oops.

Fixes: 3226b12a09
       ("release: Add an update_release_calendar.py script")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2019-10-25 15:46:11 -07:00
Dylan Baker
5eef803625 bin/post_version.py: white space fixes
Fixes: 3226b12a09
       ("release: Add an update_release_calendar.py script")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2019-10-25 15:46:08 -07:00
Dylan Baker
abf9e7ac7b bin/post_version.py: Pass version as an argument
I made a bad assumption; I assumed this would be run in the release
branch. But we don't do that, we run in the master branch. As a result
we need to pass the version as an argument.

Fixes: 3226b12a09
       ("release: Add an update_release_calendar.py script")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2019-10-25 15:46:06 -07:00
Dylan Baker
c6d41e7f0b bin/gen_release_notes.py: Return "None" if there are no new features
Which is very likely .Z > 0 releases.

Fixes: 86079447da
       ("scripts: Add a gen_release_notes.py script")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2019-10-25 15:46:03 -07:00
Dylan Baker
df3d4ad82d bin/gen_release_notes.py: strip '#' from gitlab bugs
If they use the `Fixes: #1` form.

Fixes: 86079447da
       ("scripts: Add a gen_release_notes.py script")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2019-10-25 15:46:00 -07:00
Dylan Baker
69f540c017 bin/gen_release_notes.py: fix conditional of bugfix
Previously this would result in the .0 warning be generated for .z > 0
and the .z == 0 would get the other message.

Fixes: 86079447da
       ("scripts: Add a gen_release_notes.py script")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2019-10-25 15:45:53 -07:00
Illia Iorin
6b672e342a mesa/main: Ignore filter state for MS texture completeness
After the discussion in
https://github.com/KhronosGroup/OpenGL-API/issues/45
the section 8.17 (texture completeness) of the OpenGL 4.6 core profile
was changed to explicitly say that multisample texture completeness
ignores filter state of the texture.

"Using the preceding definitions, a texture is complete unless any of the
 following conditions hold true:
   ...
  - The minification filter requires a mipmap (is neither NEAREST nor LINEAR),
    the texture is not multisample, and the texture is not mipmap complete.
  - The texture is not multisample; either the magnification filter is not
    NEAREST, or the minification filter is neither NEAREST nor NEAREST_-
    MIPMAP_NEAREST; and any of
    – The internal format of the texture is integer (see table 8.12).
    – The internal format is STENCIL_INDEX.
    – The internal format is DEPTH_STENCIL, and the value of DEPTH_-
      STENCIL_TEXTURE_MODE for the texture is STENCIL_INDEX."

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Signed-off-by: Illia Iorin <illia.iorin@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-25 21:16:23 +00:00
Illia Iorin
71d4ece366 Revert "mesa/main: Fix multisample texture initialize"
This reverts commit a113a42e73.

Per https://github.com/KhronosGroup/OpenGL-API/issues/45 it
was a wrong way to fix the issue.

Signed-off-by: Illia Iorin <illia.iorin@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-25 21:16:23 +00:00
Marek Olšák
88e9042b6c glsl/serialize: optimize for equal offsets in uniform remap tables
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1416

This decreases the shader cache size in the ticket from 1.6 MB to 40 KB.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-25 17:01:26 -04:00
Marek Olšák
e90269d90a glsl/serialize: restructure remap table code
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-25 17:01:25 -04:00
Kenneth Graunke
f306d07932 nir: Use VARYING_SLOT_TESS_MAX to size indirect bitmasks
MAX_VARYINGS_INCL_PATCH subtracts VARYING_SLOT_VAR0 giving us a size
that's too small, so BITSET_SET writes words out of bounds, corrupting
the stack and causing all kinds of chaos.  VARYING_SLOT_TESS_MAX is
the right value to use here, as it's the largest location.

Closes: 2002
Fixes: ee2050b111 ("nir: Use BITSET for tracking varyings in lower_io_arrays")
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-25 13:29:09 -07:00
Neil Armstrong
e919c44c3b Revert "ci: Disable lima until its farm can get fixed."
This reverts commit fb9362c6fb.

Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-10-25 20:52:03 +02:00
Jason Ekstrand
e2bb7fef94 Revert "mapi: Inline call x86_current_tls."
This reverts commit e137b3a9b7.  It
completely broke 32-bit EGL such that wflinfo can't even run without
crashing.
2019-10-25 11:31:51 -05:00
Jon Turney
2649609ac5 rbug: Fix use of alloca() without #include "c99_alloca.h"
[12/60] Compiling C object 'src/gallium/auxiliary/eb820e8@@gallium@sta/rbug_rbug_texture.c.o'.
FAILED: src/gallium/auxiliary/eb820e8@@gallium@sta/rbug_rbug_texture.c.o
[...]
../src/gallium/auxiliary/rbug/rbug_texture.c: In function 'rbug_send_texture_info_reply':
../src/gallium/auxiliary/rbug/rbug_texture.c:302:21: error: implicit declaration of function 'alloca'; did you mean 'malloc'? [-Werror=implicit-function-declaration]
  uint32_t *height = alloca(sizeof(uint32_t) * height_len);
                     ^~~~~~
                     malloc
../src/gallium/auxiliary/rbug/rbug_texture.c:302:21: warning: initialization makes pointer from integer without a cast [-Wint-conversion]
../src/gallium/auxiliary/rbug/rbug_texture.c:303:20: warning: initialization makes pointer from integer without a cast [-Wint-conversion]
  uint32_t *depth = alloca(sizeof(uint32_t) * height_len);
                    ^~~~~~
cc1: some warnings being treated as errors

Include c99_alloca.h to portably make the alloca() prototype available.

See also: 498d9d0f, adfb9c5c, fc8139b1

Fixes: 6174cba7 ("rbug: fix transmitted texture sizes")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-10-25 16:04:34 +01:00
Alyssa Rosenzweig
f98e9a2771 pan/midgard: Express allocated registers as offsets
Rather than supplying a mask/swizzle to compose with the original, just
supply the offset of the allocated register so we can directly offset
the mask/swizzle, without resorting to composition.

This is simpler, cleaner, and will generalize to non-32-bit.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-25 08:45:39 -04:00
Alyssa Rosenzweig
c1d36eb115 pan/midgard: Expose more typesize manipulation routines
These internal mir.c routines will help the RA.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-25 08:45:39 -04:00
Alyssa Rosenzweig
9bba182840 pan/midgard: Add mir_set_bytemask helper
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-25 08:45:39 -04:00
Timur Kristóf
85cc40f7ce st/nine: Fix unused variable warnings in release build.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-25 12:44:44 +02:00
Timur Kristóf
f091b02825 st/nine: Fix build with -Werror=empty-body
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1995
Fixes: 8d43e2b2de ("meson: add -Werror=empty-body to disallow `if(x);`")

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-25 12:44:44 +02:00
Timur Kristóf
c580f134ae aco: Refactor hazard mitigations, separate pass for GFX10.
GFX10 hazards require a different approach compared to previous
generations, for example it doesn't need s_nop, and most hazards
can't be solved by adding NOPs at all. Also, they are not
resolved by branch instructions.

This commit reorganizes aco_insert_NOPs so that there is now a
separate pass for GFX10. The new GFX10 pass also respects the
control flow of the shader.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-25 10:10:42 +02:00
Timur Kristóf
b01847bd94 aco/gfx10: Fix mitigation of VMEMtoScalarWriteHazard.
This commit refines the VMEMtoScalarWriteHazard mitigation, based
upon a closer look at what LLVM does. Also changes the code to
match the structure of the other hazard mitigations.

* The hazard is not only triggered by VMEM, FLAT and GLOBAL
  but also SCRATCH and DS instructions.
* The SMEM/SALU instructions only cause a hazard when they
  write a register that the VMEM/etc. are reading.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-25 10:10:42 +02:00
Timur Kristóf
c037ba1bb7 aco/gfx10: Mitigate LdsBranchVmemWARHazard.
There is a hazard caused by there is a branch between a
VMEM/GLOBAL/SCRATCH instruction and a DS instruction.
This commit adds a workaround that avoids the problem.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-25 10:10:42 +02:00
Timur Kristóf
09d676d81a aco/gfx10: Mitigate SMEMtoVectorWriteHazard.
There is a hazard that happens when an SMEM instruction
reads an SGPR and then a VALU instruction writes that same SGPR.
This commit adds a workaround that avoids the problem.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-25 10:10:42 +02:00
Timur Kristóf
d6dfce02d0 aco/gfx10: Mitigate VcmpxExecWARHazard.
There is a hazard when a non-VALU instruction reads the EXEC mask
and then a VALU instruction writes the EXEC mask.
This commit adds a workaround that avoids the problem.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-25 10:10:42 +02:00
Timur Kristóf
e5a8616973 aco/gfx10: Mitigate VcmpxPermlaneHazard.
Any permlane instruction that follows any VOPC instruction can cause a hazard,
this commit implements a workaround that avoids this causing a problem.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-25 10:10:42 +02:00
Timur Kristóf
99aed688d3 aco/gfx10: Add notes about some GFX10 hazards.
ACO currently mitigates VMEMtoScalarWriteHazard and Offset3fBug
(names from LLVM). There are some bugs that ACO needn't care about.
Just to be on the safe side, add an assertion that makes sure
that we aren't hit by FlatSegmentOffsetBug.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-25 10:10:41 +02:00
Samuel Pitoiset
2bf8a9b337 radv: fix VK_KHR_shader_float_controls dependency on GFX6-7
From the Vulkan spec 1.1.126 :
   "VK_SHADER_FLOAT_CONTROLS_INDEPENDENCE_32_BIT_ONLY_KHR specifies
    that shader float controls for 32-bit floating point can be set
    independently; other bit widths must be set identically to each
    other."

Forgot to update this when I enabled that extension recently.

Fixes dEQP-VK.spirv_assembly.instruction.compute.float_controls.independence_settings.independence_setting

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-25 07:49:20 +02:00
Lepton Wu
e137b3a9b7 mapi: Inline call x86_current_tls.
This saves one return and a simple benchmark which calls glGetString
repeatedly on my desktop shows it improves calls per second from 118M
to 128M.

Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-24 23:37:18 +00:00
Lepton Wu
a4fec4dd6a virgl: Remove formats with unusual sample count.
Most GPU require the sample count is power of 2. Just remove those
formats with unusual sample count. This decreases dEQP EGL tests run
time a lot.

Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-24 23:11:08 +00:00
Kristian H. Kristensen
ee2050b111 nir: Use BITSET for tracking varyings in lower_io_arrays
MAX_VARYINGS_INCL_PATCH is greater than 64, so we'll need more that 64
bits (per component) to track which vars have indirects. This pass was
trying to track patch varyings (which start at bit 63) in a separate
64 bit word, but failed to subtract VARYING_SLOT_PATCH0 and accessed
out of bounds.

Do away with the ad-hoc bit mask tracking and just use a BITSET.

Fixes: dEQP-GLES31.functional.tessellation.user_defined_io.per_patch_block.vertex_io_array_size_implicit.triangles
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-24 15:32:20 -07:00
Dylan Baker
5658b13fb6 docs: update calendar, add news item and link release notes for 19.2.2 2019-10-24 14:12:04 -07:00
Dylan Baker
89d94f5ecc docs: Add sha256 sum for 19.2.2 2019-10-24 14:08:25 -07:00
Dylan Baker
849415d615 docs: Add release notes for 19.2.2 2019-10-24 14:07:30 -07:00
Rob Clark
bc67b892d0 freedreno/ir3: handle the progress case
In some cases, in particular when you have things that can be src
modifiers ((abs)/(neg)), once eliminating one mov, there is a
possibility to remove another.  Handle this by re-visiting an
instruction after eliminating a copy on one of it's srcs.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-24 13:08:56 -07:00
Rob Clark
97b24efd9f freedreno/ir3: remove restrictions on const + (abs)/(neg)
These date back to relatively early days of ir3, when a lot was still
not well understood.  But according to CI (and what I've seen blob
driver do), these are not actually real restrictions.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-24 13:08:56 -07:00
Rob Clark
e665e65f96 freedreno/ir3: allow copy-propagate out of fanout
Now that we fixed the sharp edges that this was papering over, we can
relax the restriction about eliminating a mov coming out of a fanout
(for example from result of texture fetch).

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-24 13:08:56 -07:00
Rob Clark
3ac328875e freedreno/ir3: treat high vs low reg as conversion
This avoids copy-propagating a high register into an instruction which
cannot consume it.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-24 13:08:56 -07:00
Rob Clark
9e211b57b8 freedreno/ir3: propagate dest flags for collect/fanin
We did this properly already for split/fanout.  But collect was missed.
Extract out a helper to share.

This way we avoid copy propagating a mov from high or half reg into an
instruction which cannot consume a high/half reg.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-24 13:08:56 -07:00
Rob Clark
49ab94694d freedreno/ir3: make high regs easier to see in IR dumps
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-24 13:08:56 -07:00
Rob Clark
0f395f0933 freedreno/ir3: debug cleanup
1) deduplicate IR3_SHADER_DEBUG=disasm versus fs/vs/etc handling
2) standardize shader stage name prints, in particular VERT vs BVERT
3) don't mix stderr and stdout

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-24 13:08:56 -07:00
Caio Marcelo de Oliveira Filho
d31f415ba0 spirv: Add helper to find args of Image Operands
Avoid keeping track of the idx and all possible image operands for
each operation.  Note for convenience we split up the handling of
ImageOperandsOffsetMask and ImageOperandsConstOffsetMask.

Suggested by Jason Ekstrand.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho
c7d8fe2f0d spirv: Check that only one offset is defined as Image Operand
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho
d27b853c08 spirv: Add imageoperands_to_string helper
Change the information to also include the category, so that the
particulars of BitEnum enumeration can be handled in the template.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho
06aecb14c0 anv: Implement VK_KHR_vulkan_memory_model
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho
b8784fe652 spirv: Handle MakePointerAvailable/Visible
Emit barriers with semantics matching the access operand and the
storage class of the pointer.

v2: Fix order of visible / available emission relative to the
    operations.  (Bas)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho
129c85c28b spirv: Handle MakeTexelAvailable/Visible
Set the memory semantics and scope for later emitting the barrier.
Note the barrier emission code already exist in vtn_handle_image for
the Image atomics.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho
c649e64edc spirv: Add option to emit scoped memory barriers
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho
c022043102 spirv: Add SpvMemoryModelVulkan and related capabilities
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho
1bb191a0d1 spirv: Emit memory barriers for atomic operations
Add a helper to split the memory semantics into before and after the
operation, and use that result to emit memory barriers.

v2: Be more explicit about which bits we are keeping around when
    splitting memory semantics into a before and after.  For now
    we are ignoring Volatile.  (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho
d6992f996b spirv: Parse memory semantics for atomic operations
Including the right storage memory semantic based on the storage class
of the operation.  These will be used later to emit memory barriers.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho
e142061399 intel/fs: Implement scoped_memory_barrier
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho
901071044e nir/tests: Add copy propagation tests with scoped_memory_barrier
Three groups of tests, effectively defining what cases the
optimization is allowed or prevented

- Redudant loads       (a load  generated the value)
- Propagate SSA values (a store generated the value)
- Propagate a var      (a copy  generated the value)

Change the shader type of the tests to be COMPUTE so
nir_var_mem_shared can also be used.  Doesn't affect the semantic of
the copy propagation.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho
73572abc2a nir: Add scoped_memory_barrier intrinsic
Add a NIR instrinsic that represent a memory barrier in SPIR-V /
Vulkan Memory Model, with extra attributes that describe the barrier:

- Ordering: whether is an Acquire or Release;
- "Cache control": availability ("ensure this gets written in the memory")
  and visibility ("ensure my cache is up to date when I'm reading");
- Variable modes: which memory types this barrier applies to;
- Scope: how far this barrier applies.

Note that unlike in SPIR-V, the "Storage Semantics" and the "Memory
Semantics" are split into two different attributes so we can use
variable modes for the former.

NIR passes that took barriers in consideration were also changed

- nir_opt_copy_prop_vars: clean up the values for the mode of an
  ACQUIRE barrier.  Copy propagation effect is to "pull up a load" (by
  not performing it), which is what ACQUIRE restricts.

- nir_opt_dead_write_vars and nir_opt_combine_writes: clean up the
  pending writes for the modes of an RELEASE barrier.  Dead writes
  effect is to "push down a store", which is what RELEASE restricts.

- nir_opt_access: treat the ACQUIRE and RELEASE as a full barrier for
  the modes.  This is conservative, but since this is a GL-specific
  pass, doesn't make a difference for now.

v2: Fix the scoped barrier handling in copy propagation.  (Jason)
    Add scoped barrier handling to nir_opt_access and
    nir_opt_combine_writes.  (Rhys)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-24 11:39:55 -07:00
Jason Ekstrand
0ebe89459c spirv/info: Add a memorymodel_to_string helper
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-24 11:39:55 -07:00
Dylan Baker
cdff9b00e1 docs: Add release not about scons deprecation 2019-10-24 18:33:50 +00:00
Dylan Baker
61ed9891c7 scons: Also print a deprecation warning on windows
This warning is different. Meson support for windows is less mature than
for other platforms, and the goal here is to alert people that
eventually we plan to drop scons and move to meson, and that they should
try out meson and report issues.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-24 18:33:50 +00:00
Dylan Baker
54053bc8d0 scons: Print a deprecation warning about using scons on not windows
At this point meson should be able to handle all of the non-windows
platforms just fine; we'd like to be able to stop maintaining scons for
those platforms sooner than later.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-24 18:33:50 +00:00
Dylan Baker
79e73887e7 scons: Use print_function ins SConstruct
This ensures that we get python3's print() function behavior even in
python2, instead of python2's print statement behavior. We'll be using
this in the next patch.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-24 18:33:50 +00:00
Adam Jackson
2e9aef4651 gallium: Fix a bunch of undefined left-shifts in u_format_*
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2019-10-24 14:21:51 -04:00
Samuel Pitoiset
4b17311e52 radv: compute the number of records correctly for vertex buffers
On GFX8 the number of records is in bytes while on other chips
it's in units of "stride".

Fixes dEQP-VK.robustness.vertex_access.*.draw.vertex_* on RAVEN.

Tested on GFX6, GFX8, GFX10 and RAVEN.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-24 17:14:43 +02:00
Michel Dänzer
75cc8c0b82 gitlab-ci: Enable UBSan for the meson-vulkan job
It doesn't report any errors now.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2019-10-24 16:21:48 +02:00
Michel Dänzer
9ffe477412 util/tests: Avoid int64_t overflow issues in fast_idiv_by_const test
Flagged by UBSan:

../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:233:14: runtime error: negation of -2147483648 cannot be represented in type 'int'; cast to an unsigned type to negate this value to itself
    #0 0x55b4c1a2a428 in rand_sint ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:233
    #1 0x55b4c1a2ad3a in random_sdiv_test ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:308
    #2 0x55b4c1a2b837 in fast_idiv_by_const_int32_Test::TestBody() ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:410
    #3 0x55b4c1abc13f in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2402
    #4 0x55b4c1aa7a4d in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2438
    #5 0x55b4c1a4ce57 in testing::Test::Run() ../src/gtest/src/gtest.cc:2474
    #6 0x55b4c1a4f530 in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656
    #7 0x55b4c1a51cbe in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774
    #8 0x55b4c1a6d698 in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649
    #9 0x55b4c1abfd58 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2402
    #10 0x55b4c1aab425 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2438
    #11 0x55b4c1a64cba in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257
    #12 0x55b4c1ae4b73 in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233
    #13 0x55b4c1ae4a33 in main ../src/gtest/src/gtest_main.cc:37
    #14 0x7ff172d1dbba in __libc_start_main ../csu/libc-start.c:308
    #15 0x55b4c1a28dc9 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test+0x96dc9)

../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:309:52: runtime error: negation of -9223372036854775808 cannot be represented in type 'long int'; cast to an unsigned type to negate this value to itself
    #0 0x563b24dafd2d in random_sdiv_test ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:309
    #1 0x563b24db0f0f in fast_idiv_by_const_int64_Test::TestBody() ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:473
    #2 0x563b24e41111 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2402
    #3 0x563b24e2ca1f in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2438
    #4 0x563b24dd1e29 in testing::Test::Run() ../src/gtest/src/gtest.cc:2474
    #5 0x563b24dd4502 in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656
    #6 0x563b24dd6c90 in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774
    #7 0x563b24df266a in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649
    #8 0x563b24e44d2a in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2402
    #9 0x563b24e303f7 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2438
    #10 0x563b24de9c8c in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257
    #11 0x563b24e69b45 in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233
    #12 0x563b24e69a05 in main ../src/gtest/src/gtest_main.cc:37
    #13 0x7f9a90330bba in __libc_start_main ../csu/libc-start.c:308
    #14 0x563b24daddc9 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test+0x96dc9)

v2:
* Use INT64_MIN instead of LLONG_MIN (Jason Ekstrand)
* Simpler test for INT64_MIN result from rand_sint (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2019-10-24 16:21:27 +02:00
Michel Dänzer
69420c28bd util: Use uint64_t for shifting left in sign_extend and strunc
Shifting int64_t values left into the sign bit has undefined behaviour:

../src/util/fast_idiv_by_const.c:175:14: runtime error: left shift of 131 by 56 places cannot be represented in type 'long int'
    #0 0x561337ed10c1 in sign_extend ../src/util/fast_idiv_by_const.c:175
    #1 0x561337ed1335 in util_compute_fast_sdiv_info ../src/util/fast_idiv_by_const.c:239
    #2 0x561337e17519 in fast_idiv_by_const_int8_Test::TestBody() ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:357
    #3 0x561337ea815d in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2402
    #4 0x561337e93a6b in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2438
    #5 0x561337e38e75 in testing::Test::Run() ../src/gtest/src/gtest.cc:2474
    #6 0x561337e3b54e in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656
    #7 0x561337e3dcdc in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774
    #8 0x561337e596b6 in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649
    #9 0x561337eabd76 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2402
    #10 0x561337e97443 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2438
    #11 0x561337e50cd8 in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257
    #12 0x561337ed0b91 in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233
    #13 0x561337ed0a51 in main ../src/gtest/src/gtest_main.cc:37
    #14 0x7f85ba483bba in __libc_start_main ../csu/libc-start.c:308
    #15 0x561337e14dc9 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test+0x96dc9)

../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:51:14: runtime error: left shift of negative value -63
    #0 0x55fc3c0e67cc in strunc ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:51
    #1 0x55fc3c0e6d93 in smul_high ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:140
    #2 0x55fc3c0e7067 in fast_sdiv ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:181
    #3 0x55fc3c0e858b in fast_idiv_by_const_int8_Test::TestBody() ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:358
    #4 0x55fc3c17915d in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2402
    #5 0x55fc3c164a6b in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2438
    #6 0x55fc3c109e75 in testing::Test::Run() ../src/gtest/src/gtest.cc:2474
    #7 0x55fc3c10c54e in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656
    #8 0x55fc3c10ecdc in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774
    #9 0x55fc3c12a6b6 in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649
    #10 0x55fc3c17cd76 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2402
    #11 0x55fc3c168443 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2438
    #12 0x55fc3c121cd8 in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257
    #13 0x55fc3c1a1b91 in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233
    #14 0x55fc3c1a1a51 in main ../src/gtest/src/gtest_main.cc:37
    #15 0x7fd224759bba in __libc_start_main ../csu/libc-start.c:308
    #16 0x55fc3c0e5dc9 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test+0x96dc9)

v2:
* Use two casts instead of changing the argument type (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2019-10-24 16:21:01 +02:00
Michel Dänzer
65e376a721 gallium/util: Cast to target type before shifting left
Otherwise a smaller type may be promoted to int, which can hit undefined
behaviour:

../src/gallium/auxiliary/util/u_half.h:126:29: runtime error: left shift of 32768 by 16 places cannot be represented in type 'int'
    #0 0x5646ff63d488 in util_half_to_float ../src/gallium/auxiliary/util/u_half.h:126
    #1 0x5646ff63d749 in _mesa_half_to_float ../src/util/half_float.c:145
    #2 0x5646ff54d557 in nir_const_value_negative_equal ../src/compiler/nir/nir_instr_set.c:372
    #3 0x5646ff44d29a in const_value_negative_equal_test_nir_type_float16_trivially_true_Test::TestBody() ../src/compiler/nir/tests/negative_equal_tests.cpp:121
    #4 0x5646ff505c05 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2402
    #5 0x5646ff4f1513 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2438
    #6 0x5646ff4979b5 in testing::Test::Run() ../src/gtest/src/gtest.cc:2474
    #7 0x5646ff49a08e in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656
    #8 0x5646ff49c81c in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774
    #9 0x5646ff4b81f6 in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649
    #10 0x5646ff50981e in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2402
    #11 0x5646ff4f4eeb in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2438
    #12 0x5646ff4af818 in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257
    #13 0x5646ff52e639 in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233
    #14 0x5646ff52e4f9 in main ../src/gtest/src/gtest_main.cc:37
    #15 0x7f6bacb78bba in __libc_start_main ../csu/libc-start.c:308
    #16 0x5646ff448019 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/compiler/nir/negative_equal+0x17c019)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2019-10-24 16:20:31 +02:00
Michel Dänzer
2b1b56cb3a intel/fs: Check for NULL key in fs_visitor constructor
Flagged by UBSan:

../src/intel/compiler/brw_fs_visitor.cpp:986:20: runtime error: member access within null pointer of type 'const struct brw_base_prog_key'
    #0 0x559fadb48556 in fs_visitor::init() ../src/intel/compiler/brw_fs_visitor.cpp:986
    #1 0x559fadb46db3 in fs_visitor::fs_visitor(brw_compiler const*, void*, void*, brw_base_prog_key const*, brw_stage_prog_data*, nir_shader const*, unsigned int, int, brw_vue_map const*) ../src/intel/compiler/brw_fs_visitor.cpp:962
    #2 0x559fad9c7cd8 in saturate_propagation_fs_visitor::saturate_propagation_fs_visitor(brw_compiler*, brw_wm_prog_data*, nir_shader*) (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/intel/compiler/fs_saturate_propagation+0x61bcd8)
    #3 0x559fad9960a1 in saturate_propagation_test::SetUp() ../src/intel/compiler/test_fs_saturate_propagation.cpp:65
    #4 0x559fadd7a32d in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2402
    #5 0x559fadd65c3b in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2438
    #6 0x559fadd0af75 in testing::Test::Run() ../src/gtest/src/gtest.cc:2470
    #7 0x559fadd0d8a4 in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656
    #8 0x559fadd10032 in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774
    #9 0x559fadd2ba0c in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649
    #10 0x559fadd7df46 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2402
    #11 0x559fadd69613 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2438
    #12 0x559fadd2302e in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257
    #13 0x559fadda2d61 in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233
    #14 0x559fadda2c21 in main ../src/gtest/src/gtest_main.cc:37
    #15 0x7fe8f6748bba in __libc_start_main ../csu/libc-start.c:308
    #16 0x559fad9950f9 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/intel/compiler/fs_saturate_propagation+0x5e90f9)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2019-10-24 16:20:04 +02:00
Michel Dänzer
41623be20e intel/compiler: Cast to target type before shifting left
Otherwise a smaller type may be promoted to int, which can hit undefined
behaviour:

../src/intel/compiler/brw_packed_float.c:66:17: runtime error: left shift of 128 by 24 places cannot be represented in type 'int'
    #0 0x5604a03969aa in brw_vf_to_float ../src/intel/compiler/brw_packed_float.c:66
    #1 0x5604a0391305 in vf_float_conversion_test_test_vf_to_float_Test::TestBody() ../src/intel/compiler/test_vf_float_conversions.cpp:70
    #2 0x5604a041a323 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2402
    #3 0x5604a0405c31 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2438
    #4 0x5604a03ab03b in testing::Test::Run() ../src/gtest/src/gtest.cc:2474
    #5 0x5604a03ad714 in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656
    #6 0x5604a03afea2 in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774
    #7 0x5604a03cb87c in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649
    #8 0x5604a041df3c in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2402
    #9 0x5604a0409609 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2438
    #10 0x5604a03c2e9e in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257
    #11 0x5604a0442d57 in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233
    #12 0x5604a0442c17 in main ../src/gtest/src/gtest_main.cc:37
    #13 0x7f9a1983dbba in __libc_start_main ../csu/libc-start.c:308
    #14 0x5604a0390d89 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/intel/compiler/vf_float_conversions+0x8dd89)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2019-10-24 16:19:23 +02:00
Michel Dänzer
59b72bdfb4 intel/compiler: Don't left-shift by >= the number of bits of the type
To avoid it, use the modulo of the number of bits in the value being
shifted, which is presumably what ended up happening on x86.

Flagged by UBSan:

../src/intel/compiler/brw_eu_validate.c:974:33: runtime error: shift exponent 64 is too large for 64-bit type 'long unsigned int'
    #0 0x561abb612ab3 in general_restrictions_on_region_parameters ../src/intel/compiler/brw_eu_validate.c:974
    #1 0x561abb617574 in brw_validate_instructions ../src/intel/compiler/brw_eu_validate.c:1851
    #2 0x561abb53bd31 in validate ../src/intel/compiler/test_eu_validate.cpp:106
    #3 0x561abb555369 in validation_test_source_cannot_span_more_than_2_registers_Test::TestBody() ../src/intel/compiler/test_eu_validate.cpp:486
    #4 0x561abb742651 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2402
    #5 0x561abb72e64d in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2438
    #6 0x561abb6d5451 in testing::Test::Run() ../src/gtest/src/gtest.cc:2474
    #7 0x561abb6d7b2a in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656
    #8 0x561abb6da2b8 in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774
    #9 0x561abb6f5c92 in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649
    #10 0x561abb74626a in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2402
    #11 0x561abb732025 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2438
    #12 0x561abb6ed2b4 in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257
    #13 0x561abb768b3b in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233
    #14 0x561abb7689fb in main ../src/gtest/src/gtest_main.cc:37
    #15 0x7f525e5a9bba in __libc_start_main ../csu/libc-start.c:308
    #16 0x561abb538ed9 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/intel/compiler/eu_validate+0x1b8ed9)

Reviewed-by: Adam Jackson <ajax@redhat.com>
2019-10-24 16:16:49 +02:00
Eric Engestrom
47571a01ec anv: fix error message
`strerror()` takes an `errno`, not the negative value returned by the
`ioctl()`.
Instead of fixing this as `"%s", strerror(errno)`, let's just use the
`"%m"` shortcut for it.

Fixes: 2b5f30b1d9 ("anv: implement VK_INTEL_performance_query")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-24 13:57:40 +00:00
Eric Engestrom
8d43e2b2de meson: add -Werror=empty-body to disallow if(x);
This would have prevented a bug in MR 2058 [1]; with that MR fixed,
nothing else uses empty-body blocks, so let's just forbid them altogether.

[1] https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2058#note_237880

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-24 14:54:09 +01:00
Eric Engestrom
1177151b6d llvmpipe: avoid generating empty-body blocks
Suggested-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-24 14:54:09 +01:00
Eric Engestrom
abe32f56f5 llvmpipe: avoid compiling no-op block on release builds
Suggested-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-24 14:54:09 +01:00
Thomas Hellstrom
91146c0796 winsys/svga: Limit the maximum DMA hardware buffer size
The kernel total GMR/DMA size is limited, but it's definitely possible for the
kernel to allow a larger buffer allocation to succeed, but command
submission using that buffer as a GMR would fail typically causing an
application crash.

So have the winsys limit the size of GMR/DMA buffers. The pipe driver will
then resort to allocating smaller buffers and perform the DMA transfer in
multiple bands, also allowing for the pre-flush mechanism to kick in.

This avoids the related application crashes.

Fixes: e7843273fa ("winsys/svga: Update to vmwgfx kernel module 2.1")
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-10-24 15:08:43 +02:00
Thomas Hellstrom
00db976905 svga: Fix banded DMA upload unmap
Even with banded DMA uploads, st->hwbuf is always non-NULL, but when we've
allocated a software buffer to hold the full upload, unmapping of the
hardware buffer has already been done before
svga_texture_transfer_unmap_dma(), and the code was performing an unmap of
an already mapped buffer.

Fix this by testing for software buffer not present.

Fixes: a9c4a861d5 ("svga: refactor svga_texture_transfer_map/unmap functions")
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-10-24 15:08:43 +02:00
Tomeu Vizoso
3168b8defa gitlab-ci: Update kernel for LAVA jobs to 5.4-rc4
Update to 5.4-rc4 so we can test Panfrost on devices with Mali T720 and
T820.

A bug was found that prevented things working at all on RK3288 devices,
so we carry a patch for now in my personal fork.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Daniel Stone <daniels@collabora.com>
2019-10-24 08:47:37 +02:00
Timothy Arceri
1961653c89 glsl: remove propagate_invariance() call from the linker
This was added in 586f4a42e7 and became redundant with 34ab9b0947

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-24 13:24:49 +11:00
Timothy Arceri
922801b77d nir: improve nir_variable packing
Before:

/* size: 136, cachelines: 3, members: 10 */

After:

/* size: 128, cachelines: 2, members: 10 */

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-10-24 13:24:40 +11:00
Timothy Arceri
c412ff426b nir: fix nir_variable_data packing
Before:

/* size: 60, cachelines: 1, members: 29 */

After:

/* size: 56, cachelines: 1, members: 29 */

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-10-24 13:22:59 +11:00
Marek Olšák
fff884e09d radeonsi/nir: implement pipe_screen::finalize_nir
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-23 21:12:52 -04:00
Marek Olšák
92196fe74b st/mesa: use pipe_screen::finalize_nir
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-23 21:12:52 -04:00
Marek Olšák
43efccb657 tgsi_to_nir: use pipe_screen::finalize_nir
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-23 21:12:52 -04:00
Marek Olšák
fb04e5da97 gallium: add pipe_screen::finalize_nir
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-23 21:12:52 -04:00
Marek Olšák
8a0dd0af3f st/mesa: update VS shader_info for NIR after lowering passes
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-23 21:12:52 -04:00
Marek Olšák
28199aeee5 st/mesa: assign driver locations for VS inputs for NIR before caching
fix up edge flags in the NIR pass, because st/mesa doesn't touch the inputs
after caching

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-23 21:12:52 -04:00
Marek Olšák
eaffdad108 st/mesa: don't lower_global_vars_to_local for VS if there are no dead inputs
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-23 21:12:52 -04:00
Marek Olšák
3634dca99a st/mesa: move some NIR lowering before shader caching
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-23 21:12:52 -04:00
Marek Olšák
c2efd2cbfb util/u_queue: skip util_queue_finish if num_threads is 0
This fixes a deadlock in pthread_barrier_destroy.

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-23 21:11:17 -04:00
Marek Olšák
e096011def util/disk_cache: finish all queue jobs in destroy instead of killing them
If there are queued shaders to be written to disk, wait for that.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-23 20:22:50 -04:00
Kenneth Graunke
8dadef2ec5 iris: Rework edgeflag handling
We were relying on specific pass ordering in st to avoid setting
inputs_read/outputs_written for edge flags.  Instead, just assume
that it happens and throw out the results we don't want.

We should probably revisit this and try and add a vertex element
property like I originally wanted so we can avoid having it be
associated with the VS altogether.
2019-10-23 16:38:27 -07:00
Marek Olšák
6b166d6fb1 gallium/noop: implement get_disk_shader_cache and get_compiler_options
trivial
2019-10-23 18:11:19 -04:00
Rhys Perry
fc04a2fc31 aco: take LDS into account when calculating num_waves
pipeline-db (Vega):
SGPRS: 344 -> 344 (0.00 %)
VGPRS: 424 -> 524 (23.58 %)
Spilled SGPRs: 84 -> 80 (-4.76 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 52812 -> 52484 (-0.62 %) bytes
LDS: 135 -> 135 (0.00 %) blocks
Max Waves: 56 -> 53 (-5.36 %)

v2: consider WGP, rework to be clearer and apply the
    "maximum 16 workgroups per CU" limit properly
v2: use "SIMD" instead of "EU"
v2: fix spiller by introducing "Program::max_waves"
v2: rename "lds_size" to "lds_limit"
v3: make max_waves actually independant of register usage
v3: fix issue where max_waves was way too high
v3: use DIV_ROUND_UP(a, b) instead of max(a / b, 1)
v3: rename "workgroups_per_cu" to "workgroups_per_cu_wgp"
v4: fix typo from "workgroups_per_cu" rename

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> (v3)
2019-10-23 19:11:21 +01:00
Rhys Perry
08d510010b aco: increase accuracy of SGPR limits
SGPRs are allocated in groups of 16 on GFX8/GFX9. GFX10 allocates a fixed
number of SGPRs and has 106 addressable SGPRs.

pipeline-db (Vega):
SGPRS: 5912 -> 6232 (5.41 %)
VGPRS: 1772 -> 1780 (0.45 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 88228 -> 87904 (-0.37 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 559 -> 571 (2.15 %)

piepline-db (Navi):
SGPRS: 341256 -> 363384 (6.48 %)
VGPRS: 171536 -> 170960 (-0.34 %)
Spilled SGPRs: 832 -> 581 (-30.17 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 14207332 -> 14190872 (-0.12 %) bytes
LDS: 33 -> 33 (0.00 %) blocks
Max Waves: 18072 -> 18251 (0.99 %)

v2: unconditionally count vcc as an extra sgpr on GFX10+
v3: pass SGPRs rounded to 8

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-23 19:11:21 +01:00
Rhys Perry
7453c1adff radv: round vgprs/sgprs before calculating max_waves
Note that ACO doesn't correctly round SGPR counts on GFX8/GFX9.

pipeline-db (ACO/Vega):
SGPRS: 11000 -> 11000 (0.00 %)
VGPRS: 3120 -> 3120 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 164328 -> 164328 (0.00 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 1125 -> 1000 (-11.11 %)

v2: consider wave32

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-23 19:11:20 +01:00
Lionel Landwerlin
254d9976b6 docs: Add new Intel extension
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-23 19:07:34 +03:00
Erik Faye-Lund
8ae024d029 Revert "vc4: do not report alpha-test as supported"
This reverts commit a79b93269c.

Reviewed-by: Jose Maria Casanova <jmcasanova@igalia.com>
2019-10-23 13:03:59 +02:00
Erik Faye-Lund
65328bd32d Revert "v3d: do not report alpha-test as supported"
This reverts commit 9d0523b569.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jose Maria Casanova <jmcasanova@igalia.com>
2019-10-23 13:03:55 +02:00
Erik Faye-Lund
acf1bf47cc Revert "nir: drop support for using load_alpha_ref_float"
This reverts commit 5af272b474.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jose Maria Casanova <jmcasanova@igalia.com>
2019-10-23 13:03:52 +02:00
Erik Faye-Lund
beb6639a9d Revert "nir: drop unused alpha_ref_float"
This reverts commit e8095f2af0.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jose Maria Casanova <jmcasanova@igalia.com>
2019-10-23 13:03:38 +02:00
Samuel Pitoiset
f11ea22666 radv: fix a performance regression with graphics depth/stencil clears
I recently changed the slow depth/stencil clear path to make sure
depth values are explicitly exported by the fragment shader. This
is actually only useful when VK_EXT_depth_range_unrestricted is
enabled.

While this path is correct, it introduced a performance regression
with Heroes of the Storm, Shadow of Mordor (Vulkan beta) and
probably more titles. This is because it prevents the hardware
to do some optimizations like discarding fragments.

This commit re-introduces the previous (a bit faster) slow
depth/stencil clear path and it selects the unrestricted path
only if VK_EXT_depth_range_unrestricted is enabled.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/863
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-23 10:23:47 +02:00
Samuel Pitoiset
7562a2cbe3 radv: fix vkUpdateDescriptorSets with inline uniform blocks
descriptorCount is the number of bytes into the descriptor, so
it shouldn't be used as an index. srcArrayElement/dstArrayElement
specify the starting byte offset within the binding to copy from/to.

This fixes new CTS tests:
dEQP-VK.binding_model.descriptor_copy.*.inline_uniform_block_*
dEQP-VK.binding_model.descriptor_copy.*.mix_3
dEQP-VK.binding_model.descriptor_copy.*.mix_array1

Fixes: 8d2654a419 ("radv: Support VK_EXT_inline_uniform_block.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-23 09:59:22 +02:00
Samuel Pitoiset
9c92a21fe5 radv/gfx10: fix 3D images
GFX10 does act like GFX9 actually.

This fixes
dEQP-VK.glsl.texture_functions.query.texturesize.*sampler3d_*.

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-23 09:45:49 +02:00
Samuel Pitoiset
41ace1d939 radv/gfx10: re-enable fast depth/stencil clears with separate aspects
It used to cause weird issues on GFX10 in the past with vkmark and
Wreckfest, and they can't be reproduced now. Shadow Of Mordor
(Vulkan beta) hits that path and it works fine.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-23 09:18:06 +02:00
Samuel Pitoiset
956d825ed8 radv: do not emit rbplus if attachments are undefined
Fixes some crashes with dEQP-VK.geometry.layered.*.secondary_cmd_buffer
on Raven and other chips that allow rbplus.

This just prevents a crash and rbplus probaby needs more work.

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-23 08:57:31 +02:00
Samuel Pitoiset
411ad8e7c5 radv: add an assertion in radv_gfx10_compute_bin_size()
To prevent out of bounds access.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-23 08:33:12 +02:00
Samuel Pitoiset
f4ab58c1a0 radv: do not create meta pipelines with 16 samples
The driver only supports up to 8 samples, so it's useless to
create more pipelines than needed.

This fixes a conditional jump reported by Valgrind on GFX10:

==194282== Conditional jump or move depends on uninitialised value(s)
==194282==    at 0xDBF925A: radv_gfx10_compute_bin_size (radv_pipeline.c:3242)
==194282==    by 0xDBF95A6: radv_pipeline_generate_binning_state (radv_pipeline.c:3334)
==194282==    by 0xDBFC1A0: radv_pipeline_generate_pm4 (radv_pipeline.c:4440)
==194282==    by 0xDBFD15E: radv_pipeline_init (radv_pipeline.c:4764)
==194282==    by 0xDBFD23E: radv_graphics_pipeline_create (radv_pipeline.c:4788)
==194282==    by 0xDBB95A3: create_pipeline (radv_meta_clear.c:114)
==194282==    by 0xDBB9AC5: create_color_pipeline (radv_meta_clear.c:297)
==194282==    by 0xDBBCF05: radv_device_init_meta_clear_state (radv_meta_clear.c:1277)
==194282==    by 0xDB9ACD9: radv_device_init_meta (radv_meta.c:363)
==194282==    by 0xDB7FE3A: radv_CreateDevice (radv_device.c:2080

This is caused by an out of bound access of 'fmask_array' (ie. index
is 4 as for 16 samples).

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-23 08:33:08 +02:00
Lionel Landwerlin
2b5f30b1d9 anv: implement VK_INTEL_performance_query
v2: Introduce the appropriate pipe controls
    Properly deal with changes in metric sets (using execbuf parameter)
    Record marker at query end

v3: Fill out PerfCntr1&2

v4: Introduce vkUninitializePerformanceApiINTEL

v5: Use new execbuf extension mechanism

v6: Fix comments in genX_query.c (Rafael)
    Use PIPE_CONTROL workarounds (Rafael)
    Refactor on the last kernel series update (Lionel)

v7: Only I915_PERF_IOCTL_CONFIG when perf stream is already opened (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-10-23 05:41:15 +00:00
Lionel Landwerlin
5ba6d9941b intel/perf: add mdapi writes for register perf counters
Those are not part of the OA reports.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-10-23 05:41:15 +00:00
Lionel Landwerlin
a2a1873a82 intel/genxml: add RPSTAT register for core frequency
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-10-23 05:41:15 +00:00
Lionel Landwerlin
e0ab658acd intel/genxml: add generic perf counters registers
We have 2 of those we can configure to source programmable events.
Those are not part of the OA reports. Configuration happens in i915
through the metric set selected by the application. On the Mesa side
we'll just sample those and do a diff.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-10-23 05:41:14 +00:00
Lionel Landwerlin
11c4bf9417 intel/perf: add support for querying kernel loaded configurations
We use this as a communication mechanism between MDAPI & Anv.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-10-23 05:41:14 +00:00
Lionel Landwerlin
13f802291d drm-uapi: Update headers from drm-next
Pull new updates from drm-next as of the following commit:

commit f1b4a9217efd61d0b84c6dc404596c8519ff6f59
Merge: 400e91347e1d f3a36d469621
Author: Dave Airlie <airlied@redhat.com>
Date:   Tue Oct 22 15:04:00 2019 +1000

    Merge tag 'du-next-20191016' of git://linuxtv.org/pinchartl/media into drm-next

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-23 05:41:14 +00:00
Lionel Landwerlin
db7a6847dd intel/perf: move registers to their own header
Will conflict with the genxml RPSTAT register.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-10-23 05:41:14 +00:00
Lionel Landwerlin
e1d5d75257 intel/perf: extract register configuration
We want to query the content of register configurations from the
kernel. Let's pull this out of the query.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-10-23 05:41:14 +00:00
Lionel Landwerlin
a338b7d739 intel/perf: expose some utility functions
The Vulkan performance query extension is a bit lower level than the
GL one. Expose some of the functions to do the result accumulation
directly in the Anv driver.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-10-23 05:41:14 +00:00
Lionel Landwerlin
a0e0e75db1 intel/perf: add mdapi maker helper
A simple utility to put the marker at the right location.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-10-23 05:41:14 +00:00
Kenneth Graunke
c352cdf970 st/mesa: Silence chatty debug printf
Other debug_printf's in this file are in if (0) blocks.

Trivial.
2019-10-22 18:01:41 -07:00
Chris Wilson
0899bf55d4 st/mesa: Map MESA_FORMAT_RGB_UNORM8 <-> PIPE_FORMAT_R8G8B8_UNORM
This is useful for PBO texture upload with GL_RGB and GL_UNSIGNED_BYTE.

v2: Vasily Khoruzhick provided an update for the Lima CI expectations.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-22 22:13:14 +00:00
Lionel Landwerlin
0dfa643feb anv: fix unwind of vkCreateDevice fail
We're skipping the context destruction in some cases which is the
grand scheme of thing is not that important because closing device->fd
will destroy the associated context as well.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Fixes: b30e01aef5 ("anv: fix memory leak on device destroy")
2019-10-22 20:44:26 +00:00
Rhys Perry
118a32e5ba Revert "aco: only emit waitcnt on loop continues if we there was some load or export"
We don't properly pass on ctx.lgkm_cnt/ctx.barrier_imm/etc, so this
waitcnt was necessary for barriers and correctly waiting for SMEM before
s_dcache_wb on GFX10.

Totals from affected shaders:
SGPRS: 33200 -> 33200 (0.00 %)
VGPRS: 31376 -> 31376 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 2431804 -> 2433956 (0.09 %) bytes
LDS: 316 -> 316 (0.00 %) blocks
Max Waves: 1609 -> 1609 (0.00 %)

This reverts commit 2c050b49b3.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-22 18:52:29 +00:00
Rhys Perry
964ce47abc aco: add missing bld.scc()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-22 18:52:29 +00:00
Rhys Perry
c96289a70e aco: keep can_reorder/barrier when combining addition into SMEM
Affects 30 shaders in the pipeline-db (all youngblood).

Totals from affected shaders:
SGPRS: 2656 -> 2456 (-7.53 %)
VGPRS: 2260 -> 2260 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 240680 -> 240944 (0.11 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 90 -> 90 (0.00 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-22 18:52:29 +00:00
Rhys Perry
57c2cfb608 aco: add a few missing checks in value numbering
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-22 18:52:29 +00:00
Rhys Perry
a8d0101d69 aco: use ds_read2_b64/ds_write2_b64
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-22 18:52:29 +00:00
Rhys Perry
bdf47a1273 aco: properly combine additions into ds_write2_b64/ds_read2_b64
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-22 18:52:29 +00:00
Rhys Perry
58d4aee5df aco: fix sparse store_lds()
p_extract_vector's second operand is in units of the definition size, not
dwords.

v2: move extract_subvector() to right before ds_write_helper

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-22 18:52:29 +00:00
Rhys Perry
a856629e8f aco: create load_lds/store_lds helpers
We'll want these for GS, since VS->GS IO on Vega is done using LDS.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-22 18:52:29 +00:00
Rhys Perry
a400928f4a aco: fix 64-bit p_extract_vector on 32-bit p_create_vector
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-22 18:52:29 +00:00
Rhys Perry
f6f15859de aco: small stage corrections
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-22 18:52:29 +00:00
Marek Olšák
f764725b3e st/mesa: replace pipe_shader_state with tgsi_token* in st_vp_variant
we don't need more than that

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-22 14:41:25 -04:00
Marek Olšák
a0b711d8e9 nir: allow nir_lower_uniforms_to_ubo to be run repeatedly
for st/mesa

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-22 14:41:23 -04:00
Rob Clark
aa8515463e freedreno/ir3: fixup register footprint fixup
Small typo resulted in not converting footprint to vec4, meaning that we
could potentially ask for quite a few more registers than required

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-22 17:46:19 +00:00
Rob Clark
4c060235a2 freedreno/ir3: handle scalarized varying inputs
If the load_interpolated_input is scalarized, we would be too
conservative about deciding the tex instruction wasn't a candidate to
pre-fetch:

	vec1 32 ssa_0 = load_const (0x00000000 /* 0.000000 */)
	vec2 32 ssa_1 = intrinsic load_barycentric_pixel () (0) /* interp_mode=0 */
	vec1 32 ssa_2 = intrinsic load_interpolated_input (ssa_1, ssa_0) (0, 0) /* base=0 */ /* component=0 */	/* packed:v_uv,v_uv1 */
	vec1 32 ssa_3 = intrinsic load_interpolated_input (ssa_1, ssa_0) (0, 1) /* base=0 */ /* component=1 */	/* packed:v_uv,v_uv1 */
	vec2 32 ssa_8 = vec2 ssa_2, ssa_3
	vec4 32 ssa_9 = tex ssa_8 (coord), 0 (texture), 0 (sampler)

Really we don't care that the texcoord components come from different
load_interpolated_input instructions, just that they have consecutive
varying offsets.

Reported-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-22 17:46:19 +00:00
Daniel Schürmann
3a20ef4a32 aco: refactor value numbering
Previously, we used one hashset per BB, so that we could
always initialize the current hashset from the immediate
dominator. This patch changes the behavior to a single
hashmap using the block index per instruction to resolve
dominance.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-22 17:18:59 +02:00
Erik Faye-Lund
3a71e1d27b mesa/st: assert that lowering is supported
Some of these lowerings aren't supported for drivers that supports
tesselation and geometry shaders. Let's add a couple of asserts to make
it obvious if these have been enabled when it's not possible.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-22 12:07:23 +00:00
Michel Dänzer
793f6b30d9 gitlab-ci: Enable llvmpipe in ARM build jobs
v2:
* Use LLVM 8 from buster-backports
v3:
* Use LLVM 7 again for armhf, llvmpipe is still broken there with LLVM 8

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-22 10:26:29 +00:00
Michel Dänzer
59e7f1413c gitlab-ci: Update the meson cross file for LLVM_VERSION as well
Cross builds don't use the llvm-config path from the native file.
2019-10-22 10:26:29 +00:00
Michel Dänzer
163ec5d808 gitlab-ci: Use native aarch64 runner for ARM build jobs
This allows running the regression tests.

One downside is that we can't easily build the Vulkan overlay layer,
because only x86 binaries of the glslang validator are available. If
that's important, we could either use those binaries via qemu, or build
it from source.

v2:
* Add :amd64 suffix to existing debian-9/10 job names (Eric Engestrom)

Acked-by: Eric Engestrom <eric.engestrom@intel.com> # v1
2019-10-22 10:26:29 +00:00
Michel Dänzer
c5aa2711a4 gitlab-ci: Explicitly list debian-10 in needs: for .deqp-test template
Apparently needs: in a definition overwrites inherited ones. So
.deqp-test effectively didn't declare needs: for debian-10, which means
any jobs based on .deqp-test could spuriously run after the debian-10
job failed or was cancelled.
2019-10-22 10:26:29 +00:00
Michel Dänzer
38d42cf1d5 gitlab-ci: Bring ARM docker image install script in line with x86_64
Use https:// URLs in the APT configuration.

Drop --no-install-recommends, the image generation template disables
installation of recommended packages in /etc/apt/apt.conf.

Run apt-get autoremove at the end, cleaning up packages which were
installed to satisfy dependencies but are no longer needed.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-22 10:26:29 +00:00
Michel Dänzer
e3c7e04dfa gitlab-ci: Sort ARM docker image packages in alphabetical order
No functional change.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-22 10:26:29 +00:00
Samuel Pitoiset
a13320370e radv: fix updating bound fast ds clear values with different aspects
On GFX9, the driver is able to do an optimized fast depth/stencil
clear with only one aspect (ie. clear the stencil part of a
depth/stencil image). When this happens, the driver should only
update the clear values of the given aspect.

Note that it's currently only supported on GFX9 but I have some
local patches that extend this optimized path for other gens.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1967
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-22 11:16:13 +02:00
Sagar Ghuge
97e6d34e66 intel/compiler: Refactor disassembly of sources in 3src instruction
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-10-21 20:32:43 -07:00
Sagar Ghuge
18b28b5654 intel/compiler: Don't move immediate in register
On Gen12, we support mixed mode HF/F operands, and also 3 source
instruction supports immediate value support, so keep immediate as it
is, if it fits properly in 16 bit field.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-10-21 20:32:43 -07:00
Sagar Ghuge
bf943bdf24 intel/compiler: Set bits according to source file
On Gen >= 12, if src0 or src2 holds immediate value, we need set
src[0/2]_is_imm bits instead of register file.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-10-21 20:32:43 -07:00
Sagar Ghuge
c018c5a339 intel/compiler: Add Immediate support for 3 source instruction
On Gen >= 10, Either src0 or src2 can use 16-bit immediate value, but
not both.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-10-21 20:32:43 -07:00
Eric Anholt
fb9362c6fb ci: Disable lima until its farm can get fixed.
It's been throwing the following error today:

"<Fault -32603: 'Internal Server Error (contact server administrator
for details): could not extend file "base/17952/18226": No space left
on device\nHINT: Check free disk space.\n'>"

Reviewed-by: Daniel Stone <daniels@collabora.com>
2019-10-21 20:31:34 -07:00
Sagar Ghuge
7fb75ddfa7 intel: Add missing entry for brw_nir_lower_alpha_to_coverage in Makefile
Fixes: 7ecfbd4f6d ("nir: Add alpha_to_coverage lowering pass")

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-21 16:19:24 -07:00
Dave Airlie
bde08ce4d7 llvmpipe: handle compute shader launch with 0 threads
If you set LP_NUM_THREADS=0 compute shaders would hang,
just execute the workloads in sequence if we have no threads
in the pool.

Fixes: 1b24e3ba75 ("llvmpipe: add compute threadpool + mutex")
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-10-21 22:51:23 +00:00
Marijn Suijten
0141a4cdc0 freedreno/ir3: Add missing ir3_nir_lower_tex_prefetch.c to Android.mk
This file is created in 2a0d45ae6c but
addition to android makefiles was omitted. It breaks the build with
missing references which are defined in this file.
List the file in ir3_SOURCES to make the build succeed.

Signed-off-by: Marijn Suijten <marijns95@gmail.com>
2019-10-21 22:43:00 +00:00
Samuel Pitoiset
39760793b5 ac/llvm: fix ac_to_integer_type() for 32-bit const addr space pointers
This fixes some crashes with dEQP-VK.descriptor_indexing.* when
read_first_invocation has its source from a descriptor.

Most of these tests still fail because of an LLVM bug (they work
with ACO).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-21 22:32:01 +02:00
Rhys Perry
73184e51d1 aco: run opt_algebraic in a loop
Totals from affected shaders:
SGPRS: 13920 -> 13656 (-1.90 %)
VGPRS: 12972 -> 12960 (-0.09 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 1005680 -> 1000648 (-0.50 %) bytes
LDS: 91 -> 91 (0.00 %) blocks
Max Waves: 688 -> 688 (0.00 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-21 19:18:30 +00:00
Rhys Perry
132ae89b19 aco: use nir_lower_idiv_precise
v7: rename _nv50/_llvm to _fast/_precise

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-21 18:49:46 +00:00
Rhys Perry
8b98d0954e nir/lower_idiv: add new llvm-based path
v2: make variable names snake_case
v2: minor cleanups in emit_udiv()
v2: fix Panfrost build failure
v3: use an enum instead of a boolean flag in nir_lower_idiv()'s signature
v4: remove nir_op_urcp
v5: drop nv50 path
v5: rebase
v6: add back nv50 path
v6: add comment for nir_lower_idiv_path enum
v7: rename _nv50/_llvm to _fast/_precise
v8: fix etnaviv build failure

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-21 18:49:46 +00:00
Sagar Ghuge
f729ecefef intel/compiler: Remove emit_alpha_to_coverage workaround from backend
Remove emit_alpha_to_coverage workaround from backend compiler and start
using ported workaround from NIR.

v2: Copy comment from brw_fs_visitor (Caio Marcelo de Oliveira Filho)

Fixes piglit test on HSW:
- arb_sample_shading-builtin-gl-sample-mask-mrt-alpha-to-coverage-combinations

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-21 11:27:29 -07:00
Sagar Ghuge
7ecfbd4f6d nir: Add alpha_to_coverage lowering pass
Importing this pass from fs_visitor::emit_alpha_to_coverage_workaround()
in intel/compiler.

v2 (Caio Marcelo de Oliveira Filho):
- Track store output and sample mask instruction
- Nest math insturction for more readability
- Bail out early if no gl_SampleMask

v3: (Caio Marcelo de Oliveira Filho):
- Do math instructions after instruction block
- Restructure code
- Move pass under src/intel/compiler

v4: (Caio Marcelo de Oliveira Filho):
- Organize dither mask calculation

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-21 11:27:29 -07:00
Daniel Schürmann
0e4bd261b1 aco: ensure that uniform booleans are computed in WQM if their uses happen in WQM
This fixes graphical corruption in SC2.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-21 17:39:46 +00:00
Dylan Baker
a9a9249288 meson: Require meson >= 0.49.1 when using icc or icl
0.49.0 can compile most of mesa with ICC or ICL, but not SWR without
additional workarounds in our meson.build files. Bumping patch version
is easier and shouldn't be a big burden anyway, especially to cover a
niche compiler. The check originally only covered ICC, but now covers
ICL as well.

Fixes: 3740ffb59c
       ("meson: add switches for SWR with MSVC")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1937
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-21 17:21:57 +00:00
Juan A. Suarez Romero
d33fe2d5eb docs: update calendar, add news item and link release notes for 19.1.8
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-10-21 19:13:55 +02:00
Juan A. Suarez Romero
62a0e8421e docs: add release notes for 19.1.8
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit cc88eeb6ff)
2019-10-21 19:10:52 +02:00
Juan A. Suarez Romero
7aa63ffe4f docs: add release notes for 19.1.8
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 5c6d266c59)
2019-10-21 19:10:49 +02:00
Timur Kristóf
7e5f87b533 aco/gfx10: Update constant addresses in fix_branches_gfx10.
Due to a bug in GFX10 hardware, s_nop instructions must be added
if a branch is at 0x3f. We already do this, but forgot to also update
the constant addresses that come after this instruction.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-21 14:33:54 +00:00
Timur Kristóf
f380398f8f aco/gfx10: Fix PS exports for SPI_SHADER_32_AR.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-21 14:33:54 +00:00
Timur Kristóf
1749953ea3 aco/gfx10: Wait for pending SMEM stores before loads
Currently if you have an SMEM store followed by an SMEM load that
loads the same location as was written, it won't work because the
store isn't finished before the load is executed. This is NOT
mitigated by an s_nop instruction on GFX10.

Since we currently don't have proper alias analysis, this commit adds
a workaround which will insert an s_waitcnt lgkmcnt(0) before each
SSBO load if they follow a store. We should further refine this in
the future when we can make sure to only add the wait when we load the
same thing as has been stored.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-21 14:33:54 +00:00
Boris Brezillon
7fa5cd3ee3 panfrost: Fix the DISCARD_WHOLE_RES case in transfer_map()
The current implementation does not synchronize on BO readiness when
DISCARD_WHOLE_RES flag is set, which can lead to misbehaviours when the
resource being updated is being used by one of the pending or already
flushed batches.

Adding unconditional BO synchronization would do the trick, but we can
sometimes optimize this path by re-allocating a new BO instead of
waiting for the existing one to be ready.

Reported-by: Daniel Stone <daniels@collabora.com>
Reported-by: Heinrich Fink <heinrich.fink@daqri.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-21 14:37:02 +02:00
Iago Toral Quiroga
2d5edf2558 st/mesa: only require ESSL 3.1 for geometry shaders
According to the OES_geometry_shader spec, section Dependencies:

   "OpenGL ES 3.1 and OpenGL ES Shading Language 3.10
    are required."

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-21 09:09:15 +00:00
Lepton Wu
f4ba31ff50 egl/android: Remove our own reference to buffers.
We currently doesn't maintain it correctly and the buffer gets leaked if
surface is destroyed before calling swapping buffers.

From Android frameworks/native/libs/nativewindow/include/system/window.h:

  The window holds a reference to the buffer between dequeueBuffer and
  either queueBuffer or cancelBuffer, so clients only need their own
  reference if they might use the buffer after queueing or canceling it.

v2: Remove our own reference.

Fixes: 0212db3504 ("egl/android: Cancel any outstanding ANativeBuffer in surface destructor")

Reviewed-by: Chia-I Wu <olvaffe@gmail.com> (v1)
Reviewed-By: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Lepton Wu <lepton@chromium.org>
2019-10-21 07:50:31 +00:00
Samuel Pitoiset
b72205a4c1 radv: advertise VK_KHR_spirv_1_4
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-21 09:21:40 +02:00
Samuel Pitoiset
b139198b06 radv: do not dump descriptors twice in hang reports
If a pipeline has both graphics and compute, descriptors are same.
While we are at it, use queue->device for simplicity.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-21 08:50:39 +02:00
Samuel Pitoiset
cf5e55558e radv: dump trace files earlier if a GPU hang is detected
To make sure a trace file is generated in case the driver crashes
during the hang report generation (which happens sometimes).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-21 08:50:39 +02:00
Samuel Pitoiset
bc2319deb2 radv: print which ring is dumped in hang reports
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-21 08:50:39 +02:00
Samuel Pitoiset
076f9dce7c radv: do not print useless descriptors info in hang reports
This information has never been useful. All descriptors are
already dumped with colors etc, and it's more useful.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-21 08:50:39 +02:00
Samuel Pitoiset
9da94e510c radv: enable VK_KHR_shader_float_controls on GFX6-GFX7
Disable 16-bit features because fp16 isn't exposed on these chips.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-21 08:47:28 +02:00
Alyssa Rosenzweig
4c9b9ed5f9 panfrost/ci: Update expectations list
A bunch of blend tests fixed on T760. A single blend test regressed on
both T760/T860 but I am unable to reproduce locally so am just
documenting the regression and moving on.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig
b8c4fb235e pan/midgard: Implement SIMD-aware dead code elimination
We would like to eliminate not just entire dead instructions, but also
dead components, which increases scheduler flexibility (since some
vector instructions can become scalar after eliminating dead
components). This also will allow better RA in the future.

Results are meh.

total instructions in shared programs: 3453 -> 3451 (-0.06%)
instructions in affected programs: 60 -> 58 (-3.33%)
helped: 2
HURT: 0

total bundles in shared programs: 1826 -> 1824 (-0.11%)
bundles in affected programs: 33 -> 31 (-6.06%)
helped: 2
HURT: 0

total quadwords in shared programs: 3144 -> 3144 (0.00%)
quadwords in affected programs: 0 -> 0
helped: 0
HURT: 0

total registers in shared programs: 321 -> 321 (0.00%)
registers in affected programs: 45 -> 45 (0.00%)
helped: 11
HURT: 11
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 16.67% max: 50.00% x̄: 39.70% x̃: 50.00%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for registers value: -0.45 0.45
95% mean confidence interval for registers %-change: -1.87% 62.18%
Inconclusive result (value mean confidence interval includes 0).

total threads in shared programs: 445 -> 447 (0.45%)
threads in affected programs: 2 -> 4 (100.00%)
helped: 1
HURT: 0

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig
6c4b97011b pan/midgard: Create dependency graph bytewise
This allows for vec16 dependencies in the scheduler, not that we have
any yet (thankfully).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig
825f11e739 pan/midgard: Handle nontrivial masks in texture RA
The texture instruction has a mask we need to take into account.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig
d1d3411ba5 pan/midgard: Implement per-byte liveness tracking
Now that we have notion of byte masks, liveness tracking can be updated
to reflect this extra granularity without loss of correctness.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig
43fd730fc4 pan/midgard: Simplify mir_bytemask_of_read_components
There are easy ways to iterate sources!

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig
e9202ff3cb pan/midgard: Report byte masks for read components
Read component masks don't have a particular type associated, since the
type of the ALU operation may not match the type of the operands in
question. So let's generate byte masks instead, and update the rest of
the compiler to use byte masks when analyzing reads.

Preparation for mixed types.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig
d079631248 pan/midgard: Add helpers for manipulating byte masks
There are essentially two formats of masks in play beginning with this
commit: masks per-channel and masks per-byte. The former make sense
within a given fixed-size instruction; the latter are
typesize-independent. It turns out you need the latter to meaningfully
manipulate instructions containing multiple sizes (which is quite
possible with ALU operations).

Similarly, we have mir_srcsize. We calculate the size of the source by
analyzing the size of the instruction itself and stepping down if there
is a half-modifier.

Finally, we have mir_round_bytemask_down, for when we want to take a
byte mask and "round it down" to a given component size, so that we can
use it as a component mask.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig
e981b69484 pan/midgard: Implement OP_IS_STORE with table
..rather than open-coding.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig
8e31b14858 pan/midgard: Tableize load/store ops
This will allow us to encode properties about the load/store ops like we
do for ALU ops. We include now properties about whether we have a store,
and if there are special cases on the load/store op. We also tag each
instruction by its natural size... this is probably not totally right,
but it's a start.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig
5952add9a9 pan/midgard: Factor out mir_get_alu_src
This helper is used in a bunch of places ... might as well make that
common.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig
f77ea9798d pan/midgard/disasm: Fix printing 8-bit/16-bit masks
The trick is realizing even with a destination override, the masks are encoded in the same mode as the
instruction itself, rather than stepping down. The override means that
the smaller type is used, but the mask is parsed as if it were the
higher type. Overriding down is down by printed by blinding doing this. Overriding up can be thought of as printing in the upper size, but shifting the alphabet to use the upper half, i.e. shifting xyzw to become abcd.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig
d49fdca229 pan/midgard: Identify 64-bit atomic opcodes
They are symmetric to their 32-bit counterparts, just shifted.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig
6601570ead pan/midgard: Debug mir_insert_instruction_after_scheduled
Add some comments explaining what's going on in a more natural flow in
order to solve the actual bug.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Fixes: 2d914ebe81 ("pan/midgard: Fix memory corruption in register spilling")
2019-10-20 12:02:31 +00:00
Christian Gmeiner
a6de05a968 etnaviv: keep track of buffer valid ranges for PIPE_BUFFER
This allows a write to proceed to an uninitialized part of a buffer
even when the GPU is using the previously-initialized portions.

Such a situation can be triggered with the following API usage example:

  glBufferSubData(..., offset, size, data1);
  glDrawArrays(...);
  // append new vertex data
  glBufferSubData(..., offset+size, size, data2);
  glDrawArrays(...);

Same is done for freedreno, nouveau and radeon.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2019-10-20 09:03:06 +00:00
Christian Gmeiner
eab6d75066 etnaviv: store updated usage in pipe_transfer object
Store the changed usage in the newly created transfer object.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2019-10-20 09:03:06 +00:00
Christian Gmeiner
cd4528563f etnaviv: fix code style
Fixes: 1194afdfe3 ("etnaviv: rework the stream flush to always go through the context flush")
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-10-20 10:20:22 +02:00
Lionel Landwerlin
b30e01aef5 anv: fix memory leak on device destroy
v2: handle vma destruction if vkCreateDevice fails (Jordan)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1959
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-20 08:02:22 +00:00
Christian Gmeiner
f834656a41 etnaviv: fix compile warnings
Fixes: e5cc66dfad ("etnaviv: Rework locking")
Fixes: 1456aa61cc ("etnaviv: Rework resource status tracking")
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-10-20 08:28:18 +02:00
Eric Anholt
d8741ad251 mesa: Redefine the RG formats as array formats.
This is the layout used in the GL API, and maps directly to PIPE
formats with no endianness trickery.  As with the LA change, this
fixes big-endian fetching from texbos.  Also cleans up some endian
shenanigans in shader images.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-20 04:39:48 +00:00
Eric Anholt
4f384ddf5f gallium: Drop the unused PIPE_FORMAT_A*L* formats.
Now that Mesa is also using an array format for LA, nothing was using
these.  (And, clearly, no HW driver had exposed them).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-20 04:39:48 +00:00
Eric Anholt
6a819cabe8 mesa: Replace MESA_FORMAT_L8A8/A8L8 UNORM/SNORM/SRGB with an array format.
The array format is what the GL API wants (fixing texbos on
big-endian), and matches directly to gallium's corresponding array
format.  The only driver exposing A8L8 was radeon/r200 in big-endian,
where the HW's underlying format was trying to read as array and we
needed to flip things around to make our packed format come out right
(note that while the radeon format tables had both AL and LA,
ChooseTextureFormat would only pick one of them based on endianness).

v2: Don't make r200/radeon use endian swaps.
v3: Rebase on dropping the r200 _be/_le format table removal patch
v4: reword commit message to explain why we can drop both formats
    from radeon.

Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
2019-10-20 04:39:48 +00:00
Eric Anholt
236b478b2e mesa: Replace the LA16_UNORM packed formats with one array format.
The array format is what the GL API wants (and we made a mistake in
the format returned for texbos on big-endian!), and it's exactly what
the gallium-side PIPE_FORMAT_L16A16 is.  The only downside is that
dri_util tries to fall back to sampling RG16 using LA16, which doesn't
have a match for big-endian any more.  No HW drivers supported A16L16
anyway.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-20 04:39:48 +00:00
Eric Anholt
1165e3f360 radeon: Drop the unused first arg of OUT_BATCH_RELOC.
This was a trap when trying to figure out how to fit data bits into
the reloc.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-20 04:39:48 +00:00
Eric Anholt
2a548cf92f radeon: Fill in the TXOFFSET field containing the tile bits in our relocs.
The first arg to OUT_BATCH_RELOC is ignored, we actually wanted these
in the third arg.  They're always 0 so far, so it didn't matter.

v2: Reword commit message that I don't end up using the tile bits, but
    keep the commit as a cleanup anyway.

Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
2019-10-20 04:39:48 +00:00
Eric Anholt
ecddabfa76 r100/r200: factor out txformat/txfilter setup from the TFP path.
No matter what, we deref the texFormat from the table, except for a
mistake in cpp=4 where we pulled a 0 out of the table either way.

v2: Rebase on dropping r200 table deduplication patch.

Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
2019-10-20 04:39:48 +00:00
Vasily Khoruzhick
7ceafa4b40 lima: fix PP stack size
PP stack size should be set to maximum PP stack size, not to stack size of
last shader.

Fixes: 27e7603c34 ("lima: fix ppir spill stack allocation")
Tested-by: Icenowy Zheng <icenowy@aosc.io>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-10-19 18:15:18 -07:00
Marijn Suijten
224b267282 freedreno/a5xx: enable a510
Kernel support for this GPU is added by the following series:
https://patchwork.kernel.org/project/linux-arm-msm/list/?series=187609
In particular https://patchwork.kernel.org/patch/11189953/

Tested on Sony Xperia X and X Compact.

Signed-off-by: Marijn Suijten <marijns95@gmail.com>
Tested-by: AngeloGioacchino Del Regno <kholk11@gmail.com>
2019-10-19 16:48:24 +02:00
Prodea Alexandru-Liviu
48d617118a Appveyor/Meson: Add build test of osmesa gallium
Signed-off-by: Prodea Alexandru-Liviu <liviuprodea@yahoo.com>
Acked-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-19 14:44:44 +00:00
Lionel Landwerlin
3f8f52b241 anv: fix vkUpdateDescriptorSets with inline uniform blocks
With inline uniform blocks descriptor, the meaning of descriptorCount
is a number of bytes to copy into the descriptor. Don't try to use
that size as an index into the descriptor table.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 43f40dc7cb ("anv: Implement VK_EXT_inline_uniform_block")
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1195
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-19 13:16:40 +03:00
Rob Clark
1cea76274e freedreno/ir3: handle imad24_ir3 case in UBO lowering
Similiar to iadd, we can fold an added constant value from an imad24_ir3
into the load_uniform's constant offset.  This avoids some cases where
the addition of imad24_ir3 could otherwise be a regression in instr
count.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2019-10-18 15:08:54 -07:00
Rob Clark
d9424e5821 freedreno/ir3: add imul24 opcode
This maps to mul.s24

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2019-10-18 15:08:54 -07:00
Rob Clark
c7b8f16bee freedreno/ir3: optimize immed 2nd src to mad
We can't encode immed sources for cat3 (mad) instructions, but we can
use const in first or third src.  We handled this case already, but we
weren't considering that we could lower immed to const.

For manhattan:

  total instructions in shared programs: 35202 -> 34718 (-1.37%)
  instructions in affected programs: 14931 -> 14447 (-3.24%)
  helped: 90
  HURT: 0
  total full in shared programs: 2451 -> 2359 (-3.75%)
  full in affected programs: 653 -> 561 (-14.09%)
  helped: 69
  HURT: 2

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-18 15:08:54 -07:00
Rob Clark
666b6236f7 freedreno/ir3: add rule to generate imad24
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2019-10-18 15:08:54 -07:00
Rob Clark
5e08f070f0 nir: add nir_lower_amul pass
Lower amul to either imul or imul24, depending on whether 24b is enough
bits to calculate an offset within the thing being dereferenced.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-10-18 15:08:54 -07:00
Rob Clark
1bdde31392 nir: add address calc related opt rules
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2019-10-18 15:08:54 -07:00
Rob Clark
6320e37d4b nir: add amul instruction
Used for address/offset calculation (ie. array derefs), where we can
potentially use less than 32b for the multiply of array idx by element
size.  For backends that support `imul24`, this gives a lowering pass
an easy way to find multiplies that potentially can be converted to
`imul24`.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2019-10-18 15:08:54 -07:00
Rob Clark
0568761f8e nir: Add a new ALU nir_op_imul24
Some hardware can do 24b multiply in a single instruction, but not 32b.
However in most cases 24b is sufficient for address/offset calculation.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2019-10-18 15:08:54 -07:00
Eduardo Lima Mitev
bc2ccdc45a freedreno/ir3: Handle newly added opcode nir_op_imad24_ir3
Simply emit an ir3_MAD_S24 instruction in the backend.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2019-10-18 15:08:54 -07:00
Eduardo Lima Mitev
32e5fbf47c nir: Add a new ALU nir_op_imad24_ir3
ir3 compiler has a signed integer multiply-add instruction (MAD_S24)
that is used for different offset calculations in the backend.
Since we intend to move some of these calculations to NIR, we need
a new ALU op that can directly represent it.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2019-10-18 15:08:54 -07:00
Rob Clark
6ad442acae freedreno/ir3: rename mul.s/mul.u
to mul.s24/mul.u24, to better reflect that these are 24b multiply.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2019-10-18 15:08:54 -07:00
Rob Clark
ad8167c1e0 nir/search: fix the PoT helpers
Otherwise, if the base type is (for example) uint32, we would
incorrectly think that PoT optimizations could not apply.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jason Ekstsrand <jason@jleksrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2019-10-18 15:08:54 -07:00
Rob Clark
f30c256ec0 freedreno/ir3: enable pre-fs texture fetch for a6xx
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-18 21:11:54 +00:00
Rob Clark
72048dd799 turnip: add support for pre-fs texture fetch
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-18 21:11:54 +00:00
Rob Clark
a5afcc76d5 freedreno/a6xx: add support for pre-fs texture fetch
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-18 21:11:54 +00:00
Hyunjun Ko
e9450ad27d freedreno/ir3: Add support for texture sampling pre-dispatch
Signed-off-by: Eduardo Lima Mitev <elima@igalia.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-18 21:11:54 +00:00
Eduardo Lima Mitev
2a0d45ae6c freedreno/ir3: Add a NIR pass to select tex instructions eligible for pre-fetch
The pass should run once at the end of shader compilation, for a4xx
onwards. It iterates texture sampling instructions and mark those
eligibile for pre-dispatch by changing the tex op from 'tex' to
'tex_prefetch'. An instruction is eligibile if:

* The coordinate is a vector where all its components come from a
  shader input.
* The order of the components match exactly that of the input (no
  swizzles).
* The instruction is in the 'main' function, and in the outer
  most-block.

The first two restrictions were arrived to empirically, so more
testing could tighten or loosen it.

The 3rd restriction is there to allow moving the instructions
eligible for pre-dispatch to the beginning of the shader, so
that we don't block the registers holding the result for too
long.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-18 21:11:54 +00:00
Rob Clark
7d4213fe88 freedreno/ir3: force i/j pixel to r0.x
It seems that pre-fs texture fetch only works if ij_pix ends up in r0.x.
I've tried unknown zero bits, to no avail, and blob also seems to force
r0.x when this feature is used.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-18 21:11:54 +00:00
Rob Clark
07e9bf564f freedreno/ir3: add pre-dispatch tex fetch to disasm
Useful to see in disassembly listing texture fetches that were moved to
pre-dispatch.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-18 21:11:54 +00:00
Rob Clark
2b93eb9c76 freedreno/ir3: add dummy bary.f(ei) for pre-fs-fetch
If the only use of varyings is a pre-shader texture-fetch, we still need
to issue a bary.f with the end-input flag, otherwise we'll block further
VS invocations, as the hw will think varying storage is still busy.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-18 21:11:54 +00:00
Rob Clark
392a309a55 freedreno/ir3: fixup register footprint to account for prefetch
It is possible that the result of a pre-fs texture fetch is an output
(or partially an output) of the FS.  Sine the meta:tex_prefetch
instructions are dropped before the assembler, we need to account for
this when we fixup the register footprint.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-18 21:11:54 +00:00
Rob Clark
482e1b9955 freedreno/ir3: add meta instruction for pre-fs texture fetch
Add a placeholder instruction to track texture fetches made prior to FS
shader dispatch.  These, like meta:input instructions are scheduled
before any real instructions, so that RA realizes their result values
are live before the first real instruction.  And to give legalize a way
to track usage of fetched sample requiring (sy) sync flags.

There is some related special handling for varying texcoord inputs used
for pre-fs-fetch, so that they are not DCE'd and remain in linkage
between FS and previous stage.  Note that we could almost avoid this
special handling by giving meta:tex_prefetch real src arguments, except
that in the FS stage, inputs are actual bary.f/ldlv instructions.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-18 21:11:54 +00:00
Rob Clark
11e467c378 freedreno/ir3: don't DCE ij_pix if used for pre-fs-texture-fetch
When we enable pre-dispatch texture fetch, we could have a scenario
where the barycentric i/j coord sysval is not used in the shader, but
only used for the varying fetch for the pre-dispatch texture fetch.
In this case we need to take care not to DCE this sysval.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-18 21:11:54 +00:00
Rob Clark
af817a44c1 freedreno/ir3: track sysval slot for inputs
Will be needed for special handling of SYSTEM_VALUE_BARYCENTRIC_PIXEL
(ij_pix) when pre-fs texture fetch is enabled.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-18 21:11:54 +00:00
Rob Clark
35692fab86 freedreno/ir3: remove unused ir3_instruction::inout
Not sure I remember how long this has been unused for.  But it's unused
now.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-18 21:11:54 +00:00
Hyunjun Ko
fd14788e1f freedreno/ir3: Add data structures to support texture pre-fetch
Signed-off-by: Eduardo Lima Mitev <elima@igalia.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-18 21:11:54 +00:00
Rob Clark
766a68cdb9 freedreno: update registers
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-18 21:11:54 +00:00
Eduardo Lima Mitev
f1d4fadf1b nir: Add new texop nir_texop_tex_prefetch
This is like nir_texop_tex, but signals that the sampling coordinates
are immutable during the shader stage, in a way that allows the HW
that supports pre-dispatching sampling operations to pre-fetch
the result prior to scheduling the shader stage.

This is introduced to support the feature in Freedreno. Adreno HW
from a4xx supports it.

A NIR pass introduced later in this series will detect sampling
operations that are eligible for pre-dispatch, and replace
nir_texop_tex by this new op, to tell the backend to enable
pre-fetch.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-18 21:11:54 +00:00
Eric Engestrom
27df3e015b osmesa: add missing #include <stdint.h>
Fixes: 281466332b ("gallium/osmesa: Introduce a test.")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1947
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-18 22:07:21 +01:00
Dylan Baker
1ce23b5653 docs: Add new feature for compiling for windows with meson
Reviewed-by: Adam Jackson <ajax@redhat.com>
2019-10-18 13:02:58 -07:00
Dylan Baker
0b6b7ff3ca appveyor: Move appveyor script into .appveyor directory
This clears out the scripts directory completely

Reviewed-by: Adam Jackson <ajax@redhat.com>
2019-10-18 13:02:58 -07:00
Dylan Baker
fbb969b98a appveyor: Add support for building llvmpipe with meson
Reviewed-by: Adam Jackson <ajax@redhat.com>
2019-10-18 13:02:58 -07:00
Dylan Baker
41b3eb08d9 docs: update meson docs for windows
Reviewed-by: Adam Jackson <ajax@redhat.com>
2019-10-18 13:02:58 -07:00
Dylan Baker
821cf6942a meson: Use cmake to find LLVM when building for windows
We don't use cmake normally because it always results in static linking.
This is very problematic for *nix OSes which expect shared linking by
default, but for windows this isn't a problem as LLVM doesn't support
shared linking on windows anyway.

Reviewed-by: Adam Jackson <ajax@redhat.com>
2019-10-18 13:02:58 -07:00
Dylan Baker
b962c7c971 meson: Add support for wrapping llvm
For building on Windows (when not using cygwin), users may want to use a
binary wrap of LLVM, this provides a fallback to the LLVM dependency
which may be used in this case

Reviewed-by: Adam Jackson <ajax@redhat.com>
2019-10-18 13:02:58 -07:00
Dylan Baker
dbd554ba05 meson/llvmpipe: Add dep_llvm to driver_swrast
This fixes build errors in gl-gdi on windows when using llvmpipe

Reviewed-by: Adam Jackson <ajax@redhat.com>
2019-10-18 13:02:58 -07:00
Hal Gentz
fa611b07f1 Revert "egl: Add EGL_CONFIG_SELECT_GROUP_MESA ext."
This reverts commit 173bc9d684.
2019-10-18 18:41:51 +00:00
Hal Gentz
94386d476c Revert "egl: Fixes transparency with EGL and X11."
This reverts commit 90a19074b4.
2019-10-18 18:41:51 +00:00
Hal Gentz
9997693960 Revert "egl: Puts RGBA visuals in the second config selection group."
This reverts commit a800d16e4f.
2019-10-18 18:41:51 +00:00
Hal Gentz
4ef2c53755 Revert "egl: Configs w/o double buffering support have no EGL_WINDOW_BIT."
This reverts commit 075a96aa92.
2019-10-18 18:41:51 +00:00
Jonathan Marek
9a7a92c1ec etnaviv: check NO_ASTC feature bit
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2019-10-18 19:30:41 +02:00
Jonathan Marek
15c5ec0024 etnaviv: fix TS samplers on GC7000L
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2019-10-18 19:23:59 +02:00
Jonathan Marek
ad48411d72 etnaviv: fix linear_nearest / nearest_linear filters on GC7000Lite
MIN filter is only used when LOD MAX is at least 4 (I guess the 2 LSB don't
actually exist).

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2019-10-18 19:06:44 +02:00
Lucas Stach
95adc393eb etnaviv: GC7000: flush TX descriptor and instruction cache
The etnaviv kernel driver will only ever flush write caches. As both
the TX descriptor and instruction cache are read caches they must be
flushed from the user cmdstream at an appropriate time.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-10-18 19:06:39 +02:00
Lucas Stach
54dd288317 etnaviv: add linear texture support on GC7000
It's just a matter of writing the addressing mode into the
texture descriptor.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-10-18 19:06:35 +02:00
Wladimir J. van der Laan
eda73d7127 etnaviv: GC7000: Texture descriptors
Create a separate implementation file with texture-descriptor-based
sampler views and sampler states. Initialize the one or the other
based on the GPU. There is so little in common that this seemed more
appropriate that keeping them as one type of state object would
only be confusing.

This commit is actually a combiation of the original commit by
Wladimir, fixes and TS implementation from Jonathan and changed to
use softpin by Lucas.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-10-18 19:06:20 +02:00
Lucas Stach
5bc3fcf620 etnaviv: check for softpin availability on Halti5 devices
Halti5 uses texture descriptors to control the samplers, and thus needs to
know the GPU virtual address for the texture buffers to fill into the
descriptor buffer. Without softpin userspace has no control over the GPU
VM and also no way to fix up the texture descriptor buffer, so there is
no point in creating a screen on a Halti5 device without softpin being
available.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-10-18 19:05:25 +02:00
Lucas Stach
0bdf5420f1 etnaviv: drm: add softpin interface
If softpin is available on the kernel side, we transparently replace the
relocs with self-managed GPU virtual addresses. This allows to skip some
work at the kernel side, as it doesn't need to touch the command stream
anymore before submitting it to the hardware.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-10-18 19:05:21 +02:00
Marek Vasut
e5cc66dfad etnaviv: Rework locking
Replace the per-screen locking of flushing with per-context one and
add per-context lock around command stream buffer accesses, to prevent
cross-context flushing from corrupting these command stream buffers.

Signed-off-by: Marek Vasut <marex@denx.de>
2019-10-18 17:03:25 +00:00
Marek Vasut
0c38c5454b etnaviv: Command buffer realloc
Reallocate the command stream buffer in case it is too small.
The older kernel versions are limited to 64 kiB buffer, so
limit the size to avoid oversized buffers.

Signed-off-by: Marek Vasut <marex@denx.de>
2019-10-18 17:03:25 +00:00
Marek Vasut
1456aa61cc etnaviv: Rework resource status tracking
Have each context track which resources it marked as pending read and
pending write. Have each resource track in which context it is pending.
This way, it is possible to identify when a resource is both pending
read and pending write at the same time. Moreover, the status field
can be correctly calculated and updated when necessary.

Signed-off-by: Marek Vasut <marex@denx.de>
2019-10-18 17:03:25 +00:00
Lucas Stach
1194afdfe3 etnaviv: rework the stream flush to always go through the context flush
This way we can ensure that the pipe driver tracking of pending resources
stays in sync with the actual command buffer state, even if a space
reservation triggers a forced flush.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-10-18 17:03:25 +00:00
Lucas Stach
1864fcd8c7 etnaviv: drm: remove unused etna_cmd_stream_finish
It's not used by anything and gets in the way for the refactoring of
the flush handling.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-10-18 17:03:25 +00:00
Lucas Stach
9e672e4d20 etnaviv: keep references to pending resources
As long as a resource is pending in any context we must not destroy
it, otherwise we'll hit a classical use-after-free with fireworks.
To avoid this take a reference when the resource is first added to
the pending set and put the reference when no longer pending.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-10-18 17:03:25 +00:00
Marek Vasut
90e223646b etnaviv: Make contexts track resources
Currently, the screen tracks all resources for all contexts, but this
is not correct. Each context should track the resources it uses. This
also allows a context to detect whether a resource is used by another
context and to notify another context using a resource that the current
context is done using the resource.

Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Guido Günther <guido.gunther@puri.sm>
Cc: Lucas Stach <l.stach@pengutronix.de>
2019-10-18 17:03:25 +00:00
Brian Paul
2946bd6628 REVIEWERS: add VMware reviewers 2019-10-18 16:42:40 +00:00
Samuel Pitoiset
7c50214aab radv: implement VK_KHR_shader_float_controls
This exposes what's required for DX and this is what we already
configure. The driver flushes denorms for FP32 and preserves them
for FP16/FP64. Note that we can't allow both preserving and
flushing denorms because this won't work for merged shaders. This
will require LLVM to update the float mode register to make it work.

Only enabled on GFX8+ with the LLVM path because it's untested on
previous chips and ACO doesn't support it.

This extension is required for SPIRV 1.4.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-18 16:55:58 +02:00
Samuel Pitoiset
2c2aaf275c ac/llvm: force fneg/fabs to flush denorms to zero if requested
LLVM optimizes these instructions with XOR/AND and it loses
the sign bit.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-18 16:55:55 +02:00
Samuel Pitoiset
7dfb15fff1 ac/llvm: add AC_FLOAT_MODE_ROUND_TO_ZERO
Because some instructions will be optimized by the backend compiler,
the driver has to manually flush to zero to keep the result exact.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-18 16:55:51 +02:00
Samuel Pitoiset
d94bd4e512 ac/llvm: add ac_build_canonicalize() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-18 16:55:48 +02:00
Eric Engestrom
3ad6154f4e travis: test meson install as well
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-18 15:27:37 +01:00
Eric Engestrom
b0853a43da travis: don't (re)install python
The new Mac OS X images apparently already have python2 and python3,
and `brew` considers asking to install something already installed
as a fatal error...

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-18 15:27:37 +01:00
Lepton Wu
a651926884 gbm: Add GBM_MAX_PLANES definition
This removed hard coded "4".

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Lepton Wu <lepton@chromium.org>
2019-10-18 13:18:28 +00:00
Jose Maria Casanova Crespo
f8da0f6198 v3d: Explicitly expose OpenGL ES Shading Language 3.1
This will expose GL_EXT_primitive_bounding_box and
GL_OES_primitive_bounding_box after previous commits
expose OpenGL ES 3.1 once Compute Shaders are available.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-18 14:08:52 +02:00
Iago Toral Quiroga
db87439232 v3d: request the kernel to flush caches when TMU is dirty
This adapts the v3d driver to the new CL submit ioctl interface that
allows the driver to request a flush of the caches after the render
job has completed. This seems to eliminate the kernel write violation
errors reported during CTS and Piglit excutions, fixing some CTS tests
and GPU resets along the way.

v2:
  - Adapt to changes in the kernel side.
  - Disable shader storage and shader images if the kernel doesn't
    implement cache flushing.

Fixes CTS tests:
KHR-GLES31.core.shader_image_size.basic-nonMS-fs-float
KHR-GLES31.core.shader_image_size.basic-nonMS-fs-int
KHR-GLES31.core.shader_image_size.basic-nonMS-fs-uint
KHR-GLES31.core.shader_image_size.advanced-nonMS-fs-float
KHR-GLES31.core.shader_image_size.advanced-nonMS-fs-int
KHR-GLES31.core.shader_image_size.advanced-nonMS-fs-uint
KHR-GLES31.core.shader_atomic_counters.advanced-usage-many-draw-calls2
KHR-GLES31.core.shader_atomic_counters.advanced-usage-draw-update-draw
KHR-GLES31.core.shader_storage_buffer_object.advanced-unsizedArrayLength-fs-int
KHR-GLES31.core.shader_storage_buffer_object.advanced-unsizedArrayLength-fs-std140-matR
KHR-GLES31.core.shader_storage_buffer_object.advanced-unsizedArrayLength-fs-std140-struct
KHR-GLES31.core.shader_storage_buffer_object.advanced-unsizedArrayLength-fs-std430-matC-pad
KHR-GLES31.core.shader_storage_buffer_object.advanced-unsizedArrayLength-fs-std430-vec

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-18 14:08:52 +02:00
Eric Anholt
66e2d3b69f v3d: Add Compute Shader support
Now that the UAPI has landed, add the pipe_context function for
dispatching compute shaders.  This is the last major feature for GLES 3.1,
though it's not enabled quite yet.
2019-10-18 14:08:52 +02:00
Iago Toral Quiroga
2d8b51ea4d broadcom: document known hardware issues for L2T flush command
Suggested-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-18 14:08:52 +02:00
Iago Toral Quiroga
46182fc1da v3d: add new flag dirty TMU cache at v3d_compiler
That we set for any TMU write on spills and general tmu. It is then
used as part of v3d_emit_gl_shader_state later.

v2: add a new flag instead at v3d_compiler instead of dirty the flag
    at v3dx if there is any spill (change suggested by Eric, added by
    Alejandro)

v3: set this for anything that is not a load and do it also in
    v3d40_vir_emit_image_load_store (Eric)

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-18 14:08:52 +02:00
Iago Toral Quiroga
d2203d74c6 v3d: trivial update to obsolete comment
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-18 14:08:52 +02:00
Bas Nieuwenhuizen
fd21ee8b52 radv: Fix single stage constant flush with merged shaders.
e.g. a VERTEX only flush with tess on Vega should look at the TCS
to see which bits are needed.

CC: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1953
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-18 10:49:29 +00:00
Lucas Stach
1b65c49c58 rbug: remove superfluous NULL check
The SCR_INIT macro used to install the rbug resource_changed method
will only do so when the driver below rbug exposes this method, so
the check will always evaluate to true.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2019-10-18 10:12:07 +00:00
Lucas Stach
93d47932b8 rbug: implement resource creation with modifier
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2019-10-18 10:12:07 +00:00
Lucas Stach
5b3e57059c rbug: forward can_create_resource to pipe driver
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2019-10-18 10:12:07 +00:00
Lucas Stach
8eea8c9691 rbug: forward texture_barrier to pipe driver
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2019-10-18 10:12:07 +00:00
Lucas Stach
024eaa7fec rbug: implement missing explicit sync related fence functions
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2019-10-18 10:12:07 +00:00
Lucas Stach
5f76d3cce8 rbug: move flush_resource initialization
All the other context method initialzation follow the order of the pipe_context
structure definition making it easy to find unimplemented methods in rbug.
Move the flush_resource init to follow the same order.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2019-10-18 10:12:07 +00:00
Lucas Stach
a75eb888e0 rbug: unwrap index buffer resource
All resources passed to the drivers below rbug need to be unwrapped before
being passed down. We missed to do this for the index buffer resource when
this was made part of the draw_info structure.

Fixes: 330d0607ed (gallium: remove pipe_index_buffer and set_index_buffer)
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2019-10-18 10:12:07 +00:00
Lucas Stach
6174cba748 rbug: fix transmitted texture sizes
The rbug wire format defines the texture size parameters to be uint32_t sized
and uses memcpy to move the function parameters to the message structure.
This caused totally wrong transmitted texture sizes since the height and depth
paramterds have been changed to uint16_t in the gallium API. Fix this by doing
an explicit conversion to the correct representation before packing into the
wire message.

Fixes: e6428092f5 (gallium: decrease the size of pipe_resource - 64 -> 48 bytes)
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2019-10-18 10:12:07 +00:00
Lucas Stach
f6461df63a gallium/util: don't depend on implementation defined behavior in listen()
Using 0 as the backlog argument to listen() is exploiting implementation
defined behavior and will lead to no connections being accepted on some
libc implementations.

Quote of the listen manpage: "A backlog argument of 0 may allow the socket to
accept connections, in which case the length of the listen queue may be set to
an implementation-defined minimum value."

Fix this by using a more sensible backlog value.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2019-10-18 10:12:07 +00:00
Iago Toral Quiroga
5be5b53b6d mesa/main: GL_GEOMETRY_SHADER_INVOCATIONS exists in GL_OES_geometry_shader
It seems that for desktop GL this was included with ARB_gpu_shader5, but
for OpenGL ES this is already included with the base extension and there is
a CTS test that checks this.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-18 09:03:21 +00:00
Pierre-Eric Pelloux Prayer
af60187153 mesa: implement glTextureStorageNDEXT functions
Implement the 3 functions using the texturestorage_error() helper.
_mesa_lookup_or_create_texture is always called to make sure that 'texture'
is initialized (even if the texturestorage_error() generates an error afterwards).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-18 10:26:26 +02:00
Pierre-Eric Pelloux-Prayer
50533d408d mesa: add EXT_dsa NamedCopyBufferSubDataEXT function
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-18 10:26:26 +02:00
Pierre-Eric Pelloux-Prayer
da21435a7a mesa: add EXT_dsa NamedRenderbufferStorageMultisampleEXT function
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-18 10:26:26 +02:00
Pierre-Eric Pelloux-Prayer
2e14749f8f mesa: add EXT_dsa Generate*MipmapEXT functions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-18 10:26:26 +02:00
Pierre-Eric Pelloux-Prayer
fb804266a3 mesa: refactor GenerateTextureMipmap handling
Rework _mesa_GenerateTextureMipmap to allow code sharing with EXT_dsa functions.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-18 10:26:26 +02:00
Pierre-Eric Pelloux-Prayer
cfc0ebe7f1 mesa: add EXT_dsa glGetFloati_vEXT/glGetDoublei_vEXT
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-18 10:26:26 +02:00
Pierre-Eric Pelloux-Prayer
a4e935f2d7 mesa: add EXT_dsa + EXT_gpu_program_parameters functions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-18 10:26:26 +02:00
Pierre-Eric Pelloux-Prayer
78b65343e8 mesa: add EXT_dsa + EXT_gpu_shader4 functions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-18 10:26:26 +02:00
Pierre-Eric Pelloux-Prayer
c2d6f61f26 mesa: add EXT_dsa + EXT_texture_integer functions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-18 10:26:26 +02:00
Pierre-Eric Pelloux-Prayer
2bdf809e66 mesa: add EXT_dsa + EXT_texture_buffer_object functions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-18 10:26:26 +02:00
Pierre-Eric Pelloux-Prayer
28cc07a876 mesa: add EXT_dsa glProgramUniform*EXT functions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-18 10:26:26 +02:00
Pierre-Eric Pelloux-Prayer
1d1722e910 mesa: add EXT_dsa NamedProgram functions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-18 10:26:26 +02:00
Pierre-Eric Pelloux-Prayer
eaeab0a998 mesa: add EXT_dsa glClientAttribDefaultEXT / glPushClientAttribDefaultEXT
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-18 10:26:26 +02:00
Pierre-Eric Pelloux-Prayer
01666ad206 mesa: add EXT_dsa glNamedRenderbufferStorageEXT and glGetNamedRenderbufferParameterivEXT
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-18 10:26:26 +02:00
Daniel Stone
40689f5ac0 panfrost: Respect offset for imported resources
When we import a resource through Gallium, we need to take account of
the offset parameter passed.

Fixes a failure seen with the VIVID V4L2 driver, which would create NV12
resources within the same BO, with an offset. Sample pipeline to
reproduce (replace videoN with your actual VIVID device node):
    gst-launch-1.0 v4l2src device=/dev/videoN ! video/x-raw,format=NV12 ! glimagesink

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reported-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Tested-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
2019-10-18 09:38:52 +02:00
Jordan Justen
22859a18d5 iris/resource: Use isl surface alignment during bo allocation
Reworks:
 * Change subject from "iris: Align main surface allocation to 64k on gen12+"
 * Make use of isl surf alignment. (Nanley)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-17 21:22:00 -07:00
Jason Ekstrand
48c153e21b intel/isl: Add isl_aux_usage_has_ccs
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-17 21:22:00 -07:00
Jordan Justen
d83fe059c2 intel/isl: Add R10G10B10_FLOAT_A2_UNORM format
Reworks:
 * Fill out the format's entry in the ISL format table. (Nanley)
 * Support CCS_E-enabled BLORP copies with the format. (Nanley)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-17 21:21:56 -07:00
Kenneth Graunke
f192741ddd intel/compiler: Report the number of non-spill/fill SEND messages
This can be useful to measure whether memory access optimizations are
having the desired effect.  For example, we might see a reduction in
image loads/stores, or constant buffer loads.  We can already see this
in cycle estimates to some extent, but this is a more direct approach,
minus a lot of the noise of random scheduler shuffling.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-17 20:44:00 -07:00
Marek Olšák
cac5182992 st/mesa: don't call variables "tgsi" when they can reference NIR
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-17 20:31:34 -04:00
Marek Olšák
48b4843c30 st/mesa: merge st_fragment_program into st_common_program
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-17 20:31:34 -04:00
Marek Olšák
e94da4ab80 st/mesa: remove redundant function st_reference_compprog
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-17 20:31:34 -04:00
Marek Olšák
614331738d st/mesa: remove unused st_xxx_program::sha1
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-17 20:31:34 -04:00
Marek Olšák
0c74e354d1 st/mesa: remove st_vp_variant_key in favor of st_common_variant_key
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-17 20:31:34 -04:00
Marek Olšák
6468df0533 st/mesa: remove num_tgsi_tokens from st_xx_program
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-17 20:31:34 -04:00
Marek Olšák
64dfc82340 st/mesa: rename basic -> common for st_common_program
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-17 20:31:34 -04:00
Marek Olšák
33d53f0614 st/mesa: rename st_xxx_program::tgsi to state
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-17 20:31:34 -04:00
Marek Olšák
dd4d791821 st/mesa: lower doubles for NIR after linking
This allows dropping 1 call to st_nir_opts, because shaders are always
optimized after linking.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-17 20:31:37 -04:00
Marek Olšák
7908e82f60 st/mesa: call st_nir_opts for linked shaders only once
The removed st_nir_opts calls are mostly redundant.

There is an improvement with shader-db on radeonsi:

Before:
    real	1m54.047s
    user	28m37.857s
    sys 	0m7.573s

After:
    real	1m52.012s
    user	28m3.412s
    sys 	0m7.808s

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-17 20:31:34 -04:00
Ian Romanick
92252219d3 intel/vec4: Don't try both sources as immediates for DPH
DPH isn't actually commutative, so this doesn't work.  If the immediate
in src0 would be a VF candidate, we could do better. *shrug*

No shader-db changes on any Intel platform.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Fixes: b04beaf41d ("intel/vec4: Try both sources as candidates for being immediates")
2019-10-17 15:07:01 -07:00
Ian Romanick
050e4e28bf nir/search: Fix possible NULL dereference in is_fsign
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Fixes: 09705747d7 ("nir/algebraic: Reassociate fadd into fmul in DPH-like pattern")
2019-10-17 15:07:01 -07:00
Jordan Justen
da10fa9d63 iris: Let isl decide the supported tiling in more situations
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Suggested-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-17 14:47:23 -07:00
Jordan Justen
be89fbd51e intel/isl: Add gen12 depth/stencil surface alignments
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-17 14:47:23 -07:00
Jason Ekstrand
d9565160b2 intel/isl: Select Y-tiling for stencil on gen12
Rework:
 * Disallow linear 1D stencil buffers (Nanley)
 * Force Y for gen12 stencil rather than ~W (Nanley)

Co-authored-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-17 14:47:22 -07:00
Jason Ekstrand
9dd9c3363b intel/genxml: Remove W-tiling on gen12
It's no longer supported by the hardware

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-17 14:47:22 -07:00
Jordan Justen
523ba0a3e7 intel/genxml,isl: Add gen12 stencil buffer changes
Rework:
 * NULL stencil buffer path (Jason)
 * genxml fixes (Nanley)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-17 14:47:22 -07:00
Jordan Justen
d2a490d1d9 intel/genxml,isl: Add gen12 depth buffer changes
Reworks:
 * Fix 3DSTATE_DEPTH_BUFFER "Surface Format" end in xml (Jason)
 * Remove WM_HZ_OP changes (Nanley)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-17 14:47:22 -07:00
Jordan Justen
6c9f9a82d7 intel/genxml,isl: Add gen12 render surface state changes
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-17 14:47:17 -07:00
Eric Anholt
75c601b6cf mesa: Refactor the entirety of _mesa_format_matches_format_and_type().
This function was difficult to implement for new formats due to the
combination of endianness and swapbytes support.  Since it's mostly
used for fast paths, bugs in it were often missed during testing.

Just reimplement it on top of the recent
_mesa_format_from_format_and_type() which can give us a canonical
MESA_FORMAT for a format and type enum (while respecting endianness).

Fixes:
- R4G4B4A4_UNORM, B4G4R4_UINT, R4G4B4A4_UINT incorrectly matched with
  swapBytes (you can't just reverse the channels if the channels
  aren't bytes)
- A4R4G4B4_UNORM and A4R4G4B4_UINT missing BGRA/4444_REV matches
- failing to match RGB/BGR unorm8 array formats on BE
- 2101010 formats incorrectly matching with swapBytes set.
- UINT/SINT byte formats failed to match with swapBytes set.

This deletes the part of tests/mesa_formats.cpp that called
_mesa_format_matches_format_and_type() to make sure it didn't
assertion fail, as it now would assertion fail due to the fact that we
were passing an invalid format (GL_RG) for most types.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-17 21:07:29 +00:00
Eric Anholt
d77c77936b mesa: Add support for array formats of depth and stencil.
In desktop GL, you can specify things like GL_DEPTH_COMPONENT/GL_BYTE as a
ReadPixels format, and we need to be able to represent that to see if we
have proper MESA_FORMATs for them.  That's exactly what the
mesa_array_format enum is for.

v2: Drop _mesa from static fn.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-17 21:07:29 +00:00
Eric Anholt
4f4fc75357 mesa: Add format/type matching for DEPTH/UINT_24_8.
We had missed this case where GLES3 allows glReadPixels(DEPTH, UINT_24_8),
and just got lucky by the readpixels path never asking for the matching
format from this function.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-17 21:07:29 +00:00
Eric Anholt
7be72b24f5 mesa: Fix depth/stencil ordering in _mesa_format_from_format_and_type().
The GL spec says the 24-bit component is in the high bits, and
format_unpack.c looks at the high 24 bits in the S8Z24 case, not
Z24SS8.

Avoids a regression in the next commit.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-17 21:07:29 +00:00
Eric Anholt
df5fe86232 mesa: Add debug info to _mesa_format_from_format_and_type() error path.
The unreachable() that follows isn't very useful for debug, and by adding
this here we get a nice description of the failure in debug builds.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-17 21:07:29 +00:00
Kristian H. Kristensen
0a4e6726ba freedreno/a6xx: Turn on geometry shaders
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:45:03 -07:00
Kristian H. Kristensen
d3945e3b9b freedreno/ci: Add failing tests to skip list
Some queries are still failing and layered rending needs more work.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:45:03 -07:00
Kristian H. Kristensen
622afc8dbd freedreno/a6xx: Implement PIPE_QUERY_PRIMITIVES_GENERATED for GS
When we don't have streamout enabled, we have to read this register to
get the number of primitives emitted.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen
c8e1522a50 freedreno/blitter: Save GS state
We have GS state now.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen
946a1e206f st/mesa: Also enable GS when ESSLVersion > 320
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen
7cb672227b freedreno/a6xx: Support layered render targets
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen
0eebedb619 freedreno/a6xx: Emit program state for GS
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen
d6ed39e20e freedreno/ir3: End VS with CHMASK and CHSH in GS pipelines
When used in a GS pipeline, the VS doesn't end with the END
instruction. Instead it chains to the GS, which continues running with
the same register allocation.  The intended use cases seems to be that
you can compile a regular VS (ie outputs in registers and ending with
END) but then tack on link-time generated code past the END to write
the outputs using STLW, in case the VS is used with GS.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen
4b7312b763 freedreno/ir3: Start GS with (ss) and (sy)
We don't know what kind of loads we might have to wait on when coming
in from chsh in the VS so set both sync flags.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen
c347708bea freedreno/ir3: Pre-color GS header and primitive ID
These sysvals have to be unclobbered by VS and in the same registers
in both VS and GS, since the chsh from VS to GS doesn't reload the
values. We use the pre-color argument to ir3_ra() to always place
these values in r0.x and r0.y.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen
ce08fddbbe freedreno/ir3: Setup ir3 inputs and outputs for GS
Inputs are the GS header, which contains vertex ID, local primitive ID
and thread ID as well as primitive ID. The setup is a little different
from other sysvals, since we always have to receive them in the VS so
that it can pass them on into the GS.

The vertex flag outputs from GS is set up as a proper nir output in
the lowering pass and doesn't need special handling here.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen
0293d14719 freedreno/ir3: Implement primitive layout intrinsics
This implements the load_vs_primitive_stride_ir3,
load_vs_vertex_stride_ir3 and load_primitive_location_ir3 intrinsics,
used for getting the primitive layout strides and locations.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen
8e16fb1528 freedreno/ir3: Implement lowering passes for VS and GS
This introduces two new lowering passes. One to lower VS to explicit
outputs using STLW and one to lower GS to load input using LDLW and
implement the GS specific functionality.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen
8f39985b01 freedreno/ir3: Add has_gs flag to shader key
Since the presence of GS changes how the VS operates we need to track
that in the shader key.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen
2703844cb3 freedreno/a6xx: Add missing adjacency primitives to table
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen
0324706764 freedreno/ir3: Add intrinsics that map to LDLW/STLW
These intrinsics will let us do all the offset calculations in nir,
which is nicer to work with and lets nir_opt_algebraic eat it all up.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen
436d125adf freedreno/ir3: Add new LDLW/STLW instructions
These access memory used for passing data between geometry stages.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen
956d319446 freedreno/ir3: Extend RA with mechanism for pre-coloring registers
We'll need to pre-color certain input registers betwee VS and GS
shaders.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen
0b6625d825 freedreno/ir3: Use third register for offset for LDL and LDLV
Before, offset held the offset, which can be either immediate or a
register.  Use a third register to hold the offset so that we can use
a register.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen
3a93e60e7b freedreno/ir3: Add support for CHSH and CHMASK instructions
Just add the constructors for now and special case similar to END so
we don't remove them.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen
f335a6663d freedreno/a6xx: Trim a few regs from fd6_emit_restore()
We know what these do an either write them in the program stateobj or
don't need to write them.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Kristian H. Kristensen
610c8c938e freedreno/registers: Update with GS, HS and DS registers
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:43:53 -07:00
Eric Anholt
628ed1bbd5 freedreno/ci: Ban texsubimage2d_pbo.r16ui_2d, due to two flakes reported.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2019-10-17 20:32:46 +00:00
Marek Olšák
12d92714e9 st/mesa: silence a warning in st_nir_lower_tex_src_plane
trivial
2019-10-17 16:07:26 -04:00
Marek Olšák
3ed1dd3d42 gallium/u_blitter: remove an unused variable
trivial
2019-10-17 16:07:02 -04:00
Marek Olšák
9aa5b348de radeonsi: recreate aux_context after a GPU reset
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-10-17 14:56:26 -04:00
Marek Olšák
438ede3ca3 radeonsi: call the reset callback if get_device_reset_status returns a failure
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-10-17 14:56:24 -04:00
Marek Olšák
93707457b6 st/mesa: call the reset callback if glGetGraphicsResetStatus returns a failure
so that we immediately set the no-op dispatch

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-10-17 14:56:23 -04:00
Caio Marcelo de Oliveira Filho
c847bfaaf5 intel/fs/gen12: Add tests for scoreboard pass
Tests the combinations of cases of RAW, WAW and WAR hazards involving
both inorder and outoforder instructions.  Also tests that
dependencies combine and propagate correctly through control
flow (loops and conditionals).

v2: Add an extra test illustrating that the non-logical CFG edge
    between then-block and else-block is being taking into
    account.  (Curro)

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-10-17 10:02:35 -07:00
Daniel Schürmann
4b458b3e8f aco: don't combine minmax3 if there is a neg or abs modifier in between
This fixes a graphical corruption in HotS.
No pipelinedb changes other than that.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-17 16:21:19 +00:00
Roland Scheidegger
045f05a2f6 gallivm: Fix saturated signed psub/padd intrinsics on llvm 8
LLVM 8 did remove both the signed and unsigned sse2/avx intrinsics in
the end, and provide arch-independent llvm intrinsics instead.
Fixes a crash when using snorm framebuffers (tested with piglit
arb_color_buffer_float-render GL_RGBA8_SNORM -auto).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
CC: <mesa-stable@lists.freedesktop.org>
2019-10-17 17:42:16 +02:00
Samuel Pitoiset
c644644c65 radv: fix DCC fast clear code for intensity formats (correctly)
Previous fix was pretty bogus.

This fixes a rendering regression with Nier (minimap too large).

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1943
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1952
Fixes: ea92273cea ("radv: fix DCC fast clear code for intensity formats")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-17 15:29:43 +02:00
Tomeu Vizoso
82f18b713a panfrost: Keep track of active BOs
If two jobs use the same GEM object at the same time, the job that
finishes first will (previous to this commit) close the GEM object, even
if there's a job still referencing it.

To prevent this, have all jobs use the same panfrost_bo for a given GEM
object, so it's only closed once the last job is done with it.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-17 14:33:59 +02:00
Karol Herbst
730f06a44d nv50/ir: remove DUMMY edge type
it was never used

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-10-17 14:00:50 +02:00
James Xiong
023282a4f6 gallium: do not increase ref count of the new throttle fence
A new throttle fence was initialized to 1, and increased by 1
again when it's put in drawable->throttle_fence; the ref was
decreased by 1 when it's removed from drawable->throttle_fence,
and never reached to 0, caused leak.

Fixes: ff77bf5cbf7 ("gallium: simplify throttle implementation")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1949

Signed-off-by: James Xiong <james.xiong@intel.com>
Reported-by: Florian Wesch <fw@info-beamer.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
2019-10-17 10:18:07 +00:00
Erik Faye-Lund
e8095f2af0 nir: drop unused alpha_ref_float
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-17 10:41:36 +02:00
Erik Faye-Lund
5af272b474 nir: drop support for using load_alpha_ref_float
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-17 10:41:36 +02:00
Erik Faye-Lund
9d0523b569 v3d: do not report alpha-test as supported
This triggers lowering in the state-tracker, which makes things a bit
simpler.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-17 10:41:36 +02:00
Erik Faye-Lund
a79b93269c vc4: do not report alpha-test as supported
This triggers lowering in the state-tracker, which makes things a bit
simpler.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-17 10:41:36 +02:00
Erik Faye-Lund
2da792d398 panfrost: do not report alpha-test as supported
This triggers lowering in the state-tracker, which makes things a bit
simpler.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-17 10:41:36 +02:00
Erik Faye-Lund
3298aedd6e mesa/st: support lowering user-clip-planes automatically
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-17 10:41:36 +02:00
Erik Faye-Lund
439f499591 mesa/program: support referencing the clip-space clip-plane state
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-17 10:41:36 +02:00
Erik Faye-Lund
71c0dcf266 nir: support feeding state to nir_lower_clip_[vg]s
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-17 10:41:36 +02:00
Erik Faye-Lund
eb3047c094 nir: support lowering clipdist to arrays
This allows us to make sure clipdist is emitted as a scalar array rather
than two vec4s. This matches SPIR-V semantics, and will be useful for
Zink.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-17 10:41:36 +02:00
Erik Faye-Lund
28543f1640 mesa/gallium: automatically lower two-sided lighting
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-17 10:41:36 +02:00
Erik Faye-Lund
011d692a52 nir: support derefs in two-sided lighting lowering
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-17 10:41:36 +02:00
Erik Faye-Lund
3b4fc2401b mesa/gallium: automatically lower point-size
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-17 10:41:36 +02:00
Erik Faye-Lund
878c94288a nir: add lowering-pass for point-size mov
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-17 10:41:36 +02:00
Erik Faye-Lund
b786738454 st/mesa: move point_size_per_vertex-logic to helper
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-17 10:41:36 +02:00
Erik Faye-Lund
b1c4c4c7f5 mesa/gallium: automatically lower alpha-testing
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-17 10:41:36 +02:00
Erik Faye-Lund
6d7e02e37d nir: allow passing alpha-ref state to lowering-code
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-17 10:41:36 +02:00
Erik Faye-Lund
fdc4450c28 mesa: expose alpha-ref as a state-variable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-17 10:41:36 +02:00
Dave Airlie
cce3ad166a st/mesa: handling lower flatshading for NIR drivers.
This uses the NIR pass to lower flatshading when the driver
requests it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-17 10:41:36 +02:00
Dave Airlie
731260de7d gallium: add flatshade lowering capability
This allows the driver to request flatshade lowering.
(NIR drivers only so far).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-17 10:41:36 +02:00
Dave Airlie
dc91a02a72 nir: add a pass to lower flat shading.
This takes any color or backcolor that has unspecified
shading and converts it to flat shading.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-17 10:41:36 +02:00
Erik Faye-Lund
26c6640835 gallium/u_blitter: set a more sane viewport-state
This actually corresponds to legal GL depth-ranges, because depth-clear
values are always in the 0..1 range in OpenGL.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-17 09:26:25 +02:00
Marek Olšák
4857d695f5 st/mesa: reorder and document code in st_translate_vertex_program
move the TGSI code after the ARB_vp code

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-16 20:10:47 -04:00
Marek Olšák
5d0630e504 st/mesa: call prog_to_nir sooner for ARB_fp
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-16 20:10:47 -04:00
Marek Olšák
f54dcaf232 st/mesa: don't call translate_*_program functions for NIR
move the initializaton to st_link_nir

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-16 20:10:47 -04:00
Marek Olšák
f86b28dfdc st/mesa: finalize NIR after shader variant passes for TCS/TES/GS/CS
Same as VS and FS.

This might fix vertex color clamping.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-16 20:10:47 -04:00
Marek Olšák
45378689e0 st/mesa: unify transform feedback info translation code
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-16 20:10:47 -04:00
Marek Olšák
b23967a5e1 st/mesa: move vertex program preparation code into st_prepare_vertex_program
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-16 20:10:47 -04:00
Marek Olšák
8dfcec405a st/mesa: clean up more after the removal of st_compute_program
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-16 20:10:47 -04:00
Marek Olšák
196fc59c40 st/mesa: deduplicate st_common_program code in st_program_string_notify
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-16 20:10:47 -04:00
Marek Olšák
05f59bb777 st/mesa: sink TCS/TES/GS/CS translate code into st_translate_common_program
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-16 20:10:47 -04:00
Marek Olšák
74c007ba7f st/mesa: deduplicate cases in st_deserialise_ir_program
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-16 20:10:47 -04:00
Marek Olšák
1cc866c264 st/mesa: remove st_compute_program in favor of st_common_program
The conversion from pipe_shader_state to pipe_compute_state is done
at the end.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-16 20:10:47 -04:00
Marek Olšák
691240cdbe st/mesa: don't store stream output info to shader cache for tess ctrl shaders
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-16 20:10:47 -04:00
Marek Olšák
33de483d55 st/mesa: simplify the signature of st_release_basic_variants
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-16 20:10:47 -04:00
Marek Olšák
ab843a3702 st/mesa: deduplicate code for ATI fs in st_program_string_notify
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-16 20:10:47 -04:00
Marek Olšák
b596bb5b66 st/mesa: use *prog at the end of st_link_nir
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-16 20:10:47 -04:00
Dylan Baker
4e07869a06 appveyor: Cache meson's wrap downloads
This makes us less reliant on wrap-db (and reduces the amount of
downloading that goes on during the build).

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1936
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-16 23:26:09 +00:00
Dylan Baker
c65f907ce9 gitlab-ci: Set the meson wrapmode to disabled
This will prevent us from accidentally falling back to the wrap-db
instead of using locally installed versions.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-16 23:26:09 +00:00
Dylan Baker
449f831088 Revert "gitlab-ci: Disable meson-mingw32-x86_64 job again for now"
This reverts commit d60b8679a4.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-16 23:26:09 +00:00
Dylan Baker
6e375ff1aa gitlab-ci: Add a pkg-config for mingw
The one debian provides is broken in buster+, so I've just written my
own. This allows meson to find the installed zlib and prevents it from
falling back to wraps.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-16 23:26:09 +00:00
Dylan Baker
4441da0044 meson: Don't use expat on windows
It's not really needed, and there's no debian package for it so we're
forced to fall back to wraps in mesa's CI. This can be problematic in
itself.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-16 23:26:09 +00:00
Karol Herbst
656c038d01 st/mesa: fix crash for drivers supporting nir defaulting to tgsi
nvc0 and I assume radeonsi as well hit an assert inside glsl_to_tgsi as atan
instructions get inserted into the shader.

Fixes: cece947a8d ("glsl/builtin: Add alternate versions of atan using new ops")
Cc: Neil Roberts <nroberts@igalia.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-10-16 21:53:46 +00:00
Eric Engestrom
aaab70035a util/u_atomic: fix return type of p_atomic_{inc,dec}_return() and p_atomic_{cmp,}xchg()
We're trying to cast the return type to the type of the var, but instead
we were casting `sizeof(*v)`.

Fixes: 6df72e970c ("util: Make u_atomic.h typeless.")
Fixes: 0a7f17cf5b ("util/u_atomic: add p_atomic_xchg")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-10-16 19:41:47 +01:00
Eric Engestrom
d3b06a199e mesa/math: delete duplicate extern symbol
It's already defined in `m_debug_util.h`, along with an explanation of
what it is and how to use it.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-16 19:31:24 +01:00
Eric Engestrom
38427da02b mesa/math: delete leftover... from 18 years ago (!)
Left over from 0a79baf1bf ("remove dead vertex assembly").

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-16 19:31:20 +01:00
Andreas Baierl
0ee931c1de lima: Fix crash when there are no vertex shader attributes
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-10-16 16:45:05 +00:00
Andreas Baierl
f906f5f053 lima: Fix compiler warning in standalone compiler
'struct lima_context' has to be declared before usage in lima_program.h

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-10-16 15:13:13 +00:00
Rhys Perry
88f1c0a360 aco: emit_split_vector() s_memtime results
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-16 15:31:19 +01:00
Rhys Perry
ded51b13da aco: don't CSE s_memtime
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-16 15:31:19 +01:00
Rhys Perry
d7838152f5 aco: fix scheduling with s_memtime/s_memrealtime
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-16 15:31:19 +01:00
Alan Coopersmith
6804b8e1ff intel/common: include unistd.h for ioctl() prototype on Solaris
Fixes build errors of:
In file included from ../src/intel/vulkan/anv_private.h:48,
                 from ../src/intel/vulkan/genX_blorp_exec.c:26:
../src/intel/common/gen_gem.h: In function ‘gen_ioctl’:
../src/intel/common/gen_gem.h:68:15: error: implicit declaration of function ‘ioctl’ [-Werror=implicit-function-declaration]
   68 |         ret = ioctl(fd, request, arg);
      |               ^~~~~
In file included from ../include/c11/threads_posix.h:35,
                 from ../include/c11/threads.h:66,
                 from ../src/mesa/main/mtypes.h:39,
                 from ../src/intel/compiler/brw_compiler.h:30,
                 from ../src/intel/vulkan/anv_private.h:51,
                 from ../src/intel/vulkan/genX_blorp_exec.c:26:
/usr/include/unistd.h: At top level:
/usr/include/unistd.h:471:12: error: conflicting types for ‘ioctl’
  471 | extern int ioctl(int, int, ...);
      |            ^~~~~
/usr/include/unistd.h:471:1: note: a parameter list with an ellipsis can’t match an empty parameter name list declaration
  471 | extern int ioctl(int, int, ...);
      | ^~~~~~
In file included from ../src/intel/vulkan/anv_private.h:48,
                 from ../src/intel/vulkan/genX_blorp_exec.c:26:
../src/intel/common/gen_gem.h:68:15: note: previous implicit declaration of ‘ioctl’ was here
   68 |         ret = ioctl(fd, request, arg);
      |               ^~~~~

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-16 13:45:57 +01:00
Alan Coopersmith
d8a9420f6f meson: recognize "sunos" as the system name for Solaris
Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-16 13:45:57 +01:00
Alan Coopersmith
7040795a69 util: Solaris has linux-style pthread_setname_np
Fixes: dcf9d91a ("util: Handle differences in pthread_setname_np")

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-16 13:45:57 +01:00
Alan Coopersmith
b3028a9fb8 util: Workaround lack of flock on Solaris
v2: Replace autoconf check for flock() with meson check

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-16 13:45:57 +01:00
Alan Coopersmith
a56c3e3a47 util: Make Solaris implemention of p_atomic_add work with gcc
gcc is very particular about where you place the (void) cast
The previous placement made it error out with:

In file included from disk_cache.c:40:0:
../../src/util/u_atomic.h:203:29: error: void value not ignored as it ought to be
 #define p_atomic_add(v, i) ((void)         \
                              ^
disk_cache.c:658:4: note: in expansion of macro ‘p_atomic_add’
    p_atomic_add(cache->size, size);
    ^

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-16 13:45:57 +01:00
Alan Coopersmith
ddde652e70 c99_compat.h: Don't try to use 'restrict' in C++ code
Fixes build failures on Solaris in C++ files using gcc:

../src/util/u_math.h:628:41: error: expected ‘,’ or ‘...’ before ‘dest’
  628 | util_memcpy_cpu_to_le32(void * restrict dest, const void * restrict src, size_t n)
      |                                         ^~~~
../src/util/u_math.h: In function ‘void* util_memcpy_cpu_to_le32(void*)’:
../src/util/u_math.h:641:18: error: ‘dest’ was not declared in this scope
  641 |    return memcpy(dest, src, n);
      |                  ^~~~
../src/util/u_math.h:641:24: error: ‘src’ was not declared in this scope
  641 |    return memcpy(dest, src, n);
      |                        ^~~
../src/util/u_math.h:641:29: error: ‘n’ was not declared in this scope; did you mean ‘yn’?
  641 |    return memcpy(dest, src, n);
      |                             ^
      |                             yn

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-16 13:45:57 +01:00
Alyssa Rosenzweig
c94ccbf201 pan/midgard: Do not repeatedly spill same value
It doesn't make sense. You already spilled it once, and it didn't help.
Don't try again, or you'll end up in a loop.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-16 08:17:56 -04:00
Alyssa Rosenzweig
2d914ebe81 pan/midgard: Fix memory corruption in register spilling
Essentially an off-by-one error ... bit of an edge case, but seems to
occur in some glamor shaders.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-16 08:17:56 -04:00
Alyssa Rosenzweig
01a78dbbab pan/midgard: Allow COMPUTE jobs in panfrost_bo_access_for_stage
Fixes: ada752afe4 ("panfrost: Extend the panfrost_batch_add_bo() API to pass access flags")
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-16 08:17:56 -04:00
Alyssa Rosenzweig
fd2216e1fd pan/midgard: Use 16-bit liveness masks
We'll want liveness per-byte, so we need to accomodate up to 16 bytes.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-16 08:17:56 -04:00
Alyssa Rosenzweig
4fee7b30c0 panfrost: Disable frame throttling
The new frame throttling implemention interacts unfortunately with
pipelining, leading to fence fds leaking like crazy and ultimately apps
crashing quickly.

With this patch, apps still crash but not as quickly. We need to either
figure out the real cause or revert the core changes.

Nevertheless, we don't want frame throttling in the first place, so.

Fixes: a65e29ccb2 ("gallium: simplify throttle implementation")
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-16 08:13:38 -04:00
Pierre-Eric Pelloux-Prayer
16233797f4 mesa: fix invalid target error handling for teximage
This commit moves the target check before using _mesa_get_current_tex_object
to fix a "Mesa implementation error: bad target in _mesa_get_current_tex_object()"
error.

Fixes: 9dd1f7cec0 ("mesa: pass gl_texture_object as arg to not depend on state")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-16 10:41:31 +02:00
Marek Olšák
268e0e01f3 radeonsi/nir: simplify si_lower_nir signature
just a cleanup
2019-10-15 21:52:09 -04:00
Alyssa Rosenzweig
923aa3918c pan/midgard: Fix mir_mask_of_read_components with dot products
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-15 21:41:12 -04:00
Alyssa Rosenzweig
47b58199f0 pan/midgard: Add perspective ops to mir_get_swizzle
I really need to just make this a table..

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-15 21:41:12 -04:00
Alyssa Rosenzweig
7db36d94af pan/midgard: Don't try to propagate swizzles to branches
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-15 21:41:12 -04:00
Alyssa Rosenzweig
9c0915ba4a pan/midgard: Allow non-contiguous masks in UBO lowering
We don't really need to impose this condition, but we do need to cope
with the slightly more general case.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-15 21:41:11 -04:00
Alyssa Rosenzweig
a6867fb3fd pan/midgard: Report read mask for branch arguments
Conditionals in particular read values.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-15 21:41:11 -04:00
James Xiong
fd235484fe iris: finish aux import on get_param
A buffer and its aux are imported separately, if the aux import is
not completed yet when resource_get_param is called, merge the
separate aux a.k.a the 2nd image into the main image.

Fixes: 246eebba4a ("iris: Export and import surfaces with modifiers that have aux data")

Signed-off-by: James Xiong <james.xiong@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-15 23:19:04 +00:00
Kenneth Graunke
e6ca6e587e mesa: Handle pbuffers in desktop GL framebuffer attachment queries
Once again, we were handling back-to-front in the GLES3 case, but not
the desktop GL case.

Fixes GTF-GL46.gtf30.GL3Tests.framebuffer_srgb.framebuffer_srgb_default_encoding when run with --deqp-surface-type=pbuffer --deqp-gl-context-type=egl.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-15 15:44:27 -07:00
Kenneth Graunke
c512eca4da mesa: Make back_to_front_if_single_buffered non-static
So I can use it in fbobject.c in the next commit.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-15 15:44:25 -07:00
Kenneth Graunke
d947276b4a mesa: Use ctx->ReadBuffer in glReadBuffer back-to-front tests
We were looking at ctx->DrawBuffer when asking about the read buffer,
which was good enough for CTS purposes, but definitely not right.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-15 15:44:16 -07:00
Lionel Landwerlin
701e0ac077 etnaviv: remove variable from global namespace
Found out by accident this was clashing with another driver.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Cc: <mesa-stable@lists.freedesktop.org>
2019-10-15 21:07:25 +00:00
Marek Olšák
7f6b9baee2 st/mesa: always allocate pack/unpack buffers as staging
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-10-15 14:24:23 -04:00
Adam Jackson
3f840e5ccd gallium/xlib: Fix xmesa drawable creation
The first time you call glXMakeCurrent, current != ctx. As a result we
would never look up whether the drawable already had an XMesaDrawable,
and would instead always create one. Then XMesaBufferList would have two
different buffers for the same XID, and you'd be reading and drawing to
different places, and that's not what you want at all.

Instead just always look up the drawable.

Fixes: db8be355 (gallium/xlib: Remove drawable caching from the MakeCurrent path)
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1196
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
2019-10-15 17:24:41 +00:00
Eric Engestrom
3bcd54f3fc gitlab-ci: set a common job parent for test stage
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
2019-10-15 17:42:39 +01:00
Eric Engestrom
aba78c2d38 gitlab-ci: set a common job parent for build stage
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
2019-10-15 17:42:39 +01:00
Eric Engestrom
81b98e99cd gitlab-ci: set a common job parent for container stage
While at it, rename to singular "container" for consistency.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
2019-10-15 17:42:39 +01:00
Samuel Pitoiset
4a3bdc6d22 Revert "radv: do not emit PKT3_CONTEXT_CONTROL with AMDGPU 3.6.0+"
This reverts commit 2ca8629fa9.

This was initially ported from RadeonSI, but in the meantime it has
been reverted because it might hang. Be conservative and re-introduce
this packet emission.

Unfortunately this doesn't fix anything known.

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-15 15:58:34 +02:00
Jonathan Marek
39d7cb36ff spirv: set correct dest_type for texture query ops
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-15 08:42:22 -04:00
Jonathan Marek
37dec33676 turnip: more descriptor sets
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-15 07:56:20 -04:00
Jonathan Marek
ac9f0d2dd4 turnip: push constants
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-15 07:56:20 -04:00
Jonathan Marek
5b7fbcbdde turnip: depth/stencil
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-15 07:56:20 -04:00
Jonathan Marek
f1efc9a1c8 turnip: basic msaa working
Not perfect but gets through some tests.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-15 07:56:20 -04:00
Jonathan Marek
d3c9914152 turnip: improve CmdCopyImage and implement CmdBlitImage
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-15 07:56:20 -04:00
Jonathan Marek
571b2611b3 turnip: use nir_assign_io_var_locations instead of nir_assign_var_locations
Variables with same location should use the same driver_location.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-15 07:56:20 -04:00
Jonathan Marek
a5635a8a50 turnip: add missing nir passes
Avoids assert fails in ir3.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-15 07:56:20 -04:00
Jonathan Marek
d930be9f4c turnip: add code to lower indirect samplers
Taken from nir_lower_samplers. Sampler arrays don't work though, this is
just to avoid an assert fail in ir3.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-15 07:56:20 -04:00
Jonathan Marek
e336076838 turnip: fixup consts
Fix some mistakes in previous series.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-15 07:56:20 -04:00
Jonathan Marek
29464712ce turnip: update some shader state bits from GL driver
Notably includes centroid varying bits that were missing.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-15 07:56:20 -04:00
Eric Anholt
9a5f3594ee turnip: Emit clears of gmem using linear.
This is what we do in freedreno.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-15 07:56:20 -04:00
Eric Anholt
776a9ce36b turnip: Set up the correct tiling mode for small attachments.
Noticed while debugging a tiling-looking issue by comparing our gmem
blit setup to freedreno's.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-15 07:56:19 -04:00
Eric Anholt
8193c2b08b turnip: Tell spirv_to_nir that we want fragcoord as a sysval.
Fixes ir3 compiler failure failure in
dEQP-VK.renderpass.dedicated_allocation.formats.r8g8b8a8_unorm.clear.clear_draw
(now just a rendering failure where the subpass clear isn't happening)

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-15 07:56:19 -04:00
Eric Anholt
0ce1672a2c turnip: Fill in clear color packing for r10g11b11 and rgb9e5.
Fixes assertion failures in
dEQP-VK.api.image_clearing.core.clear_color_image.2d.* for these
formats, though the test set as a whole is stil failing.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-15 07:56:19 -04:00
Eric Anholt
1b16c5c98e turnip: Drop unused tu_pack_clear_value() return.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-15 07:56:19 -04:00
Jonathan Marek
8626d33986 turnip: add anisotropy and compressed formats to device features
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-15 07:56:19 -04:00
Jonathan Marek
f4154e7d3e turnip: disable tiling as necessary
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-15 07:56:19 -04:00
Jonathan Marek
057c0f5caa turnip: update setup_slices
Deal with tiled r8g8 having different alignment and other updates taken
from fd6_resource. Additionally track image samples/cpp.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-15 07:56:19 -04:00
Jonathan Marek
c47f58bd4d turnip: add VK_KHR_sampler_mirror_clamp_to_edge
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-15 07:56:19 -04:00
Jonathan Marek
2f939ef889 turnip: add black border color
Avoids hangs and some texture tests are happy with just this.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-15 07:56:19 -04:00
Jonathan Marek
ffbffe19f9 turnip: improve sampler descriptor
Fixes anisotropy and shadow texture

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-15 07:56:19 -04:00
Jonathan Marek
68b8d0b70e turnip: improve view descriptor
Changes to make compressed, tiled, 3d, etc textures work

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-15 07:56:19 -04:00
Jonathan Marek
31351a0281 turnip: add more 2d_ifmt translations
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-15 07:56:19 -04:00
Jonathan Marek
acdc75301f turnip: format table fixes
* Fix R16G16 SCALED and R16G16B16A16 SCALED having texture format
* Fix B5G6R5 swap value
* Use R8_UINT instead of R8_UNORM for S8_UINT rb format
* Disable 96-bit texture formats instead having a check for NPOT formats
* Don't fail assert on D24X8 format

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-15 07:56:19 -04:00
Jonathan Marek
eb67d9f0f3 turnip: add format_is_uint/format_is_sint
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-15 07:56:19 -04:00
Jonathan Marek
12ede7565f turnip: add astc format layout
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-15 07:56:19 -04:00
Jonathan Marek
b6e1544852 turnip: fix assert failing for 0 color attachments
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-15 07:56:19 -04:00
Jonathan Marek
467f9982df turnip: fix segmentation fault with compute pipeline
Not supported, so always set pointer to NULL

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-15 07:56:19 -04:00
Jonathan Marek
eef195c9cc turnip: fix segmentation fault in events
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-15 07:56:19 -04:00
Jonathan Marek
03772df450 turnip: fix 32 vertex attributes case
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-15 07:56:19 -04:00
Jonathan Marek
8580726f90 turnip: fix triangle strip
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-15 07:56:19 -04:00
Jonathan Marek
b7093882eb freedreno/regs: update a6xx 2d blit bits
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-15 07:56:18 -04:00
Samuel Pitoiset
50c8c4144b radv: rename VK_KHR_shader_float16_int8 structs/constants
Trivial change.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-15 12:13:53 +02:00
Iago Toral
e353656f3d v3d: drop unused shader_rec_count member from context
Looks like this was copied from the vc4 driver where it is actually
included in the submit CL ioctl.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-15 06:56:45 +00:00
Jonathan Marek
278c9b5cc7 freedreno/ir3: implement fquantize2f16
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robclark@gmail.com>
2019-10-14 17:48:22 -04:00
Jonathan Marek
92d756f22d freedreno/ir3: implement texop_texture_samples
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robclark@gmail.com>
2019-10-14 17:48:22 -04:00
Jonathan Marek
3cfd5ffb8c freedreno/ir3: fix GETLOD for negative LODs
Note: for output type U32, negative LOD is not sign extended from 16 bits

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robclark@gmail.com>
2019-10-14 17:48:22 -04:00
Jonathan Marek
cfc6a3e394 freedreno/ir3: implement fdd{x,y}_coarse opcodes
Same as regular fddx/fddy.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robclark@gmail.com>
2019-10-14 17:48:22 -04:00
Jonathan Marek
b094b384e2 freedreno/ir3: increase size of inputs/outputs arrays
Makes it possible to support 32 varyings.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robclark@gmail.com>
2019-10-14 17:48:22 -04:00
Jonathan Marek
08003c37b9 freedreno/ir3: remove input ncomp field
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robclark@gmail.com>
2019-10-14 17:48:22 -04:00
Lucas Stach
ce23bc9283 etnaviv: fix vertex buffer state emission for single stream GPUs
GPUs with a single supported vertex stream must use the single state
address to program the stream.

Fixes: 3d09bb390a (etnaviv: GC7000: State changes for HALTI3..5)
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-10-14 19:18:37 +00:00
Dave Airlie
c2efc7c637 gallivm/draw/swr: make the gs_iface not depend on tgsi.
This gs_iface doesn't seem to require a dependence on the tgsi
context, except for the swr end prim code.

This refactors the API to include all the info that the swr
code needs in the interface rather than having to dig it out of
the struct inheritance.

This is a precursor to adding NIR support to llvmpipe.

Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
2019-10-15 04:43:30 +10:00
Kenneth Graunke
ac7af7c500 iris: Implement the Gen < 9 tessellation quads workaround
Fixes several CTS tests:
- KHR-GL46.tessellation_shader.vertex.vertex_spacing
- KHR-GL46.tessellation_shader.tessellation_shader_point_mode.points_verification

Fixes: 823609b1a3 ("iris/WIP: add broadwell support")
2019-10-14 09:48:36 -07:00
Caio Marcelo de Oliveira Filho
58286c7969 anv: Advertise VK_KHR_spirv_1_4
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-14 08:25:42 -07:00
Caio Marcelo de Oliveira Filho
90a7893ca8 vulkan: Update the XML and headers to 1.1.125
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-14 08:23:27 -07:00
Mauro Rossi
072c94f724 android: amd/common: export amd/llvm headers
Fixes the following building error:

external/mesa/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c:42:10:
fatal error: 'ac_llvm_util.h' file not found
         ^~~~~~~~~~~~~~~~
1 error generated.

Fixes: 3a08110 ("amd: Move all amd/common code that depends on LLVM to amd/llvm.")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-14 10:46:45 +02:00
James Xiong
4f963b03a1 gallium: rename PIPE_CAP_MAX_FRAMES_IN_FLIGHT to PIPE_CAP_THROTTLE
v2: [ Michel Dänzer ]
* Update src/gallium/docs/source/screen.rst accordingly

Signed-off-by: James Xiong <james.xiong@intel.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> # v1
Reviewed-by: Marek Olšák <marek.olsak@amd.com> # v1
2019-10-14 10:05:46 +02:00
James Xiong
a65e29ccb2 gallium: simplify throttle implementation
All gallium drivers currently set MAX_FRAME_IN_FLIGHT to either 1
or 0, which means that the drivers either throttle on the previous
render or don't throttle, the current implementation is more
complicated than necessary and can be simplified.

Signed-off-by: James Xiong <james.xiong@intel.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-14 10:05:40 +02:00
Samuel Pitoiset
ea92273cea radv: fix DCC fast clear code for intensity formats
This fixes a rendering issue with DiRT 4 on GFX10. Only GFX10 was
affected because intensity formats are different.

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1923
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-14 08:36:14 +02:00
Eric Engestrom
ebe176eeff gbm: use size_t for array indexes
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-13 18:10:47 +01:00
Eric Engestrom
ad7e410893 gbm: replace NULL sentinel with explicit ARRAY_SIZE()
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-13 18:10:47 +01:00
Eric Engestrom
0d74f4bb16 gbm: replace 1/0 bool with true/false
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-13 18:10:47 +01:00
Eric Engestrom
e9d8081135 gbm: turn 0/-1 bool into true/false
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-13 18:10:47 +01:00
Eric Engestrom
48289d8853 radv: add exported symbols check
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-13 17:40:54 +01:00
Eric Engestrom
960038d550 anv: add exported symbols check
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-13 17:40:47 +01:00
Eric Engestrom
f1c22390f7 symbols-check: ignore exported C++ symbols
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-13 17:40:43 +01:00
Boris Brezillon
35e92a11dd panfrost: Fix support for packed 24-bit formats
pan_pack_color() color was missing the 24-bit packed format case.
Looks like putting the clear color in a 32-bit slot does the trick.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-13 14:44:25 +02:00
Timothy Arceri
1294f01e06 glsl: fix crash compiling bindless samplers inside unnamed UBOs
The check to see if we were dealing with a buffer block was
too late and only worked for named UBOs.

Fixes: f32b01ca43 "glsl/linker: remove ubo explicit binding handling"

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1900
2019-10-12 22:04:23 +11:00
Neil Roberts
cece947a8d glsl/builtin: Add alternate versions of atan using new ops
Adds alternate versions of the atan builtin functions that use
ir_unop_atan and ir_binop_atan2 instead of inlining to the IR
implementation of the function. These alternatives are selected if the
IR is going to be consumed by NIR. In that case the IR ops will be
translated to the appropriate NIR op.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-12 09:43:18 +02:00
Neil Roberts
77f3fbb4aa glsl: Add opcodes for atan and atan2
Adds ir_binop_atan2 and ir_unop_atan. When converting to NIR these are
expanded out using the appropriate builtin generator. If they are used
with anything else then it will just hit an assert.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-12 09:43:18 +02:00
Neil Roberts
0832845dc6 nir/builtin: Add extern "C" guards to nir_builtin_builder.h
That way it can also be included from a C++ source.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-12 09:43:18 +02:00
Neil Roberts
9eaeedd54b nir/builtin: Add #include u_math.h to the header
The inline functions use M_PI so they should include a header to make
sure it is defined.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-12 09:43:18 +02:00
Neil Roberts
2098ae16c8 nir/builder: Move nir_atan and nir_atan2 from SPIR-V translator
Moves build_atan and build_atan2 into nir_builtin_builder. The goal is
to be able to use this from the GLSL translator too.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-12 09:43:17 +02:00
Hal Gentz
075a96aa92 egl: Configs w/o double buffering support have no EGL_WINDOW_BIT.
When users pass a config to `eglCreateWindowSurface` it requests double
buffering, but if the config doesn't have the appropriate `__DRIconfig`,
`eglCreateWindowSurface` fails with a `EGL_BAD_MATCH`.

Given that such behaviour is completely unacceptable, we drop the
`EGL_WINDOW_BIT` if we don't have at least one `__DRIconfig` supporting double
buffering, otherwise dropping the `EGL_PIXMAP_BIT`.

Fixes: 049f343e8a "egl: Allow 24-bit visuals for 32-bit RGBA8888 configs"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67676
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Hal Gentz <zegentzy@protonmail.com>
2019-10-11 21:57:21 +00:00
Hal Gentz
a800d16e4f egl: Puts RGBA visuals in the second config selection group.
That way applications don't get windows that are compositor alpha-blended
accidentally.

In the ideal world, this would be done by the xserver, as it does for
GLX, however, an appropriate place could not be found, so it's being
placed here instead.

Fixes: 049f343e8a "egl: Allow 24-bit visuals for 32-bit RGBA8888 configs"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67676
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Hal Gentz <zegentzy@protonmail.com>
2019-10-11 21:57:21 +00:00
Hal Gentz
90a19074b4 egl: Fixes transparency with EGL and X11.
This commit does this by allowing both RGB and RGBA visuals to match with
EGL configs. We also expose the `EGL_MESA_config_select_group` egl
extension, which is similar to GLX's visual select group extension, to
allow the RGBA visuals to get less priority.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67676
Fixes: 049f343e8a "egl: Allow 24-bit visuals for 32-bit RGBA8888 configs"
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Hal Gentz <zegentzy@protonmail.com>
2019-10-11 21:57:21 +00:00
Hal Gentz
173bc9d684 egl: Add EGL_CONFIG_SELECT_GROUP_MESA ext.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67676
Fixes: 049f343e8a "egl: Allow 24-bit visuals for 32-bit RGBA8888 configs"
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Hal Gentz <zegentzy@protonmail.com>
2019-10-11 21:57:20 +00:00
Kenneth Graunke
44754279ac intel/fs/gen12: Use TCS 8_PATCH mode.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-10-11 12:24:16 -07:00
Jason Ekstrand
c92fb60007 intel/fs/gen12: Implement gl_FrontFacing on gen12+.
The bit moved on gen12 in order to prepare for dual-SIMD8 dispatch.
This implementation isn't an entirely complete as it only works on SIMD8
and SIMD16 and not dual-SIMD8.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
ceb123befa intel/fs/gen11+: Fix CS_OPCODE_CS_TERMINATE codegen.
Apparently the ts_request_type and ts_resource_select thread spawner
message descriptor bits were removed from the hardware at least since
ICL.  Drop them in order to avoid assertion failures on Gen12+
platforms which don't have any encoding for this.  On Gen9+ these are
probably just ignored by the hardware, so this is unlikely to have had
any functional implications prior to Gen12.

v2: Mark TS message fields as non-existing in brw_inst.h on ICL. (Caio)

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
a5efb0eae8 intel/fs/gen12: Fix barrier codegen.
The WAIT instruction has been removed, but SYNC.bar can be used
instead to wait for a notification on n0.0.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
6b52f81395 intel/eu: Don't set notify descriptor field of gateway barrier message.
Apparently this field was removed on SKL, and according to the
hardware docs for previous platforms "This field is only valid for a
ForwardMsg message. It is ignored for other messages. The BarrierMsg
message always increments the N0 notification counter".

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
b0e69d115e intel/ir/gen12: Update assert in brw_stage_has_packed_dispatch().
Confirmed no regressions after a full Piglit run on TGL with the
brw_fs_test_dispatch_packing() test enabled.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Jason Ekstrand
ca7b6fd392 intel/eu/validate/gen12: Don't blow up on indirect src0.
They look like a NULL source if you don't look at the address mode.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
ab5aa01689 intel/eu/validate/gen12: Validation fixes for SEND instruction.
The following fix-up by Jordan Justen is squashed in:

 intel/eu/validate: gen12 send instruction doesn't have a dst type field

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
a81f9b5e3e intel/eu/validate/gen12: Fix validation of SYNC instruction.
src0 will typically be null for this instruction.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
45768e6b3c intel/eu/validate/gen12: Implement integer multiply restrictions in EU validator.
Due to hardware bug filed as HSDES#1604601757.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Jordan Justen
f9ec4ac5a1 intel/ir: Lower fpow on Gen12.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
cb6db5bfb3 intel/fs/gen12: Don't support source mods for 32x16 integer multiply.
Due to hardware bug filed as HSDES#1604601757.

v2: Only return if result of fs_inst::can_do_source_mods() is known to
    be false for the case new orthogonal restrictions are implemented
    below in the future. (Caio)

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
de5d106ccf intel/disasm: Disassemble register file of split SEND sources.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
c03869323b intel/disasm: Don't disassemble saturate control on SEND instructions.
The field is gone on Gen12+ and it was illegal on previous
generations.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
f15e0b3439 intel/disasm/gen12: Disassemble Gen12 SEND instructions.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
fd7e21dd90 intel/disasm/gen12: Disassemble Gen12 SYNC instruction.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
606d823b42 intel/disasm/gen12: Disassemble three-source instruction source and destination regions.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
8263d300c2 intel/disasm/gen12: Fix disassembly of some common instruction controls.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
83612c0127 intel/disasm/gen12: Disassemble software scoreboard information.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
396f6b27a7 intel/fs/gen12: Demodernize software scoreboard lowering pass.
Kept as a separate commit in order to avoid distracting reviewers of
the software scoreboard pass with memory management boilerplate.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
265c7c8971 intel/fs/gen12: Introduce software scoreboard lowering pass.
Gen12+ hardware lacks the register scoreboard logic that used to
guarantee data coherency between register reads and writes in previous
generations.  This lowering pass runs after register allocation in
order to make up for it.

It works by performing global dataflow analysis in order to determine
the set of potential dependencies of every instruction in the shader,
and then inserts any required SWSB annotations and additional SYNC
instructions in order to guarantee data coherency.

v2: Drop unnecessary _safe list iteration (Caio).

v3: Temporarily workaround potential WaR hazard between FPU
    instruction and subsequent out-of-order write, pending
    clarification from the hardware team.  Drop redundant tracking of
    implicit access of acc0-1, since the hardware guarantees coherency
    of these (but not the other accumulators...).

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
e0b8d7953e intel/fs/gen12: Add scheduling information to the IR.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
15e3a0d9d2 intel/eu/gen12: Set SWSB annotations in hand-crafted assembly.
Reviewers are encouraged to audit the code generation pass
independently for the case I missed some potential data hazard or new
code has been added in the meantime.

v2: Add SYNC instruction to cr0 workaround in brw_float_controls_mode().

v3: Drop likely redundant (and potentially harmful) RegDist SWSB
    annotation from ce0 read in brw_find_live_channel() (Caio).

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
d3f3bdcd18 intel/eu/gen12: Add tracking of default SWSB state to the current brw_codegen instruction.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
6154cdf924 intel/eu/gen12: Add auxiliary type to represent SWSB information during codegen.
v2: Introduce extra tgl_swsb_sbid() constructor (Caio).

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
c22db5e188 intel/fs/gen12: Add codegen support for the SYNC instruction.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
0e57dbc55c intel/ir/gen12: Add SYNC hardware instruction.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
7499e10383 intel/eu/gen12: Don't set thread control, it's gone.
An effect similar to the one formerly provided by setting thread
control to "switch" can be achieved now by setting a RegDist of 1 on
the SWSB field.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
a66ea33991 intel/eu/gen12: Don't set DD control, it's gone.
A future lowering pass will simulate the same behavior originally
provided by NoDDChk/NoDDClr at the IR level by using appropriate SWSB
annotations.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
8a5fad0d92 intel/eu/gen12: Use SEND instruction for split sends.
The new SEND instruction behaves like the former SENDS instruction.
The original single-payload SEND instruction is gone.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
6634ede7aa intel/eu/gen12: Codegen SEND descriptor regions correctly.
The SEND instruction is now four-source.  The descriptor is no longer
part of source 1, so avoid touching it to avoid corruption while
initializing the descriptor.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
2c4c9aba30 intel/eu/gen12: Codegen pathological SEND source and destination regions.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
bafc9515db intel/eu/gen12: Codegen control flow instructions correctly.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
6e1daba3b4 intel/eu/gen12: Codegen three-source instruction source and destination regions.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
9fdb67aa09 intel/eu/gen12: Fix codegen of immediate source regions.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
6cb764ae9c intel/eu/gen12: Add Gen12 opcode descriptions to the table.
Quite a lot of churn because the encoding of most hardware opcodes has
changed unfortunately.

v2: Split dot-product description fixes to separate patch (Caio).

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
31182e7aa9 intel/eu/gen11+: Mark dot product opcodes as unsupported on opcode_descs table.
These instructions have been removed from the hardware.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
c742be1437 intel/eu/gen12: Implement datatype binary encoding.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Sagar Ghuge
a12533f2ce intel/eu/gen12: Implement immediate 64 bit constant encoding.
On Gen12, 64 bit immediate constants are loaded in reverse order. Lower
32 bit gets loaded from bit 96-127 and higher 32 bits from 64-95 in
instruction encoding.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Co-authored-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
5291283af0 intel/eu/gen12: Implement compact instruction binary encoding.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
77d09d0d50 intel/eu/gen12: Implement indirect region binary encoding.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
81400470be intel/eu/gen12: Implement SEND instruction binary encoding.
v2: Fix off-by-one upper GET_BITS() bound, combine 25-29 and 30-31
    descriptor fields (Ken).  Shorten name of GEN12_MD() macro, drop
    some removed TS message descriptor fields (Jordan).

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
d24b8af23d intel/eu/gen12: Implement control flow instruction binary encoding.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
956c156dc4 intel/eu/gen12: Implement three-source instruction binary encoding.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
fa48281795 intel/eu/gen12: Implement basic instruction binary encoding.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
143176163d intel/eu/gen12: Add sanity-check asserts to brw_inst_bits() and brw_inst_set_bits().
These caught a few bugs during the development of this series.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
7e5a8638d3 intel/eu/gen12: Extend brw_inst.h macros for Gen12 support.
The encoding of almost every instruction field has changed in Gen12,
so this involves adding a Gen12+ bitfield spec to every brw_inst
macro.  In addition some new macros are required to handle certain
discontiguous and variable-length fields.

This commit doesn't actually include the Gen12 updated bitfield specs,
only the macros are extended here for reviewability.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

v2: Rename FDC() to FFDC() and FDC1() to FDC() for consistency with
    the existing F() and FF() macros.
2019-10-11 12:24:16 -07:00
Francisco Jerez
6965a02e09 intel/ir: Represent physical edge of unconditional CONTINUE instruction.
This edge doesn't exist in the original scalar program, but it
represents a potential control flow path the EU will take in cases
where control flow isn't uniform across channels of the same SIMD
thread.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
eeaad2992c intel/ir: Represent physical edge of ELSE instruction.
This edge doesn't exist in the original scalar program, but it
represents a potential control flow path the EU will take in cases
where the condition isn't uniform across channels of the same SIMD
thread.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
152754665a intel/ir: Represent logical edge of BREAK instruction.
Currently only the physical back-edge is represented, which
incidentally also leads to the exit block of the loop, but we need the
direct logical edge in addition for our logical CFG representation to
be complete.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
c344c92b31 intel/ir: Add helper function to push block onto CFG analysis stack.
Requested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
d6a9731d8f intel/ir: Represent physical and logical subsets of the CFG.
This represents two control flow graphs in the same cfg_t data
structure: The physical CFG that will include all possible control
flow paths the EU can physically take, and the logical CFG restricted
to the control flow paths that exist in the original scalar program.
The latter is a subset of the former because in case of divergence the
SIMD vectorized program will take control flow paths that aren't part
of the original scalar program.

The bblock_link constructor and bblock_t::add_successor() now take a
"kind" parameter that specifies whether the edge is purely physical or
whether it's part of both the logical and physical CFGs (a logical
edge is of course always guaranteed to be in the physical CFG as
well).  bblock_t::is_predecessor_of() and ::is_successor_of() also
take a kind parameter specifying which CFG is being queried.  The '~>'
notation will be used now in order to represent purely physical edges
in IR dumps.

This commit doesn't actually add nor remove any edges from the CFG
(the only edges marked as purely physical here are the two WHILE loop
ones that already existed).  Optimization passes should continue using
the same (incomplete) physical CFG they were using before until
they're fixed to do something smarter in a later commit, so this
shouldn't lead to any functional changes.

v2: Remove tabs from lines changed in this file (Caio).

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
1b570456ca intel/ir: Drop hard-coded correspondence between IR and HW opcodes.
Having the IR opcodes locked to their hardware representation is risky
because it causes opcodes as different as BRC and IFF to compare equal
at the IR level (luckily the back-end only ever uses one opcode from
each group, right now), and it prevents us from supporting
instructions that change their hardware representation across
generations, which will become a problem on Gen12+ platforms.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
057902dcf8 intel/eu: Encode and decode native instruction opcodes from/to IR opcodes.
Change brw_inst_set_opcode() and brw_inst_opcode() to call
brw_opcode_encode/decode() transparently in order to translate between
hardware and IR opcodes, and update the EU compaction code in order to
do the same as needed, so we can eventually drop the one-to-one
correspondence between hardware and IR opcodes.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
25dd67099d intel/eu: Rework opcode description tables to allow efficient look-up by either HW or IR opcode.
This rewrites the current opcode description tables as a more compact
flat data structure.  The purpose is to allow efficient constant-time
look-up by either HW or IR opcode, which will allow us to drop the
hard-coded correspondence between HW and IR opcodes -- See the next
commits for the rationale.

brw_eu.c is now built as C++ source so we can take advantage of
pointers to member in order to make the look-up function work
regardless of the opcode_desc member used as look-up key.

v2: Optimize devinfo struct comparison (Caio)

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
51dc40cefb intel/eu: Fix up various type conversions in brw_eu.c that are illegal C++.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
35bcd08d61 intel/eu: Split brw_inst ex_desc accessors for SEND(C) vs. SENDS(C).
The brw_inst opcode accessors are going away in one of the following
commits.  We could potentially replace them with the new helpers that
do opcode remapping, but that would lead to a circular dependency
between brw_inst.h and brw_eu.h.  This way we also avoid ordering
issues that can cause the semantics of the ex_desc accessors to change
depending on whether the ex_desc field is set after or before the
opcode instruction field.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
b2ae65c7d9 intel/fs: Fix constness of implied_mrf_writes() argument.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
6f275a863d intel/fs: Define is_send() convenience IR helper.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
f326d9d218 intel/fs: Define is_payload() method of the IR instruction class.
This is required because SEND message payload sources are fetched
asynchronously by the hardware, which can lead to WaR data corruption
on Gen12+ platforms if not handled specially by the compiler to
guarantee proper synchronization.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
a42581fa8f intel/fs: Teach fs_inst::is_send_from_grf() about some missing send-like instructions.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-11 12:24:16 -07:00
Bas Nieuwenhuizen
6da3bf2600 nir/dead_cf: Remove dead control flow after infinite loops.
And after discard-only loops. Otherwise we end up with dead code
which confuses nir_repair_ssa into adding a whole bunch of uses
of undefined. However, for derefs, we sometimes always expect to
get a variable instead of undefined.

Fixes dEQP-VK.graphicsfuzz.write-red-in-loop-nest on radv.

Fixes: c832820ce9 "nir/dead_cf: Repair SSA if the pass makes progress"
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1928
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-10-11 17:24:26 +02:00
Rhys Perry
f13ad839f1 aco: don't use p_as_uniform for vgpr sampler/image indices
p_as_uniform can get CSE'd, which can be incorrect and break some
dEQP-VK.descriptor_indexing.* tests.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-11 14:26:58 +00:00
Rhys Perry
0c3fe323b6 aco: implement divergent vulkan_resource_index
Fixes the UBO/SSBO dEQP-VK.descriptor_indexing.* tests

v2: remove bld.copy() usage

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-11 14:26:58 +00:00
Rhys Perry
5526a557ee aco: readfirstlane vgpr pointers in convert_pointer_to_64_bit()
This can happen when bcsel is used between the results of two
vulkan_resource_index. It's also probably needed for non-uniform
descriptor indexing

Fixes dEQP-VK.spirv_assembly.instruction.compute.variable_pointers.compute.reads_opselect_two_buffers

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-11 14:26:58 +00:00
Rhys Perry
45d6c69b99 aco: use can_accept_constant in valu_can_accept_literal
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-11 14:26:58 +00:00
Rhys Perry
b37857bcea aco: don't apply sgprs/constants to read/write lane instructions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-11 14:26:58 +00:00
Rhys Perry
599d634c2c nir/lower_input_attachments: pass on non-uniform access flag
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-11 14:26:58 +00:00
Rhys Perry
5ef04d7982 nir/lower_non_uniform: lower image/texture instructions taking derefs
v2: always assert on the texture/sampler handle's num_components
v3: replicate the deref inside the loop
v4: remove a case of useless line wrapping

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-11 14:26:58 +00:00
Jonathan Marek
7e3b900c80 etnaviv: rework etna_resource_create tiling choice
Now that the base resource is allowed to be incompatible with PE, we can
make a smarter choice of tiling mode to avoid allocating a PE compatible
base that is never used for regular textures. This affects GPUs like GC2000
where there is no tiling compatible with both PE and TE.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-10-11 07:26:52 -04:00
Jonathan Marek
b962776530 etnaviv: rework compatible render base
For PE-incompatible layouts, use a mechanism similar to what texture does
to create a compatible base resource.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-10-11 07:26:52 -04:00
Jonathan Marek
e7e02435a8 etnaviv: get addressing mode from tiling layout
Remove the "addressing_mode" state, which is currently set incorrectly, and
instead deduce the addressing mode from the tiling layout.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-10-11 07:26:52 -04:00
Jonathan Marek
5403b36653 etnaviv: clear texture cache and flush ts when texture is modified
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-10-11 07:26:52 -04:00
Christian Gmeiner
6dc650fe71 etnaviv: output the same shader-db format as freedreno, v3d and intel
This lets us reuse their report.py.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-10-11 12:35:15 +02:00
Christian Gmeiner
140bc0f040 etnaviv: nir: start to make use of compile_error(..)
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-10-11 11:37:03 +02:00
Michel Dänzer
d60b8679a4 gitlab-ci: Disable meson-mingw32-x86_64 job again for now
The wrapdb.mesonbuild.com SSL certificate expired, causing the job to
fail: https://gitlab.freedesktop.org/mesa/mesa/-/jobs/731864

Switching to http:// doesn't avoid it:
https://gitlab.freedesktop.org/daenzer/mesa/-/jobs/732043
2019-10-11 11:10:01 +02:00
Michel Dänzer
eb86cbabe6 gitlab-ci: Add .use-debian-10 template
It simplifies the definitions of jobs using the Debian 10 image.

The needs: was previously missing from the llvmpipe/softpipe test jobs,
so they could spuriously run if the debian-10 job failed or was
cancelled.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-11 10:05:21 +02:00
Michel Dänzer
9691329727 gitlab-ci: Remove redundant .meson-cross template script
It was identical to the one inherited from the .meson-build template.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-11 10:04:45 +02:00
Dave Airlie
f59ff014b1 gallivm: fix coroutines on aarch64 with llvm 8
The coroutine split pass is missing a dependency before LLVM 9.0,
and fails to initialise properly if the CallGraphWrapperPass hasn't
be initialised earlier (x86 does it due to some of it's passes
requiring it).

This is a workaround for llvm 8 (coroutines are only supported in 8
and higher). It adds another pass that has a dependency on the pass
the coroutines split requires. This pass shouldn't have any raal
effects.

Fixes: d32690b43c (gallivm: add coroutine pass manager support)
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-10-11 12:15:45 +10:00
Dave Airlie
05b008c961 llvmpipe: add support for tg4 component selection.
This is needed as part of GLES3.1 and helps for ARB_gpu_shader5.

Fixes: KHR-GLES31.core.texture_gather.* cases
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-10-11 00:32:15 +00:00
Dave Airlie
a70f0a8841 st/glsl: add support for alternate TG4 encoding.
This will encode the component selection value (0, 1, 2, 3) into
the X swizzle of the sampler, if the driver requests it.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-10-11 00:32:15 +00:00
Dave Airlie
0c09df52e1 gallium: add a a new cap for changing the TGSI TG4 instruction encoding
Accessing the TG4 component via immediates in the llvmpipe backend is quite
messy (like really messy). Roland suggested we change the instruction encoding,
so introduce a cap to allow the component to be selected to be store in the
sampler swizzle, which should be otherwise unused.

I could probably switch all drivers over, but virgl would need some work that
I'd prefer not to rush it.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-10-11 00:32:15 +00:00
Dave Airlie
1e65757f4e gallivm/sample: add gather component selection to the key.
This allows for component selection to work as per ARB_gpu_shader5/GLES3.1

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-10-11 00:32:15 +00:00
Roland Scheidegger
5084e9785b llvmpipe: increase max texture size to 2GB
The 1GB limit was arbitrary, increase this to 2GB (which is the max
possible without code changes).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2019-10-11 01:41:08 +02:00
Dylan Baker
d905d9b600 gitlab-ci: Add a mingw x86_64 job
Acked-by: Eric Engestrom <eric.engestrom@intel.com> (v1)
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-10 16:33:05 -07:00
Dylan Baker
f066c96078 appveyor: Add support for meson as well as scons on windows
This job uses the vs2017 backend of meson (msbuild) as opposed to the
ninja backend used on MacOS and Linux.

v7: - rebase on master
    - remove llvm (we'll add that back later)
    - remove cygwin (we'll add that back later too)
v6: - rebase on master, including the addition of cygwin
    - consolidate 3 appveyor patches into this one patch
v5  - use the new b_vscrt option instead of manually specifying the crt
v4: - rebase on python3 generators
    - cache meson wraps
    - Build x86 instead of x86_64, since that's what the pre-built LLVM
      is
    - update to vs2017 from vs2015
    - set the default-library to static
    - use the new vscrt override
    - add the /m switch to msbuild to make the build somewhat faster

Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-10 16:33:05 -07:00
Dylan Baker
44c5e634a5 docs: update meson docs for windows
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-10 16:33:05 -07:00
Dylan Baker
638868bbff glsl/tests: Handle no-exec errors
Currently meson doesn't correctly handle passing compiled binaries to
scripts in tests. This patch looks to the future (0.53) when meson will
have this functionality, but also immediately it fixes these tests in
cross compiles by causing them to return 77, which meson interprets as
skip.

Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-10 16:33:05 -07:00
Dylan Baker
1bf5e5a011 meson/util: Don't run string_buffer tests on mingw
They succeed with MSVC but not with MinGW. I don't understand why they
fail.

Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-10 16:33:04 -07:00
Dylan Baker
09d21b554a meson: glcpp tests are expected to fail on windows
v2: - Exclude the tests rather than xfail them

Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-10 16:33:04 -07:00
Dylan Baker
8f363ce5b5 meson: only build timspec test if timespec is available
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-10 16:33:04 -07:00
Dylan Baker
fe8f8981d0 meson: don't error on formaters with mingw
MSVC is generally happy, but mingw errors. I've spent as much time
(several days) trying to squash all of these warnings and I'm done with
it, just leave them as warnings with MinGW.

Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-10 16:33:04 -07:00
Dylan Baker
1e2c05b82a meson: add msvc compat args to swr
This has always been present in the scons build, so it should be in
the meson build as well.

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-10 16:33:04 -07:00
Dylan Baker
63f5aee694 meson: maintain names of shared API libraries
Mesa uses the lib prefix, and doesn't use a version for it's dynamic
libraries, which meson defaults to.

v2: - this patch

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-10 16:33:04 -07:00
Dylan Baker
e1dbf10749 meson: don't build or run mesa-sha1 test on windows
It crashes hard (pop-up window and all).

v2: - Change comment to FIXME

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-10 16:33:04 -07:00
Dylan Baker
b6b59813c3 meson: disable graw tests on mingw
I can't figure out why symbols are being exposed that shouldn't.

v2: - change comment to FIXME

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-10 16:33:04 -07:00
Dylan Baker
56db696875 meson: don't build gallium trivial tests on windows
They require the pipe-loaders, which require xmlconfig, which doesn't
build with msvc.

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-10 16:33:04 -07:00
Dylan Baker
880ca3c964 meson: Set visibility and compat args for graw
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-10 16:33:04 -07:00
Dylan Baker
095bdbda2b meson: Add msvc compat args to util/tests
To keep this building with msvc

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-10 16:33:04 -07:00
Dylan Baker
00fca07c3b meson: Add idep_getopt for tests
There are quite a few tests that require getopt, when using MSVC we need
to use the bundled version of getopt since there isn't a system version.

Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-10 16:33:04 -07:00
Dylan Baker
8eee724b73 meson: don't define USE_ELF_TLS for windows
Because the macros for exporting dll symbols and using TLS are mutually
exclusive.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-10 16:33:04 -07:00
Dylan Baker
3740ffb59c meson: add switches for SWR with MSVC
This makes two changes for SWR,

The first is that it reorders the arguments to try to put the ICL ones
first. This is required to support older versions of meson that don't
add enough "error in this case" switches to ICL, which causes it to
happy accept -mavx (for example) even though it doesn't support them,
resulting in compilation failures.

The second is to fix the names of the libraries, setting the soversion
to '' will result in <lib>.dll, instead of <lib>-0.dll. Since these are
not versioned dll's, but implement an internal API we should communicate
that. It's also what scons does.

Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-10 16:33:04 -07:00
Dylan Baker
d2cb0a59ce meson: disable sse4.1 optimizations with msvc
There isn't an obvious command line switch here, /arch:AVX *might* be
the right thing, but meson doesn't know what to do here either and
leaves the -msse4.1 and -mstackrealign.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-10 16:33:04 -07:00
Dylan Baker
150aec5d1f meson: force inclusion of inttypes.h for glcpp with msvc
Because we provide a copy if MSVC doesn't, and we need it to make flex
do what we want.

Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-10 16:33:04 -07:00
Dylan Baker
8b19c5b145 meson: Add support for using win_flex and win_bison on windows
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-10 16:33:04 -07:00
Dylan Baker
81d44c01ee meson: don't look for rt on windows
v6: - Minor refactor

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-10 16:33:04 -07:00
Dylan Baker
e3f5c3232c meson: fix pipe-loader compilation for windows
v2: - Add missing D to pound define
    - Simply define the variable rather than set it to 1 (mirrors
      android.mk not scons)

Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-10 16:33:04 -07:00
Dylan Baker
474d6f8e08 util/xmlconfig: include strndup.h for windows
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-10 16:33:04 -07:00
Dylan Baker
7ef85a0d92 meson: Don't check for posix_memalign on windows
There's a mingw bug for this, it exports __builtin_posix_memalign but
not posix_memalign, so the check will succeed, but compiling will fail.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-10 16:33:04 -07:00
Dylan Baker
597a063551 meson: fix gallium-osmesa to build for windows
v2: - set so_version to '' (only affects windows)
    - always set lib prefix to 'lib', even on msvc
v5: - key NO_EXPORTS on shared glapi instead of gles.

Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-10 16:33:04 -07:00
Dylan Baker
b97a341017 meson: build graw-gdi target
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-10 16:33:04 -07:00
Dylan Baker
a2c79cc3cb meson: build libgl-gdi target
v4: - Fix check for broken mingw (should be for x86 not x86_64)
    - Add comment about why check is needed

Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-10 16:33:04 -07:00
Dylan Baker
3c8934343b meson: build wgl state tracker
v4: - Handle enable gles properly
    - Add comments about what various #defines do
v5: - key NO_EXPORTS on shared glapi instead of gles.

Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-10 16:33:04 -07:00
Dylan Baker
560cdabebe meson: build gallium gdi winsys
v6: - use null_dep instead of []

Reviewed-by: Eric Anholt <eric@anholt.net> (v5)
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-10 16:33:04 -07:00
Dylan Baker
2a53a06793 meson: Add necessary defines for mesa_gallium on windows
v4: - Retain scons comments for windows specific defines
v5: - key GLAPI_NO_EXPORTS off of shared-glapi instead of gles

Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-10 16:33:04 -07:00
Dylan Baker
2e17348600 meson: Add windows defines to glapi
These are needed to control the export or symbols due to differences
between the way windows and *nix handle symbol exports.

Reviewed-by: Eric Anholt <eric@anholt.net> (v2)
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>

v5: - key NO_EXPORT off of shared-glapi instead of gles
2019-10-10 16:33:04 -07:00
Dylan Baker
3aee462781 meson: add windows compiler checks and libraries
v4: - Fix typo in warning code (4246 -> 4267)
    - Copy comments from scons for what MSVC warnings codes do
    - Merge linker argument changes into this commit
v5: - Add /GR- on windows if LLVM is build without rtti (equivalent to
      GCc's -fno-rtti')
    - Add /wd4291, which is catching the same hting that
    -Wno-non-virtual-dtor is on GCC/Clang

Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-10 16:33:04 -07:00
Dylan Baker
16bc3073cb util: use _WIN32 instead of WIN32
MinGW defines only _WIN32, but doesn't have fcntl, so we need to use the
windows path.

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-10 16:33:04 -07:00
Rob Clark
f1fe656a92 freedreno/ir3: handle multi component alu src when propagating shifts
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-10-10 23:12:05 +00:00
Rob Clark
61a0a86d28 freedreno/ir3: drop unused param
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-10-10 23:12:05 +00:00
Marek Olšák
c38c8d012e clover: fix the nir_serialize build failure
Fixes: dd4cc56ebd "nir: add a strip parameter to nir_serialize"
2019-10-10 18:44:40 -04:00
Dave Airlie
1b221f4e7b llvmpipe/draw: handle UBOs that are < 16 bytes.
Not sure if this is a bug in the user or not, but some CTS
tests fail due to using an 8 byte constant buffer.

Fixes: KHR-GLES31.core.layout_binding.block_layout_binding_block_VertexShader

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-10-10 21:52:20 +00:00
Dave Airlie
744b8936df llvmpipe/draw: fix image sizes for vertex/geometry shaders.
since images are a single level, minify before passing the w/h
to draw.

Fixes: KHR-GLES31.core.shader_image_size.basic-nonMS-vs-*

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-10-10 21:52:20 +00:00
Dave Airlie
7cac880831 llvmpipe: make texture buffer offset alignment == 16
Due to use vmovdqa instructions in the asm, which require 16-byte
aligned buffers.

This fixes a crash in
KHR-GLES31.core.texture_buffer.texture_buffer_texture_buffer_range

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-10-10 21:52:20 +00:00
Eric Engestrom
34ba363ab0 meson: skip installation of GLVND-provided headers
Fixes: 93df862b6a ("meson: re-add incorrect pkg-config files with GLVND for backward compatibility")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1846
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-10 22:33:09 +01:00
Eric Engestrom
1a7e9652c4 meson: split Mesa headers as a separate installation
Fixes: 93df862b6a ("meson: re-add incorrect pkg-config files with GLVND for backward compatibility")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-10 22:33:05 +01:00
Eric Engestrom
daae003f47 meson: split headers one per line
Fixes: 93df862b6a ("meson: re-add incorrect pkg-config files with GLVND for backward compatibility")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-10 22:21:00 +01:00
Eric Engestrom
b9a5fb1f05 meson: move a couple of include installs around
Preparation for a later commit.

Fixes: 93df862b6a ("meson: re-add incorrect pkg-config files with GLVND for backward compatibility")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-10 22:18:04 +01:00
Eric Engestrom
b57fa7ca49 meson: rename glvnd_missing_pc_files to not glvnd_has_headers_and_pc_files
This reflects better what is provided by glvnd or not.

Fixes: 93df862b6a ("meson: re-add incorrect pkg-config files with GLVND for backward compatibility")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-10 22:18:04 +01:00
Eric Engestrom
a0829cf23b GL: drop symbols mangling support
SCons and Meson have never supported that feature, and Autotools was
deleted over 6 months ago and no-one complained yet, so it's pretty
obvious nobody cares about it.

Fixes: 95aefc94a9 ("Delete autotools")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-10 21:40:48 +01:00
Rhys Perry
2026ff5165 aco: update print_ir
Mostly adds GFX10 stuff.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2019-10-10 20:02:36 +00:00
Rhys Perry
283eda71cf aco: rework scratch resource code
Fix compute, cleanup and add GFX10 support.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2019-10-10 20:02:36 +00:00
Rhys Perry
f64b1a3454 aco/gfx10: disable GFX9 1D texture workarounds
Navi added back support for 1D textures.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2019-10-10 20:02:36 +00:00
Rhys Perry
de0748c42e aco/gfx10: fix inline uniform blocks
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2019-10-10 20:02:36 +00:00
Rhys Perry
ba71be228f radv/aco: disable NGG when ACO is used
Note that radv_device.c still has to be modified to use ACO with Navi.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2019-10-10 20:02:36 +00:00
Marek Olšák
b7fc082b28 ac/nir: add back nir_op_fmod
radeonsi doesn't lower it for doubles.

This partially reverts commit d861401554.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-10 15:57:50 -04:00
Marek Olšák
09e0e4c93c gallium: remove PIPE_SHADER_CAP_SCALAR_ISA
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-10 15:49:19 -04:00
Marek Olšák
1f718bfc78 tgsi_to_nir: use nir_shader_compiler_options::lower_to_scalar
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-10 15:49:19 -04:00
Marek Olšák
e4f7d2576e st/mesa: use nir_shader_compiler_options::lower_to_scalar
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-10 15:49:19 -04:00
Marek Olšák
cebc38ff60 nir: add nir_shader_compiler_options::lower_to_scalar
This will replace PIPE_SHADER_CAP_SCALAR_ISA.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-10 15:49:18 -04:00
Marek Olšák
7fc5919793 tgsi_to_nir: add #ifdef header guards
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-10 15:49:18 -04:00
Marek Olšák
e5209e6a95 nir/drawpixels: fix what appears to be a copy-paste bug in get_texcoord_const
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-10 15:49:18 -04:00
Marek Olšák
e621b30787 nir/drawpixels: handle load_color0, load_input, load_interpolated_input
for radeonsi

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-10 15:49:18 -04:00
Marek Olšák
3340c066a1 nir: move gl_nir_opt_access from glsl directory
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-10 15:49:18 -04:00
Marek Olšák
dd4cc56ebd nir: add a strip parameter to nir_serialize
so that drivers don't have to call nir_strip manually.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-10-10 15:47:07 -04:00
Bas Nieuwenhuizen
e6986bcb73 radv: Enable VK_ANDROID_external_memory_android_hardware_buffer.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen
e92b9c5f4f radv: Check the size of the imported buffer.
This is a security feature to disallow malicious apps from passing
a buffer that is too small.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen
dad047a56a radv: Expose image handle compat types for Android handles.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen
1b0ceba925 radv: Allow Android image binding.
Using delayed layout of images.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen
83a012b603 radv/android: Add android hardware buffer import/export.
Support does not include images yet.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen
adad61239c radv: Deal with Android external formats.
To abstract things a bit, this adds a helper function in radv_android.c.
However, this means we have to link in radv_android.c on non-android as
well, which means some scaffolding changes.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen
041fc7beb8 radv: Derive android usage from create flags.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen
53b1372571 radv: Disallow sparse shared images.
Since we really cannot share them ever.

Also remove an unused switch.

Fixes: b70829708a "radv: Implement VK_KHR_external_memory"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen
f36b52740a radv/android: Add android hardware buffer queries.
Derived from the Intel code.

For the internal format we just use the internal Vulkan format,
as we have Vulkan formats for all android formats we care about.

For the ycbcr properties we just do something. I do not have a real
clue what would be recommended.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen
a34e4dd0d2 radv/android: Add android hardware buffer field to device memory.
You cannot go from BO to Android hardware buffer, so for export we
have to remember it.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen
9ea72b5337 radv: Add VK_ANDROID_external_memory_android_hardware_buffer.
Still disabled but now we can add entrypoints.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen
4a495e1a85 radv: Unset vk_info in radv_image_create_layout.
For better test coverage of this corner case.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen
64768111c3 radv: Handle slightly different image dimensions.
The minigbm comment really says it all. We should
fix minigbm as well, but for now this is the more
robust solution.

Note that this only changes width and height for
the surface creation, not for the image and hence
also not for the sampler, where it would wreak
havoc due to the normalized coords.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen
852c64ca65 radv: Delay patching for imported images until layout time.
We want this flexibility because in GFX10 we lose any stride fields,
so we have to make sure our width/height are in alignment with
the external image we import.

Furthermore, we need the ability to inject tiling modifiers on import
time which is strictly after create time for Android. So, with the
layout & patch functions being fully independent of pCreateInfo, we
can delay it until import/bind time.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen
2ab4d418f9 radv: Split out layout code from image creation.
So we can delay the layout until later in some import cases.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen
825ddfee59 radv: Handle device memory alloc failure with normal free.
Less duplication/complexity.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen
e1469c02cf radv: Cleanup buffer_from_fd.
Unused stride/offset args.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Tomeu Vizoso
6397dff6d7 gitlab-ci/lava: Test Lima driver with dEQP
Run dEQP on boards with Mali 400 and 450 in Baylibre's lab.

There's lots of skipped tests because of crashes and undetermined
behavior. May be a good idea to run the tests with valgrind and fix any
issues found.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Neil Armstrong <narmstrong@baylibre.com>
2019-10-10 14:50:14 +00:00
Tomeu Vizoso
8a168683d0 gitlab-ci/lava: Use files to list tests to skip
As the non-LAVA runner script does, have per-GPU version files listing
the tests that are to be skipped, due to being very slow, unstable, etc.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Neil Armstrong <narmstrong@baylibre.com>
2019-10-10 14:50:14 +00:00
Rafael Antognolli
01122a78b3 intel/tools: Support multiple contexts in intel_dump_gpu.
Create basic aub_context on GEM_CONTEXT_CREATE.

Set it up and submit a context + ring + pphwsp during execbuf
submission, if it has not been initialized yet.

v2: Write the HWSP only once per engine (Lionel).

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-10 14:08:50 +00:00
Rafael Antognolli
12feafc28e intel/tools: Add basic aub_context code and helpers.
v2:
 - Only dump context if there were no erros (Lionel).
 - Store counter for context handles in aub_file (Lionel).
v3:
 - Add a comment about aub_context -> GEM context (Lionel).

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-10 14:08:50 +00:00
Rafael Antognolli
472de61187 intel/tools: Use common code for GGTT address allocation.
We want to be able to create contexts on demand, and increase the GGTT
as needed for that. Use the aub_map_ggtt() function for that.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-10 14:08:50 +00:00
Rafael Antognolli
9968316ed0 intel/tools: Factor out GGTT allocation.
We want to reuse it in execlists_setup().

v2: Rename it to write_ggtt_ptes() (Lionel).
v3: Rename it to aub_map_ggtt() (Lionel).

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-10 14:08:50 +00:00
Bas Nieuwenhuizen
a9687c4e05 radv: Implement & enable VK_EXT_texel_buffer_alignment.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 13:24:16 +00:00
Samuel Pitoiset
9d17d97ee4 radv: use a compute shader for copying timestamp query results
When the timestamp is not ready (ie. UINT64_MAX), the availabily bit
should be zero. The previous code used to copy the timestamp value
as the availabily bit and that's completely wrong.

Because it's not that simple to emit a conditional with the CP, the
driver now uses a compute shader for copying timestamp query results.

Fixes dEQP-VK.pipeline.timestamp.misc_tests.reset_query_before_copy.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-10 13:23:22 +02:00
Samuel Pitoiset
dad80eadb2 radv: sync before resetting query pools if timestamps have been written
Otherwise, the GPU might write timestamp queries after the reset
operation. This is similar to other query operations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-10 13:23:20 +02:00
Timur Kristóf
aa75be05af aco: Clean up usages of PhysReg::reg from aco_assembler.
These are not needed anymore, since PhyReg has an implicit
conversion operator that can convert it to unsigned int,
which is equivalent to accessing this field.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Timur Kristóf
d729d8f1dc aco: Add extra assertion for number of FS input VGPRs.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Timur Kristóf
a89153d038 aco: Fix s_dcache_wb on GFX10.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Rhys Perry
68c9554732 aco: Have s_waitcnt_vscnt write to NULL.
Not sure if this instruction actually writes anything, but LLVM
disassembles a destination and sets it to NULL.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Rhys Perry
619f0a71cc aco: Use the VOP3-only add/sub GFX10 instructions if needed.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Rhys Perry
6a6bef59b0 aco: Initial work to avoid GFX10 hazards.
Currently just breaks up SMEM groups and fixes
FeatureVMEMtoScalarWriteHazard (name from LLVM).

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Rhys Perry
d63c175897 aco: pad code with s_code_end on GFX10
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Rhys Perry
83993f535e aco: workaround GFX10 0x3f branch bug
According to LLVM, branches with an offset of 0x3f are buggy.

v2: (by Timur Kristóf)
- extract the GFX10 specific part to its own function

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Timur Kristóf
0be1dd8564 aco: Fix VS input VGPRs on GFX10.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Rhys Perry
c24cd97515 aco: Assemble opsel in VOP3 instructions.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Rhys Perry
818bdab796 aco: Allow literals on VOP3 instructions.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2019-10-10 09:57:53 +02:00
Timur Kristóf
7cf1dcf22d aco: Support subvector loops in aco_assembler.
These are currently not used, but could be useful later.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Timur Kristóf
21f1953383 aco: Set GFX10 dimensionality on the instructions that need it.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Timur Kristóf
eaa2a7cdf6 aco: Use ac_get_sampler_dim, delete duplicate code.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Timur Kristóf
1de9ef9c96 aco: Set GFX10 DLC bit properly.
The DLC bit is now set to 1 for all loads when GLC is also set,
but cleared to 0 for all stores (otherwise it causes issues),
and also cleared to 0 for atomics.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Timur Kristóf
89b074be86 aco: Support GFX10 VOP3 and VOP1 as VOP3 in aco_assembler.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Timur Kristóf
d3a48c272f aco: Support GFX10 EXP in aco_assembler.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Timur Kristóf
e6330d71b5 aco: Fix GFX9 FLAT, SCRATCH, GLOBAL instructions, add GFX10 support.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Timur Kristóf
64d74ca816 aco: Support GFX10 MIMG and GFX9 D16 in aco_assembler.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Timur Kristóf
c0df15e645 aco: Support GFX10 MTBUF in aco_assembler.
Also remove img_format from aco_ir, since it can be calculated
from dfmt and nfmt. So only the assember needs to deal with it.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Timur Kristóf
e96124bd65 aco: Link ACO with amd/common.
We'd like to use some functions, for example some
ac_shader_util functions in ACO, so we need to link
ACO to AC.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:52 +02:00
Timur Kristóf
c57503b932 amd/common: Add extern "C" to some headers that were missing it.
We'd like to include some of these in C++ code later.
Specifically, ACO is written in C++ and we would like to use
some of this code in ACO in order to avoid code duplication.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:52 +02:00
Timur Kristóf
9e27816252 aco: Support GFX10 MUBUF in aco_assembler.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:52 +02:00
Timur Kristóf
6106d4bce9 aco: Support GFX10 DS in aco_assembler.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:52 +02:00
Timur Kristóf
bbe87eb6c3 aco: Support GFX10 VINTRP in aco_assembler.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:52 +02:00
Timur Kristóf
b6235651b9 aco: Support GFX10 SMEM in aco_assembler.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:52 +02:00
Timur Kristóf
fd1d947457 aco: Add missing GFX10 specific fields and some README notes.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:52 +02:00
Timur Kristóf
a01d796de4 aco: Set +wavefrontsize64 for LLVM disassembler in GFX10 wave64 mode.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:52 +02:00
Alejandro Piñeiro
fa41a51891 v3d: take into account prim_counts_offset
Specifically when reading the primitive counters.

This fixed ~700 CTS tests using this pattern:
dEQP-GLES3.functional.transform_feedback.*

when run after tests like
dEQP-GLES3.functional.prerequisite.read_pixels on the same
caselist. When run individually those tests were passing because
prim_counts_offset was zero.

Fixes: 0f2d1dfe65 ("v3d: use the GPU to
       record primitives written to transform feedback")

Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-10-10 09:51:50 +02:00
Samuel Pitoiset
42b2d1119a radv: get the device name from radeon_info::name
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-10 08:15:41 +02:00
Dave Airlie
b1f3173d0f st/mesa: fix R8 bitmap texture for TGSI paths.
The initial patch only fixed up the NIR path, but forgot
the TGSI path needed fixing as well.

Fixes: f92226931b ("st/mesa: Prefer R8 for bitmap textures")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-10 10:22:37 +10:00
Jason Ekstrand
c7e5d24d8f anv/pipeline: Capture serialized NIR
This allows the serialized NIR to be displayed in RenderDoc and similar
tools.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-09 22:28:01 +00:00
Matt Turner
b2f6fda542 clover: Remove unused code
Fixes: 96b592696f ("gallium: Require LLVM >= 3.9")
Bug: https://bugs.gentoo.org/685678
2019-10-09 14:54:07 -07:00
Greg V
6da865bcfe clover: use iterator_range in get_kernel_nodes
With libc++ (LLVM's STL implementation), the original code does not compile because an
appropriate vector constructor cannot be found (for the _ForwardIterator one, requirement
is_constructible is not satisfied).
2019-10-09 14:54:07 -07:00
Marek Olšák
aed1f7ad34 radeonsi: enable MSAA shader images
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-10-09 17:12:38 -04:00
Marek Olšák
095a58204d radeonsi: expand FMASK before MSAA image stores are used
Image stores don't use FMASK, so we have to turn it into identity.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-10-09 17:12:36 -04:00
Marek Olšák
98b88cc1f6 radeonsi: apply FMASK to MSAA image loads
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-10-09 17:12:34 -04:00
Marek Olšák
c0575a6241 radeonsi: clean up image_fetch_rsrc
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-10-09 17:12:33 -04:00
Marek Olšák
743a9d85e2 radeonsi: add FMASK slots for shader images (for MSAA images)
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-10-09 17:12:31 -04:00
Marek Olšák
1881b35bf6 radeonsi: set the sample index for shader images correctly
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-10-09 17:12:30 -04:00
Marek Olšák
0a0def7317 radeonsi: fix GLSL imageSamples()
We haven't supported MSAA images, so it doesn't matter much.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-10-09 17:12:28 -04:00
Marek Olšák
279da8a201 tgsi/scan: add tgsi_shader_info::msaa_images_declared
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-10-09 17:12:27 -04:00
Marek Olšák
e26bd397a8 nir: add shader_info::last_msaa_image
for radeonsi

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-10-09 17:12:19 -04:00
Marek Olšák
e4f4bb8abd radeonsi: don't set BO metadata for non-zero planes
pointed out by Bas
2019-10-09 17:06:54 -04:00
Marek Olšák
28da990bed radeonsi: ignore metadata for non-zero planes
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-09 17:06:54 -04:00
Marek Olšák
86e60bc265 radeonsi: remove si_vid_join_surfaces and use combined planar allocations
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-09 17:06:54 -04:00
Marek Olšák
0f7c9dad44 radeonsi: allocate planar multimedia formats in 1 buffer
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-09 17:06:54 -04:00
Marek Olšák
35680bfea1 vl: use u_format in vl_video_buffer_formats
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-09 17:06:54 -04:00
Marek Olšák
a122e70858 gallium/u_tests: test NV12 allocation and export
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-09 17:06:54 -04:00
Marek Olšák
20f132e5ef gallium/util: add planar format layouts and helpers
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-09 17:06:54 -04:00
Marek Olšák
3d06b9952c gallium/util: remove enum numbering from util_format_layout
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-09 17:06:54 -04:00
Caio Marcelo de Oliveira Filho
9b58863f87 i965: Disable fast clears when running with INTEL_DEBUG=nofc
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-10-09 13:29:26 -07:00
Caio Marcelo de Oliveira Filho
bb9af8abbd iris: Disable fast clears when running with INTEL_DEBUG=nofc
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-10-09 13:29:26 -07:00
Caio Marcelo de Oliveira Filho
44978baece anv: Disable fast clears when running with INTEL_DEBUG=nofc
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-10-09 13:29:26 -07:00
Caio Marcelo de Oliveira Filho
d438261e05 intel: Add INTEL_DEBUG=nofc for disabling fast clears
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-10-09 13:29:26 -07:00
Maya Rashish
e0d89b90d4 llvmpipe: avoid left-shifting a negative number.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Maya Rashish <coypu@sdf.org>
2019-10-09 20:20:40 +00:00
Danilo Spinella
962aca1910 egl: Include stddef.h in generated source
Required for NULL macro used throughout the generated file.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-10-09 13:16:38 -07:00
OBATA Akio
1ee4258383 util: fix to detect NetBSD properly
<sys/param.h> is required for NetBSD version detection,
and __NetBSD__ must be used to detect even on older releases.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-10-09 13:01:17 -07:00
Jan Beich
6ea0a918bb util: simplify BSD includes
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jan Beich <jbeich@FreeBSD.org>
2019-10-09 12:55:15 -07:00
Jan Beich
e892d9337f util: detect AltiVec at runtime on BSDs
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jan Beich <jbeich@FreeBSD.org>
2019-10-09 12:55:13 -07:00
Jan Beich
8d2dd1f4f3 util: skip AltiVec detection if built with -maltivec
Helps platforms where runtime detection isn't implemented.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jan Beich <jbeich@FreeBSD.org>
2019-10-09 12:55:11 -07:00
Jan Beich
601a098338 util: detect NEON at runtime on FreeBSD
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jan Beich <jbeich@FreeBSD.org>
2019-10-09 12:55:10 -07:00
Jan Beich
7d5ad8e77e util: skip NEON detection if built with -mfpu=neon
Helps platforms where runtime detection isn't implemented.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jan Beich <jbeich@FreeBSD.org>
2019-10-09 12:55:00 -07:00
Adam Jackson
5218c3b27e egl: Make native display detection work more than once
eglGetDisplay is awful because you have to inspect the pointer you're
given and guess what type of native display it corresponds to. We make
it worse by caching the type of the first such display we detect, so if
the second call to eglGetDisplay is to a different display type, kaboom.

Fortunately this is a problem that can be solved with the delete key.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/156
2019-10-09 18:12:29 +00:00
Rhys Perry
3f6e91a8d8 aco: enable nir_opt_sink
SGPRS: 880272 -> 838936 (-4.70 %)
VGPRS: 705316 -> 680988 (-3.45 %)
Spilled SGPRs: 1032 -> 832 (-19.38 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 252 -> 252 (0.00 %) dwords per thread
Code Size: 55150788 -> 55172436 (0.04 %) bytes
LDS: 451 -> 451 (0.00 %) blocks
Max Waves: 66178 -> 68706 (3.82 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-09 17:55:25 +00:00
Connor Abbott
5ac32b2954 nir/sink: Don't sink load_ubo to outside of its defining loop
Previously, this could have made the resource divergent in code like
that which is genereated by nir_lower_non_uniform_access.

Fixes: da8ed68a ('nir: replace nir_move_load_const() with nir_opt_sink()')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-09 17:55:25 +00:00
Connor Abbott
af9296b8c0 nir/sink: Rewrite loop handling logic
Previously, for code like:
loop {
    loop {
        a = load_ubo()
    }
    use(a)
}
adjust_block_for_loops() would return the block before the first loop.
Now we compute the range of allowed blocks and then walk the dominance
tree directly, guaranteeing directly that we always choose a block that
dominates all the uses and is dominated by the definition.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-09 17:55:25 +00:00
Marek Olšák
b049ebcf90 amd: don't use AMD_FAMILY definitions from amdgpu_drm.h
use the ones from addrlib

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-09 13:27:13 -04:00
Dylan Baker
770ab4db82 docs: update calendar, add news item, and link release notes for 19.2.1 2019-10-09 10:26:23 -07:00
Dylan Baker
4c302ba941 docs: Add SHA256 sum for 19.2.1 2019-10-09 10:26:23 -07:00
Dylan Baker
970a83ef34 docs: Add relnotes for 19.2.1 2019-10-09 10:26:23 -07:00
Rhys Perry
2ea9e59e8d aco: move s_andn2_b64 instructions out of the p_discard_if
And use a new p_discard_early_exit instruction. This fixes some cases
where a definition having the same register as an operand causes issues.

v2: rename instruction to p_exit_early_if
v2: modify the existing instruction instead of creating a new one
v3: merge the "i == num - 1" IFs

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-09 16:19:02 +00:00
Daniel Schürmann
f584c42707 aco: don't reorder instructions in order to lower boolean phis
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-09 17:50:23 +02:00
Daniel Schürmann
10be90671f aco: re-use existing phi instruction when lowering boolean phis
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-09 17:50:23 +02:00
Michael Schellenberger Costa
a607ea51a7 aco: Cleanup insert_before_logical_end
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-09 17:50:23 +02:00
Vasily Khoruzhick
c8554f849e lima/ppir: don't clone texture loads
Cloning texture loads isn't a good idea since we may move it into
a block that is not shared between all the invocations of the shader.
We'd like to avoid that since it may result in undefined behavior.

Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-10-09 08:24:27 -07:00
Michel Dänzer
94cfe59070 gitlab-ci/lava: Add needs: for container image to test jobs
Without this, the test jobs could spuriously run after the container
job failed or was cancelled, even if the build job didn't run at all.

Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-10-09 16:19:56 +02:00
Samuel Pitoiset
030e67fac3 radv: bump minTexelBufferOffsetAlignment to 4
The spec has probably been misinterpreted during RADV bringup.

This fixes GPU hangs with dEQP-VK.binding_model.*offset_nonzero*.

Fixes: f4e499ec79 ("radv: add initial non-conformant radv vulkan driver")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-09 11:22:58 +00:00
Sergii Romantsov
1b21b97511 meta: leak of shader program when decompressing tex-images
CC: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
2019-10-09 10:49:08 +00:00
Erik Faye-Lund
bbdbb02a5f mesa/main: prefer R8-textures instead of A8 for glBitmap in display lists
This allows drivers to communicate that they prefer R8 textures rather
than A8 for glBitmap usage.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-09 09:56:00 +02:00
Dave Airlie
f92226931b st/mesa: Prefer R8 for bitmap textures
If it's not available, we fall back to A8. This should work on all drivers,
because we depend on it in the display-list code already.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-09 09:56:00 +02:00
Samuel Pitoiset
ad96c4987c drirc: enable vk_x11_override_min_image_count for DOOM
DOOM fails to handle more images than expected when the adaptative
sync mode is enabled.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1902
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-09 08:38:38 +02:00
Samuel Pitoiset
cbd6f0a0c2 radv: implement VK_KHR_shader_clock
NIR->LLVM and ACO already support nir_intrinsic_shader_clock.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-09 08:43:14 +02:00
Kenneth Graunke
0b7ecfdda5 iris: Implement the Broadwell NP Z PMA Stall Fix
This should help avoid stalls in the pixel mask array in certain
non-promoted depth cases.  It especially helps for Z16, as each bit
in the PMA corresponds to two pixels when using Z16, as opposed to
the usual one pixel.

Improves performance in GFXBench5 TRex by 22% (n=1).
2019-10-08 21:53:12 -07:00
Caio Marcelo de Oliveira Filho
4327837be9 docs: Update recently enabled VK extensions on Intel 2019-10-08 16:34:00 -07:00
Caio Marcelo de Oliveira Filho
9560c9b498 anv: Enable VK_EXT_shader_subgroup_{ballot,vote}
Anvil now supports and passes Vulkan CTS tests matching

    dEQP-VK.subgroups.*.ext_shader_subgroup_ballot.*
    dEQP-VK.subgroups.*.ext_shader_subgroup_vote.*

and crucible tests matching

    func.shader-ballot.*
    func.shader-subgroup-vote.*

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-08 16:34:00 -07:00
Kenneth Graunke
b453b29fc7 st/mesa: Fix inverted polygon stipple condition
Fixes Piglit's gl-2.1-polygon-stipple-fs on iris.

Fixes: 63f24c3c01 ("gallium: Enable MESA_framebuffer_flip_y")
Reviewed-by: Fritz Koenig <frkoenig@google.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-08 16:18:13 -07:00
Fritz Koenig
63f24c3c01 gallium: Enable MESA_framebuffer_flip_y
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-08 13:53:01 -07:00
Fritz Koenig
66937abe2b mesa: Allow MESA_framebuffer_flip_y for GLES 3
Implement glFramebufferParameteriMESA on GLES 3 so
that the extension is not dependant on GLES 3.1

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-08 13:53:01 -07:00
Fritz Koenig
9fb76392de mesa: GetFramebufferParameteriv spelling
GetFramebufferParameteriv was incorrectly spelled as
GetFramebufferParameteri.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-08 13:53:01 -07:00
Fritz Koenig
ab8e5a1539 include/GLES2: Sync GLES2 headers with Khronos
Bring in glFramebufferParameteriMESA/glGetFramebufferParameterivMESA

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-08 13:53:01 -07:00
Clément Guérin
5afbe87d21 radeonsi: enable zerovram for Rocket League
Fixes corruption on game startup.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1888

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-10-08 16:07:30 -04:00
Kenneth Graunke
face221283 iris: Properly unreference extra VBOs for draw parameters
bound_vertex_buffers doesn't include extra draw parameters buffers.
Tracking this correctly is kind of complicated, and iris_destroy_state
isn't exactly in a hot path, so just loop over all VBO bindings.

Fixes: 4122665dd9 (iris: Enable ARB_shader_draw_parameters support)
Reported-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
2019-10-08 11:14:21 -07:00
Eric Engestrom
6f26eae077 meson: fix sys/mkdev.h detection on Solaris
On Solaris, sys/sysmacros.h has long-deprecated copies of major() & minor()
but not makedev().
sys/mkdev.h has all three and is the preferred choice.

Let's make sure we check for all 3 major(), minor() and makedev().

Reported-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Tested-by: Alan Coopersmith <alan.coopersmith@oracle.com>
2019-10-08 16:26:50 +01:00
Eric Engestrom
02b3aa3cf3 include: update drm-uapi
`drm.h` was missing a `#include <stdint.h>`, which was completely
breaking the non-linux builds after 272f9cfe6a ("dri: Use DRM_FORMAT_*
instead of defining our own copy.") started making use of it.

Fixes: 272f9cfe6a ("dri: Use DRM_FORMAT_* instead of defining our own copy.")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/950
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-08 14:23:51 +01:00
Michel Dänzer
3b8aeb0906 loader: Simplify handling of the radeonsi driver
The list of AMD/ATI devices supported by radeon/r200/r300/r600 is
complete, so anything else must use radeonsi.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-08 09:02:34 +00:00
Bas Nieuwenhuizen
a0c930d284 amd/llvm: Fix warning due to asserted-only variable.
[212/893] Compiling C object 'src/amd/llvm/ce8261c@@amd_common_llvm@sta/ac_nir_to_llvm.c.o'.
../mesa/src/amd/llvm/ac_nir_to_llvm.c: In function ‘visit_image_atomic’:
../mesa/src/amd/llvm/ac_nir_to_llvm.c:2636:17: warning: unused variable ‘format’ [-Wunused-variable]
 2636 |    const GLenum format = nir_intrinsic_format(instr);
      |                 ^~~~~~

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-08 10:22:56 +02:00
Boris Brezillon
71eda74f7c panfrost: Draw the wallpaper when only depth/stencil bufs are cleared
When only the depth/stencil bufs are cleared, we should make sure the
color content is reloaded into the tile buffers if we want to preserve
their content.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-08 10:07:54 +02:00
Boris Brezillon
c138ca80d2 panfrost: Make sure a clear does not re-use a pre-existing batch
glClear()s are expected to be the first thing GL apps do before drawing
new things. If there's already an existing batch targetting the same
FBO that has draws attached to it, we should make sure the new clear
gets a new batch assigned to guaranteed that the FB content is actually
cleared with the requested color/depth/stencil values.

We create a panfrost_get_fresh_batch_for_fbo() helper for that and
call it from panfrost_clear().

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-08 10:07:54 +02:00
Kenneth Graunke
016c19bc89 iris: Update comment about 3-component formats and buffer textures
You can't render to PIPE_BUFFER so there's no reason to prefer RGBX.
PBO upload would like to use proper RGB textures as source data.
2019-10-07 23:11:45 -07:00
Chris Wilson
64207ebe66 iris: Allow packed RGB pbo uploads
Hitting any fallback path on Broxton as we require clflushing the whole
buffer even for an upload of a subtexture. However, since gallium
provides a pbo upload path, allow it to sample packed RGB if supported.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-07 23:11:38 -07:00
Tapani Pälli
e4a826b2c8 anv/android: fix images created with external format support
This fixes a case where user first creates image and then later binds it
with memory created from AHW buffer.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-08 07:19:05 +03:00
Bas Nieuwenhuizen
72665a0f1f meson: Always add LLVM coroutines module.
It gets used by the gallium auxiliary draw module, which gets used
pretty much always when LLVM is used as JIT.

At the same time most builds don't hit the issue here because the
shared library of LLVM contains all modules.

Fixes: d32690b43c ("gallivm: add coroutine pass manager support")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/951
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-10-08 03:24:49 +02:00
Timur Kristóf
3a08110d43 amd: Move all amd/common code that depends on LLVM to amd/llvm.
This commit is a step towards the goal of being able to build RADV
without LLVM. In the future we would like to offer the option to
use RADV solely with ACO. There is still a need for the common AMD
code located in amd/common but the LLVM specific parts need to be
separated.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-08 00:44:08 +00:00
Ilia Mirkin
738bbee603 nvc0: add support for GL_EXT_demote_to_helper_invocation
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-10-07 20:42:11 -04:00
Ilia Mirkin
71c34a51c3 gallium/tgsi: add support for DEMOTE and READ_HELPER opcodes
This mirrors the intrinsics in the GLSL IR. One could imagine an
alternate definition where reading the semantic would account for the
READ_HELPER functionality, but that feels potentially dodgy and could be
subject to CSE unpleasantness.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-07 20:41:59 -04:00
Marek Olšák
eec7b0a865 radeonsi: use simple_mtx_t instead of mtx_t
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-07 20:05:07 -04:00
Marek Olšák
5498a8d23c st/mesa: use simple_mtx_t instead of mtx_t
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-07 20:05:04 -04:00
Marek Olšák
732ea0b213 gallium: add PIPE_RESOURCE_FLAG_SINGLE_THREAD_USE to skip util_range lock
u_upload_mgr sets it, so that util_range_add can skip the lock.

The time spent in tc_transfer_flush_region decreases from 0.8% to 0.2%
in torcs on radeonsi.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-07 20:05:00 -04:00
Marek Olšák
59dd4dafb5 util: use simple_mtx_t for util_range
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-07 20:04:49 -04:00
Marek Olšák
3b2b83924e winsys/radeon: initialize SIMD properties in radeon_info
This was missed when I added them.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1839
Fixes: 0692ae34e9 ("ac: move ac_get_num_physical_sgprs into radeon_info")

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-10-07 18:44:19 -04:00
Kenneth Graunke
6d9c1f30e4 iris: Drop vtbl usage for some load_register calls
We can just call the actual functions directly.
2019-10-07 14:10:33 -07:00
Jordan Justen
ae9c311b9a iris/state: Move reg/mem load/store functions earlier in file
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-07 14:10:33 -07:00
Eric Engestrom
c84bd2b095 meson: drop unused inc_nir
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-07 21:49:40 +01:00
Eric Engestrom
1234505bd6 meson: drop duplicate inc_nir from spirv2nir
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-07 21:49:40 +01:00
Eric Engestrom
f5808e6088 meson: drop duplicate inc_nir from libglsl
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-07 21:49:40 +01:00
Eric Engestrom
326be1774c meson: drop duplicate inc_nir from libiris
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-07 21:49:40 +01:00
Eric Engestrom
7a1dc6ab44 meson: rename libnir to _libnir to make it clear it's not meant to be used anywhere else
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-07 21:49:40 +01:00
Eric Engestrom
3e95b2773f meson: use idep_nir instead of libnir in pipe-loader
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-07 21:49:40 +01:00
Eric Engestrom
612e70c594 meson: use idep_nir instead of libnir in haiku softpipe
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-07 21:49:40 +01:00
Eric Engestrom
1975c5a59d meson: use idep_nir instead of libnir in gallium nine
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-07 21:49:40 +01:00
Eric Engestrom
140d7e8b3a meson: use idep_nir instead of libnir in libclnir
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-07 21:49:40 +01:00
Eric Engestrom
a0a8b24078 meson: use idep_nir instead of libnir in libnouveau
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-07 21:49:40 +01:00
Eric Engestrom
731097c747 meson: add missing idep_nir_headers in iris_gen_libs
Fixes: 4929f020c3 ("iris: better SBE")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-07 21:49:40 +01:00
Eric Engestrom
721b880e4c script: drop get_reviewer.pl
This script doesn't make sense anymore in the age of GitLab.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-07 21:33:38 +01:00
Eric Engestrom
b91ae0379b meson/loader: drop unneeded *.h file
Meson automatically tracks any file included by a file it already tracks,
and `pci_id_driver_map.h` & `loader.h` are included by `loader.c`, while
`loader_dri3_helper.h` is included by `loader_dri3_helper.c`.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-07 21:30:16 +01:00
Eric Engestrom
b9157ea415 loader: use ARRAY_SIZE instead of NULL sentinel
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-07 21:30:16 +01:00
Eric Engestrom
5be6c8959c loader: s/int/bool/ for predicate result
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-07 21:30:16 +01:00
Eric Engestrom
26149d119b loader: replace int/1/0 with bool/true/false
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-07 21:30:16 +01:00
Eric Engestrom
6202a13b71 egl: replace MESA_EGL_NO_X11_HEADERS hack with upstream EGL_NO_X11
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2019-10-07 20:28:59 +00:00
Kenneth Graunke
90a35752b4 iris: Drop bonus parameters from iris_init_*_context()
Nothing uses vtbl or dbg, and screen is available from the batch.
2019-10-07 13:15:56 -07:00
Rhys Perry
2d78e55a8c nir/constant_folding: fold load_constant intrinsics
These can appear after loop unrolling.

v2: stylistic changes
v2: replace state->mem_ctx with state->shader
v2: add bounds checking
v3: use nir_intrinsic_range() for bounds checking
v3: fix issue where partially out-of-bounds reads are replaced with undefs
v4: fix merge conflicts during rebase
v5: split into two commits
v6: set constant_data to NULL after freeing (fixes nir_sweep()/Iris)
v7: don't remove the constant data if there are no constant loads

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v6)
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
2019-10-07 19:49:53 +01:00
Rhys Perry
ec054a67da nir/constant_folding: add back and use constant_fold_state
Useful for load_constant folding.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-10-07 19:49:53 +01:00
Caio Marcelo de Oliveira Filho
f7ca072ab2 anv: Implement VK_KHR_shader_clock
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-07 09:12:12 -07:00
Caio Marcelo de Oliveira Filho
f20cea0162 spirv: Implement SPV_KHR_shader_clock
We only have the subgroup variant in NIR (equivalent to clockARB), so
only support that for now.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-07 09:12:12 -07:00
Caio Marcelo de Oliveira Filho
3f304617cb vulkan: Update the XML and headers to 1.1.124
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-07 09:12:12 -07:00
Kenneth Graunke
bd46dfa889 Revert "iris: Hack up a SKL/Gen9LP PS push constant fifo depth workaround"
This reverts commit 4f857423b3.

It caused GPU hangs on all affected platforms, in e.g.
Piglit bin/stencil-twoside -auto -fbo.
2019-10-07 09:08:41 -07:00
Tomeu Vizoso
c00f017e65 gitlab-ci/lava: Fix image to use in test jobs
In the test stage, we can use any of the two container images as we
arent going to do anything architecture-dependent when submitting the
jobs to LAVA.

But if we are in a pipeline in which the images need to be rebuilt and
one finishes much earlier than the other, it could happen that the test
job that executes first fails to find the container image.

To avoid that, have each job in the test stage to use the image that has
been already implicitly built by depending on the build job for the
given arch.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
2019-10-07 07:31:55 -07:00
Boris Brezillon
8d0830de05 Revert "Revert "st/dri2: Implement DRI2bufferDamageExtension""
This reverts commit 19546108d3.
This commit breaks the build because lima implements
->set_damage_region(). I guess we'll need more discussion before
removing the ->set_damage_region() hook.
2019-10-07 12:24:51 +02:00
Boris Brezillon
19546108d3 Revert "st/dri2: Implement DRI2bufferDamageExtension"
This reverts commit 492ffbed63.

BACK_LEFT attachment can be outdated when the user calls
KHR_partial_update(), leading to a damage region update on the
wrong pipe_resource object.
Let's not expose the ->set_damage_region() method until the core is
fixed to handle that properly.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Daniel Stone <daniels@collabora.com>
2019-10-07 11:38:26 +02:00
Tomeu Vizoso
555c0de8c6 gitlab-ci: Move LAVA-related files into top-level ci dir
In preparation for testing drivers other than Panfrost in LAVA labs.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-06 07:47:41 -07:00
Tomeu Vizoso
7b01f725dd gitlab-ci: Run dEQP on devices with Panfrost
Include Panfrost's gitlab.ci.yml file from Mesa's main .gitlab-ci.yml so
we test on devices with Panfrost.

This uses LAVA to schedule jobs in the devices and will be the base for
testing Etnaviv, Lima, etc.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-06 07:47:21 -07:00
Kenneth Graunke
4f857423b3 iris: Hack up a SKL/Gen9LP PS push constant fifo depth workaround
This is a port of Nanley's 904c2a617d
from i965 to iris.

One concern is that iris uses larger batches, and also emits far fewer
commands, so we may come closer to the 500 limit within a batch, and
could need to supplement this with actual counting.  Manhattan 3.0 had
239 3DSTATE_CONSTANT_PS packets in a batch,  Unigine Valley had 155.
So it seems like we're still in the realm of safety.
2019-10-05 17:18:45 -04:00
Kenneth Graunke
f1bba22f69 iris: Refactor push constant allocation so we can reuse it
We'll need this for a workaround shortly.  While refactoring, also
improve the comment slightly.
2019-10-05 17:18:44 -04:00
Lionel Landwerlin
12bf1308c4 intel/isl: set vertical surface alignment on null surfaces
Just following the spec. Somewhat unclear whether this applies to NULL
surfaces.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-05 20:54:33 +00:00
Lionel Landwerlin
ff1a5aadbf intel/isl: set surface array appropriately
This doesn't seem to affect anything.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-05 20:54:33 +00:00
Lionel Landwerlin
c445d6f66e intel/isl: Set null surface format to R32_UINT
It appears we never had a test in piglit or deqp sampling from a null
surface...

It turns out this triggers a hang on IVB only. Updating the null
surface format to R32_UINT fixes the hang on ivb and doesn't affect
other platforms, so set it by default for all platforms.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1872
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-05 20:54:33 +00:00
Jonathan Marek
1249cf19b0 etnaviv: set texture INT_FILTER bit
This should improve texture sampling performance on GC3000.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-10-05 20:31:36 +00:00
Jonathan Marek
c877142fca etnaviv: implement texture comparator
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-10-05 20:31:36 +00:00
Jonathan Marek
686e9fa0fb etnaviv: update headers from rnndb
Update to etna_viv commit 7ff8029.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-10-05 20:31:36 +00:00
Lionel Landwerlin
d36763b2a4 intel: fix subslice computation from topology data
We're missing the offset of the slice in the subslice mask...

This worked for most platforms that don't have first slice fused off
because we would reread the same mask from slice0 again and again...

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: c1900f5b0f ("intel: devinfo: add helper functions to fill fusing masks values")
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1869
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
2019-10-05 23:05:03 +03:00
Kenneth Graunke
396b410959 dri: Avoid swapbuffer throttling in glXCopySubBufferMESA
We were supplying __DRI2_THROTTLE_SWAPBUFFER, rather than the obvious
choice of __DRI2_THROTTLE_COPYSUBBUFFER.  This meant that we hit the
swap-based frame throttling.  glXCopySubBuffer doesn't seem like it's
intended to be a frame boundary, so we'd like to avoid this throttling.

Tested-by: Michel Dänzer <mdaenzer@redhat.com> # DRI3 only
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
2019-10-05 13:19:37 +00:00
Kenneth Graunke
72beda4fb4 st/dri: Perform MSAA downsampling for __DRI2_THROTTLE_COPYSUBBUFFER
glXCopySubBufferMESA copies data from the back buffer to the front,
so it needs to perform a MSAA downsampling operation just like
glXSwapBuffers would.

Currently, the CopySubBuffer implementations supply a throttle reason
of __DRI2_THROTTLE_SWAPBUFFERS, so they hit this path and work today.
But we'd like to avoid swapbuffer throttling in this case, so the next
patch will change that reason.

Tested-by: Michel Dänzer <mdaenzer@redhat.com> # DRI3 only
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
2019-10-05 13:19:37 +00:00
Prodea Alexandru-Liviu
6309c31fd8 scons/MSYS2-MinGW-W64: Fix build options defaults
Signed-off-by: Prodea Alexandru-Liviu <liviuprodea@yahoo.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Cc: <mesa-stable@lists.freedesktop.org>

When building in a MSYS2 Mingw-w64 environment Mesa3D sets wrong default build options which inevitably lead to build failure.
2019-10-05 08:43:13 +00:00
Lionel Landwerlin
907c2397f0 intel/error2aub: add support for platforms without PPGTT
Not much to do to enable this, just make sure to always write to the
GGTT :)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-04 22:31:15 +00:00
Rhys Perry
77ebb030ed aco: fix load_constant with multiple arrays
I thought I fixed this, but I guess I must have broken it again.

Fixes various dEQP-VK.draw.* tests

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-04 22:43:11 +01:00
Eric Anholt
ce76be9933 nir: Fix some wonky whitespace in nir_search.h.
Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-10-04 19:15:01 +00:00
Eric Anholt
3cc914921e nir: Factor out most of the algebraic passes C code to .c/.h.
Working on the algebraic implementation, I was being driven nuts by my
editor not highlighting and handling indentation for the C code.  It turns
out that it's basically not pass-specific code, and we can move it over to
the relevant .c file.  Replaces 30KB of code with 34KB of data on my i965
build.  No perf diff on shader-db (n=3)

Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-10-04 19:15:01 +00:00
Eric Anholt
c23db0df18 nir: Keep the range analysis HT around intra-pass until we make a change.
This lets us memoize range analysis work across instructions.  Reduces
runtime of shader-db on Intel by -30.0288% +/- 2.1693% (n=3).

Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-10-04 19:15:01 +00:00
Eric Anholt
7025dbe794 nir: Skip emitting no-op movs from the builder.
Having passes generate these is just making more work for copy
propagation (and thus probably calling more optimization passes)
later.  Noticed while trying to debug nir_opt_algebraic()
top-to-bottom having O(n^2) behavior due to not finding new matches in
replacement code.

Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-10-04 19:15:01 +00:00
Eric Anholt
e7b754a05c nir: Make nir_search's dumping go to stderr.
Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-10-04 19:15:01 +00:00
Adam Jackson
3746ee912f surfaceless: Support EGL_WL_bind_wayland_display
Feature parity with the drm, x11, and wayland platforms.

Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1870
Tested-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
2019-10-04 15:49:10 +00:00
Rhys Perry
1264acdf4b nir/print: always use the right FILE *
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-04 15:24:10 +00:00
Erik Faye-Lund
49b32233a0 nir: initialize needs_helper_invocations as well
Similar to the previous commit, we should also initialize
needs_helper_invocations here.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-04 14:55:40 +00:00
Erik Faye-Lund
1d6d2ca9f1 nir: initialize uses_discard to false
This matches what we do for uses_sample_qualifier, and what we
do in ir_set_program_inouts.cpp as well.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-04 14:55:40 +00:00
Rhys Perry
a87b0f5141 radv/aco,aco: set lower_fmod
This simplifies ACO and allows the lowered code to be optimized (in
particular, constant folded).

Totals from affected shaders:
SGPRS: 1776 -> 1776 (0.00 %)
VGPRS: 1436 -> 1436 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 203452 -> 203564 (0.06 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 103 -> 103 (0.00 %)

At least some of the code size increase seems to be from literals being
applied to instructions as a result of constant folding.

v2: remove fmod/frem handling in init_context()

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-04 14:00:46 +00:00
Prodea Alexandru-Liviu
0fe2e04f2d scons/windows: Fix build with LLVM>=8
Fixes eebe091d29
("scons/windows: Enable compute shaders when possible.")
Signed-off-by: Prodea Alexandru-Liviu <liviuprodea@yahoo.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2019-10-04 13:48:08 +00:00
Michel Dänzer
b012f06d66 dri3: Pass __DRI2_THROTTLE_COPYSUBBUFFER from loader_dri3_copy_drawable
0 is __DRI2_THROTTLE_SWAPBUFFER, which doesn't really make sense here.

Avoids dri_flush() throttling twice for the same glFlush call with front
buffer rendering, as described in
https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2057 .

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-04 10:55:43 +02:00
Gert Wollny
7cbb44aa6a r600: Fix interpolateAtCentroid
If the instruction interpolateAtCentroid is used the extra interpolator
must also be enabled in the state.

Fixes: fs-interpolateatcentroid-block

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-10-04 10:09:01 +02:00
Dylan Baker
1481d05409 meson: Only error building gallium video without libdrm when the platform is drm
Fixes: 3b265f61f5
       ("meson: gallium media state trackers require libdrm with x11")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1878
Tested-by: Vinson Lee <vlee@freedesktop.org>
2019-10-03 22:14:20 -07:00
Alyssa Rosenzweig
dcd2f26b98 pan/midgard: Replace mir_is_live_after with new pass
Now that we have live_out calculated per block as metadata, calculating
liveness of an instruction at a given point in the program becomes O(n)
to the size of the block worst-case, rather than O(n) the program.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 22:29:51 -04:00
Alyssa Rosenzweig
39a4b3ebe9 pan/midgard: Calculate temp_count for liveness
This needs to be correct or the analysis fails.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 22:29:51 -04:00
Alyssa Rosenzweig
ad5fcac005 pan/midgard: Invalidate liveness for mir_is_live_after
Callers should have liveness info ready. Ideally we'd have a nice
metadata tracking framework like NIR to handle this automatically, but
for now this will allow us to make forward progress... when we're about
to do something with liveness, invalidate everything ahead to force a
clean calculation.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 22:29:51 -04:00
Alyssa Rosenzweig
3450c013c5 pan/midgard: Begin tracking liveness metadata
This will allow us to explicitly invalidate liveness analysis results so
we can cache liveness results.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 22:29:51 -04:00
Alyssa Rosenzweig
846e5d5ba8 pan/midgard: Don't try to OR live_in of successors
By definition, once liveness analysis has occurred:

   live_out = OR {succ} succ->live_in

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 22:29:50 -04:00
Alyssa Rosenzweig
013cd6bed2 pan/midgard: Move RA's liveness analysis into midgard_liveness.c
There are unfortunately two distinct liveness analysis passes in the
compiler right now -- one good (but complex) pass used by RA based on
solving data flow equations, and one awful (but simple) pass used for
dead code elimination and bundling based on an abstract walk of the AST.

Let's move RA's pass into shared code so we can work on unifying.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 22:29:50 -04:00
Alyssa Rosenzweig
76a76de7af pan/midgard: Add mir_calculate_temp_count helper
This allows us to fill in ctx->temp_count explicitly, even if we haven't
squished down the MIR.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 22:29:50 -04:00
Alyssa Rosenzweig
c59fae0fef pan/midgard: Remove mir_has_multiple_writes
We already enforce this with the SSA/register distinction in the
backend. There is no need to duplicate this logic merely for an assert.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 22:29:50 -04:00
Erik Faye-Lund
3f4be0d199 .mailmap: add a couple of aliases for Jakob Bornecrantz
Reviewed-by: Jakob Bornecrantz <jakob@collabora.com>
2019-10-03 17:11:20 -04:00
Erik Faye-Lund
2eb916a58d .mailmap: add an alias for Tomeu Vizoso
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-10-03 17:11:10 -04:00
Erik Faye-Lund
27ae5c81f7 .mailmap: add an alias for Gert Wollny
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
2019-10-03 17:10:59 -04:00
Erik Faye-Lund
28b64049d0 .mailmap: add an alias for Alexandros Frantzis
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
2019-10-03 17:10:28 -04:00
Erik Faye-Lund
b7baf70778 .mailmap: specify spelling for Elie Tournier
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
2019-10-03 17:09:42 -04:00
Boris Brezillon
1ac33aae49 panfrost: Get rid of the flush in panfrost_set_framebuffer_state()
Now that we have track inter-batch dependencies, the flush done in
panfrost_set_framebuffer_state() is no longer needed. Let's get rid of
it.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 16:55:38 -04:00
Boris Brezillon
70cf93c4d7 panfrost: Kill the explicit serialization in panfrost_batch_submit()
Now that we have all the pieces in place to support pipelining batches
we can get rid of the drmSyncobjWait() at the end of
panfrost_batch_submit().

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 16:55:38 -04:00
Boris Brezillon
0a12a16bae panfrost: Do fine-grained flushing when preparing BO for CPU accesses
We don't have to flush all batches when we're only interested in
reading/writing a specific BO. Thanks to the
panfrost_flush_batches_accessing_bo() and panfrost_bo_wait() helpers
we can now flush only the batches touching the BO we want to access
from the CPU.

This fixes the dEQP-GLES2.functional.fbo.render.texsubimage.* tests.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 16:55:38 -04:00
Boris Brezillon
2225383af8 panfrost: Make sure the BO is 'ready' when picked from the cache
This is needed if we want to free the panfrost_batch object at submit
time in order to not have to GC the batch on the next job submission.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 16:55:38 -04:00
Boris Brezillon
22190bc27b panfrost: Add flags to reflect the BO imported/exported state
Will be useful to make the ioctl(WAIT_BO) call conditional on BOs that
are not exported/imported (meaning that all GPU accesses are known
by the context).

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 16:55:38 -04:00
Boris Brezillon
82399b58d3 panfrost: Add a panfrost_flush_batches_accessing_bo() helper
This will allow us to only flush batches touching a specific resource,
which is particularly useful when the CPU needs to access a BO.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 16:55:38 -04:00
Boris Brezillon
a45984b244 panfrost: Add a panfrost_flush_all_batches() helper
And use it in panfrost_flush() to flush all batches, and not only the
one currently bound to the context.

We also replace all internal calls to panfrost_flush() by
panfrost_flush_all_batches() ones.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 16:55:38 -04:00
Boris Brezillon
b5d8f9bbbf panfrost: Prepare panfrost_fence for batch pipelining
The panfrost_fence logic currently waits on the last submitted batch,
but the batch serialization that was enforced in
panfrost_batch_submit() is about to go away, allowing for several
batches to be pipelined, and the last submitted one is not necessarily
the one that will finish last.

We need to make sure the fence logic waits on all flushed batches, not
only the last one.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 16:55:38 -04:00
Boris Brezillon
2dad9fde50 panfrost: Start tracking inter-batch dependencies
The idea is to track which BO are being accessed and the type of access
to determine when a dependency exists. Thanks to that we can build a
dependency graph that will allow us to flush batches in the correct
order.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 16:55:38 -04:00
Boris Brezillon
40a07bfbd7 panfrost: Add a panfrost_freeze_batch() helper
We'll soon need to freeze a batch not only when it's flushed, but also
when another batch depends on us, so let's add a helper to avoid
duplicating the logic.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 16:55:38 -04:00
Boris Brezillon
819738e4af panfrost: Use the per-batch fences to wait on the last submitted batch
We just replace the per-context out_sync object by a pointer to the
the fence of the last last submitted batch. Pipelining of batches will
come later.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 16:55:38 -04:00
Boris Brezillon
6936b7f319 panfrost: Add a batch fence
So we can implement fine-grained dependency tracking between batches.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 16:55:38 -04:00
Boris Brezillon
a8bd265cef panfrost: Make panfrost_batch->bos a hash table
So we can store the flags as data and keep the BO as a key. This way
we keep track of the type of access done on BOs.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 16:55:38 -04:00
Boris Brezillon
ada752afe4 panfrost: Extend the panfrost_batch_add_bo() API to pass access flags
The type of access being done on a BO has impacts on job scheduling
(shared resources being written enforce serialization while those
being read only allow for job parallelization) and BO lifetime (the
fragment job might last longer than the vertex/tiler ones, if we can,
it's good to release BOs earlier so that others can re-use them
through the BO re-use cache).

Let's pass extra access flags to panfrost_batch_add_bo() and
panfrost_batch_create_bo() so the batch submission logic can take the
appropriate when submitting batches. Note that this information is not
used yet, we're just patching callers to pass the correct flags here.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 16:55:38 -04:00
Boris Brezillon
12f790f7da panfrost: Add the shader BO to the batch in patch_shader_state()
We know a shader will be used by a batch when
panfrost_patch_shader_state() is called, so let's add the shader BO at
that time.

Suggested-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 16:55:38 -04:00
Andres Gomez
02c265be9d egl: Remove the 565 pbuffer-only EGL config under X11.
The CTS finally has agreed to drop the requirement for a
565-no-depth-no-stencil config for ES 3.0. Hence we can now remove the
code to satisfy this requirement using a pbuffer-only visual with
whatever other buffers the driver happens to have given us.

This reverts commit 82607f8a90,
commit 6ad31c4ff3 and
commit dacb11a585.

v2:
  - Reference the VK-GL-CTS issue (Eric E.).

v3:
  - Don't revert
    fc21394bc4 ("egl: Quiet warning about front buffer rendering for pixmaps/pbuffers")
    (Kenneth).

References: VK-GL-CTS issue 1601.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Andres Gomez <agomez@igalia.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-03 23:51:46 +03:00
Dylan Baker
974e3ad004 bin: delete unused releasing scripts
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Juan A. Suarez <jasuarez@igalia.com>
2019-10-03 20:15:19 +00:00
Dylan Baker
3226b12a09 release: Add an update_release_calendar.py script
This script is for updating post version bump.

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Juan A. Suarez <jasuarez@igalia.com>
2019-10-03 20:15:19 +00:00
Dylan Baker
86079447da scripts: Add a gen_release_notes.py script
This script is responsible for generating an entire page in the
docs/relnotes/ directory. It includes a template for the page, and uses
mako to fill in the necessary bits. It is designed to be purely fire and
forget, calculating previous versions, shortlogs, bug fixes, and dates.

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Juan A. Suarez <jasuarez@igalia.com>
2019-10-03 20:15:19 +00:00
Dylan Baker
7ff49c25ed docs: add a new_features.text file and remove 19.3.0 release notes
The next patch is going to introduce a tool that creates the entire
release html page for us, without any user intervention. As such we
can't be editing it. To that end the script will read the
new_features.txt file to get a list of new features.

This is a flat text file, one entry per line.

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Juan A. Suarez <jasuarez@igalia.com>
2019-10-03 20:15:19 +00:00
Rafael Antognolli
cdc331c6f9 anv/block_pool: Align anv_block_pool state to 64 bits.
On 64 bits platforms, some atomic operations like __sync_fetch_and_add()
have constant time, but on 32 bits platforms they are implemented with a
loop and might take much longer.

Additionally, it seems like if their operands are not aligned to 64
bits, they also require extra memory accesses. From the Intel
Architecture's Developer Manual Vol. 1, 4.1.1:

 "A word or doubleword operand that crosses a 4-byte boundary or a
 quadword operand that crosses an 8-byte boundary is considered
 unaligned and requires two separate memory bus cycles for access."

Forcing the u64 field to be aligned to 64 bits seems to make the unit
tests that are stressing this finish much faster.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-03 12:40:33 -07:00
Erik Faye-Lund
0103d4747a loader/dri3: do not blit outside old/new buffers
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-03 18:58:34 +00:00
Dylan Baker
9af6c38def docs: Add use of Closes: tag for closing gitlab issues
This replaces to old Bugzilla: tag, which no longer makes sense because
we don't use bugzilla anymore.

Reviewed-by: Eric Anholt <eric@anholt.net> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-03 17:45:51 +00:00
Anuj Phogat
0d60621101 intel/isl/icl: Use halign 8 instead of 4 hw workaround
v1 by Topi Pohjolainen
v2,v3 by Anuj Phogat:
- Apply for gen >= 11
- Remove wa_bug_xxx function
- Use helper functions

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-03 17:18:41 +00:00
Samuel Pitoiset
d861401554 ac/nir: remove unused code for nir_op_{fmod,frem}
RADV and RadeonSI both lower these two NIR instructions.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-03 18:15:17 +02:00
Samuel Pitoiset
5ebe1a17e9 radv: enable lower_fmod for the LLVM path
This lowers fmod and frem at NIR level like RadeonSI. fmod is
already lowered directly in NIR->LLVM, and frem will be lowered by
LLVM anyways.

This fixes a LLVM crash with:
dEQP-VK.glsl.builtin.precision_fp16_storage32b.frem.compute.scalar.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-03 18:15:14 +02:00
Adam Jackson
1b87f4058d egl/dri2: Don't dlclose() the driver on dri2_load_driver_common failure
... because it's wrong to do so. The error path out of
dri2_initialize_drm ends with dri2_display_destroy, which calls
functions in the vtable we're trying to set up, so if we dlclose the
driver then those function pointers will point off into space and things
crash.

Noticed this because after !1923 eglinfo would crash when setting up the
GBM platform. This was something of a cascade failure, because my kernel
is too old for DRM_IOCTL_I915_GETPARAM to work without DRM_AUTH, so i965
wouldn't load. platform_drm.c then got very confused when it tries to
load swrast as a dri2 driver.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-03 09:39:51 -04:00
Bas Nieuwenhuizen
c837872fba radv: Fix warning in 32-bit build.
uintptr_t is 32 bits in a 32-bits build, resulting in shifting out
of bounds.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-03 13:06:08 +00:00
Bas Nieuwenhuizen
8ad3d8b178 radv: Fix condition for skipping the continue CS.
We need the continue CS for referencing the tess/GDS/sample position BOs.

Fixes: 46e52df34d "radv: add tessellation ring allocation support. (v2)"
Fixes: e1dc3ab753 "radv/gfx10: allocate GDS/OA buffer objects for NGG streamout"
Fixes: 1171b304f3 "radv: overhaul fragment shader sample positions."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-03 13:02:07 +00:00
Michel Dänzer
4712fdf7ae gitlab-ci: Use per-job ccache
Instead of a single cache shared between all jobs, but reduce the
maximum cache size to 1.5G (from 5G).

Rationale for smaller cache:

Pulling & pushing a 5G cache could take a long time. Consider
https://gitlab.freedesktop.org/mesa/mesa/-/jobs/684010 (click the "Show
complete raw" button to see timestamps): Pulling the cache took
1569927241-1569927194 = 47 seconds, pushing it 1569927671-1569927519
= 152, for a total of 199 seconds. The actual build took comparable
1569927518-1569927243 = 275 seconds, despite no cache hits from ccache.
In other words, the cache transfers almost doubled the job duration,
and they would have negated any build time benefits from ccache even
with a high cache hit rate.

Also, the smaller caches avoid blowing up storage requirements for them
too much.

Rationale for per-job caches:

Making a single cache significantly smaller might result in cached
build products from one job getting evicted by another job, reducing
the likelihood of cache hits from previous pipelines.

v2:
* Move up "ccache --max-size=1500M" call (Eric Engestrom)

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-03 09:26:11 +02:00
Gurchetan Singh
a1a5672118 virgl: honor winsys supplied metadata
To truly to do this correctly, we'll have to fix the discrepancy between
drm_virtgpu_3d_transfer_to_host and virtio_gpu_transfer_host_3d. However,
this is a good starting point.

Since virtio-gpu only supports self-import and export, this should be fine.
Let's only do WINSYS_HANDLE_TYPE_FD for this currently.

Reviewed by: Robert Tarasov <tutankhamen@chromium.org>
2019-10-02 17:57:59 -07:00
Gurchetan Singh
9bde8f3a8f virgl: modify internal structures to track winsys-supplied data
The winsys might supply dimensions that are different than
those we calculate.  In additional, it may supply virtualized
modifiers.

In practice, a stride != bpp * width and virtualized modifiers don't
happen yet, but the plan is to move in that direction.

Also make virgl_resource_layout static.

Reviewed by: Robert Tarasov <tutankhamen@chromium.org>
2019-10-02 17:57:53 -07:00
Gurchetan Singh
aad4127c41 virgl: modify resource_create_from_handle(..) callback
This commit makes no functional changes, just adds the revelant
plumbing.

Reviewed by: Robert Tarasov <tutankhamen@chromium.org>
2019-10-02 17:57:47 -07:00
Gurchetan Singh
2899bbe37a virgl: remove stride from virgl_hw_res
It's not used anywhere, and stride isn't really an intrinsic
property of a GEM buffer.

Reviewed by: Robert Tarasov <tutankhamen@chromium.org>
2019-10-02 17:57:40 -07:00
Lionel Landwerlin
1c6fdbc83c intel: fix topology query
i915 will report ENODEV on generations prior to Haswell because there
is no point in reporting values on those. This is prior any fusing
could happen on parts with identical PCI ids.

This query call was previously only triggered on generations that
support performance queries, which happens to match generation for
which i915 reports topology, but the commit pointed below started
using it on all generations.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1860
Cc: <mesa-stable@lists.freedesktop.org>
Fixes: 96e1c945f2 ("i965: Move device info initialization to common code")
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
2019-10-02 22:25:44 +00:00
Caio Marcelo de Oliveira Filho
faf98be290 docs: Fix GL_EXT_demote_to_helper_invocation name
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-02 14:33:42 -07:00
Samuel Pitoiset
a2a68d551c radv/gfx10: fix the ESGS ring size symbol
Random hangs no longer happen, I'm actually not sure if they were
related to this.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-02 21:50:40 +02:00
Samuel Pitoiset
34be977f80 radv: fix build
Forgot to amend the commit before updating the MR.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-02 20:37:43 +02:00
Samuel Pitoiset
4304162744 Revert "radv: disable viewport clamping even if FS doesn't write Z"
This was actually the wrong fix.

This reverts commit 0a313cc285.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-02 19:40:39 +02:00
Samuel Pitoiset
b8fe6189a9 radv: rework the slow depthstencil clear to write depth from PS
Make sure to export the expected clear values to the depth
stencil attachment.

This fixes dEQP-VK.pipeline.depth_range_unrestricted.* on GFX10.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-02 19:31:51 +02:00
Samuel Pitoiset
e19d1ee2d1 radv/gfx10: fix NGG streamout with triangle strips for VS
The number of vertices has to be adjusted with the output primitive
type.

This fixes dEQP-VK.transform_feedback.simple.triangle_strip_*.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-02 18:09:35 +02:00
Samuel Pitoiset
08ab13d340 radv/gfx10: fix storing/loading NGG stream outputs for GS
The GS outputs are stored differently in the LDS storage, they
are indexed by out_idx which is incremented for each stored DWORD.
Thus, we need a different path for exporting the stream outputs.

This fixes a bunch of CTS failures when NGG GS is force enabled.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-02 18:09:32 +02:00
Samuel Pitoiset
3be21b5ab1 radv/gfx10: use the component mask when storing/loading NGG stream outputs
It's unnecessary to store/load more components that needed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-02 18:09:30 +02:00
Samuel Pitoiset
60f8224171 radv/gfx10: fix storing/loading NGG stream outputs for VS and TES
The LDS storage allocated for stream outputs is 4 * N, where N
is the number of outputs. So, we have to store/load with N as index
and not with the output location as index.

This doesn't fix anything known but it should fix out-of-bounds
access and it also reduces the number of outputs written to the
LDS storage.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-02 18:09:27 +02:00
Samuel Pitoiset
56e1b1ff0c radv/gfx10: add missing counter buffer to the BO list
The buffer isn't necessarily used before.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-02 18:09:25 +02:00
Samuel Pitoiset
683c5e27c7 radv/gfx10: add radv_device::use_ngg
Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-02 18:06:01 +02:00
Eric Engestrom
2236cf24a7 git: delete .gitattributes
The last of these was deleted in 44a8e51354 ("d3d1x: Remove.")
over 6 years ago.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2019-10-02 13:29:55 +01:00
Gert Wollny
c5da8230de etnaviv: enable triangle strips only when the hardware supports it
Some hardware has a bug with triangle strips and it is signalled by the
flag BUG_FIXED8 whether this bug has been fixed. So only enable triangle
strips when this flag is set.

Thanks: Jonathan Marek and Christian Gmeiner for the pointers

v2: Add TODO to indicate that the handling should be refined
    (Jonathan & Christian)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-10-02 07:34:36 +00:00
Dylan Baker
d855e19b87 meson: remove -DGALLIUM_SOFTPIPE from st/osmesa
It's unused here, and undefined in scons. It is used in targets/osmesa,
but it's properly defined there already.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-01 12:34:27 -07:00
Lionel Landwerlin
2208d79dde mesa: don't forget to clear _Layer field on texture unit
On the Android Antutu benchmark we ran into an assert in ISL where the
(base layer + num layers) > total layers. It turns out the core of
mesa forgot to clear the _Layer variable, potentially leaving an
inconsistent value.

v2: Pull setting u->_Layer out of the conditional blocks (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-01 21:49:13 +03:00
Robin Murphy
563f8974d8 egl/gbm: Fix config validation
In converting to shift/size-based validation, we lost a condition from
the ARGB/XRGB equivalence check, which left it working one way round
but not the other, and broke applications like glmark2-es2-drm on some
platforms. Restore the equivalent check that *both* configs actually
have an alpha channel before considering a mismatch.

Fixes: 7b4ed2b513 ("egl: Convert configs to use shifts and sizes instead of masks")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-01 14:45:15 +01:00
Ken Mays
4943c89d6d haiku: fix Mesa build
1. The hgl.c file is a read-only file versus read-write.
Ref: src/gallium/state_trackers/hgl/hgl.c

2.  I've included the Haiku-specific patches I used to get a successful
build of Mesa 19.1.7 on Haiku using the meson/ninja build procedure.
Shows "[764/764] linking target ... libswpipe.so" at build completion.

v2:
Remove autotools files (Eric)

v3:
Update the patch

Reported-by: Ken Mays <kmays2000@gmail.com>
Tested-by: Ken Mays <kmays2000@gmail.com>
CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Alexander von Gluck IV <kallisti5@unixzen.com>
2019-10-01 10:31:02 +00:00
Michel Dänzer
e55df4c859 gitlab-ci: Set ccache path for cross compilers in meson cross file
Without this, meson didn't pick up ccache for cross builds.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-10-01 11:16:33 +02:00
Andres Gomez
f83874a405 docs/relnotes: add support for GL_ARB_gl_spirv, GL_ARB_spirv_extensions and OpenGL 4.6 on i965 and iris
After 41549a18e6 ("i965: Enable OpenGL 4.6 for Gen8+"), i965
implements GL_ARB_gl_spirv, GL_ARB_spirv_extensions and OpenGL 4.6.

After 15e439071d ("iris: Enable ARB_gl_spirv and ARB_spirv_extensions"),
iris implements GL_ARB_gl_spirv, GL_ARB_spirv_extensions and OpenGL
4.6.

v2:
  - Explicit the support is for i965 and iris.

v3:
  - Add also GL_ARB_spirv_extensions to the release notes (Alejandro).

Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-01 12:09:48 +03:00
Kevin Strasser
641320ce02 egl: Fix implicit declaration of ffs
Found when building for Android in C99 mode. Include bitscan.h to ensure ffs is
available.

Fixes: 7b4ed2b5 ("egl: Convert configs to use shifts and sizes instead of masks")

Signed-off-by: Kevin Strasser <kevin.strasser@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-30 14:33:43 -07:00
Rafael Antognolli
b9994cb8d5 intel/tools: Fix aubinator usage of rb_tree.
The order of comparison has changed, so we need to invert the logic of
"insert_left" when using rb_tree_insert_at().

Fixes: dae33052db (util/rb_tree: Reverse the order of comparison
                    functions).
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-09-30 13:43:23 -07:00
Caio Marcelo de Oliveira Filho
089da33c4d docs/relnotes: Add EXT_demote_to_helper_invocation support on iris, i965
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-30 12:44:30 -07:00
Caio Marcelo de Oliveira Filho
54f1de1c5c i965: Enable EXT_demote_to_helper_invocation
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-30 12:44:30 -07:00
Caio Marcelo de Oliveira Filho
a3776df7b1 iris: Enable EXT_demote_to_helper_invocation
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-30 12:44:30 -07:00
Caio Marcelo de Oliveira Filho
008de52305 gallium: Add PIPE_CAP_DEMOTE_TO_HELPER_INVOCATION
To enable EXT_demote_to_helper_invocation:

    This extension adds a "demote" keyword that is similar to "discard" but
    only suppresses subsequent writes and outputs to the framebuffer, and
    does not terminate the execution of the invocation. For the remainder
    of the execution, the invocation is "demoted" to act like a helper
    invocation.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-30 12:44:30 -07:00
Caio Marcelo de Oliveira Filho
61fa4b5707 glsl: Add helperInvocationEXT() builtin
From EXT_demote_to_helper_invocation, implemented with the existing
nir_intrinsic_is_helper_invocation.

Such builtin is necessary when using `demote` because we can't
redefine the value of gl_HelperInvocation (since it is an input
variable).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-30 12:44:30 -07:00
Caio Marcelo de Oliveira Filho
3439956377 glsl: Parse demote statement
When the EXT_demote_to_helper_invocation extension is enabled,
`demote` is treated as a keyword, and produces an ir_demote.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-30 12:44:30 -07:00
Caio Marcelo de Oliveira Filho
af1a6f0f77 glsl: Add ir_demote
To represent the new `demote` keyword when using
EXT_demote_to_helper_invocation extension.  Most of the changes are to
include it in the visitors.

Demote is not considered a control flow, so also include an empty
visit member function in ir_control_flow_visitor.

Only NIR actually supports `demote`, so assert the translations for
TGSI and Mesa's gl_program -- since the demote is not expected to
appear for those.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-30 12:44:30 -07:00
Caio Marcelo de Oliveira Filho
c81b912eb7 mesa: Extension boilerplate for EXT_demote_to_helper_invocation
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-30 12:44:30 -07:00
Kenneth Graunke
309924c3c9 iris: Fix iris_rebind_buffer() for VBOs with non-zero offsets.
We can't just check for the BO base address, we need to check for the
full address including any offset we may have applied.  When updating
the address, we need to include the offset again.

Fixes: 5ad0c88dbe ("iris: Replace buffer backing storage and rebind to update addresses.")
2019-09-30 12:41:03 -07:00
Eric Engestrom
fa0dcaaae0 docs/install: drop autotools references
19.3 will be the 3rd release without autotools, people know it's gone by now.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-09-30 19:45:15 +01:00
Maya Rashish
c0330461c9 meson: Test for -Wl,--build-id=sha1
instead of hard-coding OS list. Helps Solaris ld builds.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Signed-off-by: Maya Rashish <coypu@sdf.org>
2019-09-30 18:38:14 +00:00
Dylan Baker
4913ad9a37 docs: remove stray newline
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-30 18:27:52 +00:00
Dylan Baker
bc2d73c36b docs: use https for mesonbuild.com
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-30 18:27:52 +00:00
Dylan Baker
5d11a828e1 docs: update install docs for meson
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-30 18:27:52 +00:00
Marek Olšák
a1545af079 ac/nir: fix GLSL imageSamples()
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-30 14:21:42 -04:00
Marek Olšák
0cc233e3dc ac: add ac_build_image_get_sample_count from radeonsi
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-30 14:21:42 -04:00
Marek Olšák
39e638c14e ac/surface: don't allocate FMASK if there is no graphics
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-30 14:21:42 -04:00
Marek Olšák
f704fb7f0b tgsi_to_nir: handle PIPE_FORMAT_NONE in image opcodes
radeonsi doesn't use the format and internal shaders don't set it.

Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2019-09-30 14:20:48 -04:00
Dylan Baker
3b265f61f5 meson: gallium media state trackers require libdrm with x11
v2: - update copyright year in all changed files
    - rebase on master

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-30 18:06:56 +00:00
Kenneth Graunke
a0a93763fb iris: Disable CCS_E for 32-bit floating point textures.
A while back, Michael Larabel noticed that Paraview's Wavelet Volume
case runs significantly slower on iris than i965.  It turns out this
is because we enable CCS_E for 32-bit floating point formats, while
i965 disables it, with an oblique comment saying that we benchmarked
it (on what exactly?) and determined that it was a loss.

Paraview uses both R32_FLOAT and R32G32B32A32_FLOAT, and I observed
large framerate drops when enabling CCS_E for either format.  However,
several other benchmarks (Aztec Ruins, many Synmark cases) use 16-bit
floating point formats, with no apparent ill effects.

So, disable compression for 32-bit float formats for now, but leave it
enabled for 16-bit float formats as they seem to be working fine.

Improves performance in Paraview's Wavelet Volume test by 62% on a
Skylake GT4e.

Fixes: 3cfc6a207b ("iris: Fill out res->aux.possible_usages")
2019-09-30 10:44:52 -07:00
Marek Olšák
4a0d2e2880 ac: reorder and print all radeon_info fields
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-30 13:36:21 -04:00
Marek Olšák
e8b1538587 ac: set the number of SDPs same as the number of TCCs
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-30 13:36:21 -04:00
Marek Olšák
b7c2f7c5a6 ac: fix num_good_cu_per_sh for harvested chips
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-30 13:36:20 -04:00
Marek Olšák
235ebe9163 radeonsi/gfx10: fix corruption for chips with harvested TCCs
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-30 13:36:20 -04:00
Marek Olšák
8cbe83445b ac: add radeon_info::tcc_harvested
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-30 13:36:20 -04:00
Marek Olšák
7d97013294 ac: fix incorrect vram_size reported by the kernel
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-30 13:36:20 -04:00
Marek Olšák
3c0938bece radeonsi/gfx10: fix L2 cache rinse programming
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-30 13:36:20 -04:00
Eric Engestrom
0efc253f02 etnaviv: fix bitmask typo
Fixes: d92689c46f ("etnaviv: nir: add native integers (HALTI2+)")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-09-30 17:54:33 +01:00
Adam Jackson
855dc17fcf glx: Log the filename of the drm device if we fail to open it
Helps point the user to the specific device that's having issues, since
you're increasingly likely to have more than one.

Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/107
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-30 15:30:16 +00:00
pal1000
eebe091d29 scons/windows: Enable compute shaders when possible.
Tests done with llvm-config indicate that there are only 2 libraries in
irreader and not in engine, LLVMAsmParser and LLVMIRReader and both of them
are part of coroutines so I replaced irreader with coroutines and added
libraries unique to coroutines.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2019-09-30 15:49:46 +01:00
Alyssa Rosenzweig
7be00b2a06 pan/midgard: Allow scheduling conditions with constants
Now that we have constant adjustment logic abstracted, we can do this
safely. Along with the csel inversion patch, this allows many more
common csel ops to inline their condition in the bundle.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
c20063aa4a pan/midgard: Add csel invert optimization
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
f0f4b39548 pan/midgard: Add mir_flip helper
Useful for various operations on both commutative and anticommutative
ops.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
10037ce523 pan/midgard: Tightly pack 32-bit constants
If we can reuse constant slots from other instructions, we would like to
do so to include more instructions per bundle.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
a3ca283bc1 pan/midgard: Allow writeout to see into the future
If an instruction could be scheduled to vmul to satisfy the writeout
conditions, let's do that and save an instruction+cycle per fragment
shader.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
12a70ccd9e pan/midgard: Allow 6 instructions per bundle
We never had a scheduler good enough to hit this case before! :)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
34ff50cadd pan/midgard: Only one conditional per bundle allowed
There's no r32 to save ya after you use up r31 :)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
2715bd02ee pan/midgard: Schedule to smul/sadd
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
57bac68fff pan/midgard: Extend choose_instruction for scalar units
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
e9edae3ecb pan/midgard: Don't double check SCALAR units
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
d3b3daa9d3 pan/midgard: Use new scheduler
We still emit in-order but we switch to using the bundles created from
the new scheduler, which will allow greater flexibility and room for
out-of-order optimization.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
1409af9fc7 pan/midgard: Add distance metric to choose_instruction
We require chosen instructions to be "close", to avoid ballooning
register pressure. This is a kludge that will go away once we have
proper liveness tracking in the scheduler, but for now it prevents a lot
of needless spilling.

v2: Lower threshold to 6 (from 8). Schedule is hurt, but a few shaders
that spilled excessively are fixed.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

Derp
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
e9571b53e1 pan/midgard: Add mir_choose_alu helper
Based on a given unit.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
8462e82467 pan/midgard: Implement load/store pairing
We can bundle two load/store together. This eliminates the need for
explicit load/store pairing in a prepass, as well.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
7cf4932410 pan/midgard: Extend csel_swizzle to branches
Conditions for branches don't have a swizzle explicitly in the emitted
binary, but they do implicitly get swizzled in whatever instruction
wrote r31, so we need to handle that.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
c9ce5a92a0 pan/midgard: Add helpers for scheduling conditionals
Conditional instructions (csel and conditional branches) require their
condition to be written to a special condition pipeline register (r31.w
for scalar, r31.xyzw for vector). However, pipeline registers are live
only for the duration of a single bundle. As such, the logic to schedule
conditionals correct is surprisingly complex. Essentially, we see if we
could stuff the conditional within the same bundle as the csel/branch
without breaking anything; if we can, we do that. If we can't, we add a
dummy move to make room.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
6f92288e85 pan/midgard: Implement predicate->unit
This allows ALUs to select for each unit of the bundle separately.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
5a9a48b81a pan/midgard: Add predicate->exclude
A bit of a kludge but allows setting an implicit dependency of synthetic
conditional moves on the actual condition, fixing code generated like:

   vmul.feq r0, ..
   sadd.imov r31, .., r0
   vadd.fcsel [...]

The imov runs simultaneous with feq so it gets garbage results, but it's
too late to add an actual dependency practically speaking, since the new
synthetic imov doesn't have a node associated.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
6284f3ec25 pan/midgard: Add constant intersection filters
In the future, we will want to keep track of which components of
constants of various sizes correspond to which parts of the bundle
constants, like in the old scheduler. For now, let's just stub it out
for a simple rule of one instruction with embedded constants per bundle.
We can eventually do better, of course.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
941bdd2088 pan/midgard: Remove csel constant unit force
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
da18525b6f pan/midgard: Add mir_schedule_texture/ldst/alu helpers
We don't actually do any scheduling here yet, but add per-tag helpers to
consume an instruction, print it, pop it off the worklist.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
72a03bcafa pan/midgard: Add mir_choose_bundle helper
It's not always obvious what the optimal bundle type should be. Let's
break out the logic to decide.

Currently set for purely in-order operation.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
b5396369d2 pan/midgard: Add mir_update_worklist helper
After we've chosen an instruction, popped it off, and processed it, it's
time to update the worklist, removing that instruction from the
dependency graph to allow its dependents to be put onto the worklist.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
826fd7308b pan/midgard: Add mir_choose_instruction stub
In the future, this routine will implement the core scheduling logic to
decide which instruction out of the worklist will be scheduled next, in
a way that minimizes cycle count and register pressure.

In the present, we are more interested in replicating in-order
scheduling with the much-more-powerful out-of-order model. So rather
than discriminating by a register pressure estimate, we simply choose
the latest possible instruction in the worklist.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
f48038b588 pan/midgard: Initialize worklist
This flows naturally from the dependency graph

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
a3b46c0db6 pan/midgard: Calculate dependency graph
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
adda411263 pan/midgard: Add flatten_mir helper
We would like to flatten a linked list of midgard_instructions into an
array of midgard_instruction pointers on the heap.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
0ecfcbf462 pan/midgard: Squeeze indices before scheduling
This allows node_count to be correct while scheduling.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
ad05e8a52c pan/midgard: Fix component count handling for ldst
It's not based on the writemask and it can't be inferred; it's just
intrinsic to the op itself.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
cc0544a0f5 pan/midgard: Add missing parans in SWIZZLE definition
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:11 -04:00
Daniel Schürmann
b3c1f601aa nouveau: set lower_sub = true
Subtractions are already implemented as additions anyway.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-30 09:44:10 +00:00
Eric Anholt
ca1aa5d225 v3d: Enable the late algebraic optimizations to get real subs.
This worked better than my original v3d-local pass for just subs, and is a
huge win over not producing subs.

total instructions in shared programs: 6408469 -> 6167932 (-3.75%)
total threads in shared programs: 153784 -> 154104 (0.21%)
total uniforms in shared programs: 2157078 -> 1905823 (-11.65%)
total max-temps in shared programs: 904546 -> 895796 (-0.97%)
total spills in shared programs: 4959 -> 4993 (0.69%)
total fills in shared programs: 6558 -> 6670 (1.71%)
total sfu-stalls in shared programs: 25845 -> 25175 (-2.59%)
total inst-and-stalls in shared programs: 6434314 -> 6193107 (-3.75%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-30 09:44:10 +00:00
Daniel Schürmann
1d29895e5b aco: call nir_opt_algebraic_late() exhaustively
57559 shaders in 28980 tests
Totals:
SGPRS: 2963407 -> 2959935 (-0.12 %)
VGPRS: 2014812 -> 2016328 (0.08 %)
Spilled SGPRs: 1077 -> 1077 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 10348 -> 10348 (0.00 %) dwords per thread
Code Size: 114545436 -> 114498084 (-0.04 %) bytes
LDS: 933 -> 933 (0.00 %) blocks
Max Waves: 375997 -> 375866 (-0.03 %)

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-30 09:44:10 +00:00
Daniel Schürmann
0fb27f1e5a radv/aco: Don't lower subtractions
40228 shaders in 20236 tests
Totals:
SGPRS: 2045512 -> 2046496 (0.05 %)
VGPRS: 1430856 -> 1430464 (-0.03 %)
Spilled SGPRs: 1077 -> 1077 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 10348 -> 10348 (0.00 %) dwords per thread
Code Size: 77202840 -> 77151832 (-0.07 %) bytes
LDS: 863 -> 863 (0.00 %) blocks
Max Waves: 260729 -> 260754 (0.01 %)

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-30 09:44:10 +00:00
Daniel Schürmann
239423d234 nir: Remove unnecessary subtraction optimizations
These optimizations are already covered after lowering.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-30 09:44:10 +00:00
Daniel Schürmann
99848a57b7 nir: recombine nir_op_*sub when lower_sub = false
There are some optimizations which are only implemented for additions
and some optimizations which assume that subtractions have been lowered.
By lowering all subtractions first and later recombine for backends
which prefer this option, we don't have to implement them twice.

This patch also moves lower_negate to nir_opt_algebraic_late() to enable
these optimizations for backends which make use of it.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-30 09:44:10 +00:00
Daniel Schürmann
10e508c815 freedreno: Enable the nir_opt_algebraic_late() pass.
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-30 09:44:10 +00:00
Eric Anholt
d54ae70ee7 vc4: Enable the nir_opt_algebraic_late() pass.
Upcoming changes to sub optimization will make this pass required.  Over
the course of that series, we see uniforms +.46%, instructions -.24%
(seems like a fine tradeoff -- uniforms are 1/2 the size of instructions
as far as cache occupancy)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-30 09:44:10 +00:00
Michel Dänzer
f2b8051d69 gitlab-ci: Add test-container:arm64 to needs: for arm64 test jobs
Without this, it was theoretically possible for the jobs to run before
the docker image was ready.

v2:
* Use - list syntax instead of [] (Eric Engestrom)

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-30 09:17:44 +02:00
Michel Dänzer
42a18280e4 gitlab-ci: Add needs: for x86 buster docker image
This allows most build jobs to run before the stretch or arm64 docker
images are ready.

v2:
* Use - list syntax instead of [] (Eric Engestrom)

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-30 09:17:38 +02:00
Michel Dänzer
88319f2678 gitlab-ci: Declare needs: for stretch docker image
This allows the *-old-llvm jobs to run before the buster docker images
are ready.

v2:
* Use - list syntax instead of [] (Eric Engestrom)

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-30 09:17:00 +02:00
pal1000
ffb0d3a25c scons: Fix MSYS2 Mingw-w64 build.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

This patch is based on 28e3f85e09/mingw-w64-mesa/link-ole32.patch but with tweaks to avoid MSVC build break when applied.

v2: Create Mingw platform alias pointing to windows host platform define to avoid spurious crosscompilation;

v3: Fix obviously wrong compiler flags for swr driver;

v4: Update original patch URL because it has been relocated;

v5: Don't bother patching autools stuff as it's not used by MSYS2 Mingw-w64 build and it's days are numbered anyway;

v6: After Mingw posix flag fix in 295851eb things are far simpler as we don't need more linking of uuid, ole32, version and shell32 than what is already in place.
2019-09-29 10:57:16 +01:00
pal1000
bcb4dfb14b scons/windows: Support build with LLVM 9.
As X86AsmPrinter component is gone, LLVMX86AsmPrinter got replaced
with LLVMRemarks, LLVMBitstreamReader and LLVMDebugInfoDWARF.

Tests done with llvm-config on both LLVM 8 and 9 indicate that
mcjit, bitwriter and x86asmprinter fully fit inside engine component.

On other platforms and with meson build mcdisassembler was used to replace
X86AsmPrinter but mcdisassembler also fully fits inside engine component
for LLVM>=8 according to same tests.

v2: Avoid duplicating code related to Mingw pthreads.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>

On 19.1 this patch does not apply cleanly without 88eb2a1f
2019-09-29 10:51:34 +01:00
Vasily Khoruzhick
336b021d36 lima: set uniforms_address lower bits properly
Looks like blob uses following values for uniforms buffer:

0 for 8 bytes
1 for 16 bytes
2 for 24 bytes
2 for 32 bytes
3 for 40 bytes
3 for 48 bytes
3 for 56 bytes
3 for 64 bytes
4 for 72 bytes

It all looks like log2(size / 8) rounded up, so let's do the same.

Fixes: 931fc2a7b3f9("lima: do not set the PP uniforms address lowest bits")
Reviewed-by: Icenowy Zheng <icenowy@aosc.io>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-28 10:34:19 -07:00
Michel Zou
3f92d17894 scons: add py3 support
SCons 3.1 has moved to python 3, requiring this fix
to continue supporting scons builds.

Closes: #944
Cc: mesa-stable@lists.freedesktop.org
Acked-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Eric Engestrom <eric@engestrom.ch>
2019-09-28 16:53:08 +00:00
Mauro Rossi
411e50a8fd android: aco: add support for libmesa_aco
Android building rules are added in src/amd/Android.compiler.mk
libmesa_aco static library is built conditionally to radeonsi
as done for vulkan.radv module

This will prevent Android build errors for non x86 systems

filter-out compiler/aco_instruction_selection_setup.cpp source,
as already included by compiler/aco_instruction_selection.cpp
and would cause several multiple definition linker errors

NOTE: libLLVM requires AMDGPU Disassembler to build radv with aco

Fixes: 93c8ebf ("aco: Initial commit of independent AMD compiler")
Fixes: a70a998 ("radv/aco: Setup alternate path in RADV to support the experimental ACO compiler")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
2019-09-28 15:56:34 +02:00
Mauro Rossi
268fb10e9c android: compiler/nir: build nir_divergence_analysis.c
Prerequisite to avoid following radv linking error happening with aco

FAILED: out/target/product/x86_64/obj_x86/SHARED_LIBRARIES/vulkan.radv_intermediates/LINKED/vulkan.radv.so
...
external/mesa/src/amd/compiler/aco_instruction_selection_setup.cpp:178:
error: undefined reference to 'nir_divergence_analysis'
clang.real: error: linker command failed with exit code 1 (use -v to see invocation)

Fixes: df86c5f ("nir: add divergence analysis pass.")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
2019-09-28 15:56:28 +02:00
Mauro Rossi
c24ad565ae android: aco: fix undefined template 'std::__1::array' build errors
Fixes a few building errors similar to the following:

In file included from external/mesa/src/amd/compiler/aco_instruction_selection.cpp:26:
In file included from external/libcxx/include/algorithm:639:
external/libcxx/include/utility:321:9:
error: implicit instantiation of undefined template 'std::__1::array<aco::Temp, 4>'
    _T2 second;
        ^

Fixes: 93c8ebf ("aco: Initial commit of independent AMD compiler")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
2019-09-28 15:56:23 +02:00
Jonathan Marek
b38fcaa221 etnaviv: nir: fix gl_FragDepth
Fixes the following piglit test: fragdepth_gles2 (for ETNA_MESA_DEBUG=nir)

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-09-28 00:34:44 -04:00
Jonathan Marek
d4e35e62d2 etnaviv: disable earlyZ when shader writes fragment depth
Fixes the following piglit test: fragdepth_gles2

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-09-28 00:34:43 -04:00
Jonathan Marek
dc3656c9c4 etnaviv: nir: make lower_alu easier to follow
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-09-28 00:34:43 -04:00
Jonathan Marek
c4f63be5a6 etnaviv: remove extra allocation for shader code
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-09-28 00:34:43 -04:00
Jonathan Marek
0b3957331d etnaviv: nir: remove "options" struct
It just makes thing more complicated for no reason.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-09-28 00:34:43 -04:00
Jonathan Marek
8f1b2ea7a9 etnaviv: nir: use store_deref instead of store_output
Allows some simplification.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-09-28 00:34:43 -04:00
Jonathan Marek
d92689c46f etnaviv: nir: add native integers (HALTI2+)
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-09-28 00:34:35 -04:00
Jonathan Marek
d446134d2a qetnaviv: nir: use new immediates when possible
Note it can still be improved a bit:
* Use alu swizzle to determine if src is scalar
* Take into account new immediates in the multiple uniform src lowering

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-09-28 00:33:42 -04:00
Jonathan Marek
95fa799c86 etnaviv: nir: set num_components for inputs/outputs
This can improve performance by allowing the LAST_VARYING_2X bit to be
set when possible (and possibility more benefits on HALTI5 where the
number of components is set for each varying).

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-09-28 00:33:42 -04:00
Jonathan Marek
0036e078e3 etnaviv: nir: allocate contiguous components for LOAD destination
LOAD starts reading into the first enabled destination component, and
doesn't skip disabled components, so we need to allocate a destination with
contiguous components.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-09-28 00:33:42 -04:00
Jonathan Marek
7da15bdd2d etnaviv: nir: fix gl_FrontFacing
Only invert front facing when glFrontFace is GL_CW.

Fixes following deqp test:
dEQP-GLES2.functional.shaders.builtin_variable.frontfacing

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-09-28 00:33:33 -04:00
Icenowy Zheng
931fc2a7b3 lima: do not set the PP uniforms address lowest bits
The PP uniforms address register in render state is not a direct pointer
to the uniforms storage -- instead, it points to an one-item array, and
the array item is the real pointer to the uniforms storage.

This register reuses some of its LSBs as a size field. Currently the
size is set according to the length of the real uniforms storage.
However, as the register itself contains only a pointer to the one-item
array, the size field should be set to the length of the one-item array
and subtract it by 1, which means a fixed value of 0. That means we can
just omit it now.

Test shows this should be the correct approach to set this register.

Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-28 08:49:20 +08:00
Andrii Simiklit
b32bb888c7 glsl: disallow incompatible matrices multiplication
glsl 4.4 spec section '5.9 expressions':
"The operator is multiply (*), where both operands are matrices or one operand is a vector and the
 other a matrix. A right vector operand is treated as a column vector and a left vector operand as a
 row vector. In all these cases, it is required that the number of columns of the left operand is equal
 to the number of rows of the right operand. Then, the multiply (*) operation does a linear
 algebraic multiply, yielding an object that has the same number of rows as the left operand and the
 same number of columns as the right operand. Section 5.10 “Vector and Matrix Operations”
 explains in more detail how vectors and matrices are operated on."

This fix disallows a multiplication of incompatible matrices like:
mat4x3(..) * mat4x3(..)
mat4x2(..) * mat4x2(..)
mat3x2(..) * mat3x2(..)
....

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111664
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
2019-09-27 21:42:09 +00:00
Eric Anholt
67e8977290 turnip: Fix failure behavior of vkCreateGraphicsPipelines.
According to the 1.1.123 spec:

    "The implementation will attempt to create all pipelines, and only
     return VK_NULL_HANDLE values for those that actually failed."

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-09-27 13:34:28 -07:00
Eric Anholt
ab3cf128a6 turnip: Silence compiler warning about uninit pipeline.
The code was fine as far as I see, but the warning was irritating.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-09-27 13:34:28 -07:00
Eric Anholt
a6cc68106c turnip: Add a .editorconfig and .dir-locals.el
I was inheriting the one from src/freedreno with funny tabs, while
this driver is written with normal Mesa 3-space indents.
Unfortunately I have to add both files, because I use emacs and emacs
prefers .dir-locals to .editorconfig :(

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-09-27 13:34:28 -07:00
Eric Anholt
7a4647ee39 shader_enums: Move MAX_DRAW_BUFFERS to this file.
We include shader_enums.h from freedreno's compiler for both GL and
Vulkan, and the main/config.h include resulted in polluting the
namespace with things like MAX_VIEWPORTS that other Vulkan drivers use
as their driver-specific maximums.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-09-27 13:34:28 -07:00
Jason Ekstrand
6c858b9a91 intel/fs: Fix fs_inst::flags_read for ANY/ALL predicates
Without this, we were DCEing flag writes because we didn't think their
results were used because we didn't understand that an ANY32 predicate
actually read all the flags.

Fixes: df1aec763e "i965/fs: Define methods to calculate the flag..."
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-09-27 19:31:43 +00:00
Christian Gmeiner
2391ef7785 etnaviv: support ARB_framebuffer_object
Passes most of piglit's tests regarding arb_framebuffer_object
and unlocks some more piglit tests.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-09-27 18:22:08 +00:00
Christian Gmeiner
fd1ed6f4f8 etnaviv: etna_resource_copy_region(..): drop assert
We are using util_resource_copy_region(..) as fallback which supports
different formats for src and dst. Improves the experience when running
deqp or piglit with a debug build.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-09-27 18:22:08 +00:00
Dylan Baker
e456a053c3 meson: Link xvmc with libxv
Prior to xvmc 1.0.12 libxvmc incorrectly required libxv, but that was
fixed. This results in compilation failures for the gallium xvmc tracker
and tools. This patch fixes that by explicitly linking to libxv.

Fixes: 22a817af8a
       ("meson: build gallium xvmc state tracker")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1844
Reviewed-by: Adam Jackson <ajax@redhat.com>
2019-09-27 16:39:01 +00:00
Dylan Baker
8c5c21d7e3 meson: Try finding libxvmcw via pkg-config before using find_library
This fixes cross compiling issues, because pkg-config is less likely to
get the wrong libs.

v2: - Fix typo in comment

Fixes: 22a817af8a
       ("meson: build gallium xvmc state tracker")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/939
Reviewed-by: Adam Jackson <ajax@redhat.com>
2019-09-27 16:39:01 +00:00
Andreas Gottschling
c5a2ccec5e drisw: Fix shared memory leak on drawable resize
XDestroyImage will mark the segment as to-be-destroyed, but it will
persist until we detach it, and we weren't doing so.

Cc: mesa-stable@lists.freedesktop.org
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/121
Reviewed-by: Adam Jackson <ajax@redhat.com>
2019-09-27 16:06:05 +00:00
Adam Jackson
90d58286cc drisw: Fix and simplify drawable setup
We don't want to require a visual for the drawable, because there exist
fbconfigs that don't correspond to any visual (say a 565 pixmap|pbuffer
config on a depth-24 display). Fortunately, we don't need one either.
Passing the visual to XCreateImage serves only to fill in the XImage's
{red,green,blue}_mask fields, which libX11 itself never uses, they exist
only for the client's convenience, and we don't care. And we already
have the drawable depth in glx_config::rgbBits. So replace the
XVisualInfo field in the drawable private with a pointer to the
glx_config.

Having done that driswCreateGCs becomes trivial, so inline it into its
caller.

Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1194
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-27 11:18:15 -04:00
Adam Jackson
3c0eb762e2 drisw: Simplify GC setup
There's no reason to have two GCs here. The only difference between
them is that swapgc would generate graphics exposures, except we only
ever use this GC for PutImage, and PutImage doesn't generate graphics
exposures. We also don't need to explicitly ChangeGC to GXCopy, because
that's the default.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-27 11:18:10 -04:00
Bas Nieuwenhuizen
e4a52bd653 turnip: Add todo for d24_s8 copies
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-27 15:05:21 +02:00
Bas Nieuwenhuizen
fa522b8a47 turnip: Disallow NPoT formats.
Copying is a mess for these formats for now.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-27 15:05:21 +02:00
Bas Nieuwenhuizen
9e822957cd turnip: Always use UINT formats for copies.
Looks like r16_unorm might have precision issues.

dEQP-VK.api.copy_and_blit.core.image_to_image.all_formats.color.r16_unorm.r16_unorm.general_general

fails, but the dumped images in the xml are the same so
I'd guess the low bits are the issue.

r8_unorm and r16_uint work.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-27 15:05:21 +02:00
Bas Nieuwenhuizen
b48fe29e3c turnip: Add image->image blitting.
3D blits & format reinterpretation are still TBD.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-27 15:05:21 +02:00
Rhys Perry
1f2813e103 aco: don't remove the loop exec mask in transition_to_Exact()
No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-09-27 10:57:03 +01:00
Rhys Perry
b711e62e61 aco: set loop_info::has_discard for demotes
We need the loop header phis for the outer exec masks. Needed for
dEQP-VK.glsl.demote.dynamic_loop_texture

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-09-27 10:57:03 +01:00
Kenneth Graunke
237c7636ca iris: Only resolve for image levels/layers which are actually in use.
There's no need to resolve everything.
2019-09-26 22:49:10 -07:00
Vasily Khoruzhick
6dd0ad66de lima/ppir: add NIR pass to split varying loads
NIR may emit a single instrinsic to load several packed varyings,
but that's suboptimal for Utgard PP for several reasons:
- varyings that are used as sampler inputs can be passed using
  pipeline register with increased precision
- we have small number of regs, so using a vec4 regs for storing
  two vec2 varyings increases reg pressure.

Add NIR pass to split a single load into several loads and utilize
it in lima.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-26 18:51:10 -07:00
Timur Kristóf
c372dc762d radv: Fix L2 cache rinse programming.
According to radeonsi, GLM doesn't support WB alone, so
we have to set INV too when WB is set.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-26 22:18:16 +00:00
Jonathan Marek
8727253329 turnip: emit texture and uniform state
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
2019-09-26 17:18:13 -04:00
Jonathan Marek
cb14f56b4f turnip: add some shader information in pipeline state
This information is needed by texture/uniform descriptors.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
2019-09-26 17:18:13 -04:00
Jonathan Marek
ee4fa15a86 turnip: use nir_opt_copy_prop_vars
Avoids getting a "load_output" in a case like this:

   gl_Position = ubuf.MVP * ubuf.position[gl_VertexIndex];
   frag_pos = gl_Position.xyz;

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
2019-09-26 17:18:13 -04:00
Jonathan Marek
b54f9e9e9e turnip: lower samplers and uniform buffer indices
Lower these to something compatible with ir3, and save the descriptor set
and binding information.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
2019-09-26 17:18:13 -04:00
Jonathan Marek
c39afe68f0 turnip: basic descriptor sets (uniform buffer and samplers)
Mostly copy-paste from radv, with a few modifications.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
2019-09-26 17:18:13 -04:00
Jonathan Marek
386f46ea82 turnip: enable linear filtering
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
2019-09-26 17:18:13 -04:00
Jonathan Marek
02ca326a04 turnip: align layer_size
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
2019-09-26 17:18:13 -04:00
Jonathan Marek
195abadd2c turnip: use linear tiling for scanout image
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
2019-09-26 17:18:13 -04:00
Jonathan Marek
54c80d080a turnip: implement image view descriptor
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
2019-09-26 17:18:13 -04:00
Jonathan Marek
5f2fb904a1 turnip: implement sampler state
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
2019-09-26 17:18:13 -04:00
Jonathan Marek
53277757aa turnip: fix vertex_id
ir3 uses non-zero based vertex id for a6xx

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
2019-09-26 17:18:13 -04:00
Jonathan Marek
1e8aff9ff3 turnip: emit shader immediates
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
2019-09-26 17:18:13 -04:00
Jason Ekstrand
5ca4f57469 util/rb_tree: Stop relying on &iter->field != NULL
The old version of the iterators relies on a &iter->field != NULL check
which works fine on older GCC but newer GCC versions and clang have
optimizations that break if you do pointer math on a null pointer.  The
correct solution to this is to do the null comparisons before we do any
sort of &iter->field or use rb_node_data to do the reverse operation.

Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Tested-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-09-26 20:36:41 +00:00
Jason Ekstrand
f18aad6dc0 util/rb_tree: Also test _safe iterators
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Tested-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-09-26 20:36:41 +00:00
Eric Anholt
3338d6e5f8 freedreno/a3xx: Mostly fix min-vs-mag filtering decisions on non-mipmap tex.
This is based on the fix I used for the same problem on V3D.  In this
case, it fixes all but the the
dEQP-GLES2.functional.texture.filtering.2d.*_npot cases of
dEQP-GLES2.functional.texture.filtering.2d.*'s failures.

Acked-by: Rob Clark <robdclark@chromium.org>
2019-09-26 11:27:31 -07:00
Maya Rashish
e16fadd545 intel/compiler: avoid truncating int64_t to int
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Maya Rashish <maya@netbsd.org>
2019-09-26 17:46:26 +00:00
Icenowy Zheng
a1ff8dbb1e lima: support rectangle texture
As Vasily discovered, the bit 7 of the word 1 of the texture descriptor
is set when reloading the framebuffer, to use framebuffer-based offset
rather than normalized one. This bit also works for regular textures to
enable accessing with non-normalized offset.

Add support for rectangle texture by setting this bit for
PIPE_TEXTURE_RECT.

Suggested-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-26 16:52:22 +00:00
Michel Dänzer
eb03141f52 loader: Avoid use-after-free / use of uninitialized local variables
Per the valgrind output below, we were returning the pointer to freed
memory if none of the later conditional pointer assignments were
executed. This caused dEQP CI jobs to crash on certain runners,
presumably due to a double-free down the line.

Also, we were skipping to the out: label before the vendor_id & chip_id
variables used by it were initialized, resulting in broken
LIBGL_DEBUG=verbose output such as

libGL: pci id for fd 4: 51108f00:51108f00, driver radeonsi

Fixes: 5a545e355b "loader: always map the "amdgpu" kernel driver name to radeonsi (v2)"

==403== Invalid read of size 1
==403==    at 0x4AFD576: surfaceless_probe_device (platform_surfaceless.c:316)
==403==    by 0x4AFD915: dri2_initialize_surfaceless (platform_surfaceless.c:391)
==403==    by 0x4AF5EEA: dri2_initialize (egl_dri2.c:984)
==403==    by 0x4AF5EEA: dri2_initialize (egl_dri2.c:958)
==403==    by 0x4AF1EEC: _eglMatchAndInitialize (egldriver.c:75)
==403==    by 0x4AF1F3B: _eglMatchDriver (egldriver.c:96)
==403==    by 0x4AE9367: eglInitialize (eglapi.c:617)
==403==    by 0x1D99C9: tcu::surfaceless::EglRenderContext::EglRenderContext(glu::RenderConfig const&, tcu::CommandLine const&) [clone .constprop.57] (in /deqp/modules/gles2/deqp-gles2)
==403==    by 0x1DABB0: tcu::surfaceless::ContextFactory::createContext(glu::RenderConfig const&, tcu::CommandLine const&, glu::RenderContext const*) const (in /deqp/modules/gles2/deqp-gles2)
==403==    by 0x53EBD1: glu::createRenderContext(tcu::Platform&, tcu::CommandLine const&, glu::RenderConfig const&, glu::RenderContext const*) (in /deqp/modules/gles2/deqp-gles2)
==403==    by 0x53EFE9: glu::createDefaultRenderContext(tcu::Platform&, tcu::CommandLine const&, glu::ApiType) (in /deqp/modules/gles2/deqp-gles2)
==403==    by 0x1DE07A: deqp::gles2::Context::Context(tcu::TestContext&) (in /deqp/modules/gles2/deqp-gles2)
==403==    by 0x1DB5EF: deqp::gles2::TestPackage::init() (in /deqp/modules/gles2/deqp-gles2)
==403==  Address 0x56bd340 is 0 bytes inside a block of size 4 free'd
==403==    at 0x48369AB: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==403==    by 0x4B01767: loader_get_driver_for_fd (loader.c:464)
==403==    by 0x4AFD553: surfaceless_probe_device (platform_surfaceless.c:308)
==403==    by 0x4AFD915: dri2_initialize_surfaceless (platform_surfaceless.c:391)
==403==    by 0x4AF5EEA: dri2_initialize (egl_dri2.c:984)
==403==    by 0x4AF5EEA: dri2_initialize (egl_dri2.c:958)
==403==    by 0x4AF1EEC: _eglMatchAndInitialize (egldriver.c:75)
==403==    by 0x4AF1F3B: _eglMatchDriver (egldriver.c:96)
==403==    by 0x4AE9367: eglInitialize (eglapi.c:617)
==403==    by 0x1D99C9: tcu::surfaceless::EglRenderContext::EglRenderContext(glu::RenderConfig const&, tcu::CommandLine const&) [clone .constprop.57] (in /deqp/modules/gles2/deqp-gles2)
==403==    by 0x1DABB0: tcu::surfaceless::ContextFactory::createContext(glu::RenderConfig const&, tcu::CommandLine const&, glu::RenderContext const*) const (in /deqp/modules/gles2/deqp-gles2)
==403==    by 0x53EBD1: glu::createRenderContext(tcu::Platform&, tcu::CommandLine const&, glu::RenderConfig const&, glu::RenderContext const*) (in /deqp/modules/gles2/deqp-gles2)
==403==    by 0x53EFE9: glu::createDefaultRenderContext(tcu::Platform&, tcu::CommandLine const&, glu::ApiType) (in /deqp/modules/gles2/deqp-gles2)
==403==  Block was alloc'd at
==403==    at 0x483577F: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==403==    by 0x4EE5E09: strndup (strndup.c:43)
==403==    by 0x4B010B1: loader_get_kernel_driver_name (loader.c:101)
==403==    by 0x4B016AF: loader_get_driver_for_fd (loader.c:462)
==403==    by 0x4AFD553: surfaceless_probe_device (platform_surfaceless.c:308)
==403==    by 0x4AFD915: dri2_initialize_surfaceless (platform_surfaceless.c:391)
==403==    by 0x4AF5EEA: dri2_initialize (egl_dri2.c:984)
==403==    by 0x4AF5EEA: dri2_initialize (egl_dri2.c:958)
==403==    by 0x4AF1EEC: _eglMatchAndInitialize (egldriver.c:75)
==403==    by 0x4AF1F3B: _eglMatchDriver (egldriver.c:96)
==403==    by 0x4AE9367: eglInitialize (eglapi.c:617)
==403==    by 0x1D99C9: tcu::surfaceless::EglRenderContext::EglRenderContext(glu::RenderConfig const&, tcu::CommandLine const&) [clone .constprop.57] (in /deqp/modules/gles2/deqp-gles2)
==403==    by 0x1DABB0: tcu::surfaceless::ContextFactory::createContext(glu::RenderConfig const&, tcu::CommandLine const&, glu::RenderContext const*) const (in /deqp/modules/gles2/deqp-gles2)

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-09-26 18:00:34 +02:00
Adam Jackson
8b6d3f2c78 Revert "glx: Lift sending the MakeCurrent request to top-level code"
Apparently this provokes crashes elsewhere in code unrelated to
MakeCurrent. I hate GLX so very very much.

This reverts commit 999c2aed88.

Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1207
2019-09-26 11:07:42 -04:00
Adam Jackson
a14e3b43be Revert "glx: Implement GLX_EXT_no_config_context"
This reverts commit 0d635ccc91.

Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1207
2019-09-26 11:07:13 -04:00
Timur Kristóf
30f0c0ea7d radv: Add debug option to dump meta shaders.
This new option can help debug shader compiler problems when
there are issues with the meta shaders.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-26 13:36:49 +00:00
Timur Kristóf
a4fd8ba7e3 amd/common: Introduce ac_get_fs_input_vgpr_cnt.
Add a function called ac_get_fs_input_vgpr_cnt which will return
the number of input VGPRs used by an AMD shader. Previously,
radv and radeonsi had the same code duplicated, but this commit also
allows them to share this code.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-26 13:36:49 +00:00
Timur Kristóf
83eebdb507 radv: Set shared VGPR count in radv_postprocess_config.
This commit allows RADV to set the shared VGPR count according to
the shader config.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-26 13:36:49 +00:00
Timur Kristóf
7bde4ddaf7 amd/common: Add num_shared_vgprs to ac_shader_config for GFX10.
In GFX10 wave64 mode, shared VGPRs allow the two wave halves to
share some data with each other.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-26 13:36:49 +00:00
Timur Kristóf
db1fddcf0f amd/common: Extract some helper functions to ac_shader_util.
This commit moves ac_get_tbuffer_format, ac_get_sampler_dim and
ac_get_image_dim into ac_shader_util, thus enabling them to be used
by compilers other than LLVM.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-26 13:36:49 +00:00
Timur Kristóf
d8b46f8964 amd/common: Move ac_export_mrt_z to ac_llvm_build.
The aim of this commit is to keep ac_shader_util LLVM-free,
since we would like to use it in ACO later.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-26 13:36:49 +00:00
Rhys Perry
06ea3325c3 aco: CSE readlane/readfirstlane/permute/reduce with the same exec mask
v2: rename pass_temp to pass_flags
v2: also CSE reductions
v3: add ds_swizzle_b32 support
v3: check gds/offset0/offset1 fields

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-09-26 13:19:51 +01:00
Rhys Perry
86ecf92c23 aco: don't CSE v_readlane_b32/v_readfirstlane_b32
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-09-26 13:19:51 +01:00
Rhys Perry
3c966fd688 aco,radv: rename record_llvm_ir/llvm_ir_string to record_ir/ir_string
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-26 11:08:47 +01:00
Rhys Perry
ec8ced9123 radv/aco: return a correct name and description for the backend IR
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-26 11:08:43 +01:00
Rhys Perry
15ea1c5cff aco: store printed backend IR in binary
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-26 11:08:31 +01:00
Rhys Perry
6613b81327 aco,radv/aco: get dissassembly for release builds if requested
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-26 11:08:09 +01:00
Rhys Perry
0aef1a230e radv/aco: actually disable ACO when unsupported
We were setting this twice. The second time, we weren't later disabling
it if unsupported.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-26 11:04:45 +01:00
Tapani Pälli
031752798b mesa/st: calculate texture size based on EGLImage miplevel
Fixes issues with 'egl-gl_oes_egl_image' Piglit test.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-26 07:55:24 +03:00
Dylan Baker
fafd20f67d meson: fix logic for generating .pc files with old glvnd
We want to generate PC files for non-glvnd builds and for builds with
old glvnd, but the current logic doesn't do that, it builds them
unconditionally, and for GLES it builds the shared libraries, which is
also not what we want. This does not generate .pc files for gles1 or
gles2. Which it we weren't doing before either, making this not a
regression but a return to status-quo.o

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1838
Fixes: 93df862b6a
       ("meson: re-add incorrect pkg-config files with GLVND for backward compatibility")
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-09-25 23:25:27 +00:00
Ian Romanick
7e53bebcb5 nir/range-analysis: Use types to provide better ranges from bcsel and mov
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>

All Gen7+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16328255 -> 16315391 (-0.08%)
instructions in affected programs: 218318 -> 205454 (-5.89%)
helped: 988
HURT: 0
helped stats (abs) min: 1 max: 72 x̄: 13.02 x̃: 10
helped stats (rel) min: 0.33% max: 16.04% x̄: 6.27% x̃: 4.88%
95% mean confidence interval for instructions value: -13.69 -12.35
95% mean confidence interval for instructions %-change: -6.55% -5.99%
Instructions are helped.

total cycles in shared programs: 363683977 -> 363615417 (-0.02%)
cycles in affected programs: 1475193 -> 1406633 (-4.65%)
helped: 923
HURT: 36
helped stats (abs) min: 1 max: 624 x̄: 75.78 x̃: 48
helped stats (rel) min: 0.08% max: 13.89% x̄: 5.20% x̃: 5.08%
HURT stats (abs)   min: 1 max: 179 x̄: 38.58 x̃: 4
HURT stats (rel)   min: 0.06% max: 16.56% x̄: 3.33% x̃: 0.29%
95% mean confidence interval for cycles value: -75.88 -67.10
95% mean confidence interval for cycles %-change: -5.10% -4.66%
Cycles are helped.

Sandy Bridge
total instructions in shared programs: 10785779 -> 10785654 (<.01%)
instructions in affected programs: 13855 -> 13730 (-0.90%)
helped: 67
HURT: 0
helped stats (abs) min: 1 max: 15 x̄: 1.87 x̃: 1
helped stats (rel) min: 0.20% max: 3.45% x̄: 0.97% x̃: 0.78%
95% mean confidence interval for instructions value: -2.47 -1.26
95% mean confidence interval for instructions %-change: -1.13% -0.81%
Instructions are helped.

total cycles in shared programs: 153704799 -> 153704481 (<.01%)
cycles in affected programs: 101509 -> 101191 (-0.31%)
helped: 38
HURT: 13
helped stats (abs) min: 1 max: 38 x̄: 12.53 x̃: 16
helped stats (rel) min: 0.07% max: 2.69% x̄: 0.87% x̃: 0.53%
HURT stats (abs)   min: 1 max: 36 x̄: 12.15 x̃: 7
HURT stats (rel)   min: 0.06% max: 2.53% x̄: 0.73% x̃: 0.44%
95% mean confidence interval for cycles value: -10.24 -2.24
95% mean confidence interval for cycles %-change: -0.75% -0.17%
Cycles are helped.

LOST:   2
GAINED: 0

No shader-db change on Iron Lake or GM45.
2019-09-25 15:37:05 -07:00
Ian Romanick
99ddb41e2d nir/range-analysis: Use types in the hash key
This allows the reslut of mov and bcsel to be separately interpreted as
float or int depending on the use.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-25 15:37:01 -07:00
Ian Romanick
018d2b524a nir/range-analysis: Bail if the types don't match
Some shaders are hurt by this change because now a
load_const(0x00000000) is not recognized as eq_zero when loaded as a
float.  This behavior is restored in a later patch (nir/range-analysis:
Use types to provide better ranges from bcsel and mov).

v2: Add a comment about reinterpretation of int/uint/bool.  Suggested by
Caio.  Rewrite condition the check for types being float versus checking
for types not being all the things that aren't float.

Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>

All Gen7+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16327543 -> 16328255 (<.01%)
instructions in affected programs: 55928 -> 56640 (1.27%)
helped: 0
HURT: 208
HURT stats (abs)   min: 1 max: 16 x̄: 3.42 x̃: 3
HURT stats (rel)   min: 0.33% max: 6.74% x̄: 1.31% x̃: 1.12%
95% mean confidence interval for instructions value: 3.06 3.79
95% mean confidence interval for instructions %-change: 1.17% 1.46%
Instructions are HURT.

total cycles in shared programs: 363682759 -> 363683977 (<.01%)
cycles in affected programs: 325758 -> 326976 (0.37%)
helped: 44
HURT: 133
helped stats (abs) min: 1 max: 179 x̄: 33.61 x̃: 5
helped stats (rel) min: 0.06% max: 14.21% x̄: 2.47% x̃: 0.29%
HURT stats (abs)   min: 1 max: 157 x̄: 20.28 x̃: 14
HURT stats (rel)   min: 0.07% max: 14.44% x̄: 1.42% x̃: 0.73%
95% mean confidence interval for cycles value: 0.38 13.39
95% mean confidence interval for cycles %-change: -0.06% 0.96%
Inconclusive result (%-change mean confidence interval includes 0).

Sandy Bridge
total instructions in shared programs: 10787433 -> 10787443 (<.01%)
instructions in affected programs: 1842 -> 1852 (0.54%)
helped: 0
HURT: 10
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.33% max: 1.85% x̄: 0.73% x̃: 0.49%
95% mean confidence interval for instructions value: 1.00 1.00
95% mean confidence interval for instructions %-change: 0.36% 1.10%
Instructions are HURT.

total cycles in shared programs: 153724543 -> 153724563 (<.01%)
cycles in affected programs: 8407 -> 8427 (0.24%)
helped: 1
HURT: 3
helped stats (abs) min: 18 max: 18 x̄: 18.00 x̃: 18
helped stats (rel) min: 0.98% max: 0.98% x̄: 0.98% x̃: 0.98%
HURT stats (abs)   min: 4 max: 18 x̄: 12.67 x̃: 16
HURT stats (rel)   min: 0.21% max: 0.75% x̄: 0.56% x̃: 0.72%
95% mean confidence interval for cycles value: -21.31 31.31
95% mean confidence interval for cycles %-change: -1.11% 1.46%
Inconclusive result (value mean confidence interval includes 0).

No shader-db changes on Iron Lake or GM45.
2019-09-25 15:37:01 -07:00
Lionel Landwerlin
e5ddbd7a3c intel: Add new Comet Lake PCI-ids
Commit bfc4c359b282 ("drm/i915/cml: Add Missing PCI IDs") in i915
added 3 new CML PCI ids.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-26 01:13:28 +03:00
Lionel Landwerlin
813f3460e7 intel: use proper label for Comet Lake skus
Fixes: 82f6a746e8 ("intel: Add support for Comet Lake")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-26 01:13:28 +03:00
Kristian H. Kristensen
06d207a2fa freedreno/a6xx: Move instrlen and obj_start writes to fd6_emit_shader
Consolidate a few more generic shaders setup regs in fd6_emit_shader.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-09-25 21:39:08 +00:00
Kristian H. Kristensen
cf695ad2ec freedreno/a6xx: Emit const and texture state for HS/DS/GS
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-09-25 21:39:08 +00:00
Kristian H. Kristensen
87d234d968 freedreno/ir3: Add HS/DS/GS to shader key and cache
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-09-25 21:39:08 +00:00
Kristian H. Kristensen
d9c2ceddd2 freedreno/a6xx: Add generic program stateobj support for HS/DS/GS
This add generic stage state setup for HS/DS/GS to the program state
object.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-09-25 21:39:08 +00:00
Kristian H. Kristensen
64bc833f32 freedreno: Move fs functions after geometry pipeline stages
Let's try to always order the stages in the pipeline order.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-09-25 21:39:08 +00:00
Kristian H. Kristensen
00cbb6db09 freedreno: Add state binding functions for HS/DS/GS
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-09-25 21:39:08 +00:00
Kristian H. Kristensen
2dc4d6c692 freedreno: Rename vp and fp to vs and fs in fd_program_stateobj
We're using vs and fs now, and adding hs, ds and gs soon.  It's
confusing enough that we have both DS/TCS and HS/TES. At least for VS
and FS there doesn't have to be multiple names.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-09-25 21:39:08 +00:00
Kristian H. Kristensen
c99ecf7f96 freedreno/a6xx: Factor out const state setup
We'll be sharing this logic for new shader stages soon.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-09-25 21:39:08 +00:00
Eric Engestrom
b3e3af0e37 glsl: turn runtime asserts of compile-time value into compile-time asserts
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-09-25 21:14:52 +00:00
Eric Engestrom
ae8a7d5c8f docs/release-calendar: add missing <td> and </td>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-09-25 22:13:07 +01:00
Eric Engestrom
f9bb5cd105 docs/release-calendar: fix bugfix release numbers
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-09-25 22:13:07 +01:00
Lionel Landwerlin
da2d67fc3b anv: gem-stubs: return a valid fd got anv_gem_userptr()
Fixes invalid close(-1) in the unit tests.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-09-25 22:02:51 +03:00
Danylo Piliaiev
2d8f77db83 st/nine: Ignore D3DSIO_RET if it is the last instruction in a shader
RET as a last instruction could be safely ignored.
Remove it to prevent crashes/warnings in case underlying driver
doesn't implement arbitrary returns.

A better way would be to remove the RET after the whole shader
is parsed which will handle a possible case when the last RET is
followed by a comment.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Axel Davy <davyaxel0@gmail.com>
2019-09-25 18:24:01 +00:00
Dylan Baker
60861388e7 bin/get-pick-list: use --oneline=pretty instead of --oneline
--oneline shortens hashes, while --oneline=pretty doesn't, otherwise
they are the same. Having full hashes is convenient as that is the
format that the bin/.cherry-ignore script requires to work correctly.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2019-09-25 17:50:19 +00:00
Dylan Baker
c8fa996dcf release: Push 19.3 back two weeks
The main reason to do this is that 19.2 has slipped by two weeks, and
such the 19.3 branch is due to happen extremely close to the release of
19.2.0. I think it would be better to have a little more time between
releases for developers and for packagers.

This would still have the 19.3 release out before December, even if it
slips by 1 week.

Acked-By: Karol Herbst <kherbst@redhat.com>
Acked-by: Juan A. Suarez <jasuarez@igalia.com>
2019-09-25 10:46:59 -07:00
Dylan Baker
666a5a2230 docs: update calendar, add news item, and link release notes for 19.2.0 2019-09-25 10:42:17 -07:00
Dylan Baker
582421285b docs: add SHA256 sum for 19.2.0 2019-09-25 10:42:17 -07:00
Dylan Baker
8302eb7a8f docs: Add release notes for 19.2.0 2019-09-25 10:42:17 -07:00
Andreas Baierl
0c199808bc lima/ppir: Add various varying fetch sources to disassembler
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-25 16:57:31 +00:00
Eric Engestrom
93df862b6a meson: re-add incorrect pkg-config files with GLVND for backward compatibility
This is a bit counter-intuitive, but the issue is that GLVND is broken
in versions <= 1.1.1, so we need to keep wrongly providing these files
to cover up their mistake, otherwise the rest of the world ends up
broken.

Suggested-by: Dylan Baker <dylan@pnwbakers.com>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-09-25 17:27:54 +01:00
Rhys Perry
db2ca45102 aco: check for duplicate opcode numbers
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-25 15:28:44 +00:00
Rhys Perry
101f47fdd7 aco: fix opcode for s_mul_hi_i32
Fixes dEQP-VK.glsl.builtin.function.integer.imulextended.*_compute

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-25 15:28:44 +00:00
Rhys Perry
2faaf04c62 aco: fix v_subrev_co_u32_e64 opcode
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-25 15:28:44 +00:00
Rhys Perry
00aa413bae aco: fix GFX9 opcode for v_xad_u32
Fixes various dEQP-VK.image.store.* tests.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-25 15:28:44 +00:00
Rhys Perry
b125dc4839 aco: implement 64-bit ineg
We currently lower them, but nir_opt_algebraic() can add new ones because
lower_sub=true.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-09-25 15:27:48 +00:00
Rhys Perry
641eac953c aco: run nir_lower_int64() before nir_lower_idiv()
nir_lower_idiv() asserts on 64-bit integers.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-09-25 15:27:48 +00:00
Connor Abbott
36e000d832 nir: Fix overlapping vars in nir_assign_io_var_locations()
When handling two variables with overlapping locations, we process the
one with lower location first, and then extend the location ->
driver_location map to guarantee that it's contiguous for the second
variable too. But the loop had the wrong bound, so we weren't extending
the map 100%, which could lead to problems later such as an incorrect
num_inputs. The loop index i is an index into the slots of the variable,
so we need to stop at the final slot of the variable (var_size) instead
of the number of unassigned slots.

This fixes
spec@arb_enhanced_layouts@execution@component-layout@vs-fs-array-interleave-range
on radeonsi NIR.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-25 15:53:50 +02:00
Karol Herbst
66456b8d49 clover: eliminate "ignoring attributes on template argument" warning
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Pierre Moreau <dev@pmoreau.org>
2019-09-25 10:39:58 +00:00
Karol Herbst
4f044c38e2 clover/codegen: remove unused get_symbol_offsets function
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Pierre Moreau <dev@pmoreau.org>
2019-09-25 10:39:58 +00:00
Karol Herbst
2859c49f7b clover/llvm: remove harmful std::move call
both clang and gcc warn with:
"moving a local object in a return statement prevents copy elision"

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Pierre Moreau <dev@pmoreau.org>
2019-09-25 10:39:58 +00:00
Tapani Pälli
f4d9169204 iris: disable aux on first get_param if not created with aux
This moves the fix from commit 361f3d19f1 to happen in get_param
(used now instead of get_handle by st/dri). This fixes artifacts
seen with Xorg and CCS_E.

Fixes: fc12fd05f5 "iris: Implement pipe_screen::resource_get_param"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-25 08:28:45 +03:00
Erik Faye-Lund
88f909eb37 glsl: correct bitcast-helpers
Without this, we'll incorrectly round off huge values to the nearest
representable double instead of keeping it at the exact value  as
we're supposed to.

Found by inspecting compiler-warnings.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 85faf5082f ("glsl: Add 64-bit integer support for constant expressions")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-25 04:52:54 +00:00
Vasily Khoruzhick
678ebda8b7 lima/ppir: add support for indirect load of uniforms and varyings
Utgard PP supports indirect load of uniforms and varyings, so let's
enable it.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-24 20:33:27 -07:00
Vasily Khoruzhick
780985d1b8 lima/ppir: add node dependency types
Currently we add dependecies in 3 cases:
1) One node consumes value produced by another node
2) Sequency dependencies
3) Write after read dependencies

2) and 3) only affect scheduler decisions since we still can use pipeline
register if we have only 1 dependency of type 1).

Add 3 dependency types and mark dependencies as we add them.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-24 20:33:13 -07:00
Vasily Khoruzhick
4fcfed426a lima/ppir: don't attempt to clone tex coords if it's not varying
It makes no sense to clone texture coords if it's not varying, moreover
we don't support cloning ALU nodes.

Fixes: 1c1890fa70 ("lima/ppir: clone uniforms and load_coords into each successor")
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-25 03:07:16 +00:00
Timothy Arceri
0e1310e59f radeonsi/nir: lower load constants to scalar
We call nir_lower_load_const_to_scalar in the state trackers linker
however some later passes can reintroduce constant vectors. Here
we lower these to scalar and perform optimisations. The Intel
drivers do a similar call in their backend..

shader-db results VEGA 64:

Totals from affected shaders:
SGPRS: 152168 -> 151976 (-0.13 %)
VGPRS: 135224 -> 135112 (-0.08 %)
Spilled SGPRs: 4027 -> 4163 (3.38 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 10670028 -> 10654776 (-0.14 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 13122 -> 13135 (0.10 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-25 02:42:55 +00:00
Jonathan Marek
e353fd096d turnip: use image tile_mode for gmem configuration
Fixes at least this deqp test:
dEQP-VK.api.smoke.triangle

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-24 22:32:09 -04:00
Jonathan Marek
f510901dc2 turnip: fix binning shader compilation
ir3 segfaults if nonbinning is NULL for the bininng pass shader.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-24 22:32:09 -04:00
Rhys Perry
12372d60ff nir/opt_remove_phis: handle phis with no sources
This can happen with loops with unreachable exits which are later
optimized away.

Fixes assertion in dEQP-VK.graphicsfuzz.unreachable-loops with RADV.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-09-25 00:58:30 +00:00
Michel Dänzer
67d930d64b radeonsi: fix VAAPI segfault due to various bugs
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111236
2019-09-24 19:23:30 -04:00
Marek Olšák
f52afdf672 gallium/vl: don't set PIPE_HANDLE_USAGE_EXPLICIT_FLUSH
because vl doesn't call flush_resource and I wasn't able to find
all places where flush_resource needs to be called.

This fixes corrupted / unflushed surfaces with fullscreen videos on Raven.

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
2019-09-24 19:23:30 -04:00
Marek Olšák
783fae2a1f radeonsi: initialize displayable DCC using the retile blit to prevent hangs
Cc 19.2 <mesa-stable@lists.freedesktop.org>
2019-09-24 19:23:30 -04:00
Connor Abbott
270fe55256 nir/opt_large_constants: Handle store writemasks
This fixes some piglit tests on radeonsi NIR where a varying is
initialized to a constant array in the vertex shader. Varying packing
after nir_lower_io_to_temporaries creates writemasked stores which
persist after pulling the constant initialization down into the fragment
shader.

While we're here, rewrite handle_constant_store() to do the loop over
components outside the switch, so that we don't have to duplicate the
writemask checking for every bitsize.

Fixes: 1235850522 ("nir: Add a large constants optimization pass")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-24 20:59:58 +00:00
Eric Engestrom
da496d4e30 meson: split more compiler options to their own line
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-09-24 19:39:24 +01:00
Eric Engestrom
3fd0afd5e3 meson: drop -Wno-foo bug workaround for Meson < 0.46
This was a workaround for a bug in Meson that was fixed in 0.46 [1].

[1] https://github.com/mesonbuild/meson/pull/2284

Fixes: f7b6a8d12f ("meson: bump required version to 0.46")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-09-24 19:39:24 +01:00
Eric Engestrom
30f639c181 radv: fix s/load/store/ copy-paste typo
Fixes: cdc6efddf9 ("radv: implement all depth/stencil resolve modes using graphics")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-24 19:18:54 +01:00
Stephen Barber
8c3ace6991 nouveau: add idep_nir_headers as dep for libnouveau
Fixes a compilation error when building libnouveau:

In file included from ../src/gallium/drivers/nouveau/nv50/nv50_program.c:25:
../src/compiler/nir/nir.h:1115:10: fatal error: nir_intrinsics.h: No such file or directory
 #include "nir_intrinsics.h"
           ^~~~~~~~~~~~~~~~~~
           compilation terminated.

Fixes: f014ae3c7c ("nouveau: add support for nir")
Signed-off-by: Stephen Barber <smbarber@chromium.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-09-24 17:27:20 +00:00
Bas Nieuwenhuizen
780182f0a0 radv: Add workaround for hang in The Surge 2.
Released today and hangs on RADV. We don't have the root cause yet,
but this should unblock people playing the game.

No drirc because the radv debugflags are not usable from drirc and
I want this backported.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-09-24 09:51:40 +00:00
Andres Gomez
5e87f48f1d i965/fs: set rounding mode when emitting the flrp instruction
flrp was forgotten when already adding the rounding mode for other
instructions.

Fixes: ba1e25e1aa ("i965/fs: set rounding mode when emitting fadd, fmul and ffma instructions")
Suggested-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2019-09-24 12:06:59 +03:00
Andres Gomez
6f1468c371 i965/fs: add a comment about how the rounding mode in fmul is set
After
1711bf6cf2 ("intel/fs: Generate better code for fsign multiplied by a value"),
the conflicts resolution for setting the rounding mode after the
fused fmul and fsign optimization is non obvious.

Basically, the optimization doesn't really result in a MUL, or any
other operation which would need to have the rounding mode set. Hence,
we set it just before the actual MUL in the treatment of fmul.

Fixes: ba1e25e1aa ("i965/fs: set rounding mode when emitting fadd, fmul and ffma instructions")
Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2019-09-24 11:24:15 +03:00
Juan A. Suarez Romero
b3c25e6f99 bin/get-pick-list.sh: sha1 commits can be smaller than 8 chars
The script only handles commits with "Fixes: <sha1>" where <sha1> is
equal or great than 8 chars. But <sha1> can be smaller, like 7 chars.

This commit relax the restriction to handle <sha1> 4 or more chars.

Fixes: 533fead423 ("bin/get-pick-list.sh: tweak the commit sha matching pattern")

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-09-24 07:36:45 +00:00
Connor Abbott
fed5b605f0 lima/gpir: Fix 64-bit shift in scheduler spilling
There are 64 physical registers so the shift must be 64 bits.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-24 08:44:54 +02:00
Connor Abbott
ef38a659fb lima/gpir: Don't emit movs when translating from NIR
The scheduler doesn't expect them. To do this, I had to refactor the
registration part of gpir_node_create_dest() to be separate from
creating and inserting the node, since the last two now aren't done when
handling moves. This adds more code but creates the possibility of
automatically inserting input dependencies when inserting nodes, similar
to what's done in NIR with the use-def lists (this isn't done yet).

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-24 08:43:48 +02:00
Connor Abbott
96c31d9a55 lima/gpir: Fix postlog2 fixup handling
We guarantee that a complex1 op is always used by postlog2 directly by
rewriting the postlog2 op to be a move when there would be a move
inserted between them. But we weren't doing this in all circumstances
where there might be a move. Move the logic to place_move() so that it
always happens. Fixes a few log tests that happened to start failing due
to changes in the register allocator leading to a different scheduling
order.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-24 08:43:06 +02:00
Connor Abbott
1cd1cce035 lima/gpir: Use registers for values live in multiple blocks
This commit adds the framework for cross-basic-block register
allocation. Like ARM's compiler, we assume that the value registers
aren't usable across branches, which means we have to use physical
registers to store any value that crosses a basic block. There are three
parts to this:

1. When translating from NIR, we rely on the NIR out-of-ssa pass to
coalesce values into registers. We insert store_reg instructions for
values used in more than one basic block, and load_reg instructions for
values not defined in the same basic block (or defined after their use,
for loops). So by the time we've translated out of NIR we've already
split things into values (which are only used in the same basic block)
and registers (which are only used in different basic blocks than where
they're defined).

2. We allocate the registers at the same time that we allocate the
values, before the final scheduler. Unlike the values, where the
assigned color is fake, we assign the actual physical index & component
to physregs at this stage. load_reg and store_reg are treated as moves
in the allocator and when creating write-after-read dependencies.

3. Finally, in the main scheduler we have to avoid overwriting existing
live physregs when spilling. First, we have to tell the scheduler which
physical registers are live at the end of each block, to avoid
overwriting those. If a register is only live at the beginning, we can
reuse it for spilling after the last original use in the final program
happens, i.e. before any original use is scheduled, but we have to be
careful to add the proper dependencies so that the spill write is
scheduled before the original reads. To handle this we repurpose
reg_link for uses to be used by the scheduler.

A few register-related things copied over from NIR or from other
drivers can be dropped.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-24 08:37:37 +02:00
Connor Abbott
7594ef6eb0 lima/gpir: Support branch instructions
Because branch conditions have to be in the pass slot, there is no
unconditional branch, and realistically the pass slot has to contain a
move when branching (there's nothing it does that would be useful for
operating on booleans, so we can't use it for anything when computing
the branch condition), we put the branch instruction in the pass slot
and at codegen time turn it into a move of the branch condition. This
means that it doesn't have to be special-cased like store instructions
are in the scheduler. Because of this decision we can remove the
half-implemented BRANCH codegen slot. Finally, we (ab)use the existing
schedule_first mechanism to make sure that branches are always last in
the basic block.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-24 08:35:47 +02:00
Connor Abbott
2df2e081fd lima/gpir: Only try to place actual children
When picking a node to be scheduled, we try to schedule its children as
well. But we shouldn't try to schedule nodes which only have a fake
dependency on the original node, since this isn't the point of
scheduling children at the same time and can break some expectations of
the rest of the code.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-24 08:35:26 +02:00
Connor Abbott
f989a024b4 lima/gpir: Fix compiler warning
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-24 08:33:56 +02:00
Adam Jackson
0d635ccc91 glx: Implement GLX_EXT_no_config_context
This is the GLX counterpart to EGL_KHR_no_config_context. Contexts may
now be created without reference to an fbconfig, in which case it is
treated as compatible with any fbconfig (and thus any GLX drawable).

Khronos: https://github.com/KhronosGroup/OpenGL-Registry/pull/102
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-09-23 20:39:01 -04:00
Adam Jackson
999c2aed88 glx: Lift sending the MakeCurrent request to top-level code
Somewhat terrifyingly, we never sent this for direct contexts, which
means the server never knew the context/drawable bindings. To handle
this sanely, pull the request code up out of the indirect backend, and
rewrite the context switch path to call it as appropriate.  This
attempts to preserve the existing behavior of not calling unbind() on
the context if its refcount would not drop to zero.

Of course, you can't just do this indiscriminately, because this is GLX
and extant X servers have bugs and everything is terrible. To wit:

- For 1.20.x prior to 1.20.6, you can bind a direct context once, but
the second time you try to modify the context's binding you will get
GLXBadContextTag. This includes unbinding the context. And "deleting"
the context will leak memory, because it will still appear to be
current.

- For 1.19 and earlier, glXMakeCurrent(dpy, None, ctx) should be legal
for GL 3.0+ contexts, but the server will throw BadMatch.

To guard against this, we only send the request for indirect contexts
unless the server is known good, and only mention one context at a time
in such a request; if switching between contexts, we first unbind the
old, and then bind the new. Note that the second VendorRelease() version
is to catch XFree86 4.x and Xorg [67].x, which almost certainly have the
above bugs. Other servers might report different version numbers here,
but we can't do direct rendering against them, so this should be safe.

Fixes glx-make-context, glx-multi-window-single-context and
glx-query-drawable-glx_fbconfig_id-window. Sufficiently old piglit will
regress on glx-make-glxdrawable-current (throwing BadMatch), which is
fixed by mesa/piglit!116.
2019-09-23 20:39:01 -04:00
Adam Jackson
01e437988d glx: Move vertex array protocol state into the indirect backend
Only relevant for indirect contexts, so let's get that code out of the
common path.
2019-09-23 20:21:01 -04:00
Kenneth Graunke
b9e93db208 intel: Increase Gen11 compute shader scratch IDs to 64.
From the MEDIA_VFE_STATE docs:

   "Starting with this configuration, the Maximum Number of Threads must
    be set to (#EU * 8) for GPGPU dispatches.

    Although there are only 7 threads per EU in the configuration, the
    FFTID is calculated as if there are 8 threads per EU, which in turn
    requires a larger amount of Scratch Space to be allocated by the
    driver."

It's pretty clear that we need to increase this for scratch address
calculations, because the FFTID has a certain bit-pattern.  The quote
above seems to indicate that we should increase the actual thread count
programmed in MEDIA_VFE_STATE as well, but we think the intention is to
only bump the scratch space.

Fixes GPU hangs in Bioshock Infinite and Synmark's CSDof on Icelake 8x8.

Fixes: 5ac804bd9a ("intel: Add a preliminary device for Ice Lake")
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-09-23 16:59:40 -07:00
Kenneth Graunke
50c0dd8621 Revert "intel/gen11+: Enable Hardware filtering of Semi-Pipelined State in WM"
This reverts commit 729de1488f.

It turns out that, although the register is in the logical context,
it isn't whitelisted, so we can't actually write it from userspace
batch buffers.  The write just becomes a noop, which is why we saw
no performance changes.

I manually whitelisted it, and still observed no performance gains, but
it did regress KHR-GL46.texture_cube_map_array.color_depth_attachments
on the iris driver.  So we might need to fix something before enabling
this.  To prevent it randomly getting turned on should the kernel ever
whitelist this register, we revert the patch for now.
2019-09-23 16:31:23 -07:00
Jason Ekstrand
03911195a3 util/rb_tree: Replace useless ifs with asserts
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-09-23 22:38:30 +00:00
Kenneth Graunke
a733423da5 broadcom/genxml: Stop manually scrubbing 'α' -> "alpha"
'α' has never appeared in any genxml files, so there's no need to
replace it with the word "alpha".

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-23 20:24:54 +00:00
Kenneth Graunke
8489206e9d intel/genxml: Stop manually scrubbing 'α' -> "alpha"
'α' has never appeared in any genxml files, so there's no need to
replace it with the word "alpha".

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-09-23 20:24:54 +00:00
Rob Clark
d8cbf1adc1 freedreno/a6xx: do streamout only in binning pass
Use VPC_SO_OVERRIDE to control whether we do streamout in binning or
draw pass.  Normally we want to do streamout in binning pass, except
when there is a single tile and binning passed is skipped.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-09-23 20:02:34 +00:00
Rob Clark
b9bf374512 freedreno/a6xx: fix binning pass vs. xfb
We could bit doing streamout from binning pass.  In this case we want to
use the full VS which doesn't have (potentially streamed out) varyings
stripped out.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-09-23 20:02:34 +00:00
Rob Clark
331f89a971 freedreno/a6xx: un-open-code PC_PRIMITIVE_CNTL_1.PSIZE
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-09-23 20:02:34 +00:00
Marek Olšák
05d32850ff ac/nir: force unnormalized coordinates for RECT
This fixes VAAPI.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-23 15:34:54 -04:00
Marek Olšák
500181b2ba ac/nir: port Z compare value clamping from radeonsi
This fixes some dEQP tests.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-23 15:34:54 -04:00
Marek Olšák
09447ccc78 tgsi_to_nir: fix 2-component system values like tess_level_inner_default
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-23 15:34:56 -04:00
Marek Olšák
3906fce88b tgsi_to_nir: fix masked out image loads
This caused a failure in NIR validation.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-23 15:34:54 -04:00
Marek Olšák
780eeaf2f1 nir: define 8-byte size and alignment for bindless variables
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-23 15:34:22 -04:00
Marek Olšák
f5c103ce1d nir: don't add bindless variables to num_textures and num_images
It confuses radeonsi.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-23 15:34:05 -04:00
Marek Olšák
150f6ffb4c amd: remove all PCI IDs supported by amdgpu
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-09-23 15:15:35 -04:00
Jiang, Sonny
5a545e355b loader: always map the "amdgpu" kernel driver name to radeonsi (v2)
v2: cleanup

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-09-23 15:14:11 -04:00
Marek Olšák
9429714233 ac: stop using PCI IDs for chip identification
PCI IDs for amdgpu will be removed from Mesa.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-09-23 15:14:11 -04:00
Marek Olšák
48742de601 ac/addrlib: fix chip identification for Vega10, Arcturus, Raven2, Renoir
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-09-23 15:14:11 -04:00
Marek Olšák
65b698136c amd: add more PCI IDs for Navi14
trivial and urgent

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
2019-09-23 15:12:34 -04:00
Eric Engestrom
c29c410182 meson: split compiler warnings one per line
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-09-23 17:56:22 +01:00
Jason Ekstrand
d63162cff0 nir/repair_ssa: Replace the unreachable check with the phi builder
In a3268599f3, I attempted to fix nir_repair_ssa for unreachable
blocks.  However, that commit missed the possibility that the use is in
a block which, itself, is unreachable.  In this case, we can end up in
an infinite loop trying to replace a def with itself.  Even though a
no-op replacement is a fine operation, it keeps extending the end of the
uses list as we're walking it.  Instead of explicitly checking for the
group of conditions, just check if the phi builder gives us a different
def.  That's guaranteed to be 100% reliable and, while it lacks symmetry
with the is_valid checks, should be more reliable.

Fixes: a3268599 "nir/repair_ssa: Repair dominance for unreachable..."
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-09-23 16:19:24 +00:00
Daniel Schürmann
2c050b49b3 aco: only emit waitcnt on loop continues if we there was some load or export
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-09-23 13:39:33 +02:00
Karol Herbst
70e39294d7 nv50/ir/nir: comparison of integer expressions of different signedness warning
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>
2019-09-23 13:27:32 +02:00
Karol Herbst
61ccca12f5 nv50/ir: fix unnecessary parentheses warning
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>
2019-09-23 13:27:32 +02:00
Erico Nunes
ab49a0e746 lima: remove partial clear support from pipe->clear()
pipe->clear() is not called for partial clears, which mesa emulates by
drawing a quad.
Furthermore, drivers should not use rasterizer state information for
scissor information (which was being used to handle the partial clears).
So, remove the partial clear support since it was not supposed to be
handled by pipe->clear() anyway.
This fixes issues with clearing after switching to different sized
framebuffers.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-09-23 12:19:10 +02:00
Boris Brezillon
0c6ca0a647 dEQP-GLES2.functional.buffer.write.use.index_array.* are passing now.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-09-23 09:48:38 +02:00
Boris Brezillon
055497fa84 panfrost: Fix indexed draws
->padded_count should be large enough to cover all vertices pointed by
the index array. Use the local vertex_count variable that contains the
updated vertex_count value for the indexed draw case.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-23 09:47:41 +02:00
Karol Herbst
697eb8f973 clover/nir: fix compilation with g++-5.5 and maybe earlier
fixes "sorry, unimplemented: non-trivial designated initializers not supported"

Fixes: deb04adf2a ("clover: add support for passing kernels as nir to the driver")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-09-23 07:09:41 +00:00
Kenneth Graunke
ec81f19b44 st/mesa: Bail on incomplete attachments in discard_framebuffer
Incomplete attachments don't have an associated pipe_surface, so
this would crash.

Fixes a WebGL conformance test that uses incomplete attachments:
https://www.khronos.org/registry/webgl/sdk/tests/conformance2/renderbuffers/invalidate-framebuffer.html?webglVersion=2&quiet=0&quick=1

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111756
Reviewed-By: Tapani Pälli <tapani.palli@intel.com>
2019-09-22 21:03:16 -07:00
Vasily Khoruzhick
d214778753 lima: implement BO cache
Allocating BOs is expensive, so we should avoid doing that by caching
freed BOs.

BO cache is modelled after one in v3d driver and works as follows:

- in lima_bo_create() check if we have matching BO in cache and return
  it if there's one, allocate new BO otherwise.
- in lima_bo_unreference() (renamed from lima_bo_free()): put BO in
  cache instead of freeing it and remove all stale BOs from cache

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-22 19:20:59 -07:00
Vasily Khoruzhick
9f897a2b4c lima: use 0 to poll if BO is busy in lima_bo_wait()
os_time_get_absolute_timeout(0) returns current time, while kernel
driver expects 0 as value to poll BO status and return immediately.
Fix it by setting abs_timeout to 0 if timeout_ns is 0

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-22 19:20:59 -07:00
Qiang Yu
7f7ac21088 lima: move damage bound build to resource
Reviewed-and-Tested-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
2019-09-23 09:48:55 +08:00
Qiang Yu
4ed569eed7 lima: don't use damage system when full damage
Some time weston set full damage region. It is
more effient to use the cached pp stream instead
of dynamically create one.

Reviewed-and-Tested-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
2019-09-23 09:48:50 +08:00
Qiang Yu
afbaed906d lima: implement EGL_KHR_partial_update
This extension set a damage region for each
buffer swap which can be used to reduce buffer
reload cost by only feed damage region's tile
buffer address for PP.

Reviewed-and-Tested-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
2019-09-23 09:48:15 +08:00
Icenowy Zheng
8278b236b0 lima: fix PLBU viewport configuration
The PLBU expects the viewport's 4 borders' coordinates, however
currently we're feeding the coordinate of the left-bottom point and the
size to it, which leads to misrendering when the left-bottom point is
not (0,0).

Change the macros for the viewport PLBU command, and the data feed to
it. The code to calculate the 4 borders is ported from Panfrost.

Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-09-22 15:22:38 +08:00
Bas Nieuwenhuizen
40087ffc5b amd: Build aco only if radv is enabled
ACO depends on C++14, but radeonsi/radv with LLVM 8,9 do not. Let us
only require it for RADV, since that is the only user.

Fixes: a70a998718 "radv/aco: Setup alternate path in RADV to support the experimental ACO compiler"
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-21 14:06:41 +02:00
Karol Herbst
7955fabcf8 nvc0: expose spirv support
required for OpenCL

v2: adjust to changes in previous commits
v3: properly convert to NIR in nvc0_cp_state_create

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> (v1)
2019-09-21 08:28:32 +00:00
Karol Herbst
deb04adf2a clover: add support for passing kernels as nir to the driver
v2: minor formatting fixes
v3: call glsl_type_singleton_init_or_ref and glsl_type_singleton_decref
v4: capitalize and punctuate comments
    fix text_executable -> text_intermediate in TODO
    make glsl_type_singleton wrapper static
v5: rewrite how we run the nir passes
v6: fix unhandled case switch warning in st/mesa

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net> (v4)
2019-09-21 08:28:32 +00:00
Karol Herbst
1befaf4417 clover: prepare supporting multiple IRs
v2: rework arguments to compiler::compile_program
    add assert to device::ir_format
v3: remove PIPE_SHADER_IR_SPIRV
    change title

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net> (v2)
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
2019-09-21 08:28:32 +00:00
Karol Herbst
c8cd8e279d clover: add support for drivers having no proper binary format
Most drivers have actually no binary format and just store the IR directly
as a single entry point blob.

v2: add a cap to switch between single or multi entry point binaries
v3: remove the entry_point field
v4: remove PIPE_CAP_MULTI_ENTRY_POINT_BINARIES
v5: remove supports_multiple_entry_points

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
2019-09-21 08:28:32 +00:00
Karol Herbst
1982ac6d6b clover/functional: add id_equals helper
v2: pass argument by value

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
2019-09-21 08:28:32 +00:00
Karol Herbst
f3ba98cb18 rename pipe_llvm_program_header to pipe_binary_program_header
We want to use it for other formats as well, so give it a more generic name

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
2019-09-21 08:28:32 +00:00
Karol Herbst
b6c47abe3e gallium: add blob field to pipe_llvm_program_header
makes it easier to consume a IR_NATIVE binary

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
2019-09-21 08:28:32 +00:00
Pierre Moreau
2043c5f37c clover/llvm: Add functions for compiling from source to SPIR-V
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-09-21 08:28:32 +00:00
Pierre Moreau
975a3c6ad3 clover/llvm: Add options for dumping SPIR-V binaries
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Francisco Jerez <currojerez@riseup.net>
2019-09-21 08:28:32 +00:00
Pierre Moreau
2147386505 clover/spirv: Add functions for parsing arguments, linking programs, etc.
v2 (Karol Herbst):
  silence warnings about unhandled enum values
v3 (Karol Herbst):
  added back array size parsing (needed for structs passed by value)

Acked-by: Francisco Jerez <currojerez@riseup.net> (v2)
2019-09-21 08:28:32 +00:00
Pierre Moreau
939a7e9a5c clover/spirv: Add functions for validating SPIR-V binaries
Changes since:
* v12:
  - remove autotools (Karol Herbst)
  - Remove the callback in format_validation_msg. (Francisco Jerez)
  - Removed is_binary_spirv. (Francisco Jerez)
  - Pass a string reference to is_valid_spirv instead of the
    notification callback. (Francisco Jerez)
* v11: Fix compilation error introduced in v11.
* v10:
  - Reuse format_validation_msg in is_valid_spirv.
  - Remove LVL2STR macro in format_validation_msg.
* v9: Add `clover_cpp_std` to the overrides of the `libclspirv` target
      in Meson.
* v7: Add DEFINES to libclspirv and libclover, in autotools, as they
      would otherwise never know whether CLOVER_ALLOW_SPIRV has been
      defined (Dave Airlie)
* v6: Update the dependency name (meson) and the libs variable
      (Makefile) due to the replacement of llvm-spirv to the new
      official SPIRV-LLVM-Translator.
* v5: Changed to match the updated “clover/llvm: Allow translating from
      SPIR-V to LLVM IR” in the v6.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-09-21 08:28:32 +00:00
Pierre Moreau
866f6f11d9 meson: Check for SPIRV-Tools and llvm-spirv
Changes since:
* v12 (Karol Herbst):
  - rename CLOVER_ALLOW_SPIRV to HAVE_CLOVER_SPIRV
* v11 (Karol Herbst):
  - only set new defines for clover to speed up recompilation
  - remove autotools
* v10:
  - Add a new flag (`--enable-opencl-spirv` for autotools, and
    `-Dopencl-spirv=true` for meson) for enabling SPIR-V support in
    clover, and never automagically enable it without that flag. (Dylan Baker)
  - When enabling the SPIR-V support, the SPIRV-Tools and
    SPIRV-LLVM-Translator libraries are now required dependencies.
* v7:
  - Properly align LLVMSPIRVLib comment (Dylan Baker)
  - Only define CLOVER_ALLOW_SPIRV when **both** dependencies are found:
    autotools was only requiring one or the other.
* v6: Replace the llvm-spirv repository by the new official
      SPIRV-LLVM-Translator.
* v4: Add a comment saying where to find llvm-spirv (Karol Herbst).
* v3:
  - make SPIRV-Tools and llvm-spirv optional (Francisco Jerez);
  - bump requirement for llvm-spirv to version 0.2
* v2:
  - Bump the required version of SPIRV-Tools to the latest release;
  - Add a dependency on llvm-spirv.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com> (v10)
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-09-21 08:28:32 +00:00
Kenneth Graunke
aa7ac32976 isl: Drop WaDisableSamplerL2BypassForTextureCompressedFormats on Gen11
Gen11 doesn't require us to bypass the L2 cache for BC* images anymore.

The documentation is a bit hard to follow on this point, but the Windows
driver clearly only applies this workaround on Gen9, and their commit
history indicates that this was an intentional change to drop the
workaround for Gen11+.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-09-20 15:35:17 -07:00
Hal Gentz
57c894334e gallium/osmesa: Fix the inability to set no context as current.
Currently there is no way to make no context current w/gallium + osmesa.
The non-gallium version of osmesa does this if the context and buffer
passed to `OSMesaMakeCurrent` are both null. This small change makes it
so that this is also the case with the gallium version.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Hal Gentz <zegentzy@protonmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-20 14:04:12 -06:00
Adam Jackson
6e4fd14b0f libgbm: Wire up getCapability for the image loader 2019-09-20 19:10:31 +00:00
Adam Jackson
55a1b583d9 egl/surfaceless: Add FP16 format support
Reviewed-by: Kevin Strasser <kevin.strasser@intel.com>
2019-09-20 19:10:31 +00:00
Adam Jackson
d01406133d egl/wayland: Implement getCapability for the dri2 and image loaders
Reviewed-by: Kevin Strasser <kevin.strasser@intel.com>
2019-09-20 19:10:31 +00:00
Adam Jackson
e74c947359 egl/wayland: Add FP16 format support
Reviewed-by: Kevin Strasser <kevin.strasser@intel.com>
2019-09-20 19:10:31 +00:00
Adam Jackson
cb8bbbef31 egl/wayland: Reindent the format table
No idea how these ended up with 3-then-2-space indents.

Reviewed-by: Kevin Strasser <kevin.strasser@intel.com>
2019-09-20 19:10:31 +00:00
Jason Ekstrand
7d861ab812 anv: Advertise VK_KHR_shader_subgroup_extended_types
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
2019-09-20 18:02:15 +00:00
Jason Ekstrand
03255da225 intel/fs: Do 8-bit subgroup scan operations in 16 bits
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
2019-09-20 18:02:15 +00:00
Jason Ekstrand
651725f7a1 intel/fs: Allow CLUSTER_BROADCAST to do type conversion
We can't really handle it in the little-core 64-bit case but it's not
really needed there.  Where we really want this is for when we need to
do 16 -> 8-bit conversions.

Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
2019-09-20 18:02:15 +00:00
Jason Ekstrand
3515c0e9cf intel/fs: Allow UB, B, and HF types in brw_nir_reduction_op_identity
Because byte immediates aren't a thing on GEN hardware, we return a
signed or unsigned word immediate in the byte case.

Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
2019-09-20 18:02:15 +00:00
Paulo Zanoni
10532c6831 intel/fs: don't forget the stride at generate_shuffle
During generate_shuffle(), when we use byte sized registers we end up
with a destination stride of 2. We don't take the stride into
consideration when selecting the group offset for the last MOV
operation, which means we end up moving things to the wrong place,
leaving the last few channels untouched. Take the destination stride
in consideration so we don't miss the last channels.

v2: Assert this is not necessary for the IVB special case (Jason).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
2019-09-20 10:57:05 -07:00
Jason Ekstrand
dae33052db util/rb_tree: Reverse the order of comparison functions
The new order matches that of the comparison functions accepted by the C
standard library qsort() functions.  Being consistent with qsort will
hopefully help avoid developer confusion.

The only current user of the red-black tree is aub_mem.c which is pretty
easy to fix up.

Reviewed-by: Lionel Landwerlin <lionel.g.lndwerlin@intel.com>
2019-09-20 17:37:25 +00:00
Jason Ekstrand
d35d7346d2 util/rb_tree: Add the unit tests
When I wrote the red-black tree implementation, I wrote tests for it but
they never got imported into mesa.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-09-20 17:37:25 +00:00
Eric Engestrom
3c1a24de07 anv: implement ICD interface v4
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-09-20 08:31:58 +00:00
Eric Engestrom
19db95e78e anv: split instance dispatch table
This effectively breaks the instance dispatch table in 2 with entry
points using a physical device as first argument getting their own
dispatch table.

As a result we now have to check instance & physical device dispatch
table instead of just the instance dispatch table before.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-09-20 08:31:58 +00:00
Adam Jackson
88b8922f57 glx: Fix drawable lookup bugs in glXUseXFont
We were using the current drawable of the context to name the
appropriate screen for creating the bitmaps. But one, the current
drawable can be None, and two, it can be a GLXDrawable. Passing either
one as the second argument to XCreatePixmap will throw BadDrawable. Use
the root window of the context's screen instead.

Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/89
LOLed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-09-19 21:06:01 -04:00
Adam Jackson
b4fe0b3ffd glx: Avoid atof() when computing the server's GLX version
atof() is locale-dependent (sigh), which means 1.3 becomes 1.0 if the
locale's decimal separator isn't a full-stop. Just use the protocol
major/minor instead. This would be slightly broken if the server
generically implements 1.3+ but a particular screen is only capable of
less, but in practice no such servers exist.

Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/74
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-19 20:50:01 -04:00
Ian Romanick
317a88b920 nir/algebraic: Additional D3D Boolean optimization
I observed this pattern in several shaders in Hand of Fate 2 while
investigating bugzilla #111490.  This also led to the related
bugzilla #111578.  The shaders from HoF2 are *not* in shader-db.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

Skylake and Ice Lake had similar results. (Ice Lake shown)
total instructions in shared programs: 16222621 -> 16205419 (-0.11%)
instructions in affected programs: 798418 -> 781216 (-2.15%)
helped: 548
HURT: 0
helped stats (abs) min: 2 max: 158 x̄: 31.39 x̃: 35
helped stats (rel) min: 0.45% max: 28.64% x̄: 2.83% x̃: 2.09%
95% mean confidence interval for instructions value: -33.22 -29.56
95% mean confidence interval for instructions %-change: -3.11% -2.56%
Instructions are helped.

total cycles in shared programs: 364676209 -> 363345763 (-0.36%)
cycles in affected programs: 112810504 -> 111480058 (-1.18%)
helped: 546
HURT: 7
helped stats (abs) min: 2 max: 118913 x̄: 2439.77 x̃: 2340
helped stats (rel) min: 0.08% max: 37.56% x̄: 1.46% x̃: 1.08%
HURT stats (abs)   min: 2 max: 770 x̄: 238.00 x̃: 43
HURT stats (rel)   min: 0.02% max: 11.24% x̄: 3.71% x̃: 0.35%
95% mean confidence interval for cycles value: -2884.33 -1927.41
95% mean confidence interval for cycles %-change: -1.59% -1.21%
Cycles are helped.

total spills in shared programs: 8870 -> 8514 (-4.01%)
spills in affected programs: 1230 -> 874 (-28.94%)
helped: 161
HURT: 0

total fills in shared programs: 21901 -> 21348 (-2.52%)
fills in affected programs: 2120 -> 1567 (-26.08%)
helped: 155
HURT: 5

Broadwell and Haswell had similar results. (Broadwell shown)
total instructions in shared programs: 14994910 -> 14975495 (-0.13%)
instructions in affected programs: 839033 -> 819618 (-2.31%)
helped: 548
HURT: 0
helped stats (abs) min: 2 max: 299 x̄: 35.43 x̃: 49
helped stats (rel) min: 0.39% max: 19.89% x̄: 2.91% x̃: 2.22%
95% mean confidence interval for instructions value: -37.46 -33.40
95% mean confidence interval for instructions %-change: -3.12% -2.70%
Instructions are helped.

total cycles in shared programs: 386032453 -> 384450722 (-0.41%)
cycles in affected programs: 117807357 -> 116225626 (-1.34%)
helped: 547
HURT: 6
helped stats (abs) min: 2 max: 22096 x̄: 2892.01 x̃: 3926
helped stats (rel) min: 0.17% max: 10.34% x̄: 1.56% x̃: 1.31%
HURT stats (abs)   min: 4 max: 60 x̄: 32.83 x̃: 29
HURT stats (rel)   min: 0.38% max: 12.79% x̄: 5.86% x̃: 4.65%
95% mean confidence interval for cycles value: -3060.28 -2660.27
95% mean confidence interval for cycles %-change: -1.59% -1.37%
Cycles are helped.

total spills in shared programs: 23372 -> 21869 (-6.43%)
spills in affected programs: 11730 -> 10227 (-12.81%)
helped: 352
HURT: 0

total fills in shared programs: 34747 -> 35351 (1.74%)
fills in affected programs: 11013 -> 11617 (5.48%)
helped: 3
HURT: 347

Ivy Bridge and Sandybridge had similar results. (Ivy Bridge shown)
total instructions in shared programs: 11956420 -> 11956126 (<.01%)
instructions in affected programs: 14898 -> 14604 (-1.97%)
helped: 98
HURT: 0
helped stats (abs) min: 3 max: 3 x̄: 3.00 x̃: 3
helped stats (rel) min: 1.30% max: 3.57% x̄: 2.08% x̃: 2.00%
95% mean confidence interval for instructions value: -3.00 -3.00
95% mean confidence interval for instructions %-change: -2.18% -1.98%
Instructions are helped.

total cycles in shared programs: 178791217 -> 178790792 (<.01%)
cycles in affected programs: 149763 -> 149338 (-0.28%)
helped: 91
HURT: 7
helped stats (abs) min: 3 max: 107 x̄: 20.63 x̃: 16
helped stats (rel) min: 0.13% max: 6.91% x̄: 1.40% x̃: 1.18%
HURT stats (abs)   min: 3 max: 322 x̄: 207.43 x̃: 322
HURT stats (rel)   min: 0.14% max: 19.85% x̄: 12.73% x̃: 17.41%
95% mean confidence interval for cycles value: -18.94 10.27
95% mean confidence interval for cycles %-change: -1.28% 0.49%
Inconclusive result (value mean confidence interval includes 0).
2019-09-19 14:22:22 -07:00
Ian Romanick
92f70df8c3 nir/algebraic: Do not apply late DPH optimization in vertex processing stages
Some shaders do not use 'invariant' in vertex and (possibly) geometry
shader stages on some outputs that are intended to be invariant.  For
various reasons, this optimization may not be fully applied in all
shaders used for different rendering passes of the same geometry.  This
can result in Z-fighting artifacts (at best).  For now, disable this
optimization in these stages.

In tessellation stages applications seem to use 'precise' when
necessary, so allow the optimization in those stages.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111490
Fixes: 09705747d7 ("nir/algebraic: Reassociate fadd into fmul in DPH-like pattern")

All Gen8+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16194726 -> 16344745 (0.93%)
instructions in affected programs: 2855172 -> 3005191 (5.25%)
helped: 6
HURT: 20279
helped stats (abs) min: 1 max: 3 x̄: 1.33 x̃: 1
helped stats (rel) min: 0.44% max: 1.00% x̄: 0.54% x̃: 0.44%
HURT stats (abs)   min: 1 max: 32 x̄: 7.40 x̃: 7
HURT stats (rel)   min: 0.14% max: 42.86% x̄: 8.58% x̃: 6.56%
95% mean confidence interval for instructions value: 7.34 7.45
95% mean confidence interval for instructions %-change: 8.48% 8.67%
Instructions are HURT.

total cycles in shared programs: 364471296 -> 365014683 (0.15%)
cycles in affected programs: 32421530 -> 32964917 (1.68%)
helped: 2925
HURT: 16144
helped stats (abs) min: 1 max: 403 x̄: 18.39 x̃: 5
helped stats (rel) min: <.01% max: 22.61% x̄: 1.97% x̃: 1.15%
HURT stats (abs)   min: 1 max: 18471 x̄: 36.99 x̃: 15
HURT stats (rel)   min: 0.02% max: 52.58% x̄: 5.60% x̃: 3.87%
95% mean confidence interval for cycles value: 21.58 35.41
95% mean confidence interval for cycles %-change: 4.36% 4.52%
Cycles are HURT.
2019-09-19 14:21:31 -07:00
Andres Gomez
bcd9224728 docs/features: Update VK_KHR_display_swapchain status
It was set as done by mistake.

Fixes: bc15d74529 ("docs/features: Mark some Vulkan extensions as done")
Signed-off-by: Andres Gomez <agomez@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-19 23:45:17 +03:00
Andres Gomez
53c24cfd8a docs/features: Update status list of Vulkan extensions
To get the extension list:

$ git grep -hE "extension name=\"VK_KHR" src/vulkan/registry/vk.xml | \
grep -v disabled | awk '{print $2}' | sed -E 's/(name=)?"//g' | sort

To find anv(il) and radv supported extensions:

$ git grep -hE "'VK_([A-Z]+)_[a-z,0-9]" src/intel/

$ git grep -hE "'VK_([A-Z]+)_[a-z,0-9]" src/amd/

v2:
  - Keep VK_KHR_device_group and VK_KHR_device_group_creation as not
    started (Jason).

Signed-off-by: Andres Gomez <agomez@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-19 23:39:26 +03:00
Jason Ekstrand
0c4e89ad5b Move blob from compiler/ to util/
There's nothing whatsoever compiler-specific about it other than that's
currently where it's used.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-09-19 19:56:22 +00:00
Boris Brezillon
fc5a87715a Revert "panfrost: Rework midgard_pair_load_store() to kill the nested foreach loop"
There's a missing prev_ldst = NULL; assignment in the new logic,
but even with this fixed it seems to regress some applications,
so let's revert the change until we find the real problem.

This reverts commit c9bebae287.
2019-09-19 21:01:27 +02:00
Caio Marcelo de Oliveira Filho
fa080f03d3 intel/fs: Add Fall-through comment
Reviewed-by: Andres Gomez <agomez@igalia.com>
2019-09-19 10:02:16 -07:00
Samuel Iglesias Gonsálvez
5ed5e76741 nir/algebraic: refactor inexact opcode restrictions
Refactor the code to avoid calling a lot of time to auxiliary functions
when it is not really needed.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-09-19 18:57:27 +02:00
Adam Jackson
5b5c5bf833 docs: Update bug report URLs for the gitlab migration
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-19 16:37:36 +00:00
Bas Nieuwenhuizen
ec76232785 glx: Remove redundant null check.
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/64
Reviewed-by: Adam Jackson <ajax@redhat.com>
2019-09-19 15:11:10 +00:00
Kenneth Graunke
706c9f2d60 iris: Skip double-disabling TCS/TES/GS after BLORP operations
BLORP always turns off TCS/TES/GS.  If regular drawing also has them
disabled (the overwhelmingly common case), then leaving them disabled
is just fine by us and we can skip dirtying them, as that would just
re-disable them a second time on the next draw.

If they are actually enabled, however, we do need to flag them.

Cuts 52% of the 3DSTATE_HS packets in an Aztec Ruins trace.
2019-09-19 07:56:15 -07:00
Erik Faye-Lund
7f7060dc73 .mailmap: add an alias for Frank Binns
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
2019-09-19 16:41:10 +02:00
Erik Faye-Lund
c1b1e0e875 .mailmap: add an alias for Bas Nieuwenhuizen
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-19 16:39:10 +02:00
Arcady Goldmints-Orlov
5ec5fecc26 anv: fix descriptor limits on gen8
Later generations support bindless for samplers, images, and buffers and
thus per-stage descriptors are not limited by the binding table size.
However, gen8 doesn't support bindless images and thus needs to report a
lower per-stage limit so that all combinations of descriptors that fit
within the advertised limits are reported as supported by
vkGetDescriptorSetLayoutSupport.

Fixes test dEQP-VK.api.maintenance3_check.descriptor_set
Fixes: 79fb0d27f3 ("anv: Implement SSBOs bindings with GPU addresses in the descriptor BO")

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-09-19 09:10:40 -05:00
Daniel Schürmann
8b78cce433 radv: remove dead shared variables
LLVM does this anyway, but for ACO we need to do it in NIR.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-19 12:10:00 +02:00
Daniel Schürmann
281262281b radv/aco: enable VK_EXT_shader_demote_to_helper_invocation
For now, this extension will only be enabled for ACO.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-19 12:10:00 +02:00
Daniel Schürmann
e01b522a72 radv: enable clustered reductions
These work with both, LLVM and ACO.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-19 12:10:00 +02:00
Daniel Schürmann
a70a998718 radv/aco: Setup alternate path in RADV to support the experimental ACO compiler
LLVM remains default and ACO can be enabled with RADV_PERFTEST=aco.

Co-authored-by: Daniel Schürmann <daniel@schuermann.dev>
Co-authored-by: Rhys Perry <pendingchaos02@gmail.com>

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-19 12:10:00 +02:00
Daniel Schürmann
93c8ebfa78 aco: Initial commit of independent AMD compiler
ACO (short for AMD Compiler) is a new compiler backend with the goal to replace
LLVM for Radeon hardware for the RADV driver.

ACO currently supports only VS, PS and CS on VI and Vega.
There are some optimizations missing because of unmerged NIR changes
which may decrease performance.

Full commit history can be found at
https://github.com/daniel-schuermann/mesa/commits/backend

Co-authored-by: Daniel Schürmann <daniel@schuermann.dev>
Co-authored-by: Rhys Perry <pendingchaos02@gmail.com>
Co-authored-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Co-authored-by: Connor Abbott <cwabbott0@gmail.com>
Co-authored-by: Michael Schellenberger Costa <mschellenbergercosta@googlemail.com>
Co-authored-by: Timur Kristóf <timur.kristof@gmail.com>

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-19 12:10:00 +02:00
Tapani Pälli
99cbec0a5f egl: check for NULL value like eglGetSyncAttribKHR does
Commit d1e1563bb6 added a NULL check for eglGetSyncAttribKHR
but eglGetSyncAttrib does not do this. Patch adds same check to
happen with eglGetSyncAttrib.

Fixes crashes in (when exposing EGL 1.5):
   dEQP-EGL.functional.fence_sync.invalid.get_invalid_value

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2019-09-19 06:39:33 +00:00
Kenneth Graunke
a16975e615 iris: Rework iris_update_draw_parameters to be more efficient
This improves a couple of things:

1. We now only update anything if the shader actually cares.

   Previously, is_indexed_draw was causing us to flag dirty vertex
   buffers, elements, and SGVs every time the shader switched between
   indexed and non-indexed draws.  This is a very common situation,
   but we only need that information if the shader uses gl_BaseVertex.

   We were also flagging things when switching between indirect/direct
   draws as well, and now we only bother if it matters.

2. We upload new draw parameters only when necessary.

   When we detect that the draw parameters have changed, we upload a
   new copy, and use that.  Previously we were uploading it every time
   the vertex buffers were dirty (for possibly unrelated reasons) and
   the shader needed that info.  Tying these together also makes the
   code a bit easier to follow.

In Civilization VI's benchmark, this code was flagging dirty state
many times per frame (49 average, 16 median, 614 maximum).  Now it
occurs exactly once for the entire run.
2019-09-18 22:50:52 -07:00
Kenneth Graunke
6841f11d14 iris: Use state_refs for draw parameters.
iris_state_ref is a <resource, offset> tuple, which is exactly what we
need here.
2019-09-18 22:50:52 -07:00
Timothy Arceri
ddd314f0ce util/disk_cache: make use of the total job size limiting feature
This makes use of the total job size limiting feature added in the
previous patch.

The idea is to avoid an excessive build up in memory use due to the
use of both the UTIL_QUEUE_INIT_RESIZE_IF_FULL and
UTIL_QUEUE_INIT_USE_MINIMUM_PRIORITY flags.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-19 15:03:27 +10:00
Timothy Arceri
896885025f util/u_queue: track job size and limit the size of queue growth
When both UTIL_QUEUE_INIT_RESIZE_IF_FULL and
UTIL_QUEUE_INIT_USE_MINIMUM_PRIORITY are set, we can get into a
situation where the queue never executes and grows to a huge size
due to all other threads being busy.

This is the case with the shader cache when attempting to compile a
huge number of shaders up front. If all threads are busy compiling
shaders the cache queues memory use can climb into the many GBs
very fast.

The use of these two flags with the shader cache is intended to
allow shaders compiled at runtime to be compiled as fast as possible.
To avoid huge memory use but still allow the queue to perform
optimally in the run time compilation case, we now add the ability
to track memory consumed by the jobs in the queue and limit it to
a hardcoded 256MB which should be more than enough.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-19 15:03:27 +10:00
Timothy Arceri
a2ee29c3da util/disk_cache: bump thread count assigned to disk cache queue
Since we set the UTIL_QUEUE_INIT_USE_MINIMUM_PRIORITY flag this should
have little impact on low core systems. However just about all modern
CPUs currently available that run Mesa have *at least* 4 cores. For
these CPUs allowing more threads can result in the queue being
processed faster and avoid excessive memory use due to a backlog of
cache entrys building up in the queue.

This change helps avoid a huge build up of cache entrys in the queue
due to using both the UTIL_QUEUE_INIT_USE_MINIMUM_PRIORITY and
UTIL_QUEUE_INIT_RESIZE_IF_FULL flags.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-19 15:03:27 +10:00
Paulo Zanoni
8e614c7a29 intel/fs: fix SHADER_OPCODE_CLUSTER_BROADCAST for SIMD32
The current code can create functions with a width of 32, which is not
supported by our hardware. Add some code to simplify how we express
what we want and prevent such cases.

For some unknown reason, all the tests I could run seem to work even
with these unsupported MOVs.

Fixes: b0858c1cc6 "intel/fs: Add a couple of simple helper opcodes"
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
2019-09-19 02:48:27 +00:00
Paulo Zanoni
c99df52873 intel/fs: the maximum supported stride width is 16
There are cases where we try to generate registers with a stride of
32, while the hardware maximum is just 16. This happens, for example,
when using 8 bit integers on SIMD32. This results in a crash because
the variable 'width' has a value of 32:

../../src/intel/compiler/brw_reg.h:550: brw_reg brw_vecn_reg(unsigned
int, brw_reg_file, unsigned int, unsigned int): Assertion `!"Invalid
register width"' failed.

This change prevents the crash and makes the tests pass.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
2019-09-19 02:48:27 +00:00
Paulo Zanoni
cebf447d16 intel/fs: roll the loop with the <0,1,0> additions in emit_scan()
IMHO the code is easier to understand this way, being explicit that
we're doing exactly the same thing every time.

No functional changes.

v2: Adjust the loop breaking condition (Jason).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
2019-09-19 02:47:17 +00:00
Paulo Zanoni
d9ddf5076d intel/fs: make scan/reduce work with SIMD32 when it fits 2 registers
When dealing with uint16_t and uint8_t on SIMD32 we can do all the
operations using just 2 registers, so we don't hit the recursion at
the beginning of emit_scan(). Because of that, we need to actually
compute scan/reduce for channels 31:16.

v2: Still missed instructions (Jason).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
2019-09-19 02:47:17 +00:00
Kristian H. Kristensen
7f07046dbc freedreno/regs: A couple of tess updates
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-18 16:59:10 -07:00
Kristian H. Kristensen
a2031a117c freedreno/regs: Fix CP_DRAW_INDX_OFFSET command
On A5xx+ the INDX_BASE pointer is 64 bit.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-18 16:59:10 -07:00
Kristian H. Kristensen
2251a4345b freedreno/a6xx: Write multiple regs for SP_VS_OUT_REG and SP_VS_VPC_DST_REG
Compute the number of writes up front.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-18 16:59:10 -07:00
Kristian H. Kristensen
cc4fe81145 freedreno/a6xx: Turn on vectorize_io
We want this for tessellation eventually, but we can turn it on now.

Shader-db results:

total instructions in shared programs: 8612905 -> 8611387 (-0.02%)
instructions in affected programs: 164952 -> 163434 (-0.92%)

total dwords in shared programs: 11952000 -> 11950560 (-0.01%)
dwords in affected programs: 68096 -> 66656 (-2.11%)

total full in shared programs: 315019 -> 315009 (<.01%)
full in affected programs: 1642 -> 1632 (-0.61%)

total constlen in shared programs: 2463654 -> 2463654 (0.00%)
constlen in affected programs: 0 -> 0

total (ss) in shared programs: 152379 -> 152409 (0.02%)
(ss) in affected programs: 1503 -> 1533 (2.00%)

total (sy) in shared programs: 96473 -> 96525 (0.05%)
(sy) in affected programs: 654 -> 706 (7.95%)

total max_sun in shared programs: 1172454 -> 1172472 (<.01%)
max_sun in affected programs: 104 -> 122 (17.31%)

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-18 16:59:10 -07:00
Kristian H. Kristensen
1cb9534434 freedreno/a6xx: Share shader state constructor and destructor
Also, swap vs and fs constructor or so fs comes first.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-18 16:59:10 -07:00
Kristian H. Kristensen
be38480064 freedreno/a6xx: Track location of gl_Position out as we link it
When using xfb and rasterizing, the fragment shader may have fewer
inputs than the vertex shader outputs. We can't rely on gl_Position to
be placed at fs->total_in, but have to instead remember where we add
it in the link map and use that location.

Fixes 100+ tesselation dEQPs under

  dEQP-GLES31.functional.tessellation.primitive_discard.*
  dEQP-GLES31.functional.tessellation.user_defined_io.*

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-18 16:59:10 -07:00
Caio Marcelo de Oliveira Filho
d38e0a6326 spirv: Add missing break for capability handling
New added cases "stole" the previous break.

Fixes: 420ad0a1a3 ("spirv: check support for SPV_KHR_float_controls capabilities")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-18 15:49:14 -07:00
Kenneth Graunke
3da8a8a3d6 iris: Avoid uploading SURFACE_STATE descriptors for UBOs if possible
If we can entirely push uniform data, we don't need a SURFACE_STATE
descriptor for pulling data.  Since constant uploads are a very common
operation, and being able to push all data is also very common, we would
like to avoid the overhead in this case.

This patch defers uploading new descriptors.  Instead of handling that
at iris_set_constant_buffer, we do it at iris_update_compiled_shaders,
where we can see the currently bound shader variants.  If any need pull
descriptors, and descriptors are missing, we update them and flag that
the binding table also needs to be refreshed.

Improves performance in GFXBench5 gl_driver2 on an i7-6770HQ by
31.9774% +/- 1.12947% (n=15).

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-18 15:44:22 -07:00
Kenneth Graunke
0e4a75f917 intel/compiler: Record whether any pull constant loads occur
I would like for iris to be able to avoid setting up SURFACE_STATE
for UBOs in the common case where all constants are pushed.

Unfortunately, we don't know up front whether everything will be
pushed: the backend is allowed to demote pushed UBOs to pull loads
fairly late in the process.  This is probably desirable though, as
we'd like the backend to be able to re-pull pushed data to break up
long live ranges in response to register pressure.

Here we simply add a "are there any pull loads at all" boolean to
prog_data, which is a bit crude but at least allows us to skip work
in the common "everything pushed" case.  We could skip more work by
tracking exactly which UBO surfaces are pulled in a bitmask, but I
wanted to avoid bringing back the old mark_surface_used() mechanism.

Finer-grained tracking could allow us to skip a bit more work when
multiple UBOs are in use and /some/ are 100% pushed, but others are
accessed via pulls.  However, I'm not sure how common this is and
it would save at most 4 pull descriptors, so we defer that for now.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-18 15:44:22 -07:00
Kenneth Graunke
dd83ef0d1a iris: Track per-stage bind history, reduce work accordingly
We now track per-stage bind history for constant and shader buffers,
shader images, and sampler views by adding an extra res->bind_stages
field to go with res->bind_history.

This lets us flag IRIS_DIRTY_CONSTANTS for only the specific stages
involved, and also skip some CPU overhead in iris_rebind_buffer.

Cuts 4% of 3DSTATE_CONSTANT_XS packets in a Shadow of Mordor trace
on Icelake.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-18 15:44:22 -07:00
Kenneth Graunke
1e7daaa6c9 iris: Don't flag IRIS_DIRTY_BINDINGS for constant usage history
The underlying buffer isn't changing - so we don't need to update any
SURFACE_STATE descriptors - we just might have new constants, meaning
we need to re-emit 3DSTATE_CONSTANT_XS.  On Gen9, this means we need
to update 3DSTATE_BINDING_TABLE_POINTERS_XS too, but that's now handled
by the explicit check in the previous patch.

On Gen9, this should cause us to re-emit the binding table /pointer/ on
writing to a buffer with PIPE_BIND_CONSTANT_BUFFER, rather than emitting
a whole new /table/.

On Gen8 and Gen11, this avoids binding table churn altogether.

Cuts 61% of 3DSTATE_BINDING_TABLE_POINTERS_XS packets in a Shadow of
Mordor trace on Icelake.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-18 15:44:22 -07:00
Kenneth Graunke
e7db3577f8 iris: Explicitly emit 3DSTATE_BTP_XS on Gen9 with DIRTY_CONSTANTS_XS
Right now, we usually flag both IRIS_DIRTY_{CONSTANTS,BINDINGS}_XS,
because we have SURFACE_STATE for constant buffers in case the shaders
access them via pull mode.

But this flagging is overkill in many cases.  Gen8 and Gen11 don't need
it at all.  Gen9 doesn't need that large of a hammer in all cases.

Just handle it explicitly so the right thing happens.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-18 15:44:22 -07:00
Kenneth Graunke
caa0aebd01 iris: Flag IRIS_DIRTY_BINDINGS_XS on constant buffer rebinds
We upload a new SURFACE_STATE for the UBO/SSBO in question, which
means that we need new binding tables as well.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-18 15:44:22 -07:00
Bas Nieuwenhuizen
4b7e7956f0 radv: Add DFSM support.
Apparently we already enabled it without having support ...

Not sure if we also need to set disable_start_of_prim when the PS
has memory writes, but this mirrors radeonsi.

Doubles fillrate in my dual_quad_bench from ~16 pixels/cycles to
~32 pixels/cycle on a Raven.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-18 21:28:51 +00:00
Bas Nieuwenhuizen
0fa2740059 radv: Disable dfsm by default even on Raven.
When actually implementing it, Talos on low is still 3% slower.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-18 21:28:51 +00:00
Bas Nieuwenhuizen
f2dffb395f radv: Only break batch on framebuffer change with dfsm.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-18 21:28:51 +00:00
Connor Abbott
57e0bb8ccc nir/opt_if: Fix undef handling in opt_split_alu_of_phi()
The pass assumed that "Most ALU ops produce an undefined result if any
source is undef" which is completely untrue. Due to how we lower if
statements to selects and then optimize on those selects later, we
simply cannot make that assumption. In particular this pass tried to
replace an ior of undef and true, which had been generated by
optimizing a select which itself came from flattening an if statement,
to undef causing a miscompilation for a CTS test with radeonsi NIR.

We fix this by always doing what the non-undef path did, i.e. duplicate
the instruction twice. If there are cases where the instruction before
the loop can be folded away due to having an undef source, we should add
these to opt_undef instead.

The comment above the pass says that if the phi source from before the
loop is undef, and we can fold the instruction before the loop to undef,
then we can ignore sources of the original instruction that don't
dominate the block before the loop because we don't need them to create
the instruction before the loop. This is incorrect, because the
instruction at the bottom of the loop would get those sources from the
wrong loop iteration. The code never actually did what the comment said,
so we only have to update the comment to match what the pass actually
does. We also update the example to more closely match what most actual
loops look like after vtn and peephole_select.

There are no shader-db changes with i965, radeonsi NIR, or radv. With
anv and my vkpipeline-db there's only one change:

total instructions in shared programs: 14125290 -> 14125300 (<.01%)
instructions in affected programs: 2598 -> 2608 (0.38%)
helped: 0
HURT: 1

total cycles in shared programs: 2051473437 -> 2051473397 (<.01%)
cycles in affected programs: 36697 -> 36657 (-0.11%)
helped: 1
HURT: 0

Fixes
KHR-GL45.shader_subroutine.control_flow_and_returned_subroutine_values_used_as_subroutine_input
with radeonsi NIR.
2019-09-18 17:18:34 -04:00
Eric Engestrom
a1de3011f3 gl: drop incorrect pkg-config file for glvnd
Akin to 1a25980c46 ("egl: drop incorrect pkg-config file for
glvnd") and b01524fff0 ("meson: don't build libGLES*.so with
GLVND") , removes a pkg-config file that shouldn't have been there in
the first place, but was needed because of that GLVND bug.

Now that the glvnd bug has been fixed, it was apparent that this gl.pc
pkg-config file was forgotten to be removed, so let's do just that :)

Suggested-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-09-18 22:16:51 +01:00
Andres Gomez
66f2aa6ccd docs: Add the maximum implemented Vulkan API version in 19.3 rel notes
Currently, Vulkan 1.1.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-09-19 00:03:55 +03:00
Andres Gomez
41b0e0d7e0 docs: Add the maximum implemented Vulkan API version in 19.2 rel notes
Currently, Vulkan 1.1.

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-09-19 00:03:50 +03:00
Andres Gomez
d2db43fcad docs: Add the maximum implemented Vulkan API version in 19.1 rel notes
Currently, Vulkan 1.1.

Cc: 19.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-09-19 00:03:39 +03:00
Andres Gomez
d9760f8935 nir/opcodes: Clear variable names confusion
Having Python and C variables sharing name in the same block of code
makes its understanding a bit confusing. Make it explicit that the
Python bit_size variable refers to the destination bit size.

Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-18 23:59:07 +03:00
Rhys Perry
b3f71685d9 radv: never kill a NGG GS shader
Seems to fix a hang with excessive vertex emissions when NGG is used for
GS.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-18 19:26:58 +00:00
Samuel Pitoiset
99c186fbbe radv/gfx10: fix VK_KHR_pipeline_executable_properties with NGG GS
No GS copy shader if a pipeline enables NGG GS.

This fixes
dEQP-VK.pipeline.executable_properties.graphics.*geometry_stage*.

Fixes: 86864eedd2 ("radv: Implement radv_GetPipelineExecutablePropertiesKHR.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-18 21:19:28 +02:00
Marek Olšák
fe7aa271a9 radeonsi: include drm_fourcc.h to fix the build 2019-09-18 14:52:25 -04:00
Marek Olšák
00e29816e7 radeonsi: implement pipe_screen::resource_get_param
v2: return DRM_FORMAT_MOD_INVALID from the function

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2019-09-18 14:43:01 -04:00
Marek Olšák
d307aa56f9 gallium: extend resource_get_param to be as capable as resource_get_handle
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-18 14:41:30 -04:00
Marek Olšák
aae35fbd3a ac: move ac_get_num_physical_vgprs into radeon_info
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-18 14:39:06 -04:00
Marek Olšák
0692ae34e9 ac: move ac_get_num_physical_sgprs into radeon_info
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-18 14:39:06 -04:00
Marek Olšák
ca43006fd2 ac: move ac_get_max_wave64_per_simd into radeon_info
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-18 14:39:06 -04:00
Marek Olšák
deab3a23f6 ac: move num_sdp_interfaces into radeon_info
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-18 14:39:06 -04:00
Marek Olšák
2c62b461e9 ac: move PBB MAX_ALLOC_COUNT into radeon_info
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-18 14:39:06 -04:00
Jonathan Marek
05da025f35 etnaviv: fix two-sided stencil
* Set missing STENCIL_CONFIG_EXT2 bits
* Swap stencil sides when rendering CCW

Fixes following deqp tests (which were 99% failing):
dEQP-GLES2.functional.fragment_ops.depth_stencil.*

Note: deqp tests require --deqp-gl-config-name=rgba8888d24s8ms0

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-09-18 12:49:10 -04:00
Samuel Pitoiset
68820007fd radv: fix loading 64-bit GS inputs
We have to load 2 32-bit integer and to cast correctly.

This fixes crashes with gs-double-interpolator.vk_shader_test.

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111734
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-18 17:16:36 +02:00
Bas Nieuwenhuizen
7999e10cab tu: Set up glsl types.
Addresses this assert:

deqp-vk: ../mesa-freedreno-9999/src/compiler/glsl_types.cpp:1244: static const glsl_type *glsl_type::get_interface_instance(const glsl_struct_field *, unsigned int, enum glsl_interface_packing, bool, const char *): Assertion `glsl_type_users > 0' failed.

running dEQP-VK.api.smoke.triangle .

Fixes: 624789e370 "compiler/glsl: handle case where we have multiple users for types"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-09-18 16:51:18 +02:00
Andres Gomez
f833b4cada docs: Update to OpenGL 4.6 in the release notes
After 41549a18e6 ("i965: Enable OpenGL 4.6 for Gen8+"), Mesa
implements the OpenGL 4.6 API.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-09-18 12:28:05 +00:00
Erik Faye-Lund
ea74b1b9aa .mailmap: add an alias for Eric Engestrom
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-09-18 14:05:05 +02:00
Erik Faye-Lund
ed91eacf71 .mailmap: add an alias for Michel Dänzer
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
2019-09-18 14:04:40 +02:00
Samuel Pitoiset
46b7512b0a radv: fix writing depth/stencil clear values to image
Use the fastest way only if both aspects are used. Oops.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111728
Fixes: 218ce34962 ("radv: add mipmap support for the clear depth/stencil values")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-18 13:27:46 +02:00
Michel Dänzer
88e5796daa gitlab-ci: Merge scons-nollvm and scons-llvm jobs
The new job tests scons without LLVM and with all LLVM versions >= 6.0.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-18 10:36:48 +00:00
Michel Dänzer
baa5024e24 gitlab-ci: Test scons with all LLVM versions
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-18 10:36:48 +00:00
Michel Dänzer
0374aacac0 gitlab-ci: Move scons build/test commands to a separate shell script
Preparatory, no functional change intended.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-18 10:36:48 +00:00
Michel Dänzer
8a8388ca67 gitlab-ci: Use crossbuild-essential-* packages
They are convenience packages which pull in everything needed for
cross-building via dependencies.

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-18 10:36:48 +00:00
Michel Dänzer
a01230e73a gitlab-ci: Use newer packages from backports by default
This is needed in particular to get a recent enough version of meson in
the stretch image, but should be generally beneficial.

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-18 10:36:48 +00:00
Michel Dänzer
8a19992869 gitlab-ci: Create separate docker images for Debian stretch & buster
Pros:
* Less fragile due to not mixing packages from stretch and buster
* No longer need to use third-party LLVM packages
* The buster image now uses GCC 8 for C++ as well (previously 6 for C++,
  8 for C), allowing to drop some hacks

Con:
* The stretch image now only uses GCC 6 for C as well as C++
* Need separate jobs for testing old LLVM versions

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-18 10:36:48 +00:00
Michel Dänzer
26fcc8baba gitlab-ci: Pass --no-remove to apt-get where possible
If installing new packages would require removing previously installed
ones, this flag causes apt-get to abort with an error instead,
preventing later obscure failures due to the missing packages.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-18 10:36:48 +00:00
Michel Dänzer
2259b45174 gitlab-ci: Reference full ci-templates commit hash
8 digits might become ambiguous at some point.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-18 10:36:48 +00:00
Haihao Xiang
8a9b81ab9d i965: support AYUV/XYUV for external import only
Fixes: 89785e2d56 ("i965: add support for sampling from AYUV")
Fixes: 7cab8d3661 ("i965: Add support for sampling from XYUV images")
Cc: Vivek Kasireddy <vivek.kasireddy@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-09-18 12:07:23 +03:00
Boris Brezillon
1e483a87bc panfrost: Allocate tiler and scratchpad BOs per-batch
If we want to execute several batches in parallel they need to have
their own tiler and scratchpad BOs. Let move those objects to
panfrost_batch and allocate them on a per-batch basis.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:40:17 +02:00
Boris Brezillon
0eec73a800 panfrost: Add FBO BOs to batch->bos earlier
If we want the batch dependency tracking to work correctly we must
make sure all BOs are added to the batch->bos set early enough. Adding
FBO BOs when generating the fragment job is clearly to late. Add a
panfrost_batch_add_fbo_bos helper and call it in the clear/draw path.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:37:56 +02:00
Boris Brezillon
5a4d095f9b panfrost: Add the panfrost_batch_create_bo() helper
This helper automates the panfrost_bo_create()+panfrost_batch_add_bo()+
panfrost_bo_unreference() sequence that's done for all per-batch BOs.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:37:31 +02:00
Boris Brezillon
9af4aeaaf7 panfrost: Don't return imported/exported BOs to the cache
We don't know who else is using the BO in that case, and thus shouldn't
re-use it for something else.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:35:52 +02:00
Boris Brezillon
90b8934547 panfrost: Add panfrost_bo_{alloc,free}()
Thanks to that we avoid the recursive call into panfrost_bo_create()
and we can get rid of panfrost_bo_release() by inlining the code in
panfrost_bo_unreference().

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:35:29 +02:00
Boris Brezillon
cb71ae5572 panfrost: Stop using panfrost_bo_release() outside of pan_bo.c
panfrost_bo_unreference() should be used instead.

The only difference caused by this change is that the scratchpad,
tiler_heap and tiler_dummy BOs are now returned to the cache instead
of being freed when a context is destroyed. This is only a problem if
we care about context isolation, which apparently is not the case since
transient BOs are already returned to the per-FD cache (and all contexts
share the same address space anyway, so enforcing context isolation
is almost impossible).

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:35:06 +02:00
Boris Brezillon
e15ab939fd panfrost: Stop passing screen around for BO operations
Store a screen pointer in panfrost_bo so we don't have to pass a screen
object to all functions manipulating the BO.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:34:27 +02:00
Boris Brezillon
10ce751726 panfrost: Don't check if BO is mmaped before calling panfrost_bo_mmap()
panfrost_bo_mmap() already takes care of that.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:34:08 +02:00
Boris Brezillon
a06e08def9 panfrost: Stop exposing panfrost_bo_cache_{fetch,put}()
They are not expected to be called directly, users should use
panfrost_bo_{create,release}() instead.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:33:51 +02:00
Boris Brezillon
154cb725d4 panfrost: Move the BO API to its own header
Right now, the BO API is spread over pan_{allocate,resource,screen}.h.
Let's move all BO related definitions to a separate header file.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:29:13 +02:00
Boris Brezillon
34efaafc93 panfrost: s/PAN_ALLOCATE_/PAN_BO_/
Change the prefix for BO allocation flags to make it consistent with
the rest of the BO API.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:28:55 +02:00
Boris Brezillon
29d0e5c177 panfrost: Move panfrost_bo_{reference,unreference}() to pan_bo.c
This way we have all BO related functions placed in the same source
file.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:28:39 +02:00
Boris Brezillon
0500c9e514 panfrost: Get rid of pan_drm.c
pan_drm.c was only meaningful when we were supporting 2 kernel drivers
(mali_kbase, and the drm one). Now that there's now kernel-driver
abstraction we're better off moving those functions were they belong:

* BO related functions in pan_bo.c
* fence related functions + query_gpu_version() in pan_screen.c
* submit related functions in pan_job.c

While at it, we rename the functions according to the place they're
being moved to.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:28:22 +02:00
Boris Brezillon
1e47c3ee7b panfrost: Stop passing has_draws to panfrost_drm_submit_vs_fs_batch()
has_draws can be inferred directly from the batch->last_job value, no
need to pass it around.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:28:03 +02:00
Boris Brezillon
07085fe8a4 panfrost: Kill a useless memset(0) in panfrost_create_context()
ctx is allocated with rzalloc() which takes care of zero-ing the memory
region. No need to call memset(0) on top.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:27:47 +02:00
Boris Brezillon
4eac1b2008 panfrost: Add polygon_list to the batch BO set at allocation time
That's what we do for other per-batch BOs, and we'll soon add an helper
to automate this create_bo()+add_bo()+bo_unreference() sequence, so
let's prepare the code to ease this transition.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:27:30 +02:00
Boris Brezillon
c16fb1f48d panfrost: Add missing panfrost_batch_add_bo() calls
Some BOs are used by batches but never explicitly added to the BO set.
This is currently not a problem because we wait for the execution of
a batch to be finished before releasing a BO, but we will soon relax
this rule.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:27:09 +02:00
Boris Brezillon
a94d028065 panfrost: Use the correct type for the bo_handle array
The DRM driver expects an array of u32, let's use the correct type, even
if using an int works in practice because it's still a 32-bit integer.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:26:49 +02:00
Boris Brezillon
2b771b8424 panfrost: Stop exposing internal panfrost_*_batch() functions
panfrost_{create,free,get}_batch() are only called inside pan_job.c.
Let's make them static.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:26:21 +02:00
Christian Gmeiner
8d5f905faa etnaviv: disable ARB_shadow
Looks like only HALT2 GPUs have support for it but that is not yet
implemented so disable ARB_shadow for now.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-18 06:47:26 +02:00
Christian Gmeiner
dcc0e23438 Revert "gallium: remove PIPE_CAP_TEXTURE_SHADOW_MAP"
There are GPUs that do not support this feature.

This reverts commit e871abe452

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-18 06:47:21 +02:00
Lepton Wu
417d602fda virgl: Remove wrong EAGAIN handling for drmIoctl
drmIoctl handles EAGAIN itself and actually it always return -1 on errors.
Remove the wrong handling of its return value. Also, print a warning when
it fails.

v2: - use _debug_printf instead of fprintf (Gurchetan Singh)

Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net> (v1)
2019-09-18 03:36:10 +00:00
Kenneth Graunke
f8c44e4ed7 iris: Skip allocating a null surface when there are 0 color regions.
The compiler now sets the "Null Render Target" bit in the RT write
extended message descriptor, causing it to write to an implicit null
surface without us needing to set one up in the binding table.

Together with the last patch, this improves performance in Car Chase on
an Icelake 8x8 (locked to 700Mhz) by 0.0445526% +/- 0.0132736% (n=832).

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 14:27:51 -07:00
Kenneth Graunke
f76a724e06 intel/compiler: Set "Null Render Target" ex_desc bit on Gen11
When there are no color regions (i.e. a depth only pass), we can set
the "Null Render Target" bit in the Gen11 RT write extended message
descriptor to indicate that it should behave as if it's writing to a
null render target, without the need for a binding table entry.

This lets drivers avoid setting up that null RT binding table entry,
but more importantly means the HW doesn't actually have to bother
looking up the surface state.

Together with the next patch, this improves performance in Car Chase on
an Icelake 8x8 (locked to 700Mhz) by 0.0445526% +/- 0.0132736% (n=832).

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 14:27:51 -07:00
Samuel Iglesias Gonsálvez
f55f7b199d docs/relnotes: add support for VK_KHR_shader_float_controls on Intel
v2:
- Move to 19.2.0 release notes (Andres).

v3:
- Move to 19.3.0 release notes (Andres).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez
f5dd6dfe01 anv: enable VK_KHR_shader_float_controls and SPV_KHR_float_controls
This adds support for
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FLOAT_CONTROLS_PROPERTIES_KHR and
enables de Vulkan and SPIR-V extensions.

Also, notice that this includes the updates applied to the
VkPhysicalDeviceFloatControlsPropertiesKHR structure in the extension
VK_KHR_shader_float_controls v4 and Vulkan 1.1.116.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez
9b07020a4f i965/fs: add support for shader float control to remove_extra_rounding_modes()
The remove_extra_rounding_modes() optimization will remove duplicated
rounding mode changes.

v2:
- Fix bug in the rounding mode change (Alejandro).

v3:
- Fix rounding modes.

v4:
- Updated to renamed shader info member and enum values (Andres).

v5:
- Simplify flags logic operations (Caio).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez
9bd88d10d8 i965/fs: set rounding mode when emitting nir_op_f2f32 or nir_op_f2f16
v2:
- Consider nir_op_f2f16 case too (Caio).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez
ba1e25e1aa i965/fs: set rounding mode when emitting fadd, fmul and ffma instructions
v2:
- Updated to renamed shader info member (Andres).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez
9da56ffc52 i965/fs: add emit_shader_float_controls_execution_mode() and aux functions
We need this function to emit code that setups the control register
later with the defined execution mode for the shader. Therefore, we
emit it as the first instruction.

v2:
- Fix bug in setting the default mode mask in brw_rnd_mode_from_nir().
- Fix support for rounding modes in brw_rnd_mode_from_nir().

v3:
- Updated to renamed shader info member and enum values (Andres).

v4:
- Add actual emission as first instruction of emit_nir_code (Caio).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez
8a6507b6fe i965/fs/generator: add new opcode to set float controls modes in control register
Before this commit, we had only FPRoundingMode decoration (the per
instruction one) that is applied during the SPIR-V handling. In
vtn_alu we find out the rounding mode, and generate the code
accordingly that later will be used to look for the respective
nir_op_f2f16_{rtz,rtne}.

Per-instruction gets prioritized because we make them explicit
conversions (with RTZ or RTNE nir opcodes) and they will override the
default execution mode defined with float controls. However, we need
to come back to the mode defined by float controls after the execution
of the FP Rounding instruction.

Therefore, the new SHADER_OPCODE_FLOAT_CONTROL_MODE opcode will be
used to set the default rounding mode and denorms treatment in the
whole shader while the pre-existent SHADER_OPCODE_RND_MODE, will be
used as prioritized rounding mode in a per-instruction basis.

v2:
- Fix bug in defining BRW_CR0_FP_MODE_MASK.

v3:
- Update comment (Caio).

v4:
- Split the patch into the helper and the new opcode (this
  one) (Caio).

v5:
- Add an explanation on the actual purpose and priority of the newly
  introduced opcode in the commit log (Caio).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez
28da9558f5 i965/fs/generator: refactor rounding mode helper in preparation for float controls
v2:
- Fix bug in defining BRW_CR0_FP_MODE_MASK.

v3:
- Update comment (Caio).

v4:
- Split the patch into the helper (this one) and the new
  opcode (Caio).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez
cdace5b0c6 i965/fs/nir: add nir_op_unpack_half_2x16_split_*_flush_to_zero
The denorm mode is set in the control register, no need to do
something else.

v2:
- Add an assert to make sure that we realize if this assumption is
  broken in the future (Caio).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez
3c474f8513 intel/nir: do not apply the fsin and fcos trig workarounds for consts
If we have fsin or fcos trigonometric operations with constant values
as inputs, we will multiply the result by 0.99997 in
brw_nir_apply_trig_workarounds, making the result wrong.

Adjusting the rules so they do not apply to const values we let a
later constant fold to deal with it.

v2:
- Do not early constant fold but only apply the trig workaround for
  non constants (Caio).
- Add fixes tag to commit log (Caio).

Fixes: bfd17c76c1 "i965: Port INTEL_PRECISE_TRIG=1 to NIR."
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez
dba4d0a319 nir: fix fmin/fmax support for doubles
Until now, it was using the floating point version of fmin/fmax,
instead of the double version.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez
2abc6299bf nir: fix denorm flush-to-zero in sqrt's lowering at nir_lower_double_ops
v2:
- Replace hard coded value with DBL_MIN (Connor).

v3:
- Have into account the FLOAT_CONTROLS_DENORM_PRESERVE_FP64
  flag (Caio).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v2]
2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez
1e0e3ed15a nir: fix denorms in unpack_half_1x16()
According to VK_KHR_shader_float_controls:

"Denormalized values obtained via unpacking an integer into a vector
 of values with smaller bit width and interpreting those values as
 floating-point numbers must: be flushed to zero, unless the entry
 point is declared with the code:DenormPreserve execution mode."

v2:
- Add nir_op_unpack_half_2x16_flush_to_zero opcode (Connor).

v3:
- Adapt to use the new NIR lowering framework (Andres).

v4:
- Updated to renamed shader info member and enum values (Andres).

v5:
- Simplify flags logic operations (Caio).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v2]
2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez
f097247dd8 nir/algebraic: disable inexact optimizations depending on float controls execution mode
If FLOAT_CONTROLS_SIGNED_ZERO_INF_NAN_PRESERVE or
FLOAT_CONTROLS_DENORM_FLUSH_TO_ZERO are enabled, do not apply the
inexact optimizations so the VK_KHR_shader_float_controls execution
mode is respected.

v2:
- Do not apply inexact optimizations if SHADER_DENORM_FLUSH_TO_ZERO is
  enabled (Andres).

v3:
- Updated to renamed shader info member (Andres).

v4:
- Directly access execution mode instead of dragging it by parameter (Caio).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v1]
2019-09-17 23:39:18 +03:00
Andres Gomez
3f782cdd25 nir/algebraic: mark float optimizations returning one parameter as inexact
With the arrival of VK_KHR_shader_float_controls algebraic
optimizations for float types of the form (('fop', a, b), a) become
inexact depending on the execution mode.

For example, if we have activated SHADER_DENORM_FLUSH_TO_ZERO, in case
of a denorm value for the "a" parameter, we cannot return it still as
a denorm, it needs to be flushed to zero. Therefore, we mark now all
those operations as inexact.

Suggested-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez
5e22f3e29a nir/constant_expressions: mind rounding mode converting from float to float16 destinations
v2:
- Move the op-code specific knowledge to nir_opcodes.py even if it
  means a rount trip conversion (Connor).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez
ef681cf971 nir/opcodes: make sure f2f16_rtz and f2f16_rtne behavior is not overriden by the float controls execution mode
Suggested-by: Connor Abbott <cwabbott0@gmail.com>
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez
7580707345 nir: mind rounding mode on fadd, fsub, fmul and fma opcodes
According to Vulkan spec, the new execution modes affect only
correctly rounded SPIR-V instructions, which includes fadd, fsub and
fmul.

v2:
- Fix fmul, fsub and fadd round-to-zero definitions, they should use
  auxiliary functions to calculate the proper value because Mesa uses
  round-to-nearest-even rounding mode by default (Connor).

v3:
- Do an actual fused multiply-add at ffma (Connor).

v4:
- Simplify fadd and fmul for bit sizes < 64 (Connor).
- Do not use double ffma for 32 bits float (Connor).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v3]
2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez
0ac07c7ca7 nir: add support for round to zero rounding mode to nir_op_f2f32
f2f16's rounding modes are already handled and f2f64 don't need it
as there is not a floating point type with higher bit size than 64 for
now.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez
5308333e78 util: add fp64 -> fp32 conversion support for RTNE and RTZ rounding modes
In order to be coherent with the pre-existent API for half floats,
this new API for double is the one meant to be used when doing double
to float conversions. It is no more than a wrapper for the softfloat.h
API but we meant to keep that one private.

v2:
- Fix bug in _mesa_double_to_float_rtz() in the inf/nan detection
  using the exponent value.

v3:
- Replace custom f64 -> f32 implementations with the softfloat
  one (Andres).

v4:
- Added API usage clarifying comments (Caio).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez
733ede8ff6 util: add float to float16 conversions with RTZ and RTNE
In order to be coherent with the pre-existent functions, this new API
is the one meant to be used when doing half float to float
conversions. It is no more than a wrapper for the softfloat.h API but
we meant to keep that one private.

v2:
- Replace custom f32 -> f16 RTZ implementation with the softfloat
  one (Andres).

v3:
- Added API usage clarifying comments (Caio).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez
153c714f2a util: add softfloat functions to operate with doubles and floats
Implemented fadd, fsub, fmul and ffma for doubles and ffma for floats,
rounding to zero, using a modified implementation from Berkely
Softfloat 3e Library.

Their implementation correctness has been checked with the Berkeley
TestFloat Release 3e tool for x86_64.

v2:
- Reuse util_last_bit64() in _mesa_count_leading_zeros64()
  implementation (Connor).

v3:
- Add a specific ffma for floats version (Connor).
- Implement the ffma for doubles version (Andres).
- Lots of fixes in fadd, fsub and fmul (Andres).
- Improved documentation (Andres).

v4:
- Added f64 -> f32 conversion function (Andres).
- Added f32 -> f16 RTZ conversion function (Andres).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Tested-by: Andres Gomez <agomez@igalia.com>
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez
f7d73db353 nir: add support for flushing to zero denorm constants
v2:
- Refactor conditions and shared function (Connor).
- Move code to nir_eval_const_opcode() (Connor).
- Don't flush to zero on fquantize2f16
  From Vulkan spec, VK_KHR_shader_float_controls section:

  "3) Do denorm and rounding mode controls apply to OpSpecConstantOp?

  RESOLVED: Yes, except when the opcode is OpQuantizeToF16."

v3:
- Fix bit size (Connor).
- Fix execution mode on nir_loop_analize (Connor).

v4:
- Adapt after API changes to nir_eval_const_opcode (Andres).

v5:
- Simplify constant_denorm_flush_to_zero (Caio).

v6:
- Adapt after API changes and to use the new constant
  constructors (Andres).
- Replace MAYBE_UNUSED with UNUSED as the first is going
  away (Andres).

v7:
- Adapt to newly added calls (Andres).
- Simplified the auxiliary to flush denorms to zero (Caio).
- Updated to renamed supported capabilities member (Andres).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v4]
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez
45668a8be1 nir: add auxiliary functions to detect if a mode is enabled
v2:
- Added more functions.

v3:
- Simplify most of the functions (Caio).

v4:
- Updated to renamed enum values (Andres).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v2]
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> [v3]
2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez
84781e1f1d spirv/nir: keep track of SPV_KHR_float_controls execution modes
v2:
- Add support for rounding modes for each floating point bit size.

v3:
- Commit e68871f6a4 ("spirv: Handle constants and types before
  execution modes") changed when the execution modes are handled,
  which affects the result of the floating point constants when the
  rounding mode is set in the execution mode. Moved the handling of
  the rounding modes before we handle the constants.

v4:
- Rename vtn_decoration "literals" to "operands" (Andres).
- Simplify execution mode parsing util function (Caio).
- Extend the comment about the timing of the handling of the rounding
  modes (Caio).

v5:
- Correct extension name (Caio).
- Rename shader info member (Andres).
- Rename float controls enum (Andres).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v3]
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez
420ad0a1a3 spirv: check support for SPV_KHR_float_controls capabilities
v2:
- Correct extension name (Caio).
- Rename supported capabilities member (Andres).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v1]
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:18 +03:00
Adam Jackson
320c36ed3a gallium/xlib: Fix glXMakeCurrent(dpy, None, None, ctx)
This is entirely legal in GL 3.0+. I wonder how many more times I'll
need to fix this specific bug.
2019-09-17 20:16:00 +00:00
Adam Jackson
a693f98e17 gallium/xlib: Remove MakeCurrent_PrevContext
As the comment notes, this is not thread-safe. You can just as easily
use GetCurrentContext instead, so, do that.
2019-09-17 20:16:00 +00:00
Adam Jackson
db8be355d1 gallium/xlib: Remove drawable caching from the MakeCurrent path
AFAICT this only exists to avoid hitting XMesaFindBuffer, which is a
linear search. But you don't have that many GLX drawables, so whatever.
2019-09-17 20:16:00 +00:00
Marek Olšák
83f195414a radeonsi: add Navi12 PCI ID
trivial and urgent

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
2019-09-17 16:00:33 -04:00
Adam Jackson
6ec1259423 ci: Run tests on i386 cross builds
Yes, some tests fail, but we can turn those into XFAILs at meson time.
Better to keep the things that work working than not cover them at all.
Unfortunately XPASS results will not cause the build to fail until we
update CI to meson 0.51 or newer.

Reviewed-by: Daniel Stone <daniels@collabora.com>
2019-09-17 14:53:57 -04:00
Jon Turney
dd1dba80b9 Fix timespec_from_nsec test for 32-bit time_t
Since struct timespec's tv_sec member is of type time_t, adjust the
expected value to allow for the truncation which will occur with 32-bit
time_t.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2019-09-17 12:17:53 -04:00
Tapani Pälli
631255387f iris: close screen fd on iris_destroy_screen
Otherwise it never gets closed, this fixes errors seen with deqp-egl
where we end up opening 1024 files.

Fixes: 2dce0e94 ("iris: Initial commit of a new 'iris' driver for Intel Gen8+ GPUs.")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-17 14:46:45 +03:00
Juan A. Suarez Romero
34d51f931b docs: update calendar, add news item and link release notes for 19.1.7
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-09-17 12:55:41 +02:00
Juan A. Suarez Romero
a216395831 docs: add sha256 checksums for 19.1.7
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit b9d7244035)
2019-09-17 12:53:24 +02:00
Juan A. Suarez Romero
d9d4c1be62 docs: add release notes for 19.1.7
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit f632aac938)
2019-09-17 12:53:23 +02:00
Michel Dänzer
aed3babef7 ac: Remove DEBUG workaround
As of version 7, LLVM uses LLVM_DEBUG instead of just DEBUG.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-09-17 10:24:29 +00:00
Michel Dänzer
2c278602d8 swr: Limit DEBUG workaround to LLVM < 7
As of version 7, LLVM uses LLVM_DEBUG instead of just DEBUG.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-09-17 10:24:29 +00:00
Michel Dänzer
8218f6e22d gallivm: Limit DEBUG workaround to LLVM < 7
As of version 7, LLVM uses LLVM_DEBUG instead of just DEBUG.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-09-17 10:24:29 +00:00
Erik Faye-Lund
26351f1ee3 st/mesa: remove always-true expression
In case the GLSL version is 130 or higher, we've already enabled
ARB_shader_bit_encoding a bit earlier in this same function. So this
condition will always be true.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-17 07:26:42 +00:00
Christian Gmeiner
1c34d19f90 etnaviv: a bit of micro-optimization
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-09-17 05:50:37 +00:00
Icenowy Zheng
d61b67b41d lima: reset scissor state if scissor test is disabled
The PLBU seems to preserve scissor state between draws, and since lima doesn't
emit PLBU_CMD_SCISSORS() if scissor test is disabled, it uses state from previous draw.

Fix it by emitting PLBU_CMD_SCISSORS() for full fb if scissor test is disabled.

Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-09-17 04:13:24 +00:00
Jason Ekstrand
533987b5f4 vulkan: Update the XML and headers to 1.1.123
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 04:11:05 +00:00
Caio Marcelo de Oliveira Filho
9cf1bfcdd7 spirv: Handle ShaderLayer and ShaderViewportIndex capabilities
SPIR-V 1.5 incorported the SPV_EXT_shader_viewport_index_layer but
splitting into the two capabilities above.  Just handle them as we
support the extension already.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-09-16 19:18:01 -07:00
Caio Marcelo de Oliveira Filho
f6392e38d8 spirv: Update JSON and headers to 1.5
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-09-16 19:17:26 -07:00
Eric Anholt
eea6f21cbd freedreno: Fix invalid read when a block has no instructions.
We can't deref list_(first/last)_entries unless we know we have at least
one.  Instead, just use our IP we've been tracking as we go to set up the
start ip, and fill in the end IP as we walk instructions.

Fixes a complaint in valgrind on
dEQP-GLES3.functional.transform_feedback.* which sometimes has an
empty main (non-END) block when the VS inputs are just directly mapped
to outputs without any ALU ops.

Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-09-16 22:02:43 +00:00
Kenneth Graunke
d9d6305b80 st/mesa: Increase GL_POINT_SIZE_RANGE minimum to 1.0
Table 23.54 of the OpenGL 4.5 spec lists the minimum values for
GL_POINT_SIZE_RANGE as [1, 1].  So zero is not allowed (even though
arguably this could be useful for MSAA rendering, where a sub-1px
point might cover only some samples...)

This fixes the WebGL 2.0 conformance suite's state.gl-get-calls test
on Chromium on Linux, which uses desktop OpenGL.  The test checks that
the minimum value of GL_ALIASED_POINT_SIZE_RANGE is 1.  Unfortunately,
that query doesn't exist in desktop GL, so it checks POINT_SIZE_RANGE,
which is the anti-aliased value.  There's not really anything better
for Chromium to do here, unfortunately.  When running Chromium with
--api=es3, it maps it to the correct query and the test already works.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-16 13:40:41 -07:00
Kenneth Graunke
fbad42bbb9 st/mesa: Prefer 5551 formats for GL_UNSIGNED_SHORT_5_5_5_1.
Previously, internalformat GL_RGBA and type GL_UNSIGNED_SHORT_5_5_5_1
was promoted to RGBA8888 as the table entry with the 5551 formats
is listed below the 8888 entry, and it also doesn't have GL_RGBA as
a possible internalformat.

Using actual 5551 fixes the following dEQP-EGL test:
- dEQP-EGL.functional.image.modify.tex_rgb5_a1_tex_subimage_rgba8

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-16 13:17:23 -07:00
Rhys Perry
ffabcbba60 radv: always emit a position export in gs copy shaders
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: f8d0337299 ('radv: add multiple streams support for the GS copy shader')
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-16 19:42:30 +00:00
Rhys Perry
0f29c9df31 radv: keep GS threads with excessive emissions which could write to memory
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-16 19:42:30 +00:00
Lionel Landwerlin
dcf13fbac9 drirc: include unreal engine version 0 to 23
This was meant to include up to version 23.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 0616b7ac90 ("vulkan: add vk_x11_strict_image_count option")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111522
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-16 21:47:21 +03:00
Lionel Landwerlin
10206ba17b util/xmlconfig: fix regexp compile failure check
This is embarrasing...

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 04dc6074cf ("driconfig: add a new engine name/version parameter")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-16 21:47:21 +03:00
Erik Faye-Lund
9c57b54994 gallium/gdi: use GALLIUM_FOO rather than HAVE_FOO
This matches what other targets do, and makes it easier to port to
meson.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-16 17:54:00 +00:00
Dylan Baker
9e1f49aae1 scons: Make scons and meson agree about path to glapi generated headers
Currently scons puts them in src/mapi/glapi, meosn puts them in
src/mapi/glapi/gen. This results in some things being compilable only by
one or the other, put them in the same places so that everyone is happy.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-16 17:54:00 +00:00
Vasily Khoruzhick
ca5782f0ee lima: add standalone disassembler with primitive MBS parser
It's useful for analyzing shader binaries produced by ARM mali offline
compiler which outputs files in MBS format. MBS is mali binary shader,
currently parser just extracts shader binary and ignores everything else.

Reviewed-and-tested-by: Connor Abbott<cwabbott0@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-16 09:29:55 -07:00
Heinrich Fink
df8602f4b5 mesa/gl: Sync with Khronos registry
Update GL headers and xml API from upstream Khronos registry (commit
3d0c3eb). Keep `BUILDING_MESA` quirk in glext.h.

mesa/extensions: Expose EXT_EGL_sync instead of MESA_EGL_sync to reflect
Khronos request of changing this extension's scope from MESA to EXT.
EGL_EGL_sync is also the name of the extension that has been merged into
the upstream Khronos GL registry.

Remove MESA_EGL_sync spec txt from Mesa tree as it is now published as
EXT by Khronos.

v1: Remove MESA_EGL_sync spec and squash commits (Eric E)

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2019-09-16 16:50:43 +01:00
Sergii Romantsov
2bfcf04345 nir/large_constants: pass after lowering copy_deref
v2: by J.Ekstrand suggestion moved lowering of large
    constants after lowering of copy_deref is done.

CC: Jason Ekstrand <jason@jlekstrand.net>
CC: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111450
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
2019-09-16 11:23:48 +00:00
Michel Dänzer
e536446b60 gitlab-ci: Move up meson-arm64 job definition
This might allow the arm64 tests to start running earlier.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-16 12:51:35 +02:00
Michel Dänzer
cccb68b407 gitlab-ci: Move dependencies/needs for meson-main job to .deqp-test
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-16 12:51:35 +02:00
Michel Dänzer
128581d0d8 gitlab-ci: Simplify some job definitions by extending more similar jobs
v2:
* Preserve setting NIR_VALIDATE=0 for all arm64_* jobs
* Preserve setting DEQP_SKIPS=deqp-default-skips.txt for
  arm64_a306_gles2 jobs

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> # v1
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-16 12:51:34 +02:00
Michel Dänzer
e426f40097 gitlab-ci: Use multiple inheritance instead of YAML references
Support for multiple inheritance was added to GitLab recently.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-16 12:51:34 +02:00
Michel Dänzer
0173e9b1ca gitlab-ci: Add needs stanza to arm64_a306_gles2 job definition
This allows the arm64_a306_gles2 jobs to run as soon as the meson-arm64
job has finished.

Fixes: 6f0dc087b7 "freedreno: Introduce gitlab-based CI."
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-16 12:51:34 +02:00
Timothy Arceri
741cff91d3 radeonsi/nir: fix number of used samplers
Commit f3e978db incorrectly assumed the maximum number of
samplers was equal to the max number of defined samplers
e.g. where bindings skip slots.

This fixes an assert in si_nir_load_sampler_desc() for an
enemy territory quake wars shader. And fixes potential bugs with
incorrect bounds limiting in the same code for production builds
of mesa.

Fixes: f3e978db ("radeonsi/nir: Remove uniform variable scanning")

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-16 10:14:48 +00:00
Samuel Pitoiset
c5010e72b6 radv/gfx10: disable unsupported transform feedback features for NGG
Mostly multiple streams and queries which have to be fixed/implemented.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-16 12:08:22 +02:00
Samuel Pitoiset
d0fd82b502 radv/gfx10: implement NGG streamout
It's still disabled by default because transform feedback randomly
hangs and it seems like it's related to GDS (cf. RadeonSI).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-16 12:08:22 +02:00
Samuel Pitoiset
63b20fb0cf radv/gfx10: make sure to wait for idle before clearing GDS
Otherwise the next streamout operation will overwrite GDS. This
can be improved by tracking if there is a streamout operation in
flight. Currently the driver unconditionally flushes but that
doesn't matter much as NGG streamout is disabled by default.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-16 12:08:22 +02:00
Samuel Pitoiset
7314f6ef97 radv/gfx10: make GDS idle when leaving the IB
NGG streamout uses GDS and we have to make sure that another
process isn't going to overwrite GDS while our shaders are busy.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-16 12:08:22 +02:00
Samuel Pitoiset
2d89d8f333 radv/gfx10: enable NGG_WAVE_ID_EN for NGG streamout
Otherwise the wave IDs are probably 0 and it hangs. NGG_WAVE_ID_EN
generates wave IDs for GDS OA.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-16 12:08:22 +02:00
Samuel Pitoiset
a72344efa3 radv/gfx10: gather GS output for VS as NGG
For streamout we have to the number of streamout outputs.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-16 12:08:22 +02:00
Samuel Pitoiset
b617156621 radv/gfx10: compute the correct buffer size for NGG streamout
It's used to determined the max emit per buffer.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-16 12:08:22 +02:00
Samuel Pitoiset
d81100d307 radv/gfx10: fix unnecessary LDS overallocation for NGG GS
Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-16 12:08:22 +02:00
Samuel Pitoiset
c415c58b4a radv/gfx10: adjust the LDS size for VS/TES NGG streamout
It should account for the number of streamout outputs.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-16 12:08:22 +02:00
Samuel Pitoiset
67093ed3a3 radv/gfx10: unconditionally declare scratch space for NGG streamout without GS
Streamout outputs are stored in the ESGS ring.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-16 12:08:22 +02:00
Samuel Pitoiset
5ebc76471c radv/gfx10: adjust the GS NGG scratch size for streamout
It needs more space for multiple streams.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-16 12:08:22 +02:00
Samuel Pitoiset
e1dc3ab753 radv/gfx10: allocate GDS/OA buffer objects for NGG streamout
This allocates two BOs for GFX10 NGG streamout.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-16 12:08:22 +02:00
Samuel Pitoiset
957c3436fa radv/gfx10: implement NGG streamout begin/end functions
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-16 12:08:22 +02:00
Samuel Pitoiset
a15b3bcf1a radv/gfx10: add an option to switch from legacy to NGG streamout
This internal option is turned off by default because NGG streamout
still hangs. It seems like it's related to GDS as RadeonSI.

That option will be turned on once all issues are resolved.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-16 12:08:22 +02:00
Samuel Pitoiset
c5a00c3068 radv/winsys: add support for GS and OA domains
For NGG streamout which uses GDS.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-16 12:08:22 +02:00
Danylo Piliaiev
6f5a8617b4 iris: Fix fence leak in iris_fence_flush
Documentation for pipe_context::flush states:
 "NOTE: use screen->fence_reference() (or equivalent) to transfer
  new fence ref to **fence, to ensure that previous fence is unref'd"

Hence we need to unref previous out_fence.

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-16 08:47:37 +00:00
Sergii Romantsov
c7b2a2fd36 nir/large_constants: more careful data copying
A filed of nir_variable.location may be equel to -1.
That may cause copying to invalid address of list-node,
making some internal fields corrupted.

Patch fixes segfault during freeing context due to
corrupted address of ralloc_header.destructor.

v2: copy data if var is constant (Connor Abbott)

CC: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Fixes: b6d4753568 (nir/large_constants: De-duplicate constants)
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111676
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-16 07:58:49 +00:00
Juan A. Suarez Romero
237e6f4fed docs: extend 19.1.x releases
As 19.2 got some delays, let's extend 19.1 at least in one extra
release.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-09-16 06:51:20 +00:00
Lionel Landwerlin
0616b7ac90 vulkan: add vk_x11_strict_image_count option
This option strictly allocate the minImageCount given by the
application at swapchain creation.

This works around application that do not deal with the fact that the
implementation allocates more images than the minimum specified.

v2: Add values in default drirc (Bas)

v3: specify engine name/version (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111522
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
2019-09-15 15:37:02 +03:00
Lionel Landwerlin
04dc6074cf driconfig: add a new engine name/version parameter
Vulkan applications can register with the following structure :

typedef struct VkApplicationInfo {
    VkStructureType    sType;
    const void*        pNext;
    const char*        pApplicationName;
    uint32_t           applicationVersion;
    const char*        pEngineName;
    uint32_t           engineVersion;
    uint32_t           apiVersion;
} VkApplicationInfo;

This enables the Vulkan implementations to apply workarounds based off
matching this description.

Here we add a new parameter for matching the driconfig options with
the following :

    <device driver="anv">
        <application engine_name_match="MyOwnEngine.*" engine_versions="10:12,40:42">
            <option name="blaaah" value="true" />
        </application>
    </device>

v2: switch engine name match to use regexps

v3: Verify that the regexec returns REG_NOMATCH for match failure (Eric)

v4: Add missing bit that went to the following commit (Eric)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
2019-09-15 15:37:02 +03:00
Lionel Landwerlin
6d5f11ab34 radv: store engine name
We'll use this later for a new driconfig matching parameter.

v2: Avoid leak in device creation error case (Bas)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
2019-09-15 15:37:02 +03:00
Christian Gmeiner
9466e4cfab gallium: util_set_vertex_buffers_mask(..): make use of u_bit_consecutive(..)
Also move the clearing of the bits out of if/else.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-14 17:45:47 +00:00
Rob Clark
53a38e3015 gitlab-ci/a630: skip dEQP-GLES3.functional.fbo.msaa.2_samples.stencil_index8
Seen a couple flakes on this one so far.  Not sure if it is a real
driver problem or not, but skip it to unblock things.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-09-14 10:22:55 -07:00
Lepton Wu
ac175fb168 virgl: replace fprintf with _debug_printf
Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2019-09-14 00:14:41 +00:00
Kenneth Graunke
c9fb704f72 iris: Initialize ice->state.prim_mode to an invalid value
It was calloc'd to 0 which is PIPE_PRIM_POINTS, which means that we
fail to notice an initial primitive of points being new, and fail at
updating the "primitive is points or lines" field.

We do not need to reset this on device loss because we're tracking
the last primitive mode sent to us on the CPU via draw_vbo, not the
last primitive mode sent to the GPU.

Fixes several tests:
- dEQP-GLES3.functional.clipping.point.wide_point_clip
- dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_center
- dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_corner

Fixes: dcfca0af7c ("iris: Set XY Clipping correctly.")
2019-09-13 16:31:29 -07:00
Eric Anholt
7859eb1390 gitlab-ci: Make the test job fail when bugs are unexpectedly fixed.
If people fix bugs without updating the expected-fails list, then we
end up with a lack of coverage of those failures in the future.  Also,
some day down the line another developer ends up trying to figure out
if the bug was actually fixed or their environment is just failing to
reproduce it.

Suggested-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-09-13 13:50:56 -07:00
Eric Anholt
1190c81f8e gitlab-ci/a630: Drop the MSAA expected failure.
This hasn't failed for me in ~5 minutes of looping over
dEQP-GLES3.functional.fbo.msaa.*

Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-09-13 13:50:54 -07:00
Eric Anholt
7a0fd10ffc gitlab-ci/a630: Drop remaining dEQP-GLES3.functional.draw.random.* xfails.
These haven't failed for me in ~10 minutes of looping over
draw.random.*.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-09-13 13:50:01 -07:00
Andreas Baierl
4b1a14fd47 lima/ppir: Add undef handling
Add a ppir dummy node for nir_ssa_undef_instr, create a reg for it and mark
it as undefined, so that regalloc can set it non-interfering to avoid
register pressure.

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Vasily Khozuzhick <anarsoul@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
2019-09-13 19:41:32 +00:00
Andreas Baierl
4ddadd6370 lima/ppir: Rename ppir_op_dummy to ppir_op_undef
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
2019-09-13 19:41:32 +00:00
John Stultz
3976c86e70 Android.mk: Fix missing \ from recent llvm change
Building w/ AOSP, I was hitting the following error:
external/mesa3d/src/amd/Android.common.mk:95: error: missing separator.

Which was due to the changes to mesa-build-with-llvm  missing
a line continuation.

Fixes: 96b592696f
Signed-off-by: John Stultz <john.stultz@linaro.org>
2019-09-13 19:11:10 +00:00
Boris Brezillon
6ddfd37c7e panfrost: Move the batch submission logic to panfrost_batch_submit()
We are about to patch panfrost_flush() to flush all pending batches,
not only the current one. In order to do that, we need to move the
'flush single batch' code to panfrost_batch_submit().

While at it, we get rid of the existing pipelining logic, which is
currently unused and replace it by an unconditional wait at the end of
panfrost_batch_submit(). A new pipeline logic will be introduced later
on.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-13 16:25:06 +02:00
Boris Brezillon
2fc91a16ab panfrost: Move the fence creation in panfrost_flush()
panfrost_flush() is about to be reworked to flush all pending batches,
but we want the fence to block on the last one. Let's move the fence
creation logic in panfrost_flush() to prepare for this situation.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-13 16:25:06 +02:00
Boris Brezillon
835439b84f panfrost: Delay payloads[].offset_start initialization
panfrost_draw_vbo() Might call the primeconvert/without_prim_restart
helpers which will enter the ->draw_vbo() again. Let's delay
payloads[].offset_start initialization so we don't initialize them
twice.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-13 16:25:06 +02:00
Boris Brezillon
4166ca92e2 panfrost: Prepare things to avoid flushes on FB switch
panfrost_attach_vt_xxx() functions are now passed a batch, and the
generated FB desc is kept in panfrost_batch so we can switch FBs
without forcing a flush. The postfix->framebuffer field is restored
on the next attach_vt_framebuffer() call if the batch already has an
FB desc.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-13 16:25:06 +02:00
Boris Brezillon
e5c7701a0a panfrost: Pass a batch to panfrost_set_value_job()
So we can emit SET_VALUE jobs for a batch that's not currently bound
to the context.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-13 16:25:06 +02:00
Boris Brezillon
bc0f6c0b15 panfrost: Use ctx->wallpaper_batch in panfrost_blit_wallpaper()
We'll soon be able to flush a batch that's not currently bound to the
context, which means ctx->pipe_framebuffer will not necessarily be the
FBO targeted by the wallpaper draw. Let's prepare for this case and
use ctx->wallpaper_batch in panfrost_blit_wallpaper().

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-13 16:25:06 +02:00
Boris Brezillon
aa851a62b9 panfrost: Pass a batch to functions emitting FB descs
So we can emit such jobs to a batch that's not currently bound to the
context.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-13 16:25:06 +02:00
Boris Brezillon
07a68835a1 panfrost: Pass a batch to panfrost_{allocate,upload}_transient()
We need that if we want to upload transient buffers to a batch that's
not currently bound to the context, which in turn will be needed if we
want to relax the batch serialization we have right now (only flush
batches when we need to: on a flush request, or when one batch depends
on the result of other batches).

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-13 16:25:06 +02:00
Boris Brezillon
e46d95d51b panfrost: Allow testing if a specific batch is targeting a scanout FB
Rename panfrost_is_scanout() into panfrost_batch_is_scanout(), pass it
a batch instead of a context and move the code to pan_job.c.

With this in place, we can now test if a batch is targeting a scanout
FB even if this batch is not bound to the context.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-13 16:25:06 +02:00
Boris Brezillon
40e20324e0 panfrost: Get rid of the unused 'flush jobs accessing res' infra
Will be replaced by something similar but using a BOs as keys instead
of resources.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-13 16:25:06 +02:00
Boris Brezillon
1b5873b73c panfrost: Use a pipe_framebuffer_state as the batch key
This way we have all the fb_state information directly attached to a
batch and can pass only the batch to functions emitting CMDs, which is
needed if we want to be able to queue CMDs to a batch that's not
currently bound to the context.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-13 16:25:06 +02:00
Indrajit Das
92765f85e1 radeon/vcn: exclude raven2 from vcn 2.0 encode initialization
Signed-off-by: Indrajit Das <indrajit-kumar.das@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2019-09-13 09:18:43 -04:00
Eric Engestrom
a0f8a07308 gitlab-ci: rename stages to something simpler
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
2019-09-13 13:26:09 +01:00
Boris Brezillon
c9bebae287 panfrost: Rework midgard_pair_load_store() to kill the nested foreach loop
mir_foreach_instr_in_block_safe() is based on list_for_each_entry_safe()
which is designed to protect against removal of the current entry, but
removing the entry placed just after the current one will lead to a
use-after-free situation.

Luckily, the midgard_pair_load_store() logic guarantees that the
instruction being removed (if any) is never placed just after ins which
in turn guarantees that the hidden __next variable always points to a
valid object.
Took me a bit of time to realize that this code was safe, so I'm
suggesting to get rid of the inner mir_foreach_instr_in_block_from()
loop and rework the code so that the removed instruction is always the
current one (which is what the list_for_each_entry_safe() API was
initially designed for).

While at it, we also get rid of the unecessary insert(ins)/remove(ins)
dance by simply moving the instruction around.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-13 12:03:47 +02:00
Boris Brezillon
0e513ccca4 panfrost: Fix a list_assert() in schedule_block()
list_for_each_entry() does not allow modifying the current item pointer.
Let's rework the skip-instructions logic in schedule_block() to not
break this rule.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-13 11:01:40 +02:00
Iago Toral Quiroga
2eace10c62 v3d: fix TF primitive counts for resume without draw
The V3D documentation states that primitive counters are reset when
we emit Tile Binning Mode Configuration items, which we do at the start
of each draw call, however, in the actual hardware this doesn't seem to
take effect when transform feedback is not active (this doesn't happen in
the simulator). This causes a problem in the following scenario:

glBeginTransformFeedback()
   glDrawArrays()
   glPauseTransformFeedback()
   glDrawArrays()
   glResumeTransformFeedback()
glEndTransformFeedback()

The TF pause will trigger a flush of the primitive counters, which results
in a correct number of primitives up to that point. In theory, the counter
should then be reset when we execute the draw after pausing TF, but that
doesn't happen, and since TF is enabled again by the resume command before
we end recording, by the time we end the transform feedback recording we
again check the counters, but instead of reading 0, we read again the same
value we read at the time we paused, incorrectly accumulating that value
again.

In theory, we should be able to avoid this by using the other method to
reset the primitive counters: using operation 1 instead of 0 when we
flush the counts to the buffer at the time we pause, but again, this
doesn't seem to be work and we still see obsolete counts by the time we
end transform feedback.

This patch fixes the problem by not accumulating TF primitive counts
unless we know we have actually queued draw calls during transform
feedback, since that seems to effectively reset the counters. This should
also be more performant, since it saves unnecessary stalls for the
primitive counters to be updated when we know there haven't been any
new primitives drawn.

Fixes CTS tests:
dEQP-GLES3.functional.transform_feedback.*

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-13 06:53:26 +00:00
Iago Toral Quiroga
ded6ea9209 v3d: remove redundant update of queued draw calls
This was updating the counter for the indexed draw path only, but we are
already updating the counter for all paths a bit later, so this is only
duplicating counts for indexed paths.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-13 06:53:26 +00:00
Iago Toral Quiroga
b9a07eed00 v3d: make sure we have enough space in the CL for the primitive counts packet
Fixes: 0f2d1dfe65 ("v3d: use the GPU to record primitives written to transform feedback")

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-13 06:53:26 +00:00
Iago Toral Quiroga
b69f51a5ef v3d: add missing line break for performance debug message
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-13 06:53:26 +00:00
Tomeu Vizoso
bc79e5c437 panfrost/ci: Use releases for Volt dEQP
So we can better correlate different results to versions of the runner.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-09-13 08:35:36 +02:00
Tomeu Vizoso
c301fc027a panfrost/ci: Update kernel to 5.3-rc8
We haven't updated in a long time, so better do it now and again when
5.3 is released.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-09-13 08:35:36 +02:00
Tomeu Vizoso
ca4e6637d0 panfrost/ci: Run dEQP with the surfaceless platform
Instead of running it with the Wayland platform, which introduces
unwanted dependencies and complexity.

Makes tests run 30% faster, as well.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-09-13 08:35:36 +02:00
Samuel Pitoiset
8137df3a46 radv: fix allocating number of user sgprs if streamout is used
streamout_buffers is assigned after that function, so the previous
fix was completely wrong. This probably fix something when streamout
buffers and push constants are used/inlined in the same shader.

Fixes: 378e2d2414 ("radv: fix computing number of user SGPRs for streamout buffers")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-13 07:55:30 +02:00
Jason Ekstrand
acfa2340e6 intel/fs: Handle UNDEF in split_virtual_grfs
When the UNDEF instruction was added, we didn't do anything special in
split_virtual_grfs.  This mean that anything with an UNDEF wasn't
getting split which causes problems for the compiler.  Among other
things, it makes RA harder because things are in bigger chunks.  It also
meant that dvec4s weren't getting split which means that they are larger
than the maximum register size.

Shader-db results on Kaby Lake:

    total instructions in shared programs: 14959202 -> 14960035 (<.01%)
    instructions in affected programs: 96197 -> 97030 (0.87%)
    helped: 140
    HURT: 128
    helped stats (abs) min: 1 max: 17 x̄: 1.62 x̃: 1
    helped stats (rel) min: 0.09% max: 6.15% x̄: 0.65% x̃: 0.45%
    HURT stats (abs)   min: 1 max: 825 x̄: 8.28 x̃: 1
    HURT stats (rel)   min: 0.13% max: 139.83% x̄: 1.70% x̃: 0.50%
    95% mean confidence interval for instructions value: -2.96 9.18
    95% mean confidence interval for instructions %-change: -0.56% 1.51%
    Inconclusive result (value mean confidence interval includes 0).

    total loops in shared programs: 4372 -> 4372 (0.00%)
    loops in affected programs: 0 -> 0
    helped: 0
    HURT: 0

    total cycles in shared programs: 352646771 -> 352840997 (0.06%)
    cycles in affected programs: 218600800 -> 218795026 (0.09%)
    helped: 21167
    HURT: 21411
    helped stats (abs) min: 1 max: 2924 x̄: 36.89 x̃: 10
    helped stats (rel) min: <.01% max: 41.90% x̄: 2.97% x̃: 0.98%
    HURT stats (abs)   min: 1 max: 26027 x̄: 45.54 x̃: 10
    HURT stats (rel)   min: <.01% max: 324.46% x̄: 3.88% x̃: 1.06%
    95% mean confidence interval for cycles value: 2.87 6.26
    95% mean confidence interval for cycles %-change: 0.40% 0.55%
    Cycles are HURT.

    total spills in shared programs: 8840 -> 8953 (1.28%)
    spills in affected programs: 126 -> 239 (89.68%)
    helped: 1
    HURT: 2

    total fills in shared programs: 21782 -> 21914 (0.61%)
    fills in affected programs: 431 -> 563 (30.63%)
    helped: 1
    HURT: 3

    LOST:   0
    GAINED: 5

Shader-db results on Haswell:

    total instructions in shared programs: 13320918 -> 13320769 (<.01%)
    instructions in affected programs: 40998 -> 40849 (-0.36%)
    helped: 146
    HURT: 56
    helped stats (abs) min: 1 max: 8 x̄: 2.73 x̃: 2
    helped stats (rel) min: 0.16% max: 8.60% x̄: 2.52% x̃: 2.22%
    HURT stats (abs)   min: 2 max: 23 x̄: 4.45 x̃: 4
    HURT stats (rel)   min: 0.21% max: 10.26% x̄: 6.83% x̃: 10.26%
    95% mean confidence interval for instructions value: -1.26 -0.21
    95% mean confidence interval for instructions %-change: -0.62% 0.77%
    Inconclusive result (%-change mean confidence interval includes 0).

    total loops in shared programs: 4373 -> 4373 (0.00%)
    loops in affected programs: 0 -> 0
    helped: 0
    HURT: 0

    total cycles in shared programs: 374518258 -> 374384193 (-0.04%)
    cycles in affected programs: 231101954 -> 230967889 (-0.06%)
    helped: 21427
    HURT: 19438
    helped stats (abs) min: 1 max: 2035 x̄: 31.09 x̃: 8
    helped stats (rel) min: <.01% max: 40.95% x̄: 2.42% x̃: 0.86%
    HURT stats (abs)   min: 1 max: 20875 x̄: 27.38 x̃: 8
    HURT stats (rel)   min: <.01% max: 59.09% x̄: 2.49% x̃: 0.80%
    95% mean confidence interval for cycles value: -4.49 -2.07
    95% mean confidence interval for cycles %-change: -0.14% -0.04%
    Cycles are helped.

    total spills in shared programs: 23406 -> 23411 (0.02%)
    spills in affected programs: 3 -> 8 (166.67%)
    helped: 0
    HURT: 2

    total fills in shared programs: 34845 -> 34850 (0.01%)
    fills in affected programs: 3 -> 8 (166.67%)
    helped: 0
    HURT: 2

    LOST:   0
    GAINED: 0

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111566
Fixes: f4ef34f207 "intel/fs: Add an UNDEF instruction to avoid..."
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-09-13 04:12:24 +00:00
Jiadong Zhu
33aa039acf mesa: fix texStore for FORMAT_Z32_FLOAT_S8X24_UINT
_mesa_texstore_z32f_x24s8 calculates source rowStride at a
pace of 64-bit, this will make inaccuracy offset if the width
of src image is an odd number. Modify src pointer to int_32* as
source image format is gl_float which is 32-bit per pixel.

Reviewed by Ilia Mirkin

Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2019-09-12 23:28:28 -04:00
Rob Clark
b4df115d3f freedreno/a6xx: pre-calculate userconst stateobj size
The AnTuTu "garden" benchmark overflows the fixed size constbuffer
stateobject, so lets be more clever and calculate (a potentially
slightly pessimistic) actual size.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-09-12 18:07:20 -07:00
Adam Jackson
5a9dec7534 gallium: Restore VSX for llvm >= 4
Accidentally dropped in 4fdd455eeb.

Fixes: 4fdd455e ("gallium: Require LLVM >= 3.4)
Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-09-12 20:09:12 -04:00
Eric Anholt
2efc804892 egl/android: Fix build since the DRI fourcc removal.
Fixes: 272f9cfe6a ("dri: Use DRM_FORMAT_* instead of defining our own copy.")

Reviewed-by: John Stultz <john.stultz@linaro.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-09-12 21:54:30 +00:00
Eric Anholt
89e840ec59 gitlab-ci/a630: Disable flappy layout_binding.ssbo.fragment_binding_array
It started showing up as unreliable post-merge.  There's a valgrind
complaint, but even fixing that doesn't make it stable.
2019-09-12 14:16:21 -07:00
Rob Clark
966b7c3ed2 freedreno: fix compiler warning
fd6_blitter.c:724:31: warning: passing argument 1 of ‘fd_resource_level_linear’ discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-12 12:52:04 -07:00
Eric Anholt
6f0dc087b7 freedreno: Introduce gitlab-based CI.
Since freedreno's kernel and GPU reset seem to be totally solid, we
don't need to have the complexity of the LAVA setup that panfrost has.
Instead, we can register some boards as shared gitlab runners and have
the jobs run out of a docker container just like we do for llvmpipe.
Just make sure that the DRI device node is passed through to the
containers in the gitlab config ('devices = ["/dev/dri"]' under
runners.docker).

If a runner fails (networking dies, kernel panic, etc.) it'll take out
one build but the rest can keep going since gitlab-runner is what
pulls jobs.  Since the runner pulls jobs, it also means that they can
live behind firewalls instead of needing some public address to be
accessed by gitlab.fd.o.

For now, enable it just on db410c (A307) and cheza (A630) as those are
the hardware that I have plenty of.  A307 is only testing GLES2 since
running all of GLES3 takes too long for the number of boards I've
brought up.

Acked-by: Rob Clark <robdclark@chromium.org>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-12 10:55:42 -07:00
Eric Anholt
0b6b0c09f4 gitlab-ci: Log the driver version that got tested.
Sometimes you just want confirmation that dEQP really picked up the
driver we built you thought.  This is not as good as one might like,
because git isn't present in the cross-build image.

Acked-by: Rob Clark <robdclark@chromium.org>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-12 10:55:42 -07:00
Eric Anholt
8d4742fe49 gitlab-ci: Disable dEQP's watchdog timer.
A handful of tests on freedreno have been close to the watchdog
timeout, and now sporadically fail since range analysis has slowed
down the compiler for them.

Acked-by: Rob Clark <robdclark@chromium.org>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-12 10:55:42 -07:00
Caio Marcelo de Oliveira Filho
f479878ce6 mesa/st: Fallback to name lookup when the variable have no Parameter
This brings back the fallback previously present in
st_nir_lookup_parameter_index(): if there's no parameter associated
with the variable, use a parameter from a variable with the same
prefix.

We'll have to sort out something for SPIR-V, but in the meantime let's
fix GLSL.

Fixes: b6384e57f5 ("mesa/st: Lookup parameters without using names")
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Eric Anholt <eric@anholt.net>
2019-09-12 17:53:54 +00:00
Adam Jackson
ad9c1838e0 glx: Remove unused indirection for glx_context->fillImage
This slot is always filled in with __glFillImage.

Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-12 13:23:32 -04:00
Eric Engestrom
f812cbfd88 meson/v3d: replace partial list of nir dep files with idep_nir_headers
"partial" because `nir_intrinsics_h` was missing.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-09-12 13:18:36 +01:00
Eric Engestrom
f418de5490 meson/iris: replace partial list of nir dep files with idep_nir_headers
"partial" because `nir_intrinsics_h` was missing.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-09-12 13:18:36 +01:00
Jose Maria Casanova Crespo
068c8889dd v3d: flag dirty state when binding compute states
As introduced in "v3d: flag dirty state when binding new sampler states"
we need to add support for compute states. New flag VC5_DIRTY_COMPTEX and
VC5_DIRTY_UNCOMPILED_CS are introduced.

Reaching 33 flags at the dirty field forces us to change the type to
uint_64. Flags are reordered and empty continuous bits are available
for future pipeline stages.

v2: Update flag conditions to compile cs shader. (Eric Antholt)
    Now dirty flags use uint_64t and flags are reordered.
    Added VC5_DIRTY_UNCOMPILED_CS flag.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-12 12:20:17 +01:00
Danylo Piliaiev
175c32e9bd tgsi_to_nir: Translate TGSI_INTERPOLATE_COLOR as INTERP_MODE_NONE
Translating TGSI_INTERPOLATE_COLOR as INTERP_MODE_SMOOTH made
it for drivers impossible to have flatshaded color inputs.

Translate it to INTERP_MODE_NONE which drivers interpret as
smooth or flat depending on flatshading state.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111467

Fixes: 770faf54 ("tgsi_to_nir: Improve interpolation modes.")

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-12 10:17:39 +00:00
Iago Toral Quiroga
544b156968 nir/lower_point_size: assume scalar PSIZ
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-12 06:40:04 +00:00
Iago Toral Quiroga
7f0b4a803c gallium/ttn: VARYING_SLOT_PSIZ and VARYING_SLOT_FOGC are scalar
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-12 06:40:04 +00:00
Iago Toral Quiroga
ab341e61f0 prog_to_nir: VARYING_SLOT_PSIZ is a scalar
v2: remove stray change (Erik Faye-Lund)

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-12 06:40:04 +00:00
Lepton Wu
8b1912c20b egl/android: Only keep BGRA EGL configs as fallback
Stock Android code actually doesn't support BGRA format EGL
configs. It's hard coded to use RGBA_8888 as window format
for BGRA EGL configs here:
https://android.googlesource.com/platform/frameworks/native/+/1eb32e2/opengl/libs/EGL/eglApi.cpp#608
So just remove it from EGL configs if RGBA is supported.

Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-09-12 06:38:59 +00:00
renchenglei
e2485bb023 egl/android: Enable HAL_PIXEL_FORMAT_RGBA_1010102 format
The patch adds support for HAL_PIXEL_FORMAT_RGBA_1010102 on
Android platform.

Fixes android.media.cts.DecoderTest#testVp9HdrStaticMetadata
which failed in egl due to "Unsupported native buffer format 0x2b"
on Android.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Chenglei Ren <chenglei.ren@intel.com>
2019-09-12 05:59:56 +00:00
Kenneth Graunke
6a82a374b4 iris: trivial whitespace fixes 2019-09-11 21:33:41 -07:00
Jonathan Marek
3690a53608 u_format: float type for R11G11B10_FLOAT/R9G9B9E5_FLOAT
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-11 22:39:19 -04:00
Jonathan Marek
8829f9ccb0 u_format: add ETC2 to util_format_srgb/util_format_linear
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-11 22:39:07 -04:00
Vinson Lee
8d286776b6 meson: Add coroutines component to llvmpipe build.
Fixes: d32690b43c ("gallivm: add coroutine pass manager support")
Suggested-by: Gert Wollny <gert.wollny@collabora.com>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2019-09-11 15:14:17 -07:00
Eric Anholt
272f9cfe6a dri: Use DRM_FORMAT_* instead of defining our own copy.
We have only two defines that aren't from DRM_FORMAT_*: SARGB and
SABGR.  Keep only those as __DRI_IMAGE_FOURCC and garbage collect the
rest.

While this header is also used from the X server, the X server doesn't
use any __DRI_IMAGE enums.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-09-11 13:05:10 -07:00
Eric Anholt
c18b1f0e71 uapi: Update drm_fourcc.h
Taken from drm-misc-next 268de6530aa1 ("drm: mst: Fix query_payload
ack reply struct")

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-09-11 13:04:53 -07:00
Kenneth Graunke
73e4f974b8 st/mesa: Only pause queries if there are any active queries to pause.
Previously, ReadPixels, PBO upload/download, and clears would call
cso_save_state with CSO_PAUSE_QUERIES, causing cso_context to call
pipe->set_active_query_state() twice for each operation.  This can
potentially cause driver work to enable/disable statistics counters.

But often, there are no queries happening which need to be paused.
By keeping a simple tally of active queries, we can skip this work.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-11 19:47:57 +00:00
Jean Hertel
2c1983f757 Fix missing dri2_load_driver on platform_drm
Signed-off-by: Jean Hertel <jean.hertel@hotmail.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2019-09-11 19:28:09 +00:00
Anuj Phogat
729de1488f intel/gen11+: Enable Hardware filtering of Semi-Pipelined State in WM
Initial benchmarking didn't show any performance benefits. But it might eventually.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-11 11:29:37 -07:00
Anuj Phogat
ee2bde5232 genxml/gen11+: Add COMMON_SLICE_CHICKEN4 register
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-11 11:29:37 -07:00
Adam Jackson
7e0e53a077 egl/dri2: Refuse to add EGLConfigs with no supported surface types
For example, the surfaceless platform only supports pbuffers. If the
driver supports MSAA, we would still create a config, but it would have
no supported surface types. That's meaningless, so don't do it.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-11 14:11:40 -04:00
Adam Jackson
96b592696f gallium: Require LLVM >= 3.9
To go any further than this would be to break the current version of
Android.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2019-09-11 17:00:43 +00:00
Adam Jackson
585d095610 gallium: Require LLVM >= 3.8
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2019-09-11 17:00:43 +00:00
Adam Jackson
59f18f2159 gallium: Require LLVM >= 3.7
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2019-09-11 17:00:43 +00:00
Adam Jackson
9abf7d5755 gallium: Require LLVM >= 3.6
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2019-09-11 17:00:43 +00:00
Adam Jackson
3c553d9cff gallium: Require LLVM >= 3.5
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

[ Michel Dänzer: Dropped jessie line from debian-install.sh again ]
2019-09-11 17:00:43 +00:00
Michel Dänzer
57855ff8aa gitlab-ci: Keep g++ from stretch when installing foreign toolchains
Upgrading to a newer g++ causes older LLVM/clang packages to be
removed.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2019-09-11 17:00:43 +00:00
Michel Dänzer
3be7c67bbe gitlab-ci: Explicitly install linux-libc-dev for foreign architectures
Something seems to have changed in Debian buster causing installation
of the other foreign packages to fail without this.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2019-09-11 17:00:43 +00:00
Adam Jackson
4fdd455eeb gallium: Require LLVM >= 3.4
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2019-09-11 17:00:43 +00:00
Dylan Baker
a1ebbc3225 Docs: mark that 19.2.0-rc3 has been released
Also update -rc4 to me.
2019-09-11 09:47:45 -07:00
Brian Paul
d714415208 st/nir: fix illegal designated initializer in st_glsl_to_nir.cpp
IIRC, designated initializers are not legal C++.
Fixes the MSVC build.

Fixes: 83fd1e58 ("glsl/nir: Add and use a gl_nir_link() function")

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2019-09-11 09:38:07 -06:00
Dylan Baker
52cf2d05a7 meson: don't generate file into subdirs
This is unsupported by meson and may become a hard error in the future.

Fixes: 5adfc8602c
       ("lima/ppir: move sin/cos input scaling into NIR")
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-11 08:35:05 -07:00
Kenneth Graunke
73b70b4952 iris: Set bo->reusable = false in iris_bo_make_external_locked
This fixes a missing bo->reusable = false in iris_bo_export_gem_handle.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2019-09-11 08:10:47 -07:00
Kenneth Graunke
06370c3167 iris: Finish initializing the BO before stuffing it in the hash table
Other threads may pick it up once it's in the hash table.  Not known
to fix anything currently.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2019-09-11 08:10:47 -07:00
Marek Olšák
9a59ad87df radeonsi/gfx9: honor user stride for imported buffers
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-11 11:03:31 -04:00
Marek Olšák
b97c5edd7a prog_to_nir, tgsi_to_nir: make sure kill doesn't discard NaNs
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-11 10:59:27 -04:00
Marek Olšák
1bb2656276 ac: replace HAVE_LLVM with LLVM_VERSION_MAJOR for atomic-optimizations
trivial
2019-09-11 10:56:46 -04:00
Vasily Khoruzhick
32ea4c2c5e lima: set .out_sync field of req in lima_submit_start()
Looks like .out_sync wasn't set in lima_submit_start(), as result
submit completion fence was never signalled.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-10 21:49:53 -07:00
Anuj Phogat
cb18046073 intel: Add few Ice Lake brand strings
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-09-10 15:59:30 -07:00
Kenneth Graunke
c6d40b5182 gallium: Fix util_format_get_depth_only
This is a pipe format, not a boolean.

Fixes: 5849e0612c ("gallium/auxiliary: Add util_format_get_depth_only() helper.")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-09-10 15:49:29 -07:00
Rob Clark
6c19d37331 freedreno/a6xx: fix 3d tex layout
Fixes dEQP-GLES3.functional.texture.specification.texstorage3d.size.3d_2x2x2_2_levels

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-09-10 22:08:33 +00:00
Rob Clark
85a23a8991 freedreno/a6xx: don't tile things that are too small
If the lowest (largest) mipmap level is too small to tile, then don't
bother pretending.

Note that this requires initializing pipe->screen before
fd_resource_level_linear() is called.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-09-10 22:08:33 +00:00
Caio Marcelo de Oliveira Filho
15e439071d iris: Enable ARB_gl_spirv and ARB_spirv_extensions
This will also "unlock" OpenGL 4.6 for Iris!

v2: Also enable PIPE_CAP_GL_SPIRV_VARIABLE_POINTERS.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [v1]
2019-09-10 14:36:46 -07:00
Caio Marcelo de Oliveira Filho
83fd1e58d8 glsl/nir: Add and use a gl_nir_link() function
Perform all the NIR linking steps in order.  Change iris and i965 to
use it.  Suggested by Alejandro.

v2: Add gl_nir_linker_options struct.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [v1]
2019-09-10 14:36:46 -07:00
Caio Marcelo de Oliveira Filho
eca8032f20 gallium: Add ARB_gl_spirv support
The PIPE_CAP_GL_SPIRV capability enables ARB_gl_spirv and
ARB_spirv_extensions, and will make sure the corresponding SPIR-V
capabilities and extensions lists are initialized.

The additional PIPE_CAP_GL_SPIRV_VARIABLE_POINTERS capability enables
the support for Variable Pointers in SPIR-V shaders.  This depends on
the driver and is not mandatory for ARB_gl_spirv support.

v2: Add a PIPE_CAP for Variable Pointers.  (Marek)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [v1]
2019-09-10 14:36:46 -07:00
Caio Marcelo de Oliveira Filho
dccd179ba1 mesa/spirv: Set a few more extensions
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-09-10 14:36:46 -07:00
Caio Marcelo de Oliveira Filho
1a12b0fe36 mesa/st: Don't expect prog->nir to already exist
There's no such case, if we load prog->nir from the shader cache, we
shouldn't hit this path.

Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-09-10 14:36:46 -07:00
Caio Marcelo de Oliveira Filho
b4b39d9859 mesa/st: Add support for SPIR-V shaders
The SPIR-V codepath uses NIR linking, so we have to preprocess after
the linking steps, which makes things slightly different than GLSL.
To make more clear when the preprocess is happening, I've ended up
inlining st_nir_get_mesa_program() into its caller.

The goal was to make both GLSL and SPIR-V to use the same preprocess
function, the exceptions are:

- SPIR-V codepath don't support NIR state slots yet;
- GLSL lowers shared memory early, so we don't do the deref lowering
  for those.

For now I didn't bother to rename other functions and files (now that
many of them apply to both GLSL and SPIR-V), but we should do this in
further patches.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-09-10 14:36:46 -07:00
Caio Marcelo de Oliveira Filho
18e79e97e5 mesa/st: Extract preprocessing NIR steps
Refactor to split the glsl_to_nir conversion from the preprocessing
NIR passes into separate functions, so we can use them in SPIR-V.
Unlike in GLSL, there we'll need to perform a few passes with the NIR
linker before doing the individual preprocess calls.

No behavior should change with this patch.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-09-10 14:36:46 -07:00
Caio Marcelo de Oliveira Filho
b6384e57f5 mesa/st: Lookup parameters without using names
Use the new MainUniformStorageIndex field in Parameter instead.  It
was added so we could match those in the SPIR-V case, where names are
optional.

v2: Use MainUniformStorageIndex for all cases.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [v1]
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-09-10 14:36:46 -07:00
Caio Marcelo de Oliveira Filho
d40978f396 mesa/program: Associate uniform storage without using names
Use the new UniformStorageIndex field in Parameter instead.  This
mechanism was added so we could match those in the SPIR-V case, where
names are optional.

v2: Use UniformStorageIndex for all cases.  (Timothy)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-09-10 14:36:46 -07:00
Caio Marcelo de Oliveira Filho
4dd1ef9d0a mesa: Fill Parameter storage indices even when not using SPIR-V
When creating Parameters, fill in the associated uniform storage
indices, like it is done with the NIR linker used for SPIR-V.  This
will allow later code to not rely on names (which would never work for
SPIR-V where names are optional).

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-09-10 14:36:46 -07:00
Caio Marcelo de Oliveira Filho
664e4a610d glsl/nir: Fill in the Parameters in NIR linker
The parameter lists were not being created nor filled since i965
doesn't use them.  In Gallium they are used for uniform handling, so
add a way to fill them.

The gl_uniform_storage struct got two new fields that let us go

- from a Parameter to the matching UniformStorage and,
- from the variable to the *first* UniformStorage

without relying on names -- since they are optional for ARB_gl_spirv.
Later patches will make use of them.

v2: Do not fill parameters for i965.  (Timothy)
    Use uint32_t for the new attributes.  (Marek)

v3: Serialize the new fields.  (Timothy)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-09-10 14:36:46 -07:00
Caio Marcelo de Oliveira Filho
eea3aa25aa mesa: Pack gl_program_parameter struct
The gl_register_file doesn't need 16 bits, so shorten it and use the
extra room for 'Padded' (also mark it as a single bit).  This shrinks
the struct size from 32 bytes to 24 bytes.

See also 4794fbc86e ("mesa: reduce the size of gl_program_parameter")
that shrinked from 40 to 24 and later 7536af670b ("glsl: fix shader
cache for packed param list") that added `Padded`.

v2: Use just 5 bits for gl_register_file.  (Timothy)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-09-10 14:36:46 -07:00
Caio Marcelo de Oliveira Filho
eda596d64b compiler: Add glsl_contains_opaque() helper
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-09-10 14:36:46 -07:00
Caio Marcelo de Oliveira Filho
1a96811fe1 mesa/st: Do not rely on name to identify special uniforms
Every uniform that have the "gl_" name also have some state slots.  So
use the state_slots like we did in 57b6184931 ("i965: account for NIR
uniforms without name").

This removes the dependency on names, which are optional when using
ARB_gl_spirv.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-09-10 14:36:46 -07:00
Caio Marcelo de Oliveira Filho
4f33f96c45 glsl/nir: Avoid overflow when setting max_uniform_location
Don't use the UNMAPPED_UNIFORM_LOC (-1) to set the unsigned
max_uniform_location.  Those unmapped uniforms don't have to be
accounted at this point.

Fixes: 7a9e5cdfbb ("nir/linker: Add gl_nir_link_uniforms()")
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-09-10 14:36:46 -07:00
Dylan Baker
3047199931 meson: don't allow glvnd on windows
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-10 20:36:47 +00:00
Dylan Baker
e1e2388f06 meson: don't build glx or dri by default on windows
v5: - Move is windows check down to make code more robust

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-10 20:36:47 +00:00
Dylan Baker
70cac06bbf meson: Add a platform for windows
This mirrors the haiku build which uses a platform.

v2: - Fix some rebase problems

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-10 20:36:47 +00:00
Dylan Baker
f680cc62f8 meson: build getopt when using msvc
v4: - Don't wrap a single file in a list to match mesa style
    - Use null_dep instead of empty list

Reviewed-by: Eric Anholt <eric@anholt.net> (v3)
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-10 20:36:47 +00:00
Dylan Baker
0caa229dcb meson: fix dl detection on non cygwin windows
v4: - Don't run checks on Windows that will always fail

Reviewed-by: Eric Anholt <eric@anholt.net> (v3)
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-10 20:36:47 +00:00
Dylan Baker
2595b7c997 glapi: export glapi_destroy_multithread when building shared-glapi on windows
Which will allow meson to build a shared glapi build with mingw.

v2: - Add symbol to symbol check test

Reviewed-by: Eric Anholt <eric@anholt.net> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-10 20:36:47 +00:00
Dylan Baker
b9fa7ec4fa meson: add a expat subproject
For Windows

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-10 20:36:47 +00:00
Dylan Baker
8ba86ad55c meson: add a zlib subproject
To help windows build

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-10 20:36:47 +00:00
Dylan Baker
49fe44fe5a add a git ignore for subprojects
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-10 20:36:47 +00:00
Dylan Baker
af444d84a3 meson: don't build glapi_static_check_table on windows
It doesn't compile due to undefined symbols, which are in
libglapi_static, so I don't understand the problem.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-10 20:36:47 +00:00
Dylan Baker
8424209a42 meson: Make shared-glapi a combo
So it can auto off for windows, but on elsewhere.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-10 20:36:47 +00:00
Dylan Baker
a1a8703199 meson: don't try to generate i18n translations on windows
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-10 20:36:47 +00:00
Dylan Baker
26961e2cb5 glsl/tests: Handle windows \r\n new lines
Currently the praser for s expressions assumes that newlines will be \n,
resulting in incorrect parsing on windows, where the newline is \r\n.
This patch just adds \r? to the regular expression used to parse the s
expressions, which fixes at 1 test on windows.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-10 20:36:46 +00:00
Kenneth Graunke
077a1952cc iris: Fix constant buffer sizes for non-UBOs
Since the system value refactor, we've accidentally only been setting
cbuf->buffer_size in the UBO case, and not in the uploaded-constants
case.  We use cbuf->buffer_size to fill out the SURFACE_STATE entry,
so it needs to be initialized in both cases.

Fixes: 3b6d787e40 ("iris: move sysvals to their own constant buffer")
2019-09-10 10:53:15 -07:00
Lionel Landwerlin
341034a73d intel: update product names for WHL
Documentation list all of those as "UHD".

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111629
BSpec: 33266
Acked-by: Tapani Pälli <tapani.palli@intel.com>
2019-09-10 19:21:38 +03:00
Samuel Pitoiset
538766792d radv/gfx10: declare a LDS symbol for the NGG emit space
This fixes some interactions when NGG GS is enabled. It fixes:

- dEQP-VK.clipping.user_defined.clip_cull_distance_dynamic_index.*geom*
- dEQP-VK.tessellation.geometry_interaction.passthrough.*

For some reasons, using the computed ESGS ring size randomly hangs
with CTS. For now, just use the maximum LDS size for ESGS.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-10 09:27:01 +02:00
Samuel Pitoiset
168f8dbafa radv: calculate GFX9 GS and GFX10 NGG states before compiling shader variants
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-10 09:26:58 +02:00
Samuel Pitoiset
e7ee9a6387 radv: store the ESGS ring size as part of gfx10_ngg_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-10 09:26:53 +02:00
Samuel Pitoiset
7eba5666fa radv: store GFX10 NGG state as part of the shader info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-10 09:26:51 +02:00
Samuel Pitoiset
349caedee0 radv: store GFX9 GS state as part of the shader info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-10 09:26:47 +02:00
Samuel Pitoiset
a9af11f1fa radv: fill shader info for all stages in the pipeline
This shouldn't be in NIR->LLVM because ACO also needs the shader
info. This will also help for computing some NGG values that are
necessary for declaring LDS symbols.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-10 09:26:45 +02:00
Samuel Pitoiset
8cf297c7b1 radv: do not pass all compiler options to the shader info pass
Only the pipeline layout and the shader keys are needed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-10 09:26:42 +02:00
Marek Olšák
ef919d8dcb radeonsi: remove redundant si_texture offset and size fields
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-09-09 23:43:03 -04:00
Marek Olšák
e4c84d8678 radeonsi: move texture storage allocation outside of radeonsi
possible code sharing with radv

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-09-09 23:43:03 -04:00
Marek Olšák
58ccadfc5c radeonsi: move HTILE allocation outside of radeonsi
ac_surface computes it for amdgpu.
radeon_drm_surface computes it for radeon.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-09-09 23:43:03 -04:00
Marek Olšák
30a1dd0ee6 radeonsi: handle NO_DCC early
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-09-09 23:43:03 -04:00
Marek Olšák
7d4a10a29f ac/surface: add RADEON_SURF_NO_FMASK
This controls FMASK and CMASK computation for MSAA.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-09-09 23:43:03 -04:00
Marek Olšák
6633863150 r300,r600,radeonsi: set winsys_handle::stride,offset in drivers, not winsyses
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-09-09 23:43:03 -04:00
Marek Olšák
5ac6908263 r300,r600,radeonsi: read winsys_handle::stride,offset in drivers, not winsyses
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-09-09 23:43:03 -04:00
Marek Olšák
d95afd8b9e radeonsi/gfx10: fix wave occupancy computations
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-09-09 23:43:03 -04:00
Marek Olšák
42ea0b7b52 radeonsi: only support at most 1024 threads per block
LLVM 10 won't support 2048.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-09-09 23:43:03 -04:00
Marek Olšák
c1e08cb6d5 radeonsi: disable DCC when importing a texture from an incompatible driver
and unify the code.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-09-09 23:43:03 -04:00
Marek Olšák
28adf0d00c radeonsi/gfx10: don't call gfx10_destroy_query with compute-only contexts
This fixes a crash.

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-09-09 23:43:03 -04:00
Marek Olšák
2f42d4cacc radeonsi/gfx10: use fma for TGSI_OPCODE_FMA
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-09-09 23:43:03 -04:00
Marek Olšák
d64593e3c4 ac: use fma on gfx10
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-09-09 23:43:03 -04:00
Marek Olšák
d979e5bfab ac: enable LLVM atomic optimizations 2019-09-09 23:43:03 -04:00
Lepton Wu
263136fb5d virgl: Fix pipe_resource leaks under multi-sample.
Fixes: 900a80f9e4 ("virgl: virgl_transfer should own its virgl_resource")

Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
2019-09-10 03:42:55 +00:00
Kenneth Graunke
410894c643 iris: Avoid flushing for cache history on transfer range flushes
The VBO module maps a buffer with GL_MAP_FLUSH_EXPLICIT, and keeps
appending data, and calling glFlushMappedBufferRange().  We were
invalidating the VF cache each time it flushed a new range, which
results in a ton of VF flushes.

If the contents of the destination in the target range are undefined
(never even possibly written), this patch makes us assume that it's
likely not in the cache and so cache invalidations are required.  If
the destination range is defined, we continue cache flushing as we may
need to expunge stale data.

This eliminates 88% of the VF cache invalidates on Manhattan 3.0.
Improves performance in Manhattan 3.0 on my Icelake 8x8 with the GPU
frequency locked to 700Mhz by 0.376724% +/- 0.0989183% (n=10).
2019-09-09 15:08:22 -07:00
Kenneth Graunke
7d28e9ddd6 iris: Optimize out redundant sampler state binds
This cuts roughly 85% of the 3DSTATE_SAMPLER_STATE_POINTERS_PS calls in
the J2DBench images test.  For some reason, the state tracker is calling
bind_sampler_state with the same sampler state in a bunch of cases.
2019-09-09 11:55:27 -07:00
Kenneth Graunke
325e25d689 iris: Add support for the always_flush_cache=true debug option.
This can be useful for debugging missing flushes.
2019-09-09 11:55:27 -07:00
Adam Jackson
366b2e5c19 mesa: Eliminate gl_config::rgbMode
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-09 14:12:57 -04:00
Adam Jackson
78e0fa6bb2 mesa: Eliminate gl_config::have{Accum,Depth,Stencil}Buffer
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-09 14:12:57 -04:00
Adam Jackson
c4990b7b19 mesa: Remove unused gl_config::indexBits
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-09 14:12:57 -04:00
Adam Jackson
04bef9a0a6 gallium/xlib: Fix an obvious thinko
x == !GLX_DIRECT_COLOR is a fancy way of writing x == 0, which is
clearly not what was meant.
2019-09-09 14:12:57 -04:00
Kenneth Graunke
9173459b95 iris: Ignore line stipple information if it's disabled
The line stipple pattern and factor only matter if line stippling is
actually enabled.  Otherwise, we can safely ignore it.

PBO upload may give us zero for line stipple information, while normal
drawing tends to give us an actual stipple pattern such as 0xffff.  This
was causing us to flag IRIS_DIRTY_LINE_STIPPLE way too often, leading to
useless 3DSTATE_LINE_STIPPLE commands, which are non-pipelined and thus
very expensive.

Improves performance in Manhattan 3.0 on Skylake GT4e by
0.149261% +/- 0.0380796% (n=210).  On an Icelake 8x8 with the GPU
frequency locked at 700Mhz, improves by 0.423756% +/- 0.222843% (n=3).
2019-09-09 10:55:20 -07:00
Vasily Khoruzhick
fbd5d9ebb5 lima/ppir: drop fge/flt/feq/fne options
These are supposed to be lowered into sge/slt/seq/sne equivalents.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-09 10:25:30 -07:00
Vasily Khoruzhick
576341324d lima: run opt_algebraic between int_to_float and boot_to_float for vs
int_to_float emits ftrunc and ftrunc lowering generates bool ops.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-09 10:25:30 -07:00
Vasily Khoruzhick
996f1b6174 lima/gpir: fix warning in gpir disassembler
Fixes following warning:

../src/gallium/drivers/lima/ir/gp/disasm.c: In function ‘print_src’:
../src/gallium/drivers/lima/ir/gp/disasm.c:241:20: warning: array subscript 28 is above array bounds of ‘char[5]’ [-Warray-bounds]
  241 |              "xyzw"[src - gpir_codegen_src_attrib_x]);

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-09 10:25:30 -07:00
Vasily Khoruzhick
e6dbf6d948 lima/gpir: lower fceil
GP doesn't support fceil so we need to lower it.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-09 10:25:30 -07:00
Connor Abbott
c64f30546d lima/gpir: Disallow moves for schedule_first nodes
The entire point of schedule_first is that the node has to be scheduled
as soon as possible without any moves because it doesn't produce a
proper floating-point value, or its value changes depending on where you
read it. We were still introducing a move for preexp2 in some cases
though, even if it got scheduled as soon as possible, which broke some
exp() tests. Fix that.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-09 17:42:19 +07:00
Connor Abbott
8c7ad22adb lima/gpir: Fix fake dep handling for schedule_first nodes
The whole point of schedule_first nodes is that they need to be
scheduled as soon as possible, so if a schedule_first node is the
successor in a fake dependency that prevents it from being scheduled
after its parent, that can cause problems. We need to add these fake
dependencies to the parent as well, and we need to guarantee that the
pre-RA scheduler puts schedule_first nodes right before their parents in
order to prevent this from adding cycles to the dependency graph.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-09 17:42:00 +07:00
Connor Abbott
2955875381 lima/gpir: Fix schedule_first insertion logic
The idea was to make sure schedule_first nodes were always first in the
ready list. I made sure they were inserted first, but not that other
nodes wouldn't later be scheduled ahead of them. Fixes
spec@glsl-1.10@execution@built-in-functions@vs-exp-float and probably
others.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-09 17:41:35 +07:00
Connor Abbott
63acdb5ce6 lima/gpir: Ignore unscheduled successors in can_use_complex()
The point of the function is to avoid creating a complex move which is
used by certain slots in the next instruction, but unscheduled
successors will never be in the next instruction. Found while debugging
a crash that the previous commit fixed.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-09 17:40:58 +07:00
Connor Abbott
ee8cc90e55 lima/gpir: Do all lowerings before rsched
The scheduler assumes that load nodes are always duplicated so that they
can always be scheduled eventually and therefore they never need to be
spilled. But some lowerings were running after the pre-RA scheduler,
whereas duplication has to happen before then since it's needed for the
scheduler to do a better job reducing register pressure. This meant
that lowerings were introducing multiple uses of a load instruction,
which broke the scheduler's expectation and resulted in infinite loops
in situations where the only nodes available to spill were load nodes.
Spilling load nodes would be silly, so we want to fix the lowerings
rather than the scheduler. Just do all lowerings before the pre-RA
scheduler, which also helps with reducing pressure since the scheduler
can more accurately compute the pressure.

Fixes lima/mesa#104.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-09 17:39:20 +07:00
Mauro Rossi
ae5ac26dfa android: anv: libmesa_vulkan_common: add libmesa_util static dependency
Change needed to fix the following building error:

In file included from external/mesa/src/intel/vulkan/anv_device.c:43:
external/mesa/src/util/xmlpool.h:115:10: fatal error: 'xmlpool/options.h' file not found
         ^~~~~~~~~~~~~~~~~~~
1 error generated.

Fixes: 4dcb1ff ("anv: add support for driconf")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-09-08 20:07:56 +02:00
Boris Brezillon
3ce03374b3 panfrost: Rename pan_bo_cache.c into pan_bo.c
So we can move all the BO logic into this file instead of having it
spread over pan_resource.c, pan_drm.c and pan_bo_cache.c.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-08 16:24:54 +02:00
Boris Brezillon
14bfb0cb67 panfrost: Get rid of the now unused SLAB allocator
The last users have been converted to use plain BOs. Let's get rid of
this abstraction. We can always consider adding it back if we need it
at some point.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-08 16:24:19 +02:00
Boris Brezillon
2c90045cf2 panfrost: Get rid of unused panfrost_context fields
Some fields in panfrost_context are unused (probably leftovers from
previous refactor). Let's get rid of them.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-08 16:23:34 +02:00
Boris Brezillon
76274bcb5e panfrost: Convert ctx->{scratchpad, tiler_heap, tiler_dummy} to plain BOs
ctx->{scratchpad,tiler_heap,tiler_dummy} are allocated using
panfrost_drm_allocate_slab() but they never any of the SLAB-based
allocation logic. Let's convert those fields to plain BOs.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-08 16:22:59 +02:00
Boris Brezillon
a2bba567ae panfrost: Make transient allocation rely on the BO cache
Right now, the transient memory allocator implements its own BO caching
mechanism, which is not really needed since we already have a generic
BO cache. Let's simplify things a bit.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-08 16:22:26 +02:00
Boris Brezillon
12d8a17957 panfrost: Stop passing a ctx to functions being passed a batch
The context can be retrieved from batch->ctx.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-08 16:21:44 +02:00
Boris Brezillon
beb18c6172 panfrost: Pass a batch to panfrost_drm_submit_vs_fs_batch()
Given the function name it makes more sense to pass it a job batch
directly.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-08 16:20:59 +02:00
Boris Brezillon
2c526993bc panfrost: s/job/batch/
What we currently call a job is actually a batch containing several jobs
all attached to a rendering operation targeting a specific FBO.

Let's rename structs, functions, variables and fields to reflect this
fact.

Suggested-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-08 16:19:56 +02:00
Heinrich Fink
3aa4f3a442 egl: Add GL_MESA_EGL_sync support
This commit follow OES_EGL_sync to universially enable use of EGL sync
objects with desktop OpenGL contexts.

Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-08 08:01:55 +00:00
Heinrich Fink
8c933c9d96 headers: Add GL_MESA_EGL_sync token to GL
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-08 08:01:55 +00:00
Heinrich Fink
17470c4aaa registry: update gl.xml with GL_MESA_EGL_sync token
As added by upstream GL registry changes

Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-08 08:01:55 +00:00
Heinrich Fink
f4327ce06e specs: Add GL_MESA_EGL_sync
Adds GL_MESA_EGL_sync as defined in upstream OpenGL registry

Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-08 08:01:55 +00:00
Tapani Pälli
f83f9d7daa android: fix linking issues with liblog
Fixes Android build errors observed in Intel CI.

Fixes: f9f7cbc1aa "util: android logging support"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-09-07 13:16:29 +03:00
Kenneth Graunke
dfb86405cf iris: Support the disable_throttling=true driconf option. 2019-09-06 18:35:24 -07:00
Jason Ekstrand
c832820ce9 nir/dead_cf: Repair SSA if the pass makes progress
The dead_cf pass calls into the CF manipulation helpers which attempt to
keep NIR's SSA form sane.  However, when the only break is removed from
a loop, dominance gets messed up anyway because the CF SSA clean-up code
only looks at phis and doesn't consider the case of code becoming
unreachable.  One solution to this would be to put the loop into LCSSA
form before we modify any of its contents.  Another (and the approach
taken by this pass) is to just run the repair_ssa pass afterwards
because the CF manipulation helpers are smart enough to keep all the
use/def stuff sane; they just don't always preserve dominance
properties.

While we're here, we clean up some bogus indentation.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111405
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111069
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-06 23:39:01 +00:00
Jason Ekstrand
1005272a2b nir/repair_ssa: Insert deref casts when needed
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-06 23:39:01 +00:00
Jason Ekstrand
a3268599f3 nir/repair_ssa: Repair dominance for unreachable blocks
NIR currently assumes that unreachable blocks are trivially dominated by
everything.  However, when considering well-formed SSA, there is no path
from any block to an unreachable block.  Therefore, we can break any
use-def chains where the use is in an unreachable block.  This removes
any dependencies on code created by uses in unreachable blocks and lets
DCE do a better job of cleaning it up.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-06 23:39:01 +00:00
Jason Ekstrand
f81a2623d8 nir: Add a block_is_unreachable helper
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-06 23:39:01 +00:00
Jason Ekstrand
517142252f nir: Don't infinitely recurse in lower_ssa_defs_to_regs_block
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-06 23:39:01 +00:00
Jason Ekstrand
37cdb7fc44 nir: Handle complex derefs in nir_split_array_vars
We already bail and don't split the vars but we were passing a NULL to
_mesa_hash_table_search which is not allowed.

Fixes: f1cb3348f1 "nir/split_vars: Properly bail in the presence of ..."
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-06 23:39:01 +00:00
Jason Ekstrand
34541be7b0 intel/blorp: Use wide formats for nicely aligned stencil clears
In the case where the stencil clear is nicely aligned, we can clear
stencil much more efficiently by mapping it as a wide format (say
RGBA32_UINT) and blasting out the stencil clear value with a repclear.
On Unigine Heaven, this makes one stencil clear go from non-trivial to
unnoticeable when looking at per-draw timings.

In order for this change to work properly, ANV needs to do a bit more
flushing around depth and stencil clears.  i965 and iris already have
the cache tracking logic to handle this so no changes are required
there.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-09-06 23:35:09 +00:00
Jason Ekstrand
d62ca48c31 intel/blorp: Expose surf_fake_interleaved_msaa internally 2019-09-06 23:35:09 +00:00
Jason Ekstrand
caa786e029 intel/blorp: Expose surf_retile_w_to_y internally
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-09-06 23:35:09 +00:00
Jason Ekstrand
a90b1cbe73 blorp: Memset surface info to zero when initializing it
This isn't known to fix any current bugs but it does prevent a
regression in a subsequent commit.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-09-06 23:35:09 +00:00
Jason Ekstrand
c15b197d74 intel/tools: Decode PS kernels on SNB
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-09-06 23:35:09 +00:00
Jason Ekstrand
7f5cb5fd6d intel/tools: Decode 3DSTATE_BINDING_TABLE_POINTERS on SNB
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-09-06 23:35:09 +00:00
Rhys Perry
6b8cb08756 nir/lower_io_to_vector: don't merge compact varyings
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 02bc4aabb48 ('nir/lower_io_to_vector: allow FS outputs to be vectorized')
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-06 15:38:10 -07:00
Eric Engestrom
27339fe9a7 drirc: override minImageCount=2 for gfxbench
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110765
Fixes: 4689e98fe8 ("vulkan/wsi: Set X11 minImageCount to 3.")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-09-06 23:16:05 +01:00
Eric Engestrom
5eb7d48b58 radv: add support for vk_x11_override_min_image_count
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-09-06 23:16:05 +01:00
Eric Engestrom
4ad99ee961 amd: move adaptive sync to performance section, as it is defined in xmlpool
Fixes: 3844ed8d44 ("radv: Add adaptive_sync driconfig option and enable it by default.")
Fixes: e260493f2a ("radeonsi: Enable adaptive_sync by default for radeon")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-09-06 23:16:05 +01:00
Eric Engestrom
037b5b567f anv: add support for vk_x11_override_min_image_count
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-09-06 23:16:05 +01:00
Eric Engestrom
a72cdd00ab wsi: add minImageCount override
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-09-06 23:16:05 +01:00
Eric Engestrom
4dcb1fff19 anv: add support for driconf
No option is supported yet, this is just the boilerplate.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-09-06 23:16:05 +01:00
Eric Engestrom
ba73564b52 gallivm: drop LLVM<3.3 code paths as no build system allows that
Suggested-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-06 22:26:29 +01:00
Eric Engestrom
1b8764638a meson/scons/android: drop now-unused HAVE_LLVM
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
2019-09-06 22:26:29 +01:00
Eric Engestrom
2406b35151 llvmpipe: replace more complex 3.x version check with LLVM_VERSION_MAJOR/MINOR
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
2019-09-06 22:26:29 +01:00
Eric Engestrom
ba1e085587 clover: replace more complex 3.x version check with LLVM_VERSION_MAJOR/MINOR
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
2019-09-06 22:26:29 +01:00
Eric Engestrom
1c1c477470 gallivm: replace more complex 3.x version check with LLVM_VERSION_MAJOR/MINOR
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
2019-09-06 22:26:29 +01:00
Eric Engestrom
7527144383 clover: replace major llvm version checks with LLVM_VERSION_MAJOR
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
2019-09-06 22:26:29 +01:00
Eric Engestrom
08890068c5 gallivm: replace major llvm version checks with LLVM_VERSION_MAJOR
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
2019-09-06 22:26:29 +01:00
Eric Engestrom
6120c442ee swr: replace major llvm version checks with LLVM_VERSION_MAJOR
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
2019-09-06 22:26:29 +01:00
Eric Engestrom
19d9e57f2c amd: replace major llvm version checks with LLVM_VERSION_MAJOR
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
2019-09-06 22:26:29 +01:00
Eric Engestrom
bce9c05ca8 svga: replace binary HAVE_LLVM checks with LLVM_AVAILABLE
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
2019-09-06 22:19:01 +01:00
Eric Engestrom
cf7d186be6 r600: replace binary HAVE_LLVM checks with LLVM_AVAILABLE
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
2019-09-06 22:19:01 +01:00
Eric Engestrom
28cb16b6f8 aux/draw: replace binary HAVE_LLVM checks with LLVM_AVAILABLE
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
2019-09-06 22:19:01 +01:00
Eric Engestrom
ef434fbc25 meson/scons/android: add LLVM_AVAILABLE binary flag
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
2019-09-06 22:19:01 +01:00
Eric Engestrom
5aebe37b53 gallivm: replace 0x version print with actual version string
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
2019-09-06 22:19:01 +01:00
Jordan Justen
9790cfcefa anv,iris: L3ALLOC register replaces L3CNTLREG for gen12
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-06 13:11:25 -07:00
Anuj Phogat
414cae0fd6 intel/gen12: Add L3 configurations
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-06 13:11:22 -07:00
Rhys Perry
5a7fe0ae99 util: include u_endian.h in u_math.h
u_endian.h needs to be included, otherwise PIPE_ARCH_BIG_ENDIAN might not
be defined on big-endian architectures and the endian conversion macros
will be incorrect.

I don't think anything is broken because of this, I just noticed this when
looking at the file.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-06 19:52:50 +00:00
Jason Ekstrand
3b1a7e5333 anv: Bump maxComputeWorkgroupSize
Fixes: 9a129510f5 "anv: Bump maxComputeWorkgroupInvocations"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111552
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-09-06 18:26:55 +00:00
Kenneth Graunke
0d0ae16e8f intel: Stop redirecting state cache to command streamer cache section
This bit redirects the state cache from the unified/RO sections of the
L3 cache to the "CS command buffer" section of the cache, which would
be set up via TCCNTLREG.  The documentation says:

   "Additionaly, this redirection should be enabled only if there is a
    non-zero allocation for the CS command buffer section."

We don't allocate any cache to the CS command buffer section, so
enabling this redirection effectively disabled the state cache.
The Windows driver only sets up that section when using POSH, which
we do not currently use.  So, leave it unallocated and disable the
redirection to get a functional state cache again.

Improves performance in Civilization VI by 18%, Manhattan 3.0 by 6%,
and Car Chase by 2%.
2019-09-06 10:57:55 -07:00
Kenneth Graunke
68be5ff8d0 iris: Invalidate state/texture/constant caches after STATE_BASE_ADDRESS
Jason pointed out that the caches likely refer to offsets from dynamic
and surface state base addresses, so when we change those, we need to
invalidate the caches.

Comment borrowed from src/intel/vulkan/genX_cmd_buffer.c.
2019-09-06 10:57:55 -07:00
Kristian H. Kristensen
30ab3e39fd freedreno/a6xx: Implement primitive count queries on GPU
The driver can't determine PIPE_QUERY_PRIMITIVES_GENERATED or
PIPE_QUERY_PRIMITIVES_EMITTED once we support geometry or
tessellation, since these stages add primitives at runtime.  Use the
WRITE_PRIMITIVE_COUNTS event to write back the primitive counts and
implement a hw query for this.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-09-06 09:53:28 -07:00
Kristian H. Kristensen
1acf8d2354 freedreno/a6xx: Let the GPU track streamout offsets
The GPU writes out streamout offsets as it goes to the FLUSH_BASE
pointer.  We use that value with CP_MEM_TO_REG when appending to the
stream so that we don't have to track the offsets with the CPU in the
driver.  This ensures that streamout continues to work once we enable
geometry and tessellation shader stages that add geometry.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-09-06 09:53:28 -07:00
Roland Scheidegger
de1c89fd93 llvmpipe: fix CALLOC vs. free mismatches
Should fix some issues we're seeing. And use REALLOC instead of realloc.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2019-09-06 18:31:34 +02:00
Samuel Pitoiset
0bf51b6941 radv/gfx10: determine the number of vertices per primitive for TES
This doesn't fix anything known but it's correct now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-06 17:36:49 +02:00
Rhys Perry
bcd14756ee nir/lower_io_to_vector: add flat mode
This has lower_io_to_vector try to turn variables into arrays of 4-sized
vectors when possible and fall back to the old approach when that isn't
possible.

This is so that lower_io_to_vector can guarantee that only one variable is
used for each fragment shader output.

v2: handle dual-source blending
v3: don't try to merge structs and non-32-bit types in get_flat_type()
v3: fix per-vertex inputs
v3: fix and cleanup location advancement in get_flat_type() and it's
    calling code
v4: prioritize the original mode over the flat mode
v4: don't create flat variables to merge only one variable
v5: don't skip an entire slot when encountering structs in the old mode

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-06 15:38:04 +00:00
Rhys Perry
300e758b7c nir/lower_io_to_vector: allow FS outputs to be vectorized
v2: handle dual-source blending
v3: use a higher MAX_SLOTS

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-06 15:38:04 +00:00
Samuel Pitoiset
c6be5cefba radv/gfx10: make use the output usage mask when exporting NGG GS params
It shouldn't matter much because output varyings should have been
compacted during NIR shader linking but it mirrors what the driver
does when emitting NGG GS vertex parameters.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-06 17:25:28 +02:00
Samuel Pitoiset
b1a872f0c0 radv/gfx10: account for the subpass view for the NGG GS storage
If the fragment shader needs the layer index, we have to allocate
one more dword in the NGG GS storage. Found by inspection. This
doesn't fix anything known.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-06 17:25:28 +02:00
Tomeu Vizoso
0efc0f8edc panfrost/ci: Increase timeouts
Sometimes LAVA jobs will timeout due to transient issues, and the Gitlab
job will fail in that case. Increase the timeouts to reduce the
likeliness of that happening and reduce false positives.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-09-06 16:35:16 +02:00
Tomeu Vizoso
8a5dd61828 panfrost/ci: Use special runner for LAVA jobs
So repositories don't need to be specially configured with a token to
access LAVA, store this token in a bind volume for a special runner.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-09-06 16:35:16 +02:00
Tomeu Vizoso
10b60dbd2c panfrost/ci: Re-add support for armhf
Now that Volt supports armhf, build again images and submit to LAVA for
RK3288.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-09-06 16:35:16 +02:00
Samuel Pitoiset
f31fb33432 radv: calculate esgs_itemsize in the shader info pass
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-06 15:52:24 +02:00
Samuel Pitoiset
7fa00e178f radv: calculate the GSVS vertex size in the shader info pass
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-06 15:52:22 +02:00
Samuel Pitoiset
3e8bda66ae radv: gather primitive ID in the shader info pass
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-06 15:52:20 +02:00
Samuel Pitoiset
1877e87f1e radv: gather layer in the shader info pass
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-06 15:52:19 +02:00
Samuel Pitoiset
84b346eda9 radv: gather viewport in the shader info pass
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-06 15:52:17 +02:00
Samuel Pitoiset
d21489d415 radv: gather pointsize in the shader info pass
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-06 15:52:09 +02:00
Samuel Pitoiset
a99d2d5564 radv: gather clip/cull distances in the shader info pass
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-06 15:52:07 +02:00
Samuel Pitoiset
b16cf6c4c6 radv: move ac_fill_shader_info() to radv_nir_shader_info_pass()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-06 15:52:05 +02:00
Samuel Pitoiset
83499ac765 radv: merge radv_shader_variant_info into radv_shader_info
Having two different structs is useless.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-06 15:52:03 +02:00
Zhu, James
878439bba3 radeon: Fix mjpeg issue for ARCTURUS
ARCTURUS mjpeg is using direct register access.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
2019-09-06 08:53:52 -04:00
Leo Liu
a3074370d9 radeon/vcn: add RENOIR VCN decode support
It has same VCN2.x block as Navi1x

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
2019-09-06 08:53:52 -04:00
Danylo Piliaiev
aabde02f2f glsl: Fix unroll of do{} while(false) like loops
For loops which condition is false on the first iteration
iteration count was falsely calculated under the assumption
that loop's condition is true until it becomes false, meaning
it's true at least one time.
Now such loops are reported as having 0 iteration.

Similar to the fix e71fc7f2 done in NIR.

Fixes tests/shaders/glsl-fs-loop-while-false-02.shader_test

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-09-06 10:27:33 +00:00
Timur Kristóf
3debd0ef15 tgsi_to_nir: Remove dependency on libglsl.
This commit removes the GLSL dependency in TTN by manually recording
the textures used and calling nir_lower_samplers
instead of its GL counterpart.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-06 12:20:53 +03:00
Timur Kristóf
610cc3089c nir: Carve out nir_lower_samplers from GLSL code.
Lowering samplers is needed to produce NIR that can actually be
consumed by some gallium drivers, so it doesn't make sense to
to keep it only in the GLSL code.

This commit introduces nir_lower_samplers to compiler/nir,
while maintains the GL-specific function too.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-06 12:20:20 +03:00
Gert Wollny
9b9e1de90e radeonsi: Release storage for smda_uploads when the context is destroyed
This fixes a memory leak in the flush code:

Direct leak of 128 byte(s) in 1 object(s) allocated from:
    #0 in __interceptor_realloc .../gcc-8.3.0/libsanitizer/asan/asan_malloc_linux.cc:105
    #1 in si_buffer_do_flush_region src/gallium/drivers/radeonsi/si_buffer.c:573
    #2 in si_buffer_flush_region src/gallium/drivers/radeonsi/si_buffer.c:608
    #3 in si_buffer_flush_region src/gallium/drivers/radeonsi/si_buffer.c:597

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-06 09:44:24 +02:00
Mauro Rossi
7a6e7803a7 android: mesa: revert "Enable asm unconditionally"
This patch partially reverts 20294dc ("mesa: Enable asm unconditionally, ...")

Android makefile build logic needs to disable assembler optimization
in 32bit builds to avoid text relocations for libglapi.so shared

Fixes the following build error with Android x86 32bit target:

[  0% 4/477] target SharedLib: libglapi (out/target/product/x86/obj/SHARED_LIBRARIES/libglapi_intermediates/LINKED/libglapi.so)
FAILED: out/target/product/x86/obj/SHARED_LIBRARIES/libglapi_intermediates/LINKED/libglapi.so
...
prebuilts/gcc/linux-x86/x86/x86_64-linux-android-4.9/x86_64-linux-android/bin/ld: warning: shared library text segment is not shareable
prebuilts/gcc/linux-x86/x86/x86_64-linux-android-4.9/x86_64-linux-android/bin/ld: error: treating warnings as errors
clang-6.0: error: linker command failed with exit code 1 (use -v to see invocation)

Fixes: 20294dc ("mesa: Enable asm unconditionally, now that gen_matypes is gone.")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Acked-by: Eric Engestrom <eric@engestrom.ch>
2019-09-06 08:48:28 +02:00
Samuel Pitoiset
fa13b2f002 radv/gfx10: always set ballot_mask_bits to 64
The codegen handles it and it adds the correct casts. This fixes
a bunch of LLVM validation errors when enabling Wave32 for compute.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-06 08:11:43 +02:00
Caio Marcelo de Oliveira Filho
c0c55bd84f nir/lower_explicit_io: Handle 1 bit loads and stores
Load a 32-bit value then convert to 1-bit.  Convert 1-bit to 32-bit
value, then Store it.

These cases started to appear when we changed Anvil to use derefs for
shared memory.

v2: Use `bit_size` in a couple of places we were missing.  (Jason)
    Reassign `value` instead of `src[0]`.  (Jason)

Fixes: 024a46a407 ("anv: use derefs for shared memory access")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-09-05 22:24:09 -07:00
Jason Ekstrand
d15fe8ca82 Revert "intel/fs: Move the scalar-region conversion to the generator."
This reverts commit c0504569ea.  Now that
we're doing interpolation lowering in NIR, we can continue to stride the
FS input registers directly in the brw_fs_nir code like we did before.
This fixes SIMD32 fragment shaders which broke because lower_simd_width
depended on the 0 stride to split PLN instructions correctly.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-09-06 03:58:09 +00:00
Jason Ekstrand
47e9743547 intel/fs: Fix FB write inst groups
This commit does two things.  First, it simplifies the way we compute
the FB write group bit.  There's no reason to use a ternary because
inst->group / 16 can only be 0 or 1.  Second, it fixes an order-of-
operations bug where the ternary wasn't selecting between (1 << 11) and
0 but between (1 << 11) and 0 | brw_dp_write_desc(...).

Fixes: 0d9648416 "intel/compiler: Use generic SEND for Gen7+ FB writes"
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-06 03:58:09 +00:00
Vasily Khoruzhick
aa77fc309a lima/ppir: don't lower phis to scalar
Utgard PP is vec4 architecture, so lowering phis to scalars
increases instruction count and potentially interferes with
spilling.

Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-05 19:29:16 -07:00
Jonathan Marek
feea5986a9 freedreno/a2xx: formats update
For render formats, update fd2_pipe2color to only work with HW supported
render formats, and remove the format whitelist is_format_supported. This
patch enables float render formats (which work).

For vertex/texture formats, use a generic function which translates using
the bitsize of the channels. Since we fake support for some vertex formats,
check for these in is_format_supported to avoid enabling them as sampler
formats.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-09-06 02:24:29 +00:00
Jonathan Marek
21dfa8e486 freedreno/a2xx: fix depth gmem restore
Use fd_gmem_restore_format() to avoid trying to use unsupported Z24S8/Z16
render formats for gmem restore.

Also apply this change to gmem2mem so it doesn't depend on fd2_pipe2color
working with depth formats.

gmem2mem/mem2gmem also doesn't need to use the swap/swizzle, since dst/src
formats are the same.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-09-06 02:24:29 +00:00
Jonathan Marek
88ca73bcd0 freedreno/a2xx: implement polygon offset
Fixes failures in the following deqp tests:
dEQP-GLES2.functional.polygon_offset.*

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-06 02:24:29 +00:00
Jonathan Marek
ac4ca24c32 freedreno/a2xx: fix SRC_ALPHA_SATURATE for alpha blend function
Fixes failures in the following deqp tests:
dEQP-GLES2.functional.fragment_ops.*src_alpha_saturate*

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-06 02:24:29 +00:00
Jonathan Marek
80906a12d9 freedreno/a2xx: ir2: update register state in scalar insert
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-09-06 02:24:29 +00:00
Jonathan Marek
588cfe4a2b freedreno/a2xx: ir2: fix incorrect instruction reordering
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-09-06 02:24:29 +00:00
Jonathan Marek
a6ebd4ab08 freedreno/a2xx: ir2: check opcode on the right instruction in export cp
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-06 02:24:29 +00:00
Jonathan Marek
19e62fec60 freedreno/a2xx: ir2: fix saturate in cp
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-06 02:24:29 +00:00
Jonathan Marek
c5e6961a58 freedreno/a2xx: ir2: set lower_fdph
The fdph opcode is not supported.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-06 02:24:29 +00:00
Jonathan Marek
22799787b5 freedreno/a2xx: ir2: remove pointcoord y invert
Fixes the following deqp test:
dEQP-GLES2.functional.shaders.builtin_variable.pointcoord

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-06 02:24:29 +00:00
Jonathan Marek
3516a90ab4 freedreno/a2xx: ir2: fix lowering of instructions after float lowering
Some instructions generated by int/bool float lowering need to be lowered
by opt_algebraic.

Fixes: 43dbd7d6

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-06 02:24:29 +00:00
Vasily Khoruzhick
517b60dc13 lima/ppir: don't lower vector {b,f}csel to scalar if condition is scalar
Utgard PP has vector fcsel operation, but its condition is scalar. Add
filtering callback that checks whether {b,f}csel condition is not scalar
to lower {b,f}csel to scalar only in this case.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-06 01:51:28 +00:00
Vasily Khoruzhick
9367d2ca37 nir: allow specifying filter callback in lower_alu_to_scalar
Set of opcodes doesn't have enough flexibility in certain cases. E.g.
Utgard PP has vector conditional select operation, but condition is always
scalar. Lowering all the vector selects to scalar increases instruction
number, so we need a way to filter only those ops that can't be handled
in hardware.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-06 01:51:28 +00:00
Rob Clark
f9f7cbc1aa util: android logging support
In particular, it would be nice for failed debug_assert() msgs to show
up in logcat.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-06 00:45:11 +00:00
Rob Clark
9baa72b7fc freedreno/ir3: allow copy propagation for relative
This appears to work fine (with the additional constraint of keeping the
indirect load in the same block that a0.x was loaded).

We can probably lift this restriction on earlier gens after testing.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-06 00:13:44 +00:00
Rob Clark
d9ad6f54dc freedreno/ir3: fix cp cmps.s opt
Need to use ir3_instr_set_address(), otherwise the instruction might not
get added to the indirects table.  This becomes a problem when we turn
on copy propagation for relative accesses, as check_instr() in the sched
pass won't realize there is an indirect consumer of address register
load that is ready to be scheduled.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-06 00:13:44 +00:00
Rob Clark
e59bfc820b freedreno/ir3: assert that only single address
An instruction can reference only a single address register value.
Add an assert to catch bugs.

Also, address value should also be local to the same block as the
instruction.

(The one spot where changing the instruction address is actually legit
needs to clear the address first.)

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-06 00:13:44 +00:00
Rob Clark
f94f22e87a freedreno/ir3: fix mad copy propagation special case
After the next patch enabling copy propagation for relative sources,
we'll need to dereference the n'th src in valid_flags(), so we actually
need to swap the sources before calling valid_flags().

But the logic was already a bit cumbersome, so move it into a helper
function.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-06 00:13:44 +00:00
Rob Clark
1fd6a91d4a freedreno/ir3: fix addr/pred spilling
The live_values and use_count was not being properly updated.  This
starts triggering problems with the next patch, where we allow copy
propagation for RELATIV access.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-06 00:13:44 +00:00
Rob Clark
50a91fbf87 freedreno/ir3: cleanup "partially const" ubo srcs
Move the constant part of the indirect offset into nir intrinsic base.
When we have multiple indirect accesses with different constant offsets,
this lets other opt passes clean up things to use a single address
register value.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-06 00:13:44 +00:00
Erico Nunes
17bb437ac2 lima/ppir: improve regalloc spill cost calculation
Now that spilling ops can be inserted into existing instructions, it
makes sense to increase cost to spill registers that would cause the
creation of a new instruction.
Experimental results showed that penalizing too much due to this caused
worse results, however it is beneficial as a tie resolver between
registers with the same number of components.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-05 23:29:24 +00:00
Erico Nunes
7b2f195d0b lima/ppir: optimizations in regalloc spilling code
Avoid creating unnecessary instructions for the load/store temp nodes
when not required, to further reduce register pressure.

The store_temp operation seems to be unable to do any spilling.
At least the offline shader seems to never output instructions accessing
swizzled components, and attempting to output that in ppir results in
errors. So, force spilled registers to allocate a full vec4 register.
This seems to be the optimal way as it is possible to always keep stores
and temps in a single instruction that can be pipelined.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-05 23:29:24 +00:00
Erico Nunes
f9bf1a95ec lima/ppir: mark regalloc created ssa unspillable
One ssa created in the spillinc code in ppir_update_spilled_src was not
properly being marked 'spilled', which made it a candidate for future
spilling attempts.
Since it was being inserted by the spilling code itself, let's mark it
unspillable to avoid an infinite spilling loop.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-05 23:29:24 +00:00
Jose Maria Casanova Crespo
a5df0fa0b1 v3d: writes to magic registers aren't RF writes after THREND
Shaders must not attempt to write to the register files in the last
three instructions, but that doesn't include the magic registers:

nop                  ; nop               ; thrsw; ldtmu.- *** ERROR ***
nop                  ; nop
nop                  ; nop

v2: Simplify validation rules. (Eric Anholt)
v3: Adjust validation even more. (Eric Anholt)

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-05 22:54:13 +01:00
Sergii Romantsov
1dce75c183 intel/dri: finish proper glthread
KWin was able to get NULL-context in the call
intelUnbindContext. But a call _mesa_glthread_finish
is not resistent to such case.
Case can be catched with steps:
	1. Create both glx and egl contexts
	2. Make glx as current
	3. Make egl as current
	4. Reset glx context
	5. Make egl as current

Solution adds proper finishing of glthread-context
(context will be taken from the requested dri-context
for unbinding, but not from the saved current context).

Piglit-test: https://gitlab.freedesktop.org/mesa/piglit/merge_requests/87

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110814
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111271
Fixes: dca36d5516 (i965: Implement threaded GL support)
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-05 09:04:12 -07:00
Connor Abbott
3f5b541fc8 radv: Call nir_propagate_invariant()
Without this, invariant qualifiers don't do anything. Together with a
fix to the game, this fixes flickering in No Man's Sky.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-05 14:05:46 +02:00
Connor Abbott
2f5783bc2b radeonsi/nir: Don't lower constant arrays to uniforms
shader-db results:

Totals:
SGPRS: 3955968 -> 3954960 (-0.03 %)
VGPRS: 2220220 -> 2220092 (-0.01 %)
Spilled SGPRs: 11387 -> 11325 (-0.54 %)
Spilled VGPRs: 97 -> 97 (0.00 %)
Private memory VGPRs: 2528 -> 2528 (0.00 %)
Scratch size: 2656 -> 2656 (0.00 %) dwords per thread
Code Size: 76002204 -> 75994988 (-0.01 %) bytes
LDS: 740 -> 740 (0.00 %) blocks
Max Waves: 772776 -> 772787 (0.00 %)
Wait states: 0 -> 0 (0.00 %)

Totals from affected shaders:
SGPRS: 16840 -> 15832 (-5.99 %)
VGPRS: 16452 -> 16324 (-0.78 %)
Spilled SGPRs: 1416 -> 1354 (-4.38 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 2016 -> 2016 (0.00 %)
Scratch size: 2040 -> 2040 (0.00 %) dwords per thread
Code Size: 953624 -> 946408 (-0.76 %) bytes
LDS: 303 -> 303 (0.00 %) blocks
Max Waves: 1622 -> 1633 (0.68 %)
Wait states: 0 -> 0 (0.00 %)

There were a large number of regressions in code size, but they seem to
be because NIR unrolls some loop which results in the table being
replaced by a bunch of immediates on multiplies etc. -- this bloats code
size since the table size is now included, but means that there are less
loads so it's still a net positive.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-09-05 12:39:26 +02:00
Connor Abbott
2af431cf7f gallium: Plumb through a way to disable GLSL const lowering
For radeonsi, we will prefer the NIR pass as it'll generate better code
(some index calculation and a single load vs. a load, then index
calculation, then another load) and oftentimes NIR optimization can kick
in and make all the access indices constant.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-09-05 12:38:46 +02:00
Connor Abbott
49503ae74e st/nir: Don't lower indirects when linking
I believe this was stuck here early because otherwise
nir_opt_copy_prop_vars could undo what lower_io_to_temporaries does.
However that has since been fixed. Also, we now use scratch for large
variables so the comment is stale.

On radeonsi these are the shader-db results:

Totals:
SGPRS: 3955968 -> 3955968 (0.00 %)
VGPRS: 2220208 -> 2220220 (0.00 %)
Spilled SGPRs: 11387 -> 11387 (0.00 %)
Spilled VGPRs: 97 -> 97 (0.00 %)
Private memory VGPRs: 2528 -> 2528 (0.00 %)
Scratch size: 2656 -> 2656 (0.00 %) dwords per thread
Code Size: 76002108 -> 76002204 (0.00 %) bytes
LDS: 740 -> 740 (0.00 %) blocks
Max Waves: 772779 -> 772776 (-0.00 %)
Wait states: 0 -> 0 (0.00 %)

Totals from affected shaders:
SGPRS: 176 -> 176 (0.00 %)
VGPRS: 144 -> 156 (8.33 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 12104 -> 12200 (0.79 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 28 -> 25 (-10.71 %)
Wait states: 0 -> 0 (0.00 %)

The few small regressions are due to nir_opt_large_constants kicking in
when indirect lowering happens to result in smaller code after
optimization since the array is very simple.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-09-05 12:38:22 +02:00
Connor Abbott
7d2d7b5d5f st/nir: Call nir_remove_unused_variables() in the opt loop
This prevents regressions when disabling indirect lowering. Sometimes
the only use of an input array was copying it to the array created by
nir_lower_io_to_temporaries, and without lowering indirects we wouldn't
have eliminated the temporary array until after linking, which was too
late to remove unused code in the producer.

No shader-db changes with radeonsi NIR.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-09-05 12:37:28 +02:00
Connor Abbott
71a6794200 ac/nir: Enable nir_opt_large_constants
vkpipeline-db numbers:

Totals:
SGPRS: 1740306 -> 1741322 (0.06 %)
VGPRS: 1331124 -> 1331712 (0.04 %)
Spilled SGPRs: 21201 -> 21316 (0.54 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 256 -> 256 (0.00 %) dwords per thread
Code Size: 79022628 -> 78694788 (-0.41 %) bytes
LDS: 6500 -> 6500 (0.00 %) blocks
Max Waves: 301413 -> 301302 (-0.04 %)
Wait states: 0 -> 0 (0.00 %)

Totals from affected shaders:
SGPRS: 53633 -> 54649 (1.89 %)
VGPRS: 53000 -> 53588 (1.11 %)
Spilled SGPRs: 3454 -> 3569 (3.33 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 5284232 -> 4956392 (-6.20 %) bytes
LDS: 2 -> 2 (0.00 %) blocks
Max Waves: 4239 -> 4128 (-2.62 %)
Wait states: 0 -> 0 (0.00 %)

(The biggest VGPR and max wave regression is due to unrolling a loop,
which made the scheduler more aggressive, but in this case it's able to
effectively hide latency so it's actually probably a win.)

shader-db numbers with radeonsi NIR:

Totals:
SGPRS: 3526496 -> 3526512 (0.00 %)
VGPRS: 2198576 -> 2198576 (0.00 %)
Spilled SGPRs: 10463 -> 10463 (0.00 %)
Spilled VGPRs: 86 -> 86 (0.00 %)
Private memory VGPRs: 3182 -> 2528 (-20.55 %)
Scratch size: 3308 -> 2640 (-20.19 %) dwords per thread
Code Size: 74117280 -> 74106140 (-0.02 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 775846 -> 775844 (-0.00 %)
Wait states: 0 -> 0 (0.00 %)

Totals from affected shaders:
SGPRS: 856 -> 872 (1.87 %)
VGPRS: 680 -> 680 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 654 -> 0 (-100.00 %)
Scratch size: 668 -> 0 (-100.00 %) dwords per thread
Code Size: 49652 -> 38512 (-22.44 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 182 -> 180 (-1.10 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-05 12:21:46 +02:00
Connor Abbott
91626d0865 ac/nir: Support load_constant intrinsics
Setup a constant global variable that LLVM will stick in a .rodata
section and generate PC-relative loads for.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-05 12:21:42 +02:00
Connor Abbott
5dadbabb47 radv/radeonsi: Don't count read-only data when reporting code size
We usually use these counts as a simple way to figure out if a change
reduces the number of instructions or shrinks an instruction. However,
since .rodata sections aren't executed, we shouldn't be counting their
size for this analysis. Make the linker return the total executable
size, and use it to report the more useful size in both drivers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-05 12:21:35 +02:00
Heinrich Fink
5cc7cc5f17 headers: remove redundant GL token from GL wrapper
Removing GL_FRAMEBUFFER_FLIP_Y_MESA token from glheader.h as it is now
provided by glext.h

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-09-05 09:26:35 +02:00
Heinrich Fink
e2c88b7cd6 specs: Sync framebuffer_flip_y text with GL registry
Sync extension spec of MESA_framebuffer_flip_y to what has been merged
upstream in the GL registry. Update now carries the accepted GL
extension no.

v2: split GL headers update off to separate commit

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-09-05 09:26:30 +02:00
Heinrich Fink
c9a3f4fe40 include: sync GL headers with registry
Integrating headers from upstream registry [0] master branch. Effective
GL registry commit integrated:

9d534f9312e56c72df763207e449c6719576fd54

Keeping the following quirks local to Mesa:

- glext.h: BUILDING_MESA guard (see !1492)

- glxext.h: glXQueryGLXPbufferSGIX: 'int' return type (Mesa) vs while
'void' (GL registry)

- glxext.h: GLX_RENDERER_ID_MESA is still expected by some mesa tests,
even though its token has been removed from the spec (see
docs/specs/MESA_query_renderer.spec)

- glxext.h: glXGetTransparentIndexSUN / PFNGLXGETTRANSPARENTINDEXSUNPROC
argument pTransparentIndex has type 'unsigned long *' (Mesa) vs. 'long
*' (GL registry)

[0] https://github.com/KhronosGroup/OpenGL-Registry

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-09-05 09:26:15 +02:00
Hal Gentz
55c912883c clover: Fix build after clang r370122.
../mesa/src/gallium/state_trackers/clover/llvm/invocation.cpp: In function ‘std::unique_ptr<clang::CompilerInstance> {anonymous}::create_compiler_instance(const clover::device&, const std::vector<std::__cxx11::basic_string<char> >&, std::string&)’:
../mesa/src/gallium/state_trackers/clover/llvm/invocation.cpp:203:81: error: no matching function for call to ‘clang::CompilerInvocation::CreateFromArgs(clang::CompilerInvocation&, const char* const*, const char* const*, clang::DiagnosticsEngine&)’
  203 |              c->getInvocation(), copts.data(), copts.data() + copts.size(), diag))
      |                                                                                 ^
In file included from /opt/llvm64/include/clang/Frontend/CompilerInstance.h:15,
                 from ../mesa/src/gallium/state_trackers/clover/llvm/codegen.hpp:37,
                 from ../mesa/src/gallium/state_trackers/clover/llvm/invocation.cpp:49:
/opt/llvm64/include/clang/Frontend/CompilerInvocation.h:157:15: note: candidate: ‘static bool clang::CompilerInvocation::CreateFromArgs(clang::CompilerInvocation&, llvm::ArrayRef<const char*>, clang::DiagnosticsEngine&)’
  157 |   static bool CreateFromArgs(CompilerInvocation &Res,
      |               ^~~~~~~~~~~~~~
/opt/llvm64/include/clang/Frontend/CompilerInvocation.h:157:15: note:   candidate expects 3 arguments, 4 provided

Signed-off-by: Hal Gentz <zegentzy@protonmail.com>
Reviewed-by: Aaron Watry <awatry@gmail.com>
2019-09-04 22:29:52 -05:00
Vinson Lee
e716a9e213 scons: Add coroutines component to build.
Fixes: d32690b43c ("gallivm: add coroutine pass manager support")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-09-04 20:05:43 -07:00
Eric Anholt
cc3c217ce0 gallium/osmesa: Move 565 format selection checks where the rest are.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-09-04 16:43:36 -07:00
Eric Anholt
9e7eb9780a gallium/osmesa: Fix a race in creating the stmgr.
Noticed while looking at other OSMesa bugs.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-09-04 16:43:36 -07:00
Eric Anholt
281466332b gallium/osmesa: Introduce a test.
Given that we occasionally touch this code and probably nobody really
wants to think about it, introduce a minimal test so that we know we
haven't completely broken OSMesa.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-09-04 16:43:36 -07:00
Dylan Baker
d89d075589 docs: Mark 19.2.0-rc2 as done and push back rc3 and rc4/final 2019-09-04 16:00:02 -07:00
Hal Gentz
1591d1fee5 glx: Fix SEGV due to dereferencing a NULL ptr from XCB-GLX.
When run in optirun, applications that linked to `libGLX.so` and then
proceeded to querying Mesa for extension strings caused a SEGV in Mesa.

`glXQueryExtensionsString` was calling a chain of functions that
eventually led to `__glXQueryServerString`. This function would call
`xcb_glx_query_server_string` then `xcb_glx_query_server_string_reply`.
The latter for some unknown reason returned `NULL`. Passing this `NULL`
to `xcb_glx_query_server_string_string_length` would cause a SEGV as the
function tried to dereference it.

The reason behind the function returning `NULL` is yet to be determined,
however, simply checking that the ptr is not `NULL` resolves this. A
similar check has been added to `__glXGetString` for completeness sake,
although not immediately necessary.

In addition to that, we stumbled into a similar problem in
`AllocAndFetchScreenConfigs` which tries to access the configs to free
them if `__glXQueryServerString` fails. This, of course, SEGVs, because the
configs are yet to have been allocated. Simply continuing past the configs
if their config ptrs are `NULL` resolves this. We also switch to `calloc`
to make sure that the config ptrs are `NULL` by default, and not some
uninitialized value.

Cc: mesa-stable@lists.freedesktop.org
Fixes: 24b8a8cfe8 "glx: implement __glXGetString, hide __glXGetStringFromServer"
Fixes: cb3610e37c "Import the GLX client side library, formerly from xc/lib/GL/glx. Build it "
Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Hal Gentz <zegentzy@protonmail.com>
2019-09-04 16:00:10 +00:00
Adam Jackson
9acb94b623 egl: Enable 10bpc EGLConfigs for platform_{device,surfaceless}
It's somewhat annoying that these are so similar for so little benefit.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-04 11:39:57 -04:00
Neil Roberts
95927c414f glsl: Store the precision for a function return type
The precision for a function return type is now stored in
ir_function_signature. This will later be useful to implement mediump
to float16 lowering. In the meantime it is also useful to catch errors
where a function is redeclared with a different precision.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-09-04 12:41:20 +02:00
Dave Airlie
3a7e92dac5 docs: add llvmpipe features for fb_no_attach and compute shaders
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-09-04 15:22:20 +10:00
Dave Airlie
c0521ecffb llvmpipe: enable compute shaders if LLVM has coroutines
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-09-04 15:22:20 +10:00
Dave Airlie
6453a22612 llvmpipe: add local memory allocation path
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-09-04 15:22:20 +10:00
Dave Airlie
4e70970507 llvmpipe: add compute shader parameter fetching support
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-09-04 15:22:20 +10:00
Dave Airlie
0b51e73de2 llvmpipe: add compute shader images support
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-09-04 15:22:20 +10:00
Dave Airlie
45a8cf95f2 llvmpipe: add ssbo support to compute shaders
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-09-04 15:22:20 +10:00
Dave Airlie
6ea8e9b415 llvmpipe: add compute sampler + sampler view support.
This is ported from the fragment shader code.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-09-04 15:22:20 +10:00
Dave Airlie
4ca40cc3dc llvmpipe: add support for compute constant buffers.
This is mostly ported from the fragment shader code.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-09-04 15:22:20 +10:00
Dave Airlie
775fa81d7b llvmpipe: add compute pipeline statistics support.
This just adds the CS invocations counter.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-09-04 15:22:20 +10:00
Dave Airlie
50fde5b208 llvmpipe: add grid launch
This adds the dispatch code. It creates a job for the number
of blocks in the grid, and dispatches them to the threadpool
implementation. The threadpool then calls the JIT code to
execute the coroutines.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-09-04 15:22:20 +10:00
Dave Airlie
b320830bbd llvmpipe: add compute shader generation.
This creates the coroutine execution environment and the
main compute shaders that get executed inside it.

Each compute shader block is executed in it's own coroutine
execution shader, which each "thread" being a coroutine executed
inside it in sequence.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-09-04 15:22:20 +10:00
Dave Airlie
6ea41df94c llvmpipe: introduce variant building infrastrucutre.
This doesn't actually build any of the shaders yet, but just
builds up the framework necessary to start building the shaders
and variants.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-09-04 15:22:20 +10:00
Dave Airlie
fc01fafdbc llvmpipe: introduce new state dirty tracking for compute.
Compute doesn't share dirty state with the fragment pipeline
so create a separate path for it.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-09-04 15:22:20 +10:00
Dave Airlie
a6f6ca37c8 llvmpipe: add initial shader create/bind/destroy variants framework.
This is mostly a port of the fragment shader framework

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-09-04 15:22:20 +10:00
Dave Airlie
a792c5ae3e llvmpipe: add compute debug option
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-09-04 15:22:20 +10:00
Dave Airlie
25f46ae9aa gallivm: add compute jit interface.
This adds the jit interface for compute shaders, it's based
on the fragment shader one.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-09-04 15:22:20 +10:00
Dave Airlie
3879f69b50 llvmpipe: add initial compute state structs
These mirror the fragment shader structs, this is just a framework.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-09-04 15:22:20 +10:00
Dave Airlie
add0b151f5 llvmpipe: introduce compute shader context
The compute shader will need it's own context like the frag shader
has, this just introduces the framework struct and allocates/frees
for it in the right places.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-09-04 15:22:20 +10:00
Dave Airlie
83597ad3f2 gallivm: add barrier support for compute shaders.
When the code is executing an hits a barrier, it will suspend
the coroutine and return control to the coroutine dispatcher.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-09-04 15:22:20 +10:00
Dave Airlie
1b24e3ba75 llvmpipe: add compute threadpool + mutex
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
In order to efficiently run a number of compute blocks, use
a threadpool that just allows for jobs with unique sequential
ids to be dispatched.
2019-09-04 15:22:20 +10:00
Dave Airlie
e5bf6b7013 gallivm: add support for compute shared memory
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-09-04 15:22:20 +10:00
Dave Airlie
db6c78f9c8 gallivm: add new compute related intrinsics
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-09-04 15:22:20 +10:00
Dave Airlie
3312bed7b0 llvmpipe: reogranise jit pointer ordering
In order to share the texture/image/sampler code with compute
shaders we need to reorg them to be at the front of context
same as draw does for vs/gs sharing.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-09-04 15:22:20 +10:00
Dave Airlie
d32690b43c gallivm: add coroutine pass manager support
coroutines require a proper pass manager, so add the passes
to the correct places

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-09-04 15:22:20 +10:00
Dave Airlie
9cf1340e4f gallivm: add coroutine support files to gallivm.
These wrap the coroutine intrinsics and also add some higher
level wrappers around coroutine begin, end and suspend procedures

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-09-04 15:22:20 +10:00
Dave Airlie
f3f0cbf4f4 gallivm/flow: add counter reset for loops
This allows the counter value to be forced to a certain value

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-09-04 15:22:20 +10:00
Dave Airlie
6b3c6b91a8 llvmpipe: enable fb no attach
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-09-04 15:22:20 +10:00
Kenneth Graunke
f8887909c6 iris: Report correct number of planes for planar images
We were only handling the modifiers case and not counting the number of
planes in actual planar images.

Fixes Piglit's ext_image_dma_buf_import-export.

Fixes: fc12fd05f5 ("iris: Implement pipe_screen::resource_get_param")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111509
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-09-03 21:55:23 -07:00
Ilia Mirkin
32d458fdff teximage: ensure that Tex*SubImage* checks format
We were previously not doing at least some of the checks. This uses the
same logic that is used in glTexImage*.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-04 00:35:45 -04:00
Jan Beich
8e92ce9ba5 gallium/hud: add CPU usage support for DragonFly/NetBSD/OpenBSD
Each BSD has slightly different sysctl for retrieving per-CPU times.
FreeBSD returns long while NetBSD returns uint64_t. On OpenBSD return
type differs between summation and per-CPU times. DragonFly is
compatible with FreeBSD.

Signed-off-by: Jan Beich <jbeich@FreeBSD.org>
2019-09-03 22:53:15 -04:00
Roman Stratiienko
ef621a73f7 lima: Return fence unconditionally
Based on the vc4 implementation.
Fixes Android RenderEngine::flush() routine:
android.googlesource.com/platform/frameworks/native/+/refs/tags/android-o-mr1-iot-release-smart-clock-fcs/services/surfaceflinger/RenderEngine/RenderEngine.cpp#225

Signed-off-by: Roman Stratiienko <roman.stratiienko@globallogic.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-09-04 00:32:04 +00:00
Vasily Khoruzhick
1c1890fa70 lima/ppir: clone uniforms and load_coords into each successor
Try more aggressive approach with cloning uniform and coord loads.

Uniform load can be inserted into any instruction, so let's do that. ARM site
claim that penalty for cache miss is one clock, so we don't lose anything if
we merge it into instruction that uses the result. As side effect we can also
pipeline it and thus decrease reg pressure.

Do the same for varyings that hold texture coords, but for different reason:
looks like there's a special path for coords that increases precision if
varying that holds it is pipelined. If we don't pipeline it and load coords
from a register its precision is fp16 and thus only 10 bits which is not enough
to accurately sample textures of size 1024 or larger.

Since instruction can hold only one uniform load and one varying load,
node_to_instr now creates a move using helper introduced in previous commit if
slot is already taken. As side effect of this change we can also try to
pipeline texture loads and create a move if attempt fails.

Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-04 00:02:13 +00:00
Vasily Khoruzhick
e23fd2c375 lima/ppir: don't assume that load coords gets value from register
It can load value from varying directly as well. Also load_regs is the
only op that has a source, so add src_num field to load node and set it
accordingly.

Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-04 00:02:13 +00:00
Vasily Khoruzhick
bd77d19300 lima/ppir: add common helper for creating movs
Introduce common helper for creating movs to avoid code duplication

Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-04 00:02:13 +00:00
Eric Engestrom
7659c6197f nir: fix memleak in error path
Fixes: 2cf59861a8 ("nir: Add partial redundancy elimination for compares")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-09-04 00:31:53 +01:00
Eric Engestrom
c4969b0a25 freedreno/drm-shim: fix mem leak
Fixes: 494ecef6b4 ("freedreno: Add support for drm-shim.")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-04 00:18:37 +01:00
Eric Engestrom
7abf65aedc anv: fix format string in error message
Fixes: 9775894f10 ("anv: Move size check from anv_bo_cache_import() to caller (v2)")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-09-04 00:13:20 +01:00
Eric Engestrom
1667360f7d util/os_file: fix double-close()
Fixes: 955c63d364 ("util/os_file: resize buffer to what was actually needed")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-09-04 00:11:51 +01:00
Eric Engestrom
43d470404c egl: fix deadlock in malloc error path
Fixes: cb0980e69a ("egl: move alloc & init out of _eglBuiltInDriver{DRI2,Haiku}")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-09-04 00:10:18 +01:00
Eric Engestrom
3afe9d798a ttn: fix 64-bit shift on 32-bit 1
Fixes: 4d0b2c7aaa ("ttn: Update shader->info as we generate code.")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-09-04 00:01:08 +01:00
Rob Clark
1ef459297c freedreno/ir3: use uniform base
When lowering from ubo, use the constant base field in the load_uniform
instruction for the constant part of the offset.  Doesn't change much
for constant indexing, but this will help for indirect indexing because
constant-folding can't completely clean up the result.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-03 14:10:57 -07:00
Rob Clark
305bcdf992 freedreno/drm: fix 64b iova shifts
Should shift before splitting 64b iova into dwords

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-03 14:10:57 -07:00
Rob Clark
5ccd5871ed nir: remove unused constant_fold_state
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-03 14:10:57 -07:00
Eric Anholt
79a5ebe045 freedreno: Fix the type of single-component scaled vertex attrs.
This looks like clear copy-and-pasteos, and fixes:

dEQP-GLES2.functional.draw.random.40

(on A307 and A630, both tested in the new CI farm)

Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-09-03 19:34:09 +00:00
Connor Abbott
f3e978db4d radeonsi/nir: Remove uniform variable scanning
We can get all the information we need from NIR. It's slightly less
accurate, but radeonsi doesn't use the extra information. The old code
also overcounted atomic counters, which led to problems when everything
was used at once.

Fixes KHR-GL45.compute_shader.resources-max.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-03 15:55:02 +02:00
Connor Abbott
96c2a2832f ttn: Fill out more info fields
We'll use these in radeonsi.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-03 15:54:57 +02:00
Connor Abbott
dcc64fcfed nir: Fix num_ssbos when lowering atomic counters
Otherwise it's impossible to know the maximum SSBO index for both
internal TGSI shaders from TTN (which don't have any notion of atomic
counters and no offset) as well as shaders from GLSL.

I fixed everything I could find while grepping for num_ssbos and
num_abos, which hopefully is everything (iris was the only user I could
find that uses it in a meaningful way).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-03 15:54:54 +02:00
Connor Abbott
2abf62d348 ac/nir: Fix gather4 integer wa with unnormalized coordinates
This adds a bit of unneccesary code on radeonsi, since whether
unnormalized coordinates are used is known at compile time with GL, but
I wasn't sure if it was worth the few instructions to plumb everything
through, especially for something so rare -- my shader-db doesn't have
any instances where this changes anything.

Fixes CTS tests I created at
https://github.com/cwabbott0/VK-GL-CTS/tree/unnorm-gather-tests

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-03 13:50:54 +00:00
Connor Abbott
c63ccf90df ac/nir: Rewrite gather4 integer workaround based on radeonsi
The workaround was originally written based on amdgpu-pro traces, but
since then radeonsi has got its own slightly different version. Use the
radeonsi version instead, to be consistent and because it'll be slightly
more convenient for handling unnormalized coordinates.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-03 13:50:54 +00:00
Eric Engestrom
5f7d90f2ff egl: warn user if they set an invalid EGL_PLATFORM
Technically, the user might have set EGL_DISPLAY instead of
EGL_PLATFORM, but since the former is deprecated let's just mention the
latter in the warning message.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-09-03 14:41:43 +01:00
Alyssa Rosenzweig
5cdfccf8a6 panfrost: Remove panfrost_upload
This routine was made obsolete over a series of reworks of memory
allocation; Tomeu's changes to shader memory allocation finally made
this unused as cppcheck noted.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-09-03 13:55:29 +02:00
Alyssa Rosenzweig
42f0aae874 panfrost: Fix misc. issues flagged by cppcheck
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-09-03 13:55:29 +02:00
Alyssa Rosenzweig
6bd18bb264 panfrost: Mark (1 << 31) as unsigned
I was not aware this incurred undefined behaviour; thank you cppcheck.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-09-03 13:55:29 +02:00
Alyssa Rosenzweig
a058e90138 pan/midgard: Remove mir_rewrite_index_*_tag
These helpers are unused, as flagged by cppcheck.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-09-03 13:55:29 +02:00
Alyssa Rosenzweig
41ebac638a pan/midgard: Remove mir_print_bundle
In practice, the new post-schedule print is just as useful.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-09-03 13:55:29 +02:00
Alyssa Rosenzweig
d34e3f7e0a pan/midgard: Remove cppwrap.cpp
It has not been used in a long time; I forgot this file even existed.
Flagged by cppcheck.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-09-03 13:55:21 +02:00
Alyssa Rosenzweig
1a4153b24c pan/midgard: Fix cppcheck issues
Miscellaneous minor issues flagged by cppcheck.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-09-03 13:54:21 +02:00
Alyssa Rosenzweig
032e21b33e pan/midgard: Correct issues in disassemble.c
cppcheck.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-09-03 13:54:05 +02:00
Alyssa Rosenzweig
23376c2d35 pan/decode: Add missing format specifier
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-09-03 13:42:08 +02:00
Alyssa Rosenzweig
dc342aaac3 pan/decode: Use portable format specifier for 64-bit
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-09-03 13:42:04 +02:00
Alyssa Rosenzweig
bcfcb7e624 pan/decode: Use %zu instead of %d
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-09-03 13:41:59 +02:00
Alyssa Rosenzweig
d6d6d6327a pan/decode: Fix uninitialized variables
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-09-03 13:41:34 +02:00
Juan A. Suarez Romero
c1c0386676 docs: update calendar, add news item and link release notes for 19.1.6
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-09-03 13:06:56 +02:00
Juan A. Suarez Romero
b3763dab18 docs: add sha256 checksums for 19.1.6
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 4ec2325dd0)
2019-09-03 13:04:49 +02:00
Juan A. Suarez Romero
4151947583 docs: add release notes for 19.1.6
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 85c8f88a49)
2019-09-03 13:04:45 +02:00
Lionel Landwerlin
320b0f66c2 vulkan/overlay: bounce image back to present layout
Once we write the overlay to an image to be presented, we must not
forget to put it back into present layout.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111401
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-09-03 07:11:58 +00:00
Zhaowei Yuan
9db06a5350 broadcom/vc4: Expand width of dst surface
Four bytes of src_surf will be compressed into a 32-bits data and
stored into dst_surf, and dst_surf is read as z-order, so its width
must be aligned to multiples of 8(4x2) before divided by 2.

Signed-off-by: Zhaowei Yuan <zhaowei.yuan@samsung.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111266

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-09-03 08:47:43 +02:00
Vinson Lee
538820ff5f swr: Fix make_unique build error.
swr_shader.cpp: In function ‘void (* swr_compile_gs(swr_context*, swr_jit_gs_key&))(HANDLE, HANDLE, SWR_GS_CONTEXT*)’:
swr_shader.cpp:732:44: error: ‘make_unique’ was not declared in this scope
    ctx->gs->map.insert(std::make_pair(key, make_unique<VariantGS>(builder.gallivm, func)));
                                            ^~~~~~~~~~~

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
2019-09-02 14:52:23 -07:00
nia
1900b82dbf loader: include limits.h for PATH_MAX
This is needed to build on illumos.

The location of the PATH_MAX definition in limits.h seems to be fairly standard:
https://pubs.opengroup.org/onlinepubs/009695399/basedefs/limits.h.html

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-09-02 15:49:34 +00:00
Erik Faye-Lund
2f82d972ab util: only allow _BitScanReverse64 on 64-bit cpus
While the documentation for _BitScanReverse64 on MSDN says that it's
available on ARM, this isn't true. It's only available on ARM64. So
let's match reality.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2019-09-02 12:45:45 +00:00
Erik Faye-Lund
1de9ba33a2 mesa/x86: improve SSE-checks for MSVC
This enables some more SSE optimizations on MSVC builds.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-09-02 12:45:45 +00:00
Erik Faye-Lund
06099d0e0c util: do not assume MSVC implies SSE
This is not true for MSVC on ARM.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-09-02 12:45:45 +00:00
Erik Faye-Lund
2ade1c5cf7 util: fix SSE-version needed for double opcodes
This code generates CVTSD2SI, which requires SSE2. So let's fix the
required SSE-version.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 5de29ae (util: try to use SSE instructions with MSVC and 32-bit gcc)
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-09-02 12:45:45 +00:00
Erik Faye-Lund
ee2bc11cc7 mesa/main: remove unused include
This has been unused since 183db3a645 ("glsl: move half<->float
convertion to util"), Oct 10 2015. Let's drop needlessly including it.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-09-02 12:45:45 +00:00
Samuel Pitoiset
966a455bb9 nir: do not assume that the result of fexp2(a) is always an integral
It's only correct when 'a' is an integral greater or equal to 0.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111493
Fixes: 5544b2cbbd ("nir/algebraic: Use value range analysis to eliminate useless unary ops")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-09-02 09:00:37 +02:00
Lionel Landwerlin
6775a52400 egl: fix platform selection
Add missing "device" platform

v2: Add the missing platform (Eric)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Jean Hertel <jean.hertel@hotmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111529
Fixes: d6edccee8d ("egl: add EGL_platform_device support")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-09-02 06:28:06 +03:00
Kenneth Graunke
87fa8d9ebc iris: Lessen texture cache hack flush for blits/copies on Icelake.
Lionel found actual documentation for this at long last.  Apparently
it actually is a sampler cache limitation that was mostly fixed on
Icelake.  Unfortunately, it seems there are still issues with ASTC
and non-ASTC sampler views.  Still, we can lessen the flush condition
from "format mismatch" to "ASTC mismatch", which eliminates most of
the flushing here.

We also update the documentation to refer to the workaround name.
2019-08-31 20:17:55 -07:00
Vinson Lee
4771f6bccc util: Define strchrnul on macOS.
strchrnul is not available on macOS.

pipe_loader.c:141:14: error: implicit declaration of function 'strchrnul' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
      next = strchrnul(library_paths, ':');
             ^

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-31 13:26:10 -07:00
Erik Faye-Lund
52af1427c6 gallium/auxiliary/indices: consistently apply start only to input
The majority of these only apply the start argument to the input, but a
few of them also does for the output-array. util_primconvert, the only
user of this argument expects this pass a non-zero start-argument does
not expect this to be applied to the output; if it is, it will write
outside of allocated memory, leading to VRAM corruption.

The reason this doesn't seem to have been noticed before, is that no
driver currently use util_primconvert to convert a primitive-type to
itself, which is the cases where this was broken. But for Zink, this
will no longer be true, because we need to eliminate the use of 8-bit
index-buffers.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 28f3f8d413 ("gallium/auxiliary/indices: add start param")
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-08-31 19:45:52 +00:00
Vinson Lee
029b07b2ad travis: Fail build if any command in if statement fails.
Travis is checking the exit code of the entire if statement.

Fixes: 64ffc289be ("travis: add MacOS Scons build")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Acked-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2019-08-31 12:20:04 -07:00
Vinson Lee
3664a6600e swr: Fix build with llvm-9.0 again.
Commit 6f7306c029 ("swr/rast: Refactor memory API between rasterizer
core and swr") unintentionally removed changes for llvm-9.0.

Fixes: 6f7306c029 ("swr/rast: Refactor memory API between rasterizer core and swr")
Fixes: 5dd9ad1570 ("swr/rasterizer: Better implementation of scatter")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
2019-08-31 00:20:40 -07:00
Alyssa Rosenzweig
20237166b6 pan/midgard: Use shared psiz clamp pass
We already had a perfectly cromulent pass for this, but one landed in
common NIR code so let's switch and lighten our tree.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 16:06:09 -07:00
Alyssa Rosenzweig
0b225f1892 pan/midgard: Remove mir_opt_post_move_eliminate
This optimization depended on RA running before scheduling. It therefore
no longer applies and is now unused.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 15:50:28 -07:00
Alyssa Rosenzweig
d699a17475 pan/midgard: Schedule before RA
This is a tradeoff.

Scheduling before RA means we don't do RA on what-will-become pipeline
registers. Importantly, it means the scheduler is able to reorder
instructions, as registers have not been decided yet.

Unfortunately, it also complicates register spilling, since the spills
themselves won't get bundled optimally and we can only spill twice per
ALU bundle (only one spill per bundle allowed here). It also prevents us
from eliminating dead moves introduced by register allocation, as they
are not dead before RA. The shader-db regressions are from poor spilling
choices introduced by the new bundling requirements. These could be
solved by the combination of a post-scheduler (to combine adjacent
spills into bundles) with a VLIW-aware spill cost calculation.
Nevertheless, the change is small enough that I feel it's worth it to
eat a tiny shader-db regression for the sake of flexibility.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 15:50:28 -07:00
Alyssa Rosenzweig
5e06d90c45 pan/midgard: Handle fragment writeout in RA
Rather than using a pile of hacks and awkward constructs in MIR to
ensure the writeout parameter gets written into r0, let's add a
dedicated shadow register class for writeout (interfering with work
register r0) so we can express the writeout condition succintly and
directly.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 15:50:27 -07:00
Alyssa Rosenzweig
116b17d2d1 pan/midgard: Do not propagate swizzles into writeout
There's no slot for it; you'll end up writing into the void and
clobbering stuff. Don't. do it.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 15:50:27 -07:00
Alyssa Rosenzweig
eb3cc20f42 pan/midgard: Fix misc. RA issues
When running the register allocator after scheduling, the MIR looks a
little different, so we need to extend the RA to handle a few of these
extra cases correctly.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 15:50:27 -07:00
Alyssa Rosenzweig
e5ba016d3a pan/midgard: Print MIR by the bundle
After scheduling, we still have valid MIR, but we have additional
bundling annotations which we would like to keep debug, so print these.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 15:50:27 -07:00
Alyssa Rosenzweig
f42cebdd84 pan/midgard: Print branches in MIR
Rather than a vague "br.??" line, annotate the branch with its target
type (useful for disambiguating discards) and whether it was inverted.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 15:50:26 -07:00
Alyssa Rosenzweig
59f2cfcbc7 pan/midgard: Remove texture_index
This is deadcode.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 15:50:26 -07:00
Alyssa Rosenzweig
76529836ec pan/midgard: Cleanup fragment writeout branch
I'm not sure if this is strictly necessary but it makes debugging easier
and minimizes the diff with the experimental scheduler.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 15:50:26 -07:00
Alyssa Rosenzweig
cc2ba8efe9 pan/midgard: Add scheduling barriers
Scheduling occurs on a per-block basis, strongly assuming that a given
block contains at most a single branch. This does not always map to the
source NIR control flow, particularly when discard intrinsics are
involved. The solution is to allow scheduling barriers, which will
terminate a block early in code generation and open a new block.

To facilitate this, we need to move some post-block processing to a new
pass, rather than relying hackily on the current_block pointer.

This allows us to cleanup some logic analyzing branches in other parts
of the driver us well, now that the MIR is much more well-formed.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 15:50:26 -07:00
Alyssa Rosenzweig
19bceb5812 pan/midgard: Track shader quadword count while scheduling
This allow multiblock blend shaders to compute constant colour offsets
correctly.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 15:50:26 -07:00
Alyssa Rosenzweig
72cbd2d4e7 pan/midgard: Allow NULL argument in mir_has_arg
It's sometimes convenient to call this with no instruction specified. By
definition, a missing instruction cannot reference any argument, so
let's check for NULL and shortciruit to false.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 15:50:26 -07:00
Alyssa Rosenzweig
bcc59ff04d pan/midgard: Improve mir_mask_of_read_components
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 15:50:26 -07:00
Alyssa Rosenzweig
5377d70292 pan/midgard: Extend mir_special_index to writeout
The branch has the writeout specified in its source list, making this
special even if it's not explicitly part of r0.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 15:50:26 -07:00
Alyssa Rosenzweig
b56399fcd2 pan/midgard: csel_swizzle with mir get swizzle
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 15:50:25 -07:00
Alyssa Rosenzweig
28622f9088 pan/midgard: Add mir_insert_instruction*scheduled helpers
In order to run register allocation after scheduling, it is sometimes
necessary to be able to insert instructions into an already-scheduled
program. This is suboptimal, since it forces us to do a worst-case
scheduling, but it is nevertheless required for correct handling of
spills/fills. Let's add helpers to insert instructions as standalone
bundles for use in spilling code.

These helpers are minimal -- they *only* work on load/store ops or
moves. They should not be used for anything but register spilling; any
other instructions should be added prior to the schedule.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 15:50:25 -07:00
Alyssa Rosenzweig
8e369966d7 pan/midgard: Track csel swizzle
While it doesn't matter with an unconditional move to the conditional
register (r31), when we try to elide that move we'll need to track the
swizzle explicitly, and there is no slot for that yet since ALU ops are
normally binary.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 15:50:24 -07:00
Alyssa Rosenzweig
a8eafb0b74 pan/midgard: Ensure fragment writeout is in the final block
This ensures the block only has exactly one branch, which makes
scheduling happy.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 15:50:24 -07:00
Alyssa Rosenzweig
cfd5bd2c7d pan/midgard: Document Midgard scheduling requirements
Oh boy. Midgard scheduling is crazy... These are all just the
requirements, not even the algorithm yet.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 15:50:24 -07:00
Alyssa Rosenzweig
d6e4e36566 pan/midgard: Include condition in branch->src[0]
This will allow us to reference the condition while scheduling.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 15:50:24 -07:00
Alyssa Rosenzweig
bd79cddafa pan/midgard: Add post-schedule iteration helpers
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 15:50:24 -07:00
Alyssa Rosenzweig
f29d03a1f9 pan/midgard: Fix corner case in RA
It doesn't really matter but... meh.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 15:50:24 -07:00
Alyssa Rosenzweig
d722b60191 pan/midgard: Add OP_IS_CSEL_V helper
..to distinguish from scalar csel.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 15:50:24 -07:00
Alyssa Rosenzweig
01316719cf pan/midgard: Expose mir_get/set_swizzle
The scheduler would like to use these.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 15:50:24 -07:00
Alyssa Rosenzweig
3f757425a4 pan/midgard: Extract instruction sizing helper
The scheduler shouldn't need to worry about this.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 15:50:23 -07:00
Alyssa Rosenzweig
bbe2914967 pan/midgard: Factor out mir_is_scalar
This helper doesn't need to be in the giant loop.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 15:50:23 -07:00
Alyssa Rosenzweig
67909c8ff2 pan/midgard: Count shader-db stats by bundled instructions
This does not affect shaders in any way. Rather, it makes the shader-db
instruction count recorded in the compiler accurate with the in-order
scheduler, matching up with what we calculate from pandecode.

Though shaders are the same, instruction counts cannot be compared
across this commit for this reason.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 15:50:22 -07:00
Alyssa Rosenzweig
3f9dc97124 freedreno/ir3: Link directly to Sethi-Ullman paper
Allow a direct link to the PDF itself from the authors themselves,
rather than a paywall splash page.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Rob Clark <robdclark@chromium.org>
2019-08-30 15:50:22 -07:00
Adam Jackson
da5ebe3010 Revert "glx: Unset the direct_support bit for GLX_EXT_import_context"
The GLX extension strings are independent of any context, so abusing the
direct_support bit to control this extension's visibility is wrong.

This reverts commit 079d0717fc896bc8086b037d0ed22642274986c7.

Reported-by: Michel Dänzer <michel@daenzer.net>
Reviewed-by: Michel Dänzer <michel@daenzer.net>
2019-08-30 17:50:45 -04:00
Boris Brezillon
9087cf7015 panfrost: Add transient BOs to job batches
Memory allocated through panfrost_allocate_transient() is likely to
come from the transient pool. Let's add the BO backing the allocated
memory region to the job batch so the kernel can retain this BO while
jobs are executed.

In practice that has never been a problem because the transient pool
is never shrinked, and even if it was, we still control the lifetime of
the job, so there's no reason for this BO to be freed before the GPU is
done executing the batch. But it still make sense to add the BO for
debugging purpose.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 22:13:41 +02:00
Rohan Garg
b2ff2dfc2a panfrost: protect access to shared bo cache and transient pool
Both the BO cache and the transient pool are shared across
context's. Protect access to these with mutexes.

Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-08-30 22:10:49 +02:00
Rohan Garg
6b0dc3d530 panfrost: Jobs must be per context, not per screen
Jobs _must_ only be shared across the same context, having
the last_job tracked in a screen causes use-after-free issues
and memory corruptions.

Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-08-30 22:06:54 +02:00
Lepton Wu
bd98470a46 st/mesa: Allow zero as [level|layer]_override
This fix two dEQP tests for virgl:

dEQP-EGL.functional.image.create.gles2_cubemap_positive_x_rgba_texture
dEQP-EGL.functional.image.render_multiple_contexts.gles2_cubemap_positive_x_rgba8_texture

Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-30 17:30:53 +00:00
Khaled Emara
6926f56d5b freedreno/a3xx: fix sysmem <-> gmem tiles transfer
Tiling mode was missing from fd3_emit_gmem_restore_tex().
emit_gmem2mem_surf() used LINEAR exclusiveley.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-08-30 08:54:30 -07:00
Khaled Emara
ed1954ced3 freedreno/a3xx: fix texture tiling parameters
* Fix 2D/2DArray/3D tiling parameters:
  There is a bottom threshold for width and height.
* Renable tiling for Cubemap, after setting the right parameters.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-08-30 08:54:30 -07:00
Michel Dänzer
8de25ecd6b gitlab-ci: Use new needs: keyword
This way, the test jobs can start running before all build+test jobs
have finished, once the meson-main job has.

Idea suggested by Daniel Stone on IRC.

See https://docs.gitlab.com/ce/ci/directed_acyclic_graph/ and
https://docs.gitlab.com/ce/ci/yaml/README.html#needs for details.

v2:
* Improve commit log (Daniel Stone, Eric Engestrom)

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-30 11:27:00 +02:00
Michel Dänzer
42f8d5a531 gitlab-ci: Move up meson-main job definition
In order to increase the chance of it running early.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-30 11:25:26 +02:00
Dave Stevenson
873b092e91 broadcom/v3d: Allow importing linear BOs with arbitrary offset/stride.
Equivalent of 0c1dd9dee "broadcom/vc4: Allow importing linear BOs with
arbitrary offset/stride." for v3d.

Allows YUV buffers with a single buffer and plane offsets to be
passed in.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-30 10:53:05 +02:00
Jan Zielinski
2263e6a895 swr/rasterizer: Fix GS attributes processing
Input to GS is just a set of attributes, so remove explicit setup of
'position' which is meaningless for GS input processing.

Reviewed-by: Alok Hota <alok.hota@intel.com>
2019-08-30 07:31:45 +00:00
Samuel Pitoiset
6b96c94b5a radv: keep a pointer to a NIR shader into radv_shader_context
This avoids multiple copies for nothing and it's more elegant.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-30 09:33:30 +02:00
Samuel Pitoiset
7b1655ccf3 radv: move setting can_discard to ac_fill_shader_info()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-30 09:33:27 +02:00
Samuel Pitoiset
081561de16 radv: replace ac_nir_build_if by ac_build_ifcc
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-30 09:33:25 +02:00
Samuel Pitoiset
cc3d36b5dd radv: remove radv_init_llvm_target() helper
RADV no longer uses specific LLVM options compared to the common code.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-30 09:33:21 +02:00
Samuel Pitoiset
dc27a54c84 radv: remove useless ac_llvm_util.h include from the WSI code
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-30 09:33:19 +02:00
Samuel Pitoiset
6cb455c418 radv: remove unused shader_info parameter in ac_compile_llvm_module()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-30 09:33:17 +02:00
Samuel Pitoiset
9aaca90123 radv: remove some unused fields from radv_shader_context
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-30 09:33:15 +02:00
Samuel Pitoiset
8d44f83844 radv: move lowering PS inputs/outputs at the right place
At shaders creation, just after NIR linking.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-30 09:29:31 +02:00
Samuel Pitoiset
151d6990ec radv: gather info about PS inputs in the shader info pass
It's the right place to do that.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-30 09:29:29 +02:00
Samuel Pitoiset
9f2fd23f99 ac: drop now useless lookup_interp_param from ABI
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-30 08:23:56 +02:00
Samuel Pitoiset
a63719db6a ac: import linear/perspective PS input parameters from radv/radeonsi
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-30 08:23:54 +02:00
Krzysztof Raszkowski
8be51061ec util: Add unreachable() definition for clang compiler.
Without unreachable() definition clang throw return-type error
in many places in mesa code.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-30 05:50:21 +00:00
Nataraj Deshpande
e3f54cb0c1 egl/android: Enable HAL_PIXEL_FORMAT_RGBA_FP16 format
The patch adds support for 64 bit HAL_PIXEL_FORMAT_RGBA_FP16
for android platform.

Fixes android.graphics.cts.BitmapColorSpaceTest#test16bitHardware
which failed in egl due to "Unsupported native buffer format 0x16"
on chromebooks.

Signed-off-by: Nataraj Deshpande <nataraj.deshpande@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-08-29 23:16:08 +00:00
Dave Airlie
a69ae76cc8 gallivm: disable accurate cube corner for integer textures.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111511
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-08-30 08:27:16 +10:00
Pierre-Eric Pelloux-Prayer
47cc660d9c glsl: replace 'x + (-x)' with constant 0
This fixes a hang in shadertoy for radeonsi where a buffer was initialized with:

   value -= value

with value being undefined.
In this case LLVM replace the operation with an assignment to NaN.

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111241
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-29 17:48:49 -04:00
Thong Thai
8d03a6b700 radeonsi: add JPEG decode support for VCN 2.0 devices
Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
2019-08-29 17:27:35 -04:00
Thong Thai
2a3a560407 Revert "radeonsi: don't emit PKT3_CONTEXT_CONTROL on amdgpu"
This reverts commit 5a2e65be89.

Even though CONTEXT_CONTROL is emitted by the kernel, CONTEXT_CONTROL
still needs to be emitted by the UMD, or else the driver will hang

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-29 17:27:15 -04:00
Ian Romanick
9ad4a2eac5 nir/range-analysis: Add a lot more assertions about the contents of tables
v2: Update several of the comments.  Drop some redundant uses of
ASSERT_UNION_OF_OTHERS_MATCHES_UNKNOWN_*_SOURCE source.  Suggested by
Caio.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-29 13:15:53 -07:00
Ian Romanick
636da12433 nir/range-analysis: Range tracking for fpow
One shader from Metro Last Light and the rest from Rochard.  In the
Rochard cases, something like:

    min(1.0, max(pow(saturate(x), y), z))

was transformed to

    saturate(max(pow(saturate(x), y), z))

because the result of the pow must be >= 0.

The Metro Last Light case was similar.  An instance of

    min(pow(abs(x), y), 1.0)

became

    saturate(pow(abs(x), y))

v2: Fix some comments.  Suggested by Caio.

v3: Fix setting is_intgral when the exponent might be negative.  See
also Mesa MR !1778.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>

All Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16280670 -> 16280659 (<.01%)
instructions in affected programs: 1130 -> 1119 (-0.97%)
helped: 11
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.72% max: 1.43% x̄: 1.03% x̃: 0.97%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -1.19% -0.86%
Instructions are helped.

total cycles in shared programs: 367168430 -> 367168270 (<.01%)
cycles in affected programs: 10281 -> 10121 (-1.56%)
helped: 10
HURT: 1
helped stats (abs) min: 16 max: 18 x̄: 17.00 x̃: 17
helped stats (rel) min: 1.31% max: 2.43% x̄: 1.79% x̃: 1.70%
HURT stats (abs)   min: 10 max: 10 x̄: 10.00 x̃: 10
HURT stats (rel)   min: 3.10% max: 3.10% x̄: 3.10% x̃: 3.10%
95% mean confidence interval for cycles value: -20.06 -9.04
95% mean confidence interval for cycles %-change: -2.36% -0.32%
Cycles are helped.
2019-08-29 13:15:53 -07:00
Ian Romanick
7dba7df5e5 nir/range-analysis: Handle constants in nir_op_mov just like nir_op_bcsel
I discovered this while looking at a shader that was hurt by some other
work I'm doing.  When I examined the changes, I was confused that one
instance of a comparison that was used in a discard_if was (incorrectly)
eliminated, while another instance used by a bcsel was (correctly) not
eliminated.  I had to use NIR_PRINT=true to see exactly where things
when wrong.

A bunch of shaders in Goat Simulator, Dungeon Defenders, Sanctum 2, and
Strike Suit Zero were impacted.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")

All Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16280659 -> 16281075 (<.01%)
instructions in affected programs: 21042 -> 21458 (1.98%)
helped: 0
HURT: 136
HURT stats (abs)   min: 1 max: 9 x̄: 3.06 x̃: 3
HURT stats (rel)   min: 1.16% max: 6.12% x̄: 2.23% x̃: 2.03%
95% mean confidence interval for instructions value: 2.93 3.19
95% mean confidence interval for instructions %-change: 2.08% 2.37%
Instructions are HURT.

total cycles in shared programs: 367168270 -> 367170313 (<.01%)
cycles in affected programs: 172020 -> 174063 (1.19%)
helped: 14
HURT: 111
helped stats (abs) min: 2 max: 80 x̄: 21.21 x̃: 9
helped stats (rel) min: 0.10% max: 4.47% x̄: 1.35% x̃: 0.79%
HURT stats (abs)   min: 2 max: 584 x̄: 21.08 x̃: 5
HURT stats (rel)   min: 0.12% max: 17.28% x̄: 1.55% x̃: 0.40%
95% mean confidence interval for cycles value: 5.41 27.28
95% mean confidence interval for cycles %-change: 0.64% 1.81%
Cycles are HURT.
2019-08-29 13:15:53 -07:00
Ian Romanick
0b4782fccd nir/range-analysis: Fix incorrect fadd range result for (ne_zero, ne_zero)
Found by inspection.  I tried really, really hard to make a test case
that would trigger this problem, but I was unsuccesful.  It's very hard
to get an instruction to produce a ne_zero result without ne_zero
sources.  The most plausible way is using bcsel.  That proves
problematic because bcsel interprets its sources as integers, so it
cannot currently be used to "clean" values for floating point
instructions.

No shader-db changes on any Intel platform.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")
2019-08-29 13:15:53 -07:00
Ian Romanick
ef2e235252 nir/range-analysis: Adjust result range of multiplication to account for flush-to-zero
Fixes piglit tests (new in piglit!110):

    - fs-underflow-fma-compare-zero.shader_test
    - fs-underflow-mul-compare-zero.shader_test

v2: Add back part of comment accidentally deleted.  Noticed by
Caio. Remove is_not_zero function as it is no longer used.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111308
Fixes: fa116ce357 ("nir/range-analysis: Range tracking for ffma and flrp")
Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>

All Gen7+ platforms** had similar results. (Ice Lake shown)
total instructions in shared programs: 16278465 -> 16279492 (<.01%)
instructions in affected programs: 16765 -> 17792 (6.13%)
helped: 0
HURT: 23
HURT stats (abs)   min: 7 max: 275 x̄: 44.65 x̃: 8
HURT stats (rel)   min: 1.15% max: 17.51% x̄: 4.23% x̃: 1.62%
95% mean confidence interval for instructions value: 9.57 79.74
95% mean confidence interval for instructions %-change: 1.85% 6.61%
Instructions are HURT.

total cycles in shared programs: 367135159 -> 367154270 (<.01%)
cycles in affected programs: 279306 -> 298417 (6.84%)
helped: 0
HURT: 23
HURT stats (abs)   min: 13 max: 6029 x̄: 830.91 x̃: 54
HURT stats (rel)   min: 0.17% max: 45.67% x̄: 7.33% x̃: 0.49%
95% mean confidence interval for cycles value: 100.89 1560.94
95% mean confidence interval for cycles %-change: 0.94% 13.71%
Cycles are HURT.

total spills in shared programs: 8870 -> 8869 (-0.01%)
spills in affected programs: 19 -> 18 (-5.26%)
helped: 1
HURT: 0

total fills in shared programs: 21904 -> 21901 (-0.01%)
fills in affected programs: 81 -> 78 (-3.70%)
helped: 1
HURT: 0

LOST:   0
GAINED: 1

** On Broadwell, a shader was hurt for spills / fills instead of
   helped.

No changes on any earlier platforms.
2019-08-29 13:15:53 -07:00
Ian Romanick
33ad2bab4b nir/range-analysis: Adjust result range of exp2 to account for flush-to-zero
Fixes piglit tests (new in piglit!110):

    - fs-underflow-exp2-compare-zero.shader_test

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111308
Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>

Most of the shaders affected are, unsurprisingly, in Unigine Heaven.

All Gen6+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16278207 -> 16278465 (<.01%)
instructions in affected programs: 11374 -> 11632 (2.27%)
helped: 0
HURT: 58
HURT stats (abs)   min: 2 max: 13 x̄: 4.45 x̃: 4
HURT stats (rel)   min: 0.54% max: 4.11% x̄: 2.42% x̃: 2.82%
95% mean confidence interval for instructions value: 3.77 5.13
95% mean confidence interval for instructions %-change: 2.19% 2.64%
Instructions are HURT.

total cycles in shared programs: 367134284 -> 367135159 (<.01%)
cycles in affected programs: 81207 -> 82082 (1.08%)
helped: 17
HURT: 36
helped stats (abs) min: 6 max: 356 x̄: 90.35 x̃: 6
helped stats (rel) min: 0.69% max: 21.45% x̄: 5.71% x̃: 0.78%
HURT stats (abs)   min: 4 max: 235 x̄: 66.97 x̃: 16
HURT stats (rel)   min: 0.35% max: 27.58% x̄: 5.34% x̃: 1.09%
95% mean confidence interval for cycles value: -20.36 53.38
95% mean confidence interval for cycles %-change: -1.08% 4.67%
Inconclusive result (value mean confidence interval includes 0).

No changes on any earlier platforms.
2019-08-29 13:15:53 -07:00
Ian Romanick
e07248d2a8 nir/algebraic: Clean up value range analysis-based optimizations
Fix the a / b ordering in some compares.  Delete duplicate patterns.
Add a table explaining things.  While I was cleaning this up, I managed
to confuse myself.  The table helped sort that out.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-29 13:15:52 -07:00
Ian Romanick
ccb236d1bc nir/algebraic: Mark some value range analysis-based optimizations imprecise
This didn't fix bug #111308, but it was found will trying to find the
actual cause of that bug.

Fixes piglit tests (new in piglit!110):

    - fs-fract-of-NaN.shader_test
    - fs-lt-nan-tautology.shader_test
    - fs-ge-nan-tautology.shader_test

No shader-db changes on any Intel platform.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111308
Fixes: b77070e293 ("nir/algebraic: Use value range analysis to eliminate tautological compares")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-29 13:15:52 -07:00
Kenneth Graunke
30b9ed92ea iris: Fix partial fast clear checks to account for miplevel.
We enabled fast clears at level > 0, but didn't minify the dimensions
when comparing the box size, so we always thought it was a partial
clear and as a result never actually enabled any.

This eliminates some slow clears in Civilization VI, but they are mostly
during initialization and not the main rendering.

Thanks to Dan Walsh for noticing we had too many slow clears.

Fixes: 393f659ed8 ("iris: Enable fast clears on other miplevels and layers than 0.")
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-08-29 11:27:16 -07:00
Rohan Garg
394192fcee panfrost: Remove unused argument from panfrost_drm_submit_vs_fs_job()
is_scanout is not used anywhere and can be inferred within
panfrost_drm_submit_vs_fs_job() if required.

Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-08-29 19:03:17 +02:00
Kenneth Graunke
fda9fb8dcd iris: Actually describe bo_reuse driconf option
Otherwise it doesn't exist and can't be parsed, so everything dies at
screen init time.

Fixes: 6dc4ddc5f8 ("iris: use driconf for 'bo_reuse' parameter")
2019-08-29 09:40:34 -07:00
Tomeu Vizoso
aace7d3500 panfrost/ci: Print only regressions
Some functionality has been added to deqp-volt to only print
regressions, so update our version of it and use the new options.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-29 17:12:04 +02:00
Roland Scheidegger
332b21db55 gallivm: use fallback code for mul_hi with llvm >= 7.0
LLVM 7.0 ditched the pmulu intrinsics.
This is only a trivial patch to use the fallback code instead.
It'll likely produce atrocious code since the pattern doesn't match what
llvm itself uses in its autoupgrade paths, hence the pattern won't be
recognized.

Should fix https://bugs.freedesktop.org/show_bug.cgi?id=111496

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-29 16:55:49 +02:00
Samuel Pitoiset
b650ecfe31 radv/gfx10: compute the LDS size for exporting PrimID for VS
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-29 16:08:37 +02:00
Jan Zielinski
e64091ebd4 swr/rasterizer: Enable ARB_fragment_layer_viewport
Added loading gl_Layer and gl_ViewportIndex variables
to Pixel Shader context.

Reviewed-by: Alok Hota <alok.hota@intel.com>
2019-08-29 12:09:05 +02:00
Tapani Pälli
6dc4ddc5f8 iris: use driconf for 'bo_reuse' parameter
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-29 09:33:52 +03:00
Tapani Pälli
b65de51dcf i965: initialize bo_reuse when creating brw_bufmgr
Fixes a possible data race spotted while debugging on other EGL
related failures where glFinish and eglCreateContext are going on at
the same time:

  ==11558== Possible data race during read of size 1 at 0x5E78CD0 by thread #23
  ==11558== Locks held: 1, at address 0x5E77CA8
  ==11558==    at 0x61B71D4: bo_alloc_internal (brw_bufmgr.c:639)
  ==11558==    by 0x61B7328: brw_bo_alloc (brw_bufmgr.c:669)
  ==11558==    by 0x61EF975: recreate_growing_buffer (intel_batchbuffer.c:231)
  ==11558==    by 0x61EFAAE: intel_batchbuffer_reset (intel_batchbuffer.c:255)
  ==11558==    by 0x61EFB85: intel_batchbuffer_reset_and_clear_render_cache (intel_batchbuffer.c:280)
  ==11558==    by 0x61F0507: brw_new_batch (intel_batchbuffer.c:551)
  ==11558==    by 0x61F12C1: _intel_batchbuffer_flush_fence (intel_batchbuffer.c:888)
  ==11558==    by 0x61BDD6B: intel_glFlush (brw_context.c:296)
  ==11558==    by 0x61BDDB9: intel_finish (brw_context.c:307)
  ==11558==    by 0x623831B: _mesa_Finish (context.c:1906)
  ==11558==    by 0x46D556: deqp::egl::GLES2ThreadTest::Operation::execute(tcu::ThreadUtil::Thread&)
  ==11558==    by 0x721502: tcu::ThreadUtil::Thread::run()
  ==11558==
  ==11558== This conflicts with a previous write of size 1 by thread #26
  ==11558== Locks held: 1, at address 0x5D09878
  ==11558==    at 0x61B98A9: brw_bufmgr_enable_reuse (brw_bufmgr.c:1541)
  ==11558==    by 0x61BF09D: brw_process_driconf_options (brw_context.c:854)
  ==11558==    by 0x61BF6CA: brwCreateContext (brw_context.c:993)
  ==11558==    by 0x621181F: driCreateContextAttribs (dri_util.c:473)
  ==11558==    by 0x53FE87B: dri2_create_context (egl_dri2.c:1388)
  ==11558==    by 0x53EE7BE: eglCreateContext (eglapi.c:807)
  ==11558==    by 0x5C8AB9: eglw::FuncPtrLibrary::createContext(void*, void*, void*, int const*) const
  ==11558==    by 0x46E027: deqp::egl::GLES2ThreadTest::CreateContext::exec(tcu::ThreadUtil::Thread&)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-29 09:33:13 +03:00
Kenneth Graunke
90ca709f6d iris: Don't auto-flush/dirty on transfer unmap for coherent buffers
When u_upload_mgr fills up a buffer, it unmaps and destroys it.  Our
unmap function was automatically performing the equivalent of a
FlushMappedBufferRange call in this case.  Because the buffer mapping
is persistent and coherent, we don't actually do any flushing when we
do the rest of the writes to the buffer - we were just doing one final
one at the end.  But we would be using the uploaded contents on the
GPU the whole time.

This certainly shouldn't be necessary for streaming buffers, and if
such flushing and dirtying is necessary for coherent buffers, this is
wildly insufficient.

Drops a small number of constant packets and PIPE_CONTROL flushes from
most benchmarks that I've looked at.  Doesn't seem to make much of an
impact on performance, however.

Thanks to Felix Degrood for noticing that we were emitting more
3DSTATE_CONSTANT_* packets than we needed to.
2019-08-28 22:11:05 -07:00
Timur Kristóf
5f3eb6ef29 st/nine: Properly initialize GLSL types for NIR shaders.
NIR shaders use GLSL types (note: these live outside libglsl), and
nine needs to properly initialize these just like the other state
trackers. This fixes an assertion failure when TTN is used.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
2019-08-28 23:31:34 +00:00
Rob Clark
6167a63839 freedreno/ir3: do better job of marking convergence points
Fixes:
dEQP-GLES3.functional.shaders.switch.switch_in_do_while_loop_dynamic_vertex
dEQP-GLES3.functional.shaders.switch.switch_in_do_while_loop_dynamic_fragment

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-28 15:25:27 -07:00
Rob Clark
6af70aa2b4 freedreno/ir3: maintain predecessors/successors
While resolving jumps to skip intermediate jumps from the structured
CFG, maintain the successors and predecessors correctly.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-28 15:25:25 -07:00
Rob Clark
06bc4875ff freedreno/ir3: convert block->predecessors to set
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-28 15:25:19 -07:00
Jordan Justen
cfbde3282d pci_id_driver_map: Support preferring iris over i965
This adds the ability for intel devices that:

 * Only load on i965
 * Only load on iris
 * First attempt i965, and try iris next
 * First attempt iris, and try i965 next

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-28 13:38:34 -07:00
Jordan Justen
107c22945f i965: Exit with error if gen12+ is detected
For OpenGL support on gen12, the iris driver should be used.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-28 13:38:34 -07:00
Tapani Pälli
d8dd9a245e anv: build libanv for gen12 in android build
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-28 13:38:34 -07:00
Jordan Justen
181be14d43 anv: Build for gen12
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-28 13:38:34 -07:00
Tapani Pälli
da603c066e iris: build android libmesa_iris for gen12
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-28 13:38:34 -07:00
Jordan Justen
44ab7c265f iris: Build for gen12
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-28 13:38:33 -07:00
Jordan Justen
4d2e390a65 intel/l3: Don't assert on gen12 (use gen11 config temporarily)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-28 13:38:33 -07:00
Jordan Justen
bdeb498070 intel/compiler: Disable compaction on gen12 for now
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-28 13:38:33 -07:00
Tapani Pälli
d7a1140c45 intel/isl: build android libmesa_isl for gen12
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-28 13:38:33 -07:00
Jordan Justen
6d63fd8a69 intel/isl: Build gen12 using gen11 code paths
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-28 13:38:33 -07:00
Tapani Pälli
7319003a74 intel/genxml: generate pack files for gen12 on android builds
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-28 13:38:33 -07:00
Jordan Justen
b42a05b436 intel/genxml: Build gen12 genxml
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-28 13:38:33 -07:00
Jordan Justen
531563b64b intel/genxml: Add gen12.xml as a copy of gen11.xml
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-28 13:38:32 -07:00
Jordan Justen
2323536ee7 intel/genxml: Run sort_xml.sh to tidy gen9.xml and gen11.xml
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-28 13:38:32 -07:00
Jordan Justen
70566a87eb intel/genxml/gen11: Add spaces in EnableUnormPathInColorPipe
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-28 13:38:32 -07:00
Jordan Justen
acce7d3460 intel/genxml: Handle field names with different spacing/hyphen
If a field name differs slightly between two generations then this
change will still add the fields into the same group.

For example, these will be treated as equal:
* "Software Exception" and "Software  Exception"
* "Per Thread" and "Per-Thread"

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-28 13:38:28 -07:00
Eric Anholt
973b49386c freedreno/a6xx: Fix non-mipmap filtering selection.
We were clamping the LOD to force non-mipmap filtering, but that means
that the HW doesn't get to select between the min and mag filters.
Setting MIPFILTER_LINEAR_FAR appears to force non-mipmap filtering.

Fixes all failures in dEQP-GLES2.functional.texture.filtering.2d.*

Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-08-28 13:14:41 -07:00
Ian Romanick
b418269d7d intel/compiler: Request bitfield_reverse lowering on pre-Gen7 hardware
See the previous commit for the explanation of the Fixes tag.

Hurts 21 shaders in shader-db.  All of the hurt shaders are in Unreal
Engine 4 tech demos.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 7afa26d4e3 ("nir: Add lowering for nir_op_bitfield_reverse.")
2019-08-28 11:39:29 -07:00
Ian Romanick
d3fd1c761a nir/algrbraic: Don't optimize open-coded bitfield reverse when lowering is enabled
This caused a problem on Sandybridge where an open-coded
bitfieldReverse() function could be optimized to a
nir_op_bitfield_reverse that would generate an unsupported BFREV
instruction in the backend.  This was encountered in some Unreal4 tech
demos in shader-db.  The bug was not previously noticed because we don't
actually try to run those demos on Sandybridge.

The fixes tag is a bit a lie.  The actual bug was introduced about
26,000 commits earlier in 371c4b3c48 ("nir: Recognize open-coded
bitfield_reverse.").  Without the NIR lowering pass, the flag needed to
avoid the optimization does not exist.  Hopefully nobody will care to
fix this on an earlier Mesa release.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 7afa26d4e3 ("nir: Add lowering for nir_op_bitfield_reverse.")
2019-08-28 11:38:51 -07:00
Eric Anholt
4662b70d23 gallium: Don't emit identical endian-dependent pack/unpack code.
Reduces the size of the u_format_table.c file by 140k (out of 1.64M)
and makes me less confused about endianness in gallium.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-28 10:39:36 -07:00
Eric Anholt
d17ff2f7f1 gallium: Fix big-endian addressing of non-bitmask array formats.
The formats affected are:

- LA x (16_FLOAT, 32_FLOAT, 32_UINT, 32_SINT)
- R8G8B8 x (UNORM, SNORM, SRGB, USCALED, SSCALED, UINT, SINT)
- RG/RGB/RGBA x (64_FLOAT, 32_FLOAT, 16_FLOAT, 32_UNORM, 32_SNORM,
                 32_USCALED, 32_SSCALED, 32_FIXED, 32_UINT, 32_SINT)
- RGB/RGBA x (16_UNORM, 16_SNORM, 16_USCALED, 16_SSCALED,
              16_UINT, 16_SINT)
- RGBx16 x (UNORM, SNORM, FLOAT, UINT, SINT)
- RGBx32 x (FLOAT, UINT, SINT)
- RA x (16_FLOAT, 32_FLOAT, 32_UINT, 32_SINT)

The updated st_formats.c unit test checks that the formats affected by
this change are all array formats in the equivalent Mesa format (if
any).  Mesa's array format definition is clear: the value stored is an
array (increasing memory address) of values of the channel's type.
It's also the only thing that makes sense for the RGB types, or very
large types like RGBA64_FLOAT (A should not move to the low address
because the cpu is BE).

Acked-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Adam Jackson <ajax@redhat.com>
Tested-by: Matt Turner <mattst88@gmail.com> (unit tests on BE)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-28 10:39:36 -07:00
Eric Anholt
0547fdd7ee gallium: Drop a bit of dead code from the pack/unpack python.
Nothing used this var.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-28 10:39:36 -07:00
Eric Anholt
309ef968cd gallium: Drop the useless union wrapper on pack/unpack.
Nothing accessed the .value field, just the .chan.  Unwrap all the
code from the union, for clarity (and 13k less generated code).

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-28 10:39:36 -07:00
Eric Anholt
174240c5e4 gallium: Skip generating the pack/unpack union if we don't use it.
Shaves 30k off of the 1.6M .c file, and makes for less noise for me
trying to understand how gallium formats actually work.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-28 10:39:36 -07:00
Eric Anholt
7c8cdee0b2 gallium: Fix mesa format name in unit test failure path.
We clearly wanted the mesa format here.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-28 10:39:36 -07:00
Boris Brezillon
8709b865ce panfrost: Reset the damage area on imported resources
Reset the damage area in the resource_from_handle() path (as done in
panfrost_resource_create()).

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-28 17:50:44 +02:00
Boris Brezillon
938c5b0148 panfrost: Use ralloc() to allocate instructions to avoid leaking those objs
Instructions attached to blocks are never explicitly freed. Let's
use ralloc() to attach those objects to the compiler context so that
they are automatically freed when the ctx object is freed.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-28 17:50:01 +02:00
Jose Fonseca
6e01575b68 scons: Make GCC builds stricter.
Uses some of the same -Werror options used by Meson, as suggested by
Michel Dänzer.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Michel Dänzer <michel@daenzer.net>
Acked-by: Eric Engestrom <eric@engestrom.ch>
2019-08-28 15:52:07 +01:00
Jose Fonseca
6b2bc8f25e util: Prevent strcasecmp macro redefinion.
MinGW headers already define it.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Eric Engestrom <eric@engestrom.ch>
2019-08-28 15:52:07 +01:00
Jose Fonseca
46f7b3662f util: Prevent implicit declaration of function getenv.
With MinGW cross compilation.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Eric Engestrom <eric@engestrom.ch>
2019-08-28 15:52:07 +01:00
Jose Fonseca
7029556398 glx: Fix incompatible function pointer types.
I don't know how Meson didn't hit this issue, when it too already uses
-Werror=incompatible-pointer-types

Fixes: 3dd299c3d5 ("glx: Sync <GL/glxext.h> with Khronos")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-08-28 15:52:07 +01:00
Vasily Khoruzhick
200859f45c lima: fix texture descriptor issues
Looks like initial RE was wrong and some fields have different purpose.
I.e. there's no "disable_mipmap" field, it's actually part of another field
that selects mipmap filtering.

Also fix layout position.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-28 00:28:38 +00:00
Kenneth Graunke
7e095a4fbf iris: Drop swizzling parameter from s8_offset.
This is always false on Gen8+, no need for dead code and parameters.
2019-08-27 17:11:32 -07:00
Kenneth Graunke
e18cd5452a mesa: Fix _mesa_float_to_unorm() on 32-bit systems.
This fixes the following CTS test on 32-bit systems:
GTF-GL46.gtf30.GL3Tests.packed_depth_stencil.packed_depth_stencil_init

It does glGetTexImage of a 16-bit SNORM image, requesting 32-bit UNORM
data.  In get_tex_rgba_uncompressed, we round trip through float to
handle image transfer ops for clamping.  _mesa_format_convert does:

   _mesa_float_to_unorm(0.571428597f, 32)

which translated to:

   _mesa_lroundevenf(0.571428597f * 0xffffffffu)

which produced different results on 64-bit and 32-bit systems:

   64-bit: result = 0x92492500
   32-bit: result = 0x80000000

This is because the size of "long" varies between the two systems, and
0x92492500 is too large to fit in a signed 32-bit integer.  To fix this,
we switch to the new _mesa_i64roundevenf function which always does the
64-bit operation.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104395
Fixes: 594fc0f859 ("mesa: Replace F_TO_I() with _mesa_lroundevenf().")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-08-27 23:57:02 +00:00
Kenneth Graunke
b59914e179 util: Add a _mesa_i64roundevenf() helper.
This always returns a int64_t, translating to _mesa_lroundevenf on
systems where long is 64-bit, and llrintf where "long long" is needed.

Fixes: 594fc0f859 ("mesa: Replace F_TO_I() with _mesa_lroundevenf().")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-08-27 23:57:02 +00:00
Adam Jackson
163fc11f27 glx: Unset the direct_support bit for GLX_EXT_import_context
GLX_EXT_import_context operates only on indirect contexts, a direct
context cannot possibly support it. Without this change the extension
will appear in the combined GLX extension string even if it is missing
from the server string, indicating a lack of required server support.
2019-08-27 22:34:46 +00:00
Daniel Kolesa
1b9fce56c4 util: add auxv based PowerPC AltiVec/VSX detection
At least on Linux, we can use the ELF auxiliary vector to
detect the presence of AltiVec, VSX and other CPU features
without having to go through handling SIGILL, which has
various problems of its own.

A similar thing is already being done for ARM to detect NEON.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Daniel Kolesa <daniel@octaforge.org>
2019-08-27 14:55:37 -07:00
Kenneth Graunke
23f42f8dcf intel/compiler: Use new Gen11 headerless RT writes for MRT cases
Gen11 adds support for specifying the render target index and src0
alpha present bits in the extended message descriptor.  Previously,
we had to use a message header for this, requiring extra instructions
to write the fields, and two registers of extra payload.

Improves performance on my ICL 8x8 frequency locked to 700Mhz, on iris:

   GfxBench5 Manhattan 3.0: 2.13635% +/- 0.159859% (n=5)
   GfxBench5 Aztec Ruins:   1.57173% +/- 0.128749% (n=5)
   Synmark2 OglDeferred:    2.86914% +/- 0.191211% (n=10)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-27 14:20:07 -07:00
Kenneth Graunke
0d96484165 intel/compiler: Use generic SEND for Gen7+ FB writes
This takes care of generate_fb_write/fire_fb_write/brw_fb_WRITE's stuff
earlier in the visitor.  It will also make it easier to generate SENDSC
messages with indirect extended descriptors in a few patches.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-27 14:20:07 -07:00
Kenneth Graunke
86a63b1098 intel/compiler: Refactor FB write message control setup into a helper.
This will be used by visitor code to convert directly to SEND in a bit.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-27 14:20:07 -07:00
Kenneth Graunke
b6fe25c7f5 intel/compiler: Handle bits 15:12 in brw_send_indirect_split_message()
Annoyingly, these bits exist in some extended message descriptors
(in particular render target writes), but they don't have any
corresponding bits in the ISA encoding.  So we can't use an immediate
and have to fall back to an indirect extended descriptor.

Thanks to Jason Ekstrand for reminding me that you can still set these
bits via an indirect descriptor, even if they don't exist in the ISA.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-27 14:20:07 -07:00
Kenneth Graunke
c8c9c48684 intel/compiler: Fix src0/desc setter ordering
src0 vstride and type overlap with bits of the extended descriptor.
brw_set_desc() also sets the extended descriptor to 0.  So by setting
the descriptor, then setting src0, we were accidentally setting a bunch
of extended descriptor bits unintentionally.

When using this infrastructure for framebuffer writes (in a future
patch), this ended up setting the extended descriptor bit 20, which is
"Null Render Target" on Icelake, causing nothing to be written to the
framebuffer.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-27 14:20:07 -07:00
Marek Olšák
360cf3c4b0 radeonsi: fix scratch buffer WAVESIZE setting leading to corruption
Cc: 19.2 19.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:52:32 -04:00
Marek Olšák
f95a28d361 radeonsi: unbind blend/DSA/rasterizer state correctly in delete functions
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111414

Fixes: b758eed9c3 ("radeonsi: make sure that blend state != NULL and remove all NULL checking")

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:52:30 -04:00
Marek Olšák
40e5ac45ae radeonsi: align scratch and ring buffer allocations for faster memory access
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:52:28 -04:00
Marek Olšák
d8f27552f4 radeonsi: consolidate determining VGPR_COMP_CNT for API VS
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Marek Olšák
4dde40908f radeonsi/gfx10: set PA_CL_VS_OUT_CNTL with CONTEXT_REG_RMW to fix edge flags
We need two different values of the register, one for NGG and one for
legacy, in order to fix edge flags for the legacy pipeline.

Passing the ngg flag to emit_clip_regs would be too complicated,
so CONTEXT_REG_RMW is used for partial register updates.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Marek Olšák
1426acf9e7 radeonsi/gfx10: remove incorrect ngg/pos_writes_edgeflag variables
It varies depending on si_shader_key::as_ngg.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Marek Olšák
2e94cb6693 radeonsi: add PKT3_CONTEXT_REG_RMW
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Marek Olšák
d9a453c747 winsys/amdgpu+radeon: process AMD_DEBUG in addition to R600_DEBUG
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Marek Olšák
467df4b90a radeonsi/gfx10: add AMD_DEBUG=nongg
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Marek Olšák
6229b5a058 radeonsi/gfx10: finish up Navi14, add PCI ID
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Marek Olšák
73bde2b029 radeonsi/gfx10: always use the legacy pipeline for streamout
The best way to prevent GDS hangs is not to use GDS.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Marek Olšák
f251fd7bf5 radeonsi/gfx10: don't initialize VGT_INSTANCE_STEP_RATE_0
Only gfx9 and older use it to get InstanceID in VGPR1.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Marek Olšák
28f44ee533 radeonsi/gfx10: fix InstanceID for legacy VS+GS
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Marek Olšák
e121d75de9 radeonsi/gfx10: add as_ngg variant for VS as ES to select Wave32/64
Legacy GS only works with Wave64.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Marek Olšák
f34d023f1a radeonsi/gfx10: create the GS copy shader if using legacy streamout
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Marek Olšák
776f05a307 radeonsi/gfx10: fix the PRIMITIVES_GENERATED query if using legacy streamout
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Marek Olšák
cab5b3861d radeonsi/gfx10: fix tessellation for the legacy pipeline
ported from PAL

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Marek Olšák
a9bb566955 radeonsi: move some global shader cache flags to per-binary flags
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Marek Olšák
810846e157 radeonsi/gfx10: fix the legacy pipeline by storing as_ngg in the shader cache
It could load an NGG shader when we want a legacy shader and vice versa.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Kenneth Graunke
6342d43ae9 iris: Delete dead prototype 2019-08-27 13:15:02 -07:00
Boris Brezillon
2734a4951e Revert "panfrost: Free all block/instruction objects before leaving midgard_compile_shader_nir()"
This reverts commit 5882e0def9.

This commit causes a segfault.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-08-27 20:07:28 +02:00
Boris Brezillon
0142dcb990 panfrost: Make sure bundle.instructions[] contains valid instructions
Add an assert() in schedule_bundle() to make sure all instruction
pointers in bundle.instructions[] are valid.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-27 16:50:52 +02:00
Boris Brezillon
5882e0def9 panfrost: Free all block/instruction objects before leaving midgard_compile_shader_nir()
Right now we're leaking all block and instruction objects allocated by
the compiler. Let's clean things up before leaving
midgard_compile_shader_nir().

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-27 16:50:52 +02:00
Boris Brezillon
3ac49f135a panfrost: Free the instruction object in mir_remove_instruction()
To avoid memory leaks.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-27 16:50:52 +02:00
Eric Engestrom
239f7f1c0a scons: add support for MAJOR_IN_{MKDEV,SYSMACROS}
src/gallium/winsys/svga/drm/vmw_screen.c: In function ‘vmw_dev_compare’:
src/gallium/winsys/svga/drm/vmw_screen.c:48:12: warning: implicit declaration of function ‘major’ [-Wimplicit-function-declaration]
   48 |    return (major(*(dev_t *)key1) == major(*(dev_t *)key2) &&
      |            ^~~~~
src/gallium/winsys/svga/drm/vmw_screen.c:49:12: warning: implicit declaration of function ‘minor’ [-Wimplicit-function-declaration]
   49 |            minor(*(dev_t *)key1) == minor(*(dev_t *)key2)) ? 0 : 1;
      |            ^~~~~

That file (and many others) already has the proper #include with their
respective guards, but scons wasn't defining them, resulting in implicit
functions being used instead (and an always-true check that's probably
breaking something down the line).

Note that I'm cheating a bit here because Scons doesn't seem to have
a clean way to detect the existence of major() et al. as functions or
macros, so I'm taking the shortcut of just detecting the presence of the
header and assuming its contents is what we expect.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-By: Jose Fonseca <jfonseca@vmware.com>
2019-08-27 14:03:46 +01:00
Samuel Pitoiset
49f5ddd3ae radv: make use of has_ls_vgpr_init_bug
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-27 08:04:51 +02:00
Samuel Pitoiset
fd54fc85aa ac: add has_ls_vgpr_init_bug to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:47 +02:00
Samuel Pitoiset
1bf2572dff ac: add has_msaa_sample_loc_bug to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:44 +02:00
Samuel Pitoiset
021feb1bf6 ac: add rbplus_allowed to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:41 +02:00
Samuel Pitoiset
20c5db02b5 ac: add has_tc_compat_zrange_bug to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:36 +02:00
Samuel Pitoiset
b55919cf2a ac: add has_gfx9_scissor_bug to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:32 +02:00
Samuel Pitoiset
2b9c371575 ac: add cpdma_prefetch_writes_memory to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:29 +02:00
Samuel Pitoiset
b027ad66d7 ac: add has_out_of_order_rast to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:26 +02:00
Samuel Pitoiset
ed720af46d ac: add has_load_ctx_reg_pkt to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:22 +02:00
Samuel Pitoiset
63c0b89b8f ac: add has_rbplus to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:19 +02:00
Samuel Pitoiset
44a46c09de ac: add has_dcc_constant_encode to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:16 +02:00
Samuel Pitoiset
c08401f035 ac: add has_distributed_tess to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:11 +02:00
Samuel Pitoiset
d62d2840c4 ac: add has_clear_state to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:05 +02:00
Samuel Pitoiset
af65f9431e ac: drop llvm8 from some load/store helpers
Cleanup.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:00 +02:00
Dave Airlie
e6eb444554 gallivm: fix appveyor build after images changes 2019-08-27 13:36:03 +10:00
Dave Airlie
c501c2cef6 docs: add shader image extensions for llvmpipe
v1.1: fix typo in llvmpipe name (ajax)

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-08-27 12:30:25 +10:00
Dave Airlie
b7468f7831 llvmpipe: enable ARB_shader_image_load_store
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-08-27 12:30:22 +10:00
Dave Airlie
6c2fa01b9c llvmpipe: flush on api memorybarrier.
Until we have somewhere we can do better, just hit it with a hammer.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-08-27 12:30:16 +10:00
Dave Airlie
b9bf236c71 gallivm: add memory barrier support
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-08-27 12:30:13 +10:00
Dave Airlie
abfb633968 gallivm: add support for fences api on older llvm
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-08-27 12:30:10 +10:00
Dave Airlie
8b7295f281 llvmpipe: bind vertex/geometry shader images
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-08-27 12:30:06 +10:00
Dave Airlie
2909c654b0 llvmpipe: add fragment shader image support
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-08-27 12:30:04 +10:00
Dave Airlie
dc2357070c draw: add vs/gs images support
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-08-27 12:30:01 +10:00
Dave Airlie
ceb8d0ac5a gallivm: add image load/store/atomic support
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-08-27 12:29:58 +10:00
Dave Airlie
15f7688ac9 gallivm/tgsi: add image interface to tgsi builder
This adds the callbacks for the driver/gallium binding for
image operations.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-08-27 12:29:55 +10:00
Dave Airlie
b2be174be2 llvmpipe: introduce image jit type to fragment shader jit.
This adds the image type to the fragment shader jit context

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-08-27 12:29:51 +10:00
Dave Airlie
039a2e3630 draw: add jit image type for vs/gs images.
This introduces the jit image type into the jit interface
for vertex/geom shaders

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-08-27 12:29:49 +10:00
Dave Airlie
3c2c232059 llvmpipe: move the fragment shader variant key to dynamic length.
This mirrors the vs/gs keys, and will be needed when adding images
support.

The const changes also mirror how the draw code work (as is needed
when we add images)

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-08-27 12:29:42 +10:00
Dave Airlie
d0381ea149 gallivm: add a basic image limit
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-08-27 12:29:39 +10:00
Dave Airlie
cf84b46a1c llvmpipe: handle early test property.
Also handle setting late for shaders that use stores

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-08-27 12:29:33 +10:00
Dave Airlie
a1e8fcef47 gallivm: move first/last level jit texture members.
This lets us create an image structure with the same basic
types as the texture one.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-08-27 12:29:31 +10:00
Dave Airlie
e8a445d8b5 gallivm: handle helper invocation (v2)
Just invert the exec_mask to get if this is a helper or not.

v2: get the bld mask (Roland)

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-08-27 12:29:28 +10:00
Dave Airlie
fb34369eb5 gallivm: make lp_build_float_to_r11g11b10 take a const src
This allows using it with a const src later.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-08-27 12:29:25 +10:00
Dave Airlie
a8ef6b5755 llvmpipe: refactor jit type creation
This just cleans the code up so the texture/sampler type
creation can be reused.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-08-27 12:29:21 +10:00
Dave Airlie
1eda49cc3d gallivm: fix atomic compare-and-swap
Not sure how I missed this before, but compswap was hitting an
assert here as it is it's own special case.

Fixes: b5ac381d8f ("gallivm: add buffer operations to the tgsi->llvm conversion.")
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-08-27 12:28:17 +10:00
Paulo Zanoni
848d5e444a intel/fs: grab fail_msg from v32 instead of v16 when v32->run_cs fails
Looks like a copy/paste error. This patch prevents a segfault when
running the following on BDW:

    INTEL_DEBUG=no8,no16,do32 ./deqp-vk -n \
        dEQP-VK.subgroups.arithmetic.compute.subgroupmin_dvec4

For the curious, the message we're getting is:

    CS compile failed: Failure to register allocate.  Reduce number
    of live scalar values to avoid this.

Fixes: 864737ce6c ("i965/fs: Build 32-wide compute shader when needed.")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
2019-08-26 14:54:16 -07:00
Alyssa Rosenzweig
c30116a2fa pan/midgard: Fix invert fusing with r26
The invert wasn't applying (correctly) due to the issues addressed here.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-26 13:43:04 -07:00
Alyssa Rosenzweig
75b6be2435 pan/midgard: Fold ssa_args into midgard_instruction
This is just a bit of refactoring to simplify MIR.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-26 13:43:04 -07:00
Eric Anholt
0309fb82ec gallium: Add the ASTC 3D formats.
No driver implements them yet, but this is a long way toward gallium
having matching format enums for Mesa formats.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-26 19:44:00 +00:00
Eric Anholt
9d988f9291 gallium: Add block depth to the format utils.
I decided not to update nblocks() with a depth arg as the callers
wouldn't be doing ASTC 3D.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-26 19:44:00 +00:00
Eric Anholt
530f424735 gallium: Add a block depth field to the u_formats table.
To add ASTC 3D compression formats, we need to be able to express the
block depth.  While I'm touching every line, line up the columns of
the CSV again as they've drifted over time.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-26 19:44:00 +00:00
Alyssa Rosenzweig
9c328ea66e pan/midgard: Add imov->fmov optimization
When moving constants, if switching to a floating-point representation
doesn't break anything, we'd rather have an fmov than an imov,
permitting inlining the constant in many circumstances.

total quadwords in shared programs: 3408 -> 3366 (-1.23%)
quadwords in affected programs: 1188 -> 1146 (-3.54%)
helped: 41
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.02 x̃: 1
helped stats (rel) min: 0.19% max: 25.00% x̄: 9.65% x̃: 11.11%
95% mean confidence interval for quadwords value: -1.07 -0.98
95% mean confidence interval for quadwords %-change: -11.38% -7.93%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-26 11:42:33 -07:00
Alyssa Rosenzweig
0acb5c1774 pan/midgard: Switch constants to uint32
Storing constants as float doesn't make sense when we have integer
instructions; better to switch to be integer natively and coerce to/from
float rather than the opposite.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-26 11:42:32 -07:00
Kenneth Graunke
2e1be771e4 isl: Don't set UnormPathInColorPipe for integer surfaces.
This fixes dEQP-GLES3.functional.texture.specification subtests on iris:

- texsubimage3d_depth.depth24_stencil8_2d_array
- texsubimage3d_depth.depth32f_stencil8_2d_array
- texsubimage3d_depth.depth_component32f_2d_array
- texsubimage3d_depth.depth_component24_2d_array
- texstorage2d.format.depth24_stencil8_2d
- texstorage2d.format.depth32f_stencil8_2d
- texstorage2d.format.depth_component24_2d
- texstorage2d.format.depth_component32f_2d
- texstorage3d.format.depth24_stencil8_2d_array
- texstorage3d.format.depth32f_stencil8_2d_array
- texstorage3d.format.depth_component24_2d_array
- texstorage3d.format.depth_component32f_2d_array

Here, something appears to be going wrong with having this bit set
during blorp_copy operations for texture upload, which override the
format to R8G8B8A8_UINT.

AFAICT this bit should have no effect for integer surfaces, as it has
to do with blending, and integer blending is not a thing.  So it should
be harmless to disable it.

The Windows driver appears to be setting this bit universally, so
I am unclear why we would need to.  Perhaps they simply haven't run
into this issue.

Fixes: f741de236b ("isl: Enable Unorm Path in Color Pipe")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-26 16:54:20 +00:00
Kenneth Graunke
1b090f065e isl: Drop UnormPathInColorPipe for buffer surfaces.
Jason suggested I remove this in review, and he's right.  AFAICT this
affects blending, and that just isn't going to happen on buffers.

Fixes: f741de236b ("isl: Enable Unorm Path in Color Pipe")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-26 16:54:20 +00:00
Alyssa Rosenzweig
85cc78a624 pan/midgard, bifrost: Set lower_fdph = true
fdph instructions show up in some desktop GL shaders.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-26 07:47:01 -07:00
Samuel Pitoiset
218ce34962 radv: add mipmap support for the clear depth/stencil values
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-26 15:56:59 +02:00
Samuel Pitoiset
e36e260c42 radv: add mipmap support for the TC-compat zrange bug
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-26 15:56:55 +02:00
Samuel Pitoiset
9db0dc6b8e radv: allocate metadata space for mipmapped depth/stencil images
For each mipmaps, the driver will store the clear values (8-bytes)
and the TC-compat zrange value (4-bytes).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-26 15:56:51 +02:00
Samuel Pitoiset
76812339f7 radv: decompress mipmapped depth/stencil images during transitions
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-26 15:56:48 +02:00
Samuel Pitoiset
81c6473b7f radv: add mipmaps support for decompress/resummarize
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-26 15:56:45 +02:00
Samuel Pitoiset
18ccde4d68 radv: add radv_process_depth_image_layer() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-26 15:56:42 +02:00
Connor Abbott
b7acf38073 ac/nir: Remove gfx9_stride_size_workaround_for_atomic
The workaround was entirely in common code, and it's needed in radeonsi
too so just always do it when necessary. Fixes
KHR-GL45.shader_image_load_store.advanced-allStages-oneImage on gfx9
with LLVM 8.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-08-26 11:00:49 +02:00
Connor Abbott
4849276ea8 ac/nir: add a workaround for viewing a slice of 3D as a 2D image
GL and Vulkan allow you to bind a single layer of a 3D texture to a 2D
image, and we weren't implementing a workaround for that on gfx9 that
TGSI was. Copy it over.

Fixes KHR-GL45.shader_image_load_store.non-layered_binding with radeonsi
NIR.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-26 11:00:44 +02:00
Samuel Pitoiset
89671ef205 radv: fix getting the index type size for uint8_t
16-bit and 32-bit values match hardware values but 8-bit doesn't.

This fixes dEQP-VK.pipeline.input_assembly.* with 8-bit index.

Fixes: 372c3dcfdb ("radv: implement VK_EXT_index_type_uint8")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl
2019-08-26 09:23:23 +02:00
Dave Airlie
bba4d2f442 virgl: fix format conversion for recent gallium changes.
The virgl formats are fixed in time snapshots of the gallium ones,
we just need to provide a translation table between them when
we enter the hardware.

This fixes a regression since Eric renumbered the gallium table.

Fixes: c45c33a5a2 (gallium: Remove manual defining of PIPE_FORMAT enum values.)
Bugzilla: https://bugs.freedesktop.org/111454

v1 by Dave Airlie <airlied@redhat.com>
v2: virgl: Add a number of formats to the table that are used, e.g. for vertex
    attributes
v3: cover some more missing formats from a piglit run

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
2019-08-26 06:35:00 +00:00
Dave Airlie
035cd6cdf9 virgl: drop unused format field 2019-08-26 06:35:00 +00:00
Erico Nunes
4379dcc12d lima/ppir: enable vectorize optimization
pp has vector units and some operations can be optimized when bundled
together.
Benchmarking this with piglit shaders shows that the instruction count
can be greatly reduced on many examples with vectorize.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-08-25 18:29:12 +00:00
Erico Nunes
2a8a81d109 lima/ppir: lower selects to scalars
nir vec4 fcsel assumes that each component of the condition will be used
to select the same component from the options, but pp can't implement
that since it only has 1 component for the condition.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-08-25 18:29:12 +00:00
Erico Nunes
27e7603c34 lima: fix ppir spill stack allocation
The previous spill stack was fixed and too small, and caused instability
in programs requiring spilling for roughly more than one value.
This patch adds a dynamic calculation of the buffer size based on stack
utilization and switches it to a separate allocation at flush time that
will fit the shader that requires the largest buffer.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-08-25 20:08:59 +02:00
Jason Ekstrand
f58e0405b6 intel/fs: Drop the gl_program from fs_visitor
It's not used by anything anymore now that so much lowering has been
moved into NIR.  Sadly, we still need on in brw_compile_gs() for
geometry shaders on Sandy Bridge.  Short of a lot of pointless work,
that one's probably not going away.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-25 01:02:52 -05:00
Qiang Yu
5ff41b9fc5 lima: move format handling to unified place
Create a unified table to handle pipe format to texture
and render target format lookup.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
2019-08-25 11:52:29 +08:00
Alex Smith
fe0ec41c4d radv: Change memory type order for GPUs without dedicated VRAM
Put the uncached GTT type at a higher index than the visible VRAM type,
rather than having GTT first.

When we don't have dedicated VRAM, we don't have a non-visible VRAM
type, and the property flags for GTT and visible VRAM are identical.
According to the spec, for types with identical flags, we should give
the one with better performance a lower index.

Previously, apps which follow the spec guidance for choosing a memory
type would have picked the GTT type in preference to visible VRAM (all
Feral games will do this), and end up with lower performance.

On a Ryzen 5 2500U laptop (Raven Ridge), this improves average FPS in
the Rise of the Tomb Raider benchmark by up to ~30%. Tested a couple of
other (Feral) games and saw similar improvement on those as well.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
(Bas: CCing this to 19.2-rc due to high impact and limited complexity)
2019-08-24 17:37:47 +02:00
Vasily Khoruzhick
681e99d11c lima/ppir: print register index and components number for spilled register
It can be useful for debugging purposes

Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-24 08:17:31 -07:00
Vasily Khoruzhick
28d4b456a5 lima/ppir: add control flow support
This commit adds support for nir_jump_instr, if and loop
nir_cf_nodes.

Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-24 08:17:31 -07:00
Vasily Khoruzhick
1cdf585613 lima/ppir: add better liveness analysis
Add better liveness analysis that was modelled after one in vc4.
It uses live ranges and is aware of multiple blocks which is prerequisite
for adding CF support

Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-24 08:17:31 -07:00
Vasily Khoruzhick
d30a98c896 lima/ppir: validate shader outputs
Mali4x0 supports only gl_FragColor. gl_FragDepth is not supported.
Check that we don't get anything but gl_FragColor in shader outputs.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-24 08:17:25 -07:00
Vasily Khoruzhick
8dd195e865 lima/ppir: turn store_color into ALU node
We don't have a special OP to store color in PP, all we need to do is to
store gl_FragColor into reg0, thus it's just a mov and therefore ALU node.

Yet we still need to indicate that it's store_color op so regalloc ignores
its destination.

Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-23 18:19:47 -07:00
Vasily Khoruzhick
7f814d2b46 lima/ppir: create ppir block for each corresponding NIR block
Create ppir block for each corresponding NIR block and populate
its successors. It will be used later in liveness analysis and
in CF support

Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-23 18:19:47 -07:00
Vasily Khoruzhick
4e695489df lima/ppir: add dummy op
We can get following from NIR:

(1) r1 = r2
(2) r2 = ssa1

Note that r2 is read before it's assigned, so there's no node for
it in comp->var_nodes. We need to create a dummy node in this case
which sole purpose is to hold ppir_dest with reg in it.

Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-23 18:19:47 -07:00
Vasily Khoruzhick
d11e1b7909 lima/ppir: add write after read deps for registers
For cases like:

(1) r1 = r2
(2) r2 = ssa1

We need to add (1) as dependency of (2), otherwise scheduler may
reorder them.

Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-23 18:19:47 -07:00
Vasily Khoruzhick
cd8c569ced lima/ppir: fix ordering deps
There can be several root nodes, i.e.:

(1) r0 = r1
(2) r2 = r3
(3) branch if (ssa1)

We need to make (3) depend on (1) and (2), old code added
dependency only for (2), and (1) was kept as root node since there
is no branch/discard or store color between two movs.

Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-23 18:19:47 -07:00
Vasily Khoruzhick
bf2872eeb2 lima/ppir: set write mask for texture loads if dest is reg
Destination for texture load can be a reg, so we need to
set write mask in this case

Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-23 18:19:47 -07:00
Vasily Khoruzhick
fd129817f0 lima/ppir: add support for unconditional branches and condition negation
We need 'negate' modifier for branch condition to minimize branching. Idea
is to generate following:

current_block: { ...; if (!statement) branch else_block; }
then_block: { ...; branch after_block; }
else_block: { ... }
after_block: { ... }

Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-23 18:19:46 -07:00
Vasily Khoruzhick
e15af23b73 lima/ppir: clone ld_{uni,tex,var} into each block
ppir_lower_load() and ppir_lower_load_texture() assume that node
is in the same block as its successors, fix it by cloning each
ld_uni and ld_tex to every block.

It also reduces register pressure since values never cross block
boundaries and thus never appear in live_in or live_out of any block,
so do it for varyings as well.

Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-23 18:19:46 -07:00
Vasily Khoruzhick
172f2ad805 lima/ppir: refactor const lowering
Const nodes are now cloned for each user, i.e. const is guaranteed to have
exactly one successor, so we can use ppir_do_one_node_to_instr() and
drop insert_to_each_succ_instr()

Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-23 18:19:46 -07:00
Rafael Antognolli
2b7ba9f239 anv: Only re-emit non-dynamic state that has changed.
On commit f6e7de41d7, we started emitting 3DSTATE_LINE_STIPPLE as part
of the non-dynamic state. That gets re-emitted every time we bind a new
VkPipeline. But that instruction is non-pipelined, and it caused a perf
regression of about 9-10% on Dota2.

This commit makes anv_dynamic_state_copy() return a mask with only the
state that has changed when copying it. 3DSTATE_LINE_STIPPLE won't be
emitted anymore unless it has changed, fixing the problem above.

v2: Improve commit message and add documentation about skipped checks
(Jason)

Fixes: f6e7de41d7 ("anv: Implement VK_EXT_line_rasterization")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-23 15:55:18 -07:00
Alyssa Rosenzweig
21a85fd7d8 pan/decode: Validate and quiet helper invocation flag
We can statically determine from the disassembly if helper invocations
will be needed, so we can validate the corresponding bit in the
cmdstream and thus avoid printing the bit itself in the decode.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-23 15:51:25 -07:00
Alyssa Rosenzweig
20ac0b8e4e pan/midgard: Analyze helper invocations
We check for texture ops which calculate derivatives (either explicitly
via dFd* or implicitly) and mark the shader as requiring helper
invocations.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-23 15:51:25 -07:00
Lionel Landwerlin
9d3fc737af util: fix compilation on macos
timespec_get() is not available on macos, we need to pull in the
include/c11/threads_posix.h helper.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103674
Fixes: e2d761de03 ("util: drop final reference to p_compiler.h")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-23 23:45:25 +03:00
Caio Marcelo de Oliveira Filho
bfac462d92 i965: Silence brw_blorp uninitialized warning
The variables level and start_layer are not initialized, then
initialized if we have a BUFFER_BIT_DEPTH set.  We assert on them
later using the same check.  This should be enough but GCC 9.1.1 is
not convinced, so let's initialize the variables.

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-23 13:25:27 -07:00
Caio Marcelo de Oliveira Filho
5fac7c55f7 tgsi: Remove unused local
Code that used it was removed in 4ebe6b2e72 ("tgsi: Drop the SSE2
constants setup that's been dead code since 2011.")

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-23 13:25:27 -07:00
Caio Marcelo de Oliveira Filho
63f0259aeb iris: Guard GEN9-only function in Iris state to avoid warning
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-23 13:25:27 -07:00
Caio Marcelo de Oliveira Filho
412ed1338f intel/decoders: Avoid uninitialized variable warnings
Initialize `next_batch_addr` and `second_level`.  If the batch is well
formed, those values will be overriden, if not, they are as good as
uninitialized garbage.

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-23 13:25:27 -07:00
Caio Marcelo de Oliveira Filho
0661480029 compiler/glsl: Fix warning about unused function
The helper check_node_type() is only used when DEBUG is set (in the
function below), but ASSERTED macro uses NDEBUG.  So just guard the
helper with #ifdef.  If we see more such cases we might consider a
ASSERTED-like macro for the DEBUG case.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-23 13:25:27 -07:00
Caio Marcelo de Oliveira Filho
eac8a3b9af anv: Drop unused local variable
Leftover from 021fa28163 ("xintel/nir: Add a helper for getting
BRW_AOP from an intrinsic").

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-23 13:25:27 -07:00
Caio Marcelo de Oliveira Filho
f7d90c67c7 intel/compiler: Silence maybe-uninitialized warning in GCC 9.1.1
Compiler can't see that d is initialized.

    ../src/intel/compiler/brw_vec4_nir.cpp: In function ‘int brw::try_immediate_source(const nir_alu_instr*, brw::src_reg*, bool, const gen_device_info*)’:
    ../src/intel/compiler/brw_vec4_nir.cpp:984:12: warning: ‘d’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      984 |          d = MAX2(-d, d);

Assert that we expect at least one component -- hence d going to be
set.  That by itself is not enough, so also zero initialize the
variable.

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-23 13:25:27 -07:00
Andres Rodriguez
a410823b3e radv: additional query fixes
Make sure we read the updated data from the gpu in cases where WAIT_BIT
is not set.

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-17 05:53:51 -04:00
Kenneth Graunke
7ee7b0ecbc iris: Fix large timeout handling in rel2abs()
...by copying the implementation of anv_get_absolute_timeout().

Appears to fix a CTS test with 32-bit builds:
GTF-GL46.gtf32.GL3Tests.sync.sync_functionality_clientwaitsync_flush

Fixes: f459c56be6 ("iris: Add fence support using drm_syncobj")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-08-23 10:32:01 -07:00
Kenneth Graunke
9310ae6f68 iris: Set MOCS in all STATE_BASE_ADDRESS commands
Rafael Antognolli tracked down a performance gap between i965 and iris
in Synmark2's OglCSDof microbenchmark, noting that iris was performing
substantially more memory reads and writes, with substantially fewer
L3 hits.  He suggested that something might be wrong with MOCS, or L3
configs, at which point I came up with a theory...

It would appear that the STATE_BASE_ADDRESS command updates the MOCS
settings for various base addresses even if you don't specify the
"Modify Enable" bit for that address.  Until now, we had been setting
only the MOCS for bases we intended to change, leaving the others
"blank" which is MOCS table entry 0, which is uncached.

Most data access has a more specific MOCS (e.g. in SURFACE_STATE),
but scratch access uses the Stateless Data Port Access MOCS from
STATE_BASE_ADDRESS.  So this meant all scratch access was uncached.

Improves performance in Synmark2's OglCSDof by 2x, bringing iris
on par with the existing i965 driver.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-23 10:21:48 -07:00
Vinson Lee
b05166e3d2 glx: Fix up glXQueryGLXPbufferSGIX on macOS.
Fix this build error on macOS.

../src/glx/apple/glx_empty.c:158:4: error: void function 'glXQueryGLXPbufferSGIX' should not return a value [-Wreturn-type]
   return 0;
   ^      ~

Fixes: 3dd299c3d5 ("glx: Sync <GL/glxext.h> with Khronos")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2019-08-23 11:05:23 -04:00
Juan A. Suarez Romero
6f137ed901 docs: update calendar, add news item and link release notes for 19.1.5
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-08-23 12:40:40 +02:00
Juan A. Suarez Romero
23f1741996 docs: add sha256 checksums for 19.1.5
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit ae2a676cd1)
2019-08-23 12:38:28 +02:00
Juan A. Suarez Romero
152dd6ed19 docs: add release notes for 19.1.5
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit a384fe0ceb)
2019-08-23 12:38:27 +02:00
Connor Abbott
f59076f8a7 radeonsi/nir: Rewrite output scanning
Similarly to before, this didn't properly handle varying structs with
doubles in them.

This doesn't fix any tests, but was noticed while looking at the code.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-23 11:05:31 +02:00
Connor Abbott
9395277972 radeonsi/nir: Rewrite store intrinsic gathering
The old version wasn't as accurate as it could be, and didn't handle
double variables inside structs correctly. Walk the path to compute the
actual components affected.

In combination with the previous commit fixes
KHR-GL45.enhanced_layouts.varying_structure_locations.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-23 11:05:31 +02:00
Connor Abbott
87cca891c3 radeonsi/nir: Add const_index when loading GS inputs
This fixes loading GS inputs in structures or arrays.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-23 11:05:31 +02:00
Connor Abbott
82589d3ffd radeonsi/nir: Don't add const offset to indirect
This is already done in get_deref_offset() in the common code. We were
adding it twice accidentally.

Fixes KHR-GL45.enhanced_layouts.varying_array_locations.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-23 11:05:31 +02:00
Connor Abbott
400db1852b ac/nir: Assert GS input index is constant
If it's not we silently ignore indir_index which is definitely a bug.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-23 11:05:31 +02:00
Connor Abbott
bb42c896fe ac/nir: Handle const array offsets in get_deref_offset()
Some users of this function (e.g. GS inputs) currently only work with
constant offsets. We got lucky since all the tests used an array index
of 0, so the non-constant part was always 0. But we still need to handle
this.

This doesn't fix any CTS test, but was noticed while debugging one.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-23 11:05:31 +02:00
Connor Abbott
97d592c855 radeonsi/nir: Don't recompute num_inputs and num_outputs
Don't repeat what mesa/st already does.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-23 11:05:31 +02:00
Connor Abbott
3eb4aeed60 st/nir: Fix num_inputs for VS inputs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-23 11:05:31 +02:00
Samuel Pitoiset
a4e6e59db8 radv/gfx10: do not use NGG with NAVI14
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-23 09:54:08 +02:00
Samuel Pitoiset
0813c27d8d radv/gfx10: don't initialize VGT_INSTANCE_STEP_RATE_0
Only gfx9 and older use it to get InstanceID in VGPR1.
Ported from RadeonSI.

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-23 09:54:06 +02:00
Samuel Pitoiset
7d1c091143 gitlab-ci: bump LLVM to 8 for meson-vulkan and meson-clover
To fix pipeline builds.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-23 08:13:31 +02:00
Samuel Pitoiset
1fd60db4a1 ac,radv,radeonsi: remove LLVM 7 support
Now that LLVM 9 will be released soon, we will only support
LLVM 8, 9 and master (10).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-23 08:12:34 +02:00
Tapani Pälli
3e03a3fc53 egl: reset blob cache set/get functions on terminate
Fixes errors seen with eglSetBlobCacheFuncsANDROID on Android when
running dEQP that terminates and reinitializes a display.

Fixes: 6f5b57093b "egl: add support for EGL_ANDROID_blob_cache"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-23 08:14:08 +03:00
Kenneth Graunke
2d79925034 iris: Avoid unnecessary resolves on transfer maps
We were always resolving the buffer as if we were accessing it via
CPU maps, which don't understand any auxiliary surfaces.  But we often
copy to a temporary using BLORP, which understands compression just
fine.  So we can avoid the resolve, and accelerate the copy as well.

Fixes: 9d1334d2a0 ("iris: Use copy_region and staging resources to avoid transfer stalls")
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-08-22 18:31:17 -07:00
Kenneth Graunke
136629a1e3 iris: Drop copy format hacks from copy region based transfer path.
This doesn't work for compressed formats, as the source texture and
temporary texture would have different block sizes.  (Forcing the driver
to always take the GPU path would expose the bug.)  Instead, just use
the source format for the temporary, and let blorp_copy deal with
overrides.

The one case where we can't do this is ASTC, because isl won't let us
create a linear ASTC surface.  Fall back to the CPU paths there for now.

Fixes: 9d1334d2a0 ("iris: Use copy_region and staging resources to avoid transfer stalls")
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-08-22 18:31:17 -07:00
Kenneth Graunke
1cd13ccee7 iris: Update fast clear colors on Gen9 with direct immediate writes.
Gen11 stores the fast clear color in an "indirect clear buffer", as
a packed pixel value.  Gen9 hardware stores it as a float or integer
value, which is interpreted via the format.  We were trying to store
that in a buffer, for similarity with Icelake, and MI_COPY_MEM_MEM
it from there to the actual SURFACE_STATE bytes where it's stored.

This unfortunately doesn't work for blorp_copy(), which does bit-for-bit
copies, and overrides the format to a CCS-compatible UINT format.  This
causes the clear color to be interpreted in the overridden format.

Normally, we provide the clear color on the CPU, and blorp_blit.c:2611
converts it to a packed pixel value in the original format, then unpacks
it in the overridden format, so the clear color we use expands to the
bits we originally desired.

However, BLORP doesn't support this pack/unpack with an indirect clear
buffer, as it would need to do the math on the GPU.  On Gen11+, it isn't
necessary, as the hardware does the right thing.

This patch changes Gen9 to stop using an indirect clear buffer and
simply do PIPE_CONTROLs with post-sync write immediate operations
to store the new color over the surface states for regular drawing.
BLORP continues streaming out surface states, and handles fast clear
colors on the CPU.

Fixes: 53c484ba8a ("iris: blorp using resolve hooks")
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-08-22 18:31:14 -07:00
Kenneth Graunke
117a0368b0 iris: Fix broken aux.possible/sampler_usages bitmask handling
For renderable surfaces, we allocate SURFACE_STATEs for each bit in
res->aux.possible_usages.  Sampler views use res->aux.sampler_usages.

When pinning buffers, we call surf_state_offset_for_aux() to calculate
the offset to the desired surface state.  surf_state_offset_for_aux()
took an aux_modes parameter, which should be one of those two fields.
However...it was not using that parameter.  It always used the broader
res->aux.possible_usages field directly.

One of the callers, update_clear_value(), was passing incorrect masks
for this parameter.  It iterated through the bits in order, using
u_bit_scan(), which destructively modifies the mask.  So each time we
called it, the count of bits before our selected mode was 0, which would
cause us to always update the SURFACE_STATE for ISL_AUX_USAGE_NONE,
rather than updating each in turn.  This was hidden by the earlier bug
where surf_state_offset_for_aux() ignored the parameter.

Fixes: 7339660e80 ("iris: Add aux.sampler_usages.")
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-08-22 18:31:14 -07:00
Kenneth Graunke
f6c44549ee iris: Replace devinfo->gen with GEN_GEN
This is genxml, we can compile out this code.

Fixes: 2660667284 ("iris/gen8: Re-emit the SURFACE_STATE if the clear color changed.")
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-08-22 18:31:14 -07:00
Alyssa Rosenzweig
272ce6f5a7 pan/midgard: Fix writeout combining
shader-db regression in the scheduler.

Fixes: dff4986b1a ("pan/midgard: Emit store_output branch just-in-time")

total bundles in shared programs: 2055 -> 2019 (-1.75%)
bundles in affected programs: 1055 -> 1019 (-3.41%)
helped: 36
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.35% max: 20.00% x̄: 6.71% x̃: 5.16%
95% mean confidence interval for bundles value: -1.00 -1.00
95% mean confidence interval for bundles %-change: -8.45% -4.97%
Bundles are helped.

total quadwords in shared programs: 3444 -> 3408 (-1.05%)
quadwords in affected programs: 1897 -> 1861 (-1.90%)
helped: 36
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.19% max: 14.29% x̄: 3.97% x̃: 2.99%
95% mean confidence interval for quadwords value: -1.00 -1.00
95% mean confidence interval for quadwords %-change: -5.08% -2.86%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-22 14:03:23 -07:00
Alyssa Rosenzweig
2c5ba2ee6e panfrost: Implement gl_FragCoord correctly
Rather than passing through the transformed gl_Position, we can use the
hardware-level varying for this, which will correctly handle
gl_FragCoord.w

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-22 13:31:39 -07:00
Alyssa Rosenzweig
eeebf5c2df panfrost: Remove vertex buffer offset from its size
The offset is added to the base address, so we need to subtract it from
the size to maintain the same end address and thus prevent a buffer
overflow:

   end_address = start_address + size

   start_address' = start_address + offset
   size' = size - offset

   end_address' = start_address' + size'
                = (start_address + offset) + (size - offset)
                = (start_address + size) + (offset - offset)
                = start_address + size
                = end_address

   QED.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-22 13:31:39 -07:00
Alyssa Rosenzweig
f4678f3c62 pan/decode: Handle special varyings
We need a special path for special varyings so we parse them correctly
instead of throwing an error when they inevitably point to bad memory.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-22 13:31:39 -07:00
Alyssa Rosenzweig
caec0b3232 pan/decode: Remove size/stride divisibility check
The hardware doesn't care, and a lot of Panfrost code relies on an
oversized buffer. The important part is that (stride *
padded_num_vertices) is no greater than size, which we'll need to check
once we validate instancing.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-22 13:31:39 -07:00
Alyssa Rosenzweig
ed464e05c8 pan/decode: Decouple attribute/meta printing
They are independent fields, so the parser should reflect that.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-22 13:31:39 -07:00
Alyssa Rosenzweig
ae84f16786 pan/decode: Print stub for uniforms
We don't need to dump the contents necessary, but having the stub with
the address is useful.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-22 13:31:06 -07:00
Alyssa Rosenzweig
26ed431ea9 pan/decode: Decode actual varying_meta address
I don't know who thought this mask was a good idea but unfortunately it
must have been me.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-22 12:56:49 -07:00
Alyssa Rosenzweig
f48136e9c5 pan/decode: Downgrade shader property mismatch to warning
If we permit more $whatever through than the shader needs, that's a bit
of a waste, but it isn't an error.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-22 12:56:35 -07:00
Alyssa Rosenzweig
f38ce6ea8c pan/decode: Validate, but do not print, index buffer
We don't actually care about the *contents* of the index buffer, but we
would rather like to ensure it is present and of the correct size.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-22 12:56:04 -07:00
Alyssa Rosenzweig
cbbf75424a pan/decode: Validate mali_shader_meta stats
We can infer these stats in many cases from the disassembly, so we
should try to sanity check where we can. We may need to be fuzzy about
analysis, since analysis gives us a bound but we don't mind if it's not
used fully by the shader.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-22 12:55:49 -07:00
Alyssa Rosenzweig
9b067d96f7 pan/decode: Disassemble before printing shader descriptor
This allows the shader descriptor to access the disassembled stats.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-22 12:55:27 -07:00
Alyssa Rosenzweig
5f9a1c74ae pan/decode: Promote <no shader> to an error
There is no reason this should happen to an in-spec program, as far as I
know.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-22 12:55:00 -07:00
Alyssa Rosenzweig
d7473e2e01 pan/decode: Fix uniform printing
Lazypasting from UBOs.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-22 12:54:35 -07:00
Alyssa Rosenzweig
139708bbab pan/decode: Validate blend shaders don't access I/O
We could do better by forcing the checks to *equal* zero (right now, an
indeterminate answer will pass the checks), but this is a start to guard
against some egregious cases.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-22 12:54:16 -07:00
Alyssa Rosenzweig
ded9a68d8f pan/decode: Validate and simplify FRAGMENT payloads
There are a number of conditions we need to test for to statically check
for TILE_RANGE_FAULTs, but once these checks are in order, we can print
as-is.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-22 12:53:44 -07:00
Alyssa Rosenzweig
f06e8f7fe9 pan/decode: Validate MFBD tags
These tags need to match up with what's actually described by the MFBD,
so check this. Once this is checked, since the type and contents of the
FBD are obvious from printing above, there's no need to explicitly mark
off the framebuffer line.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-22 12:53:10 -07:00
Alyssa Rosenzweig
0c313419a0 pan/decode: Eliminate non-FBD dumped case
We don't need *more* cases to deal with.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-22 12:52:52 -07:00
Alyssa Rosenzweig
6ec33b4f34 pan/decode: Removing uniform buffer framing
We can do single line prints:

   ubuf_0[192] = memory_161f5000 + 896;

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-22 12:52:37 -07:00
Alyssa Rosenzweig
a68fe4baec pan/decode: Remove mali_attr(_meta) framing
It doesn't give any real added value.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-22 12:52:18 -07:00
Alyssa Rosenzweig
f162adc32b pan/midgard: Disassemble integer constants in hex
It's usually easier to parse mentally.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-22 12:51:55 -07:00
Alyssa Rosenzweig
b89cb0dba6 pan/midgard: Explain ffma
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-22 12:51:39 -07:00
Alyssa Rosenzweig
19d58a299b pan/midgard: Analyze simple loads/store
For shaders using exclusively direct attribute/varyings, we can work
this out statically. For shaders with indirect access, we just set an
upper bound of 16 (the max attributes/varyings we support) and the
actual count will be reported regardless.

We proceed similarly for textures/samplers, as well as for UBOs. While
UBOs can be *indexed* indirectly, the *UBO itself* -- which is what we
count in the shader descriptor (rather than the UBO descriptors) -- is
statically determinable.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-22 12:51:21 -07:00
Alyssa Rosenzweig
a89e368c7f pan/midgard: Compute work_count via writes
This is exact.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-22 12:50:57 -07:00
Alyssa Rosenzweig
b9fb63859e pan/midgard: Sketch static analysis to uniform count
This one is a little tricky, but the idea is that:

   r16-r23 are always uniforms

   r8-r15 are sometimes work, sometimes uniforms...
      ...but as work, they are always written before use
      ...and as uniforms, they are never written before use

So we use that heuristic to determine the count to feed the machine.
We'll record work register use in the next commit.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-22 12:50:40 -07:00
Alyssa Rosenzweig
58fc260312 pan/decode: Hoist shader-db stats to shared decode
We'll want all this information to validate the shader descriptor.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-22 12:50:14 -07:00
Alyssa Rosenzweig
a8f86fcb51 nir: Remove nir_const_load_to_arr
There are no remaining users in-tree.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-22 12:24:13 -07:00
Alyssa Rosenzweig
3c01a6928a pan/midgard,bifrost: Expand nir_const_load_to_arr
Panfrost is the only user of the macro; we are better off expanding than
having random stuff in nir.h.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-22 12:24:13 -07:00
Adam Jackson
a7a9d958bc glx: Make __glXGetDrawableAttribute return true sometimes
Right now it always returns zero, but as of:

    commit a48a6b8a40
    Author: Adam Jackson <ajax@redhat.com>
    Date:   Tue Nov 14 15:13:05 2017 -0500

        glx: Prepare driFetchDrawable for no-config contexts

We were hoping it would return true if the drawable could actually be
looked up. It wasn't, so that didn't go very well. With the most recent
update to <GL/glxext.h> glXQueryGLXPbufferSGIX (correctly) returns void,
so there's no longer anything else besides driFetchDrawable that depends
on the return value from __glXGetDrawableAttribute.

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-22 13:29:06 -04:00
Adam Jackson
3dd299c3d5 glx: Sync <GL/glxext.h> with Khronos
Minor fixups required to keep the prototypes matching and to remove
mention of retired enums.

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-22 13:29:04 -04:00
Adam Jackson
5ebd333c6c glx: Whitespace cleanups
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-22 13:28:39 -04:00
Eric Engestrom
6db1dfe347 swr: use LLVM version string instead of re-computing it
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-22 16:08:09 +01:00
Eric Engestrom
7f5ef97a07 llvmpipe: use LLVM version string instead of re-computing it
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-22 16:08:09 +01:00
Eric Engestrom
3ea83f4c9b scons: define MESA_LLVM_VERSION_STRING like the other build systems do
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-22 16:08:09 +01:00
Bas Nieuwenhuizen
c037fe5ad1 radv: Disable NGG for geometry shaders.
A bunch of remaining issues including some that affect users.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111248
Fixes: ee21bd7440 "radv/gfx10: implement NGG support (VS only)"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-08-22 12:47:32 +02:00
Lionel Landwerlin
5833f43305 util/timespec: use unsigned 64 bit integers for nsec values
We added this utility for vulkan where all timeouts are given as
uint64_t values. We can switch from signed to unsigned as this is the
only user and if we ever deal with signed integers somewhere else
we'll have to be careful to use the corresponding
timespec_(add|sub)_msec and always pass absolute values.

v2: Forgot to drop the test calling add_nsec() with a negative number

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Fixes: d2d70c3bb5 ("util: add a timespec helper")
Acked-by: Daniel Stone <daniels@collabora.com>
2019-08-22 09:35:57 +02:00
Tapani Pälli
728ebcdec2 iris/android: fix build and link with libmesa_intel_perf
Fixes: 0fd4359733 "iris/perf: implement routines to return counter info"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-22 10:01:14 +03:00
Samuel Pitoiset
2d9f401a83 ac: fix exclusive scans on GFX8-GFX9
This fixes a regression introduced with scan&reduce operations
on GFX10. Note that some subgroups CTS still fail on GFX10 but
I assume it's a different issue.

This fixes dEQP-VK.subgroups.arithmetic.*.subgroupexclusive*.

Fixes: 227c29a80d "amd/common/gfx10: implement scan & reduce operations"
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-22 08:43:15 +02:00
Tapani Pälli
ce8fd042a5 util: fix os_create_anonymous_file on android
Commit fixes current crashes with Vulkan applications on Android.

Fixes: c0376a1234 "util: add anon_file.h for all memfd/temp file usage"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-08-22 08:27:43 +03:00
Lionel Landwerlin
ac5bda374a i965: honor scanout requirement from DRI
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-08-21 23:52:07 +00:00
Kenneth Graunke
bc844d92ce gallium/noop: Implement resource_get_param
v2: Pass through to oscreen rather than faking it (review from Marek).

Fixes: 0346b70083 ("gallium/screen: Add pipe_screen::resource_get_param")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-21 22:18:22 +00:00
Kenneth Graunke
f02d1a0b75 gallium/rbug: Wrap resource_get_param if available
Fixes: 0346b70083 ("gallium/screen: Add pipe_screen::resource_get_param")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-21 22:18:22 +00:00
Kenneth Graunke
c43a44791b gallium/trace: Wrap resource_get_param if available
Fixes: 0346b70083 ("gallium/screen: Add pipe_screen::resource_get_param")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-21 22:18:22 +00:00
Kenneth Graunke
0e6b573ae5 gallium/ddebug: Wrap resource_get_param if available
Fixes: 0346b70083 ("gallium/screen: Add pipe_screen::resource_get_param")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-21 22:18:22 +00:00
Jose Maria Casanova Crespo
74a7e3ed3b mesa: recover target_check before get_current_tex_objects
At compressed_tex_sub_image we only can obtain the tex_object after
compressed_subtexture_target_check is validated for TEX_MODE_CURRENT.
So if the target is wrong the error is raised to the user.

This completes the fix for the regression introduced on "mesa: refactor
compressed_tex_sub_image function" of the pending failing tests:

dEQP-GLES3.functional.negative_api.texture.compressedtexsubimage3d
dEQP-GLES31.functional.debug.negative_coverage.get_error.texture.compressedtexsubimage3d

v2: Fix warning that texObj might be used uninitialized (Gert Wollny)

Fixes: 7df233d68d ("mesa: refactor compressed_tex_sub_image function")
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
2019-08-21 21:26:48 +01:00
Kevin Strasser
5baff5dd3c gallium: Add buffer and configs handling or fp16 formats
Expose configs when allow_fp16_configs has been enabled and
DRI_LOADER_CAP_FP16 is set in the loader.

Also, make kms_swrast_dri respect format bpp, to allow for allocating
buffers wider than 32 bpp.

Make fp16 opt-in for gallium.

Signed-off-by: Kevin Strasser <kevin.strasser@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2019-08-21 18:36:57 +00:00
Kevin Strasser
f4703f1c10 i965: Add handling for fp16 configs
Expose configs when allow_fp16_configs has been enabled and
DRI_LOADER_CAP_FP16 is set in the loader.

Also, define a new dri configuration option so users can disable exposure of
fp16 formats. Make fp16 opt-in for i965.

Signed-off-by: Kevin Strasser <kevin.strasser@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2019-08-21 18:36:57 +00:00
Kevin Strasser
4861d2a395 gbm: Add buffer handling and visuals for fp16 formats
Define and set a new loader cap DRI_LOADER_CAP_FP16, indicating that gbm can
handle fp16 formats.

Signed-off-by: Kevin Strasser <kevin.strasser@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-08-21 18:36:57 +00:00
Kevin Strasser
a427c20080 dri: Add fp16 formats
Add dri formats for RGBA ordered 64 bpp IEEE 754 half precision floating
point. Leverage existing offscreen render support for
MESA_FORMAT_RGBA_FLOAT16 and MESA_FORMAT_RGBX_FLOAT16.

Signed-off-by: Kevin Strasser <kevin.strasser@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-08-21 18:36:57 +00:00
Kevin Strasser
482ed4347d egl: Handle dri configs with floating point pixel data
In the case that __DRI_ATTRIB_FLOAT_BIT is set in the dri config, set
EGL_COLOR_COMPONENT_TYPE_FLOAT_EXT in the egl config. Add a field to the
platform driver visual to indicate if it has components that are in floating
point form.

Signed-off-by: Kevin Strasser <kevin.strasser@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-08-21 18:36:57 +00:00
Kevin Strasser
86d31c2c12 dri: Handle configs with floating point pixel data
In order to handle pixel formats that consist of floating point data, enable
floatMode field in the dri config, and set __DRI_ATTRIB_FLOAT_BIT in the
render type attribute.

Signed-off-by: Kevin Strasser <kevin.strasser@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-08-21 18:36:57 +00:00
Kevin Strasser
d4a9010338 glx: Add fields for color shifts
glx doesn't read the masks from the dri config directly, but for consistency
add shifts to the glxconfig.

Signed-off-by: Kevin Strasser <kevin.strasser@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2019-08-21 18:36:57 +00:00
Kevin Strasser
7b4ed2b513 egl: Convert configs to use shifts and sizes instead of masks
Change dri2_add_config to take arrays of shifts and sizes, and compare with
those set in the dri config. Convert all platform driver masks
to shifts and sizes.

In order to handle older drivers, where shift attributes aren't available,
we fall back to the mask attributes and compute the shifts with ffs.

Signed-off-by: Kevin Strasser <kevin.strasser@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2019-08-21 18:36:57 +00:00
Kevin Strasser
3562f48c9d util: move bitcount to bitscan.h
bitcount is free from the pipe header dependencies that make u_math.h hard
to include by non-gallium specific code, so move it to bitscan.h. bitscan.h
is included by u_math.h so existing references will continue working.

Signed-off-by: Kevin Strasser <kevin.strasser@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2019-08-21 18:36:57 +00:00
Kevin Strasser
5a747306ce dri: Add config attributes for color channel shift
The existing mask attributes can only support up to 32 bpp. Introduce
per-channel SHIFT attributes that indicate how many bits, from lsb towards
msb, the bit field is offset. A shift of -1 will indicate that there is no
bit field set for the channel.

As old loaders will still be looking for masks, we set the masks to 0 for
any formats wider than 32 bpp.

Signed-off-by: Kevin Strasser <kevin.strasser@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2019-08-21 18:36:57 +00:00
Kevin Strasser
9328e7c04c gallium: Use consistent approach for config format filtering
rgb10 uses an 'if(allowed) continue' approach, do the same for rgba_ordering.

Signed-off-by: Kevin Strasser <kevin.strasser@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2019-08-21 18:36:57 +00:00
Kevin Strasser
4fb71604b7 i965: Add helper function for allowed config formats
The driver checks dri config options and loader caps to filter out certain
formats during config creation. Fold 4 call sites under a single helper
function.

Signed-off-by: Kevin Strasser <kevin.strasser@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2019-08-21 18:36:57 +00:00
Kevin Strasser
d07a56dbc0 drm-uapi: Update headers for fp16 formats
From drm-next commit 88ab9c76d191ad8645b483f31e2b394b0f3e280e

Signed-off-by: Kevin Strasser <kevin.strasser@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-08-21 18:36:57 +00:00
Andres Rodriguez
bd960390bb radv: add RADV_DEBUG=allentrypoints
This debug option allows vkGet[Instance/Device]ProcAddr() to succeed
even if the extension associated with the requested entrypoint was not
enabled.

This has come in handy in a few instances when debugging VR
applications, so I thought it would be good to have a cleaned up version
upstreamed.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-21 17:47:35 +00:00
Alyssa Rosenzweig
0ae72df013 panfrost: Fix PIPE_BUFFER spacing
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 10:44:45 -07:00
Alyssa Rosenzweig
d4542f8cb5 panfrost: Implement depth range clipping
This should fix glDepthRangef issues. Eventually, something similar
should allow implementing the depth bounds test.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 10:44:45 -07:00
Alyssa Rosenzweig
5e268a01d2 panfrost: Don't bail on PIPE_BUFFER
We can handle some of it.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 10:43:02 -07:00
Alyssa Rosenzweig
7f14916372 pan/midgard: Identify and disassemble indirect texture/sampler
A pair of special flags can turn the texture/sampler handle fields into
register selects. This means code like:

   texture(uTextures[hr28.w], ...)

can be compiled to something like:

   texture ..., fsampler[hr28.w], texture[hr28.w]

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 10:41:15 -07:00
Alyssa Rosenzweig
8c1bc3c000 pan/midgard: Breakout texture reg select printer
This data structure is shared in other parts of the texture word, so
let's streamline printing.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 10:41:15 -07:00
Alyssa Rosenzweig
aa404120e1 panfrost: Pass stream_output_info by reference
It's a large structure, apparently.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig
27b6264630 panfrost: Guard against NULL rasterizer explicitly
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig
5ebdd10eaf pan/bifrost: Correct file size signedness
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig
87afc2e2da panfrost: Fix missing ret assignment in DRM code
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig
c43fa6b320 panfrost: Hoist bo != NULL check before dereference
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig
a3c1ab2e9a panfrost: Hoist job != NULL check
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig
9cee21f0c9 panfrost: Prevent potential integer overflow in instancing
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig
5bdc9096b7 panfrost: Clarify intention with PIPE_SWIZZLE_X check
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig
8fba6ab03d panfrost: Pay attention to framebuffer dimension sign
These are unsigned so the clamp-positive is redundant.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig
14a2032f0f pan/midgard: Mark fallthrough explicitly
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig
ed58fd63b4 panfrost: Don't check reads_point_coord
Useless check.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig
d0b9f094fd pan/midgard: Simplify contradictory check.
Coverity.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig
91a5b2657d pan/midgard: Reorder bits check to fix 8-bit masks
Coverity.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig
6189274f57 pan/midgard: Represent unused nodes by ~0
This allows nodes to be unsigned and prevents a class of weird
signedness bugs identified by Coverity.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig
cda0ec67e6 pan/bifrost: Avoid buffer overflow in disassembler
This path shouldn't be possible for in-spec shaders, but let's be
defensive. (Because security, right? Mostly because Coverity.)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig
9ce45ac808 pan/decode: Remove all_zero
The checks confuse Coverity, so let's make it explicit what's going on.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig
1060c48d46 pan/decode: Don't leak FBD pointer
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 10:38:30 -07:00
Alyssa Rosenzweig
52ac7dc5d0 pan/midgard: Allocate dependencies on stack
It's small; this way we don't leak memory.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 10:38:30 -07:00
Alyssa Rosenzweig
bf036e127f pan/midgard: Free liveness info
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 10:38:30 -07:00
Jason Ekstrand
c9a4793de8 v3d: Use the correct opcodes for signed image min/max
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-21 17:19:55 +00:00
Jason Ekstrand
021fa28163 intel/nir: Add a helper for getting BRW_AOP from an intrinsic
So many duplicated switch statements....

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-21 17:19:55 +00:00
Jason Ekstrand
951cf94521 nir: Add explicit signs to image min/max intrinsics
This better matches all the other atomic intrinsics such as those for
SSBOs and shared variables where the sign is part of the intrinsic
opcode.  Both generators (GLSL and SPIR-V) know the sign from the type
of the image variable or handle.  In SPIR-V, signed min/max are separate
opcodes from unsigned.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-21 17:19:55 +00:00
Alyssa Rosenzweig
fc69a5cf73 pan/decode: Cleanup mali_attr printing
We can smush this into one-line per record as per usual. We still need
more validation and cleaning this up, especially around instancing. But
for LINEAR records, it works okay already.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:56 -07:00
Alyssa Rosenzweig
62e6673908 pan/decode: Validate attribute/varying buffer pointer
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:55 -07:00
Alyssa Rosenzweig
be5e30c46b pan/decode: Include address in union mali_attr
No need to break it out into extra lines.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:55 -07:00
Alyssa Rosenzweig
68b9030db7 pan/decode: Use concise texture printing
This consolidates texture format and dimensionality into something simple:

    tiled rgba8_unorm.rgb1: 512x512

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:55 -07:00
Alyssa Rosenzweig
9f15f4d8e9 panfrost: Break up usage2 field
This is another bit field describing layout.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:55 -07:00
Alyssa Rosenzweig
9b203950ec pan/decode: Pretty-print sRGB format
We can just stick an "s" in if it's sRGB.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:55 -07:00
Alyssa Rosenzweig
47af32b15e panfrost: Remove ancient TODO
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:55 -07:00
Alyssa Rosenzweig
96f6b8a707 panfrost: nr_mipmap_levels -> levels
No need to be so verbose.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:55 -07:00
Alyssa Rosenzweig
024f9cf24f pan/decode: Validate texture dimensionality
Textures of a smaller dimension don't need higher dimensions printed.
This allows us to be more compact, while enforcing verification that
higher dimensions must be zero.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:55 -07:00
Alyssa Rosenzweig
8fc4ca82e3 pan/decode: Break out pandecode_texture function
It's massive and hugely nested indentation -- break it out so it's
legible.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:55 -07:00
Alyssa Rosenzweig
fa536ece04 pan/decode: Guard texture unknowns as zero trips
unknown3A I think I've actually seen on T6xx but.. we'll see what
happens in traces going forward. We don't want the zero noise normally,
and if they show up in the wild, we want to draw attention to them.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:55 -07:00
Alyssa Rosenzweig
e09392fc27 pan/decode: Use GLSL style formats/swizzles
This dramatically reduces visual clutter: now an entire
attribute/varying record looks something like:

    rgba32f attribute_0[16].bgra;

which is equivalent to the raw structure:

{
   .index = 0,
   .format = MALI_FORMAT_RGBA32F,
   .swizzle = (MALI_CHANNEL_BLUE << 9) | ....,
   .src_offset = 16,
}

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:54 -07:00
Alyssa Rosenzweig
ac090b365f pan/decode: Don't print the default swizzle
It's just noise.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:54 -07:00
Alyssa Rosenzweig
2208eb9b72 pan/decode: Validate swizzles against format
We want to make sure we don't access a component in the swizzle that
doesn't exist in the format, since that is (as far as I know) undefined.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:54 -07:00
Alyssa Rosenzweig
b233012d44 pan/decode: Treat RESERVED swizzles as errors
We've never seen them, so if they come up in trace, we want to draw
attention to that.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:54 -07:00
Alyssa Rosenzweig
a94fb781c2 pan/decode: Handle VARYING_DISCARD
Varying discard is not used by Panfrost, but the blob uses it sometimes
to have some padding in the varyings table, probably to minimize
per-draw overhead. (...We should maybe consider this ourselves!)

Let's check for this and ensure the rest of the record is consistent
with a discarded varying.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:54 -07:00
Alyssa Rosenzweig
c0642ebca1 panfrost: Don't trip the prefix magic field
What *is* this?

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:54 -07:00
Alyssa Rosenzweig
00be5d7b82 pan/decode: Guard attribute unknowns
One should be zero. The other has always been seen as set, so check
this.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:54 -07:00
Alyssa Rosenzweig
9836c26ac1 panfrost: Don't crash on GL_CLAMP
It's a legacy GL thing... we don't really need to handle it *right* now,
but we shouldn't crash..

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:54 -07:00
Alyssa Rosenzweig
e4eaa730dd panfrost: Do not expose PIPE_CAP_TEXTURE_MIRROR_CLAMP
This CAP controls a desktop-only extension. If the corresponding support
exists in the hardware, we don't know how to use it.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig
be81711c0a panfrost: Fix scoreboarding with dependency on job #0
Subtle issue masked by how we emitted SET_VALUE jobs, but this case can
and does occur, so let's fix it.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig
3752f76715 pan/decode: Normalize final instances of XXX
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig
dcde5bd157 pan/decode: Normalize case matching XXX format
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig
89c5370118 pan/decode: Mark tripped zeroes with XXX
This normalizes the printed format. It also makes it easier for the
future when we may introduce semantic _warn and _error handlers.

A tripped zero is essentially a hazard to check for.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig
e49204c878 pan/decode: Check for MFBD preload chicken bit
If this bit is clear, MFBD preload will be enabled, and you.. don't want
that. (At least, when the bit is clear, the old contents of the
framebuffer will be preserved. I'm assuming this is what "MFBD preload"
refers to in kbase.)

Validate that this bit is always set.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig
c9b6233558 pan/decode: Validate AFBC fields are zero when AFBC is disabled
There is no "chunknown" structure; that part of the union is an artefact
from falsely believing vertex/tiler MFBDs could have render targets
attached (they can't). These are just plain old AFBC fields, and if
there is no AFBC, it's error to set these field.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig
4aeb694462 pan/decode: Do not print uniform/buffers explicitly
For our purposes of driver debugging, the contents of uniform buffers
are rarely interesting; we're more concerned about the metadata setting
them up.

We do need to be careful to validate the sizes of both uniforms and
uniform buffers.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig
4391c65f10 pan/decode: Add static bounds checking utility
Many structures in the command stream have a GPU address and size
determined statically. We should check that the pointers we are passed
are valid and the buffers they point to are big enough for the given
size. If they're not, an MMU fault would be raised.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig
9dfbc8dc03 pan/decode: Don't print unreferenced attribute memory
This is a source of uninitialized memory leaking into the traces.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig
897110a566 pan/decode: Check for a number of potential issues
Verify sizes / masks / etc against something logical to cull down the
trace space and automatically guard against a number of potential
hazards.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig
f5c293425f panfrost: Correct polygon size computations
While the algorithm for computing the header size has been correct for a
while, we used a major hack to conservatively guess the body size. Let's
scrap that and figure out the algorithm we actually need to use to be
bit-identical with what the hardware expects.

We do have to be careful to add the header size to total comptued BO
size.

It's not clear how big the polygon list needs to be in practice -- but
it has to be somewhat bigger than the polygon list itself. This needs
more investigation. If we size the polygon list exactly based on the
polygon_list_size field, we get faults like:

[ 1224.219886] panfrost ff9a0000.gpu: Unhandled Page fault in AS0 at VA 0x000000001BDE8000
               Reason: TODO
               raw fault status: 0x660003C3
               decoded fault status: SLAVE FAULT
               exception type 0xC3: TRANSLATION_FAULT_LEVEL3
               access type 0x3: WRITE
               source id 0x6600

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig
f6e41f30d0 panfrost: Remove DRY_RUN
Nobody uses this anymore anyway.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig
b4a214207c pan/decode: Print "just right" count of texture pointers
The other commented lines just add noise/entropy we don't want, and can
in fact crash the trace due to asserts failing.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig
a8bd3ad470 pan/decode: Verify and omit polygon size
The polygon sizes are computed from the width/height/flags, so we can
reverse the computation and use our computation to verify the two
computation algorithms are bit-identical. If they are, we can omit the
computed fields.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig
b45eb2775e panfrost: Move pan_tiler.c outside of Gallium
The routines in this file may be shared with Vulkan.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:52 -07:00
Alyssa Rosenzweig
13d07978ff pan/decode: Bounds check polygon list and tiler heap
We have the BOs available; ensure that the bounds specified in the
command stream are actually the correct bounds.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:52 -07:00
Alyssa Rosenzweig
b072d0357b pan/decode: Allow updating mmaps
This allows the caller to call track_mmap multiple times for the same
gpu_va for the purpose of updating the mmap. This is used to trace
invisible BOs with kbase and doesn't apply to native traces.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:52 -07:00
Alyssa Rosenzweig
52101e48f8 pan/decode: Express tiler structures as offsets
This allows us to catch a class of errors (for negative offsets, etc)
automatically.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:52 -07:00
Alyssa Rosenzweig
e918dd8a6c pan/decode: Don't print zero exception_status
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:52 -07:00
Alyssa Rosenzweig
2a8d776884 pan/decode: Fix missing NULL terminator
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:52 -07:00
Alyssa Rosenzweig
6c67bd05a6 pan/decode: Silence workgroups_x_shift_2
Since we're bit-identical we can compare the computed value.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:52 -07:00
Alyssa Rosenzweig
3752566584 panfrost: Implement workgroups_x_shift_2 quirk
I'm not sure why this is done this way, but let's follow the blob.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:51 -07:00
Alyssa Rosenzweig
25ed930c4a pan/decode: Don't print canonical workgroup encoding
The on-the-wire representation of workgroups is not 1:1 to the decoded
Gallium-level workgroups (there are multiple valid encodings; see the
previous commit). Nevertheless, since we're now bit-identical in packing
vs the blob, we can check for a canonical form and only print the
verbose trace if we fail the canonical form.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:51 -07:00
Alyssa Rosenzweig
fb56a162a9 panfrost: Set workgroups z to 32 for non-instanced graphics
This is a blob quirk; in so much as I know, the hardware doesn't care.
But we're trying to be bit-identical to take as much entropy out of
traces as possible, so let's introduce the quirk.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:51 -07:00
Alyssa Rosenzweig
39b226cfb3 panfrost: Move pan_invocation to shared panfrost/
The routines in this file have no dependency on Gallium. Let's share
them so they can be used for a theoretical future Vulkan driver or, more
immediately, consulted when tracing.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:51 -07:00
Alyssa Rosenzweig
d9f33951df pan/decode: Don't print MALI_DRAW_NONE
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:51 -07:00
Alyssa Rosenzweig
740f86c9ee pan/decode: Eliminate DYN_MEMORY_PROP
It's obvious that it's linked by virtue of us printing the struct it
links against. No need to repeat ourselves.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:51 -07:00
Alejandro Piñeiro
41549a18e6 i965: Enable OpenGL 4.6 for Gen8+
The last remaining stuff was ARB_gl_spirv and ARB_spirv_extensions.

Note that it is really likely that we can enable it for some Gen7 (as
4.5 was), but it was not tested yet.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-21 17:29:42 +02:00
Alejandro Piñeiro
7dab76014a mesa/version: uncomment SPIR-V extensions
As they are implemented on i965, so we can expose 4.6.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-21 17:29:42 +02:00
Alejandro Piñeiro
2e8565bead i965: enable ARB_gl_spirv extension and ARB_spirv_extensions for gen7+
v2: squashed the two enable patches with the docs one (Jason)

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-21 17:29:42 +02:00
Tomeu Vizoso
4109a2f612 panfrost/ci: Print load stats
To help make sure we are running tests in the ideal number of threads,
print load stats to make obvious when there's a problem with
utilization.

This will be specially useful when we run tests on a wider variety of
devices.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 16:41:56 +02:00
Tomeu Vizoso
3794652385 panfrost/ci: Install qemu-arm-static into chroot
Some runners may be configured such that the qemu binary might not be
available by the time we need to start running commands within the
chroot.

So make sure that it's there to avoid suprising problems in that case.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 16:41:56 +02:00
Tomeu Vizoso
8496045adc panfrost/ci: Build kernel with CONFIG_DETECT_HUNG_TASK
There's lots of locking changes going into the Panfrost kernel driver,
so better be prepared.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 16:41:56 +02:00
Tomeu Vizoso
a074513dc2 panfrost/ci: Print bootstrap log
A number of things can go wrong when building the rootfs from within a
non-native chroot, so make sure to print the bootstrap.log so we can
tell what's going on.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 16:41:56 +02:00
Tomeu Vizoso
76af465e57 panfrost/ci: Use Volt-based runner for dEQP tests
It's able to run tests in parallel, fully utilizing the HW and
shortening considerable the time it takes.

Needed to disable tests in RK3288 for now because Volt doesn't support
armhf yet, though this should be fixed soon.

Tests are now run with --deqp-gl-config-name=rgba8888d24s8ms0, so we are
hitting a few more failures in tests that previously were being skipped.

The time to run the tests decreases from around 8 minutes to 1:45
minutes, allowing for extending coverage without increasing CI times too
much.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 16:41:56 +02:00
Samuel Pitoiset
29834fe8a2 radv: implement VK_AMD_shader_core_properties2
Trivial extension that matches PAL.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-21 15:14:29 +02:00
Samuel Pitoiset
a6ad9e8ccf radv: force enable VK_AMD_shader_ballot for Wolfenstein Youngblood
This gives a nice boost, +20% at this time on my Vega 56. Shader
ballot should be enabled by default at some point but it reduces
performance a bit (-6%) with Wolfeinstein II. Enable it only for
Youngblood at the moment, like what we did for Talos in the past.

As a bonus point, it gets rid of some minor artifacts that only
happens when ballot is disabled for some reasons.

Cc: 19.2 <mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-21 15:14:29 +02:00
Samuel Pitoiset
f202ac27a9 radv: add a new debug option called RADV_DEBUG=noshaderballot
Shader ballot will be enabled by default for Wolfenstein
Youngblood. This follows what we did for sisched.

Cc: 19.2 <mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-21 15:14:29 +02:00
Samuel Pitoiset
e73d863a66 radv: allow to enable VK_AMD_shader_ballot only on GFX8+
Scans aren't implemented on SI/CIK.

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-21 15:14:29 +02:00
Danylo Piliaiev
e71fc7f238 nir/loop_analyze: Treat do{}while(false) loops as 0 iterations
Loops like:

block block_0:
vec1 32 ssa_2 = load_const (0x00000020)
vec1 32 ssa_3 = load_const (0x00000001)
loop {
    vec1 32 ssa_7 = phi block_0: ssa_3, block_4: ssa_9
    vec1 1 ssa_8 = ige ssa_2, ssa_7
    if ssa_8 {
        break
    } else {
    }
    vec1 32 ssa_9 = iadd ssa_7, ssa_1
}

Were treated as having more than 1 iteration and after unrolling
produced wrong results, however such loop will exit during
the first iteration if not unrolled.

So we check if loop will actually loop.

Fixes tests/shaders/glsl-fs-loop-while-false-02.shader_test

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-08-21 11:01:15 +00:00
Danylo Piliaiev
84b3ef6a96 nir/loop_unroll: Prepare loop for unrolling in wrapper_unroll
Without loop_prepare_for_unroll loops are losing phis.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111411
Fixes: 5db98195 "nir: add loop unroll support for wrapper loops"
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-08-21 10:43:27 +00:00
Danylo Piliaiev
8869f44e9a nir/loop_unroll: Update the comments for loop_prepare_for_unroll
The comments say that we should remove continue if it is the last
intruction in a loop however we remove any kind of jump.

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-08-21 10:43:27 +00:00
Bas Nieuwenhuizen
e04761d0f9 radv: Emit VGT_GS_ONCHIP_CNTL for tess on GFX10.
Otherwise hangs are possible. This register was already set for
GS and NGG.

Fixes: 5eaed7ecfc "radv/gfx10: enable support for NAVI10, NAVI12 and NAVI14"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-08-21 09:51:47 +00:00
Bas Nieuwenhuizen
2e763f7c87 radv: Use correct vgpr_comp_cnt for VS if both prim_id and instance_id are needed.
Should take the max of the 2.

Fixes: ea337c8b7e "radv/gfx10: fix VS input VGPRs with the legacy path"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-08-21 09:38:46 +00:00
Daniel Schürmann
7fa1740035 nir/algebraic: some subtraction optimizations
Changes with RADV/ACO:
Totals from affected shaders:
SGPRS: 444087 -> 455543 (2.58 %)
VGPRS: 436468 -> 436768 (0.07 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 13448928 -> 13353520 (-0.71 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 68060 -> 67979 (-0.12 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-08-21 08:51:49 +00:00
Lionel Landwerlin
8a2465e3f3 radeonsi: take reference glsl types for compile threads
An application quitting before the destroying its GL context and
binding a NULL context might still have a radeonsi compiler thread
running and potentially still accessing the types.

Therefore take a reference for the duration of the threads' lifetime.

v2: Only ref the glsl types, the builtins should be used by the time
    shader data gets to a gallium driver.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-21 09:44:10 +02:00
Lionel Landwerlin
e4da8b9c33 mesa/compiler: rework tear down of builtin/types
The issue we're running into when running CTS is that glsl types are
deleted while builtins depending on them are not.

This happens because on one hand we have glsl types ref counted, but
builtins are not. Instead builtins are destroyed when unloading libGL
or explicitly calling glReleaseShaderCompiler().

This change removes almost entirely any dealing with glsl types
ref/unref by letting the builtins deal with it instead. In turn we
introduce a builtin ref count mechanism. Each GL context takes a
reference on the builtins when compiling a shader for the first time.
It releases the reference when the context is destroyed. It can also
explicitly release those when glReleaseShaderCompiler() is called.

Finally we also take a reference on the glsl types when loading libGL
to avoid recreating glsl types too often.

v2: Ensure we take a reference if we don't have one in link step (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110796
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-08-21 09:44:10 +02:00
Lionel Landwerlin
9f37bc419c compiler: ensure glsl types are not created without a reference
We want to detect invalid refcounting so assert we have at least one
use before creating types.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-08-21 09:44:10 +02:00
Lionel Landwerlin
8b913bd1ce nir/tests: take reference on glsl types
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-08-21 09:44:10 +02:00
Lionel Landwerlin
3ade8f0040 glsl/tests: take refs on glsl types
Much like each driver, tests as standalone entities must take
references on the glsl types.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-08-21 09:44:10 +02:00
Samuel Pitoiset
41d9873459 radv/gfx10: hardcode some depth+stencil formats in the format table
The script doesn't handle them correctly and D16_UNORM_S8_UINT
isn't supported by the hardware, mark it as invalid.

This fixes warning when generating gfx10_format_table.h.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111393
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-21 08:17:40 +02:00
Samuel Pitoiset
1650e747c6 radv/gfx10: tidy up gfx10_format_table.py
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-21 08:17:38 +02:00
Ilia Mirkin
958390a9bf gallium/vl: use compute preference for all multimedia, not just blit
The compute paths in vl are a bit AMD-specific. For example, they (on
nouveau), try to use a BGRX8 image format, which is not supported.
Fixing all this is probably possible, but since the compute paths aren't
in any way better, it's difficult to care.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213
Fixes: 9364d66cb7 (gallium/auxiliary/vl: Add video compositor compute shader render)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-20 23:51:39 -04:00
Emil Velikov
cca442f3ba docs: update calendar for 19.2.x
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2019-08-20 23:14:53 +01:00
Emil Velikov
a3d42ad248 docs: add 19.3.0-devel release notes template
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2019-08-20 22:39:25 +01:00
Emil Velikov
e6c0b493d2 mesa: bump version to 19.3.0-devel
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2019-08-20 22:33:49 +01:00
2371 changed files with 314721 additions and 91370 deletions

View File

@@ -0,0 +1,66 @@
goto %1
:install
rem Check pip
if "%buildsystem%" == "scons" (
python --version
python -m pip --version
rem Install Mako
python -m pip install Mako==1.0.7
rem Install pywin32 extensions, needed by SCons
python -m pip install pypiwin32
rem Install python wheels, necessary to install SCons via pip
python -m pip install wheel
rem Install SCons
python -m pip install scons==3.0.1
call scons --version
) else (
python --version
python -m pip install Mako meson
meson --version
rem Install pkg-config, which meson requires even on windows
cinst -y pkgconfiglite
)
rem Install flex/bison
set WINFLEXBISON_ARCHIVE=win_flex_bison-%WINFLEXBISON_VERSION%.zip
if not exist "%WINFLEXBISON_ARCHIVE%" appveyor DownloadFile "https://github.com/lexxmark/winflexbison/releases/download/v%WINFLEXBISON_VERSION%/%WINFLEXBISON_ARCHIVE%"
7z x -y -owinflexbison\ "%WINFLEXBISON_ARCHIVE%" > nul
set Path=%CD%\winflexbison;%Path%
win_flex --version
win_bison --version
rem Download and extract LLVM
if not exist "%LLVM_ARCHIVE%" appveyor DownloadFile "https://people.freedesktop.org/~jrfonseca/llvm/%LLVM_ARCHIVE%"
7z x -y "%LLVM_ARCHIVE%" > nul
if "%buildsystem%" == "scons" (
mkdir llvm\bin
set LLVM=%CD%\llvm
) else (
move llvm subprojects\
copy .appveyor\llvm-wrap.meson subprojects\llvm\meson.build
)
goto :eof
:build_script
if "%buildsystem%" == "scons" (
call scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=14.1 llvm=1
) else (
call "C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\Common7\Tools\VsDevCmd.bat" -arch=x86
rem We use default-library as static to affect any wraps (such as expat and zlib)
rem it would be better if we could set subprojects buildtype independently,
rem but I haven't written that patch yet :)
call meson builddir --backend=vs2017 --default-library=static -Dbuild-tests=true -Db_vscrt=mtd --buildtype=release -Dllvm=true -Dgallium-drivers=swrast -Dosmesa=gallium
pushd builddir
call msbuild mesa.sln /m
popd
)
goto :eof
:test_script
if "%buildsystem%" == "scons" (
call scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=14.1 llvm=1 check
) else (
call meson test -C builddir
)
goto :eof

36
.appveyor/llvm-wrap.meson Normal file
View File

@@ -0,0 +1,36 @@
# A meson.build file for binary wrapping the LLVM used in the appvyeor CI
project('llvm', ['cpp'])
cpp = meson.get_compiler('cpp')
_deps = []
_search = join_paths(meson.current_source_dir(), 'lib')
foreach d : ['LLVMAnalysis', 'LLVMAsmParser', 'LLVMAsmPrinter',
'LLVMBinaryFormat', 'LLVMBitReader', 'LLVMBitWriter',
'LLVMCodeGen', 'LLVMCore', 'LLVMCoroutines', 'LLVMCoverage',
'LLVMDebugInfoCodeView', 'LLVMDebugInfoDWARF',
'LLVMDebugInfoMSF', 'LLVMDebugInfoPDB', 'LLVMDemangle',
'LLVMDlltoolDriver', 'LLVMExecutionEngine', 'LLVMGlobalISel',
'LLVMInstCombine', 'LLVMInstrumentation', 'LLVMInterpreter',
'LLVMipo', 'LLVMIRReader', 'LLVMLibDriver', 'LLVMLineEditor',
'LLVMLinker', 'LLVMLTO', 'LLVMMCDisassembler', 'LLVMMCJIT',
'LLVMMC', 'LLVMMCParser', 'LLVMMIRParser', 'LLVMObjCARCOpts',
'LLVMObject', 'LLVMObjectYAML', 'LLVMOption', 'LLVMOrcJIT',
'LLVMPasses', 'LLVMProfileData', 'LLVMRuntimeDyld',
'LLVMScalarOpts', 'LLVMSelectionDAG', 'LLVMSupport',
'LLVMSymbolize', 'LLVMTableGen', 'LLVMTarget',
'LLVMTransformUtils', 'LLVMVectorize', 'LLVMX86AsmParser',
'LLVMX86AsmPrinter', 'LLVMX86CodeGen', 'LLVMX86Desc',
'LLVMX86Disassembler', 'LLVMX86Info', 'LLVMX86Utils',
'LLVMXRay']
_deps += cpp.find_library(d, dirs : _search)
endforeach
dep_llvm = declare_dependency(
include_directories : include_directories('include'),
dependencies : _deps,
version : '5.0.1',
)
has_rtti = false
irbuilder_h = files('include/llvm/IR/IRBuilder.h')

View File

@@ -32,6 +32,10 @@ indent_size = 2
indent_style = space
indent_size = 2
[*.html]
indent_style = space
indent_size = 2
[*.patch]
trim_trailing_whitespace = false

4
.gitattributes vendored
View File

@@ -1,4 +0,0 @@
*.dsp -crlf
*.dsw -crlf
*.sln -crlf
*.vcproj -crlf

View File

@@ -1,5 +1,88 @@
# This is the tag of the docker image used for the build jobs. If the
# image doesn't exist yet, the containers-build stage generates it.
variables:
UPSTREAM_REPO: mesa/mesa
include:
- project: 'wayland/ci-templates'
# Must be the same as in .gitlab-ci/lava-gitlab-ci.yml
ref: 0a9bdd33a98f05af6761ab118b5074952242aab0
file: '/templates/debian.yml'
- local: '.gitlab-ci/lava-gitlab-ci.yml'
stages:
- container
- build
- test
- success
# When to automatically run the CI
.ci-run-policy:
rules:
# Run pipeline by default for merge requests changing files affecting it
- if: '$CI_MERGE_REQUEST_SOURCE_BRANCH_NAME == $CI_COMMIT_REF_NAME'
changes: &paths
- VERSION
- bin/**/*
# GitLab CI
- .gitlab-ci.yml
- .gitlab-ci/**/*
# Meson
- meson*
- build-support/**/*
- subprojects/**/*
# SCons
- SConstruct
- scons/**/*
- common.py
# Source code
- include/**/*
- src/**/*
when: on_success
# Run pipeline by default in the main project if files affecting it were
# changed
- if: '$CI_PROJECT_PATH == "mesa/mesa"'
changes:
*paths
when: on_success
# Allow triggering jobs manually on branches of forked projects
- if: '$CI_PROJECT_PATH != "mesa/mesa" && $CI_MERGE_REQUEST_SOURCE_BRANCH_NAME != $CI_COMMIT_REF_NAME'
when: manual
# Otherwise, most jobs won't run
- when: never
retry:
max: 2
when:
- runner_system_failure
# Cancel CI run if a newer commit is pushed to the same branch
interruptible: true
success:
stage: success
image: debian:stable-slim
only:
- merge_requests
except:
changes:
*paths
variables:
GIT_STRATEGY: none
script:
- echo "Dummy job to make sure every merge request pipeline runs at least one job"
.ci-deqp-artifacts:
artifacts:
when: always
untracked: false
paths:
# Watch out! Artifacts are relative to the build dir.
# https://gitlab.com/gitlab-org/gitlab-ce/commit/8788fb925706cad594adf6917a6c5f6587dd1521
- artifacts
# Build the CI docker images.
#
# DEBIAN_TAG is the tag of the docker image used by later stage jobs. If the
# image doesn't exist yet, the container stage job generates it.
#
# In order to generate a new image, one should generally change the tag.
# While removing the image from the registry would also work, that's not
@@ -12,63 +95,107 @@
# main repository, it's recommended to remove the image from the source
# repository's container registry, so that the image from the main
# repository's registry will be used there as well.
variables:
UPSTREAM_REPO: mesa/mesa
DEBIAN_TAG: "2019-08-09"
DEBIAN_VERSION: stretch-slim
DEBIAN_IMAGE: "$CI_REGISTRY_IMAGE/debian/$DEBIAN_VERSION:$DEBIAN_TAG"
include:
- project: 'wayland/ci-templates'
ref: c73dae8b84697ef18e2dbbf4fed7386d9652b0cd
file: '/templates/debian.yml'
stages:
- containers-build
- build+test
- test
# When to automatically run the CI
.ci-run-policy: &ci-run-policy
only:
- branches@mesa/mesa
- merge_requests
- /^ci([-/].*)?$/
retry:
max: 2
when:
- runner_system_failure
.ci-deqp-artifacts: &ci-deqp-artifacts
artifacts:
when: always
untracked: false
paths:
# Watch out! Artifacts are relative to the build dir.
# https://gitlab.com/gitlab-org/gitlab-ce/commit/8788fb925706cad594adf6917a6c5f6587dd1521
- artifacts
# CONTAINERS
debian:
extends: .debian@container-ifnot-exists
stage: containers-build
<<: *ci-run-policy
.container:
stage: container
extends:
- .ci-run-policy
variables:
GIT_STRATEGY: none # no need to pull the whole tree for rebuilding the image
DEBIAN_EXEC: 'bash .gitlab-ci/debian-install.sh'
DEBIAN_VERSION: buster-slim
REPO_SUFFIX: $CI_JOB_NAME
DEBIAN_EXEC: 'bash .gitlab-ci/container/${CI_JOB_NAME}.sh'
# no need to pull the whole repo to build the container image
GIT_STRATEGY: none
# Debian 10 based x86 build image
x86_build:
extends:
- .debian@container-ifnot-exists
- .container
variables:
DEBIAN_TAG: &x86_build "2020-01-14"
.use-x86_build:
variables:
TAG: *x86_build
image: "$CI_REGISTRY_IMAGE/debian/x86_build:$TAG"
needs:
- x86_build
# Debian 10 based x86 test image for GL
x86_test-gl:
extends: x86_build
variables:
DEBIAN_TAG: &x86_test-gl "2020-01-14"
# Debian 10 based x86 test image for VK
x86_test-vk:
extends: x86_build
variables:
DEBIAN_TAG: &x86_test-vk "2020-01-14"
# Can only be triggered manually on personal branches because RADV is the only
# driver that does Vulkan testing at the moment.
rules:
# Never build the test image for VK by default in the main project.
- if: '$CI_PROJECT_PATH == "mesa/mesa"'
when: never
# Never build the test image for VK by default for merge requests.
- if: '$CI_MERGE_REQUEST_SOURCE_BRANCH_NAME == $CI_COMMIT_REF_NAME'
when: never
# Otherwise, allow building it manually for personal branches.
- when: manual
# Debian 9 based x86 build image (old LLVM)
x86_build_old:
extends: x86_build
variables:
DEBIAN_TAG: &x86_build_old "2019-09-18"
DEBIAN_VERSION: stretch-slim
.use-x86_build_old:
variables:
TAG: *x86_build_old
image: "$CI_REGISTRY_IMAGE/debian/x86_build_old:$TAG"
needs:
- x86_build_old
# Debian 10 based ARM build image
arm_build:
extends:
- .debian@container-ifnot-exists@arm64v8
- .container
variables:
DEBIAN_TAG: &arm_build "2020-04-06"
.use-arm_build:
variables:
TAG: *arm_build
image: "$CI_REGISTRY_IMAGE/debian/arm_build:$TAG"
needs:
- arm_build
# Debian 10 based ARM test image
arm_test:
extends: arm_build
variables:
DEBIAN_TAG: &arm_test "2019-12-18"
.use-arm_test:
variables:
TAG: *arm_test
image: "$CI_REGISTRY_IMAGE/debian/arm_test:$TAG"
needs:
- meson-arm64
- arm_test
# BUILD
.build:
<<: *ci-run-policy
image: $DEBIAN_IMAGE
stage: build+test
cache:
paths:
- ccache
# Shared between windows and Linux
.build-common:
extends: .ci-run-policy
stage: build
artifacts:
when: always
paths:
@@ -76,93 +203,68 @@ debian:
# scons:
- build/*/config.log
- shader-db
# Just Linux
.build-linux:
extends: .build-common
variables:
CCACHE_COMPILERCHECK: "content"
CCACHE_COMPRESS: "true"
CCACHE_DIR: /cache/mesa/ccache
# Use ccache transparently, and print stats before/after
before_script:
- export PATH="/usr/lib/ccache:$PATH"
- export CCACHE_BASEDIR="$PWD"
- export CCACHE_DIR="$PWD/ccache"
- ccache --zero-stats || true
- ccache --show-stats || true
- ccache --show-stats
after_script:
# In case the install dir is being saved as artifacts, tar it up
# so that symlinks and hardlinks aren't each packed separately in
# the zip file.
- if [ -d install ]; then
tar -cf artifacts/install.tar install;
fi
- export CCACHE_DIR="$PWD/ccache"
- ccache --show-stats
.build-windows:
extends: .build-common
tags:
- mesa-windows
cache:
key: ${CI_JOB_NAME}
paths:
- subprojects/packagecache
.meson-build:
extends: .build
extends:
- .build-linux
- .use-x86_build
variables:
LLVM_VERSION: 9
script:
- .gitlab-ci/meson-build.sh
.scons-build:
extends: .build
extends:
- .build-linux
- .use-x86_build
variables:
SCONSFLAGS: "-j4"
script:
- if test -n "$LLVM_VERSION"; then
export LLVM_CONFIG="llvm-config-${LLVM_VERSION}";
fi
- scons $SCONS_TARGET
- eval $SCONS_CHECK_COMMAND
- .gitlab-ci/scons-build.sh
# NOTE: Building SWR is 2x (yes two) times slower than all the other
# gallium drivers combined.
# Start this early so that it doesn't limit the total run time.
#
# We also stick the glvnd build here, since we want non-glvnd in
# meson-main for actual driver CI.
meson-swr-glvnd:
extends: .meson-build
meson-testing:
extends:
- .meson-build
- .ci-deqp-artifacts
variables:
UNWIND: "true"
DRI_LOADERS: >
-D glvnd=true
-D glx=dri
-D gbm=true
-D egl=true
-D platforms=x11,drm,surfaceless
GALLIUM_ST: >
-D dri3=true
-D gallium-vdpau=false
-D gallium-xvmc=false
-D gallium-omx=disabled
-D gallium-va=false
-D gallium-xa=false
-D gallium-nine=false
-D gallium-opencl=disabled
GALLIUM_DRIVERS: "swr,iris"
LLVM_VERSION: "6.0"
meson-clang:
extends: .meson-build
variables:
UNWIND: "true"
DRI_DRIVERS: "auto"
GALLIUM_DRIVERS: "auto"
VULKAN_DRIVERS: intel,amd,freedreno
CC: "ccache clang-8"
CXX: "ccache clang++-8"
before_script:
- export CCACHE_BASEDIR="$PWD" CCACHE_DIR="$PWD/ccache"
- ccache --zero-stats --show-stats || true
# clang++ breaks if it picks up the GCC 8 directory without libstdc++.so
- apt-get remove -y libgcc-8-dev
scons-swr:
extends: .scons-build
variables:
SCONS_TARGET: "swr=1"
SCONS_CHECK_COMMAND: "true"
LLVM_VERSION: "6.0"
scons-win64:
extends: .scons-build
variables:
SCONS_TARGET: platform=windows machine=x86_64
SCONS_CHECK_COMMAND: "true"
GALLIUM_DRIVERS: "swrast"
VULKAN_DRIVERS: amd
BUILDTYPE: "debugoptimized"
script:
- .gitlab-ci/meson-build.sh
- .gitlab-ci/prepare-artifacts.sh
meson-main:
extends: .meson-build
@@ -184,14 +286,116 @@ meson-main:
-D gallium-xa=true
-D gallium-nine=true
-D gallium-opencl=disabled
GALLIUM_DRIVERS: "iris,nouveau,kmsro,r300,r600,freedreno,swrast,svga,v3d,vc4,virgl,etnaviv,panfrost,lima"
LLVM_VERSION: "7"
GALLIUM_DRIVERS: "iris,nouveau,kmsro,r300,r600,freedreno,swr,swrast,svga,v3d,vc4,virgl,etnaviv,panfrost,lima,zink"
EXTRA_OPTION: >
-D osmesa=gallium
-D tools=all
MESON_SHADERDB: "true"
script:
- .gitlab-ci/meson-build.sh
- .gitlab-ci/run-shader-db.sh
.meson-cross:
extends:
- .meson-build
variables:
UNWIND: "false"
DRI_LOADERS: >
-D glx=disabled
-D gbm=false
-D egl=true
-D platforms=surfaceless
-D osmesa=none
GALLIUM_ST: >
-D dri3=false
-D gallium-vdpau=false
-D gallium-xvmc=false
-D gallium-omx=disabled
-D gallium-va=false
-D gallium-xa=false
-D gallium-nine=false
.meson-arm:
extends:
- .meson-cross
- .use-arm_build
variables:
VULKAN_DRIVERS: freedreno
GALLIUM_DRIVERS: "etnaviv,freedreno,kmsro,lima,nouveau,panfrost,swrast,tegra,v3d,vc4"
BUILDTYPE: "debugoptimized"
<<: *ci-deqp-artifacts
EXTRA_OPTION: >
-D I-love-half-baked-turnips=true
tags:
- aarch64
meson-armhf:
extends:
- .meson-arm
- .ci-deqp-artifacts
variables:
CROSS: armhf
LLVM_VERSION: "7"
EXTRA_OPTION: >
-D llvm=false
script:
- .gitlab-ci/meson-build.sh
- .gitlab-ci/prepare-artifacts.sh
meson-arm64:
extends:
- .meson-arm
- .ci-deqp-artifacts
variables:
LLVM_VERSION: "8"
VULKAN_DRIVERS: "freedreno"
EXTRA_OPTION: >
-D llvm=false
script:
- .gitlab-ci/meson-build.sh
- .gitlab-ci/prepare-artifacts.sh
meson-arm64-build-test:
extends:
- .meson-arm
- .ci-deqp-artifacts
variables:
LLVM_VERSION: "8"
VULKAN_DRIVERS: "amd"
script:
- .gitlab-ci/meson-build.sh
meson-clang:
extends: .meson-build
variables:
UNWIND: "true"
DRI_LOADERS: >
-D glvnd=true
DRI_DRIVERS: "auto"
GALLIUM_DRIVERS: "auto"
VULKAN_DRIVERS: intel,amd,freedreno
CC: "ccache clang-9"
CXX: "ccache clang++-9"
.meson-windows:
extends:
- .build-windows
before_script:
- $ENV:ARCH = "x86"
- $ENV:VERSION = "2019\Community"
script:
- cmd /C .gitlab-ci\meson-build.bat
scons-swr:
extends: .scons-build
variables:
SCONS_TARGET: "swr=1"
SCONS_CHECK_COMMAND: "true"
LLVM_VERSION: "6.0"
scons-win64:
extends: .scons-build
variables:
SCONS_TARGET: platform=windows machine=x86_64
SCONS_CHECK_COMMAND: "true"
meson-clover:
extends: .meson-build
@@ -213,12 +417,27 @@ meson-clover:
script:
- export GALLIUM_DRIVERS="r600,radeonsi"
- .gitlab-ci/meson-build.sh
- LLVM_VERSION=7 .gitlab-ci/meson-build.sh
- LLVM_VERSION=8 .gitlab-ci/meson-build.sh
- export GALLIUM_DRIVERS="i915,r600"
- LLVM_VERSION=6.0 .gitlab-ci/meson-build.sh
- LLVM_VERSION=7 .gitlab-ci/meson-build.sh
meson-clover-old-llvm:
extends:
- meson-clover
- .use-x86_build_old
variables:
UNWIND: "false"
DRI_LOADERS: >
-D glx=disabled
-D egl=false
-D gbm=false
-D platforms=drm,surfaceless
GALLIUM_DRIVERS: "i915,r600"
script:
- LLVM_VERSION=3.9 .gitlab-ci/meson-build.sh
- LLVM_VERSION=4.0 .gitlab-ci/meson-build.sh
- LLVM_VERSION=5.0 .gitlab-ci/meson-build.sh
- LLVM_VERSION=6.0 .gitlab-ci/meson-build.sh
meson-vulkan:
extends: .meson-build
@@ -239,58 +458,14 @@ meson-vulkan:
-D gallium-xa=false
-D gallium-nine=false
-D gallium-opencl=disabled
-D b_sanitize=undefined
-D c_args=-fno-sanitize-recover=all
-D cpp_args=-fno-sanitize-recover=all
UBSAN_OPTIONS: "print_stacktrace=1"
VULKAN_DRIVERS: intel,amd,freedreno
LLVM_VERSION: "7"
EXTRA_OPTION: >
-D vulkan-overlay-layer=true
.meson-cross:
extends: .meson-build
variables:
UNWIND: "false"
DRI_LOADERS: >
-D glx=disabled
-D gbm=false
-D egl=false
-D platforms=surfaceless
-D osmesa=none
GALLIUM_ST: >
-D dri3=false
-D gallium-vdpau=false
-D gallium-xvmc=false
-D gallium-omx=disabled
-D gallium-va=false
-D gallium-xa=false
-D gallium-nine=false
-D llvm=false
<<: *ci-deqp-artifacts
script:
- .gitlab-ci/meson-build.sh
meson-armhf:
extends: .meson-cross
variables:
CROSS: armhf
VULKAN_DRIVERS: freedreno
GALLIUM_DRIVERS: "etnaviv,freedreno,kmsro,lima,nouveau,panfrost,tegra,v3d,vc4"
# Disable the tests since we're cross compiling.
EXTRA_OPTION: >
-D build-tests=false
-D I-love-half-baked-turnips=true
-D vulkan-overlay-layer=true
meson-arm64:
extends: .meson-cross
variables:
CROSS: arm64
VULKAN_DRIVERS: freedreno
GALLIUM_DRIVERS: "etnaviv,freedreno,kmsro,lima,nouveau,panfrost,tegra,v3d,vc4"
# Disable the tests since we're cross compiling.
EXTRA_OPTION: >
-D build-tests=false
-D I-love-half-baked-turnips=true
-D vulkan-overlay-layer=true
# While the main point of this build is testing the i386 cross build,
# we also use this one to test some other options that are exclusive
# with meson-main's choices (classic swrast and osmesa)
@@ -301,82 +476,229 @@ meson-i386:
VULKAN_DRIVERS: intel
DRI_DRIVERS: "swrast"
GALLIUM_DRIVERS: "iris"
# Disable i386 tests, because u_format_tests gets precision
# failures in dxtn unpacking
EXTRA_OPTION: >
-D build-tests=false
-D vulkan-overlay-layer=true
-D llvm=false
-D osmesa=classic
-D werror=true
scons-nollvm:
extends: .scons-build
meson-mingw32-x86_64:
extends: .meson-build
variables:
SCONS_TARGET: "llvm=0"
SCONS_CHECK_COMMAND: "scons llvm=0 check"
UNWIND: "false"
DRI_DRIVERS: ""
GALLIUM_DRIVERS: "swrast"
EXTRA_OPTION: >
-Dllvm=false
-Dosmesa=gallium
--cross-file=.gitlab-ci/x86_64-w64-mingw32
scons-llvm:
scons:
extends: .scons-build
variables:
SCONS_TARGET: "llvm=1"
SCONS_CHECK_COMMAND: "scons llvm=1 check"
LLVM_VERSION: "3.4"
# LLVM 3.4 packages were built with an old libstdc++ ABI
CXX: "g++ -D_GLIBCXX_USE_CXX11_ABI=0"
SCONS_CHECK_COMMAND: "scons llvm=1 force_scons=1 check"
script:
- SCONS_TARGET="" SCONS_CHECK_COMMAND="scons check force_scons=1" .gitlab-ci/scons-build.sh
- LLVM_VERSION=9 .gitlab-ci/scons-build.sh
.deqp-test:
<<: *ci-run-policy
scons-old-llvm:
extends:
- scons
- .use-x86_build_old
script:
- LLVM_VERSION=3.9 .gitlab-ci/scons-build.sh
.test:
extends:
- .ci-run-policy
stage: test
image: $DEBIAN_IMAGE
variables:
GIT_STRATEGY: none # testing doesn't build anything from source
DEQP_SKIPS: deqp-default-skips.txt
script:
before_script:
# Note: Build dir (and thus install) may be dirty due to GIT_STRATEGY
- rm -rf install
- tar -xf artifacts/install.tar
- ./artifacts/deqp-runner.sh
- LD_LIBRARY_PATH=install/lib find install/lib -name "*.so" -print -exec ldd {} \;
artifacts:
when: always
name: "$CI_JOB_NAME-$CI_COMMIT_REF_NAME"
paths:
- results/
dependencies:
- meson-testing
.test-gl:
extends:
- .test
variables:
TAG: *x86_test-gl
image: "$CI_REGISTRY_IMAGE/debian/x86_test-gl:$TAG"
needs:
- meson-testing
- x86_test-gl
.test-vk:
extends:
- .test
variables:
TAG: *x86_test-vk
image: "$CI_REGISTRY_IMAGE/debian/x86_test-vk:$TAG"
needs:
- meson-testing
- x86_test-vk
.piglit-test:
extends: .test-gl
artifacts:
when: on_failure
name: "$CI_JOB_NAME-$CI_COMMIT_REF_NAME"
paths:
- results/
- summary/
variables:
LIBGL_ALWAYS_SOFTWARE: 1
PIGLIT_NO_WINDOW: 1
script:
- install/piglit/run.sh
piglit-quick_gl:
extends: .piglit-test
variables:
LP_NUM_THREADS: 0
NIR_VALIDATE: 0
PIGLIT_OPTIONS: >
--process-isolation false
-x arb_gpu_shader5
-x egl_ext_device_
-x egl_ext_platform_device
-x ext_timer_query@time-elapsed
-x glx-multithread-clearbuffer
-x glx-multithread-shader-compile
-x max-texture-size
-x maxsize
PIGLIT_PROFILES: quick_gl
piglit-glslparser:
extends: .piglit-test
variables:
LP_NUM_THREADS: 0
NIR_VALIDATE: 0
PIGLIT_PROFILES: glslparser
piglit-quick_shader:
extends: .piglit-test
variables:
LP_NUM_THREADS: 1
NIR_VALIDATE: 0
PIGLIT_PROFILES: quick_shader
.deqp-test:
variables:
DEQP_SKIPS: deqp-default-skips.txt
script:
- ./install/deqp-runner.sh
.deqp-test-gl:
extends:
- .test-gl
- .deqp-test
.deqp-test-vk:
extends:
- .test-vk
- .deqp-test
variables:
DEQP_VER: vk
test-llvmpipe-gles2:
parallel: 4
variables:
DEQP_VER: gles2
DEQP_PARALLEL: 4
NIR_VALIDATE: 0
# Don't use threads inside llvmpipe, we've already got all 4 cores
# busy with DEQP_PARALLEL.
LP_NUM_THREADS: 0
DEQP_EXPECTED_FAILS: deqp-llvmpipe-fails.txt
LIBGL_ALWAYS_SOFTWARE: "true"
DEQP_RENDERER_MATCH: "llvmpipe"
extends: .deqp-test
dependencies:
- meson-main
extends: .deqp-test-gl
test-softpipe-gles2:
parallel: 4
extends: test-llvmpipe-gles2
variables:
DEQP_VER: gles2
DEQP_EXPECTED_FAILS: deqp-softpipe-fails.txt
LIBGL_ALWAYS_SOFTWARE: "true"
DEQP_RENDERER_MATCH: "softpipe"
DEQP_SKIPS: deqp-softpipe-skips.txt
GALLIUM_DRIVER: "softpipe"
extends: .deqp-test
dependencies:
- meson-main
# The GLES2 CTS run takes about 8 minutes of CPU time, while GLES3 is
# 25 minutes. Until we can get its runtime down, just do a partial
# (every 10 tests) run.
test-softpipe-gles3-limited:
test-softpipe-gles3:
parallel: 2
variables:
DEQP_VER: gles3
DEQP_EXPECTED_FAILS: deqp-softpipe-fails.txt
LIBGL_ALWAYS_SOFTWARE: "true"
DEQP_RENDERER_MATCH: "softpipe"
GALLIUM_DRIVER: "softpipe"
CI_NODE_INDEX: 1
CI_NODE_TOTAL: 10
extends: .deqp-test
extends: test-softpipe-gles2
test-softpipe-gles31:
parallel: 4
variables:
DEQP_VER: gles31
extends: test-softpipe-gles2
arm64_a630_gles2:
extends:
- .deqp-test-gl
- .use-arm_test
variables:
DEQP_VER: gles2
DEQP_EXPECTED_FAILS: deqp-freedreno-a630-fails.txt
DEQP_SKIPS: deqp-freedreno-a630-skips.txt
NIR_VALIDATE: 0
DEQP_PARALLEL: 4
FLAKES_CHANNEL: "#freedreno-ci"
tags:
- mesa-cheza
dependencies:
- meson-main
- meson-arm64
arm64_a630_gles31:
extends: arm64_a630_gles2
variables:
DEQP_VER: gles31
arm64_a630_gles3:
extends: arm64_a630_gles2
variables:
DEQP_VER: gles3
arm64_a306_gles2:
extends: arm64_a630_gles2
variables:
DEQP_EXPECTED_FAILS: deqp-freedreno-a307-fails.txt
DEQP_SKIPS: deqp-default-skips.txt
tags:
- db410c
# RADV CI
.test-radv:
variables:
VK_DRIVER: radeon
RADV_DEBUG: checkir
# Can only be triggered manually on personal branches because RADV is the only
# driver that does Vulkan testing at the moment.
rules:
# Never test RADV by default in the main project.
- if: '$CI_PROJECT_PATH == "mesa/mesa"'
when: never
# Never test RADV by default for merge requests.
- if: '$CI_MERGE_REQUEST_SOURCE_BRANCH_NAME == $CI_COMMIT_REF_NAME'
when: never
# Otherwise, allow testing RADV if the test image for VK has been manually
# started.
- when: on_success
radv_polaris10_vkcts:
extends:
- .deqp-test-vk
- .test-radv
variables:
DEQP_PARALLEL: 4
DEQP_SKIPS: deqp-radv-polaris10-skips.txt
tags:
- polaris10

122
.gitlab-ci/README.md Normal file
View File

@@ -0,0 +1,122 @@
## Mesa testing using gitlab-runner
The goal of the "test" stage of the .gitlab-ci.yml is to do pre-merge
testing of Mesa drivers on various platforms, so that we can ensure no
regressions are merged, as long as developers are merging code using
the "Merge when pipeline completes" button.
This document only covers the CI from .gitlab-ci.yml and this
directory. For other CI systems, see Intel's [Mesa
CI](https://gitlab.freedesktop.org/Mesa_CI) or panfrost's LAVA-based
CI (`src/gallium/drivers/panfrost/ci/`)
### Software architecture
For freedreno and llvmpipe CI, we're using gitlab-runner on the test
devices (DUTs), cached docker containers with VK-GL-CTS, and the
normal shared x86_64 runners to build the Mesa drivers to be run
inside of those containers on the DUTs.
The docker containers are rebuilt from the debian-install.sh script
when DEBIAN\_TAG is changed in .gitlab-ci.yml, and
debian-test-install.sh when DEBIAN\_ARM64\_TAG is changed in
.gitlab-ci.yml. The resulting images are around 500MB, and are
expected to change approximately weekly (though an individual
developer working on them may produce many more images while trying to
come up with a working MR!).
gitlab-runner is a client that polls gitlab.freedesktop.org for
available jobs, with no inbound networking requirements. Jobs can
have tags, so we can have DUT-specific jobs that only run on runners
with that tag marked in the gitlab UI.
Since dEQP takes a long time to run, we mark the job as "parallel" at
some level, which spawns multiple jobs from one definition, and then
deqp-runner.sh takes the corresponding fraction of the test list for
that job.
To reduce dEQP runtime (or avoid tests with unreliable results), a
deqp-runner.sh invocation can provide a list of tests to skip. If
your driver is not yet conformant, you can pass a list of expected
failures, and the job will only fail on tests that aren't listed (look
at the job's log for which specific tests failed).
### DUT requirements
#### DUTs must have a stable kernel and GPU reset.
If the system goes down during a test run, that job will eventually
time out and fail (default 1 hour). However, if the kernel can't
reliably reset the GPU on failure, bugs in one MR may leak into
spurious failures in another MR. This would be an unacceptable impact
on Mesa developers working on other drivers.
#### DUTs must be able to run docker
The Mesa gitlab-runner based test architecture is built around docker,
so that we can cache the debian package installation and CTS build
step across multiple test runs. Since the images are large and change
approximately weekly, the DUTs also need to be running some script to
prune stale docker images periodically in order to not run out of disk
space as we rev those containers (perhaps [this
script](https://gitlab.com/gitlab-org/gitlab-runner/issues/2980#note_169233611)).
Note that docker doesn't allow containers to be stored on NFS, and
doesn't allow multiple docker daemons to interact with the same
network block device, so you will probably need some sort of physical
storage on your DUTs.
#### DUTs must be public
By including your device in .gitlab-ci.yml, you're effectively letting
anyone on the internet run code on your device. docker containers may
provide some limited protection, but how much you trust that and what
you do to mitigate hostile access is up to you.
#### DUTs must expose the dri device nodes to the containers.
Obviously, to get access to the HW, we need to pass the render node
through. This is done by adding `devices = ["/dev/dri"]` to the
`runners.docker` section of /etc/gitlab-runner/config.toml.
### HW CI farm expectations
To make sure that testing of one vendor's drivers doesn't block
unrelated work by other vendors, we require that a given driver's test
farm produces a spurious failure no more than once a week. If every
driver had CI and failed once a week, we would be seeing someone's
code getting blocked on a spurious failure daily, which is an
unacceptable cost to the project.
Additionally, the test farm needs to be able to provide a short enough
turnaround time that people can regularly use the "Merge when pipeline
succeeds" button successfully (until we get
[marge-bot](https://github.com/smarkets/marge-bot) in place on
freedesktop.org). As a result, we require that the test farm be able
to handle a whole pipeline's worth of jobs in less than 5 minutes (to
compare, the build stage is about 10 minutes, if you could get all
your jobs scheduled on the shared runners in time.).
If a test farm is short the HW to provide these guarantees, consider
dropping tests to reduce runtime.
`VK-GL-CTS/scripts/log/bottleneck_report.py` can help you find what
tests were slow in a `results.qpa` file. Or, you can have a job with
no `parallel` field set and:
```
variables:
CI_NODE_INDEX: 1
CI_NODE_TOTAL: 10
```
to just run 1/10th of the test list.
If a HW CI farm goes offline (network dies and all CI pipelines end up
stalled) or its runners are consistenly spuriously failing (disk
full?), and the maintainer is not immediately available to fix the
issue, please push through an MR disabling that farm's jobs by adding
'.' to the front of the jobs names until the maintainer can bring
things back up. If this happens, the farm maintainer should provide a
report to mesa-dev@lists.freedesktop.org after the fact explaining
what happened and what the mitigation plan is for that failure next
time.

View File

@@ -11,6 +11,7 @@ CONFIG_DEVFREQ_GOV_SIMPLE_ONDEMAND=y
CONFIG_DRM=y
CONFIG_DRM_ROCKCHIP=y
CONFIG_DRM_PANFROST=y
CONFIG_DRM_LIMA=y
CONFIG_DRM_PANEL_SIMPLE=y
CONFIG_PWM_CROS_EC=y
CONFIG_BACKLIGHT_PWM=y

View File

@@ -10,6 +10,7 @@ CONFIG_DEVFREQ_GOV_PASSIVE=y
CONFIG_DRM=y
CONFIG_DRM_ROCKCHIP=y
CONFIG_DRM_PANFROST=y
CONFIG_DRM_LIMA=y
CONFIG_DRM_PANEL_SIMPLE=y
CONFIG_PWM_CROS_EC=y
CONFIG_BACKLIGHT_PWM=y
@@ -25,7 +26,6 @@ CONFIG_TYPEC_FUSB302=y
CONFIG_TYPEC=y
CONFIG_TYPEC_TCPM=y
CONFIG_ARCH_SUNXI=n
CONFIG_ARCH_ALPINE=n
CONFIG_ARCH_BCM2835=n
CONFIG_ARCH_BCM_IPROC=n
@@ -37,7 +37,6 @@ CONFIG_ARCH_LAYERSCAPE=n
CONFIG_ARCH_LG1K=n
CONFIG_ARCH_HISI=n
CONFIG_ARCH_MEDIATEK=n
CONFIG_ARCH_MESON=n
CONFIG_ARCH_MVEBU=n
CONFIG_ARCH_QCOM=n
CONFIG_ARCH_SEATTLE=n
@@ -78,5 +77,7 @@ CONFIG_TMPFS=y
CONFIG_PROVE_LOCKING=n
CONFIG_DEBUG_LOCKDEP=n
CONFIG_SOFTLOCKUP_DETECTOR=n
CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=n
CONFIG_SOFTLOCKUP_DETECTOR=y
CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y
CONFIG_DETECT_HUNG_TASK=y

View File

@@ -0,0 +1,10 @@
#!/bin/bash
set -ex
git clone https://gitlab.freedesktop.org/mesa/parallel-deqp-runner.git --depth 1 -b mesa-ci-2019-12-17
cd parallel-deqp-runner
meson build/ $EXTRA_MESON_ARGS
ninja -C build -j4 install
cd ..
rm -rf parallel-deqp-runner

View File

@@ -0,0 +1,61 @@
git config --global user.email "mesa@example.com"
git config --global user.name "Mesa CI"
# XXX: Use --depth 1 once we can drop the cherry-picks.
git clone \
https://github.com/KhronosGroup/VK-GL-CTS.git \
-b opengl-es-cts-3.2.5.1 \
/VK-GL-CTS
pushd /VK-GL-CTS
# Fix surfaceless build
git cherry-pick -x 22f41e5e321c6dcd8569c4dad91bce89f06b3670
git cherry-pick -x 1daa8dff73161ea60ead965bd6c9f2a0a2165648
# surfaceless links against libkms and such despite not using it.
sed -i '/gbm/d' targets/surfaceless/surfaceless.cmake
sed -i '/libkms/d' targets/surfaceless/surfaceless.cmake
sed -i '/libgbm/d' targets/surfaceless/surfaceless.cmake
# --insecure is due to SSL cert failures hitting sourceforge for zlib and
# libpng (sigh). The archives get their checksums checked anyway, and git
# always goes through ssh or https.
python3 external/fetch_sources.py --insecure
mkdir -p /deqp
# Save the testlog stylesheets:
cp doc/testlog-stylesheet/testlog.{css,xsl} /deqp
popd
pushd /deqp
cmake -G Ninja \
-DDEQP_TARGET=surfaceless \
-DCMAKE_BUILD_TYPE=Release \
$EXTRA_CMAKE_ARGS \
/VK-GL-CTS
ninja
# Copy out the mustpass lists we want from a bunch of other junk.
mkdir /deqp/mustpass
for gles in gles2 gles3 gles31; do
cp \
/deqp/external/openglcts/modules/gl_cts/data/mustpass/gles/aosp_mustpass/3.2.5.x/$gles-master.txt \
/deqp/mustpass/$gles-master.txt
done
# Save *some* executor utils, but otherwise strip things down
# to reduct deqp build size:
mkdir /deqp/executor.save
cp /deqp/executor/testlog-to-* /deqp/executor.save
rm -rf /deqp/executor
mv /deqp/executor.save /deqp/executor
rm -rf /deqp/external
rm -rf /deqp/modules/internal
rm -rf /deqp/execserver
rm -rf /deqp/modules/egl
rm -rf /deqp/framework
find -iname '*cmake*' -o -name '*ninja*' -o -name '*.o' -o -name '*.a' | xargs rm -rf
${STRIP_CMD:-strip} modules/*/deqp-*
du -sh *
rm -rf /VK-GL-CTS
popd

View File

@@ -0,0 +1,33 @@
git clone --depth 1 \
https://github.com/KhronosGroup/VK-GL-CTS.git \
-b vulkan-cts-1.1.6.0 \
/VK-GL-CTS
cd /VK-GL-CTS
# --insecure is due to SSL cert failures hitting sourceforge for zlib and
# libpng (sigh). The archives get their checksums checked anyway, and git
# always goes through ssh or https.
python3 external/fetch_sources.py --insecure
mkdir -p /deqp
cd /deqp
cmake -G Ninja \
-DDEQP_TARGET=x11_glx \
-DCMAKE_BUILD_TYPE=Release \
/VK-GL-CTS
ninja -j4
# Copy out the mustpass list we want.
mkdir /deqp/mustpass
cp /VK-GL-CTS/external/vulkancts/mustpass/master/vk-default.txt \
/deqp/mustpass/vk-master.txt
rm -rf /deqp/modules/internal
rm -rf /deqp/executor
rm -rf /deqp/execserver
rm -rf /deqp/modules/egl
rm -rf /deqp/framework
find -iname '*cmake*' -o -name '*ninja*' -o -name '*.o' -o -name '*.a' | xargs rm -rf
strip external/vulkancts/modules/vulkan/deqp-vk
du -sh *
rm -rf /VK-GL-CTS

View File

@@ -0,0 +1,13 @@
#!/bin/bash
set -ex
git clone https://gitlab.freedesktop.org/mesa/piglit.git --single-branch --no-checkout /piglit
pushd /piglit
git checkout 8771c3860505db2bcf4877216221d774bf90af6b
patch -p1 <$OLDPWD/.gitlab-ci/piglit/disable-vs_in.diff
cmake -G Ninja -DCMAKE_BUILD_TYPE=Release
ninja -j4
find -name .git -o -name '*ninja*' -o -iname '*cmake*' -o -name '*.[chao]' | xargs rm -rf
rm -rf target_api
popd

View File

@@ -0,0 +1,74 @@
#!/bin/bash
set -e
set -o xtrace
############### Install packages for building
apt-get -y install ca-certificates
sed -i -e 's/http:\/\/deb/https:\/\/deb/g' /etc/apt/sources.list
echo 'deb https://deb.debian.org/debian buster-backports main' >/etc/apt/sources.list.d/backports.list
dpkg --add-architecture armhf
apt-get update
apt-get -y install \
bc \
bison \
ccache \
cmake \
cpio \
crossbuild-essential-armhf \
debootstrap \
flex \
g++ \
gettext \
git \
lavacli \
libdrm-dev:armhf \
libegl1-mesa-dev \
libegl1-mesa-dev:armhf \
libelf-dev \
libelf-dev:armhf \
libexpat1-dev \
libexpat1-dev:armhf \
libgles2-mesa-dev \
libgles2-mesa-dev:armhf \
libpng-dev \
libpng-dev:armhf \
libssl-dev \
libvulkan-dev \
libvulkan-dev:armhf \
llvm-7-dev:armhf \
llvm-8-dev \
meson \
pkg-config \
python \
python3-mako \
unzip \
wget \
zlib1g-dev
# dependencies where we want a specific version
export LIBDRM_VERSION=libdrm-2.4.100
wget https://dri.freedesktop.org/libdrm/$LIBDRM_VERSION.tar.bz2
tar -xvf $LIBDRM_VERSION.tar.bz2 && rm $LIBDRM_VERSION.tar.bz2
cd $LIBDRM_VERSION; meson build -D vc4=true -D freedreno=true -D etnaviv=true; ninja -j4 -C build install; cd ..
rm -rf $LIBDRM_VERSION
############### Generate cross build file for Meson
cross_file="/cross_file-armhf.txt"
/usr/share/meson/debcrossgen --arch armhf -o "$cross_file"
# Explicitly set ccache path for cross compilers
sed -i "s|/usr/bin/\([^-]*\)-linux-gnu\([^-]*\)-g|/usr/lib/ccache/\\1-linux-gnu\\2-g|g" "$cross_file"
# Don't need wrapper for armhf executables
sed -i -e '/\[properties\]/a\' -e "needs_exe_wrapper = False" "$cross_file"
############### Generate kernel, ramdisk, test suites, etc for LAVA jobs
DEBIAN_ARCH=arm64 . .gitlab-ci/container/lava_arm.sh
DEBIAN_ARCH=armhf . .gitlab-ci/container/lava_arm.sh
apt-get purge -y \
wget
apt-get autoremove -y --purge

View File

@@ -0,0 +1,64 @@
#!/bin/bash
set -e
set -o xtrace
############### Install packages for building
apt-get -y install ca-certificates
sed -i -e 's/http:\/\/deb/https:\/\/deb/g' /etc/apt/sources.list
echo 'deb https://deb.debian.org/debian buster-backports main' >/etc/apt/sources.list.d/backports.list
apt-get update
apt-get -y install \
bzip2 \
cmake \
g++ \
gcc \
git \
libc6-dev \
libdrm-nouveau2 \
libexpat1 \
libgbm-dev \
libgbm-dev \
libgles2-mesa-dev \
libllvm8 \
libpng16-16 \
libpng-dev \
libvulkan-dev \
libvulkan1 \
meson \
netcat \
pkg-config \
procps \
python \
waffle-utils \
wget \
zlib1g
############### Build dEQP runner
. .gitlab-ci/build-cts-runner.sh
############### Build dEQP GL
. .gitlab-ci/build-deqp-gl.sh
############### Uninstall the build software
apt-get purge -y \
bzip2 \
cmake \
g++ \
gcc \
git \
libc6-dev \
libgbm-dev \
libgles2-mesa-dev \
libpng-dev \
libvulkan-dev \
meson \
pkg-config \
python \
wget
apt-get autoremove -y --purge

View File

@@ -0,0 +1,63 @@
#!/bin/bash
set -e
set -o xtrace
if [[ "$DEBIAN_ARCH" = "arm64" ]]; then
GCC_ARCH="aarch64-linux-gnu"
KERNEL_ARCH="arm64"
DEFCONFIG="arch/arm64/configs/defconfig"
DEVICE_TREES="arch/arm64/boot/dts/rockchip/rk3399-gru-kevin.dtb arch/arm64/boot/dts/amlogic/meson-gxl-s905x-libretech-cc.dtb arch/arm64/boot/dts/allwinner/sun50i-h6-pine-h64.dtb arch/arm64/boot/dts/amlogic/meson-gxm-khadas-vim2.dtb"
KERNEL_IMAGE_NAME="Image"
else
GCC_ARCH="arm-linux-gnueabihf"
KERNEL_ARCH="arm"
DEFCONFIG="arch/arm/configs/multi_v7_defconfig"
DEVICE_TREES="arch/arm/boot/dts/rk3288-veyron-jaq.dtb arch/arm/boot/dts/sun8i-h3-libretech-all-h3-cc.dtb"
KERNEL_IMAGE_NAME="zImage"
fi
############### Build dEQP runner
if [[ "$DEBIAN_ARCH" = "armhf" ]]; then
EXTRA_MESON_ARGS="--cross-file /cross_file-armhf.txt"
fi
. .gitlab-ci/build-cts-runner.sh
mkdir -p /lava-files/rootfs-${DEBIAN_ARCH}/usr/bin
mv /usr/local/bin/deqp-runner /lava-files/rootfs-${DEBIAN_ARCH}/usr/bin/.
############### Build dEQP
EXTRA_CMAKE_ARGS="-DCMAKE_C_COMPILER=${GCC_ARCH}-gcc -DCMAKE_CXX_COMPILER=${GCC_ARCH}-g++"
STRIP_CMD="${GCC_ARCH}-strip"
. .gitlab-ci/build-deqp-gl.sh
mv /deqp /lava-files/rootfs-${DEBIAN_ARCH}/.
############### Cross-build kernel
KERNEL_URL="https://gitlab.freedesktop.org/tomeu/linux/-/archive/v5.5-rc5-panfrost-fixes/linux-v5.5-rc5-panfrost-fixes.tar.gz"
if [[ "$DEBIAN_ARCH" = "armhf" ]]; then
export ARCH=${KERNEL_ARCH}
export CROSS_COMPILE="${GCC_ARCH}-"
fi
mkdir -p kernel
wget -qO- ${KERNEL_URL} | tar -xz --strip-components=1 -C kernel
pushd kernel
./scripts/kconfig/merge_config.sh ${DEFCONFIG} ../.gitlab-ci/${KERNEL_ARCH}.config
make -j12 ${KERNEL_IMAGE_NAME} dtbs
cp arch/${KERNEL_ARCH}/boot/${KERNEL_IMAGE_NAME} /lava-files/.
cp ${DEVICE_TREES} /lava-files/.
popd
rm -rf kernel
############### Create rootfs
set +e
debootstrap --variant=minbase --arch=${DEBIAN_ARCH} testing /lava-files/rootfs-${DEBIAN_ARCH}/ http://deb.debian.org/debian
cat /lava-files/rootfs-${DEBIAN_ARCH}/debootstrap/debootstrap.log
set -e
cp .gitlab-ci/create-rootfs.sh /lava-files/rootfs-${DEBIAN_ARCH}/.
chroot /lava-files/rootfs-${DEBIAN_ARCH} sh /create-rootfs.sh
rm /lava-files/rootfs-${DEBIAN_ARCH}/create-rootfs.sh

View File

@@ -0,0 +1,52 @@
-----BEGIN PGP PUBLIC KEY BLOCK-----
Version: GnuPG v1.4.12 (GNU/Linux)
mQINBFE9lCwBEADi0WUAApM/mgHJRU8lVkkw0CHsZNpqaQDNaHefD6Rw3S4LxNmM
EZaOTkhP200XZM8lVdbfUW9xSjA3oPldc1HG26NjbqqCmWpdo2fb+r7VmU2dq3NM
R18ZlKixiLDE6OUfaXWKamZsXb6ITTYmgTO6orQWYrnW6ckYHSeaAkW0wkDAryl2
B5v8aoFnQ1rFiVEMo4NGzw4UX+MelF7rxaaregmKVTPiqCOSPJ1McC1dHFN533FY
Wh/RVLKWo6npu+owtwYFQW+zyQhKzSIMvNujFRzhIxzxR9Gn87MoLAyfgKEzrbbT
DhqqNXTxS4UMUKCQaO93TzetX/EBrRpJj+vP640yio80h4Dr5pAd7+LnKwgpTDk1
G88bBXJAcPZnTSKu9I2c6KY4iRNbvRz4i+ZdwwZtdW4nSdl2792L7Sl7Nc44uLL/
ZqkKDXEBF6lsX5XpABwyK89S/SbHOytXv9o4puv+65Ac5/UShspQTMSKGZgvDauU
cs8kE1U9dPOqVNCYq9Nfwinkf6RxV1k1+gwtclxQuY7UpKXP0hNAXjAiA5KS5Crq
7aaJg9q2F4bub0mNU6n7UI6vXguF2n4SEtzPRk6RP+4TiT3bZUsmr+1ktogyOJCc
Ha8G5VdL+NBIYQthOcieYCBnTeIH7D3Sp6FYQTYtVbKFzmMK+36ERreL/wARAQAB
tD1TeWx2ZXN0cmUgTGVkcnUgLSBEZWJpYW4gTExWTSBwYWNrYWdlcyA8c3lsdmVz
dHJlQGRlYmlhbi5vcmc+iQI4BBMBAgAiBQJRPZQsAhsDBgsJCAcDAgYVCAIJCgsE
FgIDAQIeAQIXgAAKCRAVz00Yr090Ibx+EADArS/hvkDF8juWMXxh17CgR0WZlHCC
9CTBWkg5a0bNN/3bb97cPQt/vIKWjQtkQpav6/5JTVCSx2riL4FHYhH0iuo4iAPR
udC7Cvg8g7bSPrKO6tenQZNvQm+tUmBHgFiMBJi92AjZ/Qn1Shg7p9ITivFxpLyX
wpmnF1OKyI2Kof2rm4BFwfSWuf8Fvh7kDMRLHv+MlnK/7j/BNpKdozXxLcwoFBmn
l0WjpAH3OFF7Pvm1LJdf1DjWKH0Dc3sc6zxtmBR/KHHg6kK4BGQNnFKujcP7TVdv
gMYv84kun14pnwjZcqOtN3UJtcx22880DOQzinoMs3Q4w4o05oIF+sSgHViFpc3W
R0v+RllnH05vKZo+LDzc83DQVrdwliV12eHxrMQ8UYg88zCbF/cHHnlzZWAJgftg
hB08v1BKPgYRUzwJ6VdVqXYcZWEaUJmQAPuAALyZESw94hSo28FAn0/gzEc5uOYx
K+xG/lFwgAGYNb3uGM5m0P6LVTfdg6vDwwOeTNIExVk3KVFXeSQef2ZMkhwA7wya
KJptkb62wBHFE+o9TUdtMCY6qONxMMdwioRE5BYNwAsS1PnRD2+jtlI0DzvKHt7B
MWd8hnoUKhMeZ9TNmo+8CpsAtXZcBho0zPGz/R8NlJhAWpdAZ1CmcPo83EW86Yq7
BxQUKnNHcwj2ebkCDQRRPZQsARAA4jxYmbTHwmMjqSizlMJYNuGOpIidEdx9zQ5g
zOr431/VfWq4S+VhMDhs15j9lyml0y4ok215VRFwrAREDg6UPMr7ajLmBQGau0Fc
bvZJ90l4NjXp5p0NEE/qOb9UEHT7EGkEhaZ1ekkWFTWCgsy7rRXfZLxB6sk7pzLC
DshyW3zjIakWAnpQ5j5obiDy708pReAuGB94NSyb1HoW/xGsGgvvCw4r0w3xPStw
F1PhmScE6NTBIfLliea3pl8vhKPlCh54Hk7I8QGjo1ETlRP4Qll1ZxHJ8u25f/ta
RES2Aw8Hi7j0EVcZ6MT9JWTI83yUcnUlZPZS2HyeWcUj+8nUC8W4N8An+aNps9l/
21inIl2TbGo3Yn1JQLnA1YCoGwC34g8QZTJhElEQBN0X29ayWW6OdFx8MDvllbBV
ymmKq2lK1U55mQTfDli7S3vfGz9Gp/oQwZ8bQpOeUkc5hbZszYwP4RX+68xDPfn+
M9udl+qW9wu+LyePbW6HX90LmkhNkkY2ZzUPRPDHZANU5btaPXc2H7edX4y4maQa
xenqD0lGh9LGz/mps4HEZtCI5CY8o0uCMF3lT0XfXhuLksr7Pxv57yue8LLTItOJ
d9Hmzp9G97SRYYeqU+8lyNXtU2PdrLLq7QHkzrsloG78lCpQcalHGACJzrlUWVP/
fN3Ht3kAEQEAAYkCHwQYAQIACQUCUT2ULAIbDAAKCRAVz00Yr090IbhWEADbr50X
OEXMIMGRLe+YMjeMX9NG4jxs0jZaWHc/WrGR+CCSUb9r6aPXeLo+45949uEfdSsB
pbaEdNWxF5Vr1CSjuO5siIlgDjmT655voXo67xVpEN4HhMrxugDJfCa6z97P0+ML
PdDxim57uNqkam9XIq9hKQaurxMAECDPmlEXI4QT3eu5qw5/knMzDMZj4Vi6hovL
wvvAeLHO/jsyfIdNmhBGU2RWCEZ9uo/MeerPHtRPfg74g+9PPfP6nyHD2Wes6yGd
oVQwtPNAQD6Cj7EaA2xdZYLJ7/jW6yiPu98FFWP74FN2dlyEA2uVziLsfBrgpS4l
tVOlrO2YzkkqUGrybzbLpj6eeHx+Cd7wcjI8CalsqtL6cG8cUEjtWQUHyTbQWAgG
5VPEgIAVhJ6RTZ26i/G+4J8neKyRs4vz+57UGwY6zI4AB1ZcWGEE3Bf+CDEDgmnP
LSwbnHefK9IljT9XU98PelSryUO/5UPw7leE0akXKB4DtekToO226px1VnGp3Bov
1GBGvpHvL2WizEwdk+nfk8LtrLzej+9FtIcq3uIrYnsac47Pf7p0otcFeTJTjSq3
krCaoG4Hx0zGQG2ZFpHrSrZTVy6lxvIdfi0beMgY6h78p6M9eYZHQHc02DjFkQXN
bXb5c6gCHESH5PXwPU4jQEE7Ib9J6sbk7ZT2Mw==
=j+4q
-----END PGP PUBLIC KEY BLOCK-----

View File

@@ -0,0 +1,220 @@
#!/bin/bash
set -e
set -o xtrace
export DEBIAN_FRONTEND=noninteractive
CROSS_ARCHITECTURES="i386"
for arch in $CROSS_ARCHITECTURES; do
dpkg --add-architecture $arch
done
apt-get install -y \
ca-certificates \
gnupg \
unzip \
wget
# Upstream LLVM package repository
apt-key add .gitlab-ci/container/llvm-snapshot.gpg.key
echo "deb https://apt.llvm.org/buster/ llvm-toolchain-buster-9 main" >/etc/apt/sources.list.d/llvm9.list
sed -i -e 's/http:\/\/deb/https:\/\/deb/g' /etc/apt/sources.list
echo 'deb https://deb.debian.org/debian buster-backports main' >/etc/apt/sources.list.d/backports.list
apt-get update
# Use newer packages from backports by default
cat >/etc/apt/preferences <<EOF
Package: *
Pin: release a=buster-backports
Pin-Priority: 500
EOF
apt-get dist-upgrade -y
apt-get install -y --no-remove \
autoconf \
automake \
autotools-dev \
bison \
clang-9 \
cmake \
flex \
g++ \
gcc \
gettext \
git \
libclang-6.0-dev \
libclang-7-dev \
libclang-8-dev \
libclang-9-dev \
libclc-dev \
libelf-dev \
libepoxy-dev \
libexpat1-dev \
libgbm-dev \
libgtk-3-dev \
libomxil-bellagio-dev \
libpciaccess-dev \
libtool \
libunwind-dev \
libva-dev \
libvdpau-dev \
libvulkan-dev \
libx11-dev \
libx11-xcb-dev \
libxdamage-dev \
libxext-dev \
libxrandr-dev \
libxrender-dev \
libxshmfence-dev \
libxvmc-dev \
libxxf86vm-dev \
llvm-6.0-dev \
llvm-7-dev \
llvm-8-dev \
llvm-9-dev \
meson \
pkg-config \
python-mako \
python3-mako \
scons \
x11proto-dri2-dev \
x11proto-gl-dev \
x11proto-randr-dev \
xz-utils \
zlib1g-dev
# Cross-build Mesa deps
for arch in $CROSS_ARCHITECTURES; do
apt-get install -y --no-remove \
crossbuild-essential-${arch} \
libdrm-dev:${arch} \
libelf-dev:${arch} \
libexpat1-dev:${arch}
done
# for 64bit windows cross-builds
apt-get install -y --no-remove \
libz-mingw-w64-dev \
mingw-w64 \
wine \
wine32 \
wine64
# Debian's pkg-config wrapers for mingw are broken, and there's no sign that
# they're going to be fixed, so we'll just have to fix it ourselves
# https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=930492
cat >/usr/local/bin/x86_64-w64-mingw32-pkg-config <<EOF
#!/bin/sh
PKG_CONFIG_LIBDIR=/usr/x86_64-w64-mingw32/lib/pkgconfig pkg-config \$@
EOF
chmod +x /usr/local/bin/x86_64-w64-mingw32-pkg-config
# for the vulkan overlay layer
wget https://github.com/KhronosGroup/glslang/releases/download/master-tot/glslang-master-linux-Release.zip
unzip glslang-master-linux-Release.zip bin/glslangValidator
install -m755 bin/glslangValidator /usr/local/bin/
rm bin/glslangValidator glslang-master-linux-Release.zip
# dependencies where we want a specific version
export XORG_RELEASES=https://xorg.freedesktop.org/releases/individual
export XCB_RELEASES=https://xcb.freedesktop.org/dist
export WAYLAND_RELEASES=https://wayland.freedesktop.org/releases
export XORGMACROS_VERSION=util-macros-1.19.0
export LIBDRM_VERSION=libdrm-2.4.100
export XCBPROTO_VERSION=xcb-proto-1.13
export LIBXCB_VERSION=libxcb-1.13
export LIBWAYLAND_VERSION=wayland-1.15.0
export WAYLAND_PROTOCOLS_VERSION=wayland-protocols-1.12
wget $XORG_RELEASES/util/$XORGMACROS_VERSION.tar.bz2
tar -xvf $XORGMACROS_VERSION.tar.bz2 && rm $XORGMACROS_VERSION.tar.bz2
cd $XORGMACROS_VERSION; ./configure; make install; cd ..
rm -rf $XORGMACROS_VERSION
wget $XCB_RELEASES/$XCBPROTO_VERSION.tar.bz2
tar -xvf $XCBPROTO_VERSION.tar.bz2 && rm $XCBPROTO_VERSION.tar.bz2
cd $XCBPROTO_VERSION; ./configure; make install; cd ..
rm -rf $XCBPROTO_VERSION
wget $XCB_RELEASES/$LIBXCB_VERSION.tar.bz2
tar -xvf $LIBXCB_VERSION.tar.bz2 && rm $LIBXCB_VERSION.tar.bz2
cd $LIBXCB_VERSION; ./configure; make install; cd ..
rm -rf $LIBXCB_VERSION
wget https://dri.freedesktop.org/libdrm/$LIBDRM_VERSION.tar.bz2
tar -xvf $LIBDRM_VERSION.tar.bz2 && rm $LIBDRM_VERSION.tar.bz2
cd $LIBDRM_VERSION; meson build -D vc4=true -D freedreno=true -D etnaviv=true; ninja -j4 -C build install; cd ..
rm -rf $LIBDRM_VERSION
wget $WAYLAND_RELEASES/$LIBWAYLAND_VERSION.tar.xz
tar -xvf $LIBWAYLAND_VERSION.tar.xz && rm $LIBWAYLAND_VERSION.tar.xz
cd $LIBWAYLAND_VERSION; ./configure --enable-libraries --without-host-scanner --disable-documentation --disable-dtd-validation; make install; cd ..
rm -rf $LIBWAYLAND_VERSION
wget $WAYLAND_RELEASES/$WAYLAND_PROTOCOLS_VERSION.tar.xz
tar -xvf $WAYLAND_PROTOCOLS_VERSION.tar.xz && rm $WAYLAND_PROTOCOLS_VERSION.tar.xz
cd $WAYLAND_PROTOCOLS_VERSION; ./configure; make install; cd ..
rm -rf $WAYLAND_PROTOCOLS_VERSION
# The version of libglvnd-dev in debian is too old
# Check this page to see when this local compilation can be dropped in favour of the package:
# https://packages.debian.org/libglvnd-dev
GLVND_VERSION=1.2.0
wget https://gitlab.freedesktop.org/glvnd/libglvnd/-/archive/v$GLVND_VERSION/libglvnd-v$GLVND_VERSION.tar.gz
tar -xvf libglvnd-v$GLVND_VERSION.tar.gz && rm libglvnd-v$GLVND_VERSION.tar.gz
pushd libglvnd-v$GLVND_VERSION; ./autogen.sh; ./configure; make install; popd
rm -rf libglvnd-v$GLVND_VERSION
pushd /usr/local
git clone https://gitlab.freedesktop.org/mesa/shader-db.git --depth 1
rm -rf shader-db/.git
cd shader-db
make
popd
# Use ccache to speed up builds
apt-get install -y --no-remove ccache
# We need xmllint to validate the XML files in Mesa
apt-get install -y --no-remove libxml2-utils
# Generate cross build files for Meson
for arch in $CROSS_ARCHITECTURES; do
cross_file="/cross_file-$arch.txt"
/usr/share/meson/debcrossgen --arch "$arch" -o "$cross_file"
# Explicitly set ccache path for cross compilers
sed -i "s|/usr/bin/\([^-]*\)-linux-gnu\([^-]*\)-g|/usr/lib/ccache/\\1-linux-gnu\\2-g|g" "$cross_file"
if [ "$arch" = "i386" ]; then
# Work around a bug in debcrossgen that should be fixed in the next release
sed -i "s|cpu_family = 'i686'|cpu_family = 'x86'|g" "$cross_file"
# Don't need wrapper for i386 executables
sed -i -e '/\[properties\]/a\' -e "needs_exe_wrapper = False" "$cross_file"
fi
done
############### Uninstall the build software
apt-get purge -y \
autoconf \
automake \
autotools-dev \
cmake \
git \
gnupg \
libgbm-dev \
libtool \
unzip \
wget
apt-get autoremove -y --purge

View File

@@ -0,0 +1,59 @@
#!/bin/bash
set -e
set -o xtrace
export DEBIAN_FRONTEND=noninteractive
apt-get install -y \
apt-transport-https \
ca-certificates
sed -i -e 's/http:\/\/deb/https:\/\/deb/g' /etc/apt/sources.list
echo 'deb https://deb.debian.org/debian stretch-backports main' >/etc/apt/sources.list.d/backports.list
apt-get update
# Use newer packages from backports by default
cat >/etc/apt/preferences <<EOF
Package: *
Pin: release a=stretch-backports
Pin-Priority: 500
EOF
apt-get dist-upgrade -y
apt-get install -y --no-remove \
llvm-3.9-dev \
libclang-3.9-dev \
llvm-4.0-dev \
libclang-4.0-dev \
llvm-5.0-dev \
libclang-5.0-dev \
g++ \
bzip2 \
ccache \
zlib1g-dev \
pkg-config \
gcc \
git \
libepoxy-dev \
libclc-dev \
xz-utils \
libdrm-dev \
libexpat1-dev \
libelf-dev \
libunwind-dev \
libpng-dev \
python-mako \
python3-mako \
bison \
flex \
gettext \
scons \
meson
############### Uninstall unused packages
apt-get autoremove -y --purge

View File

@@ -0,0 +1,96 @@
#!/bin/bash
set -e
set -o xtrace
export DEBIAN_FRONTEND=noninteractive
apt-get install -y \
ca-certificates \
gnupg \
# Upstream LLVM package repository
apt-key add .gitlab-ci/container/llvm-snapshot.gpg.key
echo "deb https://apt.llvm.org/buster/ llvm-toolchain-buster-9 main" >/etc/apt/sources.list.d/llvm9.list
sed -i -e 's/http:\/\/deb/https:\/\/deb/g' /etc/apt/sources.list
echo 'deb https://deb.debian.org/debian buster-backports main' >/etc/apt/sources.list.d/backports.list
apt-get update
# Use newer packages from backports by default
cat >/etc/apt/preferences <<EOF
Package: *
Pin: release a=buster-backports
Pin-Priority: 500
EOF
apt-get dist-upgrade -y
apt-get install -y --no-remove \
cmake \
g++ \
git \
gcc \
libexpat1 \
libgbm-dev \
libgles2-mesa-dev \
libpng16-16 \
libpng-dev \
libvulkan1 \
libvulkan-dev \
libwaffle-dev \
libwayland-server0 \
libxcb-xfixes0 \
libxkbcommon0 \
libxkbcommon-dev \
libxrender1 \
libxrender-dev \
libllvm9 \
meson \
patch \
pkg-config \
python3-mako \
python3-numpy \
python3-six \
python \
waffle-utils \
xauth \
xvfb \
zlib1g
############### Build piglit
. .gitlab-ci/build-piglit.sh
############### Build dEQP runner
. .gitlab-ci/build-cts-runner.sh
############### Build dEQP GL
. .gitlab-ci/build-deqp-gl.sh
############### Uninstall the build software
apt-get purge -y \
cmake \
g++ \
gcc \
git \
gnupg \
libc6-dev \
libgbm-dev \
libgles2-mesa-dev \
libpng-dev \
libwaffle-dev \
libxkbcommon-dev \
libxrender-dev \
meson \
patch \
pkg-config \
python
apt-get autoremove -y --purge

View File

@@ -0,0 +1,87 @@
#!/bin/bash
set -e
set -o xtrace
export DEBIAN_FRONTEND=noninteractive
apt-get install -y \
ca-certificates \
gnupg \
# Upstream LLVM package repository
apt-key add .gitlab-ci/container/llvm-snapshot.gpg.key
echo "deb https://apt.llvm.org/buster/ llvm-toolchain-buster-9 main" >/etc/apt/sources.list.d/llvm9.list
sed -i -e 's/http:\/\/deb/https:\/\/deb/g' /etc/apt/sources.list
echo 'deb https://deb.debian.org/debian buster-backports main' >/etc/apt/sources.list.d/backports.list
apt-get update
# Use newer packages from backports by default
cat >/etc/apt/preferences <<EOF
Package: *
Pin: release a=buster-backports
Pin-Priority: 500
EOF
apt-get dist-upgrade -y
apt-get install -y --no-remove \
cmake \
g++ \
git \
gcc \
libexpat1 \
libgbm-dev \
libgles2-mesa-dev \
libpng16-16 \
libpng-dev \
libvulkan1 \
libvulkan-dev \
libwayland-server0 \
libxcb-randr0 \
libxcb-xfixes0 \
libxkbcommon0 \
libxkbcommon-dev \
libxrender1 \
libxrender-dev \
libllvm9 \
meson \
patch \
pkg-config \
python3-distutils \
python \
xauth \
xvfb
############### Build dEQP runner
. .gitlab-ci/build-cts-runner.sh
############### Build dEQP VK
. .gitlab-ci/build-deqp-vk.sh
############### Uninstall the build software
apt-get purge -y \
cmake \
g++ \
gcc \
git \
gnupg \
libgbm-dev \
libgles2-mesa-dev \
libpng-dev \
libvulkan-dev \
libxkbcommon-dev \
libxrender-dev \
meson \
patch \
pkg-config \
python
apt-get autoremove -y --purge

View File

@@ -1,8 +1,15 @@
#!/bin/sh
#!/bin/bash
set -ex
apt-get -y install --no-install-recommends initramfs-tools libpng16-16 weston strace libsensors5
apt-get -y install --no-install-recommends \
initramfs-tools \
libpng16-16 \
strace \
libsensors5 \
libexpat1 \
libdrm2 \
libdrm-nouveau2
passwd root -d
chsh -s /bin/sh
ln -s /bin/sh /init
@@ -15,9 +22,9 @@ ln -s /bin/sh /init
rm -rf /etc/localtime
cp /usr/share/zoneinfo/Etc/UTC /etc/localtime
UNNEEDED_PACKAGES=" libfdisk1"\
" tzdata"\
UNNEEDED_PACKAGES="libfdisk1
tzdata
diffutils"
export DEBIAN_FRONTEND=noninteractive
@@ -82,15 +89,10 @@ UNNEEDED_PACKAGES="apt libapt-pkg5.0 "\
"libsemanage1 libsemanage-common "\
"libsepol1 "\
"gzip "\
"gnupg "\
"gpgv "\
"hostname "\
"adduser "\
"debian-archive-keyring "\
"libgl1 libgl1-mesa-dri libglapi-mesa libglvnd0 libglx-mesa0 libegl-mesa0 libgles2 "\
"libllvm7 "\
"libx11-data libthai-data "\
"systemd dbus "\
# Removing unneeded packages
for PACKAGE in ${UNNEEDED_PACKAGES}
@@ -182,4 +184,4 @@ rm usr/lib/*/libdb-5.3.so
rm usr/lib/*/libnss_hesiod*
rm usr/lib/*/libnss_nis*
rm usr/bin/tar
rm bin/tar

View File

@@ -0,0 +1 @@
u_format_test

View File

@@ -1,285 +0,0 @@
#!/bin/bash
set -e
set -o xtrace
export DEBIAN_FRONTEND=noninteractive
CROSS_ARCHITECTURES="armhf arm64 i386"
for arch in $CROSS_ARCHITECTURES; do
dpkg --add-architecture $arch
done
apt-get install -y \
apt-transport-https \
ca-certificates \
curl \
wget \
unzip \
gnupg
curl -fsSL https://apt.llvm.org/llvm-snapshot.gpg.key | apt-key add -
echo "deb [trusted=yes] https://apt.llvm.org/stretch/ llvm-toolchain-stretch-7 main" >/etc/apt/sources.list.d/llvm7.list
echo "deb [trusted=yes] https://apt.llvm.org/stretch/ llvm-toolchain-stretch-8 main" >/etc/apt/sources.list.d/llvm8.list
sed -i -e 's/http:\/\/deb/https:\/\/deb/g' /etc/apt/sources.list
echo 'deb https://deb.debian.org/debian stretch-backports main' >/etc/apt/sources.list.d/backports.list
echo 'deb https://deb.debian.org/debian jessie main' >/etc/apt/sources.list.d/jessie.list
apt-get update
apt-get install -y -t stretch-backports \
llvm-3.4-dev \
llvm-3.9-dev \
libclang-3.9-dev \
llvm-4.0-dev \
libclang-4.0-dev \
llvm-5.0-dev \
libclang-5.0-dev \
llvm-6.0-dev \
libclang-6.0-dev \
llvm-7-dev \
libclang-7-dev \
llvm-8-dev \
libclang-8-dev \
g++ \
clang-8
# Install remaining packages from Debian buster to get newer versions
echo "deb https://deb.debian.org/debian/ buster main" >/etc/apt/sources.list.d/buster.list
echo "deb https://deb.debian.org/debian/ buster-updates main" >/etc/apt/sources.list.d/buster-updates.list
apt-get update
apt-get install -y \
git \
bzip2 \
zlib1g-dev \
pkg-config \
libxrender-dev \
libxdamage-dev \
libxxf86vm-dev \
gcc \
git \
libepoxy-dev \
libegl1-mesa-dev \
libgbm-dev \
libclc-dev \
libxvmc-dev \
libomxil-bellagio-dev \
xz-utils \
libexpat1-dev \
libx11-xcb-dev \
libelf-dev \
libunwind-dev \
libglvnd-dev \
libgtk-3-dev \
libpng-dev \
libgbm-dev \
libgles2-mesa-dev \
python-mako \
python3-mako \
bison \
flex \
gettext \
cmake \
meson \
scons
# Cross-build Mesa deps
for arch in $CROSS_ARCHITECTURES; do
apt-get install -y \
libdrm-dev:${arch} \
libexpat1-dev:${arch} \
libelf-dev:${arch}
done
apt-get install -y \
dpkg-dev \
gcc-aarch64-linux-gnu \
g++-aarch64-linux-gnu \
gcc-arm-linux-gnueabihf \
g++-arm-linux-gnueabihf \
gcc-i686-linux-gnu \
g++-i686-linux-gnu
# for 64bit windows cross-builds
apt-get install -y mingw-w64
# for the vulkan overlay layer
wget https://github.com/KhronosGroup/glslang/releases/download/master-tot/glslang-master-linux-Release.zip
unzip glslang-master-linux-Release.zip bin/glslangValidator
install -m755 bin/glslangValidator /usr/local/bin/
rm bin/glslangValidator glslang-master-linux-Release.zip
# dependencies where we want a specific version
export XORG_RELEASES=https://xorg.freedesktop.org/releases/individual
export XCB_RELEASES=https://xcb.freedesktop.org/dist
export WAYLAND_RELEASES=https://wayland.freedesktop.org/releases
export XORGMACROS_VERSION=util-macros-1.19.0
export GLPROTO_VERSION=glproto-1.4.17
export DRI2PROTO_VERSION=dri2proto-2.8
export LIBPCIACCESS_VERSION=libpciaccess-0.13.4
export LIBDRM_VERSION=libdrm-2.4.99
export XCBPROTO_VERSION=xcb-proto-1.13
export RANDRPROTO_VERSION=randrproto-1.5.0
export LIBXRANDR_VERSION=libXrandr-1.5.0
export LIBXCB_VERSION=libxcb-1.13
export LIBXSHMFENCE_VERSION=libxshmfence-1.3
export LIBVDPAU_VERSION=libvdpau-1.1
export LIBVA_VERSION=libva-1.7.0
export LIBWAYLAND_VERSION=wayland-1.15.0
export WAYLAND_PROTOCOLS_VERSION=wayland-protocols-1.12
wget $XORG_RELEASES/util/$XORGMACROS_VERSION.tar.bz2
tar -xvf $XORGMACROS_VERSION.tar.bz2 && rm $XORGMACROS_VERSION.tar.bz2
cd $XORGMACROS_VERSION; ./configure; make install; cd ..
rm -rf $XORGMACROS_VERSION
wget $XORG_RELEASES/proto/$GLPROTO_VERSION.tar.bz2
tar -xvf $GLPROTO_VERSION.tar.bz2 && rm $GLPROTO_VERSION.tar.bz2
cd $GLPROTO_VERSION; ./configure; make install; cd ..
rm -rf $GLPROTO_VERSION
wget $XORG_RELEASES/proto/$DRI2PROTO_VERSION.tar.bz2
tar -xvf $DRI2PROTO_VERSION.tar.bz2 && rm $DRI2PROTO_VERSION.tar.bz2
cd $DRI2PROTO_VERSION; ./configure; make install; cd ..
rm -rf $DRI2PROTO_VERSION
wget $XCB_RELEASES/$XCBPROTO_VERSION.tar.bz2
tar -xvf $XCBPROTO_VERSION.tar.bz2 && rm $XCBPROTO_VERSION.tar.bz2
cd $XCBPROTO_VERSION; ./configure; make install; cd ..
rm -rf $XCBPROTO_VERSION
wget $XCB_RELEASES/$LIBXCB_VERSION.tar.bz2
tar -xvf $LIBXCB_VERSION.tar.bz2 && rm $LIBXCB_VERSION.tar.bz2
cd $LIBXCB_VERSION; ./configure; make install; cd ..
rm -rf $LIBXCB_VERSION
wget $XORG_RELEASES/lib/$LIBPCIACCESS_VERSION.tar.bz2
tar -xvf $LIBPCIACCESS_VERSION.tar.bz2 && rm $LIBPCIACCESS_VERSION.tar.bz2
cd $LIBPCIACCESS_VERSION; ./configure; make install; cd ..
rm -rf $LIBPCIACCESS_VERSION
wget https://dri.freedesktop.org/libdrm/$LIBDRM_VERSION.tar.bz2
tar -xvf $LIBDRM_VERSION.tar.bz2 && rm $LIBDRM_VERSION.tar.bz2
cd $LIBDRM_VERSION; ./configure --enable-vc4 --enable-freedreno --enable-etnaviv-experimental-api; make install; cd ..
rm -rf $LIBDRM_VERSION
wget $XORG_RELEASES/proto/$RANDRPROTO_VERSION.tar.bz2
tar -xvf $RANDRPROTO_VERSION.tar.bz2 && rm $RANDRPROTO_VERSION.tar.bz2
cd $RANDRPROTO_VERSION; ./configure; make install; cd ..
rm -rf $RANDRPROTO_VERSION
wget $XORG_RELEASES/lib/$LIBXRANDR_VERSION.tar.bz2
tar -xvf $LIBXRANDR_VERSION.tar.bz2 && rm $LIBXRANDR_VERSION.tar.bz2
cd $LIBXRANDR_VERSION; ./configure; make install; cd ..
rm -rf $LIBXRANDR_VERSION
wget $XORG_RELEASES/lib/$LIBXSHMFENCE_VERSION.tar.bz2
tar -xvf $LIBXSHMFENCE_VERSION.tar.bz2 && rm $LIBXSHMFENCE_VERSION.tar.bz2
cd $LIBXSHMFENCE_VERSION; ./configure; make install; cd ..
rm -rf $LIBXSHMFENCE_VERSION
wget https://people.freedesktop.org/~aplattner/vdpau/$LIBVDPAU_VERSION.tar.bz2
tar -xvf $LIBVDPAU_VERSION.tar.bz2 && rm $LIBVDPAU_VERSION.tar.bz2
cd $LIBVDPAU_VERSION; ./configure; make install; cd ..
rm -rf $LIBVDPAU_VERSION
wget https://www.freedesktop.org/software/vaapi/releases/libva/$LIBVA_VERSION.tar.bz2
tar -xvf $LIBVA_VERSION.tar.bz2 && rm $LIBVA_VERSION.tar.bz2
cd $LIBVA_VERSION; ./configure --disable-wayland --disable-dummy-driver; make install; cd ..
rm -rf $LIBVA_VERSION
wget $WAYLAND_RELEASES/$LIBWAYLAND_VERSION.tar.xz
tar -xvf $LIBWAYLAND_VERSION.tar.xz && rm $LIBWAYLAND_VERSION.tar.xz
cd $LIBWAYLAND_VERSION; ./configure --enable-libraries --without-host-scanner --disable-documentation --disable-dtd-validation; make install; cd ..
rm -rf $LIBWAYLAND_VERSION
wget $WAYLAND_RELEASES/$WAYLAND_PROTOCOLS_VERSION.tar.xz
tar -xvf $WAYLAND_PROTOCOLS_VERSION.tar.xz && rm $WAYLAND_PROTOCOLS_VERSION.tar.xz
cd $WAYLAND_PROTOCOLS_VERSION; ./configure; make install; cd ..
rm -rf $WAYLAND_PROTOCOLS_VERSION
pushd /usr/local
git clone https://gitlab.freedesktop.org/mesa/shader-db.git --depth 1
rm -rf shader-db/.git
cd shader-db
make
popd
# Use ccache to speed up builds
apt-get install -y ccache
# We need xmllint to validate the XML files in Mesa
apt-get install -y libxml2-utils
# Generate cross build files for Meson
for arch in $CROSS_ARCHITECTURES; do
cross_file="/cross_file-$arch.txt"
/usr/share/meson/debcrossgen --arch "$arch" -o "$cross_file"
# Work around a bug in debcrossgen that should be fixed in the next release
if [ "$arch" = "i386" ]; then
sed -i "s|cpu_family = 'i686'|cpu_family = 'x86'|g" "$cross_file"
fi
done
############### Build dEQP
git config --global user.email "mesa@example.com"
git config --global user.name "Mesa CI"
# XXX: Use --depth 1 once we can drop the cherry-picks.
git clone \
https://github.com/KhronosGroup/VK-GL-CTS.git \
-b opengl-es-cts-3.2.5.1 \
/VK-GL-CTS
cd /VK-GL-CTS
# Fix surfaceless build
git cherry-pick -x 22f41e5e321c6dcd8569c4dad91bce89f06b3670
git cherry-pick -x 1daa8dff73161ea60ead965bd6c9f2a0a2165648
# surfaceless links against libkms and such despite not using it.
sed -i '/gbm/d' targets/surfaceless/surfaceless.cmake
sed -i '/libkms/d' targets/surfaceless/surfaceless.cmake
sed -i '/libgbm/d' targets/surfaceless/surfaceless.cmake
python3 external/fetch_sources.py
mkdir -p /deqp
cd /deqp
cmake -G Ninja \
-DDEQP_TARGET=surfaceless \
-DCMAKE_BUILD_TYPE=Release \
/VK-GL-CTS
ninja
# Copy out the mustpass lists we want from a bunch of other junk.
mkdir /deqp/mustpass
for gles in gles2 gles3 gles31; do
cp \
/deqp/external/openglcts/modules/gl_cts/data/mustpass/gles/aosp_mustpass/3.2.5.x/$gles-master.txt \
/deqp/mustpass/$gles-master.txt
done
# Remove the rest of the build products that we don't need.
rm -rf /deqp/external
rm -rf /deqp/modules/internal
rm -rf /deqp/executor
rm -rf /deqp/execserver
rm -rf /deqp/modules/egl
rm -rf /deqp/framework
du -sh *
rm -rf /VK-GL-CTS
############### Uninstall the build software
apt-get purge -y \
git \
curl \
unzip \
gnupg \
cmake \
git \
libgles2-mesa-dev \
libgbm-dev
apt-get autoremove -y --purge

View File

@@ -3,8 +3,8 @@
# delete lines from the test list. Be careful.
# Skip the perf/stress tests to keep runtime manageable
dEQP-GLES[0-9]*.performance
dEQP-GLES[0-9]*.stress
dEQP-GLES[0-9]*.performance.*
dEQP-GLES[0-9]*.stress.*
# These are really slow on tiling architectures (including llvmpipe).
dEQP-GLES[0-9]*.functional.flush_finish
dEQP-GLES[0-9]*.functional.flush_finish.*

View File

@@ -0,0 +1,33 @@
dEQP-GLES2.functional.clipping.line.wide_line_clip_viewport_center
dEQP-GLES2.functional.clipping.line.wide_line_clip_viewport_corner
dEQP-GLES2.functional.clipping.point.wide_point_clip
dEQP-GLES2.functional.clipping.point.wide_point_clip_viewport_center
dEQP-GLES2.functional.clipping.point.wide_point_clip_viewport_corner
dEQP-GLES2.functional.clipping.triangle_vertex.clip_three.clip_neg_x_neg_z_and_pos_x_pos_z_and_neg_x_neg_y_pos_z
dEQP-GLES2.functional.fbo.render.recreate_depthbuffer.rebind_rbo_rgb565_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_depthbuffer.rebind_rbo_rgb5_a1_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_depthbuffer.rebind_rbo_rgba4_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_depthbuffer.rebind_tex2d_rgba_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_depthbuffer.rebind_tex2d_rgb_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_stencilbuffer.rebind_rbo_rgb565_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_stencilbuffer.rebind_rbo_rgb5_a1_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_stencilbuffer.rebind_rbo_rgba4_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_stencilbuffer.rebind_tex2d_rgba_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_stencilbuffer.rebind_tex2d_rgb_stencil_index8
dEQP-GLES2.functional.polygon_offset.fixed16_displacement_with_units
dEQP-GLES2.functional.texture.filtering.2d.linear_nearest_clamp_l8_npot
dEQP-GLES2.functional.texture.filtering.2d.linear_nearest_clamp_rgb888_npot
dEQP-GLES2.functional.texture.filtering.2d.linear_nearest_clamp_rgba4444_npot
dEQP-GLES2.functional.texture.filtering.2d.linear_nearest_clamp_rgba8888_npot
dEQP-GLES2.functional.texture.filtering.2d.nearest_linear_clamp_l8_npot
dEQP-GLES2.functional.texture.filtering.2d.nearest_linear_clamp_rgb888_npot
dEQP-GLES2.functional.texture.filtering.2d.nearest_linear_clamp_rgba4444_npot
dEQP-GLES2.functional.texture.filtering.2d.nearest_linear_clamp_rgba8888_npot
dEQP-GLES2.functional.texture.filtering.cube.linear_nearest_clamp_l8_npot
dEQP-GLES2.functional.texture.filtering.cube.linear_nearest_clamp_rgb888_npot
dEQP-GLES2.functional.texture.filtering.cube.linear_nearest_clamp_rgba4444_npot
dEQP-GLES2.functional.texture.filtering.cube.linear_nearest_clamp_rgba8888_npot
dEQP-GLES2.functional.texture.filtering.cube.nearest_linear_clamp_l8_npot
dEQP-GLES2.functional.texture.filtering.cube.nearest_linear_clamp_rgb888_npot
dEQP-GLES2.functional.texture.filtering.cube.nearest_linear_clamp_rgba4444_npot
dEQP-GLES2.functional.texture.filtering.cube.nearest_linear_clamp_rgba8888_npot

View File

@@ -0,0 +1,3 @@
dEQP-GLES2.functional.clipping.triangle_vertex.clip_three.clip_neg_x_neg_z_and_pos_x_pos_z_and_neg_x_neg_y_pos_z
dEQP-GLES31.functional.stencil_texturing.render.depth24_stencil8_clear
dEQP-GLES31.functional.stencil_texturing.render.depth24_stencil8_draw

View File

@@ -0,0 +1,21 @@
# Note: skips lists for CI are just a list of lines that, when
# non-zero-length and not starting with '#', will regex match to
# delete lines from the test list. Be careful.
# Skip the perf/stress tests to keep runtime manageable
dEQP-GLES[0-9]*.performance.*
dEQP-GLES[0-9]*.stress.*
# These are really slow on tiling architectures (including llvmpipe).
dEQP-GLES[0-9]*.functional.flush_finish.*
# Unstable test results
#dEQP-GLES3.functional.fragment_out.random.*
dEQP-GLES3.functional.transform_feedback.*points.*
dEQP-GLES3.functional.transform_feedback.*lines.*
dEQP-GLES31.functional.primitive_bounding_box.*
#dEQP-GLES31.functional.layout_binding.ssbo.fragment_binding_array.*
# Intermittent timeout
dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23

View File

@@ -0,0 +1,203 @@
dEQP-GLES2.functional.clipping.line.wide_line_clip_viewport_center
dEQP-GLES2.functional.clipping.line.wide_line_clip_viewport_corner
dEQP-GLES2.functional.clipping.triangle_vertex.clip_two.clip_neg_x_neg_y_pos_z_and_pos_x_pos_y_neg_z
dEQP-GLES2.functional.clipping.triangle_vertex.clip_two.clip_neg_x_pos_y_pos_z_and_pos_x_neg_y_neg_z
dEQP-GLES2.functional.clipping.triangle_vertex.clip_two.clip_pos_x_neg_y_pos_z_and_neg_x_pos_y_neg_z
dEQP-GLES2.functional.clipping.triangle_vertex.clip_two.clip_pos_x_pos_y_pos_z_and_neg_x_neg_y_neg_z
dEQP-GLES2.functional.depth_stencil_clear.depth_stencil_masked
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb565_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb565_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb5_a1_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb5_a1_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgba4_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgba4_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgba_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgba_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgb_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgb_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb565_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb565_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb5_a1_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb5_a1_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgba4_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgba4_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgba_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgba_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgb_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgb_stencil_index8
dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgb565_stencil_index8
dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgb5_a1_stencil_index8
dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgba4_stencil_index8
dEQP-GLES2.functional.fbo.render.shared_colorbuffer.tex2d_rgba_stencil_index8
dEQP-GLES2.functional.fbo.render.shared_colorbuffer.tex2d_rgb_stencil_index8
dEQP-GLES2.functional.fbo.render.shared_depthbuffer.rbo_rgb565_depth_component16
dEQP-GLES2.functional.fbo.render.shared_depthbuffer.rbo_rgb5_a1_depth_component16
dEQP-GLES2.functional.fbo.render.shared_depthbuffer.rbo_rgba4_depth_component16
dEQP-GLES2.functional.fbo.render.shared_depthbuffer.tex2d_rgba_depth_component16
dEQP-GLES2.functional.fbo.render.shared_depthbuffer.tex2d_rgb_depth_component16
dEQP-GLES2.functional.fragment_ops.depth_stencil.random.0
dEQP-GLES2.functional.fragment_ops.depth_stencil.random.1
dEQP-GLES2.functional.fragment_ops.depth_stencil.random.10
dEQP-GLES2.functional.fragment_ops.depth_stencil.random.11
dEQP-GLES2.functional.fragment_ops.depth_stencil.random.12
dEQP-GLES2.functional.fragment_ops.depth_stencil.random.13
dEQP-GLES2.functional.fragment_ops.depth_stencil.random.14
dEQP-GLES2.functional.fragment_ops.depth_stencil.random.15
dEQP-GLES2.functional.fragment_ops.depth_stencil.random.16
dEQP-GLES2.functional.fragment_ops.depth_stencil.random.17
dEQP-GLES2.functional.fragment_ops.depth_stencil.random.18
dEQP-GLES2.functional.fragment_ops.depth_stencil.random.19
dEQP-GLES2.functional.fragment_ops.depth_stencil.random.2
dEQP-GLES2.functional.fragment_ops.depth_stencil.random.20
dEQP-GLES2.functional.fragment_ops.depth_stencil.random.21
dEQP-GLES2.functional.fragment_ops.depth_stencil.random.22
dEQP-GLES2.functional.fragment_ops.depth_stencil.random.23
dEQP-GLES2.functional.fragment_ops.depth_stencil.random.24
dEQP-GLES2.functional.fragment_ops.depth_stencil.random.3
dEQP-GLES2.functional.fragment_ops.depth_stencil.random.4
dEQP-GLES2.functional.fragment_ops.depth_stencil.random.5
dEQP-GLES2.functional.fragment_ops.depth_stencil.random.6
dEQP-GLES2.functional.fragment_ops.depth_stencil.random.7
dEQP-GLES2.functional.fragment_ops.depth_stencil.random.8
dEQP-GLES2.functional.fragment_ops.depth_stencil.random.9
dEQP-GLES2.functional.fragment_ops.depth_stencil.write_mask.stencil
dEQP-GLES2.functional.shaders.algorithm.hsl_to_rgb_vertex
dEQP-GLES2.functional.shaders.functions.array_arguments.global_in_int_vertex
dEQP-GLES2.functional.shaders.functions.array_arguments.local_in_int_vertex
dEQP-GLES2.functional.shaders.functions.datatypes.int_int_vertex
dEQP-GLES2.functional.shaders.functions.overloading.builtin_sin_vertex
dEQP-GLES2.functional.shaders.functions.overloading.builtin_step_vertex
dEQP-GLES2.functional.shaders.functions.overloading.user_func_arg_int_types_vertex
dEQP-GLES2.functional.shaders.functions.qualifiers.inout_highp_int_vertex
dEQP-GLES2.functional.shaders.functions.qualifiers.inout_int_vertex
dEQP-GLES2.functional.shaders.functions.qualifiers.inout_lowp_int_vertex
dEQP-GLES2.functional.shaders.functions.qualifiers.out_highp_int_vertex
dEQP-GLES2.functional.shaders.functions.qualifiers.out_int_vertex
dEQP-GLES2.functional.shaders.functions.qualifiers.out_lowp_int_vertex
dEQP-GLES2.functional.shaders.indexing.matrix_subscript.mat3_dynamic_loop_write_static_loop_read_vertex
dEQP-GLES2.functional.shaders.indexing.matrix_subscript.mat3_dynamic_loop_write_static_read_vertex
dEQP-GLES2.functional.shaders.indexing.matrix_subscript.mat3_dynamic_write_dynamic_loop_read_vertex
dEQP-GLES2.functional.shaders.loops.do_while_constant_iterations.conditional_body_vertex
dEQP-GLES2.functional.shaders.loops.do_while_dynamic_iterations.vector_counter_fragment
dEQP-GLES2.functional.shaders.loops.do_while_uniform_iterations.conditional_body_vertex
dEQP-GLES2.functional.shaders.loops.do_while_uniform_iterations.nested_tricky_dataflow_2_vertex
dEQP-GLES2.functional.shaders.loops.for_dynamic_iterations.vector_counter_fragment
dEQP-GLES2.functional.shaders.loops.while_constant_iterations.compound_statement_vertex
dEQP-GLES2.functional.shaders.loops.while_constant_iterations.sequence_statement_vertex
dEQP-GLES2.functional.shaders.loops.while_dynamic_iterations.nested_sequence_vertex
dEQP-GLES2.functional.shaders.loops.while_dynamic_iterations.vector_counter_fragment
dEQP-GLES2.functional.shaders.loops.while_uniform_iterations.nested_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.highp_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.highp_ivec2_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.highp_ivec2_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.highp_ivec3_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.highp_ivec3_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.highp_ivec4_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.highp_ivec4_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.lowp_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.lowp_ivec2_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.lowp_ivec2_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.lowp_ivec3_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.lowp_ivec3_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.lowp_ivec4_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.lowp_ivec4_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.mediump_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.mediump_ivec2_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.mediump_ivec2_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.mediump_ivec3_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.mediump_ivec3_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.mediump_ivec4_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.mediump_ivec4_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.highp_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.highp_ivec2_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.highp_ivec2_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.highp_ivec3_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.highp_ivec3_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.highp_ivec4_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.highp_ivec4_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.lowp_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.lowp_ivec2_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.lowp_ivec2_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.lowp_ivec3_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.lowp_ivec3_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.lowp_ivec4_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.lowp_ivec4_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.mediump_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.mediump_ivec2_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.mediump_ivec2_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.mediump_ivec3_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.mediump_ivec3_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.mediump_ivec4_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.mediump_ivec4_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub.highp_int_ivec2_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub.highp_int_ivec3_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub.highp_int_ivec4_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub.highp_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub.highp_ivec2_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub.highp_ivec2_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub.highp_ivec3_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub.highp_ivec3_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub.highp_ivec4_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub.highp_ivec4_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub.lowp_int_ivec2_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub.lowp_int_ivec3_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub.lowp_int_ivec4_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub.lowp_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub.lowp_ivec2_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub.lowp_ivec2_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub.lowp_ivec3_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub.lowp_ivec3_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub.lowp_ivec4_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub.lowp_ivec4_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub.mediump_int_ivec2_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub.mediump_int_ivec3_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub.mediump_int_ivec4_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub.mediump_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub.mediump_ivec2_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub.mediump_ivec2_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub.mediump_ivec3_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub.mediump_ivec3_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub.mediump_ivec4_int_vertex
dEQP-GLES2.functional.shaders.operator.binary_operator.sub.mediump_ivec4_vertex
dEQP-GLES2.functional.shaders.operator.unary_operator.minus.highp_int_vertex
dEQP-GLES2.functional.shaders.operator.unary_operator.minus.highp_ivec2_vertex
dEQP-GLES2.functional.shaders.operator.unary_operator.minus.highp_ivec3_vertex
dEQP-GLES2.functional.shaders.operator.unary_operator.minus.highp_ivec4_vertex
dEQP-GLES2.functional.shaders.operator.unary_operator.minus.lowp_int_vertex
dEQP-GLES2.functional.shaders.operator.unary_operator.minus.lowp_ivec2_vertex
dEQP-GLES2.functional.shaders.operator.unary_operator.minus.lowp_ivec3_vertex
dEQP-GLES2.functional.shaders.operator.unary_operator.minus.lowp_ivec4_vertex
dEQP-GLES2.functional.shaders.operator.unary_operator.minus.mediump_int_vertex
dEQP-GLES2.functional.shaders.operator.unary_operator.minus.mediump_ivec2_vertex
dEQP-GLES2.functional.shaders.operator.unary_operator.minus.mediump_ivec3_vertex
dEQP-GLES2.functional.shaders.operator.unary_operator.minus.mediump_ivec4_vertex
dEQP-GLES2.functional.shaders.random.all_features.fragment.37
dEQP-GLES2.functional.shaders.random.exponential.fragment.11
dEQP-GLES2.functional.shaders.random.exponential.fragment.12
dEQP-GLES2.functional.shaders.random.exponential.fragment.14
dEQP-GLES2.functional.shaders.random.exponential.fragment.37
dEQP-GLES2.functional.shaders.random.exponential.fragment.5
dEQP-GLES2.functional.shaders.random.exponential.fragment.74
dEQP-GLES2.functional.shaders.random.texture.fragment.28
dEQP-GLES2.functional.shaders.random.trigonometric.fragment.65
dEQP-GLES2.functional.shaders.texture_functions.fragment.texture2d_bias
dEQP-GLES2.functional.shaders.texture_functions.fragment.texture2dproj_vec4_bias
dEQP-GLES2.functional.shaders.texture_functions.fragment.texturecube_bias
dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_linear_clamp_etc1
dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_linear_clamp_rgba8888
dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_linear_mirror_etc1
dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_linear_mirror_rgba8888
dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_nearest_clamp_etc1
dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_nearest_clamp_rgba8888
dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_nearest_mirror_etc1
dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_nearest_mirror_rgba8888
dEQP-GLES2.functional.texture.mipmap.cube.basic.linear_linear
dEQP-GLES2.functional.texture.mipmap.cube.basic.linear_nearest
dEQP-GLES2.functional.texture.mipmap.cube.bias.linear_linear
dEQP-GLES2.functional.texture.mipmap.cube.bias.linear_nearest
dEQP-GLES2.functional.texture.mipmap.cube.projected.linear_linear
dEQP-GLES2.functional.texture.mipmap.cube.projected.linear_nearest
dEQP-GLES2.functional.texture.specification.basic_copytexsubimage2d.2d_rgb
dEQP-GLES2.functional.texture.specification.basic_copytexsubimage2d.2d_rgba
dEQP-GLES2.functional.texture.specification.basic_copytexsubimage2d.cube_rgb
dEQP-GLES2.functional.texture.specification.basic_copytexsubimage2d.cube_rgba

View File

@@ -0,0 +1,50 @@
# Note: skips lists for CI are just a list of lines that, when
# non-zero-length and not starting with '#', will regex match to
# delete lines from the test list. Be careful.
# Skip the perf/stress tests to keep runtime manageable
dEQP-GLES[0-9]*.performance
dEQP-GLES[0-9]*.stress
# These are really slow on tiling architectures (including llvmpipe).
dEQP-GLES[0-9]*.functional.flush_finish
# Crashes
dEQP-GLES2.functional.shaders.invariance.highp.common_subexpression_1
dEQP-GLES2.functional.shaders.invariance.mediump.common_subexpression_1
dEQP-GLES2.functional.shaders.invariance.lowp.common_subexpression_1
# Flaky
dEQP-GLES2.functional.clipping.triangle_vertex.clip_three.clip_neg_x_neg_z_and_pos_x_pos_z_and_neg_x_neg_y_pos_z
dEQP-GLES2.functional.default_vertex_attrib.*
dEQP-GLES2.functional.fbo.completeness.size.distinct
dEQP-GLES2.functional.negative_api.shader.uniform_matrixfv_invalid_transpose
dEQP-GLES2.functional.negative_api.texture.generatemipmap_zero_level_array_compressed
dEQP-GLES2.functional.shaders.builtin_variable.frontfacing
dEQP-GLES2.functional.shaders.random.exponential.fragment.94
dEQP-GLES2.functional.shaders.random.all_features.fragment.55
dEQP-GLES2.functional.shaders.random.trigonometric.fragment.1
dEQP-GLES2.functional.shaders.random.trigonometric.fragment.69
# Driver bugs causing GPU errors
dEQP-GLES2.functional.shaders.loops.while_constant_iterations.nested_sequence_vertex
dEQP-GLES2.functional.shaders.loops.while_constant_iterations.conditional_body_vertex
dEQP-GLES2.functional.shaders.loops.while_uniform_iterations.conditional_continue_vertex
dEQP-GLES2.functional.shaders.loops.while_uniform_iterations.double_continue_vertex
# Hangs / OOM
dEQP-GLES2.functional.shaders.indexing.varying_array.vec4_dynamic_loop_write_static_read
dEQP-GLES2.functional.shaders.indexing.varying_array.vec4_dynamic_loop_write_dynamic_read
dEQP-GLES2.functional.shaders.indexing.varying_array.vec4_dynamic_loop_write_static_loop_read
dEQP-GLES2.functional.shaders.indexing.varying_array.vec4_dynamic_loop_write_dynamic_loop_read
dEQP-GLES2.functional.shaders.indexing.tmp_array.vec3_dynamic_loop_write_dynamic_read_vertex
dEQP-GLES2.functional.shaders.indexing.tmp_array.vec4_dynamic_loop_write_static_read_vertex
dEQP-GLES2.functional.shaders.indexing.tmp_array.vec4_dynamic_loop_write_dynamic_read_vertex
dEQP-GLES2.functional.shaders.indexing.tmp_array.vec4_dynamic_loop_write_static_loop_read_vertex
dEQP-GLES2.functional.shaders.indexing.tmp_array.vec4_dynamic_loop_write_dynamic_loop_read_vertex
dEQP-GLES2.functional.shaders.indexing.matrix_subscript.mat4_dynamic_loop_write_static_read_vertex
dEQP-GLES2.functional.shaders.indexing.matrix_subscript.mat4_dynamic_loop_write_dynamic_read_vertex
dEQP-GLES2.functional.shaders.indexing.matrix_subscript.mat4_dynamic_loop_write_static_loop_read_vertex
dEQP-GLES2.functional.shaders.indexing.matrix_subscript.mat4_dynamic_loop_write_dynamic_loop_read_vertex

View File

@@ -0,0 +1,31 @@
dEQP-GLES2.functional.depth_stencil_clear.depth_stencil_masked
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb565_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb565_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb5_a1_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb5_a1_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgba4_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgba4_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgb_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgb_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgba_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgba_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb565_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb565_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb5_a1_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb5_a1_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgba4_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgba4_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgb_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgb_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgba_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgba_stencil_index8
dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgb565_depth_component16
dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgb5_a1_depth_component16
dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgba4_depth_component16
dEQP-GLES2.functional.fbo.render.shared_colorbuffer.tex2d_rgb_depth_component16
dEQP-GLES2.functional.fbo.render.shared_colorbuffer.tex2d_rgba_depth_component16
dEQP-GLES2.functional.fbo.render.shared_depthbuffer.rbo_rgb565_depth_component16
dEQP-GLES2.functional.fbo.render.shared_depthbuffer.rbo_rgb5_a1_depth_component16
dEQP-GLES2.functional.fbo.render.shared_depthbuffer.rbo_rgba4_depth_component16
dEQP-GLES2.functional.fbo.render.shared_depthbuffer.tex2d_rgb_depth_component16
dEQP-GLES2.functional.fbo.render.shared_depthbuffer.tex2d_rgba_depth_component16

View File

@@ -0,0 +1,14 @@
# Note: skips lists for CI are just a list of lines that, when
# non-zero-length and not starting with '#', will regex match to
# delete lines from the test list. Be careful.
# Skip the perf/stress tests to keep runtime manageable
dEQP-GLES[0-9]*.performance.*
dEQP-GLES[0-9]*.stress.*
# These are really slow on tiling architectures (including llvmpipe).
dEQP-GLES[0-9]*.functional.flush_finish.*
# XXX: Why does this flake?
dEQP-GLES2.functional.clipping.triangle_vertex.clip_three.clip_neg_x_neg_z_and_pos_x_pos_z_and_neg_x_neg_y_pos_z

View File

@@ -0,0 +1,31 @@
dEQP-GLES2.functional.depth_stencil_clear.depth_stencil_masked
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb565_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb565_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb5_a1_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb5_a1_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgba4_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgba4_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgb_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgb_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgba_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgba_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb565_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb565_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb5_a1_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb5_a1_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgba4_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgba4_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgb_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgb_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgba_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgba_stencil_index8
dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgb565_depth_component16
dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgb5_a1_depth_component16
dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgba4_depth_component16
dEQP-GLES2.functional.fbo.render.shared_colorbuffer.tex2d_rgb_depth_component16
dEQP-GLES2.functional.fbo.render.shared_colorbuffer.tex2d_rgba_depth_component16
dEQP-GLES2.functional.fbo.render.shared_depthbuffer.rbo_rgb565_depth_component16
dEQP-GLES2.functional.fbo.render.shared_depthbuffer.rbo_rgb5_a1_depth_component16
dEQP-GLES2.functional.fbo.render.shared_depthbuffer.rbo_rgba4_depth_component16
dEQP-GLES2.functional.fbo.render.shared_depthbuffer.tex2d_rgb_depth_component16
dEQP-GLES2.functional.fbo.render.shared_depthbuffer.tex2d_rgba_depth_component16

View File

@@ -0,0 +1,10 @@
# Note: skips lists for CI are just a list of lines that, when
# non-zero-length and not starting with '#', will regex match to
# delete lines from the test list. Be careful.
# Skip the perf/stress tests to keep runtime manageable
dEQP-GLES[0-9]*.performance.*
dEQP-GLES[0-9]*.stress.*
# These are really slow on tiling architectures (including llvmpipe).
dEQP-GLES[0-9]*.functional.flush_finish.*

View File

@@ -0,0 +1,31 @@
dEQP-GLES2.functional.depth_stencil_clear.depth_stencil_masked
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb565_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb565_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb5_a1_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb5_a1_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgba4_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgba4_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgb_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgb_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgba_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgba_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb565_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb565_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb5_a1_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb5_a1_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgba4_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgba4_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgb_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgb_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgba_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgba_stencil_index8
dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgb565_depth_component16
dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgb5_a1_depth_component16
dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgba4_depth_component16
dEQP-GLES2.functional.fbo.render.shared_colorbuffer.tex2d_rgb_depth_component16
dEQP-GLES2.functional.fbo.render.shared_colorbuffer.tex2d_rgba_depth_component16
dEQP-GLES2.functional.fbo.render.shared_depthbuffer.rbo_rgb565_depth_component16
dEQP-GLES2.functional.fbo.render.shared_depthbuffer.rbo_rgb5_a1_depth_component16
dEQP-GLES2.functional.fbo.render.shared_depthbuffer.rbo_rgba4_depth_component16
dEQP-GLES2.functional.fbo.render.shared_depthbuffer.tex2d_rgb_depth_component16
dEQP-GLES2.functional.fbo.render.shared_depthbuffer.tex2d_rgba_depth_component16

View File

@@ -0,0 +1,13 @@
# Note: skips lists for CI are just a list of lines that, when
# non-zero-length and not starting with '#', will regex match to
# delete lines from the test list. Be careful.
# Skip the perf/stress tests to keep runtime manageable
dEQP-GLES[0-9]*.performance.*
dEQP-GLES[0-9]*.stress.*
# These are really slow on tiling architectures (including llvmpipe).
dEQP-GLES[0-9]*.functional.flush_finish.*
# XXX: Why does this flake?
dEQP-GLES2.functional.clipping.triangle_vertex.clip_three.clip_neg_x_neg_z_and_pos_x_pos_z_and_neg_x_neg_y_pos_z

View File

@@ -0,0 +1,31 @@
dEQP-GLES2.functional.depth_stencil_clear.depth_stencil_masked
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb565_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb565_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb5_a1_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb5_a1_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgba4_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgba4_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgb_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgb_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgba_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgba_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb565_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb565_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb5_a1_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb5_a1_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgba4_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgba4_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgb_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgb_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgba_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgba_stencil_index8
dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgb565_depth_component16
dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgb5_a1_depth_component16
dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgba4_depth_component16
dEQP-GLES2.functional.fbo.render.shared_colorbuffer.tex2d_rgb_depth_component16
dEQP-GLES2.functional.fbo.render.shared_colorbuffer.tex2d_rgba_depth_component16
dEQP-GLES2.functional.fbo.render.shared_depthbuffer.rbo_rgb565_depth_component16
dEQP-GLES2.functional.fbo.render.shared_depthbuffer.rbo_rgb5_a1_depth_component16
dEQP-GLES2.functional.fbo.render.shared_depthbuffer.rbo_rgba4_depth_component16
dEQP-GLES2.functional.fbo.render.shared_depthbuffer.tex2d_rgb_depth_component16
dEQP-GLES2.functional.fbo.render.shared_depthbuffer.tex2d_rgba_depth_component16

View File

@@ -0,0 +1,13 @@
# Note: skips lists for CI are just a list of lines that, when
# non-zero-length and not starting with '#', will regex match to
# delete lines from the test list. Be careful.
# Skip the perf/stress tests to keep runtime manageable
dEQP-GLES[0-9]*.performance.*
dEQP-GLES[0-9]*.stress.*
# These are really slow on tiling architectures (including llvmpipe).
dEQP-GLES[0-9]*.functional.flush_finish.*
# XXX: Why does this flake?
dEQP-GLES2.functional.clipping.triangle_vertex.clip_three.clip_neg_x_neg_z_and_pos_x_pos_z_and_neg_x_neg_y_pos_z

View File

@@ -0,0 +1,31 @@
# Disable a TON of tests to keep the run around 5-10 minutes because my runner is
# slow.
dEQP-VK.api.*
dEQP-VK.binding_model.*
dEQP-VK.clipping.*
dEQP-VK.compute.*
dEQP-VK.conditional_rendering.*
dEQP-VK.descriptor_indexing.*
dEQP-VK.device_group.*
dEQP-VK.fragment_operations.*
dEQP-VK.fragment_shader_interlock.*
dEQP-VK.graphicsfuzz.*
dEQP-VK.image.*
dEQP-VK.imageless_framebuffer.*
dEQP-VK.info.*
dEQP-VK.memory.*
dEQP-VK.memory_model.*
dEQP-VK.multiview.*
dEQP-VK.pipeline.*
dEQP-VK.protected_memory.*
dEQP-VK.query_pool.*
dEQP-VK.robustness.*
dEQP-VK.sparse_resources.*
dEQP-VK.spirv_assembly.*
dEQP-VK.subgroups.*
dEQP-VK.synchronization.*
dEQP-VK.texture.*
dEQP-VK.transform_feedback.*
dEQP-VK.ubo.*
dEQP-VK.wsi.*
dEQP-VK.ycbcr.*

View File

@@ -1,43 +1,42 @@
#!/bin/bash
#!/bin/sh
set -ex
DEQP_OPTIONS=(--deqp-surface-width=256 --deqp-surface-height=256)
DEQP_OPTIONS+=(--deqp-surface-type=pbuffer)
DEQP_OPTIONS+=(--deqp-gl-config-name=rgba8888d24s8ms0)
DEQP_OPTIONS+=(--deqp-visibility=hidden)
DEQP_OPTIONS+=(--deqp-log-images=disable)
DEQP_OPTIONS+=(--deqp-watchdog=enable)
DEQP_OPTIONS+=(--deqp-crashhandler=enable)
DEQP_OPTIONS="--deqp-surface-width=256 --deqp-surface-height=256"
DEQP_OPTIONS="$DEQP_OPTIONS --deqp-surface-type=pbuffer"
DEQP_OPTIONS="$DEQP_OPTIONS --deqp-gl-config-name=rgba8888d24s8ms0"
DEQP_OPTIONS="$DEQP_OPTIONS --deqp-visibility=hidden"
# It would be nice to be able to enable the watchdog, so that hangs in a test
# don't need to wait the full hour for the run to time out. However, some
# shaders end up taking long enough to compile
# (dEQP-GLES31.functional.ubo.random.all_per_block_buffers.20 for example)
# that they'll sporadically trigger the watchdog.
#DEQP_OPTIONS="$DEQP_OPTIONS --deqp-watchdog=enable"
if [ -z "$DEQP_VER" ]; then
echo 'DEQP_VER must be set to something like "gles2" or "gles31" for the test run'
echo 'DEQP_VER must be set to something like "gles2", "gles31" or "vk" for the test run'
exit 1
fi
if [ "$DEQP_VER" = "vk" ]; then
if [ -z "$VK_DRIVER" ]; then
echo 'VK_DRIVER must be to something like "radeon" or "intel" for the test run'
exit 1
fi
fi
if [ -z "$DEQP_SKIPS" ]; then
echo 'DEQP_SKIPS must be set to something like "deqp-default-skips.txt"'
exit 1
fi
# Prep the expected failure list
if [ -n "$DEQP_EXPECTED_FAILS" ]; then
export DEQP_EXPECTED_FAILS=`pwd`/artifacts/$DEQP_EXPECTED_FAILS
else
export DEQP_EXPECTED_FAILS=/tmp/expect-no-failures.txt
touch $DEQP_EXPECTED_FAILS
fi
sort < $DEQP_EXPECTED_FAILS > /tmp/expected-fails.txt
# Fix relative paths on inputs.
export DEQP_SKIPS=`pwd`/artifacts/$DEQP_SKIPS
# Be a good citizen on the shared runners.
export LP_NUM_THREADS=4
INSTALL=`pwd`/install
# Set up the driver environment.
export LD_LIBRARY_PATH=`pwd`/install/lib/
export EGL_PLATFORM=surfaceless
export VK_ICD_FILENAMES=`pwd`/install/share/vulkan/icd.d/"$VK_DRIVER"_icd.x86_64.json
# the runner was failing to look for libkms in /usr/local/lib for some reason
# I never figured out.
@@ -46,18 +45,14 @@ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib
RESULTS=`pwd`/results
mkdir -p $RESULTS
cd /deqp/modules/$DEQP_VER
# Generate test case list file
cp /deqp/mustpass/$DEQP_VER-master.txt /tmp/case-list.txt
# Note: not using sorted input and comm, becuase I want to run the tests in
# the same order that dEQP would.
while read -r line; do
if echo "$line" | grep -q '^[^#]'; then
sed -i "/$line/d" /tmp/case-list.txt
fi
done < $DEQP_SKIPS
# Generate test case list file.
if [ "$DEQP_VER" = "vk" ]; then
cp /deqp/mustpass/vk-master.txt /tmp/case-list.txt
DEQP=/deqp/external/vulkancts/modules/vulkan/deqp-vk
else
cp /deqp/mustpass/$DEQP_VER-master.txt /tmp/case-list.txt
DEQP=/deqp/modules/$DEQP_VER/deqp-$DEQP_VER
fi
# If the job is parallel, take the corresponding fraction of the caselist.
# Note: N~M is a gnu sed extension to match every nth line (first line is #1).
@@ -70,43 +65,173 @@ if [ ! -s /tmp/case-list.txt ]; then
exit 1
fi
# Cannot use tee because dash doesn't have pipefail
touch /tmp/result.txt
tail -f /tmp/result.txt &
if [ -n "$DEQP_EXPECTED_FAILS" ]; then
XFAIL="--xfail-list $INSTALL/$DEQP_EXPECTED_FAILS"
fi
./deqp-$DEQP_VER "${DEQP_OPTIONS[@]}" --deqp-log-filename=$RESULTS/results.qpa --deqp-caselist-file=/tmp/case-list.txt >> /tmp/result.txt
set +e
run_cts() {
deqp=$1
caselist=$2
output=$3
deqp-runner \
--deqp $deqp \
--output $output \
--caselist $caselist \
--exclude-list $INSTALL/$DEQP_SKIPS \
$XFAIL \
--job ${DEQP_PARALLEL:-1} \
--allow-flakes true \
$DEQP_RUNNER_OPTIONS \
-- \
$DEQP_OPTIONS
}
report_flakes() {
if [ -z "$FLAKES_CHANNEL" ]; then
return 0
fi
flakes=$1
bot="$CI_RUNNER_DESCRIPTION-$CI_PIPELINE_ID"
channel="$FLAKES_CHANNEL"
(
echo NICK $bot
echo USER $bot unused unused :Gitlab CI Notifier
sleep 10
echo "JOIN $channel"
sleep 1
desc="Flakes detected in job: $CI_JOB_URL on $CI_RUNNER_DESCRIPTION"
if [ -n "CI_MERGE_REQUEST_SOURCE_BRANCH_NAME" ]; then
desc="$desc on branch $CI_MERGE_REQUEST_SOURCE_BRANCH_NAME ($CI_MERGE_REQUEST_TITLE)"
fi
echo "PRIVMSG $channel :$desc"
for flake in `cat $flakes`; do
echo "PRIVMSG $channel :$flake"
done
echo "PRIVMSG $channel :See $CI_JOB_URL/artifacts/browse/results/"
echo "QUIT"
) | nc irc.freenode.net 6667 > /dev/null
}
extract_xml_result() {
testcase=$1
shift 1
qpas=$*
start="#beginTestCaseResult $testcase"
for qpa in $qpas; do
while IFS= read -r line; do
if [ "$line" = "$start" ]; then
dst="$testcase.qpa"
echo "#beginSession" > $dst
echo $line >> $dst
while IFS= read -r line; do
if [ "$line" = "#endTestCaseResult" ]; then
echo $line >> $dst
echo "#endSession" >> $dst
/deqp/executor/testlog-to-xml $dst "$RESULTS/$testcase.xml"
# copy the stylesheets here so they only end up in artifacts
# if we have one or more result xml in artifacts
cp /deqp/testlog.css "$RESULTS/"
cp /deqp/testlog.xsl "$RESULTS/"
return 0
fi
echo $line >> $dst
done
return 1
fi
done < $qpa
done
}
extract_xml_results() {
qpas=$*
while IFS= read -r testcase; do
testcase=${testcase%,*}
extract_xml_result $testcase $qpas
done
}
# Generate junit results
generate_junit() {
results=$1
echo "<?xml version=\"1.0\" encoding=\"utf-8\"?>"
echo "<testsuites>"
echo "<testsuite name=\"$DEQP_VER-$CI_NODE_INDEX\">"
while read line; do
testcase=${line%,*}
result=${line#*,}
# avoid counting Skip's in the # of tests:
if [ "$result" = "Skip" ]; then
continue;
fi
echo "<testcase name=\"$testcase\">"
if [ "$result" != "Pass" ]; then
echo "<failure type=\"$result\">"
echo "$result: See $CI_JOB_URL/artifacts/results/$testcase.xml"
echo "</failure>"
fi
echo "</testcase>"
done < $results
echo "</testsuite>"
echo "</testsuites>"
}
# wrapper to supress +x to avoid spamming the log
quiet() {
set +x
"$@"
set -x
}
run_cts $DEQP /tmp/case-list.txt $RESULTS/cts-runner-results.txt
DEQP_EXITCODE=$?
sed -ne \
'/StatusCode="Fail"/{x;p}; s/#beginTestCaseResult //; T; h' \
$RESULTS/results.qpa \
> /tmp/unsorted-fails.txt
quiet generate_junit $RESULTS/cts-runner-results.txt > $RESULTS/results.xml
# Scrape out the renderer that the test run used, so we can validate that the
# right driver was used.
if grep -q "dEQP-.*.info.renderer" /tmp/case-list.txt; then
# This is an ugly dependency on the .qpa format: Print 3 lines after the
# match, which happens to contain the result.
RENDERER=`sed -n '/#beginTestCaseResult dEQP-.*.info.renderer/{n;n;n;p}' $RESULTS/results.qpa | sed -n -E "s|<Text>(.*)</Text>|\1|p"`
if [ $DEQP_EXITCODE -ne 0 ]; then
# preserve caselist files in case of failures:
cp /tmp/deqp_runner.*.txt $RESULTS/
echo "Some unexpected results found (see cts-runner-results.txt in artifacts for full results):"
cat $RESULTS/cts-runner-results.txt | \
grep -v ",Pass" | \
grep -v ",Skip" | \
grep -v ",ExpectedFail" > \
$RESULTS/cts-runner-unexpected-results.txt
head -n 50 $RESULTS/cts-runner-unexpected-results.txt
echo "GL_RENDERER for this test run: $RENDERER"
if [ -z "$DEQP_NO_SAVE_RESULTS" ]; then
# Save the logs for up to the first 50 unexpected results:
head -n 50 $RESULTS/cts-runner-unexpected-results.txt | quiet extract_xml_results /tmp/*.qpa
fi
if [ -n "$DEQP_RENDERER_MATCH" ]; then
echo $RENDERER | grep -q $DEQP_RENDERER_MATCH > /dev/null
count=`cat $RESULTS/cts-runner-unexpected-results.txt | wc -l`
# Re-run fails to detect flakes. But use a small threshold, if
# something was fundamentally broken, we don't want to re-run
# the entire caselist
else
cat $RESULTS/cts-runner-results.txt | \
grep ",Flake" > \
$RESULTS/cts-runner-flakes.txt
count=`cat $RESULTS/cts-runner-flakes.txt | wc -l`
if [ $count -gt 0 ]; then
echo "Some flakes found (see cts-runner-flakes.txt in artifacts for full results):"
head -n 50 $RESULTS/cts-runner-flakes.txt
if [ -z "$DEQP_NO_SAVE_RESULTS" ]; then
# Save the logs for up to the first 50 flakes:
head -n 50 $RESULTS/cts-runner-flakes.txt | quiet extract_xml_results /tmp/*.qpa
fi
# Report the flakes to IRC channel for monitoring (if configured):
quiet report_flakes $RESULTS/cts-runner-flakes.txt
else
# no flakes, so clean-up:
rm $RESULTS/cts-runner-flakes.txt
fi
fi
if [ $DEQP_EXITCODE -ne 0 ]; then
exit $DEQP_EXITCODE
fi
sort < /tmp/unsorted-fails.txt > $RESULTS/fails.txt
comm -23 $RESULTS/fails.txt /tmp/expected-fails.txt > /tmp/new-fails.txt
if [ -s /tmp/new-fails.txt ]; then
echo "Unexpected failures:"
cat /tmp/new-fails.txt
exit 1
else
echo "No new failures"
fi
exit $DEQP_EXITCODE

View File

@@ -443,3 +443,402 @@ dEQP-GLES3.functional.texture.wrap.astc_8x8_srgb.repeat_repeat_linear_divisible
dEQP-GLES3.functional.texture.wrap.astc_8x8_srgb.repeat_repeat_linear_not_divisible
dEQP-GLES3.functional.vertex_arrays.single_attribute.normalize.int2_10_10_10.components4_quads1
dEQP-GLES3.functional.vertex_arrays.single_attribute.normalize.int2_10_10_10.components4_quads256
dEQP-GLES31.functional.debug.error_filters.case_29
dEQP-GLES31.functional.debug.negative_coverage.callbacks.buffer.read_pixels_fbo_format_mismatch
dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.blit_framebuffer_multisample
dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.read_pixels_fbo_format_mismatch
dEQP-GLES31.functional.debug.negative_coverage.log.buffer.read_pixels_fbo_format_mismatch
dEQP-GLES31.functional.draw_base_vertex.draw_elements_instanced_base_vertex.line_loop.instanced_attributes
dEQP-GLES31.functional.draw_buffers_indexed.overwrite_indexed.common_color_mask_buffer_color_mask
dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.0
dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.1
dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.10
dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.11
dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.12
dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.14
dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.16
dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.17
dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.19
dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.2
dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.3
dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.4
dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.5
dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.6
dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.7
dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.8
dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.9
dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.0
dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.1
dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.14
dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.15
dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.16
dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.17
dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.19
dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.2
dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.4
dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.5
dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.7
dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.9
dEQP-GLES31.functional.draw_indirect.draw_arrays_indirect.line_strip.multiple_attributes
dEQP-GLES31.functional.fbo.no_attachments.interaction.17x512ms4_default_16x16ms2
dEQP-GLES31.functional.fbo.no_attachments.interaction.1x1ms0_default_2048x2048ms4
dEQP-GLES31.functional.fbo.no_attachments.interaction.2048x2048ms4_default_1x1ms0
dEQP-GLES31.functional.fbo.no_attachments.interaction.256x256ms0_default_512x512ms2
dEQP-GLES31.functional.fbo.no_attachments.interaction.256x256ms2_default_128x512ms0
dEQP-GLES31.functional.fbo.no_attachments.multisample.samples2
dEQP-GLES31.functional.fbo.no_attachments.multisample.samples3
dEQP-GLES31.functional.fbo.no_attachments.multisample.samples4
dEQP-GLES31.functional.fbo.no_attachments.random.1
dEQP-GLES31.functional.fbo.no_attachments.random.11
dEQP-GLES31.functional.fbo.no_attachments.random.14
dEQP-GLES31.functional.fbo.no_attachments.random.15
dEQP-GLES31.functional.fbo.no_attachments.random.4
dEQP-GLES31.functional.fbo.no_attachments.random.9
dEQP-GLES31.functional.geometry_shading.query.primitives_generated_amplification
dEQP-GLES31.functional.geometry_shading.query.primitives_generated_instanced
dEQP-GLES31.functional.geometry_shading.query.primitives_generated_no_amplification
dEQP-GLES31.functional.geometry_shading.query.primitives_generated_no_geometry
dEQP-GLES31.functional.geometry_shading.query.primitives_generated_partial_primitives
dEQP-GLES31.functional.image_load_store.early_fragment_tests.early_fragment_tests_stencil
dEQP-GLES31.functional.image_load_store.early_fragment_tests.early_fragment_tests_stencil_fbo
dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_depth
dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_depth_fbo
dEQP-GLES31.functional.shaders.opaque_type_indexing.ubo.dynamically_uniform_geometry
dEQP-GLES31.functional.state_query.integer.max_framebuffer_samples_getfloat
dEQP-GLES31.functional.state_query.integer.max_framebuffer_samples_getinteger
dEQP-GLES31.functional.state_query.integer.max_framebuffer_samples_getinteger64
dEQP-GLES31.functional.state_query.texture.texture_2d_multisample.texture_immutable_format_float
dEQP-GLES31.functional.state_query.texture.texture_2d_multisample.texture_immutable_format_integer
dEQP-GLES31.functional.state_query.texture.texture_2d_multisample.texture_immutable_format_pure_int
dEQP-GLES31.functional.state_query.texture.texture_2d_multisample.texture_immutable_format_pure_uint
dEQP-GLES31.functional.state_query.texture.texture_2d_multisample.texture_immutable_levels_float
dEQP-GLES31.functional.state_query.texture.texture_2d_multisample.texture_immutable_levels_integer
dEQP-GLES31.functional.state_query.texture.texture_2d_multisample.texture_immutable_levels_pure_int
dEQP-GLES31.functional.state_query.texture.texture_2d_multisample.texture_immutable_levels_pure_uint
dEQP-GLES31.functional.state_query.texture.texture_2d_multisample_array.texture_immutable_format_float
dEQP-GLES31.functional.state_query.texture.texture_2d_multisample_array.texture_immutable_format_integer
dEQP-GLES31.functional.state_query.texture.texture_2d_multisample_array.texture_immutable_format_pure_int
dEQP-GLES31.functional.state_query.texture.texture_2d_multisample_array.texture_immutable_format_pure_uint
dEQP-GLES31.functional.state_query.texture.texture_2d_multisample_array.texture_immutable_levels_float
dEQP-GLES31.functional.state_query.texture.texture_2d_multisample_array.texture_immutable_levels_integer
dEQP-GLES31.functional.state_query.texture.texture_2d_multisample_array.texture_immutable_levels_pure_int
dEQP-GLES31.functional.state_query.texture.texture_2d_multisample_array.texture_immutable_levels_pure_uint
dEQP-GLES31.functional.texture.border_clamp.depth_compare_mode.depth32f_stencil8.linear_size_npot
dEQP-GLES31.functional.texture.border_clamp.depth_compare_mode.depth32f_stencil8.linear_size_pot
dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_linear_clamp_repeat
dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_linear_mirror_repeat
dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_linear_repeat_clamp
dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_linear_repeat_mirror
dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_linear_repeat_repeat
dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_linear_linear_clamp_repeat
dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_linear_linear_mirror_repeat
dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_linear_linear_repeat_clamp
dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_linear_linear_repeat_repeat
dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_linear_nearest_clamp_repeat
dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_linear_nearest_mirror_repeat
dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_linear_nearest_repeat_clamp
dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_linear_nearest_repeat_mirror
dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_nearest_linear_clamp_repeat
dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_nearest_linear_mirror_repeat
dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_nearest_linear_repeat_clamp
dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_nearest_linear_repeat_mirror
dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_nearest_nearest_repeat_clamp
dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_nearest_nearest_repeat_mirror
dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_nearest_nearest_repeat_repeat
dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_nearest_repeat_mirror
dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgb10_a2_linear_mipmap_linear
dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgb10_a2_nearest
dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgb10_a2_nearest_mipmap_linear
dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgb10_a2_nearest_mipmap_nearest
dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgb565_nearest
dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgb565_nearest_mipmap_linear
dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgb5_a1_nearest
dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgb5_a1_nearest_mipmap_linear
dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgb9_e5_nearest_mipmap_linear
dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgb9_e5_nearest_mipmap_nearest
dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgba16f_nearest_mipmap_linear
dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgba16f_nearest_mipmap_nearest
dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgba4_nearest
dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgba4_nearest_mipmap_linear
dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgba8_nearest_mipmap_linear
dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgba8_nearest_mipmap_nearest
dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgba8_snorm_nearest
dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgba8_snorm_nearest_mipmap_nearest
dEQP-GLES31.functional.texture.filtering.cube_array.formats.sr8_nearest_mipmap_nearest
dEQP-GLES31.functional.texture.filtering.cube_array.formats.srgb8_alpha8_nearest
dEQP-GLES31.functional.texture.filtering.cube_array.formats.srgb8_alpha8_nearest_mipmap_linear
dEQP-GLES31.functional.texture.filtering.cube_array.sizes.128x128x12_linear
dEQP-GLES31.functional.texture.filtering.cube_array.sizes.128x128x12_linear_mipmap_linear
dEQP-GLES31.functional.texture.filtering.cube_array.sizes.128x128x12_linear_mipmap_nearest
dEQP-GLES31.functional.texture.filtering.cube_array.sizes.128x128x12_nearest_mipmap_linear
dEQP-GLES31.functional.texture.filtering.cube_array.sizes.128x128x12_nearest_mipmap_nearest
dEQP-GLES31.functional.texture.filtering.cube_array.sizes.63x63x18_nearest_mipmap_nearest
dEQP-GLES31.functional.texture.filtering.cube_array.sizes.64x64x12_nearest_mipmap_linear
dEQP-GLES31.functional.texture.filtering.cube_array.sizes.64x64x12_nearest_mipmap_nearest
dEQP-GLES31.functional.texture.filtering.cube_array.sizes.8x8x6_nearest
dEQP-GLES31.functional.texture.gather.basic.2d.rgba8i.texture_swizzle.alpha_zero_one_red
dEQP-GLES31.functional.texture.gather.basic.2d.rgba8i.texture_swizzle.blue_alpha_zero_one
dEQP-GLES31.functional.texture.gather.basic.2d.rgba8i.texture_swizzle.one_red_green_blue
dEQP-GLES31.functional.texture.gather.basic.2d.rgba8i.texture_swizzle.zero_one_red_green
dEQP-GLES31.functional.texture.gather.basic.2d.rgba8ui.texture_swizzle.alpha_zero_one_red
dEQP-GLES31.functional.texture.gather.basic.2d.rgba8ui.texture_swizzle.blue_alpha_zero_one
dEQP-GLES31.functional.texture.gather.basic.2d.rgba8ui.texture_swizzle.one_red_green_blue
dEQP-GLES31.functional.texture.gather.basic.2d.rgba8ui.texture_swizzle.zero_one_red_green
dEQP-GLES31.functional.texture.gather.basic.2d_array.rgba8i.texture_swizzle.alpha_zero_one_red
dEQP-GLES31.functional.texture.gather.basic.2d_array.rgba8i.texture_swizzle.blue_alpha_zero_one
dEQP-GLES31.functional.texture.gather.basic.2d_array.rgba8i.texture_swizzle.one_red_green_blue
dEQP-GLES31.functional.texture.gather.basic.2d_array.rgba8i.texture_swizzle.zero_one_red_green
dEQP-GLES31.functional.texture.gather.basic.2d_array.rgba8ui.texture_swizzle.alpha_zero_one_red
dEQP-GLES31.functional.texture.gather.basic.2d_array.rgba8ui.texture_swizzle.blue_alpha_zero_one
dEQP-GLES31.functional.texture.gather.basic.2d_array.rgba8ui.texture_swizzle.one_red_green_blue
dEQP-GLES31.functional.texture.gather.basic.2d_array.rgba8ui.texture_swizzle.zero_one_red_green
dEQP-GLES31.functional.texture.gather.basic.cube.rgba8.no_corners.size_pot.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.basic.cube.rgba8.no_corners.size_pot.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.basic.cube.rgba8.no_corners.size_pot.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.basic.cube.rgba8i.no_corners.size_pot.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.basic.cube.rgba8i.no_corners.size_pot.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.basic.cube.rgba8i.no_corners.size_pot.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.basic.cube.rgba8i.texture_swizzle.alpha_zero_one_red
dEQP-GLES31.functional.texture.gather.basic.cube.rgba8i.texture_swizzle.blue_alpha_zero_one
dEQP-GLES31.functional.texture.gather.basic.cube.rgba8i.texture_swizzle.one_red_green_blue
dEQP-GLES31.functional.texture.gather.basic.cube.rgba8i.texture_swizzle.zero_one_red_green
dEQP-GLES31.functional.texture.gather.basic.cube.rgba8ui.no_corners.size_pot.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.basic.cube.rgba8ui.no_corners.size_pot.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.basic.cube.rgba8ui.no_corners.size_pot.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.basic.cube.rgba8ui.texture_swizzle.alpha_zero_one_red
dEQP-GLES31.functional.texture.gather.basic.cube.rgba8ui.texture_swizzle.blue_alpha_zero_one
dEQP-GLES31.functional.texture.gather.basic.cube.rgba8ui.texture_swizzle.one_red_green_blue
dEQP-GLES31.functional.texture.gather.basic.cube.rgba8ui.texture_swizzle.zero_one_red_green
dEQP-GLES31.functional.texture.gather.offset.implementation_offset.2d.rgba8i.texture_swizzle.alpha_zero_one_red
dEQP-GLES31.functional.texture.gather.offset.implementation_offset.2d.rgba8i.texture_swizzle.blue_alpha_zero_one
dEQP-GLES31.functional.texture.gather.offset.implementation_offset.2d.rgba8i.texture_swizzle.one_red_green_blue
dEQP-GLES31.functional.texture.gather.offset.implementation_offset.2d.rgba8i.texture_swizzle.zero_one_red_green
dEQP-GLES31.functional.texture.gather.offset.implementation_offset.2d.rgba8ui.texture_swizzle.alpha_zero_one_red
dEQP-GLES31.functional.texture.gather.offset.implementation_offset.2d.rgba8ui.texture_swizzle.blue_alpha_zero_one
dEQP-GLES31.functional.texture.gather.offset.implementation_offset.2d.rgba8ui.texture_swizzle.one_red_green_blue
dEQP-GLES31.functional.texture.gather.offset.implementation_offset.2d.rgba8ui.texture_swizzle.zero_one_red_green
dEQP-GLES31.functional.texture.gather.offset.implementation_offset.2d_array.rgba8i.texture_swizzle.alpha_zero_one_red
dEQP-GLES31.functional.texture.gather.offset.implementation_offset.2d_array.rgba8i.texture_swizzle.blue_alpha_zero_one
dEQP-GLES31.functional.texture.gather.offset.implementation_offset.2d_array.rgba8i.texture_swizzle.one_red_green_blue
dEQP-GLES31.functional.texture.gather.offset.implementation_offset.2d_array.rgba8i.texture_swizzle.zero_one_red_green
dEQP-GLES31.functional.texture.gather.offset.implementation_offset.2d_array.rgba8ui.texture_swizzle.alpha_zero_one_red
dEQP-GLES31.functional.texture.gather.offset.implementation_offset.2d_array.rgba8ui.texture_swizzle.blue_alpha_zero_one
dEQP-GLES31.functional.texture.gather.offset.implementation_offset.2d_array.rgba8ui.texture_swizzle.one_red_green_blue
dEQP-GLES31.functional.texture.gather.offset.implementation_offset.2d_array.rgba8ui.texture_swizzle.zero_one_red_green
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.base_level.level_1
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.base_level.level_2
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.filter_mode.min_linear_mag_linear
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.filter_mode.min_linear_mipmap_linear_mag_linear
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.filter_mode.min_linear_mipmap_nearest_mag_linear
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.filter_mode.min_nearest_mipmap_linear_mag_linear
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.filter_mode.min_nearest_mipmap_nearest_mag_linear
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.size_npot.compare_greater.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.size_npot.compare_greater.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.size_npot.compare_greater.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.size_npot.compare_less.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.size_npot.compare_less.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.size_npot.compare_less.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.size_pot.compare_greater.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.size_pot.compare_greater.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.size_pot.compare_greater.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.size_pot.compare_less.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.size_pot.compare_less.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.size_pot.compare_less.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.base_level.level_1
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.base_level.level_2
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.filter_mode.min_linear_mag_linear
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.filter_mode.min_linear_mipmap_linear_mag_linear
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.filter_mode.min_linear_mipmap_nearest_mag_linear
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.filter_mode.min_nearest_mipmap_linear_mag_linear
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.filter_mode.min_nearest_mipmap_nearest_mag_linear
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.size_npot.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.size_npot.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.size_npot.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.size_pot.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.size_pot.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.size_pot.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.texture_swizzle.alpha_zero_one_red
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.texture_swizzle.blue_alpha_zero_one
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.texture_swizzle.green_blue_alpha_zero
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.texture_swizzle.one_red_green_blue
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.texture_swizzle.red_green_blue_alpha
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.texture_swizzle.zero_one_red_green
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8i.base_level.level_1
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8i.base_level.level_2
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8i.filter_mode.min_nearest_mipmap_nearest_mag_nearest
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8i.size_npot.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8i.size_npot.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8i.size_npot.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8i.size_pot.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8i.size_pot.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8i.size_pot.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8i.texture_swizzle.alpha_zero_one_red
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8i.texture_swizzle.blue_alpha_zero_one
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8i.texture_swizzle.green_blue_alpha_zero
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8i.texture_swizzle.one_red_green_blue
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8i.texture_swizzle.red_green_blue_alpha
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8i.texture_swizzle.zero_one_red_green
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8ui.base_level.level_1
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8ui.base_level.level_2
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8ui.filter_mode.min_nearest_mipmap_nearest_mag_nearest
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8ui.size_npot.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8ui.size_npot.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8ui.size_npot.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8ui.size_pot.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8ui.size_pot.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8ui.size_pot.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8ui.texture_swizzle.alpha_zero_one_red
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8ui.texture_swizzle.blue_alpha_zero_one
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8ui.texture_swizzle.green_blue_alpha_zero
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8ui.texture_swizzle.one_red_green_blue
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8ui.texture_swizzle.red_green_blue_alpha
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8ui.texture_swizzle.zero_one_red_green
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.base_level.level_1
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.base_level.level_2
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.filter_mode.min_linear_mag_linear
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.filter_mode.min_linear_mipmap_linear_mag_linear
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.filter_mode.min_linear_mipmap_nearest_mag_linear
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.filter_mode.min_nearest_mipmap_linear_mag_linear
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.filter_mode.min_nearest_mipmap_nearest_mag_linear
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.size_npot.compare_greater.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.size_npot.compare_greater.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.size_npot.compare_greater.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.size_npot.compare_less.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.size_npot.compare_less.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.size_npot.compare_less.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.size_pot.compare_greater.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.size_pot.compare_greater.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.size_pot.compare_greater.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.size_pot.compare_less.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.size_pot.compare_less.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.size_pot.compare_less.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.base_level.level_1
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.base_level.level_2
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.filter_mode.min_linear_mag_linear
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.filter_mode.min_linear_mipmap_linear_mag_linear
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.filter_mode.min_linear_mipmap_nearest_mag_linear
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.filter_mode.min_nearest_mipmap_linear_mag_linear
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.filter_mode.min_nearest_mipmap_nearest_mag_linear
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.size_npot.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.size_npot.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.size_npot.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.size_pot.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.size_pot.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.size_pot.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.texture_swizzle.alpha_zero_one_red
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.texture_swizzle.blue_alpha_zero_one
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.texture_swizzle.green_blue_alpha_zero
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.texture_swizzle.one_red_green_blue
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.texture_swizzle.red_green_blue_alpha
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.texture_swizzle.zero_one_red_green
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8i.base_level.level_1
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8i.base_level.level_2
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8i.filter_mode.min_nearest_mipmap_nearest_mag_nearest
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8i.size_npot.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8i.size_npot.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8i.size_npot.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8i.size_pot.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8i.size_pot.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8i.size_pot.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8i.texture_swizzle.alpha_zero_one_red
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8i.texture_swizzle.blue_alpha_zero_one
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8i.texture_swizzle.green_blue_alpha_zero
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8i.texture_swizzle.one_red_green_blue
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8i.texture_swizzle.red_green_blue_alpha
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8i.texture_swizzle.zero_one_red_green
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8ui.base_level.level_1
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8ui.base_level.level_2
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8ui.filter_mode.min_nearest_mipmap_nearest_mag_nearest
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8ui.size_npot.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8ui.size_npot.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8ui.size_npot.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8ui.size_pot.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8ui.size_pot.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8ui.size_pot.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8ui.texture_swizzle.alpha_zero_one_red
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8ui.texture_swizzle.blue_alpha_zero_one
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8ui.texture_swizzle.green_blue_alpha_zero
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8ui.texture_swizzle.one_red_green_blue
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8ui.texture_swizzle.red_green_blue_alpha
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8ui.texture_swizzle.zero_one_red_green
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.depth32f.size_npot.compare_greater.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.depth32f.size_npot.compare_greater.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.depth32f.size_npot.compare_greater.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.depth32f.size_npot.compare_less.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.depth32f.size_npot.compare_less.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.depth32f.size_npot.compare_less.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.depth32f.size_pot.compare_greater.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.depth32f.size_pot.compare_greater.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.depth32f.size_pot.compare_greater.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.depth32f.size_pot.compare_less.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.depth32f.size_pot.compare_less.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.depth32f.size_pot.compare_less.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8.size_npot.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8.size_npot.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8.size_npot.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8.size_pot.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8.size_pot.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8.size_pot.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8i.size_npot.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8i.size_npot.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8i.size_npot.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8i.size_pot.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8i.size_pot.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8i.size_pot.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8ui.size_npot.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8ui.size_npot.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8ui.size_npot.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8ui.size_pot.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8ui.size_pot.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8ui.size_pot.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.depth32f.size_npot.compare_greater.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.depth32f.size_npot.compare_greater.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.depth32f.size_npot.compare_greater.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.depth32f.size_npot.compare_less.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.depth32f.size_npot.compare_less.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.depth32f.size_npot.compare_less.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.depth32f.size_pot.compare_greater.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.depth32f.size_pot.compare_greater.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.depth32f.size_pot.compare_greater.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.depth32f.size_pot.compare_less.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.depth32f.size_pot.compare_less.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.depth32f.size_pot.compare_less.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8.size_npot.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8.size_npot.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8.size_npot.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8.size_pot.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8.size_pot.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8.size_pot.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8i.size_npot.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8i.size_npot.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8i.size_npot.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8i.size_pot.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8i.size_pot.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8i.size_pot.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8ui.size_npot.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8ui.size_npot.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8ui.size_npot.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8ui.size_pot.clamp_to_edge_repeat
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8ui.size_pot.mirrored_repeat_clamp_to_edge
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8ui.size_pot.repeat_mirrored_repeat
dEQP-GLES31.functional.texture.gather.offsets.implementation_offset.2d.rgba8i.texture_swizzle.alpha_zero_one_red
dEQP-GLES31.functional.texture.gather.offsets.implementation_offset.2d.rgba8i.texture_swizzle.blue_alpha_zero_one
dEQP-GLES31.functional.texture.gather.offsets.implementation_offset.2d.rgba8i.texture_swizzle.one_red_green_blue
dEQP-GLES31.functional.texture.gather.offsets.implementation_offset.2d.rgba8i.texture_swizzle.zero_one_red_green
dEQP-GLES31.functional.texture.gather.offsets.implementation_offset.2d.rgba8ui.texture_swizzle.alpha_zero_one_red
dEQP-GLES31.functional.texture.gather.offsets.implementation_offset.2d.rgba8ui.texture_swizzle.blue_alpha_zero_one
dEQP-GLES31.functional.texture.gather.offsets.implementation_offset.2d.rgba8ui.texture_swizzle.one_red_green_blue
dEQP-GLES31.functional.texture.gather.offsets.implementation_offset.2d.rgba8ui.texture_swizzle.zero_one_red_green
dEQP-GLES31.functional.texture.gather.offsets.implementation_offset.2d_array.rgba8i.texture_swizzle.alpha_zero_one_red
dEQP-GLES31.functional.texture.gather.offsets.implementation_offset.2d_array.rgba8i.texture_swizzle.blue_alpha_zero_one
dEQP-GLES31.functional.texture.gather.offsets.implementation_offset.2d_array.rgba8i.texture_swizzle.one_red_green_blue
dEQP-GLES31.functional.texture.gather.offsets.implementation_offset.2d_array.rgba8i.texture_swizzle.zero_one_red_green
dEQP-GLES31.functional.texture.gather.offsets.implementation_offset.2d_array.rgba8ui.texture_swizzle.alpha_zero_one_red
dEQP-GLES31.functional.texture.gather.offsets.implementation_offset.2d_array.rgba8ui.texture_swizzle.blue_alpha_zero_one
dEQP-GLES31.functional.texture.gather.offsets.implementation_offset.2d_array.rgba8ui.texture_swizzle.one_red_green_blue
dEQP-GLES31.functional.texture.gather.offsets.implementation_offset.2d_array.rgba8ui.texture_swizzle.zero_one_red_green
dEQP-GLES31.functional.texture.multisample.samples_1.sample_mask_and_alpha_to_coverage
dEQP-GLES31.functional.texture.multisample.samples_1.sample_mask_and_sample_coverage
dEQP-GLES31.functional.texture.multisample.samples_1.sample_mask_and_sample_coverage_and_alpha_to_coverage
dEQP-GLES31.functional.texture.multisample.samples_1.sample_mask_non_effective_bits
dEQP-GLES31.functional.texture.multisample.samples_1.sample_mask_only

View File

@@ -0,0 +1,16 @@
# Note: skips lists for CI are just a list of lines that, when
# non-zero-length and not starting with '#', will regex match to
# delete lines from the test list. Be careful.
# Skip the perf/stress tests to keep runtime manageable
dEQP-GLES[0-9]*.performance.*
dEQP-GLES[0-9]*.stress.*
# These are really slow on tiling architectures (including llvmpipe).
dEQP-GLES[0-9]*.functional.flush_finish.*
# Random failures
dEQP-GLES31.functional.shaders.builtin_functions.*geometry
dEQP-GLES31.functional.fbo.no_attachments.maximums.all
dEQP-GLES31.functional.fbo.no_attachments.maximums.size

45
.gitlab-ci/generate_lava.py Executable file
View File

@@ -0,0 +1,45 @@
#!/usr/bin/env python3
from jinja2 import Environment, FileSystemLoader
import argparse
import os
parser = argparse.ArgumentParser()
parser.add_argument("--template")
parser.add_argument("--pipeline-info")
parser.add_argument("--base-artifacts-url")
parser.add_argument("--device-type")
parser.add_argument("--kernel-image-name")
parser.add_argument("--kernel-image-type", nargs='?', default="")
parser.add_argument("--gpu-version")
parser.add_argument("--boot-method")
parser.add_argument("--lava-tags", nargs='?', default="")
parser.add_argument("--env-vars", nargs='?', default="")
parser.add_argument("--deqp-version")
parser.add_argument("--arch")
parser.add_argument("--ci-node-index")
parser.add_argument("--ci-node-total")
args = parser.parse_args()
env = Environment(loader = FileSystemLoader(os.path.dirname(args.template)), trim_blocks=True, lstrip_blocks=True)
template = env.get_template(os.path.basename(args.template))
values = {}
values['pipeline_info'] = args.pipeline_info
values['base_artifacts_url'] = args.base_artifacts_url
values['device_type'] = args.device_type
values['kernel_image_name'] = args.kernel_image_name
values['kernel_image_type'] = args.kernel_image_type
values['gpu_version'] = args.gpu_version
values['boot_method'] = args.boot_method
values['tags'] = args.lava_tags
values['env_vars'] = args.env_vars
values['deqp_version'] = args.deqp_version
values['arch'] = args.arch
values['ci_node_index'] = args.ci_node_index
values['ci_node_total'] = args.ci_node_total
f = open('lava-deqp.yml', "w")
f.write(template.render(values))
f.close()

View File

@@ -0,0 +1,89 @@
job_name: mesa-deqp-{{ gpu_version }} {{ pipeline_info }}
device_type: {{ device_type }}
timeouts:
job:
minutes: 40
action:
minutes: 10
actions:
power-off:
seconds: 30
priority: 75
visibility: public
{% if tags %}
{% set lavatags = tags.split(',') %}
tags:
{% for tag in lavatags %}
- {{ tag }}
{% endfor %}
{% endif %}
actions:
- deploy:
timeout:
minutes: 10
to: tftp
kernel:
url: {{ base_artifacts_url }}/{{ kernel_image_name }}
{% if kernel_image_type %}
{{ kernel_image_type }}
{% endif %}
ramdisk:
url: {{ base_artifacts_url }}/lava-rootfs-{{ arch }}.cpio.gz
compression: gz
dtb:
url: {{ base_artifacts_url }}/{{ device_type }}.dtb
os: oe
- boot:
timeout:
minutes: 5
method: {{ boot_method }}
commands: ramdisk
prompts:
- '#'
- test:
timeout:
minutes: 60
definitions:
- repository:
metadata:
format: Lava-Test Test Definition 1.0
name: deqp
description: "Mesa dEQP test plan"
os:
- oe
scope:
- functional
run:
steps:
- mount -t proc none /proc
- mount -t sysfs none /sys
- mount -t devtmpfs none /dev
- mkdir -p /dev/pts
- mount -t devpts devpts /dev/pts
{% if env_vars %}
- export {{ env_vars }}
{% endif %}
# deqp-runner.sh assumes some stuff is in pwd
- cd /
- export DEQP_NO_SAVE_RESULTS=1
- 'export DEQP_RUNNER_OPTIONS="--compact-display false --shuffle false"'
- export DEQP_EXPECTED_FAILS=deqp-{{ gpu_version }}-fails.txt
- export DEQP_SKIPS=deqp-{{ gpu_version }}-skips.txt
- export DEQP_VER={{ deqp_version }}
- export LIBGL_DRIVERS_PATH=`pwd`/install/lib/dri
- export CI_NODE_INDEX={{ ci_node_index }}
- export CI_NODE_TOTAL={{ ci_node_total }}
- "if sh /install/deqp-runner.sh; then
echo 'deqp: pass';
else
echo 'deqp: fail';
fi"
parse:
pattern: '(?P<test_case_id>\S*):\s+(?P<result>(pass|fail))'
from: inline
name: deqp
path: inline/mesa-deqp.yaml

View File

@@ -0,0 +1,136 @@
.lava-test:
extends:
- .ci-run-policy
stage: test
variables:
GIT_STRATEGY: none # testing doesn't build anything from source
ENV_VARS: "MESA_GLES_VERSION_OVERRIDE=3.0 DEQP_PARALLEL=6"
script:
- mkdir -p /srv/${FILES_HOST_NAME}/$CI_JOB_ID/
- cp /lava-files/${KERNEL_IMAGE_NAME} /srv/${FILES_HOST_NAME}/$CI_JOB_ID/.
- cp /lava-files/${DEVICE_TYPE}.dtb /srv/${FILES_HOST_NAME}/$CI_JOB_ID/.
- tar -C /lava-files/rootfs-${ARCH} -xf artifacts/install.tar
- pushd /lava-files/rootfs-${ARCH}
- find -H | cpio -H newc -o | gzip -c - > /srv/${FILES_HOST_NAME}/$CI_JOB_ID/lava-rootfs-${ARCH}.cpio.gz
- popd
- >
artifacts/generate_lava.py \
--template artifacts/lava-deqp.yml.jinja2 \
--pipeline-info "$CI_PIPELINE_URL on $CI_COMMIT_REF_NAME ${CI_NODE_INDEX}/${CI_NODE_TOTAL}" \
--base-artifacts-url ${FILES_HOST_URL}/$CI_JOB_ID \
--device-type ${DEVICE_TYPE} \
--env-vars "${ENV_VARS}" \
--arch ${ARCH} \
--deqp-version gles2 \
--kernel-image-name ${KERNEL_IMAGE_NAME} \
--kernel-image-type "${KERNEL_IMAGE_TYPE}" \
--gpu-version ${GPU_VERSION} \
--boot-method ${BOOT_METHOD} \
--lava-tags "${LAVA_TAGS}" \
--ci-node-index "${CI_NODE_INDEX}" \
--ci-node-total "${CI_NODE_TOTAL}"
- lava_job_id=`lavacli jobs submit lava-deqp.yml` || lavacli jobs submit lava-deqp.yml
- echo $lava_job_id
- rm -rf artifacts/*
- cp lava-deqp.yml artifacts/.
- lavacli jobs logs $lava_job_id | grep -a -v "{'case':" | tee artifacts/lava-deqp-$lava_job_id.log
- lavacli jobs show $lava_job_id
- result=`lavacli results $lava_job_id 0_deqp deqp | head -1`
- echo $result
- '[[ "$result" == "pass" ]]'
after_script:
- rm -rf /srv/${FILES_HOST_NAME}/$CI_JOB_ID/
artifacts:
when: always
paths:
- artifacts/
.lava-test:armhf:
variables:
ARCH: armhf
KERNEL_IMAGE_NAME: zImage
KERNEL_IMAGE_TYPE: "type:\ zimage"
BOOT_METHOD: u-boot
extends:
- .lava-test
- .use-arm_build
dependencies:
- meson-armhf
needs:
- meson-armhf
.lava-test:arm64:
variables:
ARCH: arm64
KERNEL_IMAGE_NAME: Image
KERNEL_IMAGE_TYPE: "type:\ image"
BOOT_METHOD: u-boot
extends:
- .lava-test
- .use-arm_build
dependencies:
- meson-arm64
needs:
- meson-arm64
panfrost-t720-test:arm64:
extends: .lava-test:arm64
variables:
DEVICE_TYPE: sun50i-h6-pine-h64
GPU_VERSION: panfrost-t720
FILES_HOST_NAME: "mesa-ci-lava-files.freedesktop.org"
FILES_HOST_URL: "https://mesa-ci-lava-files.freedesktop.org"
tags:
- mesa-ci-aarch64-lava-collabora
panfrost-t760-test:armhf:
extends: .lava-test:armhf
variables:
DEVICE_TYPE: rk3288-veyron-jaq
GPU_VERSION: panfrost-t760
BOOT_METHOD: depthcharge
KERNEL_IMAGE_TYPE: ""
FILES_HOST_NAME: "mesa-ci-lava-files.freedesktop.org"
FILES_HOST_URL: "https://mesa-ci-lava-files.freedesktop.org"
tags:
- mesa-ci-aarch64-lava-collabora
panfrost-t860-test:arm64:
extends: .lava-test:arm64
variables:
DEVICE_TYPE: rk3399-gru-kevin
GPU_VERSION: panfrost-t860
BOOT_METHOD: depthcharge
KERNEL_IMAGE_TYPE: ""
FILES_HOST_NAME: "mesa-ci-lava-files.freedesktop.org"
FILES_HOST_URL: "https://mesa-ci-lava-files.freedesktop.org"
tags:
- mesa-ci-aarch64-lava-collabora
.panfrost-t820-gles2:arm64:
extends: .lava-test:arm64
variables:
DEVICE_TYPE: meson-gxm-khadas-vim2
GPU_VERSION: panfrost-t820
LAVA_TAGS: panfrost
tags:
- mesa-ci-aarch64-lava-baylibre
.lima-mali400-test:armhf:
parallel: 2
extends: .lava-test:armhf
variables:
DEVICE_TYPE: sun8i-h3-libretech-all-h3-cc
GPU_VERSION: lima
ENV_VARS: "DEQP_PARALLEL=3"
tags:
- mesa-ci-aarch64-lava-baylibre
.lima-mali450-test:arm64:
extends: .lava-test:arm64
variables:
DEVICE_TYPE: meson-gxl-s905x-libretech-cc
GPU_VERSION: lima
ENV_VARS: "DEQP_PARALLEL=6"
tags:
- mesa-ci-aarch64-lava-baylibre

View File

@@ -0,0 +1,13 @@
call "C:\Program Files (x86)\Microsoft Visual Studio\%VERSION%\Common7\Tools\VsDevCmd.bat" -arch=%ARCH%
del /Q /S _build
meson _build ^
-Dbuild-tests=true ^
-Db_vscrt=mtd ^
-Dbuildtype=release ^
-Dllvm=false ^
-Dgallium-drivers=swrast ^
-Dosmesa=gallium
meson configure _build
ninja -C _build
ninja -C _build test

View File

@@ -3,20 +3,47 @@
set -e
set -o xtrace
CROSS_FILE=/cross_file-"$CROSS".txt
# We need to control the version of llvm-config we're using, so we'll
# generate a native file to do so. This requires meson >=0.49
# tweak the cross file or generate a native file to do so.
if test -n "$LLVM_VERSION"; then
LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
echo -e "[binaries]\nllvm-config = '`which $LLVM_CONFIG`'" > native.file
if [ -n "$CROSS" ]; then
sed -i -e '/\[binaries\]/a\' -e "llvm-config = '`which $LLVM_CONFIG`'" $CROSS_FILE
fi
$LLVM_CONFIG --version
else
rm -f native.file
touch native.file
fi
# cross-xfail-$CROSS, if it exists, contains a list of tests that are expected
# to fail for the $CROSS configuration, one per line. you can then mark those
# tests in their meson.build with:
#
# test(...,
# should_fail: meson.get_cross_property('xfail', '').contains(t),
# )
#
# where t is the name of the test, and the '' is the string to search when
# not cross-compiling (which is empty, because for amd64 everything is
# expected to pass).
if [ -n "$CROSS" ]; then
CROSS_XFAIL=.gitlab-ci/cross-xfail-"$CROSS"
if [ -s "$CROSS_XFAIL" ]; then
sed -i \
-e '/\[properties\]/a\' \
-e "xfail = '$(tr '\n' , < $CROSS_XFAIL)'" \
"$CROSS_FILE"
fi
fi
rm -rf _build
meson _build --native-file=native.file \
${CROSS+--cross /cross_file-$CROSS.txt} \
--wrap-mode=nofallback \
${CROSS+--cross "$CROSS_FILE"} \
-D prefix=`pwd`/install \
-D libdir=lib \
-D buildtype=${BUILDTYPE:-debug} \
@@ -35,28 +62,3 @@ ninja -j4
LC_ALL=C.UTF-8 ninja test
ninja install
cd ..
if test -n "$MESON_SHADERDB"; then
./.gitlab-ci/run-shader-db.sh;
fi
# Delete 2MB of includes from artifacts.
rm -rf install/include
# Strip the drivers in the artifacts to cut 80% of the artifacts size.
if [ -n "$CROSS" ]; then
STRIP=`sed -n -E "s/strip\s*=\s*'(.*)'/\1/p" /cross_file-$CROSS.txt`
if [ -z "$STRIP" ]; then
echo "Failed to find strip command in cross file"
exit 1
fi
else
STRIP="strip"
fi
find install -name \*.so -exec $STRIP {} \;
# Test runs don't pull down the git tree, so put the dEQP helper
# script and associated bits there.
mkdir -p artifacts/
cp -Rp .gitlab-ci/deqp* artifacts/
# cp -Rp src/freedreno/ci/expected* artifacts/

View File

@@ -0,0 +1,36 @@
diff --git a/generated_tests/CMakeLists.txt b/generated_tests/CMakeLists.txt
index 738526546..6f89048cd 100644
--- a/generated_tests/CMakeLists.txt
+++ b/generated_tests/CMakeLists.txt
@@ -206,11 +206,6 @@ piglit_make_generated_tests(
templates/gen_variable_index_write_tests/vs.shader_test.mako
templates/gen_variable_index_write_tests/fs.shader_test.mako
templates/gen_variable_index_write_tests/helpers.mako)
-piglit_make_generated_tests(
- vs_in_fp64.list
- gen_vs_in_fp64.py
- templates/gen_vs_in_fp64/columns.shader_test.mako
- templates/gen_vs_in_fp64/regular.shader_test.mako)
piglit_make_generated_tests(
shader_framebuffer_fetch_tests.list
gen_shader_framebuffer_fetch_tests.py)
@@ -279,7 +274,6 @@ add_custom_target(gen-gl-tests
gen_extensions_defined.list
vp-tex.list
variable_index_write_tests.list
- vs_in_fp64.list
gpu_shader4_tests.list
)
diff --git a/tests/sanity.py b/tests/sanity.py
index 12f1614c9..9019087e2 100644
--- a/tests/sanity.py
+++ b/tests/sanity.py
@@ -100,7 +100,6 @@ shader_tests = (
'spec/arb_tessellation_shader/execution/barrier-patch.shader_test',
'spec/arb_tessellation_shader/execution/built-in-functions/tcs-any-bvec4-using-if.shader_test',
'spec/arb_tessellation_shader/execution/sanity.shader_test',
- 'spec/arb_vertex_attrib_64bit/execution/vs_in/vs-input-uint_uvec4-double_dmat3x4_array2-position.shader_test',
'spec/glsl-1.50/execution/geometry-basic.shader_test',
'spec/oes_viewport_array/viewport-gs-write-simple.shader_test',
)

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

29
.gitlab-ci/piglit/run.sh Executable file
View File

@@ -0,0 +1,29 @@
#!/bin/bash
set -e
set -o xtrace
VERSION=`cat install/VERSION`
cd /piglit
PIGLIT_OPTIONS=$(echo $PIGLIT_OPTIONS | head -n 1)
xvfb-run --server-args="-noreset" sh -c \
"export LD_LIBRARY_PATH=$OLDPWD/install/lib;
wflinfo --platform glx --api gl --profile core | grep \"Mesa $VERSION\\\$\" &&
./piglit run -j4 $PIGLIT_OPTIONS $PIGLIT_PROFILES $OLDPWD/results"
PIGLIT_RESULTS=${PIGLIT_RESULTS:-$PIGLIT_PROFILES}
mkdir -p .gitlab-ci/piglit
cp $OLDPWD/install/piglit/$PIGLIT_RESULTS.txt .gitlab-ci/piglit/$PIGLIT_RESULTS.txt.baseline
./piglit summary console $OLDPWD/results | head -n -1 | grep -v ": pass" >.gitlab-ci/piglit/$PIGLIT_RESULTS.txt
if diff -q .gitlab-ci/piglit/$PIGLIT_RESULTS.txt{.baseline,}; then
exit 0
fi
./piglit summary html --exclude-details=pass $OLDPWD/summary $OLDPWD/results
echo Unexpected change in results:
diff -u .gitlab-ci/piglit/$PIGLIT_RESULTS.txt{.baseline,}
exit 1

39
.gitlab-ci/prepare-artifacts.sh Executable file
View File

@@ -0,0 +1,39 @@
#!/bin/bash
set -e
set -o xtrace
CROSS_FILE=/cross_file-"$CROSS".txt
# Delete unused bin and includes from artifacts to save space.
rm -rf install/bin install/include
# Strip the drivers in the artifacts to cut 80% of the artifacts size.
if [ -n "$CROSS" ]; then
STRIP=`sed -n -E "s/strip\s*=\s*'(.*)'/\1/p" "$CROSS_FILE"`
if [ -z "$STRIP" ]; then
echo "Failed to find strip command in cross file"
exit 1
fi
else
STRIP="strip"
fi
find install -name \*.so -exec $STRIP {} \;
# Test runs don't pull down the git tree, so put the dEQP helper
# script and associated bits there.
cp VERSION install/
cp -Rp .gitlab-ci/deqp* install/
cp -Rp .gitlab-ci/piglit install/
# Tar up the install dir so that symlinks and hardlinks aren't each
# packed separately in the zip file.
mkdir -p artifacts/
tar -cf artifacts/install.tar install
# If the container has LAVA stuff, prepare the artifacts for LAVA jobs
if [ -d /lava-files ]; then
# Pass needed files to the test stage
cp $CI_PROJECT_DIR/.gitlab-ci/generate_lava.py artifacts/.
cp $CI_PROJECT_DIR/.gitlab-ci/lava-deqp.yml.jinja2 artifacts/.
fi

12
.gitlab-ci/scons-build.sh Executable file
View File

@@ -0,0 +1,12 @@
#!/bin/bash
set -e
set -o xtrace
if test -n "$LLVM_VERSION"; then
export LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
fi
rm -rf build
scons $SCONS_TARGET force_scons=on
eval $SCONS_CHECK_COMMAND

View File

@@ -0,0 +1,20 @@
[binaries]
c = ['ccache', 'x86_64-w64-mingw32-gcc']
cpp = ['ccache', 'x86_64-w64-mingw32-g++']
ar = 'x86_64-w64-mingw32-ar'
strip = 'x86_64-w64-mingw32-strip'
pkgconfig = '/usr/local/bin/x86_64-w64-mingw32-pkg-config'
windres = 'x86_64-w64-mingw32-windres'
exe_wrapper = ['wine64']
[properties]
needs_exe_wrapper = True
sys_root = '/usr/x86_64-w64-mingw32/'
[host_machine]
system = 'windows'
cpu_family = 'x86_64'
cpu = 'x86_64'
endian = 'little'
; vim: ft=dosini

View File

@@ -26,6 +26,8 @@ Alexander Monakov <amonakov@gmail.com> <amonakov@ispras.ru>
Alexander von Gluck IV <kallisti5@unixzen.com> Alexander von Gluck <kallisti5@unixzen.com>
Alexandros Frantzis <alexandros.frantzis@collabora.com> <Alexandros.Frantzis@canonical.com>
Alex Corscadden <alexc@vmware.com> <alexc@alexc-dev1.prom.eng.vmware.com>
Alex Corscadden <alexc@vmware.com> <alexc@alexc-dev1.vmware.com>
@@ -50,6 +52,8 @@ Andrew Randrianasulu <randrianasulu@gmail.com> <randrik@mail.ru>
Arthur Huillet <arthur.huillet@free.fr> Arthur HUILLET <arthur.huillet@free.fr>
Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> <basni@chromium.org>
Benjamin Franzke <benjaminfranzke@googlemail.com> ben <benjaminfranzke@googlemail.com>
Ben Skeggs <bskeggs@redhat.com> <darktama@beleth.(none)>
@@ -129,8 +133,8 @@ David Miller <davem@davemloft.net> David S. Miller <davem@davemloft.net>
David Miller <davem@davemloft.net> Dave Miller <davem@davemloft.net>
David Miller <davem@davemloft.net> davem69 <davem69>
David Heidelberger <david.heidelberger@ixit.cz> David Heidelberg <david@ixit.cz>
David Heidelberger <david.heidelberger@ixit.cz> <d.okias@gmail.com>
David Heidelberg <david@ixit.cz> David Heidelberger <david.heidelberger@ixit.cz>
David Heidelberg <david@ixit.cz> <d.okias@gmail.com>
David Reveman <reveman@chromium.org> <c99drn@cs.umu.se>
@@ -142,6 +146,8 @@ Dylan Baker <dylanx.c.baker@intel.com> <baker.dylan.c@gmail.com>
Edward O'Callaghan <funfunctor@folklore1984.net> <eocallaghan@alterapraxis.com>
Elie Tournier <tournier.elie@gmail.com>
Emeric Grange <emeric.grange@gmail.com> Emeric <emeric.grange@gmail.com>
Emil Velikov <emil.l.velikov@gmail.com> <emil.velikov@collabora.com>
@@ -154,6 +160,7 @@ Emil Velikov <emil.l.velikov@gmail.com> <emmil.velikov@collabora.com>
Eric Anholt <eric@anholt.net> Eric Anholt <anholt@FreeBSD.org>
Eric Engestrom <eric@engestrom.ch> <eric.engestrom@imgtec.com>
Eric Engestrom <eric@engestrom.ch> <eric.engestrom@intel.com>
Eugeni Dodonov <eugeni.dodonov@intel.com> <eugeni@mandriva.com>
@@ -162,10 +169,14 @@ Fabian Bieler <der.fabe@gmx.net> <&lt;der.fabe@gmx.net&gt>
Feng, Haitao <haitao.feng@intel.com> Haitao Feng <haitao.feng@intel.com>
Frank Binns <frank.binns@imgtec.com> <francisbinns@gmail.com>
Frank Henigman <fjhenigman@google.com> <fjhenigman@chromium.org>
George Sapountzis <gsapountzis@gmail.com> George Sapountzis <gsap7@yahoo.gr>
Gert Wollny <gert.wollny@collabora.com> <gw.fossdev@gmail.com>
Gwenole Beauchesne <gwenole.beauchesne@intel.com> <gb.devel@gmail.com>
Hamish Marson <hmarson@users.sourceforge.net> hmarson <hmarson>
@@ -184,6 +195,8 @@ Jakob Bornecrantz <wallbraker@gmail.com> <jakob@aurora.(none)>
Jakob Bornecrantz <wallbraker@gmail.com> <jakob@aurora.walkyrie.se>
Jakob Bornecrantz <wallbraker@gmail.com> <jakob@tungstengraphics.com>
Jakob Bornecrantz <wallbraker@gmail.com> <wallbraker 'at' gmail 'dot' com>
Jakob Bornecrantz <wallbraker@gmail.com> <jakob.bornecrantz@collabora.com>
Jakob Bornecrantz <wallbraker@gmail.com> <jakob@collabora.com>
Jakub Bogusz <qboosh@pld-linux.org> <gboosh@pld-linux.org>
@@ -328,6 +341,7 @@ Michel Dänzer <michel@daenzer.net> <daenzer@vmware.com>
Michel Dänzer <michel@daenzer.net> <michel@tungstengraphics.com>
Michel Dänzer <michel@daenzer.net> Michel Daenzer <michel.daenzer@amd.com>
Michel Dänzer <michel@daenzer.net> Michel Daenzer <daenzer@localhost.(none)>
Michel Dänzer <michel@daenzer.net> <mdaenzer@redhat.com>
Mike Kaplinskiy <mike.kaplinskiy@gmail.com> Mike Kaplinksiy <mike.kaplinskiy@gmail.com>
Mike Kaplinskiy <mike.kaplinskiy@gmail.com> <mike.kaplinskiy@gmai.com>
@@ -453,6 +467,8 @@ Tom Fogal <tfogal@alumni.unh.edu> <tfogal@sci.utah.edu>
Tom Stellard <thomas.stellard@amd.com> <tstellar@gmail.com>
Tom Stellard <thomas.stellard@amd.com> Thomas Stellard <tom.stellard@amd.com>
Tomeu Vizoso <tomeu.vizoso@collabora.com> <tomeu@tomeuvizoso.net>
Tormod Volden <debian.tormod@gmail.com> <lists.tormod@gmail.com>
Török Edwin <edwin+mesa@etorok.net> Török Edvin <edwintorok@gmail.com>

44300
.pick_status.json Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -19,14 +19,15 @@ matrix:
before_install:
- HOMEBREW_NO_AUTO_UPDATE=1 brew install expat gettext
- if test "x$BUILD" = xmeson; then
HOMEBREW_NO_AUTO_UPDATE=1 brew install python3 ninja;
HOMEBREW_NO_AUTO_UPDATE=1 brew install ninja;
fi
- if test "x$BUILD" = xscons; then
HOMEBREW_NO_AUTO_UPDATE=1 brew install python2 scons;
HOMEBREW_NO_AUTO_UPDATE=1 brew install scons;
fi
# Set PATH for homebrew pip3 installs
- PATH="$HOME/Library/Python/3.6/bin:${PATH}"
- PYTHON_VERSION=$(python3 -V | awk '{print $2}' | cut -d. -f1-2)
- PATH="$HOME/Library/Python/$PYTHON_VERSION/bin:${PATH}"
# Set PKG_CONFIG_PATH for keg-only expat
- PKG_CONFIG_PATH="/usr/local/opt/expat/lib/pkgconfig:${PKG_CONFIG_PATH}"
# Set PATH for keg-only gettext
@@ -53,10 +54,11 @@ install:
script:
- if test "x$BUILD" = xmeson; then
meson _build -Dbuild-tests=true;
ninja -C _build;
ninja -C _build test;
ninja -C _build || travis_terminate 1;
ninja -C _build test || travis_terminate 1;
ninja -C _build install || travis_terminate 1;
fi
- if test "x$BUILD" = xscons; then
scons;
scons check;
scons force_scons=1 || travis_terminate 1;
scons force_scons=1 check || travis_terminate 1;
fi

View File

@@ -39,7 +39,7 @@ LOCAL_CFLAGS += \
-Wno-initializer-overrides \
-Wno-mismatched-tags \
-DPACKAGE_VERSION=\"$(MESA_VERSION)\" \
-DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\"
-DPACKAGE_BUGREPORT=\"https://gitlab.freedesktop.org/mesa/mesa/-/issues\"
# XXX: The following __STDC_*_MACROS defines should not be needed.
# It's likely due to a bug elsewhere, but let's temporarily add them

View File

@@ -24,7 +24,7 @@
# BOARD_GPU_DRIVERS should be defined. The valid values are
#
# classic drivers: i915 i965
# gallium drivers: swrast freedreno i915g nouveau kmsro r300g r600g radeonsi vc4 virgl vmwgfx etnaviv iris lima
# gallium drivers: swrast freedreno i915g nouveau kmsro r300g r600g radeonsi vc4 virgl vmwgfx etnaviv iris lima panfrost
#
# The main target is libGLES_mesa. For each classic driver enabled, a DRI
# module will also be built. DRI modules will be loaded by libGLES_mesa.
@@ -43,6 +43,7 @@ MESA_DRI_LDFLAGS := -Wl,--build-id=sha1
MESA_COMMON_MK := $(MESA_TOP)/Android.common.mk
MESA_PYTHON2 := python
MESA_PYTHON3 := python3
# Lists to convert driver names to boolean variables
# in form of <driver name>.<boolean make variable>
@@ -61,7 +62,8 @@ gallium_drivers := \
virgl.HAVE_GALLIUM_VIRGL \
etnaviv.HAVE_GALLIUM_ETNAVIV \
iris.HAVE_GALLIUM_IRIS \
lima.HAVE_GALLIUM_LIMA
lima.HAVE_GALLIUM_LIMA \
panfrost.HAVE_GALLIUM_PANFROST
ifeq ($(BOARD_GPU_DRIVERS),all)
MESA_BUILD_CLASSIC := $(filter HAVE_%, $(subst ., , $(classic_drivers)))
@@ -88,21 +90,15 @@ MESA_ENABLE_LLVM := true
endif
define mesa-build-with-llvm
$(if $(filter $(MESA_ANDROID_MAJOR_VERSION), 4 5), \
$(if $(filter $(MESA_ANDROID_MAJOR_VERSION), 4 5 6 7), \
$(warning Unsupported LLVM version in Android $(MESA_ANDROID_MAJOR_VERSION)),) \
$(if $(filter 6,$(MESA_ANDROID_MAJOR_VERSION)), \
$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0307 -DMESA_LLVM_VERSION_STRING=\"3.7\")) \
$(if $(filter 7,$(MESA_ANDROID_MAJOR_VERSION)), \
$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0308 -DMESA_LLVM_VERSION_STRING=\"3.8\")) \
$(if $(filter 8,$(MESA_ANDROID_MAJOR_VERSION)), \
$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0309 -DMESA_LLVM_VERSION_STRING=\"3.9\")) \
$(if $(filter P,$(MESA_ANDROID_MAJOR_VERSION)), \
$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0309 -DMESA_LLVM_VERSION_STRING=\"3.9\")) \
$(eval LOCAL_CFLAGS += -DLLVM_AVAILABLE -DMESA_LLVM_VERSION_STRING=\"3.9\") \
$(eval LOCAL_SHARED_LIBRARIES += libLLVM)
endef
# add subdirectories
SUBDIRS := \
src/etnaviv \
src/freedreno \
src/gbm \
src/loader \

View File

@@ -56,5 +56,4 @@ Contributions are welcome, and step-by-step instructions can be found in our
documentation (`docs/submittingpatches.html
<https://mesa3d.org/submittingpatches.html>`_).
Note that Mesa uses email mailing-lists for patches submission, review and
discussions.
Note that Mesa uses gitlab for patches submission, review and discussions.

View File

@@ -1,30 +1,11 @@
Overview:
This file is similar in syntax (or more precisly a subset) of what is
used by the MAINTAINERS file in the linux kernel. Some fields do not
apply, for example, in all cases, send patches to:
mesa-dev@lists.freedesktop.org
and in all cases the patchwork instance is:
https://patchwork.freedesktop.org/project/mesa/
used by the MAINTAINERS file in the linux kernel.
The purpose is not exactly the same the MAINTAINERS file in the linux
kernel, as there are not official/formal maintainers of different
subsystems in mesa, but is meant to give an idea of who to CC for
various patches for review, and to allow the use of
scripts/get_reviewer.pl as git --cc-cmd.
Usage:
When sending patches:
git send-email --cc-cmd ./scripts/get_reviewer.pl ...
Or to configure as default:
git config sendemail.cccmd ./scripts/get_reviewer.pl
various patches for review.
Descriptions of section entries:
@@ -36,14 +17,6 @@ Descriptions of section entries:
F: drivers/net/* all files in drivers/net, but not below
F: */net/* all files in "any top level directory"/net
One pattern per line. Multiple F: lines acceptable.
N: Files and directories with regex patterns.
N: [^a-z]tegra all files whose path contains the word tegra
One pattern per line. Multiple N: lines acceptable.
scripts/get_maintainer.pl has different behavior for files that
match F: pattern and matches of N: patterns. By default,
get_maintainer will not look at git log history when an F: pattern
match occurs. When an N: match occurs, git log history is used
to also notify the people that have git commit signatures.
Maintainers List (try to look for most precise areas first)
@@ -135,3 +108,13 @@ VULKAN
R: Eric Engestrom <eric@engestrom.ch>
F: src/vulkan/
F: include/vulkan/
VMWARE DRIVER
R: Brian Paul <brianp@vmware.com>
R: Charmaine Lee <charmainel@vmware.com>
F: src/gallium/drivers/svga/
VMWARE WINSYS CODE
R: Thomas Hellstrom <thellstrom@vmware.com>
R: Deepak Rawat <drawat@vmware.com>
F: src/gallium/winsys/svga/

View File

@@ -20,6 +20,7 @@
# to get the full list of options. See scons manpage for more info.
#
from __future__ import print_function
import os
import os.path
import sys
@@ -66,6 +67,26 @@ else:
Help(opts.GenerateHelpText(env))
#######################################################################
# Print a deprecation warning for using scons on non-windows
if common.host_platform != 'windows' and env['platform'] != 'windows':
if env['force_scons']:
print("WARNING: Scons is deprecated for non-windows platforms (including cygwin) "
"please use meson instead.", file=sys.stderr)
else:
print("ERROR: Scons is deprecated for non-windows platforms (including cygwin) "
"please use meson instead. If you really need to use scons you "
"can add `force_scons=1` to the scons command line.", file=sys.stderr)
sys.exit(1)
else:
print("WARNING: Scons support is in the process of being deprecated on "
"on windows platforms (including mingw). If you haven't already "
"please try using meson for windows builds. Be sure to report any "
"issues you run into", file=sys.stderr)
#######################################################################
# Environment setup
@@ -73,7 +94,7 @@ with open("VERSION") as f:
mesa_version = f.read().strip()
env.Append(CPPDEFINES = [
('PACKAGE_VERSION', '\\"%s\\"' % mesa_version),
('PACKAGE_BUGREPORT', '\\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\\"'),
('PACKAGE_BUGREPORT', '\\"https://gitlab.freedesktop.org/mesa/mesa/-/issues\\"'),
])
# Includes

View File

@@ -1 +1 @@
19.2.0-devel
20.0.8

View File

@@ -38,6 +38,7 @@ cache:
- '%LOCALAPPDATA%\pip\Cache -> appveyor.yml'
- win_flex_bison-2.5.15.zip
- llvm-5.0.1-msvc2017-mtd.7z
- subprojects\packagecache -> subprojects\*.wrap
os: Visual Studio 2017
@@ -49,41 +50,21 @@ init:
environment:
WINFLEXBISON_VERSION: 2.5.15
LLVM_ARCHIVE: llvm-5.0.1-msvc2017-mtd.7z
matrix:
- compiler: msvc
buildsystem: scons
- compiler: msvc
buildsystem: meson
path: C:\Python38-x64;C:\Python38-x64\Scripts;%path%
install:
# Check git config
- git config core.autocrlf
# Check pip
- python --version
- python -m pip --version
# Install Mako
- python -m pip install Mako==1.0.7
# Install pywin32 extensions, needed by SCons
- python -m pip install pypiwin32
# Install python wheels, necessary to install SCons via pip
- python -m pip install wheel
# Install SCons
- python -m pip install scons==3.0.1
- scons --version
# Install flex/bison
- set WINFLEXBISON_ARCHIVE=win_flex_bison-%WINFLEXBISON_VERSION%.zip
- if not exist "%WINFLEXBISON_ARCHIVE%" appveyor DownloadFile "https://github.com/lexxmark/winflexbison/releases/download/v%WINFLEXBISON_VERSION%/%WINFLEXBISON_ARCHIVE%"
- 7z x -y -owinflexbison\ "%WINFLEXBISON_ARCHIVE%" > nul
- set Path=%CD%\winflexbison;%Path%
- win_flex --version
- win_bison --version
# Download and extract LLVM
- if not exist "%LLVM_ARCHIVE%" appveyor DownloadFile "https://people.freedesktop.org/~jrfonseca/llvm/%LLVM_ARCHIVE%"
- 7z x -y "%LLVM_ARCHIVE%" > nul
- mkdir llvm\bin
- set LLVM=%CD%\llvm
- cmd: .appveyor\appveyor_msvc.bat install
build_script:
- scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=14.1 llvm=1
after_build:
- scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=14.1 llvm=1 check
- cmd: .appveyor\appveyor_msvc.bat build_script
test_script:
- cmd: .appveyor\appveyor_msvc.bat test_script
# It's possible to setup notification here, as described in
# http://www.appveyor.com/docs/notifications#appveyor-yml-configuration , but

0
bin/__init__.py Normal file
View File

View File

@@ -1,35 +0,0 @@
#!/bin/sh
# This script is used to generate the list of fixed bugs that
# appears in the release notes files, with HTML formatting.
#
# Note: This script could take a while until all details have
# been fetched from bugzilla.
#
# Usage examples:
#
# $ bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3
# $ bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3 > bugfixes
# $ bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3 | tee bugfixes
# regex pattern: trim before bug number
trim_before='s/.*show_bug.cgi?id=\([0-9]*\).*/\1/'
# regex pattern: reconstruct the url
use_after='s,^,https://bugs.freedesktop.org/show_bug.cgi?id=,'
echo "<ul>"
echo ""
# extract fdo urls from commit log
git log --pretty=medium $* | grep 'bugs.freedesktop.org/show_bug' | sed -e $trim_before | sort -n -u | sed -e $use_after |\
while read url
do
id=$(echo $url | cut -d'=' -f2)
summary=$(wget --quiet -O - $url | grep -e '<title>.*</title>' | sed -e 's/ *<title>[0-9]\+ &ndash; \(.*\)<\/title>/\1/')
echo "<li><a href=\"$url\">Bug $id</a> - $summary</li>"
echo ""
done
echo "</ul>"

272
bin/gen_release_notes.py Executable file
View File

@@ -0,0 +1,272 @@
#!/usr/bin/env python3
# Copyright © 2019 Intel Corporation
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
"""Generates release notes for a given version of mesa."""
import asyncio
import datetime
import os
import pathlib
import sys
import textwrap
import typing
import urllib.parse
import aiohttp
from mako.template import Template
from mako import exceptions
CURRENT_GL_VERSION = '4.6'
CURRENT_VK_VERSION = '1.2'
TEMPLATE = Template(textwrap.dedent("""\
<%!
import html
%>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa ${next_version} Release Notes / ${today}</h1>
<p>
%if not bugfix:
Mesa ${next_version} is a new development release. People who are concerned
with stability and reliability should stick with a previous release or
wait for Mesa ${next_version[:-1]}1.
%else:
Mesa ${next_version} is a bug fix release which fixes bugs found since the ${version} release.
%endif
</p>
<p>
Mesa ${next_version} implements the OpenGL ${gl_version} API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL ${gl_version}. OpenGL
${gl_version} is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<p>
Mesa ${next_version} implements the Vulkan ${vk_version} API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
</p>
<h2>SHA256 checksum</h2>
<pre>
TBD.
</pre>
<h2>New features</h2>
<ul>
%for f in features:
<li>${html.escape(f)}</li>
%endfor
</ul>
<h2>Bug fixes</h2>
<ul>
%for b in bugs:
<li>${html.escape(b)}</li>
%endfor
</ul>
<h2>Changes</h2>
<ul>
%for c, author in changes:
%if author:
<p>${html.escape(c)}</p>
%else:
<li>${html.escape(c)}</li>
%endif
%endfor
</ul>
</div>
</body>
</html>
"""))
async def gather_commits(version: str) -> str:
p = await asyncio.create_subprocess_exec(
'git', 'log', '--oneline', f'mesa-{version}..', '--grep', r'Closes: \(https\|#\).*',
stdout=asyncio.subprocess.PIPE)
out, _ = await p.communicate()
assert p.returncode == 0, f"git log didn't work: {version}"
return out.decode().strip()
async def gather_bugs(version: str) -> typing.List[str]:
commits = await gather_commits(version)
issues: typing.List[str] = []
for commit in commits.split('\n'):
sha, message = commit.split(maxsplit=1)
p = await asyncio.create_subprocess_exec(
'git', 'log', '--max-count', '1', r'--format=%b', sha,
stdout=asyncio.subprocess.PIPE)
_out, _ = await p.communicate()
out = _out.decode().split('\n')
for line in reversed(out):
if line.startswith('Closes:'):
bug = line.lstrip('Closes:').strip()
break
else:
raise Exception('No closes found?')
if bug.startswith('h'):
# This means we have a bug in the form "Closes: https://..."
issues.append(os.path.basename(urllib.parse.urlparse(bug).path))
else:
issues.append(bug.lstrip('#'))
loop = asyncio.get_event_loop()
async with aiohttp.ClientSession(loop=loop) as session:
results = await asyncio.gather(*[get_bug(session, i) for i in issues])
typing.cast(typing.Tuple[str, ...], results)
return list(results)
async def get_bug(session: aiohttp.ClientSession, bug_id: str) -> str:
"""Query gitlab to get the name of the issue that was closed."""
# Mesa's gitlab id is 176,
url = 'https://gitlab.freedesktop.org/api/v4/projects/176/issues'
params = {'iids[]': bug_id}
async with session.get(url, params=params) as response:
content = await response.json()
return content[0]['title']
async def get_shortlog(version: str) -> str:
"""Call git shortlog."""
p = await asyncio.create_subprocess_exec('git', 'shortlog', f'mesa-{version}..',
stdout=asyncio.subprocess.PIPE)
out, _ = await p.communicate()
assert p.returncode == 0, 'error getting shortlog'
assert out is not None, 'just for mypy'
return out.decode()
def walk_shortlog(log: str) -> typing.Generator[typing.Tuple[str, bool], None, None]:
for l in log.split('\n'):
if l.startswith(' '): # this means we have a patch description
yield l, False
else:
yield l, True
def calculate_next_version(version: str, is_point: bool) -> str:
"""Calculate the version about to be released."""
if '-' in version:
version = version.split('-')[0]
if is_point:
base = version.split('.')
base[2] = str(int(base[2]) + 1)
return '.'.join(base)
return version
def calculate_previous_version(version: str, is_point: bool) -> str:
"""Calculate the previous version to compare to.
In the case of -rc to final that verison is the previous .0 release,
(19.3.0 in the case of 20.0.0, for example). for point releases that is
the last point release. This value will be the same as the input value
for a point release, but different for a major release.
"""
if '-' in version:
version = version.split('-')[0]
if is_point:
return version
base = version.split('.')
if base[1] == '0':
base[0] = str(int(base[0]) - 1)
base[1] = '3'
else:
base[1] = str(int(base[1]) - 1)
return '.'.join(base)
def get_features(is_point_release: bool) -> typing.Generator[str, None, None]:
p = pathlib.Path(__file__).parent.parent / 'docs' / 'relnotes' / 'new_features.txt'
if p.exists():
if is_point_release:
print("WARNING: new features being introduced in a point release", file=sys.stderr)
with p.open('rt') as f:
for line in f:
yield line
else:
yield "None"
async def main() -> None:
v = pathlib.Path(__file__).parent.parent / 'VERSION'
with v.open('rt') as f:
raw_version = f.read().strip()
is_point_release = '-rc' not in raw_version
assert '-devel' not in raw_version, 'Do not run this script on -devel'
version = raw_version.split('-')[0]
previous_version = calculate_previous_version(version, is_point_release)
next_version = calculate_next_version(version, is_point_release)
shortlog, bugs = await asyncio.gather(
get_shortlog(previous_version),
gather_bugs(previous_version),
)
final = pathlib.Path(__file__).parent.parent / 'docs' / 'relnotes' / f'{next_version}.html'
with final.open('wt') as f:
try:
f.write(TEMPLATE.render(
bugfix=is_point_release,
bugs=bugs,
changes=walk_shortlog(shortlog),
features=get_features(is_point_release),
gl_version=CURRENT_GL_VERSION,
next_version=next_version,
today=datetime.date.today(),
version=previous_version,
vk_version=CURRENT_VK_VERSION,
))
except:
print(exceptions.text_error_template().render())
if __name__ == "__main__":
loop = asyncio.get_event_loop()
loop.run_until_complete(main())

View File

@@ -0,0 +1,62 @@
# Copyright © 2019 Intel Corporation
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
from unittest import mock
import pytest
from .gen_release_notes import *
@pytest.mark.parametrize(
'current, is_point, expected',
[
('19.2.0', True, '19.2.1'),
('19.3.6', True, '19.3.7'),
('20.0.0-rc4', False, '20.0.0'),
])
def test_next_version(current: str, is_point: bool, expected: str) -> None:
assert calculate_next_version(current, is_point) == expected
@pytest.mark.parametrize(
'current, is_point, expected',
[
('19.3.6', True, '19.3.6'),
('20.0.0-rc4', False, '19.3.0'),
])
def test_previous_version(current: str, is_point: bool, expected: str) -> None:
assert calculate_previous_version(current, is_point) == expected
@pytest.mark.asyncio
async def test_get_shortlog():
# Certainly not perfect, but it's something
version = '19.2.0'
out = await get_shortlog(version)
assert out
@pytest.mark.asyncio
async def test_gather_commits():
# Certainly not perfect, but it's something
version = '19.2.0'
out = await gather_commits(version)
assert out

View File

@@ -32,7 +32,7 @@ is_sha_nomination()
{
fixes=`git show --pretty=medium -s $1 | tr -d "\n" | \
sed -e 's/'"$2"'/\nfixes:/Ig' | \
grep -Eo 'fixes:[a-f0-9]{8,40}'`
grep -Eo 'fixes:[a-f0-9]{4,40}'`
fixes_count=`echo "$fixes" | grep "fixes:" | wc -l`
if test $fixes_count -eq 0; then
@@ -143,7 +143,7 @@ do
esac
printf "[ %8s ] " "$tag"
git --no-pager show --no-patch --oneline $sha
git --no-pager show --no-patch --pretty=oneline $sha
done
rm -f already_picked

View File

@@ -1,5 +1,6 @@
#!/usr/bin/env python3
# encoding=utf-8
# Copyright © 2017-2018 Intel Corporation
# Copyright 2017-2018 Intel Corporation
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal

View File

@@ -20,3 +20,4 @@
git_sha1_gen_py = files('git_sha1_gen.py')
symbols_check = find_program('symbols-check.py')
install_megadrivers_py = find_program('install_megadrivers.py')

33
bin/pick-ui.py Executable file
View File

@@ -0,0 +1,33 @@
#!/usr/bin/env python3
# Copyright © 2019-2020 Intel Corporation
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
import asyncio
import urwid
from pick.ui import UI, PALETTE
if __name__ == "__main__":
u = UI()
evl = urwid.AsyncioEventLoop(loop=asyncio.get_event_loop())
loop = urwid.MainLoop(u.render(), PALETTE, event_loop=evl)
u.mainloop = loop
loop.run()

0
bin/pick/__init__.py Normal file
View File

367
bin/pick/core.py Normal file
View File

@@ -0,0 +1,367 @@
# Copyright © 2019-2020 Intel Corporation
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
"""Core data structures and routines for pick."""
import asyncio
import enum
import json
import pathlib
import re
import typing
import attr
if typing.TYPE_CHECKING:
from .ui import UI
import typing_extensions
class CommitDict(typing_extensions.TypedDict):
sha: str
description: str
nomintated: bool
nomination_type: typing.Optional[int]
resolution: typing.Optional[int]
master_sha: typing.Optional[str]
IS_FIX = re.compile(r'^\s*fixes:\s*([a-f0-9]{6,40})', flags=re.MULTILINE | re.IGNORECASE)
# FIXME: I dislike the duplication in this regex, but I couldn't get it to work otherwise
IS_CC = re.compile(r'^\s*cc:\s*["\']?([0-9]{2}\.[0-9])?["\']?\s*["\']?([0-9]{2}\.[0-9])?["\']?\s*\<?mesa-stable',
flags=re.MULTILINE | re.IGNORECASE)
IS_REVERT = re.compile(r'This reverts commit ([0-9a-f]{40})')
# XXX: hack
SEM = asyncio.Semaphore(50)
COMMIT_LOCK = asyncio.Lock()
class PickUIException(Exception):
pass
@enum.unique
class NominationType(enum.Enum):
CC = 0
FIXES = 1
REVERT = 2
@enum.unique
class Resolution(enum.Enum):
UNRESOLVED = 0
MERGED = 1
DENOMINATED = 2
BACKPORTED = 3
NOTNEEDED = 4
async def commit_state(*, amend: bool = False, message: str = 'Update') -> None:
"""Commit the .pick_status.json file."""
f = pathlib.Path(__file__).parent.parent.parent / '.pick_status.json'
async with COMMIT_LOCK:
p = await asyncio.create_subprocess_exec(
'git', 'add', f.as_posix(),
stdout=asyncio.subprocess.DEVNULL,
stderr=asyncio.subprocess.DEVNULL,
)
v = await p.wait()
if v != 0:
return False
if amend:
cmd = ['--amend', '--no-edit']
else:
cmd = ['--message', f'.pick_status.json: {message}']
p = await asyncio.create_subprocess_exec(
'git', 'commit', *cmd,
stdout=asyncio.subprocess.DEVNULL,
stderr=asyncio.subprocess.DEVNULL,
)
v = await p.wait()
if v != 0:
return False
return True
@attr.s(slots=True)
class Commit:
sha: str = attr.ib()
description: str = attr.ib()
nominated: bool = attr.ib(False)
nomination_type: typing.Optional[NominationType] = attr.ib(None)
resolution: Resolution = attr.ib(Resolution.UNRESOLVED)
master_sha: typing.Optional[str] = attr.ib(None)
because_sha: typing.Optional[str] = attr.ib(None)
def to_json(self) -> 'CommitDict':
d: typing.Dict[str, typing.Any] = attr.asdict(self)
if self.nomination_type is not None:
d['nomination_type'] = self.nomination_type.value
if self.resolution is not None:
d['resolution'] = self.resolution.value
return typing.cast('CommitDict', d)
@classmethod
def from_json(cls, data: 'CommitDict') -> 'Commit':
c = cls(data['sha'], data['description'], data['nominated'], master_sha=data['master_sha'], because_sha=data['because_sha'])
if data['nomination_type'] is not None:
c.nomination_type = NominationType(data['nomination_type'])
if data['resolution'] is not None:
c.resolution = Resolution(data['resolution'])
return c
async def apply(self, ui: 'UI') -> typing.Tuple[bool, str]:
# FIXME: This isn't really enough if we fail to cherry-pick because the
# git tree will still be dirty
async with COMMIT_LOCK:
p = await asyncio.create_subprocess_exec(
'git', 'cherry-pick', '-x', self.sha,
stdout=asyncio.subprocess.DEVNULL,
stderr=asyncio.subprocess.PIPE,
)
_, err = await p.communicate()
if p.returncode != 0:
return (False, err)
self.resolution = Resolution.MERGED
await ui.feedback(f'{self.sha} ({self.description}) applied successfully')
# Append the changes to the .pickstatus.json file
ui.save()
v = await commit_state(amend=True)
return (v, '')
async def abort_cherry(self, ui: 'UI', err: str) -> None:
await ui.feedback(f'{self.sha} ({self.description}) failed to apply\n{err}')
async with COMMIT_LOCK:
p = await asyncio.create_subprocess_exec(
'git', 'cherry-pick', '--abort',
stdout=asyncio.subprocess.DEVNULL,
stderr=asyncio.subprocess.DEVNULL,
)
r = await p.wait()
await ui.feedback(f'{"Successfully" if r == 0 else "Failed to"} abort cherry-pick.')
async def denominate(self, ui: 'UI') -> bool:
self.resolution = Resolution.DENOMINATED
ui.save()
v = await commit_state(message=f'Mark {self.sha} as denominated')
assert v
await ui.feedback(f'{self.sha} ({self.description}) denominated successfully')
return True
async def backport(self, ui: 'UI') -> bool:
self.resolution = Resolution.BACKPORTED
ui.save()
v = await commit_state(message=f'Mark {self.sha} as backported')
assert v
await ui.feedback(f'{self.sha} ({self.description}) backported successfully')
return True
async def resolve(self, ui: 'UI') -> None:
self.resolution = Resolution.MERGED
ui.save()
v = await commit_state(amend=True)
assert v
await ui.feedback(f'{self.sha} ({self.description}) committed successfully')
async def get_new_commits(sha: str) -> typing.List[typing.Tuple[str, str]]:
# TODO: config file that points to the upstream branch
p = await asyncio.create_subprocess_exec(
'git', 'log', '--pretty=oneline', f'{sha}..master',
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.DEVNULL)
out, _ = await p.communicate()
assert p.returncode == 0, f"git log didn't work: {sha}"
return list(split_commit_list(out.decode().strip()))
def split_commit_list(commits: str) -> typing.Generator[typing.Tuple[str, str], None, None]:
if not commits:
return
for line in commits.split('\n'):
v = tuple(line.split(' ', 1))
assert len(v) == 2, 'this is really just for mypy'
yield typing.cast(typing.Tuple[str, str], v)
async def is_commit_in_branch(sha: str) -> bool:
async with SEM:
p = await asyncio.create_subprocess_exec(
'git', 'merge-base', '--is-ancestor', sha, 'HEAD',
stdout=asyncio.subprocess.DEVNULL,
stderr=asyncio.subprocess.DEVNULL,
)
await p.wait()
return p.returncode == 0
async def full_sha(sha: str) -> str:
async with SEM:
p = await asyncio.create_subprocess_exec(
'git', 'rev-parse', sha,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.DEVNULL,
)
out, _ = await p.communicate()
if p.returncode:
raise PickUIException(f'Invalid Sha {sha}')
return out.decode().strip()
async def resolve_nomination(commit: 'Commit', version: str) -> 'Commit':
async with SEM:
p = await asyncio.create_subprocess_exec(
'git', 'log', '--pretty=medium', '-1', commit.sha,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.DEVNULL,
)
_out, _ = await p.communicate()
assert p.returncode == 0, f'git log for {commit.sha} failed'
out = _out.decode()
# We give presedence to fixes and cc tags over revert tags.
# XXX: not having the wallrus operator available makes me sad :=
m = IS_FIX.search(out)
if m:
# We set the nomination_type and because_sha here so that we can later
# check to see if this fixes another staged commit.
try:
commit.because_sha = fixed = await full_sha(m.group(1))
except PickUIException:
pass
else:
commit.nomination_type = NominationType.FIXES
if await is_commit_in_branch(fixed):
commit.nominated = True
return commit
m = IS_CC.search(out)
if m:
if m.groups() == (None, None) or version in m.groups():
commit.nominated = True
commit.nomination_type = NominationType.CC
return commit
m = IS_REVERT.search(out)
if m:
# See comment for IS_FIX path
try:
commit.because_sha = reverted = await full_sha(m.group(1))
except PickUIException:
pass
else:
commit.nomination_type = NominationType.REVERT
if await is_commit_in_branch(reverted):
commit.nominated = True
return commit
return commit
async def resolve_fixes(commits: typing.List['Commit'], previous: typing.List['Commit']) -> None:
"""Determine if any of the undecided commits fix/revert a staged commit.
The are still needed if they apply to a commit that is staged for
inclusion, but not yet included.
This must be done in order, because a commit 3 might fix commit 2 which
fixes commit 1.
"""
shas: typing.Set[str] = set(c.sha for c in previous if c.nominated)
assert None not in shas, 'None in shas'
for commit in reversed(commits):
if not commit.nominated and commit.nomination_type is NominationType.FIXES:
commit.nominated = commit.because_sha in shas
if commit.nominated:
shas.add(commit.sha)
for commit in commits:
if (commit.nomination_type is NominationType.REVERT and
commit.because_sha in shas):
for oldc in reversed(commits):
if oldc.sha == commit.because_sha:
# In this case a commit that hasn't yet been applied is
# reverted, we don't want to apply that commit at all
oldc.nominated = False
oldc.resolution = Resolution.DENOMINATED
commit.nominated = False
commit.resolution = Resolution.DENOMINATED
shas.remove(commit.because_sha)
break
async def gather_commits(version: str, previous: typing.List['Commit'],
new: typing.List[typing.Tuple[str, str]], cb) -> typing.List['Commit']:
# We create an array of the final size up front, then we pass that array
# to the "inner" co-routine, which is turned into a list of tasks and
# collected by asyncio.gather. We do this to allow the tasks to be
# asyncrounously gathered, but to also ensure that the commits list remains
# in order.
commits = [None] * len(new)
tasks = []
async def inner(commit: 'Commit', version: str, commits: typing.List['Commit'],
index: int, cb) -> None:
commits[index] = await resolve_nomination(commit, version)
cb()
for i, (sha, desc) in enumerate(new):
tasks.append(asyncio.ensure_future(
inner(Commit(sha, desc), version, commits, i, cb)))
await asyncio.gather(*tasks)
assert None not in commits
await resolve_fixes(commits, previous)
for commit in commits:
if commit.resolution is Resolution.UNRESOLVED and not commit.nominated:
commit.resolution = Resolution.NOTNEEDED
return commits
def load() -> typing.List['Commit']:
p = pathlib.Path(__file__).parent.parent.parent / '.pick_status.json'
if not p.exists():
return []
with p.open('r') as f:
raw = json.load(f)
return [Commit.from_json(c) for c in raw]
def save(commits: typing.Iterable['Commit']) -> None:
p = pathlib.Path(__file__).parent.parent.parent / '.pick_status.json'
commits = list(commits)
with p.open('wt') as f:
json.dump([c.to_json() for c in commits], f, indent=4)
asyncio.ensure_future(commit_state(message=f'Update to {commits[0].sha}'))

470
bin/pick/core_test.py Normal file
View File

@@ -0,0 +1,470 @@
# Copyright © 2019-2020 Intel Corporation
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
"""Tests for pick's core data structures and routines."""
from unittest import mock
import textwrap
import typing
import attr
import pytest
from . import core
class TestCommit:
@pytest.fixture
def unnominated_commit(self) -> 'core.Commit':
return core.Commit('abc123', 'sub: A commit', master_sha='45678')
@pytest.fixture
def nominated_commit(self) -> 'core.Commit':
return core.Commit('abc123', 'sub: A commit', True,
core.NominationType.CC, core.Resolution.UNRESOLVED)
class TestToJson:
def test_not_nominated(self, unnominated_commit: 'core.Commit'):
c = unnominated_commit
v = c.to_json()
assert v == {'sha': 'abc123', 'description': 'sub: A commit', 'nominated': False,
'nomination_type': None, 'resolution': core.Resolution.UNRESOLVED.value,
'master_sha': '45678', 'because_sha': None}
def test_nominated(self, nominated_commit: 'core.Commit'):
c = nominated_commit
v = c.to_json()
assert v == {'sha': 'abc123',
'description': 'sub: A commit',
'nominated': True,
'nomination_type': core.NominationType.CC.value,
'resolution': core.Resolution.UNRESOLVED.value,
'master_sha': None,
'because_sha': None}
class TestFromJson:
def test_not_nominated(self, unnominated_commit: 'core.Commit'):
c = unnominated_commit
v = c.to_json()
c2 = core.Commit.from_json(v)
assert c == c2
def test_nominated(self, nominated_commit: 'core.Commit'):
c = nominated_commit
v = c.to_json()
c2 = core.Commit.from_json(v)
assert c == c2
class TestRE:
"""Tests for the regular expressions used to identify commits."""
class TestFixes:
def test_simple(self):
message = textwrap.dedent("""\
etnaviv: fix vertex buffer state emission for single stream GPUs
GPUs with a single supported vertex stream must use the single state
address to program the stream.
Fixes: 3d09bb390a39 (etnaviv: GC7000: State changes for HALTI3..5)
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
""")
m = core.IS_FIX.search(message)
assert m is not None
assert m.group(1) == '3d09bb390a39'
class TestCC:
def test_single_branch(self):
"""Tests commit meant for a single branch, ie, 19.1"""
message = textwrap.dedent("""\
radv: fix DCC fast clear code for intensity formats
This fixes a rendering issue with DiRT 4 on GFX10. Only GFX10 was
affected because intensity formats are different.
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/1923
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
""")
m = core.IS_CC.search(message)
assert m is not None
assert m.group(1) == '19.2'
def test_multiple_branches(self):
"""Tests commit with more than one branch specified"""
message = textwrap.dedent("""\
radeonsi: enable zerovram for Rocket League
Fixes corruption on game startup.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/1888
Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
""")
m = core.IS_CC.search(message)
assert m is not None
assert m.group(1) == '19.1'
assert m.group(2) == '19.2'
def test_no_branch(self):
"""Tests commit with no branch specification"""
message = textwrap.dedent("""\
anv/android: fix images created with external format support
This fixes a case where user first creates image and then later binds it
with memory created from AHW buffer.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
""")
m = core.IS_CC.search(message)
assert m is not None
def test_quotes(self):
"""Tests commit with quotes around the versions"""
message = textwrap.dedent("""\
anv: Always fill out the AUX table even if CCS is disabled
Cc: "20.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3454>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3454>
""")
m = core.IS_CC.search(message)
assert m is not None
assert m.group(1) == '20.0'
def test_multiple_quotes(self):
"""Tests commit with quotes around the versions"""
message = textwrap.dedent("""\
anv: Always fill out the AUX table even if CCS is disabled
Cc: "20.0" "20.1" mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3454>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3454>
""")
m = core.IS_CC.search(message)
assert m is not None
assert m.group(1) == '20.0'
assert m.group(2) == '20.1'
def test_single_quotes(self):
"""Tests commit with quotes around the versions"""
message = textwrap.dedent("""\
anv: Always fill out the AUX table even if CCS is disabled
Cc: '20.0' mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3454>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3454>
""")
m = core.IS_CC.search(message)
assert m is not None
assert m.group(1) == '20.0'
def test_multiple_single_quotes(self):
"""Tests commit with quotes around the versions"""
message = textwrap.dedent("""\
anv: Always fill out the AUX table even if CCS is disabled
Cc: '20.0' '20.1' mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3454>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3454>
""")
m = core.IS_CC.search(message)
assert m is not None
assert m.group(1) == '20.0'
assert m.group(2) == '20.1'
class TestRevert:
def test_simple(self):
message = textwrap.dedent("""\
Revert "radv: do not emit PKT3_CONTEXT_CONTROL with AMDGPU 3.6.0+"
This reverts commit 2ca8629fa9b303e24783b76a7b3b0c2513e32fbd.
This was initially ported from RadeonSI, but in the meantime it has
been reverted because it might hang. Be conservative and re-introduce
this packet emission.
Unfortunately this doesn't fix anything known.
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
""")
m = core.IS_REVERT.search(message)
assert m is not None
assert m.group(1) == '2ca8629fa9b303e24783b76a7b3b0c2513e32fbd'
class TestResolveNomination:
@attr.s(slots=True)
class FakeSubprocess:
"""A fake asyncio.subprocess like classe for use with mock."""
out: typing.Optional[bytes] = attr.ib(None)
returncode: int = attr.ib(0)
async def mock(self, *_, **__):
"""A dirtly little helper for mocking."""
return self
async def communicate(self) -> typing.Tuple[bytes, bytes]:
assert self.out is not None
return self.out, b''
async def wait(self) -> int:
return self.returncode
@staticmethod
async def return_true(*_, **__) -> bool:
return True
@staticmethod
async def return_false(*_, **__) -> bool:
return False
@pytest.mark.asyncio
async def test_fix_is_nominated(self):
s = self.FakeSubprocess(b'Fixes: 3d09bb390a39 (etnaviv: GC7000: State changes for HALTI3..5)')
c = core.Commit('abcdef1234567890', 'a commit')
with mock.patch('bin.pick.core.asyncio.create_subprocess_exec', s.mock):
with mock.patch('bin.pick.core.is_commit_in_branch', self.return_true):
await core.resolve_nomination(c, '')
assert c.nominated
assert c.nomination_type is core.NominationType.FIXES
@pytest.mark.asyncio
async def test_fix_is_not_nominated(self):
s = self.FakeSubprocess(b'Fixes: 3d09bb390a39 (etnaviv: GC7000: State changes for HALTI3..5)')
c = core.Commit('abcdef1234567890', 'a commit')
with mock.patch('bin.pick.core.asyncio.create_subprocess_exec', s.mock):
with mock.patch('bin.pick.core.is_commit_in_branch', self.return_false):
await core.resolve_nomination(c, '')
assert not c.nominated
assert c.nomination_type is core.NominationType.FIXES
@pytest.mark.asyncio
async def test_cc_is_nominated(self):
s = self.FakeSubprocess(b'Cc: 16.2 <mesa-stable@lists.freedesktop.org>')
c = core.Commit('abcdef1234567890', 'a commit')
with mock.patch('bin.pick.core.asyncio.create_subprocess_exec', s.mock):
await core.resolve_nomination(c, '16.2')
assert c.nominated
assert c.nomination_type is core.NominationType.CC
@pytest.mark.asyncio
async def test_cc_is_nominated2(self):
s = self.FakeSubprocess(b'Cc: mesa-stable@lists.freedesktop.org')
c = core.Commit('abcdef1234567890', 'a commit')
with mock.patch('bin.pick.core.asyncio.create_subprocess_exec', s.mock):
await core.resolve_nomination(c, '16.2')
assert c.nominated
assert c.nomination_type is core.NominationType.CC
@pytest.mark.asyncio
async def test_cc_is_not_nominated(self):
s = self.FakeSubprocess(b'Cc: 16.2 <mesa-stable@lists.freedesktop.org>')
c = core.Commit('abcdef1234567890', 'a commit')
with mock.patch('bin.pick.core.asyncio.create_subprocess_exec', s.mock):
await core.resolve_nomination(c, '16.1')
assert not c.nominated
assert c.nomination_type is None
@pytest.mark.asyncio
async def test_revert_is_nominated(self):
s = self.FakeSubprocess(b'This reverts commit 1234567890123456789012345678901234567890.')
c = core.Commit('abcdef1234567890', 'a commit')
with mock.patch('bin.pick.core.asyncio.create_subprocess_exec', s.mock):
with mock.patch('bin.pick.core.is_commit_in_branch', self.return_true):
await core.resolve_nomination(c, '')
assert c.nominated
assert c.nomination_type is core.NominationType.REVERT
@pytest.mark.asyncio
async def test_revert_is_not_nominated(self):
s = self.FakeSubprocess(b'This reverts commit 1234567890123456789012345678901234567890.')
c = core.Commit('abcdef1234567890', 'a commit')
with mock.patch('bin.pick.core.asyncio.create_subprocess_exec', s.mock):
with mock.patch('bin.pick.core.is_commit_in_branch', self.return_false):
await core.resolve_nomination(c, '')
assert not c.nominated
assert c.nomination_type is core.NominationType.REVERT
@pytest.mark.asyncio
async def test_is_fix_and_cc(self):
s = self.FakeSubprocess(
b'Fixes: 3d09bb390a39 (etnaviv: GC7000: State changes for HALTI3..5)\n'
b'Cc: 16.1 <mesa-stable@lists.freedesktop.org>'
)
c = core.Commit('abcdef1234567890', 'a commit')
with mock.patch('bin.pick.core.asyncio.create_subprocess_exec', s.mock):
with mock.patch('bin.pick.core.is_commit_in_branch', self.return_true):
await core.resolve_nomination(c, '16.1')
assert c.nominated
assert c.nomination_type is core.NominationType.FIXES
@pytest.mark.asyncio
async def test_is_fix_and_revert(self):
s = self.FakeSubprocess(
b'Fixes: 3d09bb390a39 (etnaviv: GC7000: State changes for HALTI3..5)\n'
b'This reverts commit 1234567890123456789012345678901234567890.'
)
c = core.Commit('abcdef1234567890', 'a commit')
with mock.patch('bin.pick.core.asyncio.create_subprocess_exec', s.mock):
with mock.patch('bin.pick.core.is_commit_in_branch', self.return_true):
await core.resolve_nomination(c, '16.1')
assert c.nominated
assert c.nomination_type is core.NominationType.FIXES
@pytest.mark.asyncio
async def test_is_cc_and_revert(self):
s = self.FakeSubprocess(
b'This reverts commit 1234567890123456789012345678901234567890.\n'
b'Cc: 16.1 <mesa-stable@lists.freedesktop.org>'
)
c = core.Commit('abcdef1234567890', 'a commit')
with mock.patch('bin.pick.core.asyncio.create_subprocess_exec', s.mock):
with mock.patch('bin.pick.core.is_commit_in_branch', self.return_true):
await core.resolve_nomination(c, '16.1')
assert c.nominated
assert c.nomination_type is core.NominationType.CC
class TestResolveFixes:
@pytest.mark.asyncio
async def test_in_new(self):
"""Because commit abcd is nominated, so f123 should be as well."""
c = [
core.Commit('f123', 'desc', nomination_type=core.NominationType.FIXES, because_sha='abcd'),
core.Commit('abcd', 'desc', True),
]
await core.resolve_fixes(c, [])
assert c[1].nominated
@pytest.mark.asyncio
async def test_not_in_new(self):
"""Because commit abcd is not nominated, commit f123 shouldn't be either."""
c = [
core.Commit('f123', 'desc', nomination_type=core.NominationType.FIXES, because_sha='abcd'),
core.Commit('abcd', 'desc'),
]
await core.resolve_fixes(c, [])
assert not c[0].nominated
@pytest.mark.asyncio
async def test_in_previous(self):
"""Because commit abcd is nominated, so f123 should be as well."""
p = [
core.Commit('abcd', 'desc', True),
]
c = [
core.Commit('f123', 'desc', nomination_type=core.NominationType.FIXES, because_sha='abcd'),
]
await core.resolve_fixes(c, p)
assert c[0].nominated
@pytest.mark.asyncio
async def test_not_in_previous(self):
"""Because commit abcd is not nominated, commit f123 shouldn't be either."""
p = [
core.Commit('abcd', 'desc'),
]
c = [
core.Commit('f123', 'desc', nomination_type=core.NominationType.FIXES, because_sha='abcd'),
]
await core.resolve_fixes(c, p)
assert not c[0].nominated
class TestIsCommitInBranch:
@pytest.mark.asyncio
async def test_no(self):
# Hopefully this is never true?
value = await core.is_commit_in_branch('ffffffffffffffffffffffffffffff')
assert not value
@pytest.mark.asyncio
async def test_yes(self):
# This commit is from 2000, it better always be in the branch
value = await core.is_commit_in_branch('88f3b89a2cb77766d2009b9868c44e03abe2dbb2')
assert value
class TestFullSha:
@pytest.mark.asyncio
async def test_basic(self):
# This commit is from 2000, it better always be in the branch
value = await core.full_sha('88f3b89a2cb777')
assert value
@pytest.mark.asyncio
async def test_invalid(self):
# This commit is from 2000, it better always be in the branch
with pytest.raises(core.PickUIException):
await core.full_sha('fffffffffffffffffffffffffffffffffff')

259
bin/pick/ui.py Normal file
View File

@@ -0,0 +1,259 @@
# Copyright © 2020-2020 Intel Corporation
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
"""Urwid UI for pick script."""
import asyncio
import functools
import itertools
import textwrap
import typing
import attr
import urwid
from . import core
if typing.TYPE_CHECKING:
WidgetType = typing.TypeVar('WidgetType', bound=urwid.Widget)
PALETTE = [
('a', 'black', 'light gray'),
('b', 'black', 'dark red'),
('bg', 'black', 'dark blue'),
('reversed', 'standout', ''),
]
class RootWidget(urwid.Frame):
def __init__(self, *args, ui: 'UI' = None, **kwargs):
super().__init__(*args, **kwargs)
assert ui is not None
self.ui = ui
def keypress(self, size: int, key: str) -> typing.Optional[str]:
if key == 'q':
raise urwid.ExitMainLoop()
elif key == 'u':
asyncio.ensure_future(self.ui.update())
elif key == 'a':
self.ui.add()
else:
return super().keypress(size, key)
return None
class CommitWidget(urwid.Text):
# urwid.Text is normally not interactable, this is required to tell urwid
# to use our keypress method
_selectable = True
def __init__(self, ui: 'UI', commit: 'core.Commit'):
super().__init__(commit.description)
self.ui = ui
self.commit = commit
async def apply(self) -> None:
result, err = await self.commit.apply(self.ui)
if not result:
self.ui.chp_failed(self, err)
else:
self.ui.remove_commit(self)
async def denominate(self) -> None:
await self.commit.denominate(self.ui)
self.ui.remove_commit(self)
async def backport(self) -> None:
await self.commit.backport(self.ui)
self.ui.remove_commit(self)
def keypress(self, size: int, key: str) -> typing.Optional[str]:
if key == 'c':
asyncio.ensure_future(self.apply())
elif key == 'd':
asyncio.ensure_future(self.denominate())
elif key == 'b':
asyncio.ensure_future(self.backport())
else:
return key
return None
@attr.s(slots=True)
class UI:
"""Main management object.
:previous_commits: A list of commits to master since this branch was created
:new_commits: Commits added to master since the last time this script was run
"""
commit_list: typing.List['urwid.Button'] = attr.ib(factory=lambda: urwid.SimpleFocusListWalker([]), init=False)
feedback_box: typing.List['urwid.Text'] = attr.ib(factory=lambda: urwid.SimpleFocusListWalker([]), init=False)
header: 'urwid.Text' = attr.ib(factory=lambda: urwid.Text('Mesa Stable Picker', align='center'), init=False)
body: 'urwid.Columns' = attr.ib(attr.Factory(lambda s: s._make_body(), True), init=False)
footer: 'urwid.Columns' = attr.ib(attr.Factory(lambda s: s._make_footer(), True), init=False)
root: RootWidget = attr.ib(attr.Factory(lambda s: s._make_root(), True), init=False)
mainloop: urwid.MainLoop = attr.ib(None, init=False)
previous_commits: typing.List['core.Commit'] = attr.ib(factory=list, init=False)
new_commits: typing.List['core.Commit'] = attr.ib(factory=list, init=False)
def _make_body(self) -> 'urwid.Columns':
commits = urwid.ListBox(self.commit_list)
feedback = urwid.ListBox(self.feedback_box)
return urwid.Columns([commits, feedback])
def _make_footer(self) -> 'urwid.Columns':
body = [
urwid.Text('[U]pdate'),
urwid.Text('[Q]uit'),
urwid.Text('[C]herry Pick'),
urwid.Text('[D]enominate'),
urwid.Text('[B]ackport'),
urwid.Text('[A]pply additional patch')
]
return urwid.Columns(body)
def _make_root(self) -> 'RootWidget':
return RootWidget(self.body, self.header, self.footer, 'body', ui=self)
def render(self) -> 'WidgetType':
asyncio.ensure_future(self.update())
return self.root
def load(self) -> None:
self.previous_commits = core.load()
async def update(self) -> None:
self.load()
with open('VERSION', 'r') as f:
version = f.read().strip()[:4]
if self.previous_commits:
sha = self.previous_commits[0].sha
else:
sha = f'{version}-branchpoint'
new_commits = await core.get_new_commits(sha)
if new_commits:
pb = urwid.ProgressBar('a', 'b', done=len(new_commits))
o = self.mainloop.widget
self.mainloop.widget = urwid.Overlay(
urwid.Filler(urwid.LineBox(pb)), o, 'center', ('relative', 50), 'middle', ('relative', 50))
self.new_commits = await core.gather_commits(
version, self.previous_commits, new_commits,
lambda: pb.set_completion(pb.current + 1))
self.mainloop.widget = o
for commit in reversed(list(itertools.chain(self.new_commits, self.previous_commits))):
if commit.nominated and commit.resolution is core.Resolution.UNRESOLVED:
b = urwid.AttrMap(CommitWidget(self, commit), None, focus_map='reversed')
self.commit_list.append(b)
self.save()
async def feedback(self, text: str) -> None:
self.feedback_box.append(urwid.AttrMap(urwid.Text(text), None))
def remove_commit(self, commit: CommitWidget) -> None:
for i, c in enumerate(self.commit_list):
if c.base_widget is commit:
del self.commit_list[i]
break
def save(self):
core.save(itertools.chain(self.new_commits, self.previous_commits))
def add(self) -> None:
"""Add an additional commit which isn't nominated."""
o = self.mainloop.widget
def reset_cb(_) -> None:
self.mainloop.widget = o
async def apply_cb(edit: urwid.Edit) -> None:
text: str = edit.get_edit_text()
# In case the text is empty
if not text:
return
sha = await core.full_sha(text)
for c in reversed(list(itertools.chain(self.new_commits, self.previous_commits))):
if c.sha == sha:
commit = c
break
else:
raise RuntimeError(f"Couldn't find {sha}")
await commit.apply(self)
q = urwid.Edit("Comit sha\n")
ok_btn = urwid.Button('Ok')
urwid.connect_signal(ok_btn, 'click', lambda _: asyncio.ensure_future(apply_cb(q)))
urwid.connect_signal(ok_btn, 'click', reset_cb)
can_btn = urwid.Button('Cancel')
urwid.connect_signal(can_btn, 'click', reset_cb)
cols = urwid.Columns([ok_btn, can_btn])
pile = urwid.Pile([q, cols])
box = urwid.LineBox(pile)
self.mainloop.widget = urwid.Overlay(
urwid.Filler(box), o, 'center', ('relative', 50), 'middle', ('relative', 50)
)
def chp_failed(self, commit: 'CommitWidget', err: str) -> None:
o = self.mainloop.widget
def reset_cb(_) -> None:
self.mainloop.widget = o
t = urwid.Text(textwrap.dedent(f"""
Failed to apply {commit.commit.sha} {commit.commit.description} with the following error:
{err}
You can either cancel, or resolve the conflicts, commit the
changes and select ok."""))
can_btn = urwid.Button('Cancel')
urwid.connect_signal(can_btn, 'click', reset_cb)
urwid.connect_signal(
can_btn, 'click', lambda _: asyncio.ensure_future(commit.commit.abort_cherry(self, err)))
ok_btn = urwid.Button('Ok')
urwid.connect_signal(ok_btn, 'click', reset_cb)
urwid.connect_signal(
ok_btn, 'click', lambda _: asyncio.ensure_future(commit.commit.resolve(self)))
urwid.connect_signal(
ok_btn, 'click', lambda _: self.remove_commit(commit))
cols = urwid.Columns([ok_btn, can_btn])
pile = urwid.Pile([t, cols])
box = urwid.LineBox(pile)
self.mainloop.widget = urwid.Overlay(
urwid.Filler(box), o, 'center', ('relative', 50), 'middle', ('relative', 50)
)

117
bin/post_version.py Executable file
View File

@@ -0,0 +1,117 @@
#!/usr/bin/env python3
# Copyright © 2019 Intel Corporation
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
"""Update the main page, release notes, and calendar."""
import argparse
import calendar
import datetime
import pathlib
from lxml import (
etree,
html,
)
def calculate_previous_version(version: str, is_point: bool) -> str:
"""Calculate the previous version to compare to.
In the case of -rc to final that verison is the previous .0 release,
(19.3.0 in the case of 20.0.0, for example). for point releases that is
the last point release. This value will be the same as the input value
for a poiont release, but different for a major release.
"""
if '-' in version:
version = version.split('-')[0]
if is_point:
return version
base = version.split('.')
if base[1] == '0':
base[0] = str(int(base[0]) - 1)
base[1] = '3'
else:
base[1] = str(int(base[1]) - 1)
return '.'.join(base)
def is_point_release(version: str) -> bool:
return not version.endswith('.0')
def update_index(is_point: bool, version: str, previous_version: str) -> None:
p = pathlib.Path(__file__).parent.parent / 'docs' / 'index.html'
with p.open('rt') as f:
tree = html.parse(f)
news = tree.xpath('.//h1')[0]
date = datetime.date.today()
month = calendar.month_name[date.month]
header = etree.Element('h2')
header.text = f"{month} {date.day}, {date.year}"
body = etree.Element('p')
a = etree.SubElement(
body, 'a', attrib={'href': f'relnotes/{previous_version}.html'})
a.text = f"Mesa {previous_version}"
if is_point:
a.tail = " is released. This is a bug fix release."
else:
a.tail = (" is released. This is a new development release. "
"See the release notes for mor information about this release.")
root = news.getparent()
index = root.index(news) + 1
root.insert(index, body)
root.insert(index, header)
tree.write(p.as_posix(), method='html')
def update_release_notes(previous_version: str) -> None:
p = pathlib.Path(__file__).parent.parent / 'docs' / 'relnotes.html'
with p.open('rt') as f:
tree = html.parse(f)
li = etree.Element('li')
a = etree.SubElement(li, 'a', href=f'relnotes/{previous_version}.html')
a.text = f'{previous_version} release notes'
ul = tree.xpath('.//ul')[0]
ul.insert(0, li)
tree.write(p.as_posix(), method='html')
def main() -> None:
parser = argparse.ArgumentParser()
parser.add_argument('version', help="The released version.")
args = parser.parse_args()
is_point = is_point_release(args.version)
previous_version = calculate_previous_version(args.version, is_point)
update_index(is_point, args.version, previous_version)
update_release_notes(previous_version)
if __name__ == "__main__":
main()

View File

@@ -1,29 +0,0 @@
#!/bin/sh
# This script is used to generate the list of changes that
# appears in the release notes files, with HTML formatting.
#
# Usage examples:
#
# $ bin/shortlog_mesa.sh mesa-9.0.2..mesa-9.0.3
# $ bin/shortlog_mesa.sh mesa-9.0.2..mesa-9.0.3 > changes
# $ bin/shortlog_mesa.sh mesa-9.0.2..mesa-9.0.3 | tee changes
in_log=0
git shortlog $* | while read l
do
if [ $in_log -eq 0 ]; then
echo '<p>'$l'</p>'
echo '<ul>'
in_log=1
elif echo "$l" | egrep -q '^$' ; then
echo '</ul>'
echo
in_log=0
else
mesg=$(echo $l | sed 's/ (cherry picked from commit [0-9a-f]\+)//;s/\&/&amp;/g;s/</\&lt;/g;s/>/\&gt;/g')
echo ' <li>'${mesg}'</li>'
fi
done

View File

@@ -19,9 +19,10 @@ PLATFORM_SYMBOLS = [
]
def get_symbols(nm, lib):
def get_symbols_nm(nm, lib):
'''
List all the (non platform-specific) symbols exported by the library
using `nm`
'''
symbols = []
platform_name = platform.system()
@@ -39,7 +40,35 @@ def get_symbols(nm, lib):
assert symbol_name[0] == '_'
symbol_name = symbol_name[1:]
symbols.append(symbol_name)
return symbols
def get_symbols_dumpbin(dumpbin, lib):
'''
List all the (non platform-specific) symbols exported by the library
using `dumpbin`
'''
symbols = []
output = subprocess.check_output([dumpbin, '/exports', lib],
stderr=open(os.devnull, 'w')).decode("ascii")
for line in output.splitlines():
fields = line.split()
# The lines with the symbols are made of at least 4 columns; see details below
if len(fields) < 4:
continue
try:
# Making sure the first 3 columns are a dec counter, a hex counter
# and a hex address
_ = int(fields[0], 10)
_ = int(fields[1], 16)
_ = int(fields[2], 16)
except ValueError:
continue
symbol_name = fields[3]
# De-mangle symbols
if symbol_name[0] == '_':
symbol_name = symbol_name[1:].split('@')[0]
symbols.append(symbol_name)
return symbols
@@ -55,12 +84,21 @@ def main():
help='path to library')
parser.add_argument('--nm',
action='store',
required=True,
help='path to binary (or name in $PATH)')
parser.add_argument('--dumpbin',
action='store',
help='path to binary (or name in $PATH)')
args = parser.parse_args()
try:
lib_symbols = get_symbols(args.nm, args.lib)
if platform.system() == 'Windows':
if not args.dumpbin:
parser.error('--dumpbin is mandatory')
lib_symbols = get_symbols_dumpbin(args.dumpbin, args.lib)
else:
if not args.nm:
parser.error('--nm is mandatory')
lib_symbols = get_symbols_nm(args.nm, args.lib)
except:
# We can't run this test, but we haven't technically failed it either
# Return the GNU "skip" error code
@@ -109,6 +147,10 @@ def main():
continue
if symbol in optional_symbols:
continue
if symbol[:2] == '_Z':
# Ignore random C++ symbols
#TODO: figure out if there's any way to avoid exporting them in the first place
continue
unknown_symbols.append(symbol)
missing_symbols = [

View File

@@ -17,6 +17,9 @@ import SCons.Script.SConscript
host_platform = _platform.system().lower()
if host_platform.startswith('cygwin'):
host_platform = 'cygwin'
# MSYS2 default platform selection.
if host_platform.startswith('mingw'):
host_platform = 'windows'
# Search sys.argv[] for a "platform=foo" argument since we don't have
# an 'env' variable at this point.
@@ -49,9 +52,18 @@ if 'PROCESSOR_ARCHITECTURE' in os.environ:
else:
host_machine = _platform.machine()
host_machine = _machine_map.get(host_machine, 'generic')
# MSYS2 default machine selection.
if _platform.system().lower().startswith('mingw') and 'MSYSTEM' in os.environ:
if os.environ['MSYSTEM'] == 'MINGW32':
host_machine = 'x86'
if os.environ['MSYSTEM'] == 'MINGW64':
host_machine = 'x86_64'
default_machine = host_machine
default_toolchain = 'default'
# MSYS2 default toolchain selection.
if _platform.system().lower().startswith('mingw'):
default_toolchain = 'mingw'
if target_platform == 'windows' and host_platform != 'windows':
default_machine = 'x86'
@@ -100,6 +112,7 @@ def AddOptions(opts):
opts.Add(BoolOption('asan', 'enable Address Sanitizer', 'no'))
opts.Add('toolchain', 'compiler toolchain', default_toolchain)
opts.Add(BoolOption('llvm', 'use LLVM', default_llvm))
opts.Add(BoolOption('force_scons', 'Force enable scons on deprecated platforms', 'false'))
opts.Add(BoolOption('openmp', 'EXPERIMENTAL: compile with openmp (swrast)',
'no'))
opts.Add(BoolOption('debug', 'DEPRECATED: debug build', 'yes'))

View File

@@ -24,8 +24,8 @@ The old bug database on SourceForge is no longer used.
<p>
To file a Mesa bug, go to
<a href="https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa">
Bugzilla on freedesktop.org</a>
<a href="https://gitlab.freedesktop.org/mesa/mesa/-/issues">
GitLab on freedesktop.org</a>
</p>
<p>

View File

@@ -41,11 +41,11 @@ as if you're defining a large, static table of information.
<li>Opening braces go on the same line as the if/for/while statement.
For example:
<pre>
if (condition) {
foo;
} else {
bar;
}
if (condition) {
foo;
} else {
bar;
}
</pre>
<li>Put a space before/after operators. For example, <code>a = b + c;</code>
@@ -53,7 +53,7 @@ and not <code>a=b+c;</code>
<li>This GNU indent command generally does the right thing for formatting:
<pre>
indent -br -i3 -npcs --no-tabs infile.c -o outfile.c
indent -br -i3 -npcs --no-tabs infile.c -o outfile.c
</pre>
<li>
@@ -63,47 +63,47 @@ follow <a href="http://www.doxygen.nl">Doxygen</a> conventions.
</p>
Single-line comments:
<pre>
/* null-out pointer to prevent dangling reference below */
bufferObj = NULL;
/* null-out pointer to prevent dangling reference below */
bufferObj = NULL;
</pre>
Or,
<pre>
bufferObj = NULL; /* prevent dangling reference below */
bufferObj = NULL; /* prevent dangling reference below */
</pre>
Multi-line comment:
<pre>
/* If this is a new buffer object id, or one which was generated but
* never used before, allocate a buffer object now.
*/
/* If this is a new buffer object id, or one which was generated but
* never used before, allocate a buffer object now.
*/
</pre>
We try to quote the OpenGL specification where prudent:
<pre>
/* Page 38 of the PDF of the OpenGL ES 3.0 spec says:
*
* "An INVALID_OPERATION error is generated for any of the following
* conditions:
*
* * &lt;length&gt; is zero."
*
* Additionally, page 94 of the PDF of the OpenGL 4.5 core spec
* (30.10.2014) also says this, so it's no longer allowed for desktop GL,
* either.
*/
/* Page 38 of the PDF of the OpenGL ES 3.0 spec says:
*
* "An INVALID_OPERATION error is generated for any of the following
* conditions:
*
* * &lt;length&gt; is zero."
*
* Additionally, page 94 of the PDF of the OpenGL 4.5 core spec
* (30.10.2014) also says this, so it's no longer allowed for desktop GL,
* either.
*/
</pre>
Function comment example:
<pre>
/**
* Create and initialize a new buffer object. Called via the
* ctx-&gt;Driver.CreateObject() driver callback function.
* \param name integer name of the object
* \param type one of GL_FOO, GL_BAR, etc.
* \return pointer to new object or NULL if error
*/
struct gl_object *
_mesa_create_object(GLuint name, GLenum type)
{
/* function body */
}
/**
* Create and initialize a new buffer object. Called via the
* ctx-&gt;Driver.CreateObject() driver callback function.
* \param name integer name of the object
* \param type one of GL_FOO, GL_BAR, etc.
* \return pointer to new object or NULL if error
*/
struct gl_object *
_mesa_create_object(GLuint name, GLenum type)
{
/* function body */
}
</pre>
<li>Put the function return type and qualifiers on one line and the function
@@ -113,11 +113,11 @@ the opening brace goes on the next line by itself (see above.)
<li>Function names follow various conventions depending on the type of function:
<pre>
glFooBar() - a public GL entry point (in glapi_dispatch.c)
_mesa_FooBar() - the internal immediate mode function
save_FooBar() - retained mode (display list) function in dlist.c
foo_bar() - a static (private) function
_mesa_foo_bar() - an internal non-static Mesa function
glFooBar() - a public GL entry point (in glapi_dispatch.c)
_mesa_FooBar() - the internal immediate mode function
save_FooBar() - retained mode (display list) function in dlist.c
foo_bar() - a static (private) function
_mesa_foo_bar() - an internal non-static Mesa function
</pre>
<li>Constants, macros and enum names are <code>ALL_UPPERCASE</code>, with _

View File

@@ -77,9 +77,6 @@ To add a new GL extension to Mesa you have to do at least the following.
</li>
</ul>
</div>
</body>
</html>

View File

@@ -77,17 +77,17 @@ table.</li>
<p>This can be implemented in just a few lines of C code. The file
<code>src/mesa/glapi/glapitemp.h</code> contains code very similar to this.</p>
<blockquote>
<table border="1">
<tr><td><pre>
<figure>
<pre>
void glVertex3f(GLfloat x, GLfloat y, GLfloat z)
{
const struct _glapi_table * const dispatch = GET_DISPATCH();
(*dispatch-&gt;Vertex3f)(x, y, z);
}</pre></td></tr>
<tr><td>Sample dispatch function</td></tr></table>
</blockquote>
}
</pre>
<figcaption>Sample dispatch function</figcaption>
</figure>
<p>The problem with this simple implementation is the large amount of
overhead that it adds to every GL function call.</p>
@@ -129,15 +129,14 @@ The resulting implementation of <code>GET_DISPATCH</code> is slightly more
complex, but it avoids the expensive <code>pthread_getspecific</code> call in
the common case.</p>
<blockquote>
<table border="1">
<tr><td><pre>
<figure>
<pre>
#define GET_DISPATCH() \
(_glapi_Dispatch != NULL) \
? _glapi_Dispatch : pthread_getspecific(&amp;_glapi_Dispatch_key)
</pre></td></tr>
<tr><td>Improved <code>GET_DISPATCH</code> Implementation</td></tr></table>
</blockquote>
</pre>
<figcaption>Improved <code>GET_DISPATCH</code> Implementation</figcaption>
</figure>
<h3>3.2. ELF TLS</h3>
@@ -154,16 +153,15 @@ direct rendering drivers that use either interface. Once the pointer is
properly declared, <code>GET_DISPACH</code> becomes a simple variable
reference.</p>
<blockquote>
<table border="1">
<tr><td><pre>
<figure>
<pre>
extern __thread struct _glapi_table *_glapi_tls_Dispatch
__attribute__((tls_model("initial-exec")));
#define GET_DISPATCH() _glapi_tls_Dispatch
</pre></td></tr>
<tr><td>TLS <code>GET_DISPATCH</code> Implementation</td></tr></table>
</blockquote>
</pre>
<figcaption>TLS <code>GET_DISPATCH</code> Implementation</figcaption>
</figure>
<p>Use of this path is controlled by the preprocessor define
<code>USE_ELF_TLS</code>. Any platform capable of using ELF TLS should use this
@@ -215,13 +213,12 @@ of the assembly source file different implementations of the macro are
selected based on the defined preprocessor variables. The assembly code
then consists of a series of invocations of the macros such as:
<blockquote>
<table border="1">
<tr><td><pre>
<figure>
<pre>
GL_STUB(Color3fv, _gloffset_Color3fv)
</pre></td></tr>
<tr><td>SPARC Assembly Implementation of <code>glColor3fv</code></td></tr></table>
</blockquote>
</pre>
<figcaption>SPARC Assembly Implementation of <code>glColor3fv</code></figcaption>
</figure>
<p>The benefit of this technique is that changes to the calling pattern
(i.e., addition of a new dispatch table pointer access method) require fewer
@@ -271,8 +268,6 @@ dispatch stub.</p>
<code>src/mesa/glapi/glapi.c</code> just before <code>glprocs.h</code> is
included.</p>
<h2 id="autogen">4. Automatic Generation of Dispatch Stubs</h2>
</div>
</body>
</html>

View File

@@ -19,10 +19,10 @@
<h2>Downloading</h2>
<p>
Primary Mesa download site:
<a href="ftp://ftp.freedesktop.org/pub/mesa/">ftp.freedesktop.org</a> (FTP)
or <a href="https://mesa.freedesktop.org/archive/">mesa.freedesktop.org</a>
(HTTPS).
You can download the released versions of Mesa via
<a href="https://mesa.freedesktop.org/archive/">HTTPS</a>
or
<a href="ftp://ftp.freedesktop.org/pub/mesa/">FTP</a>.
</p>
<p>

View File

@@ -307,6 +307,8 @@ See the <a href="xlibdriver.html">Xlib software driver page</a> for details.
<dd>disable instruction compaction</dd>
<dt><code>nodualobj</code></dt>
<dd>suppress generation of dual-object geometry shader code</dd>
<dt><code>nofc</code></dt>
<dd>disable fast clears</dd>
<dt><code>norbc</code></dt>
<dd>disable single sampled render buffer compression</dd>
<dt><code>optimizer</code></dt>
@@ -542,6 +544,231 @@ Mesa EGL supports different sets of environment variables. See the
</dl>
<h3>RADV driver environment variables</h3>
<dl>
<dt><code>RADV_DEBUG</code></dt>
<dd>a comma-separated list of named flags, which do various things:
<dl>
<dt><code>allbos</code></dt>
<dd>force all allocated buffers to be referenced in submissions</dd>
<dt><code>allentrypoints</code></dt>
<dd>enable all device/instance entrypoints</dd>
<dt><code>checkir</code></dt>
<dd>validate the LLVM IR before LLVM compiles the shader</dd>
<dt><code>errors</code></dt>
<dd>display more info about errors</dd>
<dt><code>info</code></dt>
<dd>show GPU-related information</dd>
<dt><code>metashaders</code></dt>
<dd>dump internal meta shaders</dd>
<dt><code>nobinning</code></dt>
<dd>disable primitive binning</dd>
<dt><code>nocache</code></dt>
<dd>disable shaders cache</dd>
<dt><code>nocompute</code></dt>
<dd>disable compute queue</dd>
<dt><code>nodcc</code></dt>
<dd>disable Delta Color Compression (DCC) on images</dd>
<dt><code>nodynamicbounds</code></dt>
<dd>do not check OOB access for dynamic descriptors</dd>
<dt><code>nofastclears</code></dt>
<dd>disable fast color/depthstencil clears</dd>
<dt><code>nohiz</code></dt>
<dd>disable HIZ for depthstencil images</dd>
<dt><code>noibs</code></dt>
<dd>disable directly recording command buffers in GPU-visible memory</dd>
<dt><code>noloadstoreopt</code></dt>
<dd>disable LLVM SILoadStoreOptimizer pass</dd>
<dt><code>nomemorycache</code></dt>
<dd>disable memory shaders cache</dd>
<dt><code>nongg</code></dt>
<dd>disable NGG for GFX10+</dd>
<dt><code>nooutoforder</code></dt>
<dd>disable out-of-order rasterization</dd>
<dt><code>noshaderballot</code></dt>
<dd>disable shader ballot</dd>
<dt><code>nosisched</code></dt>
<dd>disable LLVM sisched experimental scheduler</dd>
<dt><code>nothreadllvm</code></dt>
<dd>disable LLVM threaded compilation</dd>
<dt><code>preoptir</code></dt>
<dd>dump LLVM IR before any optimizations</dd>
<dt><code>shaders</code></dt>
<dd>dump shaders</dd>
<dt><code>shaderstats</code></dt>
<dd>dump shader statistics</dd>
<dt><code>spirv</code></dt>
<dd>dump SPIR-V</dd>
<dt><code>startup</code></dt>
<dd>display info at startup</dd>
<dt><code>syncshaders</code></dt>
<dd>synchronize shaders after all draws/dispatches</dd>
<dt><code>vmfaults</code></dt>
<dd>check for VM memory faults via dmesg</dd>
<dt><code>zerovram</code></dt>
<dd>initialize all memory allocated in VRAM as zero</dd>
</dl>
</dd>
<dt><code>RADV_FORCE_FAMILY</code></dt>
<dd>force the driver to use a specific family eg. gfx900 (developers only)</dd>
<dt><code>RADV_PERFTEST</code></dt>
<dd>a comma-separated list of named flags, which do various things:
<dl>
<dt><code>aco</code></dt>
<dd>enable ACO experimental compiler</dd>
<dt><code>bolist</code></dt>
<dd>enable the global BO list</dd>
<dt><code>cswave32</code></dt>
<dd>enable wave32 for compute shaders (GFX10+)</dd>
<dt><code>dccmsaa</code></dt>
<dd>enable DCC for MSAA images</dd>
<dt><code>dfsm</code></dt>
<dd>enable dfsm</dd>
<dt><code>gewave32</code></dt>
<dd>enable wave32 for vertex/tess/geometry shaders (GFX10+)</dd>
<dt><code>localbos</code></dt>
<dd>enable local BOs</dd>
<dt><code>nobatchchain</code></dt>
<dd>disable chained submissions</dd>
<dt><code>pswave32</code></dt>
<dd>enable wave32 for pixel shaders (GFX10+)</dd>
<dt><code>shader_ballot</code></dt>
<dd>enable shader ballot</dd>
<dt><code>sisched</code></dt>
<dd>enable LLVM sisched experimental scheduler</dd>
<dt><code>tccompatcmask</code></dt>
<dd>enable TC-compat cmask for MSAA images</dd>
</dl>
</dd>
<dt><code>RADV_SECURE_COMPILE_THREADS</code></dt>
<dd>maximum number of secure compile threads (up to 32)</dd>
<dt><code>RADV_TRACE_FILE</code></dt>
<dd>generate cmdbuffer tracefiles when a GPU hang is detected</dd>
</dl>
<h3>radeonsi driver environment variables</h3>
<dl>
<dt><code>AMD_DEBUG</code></dt>
<dd>a comma-separated list of named flags, which do various things:</dd>
<dl>
<dd></dd>
<h4>Disable features / workaround flags (useful to diagnose an issue):</h4>
<dt><code>nodma</code></dt>
<dd>Disable SDMA</dd>
<dt><code>nodmaclear</code></dt>
<dd>Disable SDMA clears</dd>
<dt><code>nodmacopyimage</code></dt>
<dd>Disable SDMA image copies</dd>
<dt><code>zerovram</code></dt>
<dd>Clear VRAM allocations.</dd>
<dt><code>nodcc</code></dt>
<dd>Disable DCC.</dd>
<dt><code>nodccclear</code></dt>
<dd>Disable DCC fast clear.</dd>
<dt><code>nodccfb</code></dt>
<dd>Disable separate DCC on the main framebuffer</dd>
<dt><code>nodccmsaa</code></dt>
<dd>Disable DCC for MSAA</dd>
<dt><code>nodpbb</code></dt>
<dd>Disable DPBB.</dd>
<dt><code>nodfsm</code></dt>
<dd>Disable DFSM.</dd>
<dt><code>notiling</code></dt>
<dd>Disable tiling</dd>
<dt><code>nofmask</code></dt>
<dd>Disable MSAA compression</dd>
<dt><code>nohyperz</code></dt>
<dd>Disable Hyper-Z</dd>
<dt><code>norbplus</code></dt>
<dd>Disable RB+.</dd>
<dt><code>no2d</code></dt>
<dd>Disable 2D tiling</dd>
<h4>Info flags:</h4>
<dt><code>info</code></dt>
<dd>Print driver information</dd>
<dt><code>tex</code></dt>
<dd>Print texture info</dd>
<dt><code>compute</code></dt>
<dd>Print compute info</dd>
<dt><code>vm</code></dt>
<dd>Print virtual addresses when creating resources</dd>
<h4>Print shaders flags:</h4>
<dt><code>vs</code></dt>
<dd>Print vertex shaders</dd>
<dt><code>ps</code></dt>
<dd>Print pixel shaders</dd>
<dt><code>gs</code></dt>
<dd>Print geometry shaders</dd>
<dt><code>tcs</code></dt>
<dd>Print tessellation control shaders</dd>
<dt><code>tes</code></dt>
<dd>Print tessellation evaluation shaders</dd>
<dt><code>cs</code></dt>
<dd>Print compute shaders</dd>
<dt><code>noir</code></dt>
<dd>Don't print the LLVM IR</dd>
<dt><code>nonir</code></dt>
<dd>Don't print NIR when printing shaders</dd>
<dt><code>noasm</code></dt>
<dd>Don't print disassembled shaders</dd>
<dt><code>preoptir</code></dt>
<dd>Print the LLVM IR before initial optimizations</dd>
<h4>Shader compilation tuning flags:</h4>
<dt><code>sisched</code></dt>
<dd>Enable LLVM SI Machine Instruction Scheduler.</dd>
<dt><code>gisel</code></dt>
<dd>Enable LLVM global instruction selector.</dd>
<dt><code>w32ge</code></dt>
<dd>Use Wave32 for vertex, tessellation, and geometry shaders.</dd>
<dt><code>w32ps</code></dt>
<dd>Use Wave32 for pixel shaders.</dd>
<dt><code>w32cs</code></dt>
<dd>Use Wave32 for computes shaders.</dd>
<dt><code>w64ge</code></dt>
<dd>Use Wave64 for vertex, tessellation, and geometry shaders.</dd>
<dt><code>w64ps</code></dt>
<dd>Use Wave64 for pixel shaders.</dd>
<dt><code>w64cs</code></dt>
<dd>Use Wave64 for computes shaders.</dd>
<dt><code>checkir</code></dt>
<dd>Enable additional sanity checks on shader IR</dd>
<dt><code>mono</code></dt>
<dd>Use old-style monolithic shaders compiled on demand</dd>
<dt><code>nooptvariant</code></dt>
<dd>Disable compiling optimized shader variants.</dd>
<h4>Advanced usage flags:</h4>
<dt><code>forcedma</code></dt>
<dd>Use SDMA for all operations when possible.</dd>
<dt><code>nowc</code></dt>
<dd>Disable GTT write combining</dd>
<dt><code>check_vm</code></dt>
<dd>Check VM faults and dump debug info.</dd>
<dt><code>reserve_vmid</code></dt>
<dd>Force VMID reservation per context.</dd>
<dt><code>nogfx</code></dt>
<dd>Disable graphics. Only multimedia compute paths can be used.</dd>
<dt><code>nongg</code></dt>
<dd>Disable NGG and use the legacy pipeline.</dd>
<dt><code>nggc</code></dt>
<dd>Always use NGG culling even when it can hurt.</dd>
<dt><code>nonggc</code></dt>
<dd>Disable NGG culling.</dd>
<dt><code>alwayspd</code></dt>
<dd>Always enable the primitive discard compute shader.</dd>
<dt><code>pd</code></dt>
<dd>Enable the primitive discard compute shader for large draw calls.</dd>
<dt><code>nopd</code></dt>
<dd>Disable the primitive discard compute shader.</dd>
<dt><code>switch_on_eop</code></dt>
<dd>Program WD/IA to switch on end-of-packet.</dd>
<dt><code>nooutoforder</code></dt>
<dd>Disable out-of-order rasterization</dd>
<dt><code>dpbb</code></dt>
<dd>Enable DPBB.</dd>
<dt><code>dfsm</code></dt>
<dd>Enable DFSM.</dd>
</dl>
<p>
Other Gallium drivers have their own environment variables. These may change
frequently so the source code should be consulted for details.

View File

@@ -118,19 +118,19 @@ GL 4.0, GLSL 4.00 --- all DONE: i965/gen7+, nvc0, r600, radeonsi, virgl
- 'precise' qualifier DONE (softpipe)
- Dynamically uniform sampler array indices DONE (softpipe)
- Dynamically uniform UBO array indices DONE (freedreno, softpipe)
- Implicit signed -> unsigned conversions DONE (softpipe)
- Fused multiply-add DONE (softpipe)
- Packing/bitfield/conversion functions DONE (freedreno, softpipe)
- Enhanced textureGather DONE (freedreno, softpipe)
- Geometry shader instancing DONE (llvmpipe, softpipe)
- Geometry shader multiple streams DONE (softpipe)
- Implicit signed -> unsigned conversions DONE (softpipe, swr)
- Fused multiply-add DONE (softpipe, swr)
- Packing/bitfield/conversion functions DONE (freedreno, softpipe, swr)
- Enhanced textureGather DONE (freedreno, softpipe, swr)
- Geometry shader instancing DONE (llvmpipe, softpipe, swr)
- Geometry shader multiple streams DONE (softpipe, swr)
- Enhanced per-sample shading DONE ()
- Interpolation functions DONE (softpipe)
- New overload resolution rules DONE (softpipe)
GL_ARB_gpu_shader_fp64 DONE (i965/gen7+, llvmpipe, softpipe, swr)
GL_ARB_sample_shading DONE (freedreno/a6xx, i965/gen6+, nv50)
GL_ARB_shader_subroutine DONE (freedreno, i965/gen6+, nv50, llvmpipe, softpipe, swr)
GL_ARB_tessellation_shader DONE (i965/gen7+)
GL_ARB_tessellation_shader DONE (i965/gen7+, swr)
GL_ARB_texture_buffer_object_rgb32 DONE (freedreno, i965/gen6+, llvmpipe, softpipe, swr)
GL_ARB_texture_cube_map_array DONE (i965/gen6+, nv50, llvmpipe, softpipe, swr)
GL_ARB_texture_gather DONE (freedreno, i965/gen6+, nv50, llvmpipe, softpipe, swr)
@@ -151,13 +151,13 @@ GL 4.1, GLSL 4.10 --- all DONE: i965/gen7+, nvc0, r600, radeonsi, virgl
GL 4.2, GLSL 4.20 -- all DONE: i965/gen7+, nvc0, r600, radeonsi, virgl
GL_ARB_texture_compression_bptc DONE (freedreno, i965)
GL_ARB_texture_compression_bptc DONE (freedreno, i965, llvmpipe, softpipe, swr)
GL_ARB_compressed_texture_pixel_storage DONE (all drivers)
GL_ARB_shader_atomic_counters DONE (freedreno/a5xx+, i965, llvmpipe, softpipe)
GL_ARB_texture_storage DONE (all drivers)
GL_ARB_transform_feedback_instanced DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_base_instance DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_shader_image_load_store DONE (freedreno/a5xx+, i965, softpipe)
GL_ARB_shader_image_load_store DONE (freedreno/a5xx+, i965, llvmpipe, softpipe)
GL_ARB_conservative_depth DONE (all drivers that support GLSL 1.30)
GL_ARB_shading_language_420pack DONE (all drivers that support GLSL 1.30)
GL_ARB_shading_language_packing DONE (all drivers)
@@ -170,18 +170,18 @@ GL 4.3, GLSL 4.30 -- all DONE: i965/gen8+, nvc0, r600, radeonsi, virgl
GL_ARB_arrays_of_arrays DONE (all drivers that support GLSL 1.30)
GL_ARB_ES3_compatibility DONE (all drivers that support GLSL 3.30)
GL_ARB_clear_buffer_object DONE (all drivers)
GL_ARB_compute_shader DONE (freedreno/a5xx+, i965, softpipe)
GL_ARB_compute_shader DONE (freedreno/a5xx+, i965, llvmpipe, softpipe)
GL_ARB_copy_image DONE (i965, nv50, softpipe, llvmpipe, swr)
GL_KHR_debug DONE (all drivers)
GL_ARB_explicit_uniform_location DONE (all drivers that support GLSL)
GL_ARB_fragment_layer_viewport DONE (i965, nv50, llvmpipe, softpipe)
GL_ARB_framebuffer_no_attachments DONE (freedreno, i965, softpipe)
GL_ARB_fragment_layer_viewport DONE (i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_framebuffer_no_attachments DONE (freedreno, i965, llvmpipe, softpipe)
GL_ARB_internalformat_query2 DONE (all drivers)
GL_ARB_invalidate_subdata DONE (all drivers)
GL_ARB_multi_draw_indirect DONE (freedreno, i965, llvmpipe, softpipe, swr)
GL_ARB_program_interface_query DONE (all drivers)
GL_ARB_robust_buffer_access_behavior DONE (i965)
GL_ARB_shader_image_size DONE (freedreno/a5xx+, i965, softpipe)
GL_ARB_shader_image_size DONE (freedreno/a5xx+, i965, llvmpipe, softpipe)
GL_ARB_shader_storage_buffer_object DONE (freedreno/a5xx+, i965, llvmpipe, softpipe)
GL_ARB_stencil_texturing DONE (freedreno, i965/hsw+, nv50, llvmpipe, softpipe, swr)
GL_ARB_texture_buffer_range DONE (freedreno, nv50, i965, softpipe, llvmpipe, swr)
@@ -204,7 +204,7 @@ GL 4.4, GLSL 4.40 -- all DONE: i965/gen8+, nvc0, r600, radeonsi
- specified transform/feedback layout DONE
- input/output block locations DONE
GL_ARB_multi_bind DONE (all drivers)
GL_ARB_query_buffer_object DONE (i965/hsw+, virgl)
GL_ARB_query_buffer_object DONE (i965/hsw+, llvmpipe, virgl)
GL_ARB_texture_mirror_clamp_to_edge DONE (i965, nv50, llvmpipe, softpipe, swr, virgl)
GL_ARB_texture_stencil8 DONE (freedreno, i965/hsw+, nv50, llvmpipe, softpipe, swr, virgl)
GL_ARB_vertex_type_10f_11f_11f_rev DONE (i965, nv50, llvmpipe, softpipe, swr, virgl)
@@ -215,7 +215,7 @@ GL 4.5, GLSL 4.50 -- all DONE: nvc0, radeonsi, r600
GL_ARB_clip_control DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_conditional_render_inverted DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr, virgl)
GL_ARB_cull_distance DONE (i965, nv50, llvmpipe, softpipe, swr, virgl)
GL_ARB_derivative_control DONE (i965, nv50, softpipe, virgl)
GL_ARB_derivative_control DONE (i965, nv50, llvmpipe, softpipe, virgl)
GL_ARB_direct_state_access DONE (all drivers)
GL_ARB_get_texture_sub_image DONE (all drivers)
GL_ARB_shader_texture_image_samples DONE (i965, nv50, virgl)
@@ -224,18 +224,18 @@ GL 4.5, GLSL 4.50 -- all DONE: nvc0, radeonsi, r600
GL_KHR_robustness DONE (freedreno, i965)
GL_EXT_shader_integer_mix DONE (all drivers that support GLSL)
GL 4.6, GLSL 4.60
GL 4.6, GLSL 4.60 -- all DONE: radeonsi
GL_ARB_gl_spirv in progress (Nicolai Hähnle, Ian Romanick)
GL_ARB_indirect_parameters DONE (i965/gen7+, nvc0, radeonsi, virgl)
GL_ARB_pipeline_statistics_query DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_polygon_offset_clamp DONE (freedreno, i965, nv50, nvc0, r600, radeonsi, llvmpipe, swr, virgl)
GL_ARB_shader_atomic_counter_ops DONE (freedreno/a5xx+, i965/gen7+, nvc0, r600, radeonsi, llvmpipe, softpipe, virgl)
GL_ARB_shader_draw_parameters DONE (i965, nvc0, radeonsi)
GL_ARB_shader_group_vote DONE (i965, nvc0, radeonsi)
GL_ARB_spirv_extensions in progress (Nicolai Hähnle, Ian Romanick)
GL_ARB_texture_filter_anisotropic DONE (freedreno, i965, nv50, nvc0, r600, radeonsi, softpipe (*), llvmpipe (*))
GL_ARB_transform_feedback_overflow_query DONE (i965/gen6+, nvc0, radeonsi, llvmpipe, softpipe, virgl)
GL_ARB_gl_spirv DONE (i965/gen7+)
GL_ARB_indirect_parameters DONE (i965/gen7+, nvc0, llvmpipe, virgl)
GL_ARB_pipeline_statistics_query DONE (i965, nvc0, r600, llvmpipe, softpipe, swr)
GL_ARB_polygon_offset_clamp DONE (freedreno, i965, nv50, nvc0, r600, llvmpipe, swr, virgl)
GL_ARB_shader_atomic_counter_ops DONE (freedreno/a5xx+, i965/gen7+, nvc0, r600, llvmpipe, softpipe, virgl)
GL_ARB_shader_draw_parameters DONE (i965, llvmpipe, nvc0)
GL_ARB_shader_group_vote DONE (i965, nvc0, llvmpipe)
GL_ARB_spirv_extensions DONE (i965/gen7+)
GL_ARB_texture_filter_anisotropic DONE (freedreno, i965, nv50, nvc0, r600, softpipe (*), llvmpipe (*))
GL_ARB_transform_feedback_overflow_query DONE (i965/gen6+, nvc0, llvmpipe, softpipe, virgl)
GL_KHR_no_error DONE (all drivers)
(*) softpipe and llvmpipe advertise 16x anisotropy but simply ignore the setting
@@ -244,14 +244,14 @@ These are the extensions cherry-picked to make GLES 3.1
GLES3.1, GLSL ES 3.1 -- all DONE: i965/hsw+, nvc0, r600, radeonsi, virgl
GL_ARB_arrays_of_arrays DONE (all drivers that support GLSL 1.30)
GL_ARB_compute_shader DONE (freedreno/a5xx+, i965/gen7+, softpipe)
GL_ARB_compute_shader DONE (freedreno/a5xx+, i965/gen7+, llvmpipe, softpipe)
GL_ARB_draw_indirect DONE (freedreno, i965/gen7+, llvmpipe, softpipe, swr)
GL_ARB_explicit_uniform_location DONE (all drivers that support GLSL)
GL_ARB_framebuffer_no_attachments DONE (freedreno, i965/gen7+, softpipe)
GL_ARB_framebuffer_no_attachments DONE (freedreno, i965/gen7+, llvmpipe, softpipe)
GL_ARB_program_interface_query DONE (all drivers)
GL_ARB_shader_atomic_counters DONE (freedreno/a5xx+, i965/gen7+, llvmpipe, softpipe)
GL_ARB_shader_image_load_store DONE (freedreno/a5xx+, i965/gen7+, softpipe)
GL_ARB_shader_image_size DONE (freedreno/a5xx+, i965/gen7+, softpipe)
GL_ARB_shader_image_load_store DONE (freedreno/a5xx+, i965/gen7+, llvmpipe, softpipe)
GL_ARB_shader_image_size DONE (freedreno/a5xx+, i965/gen7+, llvmpipe, softpipe)
GL_ARB_shader_storage_buffer_object DONE (freedreno/a5xx+, i965/gen7+, llvmpipe, softpipe)
GL_ARB_shading_language_packing DONE (all drivers)
GL_ARB_separate_shader_objects DONE (all drivers)
@@ -311,6 +311,7 @@ Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES ve
GL_ARB_shader_clock DONE (i965/gen7+, nv50, nvc0, r600, radeonsi, virgl)
GL_ARB_shader_stencil_export DONE (i965/gen9+, r600, radeonsi, softpipe, llvmpipe, swr, virgl)
GL_ARB_shader_viewport_layer_array DONE (i965/gen6+, nvc0, radeonsi)
GL_ARB_shading_language_include DONE
GL_ARB_sparse_buffer DONE (radeonsi/CIK+)
GL_ARB_sparse_texture not started
GL_ARB_sparse_texture2 not started
@@ -347,54 +348,54 @@ Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES ve
GLX_ARB_robustness_share_group_isolation not started
GL_EXT_direct_state_access subfeatures (in the spec order):
GL 1.1: Client commands not started
GL 1.0-1.3: Matrix and transpose matrix commands not started
GL 1.1-1.2: Texture commands not started
GL 1.2: 3D texture commands not started
GL 1.2.1: Multitexture commands not started
GL 1.2.1-3.0: Indexed texture commands not started
GL 1.2.1-3.0: Indexed generic queries not started
GL 1.2.1: EnableIndexed.. Get*Indexed not started
GL_ARB_vertex_program not started
GL 1.3: Compressed texture and multitexture commands not started
GL 1.5: Buffer commands not started
GL 2.0-2.1: Uniform and uniform matrix commands not started
GL_EXT_texture_buffer_object not started
GL_EXT_texture_integer not started
GL_EXT_gpu_shader4 not started
GL_EXT_gpu_program_parameters not started
GL 1.1: Client commands DONE
GL 1.0-1.3: Matrix and transpose matrix commands DONE
GL 1.1-1.2: Texture commands DONE
GL 1.2: 3D texture commands DONE
GL 1.2.1: Multitexture commands DONE
GL 1.2.1-3.0: Indexed texture commands DONE
GL 1.2.1-3.0: Indexed generic queries DONE
GL 1.2.1: EnableIndexed.. Get*Indexed DONE
GL_ARB_vertex_program DONE
GL 1.3: Compressed texture and multitexture commands DONE
GL 1.5: Buffer commands DONE
GL 2.0-2.1: Uniform and uniform matrix commands DONE
GL_EXT_texture_buffer_object DONE
GL_EXT_texture_integer DONE
GL_EXT_gpu_shader4 DONE
GL_EXT_gpu_program_parameters DONE
GL_NV_gpu_program4 n/a
GL_NV_framebuffer_multisample_coverage n/a
GL 3.0: Renderbuffer/framebuffer commands, Gen*Mipmap not started
GL 3.0: CopyBuffer command not started
GL_EXT_geometry_shader4 commands (expose in GL 3.2) not started
GL 3.0: Renderbuffer/framebuffer commands, Gen*Mipmap DONE
GL 3.0: CopyBuffer command DONE
GL_EXT_geometry_shader4 commands (expose in GL 3.2) DONE
GL_NV_explicit_multisample n/a
GL 3.0: Vertex array/attrib/query/map commands not started
Matrix GL tokens not started
GL 3.0: Vertex array/attrib/query/map commands DONE
Matrix GL tokens DONE
GL_EXT_direct_state_access additions from other extensions (complete list):
GL_AMD_framebuffer_sample_positions n/a
GL_AMD_gpu_shader_int64 not started
GL_ARB_bindless_texture not started
GL_ARB_buffer_storage not started
GL_ARB_clear_buffer_object not started
GL_ARB_framebuffer_no_attachments not started
GL_ARB_gpu_shader_fp64 not started
GL_ARB_instanced_arrays not started
GL_ARB_internalformat_query2 not started
GL_AMD_gpu_shader_int64 n/a (not enabled in compat profile)
GL_ARB_bindless_texture DONE
GL_ARB_buffer_storage DONE
GL_ARB_clear_buffer_object DONE
GL_ARB_framebuffer_no_attachments DONE
GL_ARB_gpu_shader_fp64 DONE
GL_ARB_instanced_arrays DONE
GL_ARB_internalformat_query2 DONE
GL_ARB_sparse_texture n/a
GL_ARB_sparse_buffer not started
GL_ARB_texture_buffer_range not started
GL_ARB_texture_storage not started
GL_ARB_texture_storage_multisample not started
GL_ARB_vertex_attrib_64bit not started
GL_ARB_vertex_attrib_binding not started
GL_EXT_buffer_storage not started
GL_EXT_external_buffer not started
GL_ARB_sparse_buffer DONE
GL_ARB_texture_buffer_range DONE
GL_ARB_texture_storage DONE
GL_ARB_texture_storage_multisample DONE
GL_ARB_vertex_attrib_64bit DONE
GL_ARB_vertex_attrib_binding DONE
GL_EXT_buffer_storage DONE
GL_EXT_external_buffer n/a
GL_EXT_separate_shader_objects n/a
GL_EXT_sparse_texture n/a
GL_EXT_texture_storage n/a
GL_EXT_vertex_attrib_64bit not started
GL_EXT_vertex_attrib_64bit DONE
GL_EXT_EGL_image_storage n/a
GL_NV_bindless_texture n/a
GL_NV_gpu_shader5 n/a
@@ -408,7 +409,6 @@ we DO NOT WANT implementations of these extensions for Mesa.
GL_ARB_geometry_shader4 Superseded by GL 3.2 geometry shaders
GL_ARB_matrix_palette Superseded by GL_ARB_vertex_program
GL_ARB_shading_language_include Not interesting
GL_ARB_shadow_ambient Superseded by GL_ARB_fragment_program
GL_ARB_vertex_blend Superseded by GL_ARB_vertex_program
@@ -416,7 +416,7 @@ Vulkan 1.0 -- all DONE: anv, radv
Vulkan 1.1 -- all DONE: anv, radv
VK_KHR_16bit_storage in progress (Alejandro)
VK_KHR_16bit_storage DONE (anv/gen8+, radv)
VK_KHR_bind_memory2 DONE (anv, radv)
VK_KHR_dedicated_allocation DONE (anv, radv)
VK_KHR_descriptor_update_template DONE (anv, radv)
@@ -435,18 +435,21 @@ Vulkan 1.1 -- all DONE: anv, radv
VK_KHR_maintenance3 DONE (anv, radv)
VK_KHR_multiview DONE (anv, radv)
VK_KHR_relaxed_block_layout DONE (anv, radv)
VK_KHR_sampler_ycbcr_conversion DONE (anv)
VK_KHR_sampler_ycbcr_conversion DONE (anv, radv)
VK_KHR_shader_draw_parameters DONE (anv, radv)
VK_KHR_storage_buffer_storage_class DONE (anv, radv)
VK_KHR_variable_pointers DONE (anv, radv)
Khronos extensions that are not part of any Vulkan version:
VK_KHR_8bit_storage DONE (anv, radv)
VK_KHR_8bit_storage DONE (anv/gen8+, radv)
VK_KHR_android_surface not started
VK_KHR_create_renderpass2 DONE (anv, radv)
VK_KHR_depth_stencil_resolve DONE (anv, radv)
VK_KHR_display DONE (anv, radv)
VK_KHR_display_swapchain DONE (anv, radv)
VK_KHR_draw_indirect_count DONE (radv)
VK_KHR_display_swapchain not started
VK_KHR_draw_indirect_count DONE (anv, radv)
VK_KHR_driver_properties DONE (anv, radv)
VK_KHR_external_fence_fd DONE (anv, radv)
VK_KHR_external_fence_win32 not started
VK_KHR_external_memory_fd DONE (anv, radv)
@@ -456,13 +459,23 @@ Khronos extensions that are not part of any Vulkan version:
VK_KHR_get_display_properties2 DONE (anv, radv)
VK_KHR_get_surface_capabilities2 DONE (anv, radv)
VK_KHR_image_format_list DONE (anv, radv)
VK_KHR_imageless_framebuffer DONE (anv, radv)
VK_KHR_incremental_present DONE (anv, radv)
VK_KHR_mir_surface not started
VK_KHR_pipeline_executable_properties DONE (anv, radv)
VK_KHR_push_descriptor DONE (anv, radv)
VK_KHR_sampler_mirror_clamp_to_edge DONE (anv, radv)
VK_KHR_shader_atomic_int64 DONE (anv, radv)
VK_KHR_shader_float16_int8 DONE (anv/gen8+, radv)
VK_KHR_shader_float_controls DONE (anv/gen8+, radv)
VK_KHR_shader_subgroup_extended_types DONE (radv)
VK_KHR_shared_presentable_image not started
VK_KHR_surface DONE (anv, radv)
VK_KHR_surface_protected_capabilities DONE (anv, radv)
VK_KHR_swapchain DONE (anv, radv)
VK_KHR_swapchain_mutable_format DONE (anv, radv)
VK_KHR_uniform_buffer_standard_layout DONE (anv, radv)
VK_KHR_vulkan_memory_model not started
VK_KHR_wayland_surface DONE (anv, radv)
VK_KHR_win32_keyed_mutex not started
VK_KHR_win32_surface not started

View File

@@ -29,7 +29,7 @@ immediately checked into git because not enough people are testing them.
Just applying patches, testing and reporting back is helpful.
<li>
<b>Driver debugging.</b>
There are plenty of open bugs in the <a href="https://bugs.freedesktop.org/describecomponents.cgi?product=Mesa">bug database</a>.
There are plenty of open bugs in the <a href="https://gitlab.freedesktop.org/mesa/mesa/-/issues">bug database</a>.
<li>
<b>Remove aliasing warnings.</b>
Enable gcc's <code>-Wstrict-aliasing=2 -fstrict-aliasing</code> arguments, and
@@ -47,7 +47,7 @@ You can find some further To-do lists here:
<b>Common To-Do lists:</b>
</p>
<ul>
<li><a href="https://gitlab.freedesktop.org/mesa/mesa/blob/master/docs/features.txt">
<li><a href="https://gitlab.freedesktop.org/mesa/mesa/-/blob/master/docs/features.txt">
<code>features.txt</code></a> - Status of OpenGL 3.x / 4.x features in
Mesa.</li>
</ul>

View File

@@ -16,6 +16,43 @@
<h1>News</h1>
<h2>January 28, 2020</h2><p><a href="relnotes/19.3.3.html">Mesa 19.3.3</a> is released. This is a bug fix release.</p><h2>January 9, 2020</h2><p><a href="relnotes/19.3.2.html">Mesa 19.3.2</a> is released. This is a bug fix release.</p><h2>December 18, 2019</h2><p><a href="relnotes/19.2.8.html">Mesa 19.2.8</a> is released. This is a bug fix release.</p><h2>December 18, 2019</h2><p><a href="relnotes/19.3.1.html">Mesa 19.3.1</a> is released. This is a bug fix release.</p><h2>December 12, 2019</h2><p><a href="relnotes/19.3.0.html">Mesa 19.3.0</a> is released. This is a new development release. See the release notes for mor information about this release.</p><h2>December 4, 2019</h2><p><a href="relnotes/19.2.7.html">Mesa 19.2.7</a> is released. This is a bug fix release.</p><h2>November 21, 2019</h2><p><a href="relnotes/19.2.6.html">Mesa 19.2.6</a> is released. This is a bug fix release.</p><h2>November 20, 2019</h2><p><a href="relnotes/19.2.5.html">Mesa 19.2.5</a> is released. This is a bug fix release.</p><h2>November 13, 2019</h2><p><a href="relnotes/19.2.4.html">Mesa 19.2.4</a> is released. This is an emergency bugfix release, all users of 19.2.3 are recomended to upgrade immediately.</p>
<h2>November 6, 2019</h2><p><a href="relnotes/19.2.3.html">Mesa 19.2.3</a> is released. This is a bug fix release.</p><h2>October 24, 2019</h2><p><a href="relnotes/19.2.2.html">Mesa 19.2.2</a> is released. This is a bug fix release.</p><h2>October 21, 2019</h2>
<p>
<a href="relnotes/19.1.8.html">Mesa 19.1.8</a> is released.
This is a bug-fix release.
</p>
<p>
NOTE: It is anticipated that 19.1.8 will be the final release in the
19.1 series. Users of 19.1 are encouraged to migrate to the 19.2
series in order to obtain future fixes.
</p>
<h2>October 9, 2019</h2><p><a href="relnotes/19.2.1.html">Mesa 19.2.1</a> is released. This is a bug fix release.</p><h2>September 25, 2019</h2>
<p>
<a href="relnotes/19.2.0.html">Mesa 19.2.0</a> is released.
This is a new development release. See the release notes for more
information about this release
</p>
<h2>September 17, 2019</h2>
<p>
<a href="relnotes/19.1.7.html">Mesa 19.1.7</a> is released.
This is a bug-fix release.
</p>
<h2>September 3, 2019</h2>
<p>
<a href="relnotes/19.1.6.html">Mesa 19.1.6</a> is released.
This is a bug-fix release.
</p>
<h2>August 23, 2019</h2>
<p>
<a href="relnotes/19.1.5.html">Mesa 19.1.5</a> is released.
This is a bug-fix release.
</p>
<h2>August 7, 2019</h2>
<p>
<a href="relnotes/19.1.4.html">Mesa 19.1.4</a> is released.
@@ -1603,7 +1640,7 @@ shading language and built-in functions.
<h2>April 4, 2007</h2>
<p>
Thomas Hellstr&ouml;m of Tungsten Graphics has written a whitepaper
Thomas Hellstr&#246;m of Tungsten Graphics has written a whitepaper
describing the new DRI memory management system.
</p>
@@ -2001,7 +2038,7 @@ d2b5ba32b53e0ad0576c637a4cc1fb41 MesaDemos-5.1.zip
</pre>
<H2>November 12, 2003</H2>
<h2>November 12, 2003</h2>
<p>
New Mesa 5.0.2 tarballs have been uploaded to SourceForge which fix a
@@ -2614,7 +2651,7 @@ https://www.nvidia.com/Products.nsf/htmlmedia/software_drivers.html</a>.</p>
quake scene, you may want to try this out, as it contains some optimizations
specifically in the Q3A rendering path.
<h2>May 13, 1999</h2>
</p><h2>May 13, 1999</h2>
<p>For those interested in the integration of Mesa into XFree86 4.0, Precision Insight
has posted their lowlevel design documents at
<a href="http://www.precisioninsight.com">www.precisioninsight.com</a>.</p>
@@ -2655,7 +2692,7 @@ http://www.quake3arena.com/news/glopt.html
<h2>March 18, 1999</h2>
<p>The new webpages are now online. Enjoy, and let me know if you find any errors.
<h2>February 16, 1999</h2>
</p><h2>February 16, 1999</h2>
<p><a href="https://www.sgi.com/">SGI</a> releases its
<a href="http://web.archive.org/web/20040805154836/http://www.sgi.com/software/opensource/glx/download.html">GLX source code</a>.
</p>
@@ -2665,4 +2702,4 @@ http://www.quake3arena.com/news/glopt.html
</div>
</body>
</html>
</html>

View File

@@ -23,7 +23,6 @@
<li><a href="#prereq-dri">For DRI and hardware acceleration</a>
</ul>
<li><a href="#meson">Building with meson</a>
<li><a href="#autoconf">Building with autoconf (Linux/Unix/X11)</a>
<li><a href="#scons">Building with SCons (Windows/Linux)</a>
<li><a href="#android">Building with AOSP (Android)</a>
<li><a href="#libs">Library Information</a>
@@ -38,17 +37,15 @@
<h4>Build system</h4>
<ul>
<li><a href="https://mesonbuild.com">meson</a> is required when building on *nix platforms.
<li>Autoconf was removed in 19.1.0, use meson instead
<li><a href="http://www.scons.org/">SCons</a> is required for building on
Windows and optional for Linux (it's an alternative to meson.)
<li><a href="https://mesonbuild.com">meson</a> is required when building on *nix platforms and is supported on windows.
<li><a href="http://www.scons.org/">SCons</a> is an alternative for building on
Windows and Linux.
</li>
<li>Android Build system when building as native Android component. Autoconf
<li>Android Build system when building as native Android component. Meson
is used when when building ARC.
</li>
</ul>
<h4>Compiler</h4>
<p>
The following compilers are known to work, if you know of others or you're
@@ -63,12 +60,6 @@ willing to maintain support for other compiler get in touch.
<h4>Third party/extra tools.</h4>
<p>
<strong>Note</strong>: These should not be required, when building from a release tarball. If
you think you've spotted a bug let developers know by filing a
<a href="bugs.html">bug report</a>.
</p>
<ul>
<li><a href="https://www.python.org/">Python</a> - Python is required.
@@ -83,7 +74,9 @@ Python Mako module is required. Version 0.8.0 or later should work.
On Linux systems, flex and bison versions 2.5.35 and 2.4.1, respectively,
(or later) should work.
On Windows with MinGW, install flex and bison with:
</p>
<pre>mingw-get install msys-flex msys-bison</pre>
<p>
For MSVC on Windows, install
<a href="http://winflexbison.sourceforge.net/">Win flex-bison</a>.
</p>
@@ -114,9 +107,12 @@ the packaging tool used by your distro.
<h2 id="meson">2. Building with meson</h2>
<p><strong>Meson &gt;= 0.46.0 is required</strong></p>
<p>
Meson is the latest build system in mesa, it is currently able to build for
*nix systems like Linux and BSD, and will be able to build for windows as well.
*nix systems like Linux and BSD, macOS, Haiku, and Windows.
</p>
<p>
@@ -127,20 +123,22 @@ The general approach is:
ninja -C builddir/
sudo ninja -C builddir/ install
</pre>
<p>On windows you can also use the visual studio backend</p>
<pre>
meson builddir --backend=vs
cd builddir
msbuild mesa.sln /m
</pre>
<p>
Please read the <a href="meson.html">detailed meson instructions</a>
for more information
</p>
<h2 id="autoconf">3. Building with autoconf (Linux/Unix/X11)</h2>
<p>
Autoconf support was removed in Mesa 19.1.0. Please use meson instead.
</p>
<h2 id="scons">4. Building with SCons (Windows/Linux)</h2>
<h2 id="scons">3. Building with SCons (Windows/Linux)</h2>
<p>
To build Mesa with SCons on Linux or Windows do
@@ -176,7 +174,7 @@ Additional information is available in <a href="README.WIN32">README.WIN32</a>.
<h2 id="android">5. Building with AOSP (Android)</h2>
<h2 id="android">4. Building with AOSP (Android)</h2>
<p>
Currently one can build Mesa for Android as part of the AOSP project, yet
@@ -195,7 +193,7 @@ Android-x86 and/or other resources.
</p>
<h2 id="libs">6. Library Information</h2>
<h2 id="libs">5. Library Information</h2>
<p>
When compilation has finished, look in the top-level <code>lib/</code>
@@ -232,7 +230,7 @@ versions of libGL and device drivers.
</p>
<h2 id="pkg-config">7. Building OpenGL programs with pkg-config</h2>
<h2 id="pkg-config">6. Building OpenGL programs with pkg-config</h2>
<p>
Running <code>ninja install</code> will install package configuration files

View File

@@ -357,46 +357,46 @@ features.
</p>
<ul>
<li>Texture mapping:
<ul>
<li>glAreTexturesResident
<li>glBindTexture
<li>glCopyTexImage1D
<li>glCopyTexImage2D
<li>glCopyTexSubImage1D
<li>glCopyTexSubImage2D
<li>glDeleteTextures
<li>glGenTextures
<li>glIsTexture
<li>glPrioritizeTextures
<li>glTexSubImage1D
<li>glTexSubImage2D
</ul>
<ul>
<li>glAreTexturesResident
<li>glBindTexture
<li>glCopyTexImage1D
<li>glCopyTexImage2D
<li>glCopyTexSubImage1D
<li>glCopyTexSubImage2D
<li>glDeleteTextures
<li>glGenTextures
<li>glIsTexture
<li>glPrioritizeTextures
<li>glTexSubImage1D
<li>glTexSubImage2D
</ul>
<li>Vertex Arrays:
<ul>
<li>glArrayElement
<li>glColorPointer
<li>glDrawElements
<li>glEdgeFlagPointer
<li>glIndexPointer
<li>glInterleavedArrays
<li>glNormalPointer
<li>glTexCoordPointer
<li>glVertexPointer
</ul>
<ul>
<li>glArrayElement
<li>glColorPointer
<li>glDrawElements
<li>glEdgeFlagPointer
<li>glIndexPointer
<li>glInterleavedArrays
<li>glNormalPointer
<li>glTexCoordPointer
<li>glVertexPointer
</ul>
<li>Client state management:
<ul>
<li>glDisableClientState
<li>glEnableClientState
<li>glPopClientAttrib
<li>glPushClientAttrib
</ul>
<ul>
<li>glDisableClientState
<li>glEnableClientState
<li>glPopClientAttrib
<li>glPushClientAttrib
</ul>
<li>Misc:
<ul>
<li>glGetPointer
<li>glIndexub
<li>glIndexubv
<li>glPolygonOffset
</ul>
<ul>
<li>glGetPointer
<li>glIndexub
<li>glIndexubv
<li>glPolygonOffset
</ul>
</ul>
</div>

View File

@@ -20,7 +20,7 @@
<p>
Mesa is a 3-D graphics library with an API which is very similar to
that of <a href="https://www.opengl.org/">OpenGL</a>.*
that of <a href="https://www.opengl.org/">OpenGL</a><sup>[<a href="#trademark">1</a>]</sup>.
To the extent that Mesa utilizes the OpenGL command syntax or state
machine, it is being used with authorization from <a
href="https://www.sgi.com/">Silicon Graphics,
@@ -38,8 +38,8 @@ library</em>.
</p>
<p>
* OpenGL is a trademark of <a href="https://www.sgi.com/"
>Silicon Graphics Incorporated</a>.
<a id="trademark">[1]</a>: OpenGL is a trademark of <a
href="https://www.sgi.com/">Silicon Graphics Incorporated</a>.
</p>

View File

@@ -56,7 +56,7 @@ It's the fastest software rasterizer for Mesa.
For Linux, on a recent Debian based distribution do:
</p>
<pre>
aptitude install llvm-dev
aptitude install llvm-dev
</pre>
<p>
If you want development snapshot builds of LLVM for Debian and derived
@@ -68,7 +68,7 @@ It's the fastest software rasterizer for Mesa.
For a RPM-based distribution do:
</p>
<pre>
yum install llvm-devel
yum install llvm-devel
</pre>
<p>
@@ -120,15 +120,15 @@ It's the fastest software rasterizer for Mesa.
To build everything on Linux invoke scons as:
<pre>
scons build=debug libgl-xlib
scons build=debug libgl-xlib
</pre>
Alternatively, you can build it with meson with:
<pre>
mkdir build
cd build
meson -D glx=gallium-xlib -D gallium-drivers=swrast
ninja
mkdir build
cd build
meson -D glx=gallium-xlib -D gallium-drivers=swrast
ninja
</pre>
but the rest of these instructions assume that scons is used.
@@ -136,7 +136,7 @@ but the rest of these instructions assume that scons is used.
For Windows the procedure is similar except the target:
<pre>
scons platform=windows build=debug libgl-gdi
scons platform=windows build=debug libgl-gdi
</pre>
@@ -148,11 +148,11 @@ For Windows the procedure is similar except the target:
<code>libGL.so</code> into</p>
<pre>
build/foo/gallium/targets/libgl-xlib/libGL.so
build/foo/gallium/targets/libgl-xlib/libGL.so
</pre>
or
<pre>
lib/gallium/libGL.so
lib/gallium/libGL.so
</pre>
<p>To use it set the <code>LD_LIBRARY_PATH</code> environment variable
@@ -206,7 +206,7 @@ any OpenGL drivers):
To profile llvmpipe you should build as
</p>
<pre>
scons build=profile &lt;same-as-before&gt;
scons build=profile &lt;same-as-before&gt;
</pre>
<p>
@@ -221,8 +221,8 @@ On Linux, it is possible to have symbol resolution of JIT code with <a href="htt
</p>
<pre>
perf record -g /my/application
perf report
perf record -g /my/application
perf report
</pre>
<p>
@@ -255,7 +255,7 @@ Some of these tests can output results and benchmarks to a tab-separated file
for later analysis, e.g.:
</p>
<pre>
build/linux-x86_64-debug/gallium/drivers/llvmpipe/lp_test_blend -o blend.tsv
build/linux-x86_64-debug/gallium/drivers/llvmpipe/lp_test_blend -o blend.tsv
</pre>

View File

@@ -34,6 +34,20 @@ iframe {
float: left;
}
figure {
margin: 0.5em;
padding: 0.5em;
border: 1px solid #ccc;
}
figure pre {
margin: 0;
}
figure figcaption {
padding-top: 0.5em;
}
.content {
position: absolute;
left: 20em;

View File

@@ -26,14 +26,18 @@
<h2 id="intro">1. Introduction</h2>
<p>For general information about Meson see the
<a href="http://mesonbuild.com/">Meson website</a>.</p>
<a href="https://mesonbuild.com/">Meson website</a>.</p>
<p><strong>Mesa's Meson build system is generally considered stable and ready
for production.</strong></p>
<p>The Meson build of Mesa is tested on Linux, macOS, Cygwin and Haiku, FreeBSD,
<p><strong>Mesa requires Meson &gt;= 0.46.0 to build.</strong>
<p>The Meson build of Mesa is tested on Linux, macOS, Windows, Cygwin, Haiku, FreeBSD,
DragonflyBSD, NetBSD, and should work on OpenBSD.</p>
<h4>Unix-like OSes</h4>
<p>If Meson is not already installed on your system, you can typically
install it with your package installer. For example:</p>
<pre>
@@ -43,9 +47,7 @@ or
<pre>
sudo dnf install meson # Fedora
</pre>
<p><strong>Mesa requires Meson &gt;= 0.46.0 to build.</strong>
<p>
Some older versions of meson do not check that they are too old and will error
out in odd ways.
</p>
@@ -55,14 +57,37 @@ If it's not already installed, use apt-get or dnf to install
the <em>ninja-build</em> package.
</p>
<h4>Windows</h4>
<p>
You will need to install python3 and meson as a module using pip. This is
because we use python for generating code, and rely on external modules
(mako). You also need pkg-config (a hard dependency of meson), flex, and bison.
The easiest way to install everything you need is with <a
href="https://chocolatey.org/">chocolatey</a>.
</p>
<pre>
choco install python3 winflexbison pkgconfiglite
</pre>
<p>You can even use chocolatey to install mingw and ninja (ninja can be used with MSVC as well)</p>
<pre>
choco install ninja mingw
</pre>
<p>Then install meson using pip</p>
<pre>
py -3 -m pip install meson mako
</pre>
You may need to add the python3 scripts directory to your path for meson.
<h2 id="basic">2. Basic Usage</h2>
<p>
The meson program is used to configure the source directory and generates
either a ninja build file or Visual Studio® build files. The latter must
be enabled via the <code>--backend</code> switch, as ninja is the default
backend on all
operating systems.
backend on all operating systems.
</p>
<p>
@@ -70,7 +95,7 @@ Meson only supports out-of-tree builds, and must be passed a
directory to put built and generated sources into. We'll call that directory
"build" here.
It's recommended to create a
<a href="http://mesonbuild.com/Using-multiple-build-directories.html">
<a href="https://mesonbuild.com/Using-multiple-build-directories.html">
separate build directory</a> for each configuration you might want to use.
</p>
@@ -101,7 +126,7 @@ running "meson build/" but this feature is being discussed upstream.
For now, we have a <code>bin/meson-options.py</code> script that prints
the options for you.
If that script doesn't work for some reason, you can always look in the
<a href="https://gitlab.freedesktop.org/mesa/mesa/blob/master/meson_options.txt">
<a href="https://gitlab.freedesktop.org/mesa/mesa/-/blob/master/meson_options.txt">
meson_options.txt</a> file at the root of the project.
</p>
@@ -119,7 +144,7 @@ meson configure build/ -Dprefix=/tmp/install -Dglx=true
<p>
Note that options taking lists (such as <code>platforms</code>) are
<a href="http://mesonbuild.com/Build-options.html#using-build-options">a bit
<a href="https://mesonbuild.com/Build-options.html#using-build-options">a bit
more complicated</a>, but the simplest form compatible with Mesa options
is to use a comma to separate values (<code>-D platforms=drm,wayland</code>)
and brackets to represent an empty list (<code>-D platforms=[]</code>).
@@ -153,12 +178,32 @@ Meson does not do this. Instead, you will need do this:
ninja -C build/ xmlpool-pot xmlpool-update-po xmlpool-gmo
</pre>
<h4>Windows specific instructions</h4>
<p>
On windows you have a couple of choices for compilers. If you installed mingw
with chocolatey and want to use ninja you should be able to open any shell
and follow the instructions above. If you want to you MSVC, clang-cl, or ICL
(the Intel Compiler), read on.
</p>
<p>
Both ICL and MSVC come with shell environments, the easiest way to use meson
with these it to open a shell. For clang-cl you will need to open an MSVC
shell, and then override the compilers, either using a <a
href="https://mesonbuild.com/Native-environments.html">native file</a>, or
with the CC and CXX environment variables.
</p>
<p>
All of these compilers are tested and work with ninja, but if you want visual
studio integration or you just like msbuild, passing
<code>--backend=vs</code> to meson will generate a visual studio solution. If
you want to use ICL or clang-cl with the vsbackend you will need meson 0.52.0
or greater. Older versions always use the microsoft compiler.
</p>
<h2 id="advanced">3. Advanced Usage</h2>
<dl>
<dt>Installation Location</dt>
<dd>
<h3>Installation Location</h3>
<p>
Meson default to installing libGL.so in your system's main lib/ directory
and DRI drivers to a dri/ subdirectory.
@@ -180,10 +225,8 @@ to run/test the driver.
<p>
Meson also honors <code>DESTDIR</code> for installs.
</p>
</dd>
<dt>Compiler Options</dt>
<dd>
<h3>Compiler Options</h3>
<p>Meson supports the common CFLAGS, CXXFLAGS, etc. environment
variables but their use is discouraged because of the many caveats
in using them.
@@ -199,11 +242,9 @@ for C++ sources:
<pre>
meson builddir/ -Dc_args=-fmax-errors=10 -Dcpp_args=-DMAGIC=123
</pre>
</dd>
<dt>Compiler Specification</dt>
<dd>
<h3>Compiler Specification</h3>
<p>
Meson supports the standard CC and CXX environment variables for
changing the default compiler. Note that Meson does not allow
@@ -224,16 +265,28 @@ ninja -C build-clang
<p>
The default compilers depends on your operating system. Meson supports most of
the popular compilers, a complete list is available
<a href="http://mesonbuild.com/Reference-tables.html#compiler-ids">here</a>.
<a href="https://mesonbuild.com/Reference-tables.html#compiler-ids">here</a>.
</p>
</dd>
<dt>LLVM</dt>
<dd><p>Meson includes upstream logic to wrap llvm-config using its standard
<h3>LLVM</h3>
<p>Meson includes upstream logic to wrap llvm-config using its standard
dependency interface.
</p></dd>
</p>
<p>
As of meson 0.51.0 meson can use cmake to find llvm (the cmake finder
was added in meson 0.49.0, but LLVM cannot be found until 0.51) Due to the
way LLVM implements its cmake finder it will only find static libraries, it
will never find libllvm.so.
<dd><p>
There is also a <code>-Dcmake_module_path</code> option in this meson version,
which points to the root of an alternative installation (the prefix). For
example:
</p>
<pre>
meson builddir -Dcmake_module_path=/home/user/mycmake/prefix
</pre>
<p>
As of meson 0.49.0 meson also has the concept of a
<a href="https://mesonbuild.com/Native-environments.html">"native file"</a>,
these files provide information about the native build environment (as opposed
@@ -243,18 +296,17 @@ find llvm-config:
custom-llvm.ini
<pre>
[binaries]
llvm-config = '/usr/local/bin/llvm/llvm-config'
[binaries]
llvm-config = '/usr/local/bin/llvm/llvm-config'
</pre>
Then configure meson:
<pre>
meson builddir/ --native-file custom-llvm.ini
meson builddir/ --native-file custom-llvm.ini
</pre>
</dd>
<dd><p>
<p>
Meson &lt; 0.49 doesn't support native files, so to specify a custom
<code>llvm-config</code> you need to modify your <code>$PATH</code> (or
<code>%PATH%</code> on windows), which will be searched for
@@ -264,9 +316,8 @@ and <code>llvm-config-<i>$version</i></code>:
<pre>
PATH=/path/to/folder/with/llvm-config:$PATH meson build
</pre>
</dd>
<dd><p>
<p>
For selecting llvm-config for cross compiling a
<a href="https://mesonbuild.com/Cross-compilation.html#defining-the-environment">"cross file"</a>
should be used. It uses the same format as the native file above:
@@ -274,30 +325,96 @@ should be used. It uses the same format as the native file above:
<p>cross-llvm.ini</p>
<pre>
[binaries]
...
llvm-config = '/usr/lib/llvm-config-32'
[binaries]
...
llvm-config = '/usr/lib/llvm-config-32'
cmake = '/usr/bin/cmake-for-my-arch'
</pre>
<p>Obviously, only cmake or llvm-config is required.</p>
<p>Then configure meson:</p>
<pre>
meson builddir/ --cross-file cross-llvm.ini
meson builddir/ --cross-file cross-llvm.ini
</pre>
See the <a href="#cross-compilation">Cross Compilation</a> section for more information.
</dd>
<dt><code>PKG_CONFIG_PATH</code></dt>
<dd><p>The
<p>On windows (and in other cases), using llvm-config or cmake may be
either undesirable or impossible. Meson's solution for this is a
<a href="https://mesonbuild.com/Wrap-dependency-system-manual.html">wrap</a>, in
this case a "binary wrap". Follow the steps below:</p>
<ul>
<li>Install the binaries and headers into the <code>$mesa_src/subprojects/llvm</code></li>
<li>Add a meson build.build file to that directory (more on that later)</li>
</ul>
<p>The wrap file must define the following:</p>
<ul>
<li><code>dep_llvm</code>: a <code>declare_dependency()</code> object with include_directories, dependencies, and version set)</li>
</ul>
<p>It may also define:</p>
<ul>
<li><code>irbuilder_h</code>: a <code>files()</code> object pointing to llvm/IR/IRBuilder.h (this is requred for SWR)</li>
<li><code>has_rtti</code>: a <code>bool</code> that declares whether LLVM was built with RTTI. Defaults to true</li>
</ul>
<p>such a meson.build file might look like:</p>
<pre>
project('llvm', ['cpp'])
cpp = meson.get_compiler('cpp')
_deps = []
_search = join_paths(meson.current_source_dir(), 'lib')
foreach d : ['libLLVMCodeGen', 'libLLVMScalarOpts', 'libLLVMAnalysis',
'libLLVMTransformUtils', 'libLLVMCore', 'libLLVMX86CodeGen',
'libLLVMSelectionDAG', 'libLLVMipo', 'libLLVMAsmPrinter',
'libLLVMInstCombine', 'libLLVMInstrumentation', 'libLLVMMC',
'libLLVMGlobalISel', 'libLLVMObjectYAML', 'libLLVMDebugInfoPDB',
'libLLVMVectorize', 'libLLVMPasses', 'libLLVMSupport',
'libLLVMLTO', 'libLLVMObject', 'libLLVMDebugInfoCodeView',
'libLLVMDebugInfoDWARF', 'libLLVMOrcJIT', 'libLLVMProfileData',
'libLLVMObjCARCOpts', 'libLLVMBitReader', 'libLLVMCoroutines',
'libLLVMBitWriter', 'libLLVMRuntimeDyld', 'libLLVMMIRParser',
'libLLVMX86Desc', 'libLLVMAsmParser', 'libLLVMTableGen',
'libLLVMFuzzMutate', 'libLLVMLinker', 'libLLVMMCParser',
'libLLVMExecutionEngine', 'libLLVMCoverage', 'libLLVMInterpreter',
'libLLVMTarget', 'libLLVMX86AsmParser', 'libLLVMSymbolize',
'libLLVMDebugInfoMSF', 'libLLVMMCJIT', 'libLLVMXRay',
'libLLVMX86AsmPrinter', 'libLLVMX86Disassembler',
'libLLVMMCDisassembler', 'libLLVMOption', 'libLLVMIRReader',
'libLLVMLibDriver', 'libLLVMDlltoolDriver', 'libLLVMDemangle',
'libLLVMBinaryFormat', 'libLLVMLineEditor',
'libLLVMWindowsManifest', 'libLLVMX86Info', 'libLLVMX86Utils']
_deps += cpp.find_library(d, dirs : _search)
endforeach
dep_llvm = declare_dependency(
include_directories : include_directories('include'),
dependencies : _deps,
version : '6.0.0',
)
has_rtti = false
irbuilder_h = files('include/llvm/IR/IRBuilder.h')
</pre>
<p>It is very important that version is defined and is accurate, if it is not,
workarounds for the wrong version of LLVM might be used resulting in build
failures.</p>
<h3><code>PKG_CONFIG_PATH</code></h3>
<p>The
<code>pkg-config</code> utility is a hard requirement for configuring and
building Mesa on Unix-like systems. It is used to search for external libraries
on the system. This environment variable is used to control the search path for
<code>pkg-config</code>. For instance, setting
<code>PKG_CONFIG_PATH=/usr/X11R6/lib/pkgconfig</code> will search for package
metadata in <code>/usr/X11R6</code> before the standard directories.</p>
</dd>
</dl>
<h3>Options</h3>
<p>
One of the oddities of meson is that some options are different when passed to
the <code>meson</code> than to <code>meson configure</code>. These options are

View File

@@ -41,8 +41,7 @@ will be in constant rotation.
<p>
The way the release schedule works is explained
<a href="releasing.html#schedule" target="_parent">here</a>.
</p
>
</p>
<p>
Take a look <a href="submittingpatches.html#criteria" target="_parent">here</a>
if you'd like to nominate a patch in the next stable release.
@@ -60,73 +59,48 @@ if you'd like to nominate a patch in the next stable release.
<th>Notes</th>
</tr>
<tr>
<td rowspan="3">19.1</td>
<td>2019-08-20</td>
<td>19.1.5</td>
<td>Juan A. Suarez</td>
<td>
</tr>
<tr>
<td>2019-09-03</td>
<td>19.1.6</td>
<td>Juan A. Suarez</td>
<td>
</tr>
<tr>
<td>2019-09-17</td>
<td>19.1.7</td>
<td>Juan A. Suarez</td>
<td>Last planned 19.1.x release</td>
</tr>
<tr>
<td rowspan="4">19.2</td>
<td>2019-08-06</td>
<td>19.2.0-rc1</td>
<td>Emil Velikov</td>
<td>
</tr>
<tr>
<td>2019-08-13</td>
<td>19.2.0-rc2</td>
<td>Emil Velikov</td>
<td>
</tr>
<tr>
<td>2019-08-20</td>
<td>19.2.0-rc3</td>
<td>Emil Velikov</td>
<td>
</tr>
<tr>
<td>2019-08-27</td>
<td>19.2.0-rc4</td>
<td>Emil Velikov</td>
<td>Last planned RC/Final release</td>
</tr>
<tr>
<td rowspan="4">19.3</td>
<td>2019-10-15</td>
<td>19.3.0-rc1</td>
<td rowspan="3">19.3</td>
<td>2020-02-05</td>
<td>19.3.4</td>
<td>Dylan Baker</td>
<td>
<td/>
</tr>
<tr>
<td>2019-10-22</td>
<td>19.3.0-rc2</td>
<td>2020-02-12</td>
<td>19.3.5</td>
<td>Dylan Baker</td>
<td>
<td/>
</tr>
<tr>
<td>2019-10-29</td>
<td>19.3.0-rc3</td>
<td>2020-02-26</td>
<td>19.3.6</td>
<td>Dylan Baker</td>
<td>
<td>Last planned 19.3 release</td>
</tr>
<tr>
<td>2019-11-05</td>
<td>19.3.0-rc4</td>
<td rowspan="4">20.0</td>
<td>2020-01-29</td>
<td>20.0.0-rc1</td>
<td>Dylan Baker</td>
<td>Last planned RC/Final release</td>
<td/>
</tr>
<tr>
<td>2020-02-05</td>
<td>20.0.0-rc2</td>
<td>Dylan Baker</td>
<td/>
</tr>
<tr>
<td>2020-02-12</td>
<td>20.0.0-rc3</td>
<td>Dylan Baker</td>
<td/>
</tr>
<tr>
<td>2020-02-19</td>
<td>20.0.0-rc4</td>
<td>Dylan Baker</td>
<td>Or 20.0.0 final</td>
</tr>
</table>

View File

@@ -26,8 +26,7 @@
<li><a href="#prerelease">Pre-release announcement</a>
<li><a href="#release">Making a new release</a>
<li><a href="#announce">Announce the release</a>
<li><a href="#website">Update the mesa3d.org website</a>
<li><a href="#bugzilla">Update Bugzilla</a>
<li><a href="#gitlab">Update Gitlab Issues</a>
</ul>
@@ -47,10 +46,10 @@ while the latter have a non-zero one.
For example:
</p>
<pre>
Mesa 10.1.0 - 10.1 branch, feature
Mesa 10.1.4 - 10.1 branch, bugfix
Mesa 12.0.0 - 12.0 branch, feature
Mesa 12.0.2 - 12.0 branch, bugfix
Mesa 10.1.0 - 10.1 branch, feature
Mesa 10.1.4 - 10.1 branch, bugfix
Mesa 12.0.0 - 12.0 branch, feature
Mesa 12.0.2 - 12.0 branch, bugfix
</pre>
@@ -184,27 +183,27 @@ This should be noted in the <a href="#prerelease">pre-announce</a> email.
</p>
<pre>
git show b10859ec41d09c57663a258f43fe57c12332698e
git show b10859ec41d09c57663a258f43fe57c12332698e
commit b10859ec41d09c57663a258f43fe57c12332698e
Author: Jonas Pfeil &lt;pfeiljonas@gmx.de&gt;
Date: Wed Mar 1 18:11:10 2017 +0100
commit b10859ec41d09c57663a258f43fe57c12332698e
Author: Jonas Pfeil &lt;pfeiljonas@gmx.de&gt;
Date: Wed Mar 1 18:11:10 2017 +0100
ralloc: Make sure ralloc() allocations match malloc()'s alignment.
ralloc: Make sure ralloc() allocations match malloc()'s alignment.
The header of ralloc needs to be aligned, because the compiler assumes
...
The header of ralloc needs to be aligned, because the compiler assumes
...
(cherry picked from commit cd2b55e536dc806f9358f71db438dd9c246cdb14)
(cherry picked from commit cd2b55e536dc806f9358f71db438dd9c246cdb14)
Squashed with commit:
Squashed with commit:
ralloc: don't leave out the alignment factor
ralloc: don't leave out the alignment factor
Experimentation shows that without alignment factor gcc and clang choose
...
Experimentation shows that without alignment factor gcc and clang choose
...
(cherry picked from commit ff494fe999510ea40e3ed5827e7818550b6de126)
(cherry picked from commit ff494fe999510ea40e3ed5827e7818550b6de126)
</pre>
<h2>Regression/functionality testing</h2>
@@ -237,8 +236,8 @@ A live branch, which contains the currently merge/rejected patches is available
in the main repository under <code>staging/X.Y</code>. For example:
</p>
<pre>
staging/18.1 - WIP branch for the 18.1 series
staging/18.2 - WIP branch for the 18.2 series
staging/18.1 - WIP branch for the 18.1 series
staging/18.2 - WIP branch for the 18.2 series
</pre>
<p>
@@ -272,20 +271,20 @@ Check if the version number is going to remain as, alternatively
To setup the branchpoint:
</p>
<pre>
git checkout master # make sure we're in master first
git tag -s X.Y-branchpoint -m "Mesa X.Y branchpoint"
git checkout -b X.Y
git checkout master
$EDITOR VERSION # bump the version number
git commit -as
cp docs/relnotes/{X.Y,X.Y+1}.html # copy/create relnotes template
git commit -as
git push origin X.Y-branchpoint X.Y
git checkout master # make sure we're in master first
git tag -s X.Y-branchpoint -m "Mesa X.Y branchpoint"
git checkout -b X.Y
git checkout master
$EDITOR VERSION # bump the version number
git commit -as
cp docs/relnotes/{X.Y,X.Y+1}.html # copy/create relnotes template
git commit -as
git push origin X.Y-branchpoint X.Y
</pre>
<p>
Now go to
<a href="https://bugs.freedesktop.org/editversions.cgi?action=add&amp;product=Mesa" target="_parent">Bugzilla</a> and add the new Mesa version X.Y.
<a href="https://gitlab.freedesktop.org/mesa/mesa/-/milestones" target="_parent">gitlab</a> and add the new Mesa version X.Y.
</p>
<p>
@@ -483,40 +482,50 @@ So we do a quick 'touch test'
</p>
<pre>
__glxgears_cmd='glxgears 2&gt;&amp;1 | grep -v "configuration file"'
__es2info_cmd='es2_info 2&gt;&amp;1 | egrep "GL_VERSION|GL_RENDERER|.*dri\.so"'
__es2gears_cmd='es2gears_x11 2&gt;&amp;1 | grep -v "configuration file"'
test "x$LD_LIBRARY_PATH" != 'x' &amp;&amp; __old_ld="$LD_LIBRARY_PATH"
export LD_LIBRARY_PATH=`pwd`/test/usr/local/lib/:"${__old_ld}"
export LIBGL_DRIVERS_PATH=`pwd`/test/usr/local/lib/dri/
export LIBGL_DEBUG=verbose
eval $__glxinfo_cmd
eval $__glxgears_cmd
eval $__es2info_cmd
eval $__es2gears_cmd
export LIBGL_ALWAYS_SOFTWARE=true
eval $__glxinfo_cmd
eval $__glxgears_cmd
eval $__es2info_cmd
eval $__es2gears_cmd
export LIBGL_ALWAYS_SOFTWARE=true
export GALLIUM_DRIVER=softpipe
eval $__glxinfo_cmd
eval $__glxgears_cmd
eval $__es2info_cmd
eval $__es2gears_cmd
# Smoke test DOTA2
unset LD_LIBRARY_PATH
test "x$__old_ld" != 'x' &amp;&amp; export LD_LIBRARY_PATH="$__old_ld" &amp;&amp; unset __old_ld
unset LIBGL_DRIVERS_PATH
unset LIBGL_DEBUG
unset LIBGL_ALWAYS_SOFTWARE
unset GALLIUM_DRIVER
export VK_ICD_FILENAMES=`pwd`/test/usr/local/share/vulkan/icd.d/intel_icd.x86_64.json
steam steam://rungameid/570 -vconsole -vulkan
unset VK_ICD_FILENAMES
__glxgears_cmd='glxgears 2&gt;&amp;1 | grep -v "configuration file"'
__es2info_cmd='es2_info 2&gt;&amp;1 | egrep "GL_VERSION|GL_RENDERER|.*dri\.so"'
__es2gears_cmd='es2gears_x11 2&gt;&amp;1 | grep -v "configuration file"'
test "x$LD_LIBRARY_PATH" != 'x' &amp;&amp; __old_ld="$LD_LIBRARY_PATH"
export LD_LIBRARY_PATH=`pwd`/test/usr/local/lib/:"${__old_ld}"
export LIBGL_DRIVERS_PATH=`pwd`/test/usr/local/lib/dri/
export LIBGL_DEBUG=verbose
eval $__glxinfo_cmd
eval $__glxgears_cmd
eval $__es2info_cmd
eval $__es2gears_cmd
export LIBGL_ALWAYS_SOFTWARE=true
eval $__glxinfo_cmd
eval $__glxgears_cmd
eval $__es2info_cmd
eval $__es2gears_cmd
export LIBGL_ALWAYS_SOFTWARE=true
export GALLIUM_DRIVER=softpipe
eval $__glxinfo_cmd
eval $__glxgears_cmd
eval $__es2info_cmd
eval $__es2gears_cmd
# Smoke test DOTA2
unset LD_LIBRARY_PATH
test "x$__old_ld" != 'x' &amp;&amp; export LD_LIBRARY_PATH="$__old_ld" &amp;&amp; unset __old_ld
unset LIBGL_DRIVERS_PATH
unset LIBGL_DEBUG
unset LIBGL_ALWAYS_SOFTWARE
unset GALLIUM_DRIVER
export VK_ICD_FILENAMES=`pwd`/test/usr/local/share/vulkan/icd.d/intel_icd.x86_64.json
steam steam://rungameid/570 -vconsole -vulkan
unset VK_ICD_FILENAMES
</pre>
<h3>Create release notes for the new release</h3>
<p>
The release notes are completely generated by the
<code>bin/gen_release_notes.py</code> script. Simply run this script before
bumping the version
The only thing left to do is add the sha256 sums.
</p>
<h3>Update version in file VERSION</h3>
<p>
@@ -524,36 +533,12 @@ Increment the version contained in the file VERSION at Mesa's top-level, then
commit this change.
</p>
<h3>Create release notes for the new release</h3>
<p>
Create a new file docs/relnotes/X.Y.Z.html, (follow the style of the previous
release notes). Note that the sha256sums section of the release notes should
be empty (TBD) at this point.
</p>
<p>
Two scripts are available to help generate portions of the release notes:
</p>
<pre>
./bin/bugzilla_mesa.sh
./bin/shortlog_mesa.sh
</pre>
<p>
The first script identifies commits that reference bugzilla bugs and obtains
the descriptions of those bugs from bugzilla. The second script generates a
log of all commits. In both cases, HTML-formatted lists are printed to stdout
to be included in the release notes.
</p>
<p>
Commit these changes and push the branch.
</p>
<pre>
git push origin HEAD
git push origin HEAD
</pre>
@@ -564,9 +549,9 @@ Start the release process.
</p>
<pre>
# For the dist/distcheck, you may want to specify which LLVM to use:
# export LLVM_CONFIG=/usr/lib/llvm-3.9/bin/llvm-config
../relative/path/to/release.sh . # append --dist if you've already done distcheck above
# For the dist/distcheck, you may want to specify which LLVM to use:
# export LLVM_CONFIG=/usr/lib/llvm-3.9/bin/llvm-config
../relative/path/to/release.sh . # append --dist if you've already done distcheck above
</pre>
<p>
@@ -587,20 +572,18 @@ Something like the following steps will do the trick:
</p>
<pre>
git cherry-pick -x X.Y~1
git cherry-pick -x X.Y
git cherry-pick -x X.Y~1
git cherry-pick -x X.Y
</pre>
<p>
Also, edit docs/relnotes.html to add a link to the new release notes,
edit docs/index.html to add a news entry and a note in case of the
last release in a series, and remove the version from
docs/release-calendar.html. Then commit and push:
<p>Then run the <code>./bin/post_verison.py X.Y.Z</code>, where X.Y.Z is the
version you just made. This will updated docs/relnotes.html and
docs/index.html. Remove docs/release-calendar.html. Then commit and push:
</p>
<pre>
git commit -as -m "docs: update calendar, add news item and link release notes for X.Y.Z"
git push origin master X.Y
git commit -as -m "docs: update calendar, add news item and link release notes for X.Y.Z"
git push origin master X.Y
</pre>
@@ -616,15 +599,7 @@ series, if that is the case.
</p>
<h2 id="website">Update the mesa3d.org website</h2>
<p>
As the hosting was moved to freedesktop, git hooks are deployed to update the
website. Manually check that it is updated 5-10 minutes after the final <code>git push</code>
</p>
<h2 id="bugzilla">Update Bugzilla</h2>
<h2 id="gitlab">Update gitlab issues</h2>
<p>
Parse through the bugreports as listed in the docs/relnotes/X.Y.Z.html

View File

@@ -21,247 +21,252 @@ The release notes summarize what's new or changed in each Mesa release.
</p>
<ul>
<li><a href="relnotes/19.1.4.html">19.1.4 release notes</a>
<li><a href="relnotes/19.1.3.html">19.1.3 release notes</a>
<li><a href="relnotes/19.1.2.html">19.1.2 release notes</a>
<li><a href="relnotes/19.0.8.html">19.0.8 release notes</a>
<li><a href="relnotes/19.1.1.html">19.1.1 release notes</a>
<li><a href="relnotes/19.0.7.html">19.0.7 release notes</a>
<li><a href="relnotes/19.1.0.html">19.1.0 release notes</a>
<li><a href="relnotes/19.0.6.html">19.0.6 release notes</a>
<li><a href="relnotes/19.0.5.html">19.0.5 release notes</a>
<li><a href="relnotes/19.0.4.html">19.0.4 release notes</a>
<li><a href="relnotes/19.0.3.html">19.0.3 release notes</a>
<li><a href="relnotes/19.0.2.html">19.0.2 release notes</a>
<li><a href="relnotes/18.3.6.html">18.3.6 release notes</a>
<li><a href="relnotes/19.0.1.html">19.0.1 release notes</a>
<li><a href="relnotes/18.3.5.html">18.3.5 release notes</a>
<li><a href="relnotes/19.0.0.html">19.0.0 release notes</a>
<li><a href="relnotes/18.3.4.html">18.3.4 release notes</a>
<li><a href="relnotes/18.3.3.html">18.3.3 release notes</a>
<li><a href="relnotes/18.3.2.html">18.3.2 release notes</a>
<li><a href="relnotes/18.2.8.html">18.2.8 release notes</a>
<li><a href="relnotes/18.2.7.html">18.2.7 release notes</a>
<li><a href="relnotes/18.3.1.html">18.3.1 release notes</a>
<li><a href="relnotes/18.3.0.html">18.3.0 release notes</a>
<li><a href="relnotes/18.2.6.html">18.2.6 release notes</a>
<li><a href="relnotes/18.2.5.html">18.2.5 release notes</a>
<li><a href="relnotes/18.2.4.html">18.2.4 release notes</a>
<li><a href="relnotes/18.2.3.html">18.2.3 release notes</a>
<li><a href="relnotes/18.2.2.html">18.2.2 release notes</a>
<li><a href="relnotes/18.1.9.html">18.1.9 release notes</a>
<li><a href="relnotes/18.2.1.html">18.2.1 release notes</a>
<li><a href="relnotes/18.2.0.html">18.2.0 release notes</a>
<li><a href="relnotes/18.1.8.html">18.1.8 release notes</a>
<li><a href="relnotes/18.1.7.html">18.1.7 release notes</a>
<li><a href="relnotes/18.1.6.html">18.1.6 release notes</a>
<li><a href="relnotes/18.1.5.html">18.1.5 release notes</a>
<li><a href="relnotes/18.1.4.html">18.1.4 release notes</a>
<li><a href="relnotes/18.1.3.html">18.1.3 release notes</a>
<li><a href="relnotes/18.1.2.html">18.1.2 release notes</a>
<li><a href="relnotes/18.0.5.html">18.0.5 release notes</a>
<li><a href="relnotes/18.1.1.html">18.1.1 release notes</a>
<li><a href="relnotes/18.1.0.html">18.1.0 release notes</a>
<li><a href="relnotes/18.0.4.html">18.0.4 release notes</a>
<li><a href="relnotes/18.0.3.html">18.0.3 release notes</a>
<li><a href="relnotes/18.0.2.html">18.0.2 release notes</a>
<li><a href="relnotes/18.0.1.html">18.0.1 release notes</a>
<li><a href="relnotes/17.3.9.html">17.3.9 release notes</a>
<li><a href="relnotes/17.3.8.html">17.3.8 release notes</a>
<li><a href="relnotes/18.0.0.html">18.0.0 release notes</a>
<li><a href="relnotes/17.3.7.html">17.3.7 release notes</a>
<li><a href="relnotes/17.3.6.html">17.3.6 release notes</a>
<li><a href="relnotes/17.3.5.html">17.3.5 release notes</a>
<li><a href="relnotes/17.3.4.html">17.3.4 release notes</a>
<li><a href="relnotes/17.3.3.html">17.3.3 release notes</a>
<li><a href="relnotes/17.3.2.html">17.3.2 release notes</a>
<li><a href="relnotes/17.2.8.html">17.2.8 release notes</a>
<li><a href="relnotes/17.3.1.html">17.3.1 release notes</a>
<li><a href="relnotes/17.2.7.html">17.2.7 release notes</a>
<li><a href="relnotes/17.3.0.html">17.3.0 release notes</a>
<li><a href="relnotes/17.2.6.html">17.2.6 release notes</a>
<li><a href="relnotes/17.2.5.html">17.2.5 release notes</a>
<li><a href="relnotes/17.2.4.html">17.2.4 release notes</a>
<li><a href="relnotes/17.2.3.html">17.2.3 release notes</a>
<li><a href="relnotes/17.2.2.html">17.2.2 release notes</a>
<li><a href="relnotes/17.1.10.html">17.1.10 release notes</a>
<li><a href="relnotes/17.2.1.html">17.2.1 release notes</a>
<li><a href="relnotes/17.1.9.html">17.1.9 release notes</a>
<li><a href="relnotes/17.2.0.html">17.2.0 release notes</a>
<li><a href="relnotes/17.1.8.html">17.1.8 release notes</a>
<li><a href="relnotes/17.1.7.html">17.1.7 release notes</a>
<li><a href="relnotes/17.1.6.html">17.1.6 release notes</a>
<li><a href="relnotes/17.1.5.html">17.1.5 release notes</a>
<li><a href="relnotes/17.1.4.html">17.1.4 release notes</a>
<li><a href="relnotes/17.1.3.html">17.1.3 release notes</a>
<li><a href="relnotes/17.1.2.html">17.1.2 release notes</a>
<li><a href="relnotes/17.0.7.html">17.0.7 release notes</a>
<li><a href="relnotes/17.1.1.html">17.1.1 release notes</a>
<li><a href="relnotes/17.0.6.html">17.0.6 release notes</a>
<li><a href="relnotes/17.1.0.html">17.1.0 release notes</a>
<li><a href="relnotes/17.0.5.html">17.0.5 release notes</a>
<li><a href="relnotes/17.0.4.html">17.0.4 release notes</a>
<li><a href="relnotes/17.0.3.html">17.0.3 release notes</a>
<li><a href="relnotes/17.0.2.html">17.0.2 release notes</a>
<li><a href="relnotes/13.0.6.html">13.0.6 release notes</a>
<li><a href="relnotes/17.0.1.html">17.0.1 release notes</a>
<li><a href="relnotes/13.0.5.html">13.0.5 release notes</a>
<li><a href="relnotes/17.0.0.html">17.0.0 release notes</a>
<li><a href="relnotes/13.0.4.html">13.0.4 release notes</a>
<li><a href="relnotes/12.0.6.html">12.0.6 release notes</a>
<li><a href="relnotes/13.0.3.html">13.0.3 release notes</a>
<li><a href="relnotes/12.0.5.html">12.0.5 release notes</a>
<li><a href="relnotes/13.0.2.html">13.0.2 release notes</a>
<li><a href="relnotes/13.0.1.html">13.0.1 release notes</a>
<li><a href="relnotes/12.0.4.html">12.0.4 release notes</a>
<li><a href="relnotes/13.0.0.html">13.0.0 release notes</a>
<li><a href="relnotes/12.0.3.html">12.0.3 release notes</a>
<li><a href="relnotes/12.0.2.html">12.0.2 release notes</a>
<li><a href="relnotes/12.0.1.html">12.0.1 release notes</a>
<li><a href="relnotes/12.0.0.html">12.0.0 release notes</a>
<li><a href="relnotes/11.2.2.html">11.2.2 release notes</a>
<li><a href="relnotes/11.1.4.html">11.1.4 release notes</a>
<li><a href="relnotes/11.2.1.html">11.2.1 release notes</a>
<li><a href="relnotes/11.1.3.html">11.1.3 release notes</a>
<li><a href="relnotes/11.2.0.html">11.2.0 release notes</a>
<li><a href="relnotes/11.1.2.html">11.1.2 release notes</a>
<li><a href="relnotes/11.0.9.html">11.0.9 release notes</a>
<li><a href="relnotes/11.1.1.html">11.1.1 release notes</a>
<li><a href="relnotes/11.0.8.html">11.0.8 release notes</a>
<li><a href="relnotes/11.1.0.html">11.1.0 release notes</a>
<li><a href="relnotes/11.0.7.html">11.0.7 release notes</a>
<li><a href="relnotes/11.0.6.html">11.0.6 release notes</a>
<li><a href="relnotes/11.0.5.html">11.0.5 release notes</a>
<li><a href="relnotes/11.0.4.html">11.0.4 release notes</a>
<li><a href="relnotes/11.0.3.html">11.0.3 release notes</a>
<li><a href="relnotes/10.6.9.html">10.6.9 release notes</a>
<li><a href="relnotes/11.0.2.html">11.0.2 release notes</a>
<li><a href="relnotes/11.0.1.html">11.0.1 release notes</a>
<li><a href="relnotes/10.6.8.html">10.6.8 release notes</a>
<li><a href="relnotes/11.0.0.html">11.0.0 release notes</a>
<li><a href="relnotes/10.6.7.html">10.6.7 release notes</a>
<li><a href="relnotes/10.6.6.html">10.6.6 release notes</a>
<li><a href="relnotes/10.6.5.html">10.6.5 release notes</a>
<li><a href="relnotes/10.6.4.html">10.6.4 release notes</a>
<li><a href="relnotes/10.6.3.html">10.6.3 release notes</a>
<li><a href="relnotes/10.6.2.html">10.6.2 release notes</a>
<li><a href="relnotes/10.5.9.html">10.5.9 release notes</a>
<li><a href="relnotes/10.6.1.html">10.6.1 release notes</a>
<li><a href="relnotes/10.5.8.html">10.5.8 release notes</a>
<li><a href="relnotes/10.6.0.html">10.6.0 release notes</a>
<li><a href="relnotes/10.5.7.html">10.5.7 release notes</a>
<li><a href="relnotes/10.5.6.html">10.5.6 release notes</a>
<li><a href="relnotes/10.5.5.html">10.5.5 release notes</a>
<li><a href="relnotes/10.5.4.html">10.5.4 release notes</a>
<li><a href="relnotes/10.5.3.html">10.5.3 release notes</a>
<li><a href="relnotes/10.5.2.html">10.5.2 release notes</a>
<li><a href="relnotes/10.4.7.html">10.4.7 release notes</a>
<li><a href="relnotes/10.5.1.html">10.5.1 release notes</a>
<li><a href="relnotes/10.5.0.html">10.5.0 release notes</a>
<li><a href="relnotes/10.4.6.html">10.4.6 release notes</a>
<li><a href="relnotes/10.4.5.html">10.4.5 release notes</a>
<li><a href="relnotes/10.4.4.html">10.4.4 release notes</a>
<li><a href="relnotes/10.4.3.html">10.4.3 release notes</a>
<li><a href="relnotes/10.4.2.html">10.4.2 release notes</a>
<li><a href="relnotes/10.3.7.html">10.3.7 release notes</a>
<li><a href="relnotes/10.4.1.html">10.4.1 release notes</a>
<li><a href="relnotes/10.3.6.html">10.3.6 release notes</a>
<li><a href="relnotes/10.4.html">10.4 release notes</a>
<li><a href="relnotes/10.3.5.html">10.3.5 release notes</a>
<li><a href="relnotes/10.3.4.html">10.3.4 release notes</a>
<li><a href="relnotes/10.3.3.html">10.3.3 release notes</a>
<li><a href="relnotes/10.3.2.html">10.3.2 release notes</a>
<li><a href="relnotes/10.3.1.html">10.3.1 release notes</a>
<li><a href="relnotes/10.2.9.html">10.2.9 release notes</a>
<li><a href="relnotes/10.3.html">10.3 release notes</a>
<li><a href="relnotes/10.2.8.html">10.2.8 release notes</a>
<li><a href="relnotes/10.2.7.html">10.2.7 release notes</a>
<li><a href="relnotes/10.2.6.html">10.2.6 release notes</a>
<li><a href="relnotes/10.2.5.html">10.2.5 release notes</a>
<li><a href="relnotes/10.2.4.html">10.2.4 release notes</a>
<li><a href="relnotes/10.2.3.html">10.2.3 release notes</a>
<li><a href="relnotes/10.2.2.html">10.2.2 release notes</a>
<li><a href="relnotes/10.2.1.html">10.2.1 release notes</a>
<li><a href="relnotes/10.2.html">10.2 release notes</a>
<li><a href="relnotes/10.1.6.html">10.1.6 release notes</a>
<li><a href="relnotes/10.1.5.html">10.1.5 release notes</a>
<li><a href="relnotes/10.1.4.html">10.1.4 release notes</a>
<li><a href="relnotes/10.1.3.html">10.1.3 release notes</a>
<li><a href="relnotes/10.1.2.html">10.1.2 release notes</a>
<li><a href="relnotes/10.1.1.html">10.1.1 release notes</a>
<li><a href="relnotes/10.1.html">10.1 release notes</a>
<li><a href="relnotes/10.0.5.html">10.0.5 release notes</a>
<li><a href="relnotes/10.0.4.html">10.0.4 release notes</a>
<li><a href="relnotes/10.0.3.html">10.0.3 release notes</a>
<li><a href="relnotes/10.0.2.html">10.0.2 release notes</a>
<li><a href="relnotes/10.0.1.html">10.0.1 release notes</a>
<li><a href="relnotes/10.0.html">10.0 release notes</a>
<li><a href="relnotes/9.2.5.html">9.2.5 release notes</a>
<li><a href="relnotes/9.2.4.html">9.2.4 release notes</a>
<li><a href="relnotes/9.2.3.html">9.2.3 release notes</a>
<li><a href="relnotes/9.2.2.html">9.2.2 release notes</a>
<li><a href="relnotes/9.2.1.html">9.2.1 release notes</a>
<li><a href="relnotes/9.2.html">9.2 release notes</a>
<li><a href="relnotes/9.1.7.html">9.1.7 release notes</a>
<li><a href="relnotes/9.1.6.html">9.1.6 release notes</a>
<li><a href="relnotes/9.1.5.html">9.1.5 release notes</a>
<li><a href="relnotes/9.1.4.html">9.1.4 release notes</a>
<li><a href="relnotes/9.1.3.html">9.1.3 release notes</a>
<li><a href="relnotes/9.1.2.html">9.1.2 release notes</a>
<li><a href="relnotes/9.1.1.html">9.1.1 release notes</a>
<li><a href="relnotes/9.1.html">9.1 release notes</a>
<li><a href="relnotes/9.0.3.html">9.0.3 release notes</a>
<li><a href="relnotes/9.0.2.html">9.0.2 release notes</a>
<li><a href="relnotes/9.0.1.html">9.0.1 release notes</a>
<li><a href="relnotes/9.0.html">9.0 release notes</a>
<li><a href="relnotes/8.0.5.html">8.0.5 release notes</a>
<li><a href="relnotes/8.0.4.html">8.0.4 release notes</a>
<li><a href="relnotes/8.0.3.html">8.0.3 release notes</a>
<li><a href="relnotes/8.0.2.html">8.0.2 release notes</a>
<li><a href="relnotes/8.0.1.html">8.0.1 release notes</a>
<li><a href="relnotes/8.0.html">8.0 release notes</a>
<li><a href="relnotes/7.11.2.html">7.11.2 release notes</a>
<li><a href="relnotes/7.11.1.html">7.11.1 release notes</a>
<li><a href="relnotes/7.11.html">7.11 release notes</a>
<li><a href="relnotes/7.10.3.html">7.10.3 release notes</a>
<li><a href="relnotes/7.10.2.html">7.10.2 release notes</a>
<li><a href="relnotes/7.10.1.html">7.10.1 release notes</a>
<li><a href="relnotes/7.10.html">7.10 release notes</a>
<li><a href="relnotes/7.9.2.html">7.9.2 release notes</a>
<li><a href="relnotes/7.9.1.html">7.9.1 release notes</a>
<li><a href="relnotes/7.9.html">7.9 release notes</a>
<li><a href="relnotes/7.8.3.html">7.8.3 release notes</a>
<li><a href="relnotes/7.8.2.html">7.8.2 release notes</a>
<li><a href="relnotes/7.8.1.html">7.8.1 release notes</a>
<li><a href="relnotes/7.8.html">7.8 release notes</a>
<li><a href="relnotes/7.7.1.html">7.7.1 release notes</a>
<li><a href="relnotes/7.7.html">7.7 release notes</a>
<li><a href="relnotes/7.6.1.html">7.6.1 release notes</a>
<li><a href="relnotes/7.6.html">7.6 release notes</a>
<li><a href="relnotes/7.5.2.html">7.5.2 release notes</a>
<li><a href="relnotes/7.5.1.html">7.5.1 release notes</a>
<li><a href="relnotes/7.5.html">7.5 release notes</a>
<li><a href="relnotes/7.4.4.html">7.4.4 release notes</a>
<li><a href="relnotes/7.4.3.html">7.4.3 release notes</a>
<li><a href="relnotes/7.4.2.html">7.4.2 release notes</a>
<li><a href="relnotes/7.4.1.html">7.4.1 release notes</a>
<li><a href="relnotes/7.4.html">7.4 release notes</a>
<li><a href="relnotes/7.3.html">7.3 release notes</a>
<li><a href="relnotes/7.2.html">7.2 release notes</a>
<li><a href="relnotes/7.1.html">7.1 release notes</a>
<li><a href="relnotes/7.0.4.html">7.0.4 release notes</a>
<li><a href="relnotes/7.0.3.html">7.0.3 release notes</a>
<li><a href="relnotes/7.0.2.html">7.0.2 release notes</a>
<li><a href="relnotes/7.0.1.html">7.0.1 release notes</a>
<li><a href="relnotes/7.0.html">7.0 release notes</a>
<li><a href="relnotes/6.5.3.html">6.5.3 release notes</a>
<li><a href="relnotes/6.5.2.html">6.5.2 release notes</a>
<li><a href="relnotes/6.5.1.html">6.5.1 release notes</a>
<li><a href="relnotes/6.5.html">6.5 release notes</a>
<li><a href="relnotes/6.4.2.html">6.4.2 release notes</a>
<li><a href="relnotes/6.4.1.html">6.4.1 release notes</a>
<li><a href="relnotes/6.4.html">6.4 release notes</a>
</ul>
<li><a href="relnotes/19.3.3.html">19.3.3 release notes</a></li><li><a href="relnotes/19.3.2.html">19.3.2 release notes</a></li><li><a href="relnotes/19.2.8.html">19.2.8 release notes</a></li><li><a href="relnotes/19.3.1.html">19.3.1 release notes</a></li><li><a href="relnotes/19.3.0.html">19.3.0 release notes</a></li><li><a href="relnotes/19.2.7.html">19.2.7 release notes</a></li><li><a href="relnotes/19.2.6.html">19.2.6 release notes</a></li><li><a href="relnotes/19.2.5.html">19.2.5 release notes</a></li><li><a href="relnotes/19.2.4.html">19.2.4 release notes</a></li><li><a href="relnotes/19.2.3.html">19.2.3 release notes</a></li><li><a href="relnotes/19.2.2.html">19.2.2 release notes</a></li><li><a href="relnotes/19.1.8.html">19.1.8 release notes</a>
</li><li><a href="relnotes/19.2.1.html">19.2.1 release notes</a></li><li><a href="relnotes/19.2.0.html">19.2.0 release notes</a>
</li><li><a href="relnotes/19.1.7.html">19.1.7 release notes</a>
</li><li><a href="relnotes/19.1.6.html">19.1.6 release notes</a>
</li><li><a href="relnotes/19.1.5.html">19.1.5 release notes</a>
</li><li><a href="relnotes/19.1.4.html">19.1.4 release notes</a>
</li><li><a href="relnotes/19.1.3.html">19.1.3 release notes</a>
</li><li><a href="relnotes/19.1.2.html">19.1.2 release notes</a>
</li><li><a href="relnotes/19.0.8.html">19.0.8 release notes</a>
</li><li><a href="relnotes/19.1.1.html">19.1.1 release notes</a>
</li><li><a href="relnotes/19.0.7.html">19.0.7 release notes</a>
</li><li><a href="relnotes/19.1.0.html">19.1.0 release notes</a>
</li><li><a href="relnotes/19.0.6.html">19.0.6 release notes</a>
</li><li><a href="relnotes/19.0.5.html">19.0.5 release notes</a>
</li><li><a href="relnotes/19.0.4.html">19.0.4 release notes</a>
</li><li><a href="relnotes/19.0.3.html">19.0.3 release notes</a>
</li><li><a href="relnotes/19.0.2.html">19.0.2 release notes</a>
</li><li><a href="relnotes/18.3.6.html">18.3.6 release notes</a>
</li><li><a href="relnotes/19.0.1.html">19.0.1 release notes</a>
</li><li><a href="relnotes/18.3.5.html">18.3.5 release notes</a>
</li><li><a href="relnotes/19.0.0.html">19.0.0 release notes</a>
</li><li><a href="relnotes/18.3.4.html">18.3.4 release notes</a>
</li><li><a href="relnotes/18.3.3.html">18.3.3 release notes</a>
</li><li><a href="relnotes/18.3.2.html">18.3.2 release notes</a>
</li><li><a href="relnotes/18.2.8.html">18.2.8 release notes</a>
</li><li><a href="relnotes/18.2.7.html">18.2.7 release notes</a>
</li><li><a href="relnotes/18.3.1.html">18.3.1 release notes</a>
</li><li><a href="relnotes/18.3.0.html">18.3.0 release notes</a>
</li><li><a href="relnotes/18.2.6.html">18.2.6 release notes</a>
</li><li><a href="relnotes/18.2.5.html">18.2.5 release notes</a>
</li><li><a href="relnotes/18.2.4.html">18.2.4 release notes</a>
</li><li><a href="relnotes/18.2.3.html">18.2.3 release notes</a>
</li><li><a href="relnotes/18.2.2.html">18.2.2 release notes</a>
</li><li><a href="relnotes/18.1.9.html">18.1.9 release notes</a>
</li><li><a href="relnotes/18.2.1.html">18.2.1 release notes</a>
</li><li><a href="relnotes/18.2.0.html">18.2.0 release notes</a>
</li><li><a href="relnotes/18.1.8.html">18.1.8 release notes</a>
</li><li><a href="relnotes/18.1.7.html">18.1.7 release notes</a>
</li><li><a href="relnotes/18.1.6.html">18.1.6 release notes</a>
</li><li><a href="relnotes/18.1.5.html">18.1.5 release notes</a>
</li><li><a href="relnotes/18.1.4.html">18.1.4 release notes</a>
</li><li><a href="relnotes/18.1.3.html">18.1.3 release notes</a>
</li><li><a href="relnotes/18.1.2.html">18.1.2 release notes</a>
</li><li><a href="relnotes/18.0.5.html">18.0.5 release notes</a>
</li><li><a href="relnotes/18.1.1.html">18.1.1 release notes</a>
</li><li><a href="relnotes/18.1.0.html">18.1.0 release notes</a>
</li><li><a href="relnotes/18.0.4.html">18.0.4 release notes</a>
</li><li><a href="relnotes/18.0.3.html">18.0.3 release notes</a>
</li><li><a href="relnotes/18.0.2.html">18.0.2 release notes</a>
</li><li><a href="relnotes/18.0.1.html">18.0.1 release notes</a>
</li><li><a href="relnotes/17.3.9.html">17.3.9 release notes</a>
</li><li><a href="relnotes/17.3.8.html">17.3.8 release notes</a>
</li><li><a href="relnotes/18.0.0.html">18.0.0 release notes</a>
</li><li><a href="relnotes/17.3.7.html">17.3.7 release notes</a>
</li><li><a href="relnotes/17.3.6.html">17.3.6 release notes</a>
</li><li><a href="relnotes/17.3.5.html">17.3.5 release notes</a>
</li><li><a href="relnotes/17.3.4.html">17.3.4 release notes</a>
</li><li><a href="relnotes/17.3.3.html">17.3.3 release notes</a>
</li><li><a href="relnotes/17.3.2.html">17.3.2 release notes</a>
</li><li><a href="relnotes/17.2.8.html">17.2.8 release notes</a>
</li><li><a href="relnotes/17.3.1.html">17.3.1 release notes</a>
</li><li><a href="relnotes/17.2.7.html">17.2.7 release notes</a>
</li><li><a href="relnotes/17.3.0.html">17.3.0 release notes</a>
</li><li><a href="relnotes/17.2.6.html">17.2.6 release notes</a>
</li><li><a href="relnotes/17.2.5.html">17.2.5 release notes</a>
</li><li><a href="relnotes/17.2.4.html">17.2.4 release notes</a>
</li><li><a href="relnotes/17.2.3.html">17.2.3 release notes</a>
</li><li><a href="relnotes/17.2.2.html">17.2.2 release notes</a>
</li><li><a href="relnotes/17.1.10.html">17.1.10 release notes</a>
</li><li><a href="relnotes/17.2.1.html">17.2.1 release notes</a>
</li><li><a href="relnotes/17.1.9.html">17.1.9 release notes</a>
</li><li><a href="relnotes/17.2.0.html">17.2.0 release notes</a>
</li><li><a href="relnotes/17.1.8.html">17.1.8 release notes</a>
</li><li><a href="relnotes/17.1.7.html">17.1.7 release notes</a>
</li><li><a href="relnotes/17.1.6.html">17.1.6 release notes</a>
</li><li><a href="relnotes/17.1.5.html">17.1.5 release notes</a>
</li><li><a href="relnotes/17.1.4.html">17.1.4 release notes</a>
</li><li><a href="relnotes/17.1.3.html">17.1.3 release notes</a>
</li><li><a href="relnotes/17.1.2.html">17.1.2 release notes</a>
</li><li><a href="relnotes/17.0.7.html">17.0.7 release notes</a>
</li><li><a href="relnotes/17.1.1.html">17.1.1 release notes</a>
</li><li><a href="relnotes/17.0.6.html">17.0.6 release notes</a>
</li><li><a href="relnotes/17.1.0.html">17.1.0 release notes</a>
</li><li><a href="relnotes/17.0.5.html">17.0.5 release notes</a>
</li><li><a href="relnotes/17.0.4.html">17.0.4 release notes</a>
</li><li><a href="relnotes/17.0.3.html">17.0.3 release notes</a>
</li><li><a href="relnotes/17.0.2.html">17.0.2 release notes</a>
</li><li><a href="relnotes/13.0.6.html">13.0.6 release notes</a>
</li><li><a href="relnotes/17.0.1.html">17.0.1 release notes</a>
</li><li><a href="relnotes/13.0.5.html">13.0.5 release notes</a>
</li><li><a href="relnotes/17.0.0.html">17.0.0 release notes</a>
</li><li><a href="relnotes/13.0.4.html">13.0.4 release notes</a>
</li><li><a href="relnotes/12.0.6.html">12.0.6 release notes</a>
</li><li><a href="relnotes/13.0.3.html">13.0.3 release notes</a>
</li><li><a href="relnotes/12.0.5.html">12.0.5 release notes</a>
</li><li><a href="relnotes/13.0.2.html">13.0.2 release notes</a>
</li><li><a href="relnotes/13.0.1.html">13.0.1 release notes</a>
</li><li><a href="relnotes/12.0.4.html">12.0.4 release notes</a>
</li><li><a href="relnotes/13.0.0.html">13.0.0 release notes</a>
</li><li><a href="relnotes/12.0.3.html">12.0.3 release notes</a>
</li><li><a href="relnotes/12.0.2.html">12.0.2 release notes</a>
</li><li><a href="relnotes/12.0.1.html">12.0.1 release notes</a>
</li><li><a href="relnotes/12.0.0.html">12.0.0 release notes</a>
</li><li><a href="relnotes/11.2.2.html">11.2.2 release notes</a>
</li><li><a href="relnotes/11.1.4.html">11.1.4 release notes</a>
</li><li><a href="relnotes/11.2.1.html">11.2.1 release notes</a>
</li><li><a href="relnotes/11.1.3.html">11.1.3 release notes</a>
</li><li><a href="relnotes/11.2.0.html">11.2.0 release notes</a>
</li><li><a href="relnotes/11.1.2.html">11.1.2 release notes</a>
</li><li><a href="relnotes/11.0.9.html">11.0.9 release notes</a>
</li><li><a href="relnotes/11.1.1.html">11.1.1 release notes</a>
</li><li><a href="relnotes/11.0.8.html">11.0.8 release notes</a>
</li><li><a href="relnotes/11.1.0.html">11.1.0 release notes</a>
</li><li><a href="relnotes/11.0.7.html">11.0.7 release notes</a>
</li><li><a href="relnotes/11.0.6.html">11.0.6 release notes</a>
</li><li><a href="relnotes/11.0.5.html">11.0.5 release notes</a>
</li><li><a href="relnotes/11.0.4.html">11.0.4 release notes</a>
</li><li><a href="relnotes/11.0.3.html">11.0.3 release notes</a>
</li><li><a href="relnotes/10.6.9.html">10.6.9 release notes</a>
</li><li><a href="relnotes/11.0.2.html">11.0.2 release notes</a>
</li><li><a href="relnotes/11.0.1.html">11.0.1 release notes</a>
</li><li><a href="relnotes/10.6.8.html">10.6.8 release notes</a>
</li><li><a href="relnotes/11.0.0.html">11.0.0 release notes</a>
</li><li><a href="relnotes/10.6.7.html">10.6.7 release notes</a>
</li><li><a href="relnotes/10.6.6.html">10.6.6 release notes</a>
</li><li><a href="relnotes/10.6.5.html">10.6.5 release notes</a>
</li><li><a href="relnotes/10.6.4.html">10.6.4 release notes</a>
</li><li><a href="relnotes/10.6.3.html">10.6.3 release notes</a>
</li><li><a href="relnotes/10.6.2.html">10.6.2 release notes</a>
</li><li><a href="relnotes/10.5.9.html">10.5.9 release notes</a>
</li><li><a href="relnotes/10.6.1.html">10.6.1 release notes</a>
</li><li><a href="relnotes/10.5.8.html">10.5.8 release notes</a>
</li><li><a href="relnotes/10.6.0.html">10.6.0 release notes</a>
</li><li><a href="relnotes/10.5.7.html">10.5.7 release notes</a>
</li><li><a href="relnotes/10.5.6.html">10.5.6 release notes</a>
</li><li><a href="relnotes/10.5.5.html">10.5.5 release notes</a>
</li><li><a href="relnotes/10.5.4.html">10.5.4 release notes</a>
</li><li><a href="relnotes/10.5.3.html">10.5.3 release notes</a>
</li><li><a href="relnotes/10.5.2.html">10.5.2 release notes</a>
</li><li><a href="relnotes/10.4.7.html">10.4.7 release notes</a>
</li><li><a href="relnotes/10.5.1.html">10.5.1 release notes</a>
</li><li><a href="relnotes/10.5.0.html">10.5.0 release notes</a>
</li><li><a href="relnotes/10.4.6.html">10.4.6 release notes</a>
</li><li><a href="relnotes/10.4.5.html">10.4.5 release notes</a>
</li><li><a href="relnotes/10.4.4.html">10.4.4 release notes</a>
</li><li><a href="relnotes/10.4.3.html">10.4.3 release notes</a>
</li><li><a href="relnotes/10.4.2.html">10.4.2 release notes</a>
</li><li><a href="relnotes/10.3.7.html">10.3.7 release notes</a>
</li><li><a href="relnotes/10.4.1.html">10.4.1 release notes</a>
</li><li><a href="relnotes/10.3.6.html">10.3.6 release notes</a>
</li><li><a href="relnotes/10.4.html">10.4 release notes</a>
</li><li><a href="relnotes/10.3.5.html">10.3.5 release notes</a>
</li><li><a href="relnotes/10.3.4.html">10.3.4 release notes</a>
</li><li><a href="relnotes/10.3.3.html">10.3.3 release notes</a>
</li><li><a href="relnotes/10.3.2.html">10.3.2 release notes</a>
</li><li><a href="relnotes/10.3.1.html">10.3.1 release notes</a>
</li><li><a href="relnotes/10.2.9.html">10.2.9 release notes</a>
</li><li><a href="relnotes/10.3.html">10.3 release notes</a>
</li><li><a href="relnotes/10.2.8.html">10.2.8 release notes</a>
</li><li><a href="relnotes/10.2.7.html">10.2.7 release notes</a>
</li><li><a href="relnotes/10.2.6.html">10.2.6 release notes</a>
</li><li><a href="relnotes/10.2.5.html">10.2.5 release notes</a>
</li><li><a href="relnotes/10.2.4.html">10.2.4 release notes</a>
</li><li><a href="relnotes/10.2.3.html">10.2.3 release notes</a>
</li><li><a href="relnotes/10.2.2.html">10.2.2 release notes</a>
</li><li><a href="relnotes/10.2.1.html">10.2.1 release notes</a>
</li><li><a href="relnotes/10.2.html">10.2 release notes</a>
</li><li><a href="relnotes/10.1.6.html">10.1.6 release notes</a>
</li><li><a href="relnotes/10.1.5.html">10.1.5 release notes</a>
</li><li><a href="relnotes/10.1.4.html">10.1.4 release notes</a>
</li><li><a href="relnotes/10.1.3.html">10.1.3 release notes</a>
</li><li><a href="relnotes/10.1.2.html">10.1.2 release notes</a>
</li><li><a href="relnotes/10.1.1.html">10.1.1 release notes</a>
</li><li><a href="relnotes/10.1.html">10.1 release notes</a>
</li><li><a href="relnotes/10.0.5.html">10.0.5 release notes</a>
</li><li><a href="relnotes/10.0.4.html">10.0.4 release notes</a>
</li><li><a href="relnotes/10.0.3.html">10.0.3 release notes</a>
</li><li><a href="relnotes/10.0.2.html">10.0.2 release notes</a>
</li><li><a href="relnotes/10.0.1.html">10.0.1 release notes</a>
</li><li><a href="relnotes/10.0.html">10.0 release notes</a>
</li><li><a href="relnotes/9.2.5.html">9.2.5 release notes</a>
</li><li><a href="relnotes/9.2.4.html">9.2.4 release notes</a>
</li><li><a href="relnotes/9.2.3.html">9.2.3 release notes</a>
</li><li><a href="relnotes/9.2.2.html">9.2.2 release notes</a>
</li><li><a href="relnotes/9.2.1.html">9.2.1 release notes</a>
</li><li><a href="relnotes/9.2.html">9.2 release notes</a>
</li><li><a href="relnotes/9.1.7.html">9.1.7 release notes</a>
</li><li><a href="relnotes/9.1.6.html">9.1.6 release notes</a>
</li><li><a href="relnotes/9.1.5.html">9.1.5 release notes</a>
</li><li><a href="relnotes/9.1.4.html">9.1.4 release notes</a>
</li><li><a href="relnotes/9.1.3.html">9.1.3 release notes</a>
</li><li><a href="relnotes/9.1.2.html">9.1.2 release notes</a>
</li><li><a href="relnotes/9.1.1.html">9.1.1 release notes</a>
</li><li><a href="relnotes/9.1.html">9.1 release notes</a>
</li><li><a href="relnotes/9.0.3.html">9.0.3 release notes</a>
</li><li><a href="relnotes/9.0.2.html">9.0.2 release notes</a>
</li><li><a href="relnotes/9.0.1.html">9.0.1 release notes</a>
</li><li><a href="relnotes/9.0.html">9.0 release notes</a>
</li><li><a href="relnotes/8.0.5.html">8.0.5 release notes</a>
</li><li><a href="relnotes/8.0.4.html">8.0.4 release notes</a>
</li><li><a href="relnotes/8.0.3.html">8.0.3 release notes</a>
</li><li><a href="relnotes/8.0.2.html">8.0.2 release notes</a>
</li><li><a href="relnotes/8.0.1.html">8.0.1 release notes</a>
</li><li><a href="relnotes/8.0.html">8.0 release notes</a>
</li><li><a href="relnotes/7.11.2.html">7.11.2 release notes</a>
</li><li><a href="relnotes/7.11.1.html">7.11.1 release notes</a>
</li><li><a href="relnotes/7.11.html">7.11 release notes</a>
</li><li><a href="relnotes/7.10.3.html">7.10.3 release notes</a>
</li><li><a href="relnotes/7.10.2.html">7.10.2 release notes</a>
</li><li><a href="relnotes/7.10.1.html">7.10.1 release notes</a>
</li><li><a href="relnotes/7.10.html">7.10 release notes</a>
</li><li><a href="relnotes/7.9.2.html">7.9.2 release notes</a>
</li><li><a href="relnotes/7.9.1.html">7.9.1 release notes</a>
</li><li><a href="relnotes/7.9.html">7.9 release notes</a>
</li><li><a href="relnotes/7.8.3.html">7.8.3 release notes</a>
</li><li><a href="relnotes/7.8.2.html">7.8.2 release notes</a>
</li><li><a href="relnotes/7.8.1.html">7.8.1 release notes</a>
</li><li><a href="relnotes/7.8.html">7.8 release notes</a>
</li><li><a href="relnotes/7.7.1.html">7.7.1 release notes</a>
</li><li><a href="relnotes/7.7.html">7.7 release notes</a>
</li><li><a href="relnotes/7.6.1.html">7.6.1 release notes</a>
</li><li><a href="relnotes/7.6.html">7.6 release notes</a>
</li><li><a href="relnotes/7.5.2.html">7.5.2 release notes</a>
</li><li><a href="relnotes/7.5.1.html">7.5.1 release notes</a>
</li><li><a href="relnotes/7.5.html">7.5 release notes</a>
</li><li><a href="relnotes/7.4.4.html">7.4.4 release notes</a>
</li><li><a href="relnotes/7.4.3.html">7.4.3 release notes</a>
</li><li><a href="relnotes/7.4.2.html">7.4.2 release notes</a>
</li><li><a href="relnotes/7.4.1.html">7.4.1 release notes</a>
</li><li><a href="relnotes/7.4.html">7.4 release notes</a>
</li><li><a href="relnotes/7.3.html">7.3 release notes</a>
</li><li><a href="relnotes/7.2.html">7.2 release notes</a>
</li><li><a href="relnotes/7.1.html">7.1 release notes</a>
</li><li><a href="relnotes/7.0.4.html">7.0.4 release notes</a>
</li><li><a href="relnotes/7.0.3.html">7.0.3 release notes</a>
</li><li><a href="relnotes/7.0.2.html">7.0.2 release notes</a>
</li><li><a href="relnotes/7.0.1.html">7.0.1 release notes</a>
</li><li><a href="relnotes/7.0.html">7.0 release notes</a>
</li><li><a href="relnotes/6.5.3.html">6.5.3 release notes</a>
</li><li><a href="relnotes/6.5.2.html">6.5.2 release notes</a>
</li><li><a href="relnotes/6.5.1.html">6.5.1 release notes</a>
</li><li><a href="relnotes/6.5.html">6.5 release notes</a>
</li><li><a href="relnotes/6.4.2.html">6.4.2 release notes</a>
</li><li><a href="relnotes/6.4.1.html">6.4.1 release notes</a>
</li><li><a href="relnotes/6.4.html">6.4 release notes</a>
</li></ul>
<p>
Versions of Mesa prior to 6.4 are summarized in the
@@ -270,32 +275,32 @@ Versions of Mesa prior to 6.4 are summarized in the
<ul>
<li><a href="relnotes/6.3.2">6.3.2 release notes</a>
<li><a href="relnotes/6.3.1">6.3.1 release notes</a>
<li><a href="relnotes/6.3">6.3 release notes</a>
<li><a href="relnotes/6.2.1">6.2.1 release notes</a>
<li><a href="relnotes/6.2">6.2 release notes</a>
<li><a href="relnotes/6.1">6.1 release notes</a>
<li><a href="relnotes/6.0.1">6.0.1 release notes</a>
<li><a href="relnotes/6.0">6.0 release notes</a>
<li><a href="relnotes/5.1">5.1 release notes</a>
<li><a href="relnotes/5.0.2">5.0.2 release notes</a>
<li><a href="relnotes/5.0.1">5.0.1 release notes</a>
<li><a href="relnotes/5.0">5.0 release notes</a>
<li><a href="relnotes/4.1">4.1 release notes</a>
<li><a href="relnotes/4.0.3">4.0.3 release notes</a>
<li><a href="relnotes/4.0.2">4.0.2 release notes</a>
<li><a href="relnotes/4.0.1">4.0.1 release notes</a>
<li><a href="relnotes/4.0">4.0 release notes</a>
<li><a href="relnotes/3.5">3.5 release notes</a>
<li><a href="relnotes/3.4.2">3.4.2 release notes</a>
<li><a href="relnotes/3.4.1">3.4.1 release notes</a>
<li><a href="relnotes/3.4">3.4 release notes</a>
<li><a href="relnotes/3.3">3.3 release notes</a>
<li><a href="relnotes/3.2.1">3.2.1 release notes</a>
<li><a href="relnotes/3.2">3.2 release notes</a>
<li><a href="relnotes/3.1">3.1 release notes</a>
</ul>
</li><li><a href="relnotes/6.3.1">6.3.1 release notes</a>
</li><li><a href="relnotes/6.3">6.3 release notes</a>
</li><li><a href="relnotes/6.2.1">6.2.1 release notes</a>
</li><li><a href="relnotes/6.2">6.2 release notes</a>
</li><li><a href="relnotes/6.1">6.1 release notes</a>
</li><li><a href="relnotes/6.0.1">6.0.1 release notes</a>
</li><li><a href="relnotes/6.0">6.0 release notes</a>
</li><li><a href="relnotes/5.1">5.1 release notes</a>
</li><li><a href="relnotes/5.0.2">5.0.2 release notes</a>
</li><li><a href="relnotes/5.0.1">5.0.1 release notes</a>
</li><li><a href="relnotes/5.0">5.0 release notes</a>
</li><li><a href="relnotes/4.1">4.1 release notes</a>
</li><li><a href="relnotes/4.0.3">4.0.3 release notes</a>
</li><li><a href="relnotes/4.0.2">4.0.2 release notes</a>
</li><li><a href="relnotes/4.0.1">4.0.1 release notes</a>
</li><li><a href="relnotes/4.0">4.0 release notes</a>
</li><li><a href="relnotes/3.5">3.5 release notes</a>
</li><li><a href="relnotes/3.4.2">3.4.2 release notes</a>
</li><li><a href="relnotes/3.4.1">3.4.1 release notes</a>
</li><li><a href="relnotes/3.4">3.4 release notes</a>
</li><li><a href="relnotes/3.3">3.3 release notes</a>
</li><li><a href="relnotes/3.2.1">3.2.1 release notes</a>
</li><li><a href="relnotes/3.2">3.2 release notes</a>
</li><li><a href="relnotes/3.1">3.1 release notes</a>
</li></ul>
</div>
</body>
</html>
</html>

119
docs/relnotes/19.1.5.html Normal file
View File

@@ -0,0 +1,119 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 19.1.5 Release Notes / August 23, 2019</h1>
<p>
Mesa 19.1.5 is a bug fix release which fixes bugs found since the 19.1.4 release.
</p>
<p>
Mesa 19.1.5 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
7b54e14e35c7251b171b4cf9d84cbc1d760eafe00132117db193454999cd6eb4 mesa-19.1.5.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109630">Bug 109630</a> - vkQuake flickering geometry under Intel</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110395">Bug 110395</a> - Shadows are flickering in SuperTuxKart</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111113">Bug 111113</a> - ANGLE BlitFramebufferTest.MultisampleDepthClear/ES3_OpenGL fails on Intel Ubuntu19.04</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111267">Bug 111267</a> - [CM246] Flickering with multiple draw calls within the same graphics pipeline if a compute pipeline is present</li>
</ul>
<h2>Changes</h2>
<p>Bas Nieuwenhuizen (4):</p>
<ul>
<li>radv: Do non-uniform lowering before bool lowering.</li>
<li>ac/nir: Use correct cast for readfirstlane and ptrs.</li>
<li>radv: Avoid binning RAVEN hangs.</li>
<li>radv: Avoid VEGA/RAVEN scissor bug in binning.</li>
</ul>
<p>Danylo Piliaiev (1):</p>
<ul>
<li>i965: Emit a dummy MEDIA_VFE_STATE before switching from GPGPU to 3D</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>util: fix mem leak of program path</li>
</ul>
<p>Erik Faye-Lund (2):</p>
<ul>
<li>gallium/dump: add missing query-type to short-list</li>
<li>gallium/dump: add missing query-type to short-list</li>
</ul>
<p>Greg V (2):</p>
<ul>
<li>anv: remove unused Linux-specific include</li>
<li>intel/perf: use MAJOR_IN_SYSMACROS/MAJOR_IN_MKDEV</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>anv: Emit a dummy MEDIA_VFE_STATE before switching from GPGPU to 3D</li>
</ul>
<p>Juan A. Suarez Romero (3):</p>
<ul>
<li>docs: add sha256 checksums for 19.1.4</li>
<li>cherry-ignore: panfrost: Make ctx-&gt;job useful</li>
<li>Update version to 19.1.5</li>
</ul>
<p>Marek Olšák (2):</p>
<ul>
<li>radeonsi: disable SDMA image copies on dGPUs to fix corruption in games</li>
<li>radeonsi: fix an assertion failure: assert(!res-&gt;b.is_shared)</li>
</ul>
<p>Matt Turner (1):</p>
<ul>
<li>meson: Test for program_invocation_name</li>
</ul>
<p>Sergii Romantsov (1):</p>
<ul>
<li>i965/clear: clear_value better precision</li>
</ul>
</div>
</body>
</html>

132
docs/relnotes/19.1.6.html Normal file
View File

@@ -0,0 +1,132 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 19.1.6 Release Notes / September 3, 2019</h1>
<p>
Mesa 19.1.6 is a bug fix release which fixes bugs found since the 19.1.5 release.
</p>
<p>
Mesa 19.1.6 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
2a369b7b48545c6486e7e44913ad022daca097c8bd937bf30dcf3f17a94d3496 mesa-19.1.6.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104395">Bug 104395</a> - [CTS] GTF-GL46.gtf32.GL3Tests.packed_pixels.packed_pixels tests fail on 32bit Mesa</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111213">Bug 111213</a> - VA-API nouveau SIGSEGV and asserts</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111241">Bug 111241</a> - Shadertoy shader causing hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111411">Bug 111411</a> - SPIR-V shader leads to GPU hang, sometimes making machine unstable</li>
</ul>
<h2>Changes</h2>
<p>Andres Rodriguez (1):</p>
<ul>
<li>radv: additional query fixes</li>
</ul>
<p>Daniel Schürmann (1):</p>
<ul>
<li>nir/lcssa: handle deref instructions properly</li>
</ul>
<p>Danylo Piliaiev (1):</p>
<ul>
<li>nir/loop_unroll: Prepare loop for unrolling in wrapper_unroll</li>
</ul>
<p>Ian Romanick (2):</p>
<ul>
<li>nir/algrbraic: Don't optimize open-coded bitfield reverse when lowering is enabled</li>
<li>intel/compiler: Request bitfield_reverse lowering on pre-Gen7 hardware</li>
</ul>
<p>Ilia Mirkin (1):</p>
<ul>
<li>gallium/vl: use compute preference for all multimedia, not just blit</li>
</ul>
<p>Jonas Ådahl (1):</p>
<ul>
<li>wayland/egl: Ensure correct buffer size when allocating</li>
</ul>
<p>Juan A. Suarez Romero (6):</p>
<ul>
<li>docs: add sha256 checksums for 19.1.5</li>
<li>cherry-ignore: add explicit 19.2 only nominations</li>
<li>cherry-ignore: iris: Replace devinfo-&gt;gen with GEN_GEN</li>
<li>cherry-ignore: iris: Update fast clear colors on Gen9 with direct immediate writes.</li>
<li>cherry-ignore: iris: Avoid unnecessary resolves on transfer maps</li>
<li>Update version to 19.1.6</li>
</ul>
<p>Kenneth Graunke (6):</p>
<ul>
<li>iris: Fix broken aux.possible/sampler_usages bitmask handling</li>
<li>iris: Drop copy format hacks from copy region based transfer path.</li>
<li>iris: Fix large timeout handling in rel2abs()</li>
<li>util: Add a _mesa_i64roundevenf() helper.</li>
<li>mesa: Fix _mesa_float_to_unorm() on 32-bit systems.</li>
<li>intel/compiler: Fix src0/desc setter ordering</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>radeonsi: fix scratch buffer WAVESIZE setting leading to corruption</li>
</ul>
<p>Paulo Zanoni (1):</p>
<ul>
<li>intel/fs: grab fail_msg from v32 instead of v16 when v32-&gt;run_cs fails</li>
</ul>
<p>Pierre-Eric Pelloux-Prayer (1):</p>
<ul>
<li>glsl: replace 'x + (-x)' with constant 0</li>
</ul>
<p>Tapani Pälli (1):</p>
<ul>
<li>egl: reset blob cache set/get functions on terminate</li>
</ul>
</div>
</body>
</html>

Some files were not shown because too many files have changed in this diff Show More