Compare commits

...

1127 Commits

Author SHA1 Message Date
Dylan Baker
34896d2299 VERSION: bump for 19.2.8 2019-12-18 11:02:09 -08:00
Dylan Baker
1743dec475 docs: add relnotes for 19.2.8 2019-12-18 11:01:53 -08:00
Lionel Landwerlin
6979d19fcb mesa: avoid triggering assert in implementation
When tearing down a GL context with an active performance query, the
implementation can be confused by a query marked active when it's
being deleted.

This shouldn't happen in the implementation because the context will
already be idle.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2235
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3115>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3115>
(cherry picked from commit 2c8742ed85)
2019-12-17 09:10:49 -08:00
Gert Wollny
f42f9bbcd6 virgl: Increase the shader transfer buffer by doubling the size
With only linearly increasing the size of the shader transfer buffer
the transfer of very large shaders may fail, so with each attempt double
the size of the buffer.

CTS:
  dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.48
  for VTK-GL-CTS b5dcfb9c5 and newer

virglrenderer bug:
  https://gitlab.freedesktop.org/virgl/virglrenderer/issues/150

Fixes: a8987b88ff
    virgl: add driver for virtio-gpu 3D (v2)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3121>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3121>
(cherry picked from commit cffa7bb990)
2019-12-17 09:08:49 -08:00
Dylan Baker
4244f4af88 cherry-ignore: Update for 19.2.8 2019-12-16 16:09:30 -08:00
Bas Nieuwenhuizen
ed9c1f7f42 amd/common: Fix tcCompatible degradation on Stoney.
addrlib sometimes returns smaller sizes for tcCompat as it does
not seem to take into account the depth+stencil matching config
gymnastics with tcCompat.

This fixes
dEQP-VK.pipeline.render_to_image.core.2d_array.huge.height.r8g8b8a8_unorm_d32_sfloat_s8_uint

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3054>
(cherry picked from commit e197fb1c2f)
Conflicts resolved by Dylan Baker

Conflicts:
	src/amd/common/ac_surface.c
2019-12-16 16:09:30 -08:00
Iván Briano
5d2d6442ff anv: Export filter_minmax support only when it's really supported
Fixes: bea4d4c78c ("anv: add VK_EXT_sampler_filter_minmax support")

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3071>
(cherry picked from commit 0fd93b9589)
2019-12-16 15:16:50 -08:00
Bas Nieuwenhuizen
e8913e62a6 amd/common: Always use addrlib for HTILE tc-compat.
Even without depth+stencil addrlib can (correctly!) decide to
disable tc compatible HTILE.

One example is 8x sampling with 32-bit depth on Stoney. The row size
on Stoney is 1024, while the tile size is 2048, which results in
tile splits which are not supported with tc-compat.

On Stoney, this fixes
dEQP-VK.glsl.builtin_var.fragdepth.*_list_d32_sfloat_multisample_8

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3054>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3054>
(cherry picked from commit b53856aca3)
2019-12-16 15:16:50 -08:00
Lionel Landwerlin
448b90a4a0 anv: fix fence underlying primitive checks
We appear to have got lucky that the only type of temporary fence
payload we could have was a syncobj and that would only happen when
the type of the permanent payload was also a syncobj.

This code was broken if that assumption changed and it did in commit
f9a3d9738b.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
(cherry picked from commit 52bc235f2a)
2019-12-16 15:16:49 -08:00
Kenneth Graunke
06b97d1e34 iris: Default to X-tiling for scanout buffers without modifiers
Neither Mutter nor KWin's wayland compositors appear to use modifiers.
In the non-modifier case, iris was still trying to use Y-tiling for
scan-out surfaces, leading to this error:

(gnome-shell:7247): mutter-WARNING **: 09:23:47.787: meta_drm_buffer_gbm_new failed: drmModeAddFB failed: Invalid argument

We now fall back to the historical X-tiling for scanout buffers, which
ought to work everyone, at lower performance.  To regain that, we need
to ensure modifiers are actually supported in environments people use.

Fixes: fbf3124771 ("iris: Rework tiling/modifiers handling")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit dcb4230e5e)
2019-12-16 15:16:49 -08:00
Jason Ekstrand
9925871a1a anv: Don't leak when set_tiling fails
Fixes: a44744e01d "anv: Require a dedicated allocation for..."
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 0a36fafa95)
Conflicts resolved by Dylan Baker

Conflicts:
	src/intel/vulkan/anv_device.c
2019-12-16 15:16:49 -08:00
Dylan Baker
325ef15f26 meson/broadcom: libbroadcom_cle also needs zlib
Fixes: 1ae8018a6a
       ("meson: Add support for the vc4 driver.")
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit d0eebda990)
2019-12-11 13:19:24 -08:00
Dylan Baker
5e100b6ba7 meson/broadcom: libbroadcom_cle needs expat headers
Fixes: 1ae8018a6a
       ("meson: Add support for the vc4 driver.")
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 85a9698ac3)
2019-12-11 13:19:17 -08:00
Alyssa Rosenzweig
47c8b41f8f gallium/util: Support POLYGON in u_stream_outputs_for_vertices
u_decomposed_prims_for_vertices cannot support POLYGON, but POLYGON is
trivial to support as a special case directly (since we have the number
of vertices directly).

Fixes aborts in Panfrost in apps using GL_POLYGON.

Fixes: e881aa8c12 ("gallium/util: Add u_stream_outputs_for_vertices helper")
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Revewied-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit a37822f5f7)
2019-12-10 09:09:05 -08:00
Jason Ekstrand
af0d38bfde anv: Re-emit all compute state on pipeline switch
It's a very odd case to hit in the real world.  However, there are some
CTS tests which switch back and forth between dispatch and clear without
changing the pipeline.

Fixes: bc612536eb "anv: Emit a dummy MEDIA_VFE_STATE before switching..."
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit 0f60aa4037)
2019-12-10 09:08:59 -08:00
Nanley Chery
f3507690f8 gallium: Store the image format in winsys_handle
This format will be used to properly handle planar images with modifiers
in iris.

Fixes: 246eebba4a ("iris: Export and import surfaces with modifiers that have aux data")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 51ee8fff9b)
2019-12-10 09:08:41 -08:00
Nanley Chery
96b8f42611 gallium/dri2: Fix creation of multi-planar modifier images
The commit noted below assumed and enforced that DRM_MOD_INVALID was the
only valid modifier for multi-planar imported images. Due to that, it
required that modifier on multi-planar images to:

   1. Allow multiple planes.
   2. Perform YUV format lowering and extent adjustments.
   3. Use buffer_index to correctly map the given planes.

Fix these issues by removing or updating the code built on that
assumption.

Fixes: 2066966c10 ("gallium/dri2: Support creating multi-planar modifier images")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit d5c857837a)
2019-12-10 09:08:35 -08:00
Timothy Arceri
0a70ed2aa3 glsl/nir: iterate the system values list when adding varyings
Iterate the system values list when adding varyings to the program
resource list in the NIR linker. This is needed to avoid CTS
regressions when using the NIR to build the GLSL resource list in
an upcoming series. Presumably it also fixes a bug with the current
ARB_gl_spirv support.

Fixes: ffdb44d3a0 ("nir/linker: Add inputs/outputs to the program resource list")

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
(cherry picked from commit 1abca2b3c8)
2019-12-10 09:08:29 -08:00
Rob Clark
3742910ba2 nir/lower_clip: Fix incorrect driver loc for clipdist outputs
Somehow adjusting maxloc based on existing outputs got lost, resulting
in the clipdist varying clobbering the position varying.  Causing a
shader that had no position output in freedreno/ir3, which triggers GPU
hangs in neverball.

Fixes: d0f746b645 ("nir: Save nir_variable pointers in nir_lower_clip_vs rather than locs.")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
(cherry picked from commit 372ed42d22)
2019-12-04 14:44:25 -08:00
Lionel Landwerlin
21a4be582c intel/perf: fix improper pointer access
This expression was unused by the macro, probably why it didn't
register in the compilation.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit ddacd3d43b)
2019-12-04 14:44:04 -08:00
Lionel Landwerlin
272c4f2711 intel/perf: simplify the processing of OA reports
This is a more accurate description of what happens in processing the
OA reports.

Previously we only had a somewhat difficult to parse state machine
tracking the context ID.

What we really only need to do to decide if the delta between 2
reports (r0 & r1) should be accumulated in the query result is :

   * whether the r0 is tagged with the context ID relevant to us

   * if r0 is not tagged with our context ID and r1 is: does r0 have a
     invalid context id? If not then we're in a case where i915 has
     resubmitted the same context for execution through the execlist
     submission port

v2: Update comment (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 8c0b058263)
2019-12-04 14:43:58 -08:00
Lionel Landwerlin
d4bb049e98 intel/perf: take into account that reports read can be fairly old
If we read the OA reports late enough after the query happens, we can
get a timestamp in the report that is significantly in the past
compared to the start timestamp of the query. The current code must
deal with the wraparound of the timestamp value (every ~6 minute). So
consider that if the difference is greater than half that wraparound
period, we're probably dealing with an old report and make the caller
aware it should read more reports when they're available.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit b364e920bf)
2019-12-04 14:43:51 -08:00
Lionel Landwerlin
21df9767ad intel/perf: set read buffer len to 0 to identify empty buffer
We always add an empty buffer in the list when creating the query.
Let's set the len appropriately so that we can recognize it when we
read OA reports up to the end of a query.

We were using an 0 timestamp value associated with the empty buffer
and incorrectly assuming this was a valid value. In turn that led to
not reading enough reports and resulted in deltas added to our counter
values which should have been discarded because those would be flagged
for a different context.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 9d0a5c817c)
2019-12-04 14:43:38 -08:00
Lionel Landwerlin
82177cdc1d intel/perf: fix invalid hw_id in query results
Accumulation happens between 2 reports, it can be between a start/end
report from another context. So only consider updating the hw_id of
the results when it's not already valid and that we have a valid value
to put in there.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 41b54b5faf ("i965: move OA accumulation code to intel/perf")
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit acea59dbf8)
2019-12-04 14:43:29 -08:00
Dylan Baker
e5b4f23475 docs: Add SHA256 sums for 19.2.7 2019-12-04 14:36:13 -08:00
Dylan Baker
65d255cd1e VERSION: bump version for 19.2.7 2019-12-04 13:48:11 -08:00
Dylan Baker
d8e767ede8 docs: Add release notes for 19.2.7 2019-12-04 13:47:44 -08:00
Rhys Perry
4a0199b6e4 radv: set writes_memory for global memory stores/atomics
Fixes: 13ab63bb62 ('radv: Implement VK_EXT_buffer_device_address.')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 35fab1ba33)
2019-12-04 13:43:32 -08:00
Samuel Pitoiset
3ed8c94244 radv: fix compute pipeline keys when optimizations are disabled
If an app first creates a compute pipeline with
VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT set, then re-compile it
without that flag, the driver should re-compile the compute shader.
Otherwise, it will return the unoptimized one.

Fixes: ce188813bf ("radv: add initial support for VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 9ab27647ff)
2019-12-04 13:43:32 -08:00
Samuel Pitoiset
5c98b36577 radv/gfx10: fix implementation of exclusive scans
This implementation is loosely based on ROCm.
https://github.com/RadeonOpenCompute/ROCm-Device-Libs/blob/master/ockl/src/wfredscan.cl

This fixes dEQP-VK.subgroups.arithmetic.*.subgroupexclusive* on GFX10.

Fixes: 227c29a80d ("amd/common/gfx10: implement scan & reduce operations")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit c9aa843961)
Conflicts resolved by Dylan Baker
2019-12-04 13:43:32 -08:00
Samuel Pitoiset
a3869c14c0 radv: fix enabling sample shading with SampleID/SamplePosition
When a fragment shader includes an input variable decorated with
SampleId or SamplePosition, sample shading should be enabled
because minSampleShadingFactor is expected to be 1.0.

Cc: 19.2, 19.3 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 86a5fbfd4a)
Conflicts resolved by Dylan Baker
2019-12-04 13:43:32 -08:00
Yevhenii Kolesnikov
bda6890f58 meson: Fix linkage of libgallium_nine with libgalliumvl
Do not link libgallium_nine with libgalliumvl_stub if it's already
linked with libgalliumvl. Linking with stub leads to "duplicate
symbol" errors.

Fixes: 6b4c7047d5
       ("meson: build gallium nine state_tracker")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2040

Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit 9af22ccddc)
Conflicts resolved by Dylan Baker
2019-12-04 13:43:32 -08:00
Jason Ekstrand
2bf47550ce anv: Set up SBE_SWIZ properly for gl_Viewport
gl_Viewport is also in the VUE header so we need to whack the read
offset to 0 and emit a default (no overrides) SBE_SWIZ entry in that
case as well.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit b1f37688ba)
2019-12-04 13:43:32 -08:00
Jonathan Gray
b4e83559cb i965: update Makefile.sources for perf changes
brw_performance_query_metrics.h was removed in
134e750e16 and
brw_performance_query.h was removed in
8ae6667992

remove reference to these files from Makefile.sources

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Fixes: 134e750e16 ("i965: extract performance query metrics")
Fixes: 8ae6667992 ("intel/perf: move query_object into perf")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 34dda0ca65)
2019-12-04 13:43:32 -08:00
Boris Brezillon
a39d364af3 gallium: Fix the ->set_damage_region() implementation
BACK_LEFT attachment can be outdated when the user calls
KHR_partial_update() (->lastStamp != ->texture_stamp), leading to a
damage region update on the wrong pipe_resource object.
Let's delay the ->set_damage_region() call until the attachments are
updated when we're in that case.

Reported-by: Carsten Haitzler <raster@rasterman.com>
Fixes: 492ffbed63 ("st/dri2: Implement DRI2bufferDamageExtension")
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit b196e1a8cf)
2019-12-04 13:43:32 -08:00
Jonathan Gray
c166435dc2 winsys/amdgpu: avoid double simple_mtx_unlock()
pthread_mutex_unlock() when unlocked is documented by posix as
being undefined behaviour.  On OpenBSD pthread_mutex_unlock() will call
abort(3) if this happens.

This occurs in amdgpu_winsys_create() after
cb446dc0fa
winsys/amdgpu: Add amdgpu_screen_winsys

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Cc: 19.2 19.3 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 3fe3bde4f2)
2019-12-04 13:43:32 -08:00
Bas Nieuwenhuizen
52a8d43a24 radv: Unify max_descriptor_set_size.
They were out of sync. Besides syncing, lets ensure they never diverge
again.

Fixes: 8d2654a419 "radv: Support VK_EXT_inline_uniform_block."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 4cde0e04e3)
2019-12-04 13:43:32 -08:00
Bas Nieuwenhuizen
2e379b0a65 radv: Allocate cmdbuffer space for buffer marker write.
Fixes: 946193ae00 "radv: add support for VK_AMD_buffer_marker"
Reviewed-by:  Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 25bc9102d8)
2019-12-04 13:43:32 -08:00
Zebediah Figura
336f59f8ba Revert "draw: revert using correct order for prim decomposition."
This reverts commit f97b731c82.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/250

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit a3c8bc10aa)
2019-12-04 13:43:32 -08:00
Ian Romanick
38c8af9e2a intel/fs: Disable conditional discard optimization on Gen4 and Gen5
The CMP instruction on Gen4 and Gen5 generates one bit (the LSB) of
valid data and 31 bits of junk.  Results of comparisons that are used as
Boolean values need to have a fixup applied to generate the proper 0/~0
values.

Calling fs_visitor::nir_emit_alu with need_dest=false prevents the fixup
code from being generated.  This results in a sequence like:

        cmp.l.f0.0(16)  g8<1>F          g14<8,8,1>F     0x0F  /* 0F */
        ...
        cmp.l.f0.0(16)  g4<1>F          g6<8,8,1>F      0x0F  /* 0F */
(+f0.1) or.z.f0.1(16) null<1>UD g4<8,8,1>UD     g8<8,8,1>UD

instead of

        cmp.l.f0.0(16)  g8<1>F          g14<8,8,1>F     0x0F  /* 0F */
        ...
        cmp.l.f0.0(16)  g4<1>F          g6<8,8,1>F      0x0F  /* 0F */
        or(16) g4<1>UD g4<8,8,1>UD     g8<8,8,1>UD
(+f0.1) and.z.f0.1(16) null<1>UD g4<8,8,1>UD     1UD

I examined a couple of the shaders hurt by this change, and ALL of them
would have been affected by this bug. :(

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1836
Fixes: 0ba9497e66 ("intel/fs: Improve discard_if code generation")

Iron Lake
total instructions in shared programs: 8122757 -> 8122957 (<.01%)
instructions in affected programs: 8307 -> 8507 (2.41%)
helped: 0
HURT: 100
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.84% max: 6.67% x̄: 2.81% x̃: 2.76%
95% mean confidence interval for instructions value: 2.00 2.00
95% mean confidence interval for instructions %-change: 2.58% 3.03%
Instructions are HURT.

total cycles in shared programs: 188510100 -> 188510376 (<.01%)
cycles in affected programs: 76018 -> 76294 (0.36%)
helped: 0
HURT: 55
HURT stats (abs)   min: 2 max: 12 x̄: 5.02 x̃: 4
HURT stats (rel)   min: 0.07% max: 3.75% x̄: 0.86% x̃: 0.56%
95% mean confidence interval for cycles value: 4.33 5.71
95% mean confidence interval for cycles %-change: 0.60% 1.12%
Cycles are HURT.

GM45
total instructions in shared programs: 4994403 -> 4994503 (<.01%)
instructions in affected programs: 4212 -> 4312 (2.37%)
helped: 0
HURT: 50
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.84% max: 6.25% x̄: 2.76% x̃: 2.72%
95% mean confidence interval for instructions value: 2.00 2.00
95% mean confidence interval for instructions %-change: 2.45% 3.07%
Instructions are HURT.

total cycles in shared programs: 128928750 -> 128928982 (<.01%)
cycles in affected programs: 67442 -> 67674 (0.34%)
helped: 0
HURT: 47
HURT stats (abs)   min: 2 max: 12 x̄: 4.94 x̃: 4
HURT stats (rel)   min: 0.09% max: 3.75% x̄: 0.75% x̃: 0.53%
95% mean confidence interval for cycles value: 4.19 5.68
95% mean confidence interval for cycles %-change: 0.50% 1.00%
Cycles are HURT.

(cherry picked from commit e51eda99df)
2019-12-04 13:43:32 -08:00
Dylan Baker
5836dd66e0 VERSION: bumpre to 19.2.6 2019-11-21 16:04:47 -08:00
Dylan Baker
264d1187df docs: Add release notes for 19.2.6 2019-11-21 16:04:11 -08:00
Dylan Baker
aa620fdf8e meson: generate .pc files for gles and gles2 with old glvnd
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1921
2019-11-21 17:28:23 +00:00
Yevhenii Kolesnikov
05d5784ea6 glsl: Enable textureSize for samplerExternalOES
From OES_EGL_image_external_essl3

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1901

Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
2019-11-21 09:26:26 -08:00
Dave Airlie
b1f505463d llvmpipe/ppc: fix if/ifdef confusion in backport.
Fixes: 32aba91c07 (llvmpipe: use ppc64le/ppc64 Large code model for JIT-compiled shaders)
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2131
2019-11-21 09:26:26 -08:00
Hyunjun Ko
c2488d810b freedreno/ir3: fix printing output registers of FS.
Fixes: cea39af2fb ("freedreno/ir3: Generalize ir3_shader_disasm()")

Reviewed-by: Rob Clark <robdclark@gmail.com>
(cherry picked from commit d0f38394b1)
2019-11-21 09:26:26 -08:00
Alejandro Piñeiro
e594e4cefd v3d: adds an extra MOV for any sig.ld*
Specifically when we are in non-uniform control flow, as we would need
to set the condition for the last instruction. If (for example) a
image atomic load stores directly their value on a NIR register,
last_inst would be a nop, and would fail when set the condition.

Fixes piglit test:
spec/glsl-es-3.10/execution/cs-ssbo-atomic-if-else-2.shader_test

Fixes: 6281f26f06 ("v3d: Add support for shader_image_load_store.")

v2: (Changes suggested by Eric Anholt)
   * Cover all sig.ld* signals, not just ldunif and ldtmu, as all of
     them have the same restriction.
   * Update comment explaining why we add a MOV in that case
   * Tweak commit message.

v3:
   * Drop extra set of parens (Eric)
   * Add missing ld signal to is_ld_signal to fix shader-db regression.

Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit b4bc59e37e)
2019-11-21 09:26:26 -08:00
Jose Maria Casanova Crespo
8f526ee7cd v3d: Fix predication with atomic image operations
Fixes dEQP test:
dEQP-GLES31.functional.synchronization.inter_call.with_memory_barrier.image_atomic_multiple_interleaved_write_read

Fixes piglit test:
spec/glsl-es-3.10/execution/cs-image-atomic-if-else.shader_test

Fixes: 6281f26f06 ("v3d: Add support for shader_image_load_store.")

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit d983055184)
2019-11-21 09:26:26 -08:00
Eric Engestrom
be8a46d064 vulkan: delete typo'd header
Two files exist in that directory:
- vulkan_xlib_randr.h
- vulkan_xlib_xrandr.h

Both were imported in 205c271562 ("vulkan: Update the XML and
headers to 1.1.70") with identical contents (ie. the
VK_EXT_acquire_xlib_display extension), but the former was never
included anywhere and can't be found upstream [1], while the latter is
included in vulkan.h and found upstream.

[1] https://github.com/KhronosGroup/Vulkan-Headers/tree/master/include/vulkan

Fixes: 205c271562 ("vulkan: Update the XML and headers to 1.1.70")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 344859c32d)
2019-11-21 09:26:26 -08:00
Dylan Baker
4c86eda2c2 docs/relnotes/19.2.5: Add SHA256 sum 2019-11-20 09:18:11 -08:00
Dylan Baker
f418e9231a VERSION: bump for 19.2.5 2019-11-20 08:54:30 -08:00
Dylan Baker
9e0a0d2ebb docs: Add relnotes for 19.2.5 2019-11-20 08:54:10 -08:00
Jason Ekstrand
e10851ff34 anv: Stop bounds-checking pushed UBOs
The bounds checking is actually less safe than just pushing the data.
If the bounds checking actually ever kicks in and it's not on the last
UBO push range, then the shrinking will cause all subsequent ranges to
be pushed to the wrong place in the GRF.  One of the behaviors we
definitely don't want is for OOB UBO access to result in completely
unrelated UBOs returning garbage values.  It's safer to just push the
UBOs as-requested.  If we're really concerned about robustness, we can
emit shader code to do bounds checking which should be stupid cheap (a
CMP followed by SEL).

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-20 08:24:09 -08:00
Brian Paul
023ddb01b5 Call shmget() with permission 0600 instead of 0777
A security advisory (TALOS-2019-0857/CVE-2019-5068) found that
creating shared memory regions with permission mode 0777 could allow
any user to access that memory.  Several Mesa drivers use shared-
memory XImages to implement back buffers for improved performance.

This path changes the shmget() calls to use 0600 (user r/w).

Tested with legacy Xlib driver and llvmpipe.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
(cherry picked from commit 02c3dad0f3)
2019-11-20 08:24:09 -08:00
Danylo Piliaiev
3199172eaa i965: Unify CC_STATE and BLEND_STATE atoms on Haswell as a workaround
Re-emitting 3DSTATE_CC_STATE_POINTERS after emitting
3DSTATE_BLEND_STATE_POINTERS fixes the shadow flickering in
SuperTuxCart and Tropico 6 which was seen only on Haswell.
The reason for this is unknown and fix was found empirically.

The closest mention in PRM is that it should improve performance.
From the HSW PRM, volume 2b, page 823 (3DSTATE_BLEND_STATE_POINTERS):
 "When the BLEND_STATE pointer changes but not the CC_STATE pointer,
  driver needs to force a CC_STATE pointer change to improve
  blend performance in pixel backend."

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1834
Fixes: eca4a654 ("i965: Disable dual source blending when shader doesn't support it on gen8+")
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 6f17fe0606)
2019-11-20 08:24:09 -08:00
Ben Crocker
ae071434e9 llvmpipe: use ppc64le/ppc64 Large code model for JIT-compiled shaders
Large programs, e.g. gnome-shell and firefox, may tax the
addressability of the Medium code model once a (potentially unbounded)
number of dynamically generated JIT-compiled shader programs are
linked in and relocated.  Yet the default code model as of LLVM 8 is
Medium or even Small.

The cost of changing from Medium to Large is negligible:
- an additional 8-byte pointer stored immediately before the shader entrypoint;
- change an add-immediate (addis) instruction to a load (ld).

Testing with WebGL Conformance
(https://www.khronos.org/registry/webgl/sdk/tests/webgl-conformance-tests.html)
yields clean runs with this change (and crashes without it).

Testing with glxgears shows no detectable performance difference.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1753327, 1753789, 1543572, 1747110, and 1582226

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/223

Co-authored by: Nemanja Ivanovic <nemanjai@ca.ibm.com>, Tom Stellard <tstellar@redhat.com>

CC: mesa-stable@lists.freedesktop.org

Signed-off-by: Ben Crocker <bcrocker@redhat.com>
(cherry picked from commit 9c3be6d21f)
Conflicts resolved Dylan (PIPE_ARCH -> UTIL_ARCH rename)
2019-11-20 08:24:09 -08:00
Pierre-Eric Pelloux-Prayer
60c299c542 radeonsi: fix shader disk cache key
Use unsigned values otherwise signed extension will produce a 64 bits value where
the 32 left-most bits are 1.

Fixes: 2afeed3010 ("radeonsi: tell the shader disk cache what IR is used")
2019-11-20 08:24:09 -08:00
Pierre-Eric Pelloux-Prayer
b5b09acb74 radeonsi: tell the shader disk cache what IR is used
Until 8bef4df196 the IR (TGSI or NIR) was used in disk_cache driver_flags.
This commit restores this features to avoid crashing when switching from
one IR to the other.

As radeonsi's default is TGSI, I used "driver_flags & 0x8000000 = 0" for TGSI
to keep the same driver_flags.

Fixes: 8bef4df196 ("radeonsi: add si_debug_options for convenient adding/removing of options")

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-20 08:24:09 -08:00
Pierre-Eric Pelloux-Prayer
38bd621f0d radeonsi: disable sdma for gfx10
Disable sdma on gfx10 until all timeouts bugs are fixed.

See:
    https://gitlab.freedesktop.org/mesa/mesa/issues/1907
    https://bugs.freedesktop.org/show_bug.cgi?id=111481

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-20 08:24:09 -08:00
Marek Olšák
0e7e56aa2f tgsi_to_nir: handle PIPE_FORMAT_NONE in image opcodes
radeonsi doesn't use the format and internal shaders don't set it.

Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
(cherry picked from commit f704fb7f0b)
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2112
2019-11-20 08:23:55 -08:00
Marek Olšák
2353a63a58 tgsi_to_nir: fix masked out image loads
This caused a failure in NIR validation.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 3906fce88b)
2019-11-19 16:56:01 -08:00
Illia Iorin
856c7eddf7 mesa/main: Ignore filter state for MS texture completeness
After the discussion in
https://github.com/KhronosGroup/OpenGL-API/issues/45
the section 8.17 (texture completeness) of the OpenGL 4.6 core profile
was changed to explicitly say that multisample texture completeness
ignores filter state of the texture.

"Using the preceding definitions, a texture is complete unless any of the
 following conditions hold true:
   ...
  - The minification filter requires a mipmap (is neither NEAREST nor LINEAR),
    the texture is not multisample, and the texture is not mipmap complete.
  - The texture is not multisample; either the magnification filter is not
    NEAREST, or the minification filter is neither NEAREST nor NEAREST_-
    MIPMAP_NEAREST; and any of
    – The internal format of the texture is integer (see table 8.12).
    – The internal format is STENCIL_INDEX.
    – The internal format is DEPTH_STENCIL, and the value of DEPTH_-
      STENCIL_TEXTURE_MODE for the texture is STENCIL_INDEX."

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Signed-off-by: Illia Iorin <illia.iorin@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 6b672e342a)
2019-11-19 16:56:00 -08:00
Ian Romanick
4c82d426bd nir/algebraic: Mark other comparison exact when removing a == a
This prevents some additional optimizations that would change the
original result.  This includes things like (b < a && b < c) => b <
min(a, c) and !(a < b) => b >= a.  Both of these optimizations were
specifically observed in the piglit tests added in piglit!160.

This was discovered while investigating
https://gitlab.freedesktop.org/mesa/mesa/issues/1958.  However, the
problem in that issue was Chrome or Angle is replacing calls to isnan()
with some stuff that we (correctly) optimize to false.  If they had left
the calls to isnan() alone, everything would have just worked.

No shader-db changes on any Intel platform.

I also tried marking the comparison generated by the isnan() function
precise.  The precise marker "infects" every computation involved in
calculating the parameter to the isnan() function, and this severely
hurt all of the (few) shaders in shader-db that use isnan().

I also considered adding a new ir_unop_isnan opcode that would implement
the functionality.  During GLSL IR-to-NIR translation, the resulting
comparison operation would be marked exact (and the samething would need
to happen in SPIR-V translation).

This approach taken by this patch seemed easier, but we may want to do
the ir_unop_isnan thing anyway.

Fixes: d55835b8bd ("nir/algebraic: Add optimizations for "a == a && a CMP b"")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
(cherry picked from commit 9be4a422a0)
2019-11-19 16:56:00 -08:00
Ian Romanick
d8a37880b5 nir/algebraic: Add the ability to mark a replacement as exact
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
(cherry picked from commit ea19f2fb68)
2019-11-19 16:56:00 -08:00
Paulo Zanoni
bd2f6150ca intel/compiler: fix nir_op_{i,u}*32 on ICL
On ICL we have the src1 restriction which is applied through
fix_byte_src() and potentially changes the type of the operands from 8
to 32 bits. When this change happens, we fall into the "else if
(bit_size < 32)" case and miscompute src_type because it takes into
consideration bit_size (8) instead of the adjusted size of temp_op
(32). This results in the shader reading unused memory, giving us
mostly failures, but occasional passes due to whatever was already in
the registers we were reading.

This commit fixes a lot of dEQP subgroup i8vec2 tests on ICL, such as:
    dEQP-VK.subgroups.arithmetic.compute.subgroupadd_i8vec2

This can also be verified by simply changing fix_byte_src() to apply
on all platforms.

Fixes: 5847de6e9a ("intel/compiler: don't use byte operands for src1 on ICL")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
(cherry picked from commit eb6352162d)
2019-11-19 16:56:00 -08:00
Marek Olšák
9ffb0176e2 st/mesa: fix Sanctuary and Tropics by disabling ARB_gpu_shader5 for them
They use the "sample" keyword as a variable name.

Cc: 19.2 19.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit e00791c552)
2019-11-19 16:56:00 -08:00
Lionel Landwerlin
a390cf739f anv/wsi: signal the semaphore in the acquireNextImage
We seem to have forgotten about the semaphore in the
acquireNextImageInfo.

v2: Signal semaphore/fence regardless of presentation status (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit edc6606d4e)
2019-11-19 16:56:00 -08:00
Lionel Landwerlin
a993dc20b6 anv: remove list items on batch fini
This doesn't seem to fix anything because those destroy() calls happen
right before the command buffer object & its list of batch_bo is also
destroyed. Still looks a bit cleaner.

v2: Found a second occurence

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)
Fixes: 26ba0ad54d ("vk: Re-name command buffer implementation files")
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 935f8f0e56)
2019-11-19 16:56:00 -08:00
Lionel Landwerlin
2281258a7b anv: invalidate file descriptor of semaphore sync fd at vkQueueSubmit
We always close the in_fence at the end the anv_cmd_buffer_execbuf()
so when we take it from the semaphore, let's not forget to invalidate
it.

Note that the code leaks the fence_in if we get any error before
reaching the close(). Let's fix that in another patch or better,
rewrite the whole thing!

v2: drop redundant fd = -1 (Jason)

v3: Update commit message (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 048f0690ee)
2019-11-19 16:56:00 -08:00
Dylan Baker
a89d4090bc cherry-ignore: Update for 19.2.4 cycle 2019-11-19 16:56:00 -08:00
Eric Engestrom
cb835d281b egl: fix _EGL_NATIVE_PLATFORM fallback
When the X11 or Haiku platforms were compiled in, they would bypass the
`_EGL_NATIVE_PLATFORM` fallback by always returning themselves instead.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 86d3a346f1)
2019-11-13 11:10:13 -08:00
Caio Marcelo de Oliveira Filho
6e98a923cb spirv: Don't leak GS initialization to other stages
The stage specific fields of shader_info are in an union.  We've
likely been lucky that this value was either overwritten or ignored by
other stages.  The recent change in shader_info layout in commit
84a1a2578d ("compiler: pack shader_info from 160 bytes to 96 bytes")
made this issue visible.

Fixes: cf2257069c ("nir/spirv: Set a default number of invocations for geometry shaders")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 087ecd9ca5)
2019-11-13 11:10:13 -08:00
Lepton Wu
2b89e68a2e gallium: dri2: Use index as plane number.
This fix wrong color when playing video under Android + virgl
configuration.

Fixes: 2decad495f ("gallium/dri2: Support images with multiple planes for modifiers")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Lepton Wu <lepton@chromium.org>
(cherry picked from commit 5a40e153fd)
2019-11-13 11:10:13 -08:00
Dylan Baker
9cbffed5d0 docs: Add SHA256 sum for for 19.2.4 2019-11-13 11:09:32 -08:00
Dylan Baker
6f1b8e554f VERSION: bump to 19.2.4 2019-11-13 10:38:55 -08:00
Dylan Baker
cebe2b12f5 docs: Add release notes for 19.2.4 2019-11-13 10:38:40 -08:00
Lionel Landwerlin
330d31f63b mesa: check framebuffer completeness only after state update
The change made in 88d665830f ("mesa: check draw buffer completeness
on glClearBufferfi/glClearBufferiv") correctly updated the state prior
to checking the framebuffer completeness on glClearBufferiv but not in
glClearBufferfi.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Fixes: 88d665830f ("mesa: check draw buffer completeness on glClearBufferfi/glClearBufferiv")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2072
(cherry picked from commit f93bb90302)
2019-11-13 10:34:47 -08:00
Dylan Baker
019dbc4b3b Bump version to 19.2.3 2019-11-06 08:19:59 -08:00
Dylan Baker
2e4da07a3e docs: add release notes for 19.2.3 2019-11-06 08:19:42 -08:00
Dylan Baker
85608bad46 meson: Add dep_glvnd to egl deps when building with glvnd
Otherwise if glvnd is not installed systemwide, but only in a prefix,
it's headers wont be found. This happens because if it's headers are in
/usr/include/ then another dependence will provide the necessary -I
arguments and compilation will work.

Fixes: 035ec7a2bb
       ("meson: Add support for EGL glvnd")
Acked-by: Eric Engestrom <eric@engestrom.ch>
(cherry picked from commit 5d085ad052)
2019-11-05 08:48:03 -08:00
Paulo Zanoni
a2b3911a47 intel/compiler: remove the operand restriction for src1 on GLK
Commit 5847de6e9a implemented a restriction that applies to ICL, but
wrongly marked it as also applying to GLK. Reviewers or MR !1125
pointed this, and the commit history shows removal of GLK to parts of
the patch, but it turns there was still a left-over GLK check in the
code.

This code was breaking some of the i8vec2 tests on GLK, for example:
  dEQP-VK.subgroups.arithmetic.compute.subgroupadd_i8vec2

Removing the GLK check solves the issue for GLK. I don't see a reason
on why implementing this restriction would actually break GLK, so
there's still more to investigate here since this bug may be affecting
ICL+, but let's apply the real GLK fix while we analyze and discuss
the other possible issues.

Fixes: 5847de6e9a ("intel/compiler: don't use byte operands for src1
on ICL")
BSpec: 3017
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
(cherry picked from commit b57383a944)
2019-11-05 08:48:03 -08:00
Kenneth Graunke
a74300e4cf iris: Fix "Force Zero RTA Index Enable" setting again
In 2ca0d913ea, we began updating cso_fb->layers to the actual layer
count, rather than 0.  This fixed cases where we were setting "Force
Zero RTA Index Enable" even when doing layered rendering.  Sadly, it
also broke the check entirely: cso_fb->layers is now 1 for non-layered
cases, but the Force Zero RTA Index check was still comparing for 0.

Fixes: 2ca0d913ea ("iris: Fix framebuffer layer count")
(cherry picked from commit fc7b748086)
2019-11-05 08:46:55 -08:00
Dylan Baker
c1f1a1056f nir: correct use of identity check in python
Python has the identity operator `is`, and the equality operator `==`.
Using `is` with strings sometimes works in CPython due to optimizations
(they have some kind of cache), but it may not always work.

Fixes: 96c4b135e3
       ("nir/algebraic: Don't put quotes around floating point literals")
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 717606f9f3)
2019-11-05 08:46:48 -08:00
Jon Turney
13768f3714 Fix timespec_from_nsec test for 32-bit time_t
Since struct timespec's tv_sec member is of type time_t, adjust the
expected value to allow for the truncation which will occur with 32-bit
time_t.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
(cherry picked from commit dd1dba80b9)
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2043
2019-11-04 08:17:58 -08:00
Lionel Landwerlin
aab604398d mesa: check draw buffer completeness on glClearBufferfi/glClearBufferiv
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 88d665830f)
2019-11-04 08:05:19 -08:00
Dylan Baker
5fbc187720 cherry-ignore: update for 19.2.3 cycle 2019-11-04 08:05:19 -08:00
Jason Ekstrand
aaba438f80 anv/tests: Zero-initialize instances
Some of the tests were actually relying on some of those uninitialized
bits to be non-zero.  In particular, a couple want use_softpin = true.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 9076e9f375)
2019-11-04 08:02:36 -08:00
Jason Ekstrand
3de8c6e8fb anv: Fix a potential BO handle leak
Fixes: 731c4adcf9 "anv/allocator: Add support for non-userptr"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit bb257e1852)
2019-11-04 08:02:36 -08:00
Pierre-Eric Pelloux-Prayer
6d96ee8507 mesa: enable msaa in clear_with_quad if needed
If the DrawBuffer sample count is > 1 and msaa is enabled we must also
enable msaa when clearing it.

Fixes: ea5b7de138 ("radeonsi: make gl_SampleMaskIn = 0x1 when MSAA is disabled")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1991

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Witold Baryluk <witold.baryluk@gmail.com>
(cherry picked from commit 8a723282e3)
2019-11-04 08:02:36 -08:00
Bas Nieuwenhuizen
4575aa2a30 anv: Remove _mesa_locale_init/fini calls.
The resulting locale is not used for Vulkan, and it is not reference
counted, giving issues when multiple instances are created.

CC: 19.2 19.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 3e86d553a4)
2019-11-04 08:02:36 -08:00
Bas Nieuwenhuizen
ff9237e8ea turnip: Remove _mesa_locale_init/fini calls.
The resulting locale is not used for Vulkan, and it is not reference
counted, giving issues when multiple instances are created.

CC: 19.2 19.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 72f858fc07)
2019-11-04 08:02:36 -08:00
Bas Nieuwenhuizen
a839a023c5 radv: Remove _mesa_locale_init/fini calls.
The resulting locale is not used for Vulkan, and it is not reference
counted, giving issues when multiple instances are created.

CC: 19.2 19.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 344ba56b0f)
2019-11-04 08:02:36 -08:00
Bas Nieuwenhuizen
63ff2a51d3 radv: Fix timeout handling in syncobj wait.
libdrm returns -errno instead of directly the ioctl ret of -1.

Fixes: 1c3cda7d27 "radv: Add syncobj signal/reset/wait to winsys."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit ec770085c2)
2019-11-04 08:02:36 -08:00
Ilia Mirkin
43234ba02a nv50/ir: mark STORE destination inputs as used
Observed an issue when looking at the code generatedy by the
image-vertex-attrib-input-output piglit test. Even though the test
itself worked fine (due to TIC 0 being used for the image), this needs
to be fixed.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 1b9d1e13d8)
2019-11-04 08:02:35 -08:00
Jonathan Marek
49e105048b etnaviv: fix depth bias
Fixes remaining failures in these deqp tests (tested on GC3000/GC7000L):
dEQP-GLES2.functional.polygon_offset.*

Fixes: 6c3c05dc ("etnaviv: fix polygon offset")

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
(cherry picked from commit 7b524e1acb)
2019-10-31 08:43:30 -07:00
Sagar Ghuge
db4f7eae8b intel/blorp: Assign correct view while clearing depth stencil
We never saw any failures regarding this typo but it's good to assign
correct stencil view while constructing blorp_params.

Fixes: 0cabf93b80 "intel/blorp: Add an entrypoint for clearing depth and stencil"

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
(cherry picked from commit ce208be2d8)
2019-10-30 10:55:34 -07:00
Caio Marcelo de Oliveira Filho
5290062a3f anv: Fix output of INTEL_DEBUG=bat for chained batches
The anv_batch_bo contents are linked one to another, and when printing
we have to start with the first of those.  Since in `u_vector` new
elements are added to the head, to get the first element we need the
vector's tail.

Fixes: 32ffd90002 ("anv: add support for INTEL_DEBUG=bat")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit e2155158e9)
2019-10-29 08:16:07 -07:00
Nanley Chery
7affc43e81 iris: Disallow incomplete resource creation
If a modifier specifies an aux, it must be created.

Fixes: 75a3947af4 ("iris/resource: Fall back to no aux if creation fails")
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit d298740a1c)
2019-10-29 08:16:03 -07:00
Nanley Chery
8cb1070ea3 iris: Don't leak the resource for unsupported modifier
Make sure the res struct is free'd before returning.

Fixes: 2dce0e94a3 ("iris: Initial commit of a new 'iris' driver for Intel Gen8+ GPUs.")
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit f2fc5dece9)
2019-10-29 08:15:58 -07:00
Nanley Chery
88ca3cf11d iris: Clear ::has_hiz when disabling aux
Fixes: 2cddc953cd ("iris: some initial HiZ bits")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 6cd9731d96)
2019-10-29 08:15:52 -07:00
Nanley Chery
73a9f31b57 intel/blorp: Disable depth testing for slow depth clears
We'll start doing slow depth clears more often on HIZ_CCS buffers in a
future commit. Reduce the performance impact by making them use less
bandwidth.

From the Depth Test section of the BSpec:

   This function is enabled by the Depth Test Enable state variable. If
   enabled, the pixel's ("source") depth value is first computed. After
   computation the pixel's depth value is clamped to the range defined
   by Minimum Depth and Maximum Depth in the selected CC_VIEWPORT state.
   Then the current ("destination") depth buffer value for this pixel is
   read.

and from the Depth Buffer Updates section of the BSpec:

   If depth testing is disabled or the depth test passed, the incoming
   pixel's depth value is written to the Depth Buffer.

Taken together, it's clear that depth testing isn't necessary to perform
a depth buffer clear. Mark Janes and I analyzed this patch with
frameretrace and a depthrange piglit test. I disabled HiZ to ensure we'd
get slow depth clears. We've observed the bandwidth consumption by the
depth buffer access to be cut ~50% on BDW and SKL during depth clears.
On a more graphically intensive workload, the Shadowmapping Sascha
benchmark, I took the average of 3 runs on a BDW with a display
resolution of about 1920x1200 (minus some desktop environment
decorations). I measured a 22.61% FPS improvement when HiZ is disabled.

v2. The BSpec doesn't mandate this behavior, update comment accordingly.
    (Ken)

Fixes: bc4bb5a7e3 ("intel/blorp: Emit more complete DEPTH_STENCIL state")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit d5fb9cccdc)
2019-10-29 08:15:45 -07:00
Nanley Chery
d8847c2f28 anv: Properly allocate aux-tracking space for CCS_E
add_aux_state_tracking_buffer() actually checks the aux usage when
determining how many dwords to allocate for state tracking. Move the
function call to the point after the CCS_E aux usage is assigned.

Fixes: de3be61801 ("anv/cmd_buffer: Rework aux tracking")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit d0fcc2dd50)
2019-10-29 08:15:40 -07:00
Danylo Piliaiev
aa51bd5aa1 glsl: Initialize all fields of ir_variable in constructor
Better be safe, even if we could technically avoid this for
some fields.

Cc: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1999
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Tested-by: Witold Baryluk <witold.baryluk@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 8818e0df74)
2019-10-28 08:32:19 -07:00
Tapani Pälli
5378782434 i965: setup sized internalformat for MESA_FORMAT_R10G10B10A2_UNORM
Commit d2b60e433e introduced restrictions (as per GLES spec) on the
internal format. We need to setup a sized format for the texture image
so framebuffers created with that are considered complete.

This change fixes following Android CTS test in AHardwareBufferNativeTests
category:

   SingleLayer_ColorTest_GpuColorOutputAndSampledImage_R10G10B10A2_UNORM

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Fixes: d2b60e433e ("mesa/main: R10G10B10_(A2) formats are not color renderable in ES")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 412badd059)
2019-10-28 08:32:12 -07:00
Dylan Baker
2936df10df bin/gen_release_notes.py: Add a warning if new features are introduced in a point release
Fixes: 86079447da
       ("scripts: Add a gen_release_notes.py script")
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
(cherry picked from commit 8a4541aae2)
2019-10-28 08:32:06 -07:00
Dylan Baker
1d86a89733 bin/gen_release_notes.py: html escape all external data
All of these (bug titles, patch titles, features, and people's names)
can contain characters that are not valid html. Just escape everything
for safety.

Fixes: 86079447da
       ("scripts: Add a gen_release_notes.py script")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
(cherry picked from commit b153785370)
2019-10-28 08:31:58 -07:00
Dylan Baker
05605ad196 bin/post_release.py: Add .html to hrefs
oops.

Fixes: 3226b12a09
       ("release: Add an update_release_calendar.py script")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
(cherry picked from commit 7e4b87f987)
2019-10-28 08:31:53 -07:00
Dylan Baker
a884f9d5e9 bin/post_version.py: white space fixes
Fixes: 3226b12a09
       ("release: Add an update_release_calendar.py script")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
(cherry picked from commit 5eef803625)
2019-10-28 08:31:46 -07:00
Dylan Baker
e9a1f47123 bin/post_version.py: Pass version as an argument
I made a bad assumption; I assumed this would be run in the release
branch. But we don't do that, we run in the master branch. As a result
we need to pass the version as an argument.

Fixes: 3226b12a09
       ("release: Add an update_release_calendar.py script")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
(cherry picked from commit abf9e7ac7b)
2019-10-28 08:31:41 -07:00
Dylan Baker
fa47f0f5cd bin/gen_release_notes.py: Return "None" if there are no new features
Which is very likely .Z > 0 releases.

Fixes: 86079447da
       ("scripts: Add a gen_release_notes.py script")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
(cherry picked from commit c6d41e7f0b)
2019-10-28 08:31:35 -07:00
Dylan Baker
701457466c bin/gen_release_notes.py: strip '#' from gitlab bugs
If they use the `Fixes: #1` form.

Fixes: 86079447da
       ("scripts: Add a gen_release_notes.py script")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
(cherry picked from commit df3d4ad82d)
2019-10-28 08:31:30 -07:00
Dylan Baker
58a37997a3 bin/gen_release_notes.py: fix conditional of bugfix
Previously this would result in the .0 warning be generated for .z > 0
and the .z == 0 would get the other message.

Fixes: 86079447da
       ("scripts: Add a gen_release_notes.py script")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
(cherry picked from commit 69f540c017)
2019-10-28 08:31:24 -07:00
Illia Iorin
cab0e230ec Revert "mesa/main: Fix multisample texture initialize"
This reverts commit a113a42e73.

Per https://github.com/KhronosGroup/OpenGL-API/issues/45 it
was a wrong way to fix the issue.

Signed-off-by: Illia Iorin <illia.iorin@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 71d4ece366)
2019-10-28 08:31:19 -07:00
Jon Turney
87db4b9376 rbug: Fix use of alloca() without #include "c99_alloca.h"
[12/60] Compiling C object 'src/gallium/auxiliary/eb820e8@@gallium@sta/rbug_rbug_texture.c.o'.
FAILED: src/gallium/auxiliary/eb820e8@@gallium@sta/rbug_rbug_texture.c.o
[...]
../src/gallium/auxiliary/rbug/rbug_texture.c: In function 'rbug_send_texture_info_reply':
../src/gallium/auxiliary/rbug/rbug_texture.c:302:21: error: implicit declaration of function 'alloca'; did you mean 'malloc'? [-Werror=implicit-function-declaration]
  uint32_t *height = alloca(sizeof(uint32_t) * height_len);
                     ^~~~~~
                     malloc
../src/gallium/auxiliary/rbug/rbug_texture.c:302:21: warning: initialization makes pointer from integer without a cast [-Wint-conversion]
../src/gallium/auxiliary/rbug/rbug_texture.c:303:20: warning: initialization makes pointer from integer without a cast [-Wint-conversion]
  uint32_t *depth = alloca(sizeof(uint32_t) * height_len);
                    ^~~~~~
cc1: some warnings being treated as errors

Include c99_alloca.h to portably make the alloca() prototype available.

See also: 498d9d0f, adfb9c5c, fc8139b1

Fixes: 6174cba7 ("rbug: fix transmitted texture sizes")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
(cherry picked from commit 2649609ac5)
2019-10-25 09:06:25 -07:00
Thomas Hellstrom
5f6f349bcf winsys/svga: Limit the maximum DMA hardware buffer size
The kernel total GMR/DMA size is limited, but it's definitely possible for the
kernel to allow a larger buffer allocation to succeed, but command
submission using that buffer as a GMR would fail typically causing an
application crash.

So have the winsys limit the size of GMR/DMA buffers. The pipe driver will
then resort to allocating smaller buffers and perform the DMA transfer in
multiple bands, also allowing for the pre-flush mechanism to kick in.

This avoids the related application crashes.

Fixes: e7843273fa ("winsys/svga: Update to vmwgfx kernel module 2.1")
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 91146c0796)
2019-10-25 09:06:25 -07:00
Thomas Hellstrom
dacfad708f svga: Fix banded DMA upload unmap
Even with banded DMA uploads, st->hwbuf is always non-NULL, but when we've
allocated a software buffer to hold the full upload, unmapping of the
hardware buffer has already been done before
svga_texture_transfer_unmap_dma(), and the code was performing an unmap of
an already mapped buffer.

Fix this by testing for software buffer not present.

Fixes: a9c4a861d5 ("svga: refactor svga_texture_transfer_map/unmap functions")
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 00db976905)
2019-10-25 09:06:25 -07:00
Marek Olšák
340849f311 util/u_queue: skip util_queue_finish if num_threads is 0
This fixes a deadlock in pthread_barrier_destroy.

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit c2efd2cbfb)
2019-10-25 09:06:25 -07:00
Samuel Pitoiset
e2b416e8ae radv: fix vkUpdateDescriptorSets with inline uniform blocks
descriptorCount is the number of bytes into the descriptor, so
it shouldn't be used as an index. srcArrayElement/dstArrayElement
specify the starting byte offset within the binding to copy from/to.

This fixes new CTS tests:
dEQP-VK.binding_model.descriptor_copy.*.inline_uniform_block_*
dEQP-VK.binding_model.descriptor_copy.*.mix_3
dEQP-VK.binding_model.descriptor_copy.*.mix_array1

Fixes: 8d2654a419 ("radv: Support VK_EXT_inline_uniform_block.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 7562a2cbe3)
2019-10-25 09:06:25 -07:00
Samuel Pitoiset
a352e1d2f7 radv/gfx10: fix 3D images
GFX10 does act like GFX9 actually.

This fixes
dEQP-VK.glsl.texture_functions.query.texturesize.*sampler3d_*.

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 9c92a21fe5)
2019-10-25 09:06:25 -07:00
Samuel Pitoiset
e8d4a75d9a radv: do not emit rbplus if attachments are undefined
Fixes some crashes with dEQP-VK.geometry.layered.*.secondary_cmd_buffer
on Raven and other chips that allow rbplus.

This just prevents a crash and rbplus probaby needs more work.

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 956d825ed8)
2019-10-25 09:06:25 -07:00
Samuel Pitoiset
2bd9eb83b4 radv: do not create meta pipelines with 16 samples
The driver only supports up to 8 samples, so it's useless to
create more pipelines than needed.

This fixes a conditional jump reported by Valgrind on GFX10:

==194282== Conditional jump or move depends on uninitialised value(s)
==194282==    at 0xDBF925A: radv_gfx10_compute_bin_size (radv_pipeline.c:3242)
==194282==    by 0xDBF95A6: radv_pipeline_generate_binning_state (radv_pipeline.c:3334)
==194282==    by 0xDBFC1A0: radv_pipeline_generate_pm4 (radv_pipeline.c:4440)
==194282==    by 0xDBFD15E: radv_pipeline_init (radv_pipeline.c:4764)
==194282==    by 0xDBFD23E: radv_graphics_pipeline_create (radv_pipeline.c:4788)
==194282==    by 0xDBB95A3: create_pipeline (radv_meta_clear.c:114)
==194282==    by 0xDBB9AC5: create_color_pipeline (radv_meta_clear.c:297)
==194282==    by 0xDBBCF05: radv_device_init_meta_clear_state (radv_meta_clear.c:1277)
==194282==    by 0xDB9ACD9: radv_device_init_meta (radv_meta.c:363)
==194282==    by 0xDB7FE3A: radv_CreateDevice (radv_device.c:2080

This is caused by an out of bound access of 'fmask_array' (ie. index
is 4 as for 16 samples).

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit f4ab58c1a0)
2019-10-25 09:06:25 -07:00
Lionel Landwerlin
f0cad4c038 anv: fix unwind of vkCreateDevice fail
We're skipping the context destruction in some cases which is the
grand scheme of thing is not that important because closing device->fd
will destroy the associated context as well.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Fixes: b30e01aef5 ("anv: fix memory leak on device destroy")
(cherry picked from commit 0dfa643feb)
2019-10-25 09:06:25 -07:00
Dylan Baker
c5f5ce1e37 Bump version for 19.2.2 release 2019-10-23 09:06:39 -07:00
Dylan Baker
db2797251d docs: Add release notes for 19.2.2 2019-10-23 09:06:39 -07:00
Samuel Pitoiset
56f0434232 radv: fix updating bound fast ds clear values with different aspects
On GFX9, the driver is able to do an optimized fast depth/stencil
clear with only one aspect (ie. clear the stencil part of a
depth/stencil image). When this happens, the driver should only
update the clear values of the given aspect.

Note that it's currently only supported on GFX9 but I have some
local patches that extend this optimized path for other gens.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1967
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit a13320370e)
2019-10-22 08:51:44 -07:00
Bas Nieuwenhuizen
ad56aaef4f radv: Fix single stage constant flush with merged shaders.
e.g. a VERTEX only flush with tess on Vega should look at the TCS
to see which bits are needed.

CC: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1953
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit fd21ee8b52)
Conflicts resolved by Dylan Baker

Conflicts:
	src/amd/vulkan/radv_cmd_buffer.c
2019-10-22 08:51:15 -07:00
Lepton Wu
35b900310b egl/android: Remove our own reference to buffers.
We currently doesn't maintain it correctly and the buffer gets leaked if
surface is destroyed before calling swapping buffers.

From Android frameworks/native/libs/nativewindow/include/system/window.h:

  The window holds a reference to the buffer between dequeueBuffer and
  either queueBuffer or cancelBuffer, so clients only need their own
  reference if they might use the buffer after queueing or canceling it.

v2: Remove our own reference.

Fixes: 0212db3504 ("egl/android: Cancel any outstanding ANativeBuffer in surface destructor")

Reviewed-by: Chia-I Wu <olvaffe@gmail.com> (v1)
Reviewed-By: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Lepton Wu <lepton@chromium.org>
(cherry picked from commit f4ba31ff50)
2019-10-21 08:59:25 -07:00
Lionel Landwerlin
cb0215a6fb anv: fix memory leak on device destroy
v2: handle vma destruction if vkCreateDevice fails (Jordan)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1959
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit b30e01aef5)
2019-10-21 08:59:18 -07:00
Lionel Landwerlin
425fbe2902 anv: fix vkUpdateDescriptorSets with inline uniform blocks
With inline uniform blocks descriptor, the meaning of descriptorCount
is a number of bytes to copy into the descriptor. Don't try to use
that size as an index into the descriptor table.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 43f40dc7cb ("anv: Implement VK_EXT_inline_uniform_block")
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1195
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 3f8f52b241)
2019-10-21 08:59:11 -07:00
Lucas Stach
62f9ba1bf2 rbug: unwrap index buffer resource
All resources passed to the drivers below rbug need to be unwrapped before
being passed down. We missed to do this for the index buffer resource when
this was made part of the draw_info structure.

Fixes: 330d0607ed (gallium: remove pipe_index_buffer and set_index_buffer)
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
(cherry picked from commit a75eb888e0)
2019-10-18 09:56:11 -07:00
Lucas Stach
306e82acb6 rbug: fix transmitted texture sizes
The rbug wire format defines the texture size parameters to be uint32_t sized
and uses memcpy to move the function parameters to the message structure.
This caused totally wrong transmitted texture sizes since the height and depth
paramterds have been changed to uint16_t in the gallium API. Fix this by doing
an explicit conversion to the correct representation before packing into the
wire message.

Fixes: e6428092f5 (gallium: decrease the size of pipe_resource - 64 -> 48 bytes)
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
(cherry picked from commit 6174cba748)
2019-10-18 09:56:07 -07:00
Ian Romanick
ef906b4636 intel/vec4: Don't try both sources as immediates for DPH
DPH isn't actually commutative, so this doesn't work.  If the immediate
in src0 would be a VF candidate, we could do better. *shrug*

No shader-db changes on any Intel platform.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Fixes: b04beaf41d ("intel/vec4: Try both sources as candidates for being immediates")
(cherry picked from commit 92252219d3)
2019-10-18 09:56:03 -07:00
Ian Romanick
e958b35a40 nir/search: Fix possible NULL dereference in is_fsign
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Fixes: 09705747d7 ("nir/algebraic: Reassociate fadd into fmul in DPH-like pattern")
(cherry picked from commit 050e4e28bf)
2019-10-18 09:55:58 -07:00
Roland Scheidegger
16af8e9772 gallivm: Fix saturated signed psub/padd intrinsics on llvm 8
LLVM 8 did remove both the signed and unsigned sse2/avx intrinsics in
the end, and provide arch-independent llvm intrinsics instead.
Fixes a crash when using snorm framebuffers (tested with piglit
arb_color_buffer_float-render GL_RGBA8_SNORM -auto).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
CC: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 045f05a2f6)
Conflicts resolved by Dylan Baker
2019-10-17 09:29:01 -07:00
Samuel Pitoiset
01e31f8cab radv: fix DCC fast clear code for intensity formats (correctly)
Previous fix was pretty bogus.

This fixes a rendering regression with Nier (minimap too large).

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1943
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1952
Fixes: ea92273cea ("radv: fix DCC fast clear code for intensity formats")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit c644644c65)
2019-10-17 09:12:02 -07:00
Eric Engestrom
4d6fcddc65 util/u_atomic: fix return type of p_atomic_{inc,dec}_return() and p_atomic_{cmp,}xchg()
We're trying to cast the return type to the type of the var, but instead
we were casting `sizeof(*v)`.

Fixes: 6df72e970c ("util: Make u_atomic.h typeless.")
Fixes: 0a7f17cf5b ("util/u_atomic: add p_atomic_xchg")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
(cherry picked from commit aaab70035a)
2019-10-17 09:11:21 -07:00
Pierre-Eric Pelloux-Prayer
5694c20188 mesa: fix invalid target error handling for teximage
This commit moves the target check before using _mesa_get_current_tex_object
to fix a "Mesa implementation error: bad target in _mesa_get_current_tex_object()"
error.

Fixes: 9dd1f7cec0 ("mesa: pass gl_texture_object as arg to not depend on state")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 16233797f4)
2019-10-17 09:11:13 -07:00
James Xiong
9a929cfef2 iris: finish aux import on get_param
A buffer and its aux are imported separately, if the aux import is
not completed yet when resource_get_param is called, merge the
separate aux a.k.a the 2nd image into the main image.

Fixes: 246eebba4a ("iris: Export and import surfaces with modifiers that have aux data")

Signed-off-by: James Xiong <james.xiong@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit fd235484fe)
2019-10-17 09:11:06 -07:00
Lionel Landwerlin
ae4f569232 etnaviv: remove variable from global namespace
Found out by accident this was clashing with another driver.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 701e0ac077)
2019-10-17 09:11:00 -07:00
Alan Coopersmith
9d13289ad4 intel/common: include unistd.h for ioctl() prototype on Solaris
Fixes build errors of:
In file included from ../src/intel/vulkan/anv_private.h:48,
                 from ../src/intel/vulkan/genX_blorp_exec.c:26:
../src/intel/common/gen_gem.h: In function ‘gen_ioctl’:
../src/intel/common/gen_gem.h:68:15: error: implicit declaration of function ‘ioctl’ [-Werror=implicit-function-declaration]
   68 |         ret = ioctl(fd, request, arg);
      |               ^~~~~
In file included from ../include/c11/threads_posix.h:35,
                 from ../include/c11/threads.h:66,
                 from ../src/mesa/main/mtypes.h:39,
                 from ../src/intel/compiler/brw_compiler.h:30,
                 from ../src/intel/vulkan/anv_private.h:51,
                 from ../src/intel/vulkan/genX_blorp_exec.c:26:
/usr/include/unistd.h: At top level:
/usr/include/unistd.h:471:12: error: conflicting types for ‘ioctl’
  471 | extern int ioctl(int, int, ...);
      |            ^~~~~
/usr/include/unistd.h:471:1: note: a parameter list with an ellipsis can’t match an empty parameter name list declaration
  471 | extern int ioctl(int, int, ...);
      | ^~~~~~
In file included from ../src/intel/vulkan/anv_private.h:48,
                 from ../src/intel/vulkan/genX_blorp_exec.c:26:
../src/intel/common/gen_gem.h:68:15: note: previous implicit declaration of ‘ioctl’ was here
   68 |         ret = ioctl(fd, request, arg);
      |               ^~~~~

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 6804b8e1ff)
2019-10-17 09:09:03 -07:00
Alan Coopersmith
9b49a4ea12 meson: recognize "sunos" as the system name for Solaris
Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit d8a9420f6f)
Minor conflicts resolved by Dylan Baker
2019-10-17 09:08:50 -07:00
Alan Coopersmith
55a04df479 util: Solaris has linux-style pthread_setname_np
Fixes: dcf9d91a ("util: Handle differences in pthread_setname_np")

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 7040795a69)
2019-10-17 09:08:10 -07:00
Alan Coopersmith
45aa00da9f util: Workaround lack of flock on Solaris
v2: Replace autoconf check for flock() with meson check

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit b3028a9fb8)
Minor conflicts resolved by Dylan Baker
2019-10-17 09:07:40 -07:00
Alan Coopersmith
a1e6d1fb30 util: Make Solaris implemention of p_atomic_add work with gcc
gcc is very particular about where you place the (void) cast
The previous placement made it error out with:

In file included from disk_cache.c:40:0:
../../src/util/u_atomic.h:203:29: error: void value not ignored as it ought to be
 #define p_atomic_add(v, i) ((void)         \
                              ^
disk_cache.c:658:4: note: in expansion of macro ‘p_atomic_add’
    p_atomic_add(cache->size, size);
    ^

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit a56c3e3a47)
2019-10-17 09:07:01 -07:00
Alan Coopersmith
2380173433 c99_compat.h: Don't try to use 'restrict' in C++ code
Fixes build failures on Solaris in C++ files using gcc:

../src/util/u_math.h:628:41: error: expected ‘,’ or ‘...’ before ‘dest’
  628 | util_memcpy_cpu_to_le32(void * restrict dest, const void * restrict src, size_t n)
      |                                         ^~~~
../src/util/u_math.h: In function ‘void* util_memcpy_cpu_to_le32(void*)’:
../src/util/u_math.h:641:18: error: ‘dest’ was not declared in this scope
  641 |    return memcpy(dest, src, n);
      |                  ^~~~
../src/util/u_math.h:641:24: error: ‘src’ was not declared in this scope
  641 |    return memcpy(dest, src, n);
      |                        ^~~
../src/util/u_math.h:641:29: error: ‘n’ was not declared in this scope; did you mean ‘yn’?
  641 |    return memcpy(dest, src, n);
      |                             ^
      |                             yn

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit ddde652e70)
2019-10-17 09:07:00 -07:00
Samuel Pitoiset
45ebe99a88 Revert "radv: do not emit PKT3_CONTEXT_CONTROL with AMDGPU 3.6.0+"
This reverts commit 2ca8629fa9.

This was initially ported from RadeonSI, but in the meantime it has
been reverted because it might hang. Be conservative and re-introduce
this packet emission.

Unfortunately this doesn't fix anything known.

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 4a3bdc6d22)
2019-10-15 10:05:50 -07:00
Lucas Stach
c1ca1602dd etnaviv: fix vertex buffer state emission for single stream GPUs
GPUs with a single supported vertex stream must use the single state
address to program the stream.

Fixes: 3d09bb390a (etnaviv: GC7000: State changes for HALTI3..5)
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
(cherry picked from commit ce23bc9283)
2019-10-15 10:05:26 -07:00
Kenneth Graunke
91960ae890 iris: Implement the Gen < 9 tessellation quads workaround
Fixes several CTS tests:
- KHR-GL46.tessellation_shader.vertex.vertex_spacing
- KHR-GL46.tessellation_shader.tessellation_shader_point_mode.points_verification

Fixes: 823609b1a3 ("iris/WIP: add broadwell support")
(cherry picked from commit ac7af7c500)
2019-10-14 11:16:04 -07:00
Samuel Pitoiset
59e56bf05d radv: fix DCC fast clear code for intensity formats
This fixes a rendering issue with DiRT 4 on GFX10. Only GFX10 was
affected because intensity formats are different.

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1923
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit ea92273cea)
2019-10-14 11:16:00 -07:00
Timothy Arceri
6f30614d73 glsl: fix crash compiling bindless samplers inside unnamed UBOs
The check to see if we were dealing with a buffer block was
too late and only worked for named UBOs.

Fixes: f32b01ca43 "glsl/linker: remove ubo explicit binding handling"

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1900
(cherry picked from commit 1294f01e06)
2019-10-14 11:15:56 -07:00
Bas Nieuwenhuizen
b87edab8a2 nir/dead_cf: Remove dead control flow after infinite loops.
And after discard-only loops. Otherwise we end up with dead code
which confuses nir_repair_ssa into adding a whole bunch of uses
of undefined. However, for derefs, we sometimes always expect to
get a variable instead of undefined.

Fixes dEQP-VK.graphicsfuzz.write-red-in-loop-nest on radv.

Fixes: c832820ce9 "nir/dead_cf: Repair SSA if the pass makes progress"
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1928
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
(cherry picked from commit 6da3bf2600)
2019-10-11 11:09:52 -07:00
Eric Engestrom
8355658fa8 meson: skip installation of GLVND-provided headers
Fixes: 93df862b6a ("meson: re-add incorrect pkg-config files with GLVND for backward compatibility")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1846
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit 34ba363ab0)
2019-10-11 11:09:47 -07:00
Eric Engestrom
089aa74d57 meson: split Mesa headers as a separate installation
Fixes: 93df862b6a ("meson: re-add incorrect pkg-config files with GLVND for backward compatibility")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit 1a7e9652c4)
2019-10-11 11:09:41 -07:00
Eric Engestrom
7f0d0ab83d meson: split headers one per line
Fixes: 93df862b6a ("meson: re-add incorrect pkg-config files with GLVND for backward compatibility")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit daae003f47)
2019-10-11 11:09:37 -07:00
Eric Engestrom
7b70f2ec47 meson: move a couple of include installs around
Preparation for a later commit.

Fixes: 93df862b6a ("meson: re-add incorrect pkg-config files with GLVND for backward compatibility")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit b9a5fb1f05)
2019-10-11 11:09:32 -07:00
Eric Engestrom
f446e56d30 meson: rename glvnd_missing_pc_files to not glvnd_has_headers_and_pc_files
This reflects better what is provided by glvnd or not.

Fixes: 93df862b6a ("meson: re-add incorrect pkg-config files with GLVND for backward compatibility")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit b57fa7ca49)
2019-10-11 11:09:26 -07:00
Eric Engestrom
767965b6fa GL: drop symbols mangling support
SCons and Meson have never supported that feature, and Autotools was
deleted over 6 months ago and no-one complained yet, so it's pretty
obvious nobody cares about it.

Fixes: 95aefc94a9 ("Delete autotools")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit a0829cf23b)
2019-10-11 11:09:22 -07:00
Bas Nieuwenhuizen
3deb4fa226 radv: Disallow sparse shared images.
Since we really cannot share them ever.

Also remove an unused switch.

Fixes: b70829708a "radv: Implement VK_KHR_external_memory"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 53b1372571)
2019-10-11 11:09:16 -07:00
Connor Abbott
0056943e69 nir/sink: Don't sink load_ubo to outside of its defining loop
Previously, this could have made the resource divergent in code like
that which is genereated by nir_lower_non_uniform_access.

Fixes: da8ed68a ('nir: replace nir_move_load_const() with nir_opt_sink()')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
(cherry picked from commit 5ac32b2954)
2019-10-10 09:09:35 -07:00
Connor Abbott
d14d70de2f nir/sink: Rewrite loop handling logic
Previously, for code like:
loop {
    loop {
        a = load_ubo()
    }
    use(a)
}
adjust_block_for_loops() would return the block before the first loop.
Now we compute the range of allowed blocks and then walk the dominance
tree directly, guaranteeing directly that we always choose a block that
dominates all the uses and is dominated by the definition.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
(cherry picked from commit af9296b8c0)
2019-10-10 09:09:27 -07:00
Alejandro Piñeiro
0beee2f723 v3d: take into account prim_counts_offset
Specifically when reading the primitive counters.

This fixed ~700 CTS tests using this pattern:
dEQP-GLES3.functional.transform_feedback.*

when run after tests like
dEQP-GLES3.functional.prerequisite.read_pixels on the same
caselist. When run individually those tests were passing because
prim_counts_offset was zero.

Fixes: 0f2d1dfe65 ("v3d: use the GPU to
       record primitives written to transform feedback")

Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit fa41a51891)
2019-10-10 09:05:34 -07:00
Samuel Pitoiset
c48dc6ad5f radv: bump minTexelBufferOffsetAlignment to 4
The spec has probably been misinterpreted during RADV bringup.

This fixes GPU hangs with dEQP-VK.binding_model.*offset_nonzero*.

Fixes: f4e499ec79 ("radv: add initial non-conformant radv vulkan driver")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 030e67fac3)
2019-10-10 09:04:55 -07:00
Samuel Pitoiset
03df69d6a1 drirc: enable vk_x11_override_min_image_count for DOOM
DOOM fails to handle more images than expected when the adaptative
sync mode is enabled.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1902
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit ad96c4987c)
2019-10-10 09:04:55 -07:00
Clément Guérin
47bc45ba1a radeonsi: enable zerovram for Rocket League
Fixes corruption on game startup.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1888

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
(cherry picked from commit 5afbe87d21)
2019-10-10 09:04:55 -07:00
Kenneth Graunke
aa89c0a2bd iris: Properly unreference extra VBOs for draw parameters
bound_vertex_buffers doesn't include extra draw parameters buffers.
Tracking this correctly is kind of complicated, and iris_destroy_state
isn't exactly in a hot path, so just loop over all VBO bindings.

Fixes: 4122665dd9 (iris: Enable ARB_shader_draw_parameters support)
Reported-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
(cherry picked from commit face221283)
2019-10-10 09:04:55 -07:00
Dylan Baker
3e15620451 docs: Add SHA256 sum for 19.2.1 2019-10-09 10:19:16 -07:00
Dylan Baker
877417918f Bump version for 19.2.1 2019-10-09 09:47:07 -07:00
Dylan Baker
e19dc53aae docs: Add relnotes for 19.2.1 2019-10-09 09:47:07 -07:00
Dylan Baker
a06f8341d8 bin: delete unused releasing scripts
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Juan A. Suarez <jasuarez@igalia.com>
(cherry picked from commit 974e3ad004)
2019-10-09 09:46:04 -07:00
Dylan Baker
4f38287970 release: Add an update_release_calendar.py script
This script is for updating post version bump.

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Juan A. Suarez <jasuarez@igalia.com>
(cherry picked from commit 3226b12a09)
2019-10-09 09:46:04 -07:00
Dylan Baker
a4348e9594 scripts: Mesa 19.2.x only implements GL 4.5 2019-10-09 09:46:04 -07:00
Dylan Baker
61b3371acc scripts: Add a gen_release_notes.py script
This script is responsible for generating an entire page in the
docs/relnotes/ directory. It includes a template for the page, and uses
mako to fill in the necessary bits. It is designed to be purely fire and
forget, calculating previous versions, shortlogs, bug fixes, and dates.

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Juan A. Suarez <jasuarez@igalia.com>
(cherry picked from commit 86079447da)
2019-10-09 09:15:28 -07:00
Eric Engestrom
5a0361ce09 meson: add missing idep_nir_headers in iris_gen_libs
Fixes: 4929f020c3 ("iris: better SBE")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit 731097c747)
2019-10-08 09:18:31 -07:00
Tapani Pälli
06b8b29a0a anv/android: fix images created with external format support
This fixes a case where user first creates image and then later binds it
with memory created from AHW buffer.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit e4a826b2c8)
2019-10-08 09:18:22 -07:00
Prodea Alexandru-Liviu
9052318565 scons/MSYS2-MinGW-W64: Fix build options defaults
Signed-off-by: Prodea Alexandru-Liviu <liviuprodea@yahoo.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Cc: <mesa-stable@lists.freedesktop.org>

When building in a MSYS2 Mingw-w64 environment Mesa3D sets wrong default build options which inevitably lead to build failure.

(cherry picked from commit 6309c31fd8)
2019-10-07 10:49:04 -07:00
Lionel Landwerlin
85193e808a intel/isl: Set null surface format to R32_UINT
It appears we never had a test in piglit or deqp sampling from a null
surface...

It turns out this triggers a hang on IVB only. Updating the null
surface format to R32_UINT fixes the hang on ivb and doesn't affect
other platforms, so set it by default for all platforms.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1872
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit c445d6f66e)
2019-10-07 10:48:14 -07:00
Lionel Landwerlin
34d738ff2e intel: fix subslice computation from topology data
We're missing the offset of the slice in the subslice mask...

This worked for most platforms that don't have first slice fused off
because we would reread the same mask from slice0 again and again...

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: c1900f5b0f ("intel: devinfo: add helper functions to fill fusing masks values")
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1869
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
(cherry picked from commit d36763b2a4)
2019-10-07 10:48:14 -07:00
Andres Gomez
c289ac9a22 egl: Remove the 565 pbuffer-only EGL config under X11.
The CTS finally has agreed to drop the requirement for a
565-no-depth-no-stencil config for ES 3.0. Hence we can now remove the
code to satisfy this requirement using a pbuffer-only visual with
whatever other buffers the driver happens to have given us.

This reverts commit 82607f8a90,
commit 6ad31c4ff3 and
commit dacb11a585.

v2:
  - Reference the VK-GL-CTS issue (Eric E.).

v3:
  - Don't revert
    fc21394bc4 ("egl: Quiet warning about front buffer rendering for pixmaps/pbuffers")
    (Kenneth).

References: VK-GL-CTS issue 1601.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Andres Gomez <agomez@igalia.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 02c265be9d)
Conflicts resolved by Dylan Baker
2019-10-07 10:48:14 -07:00
Dylan Baker
bad5e64da8 meson: Only error building gallium video without libdrm when the platform is drm
Fixes: 3b265f61f5
       ("meson: gallium media state trackers require libdrm with x11")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1878
Tested-by: Vinson Lee <vlee@freedesktop.org>
(cherry picked from commit 1481d05409)
2019-10-07 10:48:14 -07:00
Dylan Baker
52cf623955 .cherry-ignore: Update for 19.2.1 cycle 2019-10-07 10:48:14 -07:00
Bas Nieuwenhuizen
78a05b8cbb radv: Fix condition for skipping the continue CS.
We need the continue CS for referencing the tess/GDS/sample position BOs.

Fixes: 46e52df34d "radv: add tessellation ring allocation support. (v2)"
Fixes: e1dc3ab753 "radv/gfx10: allocate GDS/OA buffer objects for NGG streamout"
Fixes: 1171b304f3 "radv: overhaul fragment shader sample positions."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 8ad3d8b178)
2019-10-04 15:54:22 +02:00
Lionel Landwerlin
676471a092 intel: fix topology query
i915 will report ENODEV on generations prior to Haswell because there
is no point in reporting values on those. This is prior any fusing
could happen on parts with identical PCI ids.

This query call was previously only triggered on generations that
support performance queries, which happens to match generation for
which i915 reports topology, but the commit pointed below started
using it on all generations.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1860
Cc: <mesa-stable@lists.freedesktop.org>
Fixes: 96e1c945f2 ("i965: Move device info initialization to common code")
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
(cherry picked from commit 1c6fdbc83c)
2019-10-03 09:56:36 -07:00
Marek Olšák
0b97377f58 radeonsi/gfx10: fix corruption for chips with harvested TCCs
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 235ebe9163)
2019-10-01 15:29:22 -07:00
Marek Olšák
33eecbcc9b ac: add radeon_info::tcc_harvested
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 8cbe83445b)
Conflicts resolved by Dylan Baker

Conflicts:
	src/amd/common/ac_gpu_info.h
2019-10-01 15:28:53 -07:00
Lionel Landwerlin
2bbe4c69c8 mesa: don't forget to clear _Layer field on texture unit
On the Android Antutu benchmark we ran into an assert in ISL where the
(base layer + num layers) > total layers. It turns out the core of
mesa forgot to clear the _Layer variable, potentially leaving an
inconsistent value.

v2: Pull setting u->_Layer out of the conditional blocks (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 2208d79dde)
2019-10-01 12:30:07 -07:00
Ken Mays
db56bc2c52 haiku: fix Mesa build
1. The hgl.c file is a read-only file versus read-write.
Ref: src/gallium/state_trackers/hgl/hgl.c

2.  I've included the Haiku-specific patches I used to get a successful
build of Mesa 19.1.7 on Haiku using the meson/ninja build procedure.
Shows "[764/764] linking target ... libswpipe.so" at build completion.

v2:
Remove autotools files (Eric)

v3:
Update the patch

Reported-by: Ken Mays <kmays2000@gmail.com>
Tested-by: Ken Mays <kmays2000@gmail.com>
CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Alexander von Gluck IV <kallisti5@unixzen.com>
(cherry picked from commit 4943c89d6d)
2019-10-01 12:30:02 -07:00
Kenneth Graunke
4169d86913 iris: Fix iris_rebind_buffer() for VBOs with non-zero offsets.
We can't just check for the BO base address, we need to check for the
full address including any offset we may have applied.  When updating
the address, we need to include the offset again.

Fixes: 5ad0c88dbe ("iris: Replace buffer backing storage and rebind to update addresses.")
(cherry picked from commit 309924c3c9)
2019-10-01 12:29:58 -07:00
Dylan Baker
680e18c159 meson: gallium media state trackers require libdrm with x11
v2: - update copyright year in all changed files
    - rebase on master

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 3b265f61f5)
2019-10-01 12:29:54 -07:00
Kenneth Graunke
d4dab05a09 iris: Disable CCS_E for 32-bit floating point textures.
A while back, Michael Larabel noticed that Paraview's Wavelet Volume
case runs significantly slower on iris than i965.  It turns out this
is because we enable CCS_E for 32-bit floating point formats, while
i965 disables it, with an oblique comment saying that we benchmarked
it (on what exactly?) and determined that it was a loss.

Paraview uses both R32_FLOAT and R32G32B32A32_FLOAT, and I observed
large framerate drops when enabling CCS_E for either format.  However,
several other benchmarks (Aztec Ruins, many Synmark cases) use 16-bit
floating point formats, with no apparent ill effects.

So, disable compression for 32-bit float formats for now, but leave it
enabled for 16-bit float formats as they seem to be working fine.

Improves performance in Paraview's Wavelet Volume test by 62% on a
Skylake GT4e.

Fixes: 3cfc6a207b ("iris: Fill out res->aux.possible_usages")
(cherry picked from commit a0a93763fb)
2019-10-01 12:29:51 -07:00
Marek Olšák
f5cccfe0e6 ac: fix num_good_cu_per_sh for harvested chips
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit b7c2f7c5a6)
2019-10-01 12:29:45 -07:00
Marek Olšák
0a2285b1d4 ac: fix incorrect vram_size reported by the kernel
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 7d97013294)
2019-10-01 12:29:15 -07:00
Marek Olšák
8f95245068 radeonsi/gfx10: fix L2 cache rinse programming
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 3c0938bece)
2019-10-01 12:29:11 -07:00
pal1000
a413b55157 scons/windows: Support build with LLVM 9.
As X86AsmPrinter component is gone, LLVMX86AsmPrinter got replaced
with LLVMRemarks, LLVMBitstreamReader and LLVMDebugInfoDWARF.

Tests done with llvm-config on both LLVM 8 and 9 indicate that
mcjit, bitwriter and x86asmprinter fully fit inside engine component.

On other platforms and with meson build mcdisassembler was used to replace
X86AsmPrinter but mcdisassembler also fully fits inside engine component
for LLVM>=8 according to same tests.

v2: Avoid duplicating code related to Mingw pthreads.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>

On 19.1 this patch does not apply cleanly without 88eb2a1f

(cherry picked from commit bcb4dfb14b)
2019-09-30 09:10:21 -07:00
Michel Zou
5a027b6201 scons: add py3 support
SCons 3.1 has moved to python 3, requiring this fix
to continue supporting scons builds.

Closes: #944
Cc: mesa-stable@lists.freedesktop.org
Acked-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Eric Engestrom <eric@engestrom.ch>
(cherry picked from commit 3f92d17894)
2019-09-30 09:10:16 -07:00
Mauro Rossi
769a18d1f3 android: compiler/nir: build nir_divergence_analysis.c
Prerequisite to avoid following radv linking error happening with aco

FAILED: out/target/product/x86_64/obj_x86/SHARED_LIBRARIES/vulkan.radv_intermediates/LINKED/vulkan.radv.so
...
external/mesa/src/amd/compiler/aco_instruction_selection_setup.cpp:178:
error: undefined reference to 'nir_divergence_analysis'
clang.real: error: linker command failed with exit code 1 (use -v to see invocation)

Fixes: df86c5f ("nir: add divergence analysis pass.")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
(cherry picked from commit 268fb10e9c)
2019-09-30 09:10:11 -07:00
Andrii Simiklit
cb2649768f glsl: disallow incompatible matrices multiplication
glsl 4.4 spec section '5.9 expressions':
"The operator is multiply (*), where both operands are matrices or one operand is a vector and the
 other a matrix. A right vector operand is treated as a column vector and a left vector operand as a
 row vector. In all these cases, it is required that the number of columns of the left operand is equal
 to the number of rows of the right operand. Then, the multiply (*) operation does a linear
 algebraic multiply, yielding an object that has the same number of rows as the left operand and the
 same number of columns as the right operand. Section 5.10 “Vector and Matrix Operations”
 explains in more detail how vectors and matrices are operated on."

This fix disallows a multiplication of incompatible matrices like:
mat4x3(..) * mat4x3(..)
mat4x2(..) * mat4x2(..)
mat3x2(..) * mat3x2(..)
....

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111664
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
(cherry picked from commit b32bb888c7)
2019-09-30 09:10:07 -07:00
Jason Ekstrand
e6edeebd15 intel/fs: Fix fs_inst::flags_read for ANY/ALL predicates
Without this, we were DCEing flag writes because we didn't think their
results were used because we didn't understand that an ANY32 predicate
actually read all the flags.

Fixes: df1aec763e "i965/fs: Define methods to calculate the flag..."
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 6c858b9a91)
2019-09-30 09:10:02 -07:00
Dylan Baker
2dbf10ba3d meson: Link xvmc with libxv
Prior to xvmc 1.0.12 libxvmc incorrectly required libxv, but that was
fixed. This results in compilation failures for the gallium xvmc tracker
and tools. This patch fixes that by explicitly linking to libxv.

Fixes: 22a817af8a
       ("meson: build gallium xvmc state tracker")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1844
Reviewed-by: Adam Jackson <ajax@redhat.com>
(cherry picked from commit e456a053c3)
2019-09-30 09:09:57 -07:00
Dylan Baker
daeb959c91 meson: Try finding libxvmcw via pkg-config before using find_library
This fixes cross compiling issues, because pkg-config is less likely to
get the wrong libs.

v2: - Fix typo in comment

Fixes: 22a817af8a
       ("meson: build gallium xvmc state tracker")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/939
Reviewed-by: Adam Jackson <ajax@redhat.com>
(cherry picked from commit 8c5c21d7e3)
2019-09-30 09:09:52 -07:00
Andreas Gottschling
db1ed17ac4 drisw: Fix shared memory leak on drawable resize
XDestroyImage will mark the segment as to-be-destroyed, but it will
persist until we detach it, and we weren't doing so.

Cc: mesa-stable@lists.freedesktop.org
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/121
Reviewed-by: Adam Jackson <ajax@redhat.com>
(cherry picked from commit c5a2ccec5e)
2019-09-30 09:09:47 -07:00
pal1000
dc0995669d scons: Fix MSYS2 Mingw-w64 build.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

This patch is based on 28e3f85e09/mingw-w64-mesa/link-ole32.patch but with tweaks to avoid MSVC build break when applied.

v2: Create Mingw platform alias pointing to windows host platform define to avoid spurious crosscompilation;

v3: Fix obviously wrong compiler flags for swr driver;

v4: Update original patch URL because it has been relocated;

v5: Don't bother patching autools stuff as it's not used by MSYS2 Mingw-w64 build and it's days are numbered anyway;

v6: After Mingw posix flag fix in 295851eb things are far simpler as we don't need more linking of uuid, ole32, version and shell32 than what is already in place.
(cherry picked from commit ffb0d3a25c)
2019-09-30 09:09:00 -07:00
Michel Dänzer
434ab094c0 radeonsi: fix VAAPI segfault due to various bugs
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111236
(cherry picked from commit 67d930d64b)
2019-09-26 14:40:54 -07:00
Tapani Pälli
e35a7a0238 iris: disable aux on first get_param if not created with aux
This moves the fix from commit 361f3d19f1 to happen in get_param
(used now instead of get_handle by st/dri). This fixes artifacts
seen with Xorg and CCS_E.

Fixes: fc12fd05f5 "iris: Implement pipe_screen::resource_get_param"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit f4d9169204)
2019-09-26 08:47:50 -07:00
Marek Olšák
95d87a897b gallium: extend resource_get_param to be as capable as resource_get_handle
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit d307aa56f9)
2019-09-26 08:47:50 -07:00
Lionel Landwerlin
c4b70fef71 intel: use proper label for Comet Lake skus
Fixes: 82f6a746e8 ("intel: Add support for Comet Lake")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 813f3460e7)
2019-09-26 08:47:50 -07:00
Ian Romanick
bc6cc94d5a nir/range-analysis: Bail if the types don't match
Some shaders are hurt by this change because now a
load_const(0x00000000) is not recognized as eq_zero when loaded as a
float.  This behavior is restored in a later patch (nir/range-analysis:
Use types to provide better ranges from bcsel and mov).

v2: Add a comment about reinterpretation of int/uint/bool.  Suggested by
Caio.  Rewrite condition the check for types being float versus checking
for types not being all the things that aren't float.

Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>

All Gen7+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16327543 -> 16328255 (<.01%)
instructions in affected programs: 55928 -> 56640 (1.27%)
helped: 0
HURT: 208
HURT stats (abs)   min: 1 max: 16 x̄: 3.42 x̃: 3
HURT stats (rel)   min: 0.33% max: 6.74% x̄: 1.31% x̃: 1.12%
95% mean confidence interval for instructions value: 3.06 3.79
95% mean confidence interval for instructions %-change: 1.17% 1.46%
Instructions are HURT.

total cycles in shared programs: 363682759 -> 363683977 (<.01%)
cycles in affected programs: 325758 -> 326976 (0.37%)
helped: 44
HURT: 133
helped stats (abs) min: 1 max: 179 x̄: 33.61 x̃: 5
helped stats (rel) min: 0.06% max: 14.21% x̄: 2.47% x̃: 0.29%
HURT stats (abs)   min: 1 max: 157 x̄: 20.28 x̃: 14
HURT stats (rel)   min: 0.07% max: 14.44% x̄: 1.42% x̃: 0.73%
95% mean confidence interval for cycles value: 0.38 13.39
95% mean confidence interval for cycles %-change: -0.06% 0.96%
Inconclusive result (%-change mean confidence interval includes 0).

Sandy Bridge
total instructions in shared programs: 10787433 -> 10787443 (<.01%)
instructions in affected programs: 1842 -> 1852 (0.54%)
helped: 0
HURT: 10
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.33% max: 1.85% x̄: 0.73% x̃: 0.49%
95% mean confidence interval for instructions value: 1.00 1.00
95% mean confidence interval for instructions %-change: 0.36% 1.10%
Instructions are HURT.

total cycles in shared programs: 153724543 -> 153724563 (<.01%)
cycles in affected programs: 8407 -> 8427 (0.24%)
helped: 1
HURT: 3
helped stats (abs) min: 18 max: 18 x̄: 18.00 x̃: 18
helped stats (rel) min: 0.98% max: 0.98% x̄: 0.98% x̃: 0.98%
HURT stats (abs)   min: 4 max: 18 x̄: 12.67 x̃: 16
HURT stats (rel)   min: 0.21% max: 0.75% x̄: 0.56% x̃: 0.72%
95% mean confidence interval for cycles value: -21.31 31.31
95% mean confidence interval for cycles %-change: -1.11% 1.46%
Inconclusive result (value mean confidence interval includes 0).

No shader-db changes on Iron Lake or GM45.

(cherry picked from commit 018d2b524a)
2019-09-26 08:47:50 -07:00
Lionel Landwerlin
6273d4d4ed anv: gem-stubs: return a valid fd got anv_gem_userptr()
Fixes invalid close(-1) in the unit tests.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit da2d67fc3b)
2019-09-26 08:47:50 -07:00
Danylo Piliaiev
960ab3e465 st/nine: Ignore D3DSIO_RET if it is the last instruction in a shader
RET as a last instruction could be safely ignored.
Remove it to prevent crashes/warnings in case underlying driver
doesn't implement arbitrary returns.

A better way would be to remove the RET after the whole shader
is parsed which will handle a possible case when the last RET is
followed by a comment.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Axel Davy <davyaxel0@gmail.com>
(cherry picked from commit 2d8f77db83)
2019-09-26 08:47:50 -07:00
Dylan Baker
41b57b8b73 meson: fix logic for generating .pc files with old glvnd
We want to generate PC files for non-glvnd builds and for builds with
old glvnd, but the current logic doesn't do that, it builds them
unconditionally, and for GLES it builds the shared libraries, which is
also not what we want. This does not generate .pc files for gles1 or
gles2. Which it we weren't doing before either, making this not a
regression but a return to status-quo.o

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1838
Fixes: 93df862b6a
       ("meson: re-add incorrect pkg-config files with GLVND for backward compatibility")
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit fafd20f67d)
2019-09-26 08:47:50 -07:00
Erik Faye-Lund
109137ee7b glsl: correct bitcast-helpers
Without this, we'll incorrectly round off huge values to the nearest
representable double instead of keeping it at the exact value  as
we're supposed to.

Found by inspecting compiler-warnings.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 85faf5082f ("glsl: Add 64-bit integer support for constant expressions")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 88f909eb37)
2019-09-26 08:47:50 -07:00
Rhys Perry
f4faf5cbd7 nir/opt_remove_phis: handle phis with no sources
This can happen with loops with unreachable exits which are later
optimized away.

Fixes assertion in dEQP-VK.graphicsfuzz.unreachable-loops with RADV.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 12372d60ff)
2019-09-26 08:47:50 -07:00
Marek Olšák
b29682c290 gallium/vl: don't set PIPE_HANDLE_USAGE_EXPLICIT_FLUSH
because vl doesn't call flush_resource and I wasn't able to find
all places where flush_resource needs to be called.

This fixes corrupted / unflushed surfaces with fullscreen videos on Raven.

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f52afdf672)
2019-09-26 08:47:50 -07:00
Connor Abbott
1b67e47c0c nir/opt_large_constants: Handle store writemasks
This fixes some piglit tests on radeonsi NIR where a varying is
initialized to a constant array in the vertex shader. Varying packing
after nir_lower_io_to_temporaries creates writemasked stores which
persist after pulling the constant initialization down into the fragment
shader.

While we're here, rewrite handle_constant_store() to do the loop over
components outside the switch, so that we don't have to duplicate the
writemask checking for every bitsize.

Fixes: 1235850522 ("nir: Add a large constants optimization pass")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit 270fe55256)
2019-09-26 08:47:50 -07:00
Eric Engestrom
0c56cb50c7 meson: drop -Wno-foo bug workaround for Meson < 0.46
This was a workaround for a bug in Meson that was fixed in 0.46 [1].

[1] https://github.com/mesonbuild/meson/pull/2284

Fixes: f7b6a8d12f ("meson: bump required version to 0.46")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit 3fd0afd5e3)
2019-09-26 08:47:50 -07:00
Eric Engestrom
40d592473e radv: fix s/load/store/ copy-paste typo
Fixes: cdc6efddf9 ("radv: implement all depth/stencil resolve modes using graphics")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 30f639c181)
2019-09-26 08:47:50 -07:00
Stephen Barber
2947b89369 nouveau: add idep_nir_headers as dep for libnouveau
Fixes a compilation error when building libnouveau:

In file included from ../src/gallium/drivers/nouveau/nv50/nv50_program.c:25:
../src/compiler/nir/nir.h:1115:10: fatal error: nir_intrinsics.h: No such file or directory
 #include "nir_intrinsics.h"
           ^~~~~~~~~~~~~~~~~~
           compilation terminated.

Fixes: f014ae3c7c ("nouveau: add support for nir")
Signed-off-by: Stephen Barber <smbarber@chromium.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
(cherry picked from commit 8c3ace6991)
2019-09-26 08:47:50 -07:00
Dylan Baker
0c47b502c2 Bump version for 19.2.0 final 2019-09-25 09:56:31 -07:00
Dylan Baker
8ea440fa3a docs: Add release notes for 19.2.0 2019-09-25 09:55:33 -07:00
Eric Engestrom
e59e9cd58a meson: re-add incorrect pkg-config files with GLVND for backward compatibility
This is a bit counter-intuitive, but the issue is that GLVND is broken
in versions <= 1.1.1, so we need to keep wrongly providing these files
to cover up their mistake, otherwise the rest of the world ends up
broken.

Suggested-by: Dylan Baker <dylan@pnwbakers.com>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit 93df862b6a)
2019-09-25 09:30:31 -07:00
Mauro Rossi
0d781fe4b8 android: anv: libmesa_vulkan_common: add libmesa_util static dependency
Change needed to fix the following building error:

In file included from external/mesa/src/intel/vulkan/anv_device.c:43:
external/mesa/src/util/xmlpool.h:115:10: fatal error: 'xmlpool/options.h' file not found
         ^~~~~~~~~~~~~~~~~~~
1 error generated.

Fixes: 4dcb1ff ("anv: add support for driconf")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
(cherry picked from commit ae5ac26dfa)
2019-09-24 08:25:33 -07:00
Mauro Rossi
ba8b282ae4 android: mesa: revert "Enable asm unconditionally"
This patch partially reverts 20294dc ("mesa: Enable asm unconditionally, ...")

Android makefile build logic needs to disable assembler optimization
in 32bit builds to avoid text relocations for libglapi.so shared

Fixes the following build error with Android x86 32bit target:

[  0% 4/477] target SharedLib: libglapi (out/target/product/x86/obj/SHARED_LIBRARIES/libglapi_intermediates/LINKED/libglapi.so)
FAILED: out/target/product/x86/obj/SHARED_LIBRARIES/libglapi_intermediates/LINKED/libglapi.so
...
prebuilts/gcc/linux-x86/x86/x86_64-linux-android-4.9/x86_64-linux-android/bin/ld: warning: shared library text segment is not shareable
prebuilts/gcc/linux-x86/x86/x86_64-linux-android-4.9/x86_64-linux-android/bin/ld: error: treating warnings as errors
clang-6.0: error: linker command failed with exit code 1 (use -v to see invocation)

Fixes: 20294dc ("mesa: Enable asm unconditionally, now that gen_matypes is gone.")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Acked-by: Eric Engestrom <eric@engestrom.ch>
(cherry picked from commit 7a6e7803a7)
2019-09-24 08:25:27 -07:00
Erik Faye-Lund
f7e25ae6d6 util: fix SSE-version needed for double opcodes
This code generates CVTSD2SI, which requires SSE2. So let's fix the
required SSE-version.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 5de29ae (util: try to use SSE instructions with MSVC and 32-bit gcc)
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 2ade1c5cf7)
2019-09-24 08:25:01 -07:00
Bas Nieuwenhuizen
5037ffb646 radv: Add workaround for hang in The Surge 2.
Released today and hangs on RADV. We don't have the root cause yet,
but this should unblock people playing the game.

No drirc because the radv debugflags are not usable from drirc and
I want this backported.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 780182f0a0)
2019-09-24 08:24:22 -07:00
Juan A. Suarez Romero
c90dc1232d bin/get-pick-list.sh: sha1 commits can be smaller than 8 chars
The script only handles commits with "Fixes: <sha1>" where <sha1> is
equal or great than 8 chars. But <sha1> can be smaller, like 7 chars.

This commit relax the restriction to handle <sha1> 4 or more chars.

Fixes: 533fead423 ("bin/get-pick-list.sh: tweak the commit sha matching pattern")

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit b3c25e6f99)
2019-09-24 08:24:17 -07:00
Kenneth Graunke
a1b6212710 intel: Increase Gen11 compute shader scratch IDs to 64.
From the MEDIA_VFE_STATE docs:

   "Starting with this configuration, the Maximum Number of Threads must
    be set to (#EU * 8) for GPGPU dispatches.

    Although there are only 7 threads per EU in the configuration, the
    FFTID is calculated as if there are 8 threads per EU, which in turn
    requires a larger amount of Scratch Space to be allocated by the
    driver."

It's pretty clear that we need to increase this for scratch address
calculations, because the FFTID has a certain bit-pattern.  The quote
above seems to indicate that we should increase the actual thread count
programmed in MEDIA_VFE_STATE as well, but we think the intention is to
only bump the scratch space.

Fixes GPU hangs in Bioshock Infinite and Synmark's CSDof on Icelake 8x8.

Fixes: 5ac804bd9a ("intel: Add a preliminary device for Ice Lake")
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit b9e93db208)
2019-09-24 08:23:56 -07:00
Marek Olšák
9f94cd7140 ac/addrlib: fix chip identification for Vega10, Arcturus, Raven2, Renoir
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
(cherry picked from commit 48742de601)
2019-09-24 08:23:50 -07:00
Marek Olšák
5e289248a4 amd: add more PCI IDs for Navi14
trivial and urgent

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 65b698136c)
2019-09-24 08:23:45 -07:00
Jason Ekstrand
dc7129777f nir/repair_ssa: Replace the unreachable check with the phi builder
In a3268599f3, I attempted to fix nir_repair_ssa for unreachable
blocks.  However, that commit missed the possibility that the use is in
a block which, itself, is unreachable.  In this case, we can end up in
an infinite loop trying to replace a def with itself.  Even though a
no-op replacement is a fine operation, it keeps extending the end of the
uses list as we're walking it.  Instead of explicitly checking for the
group of conditions, just check if the phi builder gives us a different
def.  That's guaranteed to be 100% reliable and, while it lacks symmetry
with the is_valid checks, should be more reliable.

Fixes: a3268599 "nir/repair_ssa: Repair dominance for unreachable..."
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit d63162cff0)
2019-09-23 11:11:59 -07:00
Hal Gentz
6485fd8362 gallium/osmesa: Fix the inability to set no context as current.
Currently there is no way to make no context current w/gallium + osmesa.
The non-gallium version of osmesa does this if the context and buffer
passed to `OSMesaMakeCurrent` are both null. This small change makes it
so that this is also the case with the gallium version.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Hal Gentz <zegentzy@protonmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 57c894334e)
2019-09-23 11:11:59 -07:00
Ian Romanick
2d30c0fc59 nir/algebraic: Do not apply late DPH optimization in vertex processing stages
Some shaders do not use 'invariant' in vertex and (possibly) geometry
shader stages on some outputs that are intended to be invariant.  For
various reasons, this optimization may not be fully applied in all
shaders used for different rendering passes of the same geometry.  This
can result in Z-fighting artifacts (at best).  For now, disable this
optimization in these stages.

In tessellation stages applications seem to use 'precise' when
necessary, so allow the optimization in those stages.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111490
Fixes: 09705747d7 ("nir/algebraic: Reassociate fadd into fmul in DPH-like pattern")

All Gen8+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16194726 -> 16344745 (0.93%)
instructions in affected programs: 2855172 -> 3005191 (5.25%)
helped: 6
HURT: 20279
helped stats (abs) min: 1 max: 3 x̄: 1.33 x̃: 1
helped stats (rel) min: 0.44% max: 1.00% x̄: 0.54% x̃: 0.44%
HURT stats (abs)   min: 1 max: 32 x̄: 7.40 x̃: 7
HURT stats (rel)   min: 0.14% max: 42.86% x̄: 8.58% x̃: 6.56%
95% mean confidence interval for instructions value: 7.34 7.45
95% mean confidence interval for instructions %-change: 8.48% 8.67%
Instructions are HURT.

total cycles in shared programs: 364471296 -> 365014683 (0.15%)
cycles in affected programs: 32421530 -> 32964917 (1.68%)
helped: 2925
HURT: 16144
helped stats (abs) min: 1 max: 403 x̄: 18.39 x̃: 5
helped stats (rel) min: <.01% max: 22.61% x̄: 1.97% x̃: 1.15%
HURT stats (abs)   min: 1 max: 18471 x̄: 36.99 x̃: 15
HURT stats (rel)   min: 0.02% max: 52.58% x̄: 5.60% x̃: 3.87%
95% mean confidence interval for cycles value: 21.58 35.41
95% mean confidence interval for cycles %-change: 4.36% 4.52%
Cycles are HURT.

(cherry picked from commit 92f70df8c3)
2019-09-23 11:11:59 -07:00
Adam Jackson
ac9ef75a25 docs: Update bug report URLs for the gitlab migration
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 5b5c5bf833)
2019-09-23 11:11:59 -07:00
Andres Gomez
75836b70c7 docs: Add the maximum implemented Vulkan API version in 19.2 rel notes
Currently, Vulkan 1.1.

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit 41b0e0d7e0)
2019-09-23 11:11:59 -07:00
Arcady Goldmints-Orlov
0b54e1d176 anv: fix descriptor limits on gen8
Later generations support bindless for samplers, images, and buffers and
thus per-stage descriptors are not limited by the binding table size.
However, gen8 doesn't support bindless images and thus needs to report a
lower per-stage limit so that all combinations of descriptors that fit
within the advertised limits are reported as supported by
vkGetDescriptorSetLayoutSupport.

Fixes test dEQP-VK.api.maintenance3_check.descriptor_set
Fixes: 79fb0d27f3 ("anv: Implement SSBOs bindings with GPU addresses in the descriptor BO")

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 5ec5fecc26)
2019-09-23 11:11:59 -07:00
Tapani Pälli
00eaba7761 egl: check for NULL value like eglGetSyncAttribKHR does
Commit d1e1563bb6 added a NULL check for eglGetSyncAttribKHR
but eglGetSyncAttrib does not do this. Patch adds same check to
happen with eglGetSyncAttrib.

Fixes crashes in (when exposing EGL 1.5):
   dEQP-EGL.functional.fence_sync.invalid.get_invalid_value

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 99cbec0a5f)
2019-09-23 11:11:59 -07:00
Paulo Zanoni
770e77dcd1 intel/fs: fix SHADER_OPCODE_CLUSTER_BROADCAST for SIMD32
The current code can create functions with a width of 32, which is not
supported by our hardware. Add some code to simplify how we express
what we want and prevent such cases.

For some unknown reason, all the tests I could run seem to work even
with these unsupported MOVs.

Fixes: b0858c1cc6 "intel/fs: Add a couple of simple helper opcodes"
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
(cherry picked from commit 8e614c7a29)
2019-09-23 11:11:59 -07:00
Eric Engestrom
0c29a15a09 gl: drop incorrect pkg-config file for glvnd
Akin to 1a25980c46 ("egl: drop incorrect pkg-config file for
glvnd") and b01524fff0 ("meson: don't build libGLES*.so with
GLVND") , removes a pkg-config file that shouldn't have been there in
the first place, but was needed because of that GLVND bug.

Now that the glvnd bug has been fixed, it was apparent that this gl.pc
pkg-config file was forgotten to be removed, so let's do just that :)

Suggested-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit a1de3011f3)
2019-09-23 11:11:59 -07:00
Samuel Pitoiset
401de43891 radv/gfx10: fix VK_KHR_pipeline_executable_properties with NGG GS
No GS copy shader if a pipeline enables NGG GS.

This fixes
dEQP-VK.pipeline.executable_properties.graphics.*geometry_stage*.

Fixes: 86864eedd2 ("radv: Implement radv_GetPipelineExecutablePropertiesKHR.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 99c186fbbe)
2019-09-23 11:11:59 -07:00
Dylan Baker
e27644cc2e bin/get-pick-list: use --oneline=pretty instead of --oneline
--oneline shortens hashes, while --oneline=pretty doesn't, otherwise
they are the same. Having full hashes is convenient as that is the
format that the bin/.cherry-ignore script requires to work correctly.
2019-09-23 11:11:55 -07:00
Dylan Baker
b1ea3dcc1f rehardcode from origin/master to upstream/master 2019-09-23 11:11:50 -07:00
Dylan Baker
6ee069f94a cherry-ignore: Add patches 2019-09-23 11:08:24 -07:00
Samuel Pitoiset
d083220879 radv: fix loading 64-bit GS inputs
We have to load 2 32-bit integer and to cast correctly.

This fixes crashes with gs-double-interpolator.vk_shader_test.

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111734
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 68820007fd)
2019-09-18 09:38:42 -07:00
Bas Nieuwenhuizen
9bf80c2b53 tu: Set up glsl types.
Addresses this assert:

deqp-vk: ../mesa-freedreno-9999/src/compiler/glsl_types.cpp:1244: static const glsl_type *glsl_type::get_interface_instance(const glsl_struct_field *, unsigned int, enum glsl_interface_packing, bool, const char *): Assertion `glsl_type_users > 0' failed.

running dEQP-VK.api.smoke.triangle .

Fixes: 624789e370 "compiler/glsl: handle case where we have multiple users for types"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 7999e10cab)
2019-09-18 09:38:42 -07:00
Haihao Xiang
fa59ef37ed i965: support AYUV/XYUV for external import only
Fixes: 89785e2d56 ("i965: add support for sampling from AYUV")
Fixes: 7cab8d3661 ("i965: Add support for sampling from XYUV images")
Cc: Vivek Kasireddy <vivek.kasireddy@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 8a9b81ab9d)
2019-09-18 09:38:42 -07:00
Samuel Iglesias Gonsálvez
0ff13c291b intel/nir: do not apply the fsin and fcos trig workarounds for consts
If we have fsin or fcos trigonometric operations with constant values
as inputs, we will multiply the result by 0.99997 in
brw_nir_apply_trig_workarounds, making the result wrong.

Adjusting the rules so they do not apply to const values we let a
later constant fold to deal with it.

v2:
- Do not early constant fold but only apply the trig workaround for
  non constants (Caio).
- Add fixes tag to commit log (Caio).

Fixes: bfd17c76c1 "i965: Port INTEL_PRECISE_TRIG=1 to NIR."
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit 3c474f8513)
2019-09-18 09:38:42 -07:00
Dylan Baker
71fafc13b9 Bump version for 19.2.0-rc4 2019-09-18 09:10:19 -07:00
Marek Olšák
e2bb51a99b radeonsi: add Navi12 PCI ID
trivial and urgent

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 83f195414a)
2019-09-18 09:09:54 -07:00
Tapani Pälli
2e9a37cf1c iris: close screen fd on iris_destroy_screen
Otherwise it never gets closed, this fixes errors seen with deqp-egl
where we end up opening 1024 files.

Fixes: 2dce0e94 ("iris: Initial commit of a new 'iris' driver for Intel Gen8+ GPUs.")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 631255387f)
2019-09-18 09:09:54 -07:00
Rhys Perry
b94ab6f5e3 radv: always emit a position export in gs copy shaders
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: f8d0337299 ('radv: add multiple streams support for the GS copy shader')
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit ffabcbba60)
2019-09-18 09:09:54 -07:00
Lionel Landwerlin
b2a09536ff drirc: include unreal engine version 0 to 23
This was meant to include up to version 23.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 0616b7ac90 ("vulkan: add vk_x11_strict_image_count option")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111522
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit dcf13fbac9)
2019-09-18 09:09:54 -07:00
Lionel Landwerlin
3aeddc1f2f util/xmlconfig: fix regexp compile failure check
This is embarrasing...

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 04dc6074cf ("driconfig: add a new engine name/version parameter")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 10206ba17b)
2019-09-18 09:09:54 -07:00
Sergii Romantsov
d5fe3f73fc nir/large_constants: more careful data copying
A filed of nir_variable.location may be equel to -1.
That may cause copying to invalid address of list-node,
making some internal fields corrupted.

Patch fixes segfault during freeing context due to
corrupted address of ralloc_header.destructor.

v2: copy data if var is constant (Connor Abbott)

CC: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Fixes: b6d4753568 (nir/large_constants: De-duplicate constants)
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111676
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
(cherry picked from commit c7b2a2fd36)
2019-09-18 09:09:54 -07:00
Lionel Landwerlin
5650308d08 vulkan: add vk_x11_strict_image_count option
This option strictly allocate the minImageCount given by the
application at swapchain creation.

This works around application that do not deal with the fact that the
implementation allocates more images than the minimum specified.

v2: Add values in default drirc (Bas)

v3: specify engine name/version (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111522
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0616b7ac90)
2019-09-18 09:09:54 -07:00
Lionel Landwerlin
fbd96932d6 driconfig: add a new engine name/version parameter
Vulkan applications can register with the following structure :

typedef struct VkApplicationInfo {
    VkStructureType    sType;
    const void*        pNext;
    const char*        pApplicationName;
    uint32_t           applicationVersion;
    const char*        pEngineName;
    uint32_t           engineVersion;
    uint32_t           apiVersion;
} VkApplicationInfo;

This enables the Vulkan implementations to apply workarounds based off
matching this description.

Here we add a new parameter for matching the driconfig options with
the following :

    <device driver="anv">
        <application engine_name_match="MyOwnEngine.*" engine_versions="10:12,40:42">
            <option name="blaaah" value="true" />
        </application>
    </device>

v2: switch engine name match to use regexps

v3: Verify that the regexec returns REG_NOMATCH for match failure (Eric)

v4: Add missing bit that went to the following commit (Eric)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 04dc6074cf)
2019-09-18 09:09:54 -07:00
Lionel Landwerlin
82fc77b521 radv: store engine name
We'll use this later for a new driconfig matching parameter.

v2: Avoid leak in device creation error case (Bas)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6d5f11ab34)
2019-09-18 09:09:54 -07:00
Kenneth Graunke
78c34c3bfb iris: Initialize ice->state.prim_mode to an invalid value
It was calloc'd to 0 which is PIPE_PRIM_POINTS, which means that we
fail to notice an initial primitive of points being new, and fail at
updating the "primitive is points or lines" field.

We do not need to reset this on device loss because we're tracking
the last primitive mode sent to us on the CPU via draw_vbo, not the
last primitive mode sent to the GPU.

Fixes several tests:
- dEQP-GLES3.functional.clipping.point.wide_point_clip
- dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_center
- dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_corner

Fixes: dcfca0af7c ("iris: Set XY Clipping correctly.")
(cherry picked from commit c9fb704f72)
2019-09-18 09:09:54 -07:00
Samuel Pitoiset
cd0402c582 radv: fix allocating number of user sgprs if streamout is used
streamout_buffers is assigned after that function, so the previous
fix was completely wrong. This probably fix something when streamout
buffers and push constants are used/inlined in the same shader.

Fixes: 378e2d2414 ("radv: fix computing number of user SGPRs for streamout buffers")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 8137df3a46)
[Juan A. Suarez: fix the structure usage]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-09-18 09:09:54 -07:00
Iago Toral Quiroga
110dc21ed3 v3d: make sure we have enough space in the CL for the primitive counts packet
Fixes: 0f2d1dfe65 ("v3d: use the GPU to record primitives written to transform feedback")

Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit b9a07eed00)
2019-09-18 09:09:54 -07:00
Jason Ekstrand
cfcb96da38 intel/fs: Handle UNDEF in split_virtual_grfs
When the UNDEF instruction was added, we didn't do anything special in
split_virtual_grfs.  This mean that anything with an UNDEF wasn't
getting split which causes problems for the compiler.  Among other
things, it makes RA harder because things are in bigger chunks.  It also
meant that dvec4s weren't getting split which means that they are larger
than the maximum register size.

Shader-db results on Kaby Lake:

    total instructions in shared programs: 14959202 -> 14960035 (<.01%)
    instructions in affected programs: 96197 -> 97030 (0.87%)
    helped: 140
    HURT: 128
    helped stats (abs) min: 1 max: 17 x̄: 1.62 x̃: 1
    helped stats (rel) min: 0.09% max: 6.15% x̄: 0.65% x̃: 0.45%
    HURT stats (abs)   min: 1 max: 825 x̄: 8.28 x̃: 1
    HURT stats (rel)   min: 0.13% max: 139.83% x̄: 1.70% x̃: 0.50%
    95% mean confidence interval for instructions value: -2.96 9.18
    95% mean confidence interval for instructions %-change: -0.56% 1.51%
    Inconclusive result (value mean confidence interval includes 0).

    total loops in shared programs: 4372 -> 4372 (0.00%)
    loops in affected programs: 0 -> 0
    helped: 0
    HURT: 0

    total cycles in shared programs: 352646771 -> 352840997 (0.06%)
    cycles in affected programs: 218600800 -> 218795026 (0.09%)
    helped: 21167
    HURT: 21411
    helped stats (abs) min: 1 max: 2924 x̄: 36.89 x̃: 10
    helped stats (rel) min: <.01% max: 41.90% x̄: 2.97% x̃: 0.98%
    HURT stats (abs)   min: 1 max: 26027 x̄: 45.54 x̃: 10
    HURT stats (rel)   min: <.01% max: 324.46% x̄: 3.88% x̃: 1.06%
    95% mean confidence interval for cycles value: 2.87 6.26
    95% mean confidence interval for cycles %-change: 0.40% 0.55%
    Cycles are HURT.

    total spills in shared programs: 8840 -> 8953 (1.28%)
    spills in affected programs: 126 -> 239 (89.68%)
    helped: 1
    HURT: 2

    total fills in shared programs: 21782 -> 21914 (0.61%)
    fills in affected programs: 431 -> 563 (30.63%)
    helped: 1
    HURT: 3

    LOST:   0
    GAINED: 5

Shader-db results on Haswell:

    total instructions in shared programs: 13320918 -> 13320769 (<.01%)
    instructions in affected programs: 40998 -> 40849 (-0.36%)
    helped: 146
    HURT: 56
    helped stats (abs) min: 1 max: 8 x̄: 2.73 x̃: 2
    helped stats (rel) min: 0.16% max: 8.60% x̄: 2.52% x̃: 2.22%
    HURT stats (abs)   min: 2 max: 23 x̄: 4.45 x̃: 4
    HURT stats (rel)   min: 0.21% max: 10.26% x̄: 6.83% x̃: 10.26%
    95% mean confidence interval for instructions value: -1.26 -0.21
    95% mean confidence interval for instructions %-change: -0.62% 0.77%
    Inconclusive result (%-change mean confidence interval includes 0).

    total loops in shared programs: 4373 -> 4373 (0.00%)
    loops in affected programs: 0 -> 0
    helped: 0
    HURT: 0

    total cycles in shared programs: 374518258 -> 374384193 (-0.04%)
    cycles in affected programs: 231101954 -> 230967889 (-0.06%)
    helped: 21427
    HURT: 19438
    helped stats (abs) min: 1 max: 2035 x̄: 31.09 x̃: 8
    helped stats (rel) min: <.01% max: 40.95% x̄: 2.42% x̃: 0.86%
    HURT stats (abs)   min: 1 max: 20875 x̄: 27.38 x̃: 8
    HURT stats (rel)   min: <.01% max: 59.09% x̄: 2.49% x̃: 0.80%
    95% mean confidence interval for cycles value: -4.49 -2.07
    95% mean confidence interval for cycles %-change: -0.14% -0.04%
    Cycles are helped.

    total spills in shared programs: 23406 -> 23411 (0.02%)
    spills in affected programs: 3 -> 8 (166.67%)
    helped: 0
    HURT: 2

    total fills in shared programs: 34845 -> 34850 (0.01%)
    fills in affected programs: 3 -> 8 (166.67%)
    helped: 0
    HURT: 2

    LOST:   0
    GAINED: 0

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111566
Fixes: f4ef34f207 "intel/fs: Add an UNDEF instruction to avoid..."
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit acfa2340e6)
2019-09-18 09:09:54 -07:00
Dylan Baker
b0b63d9593 add patches to be ignored 2019-09-18 09:09:54 -07:00
Danylo Piliaiev
5eba28227c tgsi_to_nir: Translate TGSI_INTERPOLATE_COLOR as INTERP_MODE_NONE
Translating TGSI_INTERPOLATE_COLOR as INTERP_MODE_SMOOTH made
it for drivers impossible to have flatshaded color inputs.

Translate it to INTERP_MODE_NONE which drivers interpret as
smooth or flat depending on flatshading state.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111467

Fixes: 770faf54 ("tgsi_to_nir: Improve interpolation modes.")

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 175c32e9bd)
2019-09-18 09:09:54 -07:00
Dylan Baker
f9ef2b4375 meson: don't generate file into subdirs
This is unsupported by meson and may become a hard error in the future.

Fixes: 5adfc8602c
       ("lima/ppir: move sin/cos input scaling into NIR")
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
(cherry picked from commit 52cf2d05a7)
2019-09-18 09:09:54 -07:00
Kenneth Graunke
b85d10e14b gallium: Fix util_format_get_depth_only
This is a pipe format, not a boolean.

Fixes: 5849e0612c ("gallium/auxiliary: Add util_format_get_depth_only() helper.")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit c6d40b5182)
2019-09-18 09:09:54 -07:00
Caio Marcelo de Oliveira Filho
7882268cc9 glsl/nir: Avoid overflow when setting max_uniform_location
Don't use the UNMAPPED_UNIFORM_LOC (-1) to set the unsigned
max_uniform_location.  Those unmapped uniforms don't have to be
accounted at this point.

Fixes: 7a9e5cdfbb ("nir/linker: Add gl_nir_link_uniforms()")
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
(cherry picked from commit 4f33f96c45)
2019-09-18 09:09:54 -07:00
Kenneth Graunke
4eabbc04f2 iris: Fix constant buffer sizes for non-UBOs
Since the system value refactor, we've accidentally only been setting
cbuf->buffer_size in the UBO case, and not in the uploaded-constants
case.  We use cbuf->buffer_size to fill out the SURFACE_STATE entry,
so it needs to be initialized in both cases.

Fixes: 3b6d787e40 ("iris: move sysvals to their own constant buffer")
(cherry picked from commit 077a1952cc)
2019-09-18 09:09:54 -07:00
Danylo Piliaiev
97792279e4 nir/loop_analyze: Treat do{}while(false) loops as 0 iterations
Loops like:

block block_0:
vec1 32 ssa_2 = load_const (0x00000020)
vec1 32 ssa_3 = load_const (0x00000001)
loop {
    vec1 32 ssa_7 = phi block_0: ssa_3, block_4: ssa_9
    vec1 1 ssa_8 = ige ssa_2, ssa_7
    if ssa_8 {
        break
    } else {
    }
    vec1 32 ssa_9 = iadd ssa_7, ssa_1
}

Were treated as having more than 1 iteration and after unrolling
produced wrong results, however such loop will exit during
the first iteration if not unrolled.

So we check if loop will actually loop.

Fixes tests/shaders/glsl-fs-loop-while-false-02.shader_test

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit e71fc7f238)
2019-09-18 09:09:54 -07:00
Dylan Baker
3771534b2f Bump version for rc3 2019-09-11 09:05:32 -07:00
Marek Olšák
481d82b65b radeonsi/gfx10: fix wave occupancy computations
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
(cherry picked from commit d95afd8b9e)
2019-09-10 09:51:57 -07:00
Marek Olšák
732950bf36 radeonsi/gfx10: don't call gfx10_destroy_query with compute-only contexts
This fixes a crash.

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
(cherry picked from commit 28adf0d00c)
2019-09-10 09:51:57 -07:00
Lepton Wu
637e02c1b1 virgl: Fix pipe_resource leaks under multi-sample.
Fixes: 900a80f9e4 ("virgl: virgl_transfer should own its virgl_resource")

Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
(cherry picked from commit 263136fb5d)
2019-09-10 09:51:57 -07:00
Timur Kristóf
e4bb2ceb78 st/nine: Properly initialize GLSL types for NIR shaders.
NIR shaders use GLSL types (note: these live outside libglsl), and
nine needs to properly initialize these just like the other state
trackers. This fixes an assertion failure when TTN is used.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
2019-09-09 18:01:13 +00:00
Bas Nieuwenhuizen
d858388bfc Revert "ac/nir: Lower large indirect variables to scratch"
This reverts commit 74470baebb.

This change introduces some significant performance regressions. We
are fixing those on master, but the follow up work is large enough
not to backport to 19.2 .

Fixes: 74470baebb "ac/nir: Lower large indirect variables to scratch"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111576
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-09 17:32:38 +00:00
Jason Ekstrand
7635b74acf nir/dead_cf: Repair SSA if the pass makes progress
The dead_cf pass calls into the CF manipulation helpers which attempt to
keep NIR's SSA form sane.  However, when the only break is removed from
a loop, dominance gets messed up anyway because the CF SSA clean-up code
only looks at phis and doesn't consider the case of code becoming
unreachable.  One solution to this would be to put the loop into LCSSA
form before we modify any of its contents.  Another (and the approach
taken by this pass) is to just run the repair_ssa pass afterwards
because the CF manipulation helpers are smart enough to keep all the
use/def stuff sane; they just don't always preserve dominance
properties.

While we're here, we clean up some bogus indentation.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111405
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111069
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit c832820ce9)
2019-09-09 09:14:48 -07:00
Jason Ekstrand
2012c4a75c nir/repair_ssa: Insert deref casts when needed
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit 1005272a2b)
2019-09-09 09:14:43 -07:00
Jason Ekstrand
3f56fa8fe9 nir/repair_ssa: Repair dominance for unreachable blocks
NIR currently assumes that unreachable blocks are trivially dominated by
everything.  However, when considering well-formed SSA, there is no path
from any block to an unreachable block.  Therefore, we can break any
use-def chains where the use is in an unreachable block.  This removes
any dependencies on code created by uses in unreachable blocks and lets
DCE do a better job of cleaning it up.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit a3268599f3)
2019-09-09 09:14:38 -07:00
Jason Ekstrand
2c6e34ac93 nir: Add a block_is_unreachable helper
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit f81a2623d8)
2019-09-09 09:14:34 -07:00
Jason Ekstrand
23bc3a401d nir: Don't infinitely recurse in lower_ssa_defs_to_regs_block
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit 517142252f)
2019-09-09 09:13:30 -07:00
Jason Ekstrand
8889cc1241 nir: Handle complex derefs in nir_split_array_vars
We already bail and don't split the vars but we were passing a NULL to
_mesa_hash_table_search which is not allowed.

Fixes: f1cb3348f1 "nir/split_vars: Properly bail in the presence of ..."
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit 37cdb7fc44)
2019-09-09 09:13:23 -07:00
Eric Engestrom
89776bb48c drirc: override minImageCount=2 for gfxbench
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110765
Fixes: 4689e98fe8 ("vulkan/wsi: Set X11 minImageCount to 3.")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 27339fe9a7)
2019-09-09 09:13:19 -07:00
Eric Engestrom
55a10da818 radv: add support for vk_x11_override_min_image_count
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 5eb7d48b58)
2019-09-09 09:13:14 -07:00
Eric Engestrom
c9b1f07e55 amd: move adaptive sync to performance section, as it is defined in xmlpool
Fixes: 3844ed8d44 ("radv: Add adaptive_sync driconfig option and enable it by default.")
Fixes: e260493f2a ("radeonsi: Enable adaptive_sync by default for radeon")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 4ad99ee961)
2019-09-09 09:13:09 -07:00
Eric Engestrom
9d22b0dc90 anv: add support for vk_x11_override_min_image_count
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 037b5b567f)
2019-09-09 09:13:04 -07:00
Eric Engestrom
cb548d566d wsi: add minImageCount override
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit a72cdd00ab)
2019-09-09 09:12:46 -07:00
Eric Engestrom
172cff6357 anv: add support for driconf
No option is supported yet, this is just the boilerplate.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 4dcb1fff19)
2019-09-09 09:12:41 -07:00
Jason Ekstrand
c1114994f3 anv: Bump maxComputeWorkgroupSize
Fixes: 9a129510f5 "anv: Bump maxComputeWorkgroupInvocations"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111552
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 3b1a7e5333)
2019-09-09 09:12:36 -07:00
Caio Marcelo de Oliveira Filho
1a99cfef28 nir/lower_explicit_io: Handle 1 bit loads and stores
Load a 32-bit value then convert to 1-bit.  Convert 1-bit to 32-bit
value, then Store it.

These cases started to appear when we changed Anvil to use derefs for
shared memory.

v2: Use `bit_size` in a couple of places we were missing.  (Jason)
    Reassign `value` instead of `src[0]`.  (Jason)

Fixes: 024a46a407 ("anv: use derefs for shared memory access")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit c0c55bd84f)
2019-09-06 08:33:07 -07:00
Jonathan Marek
f42a4300aa freedreno/a2xx: ir2: fix lowering of instructions after float lowering
Some instructions generated by int/bool float lowering need to be lowered
by opt_algebraic.

Fixes: 43dbd7d6

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 3516a90ab4)
2019-09-06 08:30:40 -07:00
Sergii Romantsov
0e6ccd180c intel/dri: finish proper glthread
KWin was able to get NULL-context in the call
intelUnbindContext. But a call _mesa_glthread_finish
is not resistent to such case.
Case can be catched with steps:
	1. Create both glx and egl contexts
	2. Make glx as current
	3. Make egl as current
	4. Reset glx context
	5. Make egl as current

Solution adds proper finishing of glthread-context
(context will be taken from the requested dri-context
for unbinding, but not from the saved current context).

Piglit-test: https://gitlab.freedesktop.org/mesa/piglit/merge_requests/87

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110814
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111271
Fixes: dca36d5516 (i965: Implement threaded GL support)
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 1dce75c183)
2019-09-06 08:30:35 -07:00
Dave Stevenson
5e72987777 broadcom/v3d: Allow importing linear BOs with arbitrary offset/stride.
Equivalent of 0c1dd9dee "broadcom/vc4: Allow importing linear BOs with
arbitrary offset/stride." for v3d.

Allows YUV buffers with a single buffer and plane offsets to be
passed in.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 873b092e91)
2019-09-05 15:16:51 -07:00
Connor Abbott
95145376d1 radv: Call nir_propagate_invariant()
Without this, invariant qualifiers don't do anything. Together with a
fix to the game, this fixes flickering in No Man's Sky.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 3f5b541fc8)
2019-09-05 08:44:57 -07:00
Samuel Pitoiset
3ef013f0e9 nir: do not assume that the result of fexp2(a) is always an integral
It's only correct when 'a' is an integral greater or equal to 0.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111493
Fixes: 5544b2cbbd ("nir/algebraic: Use value range analysis to eliminate useless unary ops")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 966a455bb9)
(conflicts resolved by Dylan Baker)
2019-09-04 16:25:49 -07:00
Dylan Baker
b84cbdfd4d nir: Add is_not_negative helper function
This was taken from 636da12433, and is
needed by the next patch.
2019-09-04 16:25:49 -07:00
Hal Gentz
f180b04d65 glx: Fix SEGV due to dereferencing a NULL ptr from XCB-GLX.
When run in optirun, applications that linked to `libGLX.so` and then
proceeded to querying Mesa for extension strings caused a SEGV in Mesa.

`glXQueryExtensionsString` was calling a chain of functions that
eventually led to `__glXQueryServerString`. This function would call
`xcb_glx_query_server_string` then `xcb_glx_query_server_string_reply`.
The latter for some unknown reason returned `NULL`. Passing this `NULL`
to `xcb_glx_query_server_string_string_length` would cause a SEGV as the
function tried to dereference it.

The reason behind the function returning `NULL` is yet to be determined,
however, simply checking that the ptr is not `NULL` resolves this. A
similar check has been added to `__glXGetString` for completeness sake,
although not immediately necessary.

In addition to that, we stumbled into a similar problem in
`AllocAndFetchScreenConfigs` which tries to access the configs to free
them if `__glXQueryServerString` fails. This, of course, SEGVs, because the
configs are yet to have been allocated. Simply continuing past the configs
if their config ptrs are `NULL` resolves this. We also switch to `calloc`
to make sure that the config ptrs are `NULL` by default, and not some
uninitialized value.

Cc: mesa-stable@lists.freedesktop.org
Fixes: 24b8a8cfe8 "glx: implement __glXGetString, hide __glXGetStringFromServer"
Fixes: cb3610e37c "Import the GLX client side library, formerly from xc/lib/GL/glx. Build it "
Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Hal Gentz <zegentzy@protonmail.com>
(cherry picked from commit 1591d1fee5)
2019-09-04 16:15:49 -07:00
Kenneth Graunke
45b22fb873 iris: Report correct number of planes for planar images
We were only handling the modifiers case and not counting the number of
planes in actual planar images.

Fixes Piglit's ext_image_dma_buf_import-export.

Fixes: fc12fd05f5 ("iris: Implement pipe_screen::resource_get_param")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111509
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit f8887909c6)
2019-09-04 16:15:41 -07:00
Eric Engestrom
ffa1f33ec0 nir: fix memleak in error path
Fixes: 2cf59861a8 ("nir: Add partial redundancy elimination for compares")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 7659c6197f)
2019-09-04 16:15:36 -07:00
Eric Engestrom
0bbf6eab52 freedreno/drm-shim: fix mem leak
Fixes: 494ecef6b4 ("freedreno: Add support for drm-shim.")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit c4969b0a25)
2019-09-04 16:15:29 -07:00
Eric Engestrom
ef0ccde381 anv: fix format string in error message
Fixes: 9775894f10 ("anv: Move size check from anv_bo_cache_import() to caller (v2)")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 7abf65aedc)
2019-09-04 16:15:23 -07:00
Eric Engestrom
ba255cdd50 util/os_file: fix double-close()
Fixes: 955c63d364 ("util/os_file: resize buffer to what was actually needed")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 1667360f7d)
2019-09-04 16:14:52 -07:00
Eric Engestrom
1a4e7d7293 egl: fix deadlock in malloc error path
Fixes: cb0980e69a ("egl: move alloc & init out of _eglBuiltInDriver{DRI2,Haiku}")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 43d470404c)
2019-09-04 16:14:41 -07:00
Eric Engestrom
1cfc906898 ttn: fix 64-bit shift on 32-bit 1
Fixes: 4d0b2c7aaa ("ttn: Update shader->info as we generate code.")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
(cherry picked from commit 3afe9d798a)
2019-09-04 16:14:31 -07:00
Lionel Landwerlin
0a83226dc6 vulkan/overlay: bounce image back to present layout
Once we write the overlay to an image to be presented, we must not
forget to put it back into present layout.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111401
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 320b0f66c2)
2019-09-04 16:14:26 -07:00
Lionel Landwerlin
26cd407481 egl: fix platform selection
Add missing "device" platform

v2: Add the missing platform (Eric)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Jean Hertel <jean.hertel@hotmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111529
Fixes: d6edccee8d ("egl: add EGL_platform_device support")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
(cherry picked from commit 6775a52400)
2019-09-04 16:14:01 -07:00
Erik Faye-Lund
451ddeb429 gallium/auxiliary/indices: consistently apply start only to input
The majority of these only apply the start argument to the input, but a
few of them also does for the output-array. util_primconvert, the only
user of this argument expects this pass a non-zero start-argument does
not expect this to be applied to the output; if it is, it will write
outside of allocated memory, leading to VRAM corruption.

The reason this doesn't seem to have been noticed before, is that no
driver currently use util_primconvert to convert a primitive-type to
itself, which is the cases where this was broken. But for Zink, this
will no longer be true, because we need to eliminate the use of 8-bit
index-buffers.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 28f3f8d413 ("gallium/auxiliary/indices: add start param")
Reviewed-by: Rob Clark <robdclark@chromium.org>
(cherry picked from commit 52af1427c6)
2019-09-04 16:13:57 -07:00
Vinson Lee
6ac1d9b46e travis: Fail build if any command in if statement fails.
Travis is checking the exit code of the entire if statement.

Fixes: 64ffc289be ("travis: add MacOS Scons build")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Acked-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 029b07b2ad)
2019-09-04 16:13:52 -07:00
Vinson Lee
529db83e7b swr: Fix build with llvm-9.0 again.
Commit 6f7306c029 ("swr/rast: Refactor memory API between rasterizer
core and swr") unintentionally removed changes for llvm-9.0.

Fixes: 6f7306c029 ("swr/rast: Refactor memory API between rasterizer core and swr")
Fixes: 5dd9ad1570 ("swr/rasterizer: Better implementation of scatter")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
(cherry picked from commit 3664a6600e)
2019-09-04 16:13:45 -07:00
Dylan Baker
7f9b49218f bump version to 19.2-rc2 2019-09-04 14:22:26 -07:00
Pierre-Eric Pelloux-Prayer
efa4aee99d glsl: replace 'x + (-x)' with constant 0
This fixes a hang in shadertoy for radeonsi where a buffer was initialized with:

   value -= value

with value being undefined.
In this case LLVM replace the operation with an assignment to NaN.

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111241
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 47cc660d9c)
2019-09-04 11:57:07 -07:00
Thong Thai
ec74c76a1a Revert "radeonsi: don't emit PKT3_CONTEXT_CONTROL on amdgpu"
This reverts commit 5a2e65be89.

Even though CONTEXT_CONTROL is emitted by the kernel, CONTEXT_CONTROL
still needs to be emitted by the UMD, or else the driver will hang

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 2a3a560407)
2019-09-04 11:57:03 -07:00
Ian Romanick
6934bc4f08 nir/range-analysis: Handle constants in nir_op_mov just like nir_op_bcsel
I discovered this while looking at a shader that was hurt by some other
work I'm doing.  When I examined the changes, I was confused that one
instance of a comparison that was used in a discard_if was (incorrectly)
eliminated, while another instance used by a bcsel was (correctly) not
eliminated.  I had to use NIR_PRINT=true to see exactly where things
when wrong.

A bunch of shaders in Goat Simulator, Dungeon Defenders, Sanctum 2, and
Strike Suit Zero were impacted.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")

All Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16280659 -> 16281075 (<.01%)
instructions in affected programs: 21042 -> 21458 (1.98%)
helped: 0
HURT: 136
HURT stats (abs)   min: 1 max: 9 x̄: 3.06 x̃: 3
HURT stats (rel)   min: 1.16% max: 6.12% x̄: 2.23% x̃: 2.03%
95% mean confidence interval for instructions value: 2.93 3.19
95% mean confidence interval for instructions %-change: 2.08% 2.37%
Instructions are HURT.

total cycles in shared programs: 367168270 -> 367170313 (<.01%)
cycles in affected programs: 172020 -> 174063 (1.19%)
helped: 14
HURT: 111
helped stats (abs) min: 2 max: 80 x̄: 21.21 x̃: 9
helped stats (rel) min: 0.10% max: 4.47% x̄: 1.35% x̃: 0.79%
HURT stats (abs)   min: 2 max: 584 x̄: 21.08 x̃: 5
HURT stats (rel)   min: 0.12% max: 17.28% x̄: 1.55% x̃: 0.40%
95% mean confidence interval for cycles value: 5.41 27.28
95% mean confidence interval for cycles %-change: 0.64% 1.81%
Cycles are HURT.

(cherry picked from commit 7dba7df5e5)
2019-09-04 11:56:59 -07:00
Ian Romanick
6aa7a10370 nir/range-analysis: Fix incorrect fadd range result for (ne_zero, ne_zero)
Found by inspection.  I tried really, really hard to make a test case
that would trigger this problem, but I was unsuccesful.  It's very hard
to get an instruction to produce a ne_zero result without ne_zero
sources.  The most plausible way is using bcsel.  That proves
problematic because bcsel interprets its sources as integers, so it
cannot currently be used to "clean" values for floating point
instructions.

No shader-db changes on any Intel platform.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")
(cherry picked from commit 0b4782fccd)
2019-09-04 11:56:54 -07:00
Ian Romanick
97e44d6817 nir/range-analysis: Adjust result range of multiplication to account for flush-to-zero
Fixes piglit tests (new in piglit!110):

    - fs-underflow-fma-compare-zero.shader_test
    - fs-underflow-mul-compare-zero.shader_test

v2: Add back part of comment accidentally deleted.  Noticed by
Caio. Remove is_not_zero function as it is no longer used.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111308
Fixes: fa116ce357 ("nir/range-analysis: Range tracking for ffma and flrp")
Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>

All Gen7+ platforms** had similar results. (Ice Lake shown)
total instructions in shared programs: 16278465 -> 16279492 (<.01%)
instructions in affected programs: 16765 -> 17792 (6.13%)
helped: 0
HURT: 23
HURT stats (abs)   min: 7 max: 275 x̄: 44.65 x̃: 8
HURT stats (rel)   min: 1.15% max: 17.51% x̄: 4.23% x̃: 1.62%
95% mean confidence interval for instructions value: 9.57 79.74
95% mean confidence interval for instructions %-change: 1.85% 6.61%
Instructions are HURT.

total cycles in shared programs: 367135159 -> 367154270 (<.01%)
cycles in affected programs: 279306 -> 298417 (6.84%)
helped: 0
HURT: 23
HURT stats (abs)   min: 13 max: 6029 x̄: 830.91 x̃: 54
HURT stats (rel)   min: 0.17% max: 45.67% x̄: 7.33% x̃: 0.49%
95% mean confidence interval for cycles value: 100.89 1560.94
95% mean confidence interval for cycles %-change: 0.94% 13.71%
Cycles are HURT.

total spills in shared programs: 8870 -> 8869 (-0.01%)
spills in affected programs: 19 -> 18 (-5.26%)
helped: 1
HURT: 0

total fills in shared programs: 21904 -> 21901 (-0.01%)
fills in affected programs: 81 -> 78 (-3.70%)
helped: 1
HURT: 0

LOST:   0
GAINED: 1

** On Broadwell, a shader was hurt for spills / fills instead of
   helped.

No changes on any earlier platforms.

(cherry picked from commit ef2e235252)
2019-09-04 11:56:43 -07:00
Ian Romanick
a8525d7751 nir/range-analysis: Adjust result range of exp2 to account for flush-to-zero
Fixes piglit tests (new in piglit!110):

    - fs-underflow-exp2-compare-zero.shader_test

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111308
Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>

Most of the shaders affected are, unsurprisingly, in Unigine Heaven.

All Gen6+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16278207 -> 16278465 (<.01%)
instructions in affected programs: 11374 -> 11632 (2.27%)
helped: 0
HURT: 58
HURT stats (abs)   min: 2 max: 13 x̄: 4.45 x̃: 4
HURT stats (rel)   min: 0.54% max: 4.11% x̄: 2.42% x̃: 2.82%
95% mean confidence interval for instructions value: 3.77 5.13
95% mean confidence interval for instructions %-change: 2.19% 2.64%
Instructions are HURT.

total cycles in shared programs: 367134284 -> 367135159 (<.01%)
cycles in affected programs: 81207 -> 82082 (1.08%)
helped: 17
HURT: 36
helped stats (abs) min: 6 max: 356 x̄: 90.35 x̃: 6
helped stats (rel) min: 0.69% max: 21.45% x̄: 5.71% x̃: 0.78%
HURT stats (abs)   min: 4 max: 235 x̄: 66.97 x̃: 16
HURT stats (rel)   min: 0.35% max: 27.58% x̄: 5.34% x̃: 1.09%
95% mean confidence interval for cycles value: -20.36 53.38
95% mean confidence interval for cycles %-change: -1.08% 4.67%
Inconclusive result (value mean confidence interval includes 0).

No changes on any earlier platforms.

(cherry picked from commit 33ad2bab4b)
2019-09-04 11:56:39 -07:00
Ian Romanick
2971f079e1 nir/algebraic: Mark some value range analysis-based optimizations imprecise
This didn't fix bug #111308, but it was found will trying to find the
actual cause of that bug.

Fixes piglit tests (new in piglit!110):

    - fs-fract-of-NaN.shader_test
    - fs-lt-nan-tautology.shader_test
    - fs-ge-nan-tautology.shader_test

No shader-db changes on any Intel platform.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111308
Fixes: b77070e293 ("nir/algebraic: Use value range analysis to eliminate tautological compares")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit ccb236d1bc)
2019-09-04 11:56:34 -07:00
Kenneth Graunke
da03ddf677 iris: Fix partial fast clear checks to account for miplevel.
We enabled fast clears at level > 0, but didn't minify the dimensions
when comparing the box size, so we always thought it was a partial
clear and as a result never actually enabled any.

This eliminates some slow clears in Civilization VI, but they are mostly
during initialization and not the main rendering.

Thanks to Dan Walsh for noticing we had too many slow clears.

Fixes: 393f659ed8 ("iris: Enable fast clears on other miplevels and layers than 0.")
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
(cherry picked from commit 30b9ed92ea)
2019-09-04 11:56:30 -07:00
Ian Romanick
753ea83477 intel/compiler: Request bitfield_reverse lowering on pre-Gen7 hardware
See the previous commit for the explanation of the Fixes tag.

Hurts 21 shaders in shader-db.  All of the hurt shaders are in Unreal
Engine 4 tech demos.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 7afa26d4e3 ("nir: Add lowering for nir_op_bitfield_reverse.")
(cherry picked from commit b418269d7d)
2019-09-04 11:56:12 -07:00
Ian Romanick
91fa24a686 nir/algrbraic: Don't optimize open-coded bitfield reverse when lowering is enabled
This caused a problem on Sandybridge where an open-coded
bitfieldReverse() function could be optimized to a
nir_op_bitfield_reverse that would generate an unsupported BFREV
instruction in the backend.  This was encountered in some Unreal4 tech
demos in shader-db.  The bug was not previously noticed because we don't
actually try to run those demos on Sandybridge.

The fixes tag is a bit a lie.  The actual bug was introduced about
26,000 commits earlier in 371c4b3c48 ("nir: Recognize open-coded
bitfield_reverse.").  Without the NIR lowering pass, the flag needed to
avoid the optimization does not exist.  Hopefully nobody will care to
fix this on an earlier Mesa release.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 7afa26d4e3 ("nir: Add lowering for nir_op_bitfield_reverse.")
(cherry picked from commit d3fd1c761a)
2019-09-04 11:56:08 -07:00
Kenneth Graunke
c96de002b7 mesa: Fix _mesa_float_to_unorm() on 32-bit systems.
This fixes the following CTS test on 32-bit systems:
GTF-GL46.gtf30.GL3Tests.packed_depth_stencil.packed_depth_stencil_init

It does glGetTexImage of a 16-bit SNORM image, requesting 32-bit UNORM
data.  In get_tex_rgba_uncompressed, we round trip through float to
handle image transfer ops for clamping.  _mesa_format_convert does:

   _mesa_float_to_unorm(0.571428597f, 32)

which translated to:

   _mesa_lroundevenf(0.571428597f * 0xffffffffu)

which produced different results on 64-bit and 32-bit systems:

   64-bit: result = 0x92492500
   32-bit: result = 0x80000000

This is because the size of "long" varies between the two systems, and
0x92492500 is too large to fit in a signed 32-bit integer.  To fix this,
we switch to the new _mesa_i64roundevenf function which always does the
64-bit operation.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104395
Fixes: 594fc0f859 ("mesa: Replace F_TO_I() with _mesa_lroundevenf().")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit e18cd5452a)
2019-09-04 11:56:04 -07:00
Kenneth Graunke
07ac4269a5 util: Add a _mesa_i64roundevenf() helper.
This always returns a int64_t, translating to _mesa_lroundevenf on
systems where long is 64-bit, and llrintf where "long long" is needed.

Fixes: 594fc0f859 ("mesa: Replace F_TO_I() with _mesa_lroundevenf().")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit b59914e179)
2019-09-04 11:55:50 -07:00
Marek Olšák
c2aad5dc4d radeonsi: fix scratch buffer WAVESIZE setting leading to corruption
Cc: 19.2 19.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
(cherry picked from commit 360cf3c4b0)
2019-09-04 11:55:44 -07:00
Marek Olšák
952fd55015 radeonsi: unbind blend/DSA/rasterizer state correctly in delete functions
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111414

Fixes: b758eed9c3 ("radeonsi: make sure that blend state != NULL and remove all NULL checking")

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
(cherry picked from commit f95a28d361)
2019-09-04 11:55:40 -07:00
Dave Airlie
9433241cc9 gallivm: fix atomic compare-and-swap
Not sure how I missed this before, but compswap was hitting an
assert here as it is it's own special case.

Fixes: b5ac381d8f ("gallivm: add buffer operations to the tgsi->llvm conversion.")
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 1eda49cc3d)
2019-09-04 11:55:35 -07:00
Paulo Zanoni
6e07e58ef6 intel/fs: grab fail_msg from v32 instead of v16 when v32->run_cs fails
Looks like a copy/paste error. This patch prevents a segfault when
running the following on BDW:

    INTEL_DEBUG=no8,no16,do32 ./deqp-vk -n \
        dEQP-VK.subgroups.arithmetic.compute.subgroupmin_dvec4

For the curious, the message we're getting is:

    CS compile failed: Failure to register allocate.  Reduce number
    of live scalar values to avoid this.

Fixes: 864737ce6c ("i965/fs: Build 32-wide compute shader when needed.")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
(cherry picked from commit 848d5e444a)
2019-09-04 11:55:31 -07:00
Kenneth Graunke
7160c70f0f isl: Don't set UnormPathInColorPipe for integer surfaces.
This fixes dEQP-GLES3.functional.texture.specification subtests on iris:

- texsubimage3d_depth.depth24_stencil8_2d_array
- texsubimage3d_depth.depth32f_stencil8_2d_array
- texsubimage3d_depth.depth_component32f_2d_array
- texsubimage3d_depth.depth_component24_2d_array
- texstorage2d.format.depth24_stencil8_2d
- texstorage2d.format.depth32f_stencil8_2d
- texstorage2d.format.depth_component24_2d
- texstorage2d.format.depth_component32f_2d
- texstorage3d.format.depth24_stencil8_2d_array
- texstorage3d.format.depth32f_stencil8_2d_array
- texstorage3d.format.depth_component24_2d_array
- texstorage3d.format.depth_component32f_2d_array

Here, something appears to be going wrong with having this bit set
during blorp_copy operations for texture upload, which override the
format to R8G8B8A8_UINT.

AFAICT this bit should have no effect for integer surfaces, as it has
to do with blending, and integer blending is not a thing.  So it should
be harmless to disable it.

The Windows driver appears to be setting this bit universally, so
I am unclear why we would need to.  Perhaps they simply haven't run
into this issue.

Fixes: f741de236b ("isl: Enable Unorm Path in Color Pipe")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 2e1be771e4)
2019-09-04 11:55:22 -07:00
Kenneth Graunke
b871874de7 isl: Drop UnormPathInColorPipe for buffer surfaces.
Jason suggested I remove this in review, and he's right.  AFAICT this
affects blending, and that just isn't going to happen on buffers.

Fixes: f741de236b ("isl: Enable Unorm Path in Color Pipe")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 1b090f065e)
2019-09-04 11:55:17 -07:00
Samuel Pitoiset
80514527e5 radv: fix getting the index type size for uint8_t
16-bit and 32-bit values match hardware values but 8-bit doesn't.

This fixes dEQP-VK.pipeline.input_assembly.* with 8-bit index.

Fixes: 372c3dcfdb ("radv: implement VK_EXT_index_type_uint8")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl
(cherry picked from commit 89671ef205)
2019-09-04 11:55:09 -07:00
Dave Airlie
7d8eee2bdb virgl: fix format conversion for recent gallium changes.
The virgl formats are fixed in time snapshots of the gallium ones,
we just need to provide a translation table between them when
we enter the hardware.

This fixes a regression since Eric renumbered the gallium table.

Fixes: c45c33a5a2 (gallium: Remove manual defining of PIPE_FORMAT enum values.)
Bugzilla: https://bugs.freedesktop.org/111454

v1 by Dave Airlie <airlied@redhat.com>
v2: virgl: Add a number of formats to the table that are used, e.g. for vertex
    attributes
v3: cover some more missing formats from a piglit run

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
(cherry picked from commit bba4d2f442)
2019-09-04 11:55:04 -07:00
Alex Smith
6ea07af9c1 radv: Change memory type order for GPUs without dedicated VRAM
Put the uncached GTT type at a higher index than the visible VRAM type,
rather than having GTT first.

When we don't have dedicated VRAM, we don't have a non-visible VRAM
type, and the property flags for GTT and visible VRAM are identical.
According to the spec, for types with identical flags, we should give
the one with better performance a lower index.

Previously, apps which follow the spec guidance for choosing a memory
type would have picked the GTT type in preference to visible VRAM (all
Feral games will do this), and end up with lower performance.

On a Ryzen 5 2500U laptop (Raven Ridge), this improves average FPS in
the Rise of the Tomb Raider benchmark by up to ~30%. Tested a couple of
other (Feral) games and saw similar improvement on those as well.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
(Bas: CCing this to 19.2-rc due to high impact and limited complexity)
(cherry picked from commit fe0ec41c4d)
2019-09-04 11:55:00 -07:00
Rafael Antognolli
1ec895b4a7 anv: Only re-emit non-dynamic state that has changed.
On commit f6e7de41d7, we started emitting 3DSTATE_LINE_STIPPLE as part
of the non-dynamic state. That gets re-emitted every time we bind a new
VkPipeline. But that instruction is non-pipelined, and it caused a perf
regression of about 9-10% on Dota2.

This commit makes anv_dynamic_state_copy() return a mask with only the
state that has changed when copying it. 3DSTATE_LINE_STIPPLE won't be
emitted anymore unless it has changed, fixing the problem above.

v2: Improve commit message and add documentation about skipped checks
(Jason)

Fixes: f6e7de41d7 ("anv: Implement VK_EXT_line_rasterization")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 2b7ba9f239)
2019-09-04 11:54:49 -07:00
Lionel Landwerlin
7ff682a12c util: fix compilation on macos
timespec_get() is not available on macos, we need to pull in the
include/c11/threads_posix.h helper.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103674
Fixes: e2d761de03 ("util: drop final reference to p_compiler.h")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 9d3fc737af)
2019-09-04 11:54:45 -07:00
Andres Rodriguez
9fff4192bc radv: additional query fixes
Make sure we read the updated data from the gpu in cases where WAIT_BIT
is not set.

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit a410823b3e)
2019-09-04 11:54:34 -07:00
Kenneth Graunke
5c1362581a iris: Fix large timeout handling in rel2abs()
...by copying the implementation of anv_get_absolute_timeout().

Appears to fix a CTS test with 32-bit builds:
GTF-GL46.gtf32.GL3Tests.sync.sync_functionality_clientwaitsync_flush

Fixes: f459c56be6 ("iris: Add fence support using drm_syncobj")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
(cherry picked from commit 7ee7b0ecbc)
2019-09-04 11:54:29 -07:00
Samuel Pitoiset
649040ed8d radv/gfx10: do not use NGG with NAVI14
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit a4e6e59db8)
2019-09-04 11:54:25 -07:00
Samuel Pitoiset
7200ed1399 radv/gfx10: don't initialize VGT_INSTANCE_STEP_RATE_0
Only gfx9 and older use it to get InstanceID in VGPR1.
Ported from RadeonSI.

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 0813c27d8d)
2019-09-04 11:54:20 -07:00
Tapani Pälli
966a2bdc99 egl: reset blob cache set/get functions on terminate
Fixes errors seen with eglSetBlobCacheFuncsANDROID on Android when
running dEQP that terminates and reinitializes a display.

Fixes: 6f5b57093b "egl: add support for EGL_ANDROID_blob_cache"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 3e03a3fc53)
2019-09-04 11:54:16 -07:00
Kenneth Graunke
dff3ab5c04 iris: Avoid unnecessary resolves on transfer maps
We were always resolving the buffer as if we were accessing it via
CPU maps, which don't understand any auxiliary surfaces.  But we often
copy to a temporary using BLORP, which understands compression just
fine.  So we can avoid the resolve, and accelerate the copy as well.

Fixes: 9d1334d2a0 ("iris: Use copy_region and staging resources to avoid transfer stalls")
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
(cherry picked from commit 2d79925034)
2019-09-04 11:54:10 -07:00
Kenneth Graunke
d78f39eba0 iris: Drop copy format hacks from copy region based transfer path.
This doesn't work for compressed formats, as the source texture and
temporary texture would have different block sizes.  (Forcing the driver
to always take the GPU path would expose the bug.)  Instead, just use
the source format for the temporary, and let blorp_copy deal with
overrides.

The one case where we can't do this is ASTC, because isl won't let us
create a linear ASTC surface.  Fall back to the CPU paths there for now.

Fixes: 9d1334d2a0 ("iris: Use copy_region and staging resources to avoid transfer stalls")
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
(cherry picked from commit 136629a1e3)
2019-09-04 11:54:05 -07:00
Kenneth Graunke
1be5f26cfb iris: Update fast clear colors on Gen9 with direct immediate writes.
Gen11 stores the fast clear color in an "indirect clear buffer", as
a packed pixel value.  Gen9 hardware stores it as a float or integer
value, which is interpreted via the format.  We were trying to store
that in a buffer, for similarity with Icelake, and MI_COPY_MEM_MEM
it from there to the actual SURFACE_STATE bytes where it's stored.

This unfortunately doesn't work for blorp_copy(), which does bit-for-bit
copies, and overrides the format to a CCS-compatible UINT format.  This
causes the clear color to be interpreted in the overridden format.

Normally, we provide the clear color on the CPU, and blorp_blit.c:2611
converts it to a packed pixel value in the original format, then unpacks
it in the overridden format, so the clear color we use expands to the
bits we originally desired.

However, BLORP doesn't support this pack/unpack with an indirect clear
buffer, as it would need to do the math on the GPU.  On Gen11+, it isn't
necessary, as the hardware does the right thing.

This patch changes Gen9 to stop using an indirect clear buffer and
simply do PIPE_CONTROLs with post-sync write immediate operations
to store the new color over the surface states for regular drawing.
BLORP continues streaming out surface states, and handles fast clear
colors on the CPU.

Fixes: 53c484ba8a ("iris: blorp using resolve hooks")
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
(cherry picked from commit 1cd13ccee7)
2019-09-04 11:52:53 -07:00
Kenneth Graunke
14588c0727 iris: Fix broken aux.possible/sampler_usages bitmask handling
For renderable surfaces, we allocate SURFACE_STATEs for each bit in
res->aux.possible_usages.  Sampler views use res->aux.sampler_usages.

When pinning buffers, we call surf_state_offset_for_aux() to calculate
the offset to the desired surface state.  surf_state_offset_for_aux()
took an aux_modes parameter, which should be one of those two fields.
However...it was not using that parameter.  It always used the broader
res->aux.possible_usages field directly.

One of the callers, update_clear_value(), was passing incorrect masks
for this parameter.  It iterated through the bits in order, using
u_bit_scan(), which destructively modifies the mask.  So each time we
called it, the count of bits before our selected mode was 0, which would
cause us to always update the SURFACE_STATE for ISL_AUX_USAGE_NONE,
rather than updating each in turn.  This was hidden by the earlier bug
where surf_state_offset_for_aux() ignored the parameter.

Fixes: 7339660e80 ("iris: Add aux.sampler_usages.")
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
(cherry picked from commit 117a0368b0)
2019-09-04 11:52:46 -07:00
Kenneth Graunke
973d58e9b3 iris: Replace devinfo->gen with GEN_GEN
This is genxml, we can compile out this code.

Fixes: 2660667284 ("iris/gen8: Re-emit the SURFACE_STATE if the clear color changed.")
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
(cherry picked from commit f6c44549ee)
2019-09-04 11:52:41 -07:00
Alyssa Rosenzweig
58acce6dd9 pan/midgard: Fix writeout combining
shader-db regression in the scheduler.

Fixes: dff4986b1a ("pan/midgard: Emit store_output branch just-in-time")

total bundles in shared programs: 2055 -> 2019 (-1.75%)
bundles in affected programs: 1055 -> 1019 (-3.41%)
helped: 36
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.35% max: 20.00% x̄: 6.71% x̃: 5.16%
95% mean confidence interval for bundles value: -1.00 -1.00
95% mean confidence interval for bundles %-change: -8.45% -4.97%
Bundles are helped.

total quadwords in shared programs: 3444 -> 3408 (-1.05%)
quadwords in affected programs: 1897 -> 1861 (-1.90%)
helped: 36
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.19% max: 14.29% x̄: 3.97% x̃: 2.99%
95% mean confidence interval for quadwords value: -1.00 -1.00
95% mean confidence interval for quadwords %-change: -5.08% -2.86%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
(cherry picked from commit 272ce6f5a7)
2019-09-04 11:52:36 -07:00
Bas Nieuwenhuizen
bd0300f8ef radv: Disable NGG for geometry shaders.
A bunch of remaining issues including some that affect users.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111248
Fixes: ee21bd7440 "radv/gfx10: implement NGG support (VS only)"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit c037fe5ad1)
2019-09-04 11:52:30 -07:00
Lionel Landwerlin
4385e6cf02 util/timespec: use unsigned 64 bit integers for nsec values
We added this utility for vulkan where all timeouts are given as
uint64_t values. We can switch from signed to unsigned as this is the
only user and if we ever deal with signed integers somewhere else
we'll have to be careful to use the corresponding
timespec_(add|sub)_msec and always pass absolute values.

v2: Forgot to drop the test calling add_nsec() with a negative number

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Fixes: d2d70c3bb5 ("util: add a timespec helper")
Acked-by: Daniel Stone <daniels@collabora.com>
(cherry picked from commit 5833f43305)
2019-09-04 11:52:26 -07:00
Tapani Pälli
18511e3f5b iris/android: fix build and link with libmesa_intel_perf
Fixes: 0fd4359733 "iris/perf: implement routines to return counter info"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 728ebcdec2)
2019-09-04 11:52:09 -07:00
Samuel Pitoiset
7c615873e5 ac: fix exclusive scans on GFX8-GFX9
This fixes a regression introduced with scan&reduce operations
on GFX10. Note that some subgroups CTS still fail on GFX10 but
I assume it's a different issue.

This fixes dEQP-VK.subgroups.arithmetic.*.subgroupexclusive*.

Fixes: 227c29a80d "amd/common/gfx10: implement scan & reduce operations"
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 2d9f401a83)
2019-09-04 11:52:02 -07:00
Tapani Pälli
6af303f6fc util: fix os_create_anonymous_file on android
Commit fixes current crashes with Vulkan applications on Android.

Fixes: c0376a1234 "util: add anon_file.h for all memfd/temp file usage"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
(cherry picked from commit ce8fd042a5)
2019-09-04 11:51:55 -07:00
Kenneth Graunke
844fbc5c42 gallium/noop: Implement resource_get_param
v2: Pass through to oscreen rather than faking it (review from Marek).

Fixes: 0346b70083 ("gallium/screen: Add pipe_screen::resource_get_param")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit bc844d92ce)
2019-09-04 11:51:50 -07:00
Kenneth Graunke
813ed8629e gallium/rbug: Wrap resource_get_param if available
Fixes: 0346b70083 ("gallium/screen: Add pipe_screen::resource_get_param")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit f02d1a0b75)
2019-09-04 11:51:44 -07:00
Kenneth Graunke
6e6f137a4e gallium/trace: Wrap resource_get_param if available
Fixes: 0346b70083 ("gallium/screen: Add pipe_screen::resource_get_param")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit c43a44791b)
2019-09-04 11:51:39 -07:00
Kenneth Graunke
07760c1c9e gallium/ddebug: Wrap resource_get_param if available
Fixes: 0346b70083 ("gallium/screen: Add pipe_screen::resource_get_param")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 0e6b573ae5)
2019-09-04 11:51:31 -07:00
Jose Maria Casanova Crespo
0504bff354 mesa: recover target_check before get_current_tex_objects
At compressed_tex_sub_image we only can obtain the tex_object after
compressed_subtexture_target_check is validated for TEX_MODE_CURRENT.
So if the target is wrong the error is raised to the user.

This completes the fix for the regression introduced on "mesa: refactor
compressed_tex_sub_image function" of the pending failing tests:

dEQP-GLES3.functional.negative_api.texture.compressedtexsubimage3d
dEQP-GLES31.functional.debug.negative_coverage.get_error.texture.compressedtexsubimage3d

v2: Fix warning that texObj might be used uninitialized (Gert Wollny)

Fixes: 7df233d68d ("mesa: refactor compressed_tex_sub_image function")
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
(cherry picked from commit 74a7e3ed3b)
2019-09-04 11:51:25 -07:00
Samuel Pitoiset
637a9cbd3b radv: force enable VK_AMD_shader_ballot for Wolfenstein Youngblood
This gives a nice boost, +20% at this time on my Vega 56. Shader
ballot should be enabled by default at some point but it reduces
performance a bit (-6%) with Wolfeinstein II. Enable it only for
Youngblood at the moment, like what we did for Talos in the past.

As a bonus point, it gets rid of some minor artifacts that only
happens when ballot is disabled for some reasons.

Cc: 19.2 <mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit a6ad9e8ccf)
2019-09-04 11:51:15 -07:00
Samuel Pitoiset
690f050608 radv: add a new debug option called RADV_DEBUG=noshaderballot
Shader ballot will be enabled by default for Wolfenstein
Youngblood. This follows what we did for sisched.

Cc: 19.2 <mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit f202ac27a9)
2019-09-04 11:50:59 -07:00
Samuel Pitoiset
3ab1368c4f radv: allow to enable VK_AMD_shader_ballot only on GFX8+
Scans aren't implemented on SI/CIK.

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit e73d863a66)
2019-09-04 11:50:53 -07:00
Danylo Piliaiev
71daf2ef67 nir/loop_unroll: Prepare loop for unrolling in wrapper_unroll
Without loop_prepare_for_unroll loops are losing phis.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111411
Fixes: 5db98195 "nir: add loop unroll support for wrapper loops"
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit 84b3ef6a96)
2019-09-04 11:50:48 -07:00
Bas Nieuwenhuizen
614def1a89 radv: Emit VGT_GS_ONCHIP_CNTL for tess on GFX10.
Otherwise hangs are possible. This register was already set for
GS and NGG.

Fixes: 5eaed7ecfc "radv/gfx10: enable support for NAVI10, NAVI12 and NAVI14"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit e04761d0f9)
2019-09-04 11:50:42 -07:00
Bas Nieuwenhuizen
55334521f7 radv: Use correct vgpr_comp_cnt for VS if both prim_id and instance_id are needed.
Should take the max of the 2.

Fixes: ea337c8b7e "radv/gfx10: fix VS input VGPRs with the legacy path"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 2e763f7c87)
2019-09-04 11:50:37 -07:00
Ilia Mirkin
8ee40f6b63 gallium/vl: use compute preference for all multimedia, not just blit
The compute paths in vl are a bit AMD-specific. For example, they (on
nouveau), try to use a BGRX8 image format, which is not supported.
Fixing all this is probably possible, but since the compute paths aren't
in any way better, it's difficult to care.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213
Fixes: 9364d66cb7 (gallium/auxiliary/vl: Add video compositor compute shader render)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 958390a9bf)
2019-09-04 11:50:32 -07:00
Marek Olšák
25de459644 radeonsi: consolidate determining VGPR_COMP_CNT for API VS 2019-08-27 16:10:40 -04:00
Marek Olšák
5d7754017c radeonsi/gfx10: set PA_CL_VS_OUT_CNTL with CONTEXT_REG_RMW to fix edge flags
We need two different values of the register, one for NGG and one for
legacy, in order to fix edge flags for the legacy pipeline.

Passing the ngg flag to emit_clip_regs would be too complicated,
so CONTEXT_REG_RMW is used for partial register updates.
2019-08-27 16:10:40 -04:00
Marek Olšák
d23bf14d44 radeonsi/gfx10: remove incorrect ngg/pos_writes_edgeflag variables
It varies depending on si_shader_key::as_ngg.
2019-08-27 16:10:40 -04:00
Marek Olšák
514eb1587e radeonsi: add PKT3_CONTEXT_REG_RMW 2019-08-27 16:10:40 -04:00
Marek Olšák
a935da7cef winsys/amdgpu+radeon: process AMD_DEBUG in addition to R600_DEBUG 2019-08-27 16:10:40 -04:00
Marek Olšák
b9330a6189 radeonsi/gfx10: add AMD_DEBUG=nongg 2019-08-27 16:10:40 -04:00
Marek Olšák
0207c318e0 radeonsi/gfx10: finish up Navi14, add PCI ID 2019-08-27 16:10:40 -04:00
Marek Olšák
e09d469622 radeonsi/gfx10: always use the legacy pipeline for streamout
The best way to prevent GDS hangs is not to use GDS.
2019-08-27 16:10:40 -04:00
Marek Olšák
f208b04dba radeonsi/gfx10: don't initialize VGT_INSTANCE_STEP_RATE_0
Only gfx9 and older use it to get InstanceID in VGPR1.
2019-08-27 16:10:40 -04:00
Marek Olšák
c0716446a4 radeonsi/gfx10: fix InstanceID for legacy VS+GS 2019-08-27 16:10:40 -04:00
Marek Olšák
4d3097f36a radeonsi/gfx10: add as_ngg variant for VS as ES to select Wave32/64
Legacy GS only works with Wave64.
2019-08-27 16:10:40 -04:00
Marek Olšák
a3a266807e radeonsi/gfx10: create the GS copy shader if using legacy streamout 2019-08-27 16:10:40 -04:00
Marek Olšák
beea2dee8a radeonsi/gfx10: fix the PRIMITIVES_GENERATED query if using legacy streamout 2019-08-27 16:10:40 -04:00
Marek Olšák
78c603ebf5 radeonsi/gfx10: fix tessellation for the legacy pipeline
ported from PAL
2019-08-27 16:10:40 -04:00
Marek Olšák
3dec21a8aa radeonsi: move some global shader cache flags to per-binary flags 2019-08-27 16:10:40 -04:00
Marek Olšák
6e07ac3343 radeonsi/gfx10: fix the legacy pipeline by storing as_ngg in the shader cache
It could load an NGG shader when we want a legacy shader and vice versa.
2019-08-27 16:10:40 -04:00
Emil Velikov
c0b9399d9d Update version to 19.2.0-rc1
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2019-08-20 23:18:37 +01:00
Erico Nunes
71fb721ca5 lima/ppir: use ra_get_best_spill_node to select spill node
ra_get_best_spill_node is what other users of the mesa register
allocator use.
Switching to it now also fixes an infinite loop issue with ppir regalloc
with the ppir control flow patchset, and also provides a small gain over
the previous herusitic on number of spilled nodes testing with
shader-db.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-20 21:16:02 +00:00
Eric Anholt
c1dc84e71d tgsi: Remove unused tgsi_check_soa_dependencies().
Acked-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
2019-08-20 13:31:13 -07:00
Eric Anholt
4ebe6b2e72 tgsi: Drop the SSE2 constants setup that's been dead code since 2011.
The SSE2 executor was removed in 4eb3225b38 ("Remove tgsi_sse2.")

Acked-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
2019-08-20 13:31:13 -07:00
Eric Anholt
98c58355d3 tgsi: drop a stale comment
This was fixed in 912ed84f83 ("tgsi: move to using vector for system
values.")

Acked-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
2019-08-20 13:31:13 -07:00
Eric Anholt
553cd82d64 gitlab-ci: Enable the GLES2/3 CTS on softpipe.
The GLES2 CTS takes about 8 minutes of total runtime (at parallel 4 is
~2 minutes in the test stage if runners are free), while GLES3 takes
about 25.  Since the GLES3 run is pretty expensive, just do a cheap
touch test of 1 out of every 10 tests in the test list on MRs, until
we can get the runtime down.

v2: Drop the full run for now until we can bring runtime down or bring
    up a dedicated mesa runner.

Reviewed-by: Eric Engestrom <eric@engestrom.ch> (v1)
Reviewed-By: Gert Wollny <gert.wollny@collabora.com> (v1)
2019-08-20 13:31:13 -07:00
Jose Maria Casanova Crespo
6c904773fe mesa: reverse no_error on compressed_tex_sub_image for TEX_MODE_CURRENT
This fixes the regression introduced on "mesa: refactor
compressed_tex_sub_image function" that started to crash
KHR-GLES2.texture_3d.compressed_texture.negative_compressed_tex_sub_image

Fixes: 7df233d68d ("mesa: refactor compressed_tex_sub_image function")
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-20 20:45:21 +01:00
Adam Jackson
b283919398 glx: Eliminate glx_config::{rgb,float,colorIndex}Mode
These are redundant with glx_config::renderType, let's just use that
consistently.
2019-08-20 14:05:07 -04:00
Adam Jackson
74ca87e4bc glx: Remove unused glx_config::pixmapMode
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-20 14:05:03 -04:00
Adam Jackson
35fc7bdf0e glx: convert glx_config_create_list to one big calloc
Simpler, less failure prone, less malloc overhead, what's not to like.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-20 14:05:01 -04:00
Adam Jackson
97d58eabcc glx: convert a malloc+memset to calloc
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-20 14:04:59 -04:00
Adam Jackson
cabd09c9e7 glx: Fix parameter documentation of glx_config_create_list
'minimum_size' is not, in fact, an argument to this function.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-20 14:04:56 -04:00
Arcady Goldmints-Orlov
3835535537 anv: inline uniforms blocks don't count toward descriptor set limits
In a descriptor set inline uniform blocks don't use up any bindings.
However, the presence of any inline uniform blocks doed require the
use of the descriptor buffer, which takes up one binding.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-20 16:48:45 +00:00
Daniel Schürmann
df86c5ffb3 nir: add divergence analysis pass.
This pass expects the shader to be in LCSSA form.
The algorithm is based on 'The Simple Divergence Analysis' from
Diogo Sampaio, Rafael De Souza, Sylvain Collange, Fernando Magno Quintão Pereira.
Divergence Analysis. ACM Transactions on Programming Languages and Systems (TOPLAS)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-08-20 17:40:13 +02:00
Rhys Perry
7b07034931 nir/subgroups: Lower clustered reductions with cluster_size >= subgroup_size into reductions
The behavior for reductions with cluster_size >= subgroup_size is implementation defined.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-20 17:40:10 +02:00
Rhys Perry
911a1dfad2 nir/lcssa: allow to create LCSSA phis for loop-invariant booleans
ACO depends on LCSSA phis for divergent booleans to work correctly.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-20 17:40:05 +02:00
Daniel Schürmann
9c40ad49d5 nir/lcssa: Skip loop invariant variables when converting to LCSSA.
Co-authored-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-20 17:40:01 +02:00
Rhys Perry
8a6cfaa15a nir: make nir_to_lcssa() a general NIR pass.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-20 17:39:54 +02:00
Daniel Schürmann
204846ad06 nir/lcssa: handle deref instructions properly
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Fixes: 414148cdc1 "nir: Support deref instructions in loop_analyze"
2019-08-20 17:39:52 +02:00
Jose Maria Casanova Crespo
7c56a68c8b tgsi_to_nir: only update TGSI properties of the current shader stage
The implementation introduced in "tgsi_to_nir: be careful about not
losing any TGSI properties silently (v2)" updates all the TGSI properties,
but it didn't take into account that the shader_info structure uses a union
to store the different attributes for each shader stage.

Now we only update the attributes if they affect current shader stage,
avoiding to overwrite members of the union that should be overwritten.
This has created hundreds of regressions in v3d.

For example the TGSI_PROPERTY_VS_BLIT_SGPRS_AMD was overwritting the
same position used by TGSI_PROPERY_CS_FIXED_BLOCK_DEPTH.

Fixes: e300365197 ("tgsi_to_nir: be careful about not losing any TGSI properties silently (v2)")

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-20 10:30:21 +00:00
Samuel Pitoiset
83a63a5b12 radv/gfx10: do not emit PA_SC_TILE_STEERING_OVERRIDE twice
CLEAR_STATE emits it for us.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-20 12:13:44 +02:00
Samuel Pitoiset
2ca8629fa9 radv: do not emit PKT3_CONTEXT_CONTROL with AMDGPU 3.6.0+
It's emitted by the kernel.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-20 12:13:41 +02:00
Gert Wollny
6a09405368 mesa/program: Take ARB_framebuffers_no_attachments into account in wpos correction
If a drawbuffer is an fbo without an attachment then its 'Height' will be zero,
and we have to take its 'DefaultGeometry.Height' into account.

Fixes on softpipe (with the exception of tests that use multisample):
  dEQP-GLES31.functional.fbo.no_attachments.*

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-20 10:04:24 +02:00
Sagar Ghuge
fe0e9db797 iris: Enable non coherent framebuffer fetch on broadwell
v2: Use GEN_GEN in iris_state (Kenneth Graunke)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-20 00:50:58 -07:00
Sagar Ghuge
57ce422e20 iris: Free resource if failed to allocate surface state
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-20 00:50:55 -07:00
Sagar Ghuge
02244bc515 iris: Pass isl_surf to fill_surface_state
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-20 00:50:45 -07:00
Sagar Ghuge
638a157e02 iris: Add infrastructure to support non coherent framebuffer fetch
Create separate SURFACE_STATE for render target read in order to support
non coherent framebuffer fetch on broadwell.

Also we need to resolve framebuffer in order to support CCS_D.

v2: Add outputs_read check (Kenneth Graunke)

v3: 1) Import Curro's comment from get_isl_surf
    2) Rename get_isl_surf method
    3) Clean up allocation in case of failure

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-20 00:50:44 -07:00
Sagar Ghuge
61c0637afb iris: Add helper functions to get tile offset
All helper functions are ported from i965 driver.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-20 00:50:43 -07:00
Sagar Ghuge
7e816991cc iris: Add helper function to get isl dim layout
v2: Add missing space (Caio)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-20 00:50:41 -07:00
Sagar Ghuge
58471e20d2 iris: Add render target read entry in binding table
This will be used in next patches for supporting non coherent
framebuffer fetch on Broadwell.

v2: Fix comment (Kenneth Graunke)

v3: 1) Fix a few nits (Caio)
    2) Add comment (Caio)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-20 00:50:31 -07:00
Kai Wasserbäch
1abe87383e build: Bump C++ standard requirement to C++14 to fix FTBFS with LLVM 10
When building Mesa against a recent LLVM 10 with C++11, the build fails
if the AMD common code is built as well due to "std::index_sequence"
being undeclared.

LLVM requires a minimum of C++14.

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Acked-by: Eric Engestrom <eric@engestrom.ch>
2019-08-20 05:39:19 +00:00
Rob Herring
d0ec5d38f6 panfrost: Add madvise support to BO cache
The kernel now supports madvise ioctl to indicate which BOs can be freed
when there is memory pressure. Mark BOs purgeable when they are in the
BO cache. The BOs must also be munmapped when they are in the cache or
they cannot be purged.

We could optimize avoiding the madvise ioctl on older kernels once the
driver version bump lands, but probably not worth it given the other
driver features also being added.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2019-08-19 19:33:20 -05:00
Rob Herring
c45c2d7960 panfrost: Sync UAPI header from kernel
Sync the panfrost_drm.h UAPI header with the latest from the kernel.
This adds madvise ioctl and GPU feature params.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2019-08-19 19:33:20 -05:00
Pierre-Eric Pelloux-Prayer
0f07d18e48 mesa: add ext_dsa GetMultiTexLevelParameterEXT
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-19 18:50:08 -04:00
Pierre-Eric Pelloux-Prayer
e8c5dc9c24 mesa: add EXT_dsa glCompressedMultiTex* functions display list support
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-19 18:50:07 -04:00
Pierre-Eric Pelloux-Prayer
1cb8e12717 mesa: add EXT_dsa glCompressedMultiTex* functions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-19 18:50:05 -04:00
Pierre-Eric Pelloux-Prayer
a886025ef5 mesa: add EXT_dsa glCompressedTex* functions display list support
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-19 18:50:03 -04:00
Pierre-Eric Pelloux-Prayer
8c76221886 mesa: add EXT_dsa glCompressedTexture(Sub)Image1D/2D/3D functions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-19 18:49:57 -04:00
Pierre-Eric Pelloux-Prayer
7df233d68d mesa: refactor compressed_tex_sub_image function
Combine compressed_tex_sub_image, compressed_tex_sub_image_error and
compressed_tex_sub_image_no_error in a single function.

The added "enum tex_mode mode" parameter allows to implement the
DSA / non-DSA variants and their error/no_error combination.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-19 18:49:43 -04:00
Bas Nieuwenhuizen
6c5d983865 radv: Add Renoir support.
Took the freedom to enable dfsm even though I don't have benchmark
results yet, but it seems Raven-like.

Rest is from radeonsi.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-08-19 22:34:11 +00:00
Marek Olšák
223b3174bd radeonsi/nir: always lower ballot masks as 64-bit, codegen handles it
This fixes KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks.

This solution is better, because the IR isn't dependent on wave32.
2019-08-19 17:23:38 -04:00
Marek Olšák
5d37194d43 radeonsi: remove the unsafemath debug option
unlikely to be used in the future

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-19 17:23:38 -04:00
Marek Olšák
5586411de4 radeonsi/nir: fix counting shader inputs & outputs 2019-08-19 17:23:38 -04:00
Marek Olšák
452cb7055f radeonsi/nir: fix assertion in si_nir_load_sampler_desc 2019-08-19 17:23:38 -04:00
Marek Olšák
1f8a661748 radeonsi: clean up si_llvm_context_set_tgsi
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-19 17:23:38 -04:00
Marek Olšák
43f8b5642b radeonsi: allocate and resize global_buffers as needed
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-19 17:23:38 -04:00
Marek Olšák
c315cb509d radeonsi/gfx10: don't set PA_SC_TILE_STEERING_OVERRIDE if CLEAR_STATE sets it
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-19 17:23:38 -04:00
Marek Olšák
5a2e65be89 radeonsi: don't emit PKT3_CONTEXT_CONTROL on amdgpu
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-19 17:23:38 -04:00
Marek Olšák
8d0d753bd0 radeonsi: fix an assertion failure: assert(!res->b.is_shared)
This only appears to happen on Raven2.

Possible way to reproduce:

resource_get_handle(WINSYS_HANDLE_TYPE_KMS) --> sets is_shared = true
resource_get_handle(WINSYS_HANDLE_TYPE_DMABUF) --> fail

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
2019-08-19 17:23:38 -04:00
Marek Olšák
bdcbac9459 radeonsi: handle the use_ngg_streamout flag in si_update_ngg 2019-08-19 17:23:38 -04:00
Marek Olšák
a6b3ca1c70 radeonsi: move the tess factor ring size assertion to a place where it matters
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-19 17:23:38 -04:00
Marek Olšák
21217efdfe ac/nir: set image=true when loading FMASK for images
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-19 17:23:38 -04:00
Christian Gmeiner
f52b9218ff etnaviv: rs: add support for 64bpp clears
Starting with HALTI2 the RS supports 64bpp clears.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Philipp Zabel <philipp.zabel@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-08-19 22:36:45 +02:00
Christian GMEINER
7492685b1b etnaviv: update headers from rnndb
Update to etna_viv commit c51353e.

Signed-off-by: Christian GMEINER <christian.GMEINER@bachmann.info>
Reviewed-by: Philipp Zabel <philipp.zabel@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-08-19 22:36:45 +02:00
Eric Anholt
1395503424 swrast: Make the fetch funcs table sparse.
This shrinks the table, avoids needing to update the table with NULL
entries on every MESA_FORMAT addition, and removes a surprising,
non-unit-tested format number ordering dependency.

Acked-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2019-08-19 11:48:03 -07:00
Eric Anholt
c45c33a5a2 gallium: Remove manual defining of PIPE_FORMAT enum values.
Now that SVGA doesn't have a table that has to be in PIPE_FORMAT
order, we can let the enums have whatever values they naturally would
without worrying about holes.

Acked-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2019-08-19 11:48:01 -07:00
Eric Anholt
84db6ba740 svga: Drop unsupported formats from the format table.
Now that we're using the array initializers, we don't need to manually
fill out all these stub entries.

Produced with "sed -i '/.*INVALID.*INVALID.*INVALID/d'
src/gallium/drivers/svga/svga_format.c"

Acked-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2019-08-19 11:43:02 -07:00
Eric Anholt
ef37da52c0 svga: Remove duplication in the format table.
By using the [ ] = {} array initializer syntax, we no longer need the
entries to be listed in PIPE_FORMAT_* value order.  This means that
people adding new gallium formats don't need to cargo-cult changes to
this driver or regress that non-unit-tested requirement.

While I'm here, drop the lines for formats that no longer exist (the
numbered ones in the table).

Acked-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2019-08-19 11:42:55 -07:00
Eric Anholt
42efa789b5 svga: Factor out the format conversion table entry lookup.
Seemed like a sensible cleanup, while I was looking at whether I could
make the table sparse.

To make the svga table not require fixups on every new gallium format,
we may want to change how it's populated.

Acked-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2019-08-19 11:42:36 -07:00
Jason Ekstrand
5167e94f23 nir: Add more source types to nir_tex_instr_src_type
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 17:03:34 +00:00
Alyssa Rosenzweig
2bb4dc4054 pan/midgard: Compute liveness per-block
Rather than using a regalloc based on live internals, computed hastily
with repeated invocations of a forward-analysis pass, we switch to
compute liveness information on a per-block basis.

Within a given basic block, we compute liveness backwards with a
linear-time algorithm; for common shaders, this may help RA terminate
quicker.

Across blocks, we use a work list (really a work set) and check if we're
making progress. This isn't terribly efficient, but it gets the job
done. Point is, we get the live_in/live_out for each block.

From there, it's simple to rerun the linear-time update algorithm to
compute the interference graph.

The benefit of this technique is the ability to ignore "gaps" in
liveness across intermediate blocks that are never executed. On simple
shaders like the loops in glmark, this results in a minor reduction in
register pressure. The motivation was a complex shader in Krita that
failed register allocation due to an unfortunate interaction between
texture pipeline registers and control flow. This shader now compiles
successfully.

total instructions in shared programs: 3439 -> 3438 (-0.03%)
instructions in affected programs: 22 -> 21 (-4.55%)
helped: 1
HURT: 0

total bundles in shared programs: 2077 -> 2076 (-0.05%)
bundles in affected programs: 12 -> 11 (-8.33%)
helped: 1
HURT: 0

total quadwords in shared programs: 3457 -> 3456 (-0.03%)
quadwords in affected programs: 20 -> 19 (-5.00%)
helped: 1
HURT: 0

total registers in shared programs: 341 -> 338 (-0.88%)
registers in affected programs: 9 -> 6 (-33.33%)
helped: 3
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 33.33% max: 33.33% x̄: 33.33% x̃: 33.33%

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig
24c91bb54b pan/midgard: Analyze load/store for swizzle propagation
If there's a nontrivial swizzle fed into an extra (shortened) argument,
we bail on copyprop. No glmark changes (since it doesn't use fancy
texturing/loads).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig
9ae4d3653e pan/midgard: Treat cubemaps "stores" as loads
It's always been ambiguous which they are, but their primary register is
their output, not their input; therefore, they are loads.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig
20dd482668 pan/midgard: Clamp cubemap swizzle to XYXX
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig
2788721cc4 pan/midgard: Clamp st_vary swizzle by number of components
Same issue with liveness analysis. If we store out a vec3, we should not
reference the .w component.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig
edc8e41566 pan/midgard: Use type-appropriate swizzle for texture coordinate
The texture coordinate for a 2D texture could be a vec2 or a vec3,
depending if it's an array texture or not. If it's vec2 (non-array
texture), we should not reference the z component; otherwise, liveness
analysis will get very confused when z is never written.

v2: Fix typo (Ilia).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig
2bcb3d9226 pan/midgard: Set mask for lowered read-hazard moves
If we need to lower a move for a read from a vec2 texture coordinate, we
shouldn't write zw, even incidentally.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig
739e09c297 pan/midgard: Fix texw lowering with complex control flow
Fixes shaders with control flow like:

   out = 0;

   if (A) {
      if (B)
         out = texture(A, ...)
   } else {
      out = texture(B, ...)
   }

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig
6f1c8c148d pan/midgard: Add mir_rewrite_index_dst_single helper
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig
d68019ad1f pan/midgard: Print predecessors in MIR
Just as a sanity check.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig
e3a418fe86 pan/midgard: Index blocks for printing
Better than having pointers flying about.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig
2f92479ffc pan/midgard: Add mir_foreach_src
This is repeated often enough.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig
84580c6dbc pan/midgard: Add mir_foreach_instr_in_block_rev
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig
c8c4471a92 pan/midgard: Add mir_foreach_successor helper
Now we should be able to walk the control-flow graph naturally.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig
b8e526c520 pan/midgard: Add mir_foreach_predecessor utility
It's ugly, but c'est la vie.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig
b4b2e111f8 pan/midgard: Link exit block
The exit block has been 'dangling' in the successors graph, so let's
ensure it's linked in.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig
07c960cac0 pan/midgard: Add mir_exit_block helper
The exit block is gauranteed to be empty, signaling the end of the
program.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig
aeeeef1242 pan/midgard: Maintain block predecessor set
While we already compute the successors array, for backwards data flow
analysis, it is useful to walk the control flow graph backwards based on
predecessors, so let's compute that information as well.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig
4fa09329c1 pan/midgard: Use ralloc on ctx/blocks
This will allow us to get some level of automatic memory management.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig
b59b1793b8 pan/midgard: Shrink successors[] to 2 length
A block can't have more.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Roman Stratiienko
fdd6151612 nir: Add missing dependency in Android.nir.gen.mk
Fixes incremental build with Android

Signed-off-by: Roman Stratiienko <roman.stratiienko@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-08-19 09:53:18 +03:00
Erico Nunes
99d5bdcfa5 meson: build lima tools as part of 'all' tools
This is primarily so that this build gets tested in CI and we don't
break it again.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-08-18 22:27:55 +02:00
Connor Abbott
c550d367a7 ac/nir: Fix store_scratch with a non-full writemask
By adding one more helper to ac_llvm_build, we can also easily keep
vector stores together.

Fixes the
tests/spec/glsl-1.30/execution/fs-large-local-array-vec4.shader_test
piglit test.

Fixes: 74470baebb ("ac/nir: Lower large indirect variables to scratch")
Reviewed-by: Marek Olšák <marek.olsak@amd.com
2019-08-18 15:15:45 +02:00
Vasily Khoruzhick
0e394cda0d glsl/standalone: init shader stage in init_gl_program()
Otherwise lima standalone compiler fails when trying to compile fragment
shader with:

lima_compiler: ../src/compiler/nir/nir.c:55: nir_shader_create: Assertion `si->stage == stage' failed

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-17 11:14:40 -07:00
Jason Ekstrand
16edd02bfa iris: Only request an input mask if the shader needs it
Fixes: aebca3961b "iris: Fix handling of SIMD32 fragment shaders"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-16 19:59:42 -05:00
Xiong, James
dcad15ff54 gallium: add back YVU support
PIPE_FORMAT_YV12 is not handled so switching to PIPE_FORMAT_IYUV and
adding back YVU support.

Signed-off-by: James Xiong <james.xiong@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-16 13:24:49 -07:00
Erico Nunes
7a51abab42 lima: actually wait for bo in lima_bo_wait
PIPE_TIMEOUT_INFINITE is unsigned and gets assigned to signed fields
where it ends up as -1. When this reaches the kernel as a timeout it
gets translated as no timeout, which cause the waiting functions to
return immediately and not actually wait for a completion.

This seems to cause unstable results with lima where even piglit tests
randomly fail.

Handle this by setting the signed max value in case of infinite timeout.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-08-16 16:31:29 +02:00
Rhys Perry
0a790c3019 nir/algebraic: add a few masking-before-unpack optimizations
Helps some Dawn of War 3 and F1 2017 shaders with ACO:
Totals from affected shaders:
SGPRS: 2136 -> 2128 (-0.37 %)
VGPRS: 1624 -> 1628 (0.25 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 168068 -> 164332 (-2.22 %) bytes
LDS: 44 -> 44 (0.00 %) blocks
Max Waves: 222 -> 221 (-0.45 %)
Wait states: 0 -> 0 (0.00 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-16 12:13:01 +01:00
Vasily Khoruzhick
861c2b8d31 lima: fix compilation of standalone compiler
Fixes: e0aeee946004("lima: add summary report for shader-db")
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-15 16:59:51 -07:00
Bas Nieuwenhuizen
b9fb90e6d3 Revert "radv/gfx10: Enable DCC for storage images."
Quite useless without DCC for LAYOUT_GENERAL.

Fixes: b4dad3afaa Revert "radv: Do not decompress on LAYOUT_GENERAL."
Acked-by: Dave Airlie <airlied@redhat.com>
2019-08-16 01:22:54 +02:00
Bas Nieuwenhuizen
b4dad3afaa Revert "radv: Do not decompress on LAYOUT_GENERAL."
Causes issues with a bunch of games with DXVK.

Fixes: 50add1b33a "radv: Do not decompress on LAYOUT_GENERAL."
Acked-by: Dave Airlie <airlied@redhat.com>
2019-08-16 01:22:35 +02:00
Dave Airlie
f3af7886fe mesa: add support for CET to x86/x86-64 asm files.
Control-flow enforcement technology is a new instructions on x86
processors to denote where indirect jumps can land. Gcc auto adds
the instruction (which encodes as a NOP on older CPUs) to entrypoints
but assembler files need manual adding. This adds it to all the
entry points in the mesa x86/x86-64 assembler files.

This will only happen if mesa is built with the -fcf-protection flag
to gcc as some distros are wanting to do.

Acked-by: Eric Anholt <eric@anholt.net>
2019-08-16 09:00:35 +10:00
Alyssa Rosenzweig
78eda70892 pan/bifrost: Manually constant fold register class
Fixes errors for some people building Mesa:

../src/panfrost/bifrost/bifrost_sched.c:32:31: error: initializer
element is not constant
 const unsigned max_vec2_reg = max_primary_reg / 2;

../src/panfrost/bifrost/bifrost_sched.c:33:31: error: initializer
element is not constant
 const unsigned max_vec3_reg = max_primary_reg / 4; // XXX: Do we need
to align vec3 to vec4 boundary?

../src/panfrost/bifrost/bifrost_sched.c:34:31: error: initializer
element is not constant
 const unsigned max_vec4_reg = max_primary_reg / 4;

../src/panfrost/bifrost/bifrost_sched.c:35:32: error: initializer
element is not constant
 const unsigned max_registers = max_primary_reg +

../src/panfrost/bifrost/bifrost_sched.c:40:28: error: initializer
element is not constant
 const unsigned vec2_base = primary_base + max_primary_reg;

../src/panfrost/bifrost/bifrost_sched.c:41:28: error: initializer
element is not constant
 const unsigned vec3_base = vec2_base + max_vec2_reg;

../src/panfrost/bifrost/bifrost_sched.c:42:28: error: initializer
element is not constant
 const unsigned vec4_base = vec3_base + max_vec3_reg;

../src/panfrost/bifrost/bifrost_sched.c:43:27: error: initializer
element is not constant
 const unsigned vec4_end = vec4_base + max_vec4_reg;

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-15 19:06:35 +00:00
Erik Faye-Lund
18ab42644b gallium/util: widen type before multiplication
This method returns size_t, but the multiplication multiplies two
integers, leading to overflow rather than type widening.

Noticed by compiling with MSVC, which emits a warning.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-15 20:23:53 +02:00
Erik Faye-Lund
0091f62978 mesa: avoid warning on Windows
On Windows, p_atomic_inc_return returns an unsigned long long rather
than the type the pointer refers to, so let's make sure we cast the
result to the right type. Otherwise, we'll trigger a warning about
the wrong format-string for the type.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-15 20:23:49 +02:00
Erik Faye-Lund
544b088616 win32: unify strcasecmp definitions
There was two incompatible definitions of strcasecmp, which lead to a
compiler warning. Let's clean this up by only leaving one of them, and
using that one all the time.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-15 20:23:44 +02:00
Erik Faye-Lund
ecd312be96 mesa/main: avoid warning when casting offset to pointer
This generates a warning on some 64-bit systems, so let's cast to a
properly sized integer first.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-15 20:23:39 +02:00
Erik Faye-Lund
c646cd4bac nir: avoid warning when casting bogus pointer
This intentionally-bogus pointer generates a warning on some 64-bit
systems, so let's cast to a properly-sized integer first.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-15 20:23:35 +02:00
Erik Faye-Lund
b355eef920 glsl: fixup u64-warning
Similarly to the unsigned-version, we need to first cast the result to a
suiting integer before negating the number, otherwise we'll trigger a
warning.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-15 20:23:13 +02:00
Kenneth Graunke
f741de236b isl: Enable Unorm Path in Color Pipe
Improves performance on my Icelake 8x8 locked to 700Mhz.  For example,
some GfxBench5 subtests have the following results:

- [i965] gl_manhattan: ................ 7.01119% +/- 0.180971% (n=5)
- [i965] gl_4 (Car Chase):              4.24351% +/- 0.175622% (n=5)
- [i965] gl_blending:  ................ 3.36327% +/- 0.180267% (n=5)
- [i965] gl_5_normal (Aztec Ruins):     1.67962% +/- 0.243534% (n=10)
- [iris] gl_manhattan: ................ 3.92357% +/- 0.073965% (n=25)
- [iris] gl_4 (Car Chase):              2.17746% +/- 0.0826858% (n=5)
- [iris] gl_blending:  ................ 2.79599% +/- 0.803652% (n=15)
- [iris] gl_5_normal (Aztec Ruins):     1.30930% +/- 0.106523% (n=25)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-15 10:39:09 -07:00
Rafael Antognolli
ceeaf93c8e anv: Properly initialize device->slice_hash.
When subslices_delta == 0 and we take the early return,
device->slice_hash is not initialized on GEN11. It then causes a
segfault when going through anv_DestroyDevice, if compiled with
valgrind.

Fixes: 7bc022b4bb ("anv/gen11: Emit SLICE_HASH_TABLE when pipes are
                    unbalanced.)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-15 09:42:48 -07:00
Danylo Piliaiev
72354d43d4 intel/compiler: Fix resource leak in error path
CID: 1452261

Fixes: 04a99515 "intel/compiler: add ability to override shader's assembly"

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-08-15 08:17:36 +00:00
Alyssa Rosenzweig
44a6c38bd6 panfrost: Implement native RECT textures
We started honouring the normalized_coords flag in the texture
descriptor, but a bisection revealed that broke RECT textures -- since
we were *also* lowering them in the shader. So just remove the
shader-based lowering, use native RECT textures, and enjoy the nominal
reduction in complexity and performance boost.

Fixes: 3e47a1181b ("panfrost: Add MALI_SAMP_NORM_COORDS flag")

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:57:42 -07:00
Alyssa Rosenzweig
6fe4822cca panfrost: Add R10G10B10A2_SSCALED vertex format
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:57:24 -07:00
Alyssa Rosenzweig
e823a47f02 pan/midgard: Disassemble UBO index explicitly
It's a bit of a special case but that's fine.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:57:24 -07:00
Alyssa Rosenzweig
3d54ed2488 pan/midgard: Account for unaligned UBOs when promoting uniforms
We only know how to promote aligned accesses, although theoretically we
should be able to promote unaligned to swizzles in the future. Check
this.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:57:24 -07:00
Alyssa Rosenzweig
03350eb8b8 pan/midgard: Add mir_ubo_shift helper
Different UBO reads have different shift requirements.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:57:24 -07:00
Alyssa Rosenzweig
cf3bb10f51 pan/midgard: Address emit_ubo_read offset in bytes
We'll want to be smarter about unaligned reads, so let's get this code
all in one place.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:57:24 -07:00
Alyssa Rosenzweig
65e6cb4eb0 pan/midgard: Wire writemask into UBO reads
Helps the disassembly be clearer and maybe regalloc be smarter.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:57:24 -07:00
Alyssa Rosenzweig
ec2f0b580f pan/midgard: Identify UBO/SSBO op symmetry
It's the same thing, just shifted.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:57:24 -07:00
Alyssa Rosenzweig
375d4c2c74 panfrost: Extend blending to MRT
Our hardware supports independent (per-RT) blending, but we need to
route those settings through from Gallium.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:42:40 -07:00
Alyssa Rosenzweig
dff4986b1a pan/midgard: Emit store_output branch just-in-time
We'll need multiple branches for MRT, so we can't defer. Also, we need
to track dependencies to ensure r0 is set to the correct value for each
store_output.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:42:40 -07:00
Alyssa Rosenzweig
2fc44c4dc8 pan/midgard: Add dont_eliminate flag
We need to treat fragment writes specially.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:42:40 -07:00
Alyssa Rosenzweig
6ed3843224 pan/mfbd: Stuff in RT count
Fixes DATA_INVALID_FAULTs with multiple render targets.

We do always allocate space for 4 cbufs just to keep things sane. This
may not be strictly necessary.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:42:40 -07:00
Alyssa Rosenzweig
716be7862e pan/decode: Dump FBD tagged pointer
Turns out the rt count is stuffed in here..

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:42:40 -07:00
Alyssa Rosenzweig
358372b256 pan/decode: Decode invalid access type upon fault
We don't have a good way to confirm this, but it parallels the kernel
definitons for MMU faults nicely.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:42:39 -07:00
Alyssa Rosenzweig
f5cc5ef404 pan/decode: Fix duplicate heap_end property
This was supposed to read heap_start. It's the same value but still,
better get this right.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:42:39 -07:00
Alyssa Rosenzweig
b78e04c17b panfrost: Note "MFBD preload disable" bit
It's a chicken bit, as far as I can tell. Buck buck.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:39:57 -07:00
Alyssa Rosenzweig
64720d1e9e pan/bifrost: Link in compiler
We enable the standalone compiler, build the new files, and let it
blast.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 22:54:07 +00:00
Alyssa Rosenzweig
b93fa7d232 pan/bifrost: Check in remainder of the Bifrost compiler
What it says on the tin.

Signed-off-by: Ryan Houdek <Sonicadvance1@gmail.com>
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 22:54:07 +00:00
Alyssa Rosenzweig
0e126aa0f0 pan/bifrost: Add bifrost_print.c/h
IR printers.

Signed-off-by: Ryan Houdek <Sonicadvance1@gmail.com>
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 22:54:07 +00:00
Alyssa Rosenzweig
d8d8b08fe5 pan/bifrost: Style format the disassembler
$ astyle *.c *.h --style=linux -s8

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 22:54:07 +00:00
Alyssa Rosenzweig
fca491c0e1 pan/bifrost: Stub out standalone compiler
We don't actually have a standalone compiler in-tree yet, but let's get
prepared for when we do.

Signed-off-by: Ryan Houdek <Sonicadvance1@gmail.com>
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 22:54:07 +00:00
Alyssa Rosenzweig
62bbc23da5 pan/bifrost: Sync disassembler with Ryan's tree
The disassembler was updated to move common code with the compiler into
a shared header. Additional, some new ops and control registers relating
to rounding were added.

Signed-off-by: Ryan Houdek <Sonicadvance1@gmail.com>
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 22:54:07 +00:00
Alyssa Rosenzweig
b73cbd6880 panfrost: Remove standalone pandecode tool
Now that panwrap has gained the ability to trace directly without
dumping to the filesystem, there's no need to lug around this tool.

I can assure you nobody will miss it.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 15:09:17 -07:00
Alyssa Rosenzweig
6f4d796911 pan/midgard: Fix disassembly termination condition
Fixes: 863bdd1f8d ("pan/midgard: Break, not return, in disassembler")

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 15:09:17 -07:00
Alyssa Rosenzweig
de2efd5ea7 panfrost: Ensure we upload at least 1 blend RT
Otherwise we'll get memory junk.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 15:09:17 -07:00
Alyssa Rosenzweig
54438267c3 panfrost: Zero tripipe on initialize
I don't think the hardware cares, but this adds a lot of noise to traces
that we would rather not need to look at.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 15:09:17 -07:00
Alyssa Rosenzweig
1ab6290746 pan/midgard: Improve disassembler robustness
Some memory corruption / etc issues let to an accidental "fuzzing" of
the disassembler ;) This uncovered some issues leading to a disassembler
hang, so let's fix that.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 15:09:17 -07:00
Alyssa Rosenzweig
9c4c7211a3 pan/decode: Split public.h out
We want a defined ABI for tracing; this set of functions should be as
small as strictly necessary to minimize panwrap shenanigans.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 15:09:17 -07:00
Alyssa Rosenzweig
4f03728fb7 pan/decode: Prefer uint64_t to mali_ptr
This removes an unwanted dependency on panfrost-job.h

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 15:09:17 -07:00
Alyssa Rosenzweig
6c84a2665c pan/midgard: Allocate spill_slot once
Multiple spill moves share a single spill slot. Issue found in Krita.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 14:58:34 -07:00
Alyssa Rosenzweig
2a9031ea44 pan/midgard: Use hint on midgard_instruction for spill_move
This allows us to have multiple spill moves, whereas otherwise for N
spill moves, the first N-1 would be clobbered. Issue found in Krita.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 14:58:34 -07:00
Alyssa Rosenzweig
3e6f2e7aba panfrost: Remove panfrost_add_dependency asserts
It doesn't... make a ton of sense to need to assert and this routine is
hotter than you might expect. Doesn't matter for release builds, of
course.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 14:58:34 -07:00
Marek Olšák
aafc95ceb6 radeonsi: add support for Renoir
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-14 17:31:04 -04:00
Eric Engestrom
a3d6024199 meson: add nir tests to the compiler/nir test suite
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-14 22:17:06 +01:00
Eric Engestrom
d0916edfcb EGL: sync headers with Khronos
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-14 21:48:23 +01:00
Christian Gmeiner
2c4fe6af78 relnotes: Add new ext on etnaviv for 19.2.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-08-14 21:47:35 +02:00
Christian Gmeiner
17200bb67a etnaviv: fix weird indentation
Fixes: 797a2e4fd0 ("etnaviv: update logic to determine uniform limits")
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-08-14 21:29:48 +02:00
Ian Romanick
0e6581b87d nir/algebraic: Reassociate shift-by-constant of shift-by-constant
v2: After some review discussion with Alyssa, the replacements now
correct account for cases where (b+c) >= bitsize.

v3: Use a temporary to simplify the Python code quite a bit.  Suggested
by Jason.

Haswell and all Gen8+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16251155 -> 16249576 (<.01%)
instructions in affected programs: 232627 -> 231048 (-0.68%)
helped: 547
HURT: 1
helped stats (abs) min: 1 max: 15 x̄: 2.89 x̃: 3
helped stats (rel) min: 0.04% max: 7.84% x̄: 1.14% x̃: 1.06%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.12% max: 0.12% x̄: 0.12% x̃: 0.12%
95% mean confidence interval for instructions value: -3.12 -2.65
95% mean confidence interval for instructions %-change: -1.20% -1.06%
Instructions are helped.

total cycles in shared programs: 365924392 -> 365372103 (-0.15%)
cycles in affected programs: 59207053 -> 58654764 (-0.93%)
helped: 497
HURT: 34
helped stats (abs) min: 1 max: 29300 x̄: 1118.16 x̃: 16
helped stats (rel) min: <.01% max: 10.59% x̄: 1.82% x̃: 1.82%
HURT stats (abs)   min: 2 max: 424 x̄: 101.03 x̃: 63
HURT stats (rel)   min: 0.07% max: 46.17% x̄: 4.72% x̃: 2.06%
95% mean confidence interval for cycles value: -1426.41 -653.77
95% mean confidence interval for cycles %-change: -1.66% -1.15%
Cycles are helped.

total spills in shared programs: 8870 -> 8871 (0.01%)
spills in affected programs: 104 -> 105 (0.96%)
helped: 0
HURT: 1

Ivy Bridge and all pre-Gen7 platforms had similar results. (Ivy Bridge shown)
total instructions in shared programs: 11956236 -> 11955635 (<.01%)
instructions in affected programs: 94110 -> 93509 (-0.64%)
helped: 106
HURT: 0
helped stats (abs) min: 1 max: 14 x̄: 5.67 x̃: 4
helped stats (rel) min: 0.12% max: 4.71% x̄: 1.96% x̃: 0.76%
95% mean confidence interval for instructions value: -6.62 -4.72
95% mean confidence interval for instructions %-change: -2.27% -1.64%
Instructions are helped.

total cycles in shared programs: 179296340 -> 178788044 (-0.28%)
cycles in affected programs: 51009603 -> 50501307 (-1.00%)
helped: 82
HURT: 7
helped stats (abs) min: 5 max: 27820 x̄: 6199.00 x̃: 16
helped stats (rel) min: 0.30% max: 8.16% x̄: 2.58% x̃: 3.11%
HURT stats (abs)   min: 2 max: 8 x̄: 3.14 x̃: 2
HURT stats (rel)   min: 0.02% max: 1.40% x̄: 0.34% x̃: 0.10%
95% mean confidence interval for cycles value: -7649.38 -3773.00
95% mean confidence interval for cycles %-change: -2.71% -1.99%
Cycles are helped.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> [v2]
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-14 11:15:37 -07:00
Ian Romanick
73aaeac0a3 nir/algebraic: Reassociate add-and-shift to be shift-and-add
A common thing in many shaders:

    uniform vs { vec4 bones[...]; };

    ...

    x = some_calculation(bones[i + 0]);
    y = some_calculation(bones[i + 1]);
    z = some_calculation(bones[i + 2]);

This turns into stuff like

    vec1 32 ssa_12 = iadd ssa_11, ssa_0
    vec1 32 ssa_13 = ishl ssa_12, ssa_3
    vec1 32 ssa_14 = intrinsic load_ssbo (ssa_7, ssa_13) (16, 4, 0)
    vec1 32 ssa_15 = iadd ssa_11, ssa_1
    vec1 32 ssa_16 = ishl ssa_15, ssa_3
    vec1 32 ssa_17 = intrinsic load_ssbo (ssa_7, ssa_16) (16, 4, 0)
    vec1 32 ssa_18 = iadd ssa_11, ssa_2
    vec1 32 ssa_19 = ishl ssa_18, ssa_3
    vec1 32 ssa_20 = intrinsic load_ssbo (ssa_7, ssa_19) (16, 4, 0)

By reassociating the shift and the add, we can reduce this to

    vec1 32 ssa_12 = ishl ssa_11, ssa_3
    vec1 32 ssa_13 = iadd ssa_12, ssa_0
    vec1 32 ssa_14 = intrinsic load_ssbo (ssa_7, ssa_13) (16, 4, 0)
    vec1 32 ssa_16 = iadd ssa_12, ssa_1
    vec1 32 ssa_17 = intrinsic load_ssbo (ssa_7, ssa_16) (16, 4, 0)
    vec1 32 ssa_19 = iadd ssa_12, ssa_2
    vec1 32 ssa_20 = intrinsic load_ssbo (ssa_7, ssa_19) (16, 4, 0)

v2: Add some commentary from Rhys Perry's nearly identical patch.

All Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16277758 -> 16250704 (-0.17%)
instructions in affected programs: 1440284 -> 1413230 (-1.88%)
helped: 4920
HURT: 6
helped stats (abs) min: 1 max: 69 x̄: 5.50 x̃: 4
helped stats (rel) min: 0.10% max: 18.33% x̄: 2.21% x̃: 1.79%
HURT stats (abs)   min: 1 max: 12 x̄: 4.50 x̃: 3
HURT stats (rel)   min: 0.18% max: 3.23% x̄: 1.91% x̃: 2.55%
95% mean confidence interval for instructions value: -5.67 -5.31
95% mean confidence interval for instructions %-change: -2.26% -2.16%
Instructions are helped.

total cycles in shared programs: 367118526 -> 365895358 (-0.33%)
cycles in affected programs: 93504145 -> 92280977 (-1.31%)
helped: 2754
HURT: 1269
helped stats (abs) min: 1 max: 47039 x̄: 460.66 x̃: 16
helped stats (rel) min: <.01% max: 34.93% x̄: 3.77% x̃: 1.12%
HURT stats (abs)   min: 1 max: 1500 x̄: 35.85 x̃: 9
HURT stats (rel)   min: 0.01% max: 17.35% x̄: 2.18% x̃: 0.75%
95% mean confidence interval for cycles value: -387.31 -220.78
95% mean confidence interval for cycles %-change: -2.11% -1.68%
Cycles are helped.

LOST:   1
GAINED: 1

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-14 11:15:32 -07:00
Andrii Simiklit
ff2225cf88 nir/find_array_copies: Reject copies with mismatched lengths
copy_deref for wildcard dereferences requires the same
arrays lengths otherwise it leads to a crash in optimizations
like 'nir_opt_copy_prop_vars' because these optimizations expect
'copy_deref' just for arrays with the same lengths.

v2: check was moved to 'try_match_deref' to fix aoa cases
                 (Jason Ekstrand <jason@jlekstrand.net>)
v3: -fixed comment
    -the condition merged with other one
                 (Jason Ekstrand <jason@jlekstrand.net>)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111286
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
2019-08-14 18:11:31 +00:00
Alyssa Rosenzweig
c4a4f3db5a pan/midgard: Prefix blobber-db output for grepping
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 10:31:09 -07:00
Alyssa Rosenzweig
5f0f9e1333 pan/midgard: Implement blobber-db
We wire through some shader-db-style stats on the current shader in the
disassemble so we can get a quick estimate of shader complexity from a
trace.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Suggested-by: Rob Clark <robdclark@chromium.org>
2019-08-14 10:31:09 -07:00
Alyssa Rosenzweig
863bdd1f8d pan/midgard: Break, not return, in disassembler
We'll want to dump some stats after the shader, and I refuse to use one
teensy little goto.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 10:31:09 -07:00
Ian Romanick
f2965fde9b nir/range-analysis: Fail gracefully on non-SSA sources
Tested-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-14 09:02:38 -07:00
Christian Gmeiner
1290cc3e27 etnaviv: split destroy_shader
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-08-14 15:10:07 +02:00
Christian Gmeiner
f90b23b8c4 etnaviv: split link_shader
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-08-14 15:10:07 +02:00
Christian Gmeiner
0765a1dd0e etnaviv: split dump_shader
Also this adds the missing impl for etna_dump_shader_nir(..).

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-08-14 15:10:07 +02:00
Christian Gmeiner
a36d04daa1 etnaviv: mv etnaviv_compiler.c etnaviv_compiler_tgsi.c
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-08-14 15:10:07 +02:00
Christian Gmeiner
b2da8a8357 etnaviv: correct PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE handling
Have a correct answer to GL_MAX_FRAGMENT_UNIFORM_VECTORS and
GL_MAX_VERTEX_UNIFORM_VECTORS.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach l.stach@pengutronix.de
2019-08-14 12:29:56 +02:00
Christian Gmeiner
797a2e4fd0 etnaviv: update logic to determine uniform limits
Taken 1:1 from the header file.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach l.stach@pengutronix.de
2019-08-14 12:29:56 +02:00
Christian Gmeiner
45cb5eee5d etnaviv: put uniform limit determination into own function
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach l.stach@pengutronix.de
2019-08-14 12:29:56 +02:00
Marek Vasut
8f97262cdd etnaviv: Use reentrant screen lock around flush
The flush callback may be called on the same pipe context, and thus
the same stream, from two different threads of execution. However,
etna_cmd_stream_flush{,2}() must not be called on the same stream
from two different threads of execution as that would mess up the
etna_bo refcounting and likely have other ugly side effects.

Fix this by using a reentrant screen lock around the flush callback.

Signed-off-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2019-08-14 10:36:36 +02:00
Marek Vasut
6bb4b6d078 etnaviv: Add valgrind support
Add Valgrind support for etnaviv to track BO leaks.

Signed-off-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2019-08-14 10:36:20 +02:00
Marek Vasut
cf92074277 etnaviv: Use hash table to track BO indexes
Use hash table instead of ad-hoc arrays.

Signed-off-by: Marek Vasut <marex@denx.de>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2019-08-14 10:36:04 +02:00
Marek Vasut
23f5f126d5 etnaviv: Fix double-free in etna_bo_cache_free()
The following situation can happen in a multithreaded OpenGL application.
A BO is submitted from etna_cmd_stream #1 with flags set for read.
A BO is submitted from etna_cmd_stream #2 with flags set for write.
This triggers a flush on stream #1 and clears the BO's current_stream
pointer. If at this point, stream #2 attempts to queue BO again, which
does happen, the BO will be added to the submit list twice. The Linux
kernel driver correctly detects this and warns about it with "BO at
index %u already on submit list" kernel message.

However, when cleaning the BO cache in etna_bo_cache_free(), the BO
which was submitted twice will also be free()d twice, this triggering
a glibc double free detector.

The fix is easy, even if the BO does not have current_stream set,
iterate over current streams' list of BOs before adding the BO to it
and verify that the BO is not yet there.

Signed-off-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2019-08-14 10:35:48 +02:00
Roman Stratiienko
1ea95e37cc kmsro: Add missing definitions to Android.mk
Signed-off-by: Roman Stratiienko <roman.stratiienko@globallogic.com>
Reviewed-by: Rob Herring robh@kernel.org
2019-08-14 07:39:53 +00:00
Gert Wollny
742d3c918f softpipe: Add support for ARB_derivative_control
Enables and passes piglits:

spec/ARB_drivative_control/
        dfdx-coarse
        dfdx-dfdy
        dfdx-fine
        dfdy-coarse
        dfdy-fine

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-08-14 07:03:15 +00:00
Vasily Khoruzhick
b579af77f3 lima/ppir: print srcs and dests in ppir_node_print_prog()
Now we have an accessors for ppir src, so it's possible to easily
print all srcs and dests while dumping ppir representation.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-13 22:44:07 -07:00
Vasily Khoruzhick
6920710af5 lima/ppir: use src accessors in ppir regalloc
Get rid of most switch/case by using src accessors

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-13 22:44:07 -07:00
Vasily Khoruzhick
a5e7c12ced lima/ppir: add ppir_node to ppir_src
We'll need it if we want to walk through node sources

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-13 22:43:58 -07:00
Vasily Khoruzhick
afa64a2105 lima/ppir: introduce accessors for ppir_node sources
Sometimes we need to walk through ppir_node sources, common
accessor for all node types will simplify code a lot.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-13 22:38:07 -07:00
Jordan Justen
0f5be81edd iris: Expose aux buffer as 2nd plane w/modifiers
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-13 15:20:47 -07:00
Jordan Justen
246eebba4a iris: Export and import surfaces with modifiers that have aux data
The DRI interface for modifiers with aux data treats the aux data as a
separate plane of the main surface.

When the dri layer requests the plane associated with the aux data, we
save the required information into the dri aux plane image.

Later when the image is used, the dri plane image will be available in
the pipe_resource structure's `next` field. Therefore in iris, we
reconstruct the aux setup from this separate dri plane image when the
image is used.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-13 15:20:47 -07:00
Kenneth Graunke
99c8eb997d iris: Do proper format checks for Y+CCS modifier support
We need to ensure that the DRI image format supports CCS.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-08-13 15:20:47 -07:00
Jordan Justen
51f941c20c iris: Create single bo for surfaces with modifiers and aux data
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-13 15:20:47 -07:00
Jordan Justen
2c7b577e13 iris: Split iris_resource_alloc_aux to enable aux modifiers
Reworks:

 * If the aux-state is not ISL_AUX_STATE_AUX_INVALID, then use memset
   even when memset_value is zero. The hiz buffer initial aux-state
   will be set to invalid, and therefore we can skip the memset. But,
   for CCS it will be set to ISL_AUX_STATE_PASS_THROUGH, and therefore
   the aux data must be cleared to 0 with the memset. Previously we
   would use BO_ALLOC_ZEROED with the CCS aux data, so this memset
   wasn't required. Now, the CCS aux data may be part of the main
   surface. We prefer to not use BO_ALLOC_ZEROED excessively, so the
   memset is needed for the CCS case. (Nanley)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-13 15:20:46 -07:00
Jordan Justen
aad36dfd16 iris: Add aux offset into hiz_address
This is not currently required because the hiz buffer is in a separate
buffer, and therefore the offset is 0. If we combine the aux buffer
with the main surface buffer, then the hiz offset may become non-zero.

Suggested-by: Nanley Chery <nanley.g.chery@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-13 15:20:39 -07:00
Marek Olšák
f5e1f9ccef tgsi_to_nir: add assertions for max varying slots
Nine uses GENERIC slots > 31.

Trivial.
2019-08-13 18:15:53 -04:00
Marek Olšák
fad962eddc tgsi_to_nir: expand vec3 system values to vec4
for nir_intrinsic_load_work_group_id

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-13 18:15:53 -04:00
Marek Olšák
88a511bd42 tgsi_to_nir: fix incorrect number of image src1 components
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-13 18:15:53 -04:00
Mauro Rossi
37841f52b2 i965/gen11: fix genX_bits.h include path
Instead of "genX_bits.h" use "genxml/genX_bits.h"
as already done in other similar cases

Besides being more correct, it also fixes building error in Android.

Fixes: f0d2923 ("i965/gen11: Emit SLICE_HASH_TABLE when pipes are unbalanced.")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-08-13 23:58:25 +02:00
Alyssa Rosenzweig
0c56330361 panfrost: Workaround bug in partial update implementation
We can't intersect with empty regions.

Fixes: 65ae86b854 ("panfrost: Add support for KHR_partial_update()")

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-13 11:13:48 -07:00
Eric Anholt
46daaca55e gitlab-ci: Run the GLES2 CTS on llvmpipe.
This is the start of doing CTS tests on merges to Mesa master.  We use
the surfaceless platform so that we don't need to bother bringing up
weston or X11.  The surface size is kept low to reduce runtime, but
this comes at the cost of many rendering tests skipping due to
too-small render targets (as we see the impact of Mesa on the shared
runner pool, we can reevaluate this and what set of CTS tests we want
to run).

We split the job up across 4 runners (each at 4 llvmpipe threads), so
that the job can load-balance across our shared runners and finish
sooner (since dEQP is very single-thread-performance bound).

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-13 10:30:01 -07:00
Eric Anholt
ab49873b44 gitlab-ci: Switch the meson-main build type to debugoptimized.
Now that we're running the drivers we build, building with
optimization is important for keeping our runtime down.  Shaves about
4 minutes of runtime off of GLES2 CTS of llvmpipe at 64x64.

v2: Only switch meson-main until we enable CTS for other builds
    on request by Michel.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-13 10:30:01 -07:00
Eric Anholt
9605749f99 gitlab-ci: Set the prefix to ./install instead of the DESTDIR.
If we don't set DESTDIR, then the DEFAULT_DRIVER_DIR built into the
libraries is correct and we don't need to use LIBGL_DRIVERS_PATH and
friends for CI usage.  Incidentally, this moves our installed paths
from /builds/anholt/mesa/install/usr/local/lib (for example) to
/builds/anholt/mesa/install/lib for simplicity.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-13 10:30:01 -07:00
Eric Anholt
f417ced5cc gitlab-ci: Build the CTS in the debian build image.
This will let us reuse the image for test runs.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-13 10:30:01 -07:00
Eric Anholt
86ae3c2186 surfaceless: Fix swrast-path segfault when loader doesn't know driver name.
If we're hitting the swrast fallback path here, it's probably because
we stumbled across a KMS-only device (such as the ASpeed that some of
our CI runners have) that will then return a NULL driver_name.  Don't
crash in that case.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-13 10:30:01 -07:00
Eric Anholt
6a8d39dccd surfaceless: Fix swrast path.
We get a getDrawableInfo() call in the MakeCurrent path, which
platform_device was handling correctly by returning the pbuffer's
width/height but platform_surfaceless segfaulted for.  Reuse
platform_device's implementation.

Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2019-08-13 10:29:34 -07:00
Eric Anholt
030aa6e184 gitlab-ci: Move around which builds cover which swrast.
I want to enable CI of llvmpipe out of the meson-main build.  So, kick
classic swrast/osmesa to meson-i386, then promote llvmpipe to
meson-main (along with nine, now that classic osmesa isn't keeping it
out of there).

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-13 10:29:34 -07:00
Eric Anholt
b816edcbf4 meson: Don't require DRI classic swrast for OSMesa.
OSMesa doesn't care about this build option, it links against
src/mesa/swrast regardless.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2019-08-13 10:29:34 -07:00
Alyssa Rosenzweig
29cfd154e3 panfrost: Implement transform feedback
Midgard has no hardware support for transform feedback, so we simulate
it in software. Lucky us.

What Midgard does do is write out vertex shader outputs to main memory
unconditonally. Fragment shaders read varyings back from main memory;
there's no on-chip storage for varyings. Whether this was a reasonable
design is a question I will not be engaging in this commit message.

What that does mean is that, in some sense, Midgard *always* does
transform feedback uncondtionally, and there's no way to turn off
transform feedback. Normally, we would allocate some scratch memory
every frame to store the varyings in an arbitrary format (interleaved
for simplicity), and then feed that scratch to the fragment shader and
discard when the rendering completes.

The only difference now is that sometimes, for some buffers, we use a BO
provided to us by Gallium and a format provided by Gallium, instead of
allocating the memory and choosing the format ourselves. This has some
limitations -- in particular, it only works at vec4 granularity, so a
corresponding GLSL linkage patch is needed to correctly implement
transform feedback for non-vec4 types. Nevertheless, given the hardware
already works in this admittedly-bizarre fashion, transform feedback is
"free". Or, at least, it's no more expensive than any other rendering.

Specifically not implemented is dynamically-sized transform feedback
(i.e. with geometry/tesselation shaders).

Spoiler alert: Midgard has no support for geometry *or* tessellation
shaders, despite advertising support. They get compiled to *massive*
compute shaders. How's that for checkbox compliance?

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-08-13 09:43:41 -07:00
Alyssa Rosenzweig
7c29588c07 panfrost: Increment offsets[] per draw
We have to maintain the internal offset ourselves. Per v3d.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-08-13 09:43:39 -07:00
Alyssa Rosenzweig
e7a05a601e panfrost: Fixup stream out information per variant
We could probably get away with doing this once per pipe_shader_state
but let's not jump down that rabbit hole quite yet.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-08-13 09:43:32 -07:00
Alyssa Rosenzweig
5b0a1a4e49 panfrost: Route outputs_written through the compiler
It's there in shader_info, but we need to access it from pan_context.c

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-08-13 09:43:17 -07:00
Alyssa Rosenzweig
f714eab882 panfrost: Import stream out utility from iris
We'll need this in a moment. Ken's implementation, lightly edited for
Panfrost.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-08-13 09:43:14 -07:00
Alyssa Rosenzweig
9b2514d6c6 panfrost: Flush when using transform feedback
This is a huge hack to workaround incomplete BO flushing logic, but it's
enough for the dEQP transform feedback tests, and doing the resource
management to get this right is out-of-scope for this patch series.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-08-13 09:43:11 -07:00
Alyssa Rosenzweig
4b0001c42d panfrost: Set PIPE_CAP_TGSI_TEXCOORD
It doesn't really make sense, since we don't have special texture
coordinate varyings, but it'll make some code simpler for XFB and it
doesn't hurt us, even if I lose a bit of my soul setting it.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-08-13 09:43:09 -07:00
Alyssa Rosenzweig
72fc06df9c panfrost: Wire up statistics for primitives
GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN should now be handled.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-08-13 09:43:04 -07:00
Alyssa Rosenzweig
7c224c1008 panfrost: Implement callbacks for PRIMITIVES queries
We're just going to compute them in the driver but let's get the
structures setup to handle them. Implementation from v3d.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-08-13 09:42:48 -07:00
Rob Clark
72d086fc36 freedreno/a6xx: move SSBO/image consts to IBO stateobj
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:11:26 -07:00
Rob Clark
ab01ab4d4f freedreno/a6xx: move VS driverparams to it's own stateobj
If driver-params are required, we really should emit it on every draw
for correctness.  And if not required, we should emit a DISABLE so that
un-applied state updates from previous draws don't corrupt the const
state.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:11:26 -07:00
Rob Clark
882d53d8e3 freedreno/ir3+a6xx: same VBO state for draw/binning
Worth ~+20% on gl_driver2

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:11:26 -07:00
Rob Clark
4b82d1bbb7 freedreno/a6xx: add fd_emit_take_group()
Which takes ownership of the stateobj.  Useful for streaming state-
objs, to avoid an extra ref/unref

Worth ~5% at gl_driver2

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:11:26 -07:00
Rob Clark
4a188e4215 freedreno/ir3: track # of driver params
To avoid emitting unneeded const state.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:11:26 -07:00
Rob Clark
7f1e3391c6 freedreno/a6xx: move immediates to program stateobj
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:11:26 -07:00
Rob Clark
f0b91730a1 freedreno/a6xx: stop using ir3_emit_{vs,fs}_consts()
Should be no functional change.  Next step is to re-arrange various
const state into different stateobjs.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:11:26 -07:00
Rob Clark
53667a43c4 freedreno/ir3: push ctx further up call chain
Move more of the code to deal just w/ screen, without requiring ctx.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:11:26 -07:00
Rob Clark
4080dfb8af freedreno/ir3: move ring_wfi() further up call chain
Hoist them out of code-paths that will eventually be called directly for
various a6xx+ const related stateobjs.

This ends up duplicating one constlen check in ir3_emit_vs_consts(), to
avoid what could otherwise be an unnecessary WFI on older gens.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:11:26 -07:00
Rob Clark
c6fab232c8 freedreno/all: move more emit helpers to screen
framebuffer_barrier() still depends on the ctx, but the rest can move to
screen.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:11:26 -07:00
Rob Clark
684f4b5843 freedreno/a3xx-a6xx+ir3: move emit_const* to screen
These don't need to be in context, and we'll need them in screen in a
later patch.  Plus it's a good cleanup.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:11:26 -07:00
Rob Clark
566f2281c5 freedreno/a6xx: add fd6_emit_init_screen()
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:11:26 -07:00
Rob Clark
e89255b0a5 freedreno/a5xx: add fd5_emit_init_screen()
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:11:25 -07:00
Rob Clark
d256e3f34a freedreno/a3xx: add fd3_emit_init_screen()
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:11:25 -07:00
Rob Clark
b9d3f39728 freedreno/a2xx: add fd2_emit_init_screen()
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:08:07 -07:00
Rob Clark
ec0ec641d8 freedreno/a4xx: add fd4_emit_init_screen()
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:08:07 -07:00
Rob Clark
2f94de2372 freedreno/a2xx: call fd2_emit_ib() directly from fd2
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:08:07 -07:00
Rob Clark
eb45422c5f freedreno/a5xx: call fd5_emit_ib() directly from fd5
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:08:07 -07:00
Rob Clark
50e15e1c6f freedreno/a4xx: call fd4_emit_ib() directly from fd4
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:08:07 -07:00
Rob Clark
4326eeac97 freedreno/a3xx: call fd3_emit_ib() directly from fd3
No reason for the indirection when called from a3xx specific code.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:08:07 -07:00
Rob Clark
32014afa44 freedreno/ir3: move VS driver-param emit
Move DP emit to it's own function.  No functional change, just code
motion to prepare for splitting up const state into multiple state-
objs on a6xx.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:08:07 -07:00
Rob Clark
5722149bf1 freedreno/ir3: drop unneeded ir3_ra() args
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-13 08:08:07 -07:00
Boris Brezillon
65ae86b854 panfrost: Add support for KHR_partial_update()
Implement ->set_damage_region() region to support partial updates.

This is a dummy implementation in that it does not try to merge
damage rects. It also does not deal with distinct regions and instead
pick the largest quad as the only damage rect and generate up to 4
reload rects out of it (the left/right/top/bottom regions surrounding
the biggest damage rect).

We also do not try to reduce the number of draws by passing all quad
vertices to the blit request (would require extending u_blitter)

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-13 14:41:10 +02:00
Daniel Stone
492ffbed63 st/dri2: Implement DRI2bufferDamageExtension
Add a pipe_screen->set_damage_region() hook to propagate
set-damage-region requests to the driver, it's then up to the driver to
decide what to do with this piece of information.

If the hook is left unassigned, the buffer-damage extension is
considered unsupported.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-13 14:40:45 +02:00
Harish Krupo
a4a8ebe156 egl/dri: Use __DRI2_BUFFER_DAMAGE extension for KHR_partial_update
Use the DRI2 interface callback to pass the damage rects to
the driver.

Signed-off-by: Harish Krupo <harishkrupo@gmail.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Tested-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-13 14:40:31 +02:00
Daniel Stone
bd08a83b09 dri_interface: add DRI2_BufferDamage interface
Add a new DRI2_BufferDamage interface to support the
EGL_KHR_partial_update extension, informing the driver of an overriding
scissor region for a particular drawable.

Based on a commit originally authored by:
Harish Krupo <harish.krupo.kps@intel.com>
renamed extension, retargeted at DRI drawable instead of context,
rewritten description

Signed-off-by: Daniel Stone <daniels@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Tested-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-13 14:40:14 +02:00
Harish Krupo
b4345da876 egl/android: Delete set_damage_region from egl dri vtbl
The intension of the KHR_partial_update was not to send the damage back
to the platform but to send the damage to the driver to ensure that the
following rendering could be restricted to those regions.
This patch removes the set_damage_region from the egl_dri vtbl and all
the platfrom_*.c files.
Then upcomming patches add a new dri2 interface for the drivers to
implement

Signed-off-by: Harish Krupo <harishkrupo@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Tested-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-13 14:39:38 +02:00
Jordan Justen
fc12fd05f5 iris: Implement pipe_screen::resource_get_param
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-13 01:12:30 -07:00
Jordan Justen
3198c5b7bf gallium/dri2: Use pipe_screen::resource_get_param in image queries
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Eric Anholt <eric@anholt.net>
2019-08-13 01:12:29 -07:00
Jordan Justen
2decad495f gallium/dri2: Support images with multiple planes for modifiers
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Eric Anholt <eric@anholt.net>
2019-08-13 01:12:29 -07:00
Jordan Justen
6e749a6b2b gallium/dri2: Refactor image property queries
This refactor will let us more easily use
pipe_screen::resource_get_param as an alternative to
pipe_screen::resource_get_handle.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Eric Anholt <eric@anholt.net>
2019-08-13 01:12:29 -07:00
Jordan Justen
c5c2365455 state_tracker/winsys_handle: Add plane input field
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Eric Anholt <eric@anholt.net>
2019-08-13 01:12:29 -07:00
Jordan Justen
2066966c10 gallium/dri2: Support creating multi-planar modifier images
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Eric Anholt <eric@anholt.net>
2019-08-13 01:12:29 -07:00
Jordan Justen
fe06655e86 gallium/dri2: Implement dri2ImageExtension.queryDmaBufFormatModifierAttribs
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Eric Anholt <eric@anholt.net>
2019-08-13 01:12:29 -07:00
Jordan Justen
0346b70083 gallium/screen: Add pipe_screen::resource_get_param
This function retrieves individual parameters selected by enum
pipe_resource_param. It can be used as a more direct alternative to
pipe_screen::resource_get_handle.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Eric Anholt <eric@anholt.net>
2019-08-13 01:12:24 -07:00
Iago Toral Quiroga
2353f7f7ef vc4: clamp gl_PointSize to a minimum of 1.0
The OpenGL ES spec requires that the value of gl_PointSize is clamped
to an implementation-dependent range matching what is advertised by
GL_ALIASED_POINT_SIZE_RANGE. For VC4 this is [1.0, 512.0], but the
hardware won't clamp to the minimum side of the range and won't render
points with a size strictly smaller than 1.0 either, so we need to
clamp manually. For points larger than the maximum size of the range
the hardware clamps automatically.

Fixes piglit test:
spec/!opengl 2.0/vs-point_size-zero

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-13 09:44:54 +02:00
Iago Toral Quiroga
3539bd63dd v3d: clamp gl_PointSize to a minimum of 1.0
The OpenGL ES spec requires that the value of gl_PointSize is clamped
to an implementation-dependent range matching what is advertised by
GL_ALIASED_POINT_SIZE_RANGE. For V3D this is [1.0, 512.0], but the
hardware won't clamp to the minimum side of the range and won't render
points with a size strictly smaller than 1.0 either, so we need to
clamp manually. For points larger than the maximum size of the range
the hardware clamps automatically.

Fixes piglit test:
spec/!opengl 2.0/vs-point_size-zero

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-13 09:44:54 +02:00
Iago Toral Quiroga
48f5c34301 nir: add a pass to clamp gl_PointSize to a range
The OpenGL and OpenGL ES specs require that implementations clamp the
value of gl_PointSize to an implementation-depedent range. This pass
is useful for any GPU hardware that doesn't do this automatically
for either one or both sides of the range, such as V3D.

v2:
 - Turn into a generic NIR pass (Eric).
 - Make the pass work before lower I/O so we can use the deref variable
   to inspect if we are writing to gl_PointSize (Eric).
 - Make the pass take the range to clamp as parameter and allow it
   to clamp to both sides of the range or just one side.
 - Make the pass report progress.

v3:
 - Fix copyright header (Eric)
 - use fmin/fmax instead of bcsel to clamp (Eric)

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-13 09:44:12 +02:00
Iago Toral Quiroga
62e0ca3064 v3d: line length style fixes
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-13 08:38:19 +02:00
Iago Toral Quiroga
99e9809cab v3d: honor the write mask on store operations
v2:
  - Fix incremental update of the const offset when we need to emit a sequence
    with more than one write because of the writemask.
  - Do not move the tmu write emission to a separate helper.

v3:
  - Get the store writemask before the loop, use ffs to get the first component
    to write and clear writemask bits as we process the components (Eric).
  - Simplified the code that figured out the number of components for the TMU
    config based on the number of tmu writes for stores and atomics.

v4:
  - Code clean-ups (Eric).

Fixes:
KHR-GLES31.core.shader_image_load_store.advanced-cast-cs
KHR-GLES31.core.shader_image_load_store.advanced-cast-fs
KHR-GLES31.core.shader_storage_buffer_object.advanced-switchBuffers-cs
KHR-GLES31.core.shader_storage_buffer_object.advanced-switchPrograms-cs
KHR-GLES31.core.shader_storage_buffer_object.basic-operations-case1-cs

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-13 08:38:19 +02:00
Iago Toral Quiroga
3d65d2a488 v3d: refactor ntq_emit_tmu_general() slightly
When we implement write masks on store operations we might need to
emit multiple write sequences for a given store intrinsic. To make
that easier, let's split the emission of the tmud instructions to
their own block after we are done with the code that only needs to
run once no matter how many write sequences we need to emit.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-13 08:38:19 +02:00
Iago Toral Quiroga
b594796f1b v3d: do not automatically flush current job for SSBOs and shader images
If the current job has a sequence of draw calls involving SSBOs and/or
shader images, we would flush the job in between each draw call.
With this change, we won't flush the current job and we rely on the
application inserting correct barriers by issuing glMemoryBarrier()
when needed.

v2 (Eric):
 - When mapping a buffer for writing, we always need to flush.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-13 08:25:15 +02:00
Iago Toral Quiroga
f1cf1153e8 v3d: only process glMemoryBarrier() for SSBOs and images
PIPE_BARRIER_UPDATE is defined as:
PIPE_BARRIER_UPDATE_BUFFER | PIPE_BARRIER_UPDATE_TEXTURE

Which means we were flushing for any flags other than these two, but
this was intended to only flush for ssbos and images.

Actually, the driver automatically flushes jobs as we need, including
writes/reads involving SSBOs and images, so we don't really need to
flush anything when the program emits a barrier. However, this may
lead to excessive flushing in some cases, so we will soon change this
to avoid atutomatic flushing of the current job for SSBOs and images,
meaning that we will rely on the application to emit correct memory
barriers for these that we should make sure to process here.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-13 08:25:15 +02:00
Iago Toral Quiroga
f1559ca922 v3d: fix flushing of SSBOs and shader images
If the current draw call includes SSBOs, then we must flush any jobs
that are writing to the same SSBOs (so that our SSBOs reads are correct),
as well as jobs reading from the same SSBO (so that our SSBO writes don't
stomp previous SSBO reads).

The exact same logic applies to shader images. In this case we were already
flushing previous writes, but we should also flush previous reads.

Note that We don't need to call v3d_flush_jobs_reading_resource() and
v3d_flush_jobs_writing_resource() separately though, since flushing
jobs that read a resource also flushes those writing to it.

Suggested-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-13 08:25:15 +02:00
Caio Marcelo de Oliveira Filho
1021abab07 intel/tools: Fix aub_file initialization in intel_dump_gpu
The `device` can be set earlier either by a command line or a by
intercepting an ioctl call to get the I915_PARAM_CHIPSET_ID done by
the application early.  In both cases `aub_file` and `devinfo` would
not be initialized.

Fix by splitting the conditions

- `device == 0`: use the FD to get both device and devinfo.
- Or `devinfo.gen == 0`: use `device` to initialize it.

And separatedly, initialize aub_file the first time it is needed.

Fixes: d594d2a052 ("intel/tools: use device info initializer")
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-12 19:18:26 -07:00
Rafael Antognolli
f0d29238df i965/gen11: Emit SLICE_HASH_TABLE when pipes are unbalanced.
If the pixel pipes have a different number of subslices, emit a slice
hashing table that will ensure proper workload distribution.

v2: Set Mask field to 0xffff for workaround (Ken).
2019-08-12 16:19:08 -07:00
Rafael Antognolli
7bc022b4bb anv/gen11: Emit SLICE_HASH_TABLE when pipes are unbalanced.
If the pixel pipes have a different number of subslices, emit a slice
hashing table that will ensure proper workload distribution.

v2: Don't need to set the mask - it's mbo (Ken).
2019-08-12 16:19:08 -07:00
Rafael Antognolli
a1a499e7fe iris/gen11: Emit SLICE_HASH_TABLE when pipes are unbalanced.
If the pixel pipes have a different number of subslices, emit a slice
hashing table that will ensure proper workload distribution.

v2: Don't need to set the mask - it's mbo (Ken).
v3: Don't keep a reference to the resource used for emitting the table
(Ken).
2019-08-12 16:19:08 -07:00
Rafael Antognolli
ad513fd386 intel: Get information about pixel pipes subslices.
v2: Use 1 instead of 1UL (Ken).
2019-08-12 16:19:08 -07:00
Rafael Antognolli
32344dc581 intel/gen_decoder: Decode SLICE_HASH_TABLE. 2019-08-12 16:19:08 -07:00
Rafael Antognolli
e1cb71c182 intel/genxml: Update 3D_MODE and add SLICE_HASH_TABLE.
Add these fields and the 3DSTATE_SLICE_TABLE_STATE_POINTERS instruction
so we can properly configure the slice and subslice hashing on ICL+

v2: Make 'Mask' field a mbo (Ken).
2019-08-12 16:19:08 -07:00
Jason Ekstrand
d787a2d05e anv: Implement VK_KHR_pipeline_executable_properties
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-12 22:56:07 +00:00
Jason Ekstrand
67cb55ad11 anv: Add a ralloc context to anv_pipeline
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-12 22:56:07 +00:00
Jason Ekstrand
fec4bdff40 anv: Force a full re-compile when CAPTURE_INTERNAL_REPRESENTATION_TEXT is set
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-12 22:56:07 +00:00
Jason Ekstrand
651fbbf9b8 anv/pipeline: Split setting up per-stage keys into its own loop
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-12 22:56:07 +00:00
Jason Ekstrand
78f3dfb4a2 anv: Record shader compile stats in the pipeline cache
We're going to want these to be available regardless of caching.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-12 22:56:07 +00:00
Jason Ekstrand
2af380d20f anv/pipeline: Stash generated code in the pipeline stage
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-12 22:56:07 +00:00
Jason Ekstrand
8d3cbd0393 intel/fs: Add SLM size to brw_cs_prog_data
We don't need it for state setup but it's a useful statistic we want to
pass on to developers.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-12 22:56:07 +00:00
Jason Ekstrand
134607760a intel/compiler: Fill a compiler statistics struct
This commit is all annoying plumbing work which just adds support for a
new brw_compile_stats struct.  This struct provides a binary driver
readable form of the same statistics we dump out to stderr when we
INTEL_DEBUG is set with a shader stage.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-12 22:56:07 +00:00
Khaled Emara
2720ad5fd9 freedreno: disable tiling for cubemaps
Tiling doesn't work quite well with cubemaps.
Revert to linear textures, until it's fixed.
2019-08-12 22:30:54 +00:00
Khaled Emara
0ae16fb565 freedreno: add tiling parameters for 2D/2DArray/3D 2019-08-12 22:30:54 +00:00
Khaled Emara
aeaba3e4a6 freedreno: simplified slices setup for a3xx
a3xx doesn't support ASTC and layout_first always returns false
2019-08-12 22:30:54 +00:00
Khaled Emara
e11a239e8c freedreno: enable tiled textures for debug builds 2019-08-12 22:30:54 +00:00
Paulo Zanoni
866bb775de intel/fs: add 64 bit integer multiplication lowering
While NIR's lower_imul64() solves the case of 64 bit integer multiplications
generated early, we don't have a way to lower such instructions when they are
generated by our own backend, such as the scan/reduce intrinsics. We'll need
this soon, so implement it now.

An easy way to test this is to simply disable nir_lower_imul64 to let
those operations reach the backend.

v2:
  - Fix Q/UQ copy/paste errors (Caio).
  - Transform an 'if' into 'else if' (Caio).
  - Add an extra comment to clarify the need for 64b = 32b * 32b
    (Caio).
  - Make private functions private (Caio).
v3:
  - Remove ambiguity with 'b' and 'd' variables (Caio).
  - Allocate potentially less regs for the dwords (Caio).

Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Matt Turner <matt.turner@intel.com>
Cc: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
2019-08-12 15:16:23 -07:00
Paulo Zanoni
9217cf3b5e intel/compiler: invert the logic of lower_integer_multiplication()
Invert the logic of how progress is handled: remove the continue
statements and mark progress inside the places where it actually
happens.

We're going to add a new lowering that also looks for BRW_OPCODE_MUL,
so inverting the logic here makes the resulting code much easier to
follow.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
2019-08-12 15:16:23 -07:00
Paulo Zanoni
6ba4717924 intel/compiler: don't instantiate a builder for each instruction
Don't instantiate a builder for each instruction during
lower_integer_multiplication(). Instantiate one only when needed.

On the other hand, these unneeded builders don't seem to cost much to
init, so I don't expect any significant difference in performance:
this is mostly about code organization.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
2019-08-12 15:16:23 -07:00
Paulo Zanoni
75b3868dcc intel/compiler: extract subfunctions of lower_integer_multiplication()
The lower_integer_multiplication() function is already a little too
big. I want to add more to it, so let's reorganize the existing code
first. Let's start with just extracting the current code to
subfunctions. Later we'll change them a little more.

v2: Make private functions private (Caio).
v3: Fix typo (Caio).

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
2019-08-12 15:16:23 -07:00
Rhys Perry
7740149852 nir: merge and extend nir_opt_move_comparisons and nir_opt_move_load_ubo
v2: add to series
v3: update Makefile.sources
v4: don't remove a comment and break statement
v4: use nir_can_move_instr

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-12 22:01:30 +00:00
Rhys Perry
da8ed68aca nir: replace nir_move_load_const() with nir_opt_sink()
This is mostly the same as nir_move_load_const() but can also move
undef instructions, comparisons and some intrinsics (being careful with
loops).

v2: actually delete nir_move_load_const.c
v3: fix nir_opt_sink() usage in freedreno
v3: update Makefile.sources
v4: replace get_move_def with nir_can_move_instr and nir_instr_ssa_def
v4: handle if uses
v4: fix handling of nested loops
v5: re-write adjust_block_for_loops
v5: re-write setting of use_block for if uses

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Co-authored-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-12 22:01:30 +00:00
Francisco Jerez
c2fe7a0fb8 anv/gen9: Optimize slice and subslice load balancing behavior.
See "i965/gen9: Optimize slice and subslice load balancing behavior."
for the rationale.  According to Jason, improves Aztec Ruins
performance by 2.7%.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)

v2: Undo CPU performance micro-optimization done in i965 and iris due
    to lack of data justifying it on anv.  Use
    cmd_buffer_apply_pipe_flushes wrapper instead of emitting pipe
    control command directly.  (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-12 14:40:21 -07:00
Andreas Baierl
1c45541c7f lima/ppir: Add fddx and fddy
Lower fddx and fddy and set the right bits in codegen.

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
2019-08-12 23:20:04 +02:00
Bas Nieuwenhuizen
f1da129220 radv: Enable VK_KHR_pipeline_executable_properties.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-12 23:00:24 +02:00
Bas Nieuwenhuizen
afad67cd7a radv: Implement radv_GetPipelineExecutableStatisticsKHR.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-12 23:00:24 +02:00
Bas Nieuwenhuizen
35302f0189 radv: Implement radv_GetPipelineExecutableInternalRepresentationsKHR.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-12 23:00:24 +02:00
Bas Nieuwenhuizen
86864eedd2 radv: Implement radv_GetPipelineExecutablePropertiesKHR.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-12 23:00:24 +02:00
Bas Nieuwenhuizen
8874af8ef4 radv: Keep shader info when needed.
This allows enabling the shader info keeping on a per shader basis.
Also disables the cache on a per shader basis.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-12 23:00:24 +02:00
Bas Nieuwenhuizen
e8a256eb54 radv: Add VK_KHR_pipeline_executable_properties in disabled state.
So we can add the functions.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-12 23:00:24 +02:00
Bas Nieuwenhuizen
5444d3e0c2 radv: Use string for nir dumping.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Allows us to easily dump all nir shaders for combined variants in
vega and simplifies ownership.
2019-08-12 23:00:24 +02:00
Bas Nieuwenhuizen
739a2880f5 radv: Get max workgroup size without nir.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-12 23:00:24 +02:00
Bas Nieuwenhuizen
290ca0c4dd radv: Add utility function to calculate max waves.
Not AC because a lot of it is data extraction out of radv structs.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-12 23:00:24 +02:00
Francisco Jerez
026773397b iris/gen9: Optimize slice and subslice load balancing behavior.
See "i965/gen9: Optimize slice and subslice load balancing behavior."
for the rationale.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-12 13:17:58 -07:00
Francisco Jerez
03cba9f5d9 intel/genxml: Add GT_MODE hashing defs for Gen9.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-12 13:17:58 -07:00
Francisco Jerez
9406b3a5c1 i965/gen9: Optimize slice and subslice load balancing behavior.
The default pixel hashing mode settings used for slice and subslice
load balancing are far from optimal under certain conditions (see the
comments below for the gory details).  The top-of-the-line GT4 parts
suffer from a particularly severe performance problem currently due to
a subslice load balancing issue.  Fixing this seems to improve
graphics performance across the board for most of the benchmarks in my
test set, up to ~20% in some cases, e.g. from SKL GT4:

unigine/valley:                3.44% ±0.11%
gfxbench/gl_manhattan31:       3.99% ±0.13%
gputest/pixmark_piano:         7.95% ±0.33%
synmark/OglTexFilterAniso:    15.22% ±0.07%
synmark/OglTexMem128:         22.26% ±0.06%

Lower-end platforms are also affected by some subslice load imbalance
to a lesser degree, especially during CCS resolve and fast clear
operations, which are handled specially here due to rasterization
ocurring in reduced CCS coordinates, which changes the semantics of
the pixel hashing mode settings.

No regressions seen during my tests on some SKL, KBL and BXT
configurations.  Additional benchmark reports welcome on any Gen9
platforms (that includes anything with Skylake, Broxton, Kabylake,
Geminilake, Coffeelake, Whiskey Lake, Comet Lake or Amber Lake in your
renderer string).

P.S.: A similar problem is likely to be present on other non-Gen9
      platforms, especially for CCS resolve and fast clear operations.
      Will follow-up with additional patches fixing the hashing mode
      for those once I have enough performance data to justify it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-12 13:17:58 -07:00
Alyssa Rosenzweig
b1965831e4 pan/midgard: Handle 64-bit address in mir_mask_of_read_components
This is a bit of a hack, but it'll hold us over until we have 64-bit
support wired through.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:03 -07:00
Alyssa Rosenzweig
41e68094f8 pan/midgard: Allocate separate spill indices for lowered moves
This helps RA be slightly more reasonable.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:03 -07:00
Alyssa Rosenzweig
14b5b9ac38 pan/midgard: Extend liveness analysis to trinary ops
Fixes RA fails with multiple indirect SSBO writes.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:03 -07:00
Alyssa Rosenzweig
c690b37d76 pan/midgard: Fix load/store pairing
This used a delicate hack to try to find indirect inputs and skip them
as candidates for pairing. Let's use a better criterion -- no sources --
and pair based on that.

We could do better, but that would require more complex data flow
analysis than we're interested in doing here.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:02 -07:00
Alyssa Rosenzweig
15954ab6ca pan/midgard: Implement nir_intrinsic_load_num_work_groups
Just a sysval to route through.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:02 -07:00
Alyssa Rosenzweig
7229af794b pan/midgard: Implement some compute builtins
We implement gl_WorkGroupID and gl_LocalInvocationID, which map to
ld_compute_id with special sources.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:02 -07:00
Alyssa Rosenzweig
2b4e579585 pan/midgard: Rename ld_global_id -> ld_compute_id
It's used for more general loads within a compute shader.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:02 -07:00
Alyssa Rosenzweig
a5059f2cba pan/midgard: Handle partial writes in liveness analysis
This allows liveness analysis within a loop to be more fine grained,
fixing RA failures with partial spilled movs within a loop, as well as
enabling a slight reduction of register pressure more generally:

total registers in shared programs: 350 -> 347 (-0.86%)
registers in affected programs: 12 -> 9 (-25.00%)
helped: 3
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 25.00% max: 25.00% x̄: 25.00% x̃: 25.00%

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:01 -07:00
Alyssa Rosenzweig
e333bf606f pan/midgard: Dump "no spill"?
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:01 -07:00
Alyssa Rosenzweig
cc3df917d3 pan/midgard: Absorb nonexistance sources
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:01 -07:00
Alyssa Rosenzweig
0a7cc239bd pan/midgard: Pretty-print destinations
They're not "sources" but they follow the same conventions.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:01 -07:00
Alyssa Rosenzweig
ba8ec19a64 pan/midgard: Pretty-print units
Since we are seeing some use of MIR post-scheduling, let's get this
printed right.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:01 -07:00
Alyssa Rosenzweig
73f54f286a pan/midgard: Print mask in dumped MIR
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:01 -07:00
Alyssa Rosenzweig
2ec4f9a74b pan/midgard: Add no_spill flag
Hint for the RA to avoid infinite spilling loops.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:01 -07:00
Alyssa Rosenzweig
7090971f2f pan/midgard: Generalize mir_mask_of_read_components
This now works for load/store and texture instructions as well as ALU.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:01 -07:00
Alyssa Rosenzweig
419ddd63b0 pan/midgard: Implement SSBO access
Just laying the groundwork. Reads and writes should be supported (both
direct and indirect, either int or float, vec1/2/3/4), but no bounds
checking is done at the moment.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:01 -07:00
Alyssa Rosenzweig
a8639b91b5 pan/midgard: Pipe uniform mask through when spilling
This is a corner case that happens a lot with SSBOs. Basically, if we
only read a few components of a uniform, we need to only spill a few
components or otherwise we try to spill what we spilled and RA hangs.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:00 -07:00
Alyssa Rosenzweig
63e240dd05 pan/midgard: Clamp sysval component count
We don't want to load a 128-bit sysval when 64-bits will do. Fixes RA
failures with SSBO indirect writes.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:42:59 -07:00
Alyssa Rosenzweig
e7ac46be7a pan/midgard: Pass uploaded midgard_instruction through
We want to edit it after emission in some cases.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:42:59 -07:00
Alyssa Rosenzweig
fa68740187 pan/midgard: Allow sysval destination override
Sometimes a sysval is used to facilitate an instruction but is not the
instruction itself.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:42:59 -07:00
Alyssa Rosenzweig
60d80157d1 panfrost: Force flush every compute job
This is of course suboptimal for performance, forcing each
glDispatchCompute call to be submitted separately to the kernel and
finish to completion. However, for the initial bring-up of compute jobs,
this simplifies quite a bit.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:42:59 -07:00
Alyssa Rosenzweig
2efa025b05 panfrost: Add SSBO system value
For each SSBO index we get from Gallium/NIR, we need two pieces of
information in the shader:

1. The address of the SSBO in GPU memory. Within the shader, we'll be
accessing it with raw memory load/store, so we need the actual address,
not just an index.

2. The size of the SSBO. This is not strictly necessary, but at some
point, we may like to do bounds checking on SSBO accesses.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:42:59 -07:00
Alyssa Rosenzweig
e881aa8c12 gallium/util: Add u_stream_outputs_for_vertices helper
This u_prim.h helper determines the number of outputs for stream output,
given a particular primitive type and a vertex count. This is useful for
statically calculating sizes of stream output buffers (i.e. when there
is no geometry/tessellation shader in use).

This helper will be used in Panfrost's transform feedback
implementation, as you can probably guess since why else would I be
submitting it....

See also dEQP's getTransformFeedbackOutputCount routine.

v2: Simplify definition using new helpers, which also extends to non-ES2
primitive types (Eric).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-12 12:22:54 -07:00
Marek Olšák
8ce4f9bbc3 radeonsi: remove the always_nir option
tgsi_to_nir is no longer optional if NIR is enabled.
2019-08-12 14:52:17 -04:00
Marek Olšák
4e545f934f radeonsi/nir: implement default tess level system values
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-08-12 14:52:17 -04:00
Marek Olšák
9c7746ceae compiler: add SYSTEM_VALUE_TESS_LEVEL_OUTER/INNER_DEFAULT
TCS system values for internal passthru TCS, needed by radeonsi NIR support

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-08-12 14:52:17 -04:00
Marek Olšák
5167ca27fa gallium: add TGSI_SEMANTIC_DEFAULT_OUTER/INNER_LEVEL
for radeonsi NIR support.
2019-08-12 14:52:17 -04:00
Marek Olšák
f8d4198998 tgsi_to_nir: handle tess level inner/outer varyings
for internal radeonsi shaders

Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-08-12 14:52:17 -04:00
Marek Olšák
8ac2583cd8 tgsi_to_nir: add support for the stencil FS output
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-08-12 14:52:17 -04:00
Marek Olšák
f3f1d0dfd0 tgsi_to_nir: add support for TEX_LZ
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-12 14:52:17 -04:00
Marek Olšák
1b881852bc compiler: add SYSTEM_VALUE_USER_DATA_AMD
for internal radeonsi shaders
2019-08-12 14:52:17 -04:00
Marek Olšák
f0ccc5457a compiler: add shader_info.cs.user_data_components_amd 2019-08-12 14:52:17 -04:00
Marek Olšák
155789c8e7 tgsi_to_nir: add basic compute shader support
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-08-12 14:52:17 -04:00
Marek Olšák
5a0adfd9f0 tgsi_to_nir: add support for LOAD & STORE with SSBOs and images
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-08-12 14:52:17 -04:00
Marek Olšák
0b121cb89a tgsi_to_nir: make setup_texture_info reusable
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-12 14:52:17 -04:00
Marek Olšák
70fd85172b tgsi_to_nir: add support for TXF_LZ
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-08-12 14:52:17 -04:00
Marek Olšák
028dbd35ba compiler: add shader_info.vs.blit_sgprs_amd
for internal radeonsi shaders
2019-08-12 14:52:17 -04:00
Marek Olšák
e300365197 tgsi_to_nir: be careful about not losing any TGSI properties silently (v2)
v2: squash with Timur Kristof's commit

Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-08-12 14:52:17 -04:00
Marek Olšák
8b6814211a tgsi/scan: don't set GS_INVOCATIONS for all shader stages
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-08-12 14:52:17 -04:00
Marek Olšák
9fb2fd0b43 compiler: add ACCESS_STREAM_CACHE_POLICY
radeonsi will use this.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-08-12 14:52:17 -04:00
Marek Olšák
902dd50cf0 gallium: add AMD-specific compute TGSI enums
for tgsi_to_nir
2019-08-12 14:52:17 -04:00
Marek Olšák
6a2bdb8d01 gallium: add TGSI_PROPERTY_VS_BLIT_SGPRS_AMD for tgsi_to_nir
needed by radeonsi NIR support
2019-08-12 14:52:17 -04:00
Marek Olšák
d1ad4fda31 st/mesa: don't allocate mipmapped texture for NEAREST_MIPMAP_LINEAR
Reviewed-by: Brian Paul <brianp@vmware.com>
2019-08-12 14:52:17 -04:00
Kenneth Graunke
5180a222c0 glsl: Optimize the SoftFP64 shader when first creating it.
By optimizing the shader before inlining, we avoid having to redo this
work for each inlined copy of a function.  It should also reduce the
memory consumption a bit.

This cuts the KHR-GL46.arrays_of_arrays_gl.SubroutineFunctionCalls2
runtime by 25% on my Icelake.  That test compiles many shaders, which
contain large types (dmat4) and division (expensive operations).

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-08-12 10:42:32 -07:00
Christian Gmeiner
914ecc9384 etnaviv: fix compile warnings in release build
[27/31] Compiling C object 'src/gallium/drivers/etnaviv/df32d18@@etnaviv@sta/etnaviv_compiler_nir.c.o'.
In file included from ../../src/gitlab_mesa/src/gallium/drivers/etnaviv/etnaviv_compiler_nir.c:552:
../../src/gitlab_mesa/src/gallium/drivers/etnaviv/etnaviv_compiler_nir_emit.h: In function 'ra_assign':
../../src/gitlab_mesa/src/gallium/drivers/etnaviv/etnaviv_compiler_nir_emit.h:903:9: warning: unused variable 'ok' [-Wunused-variable]
    bool ok = ra_allocate(g);
         ^~
../../src/gitlab_mesa/src/gallium/drivers/etnaviv/etnaviv_compiler_nir.c: In function 'etna_compile_shader_nir':
../../src/gitlab_mesa/src/gallium/drivers/etnaviv/etnaviv_compiler_nir.c:663:9: warning: unused variable 'ok' [-Wunused-variable]
    bool ok = emit_shader(c->nir, &options, &v->num_temps, &num_consts);
         ^~

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-08-12 16:58:13 +00:00
Bas Nieuwenhuizen
e040c1b274 radv: Do not setup attachments without a framebuffer.
Test that found this: dEQP-VK.geometry.layered.1d_array.secondary_cmd_buffer

Fixes: 49e6c2fb78 "radv: Store color/depth surface info in attachment info instead of framebuffer."
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-12 17:19:24 +02:00
Jason Ekstrand
14c96a6300 anv: Implement VK_EXT_subgroup_size_control version 2
The version bump adds a proper features struct.

Fixes: d10de25309 "anv: Implement VK_EXT_subgroup_size_control"
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-12 14:56:33 +00:00
Jason Ekstrand
8aef89cc2d vulkan: Update the XML and headers to 1.1.119
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-12 14:56:33 +00:00
Bas Nieuwenhuizen
d062bec48d radv: Hash Wave32 settings in shader key.
Can result in different shaders.

Fixes: 8a86908e9a "radv/gfx10: add Wave32 support for vertex, tessellation and geometry shaders"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-12 13:32:18 +00:00
Bas Nieuwenhuizen
ba8d3c362b radv: Properly use Wave64 for non-NGG GS and copy shader.
Fixes: 8a86908e9a "radv/gfx10: add Wave32 support for vertex, tessellation and geometry shaders"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-12 13:32:18 +00:00
Bas Nieuwenhuizen
035406ecf7 radv: Put wave size in shader options/info.
Instead of having the three values everywhere. This is also more
future proof if we want the driver to make those decisions eventually.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-12 13:32:18 +00:00
Bas Nieuwenhuizen
71621e877f relnotes: Make entries for radv more consistent.
Always use 'on' as for the rest of the drivers.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-12 13:29:27 +00:00
Bas Nieuwenhuizen
38961729a8 relnotes: Add new exts on radv for 19.2.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-12 13:29:27 +00:00
Tapani Pälli
d4b574f26a iris: reorder arguments as expected by the function
CID: 1452262
Fixes: b4c54894bb "iris: Handle vertex shader with window space position"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
2019-08-12 13:08:26 +03:00
Tapani Pälli
590ba15d6e iris/android: move iris_query.c to 'per gen' LIBIRIS_SRC_FILES
Fixes Iris build on Android.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-08-12 10:06:36 +03:00
Kenneth Graunke
0f3768bc5d iris: Free query on error path
CID: 1452276
2019-08-11 14:04:31 -07:00
Kenneth Graunke
661be3fef9 iris: Add missing 'break'
We don't want to fall through to unreachable().

CID: 1452277
2019-08-11 14:04:31 -07:00
Caio Marcelo de Oliveira Filho
5ed4e31c08 spirv: Drop lower_workgroup_access_to_offsets
Intel drivers are not using this anymore, and turnip still don't have
Compute Shaders, so won't make a difference to stop using this option.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Rob Clark <robdclark@chromium.org>
2019-08-10 22:15:35 -07:00
Caio Marcelo de Oliveira Filho
925e9142bd i965/spirv: Lower shared memory later
Instead of asking spirv_to_nir to lower the workgroup (shared memory)
to offsets, keep them as derefs longer, then lower it later on.

Because Workgroup memory doesn't have explicit offsets, we need to set
those using nir_lower_vars_to_explicit_types before calling the I/O
lowering pass.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-10 22:15:35 -07:00
Danylo Piliaiev
61d6be84f3 i965: Use force_compat_profile driconf option
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-10 11:39:29 -07:00
Eric Engestrom
d7eb40962b i965: fix mem leak in error path
Fixes: 8ae6667992 ("intel/perf: move query_object into perf")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
2019-08-10 12:14:56 +01:00
Eric Engestrom
1c82fa0a92 gitlab-ci: simplify $CROSS option
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-08-10 12:11:28 +01:00
Kenneth Graunke
f1dba99639 iris: minor restyling 2019-08-10 00:16:45 -07:00
Mark Janes
9c597514d4 iris/query: enable amd performance monitors
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-09 19:28:34 -07:00
Mark Janes
469af7fdc9 iris/perf: get monitor results
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-09 19:28:32 -07:00
Mark Janes
1cb4fc184f iris/perf: add begin/end hooks
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-09 19:28:24 -07:00
Mark Janes
8c4c346665 iris/perf: add delete query
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-09 19:28:17 -07:00
Mark Janes
aca42759ff iris/perf: implement iris_create_monitor_object
This is the first call that provides the iris context to the monitor
implementation.  On the first call, use the iris context to initialize
the monitor context.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-09 19:28:14 -07:00
Mark Janes
0fd4359733 iris/perf: implement routines to return counter info
With this commit, Iris will report that AMD_performance_monitor is
supported, and will allow the caller to query the available metrics.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-09 19:28:03 -07:00
Eric Engestrom
e4aa0fc63a anv: add missing break
Fixes: f6e7de41d7 ("anv: Implement VK_EXT_line_rasterization")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-09 23:34:31 +01:00
Lionel Landwerlin
e2d761de03 util: drop final reference to p_compiler.h
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-09 22:59:43 +03:00
Lionel Landwerlin
85bf1dc2de util: os_misc: drop p_compiler.h include
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-09 22:59:43 +03:00
Lionel Landwerlin
c44c3948c7 util: u_math: drop p_compiler.h include
This file was moved from gallium so drop depending on gallium headers.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-09 22:59:43 +03:00
Lionel Landwerlin
8818db8f2c vc4: prepare for p_compiler.h dependency removal
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-09 22:59:43 +03:00
Lionel Landwerlin
8a884a25c5 amd: prepare dropping include of p_compiler.h
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-09 22:59:43 +03:00
Lionel Landwerlin
a233a3a74e mesa: be consistent on GL_TRUE/GL_FALSE & TRUE/FALSE
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-09 22:59:43 +03:00
Lionel Landwerlin
8f4dea20fc mesa: drop some p_compiler.h types
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-09 22:50:29 +03:00
Lionel Landwerlin
7abac7a8bf mesa: add stddef include in preparation for dropping p_compiler.h
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-09 22:50:17 +03:00
Lionel Landwerlin
6637395073 panfrost: prepare for p_compiler.h dependency removal
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-09 22:50:03 +03:00
Lionel Landwerlin
351c2ad157 i965: don't use p_compiler.h types
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-09 22:49:48 +03:00
Eric Engestrom
9be5ce1d73 gitlab-ci: generate meson cross-files earlier
Suggested-by: Michel Dänzer <michel@daenzer.net>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-09 20:07:50 +01:00
Alyssa Rosenzweig
9bc99e60a8 panfrost: Assign varying buffers dynamically
Rather than hardcoding certain varying buffer indices "by convention",
work it out at draw time. This added flexibility is needed for
futureproofing and will be enable streamout.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-09 11:53:21 -07:00
Alyssa Rosenzweig
46dae9ef58 panfrost: Assign indices at draw-time
This will allow us to shuffle buffers.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-09 11:53:21 -07:00
Alyssa Rosenzweig
af6d3f7cb5 panfrost: Break out pan_varyings.c
This code is fairly self-contained, so let's factor it out of the giant
pan_context.c monster.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-09 11:53:21 -07:00
Alyssa Rosenzweig
4dba493fd7 panfrost: Enable PIPE_CAP_STREAM_OUTPUT_INTERLEAVE_BUFFERS
Just as easy/hard as the rest of XFB.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-09 11:53:21 -07:00
Alyssa Rosenzweig
5ff7973560 panfrost: Import streamout data structures
Pretty much copypasted from v3d to jumpstart us.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-09 11:53:21 -07:00
Alyssa Rosenzweig
c82672c9c1 pan/midgard: Account for swizzle/mask in st_vary
Register allocation for varying stores is a bit different, since the
instructions ignore the writemask (varyings are normalized
packed/vectorized..)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-09 11:50:45 -07:00
Alyssa Rosenzweig
5ad83015cd pan/decode: Resolve crash with NULL attr/varyings
This case needs more investigation, but this was found with geometry
shaders.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-09 11:50:45 -07:00
Krzysztof Raszkowski
c0ab268f9c gallium/swr: Fix glClear when it's used with glEnable/glDisable GL_SCISSOR_TEST
When GL_SCISSOR_TEST is enabled glClear is handled by state tracker
and there is no need to do this in gallium driver.

Reviewed-by: Alok Hota alok.hota@intel.com
2019-08-09 18:56:13 +02:00
Gurchetan Singh
d6f8ce1c96 util: Revert "util: added missing headers in anon-file"
This reverts commit c73988300f.

Reason: Made a fix for this, then saw @eric's change
        ("util/anon_file: add missing"), but some sequence of events
        I don't really remember caused this to get merged. So revert ;-)

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-09 09:13:45 -07:00
Marek Vasut
bb47bedc85 etnaviv: Remove etna_bo_from_handle() prototype
Remove etna_bo_from_handle() as there are no known users.

Signed-off-by: Marek Vasut <marex@denx.de>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2019-08-09 17:21:55 +02:00
Lionel Landwerlin
cefb4341b7 anv: drop unused code
We stopped using this when we moved to Jason's mi_builder.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-09 17:01:38 +03:00
Christian Gmeiner
889e752965 etnaviv: fix typo
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-08-09 13:08:20 +00:00
Christian Gmeiner
de5070ea8d etnaviv: add gpu_supports_texture_target(..)
Currently I am seeing a handful of the following debug message:
translate_texture_target:495: Unhandled texture target: 0

PIPE_BUFFER is not handled in translate_texture_target(..) which makes
sense as it is used to translate from PIPE_XXX to GPU specific value
during etna_create_sampler_view_state(..).

To fix this problem introduce gpu_supports_texture_target(..) which just
checks if the texture target is supported.

Fixes: dfe048058f ("etnaviv: support 3D and 2D array textures")
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-08-09 13:08:20 +00:00
Jon Turney
0141b7c6b2 util: Cygwin has linux-style pthread_setname_np
Fixes: dcf9d91a ("util: Handle differences in pthread_setname_np")
2019-08-09 12:46:43 +00:00
Tapani Pälli
5e38db0c47 anv/android: disable shared representable image support explicitly
Android 9 loader conditionally advertises VK_KHR_shared_presentable_image
extension based on this property and it looks like it does not
initialize the struct before query.

Pragmas are added to ignore warnings with Android specific structure
types in same manner as commit 8d386e6eef  did.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-09 08:53:54 +03:00
Vasily Khoruzhick
39a90749af lima: introduce a struct describing texture descriptor
Use a struct with bitfields to construct texture descriptor
instead of poking bits in array of uint32_t. It improves code
readability and makes it easier to experiment with unknown fields.

Also fix mipmapping while we're at it - Utgard can have up to 13
levels, but 64 bytes is enough only for 10. Calculate descriptor
size dynamically to account extra levels if we need them.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-08 19:17:20 -07:00
Vasily Khoruzhick
edf008c04e lima: add texel format table
Introduce a table for supported texel formats and use it to check
whether format is supported and for converting pipe format to lima
texel format.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-08 19:17:20 -07:00
Gurchetan Singh
c73988300f util: added missing headers in anon-file
Otherwise I get:

../src/util/anon_file.c: In function ‘create_tmpfile_cloexec’:
../src/util/anon_file.c:75:9: error: implicit declaration of function ‘mkostemp’
[-Werror=implicit-function-declaration]
    fd = mkostemp(tmpname, O_CLOEXEC);
         ^~~~~~~~

../src/util/anon_file.c:133:7: error: implicit declaration of function ‘asprintf’
[-Werror=implicit-function-declaration]
       asprintf(&name, "%s/mesa-shared-%s-XXXXXX", path, debug_name);
       ^~~~~~~~
../src/util/anon_file.c:141:4: error: implicit declaration of function ‘free’
[-Werror=implicit-function-declaration]
    free(name)

Fixes: c0376a ("util: add anon_file.h for all memfd/temp file usage")
2019-08-08 16:21:57 -07:00
Gurchetan Singh
42759dc986 virgl: check scanout mask
Otherwise, virgl will report renderable or texturable formats as
also scan-out formats.

v2: drop host feature check (@kusma)

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-08-08 16:21:57 -07:00
Gurchetan Singh
3da029ac1a virgl: fixup_readback_format --> fixup_formats
This function is generalizable.

Suggested-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-08-08 16:21:57 -07:00
Gurchetan Singh
bf0ca99ec7 virgl: access caps in a less verbose way in virgl_is_format_supported
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-08-08 16:21:57 -07:00
Alyssa Rosenzweig
5a898e2a65 pan/midgard: Disassemble load/store barrel shift
Arm assembly intensifies.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-08 15:49:12 -07:00
Eric Engestrom
525a917c6c util/anon_file: const string param
Fixes: c0376a1234 ("util: add anon_file.h for all memfd/temp file usage")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Tested-by: Eric Anholt <eric@anholt.net>
Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
2019-08-08 22:02:54 +01:00
Eric Engestrom
8a028b0df2 util/anon_file: drop unused #include
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Tested-by: Eric Anholt <eric@anholt.net>
Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
2019-08-08 22:02:54 +01:00
Eric Engestrom
60af7f5a81 util/anon_file: add missing #include
Fixes: c0376a1234 ("util: add anon_file.h for all memfd/temp file usage")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Tested-by: Eric Anholt <eric@anholt.net>
Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
2019-08-08 22:02:54 +01:00
Greg V
ac1561088d intel/perf: use MAJOR_IN_SYSMACROS/MAJOR_IN_MKDEV
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Fixes: 134e750e16 ("i965: extract performance query metrics")
2019-08-08 21:44:33 +01:00
Greg V
0233372581 util: fix cpuset support on FreeBSD
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-08 21:44:33 +01:00
Greg V
c00ee00031 i965/tiled_memcpy: avoid creating bswap32 if it exists as a macro (e.g. on FreeBSD)
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-08 21:44:33 +01:00
Greg V
7b520dc74f anv: add MAP_POPULATE fallback define for portability
FreeBSD does not have MAP_POPULATE

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-08 21:44:33 +01:00
Greg V
2be3f16600 anv: remove unused Linux-specific include
Fixes: 4201cc2dd3 ("anv: Implement VK_KHX_external_semaphore_fd")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-08 21:44:33 +01:00
Greg V
c0dc5c1859 meson: define ETIME to ETIMEDOUT if not present
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-08 21:44:33 +01:00
Roman Stratiienko
28061e0ab0 lima: Fix Android.mk
1. Update LOCAL_SRC_FILES according to commit
54434fe670 ("lima/gpir: Rework the scheduler").

2. Add libpanfrost_shared.a dependency.

3. Generate lima_nir_algebraic.c with Android.mk
Fixes Android build error introduced by commit 5adfc8602c
("lima/ppir: move sin/cos input scaling into NIR")

Signed-off-by: Roman Stratiienko <roman.stratiienko@globallogic.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Acked-by: Qiang Yu <yuq825@gmail.com>
2019-08-08 17:47:22 +00:00
Roman Stratiienko
26a01a6797 Add libpanfrost_shared to Android build
1. Add missing directory to ./Android.mk
2. Fix ./src/panfrost/Android.shared.mk

Signed-off-by: Roman Stratiienko <roman.stratiienko@globallogic.com>
Reviewed-by: Icenowy Zheng <icenowy@aosc.io>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Acked-by: Qiang Yu <yuq825@gmail.com>
2019-08-08 17:47:22 +00:00
Rhys Perry
c52c54a746 anv,i965,iris: deduplicate setting of total_shared
v5: add patch

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-08 12:10:39 -05:00
Rhys Perry
024a46a407 anv: use derefs for shared memory access
vkpipeline-db for my Skylake GPU:
total instructions in shared programs: 8847602 -> 8847896 (<.01%)
instructions in affected programs: 10165 -> 10459 (2.89%)
helped: 8
HURT: 2

total cycles in shared programs: 1606273555 -> 1606251634 (<.01%)
cycles in affected programs: 2201803 -> 2179882 (-1.00%)
helped: 7
HURT: 3

The shaders with more instructions is due to a loop over a shared array
in Three Kingdoms being unrolled (and creating a lot of nested ifs). Not sure
if that's good or bad.

One of the shaders with worse cycles is only worse by 0.04% and the other
two are the shaders with loops unrolled.

v2: add patch
v4: don't set spirv_options.shared_addr_format
v4: move comment concerning the shared address format used and NULL
v4: add vkpipeline-db results
v5: rename to nir_lower_vars_to_explicit_types
v5: move setting of total_shared to outside brw_compile_cs
v6: set shared_addr_format
v6: formatting changes

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> (v5)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-08 12:10:39 -05:00
Rhys Perry
fd73ed1bd7 nir: add nir_lower_to_explicit()
v2: use glsl_type_size_align_func
v2: move get_explicit_type() to glsl_types.cpp/nir_types.cpp
v2: use align() instead of util_align_npot()
v2: pack arrays a bit tighter
v2: rename mem_* to field_*
v2: don't attempt to handle when struct offsets are already set
v2: use column_type() instead of recreating it
v2: use a branch instead of |= in nir_lower_to_explicit_impl()
v2: assign locations to variables and update shared_size and num_shared
v2: allow the pass to be used with nir_var_{shader_temp,function_temp}
v4: rebase
v5: add TODO
v5: small formatting changes
v5: remove incorrect assert in get_explicit_type()
v5: rename to nir_lower_vars_to_explicit_types
v5: correctly update progress when only variables are updated
v5: rename get_explicit_type() to get_explicit_shared_type()
v5: add comment explaining how get_explicit_shared_type() is different
v5: update cast strides
v6: update progress when lowering nir_var_function_temp variables
v6: formatting changes
v6: add more detailed documentation comment for get_explicit_shared_type
v6: rename get_explicit_shared_type to get_explicit_type_for_size_align
v7: fix comment in nir_lower_vars_to_explicit_types_impl()

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> (v5)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-08 12:10:39 -05:00
Rhys Perry
8bd2e138f5 nir/lower_explicit_io: add nir_var_mem_shared support
v2: require nir_address_format_32bit_offset instead
v3: don't call nir_intrinsic_set_access() for shared atomics

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-08 12:10:39 -05:00
Erik Faye-Lund
1e21bb4123 mesa: avoid warning on Windows
On Windows, p_atomic_inc_return returns an unsigned long long rather
than the type the pointer refers to, so let's make sure we cast the
result to the right type. Otherwise, we'll trigger a warning about
the wrong format-string for the type.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Eric Engestrom <eric@engestrom.ch>
2019-08-08 18:20:29 +02:00
Erik Faye-Lund
e0a740c633 mesa/main: cast away constness
This avoids a warning about implicitly casting away the constness of the
pointer.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Eric Engestrom <eric@engestrom.ch>
2019-08-08 18:20:29 +02:00
Erik Faye-Lund
75097114d9 spirv: fixup signature
This avoids a warning on some compiler, complaining about implicitly
casting the function-pointer.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: d482a8f "spirv: Update the OpenCL.std.h header"
Acked-by: Eric Engestrom <eric@engestrom.ch>
2019-08-08 18:20:29 +02:00
Lucas Stach
68c24b09c2 etnaviv: remember data offset into BO
Imported resources might not start at offset 0 into the buffer object.
Make sure to remember the offset that is provided with the handle on
import.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-08-08 16:11:34 +02:00
Danylo Piliaiev
b8842bc312 i965: Emit a dummy MEDIA_VFE_STATE before switching from GPGPU to 3D
There is an object-level  preemption workaround which requires this.
However, even without object-level preemption, we seem to have issues
with geometry flickering when 3D and compute are combined in the same
batch and this appears to fix it.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110395
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2019-08-08 13:39:15 +00:00
Bas Nieuwenhuizen
23a9d20997 radv: Avoid VEGA/RAVEN scissor bug in binning.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-08 14:08:21 +02:00
Bas Nieuwenhuizen
4a3f987afd radv: Avoid binning RAVEN hangs.
Mirroring radeonsi.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-08 14:08:21 +02:00
Bas Nieuwenhuizen
66ecc3eac8 radv: Fix off by one for S_028C48_MAX_ALLOC_COUNT.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-08 14:08:21 +02:00
Jan Zielinski
207026d29e swr/rasterizer: modernize thread TLB
Reviewed-by: Alok Hota <alok.hota@intel.com>
2019-08-08 12:33:21 +02:00
Jan Zielinski
387599a661 swr/rasterizer: Refactor events collection mechanism
Several improvements and cleanups in events and statstics mechanisms

Reviewed-by: Alok Hota <alok.hota@intel.com>
2019-08-08 11:15:07 +02:00
Jan Zielinski
ff75c35846 swr/rasterizer: improvements in simdlib
1. fix build issues with MSVC 2019 compiler

The MSVC 2019 compiler seems to have an issue with optimized code-gen
when using the _mm256_and_si256() intrinsic.
Only disable use of integer vpand on buggy versions MSVC 2019.
Otherwise allow use of integer vpand intrinsic.

2. Remove unused vec/matrix functionality

Reviewed-by: Alok Hota <alok.hota@intel.com>
2019-08-08 10:53:47 +02:00
Jan Zielinski
b55a93fdd4 swr/rasterizer: Events are now grouped and enabled by knobs
All events are now grouped as follows:

-Framework (i.e. ThreadStart) [always ON]
-Api (i.e. SwrSync) [always ON]
-Pipeline [default ON]
-Shader [default ON]
-SWTag [default OFF]
-Memory [default OFF]

Reviewed-by: Alok Hota <alok.hota@intel.com>
2019-08-08 10:33:25 +02:00
Jan Zielinski
982d99490f swr/rasterizer: do not mark tiles dirty until actually rendered
Reviewed-by: Alok Hota <alok.hota@intel.com>
2019-08-08 10:16:20 +02:00
Jan Zielinski
4f04f260d9 swr/rasterizer: enable size accumulation in mem stats
Small refactoring is also performed

Reviewed-by: Alok Hota <alok.hota@intel.com>
2019-08-08 10:16:20 +02:00
Jan Zielinski
365ad367f1 swr/rasterizer: enable using AOS vertex data format
Reviewed-by: Alok Hota <alok.hota@intel.com>
2019-08-08 10:16:20 +02:00
Iago Toral Quiroga
fb9f7872e7 v3d: handle wait requirement when retrieving query results correctly
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-08 08:36:52 +02:00
Iago Toral Quiroga
0f2d1dfe65 v3d: use the GPU to record primitives written to transform feedback
We can use the PRIMITIVE_COUNTS_FEEDBACK packet to write various primitive
counts to a buffer, including the number of primives written to transform
feedback buffers, which will handle buffer overflow correctly.

There are a couple of caveats with this:

Primitive counters are reset when we emit a 'Tile Binning Mode Configuration'
packet, which can happen in the middle of a primitives query, so we need to
read the buffer when we submit a job and accumulate the counts in the context
so we don't lose them.

We also need to do the same when we switch primitive type during transform
feedback so we can compute the correct number of recorded vertices from
the number of primitives. This is necessary so we can provide an accurate
vertex count for draw from transform feedback.

v2:
 - When computing the number of vertices for a primitive, pass in the base
   primitive, since that is what the hardware will count.
 - No need to update primitive counts when switching primitive types if
   the base primitives are the same.
 - Log perf warning when mapping the primitive counts BO for readback (Eric).
 - Only emit the primitive counts packet once at job end (Eric).
 - Use u_upload mechanism for the primitive counts buffer (Eric).
 - Use the XML to generate indices into the primitive counters buffer (Eric).

Fixes piglit tests:
spec/ext_transform_feedback/overflow-edge-cases
spec/ext_transform_feedback/query-primitives_written-bufferrange
spec/ext_transform_feedback/query-primitives_written-bufferrange-discard
spec/ext_transform_feedback/change-size base-shrink
spec/ext_transform_feedback/change-size base-grow
spec/ext_transform_feedback/change-size offset-shrink
spec/ext_transform_feedback/change-size offset-grow
spec/ext_transform_feedback/change-size range-shrink
spec/ext_transform_feedback/change-size range-grow
spec/ext_transform_feedback/intervening-read prims-written

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-08 08:36:52 +02:00
Iago Toral Quiroga
cf8986bce0 gallium/util: add a helper to compute vertex count from primitive count
v2:
  - Only compute vertex counts for base primitives.
  - Add a unit test (Eric)

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-08 08:36:52 +02:00
Iago Toral Quiroga
9eb8699e0f v3d: be more explicit about the query types supported
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-08 08:36:52 +02:00
Iago Toral Quiroga
9b316ab57a v3d: generate packet unpack functions
These were not being compiled because of the lack of __gen_unpack_address.

v2:
 - Shift raw address correctly (Eric).

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-08 08:36:52 +02:00
Iago Toral Quiroga
5ffb8b1716 v3d: add header guards in v3d_packet_helpers.h
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-08 08:36:52 +02:00
Tomeu Vizoso
e7eac8a1e8 panfrost: Print errors from kernel
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-08 07:42:52 +02:00
Tomeu Vizoso
7c8434889d panfrost: Mark buffers as PANFROST_BO_HEAP
What we call GROWABLE in Mesa corresponds to the HEAP BO flag in the
kernel. These buffers cannot be memory mapped in the CPU side at the
moment, so make sure they are also marked INVISIBLE.

This allows us to allocate a big heap upfront (16MB) without actually
reserving space unless it's needed.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-08 07:42:52 +02:00
Tomeu Vizoso
19afd41e65 panfrost: Mark BOs as NOEXEC
Unless a BO has the EXECUTABLE flag, mark it as NOEXEC.

v2: - Rework version detection (Alyssa).

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-08 07:42:52 +02:00
Tomeu Vizoso
9398932c2d panfrost: Take into account flags when looking up in the BO cache
This will be useful right now so we avoid retrieving a non-executable
buffer when a executable one is needed.

As we support more flags, this logic will need to be extended to
consider the different trade-offs to be made when matching BO
specifications to BOs in the cache.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-08 07:42:52 +02:00
Tomeu Vizoso
950b5fc596 panfrost: Allocate shaders in their own BOs
Instead of all shaders being stored in a single BO, have each shader in
its own.

This removes the need for a 16MB allocation per context, and allows us
to place transient blend shaders in BOs marked as executable (before
they were allocated in the transient pool, which shouldn't be
executable).

v2: - Store compiled blend shaders in a malloc'ed buffer, to avoid
      reading from GPU-accessible memory when patching (Alyssa).
    - Free struct panfrost_blend_shader (Alyssa).
    - Give the job a reference to regular shaders when emitting
      (Alyssa).

v3: - Split out the allocation flags change (Rob).

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-08 07:42:52 +02:00
Tomeu Vizoso
5804d75b9c util/hash_table: Fix hashing in clears on 32-bit
Some hash functions (eg. key_u64_hash) will attempt to dereference the
key, causing an invalid access when passed DELETED_KEY_VALUE (0x1) or
FREED_KEY_VALUE (0x0).

When in 32-bit arch a 64-bit key value doesn't fit into a pointer, so
hash_table_u64 internally use a pointer to a struct containing the
64-bit key value.

Fix _mesa_hash_table_u64_clear() to handle the 32-bit case by creating a
temporary hash_key_u64 to pass to the hash function.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Cc: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-08-08 07:42:52 +02:00
Tapani Pälli
aba57b11ee anv: support GetSwapchainGrallocUsage2ANDROID for Android
New function supports gralloc1 usage flags that get set separately
for producer and consumer. As we still need to support old method too,
let's share common code and use android_convertGralloc0To1Usage helper.
Bump the VK_ANDROID_native_buffer version to indicate support for the
new call.

Changes were tested on Android Celadon P with Basemark GPU and various
Sascha Willems Vulkan demos.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-08 05:08:01 +00:00
Mark Janes
51c3ab618b st/mesa: eliminate unnecessary redirection
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes
61c54a8878 intel/perf: fix debug typo
Misspelling was seen with INTEL_DEBUG=perfmon.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes
2df1ab4d48 intel/perf: make gen_perf_query_object private
Encapsulate the details of this structure within the perf implemenation.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes
deea3798b6 intel/perf: make perf context private
Encapsulate the details of this data structure.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes
1f4f421ce0 intel/perf: print debug information
INTEL_DEBUG=perfmon will iterate over the perf queries, printing
information about the state of each query.  Some of this information
will be private to intel/perf, and needs to a dump routine that can be
called from i965.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes
a663c8c26e intel/perf: make internal methods private
Now that all references from i965 have been moved to perf, we can make
internal methods private again.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes
be8b466cff intel/perf: make oa_sample_buffers private
All references to this data structure have been moved inside the perf
subsystem.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes
f2a049b4e3 intel/perf: expose method to create query
By encapsulating this implementation within perf, we can eventually
make struct gen_perf_ctx private.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes
9f5c160d82 intel/perf: move initialization of pipeline statistics metrics to gen_perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes
9f84efb452 intel/perf: move get_query_data into gen_perf
This refactor moves several helper functions for get_query_data as
well:

 - accumulate_oa_reports
 - read_gt_frequency
 - get_pipeline_stats_data
 - get_oa_counter_data

Functions which are no longer referenced in brw_performance_query.c
have been removed.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes
73eccdc4a5 intel/perf: move delete_query to gen_perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes
8c9eac1234 intel/perf: move is_query_ready to gen_perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes
a9be292722 intel/perf: move wait_query to perf
The following methods have duplicate implementation of read_oa_samples_until in
brw_performance_query.c:

 - read_oa_samples_for_query
 - read_oa_samples_until

They ar still referenced by other methods in the file and will be
removed on the subsequent commit.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes
3c8ed58486 intel/perf: create a vtable entry for bo_busy
Iris and i965 variants of this method need to be called by perf
routines.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes
6fed756388 intel/perf: create a vtable entry for bo_wait_rendering
Iris and i965 variants of this method need to be called by perf
routines.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes
511bb15d4b intel/perf: create a vtable entry for batch_references
Iris and i965 variants of this method need to be called by perf
routines.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes
3ecb23092e intel/perf: refactor gen_perf_end_query into gen_perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes
018f9b81e5 intel/perf: refactor gen_perf_begin_query into gen_perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
52d3db9ab6 intel/perf: move perf-related state into gen_perf_context
To move more operations into intel/perf, several state items are
needed.  Save references to that state in the perf_ctxt, rather than
passing them in for every operation.

This commit includes an initializer for gen_perf_context, to set those
references and also encapsulate the initialization of the sample
buffer state.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
df18acee78 intel/perf: create a vtable entries for buffer object map/unmap
These operations are needed to refactor subsequent methods into perf

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
a330d759c5 intel/perf: move client reference counts into perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
4d0d4aa1b5 intel/perf: move open_perf into perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
79ded7cc8f intel/perf: move close_perf into perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
f57c8a6dc1 intel/perf: create a vtable entry for emit_mi_flush
This method is needed to move subsequent methods into perf.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
52f7a0bff7 intel/perf: use temporary pointers to simplify access to perf state
Most accesses to perf state were made through repeated dereferences of
brw_context members.  Prefering temporary variables of perf_ctx and
perf_cfg has the following advantages:

 - more concise implementation
 - easier refactor when moving subsequent methods to perf

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
a157f5acb1 intel/perf: move snapshot_statistics_registers into perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
8ae6667992 intel/perf: move query_object into perf
Query objects can now be encapsulated within the perf subsystem.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
7e890ed476 intel/perf: create a vtable entry for store_register_mem64
This method is needed to move subsequent methods into perf.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
4b2c885207 intel/perf: move free_sample_bufs into perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
2f712d21b9 intel/perf: move reap_old_sample_buffers into perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
31758bd36c intel/perf: move get_free_sample_buf into perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
e08a69b7f4 intel/perf: move the perf context into perf
The "context" that is necessary to submit and process perf commands to
the hardware was previously present in the brw_context.perfquery
struct.  This commit moves it into perf and provides a more
understandable name.

The intention is for this struct to be private, when all methods that
access it are migrated into perf.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
fb622054f7 intel/perf: move get_metric_id to perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
b14e15e26a intel/perf: move oa_sample_buf structure to perf
oa_sample_buf holds the data provided by the kernel that will be
collated into performance metrics.  Since this functionality will be
implemented in perf, the struct needs to be defined there.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
e091f33990 intel/perf: enumerate query-based metrics in perf
Iris and i965 both need to enumerate the available metrics, so these
routines must be located in perf.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
2446f5cfd8 intel/perf: move perf-related constants to common location
The perf subsystem needs several macro definitions that were
duplicated in Iris and i965 headers.  Place these macros within perf,
if the perf implementation contains the only references to the values.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
67675a5802 intel/perf: create a vtable entry for capture_frequency_stat_register
In preparation for calling both Iris and i965 implementions from perf.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
ae3fac851d intel/perf: create a vtable entry for batchbuffer_flush
In preparation for calling both Iris and i965 implementions from perf.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
a921b215dd intel/perf: create a vtable entry for emit_report_count
In preparation for calling both Iris and i965 implementions from perf.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
9a2a2e8bea intel/perf: create a vtable entry for bo_unreference
In preparation for calling both Iris and i965 implementions from perf.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
439d5a3eff intel/perf: create a vtable for low-level driver functions
Performance metrics collections requires several actions (eg bo_map())
that have different implementations for Iris and i965.  The perf
subsystem needs a vtable for each of these actions, so it can invoke
the corresponding implementation for each driver.

The first call to be added to the table is bo_alloc.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
ea66484e86 intel/perf: use common ioctl wrapper
There were multiple ioctl-wrapper functions, so a common
implementation was put in gen_gem.h.   With a common implementation,
perf no longer needs the caller to configure one for it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
07d3bd5c46 intel/perf: rename gen_perf to gen_perf_config
This structure contains the configurations of the metrics for the
current platform, and the settings needed for the perf subsystem to
query that configuration from the device.  This data is available
without a rendering context, and needed to support MDAPI metrics for
Vulkan.

A gen_perf_context struct will be added later, which holds additional
state from the rendering context necessary for metric data
collection.  The gen_perf struct needs a more precise name to reduce
confusion.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Ilia Mirkin
9ff8da0e50 nvc0: fix program dumping, use _debug_printf
This debug situation is unforunate. debug_printf only does something
with DEBUG set, but in practice all that needs to be moved to !NDEBUG.
For now, use _debug_printf which always prints. However the whole
function is guarded by !NDEBUG.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-08-07 22:32:02 -04:00
Ilia Mirkin
f6af104340 nvc0: add support for ATOMC_WRAP TGSI operations
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-08-07 22:32:02 -04:00
Ilia Mirkin
a2bb7b26a1 gallium: redefine ATOMINC_WRAP to be more hardware-friendly
Both AMD and NVIDIA hardware define it this way. Instead of replicating
the logic everywhere, just fix it up in one place.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-07 22:31:56 -04:00
Ilia Mirkin
582c86346d st/mesa: relax EXT_shader_image_load_store enable
There's no reason to bring format-less load requirement into this
extension. It requires a size to be provided, and a compatible format is
computed from the size + data type. For example

  layout(size1x32) uniform iimage1D image;

becomes

  DCL IMAGE[0], 1D, PIPE_FORMAT_R32_SINT, WR

whereas PIPE_CAP_IMAGE_LOAD_FORMATTED is designed to allow
PIPE_FORMAT_NONE to be provided as a format and still enable LOAD
operations to be performed.

So the shader has all the information it needs about the format.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-07 22:31:38 -04:00
Mark Janes
a29bc3a3ad i965/perf: restore mdapi statistics query metrics
Registration of mdapi metrics based on statistics query registers was
inadvertently removed in the commit that checks for OA kernel support.

The statistics queries are not dependent on OA.

Fixes: 96e1c945f2 ("i965: Move device info initialization to common code")

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-07 17:20:04 -07:00
Greg V
c0376a1234 util: add anon_file.h for all memfd/temp file usage
Move the Weston os_create_anonymous_file code from egl/wayland into util,
add support for Linux memfd and FreeBSD SHM_ANON,
use that code in anv/aubinator instead of explicit memfd calls for portability.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-07 22:57:55 +00:00
Pierre-Eric Pelloux-Prayer
519bebdb40 radeonsi: limit DPBB context_states_per_bin batches when using gfx9 workaround
It seems that using 'context_states_per_bin = 1' for DPBB fixes the reported issue.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110214

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-07 18:45:24 -04:00
Pierre-Eric Pelloux-Prayer
120d0ef937 radeonsi: reduce DPBB persistent_states_per_bin value for APUs
Fixes some reported GPU hangs on RAVEN.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111231

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-07 18:45:22 -04:00
Pierre-Eric Pelloux-Prayer
6bda9ca062 radeonsi: fix typo in DPBB register field
Also only set FLUSH_ON_BINNING_TRANSITION for GPU families that needs it (matches
what si_emit_dpbb_disable is doing).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-07 18:45:20 -04:00
Pierre-Eric Pelloux-Prayer
90bded140e radeonsi: fix S_028C48_MAX_ALLOC_COUNT value
This field uses "value minus 1" encoding.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-07 18:45:09 -04:00
Christian Gmeiner
323cda475b etnaviv: drop struct etna_3d_state
Also drop #if 0 code block.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Philipp Zabel <philipp.zabel@gmail.com>
2019-08-07 22:12:00 +02:00
Yevhenii Kolesnikov
0325860e90 mesa: Use _mesa_delete_transform_feedback_object in drivers
Function _mesa_delete_transform_feedback_object called from within
drivers once driver-specific clean-up has been done. Brings into
conformity with how other GL objects are handled.

CC: Eric Anholt <eric@anholt.net>
CC: Kenneth Graunke <kenneth@whitecape.org>

Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-07 17:25:22 +00:00
Yevhenii Kolesnikov
4f767ded6e mesa: use _mesa_delete_query in drivers
Now drivers can call _mesa_delete_query once driver-specific
clean-up has been done. Brings into conformity with how other GL
objects are handled.

CC: Eric Anholt <eric@anholt.net>
CC: Kenneth Graunke <kenneth@whitecape.org>

Suggested-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-07 17:25:22 +00:00
Juan A. Suarez Romero
4619535ab7 docs: update calendar, add news item and link release notes for 19.1.4
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-08-07 18:51:32 +02:00
Juan A. Suarez Romero
a19d43ebd5 docs: add sha256 checksums for 19.1.4
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 7fcb69a33c)
2019-08-07 18:49:25 +02:00
Juan A. Suarez Romero
8484fafc78 docs: add release notes for 19.1.4
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit b84ffa028d)
2019-08-07 18:49:23 +02:00
Bas Nieuwenhuizen
5a26f528cb meson,i965: Link with android deps when building for android.
The DBG marco in brw_blorp.c ends up calling an android log function:

error: undefined reference to '__android_log_print'

v2: On suggestion from Lionel, hang the Android dependency onto a new
    libintel_common dependency.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-07 15:34:46 +02:00
Erik Faye-Lund
da9e2958ec gallium/dump: add missing query-type to short-list
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 3f6b3d9db7 ("gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-07 12:03:24 +00:00
Erik Faye-Lund
70a93922db gallium/dump: add missing query-type to short-list
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: a677799e51 ("gallium: add PIPE_QUERY_SO_OVERFLOW_ANY_PREDICATE
                     and corresponding cap")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-07 12:03:24 +00:00
Eric Engestrom
32ce010951 gitlab-ci: don't install autotools deps
These could've been deleted a long time ago, but apparent we forgot.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2019-08-07 10:18:25 +01:00
Eric Engestrom
5b10ddf358 util: fix mem leak of program path
Fixes: 759b940389 ("util: Get program name based on path when possible")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-08-07 08:42:42 +01:00
Eric Engestrom
991137144a meson: build intel-ui tools as part of all tools
Reported-by: Mark Janes <mark.a.janes@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111289
Cc: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-07 08:19:31 +01:00
Eric Engestrom
c32ebfe003 gitlab-ci: add gtk3 dev files for -D tools=intel-ui
We also need to update wayland-protocols and libXrandr (and randrproto),
as they are too old for gdk3 (which gtk3 depends on).

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-07 08:19:30 +01:00
Jan Vesely
6b8269d0bb clover: Fix build after clang r367864
v2: Drop special case of llvm-9
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Acked-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Aaron Watry <awatry@gmail.com>
2019-08-06 23:33:55 -04:00
Timothy Arceri
d81e11332b mesa: remove super old TODOs from shaderapi.c
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-07 13:31:40 +10:00
John Stultz
fcfa2d1447 mesa: freedreno: Android.registers.mk: Fix up register xml.h file generation
The current Androdi.registers.mk file causes build failures that
look like:
 FAILED:
 external/mesa3d/src/freedreno/Android.registers.mk:49: error: implicit rules are obsolete: out/target/product/linaro_db845c/gen/STATIC_LIBRARIES/libfreedreno_registers_intermediates/registers/%.xml.h

Caused by the following Android build rule change:
https://android.googlesource.com/platform/build/+/HEAD/Changes.md#implicit_rules

I tried to replace this with something similar to the static
pattern suggested in the URL above, but ended up getting all the
xml.h files generated using only the first a2xx.xml source file.

So I've fallen back to explicitly defining the make rules for
each.

Additionally, we needed to provide the proper
LOCAL_EXPORT_C_INCLUDE_DIRS and add the defined static library
to the components that depend on the register headers.

Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2019-08-07 02:18:38 +00:00
John Stultz
96baf052b2 mesa: Add ir3/ir3_nir_imul.c generation to Android.mk
With current master we're seeing build failures with AOSP:
  error: undefined symbol: ir3_nir_lower_imul

This is due to the ir3_nir_imul.c file not being generated
in the Android.mk files.

This patch simply adds it to the Android build, after which
thigns build and book ok on db410c.

Cc: Rob Clark <robdclark@chromium.org>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Amit Pundir <amit.pundir@linaro.org>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Alistair Strachan <astrachan@google.com>
Cc: Greg Hartman <ghartman@google.com>
Cc: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2019-08-07 02:18:19 +00:00
Rohan Garg
16edd56fcc panfrost: Take into account a index_bias for glDrawElementsBaseVertex calls
Midgard does not accept a index_bias directly and relies instead on a
bias correction offset (offset_bias_correction) in order to calculate
the unbiased vertex index.

We need to make sure we adjust offset_start and vertex_count in order to
take into account the index_bias as required by a
glDrawElementsBaseVertex call and then supply a additional
offset_bias_correction to the hardware.

Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-06 17:18:19 -07:00
Bas Nieuwenhuizen
4bb17c08ae radv/gfx10: Enable DCC for storage images.
v2: Hide it behind a perftest flag.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-07 02:13:07 +02:00
Bas Nieuwenhuizen
3a5950f501 radv: Add device argument for dcc compression check.
Because it is about to be generation dependent.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-07 02:13:07 +02:00
Bas Nieuwenhuizen
8c63ffe54d radv: Disable compression for compute DCC decompress store.
Previously we relied on stores not using DCC but that is going to
change, so disable compression explicitly.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-07 02:13:07 +02:00
Bas Nieuwenhuizen
216a9d8871 radv: Add extra struct to image view creation.
For extra args. Unlike image creation, I'm not embedding the vk
struct in there, so all the inline structs can be kept.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-07 02:13:07 +02:00
Bas Nieuwenhuizen
50add1b33a radv: Do not decompress on LAYOUT_GENERAL.
We handle render loops properly now and STORAGE still disables
DCC/TC-compat HTILE in general.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-07 02:13:07 +02:00
Bas Nieuwenhuizen
66131ceb8b radv: Pass through render loop detection to internal layout decisions.
And do nothing with it yet.

Everything outside a renderpass has no render loop.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-07 02:13:07 +02:00
Bas Nieuwenhuizen
a171a6663d radv: Add render loop detection in renderpass.
VK spec 7.3:

"Applications must ensure that all accesses to memory that backs
image subresources used as attachments in a given renderpass instance
either happen-before the load operations for those attachments, or
happen-after the store operations for those attachments."

So the only renderloops we can have is with input attachments. Detect
these.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-07 02:13:07 +02:00
Timothy Arceri
a5b9394b87 drirc: Add vendor workaround for Divinity: Original Sin EE
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93551
2019-08-07 10:12:49 +10:00
Timothy Arceri
dca119f12c mesa/gallium: add dric option to allow overriding GL vendor string
Will be used in the following patch.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93551
2019-08-07 10:12:49 +10:00
Marek Olšák
c95e2a1c6b relnotes/19.2: document EXT_texture_dhadow_lod 2019-08-06 20:10:15 -04:00
Bas Nieuwenhuizen
04c6feb12c radv: Fix config reg assert.
Using the wrong bounds

Fixes: "219d6939df8 radv: add more assertions to make sure packets are correctly emitted"
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-07 08:58:23 +10:00
Marek Olšák
16577f5002 tgsi_to_nir: add a few needed double opcodes
for internal radeonsi shaders

v2 (Connor):
- Split out prep work from adding opcodes, and rewrite the former

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 18:03:26 -04:00
Marek Olšák
2207daf549 tgsi_to_nir: implement a few needed 64-bit integer opcodes
for internal radeonsi shaders

v2 (Connor):
- Split this out from the prep work, and rework the former
- Add support for U64SNE

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 18:03:24 -04:00
Connor Abbott
37f6350c1d ttn: Prepare for 64-bit sources and destinations
v2: Properly handle 32->64 bit conversions

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 18:03:22 -04:00
Connor Abbott
4b10949482 ttn: Use 1-bit NIR comparison opcodes
We shouldn't be using the versions that output a 32-bit boolean, since
nir_opt_algebraic won't optimize them as well. Drivers will lower these
to the 32-bit versions after optimizing, if appropriate. Also, this will
make implementing 64-bit comparisons easier.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 18:03:19 -04:00
Connor Abbott
e7fd90e8ef nir/builder: Add nir_b2i
Same as nir_b2f but for integers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 18:03:10 -04:00
Pierre-Eric Pelloux-Prayer
f84c9ad17a radeonsi: enable EXT_shader_image_load_store
This depends on LLVM 10 because this needs https://reviews.llvm.org/D65283

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:41:07 -04:00
Pierre-Eric Pelloux-Prayer
25fff591c1 radeonsi: add support for nir atomic_inc_wrap/atomic_dec_wrap
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:41:06 -04:00
Pierre-Eric Pelloux-Prayer
8789248541 radeonsi: add support for tgsi ATOMDEC_WRAP / ATOMINC_WRAP opcodes
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:41:04 -04:00
Pierre-Eric Pelloux-Prayer
704a6b5948 ac: add ac_atomic_inc_wrap / ac_atomic_dec_wrap support
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:41:03 -04:00
Pierre-Eric Pelloux-Prayer
a9ec718652 nir: add atomic_inc_wrap/atomic_dec_wrap image intrinsics
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:41:02 -04:00
Pierre-Eric Pelloux-Prayer
fc0a2e5d01 glsl: add EXT_shader_image_load_store new image functions
This extension has 2 functions that are missing from the ARB versions:
- imageAtomicIncWrap
- imageAtomicDecWrap

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:41:00 -04:00
Pierre-Eric Pelloux-Prayer
70a47fb032 glsl: add EXT_shader_image_load_store keywords to lexer
All of them already existed for ARB_shader_image_load_store.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:40:58 -04:00
Pierre-Eric Pelloux-Prayer
cfba168b6c glsl: add size qualifiers from EXT_shader_image_load_store
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:40:56 -04:00
Pierre-Eric Pelloux-Prayer
cd45d09226 glsl: handle differences between ARB/EXT versions of shader_image_load_store
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:40:55 -04:00
Pierre-Eric Pelloux-Prayer
5db28b0cf7 mesa: add EXT_shader_image_load_store glBindImageTextureEXT function
The implementation is almost identical to glBindImageTexture except for error
checking.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:40:53 -04:00
Pierre-Eric Pelloux-Prayer
71e619a825 glapi: add EXT_shader_image_load_store
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:40:52 -04:00
Pierre-Eric Pelloux-Prayer
91924453ee gallium: add PIPE_CAP_TGSI_ATOMINC_WRAP to indicate support
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:40:51 -04:00
Pierre-Eric Pelloux-Prayer
8b6bfed3d2 tgsi: add ATOMICINC_WRAP/ATOMICDEC_WRAP opcode
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:40:34 -04:00
Marek Olšák
1d8a71af57 radeonsi/gfx10: enable all CUs for GS if NGG is never used
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-06 17:09:03 -04:00
Marek Olšák
91227a1e17 radeonsi/gfx10: add global use_ngg and use_ngg_streamout flags
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-06 17:09:02 -04:00
Marek Olšák
f064b530f6 radeonsi/gfx10: remove an obsolete VGT_REUSE_OFF workaround
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-06 17:09:01 -04:00
Marek Olšák
37dd8ebcf7 radeonsi/gfx10: disable LATE_ALLOC_GS on Navi14
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-06 17:08:59 -04:00
Marek Olšák
c5a6ecf61a radeonsi/gfx10: implement a bug workaround for GE_PC_ALLOC
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-06 17:08:58 -04:00
Marek Olšák
8f8c28767e radeonsi/gfx10: implement a bug workaround for NGG -> legacy transitions
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-06 17:08:57 -04:00
Marek Olšák
cb9d95623b radeonsi/gfx10: implement a GE bug workaround
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-06 17:08:56 -04:00
Marek Olšák
e08b0d7ac4 radeonsi/gfx10: set GE_CNTL for tessellation correctly
to match PAL

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-06 17:08:54 -04:00
Marek Olšák
71b53020b7 radeonsi/gfx10: simplify NGG code in si_update_shaders
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-06 17:08:53 -04:00
Marek Olšák
a232f5e07c radeonsi/gfx10: fix input VGPRs for legacy VS
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-06 17:08:51 -04:00
Marek Olšák
8d90157d49 radeonsi: make sure that rasterizer state != NULL and remove all NULL checking
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-06 17:08:39 -04:00
Marek Olšák
8b8819e88a radeonsi: make sure that DSA state != NULL and remove all NULL checking
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-06 17:08:39 -04:00
Marek Olšák
b758eed9c3 radeonsi: make sure that blend state != NULL and remove all NULL checking
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-06 17:08:39 -04:00
Marek Olšák
8b68511ebc radeonsi: DCC MSAA blending bug - include logic op, limit to Navi14 and older
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-06 17:08:50 -04:00
Marek Olšák
e69c1c8b8f radeonsi: determine accurately whether logic op is enabled
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-06 17:08:48 -04:00
Marek Olšák
b38f5eb17a radeonsi: skip draw calls with 0-sized index buffers
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-06 17:08:39 -04:00
Marek Olšák
e777720173 radeonsi/nir: lower PS inputs before scanning the shader
Lowering PS inputs can eliminate some of them, which messes up
persp/linear barycentric coord usage info.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-06 17:08:46 -04:00
Marek Olšák
f818d9ae3c radeonsi/nir: handle key.mono.u.ps.interpolate_at_sample_force_center
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-06 17:08:39 -04:00
Marek Olšák
b3eed3cff9 radeonsi: add missing prints into si_dump_shader_key
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-06 17:08:15 -04:00
Marek Olšák
6b3ee86989 radeonsi: disable SDMA image copies on dGPUs to fix corruption in games
Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-06 17:08:08 -04:00
Pierre-Eric Pelloux-Prayer
0556932f4a mesa: add EXT_dsa glMultiTexCoordPointerEXT function
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:03:22 -04:00
Pierre-Eric Pelloux-Prayer
e364ddece3 mesa: add EXT_dsa glMultiTexGen* functions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:03:21 -04:00
Pierre-Eric Pelloux-Prayer
e8e0de6a8f mesa: add EXT_dsa glCopyMultiTexImage* and glCopyMultiTexSubImage*
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:03:19 -04:00
Pierre-Eric Pelloux-Prayer
f28d9ab1a3 mesa: add EXT_dsa glGetMultiTexParameteriv/fvEXT
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:03:18 -04:00
Pierre-Eric Pelloux-Prayer
989c375852 mesa: add EXT_dsa glMultiTexSubImage1D/2D/3DEXT
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:03:16 -04:00
Pierre-Eric Pelloux-Prayer
aac6578732 mesa: add EXT_dsa glMultiTexImage1D/2D/3DEXT + glGetMultiTexImageEXT
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:03:15 -04:00
Pierre-Eric Pelloux-Prayer
885dbe2e84 mesa: add glBindMultiTextureEXT display list support
Fixes: 0972b0b059 ("mesa: add support for glBindMultiTextureEXT")

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:03:13 -04:00
Pierre-Eric Pelloux-Prayer
d9e26c3483 mesa: add EXT_dsa glMultiTexParameter* functions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:03:12 -04:00
Pierre-Eric Pelloux-Prayer
e04f95057f mesa: add EXT_dsa (Get)MultiTexEnv functions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:03:10 -04:00
Pierre-Eric Pelloux-Prayer
04b8e50bb8 mesa: add _mesa_(get)texenvi(f)v_indexed helpers
They are exactly like _mesa_GetTexEnvfv/_mesa_GetTexEnviv except they take
a GLuint texunit parameter instead of relying of ctx->Texture.CurrentUnit.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:03:08 -04:00
Pierre-Eric Pelloux-Prayer
0e595326c4 mesa: add new helper _mesa_get_texobj_by_target_and_texunit
Based on the 'static get_texobj_by_target' function from texparam.c,
but extended to also take the texunit as a parameter.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:03:06 -04:00
Pierre-Eric Pelloux-Prayer
58030d2b3d mesa: replace _mesa_get_current_fixedfunc_tex_unit with _mesa_get_fixedfunc_tex_unit
The new function implements the same feature but doesn't depend
on ctx->Texture.CurrentUnit.
This change allows to use it from indexed functions.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:02:52 -04:00
Danylo Piliaiev
b4c54894bb iris: Handle vertex shader with window space position
Iris advertises support for PIPE_CAP_TGSI_VS_WINDOW_SPACE_POSITION
so let's actually implement it.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110657

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-06 20:25:35 +00:00
Erico Nunes
b783f9f77e lima: fix pipe_debug_callback warnings
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-06 20:29:53 +02:00
Vasily Khoruzhick
5adfc8602c lima/ppir: move sin/cos input scaling into NIR
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-06 17:49:22 +00:00
Antia Puentes
954224b714 nir/spirv: Fix gl_BaseVertex for non-indexed draws for OpenGL
Lowers BaseVertex to the correct system value for OpenGL.

v2: use options->environment rather than adding a new flag to
    spirv_to_nir_options

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-06 09:11:27 -07:00
Kenneth Graunke
382f92a814 iris: Increase BATCH_SZ to 64kB
This seems to improve performance by roughly ~1% across the board.
Thanks to Rafael Antognolli and Dan Walsh for their help tuning.
2019-08-06 09:09:26 -07:00
Bas Nieuwenhuizen
2af00b1fdd ac/nir: Use correct cast for readfirstlane and ptrs.
Fixes: 028ce527 "radv: Add non-uniform indexing lowering."
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-08-06 15:48:50 +00:00
Bas Nieuwenhuizen
2301b2e029 radv: Do non-uniform lowering before bool lowering.
Since it can introduce comparisons.

Fixes: 028ce52739 "radv: Add non-uniform indexing lowering."
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-08-06 15:48:50 +00:00
Jonathan Marek
dfe048058f etnaviv: support 3D and 2D array textures
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-08-06 10:37:36 -04:00
Jonathan Marek
3508f2fb18 etnaviv: fix 3d texture upload
Fix uploading of 3D textures and 2D array textures:

* Remove asserts in BLT and RS checking z
* Use box->z/box->depth in etna_copy_resource_box and CPU tile/untile
* Track mip level depth and use it in etna_copy_resource

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-08-06 10:37:36 -04:00
Jonathan Marek
ed7a27719a etnaviv: add alternative NIR compiler
enable with ETNA_MESA_DEBUG=nir

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
2019-08-06 10:33:17 -04:00
Jonathan Marek
ee1ed59458 etnaviv: prep for UBOs
Allow UBO relocs and only emitting uniforms that are actually used.

GC7000Lite has no address register, so upload uniforms to a UBO object to
LOAD from.

I removed the code to check for changes to individual uniforms and just
reupload to entire uniform state when the state is dirty. I think there
was very limited benefit to it and it isn't compatible with relocs.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-08-06 10:33:17 -04:00
Jonathan Marek
ca58c1120e etnaviv: disasm: add dual16 bits, immediate decoding, and some opcodes
Also use structs from etnaviv_asm since they hold the same information.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-08-06 10:33:17 -04:00
Jonathan Marek
e9a5181ad6 etnaviv: asm: new features
* Dual16 bits
* Halti5 disable multiple uniform src
* write_mask compose
* Halti2+ immediates

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-08-06 10:33:17 -04:00
Jonathan Marek
98e59f0a0a etnaviv: update headers from rnndb
Update to etna_viv commit f38ba2d.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-08-06 10:33:17 -04:00
Erico Nunes
e0aeee9460 lima: add summary report for shader-db
Very basic summary, loops and gpir spills:fills are not updated yet and
are only there to comply with the strings to shader-db report.py regex.

For now it can be used to analyze the impact of changes in instruction
count in both gpir and ppir.

The LIMA_DEBUG=shaderdb setting can be useful to output stats on
applications other than shader-db.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-08-06 15:43:31 +02:00
Erico Nunes
9e41a514a8 lima: add support for debug callback
This adds support for glDebugMessageCallback which is required to
support shader-db reports.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-08-06 15:43:26 +02:00
Tomeu Vizoso
67f4e1e787 panfrost/ci: Remove two tests from list of failures
These tests have been fixed by:

b514f41183 ("glcpp: use pre-expansion line number for __LINE__")

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-08-06 15:19:43 +02:00
Jon Turney
84fae8e649 st/dri: Move dri2_format_mapping table and it's accessors from dri2.c to dri_helpers.c
8af1990a exposed dri2_get_mapping_by_fourcc() in dri_helpers.h, so it
could be used by dri_get_egl_image(), but didn't move it.  This breaks
the build in the with_dri=false case (e.g. when building for a target
which doesn't have libdrm, so swrast is only dri driver built)
2019-08-06 12:21:56 +00:00
Jonathan Marek
b514f41183 glcpp: use pre-expansion line number for __LINE__
Fixes the following deqp tests:
dEQP-GLES2.functional.shaders.preprocessor.predefined_macros.line_2_*

It don't see the spec requiring this, but it seems to be better, as the
clang preprocessor for example has this behavior.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-06 11:27:04 +00:00
Jason Ekstrand
bc612536eb anv: Emit a dummy MEDIA_VFE_STATE before switching from GPGPU to 3D
There is an object-level  preemption workaround which requires this.
However, even without object-level preemption, we seem to have issues
with geometry flickering when 3D and compute are combined in the same
batch and this appears to fix it.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109630
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111267
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-06 05:46:28 +00:00
Ian Romanick
5544b2cbbd nir/algebraic: Use value range analysis to eliminate useless unary ops
Sandy Bridge is the big winner because it lies at something of a
crossroads.  It supports a fairly high OpenGL version, and it still has
the old style math box.  The high OpenGL version means a lot more
shaders can run on it.  The old style math box means extra moves are
necessary to resolve source modifiers on operands to complex math
instructions like COS, SQRT, and RCP.

v2: Remove a couple patterns that are now redundant.

All Gen7+ platforms had similar results.  (Ice Lake shown)
total instructions in shared programs: 16282006 -> 16278207 (-0.02%)
instructions in affected programs: 174555 -> 170756 (-2.18%)
helped: 661
HURT: 0
helped stats (abs) min: 1 max: 36 x̄: 5.75 x̃: 3
helped stats (rel) min: 0.06% max: 23.68% x̄: 2.81% x̃: 1.94%
95% mean confidence interval for instructions value: -6.16 -5.34
95% mean confidence interval for instructions %-change: -3.02% -2.60%
Instructions are helped.

total cycles in shared programs: 367168597 -> 367134284 (<.01%)
cycles in affected programs: 1105276 -> 1070963 (-3.10%)
helped: 460
HURT: 150
helped stats (abs) min: 1 max: 568 x̄: 96.60 x̃: 82
helped stats (rel) min: 0.02% max: 32.50% x̄: 7.99% x̃: 4.27%
HURT stats (abs)   min: 1 max: 901 x̄: 67.49 x̃: 39
HURT stats (rel)   min: 0.07% max: 20.00% x̄: 4.90% x̃: 4.22%
95% mean confidence interval for cycles value: -65.68 -46.82
95% mean confidence interval for cycles %-change: -5.59% -4.05%
Cycles are helped.

Sandy Bridge
total instructions in shared programs: 10824272 -> 10802557 (-0.20%)
instructions in affected programs: 1237988 -> 1216273 (-1.75%)
helped: 8199
HURT: 0
helped stats (abs) min: 1 max: 41 x̄: 2.65 x̃: 2
helped stats (rel) min: 0.12% max: 20.00% x̄: 2.04% x̃: 1.73%
95% mean confidence interval for instructions value: -2.70 -2.59
95% mean confidence interval for instructions %-change: -2.07% -2.00%
Instructions are helped.

total cycles in shared programs: 154009894 -> 153843598 (-0.11%)
cycles in affected programs: 10650486 -> 10484190 (-1.56%)
helped: 4973
HURT: 1533
helped stats (abs) min: 1 max: 3904 x̄: 40.20 x̃: 20
helped stats (rel) min: 0.02% max: 41.72% x̄: 2.63% x̃: 1.67%
HURT stats (abs)   min: 1 max: 453 x̄: 21.94 x̃: 8
HURT stats (rel)   min: 0.02% max: 41.91% x̄: 1.54% x̃: 0.58%
95% mean confidence interval for cycles value: -28.02 -23.10
95% mean confidence interval for cycles %-change: -1.74% -1.56%
Cycles are helped.

LOST:   0
GAINED: 2

GM45 and Iron Lake had similar results. (Iron Lake shown)
total instructions in shared programs: 8135196 -> 8134888 (<.01%)
instructions in affected programs: 31920 -> 31612 (-0.96%)
helped: 169
HURT: 0
helped stats (abs) min: 1 max: 12 x̄: 1.82 x̃: 2
helped stats (rel) min: 0.43% max: 3.23% x̄: 1.23% x̃: 1.16%
95% mean confidence interval for instructions value: -2.01 -1.64
95% mean confidence interval for instructions %-change: -1.32% -1.15%
Instructions are helped.

total cycles in shared programs: 188575724 -> 188574092 (<.01%)
cycles in affected programs: 406840 -> 405208 (-0.40%)
helped: 169
HURT: 0
helped stats (abs) min: 4 max: 72 x̄: 9.66 x̃: 10
helped stats (rel) min: 0.07% max: 2.16% x̄: 0.57% x̃: 0.47%
95% mean confidence interval for cycles value: -10.72 -8.59
95% mean confidence interval for cycles %-change: -0.63% -0.50%
Cycles are helped.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-05 20:14:14 -07:00
Ian Romanick
8d14380971 nir/algebraic: Use value range analysis to convert fmin to fsat
All Gen8+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16297320 -> 16282006 (-0.09%)
instructions in affected programs: 2434498 -> 2419184 (-0.63%)
helped: 8091
HURT: 1
helped stats (abs) min: 1 max: 51 x̄: 1.89 x̃: 2
helped stats (rel) min: 0.04% max: 14.29% x̄: 0.98% x̃: 0.95%
HURT stats (abs)   min: 7 max: 7 x̄: 7.00 x̃: 7
HURT stats (rel)   min: 0.28% max: 0.28% x̄: 0.28% x̃: 0.28%
95% mean confidence interval for instructions value: -1.94 -1.85
95% mean confidence interval for instructions %-change: -0.99% -0.96%
Instructions are helped.

total cycles in shared programs: 367221624 -> 367168597 (-0.01%)
cycles in affected programs: 126409635 -> 126356608 (-0.04%)
helped: 5612
HURT: 1023
helped stats (abs) min: 1 max: 2332 x̄: 31.11 x̃: 16
helped stats (rel) min: <.01% max: 30.31% x̄: 1.69% x̃: 1.42%
HURT stats (abs)   min: 1 max: 2372 x̄: 118.84 x̃: 16
HURT stats (rel)   min: <.01% max: 46.98% x̄: 1.46% x̃: 0.35%
95% mean confidence interval for cycles value: -11.52 -4.46
95% mean confidence interval for cycles %-change: -1.26% -1.14%
Cycles are helped.

total spills in shared programs: 8868 -> 8870 (0.02%)
spills in affected programs: 28 -> 30 (7.14%)
helped: 0
HURT: 1

total fills in shared programs: 21903 -> 21904 (<.01%)
fills in affected programs: 42 -> 43 (2.38%)
helped: 0
HURT: 1

Haswell
total instructions in shared programs: 13353925 -> 13338728 (-0.11%)
instructions in affected programs: 2265850 -> 2250653 (-0.67%)
helped: 8127
HURT: 5
helped stats (abs) min: 1 max: 51 x̄: 1.88 x̃: 2
helped stats (rel) min: 0.04% max: 20.00% x̄: 1.13% x̃: 1.07%
HURT stats (abs)   min: 5 max: 16 x̄: 9.00 x̃: 6
HURT stats (rel)   min: 0.19% max: 0.52% x̄: 0.35% x̃: 0.28%
95% mean confidence interval for instructions value: -1.91 -1.83
95% mean confidence interval for instructions %-change: -1.15% -1.11%
Instructions are helped.

total cycles in shared programs: 375535444 -> 375536343 (<.01%)
cycles in affected programs: 131206582 -> 131207481 (<.01%)
helped: 5590
HURT: 1055
helped stats (abs) min: 1 max: 2844 x̄: 34.15 x̃: 16
helped stats (rel) min: <.01% max: 21.57% x̄: 2.08% x̃: 1.60%
HURT stats (abs)   min: 1 max: 2487 x̄: 181.78 x̃: 21
HURT stats (rel)   min: <.01% max: 40.66% x̄: 1.96% x̃: 0.37%
95% mean confidence interval for cycles value: -4.74 5.01
95% mean confidence interval for cycles %-change: -1.51% -1.37%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 23401 -> 23407 (0.03%)
spills in affected programs: 248 -> 254 (2.42%)
helped: 2
HURT: 5

total fills in shared programs: 34850 -> 34845 (-0.01%)
fills in affected programs: 383 -> 378 (-1.31%)
helped: 2
HURT: 5

Ivy Bridge
total instructions in shared programs: 11975423 -> 11968117 (-0.06%)
instructions in affected programs: 845703 -> 838397 (-0.86%)
helped: 4071
HURT: 0
helped stats (abs) min: 1 max: 51 x̄: 1.79 x̃: 1
helped stats (rel) min: 0.08% max: 8.21% x̄: 1.04% x̃: 0.93%
95% mean confidence interval for instructions value: -1.87 -1.71
95% mean confidence interval for instructions %-change: -1.06% -1.02%
Instructions are helped.

total cycles in shared programs: 179674318 -> 179635552 (-0.02%)
cycles in affected programs: 5100065 -> 5061299 (-0.76%)
helped: 2650
HURT: 611
helped stats (abs) min: 1 max: 900 x̄: 21.85 x̃: 16
helped stats (rel) min: <.01% max: 21.55% x̄: 2.39% x̃: 1.40%
HURT stats (abs)   min: 1 max: 1841 x̄: 31.33 x̃: 6
HURT stats (rel)   min: <.01% max: 58.71% x̄: 1.64% x̃: 0.37%
95% mean confidence interval for cycles value: -14.14 -9.64
95% mean confidence interval for cycles %-change: -1.75% -1.52%
Cycles are helped.

LOST:   3
GAINED: 7

Sandy Bridge
total instructions in shared programs: 10828844 -> 10824272 (-0.04%)
instructions in affected programs: 525678 -> 521106 (-0.87%)
helped: 2386
HURT: 0
helped stats (abs) min: 1 max: 51 x̄: 1.92 x̃: 2
helped stats (rel) min: 0.11% max: 7.96% x̄: 1.05% x̃: 0.94%
95% mean confidence interval for instructions value: -2.04 -1.80
95% mean confidence interval for instructions %-change: -1.08% -1.03%
Instructions are helped.

total cycles in shared programs: 154024591 -> 154009894 (<.01%)
cycles in affected programs: 4005766 -> 3991069 (-0.37%)
helped: 1245
HURT: 506
helped stats (abs) min: 1 max: 585 x̄: 21.07 x̃: 16
helped stats (rel) min: 0.02% max: 11.57% x̄: 1.98% x̃: 0.83%
HURT stats (abs)   min: 1 max: 639 x̄: 22.81 x̃: 6
HURT stats (rel)   min: 0.01% max: 26.21% x̄: 1.07% x̃: 0.26%
95% mean confidence interval for cycles value: -10.57 -6.21
95% mean confidence interval for cycles %-change: -1.23% -0.97%
Cycles are helped.

GM45 and Iron Lake had similar results. (Iron Lake shown)
total instructions in shared programs: 8137248 -> 8135196 (-0.03%)
instructions in affected programs: 148322 -> 146270 (-1.38%)
helped: 992
HURT: 0
helped stats (abs) min: 1 max: 32 x̄: 2.07 x̃: 2
helped stats (rel) min: 0.41% max: 9.73% x̄: 1.74% x̃: 1.51%
95% mean confidence interval for instructions value: -2.16 -1.98
95% mean confidence interval for instructions %-change: -1.80% -1.67%
Instructions are helped.

total cycles in shared programs: 188583424 -> 188575724 (<.01%)
cycles in affected programs: 4409620 -> 4401920 (-0.17%)
helped: 956
HURT: 6
helped stats (abs) min: 2 max: 168 x̄: 8.09 x̃: 8
helped stats (rel) min: 0.04% max: 6.76% x̄: 0.27% x̃: 0.18%
HURT stats (abs)   min: 6 max: 6 x̄: 6.00 x̃: 6
HURT stats (rel)   min: 0.10% max: 0.10% x̄: 0.10% x̃: 0.10%
95% mean confidence interval for cycles value: -8.41 -7.60
95% mean confidence interval for cycles %-change: -0.29% -0.25%
Cycles are helped.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-05 20:14:14 -07:00
Ian Romanick
b77070e293 nir/algebraic: Use value range analysis to eliminate tautological compares
It's only one application on one platform (Haswell) that's affected,
but spills and fills increase quite dramatically. :(

All Gen8+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16320850 -> 16297320 (-0.14%)
instructions in affected programs: 448012 -> 424482 (-5.25%)
helped: 1938
HURT: 0
helped stats (abs) min: 2 max: 264 x̄: 12.14 x̃: 10
helped stats (rel) min: 0.35% max: 43.75% x̄: 5.85% x̃: 5.38%
95% mean confidence interval for instructions value: -12.80 -11.48
95% mean confidence interval for instructions %-change: -5.99% -5.72%
Instructions are helped.

total cycles in shared programs: 367496943 -> 367221624 (-0.07%)
cycles in affected programs: 8557232 -> 8281913 (-3.22%)
helped: 1907
HURT: 26
helped stats (abs) min: 4 max: 12802 x̄: 147.21 x̃: 48
helped stats (rel) min: 0.03% max: 75.85% x̄: 5.55% x̃: 3.94%
HURT stats (abs)   min: 4 max: 1870 x̄: 208.23 x̃: 20
HURT stats (rel)   min: 0.16% max: 32.11% x̄: 8.31% x̃: 0.79%
95% mean confidence interval for cycles value: -165.38 -119.48
95% mean confidence interval for cycles %-change: -5.68% -5.04%
Cycles are helped.

LOST:   1
GAINED: 0

Haswell
total instructions in shared programs: 13374211 -> 13353925 (-0.15%)
instructions in affected programs: 349868 -> 329582 (-5.80%)
helped: 1669
HURT: 1
helped stats (abs) min: 1 max: 264 x̄: 12.57 x̃: 10
helped stats (rel) min: 0.12% max: 46.81% x̄: 6.86% x̃: 6.49%
HURT stats (abs)   min: 700 max: 700 x̄: 700.00 x̃: 700
HURT stats (rel)   min: 64.34% max: 64.34% x̄: 64.34% x̃: 64.34%
95% mean confidence interval for instructions value: -13.25 -11.04
95% mean confidence interval for instructions %-change: -7.01% -6.63%
Instructions are helped.

total cycles in shared programs: 375763544 -> 375535444 (-0.06%)
cycles in affected programs: 6932686 -> 6704586 (-3.29%)
helped: 1622
HURT: 48
helped stats (abs) min: 2 max: 12229 x̄: 148.31 x̃: 68
helped stats (rel) min: 0.06% max: 74.03% x̄: 5.94% x̃: 4.12%
HURT stats (abs)   min: 3 max: 7451 x̄: 259.44 x̃: 41
HURT stats (rel)   min: 0.05% max: 54.99% x̄: 8.52% x̃: 2.88%
95% mean confidence interval for cycles value: -159.86 -113.31
95% mean confidence interval for cycles %-change: -5.86% -5.18%
Cycles are helped.

total spills in shared programs: 23258 -> 23401 (0.61%)
spills in affected programs: 54 -> 197 (264.81%)
helped: 4
HURT: 2

total fills in shared programs: 34775 -> 34850 (0.22%)
fills in affected programs: 52 -> 127 (144.23%)
helped: 4
HURT: 1

LOST:   5
GAINED: 0

Ivy Bridge
total instructions in shared programs: 11996051 -> 11977964 (-0.15%)
instructions in affected programs: 346679 -> 328592 (-5.22%)
helped: 1508
HURT: 0
helped stats (abs) min: 2 max: 198 x̄: 11.99 x̃: 10
helped stats (rel) min: 0.26% max: 19.83% x̄: 5.73% x̃: 5.43%
95% mean confidence interval for instructions value: -12.65 -11.34
95% mean confidence interval for instructions %-change: -5.86% -5.60%
Instructions are helped.

total cycles in shared programs: 179891389 -> 179691339 (-0.11%)
cycles in affected programs: 7869479 -> 7669429 (-2.54%)
helped: 1485
HURT: 23
helped stats (abs) min: 1 max: 12615 x̄: 136.16 x̃: 54
helped stats (rel) min: 0.02% max: 71.84% x̄: 4.69% x̃: 3.49%
HURT stats (abs)   min: 1 max: 403 x̄: 93.48 x̃: 6
HURT stats (rel)   min: 0.04% max: 34.01% x̄: 8.68% x̃: 0.81%
95% mean confidence interval for cycles value: -154.59 -110.73
95% mean confidence interval for cycles %-change: -4.79% -4.19%
Cycles are helped.

Sandy Bridge
total instructions in shared programs: 10829247 -> 10828844 (<.01%)
instructions in affected programs: 21258 -> 20855 (-1.90%)
helped: 88
HURT: 0
helped stats (abs) min: 2 max: 17 x̄: 4.58 x̃: 5
helped stats (rel) min: 0.52% max: 3.92% x̄: 2.05% x̃: 2.21%
95% mean confidence interval for instructions value: -5.03 -4.13
95% mean confidence interval for instructions %-change: -2.21% -1.89%
Instructions are helped.

total cycles in shared programs: 154035437 -> 154024591 (<.01%)
cycles in affected programs: 430176 -> 419330 (-2.52%)
helped: 78
HURT: 10
helped stats (abs) min: 2 max: 4649 x̄: 143.06 x̃: 32
helped stats (rel) min: 0.05% max: 6.02% x̄: 2.03% x̃: 1.07%
HURT stats (abs)   min: 3 max: 265 x̄: 31.30 x̃: 6
HURT stats (rel)   min: 0.10% max: 8.67% x̄: 1.03% x̃: 0.21%
95% mean confidence interval for cycles value: -232.53 -13.97
95% mean confidence interval for cycles %-change: -2.13% -1.23%
Cycles are helped.

Iron Lake and GM45 had similar results. (Iron Lake shown)
total instructions in shared programs: 8137402 -> 8137248 (<.01%)
instructions in affected programs: 2280 -> 2126 (-6.75%)
helped: 10
HURT: 0
helped stats (abs) min: 12 max: 19 x̄: 15.40 x̃: 15
helped stats (rel) min: 3.90% max: 11.73% x̄: 7.19% x̃: 6.95%
95% mean confidence interval for instructions value: -17.69 -13.11
95% mean confidence interval for instructions %-change: -8.99% -5.39%
Instructions are helped.

total cycles in shared programs: 188538716 -> 188583424 (0.02%)
cycles in affected programs: 69326 -> 114034 (64.49%)
helped: 0
HURT: 10
HURT stats (abs)   min: 2068 max: 7686 x̄: 4470.80 x̃: 4870
HURT stats (rel)   min: 27.20% max: 173.66% x̄: 69.55% x̃: 59.41%
95% mean confidence interval for cycles value: 2830.86 6110.74
95% mean confidence interval for cycles %-change: 39.18% 99.91%
Cycles are HURT.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-05 20:14:13 -07:00
Ian Romanick
96fcb3f95b nir/algebraic: Use value range analysis to eliminate tautological compares not used by if-statements
This just eliminates tautological / contradictory compares that are used
for bcsel and other non-if-statement cases.  If-statements are not
affected because removing flow control can cause the i965 instrution
scheduler to create some very long live ranges resulting in unncessary
spilling.  This causes some shaders to fall of a performance cliff.

Since many small if-statements are already flattened to bcsel, this
optimization covers more than 68% of the possible cases (2417 shaders
helped for instructions on Skylake vs. 3554).

v2: Reorder and add whitespace to make the relationship between the
patterns more obvious.  Suggested by Caio.

All Gen7+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16333474 -> 16322028 (-0.07%)
instructions in affected programs: 438559 -> 427113 (-2.61%)
helped: 1765
HURT: 0
helped stats (abs) min: 1 max: 275 x̄: 6.48 x̃: 4
helped stats (rel) min: 0.20% max: 36.36% x̄: 4.07% x̃: 1.82%
95% mean confidence interval for instructions value: -6.87 -6.10
95% mean confidence interval for instructions %-change: -4.30% -3.84%
Instructions are helped.

total cycles in shared programs: 367608554 -> 367511103 (-0.03%)
cycles in affected programs: 8368829 -> 8271378 (-1.16%)
helped: 1541
HURT: 129
helped stats (abs) min: 1 max: 4468 x̄: 66.78 x̃: 39
helped stats (rel) min: 0.01% max: 45.69% x̄: 4.10% x̃: 2.17%
HURT stats (abs)   min: 1 max: 973 x̄: 42.25 x̃: 10
HURT stats (rel)   min: 0.02% max: 64.39% x̄: 2.15% x̃: 0.60%
95% mean confidence interval for cycles value: -64.90 -51.81
95% mean confidence interval for cycles %-change: -3.89% -3.36%
Cycles are helped.

total spills in shared programs: 8867 -> 8868 (0.01%)
spills in affected programs: 18 -> 19 (5.56%)
helped: 0
HURT: 1

total fills in shared programs: 21900 -> 21903 (0.01%)
fills in affected programs: 78 -> 81 (3.85%)
helped: 0
HURT: 1

All Gen6 and earlier platforms had similar results. (Sandy Bridge shown)
total instructions in shared programs: 10829877 -> 10829247 (<.01%)
instructions in affected programs: 30240 -> 29610 (-2.08%)
helped: 177
HURT: 0
helped stats (abs) min: 1 max: 15 x̄: 3.56 x̃: 3
helped stats (rel) min: 0.37% max: 17.39% x̄: 2.68% x̃: 1.94%
95% mean confidence interval for instructions value: -3.93 -3.18
95% mean confidence interval for instructions %-change: -3.04% -2.32%
Instructions are helped.

total cycles in shared programs: 154036580 -> 154035437 (<.01%)
cycles in affected programs: 352402 -> 351259 (-0.32%)
helped: 96
HURT: 28
helped stats (abs) min: 1 max: 128 x̄: 14.73 x̃: 6
helped stats (rel) min: 0.03% max: 24.00% x̄: 1.51% x̃: 0.46%
HURT stats (abs)   min: 1 max: 117 x̄: 9.68 x̃: 4
HURT stats (rel)   min: 0.03% max: 2.24% x̄: 0.43% x̃: 0.23%
95% mean confidence interval for cycles value: -13.40 -5.03
95% mean confidence interval for cycles %-change: -1.62% -0.53%
Cycles are helped.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-05 20:14:13 -07:00
Ian Romanick
fa116ce357 nir/range-analysis: Range tracking for ffma and flrp
A similar technique could be used for fmin3, fmax3, and fmid3.

This could be squashed with the previous commit.  I kept it separate to
ease review.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-05 20:14:13 -07:00
Ian Romanick
586602c5d9 nir/range-analysis: Range tracking for bcsel
This could be squashed with the previous commit.  I kept it separate to
ease review.

v2: Add some missing cases.  Use nir_src_is_const helper.  Both
suggested by Caio.  Use a table for mapping source ranges to a result
range.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-05 20:14:13 -07:00
Ian Romanick
3009cbed50 nir/range-analysis: Tighten the range of fsat based on the range of its source
This could be squashed with the previous commit.  I kept it separate to
ease review.

v2: Use a switch statement and add more comments.  Both suggested by
Caio.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-05 20:14:13 -07:00
Ian Romanick
405de7ccb6 nir/range-analysis: Rudimentary value range analysis pass
Most integer operations are omitted because dealing with integer
overflow is hard.  There are a few things that could be smarter if there
was a small amount more tracking of ranges of integer types (i.e.,
operands are Boolean, operand values fit in 16 bits, etc.).

The changes to nir_search_helpers.h are included in this patch to
simplify reordering the changes to nir_opt_algebraic.py.

v2: Memoize range analysis results.  Without this, some shaders appear
to get stuck in infinite loops.

v3: Rebase on many months of Mesa changes, including 1-bit Boolean
changes.

v4: Rebase on "nir: Drop imov/fmov in favor of one mov instruction".

v5: Use nir_alu_srcs_equal for detecting (a*a).  Previously just the SSA
value was compared, and this incorrectly matched (a.x*a.y).

v6: Many code improvements including (but not limited to) better names,
more comments, and better use of helper functions.  All suggested by
Caio.  Rework the handling of several opcodes to use a table for mapping
source ranges to a result range.  This change fixed a bug that caused
fmax(gt_zero, ge_zero) to be incorrectly recognized as ge_zero.
Slightly tighten the range of fmul by recognizing that x*x is gt_zero if
x is gt_zero.  Add similar handling for -x*x.

v7: Use _______ in the tables as an alias for unknown.  Suggested by
Caio.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-05 20:14:13 -07:00
Ian Romanick
d24edb4b8c nir/algebraic: Simplify some comparisons like a+constant < constant
v2: Remove unsafe integer versions of the optimizations.  This change
had no effect on shader-db results.  Suggested by Caio.

All Gen6+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16333713 -> 16332631 (<.01%)
instructions in affected programs: 258112 -> 257030 (-0.42%)
helped: 1275
HURT: 407
helped stats (abs) min: 1 max: 7 x̄: 1.17 x̃: 1
helped stats (rel) min: 0.20% max: 8.33% x̄: 1.33% x̃: 0.86%
HURT stats (abs)   min: 1 max: 2 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.11% max: 2.94% x̄: 0.98% x̃: 0.98%
95% mean confidence interval for instructions value: -0.70 -0.59
95% mean confidence interval for instructions %-change: -0.84% -0.70%
Instructions are helped.

total cycles in shared programs: 367596791 -> 367601268 (<.01%)
cycles in affected programs: 3420062 -> 3424539 (0.13%)
helped: 1553
HURT: 783
helped stats (abs) min: 1 max: 742 x̄: 24.36 x̃: 6
helped stats (rel) min: 0.05% max: 21.12% x̄: 1.47% x̃: 0.65%
HURT stats (abs)   min: 1 max: 557 x̄: 54.04 x̃: 14
HURT stats (rel)   min: 0.01% max: 33.66% x̄: 3.36% x̃: 1.43%
95% mean confidence interval for cycles value: -1.60 5.43
95% mean confidence interval for cycles %-change: -0.03% 0.33%
Inconclusive result (value mean confidence interval includes 0).

LOST:   0
GAINED: 2

Iron Lake
total instructions in shared programs: 8137992 -> 8137874 (<.01%)
instructions in affected programs: 17501 -> 17383 (-0.67%)
helped: 104
HURT: 2
helped stats (abs) min: 1 max: 2 x̄: 1.17 x̃: 1
helped stats (rel) min: 0.25% max: 2.63% x̄: 0.87% x̃: 0.72%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.45% max: 0.45% x̄: 0.45% x̃: 0.45%
95% mean confidence interval for instructions value: -1.22 -1.00
95% mean confidence interval for instructions %-change: -0.94% -0.76%
Instructions are helped.

total cycles in shared programs: 188540038 -> 188539650 (<.01%)
cycles in affected programs: 704574 -> 704186 (-0.06%)
helped: 125
HURT: 84
helped stats (abs) min: 2 max: 96 x̄: 6.45 x̃: 4
helped stats (rel) min: <.01% max: 3.47% x̄: 0.42% x̃: 0.25%
HURT stats (abs)   min: 2 max: 58 x̄: 4.98 x̃: 4
HURT stats (rel)   min: 0.01% max: 2.75% x̄: 0.36% x̃: 0.33%
95% mean confidence interval for cycles value: -3.20 -0.52
95% mean confidence interval for cycles %-change: -0.19% -0.03%
Cycles are helped.

GM45
total instructions in shared programs: 5008889 -> 5008830 (<.01%)
instructions in affected programs: 8824 -> 8765 (-0.67%)
helped: 52
HURT: 1
helped stats (abs) min: 1 max: 2 x̄: 1.17 x̃: 1
helped stats (rel) min: 0.25% max: 2.38% x̄: 0.86% x̃: 0.72%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.45% max: 0.45% x̄: 0.45% x̃: 0.45%
95% mean confidence interval for instructions value: -1.27 -0.95
95% mean confidence interval for instructions %-change: -0.96% -0.71%
Instructions are helped.

total cycles in shared programs: 128969426 -> 128969128 (<.01%)
cycles in affected programs: 399798 -> 399500 (-0.07%)
helped: 74
HURT: 30
helped stats (abs) min: 2 max: 22 x̄: 6.76 x̃: 6
helped stats (rel) min: <.01% max: 1.83% x̄: 0.46% x̃: 0.29%
HURT stats (abs)   min: 2 max: 58 x̄: 6.73 x̃: 6
HURT stats (rel)   min: 0.06% max: 2.75% x̄: 0.42% x̃: 0.21%
95% mean confidence interval for cycles value: -4.60 -1.14
95% mean confidence interval for cycles %-change: -0.32% -0.08%
Cycles are helped.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-05 20:14:13 -07:00
Ian Romanick
7c64cbf49d nir/algebraic: Recognize (a < 0 || 0 < b) as min(a, -b) < 0
Similar to commit 97e6c1b9 and f5cf74d8ba.

First apply 0 < b => -b < 0 to get (a < 0 || -b < 0), then apply some
pre-existing rules to get min(a, -b) < 0.

v2: Substantially update the comment explaining the use of is_used_once
and the duplication of patterns.  Suggested by Caio.  Also, while flt
and fge are not commutative, ior and iand are.  Half of the original
patterns were redundant, so delete them.  As alternate justification for
deleting them, fmin(a, -b) < 0 <=> 0 < fmax(-a, b).  Proof left as an
exercise for the reader.

All Gen7+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16333789 -> 16333713 (<.01%)
instructions in affected programs: 11424 -> 11348 (-0.67%)
helped: 32
HURT: 0
helped stats (abs) min: 1 max: 7 x̄: 2.38 x̃: 2
helped stats (rel) min: 0.20% max: 1.67% x̄: 0.76% x̃: 0.69%
95% mean confidence interval for instructions value: -3.03 -1.72
95% mean confidence interval for instructions %-change: -0.89% -0.62%
Instructions are helped.

total cycles in shared programs: 367598295 -> 367596791 (<.01%)
cycles in affected programs: 141414 -> 139910 (-1.06%)
helped: 23
HURT: 6
helped stats (abs) min: 3 max: 386 x̄: 72.52 x̃: 20
helped stats (rel) min: 0.15% max: 4.86% x̄: 1.01% x̃: 0.76%
HURT stats (abs)   min: 4 max: 88 x̄: 27.33 x̃: 12
HURT stats (rel)   min: 0.22% max: 3.95% x̄: 1.08% x̃: 0.59%
95% mean confidence interval for cycles value: -93.51 -10.21
95% mean confidence interval for cycles %-change: -1.10% -0.05%
Cycles are helped.

total instructions in shared programs: 10830836 -> 10830779 (<.01%)
instructions in affected programs: 6895 -> 6838 (-0.83%)
helped: 12
HURT: 0
helped stats (abs) min: 1 max: 14 x̄: 4.75 x̃: 1
helped stats (rel) min: 0.14% max: 1.61% x̄: 0.65% x̃: 0.33%
95% mean confidence interval for instructions value: -8.46 -1.04
95% mean confidence interval for instructions %-change: -1.03% -0.27%
Instructions are helped.

total cycles in shared programs: 154028477 -> 154032740 (<.01%)
cycles in affected programs: 178433 -> 182696 (2.39%)
helped: 3
HURT: 9
helped stats (abs) min: 3 max: 20 x̄: 11.00 x̃: 10
helped stats (rel) min: 0.07% max: 0.20% x̄: 0.12% x̃: 0.09%
HURT stats (abs)   min: 27 max: 1415 x̄: 477.33 x̃: 262
HURT stats (rel)   min: 0.22% max: 6.45% x̄: 2.49% x̃: 1.76%
95% mean confidence interval for cycles value: 28.68 681.82
95% mean confidence interval for cycles %-change: 0.37% 3.30%
Cycles are HURT.

Iron Lake
total instructions in shared programs: 8137966 -> 8137992 (<.01%)
instructions in affected programs: 3281 -> 3307 (0.79%)
helped: 0
HURT: 6
HURT stats (abs)   min: 3 max: 7 x̄: 4.33 x̃: 3
HURT stats (rel)   min: 0.63% max: 1.01% x̄: 0.76% x̃: 0.64%
95% mean confidence interval for instructions value: 2.17 6.50
95% mean confidence interval for instructions %-change: 0.56% 0.96%
Instructions are HURT.

total cycles in shared programs: 188539386 -> 188540038 (<.01%)
cycles in affected programs: 103826 -> 104478 (0.63%)
helped: 0
HURT: 7
HURT stats (abs)   min: 16 max: 218 x̄: 93.14 x̃: 80
HURT stats (rel)   min: 0.14% max: 0.95% x̄: 0.53% x̃: 0.46%
95% mean confidence interval for cycles value: 10.26 176.02
95% mean confidence interval for cycles %-change: 0.24% 0.81%
Cycles are HURT.

GM45
total instructions in shared programs: 5008876 -> 5008889 (<.01%)
instructions in affected programs: 1645 -> 1658 (0.79%)
helped: 0
HURT: 3
HURT stats (abs)   min: 3 max: 7 x̄: 4.33 x̃: 3
HURT stats (rel)   min: 0.63% max: 1.00% x̄: 0.76% x̃: 0.63%

total cycles in shared programs: 128968950 -> 128969426 (<.01%)
cycles in affected programs: 64854 -> 65330 (0.73%)
helped: 0
HURT: 4
HURT stats (abs)   min: 18 max: 218 x̄: 119.00 x̃: 120
HURT stats (rel)   min: 0.14% max: 0.95% x̄: 0.60% x̃: 0.66%
95% mean confidence interval for cycles value: -62.92 300.92
95% mean confidence interval for cycles %-change: -0.05% 1.26%
Inconclusive result (value mean confidence interval includes 0).

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-05 20:14:13 -07:00
Ian Romanick
92b75c126b nir/algebraic: Replace checks that a value is between (or not) [0, 1]
v2: Add an extra line to one of the proofs.  Suggested by Caio.

All Gen7+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16329772 -> 16329427 (<.01%)
instructions in affected programs: 41980 -> 41635 (-0.82%)
helped: 110
HURT: 0
helped stats (abs) min: 1 max: 20 x̄: 3.14 x̃: 2
helped stats (rel) min: 0.19% max: 5.56% x̄: 1.12% x̃: 0.94%
95% mean confidence interval for instructions value: -4.10 -2.17
95% mean confidence interval for instructions %-change: -1.28% -0.96%
Instructions are helped.

total cycles in shared programs: 367551273 -> 367549979 (<.01%)
cycles in affected programs: 492462 -> 491168 (-0.26%)
helped: 76
HURT: 25
helped stats (abs) min: 1 max: 400 x̄: 42.86 x̃: 12
helped stats (rel) min: 0.06% max: 10.72% x̄: 1.23% x̃: 0.75%
HURT stats (abs)   min: 2 max: 730 x̄: 78.52 x̃: 16
HURT stats (rel)   min: 0.17% max: 6.89% x̄: 2.08% x̃: 1.23%
95% mean confidence interval for cycles value: -37.79 12.16
95% mean confidence interval for cycles %-change: -0.90% 0.07%
Inconclusive result (value mean confidence interval includes 0).

LOST:   0
GAINED: 2

Sandy Bridge
total instructions in shared programs: 10831115 -> 10830836 (<.01%)
instructions in affected programs: 37830 -> 37551 (-0.74%)
helped: 70
HURT: 0
helped stats (abs) min: 1 max: 20 x̄: 3.99 x̃: 2
helped stats (rel) min: 0.33% max: 7.14% x̄: 1.21% x̃: 0.97%
95% mean confidence interval for instructions value: -5.47 -2.50
95% mean confidence interval for instructions %-change: -1.49% -0.92%
Instructions are helped.

total cycles in shared programs: 154029323 -> 154028477 (<.01%)
cycles in affected programs: 247909 -> 247063 (-0.34%)
helped: 52
HURT: 6
helped stats (abs) min: 2 max: 254 x̄: 25.81 x̃: 4
helped stats (rel) min: 0.07% max: 4.39% x̄: 0.81% x̃: 0.19%
HURT stats (abs)   min: 4 max: 403 x̄: 82.67 x̃: 8
HURT stats (rel)   min: 0.18% max: 1.60% x̄: 0.71% x̃: 0.53%
95% mean confidence interval for cycles value: -34.83 5.65
95% mean confidence interval for cycles %-change: -0.98% -0.32%
Inconclusive result (value mean confidence interval includes 0).

Iron Lake
total instructions in shared programs: 8138007 -> 8137966 (<.01%)
instructions in affected programs: 4060 -> 4019 (-1.01%)
helped: 31
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.32 x̃: 1
helped stats (rel) min: 0.68% max: 8.33% x̄: 1.45% x̃: 0.90%
95% mean confidence interval for instructions value: -1.50 -1.15
95% mean confidence interval for instructions %-change: -2.11% -0.79%
Instructions are helped.

total cycles in shared programs: 188539492 -> 188539386 (<.01%)
cycles in affected programs: 26280 -> 26174 (-0.40%)
helped: 25
HURT: 0
helped stats (abs) min: 2 max: 8 x̄: 4.24 x̃: 4
helped stats (rel) min: 0.08% max: 2.11% x̄: 0.54% x̃: 0.50%
95% mean confidence interval for cycles value: -5.08 -3.40
95% mean confidence interval for cycles %-change: -0.70% -0.37%
Cycles are helped.

GM45
total instructions in shared programs: 5008897 -> 5008876 (<.01%)
instructions in affected programs: 2096 -> 2075 (-1.00%)
helped: 16
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.31 x̃: 1
helped stats (rel) min: 0.68% max: 7.69% x̄: 1.41% x̃: 0.89%
95% mean confidence interval for instructions value: -1.57 -1.06
95% mean confidence interval for instructions %-change: -2.32% -0.49%
Instructions are helped.

total cycles in shared programs: 128969020 -> 128968950 (<.01%)
cycles in affected programs: 18490 -> 18420 (-0.38%)
helped: 15
HURT: 0
helped stats (abs) min: 2 max: 8 x̄: 4.67 x̃: 4
helped stats (rel) min: 0.08% max: 2.11% x̄: 0.51% x̃: 0.48%
95% mean confidence interval for cycles value: -6.03 -3.30
95% mean confidence interval for cycles %-change: -0.78% -0.24%
Cycles are helped.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-05 20:14:13 -07:00
Jonathan Marek
a44b4200f3 tgsi_to_nir: fix nir_gather_ssa_types for TGSI->NIR shaders
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2019-08-05 22:09:47 -04:00
Jason Ekstrand
f6e7de41d7 anv: Implement VK_EXT_line_rasterization
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-06 02:05:28 +00:00
Jason Ekstrand
f03512f90b genxml: Rename 3DSTATE_SF::Anti-Aliasing Enable
This makes it consistent with the new name when it's moved to
3DSTATE_RASTER.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-06 02:05:28 +00:00
Jason Ekstrand
abf9e10488 anv: Use dirty bits for dynamic state tracking
Previously, we assumed that the dirty bit was always 1 << VK_DYNAMIC_*
and this assumption is about to be false.  Extensions which define new
VK_DYNAMIC_* enums won't be nice and tightly packed which this really
requires.  Instead, add functions to don the conversions and rework the
bits a bit.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-06 02:05:28 +00:00
Jason Ekstrand
aa13f75f01 anv: Advertise the right line width range on gen9 and CHV
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-06 02:05:28 +00:00
Alyssa Rosenzweig
77295b1fdc meson: Add panfrost to the --auto list
Look ma, we're a real driver now! I was waiting until Panfrost
stabilises a bit for this, but now that 19.2 is almost here, let's make
us official :)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-08-05 17:42:05 -07:00
Erico Nunes
360bda0b1d lima/ppir: enable lower_vector_cmp to lower fall_equal
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-05 23:36:46 +02:00
Erico Nunes
9e8f8dbcd1 lima: re-run nir_opt_algebraic after int lowering
nir_lower_int_to_float is currently only meant to run once, and some ops
must be lowered after being converted from int ops to be implementable,
so re-run nir_opt_algebraic after lowering ints to floats.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-05 23:36:35 +02:00
Alyssa Rosenzweig
3db4949197 pan/midgard: Extend SSA concurrency checks to other args
No glmark changes, but this seems like a good idea.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-05 11:22:49 -07:00
Alyssa Rosenzweig
2869758355 pan/midgard: Rewrite bidirectionally when eliminating moves
Symptom: the sky is black in SuperTuxKart (flashbacks to SMB/NES
emulation intensify).

Essentially, what happened is a fixed (special) move to r0 was
eliminated but scheduling did not factor this in, so
can_run_concurrent_ssa returned true even when there was a logical data
dependency that needed to be resolved.

Fixes: 20771ede1c ("pan/midgard: Add post-RA move elimination")

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-05 10:58:39 -07:00
Danylo Piliaiev
04a9951580 intel/compiler: add ability to override shader's assembly
When dumping shader's assembly with INTEL_DEBUG=vs,tcs,...
sha1 of the resulting assembly is also printed, having environment
variable INTEL_SHADER_ASM_READ_PATH present driver will try to
load a "%sha1%.bin" file from the path and substitute current
assembly with the one from the file.

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-08-05 17:19:09 +00:00
Danylo Piliaiev
430823c96b intel/tools: add binary output type to i965_asm
Add '-t,--type' command line option to specify the output type
which can be 'bin', 'c_literal' or 'hex'.

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-08-05 17:19:09 +00:00
Alyssa Rosenzweig
1f8b653acb panfrost: Add app blacklist
In preparation for an initial 19.2 release, add a blacklist for apps
known to be buggy under Panfrost to protect users. Panfrost is NOT a
conformant implementation at this time.

Distros: please do not revert this patch. If blacklisted apps are run
using Panfrost, dragons will bite you. Thanks :)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-08-05 16:04:47 +00:00
Kenneth Graunke
64b73b770b iris: Fix bad external BO hash table and zombie list interactions
A while ago, we started deferring GEM object closure and VMA release
until buffers were idle.  This had some unforeseen interactions with
external buffers.

We keep imported buffers in hash tables, so if we have repeated imports
of the same GEM object, we map those to the same iris_bo structure.
This is critical for several reasons.  Unfortunately, we broke this
assumption.  When freeing a non-idle external buffer, we would drop it
from the hash tables, then move it to the zombie list.  If someone
reimported the same GEM object, we would not find it in the hash tables,
and go ahead and make a second iris_bo for that GEM object.  But the old
iris_bo would still be in the zombie list, and so we would eventually
call GEM_CLOSE on it - closing a BO that should have still been live.

To work around this, we defer removing a BO from the hash tables until
it's actually fully closed.  This has the strange effect that an
external BO may be on the zombie list, and yet be resurrected before
it can be properly cleaned up.  In this case, we remove it from the
list so it won't be freed.

Fixes severe instability in Weston, which was hitting EINVALs and
ENOENTs from execbuf2, due to batches referring to a GEM object that
had been closed, or at least had its VMA torched.

Fixes: 457a55716e ("iris: Defer closing and freeing VMA until buffers are idle.")
2019-08-05 08:53:41 -07:00
Kenneth Graunke
48e5a99d86 iris/bufmgr: Move iris_bo_reference into hash_find_bo, rename it
Everybody importing an external buffer was looking it up in the hash
table, then referencing it.  We can just do that in the helper instead,
which also gives us a convenient spot to stash extra code shortly.
2019-08-05 08:53:07 -07:00
Ahmad Fatoum
4f75ea57c2 gallium: add stm DRM entry point
The STM32MP157 features a Vivante GC400 GPU supported by etnaviv.
Add a DRM entry point for the STM display controller, so mesa
can be used with it.

Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-05 14:53:31 +00:00
Eric Engestrom
c251e2e662 gitlab-ci: don't remove a package we don't install anymore
Fixes: 85dace1c0b ("gitlab-ci: remove software-properties-common")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-08-05 15:43:26 +01:00
Andrii Simiklit
dc471f2ef8 etnaviv: fix a null pointer dereference
This issue was found by cppcheck

Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2019-08-05 15:31:43 +02:00
Connor Abbott
74470baebb ac/nir: Lower large indirect variables to scratch
results from radeonsi NIR:

Totals from affected shaders:
SGPRS: 704 -> 464 (-34.09 %)
VGPRS: 2056 -> 672 (-67.32 %)
Spilled SGPRs: 24 -> 0 (-100.00 %)
Spilled VGPRs: 28406 -> 0 (-100.00 %)
Private memory VGPRs: 0 -> 3182 (0.00 %)
Scratch size: 1064 -> 3228 (203.38 %) dwords per thread
Code Size: 935260 -> 40180 (-95.70 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 28 -> 70 (150.00 %)
Wait states: 0 -> 0 (0.00 %)

results from radv:

Totals from affected shaders:
SGPRS: 80 -> 48 (-40.00 %)
VGPRS: 204 -> 108 (-47.06 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 256 (0.00 %) dwords per thread
Code Size: 15792 -> 9504 (-39.82 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 1 -> 2 (100.00 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-05 11:45:18 +02:00
Timothy Arceri
3c9144f9e5 drirc: Add discard workaround for Divinity: Original Sin EE
This adds an additional work around for the game to fix the blocky
shadows as reported in bug 105282

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105282
2019-08-05 15:35:00 +10:00
Erico Nunes
486b33558a lima/ppir: simplify load uni/temp op lowering and scheduling
The load uniform/temporary operations output only to a pipeline
register, which must be consumed by another op in the same instruction
later.
The current implementation delays the decision of who will consume this
result to until the scheduling step. If the consumer node is not able to
use the pipeline register, a mov node may have to be created, during the
scheduler step.

As part of the ppir scheduler simplification, and now that the ppir
scheduler supports pipeline register dependencies, this can be
simplified by always creating a single mov node outputting to a normal
register that can be used directly by all consumers.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-08-04 13:38:19 +02:00
Erico Nunes
fd29c4d6c5 lima/ppir: simplify select op lowering and scheduling
The select operation relies on the select condition coming from the
result of the the alu scalar mult slot, in the same instruction.
The current implementation creates a mov node to be the predecessor of
select, and then relies on an exception during scheduling to ensure that
both ops are inserted in the same instruction.

Now that the ppir scheduler supports pipeline register dependencies,
this can be simplified by making the mov explicitly output to the fmul
pipeline register, and the scheduler can place it without an exception.

Since the select condition can only be placed in the scalar mult slot,
differently than a regular mov, define a separate op for it.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-08-04 13:38:18 +02:00
Erico Nunes
eb82637c2f lima/ppir: support pipeline registers in scheduler
The ppir scheduler grew to be rather complicated and containing many
exceptions as it also has to take care of inserting additional nodes
when it is mandatory for nodes to be in the same instruction.
As such, the lima lowering and scheduling process can be difficult to
understand and maintain.
The ppir lowering step created nodes hoping that the scheduler would
notice the exception and do the right thing.

This proposal adds a simple refactor to the scheduler so that it places
nodes with pipeline registers in the same instruction.
With the scheduler handling this in a general way, it is possible to
create same-instruction dependencies by using pipeline registers during
the lowering stage.
This is simpler to maintain because now we can make these dependencies
explicit in a single place (lowering), and we can drop exceptions from
scheduling.
Reducing the complexity of the scheduler is also useful as preparatory
work to support control flow in ppir.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-08-04 13:38:11 +02:00
Eric Engestrom
a1da8eccbe docs: fix "empty array" meson syntax
On recent versions of Meson (0.47+) these are synonymous, but we still
support older versions than that, so let's use the correct syntax to
avoid confusing users of old Meson versions.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-04 12:21:19 +01:00
Eric Engestrom
1361ab3c82 egl: drop unnecessary function deref
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2019-08-04 11:26:20 +01:00
Eric Engestrom
e7e3fd5c03 glx: drop unnecessary pointer deref for function calls
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2019-08-04 11:26:20 +01:00
Eric Engestrom
9668d7f539 introduce c11_compat.h to provide C11 things in C99
Right now, all it does is provide the new standard `static_assert()` name.

Fixes: fbf7c38da3 ("egl/wayland: use bitset.h for `formats` bit set")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Tested-by: Bhushan Shah <bshah@kde.org>
2019-08-04 11:14:25 +01:00
Eric Engestrom
64ffc289be travis: add MacOS Scons build
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2019-08-04 11:11:32 +01:00
Eric Engestrom
8f1cdac793 symbols-check: fix nm invocation on MacOS
According to Mac OSX's man page [1], this is how we should get the list
of exported symbols:
  nm -g -P foo.dylib

-g to only show the exported symbols
-P to show it in a "portable" format, ie. readable by a script

Since this is supported by GNU nm as well, let's use that everywhere,
although some care needs to be taken as there are some differences in
the output.

[1] https://www.unix.com/man-page/osx/1/nm/

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
2019-08-04 11:06:27 +01:00
Eric Engestrom
59f8809f3c symbols-check: discard platform symbols early
(as the comment there already claimed)

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
2019-08-04 11:06:27 +01:00
Eric Engestrom
81b3d141b3 symbols-check: skip test if we can't get the symbols list
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
2019-08-04 11:06:27 +01:00
Vasily Khoruzhick
c780af7771 lima/ppir: move alu vec to scalar lowering into NIR
Utgard PP is vec4, but some operations are scalar, utilize
NIR vec to scalar lowering pass and indicate operations that we
want to lower.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-04 02:17:12 +00:00
Jason Ekstrand
aebca3961b iris: Fix handling of SIMD32 fragment shaders
The brw_wm_prog_data_dispatch_grf_start_reg and _prog_offset helpers
read the _NPixelDispatchEnable fields from 3DSTATE_PS to figure out
which bits to pull out of the prog data and stuff where.  Therefore,
they need to be called with the final set of _NPixelDispatchEnable bits
after we've done the workaround for SIMD32 and 16x MSAA.  Otherwise, if
you end up with a somewhat odd combination of enables, the GRF start reg
and KSP data ends up in the wrong slots.  In particular, running
SIMD32-only is broken but several other combinations are as well.

Fixes: 5445c176e2 "iris: Disable SIMD32 when using a 16x MSAA..."
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-03 22:24:40 +00:00
Bas Nieuwenhuizen
9f37c9903b mesa: Rename GLX_USE_TLS to USE_ELF_TLS.
These days it is not GLX only and it does not work with all TLS
implementations.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-03 20:18:17 +02:00
Bas Nieuwenhuizen
d7ca1efc6c meson: Do not use GLX_USE_TLS on Android.
The asm code expects a specific kind of implementation, but Android
uses something different (emutls).

Turns out mesa has a fallback with pthread_getspecific, with an
optimizaiton if only a single thread is used. emutls also uses
getspecific, so lets just use the optimized mesa implementation.

Fixes: 20294dceeb "mesa: Enable asm unconditionally, now that gen_matypes is gone."
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-03 18:40:04 +02:00
Christian Gmeiner
2dd598c129 etnaviv: s/boolean/bool
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Philipp Zabel <philipp.zabel@gmail.com>
2019-08-03 12:32:28 +02:00
Andreas Baierl
5254e53deb lima/ppir: Add gl_FrontFace handling
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-08-03 08:04:12 +00:00
Jason Ekstrand
b62b0cfa71 intel/nir: Add 1-bit opcodes to brw_cmod_for_nir_comparison_op
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-08-03 00:35:48 +00:00
Jason Ekstrand
c02c3ff612 intel/nir: Add a common nir comparison -> cmod helper
We already had one in the vec4 code, we just had move it.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-08-03 00:35:48 +00:00
Eric Engestrom
2fd30e3722 util: fix pointer type on NetBSD
NetBSD expects a `void *` argument [1] as the printf-style arguments to
the formatting string, so we need to cast the `const` away.

[1] https://netbsd.gw.com/cgi-bin/man-cgi?pthread_setname_np++NetBSD-current

Suggested-by: Kamil Rytarowski <n54@gmx.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-08-03 00:20:21 +00:00
Eric Engestrom
b558fa4dfe meson: remove unused field
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Eric Anholt <eric@anholt.net>
Tested-by: Vinson Lee <vlee@freedesktop.org>
2019-08-03 00:08:37 +00:00
Eric Engestrom
9a07606b84 meson: replace last uses of libxmlconfig with idep_xmlconfig
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Eric Anholt <eric@anholt.net>
Tested-by: Vinson Lee <vlee@freedesktop.org>
2019-08-03 00:08:37 +00:00
Eric Engestrom
178811d8f6 meson: drop unused dep_{thread,dl}
Unused as of last commit.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Eric Anholt <eric@anholt.net>
Tested-by: Vinson Lee <vlee@freedesktop.org>
2019-08-03 00:08:37 +00:00
Eric Engestrom
d2d85b950d meson: replace libmesa_util with idep_mesautil
This automates the include_directories and dependencies tracking so that
all users of libmesa_util don't need to add them manually.

Next commit will remove the ones that were only added for that reason.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Eric Anholt <eric@anholt.net>
Tested-by: Vinson Lee <vlee@freedesktop.org>
2019-08-03 00:08:37 +00:00
Alyssa Rosenzweig
8ddb38209d pan/midgard: Print texture outmod
I have no idea who thought this was a good idea.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 16:54:53 -07:00
Alyssa Rosenzweig
ad864a0bbb pan/midgard: Promote all 16 uniforms
Now that register spilling is in place, this is reasonable. It turns out
for some shaders, it's actually better to cap at 8 work registers and
extra >8 uniform reigsters and tolerate the spilling, since the extra
resulting threads make up for the spillage. So incidentally, the shader
that spills here is in -bterrain, which jumps from 19fps to 21fps as a
result of this change.

total instructions in shared programs: 3513 -> 3448 (-1.85%)
instructions in affected programs: 776 -> 711 (-8.38%)
helped: 20
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 3.25 x̃: 2
helped stats (rel) min: 3.57% max: 16.00% x̄: 8.37% x̃: 7.19%
95% mean confidence interval for instructions value: -4.28 -2.22
95% mean confidence interval for instructions %-change: -10.02% -6.73%
Instructions are helped.

total bundles in shared programs: 2067 -> 2024 (-2.08%)
bundles in affected programs: 515 -> 472 (-8.35%)
helped: 19
HURT: 1
helped stats (abs) min: 1 max: 6 x̄: 2.37 x̃: 2
helped stats (rel) min: 2.13% max: 17.86% x̄: 10.19% x̃: 11.11%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 3.23% max: 3.23% x̄: 3.23% x̃: 3.23%
95% mean confidence interval for bundles value: -3.01 -1.29
95% mean confidence interval for bundles %-change: -12.13% -6.91%
Bundles are helped.

total quadwords in shared programs: 3468 -> 3426 (-1.21%)
quadwords in affected programs: 764 -> 722 (-5.50%)
helped: 19
HURT: 1
helped stats (abs) min: 1 max: 5 x̄: 2.26 x̃: 2
helped stats (rel) min: 1.41% max: 12.50% x̄: 6.76% x̃: 7.14%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 1.08% max: 1.08% x̄: 1.08% x̃: 1.08%
95% mean confidence interval for quadwords value: -2.83 -1.37
95% mean confidence interval for quadwords %-change: -8.08% -4.65%
Quadwords are helped.

total registers in shared programs: 383 -> 360 (-6.01%)
registers in affected programs: 112 -> 89 (-20.54%)
helped: 19
HURT: 0
helped stats (abs) min: 1 max: 3 x̄: 1.21 x̃: 1
helped stats (rel) min: 12.50% max: 27.27% x̄: 20.63% x̃: 20.00%
95% mean confidence interval for registers value: -1.47 -0.95
95% mean confidence interval for registers %-change: -22.39% -18.87%
Registers are helped.

total threads in shared programs: 432 -> 451 (4.40%)
threads in affected programs: 19 -> 38 (100.00%)
helped: 11
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.73 x̃: 2
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for threads value: 1.41 2.04
95% mean confidence interval for threads %-change: 100.00% 100.00%
Threads are [helped].

total loops in shared programs: 4 -> 4 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0

total spills in shared programs: 0 -> 4
spills in affected programs: 0 -> 4
helped: 0
HURT: 2

total fills in shared programs: 0 -> 7
fills in affected programs: 0 -> 7
helped: 0
HURT: 2

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 16:52:21 -07:00
Alyssa Rosenzweig
e94239b9a4 pan/midgard: Break mir_spill_register into its function
No functional changes, just breaks out a megamonster function and fixes
the indentation.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 16:52:21 -07:00
Alyssa Rosenzweig
d4bcca19da pan/midgard: Switch sources to an array for trinary sources
We need three independent sources to support indirect SSBO writes (as
well as textures with both LOD/bias and offsets). Now is a good time to
make sources just an array so we don't have to rewrite a ton of code if
we ever needed a fourth source for some reason.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 16:48:54 -07:00
Alyssa Rosenzweig
513d02cfeb pan/midgard: Remove "r27-only" register class
As far as I know, there's no such thing as a load/store op that only
takes its argument in r27. We just need to set the appropriate arg_1
field in the RA to specify other registers if we want them.

To facilitate this, various RA-related changes are needed across the
compiler ; this should also fix indirect offsets which were implicitly
interpreted as "r27-only" despite not even passing through RA yet. One
ripple effect change is switching the move insertion point and adjusting
the liveness analysis accordingly, so while this was intended as a
purely functional change, there are some shader-db changes:

total instructions in shared programs: 3511 -> 3498 (-0.37%)
instructions in affected programs: 563 -> 550 (-2.31%)
helped: 12
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.08 x̃: 1
helped stats (rel) min: 0.93% max: 5.00% x̄: 2.58% x̃: 2.33%
95% mean confidence interval for instructions value: -1.27 -0.90
95% mean confidence interval for instructions %-change: -3.23% -1.93%
Instructions are helped.

total bundles in shared programs: 2067 -> 2067 (0.00%)
bundles in affected programs: 398 -> 398 (0.00%)
helped: 7
HURT: 4
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 1.54% max: 10.00% x̄: 5.04% x̃: 5.56%
HURT stats (abs)   min: 1 max: 2 x̄: 1.75 x̃: 2
HURT stats (rel)   min: 2.13% max: 4.26% x̄: 3.72% x̃: 4.26%
95% mean confidence interval for bundles value: -0.95 0.95
95% mean confidence interval for bundles %-change: -5.21% 1.50%
Inconclusive result (value mean confidence interval includes 0).

total quadwords in shared programs: 3464 -> 3454 (-0.29%)
quadwords in affected programs: 1199 -> 1189 (-0.83%)
helped: 18
HURT: 4
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 1.03% max: 5.26% x̄: 2.44% x̃: 1.79%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 2.56% max: 2.82% x̄: 2.63% x̃: 2.56%
95% mean confidence interval for quadwords value: -0.98 0.07
Inconclusive result (value mean confidence interval includes 0).

total registers in shared programs: 383 -> 373 (-2.61%)
registers in affected programs: 56 -> 46 (-17.86%)
helped: 12
HURT: 2
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 9.09% max: 33.33% x̄: 29.58% x̃: 33.33%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 20.00% max: 50.00% x̄: 35.00% x̃: 35.00%
95% mean confidence interval for registers value: -1.13 -0.29
95% mean confidence interval for registers %-change: -35.07% -5.63%
Registers are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 14:20:03 -07:00
Alyssa Rosenzweig
5d9b7a8ddb pan/midgard: Handle get/set_swizzle for load/store arguments
Load/store's  main "argument 0" already has its swizzle handled
correctly (for stores, that is). But the tinier arguments, the compact
ones with a component select but not a full swizzle, those are not yet
handled. Let's do something about that!
2019-08-02 14:20:03 -07:00
Alyssa Rosenzweig
9aeb726045 pan/midgard: Fix block successors
Rather than an ersatz thing that sort of looks like successors but is in
fact just the source order traversal with some backward jumps hacked in
for loops... construct an actual flow graph so we can do analysis
sanely.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 14:20:03 -07:00
Alyssa Rosenzweig
1a116037d8 pan/midgard: Add helper to pack load/store registers
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 14:20:03 -07:00
Alyssa Rosenzweig
e112d9d333 pan/midgard: Decode register/component in load/store argument
3-bits out of 8 down!

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 14:20:03 -07:00
Alyssa Rosenzweig
5a572f4b55 pan/midgard: Fix REGISTER_OFFSET
r27 isn't the special one, usually.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 14:20:03 -07:00
Alyssa Rosenzweig
c908772ee4 pan/midgard: Split ld/st unknown to arg_1/arg_2 fields
The 16-bit field can be decomposed to two independent 8-bit fields, each
representing a single (additional) argument to the load/store op,
generally used for encoding registers. Addressable registers here are
substantially limited compared to the main register in a load/store op.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 14:20:02 -07:00
Bas Nieuwenhuizen
2d54fdb563 radv: Expose VK_KHR_imageless_framebuffer.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-08-02 22:35:25 +02:00
Bas Nieuwenhuizen
9475782eac radv: Implement VK_KHR_imageless_framebuffer.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-08-02 22:35:19 +02:00
Bas Nieuwenhuizen
a7041f3b4e radv: Store image view also outside framebuffer.
So we can use it with imageless framebuffers.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-08-02 22:19:16 +02:00
Bas Nieuwenhuizen
49e6c2fb78 radv: Store color/depth surface info in attachment info instead of framebuffer.
That way we can use it for imageless framebuffers.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-08-02 22:18:51 +02:00
Alyssa Rosenzweig
cd98d94516 panfrost: Allocate polygon lists on-demand
Rather than alloacting a huge (64MB) polygon list on context creation
and sharing it across framebuffers, we instead allocate polygon lists as
BOs (which consistently hit the cache) sized appropriately; for about a
month, we've known how to calculate the polygon list size so this has
only recently become possible.

The good news is we can render to truly massive framebuffers without
crashing and, more importantly, we eliminate the 64MB upfront overhead.
If a list that size isn't actually needed, it's not allocated.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-08-02 21:54:58 +02:00
Boris Brezillon
ed501c00cb panfrost: Handle the bo == NULL case in panfrost_bo_[un]reference()
Allows us to pass BOs without checking if they're NULL or not.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 21:54:58 +02:00
Boris Brezillon
12f72175f3 panfrost: Get rid of the skippable param in attach_vt_framebuffer()
The only user of this function always passes true.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 21:54:58 +02:00
Boris Brezillon
8227d284f7 panfrost: Don't emit a new FB desc when setting a new FB state
The FB desc will be emitted/attached on the first draw targetting
this new FB.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 21:54:58 +02:00
Boris Brezillon
95507a3dd4 panfrost: Bail out early when doing a wallpaper blit
The wallpaper blit is a bit special in that the operation is targetting
the current FB, but the u_blitter logic creates a new surface for it
which makes util_framebuffer_state_equal() return false. In that case
we don't want a new FB descriptor to be emitted/attached, so let's just
copy the new state into ctx->pipe_framebuffer and exit the function.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 21:54:58 +02:00
Boris Brezillon
8645afce4c panfrost: Bail out early when new and current FB states are equal
If the current FB matches the new one there's nothing to be done in
panfrost_set_framebuffer_state(). By bailing out early in that case we
avoid emitting new FB descriptors (the old ones are still valid).

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 21:54:58 +02:00
Boris Brezillon
17d6ee2bd1 panfrost: Delay FB descriptor allocation
No need to emit SFBD/MFBD at frame invalidation. They can be emitted
when the framebuffer is attached, which saves us a potential FB desc
re-allocation if a new FB is bound after the swap.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 21:54:58 +02:00
Boris Brezillon
b5ca1e5458 panfrost: Remove job from ctx->jobs at submission time
This guarantees that new draws targetting the same framebuffer will
get a new job instance.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 21:54:58 +02:00
Boris Brezillon
20b00e1ff2 panfrost: Make ctx->job useful
ctx->job is supposed to serve as a cache to avoid an hash table lookup
everytime we access the job attached to the currently bound FB, except
it was never assigned to anything but NULL.

Fix that by adding the missing assignment in panfrost_get_job_for_fbo().
Also add a missing NULL assignment in the ->set_framebuffer_state()
path.

While at it, add extra assert()s to make sure ctx->job is consistent.

Fixes: 59c9623d0a ("panfrost: Import job data structures from v3d")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 21:52:56 +02:00
Bas Nieuwenhuizen
72e7b7a00b ac/nir,radv: Optimize bounds check for 64 bit CAS.
When the application does not ask for robust buffer access.

Only implemented the check in radv.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-08-02 21:21:55 +02:00
Roland Scheidegger
74baeacafc gallivm: fix issue with AtomicCmpXchg wrapper on llvm 3.5-3.8
These versions still need wrapper but already have both success and
failure ordering.
(Compile tested on llvm 3.3, 3.7, 3.8.)

v2: don't duplicate whole function (suggested by Brian).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111102

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2019-08-02 20:16:17 +02:00
Matt Turner
dcf9d91a80 util: Handle differences in pthread_setname_np
There are a lot of unfortunate differences in the implementation of this
function. NetBSD and Mac OS X in particular require different arguments.

https://stackoverflow.com/questions/2369738/how-to-set-the-name-of-a-thread-in-linux-pthreads/7989973#7989973
provides for a good overview of the differences.

Fixes: 9c411e020d ("util: Drop preprocessor guards for glibc-2.12")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111264
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
[Eric: use DETECT_OS_* instead of PIPE_OS_*]
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-08-02 18:38:52 +01:00
Eric Engestrom
55eadf971a util/os_time: use detect_os.h to uncouple from gallium
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-08-02 18:38:52 +01:00
Eric Engestrom
bffa23313a util/u_debug: use detect_os.h
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-08-02 18:38:52 +01:00
Eric Engestrom
7f12a66ad5 util/os_misc: use detect_os.h to start uncoupling from gallium
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-08-02 18:38:52 +01:00
Eric Engestrom
87adc898b3 util/os_memory: use detect_os.h to uncouple it from gallium
While at it, remove p_compiler.h as well as it is unused.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-08-02 18:38:52 +01:00
Eric Engestrom
9a5148190a gallium: deduplicate os detection logic by using detect_os.h
This allows us to avoid having to rename all the PIPE_OS_* at once while
still making sure PIPE_OS_* and DETECT_OS_* are always in sync.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-08-02 18:38:52 +01:00
Eric Engestrom
8c52bca112 gallium/utils: drop PIPE_SUBSYSTEM_WINDOWS_USER
This is basically just an alias for PIPE_OS_WINDOWS.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-08-02 18:38:52 +01:00
Eric Engestrom
e740e7a6f0 scons: rename PIPE_SUBSYSTEM_EMBEDDED to EMBEDDED_DEVICE
It has nothing to do with the PIPE_SUBSYSTEM_* stuff from gallium.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-08-02 18:38:52 +01:00
Eric Engestrom
8c63348c94 gallium: remove never-used PIPE_SUBSYSTEM_DRI
PIPE_SUBSYSTEM_DRI was introduced in dacfef1589 ("gallium: New
configuration header.") 11 years ago, and was never used.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-08-02 18:38:52 +01:00
Eric Engestrom
bfb70032d4 util: fix typo in comment
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-08-02 18:38:52 +01:00
Eric Engestrom
362e9d8682 util: introduce detect_os.h
Mostly copied from src/gallium/include/pipe/p_config.h, so I kept its
copyright and authorship.

Other than the obvious rename, the big difference is that these are
always defined, to be used as `#if DETECT_OS_LINUX`.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-08-02 18:38:52 +01:00
Rob Clark
9d5beab441 freedreno/batch: fix dependency loop detection
We can have a scenario like:

  A -> B
  A -> C -> B

When adding the A->C dependency, it doesn't really matter that C depends
on something that A depends on, that isn't a necessary condition for a
dependency loop.

Instead what we want to know is that nothing C depends on, directly or
indirectly, depends on A.  We can detect this by recursively OR'ing the
dependents_mask of C and all it's dependencies.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-02 10:24:14 -07:00
Rob Clark
e1790c532a freedreno/a6xx: add missing flush/invalidates for blit
Various things we were missing for multiple blits in a single batch.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-02 10:24:14 -07:00
Rob Clark
d8379da19e freedreno/a6xx: skip tiles with no geometry
If no clear, and no geometry according to VSC_STATE[pipe] we can skip
the tile entirely.  If there is a fast-clear, we can't skip restore
(clear) or resolve IBs, but we can still skip draw IB.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-02 10:24:14 -07:00
Rob Clark
de3e130fc9 freedreno/a6xx: VSC overflow detection/handling
Check VSC_SIZE/VSC_SIZE2 regs from cmdstream to detect overflow, and
skip use of VSC visibility stream when overflow is detected, to avoid
GPU hangs.  This is done w/ introduction of some CP_REG_TEST/
CP_COND_REG_EXEC packet pairs.

In addition, eventually (after a frame or two) detect the condition and
resize the VSC buffers until overflow no longer happens.

Note that this significantly reduces the initial size of the VSC
buffers, backing out a previous hack to make them 16x larger than
what should be typically required (the previous "solution" for
VSC overflow).

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-02 10:24:14 -07:00
Rob Clark
401f532bea freedreno/a6xx: remove USE/IGNORE_VISIBILITY draw patching
Seems this isn't needed anymore on a6xx to control whether visibility
stream is used.  And it would be hard to deal with if it was, for
disabling use of VSC stream in draw pass.  So just remove it and
simplify things.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-02 10:24:14 -07:00
Rob Clark
146d6e6463 freedreno/a6xx: cleanup "blit_mem"
Rename to "control_mem", and switch to using a struct to manage the
layout, rather than just ad-hoc hard-coded offsets.

For recovering from VSC stream overflow, we'll need to add more, but
best to clean it up first.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-02 10:24:14 -07:00
Rob Clark
1cbb7f7601 freedreno: refresh tile debug
Fix some #ifdef'd bitrot, and get rid of #ifdef so it doesn't bitrot
again.

And add a prints for per-tile state.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-02 10:24:14 -07:00
Rob Clark
44f3c1cf01 freedreno: update registers
Pull in some updates of VSC regs

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-02 10:24:14 -07:00
Rob Clark
c179ded9cb freedreno/gmem: small cleanup
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-02 10:24:14 -07:00
Rob Clark
e2bb3e84ab freedreno/drm: convert ring_pool to child_pool
Worth another couple percent at driver2

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-02 10:24:14 -07:00
Rob Clark
9ac23794c9 freedreno/drm: remove idx_lock
Since it ends up contended, it is a bit of a bottleneck for workloads
with high driver overhead.  Worth nearly +10% at gfxbench driver2.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-08-02 10:24:14 -07:00
Rob Clark
e439f63467 freedreno/batch: always update last_fence
Not all flush paths come thru fd_context_flush(), so we should also set
last_fence in the batch flush path.  This avoids some no-op flushes just
to get a fence.  For example when pctx->flush_resource() triggers a
flush.

We should probably keep the last_fence update in fd_context_flush() as
well to handle deferred flush case.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-02 10:24:14 -07:00
Rob Clark
c93eae7f10 freedreno: drop unused fd_fence_ref param
The pscreen param was just there to satisfy pipe_screen::fence_reference

But some of the internal uses passed NULL for screen.  Which is a bit
ugly.  Instead drop the param and add a shim function to plug into the
screen.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-02 10:24:14 -07:00
Alyssa Rosenzweig
1637a53890 pan/midgard: Print invert modifier
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 09:57:15 -07:00
Alyssa Rosenzweig
62a5ee3bb4 pan/midgard: Flip conditionals
We would like to flip ops to have a constant in the second place to
enable inlining of the constant.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 09:57:15 -07:00
Alyssa Rosenzweig
d066ca3575 pan/midgard: Add bitwise src/invert fusing
De Morgan's Laws and some special ops basically.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 09:57:15 -07:00
Alyssa Rosenzweig
620c2717cf pan/midgard: Add .not propagation pass
Essentially .pos propagation but for bitwise.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 09:57:15 -07:00
Alyssa Rosenzweig
b821e1b85e pan/midgard: Fuse invert into bitwise ops
We use the new invert flag to produce ops like inand.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 09:57:15 -07:00
Jonathan Marek
d8584c5cf2 freedreno: a2xx: implement texture tiling
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-08-02 15:58:22 +00:00
Jonathan Marek
fb5c3db0ab freedreno: a2xx: use nir_lower_alu_to_scalar instead of lowering pass
nir_lower_alu_to_scalar can now be used to only lower certain ops, so we
don't need the custom pass. And we can lower fall_equal/fany_nequal with
lower_vector_cmp instead.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-08-02 15:58:22 +00:00
Jonathan Marek
e652ca4e0b freedreno: a2xx: fix HW binning for batches with >256K vertices
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-08-02 15:58:22 +00:00
Jonathan Marek
257957b026 freedreno: a2xx: fix fneg/fabs/fsat opcodes
Previously we would get a fmov with modifiers, but now that mov has no type
these opcodes need to be supported.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-08-02 15:58:22 +00:00
Jonathan Marek
43dbd7d603 freedreno: a2xx: fix order of NIR opts
int_to_float needs to come after bool_to_float, and lower_to_source_mods
needs to come after both, since they don't deal wih source mods.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-08-02 15:58:22 +00:00
Jonathan Marek
57e980a4fb freedreno: a2xx: fix non-etc1 cubemaps
Not sure how this happened, but apparently all cubemaps need swapped XY.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-08-02 15:58:22 +00:00
Jonathan Marek
2e029acbe2 freedreno: a2xx: fix fast clear not being used for Z24X8 buffers
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-08-02 15:58:22 +00:00
Jonathan Marek
e25388c97b freedreno: align renderonly scanout buffers
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-08-02 15:58:22 +00:00
Eric Engestrom
6125c93e00 gitlab-ci: just build all the tools
This line was mistakenly added while there is already a `-D tools=all`
a few lines below.

Fixes: f60defa72d ("gitlab-ci: Add a shader-db run using v3d on drm-shim.")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-02 16:41:19 +01:00
Sergii Romantsov
a86eccfb78 i965/clear: clear_value better precision
Test-case with depth-clear 0.5 and format
MESA_FORMAT_Z24_UNORM_X8_UINT fails due inconsistent
clear-value of 0.4999997.
Maybe its better to improve?

CC: Jason Ekstrand <jason.ekstrand@intel.com>
Fixes: 0ae9ce0f29 (i965/clear: Quantize the depth clear value based on the format)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111113
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-02 14:25:34 +00:00
Samuel Pitoiset
e8110e51c6 radv: fix image_has_{cmask,fmask}() helpers
The driver should now rely on cmask_offset because CMASK can be
disabled by the driver for some reasons (eg. mipmaps). Apply the
same change for FMASK, although it should be useless.

Fixes: ad1bc8621d ("radv: remove radv_get_image_fmask_info()")
Fixes: 10d08da52c ("radv/gfx10: add missing dcc_tile_swizzle tweak")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-02 14:00:50 +02:00
Samuel Pitoiset
ad1bc8621d radv: remove radv_get_image_fmask_info()
It's unnecessary to duplicate fields in another struct.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-02 13:34:46 +02:00
Samuel Pitoiset
10d08da52c radv/gfx10: add missing dcc_tile_swizzle tweak
Fixes: c90f46700d ("radv/gfx10: mask DCC tile swizzle by alignment")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-02 13:34:43 +02:00
Samuel Pitoiset
9c9745e8dd radv: remove radv_get_image_cmask_info()
It's unnecessary to duplicate fields in another struct.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-02 13:34:41 +02:00
Samuel Pitoiset
856487a280 radv: only account for tile_swizzle for color surfaces with DCC
It's 0 for depth surfaces with TC compat HTILE enabled.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-02 13:34:39 +02:00
Bas Nieuwenhuizen
e1c5d8a364 radv: Enable VK_KHR_shader_atomic_int64
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-08-02 12:26:32 +02:00
Bas Nieuwenhuizen
a17f2206d3 ac/nir: Implement LLVM9 64-bit buffer compare & exchange.
LLVM 9 does not have a 64-bit buffer compswap intrinsic, so this
extracts the ptr, does a bound check and then uses a cmpxchg LLVM
instruction.

Not ideal, but the earliest release we're going to get a proper
intrinsic is LLVM 10.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-08-02 12:26:11 +02:00
Connor Abbott
73274c9ec2 Revert "ac/nir: handle negate modifier"
This reverts commit bfea7e4d29.
2019-08-02 11:14:50 +02:00
Connor Abbott
4a382d66ee Revert "ac/nir: handle abs modifier"
This reverts commit d3c80733cd.

These were only appearing due to memory corruption.
2019-08-02 11:14:08 +02:00
Timothy Arceri
06ec14d692 iris: bump compat profile support to 4.6
All of the current piglit compat profile tests pass.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-02 18:56:53 +10:00
Timothy Arceri
74f96b06d6 egl: fix OpenGL 3.1 context creation
>From the EGL_KHR_create_context spec:

   "* If OpenGL 3.1 is requested, the context returned may implement
       any of the following versions:

         * Version 3.1. The GL_ARB_compatibility extension may or may
           not be implemented, as determined by the implementation.
         * The core profile of version 3.2 or greater."

Fixes CTS tests:

    dEQP-EGL.functional.create_context_ext.gl_31.rgb888_depth_stencil
    dEQP-EGL.functional.create_context_ext.robust_gl_31.rgb888_depth_stencil
    dEQP-EGL.functional.create_context_ext.gl_31.rgb888_depth_no_stencil
    dEQP-EGL.functional.create_context_ext.robust_gl_31.rgb888_depth_no_stencil
    dEQP-EGL.functional.create_context_ext.gl_31.rgba8888_depth_no_stencil
    dEQP-EGL.functional.create_context_ext.gl_31.rgb888_no_depth_no_stencil
    dEQP-EGL.functional.create_context_ext.robust_gl_31.rgba8888_depth_no_stencil
    dEQP-EGL.functional.create_context_ext.robust_gl_31.rgb888_no_depth_no_stencil
    dEQP-EGL.functional.create_context_ext.gl_31.rgba8888_no_depth_no_stencil
    dEQP-EGL.functional.create_context_ext.robust_gl_31.rgba8888_no_depth_no_stencil
    dEQP-EGL.functional.create_context_ext.gl_31.rgba8888_depth_stencil
    dEQP-EGL.functional.create_context_ext.robust_gl_31.rgba8888_depth_stencil

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-02 18:56:53 +10:00
Connor Abbott
f41516bdb5 nir/find_array_copies: Reject copies with mismatched type
When we detect a scalar/vector copy through load_deref/store_deref, we
have to be careful since those can bitcast an int to a float and
vice-versa even though copy_deref can't.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111251
Fixes: 156306e5e6 ("nir/find_array_copies: Handle wildcards and overlapping copies")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 10:34:29 +02:00
Samuel Pitoiset
7368000868 radv: re-apply "Optimize rebinding the same descriptor set."
This makes it cheaper to just change the dynamic offsets with
the same descriptor sets.

This optimization has been reverted a while back because of
random GPU hangs on GFX9, no it looks fine, at least CTS no longer
hangs on GFX9 and it doesn't hang on GFX10 as well.

It fixes a performance problem with Wolfenstein Youngblood.

Suggested-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2019-08-02 09:56:55 +02:00
Samuel Pitoiset
96a5445559 radv/gfx10: use the correct target machine for Wave32
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-02 09:37:38 +02:00
Samuel Pitoiset
8a86908e9a radv/gfx10: add Wave32 support for vertex, tessellation and geometry shaders
It can be enabled with RADV_PERFTEST=gewave32.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-02 09:37:36 +02:00
Samuel Pitoiset
953bbacc23 radv/gfx10: add Wave32 support for fragment shaders
It can be enabled with RADV_PERFTEST=pswave32.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-02 09:37:34 +02:00
Kenneth Graunke
18c2e09dc7 gallium: Implement GL_EXT_shader_samples_identical via a new capability
This exposes the textureSamplesIdenticalEXT function in GLSL.

We enable it for iris and radeonsi, because their compilers already
have support for this.  Tested on Intel Kabylake and AMD Vega 64.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-01 23:38:54 -07:00
Kenneth Graunke
adcc0a8fdc intel/tools: Fix aubinator_viewer build.
This functions was recently renamed and not all callers were updated.

Fixes: 086c486a75 ("intel/device: rename gen_get_device_info")
2019-08-01 23:36:41 -07:00
Francisco Jerez
54fbc625ea intel/ir: Fix CFG corruption in opt_predicated_break().
Specifically the optimization of a conditional BREAK + WHILE sequence
into a conditional WHILE seems pretty broken.  The list of successors
of "earlier_block" (where the conditional BREAK was found) is emptied
and then re-created with the same edges for no apparent reason.  On
top of that the list of predecessors of the block immediately after
the WHILE loop is emptied, but only one of the original edges will be
added back, which means that potentially several blocks that still
have it on their list of successors won't be on its list of
predecessors anymore, causing all sorts of hilarity due to the
inconsistency in the control flow graph.

The solution is to remove the code that's removing valid edges from
the CFG.  cfg_t::remove_block() will already clean up after itself.
The assert in bblock_t::combine_with() also needs to be removed since
we will be merging a block with multiple children into the first one
of them.

Found the issue on a hardware enabling branch originally, but
apparently somebody reproduced the same problem independently on
master in the meantime.

Fixes: d13bcdb3a9 ("i965/fs: Extend predicated break pass to predicate WHILE.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111009
Cc: jiradet.jd@gmail.com
Cc: Sergii Romantsov <sergii.romantsov@globallogic.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Tested-by: Paul Chelombitko <qamonstergl@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-08-01 16:56:48 -07:00
Mark Janes
ddb59cd20e intel/device: make internal functions private
The device info initializer makes several fuctions internal:

  - handling of device override
  - updating topology from kernel information

The implementation file is slightly reordered due to the renamed
functions being static.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-01 16:40:03 -07:00
Mark Janes
086c486a75 intel/device: rename gen_get_device_info
Rename the original device info initialization routine so callers
don't mistakenly call the wrong one:

  gen_get_device_info_from_fd:

      Queries kernel for full device info, including topology
      details.

  gen_get_device_info_from_pci_id:

      Partially initializes device info based on PCI ID lookup, when
      the kernel is not available.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-01 16:39:56 -07:00
Mark Janes
d594d2a052 intel/tools: use device info initializer
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-01 16:39:54 -07:00
Mark Janes
e4a0070db4 anv: use initialization routine for gen_device_info
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-01 16:39:51 -07:00
Mark Janes
49465f1330 iris/screen: use initialization routine for gen_device_info
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-01 16:39:48 -07:00
Mark Janes
96e1c945f2 i965: Move device info initialization to common code
With perf queries, initializing the device info is much more complex
than just getting a PCI ID and calling gen_get_device_info.  This commit
adds a new gen_get_device_info_from_fd helper in common code which does
all of the requisite kernel queries to get device info including all of
the topology information.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-01 16:39:44 -07:00
Mark Janes
1186f6ea69 i965/perf: verify kernel support before registering OA metrics
When gen_device_info updates the topology in it's initializer, the
kernel queries will fail silently.  Iris and anv have minimum
kernel requirements that support the queries.  i965 must verify kernel
support before reporting OA metrics.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-01 16:39:41 -07:00
Mark Janes
7852fe5415 intel/common: provide common ioctl routine
i965 links against libdrm for drmIoctl, but anv and iris both
re-implement this routine to avoid the dependency.

intel/dev also needs an ioctl wrapper, so lets share the same
implementation everywhere.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-01 16:38:40 -07:00
Alyssa Rosenzweig
b40ba2db6c panfrost: Remove unused argument
A relic from when we didn't have an online compiler, hah.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig
ff345d4a01 panfrost: Handle MESA_SHADER_COMPUTE in compile callback
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig
73c40d6bbb pan/midgard: Use standard list traversal to find initial tag
Fixes a hang (and abort) on empty shaders, which you shouldn't have
anyway but better safe than sorry. DCE going on the fritz is no reason
to freeze the system.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig
4647999327 panfrost: Use gl_shader_stage directly for compiles
No need to add a third set of enums to the mix.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig
d9eb65c60c panfrost: Emit "draw" info for compute jobs
Important fields relating to shader state and UBOs are filled out from
this (misnomer) function.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig
22a8f6de61 panfrost: Feed compute shaders into the compiler
The path for compute shader compiles resembles the graphic shader
compile path, although it is substantially simpler as we don't need any
shader keying.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig
1b284628ef panfrost: Expose compute shaders as panfrost_shader_variants
Whether variants are packed by graphics or compute is irrelevant.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig
8b53230d47 panfrost: Remove shader state *base
It is now unused.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig
c228046b4b panfrost: Remove CSO dependency from shader_compile
We want this routine to be generic across graphics and compute, so let
the caller deal with the typing.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig
428bed3bde panfrost: Generalize UBO upload for other shader stages
Now that everything is unified, this generalization is nice and easy.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig
a34370e855 panfrost: Guard vertex upload by ctx->vertex != NULL
This is irrelevant for graphics but matters for compute workloads.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig
3bfdb878aa panfrost: Generalize vertex shader upload
This allows us to reuse the same code path for compute.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig
3b7224190e panfrost: Share gl_enables between VERTEX/COMPUTE
Catch-all for magic bits.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig
871c02b12e panfrost: Invoke compute shader according to grid info
We already have helpers for packing invocations (due to its role in
instanced vertex shaders), so we can reuse this drop in for compute
shaders.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig
748ccbc808 panfrost: Explain and include compute FBD
Squint at it hard enough and you realize it's the beginning of an
SFBD... I guess...

A compute shader with register spilling would be able to confirm this,
but we would expect to see the first field | 1 and an address splattered
later, setting up TLS.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig
3113be3127 panfrost: Unify-driven cleanup
Again, now that stages are unified some logic goes away.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig
ac6aa93f9e panfrost: Unify ctx->vs and ctx->fs
It's a little verbose, but this way we can support other shader stages
without too much contortion.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig
4b93152c29 panfrost: Flesh out launch_grid stub
It's still incomplette, but we're able to hook into launch_grid to
create a stub COMPUTE job.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-01 16:23:02 -07:00
Alyssa Rosenzweig
cd1be4605c panfrost: Cleanup via payload unification
Since these are now indexable, quite a bit of code cleans up.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-01 16:23:02 -07:00
Alyssa Rosenzweig
0da52015a1 panfrost: Unify payload_vertex/payload_tiler
Rather than disparate variables, let's use an array of payloads indexed
by the shader stage.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-01 16:23:02 -07:00
Alyssa Rosenzweig
902115f94f panfrost: Only wallpaper if we drew something
last_tiler.gpu may be NULL at flush time despite no clear and existing
jobs -- if we executed a compute-only workload.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-01 16:23:02 -07:00
Alyssa Rosenzweig
2d86828243 panfrost: Adjust shader CAPs to expose dEQP compute
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-01 16:23:01 -07:00
Alyssa Rosenzweig
39fe9f5e2f panfrost: Expose NIR as our PIPE_SHADER_CAP_SUPPORTED_IRS
We *could* expose TGSI as well -- we pipe it through tgsi_to_nir for
Gallium-internal shaders anyway -- but we'd rather not.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-01 16:23:01 -07:00
Alyssa Rosenzweig
1697760e05 panfrost: Copy freedreno's panfrost_get_compute_param
Values reported here aren't remotely correct, but it's a start to just
get the entrypoint stubbed out.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-01 16:23:01 -07:00
Alyssa Rosenzweig
c8bc664447 panfrost: Expose COMPUTE-related caps for GLES3.1
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-01 16:23:01 -07:00
Alyssa Rosenzweig
5a8b83ca0b panfrost: Stub out launch_grid
Just dumps some information about the invocation for later debug.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-01 16:23:01 -07:00
Alyssa Rosenzweig
a8fc40aaf5 panfrost: Stub out compute CSO
Doesn't do anything, just gets the functions there.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-01 16:23:01 -07:00
Alyssa Rosenzweig
e913986868 panfrost: Implement gl_FrontFacing
Interestingly, this requires no compiler changes. It's just exposed as a
special varying.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-01 16:15:03 -07:00
Alyssa Rosenzweig
f3e15122d4 panfrost: Add support for decoding gl_FrontFacing
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-01 16:15:03 -07:00
Alyssa Rosenzweig
9e66ff3ea9 pan/decode: Use max varying index as varying buffer count
This allows us to decode asymmetric varyings correctly, which occurs
with e.g. gl_FrontFacing.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-01 16:15:03 -07:00
Timothy Arceri
2afedfaf9a iris: add support for gl_ClipVertex in tess eval shaders
Required for OpenGL compat support.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-01 16:12:37 -07:00
Timothy Arceri
00b5bf2d72 iris: add support for gl_ClipVertex in geometry shaders
This will enable us to support the OpenGL compat profile.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-01 16:12:27 -07:00
Jason Ekstrand
70dc017aec nir: Stop whacking gl_FrontFacing to a system value
We have a cap bit for gallium and a GLSL compiler flag to control this.
Just trust what GLSL gives us and stop forcing it.  In order for this to
be safe, we have to advertise another cap in some of the gallium
drivers.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-01 21:59:37 +00:00
Alyssa Rosenzweig
4e736b88f3 panfrost: Implement panfrost_set_shader_buffers callback
Just copy over the passed SSBO for now.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-01 14:32:08 -07:00
Alyssa Rosenzweig
898a18ea89 gallium/util: Add util_set_shader_buffers_mask helper
Conceptually follows util_set_vertex_buffers_mask but for SSBOs.

v2: Fix missing ~ when clearing mask. Adjust mask behaviour to match
freedreno/v3d when buffer == NULL.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-01 14:31:56 -07:00
Jonathan Marek
3e33173200 kmsro: move entry points from etnaviv to kmsro
These drivers are kmsro drivers so they should be part of the kmsro #if

This fixes missing imx_drm driver when building with only freedreno+kmsro

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-01 16:31:51 -04:00
933 changed files with 37686 additions and 17043 deletions

View File

@@ -14,7 +14,7 @@
# repository's registry will be used there as well.
variables:
UPSTREAM_REPO: mesa/mesa
DEBIAN_TAG: "2019-07-23"
DEBIAN_TAG: "2019-08-09"
DEBIAN_VERSION: stretch-slim
DEBIAN_IMAGE: "$CI_REGISTRY_IMAGE/debian/$DEBIAN_VERSION:$DEBIAN_TAG"
@@ -26,6 +26,7 @@ include:
stages:
- containers-build
- build+test
- test
# When to automatically run the CI
@@ -39,6 +40,14 @@ stages:
when:
- runner_system_failure
.ci-deqp-artifacts: &ci-deqp-artifacts
artifacts:
when: always
untracked: false
paths:
# Watch out! Artifacts are relative to the build dir.
# https://gitlab.com/gitlab-org/gitlab-ce/commit/8788fb925706cad594adf6917a6c5f6587dd1521
- artifacts
# CONTAINERS
@@ -77,6 +86,12 @@ debian:
- ccache --zero-stats || true
- ccache --show-stats || true
after_script:
# In case the install dir is being saved as artifacts, tar it up
# so that symlinks and hardlinks aren't each packed separately in
# the zip file.
- if [ -d install ]; then
tar -cf artifacts/install.tar install;
fi
- export CCACHE_DIR="$PWD/ccache"
- ccache --show-stats
@@ -100,12 +115,8 @@ debian:
# gallium drivers combined.
# Start this early so that it doesn't limit the total run time.
#
# We also put softpipe (and therefore gallium nine, which requires
# it) here, since softpipe/llvmpipe can't be built alongside classic
# swrast.
#
# Putting glvnd here is arbitrary, but we want it in one of the builds
# for coverage.
# We also stick the glvnd build here, since we want non-glvnd in
# meson-main for actual driver CI.
meson-swr-glvnd:
extends: .meson-build
variables:
@@ -120,10 +131,9 @@ meson-swr-glvnd:
-D gallium-omx=disabled
-D gallium-va=false
-D gallium-xa=false
-D gallium-nine=true
-D gallium-nine=false
-D gallium-opencl=disabled
-D osmesa=gallium
GALLIUM_DRIVERS: "swr,swrast,iris"
GALLIUM_DRIVERS: "swr,iris"
LLVM_VERSION: "6.0"
meson-clang:
@@ -163,24 +173,25 @@ meson-main:
-D gbm=true
-D egl=true
-D platforms=x11,wayland,drm,surfaceless
-D osmesa=classic
DRI_DRIVERS: "i915,i965,r100,r200,swrast,nouveau"
DRI_DRIVERS: "i915,i965,r100,r200,nouveau"
GALLIUM_ST: >
-D dri3=true
-D tools=drm-shim
-D gallium-extra-hud=true
-D gallium-vdpau=true
-D gallium-xvmc=true
-D gallium-omx=bellagio
-D gallium-va=true
-D gallium-xa=true
-D gallium-nine=false
-D gallium-nine=true
-D gallium-opencl=disabled
GALLIUM_DRIVERS: "iris,nouveau,kmsro,r300,r600,freedreno,svga,v3d,vc4,virgl,etnaviv,panfrost,lima"
GALLIUM_DRIVERS: "iris,nouveau,kmsro,r300,r600,freedreno,swrast,svga,v3d,vc4,virgl,etnaviv,panfrost,lima"
LLVM_VERSION: "7"
EXTRA_OPTION: >
-D osmesa=gallium
-D tools=all
MESON_SHADERDB: "true"
BUILDTYPE: "debugoptimized"
<<: *ci-deqp-artifacts
meson-clover:
extends: .meson-build
@@ -252,19 +263,14 @@ meson-vulkan:
-D gallium-xa=false
-D gallium-nine=false
-D llvm=false
CROSS: >
--cross /tmp/cross_file.txt
<<: *ci-deqp-artifacts
script:
- /usr/share/meson/debcrossgen --arch ${ARCH} -o /tmp/cross_file.txt
# Work around a bug in debcrossgen that should be fixed in the next release
- sed -i "s|cpu_family = 'i686'|cpu_family = 'x86'|g" /tmp/cross_file.txt
- .gitlab-ci/meson-build.sh
meson-armhf:
extends: .meson-cross
variables:
ARCH: armhf
CROSS: armhf
VULKAN_DRIVERS: freedreno
GALLIUM_DRIVERS: "etnaviv,freedreno,kmsro,lima,nouveau,panfrost,tegra,v3d,vc4"
# Disable the tests since we're cross compiling.
@@ -276,7 +282,7 @@ meson-armhf:
meson-arm64:
extends: .meson-cross
variables:
ARCH: arm64
CROSS: arm64
VULKAN_DRIVERS: freedreno
GALLIUM_DRIVERS: "etnaviv,freedreno,kmsro,lima,nouveau,panfrost,tegra,v3d,vc4"
# Disable the tests since we're cross compiling.
@@ -285,17 +291,23 @@ meson-arm64:
-D I-love-half-baked-turnips=true
-D vulkan-overlay-layer=true
# While the main point of this build is testing the i386 cross build,
# we also use this one to test some other options that are exclusive
# with meson-main's choices (classic swrast and osmesa)
meson-i386:
extends: .meson-cross
variables:
ARCH: i386
CROSS: i386
VULKAN_DRIVERS: intel
GALLIUM_DRIVERS: "swrast"
DRI_DRIVERS: "swrast"
GALLIUM_DRIVERS: "iris"
# Disable i386 tests, because u_format_tests gets precision
# failures in dxtn unpacking
EXTRA_OPTION: >
-D build-tests=false
-D vulkan-overlay-layer=true
-D llvm=false
-D osmesa=classic
scons-nollvm:
extends: .scons-build
@@ -311,3 +323,60 @@ scons-llvm:
LLVM_VERSION: "3.4"
# LLVM 3.4 packages were built with an old libstdc++ ABI
CXX: "g++ -D_GLIBCXX_USE_CXX11_ABI=0"
.deqp-test:
<<: *ci-run-policy
stage: test
image: $DEBIAN_IMAGE
variables:
GIT_STRATEGY: none # testing doesn't build anything from source
DEQP_SKIPS: deqp-default-skips.txt
script:
# Note: Build dir (and thus install) may be dirty due to GIT_STRATEGY
- rm -rf install
- tar -xf artifacts/install.tar
- ./artifacts/deqp-runner.sh
artifacts:
when: on_failure
name: "$CI_JOB_NAME-$CI_COMMIT_REF_NAME"
paths:
- results/
test-llvmpipe-gles2:
parallel: 4
variables:
DEQP_VER: gles2
DEQP_EXPECTED_FAILS: deqp-llvmpipe-fails.txt
LIBGL_ALWAYS_SOFTWARE: "true"
DEQP_RENDERER_MATCH: "llvmpipe"
extends: .deqp-test
dependencies:
- meson-main
test-softpipe-gles2:
parallel: 4
variables:
DEQP_VER: gles2
DEQP_EXPECTED_FAILS: deqp-softpipe-fails.txt
LIBGL_ALWAYS_SOFTWARE: "true"
DEQP_RENDERER_MATCH: "softpipe"
GALLIUM_DRIVER: "softpipe"
extends: .deqp-test
dependencies:
- meson-main
# The GLES2 CTS run takes about 8 minutes of CPU time, while GLES3 is
# 25 minutes. Until we can get its runtime down, just do a partial
# (every 10 tests) run.
test-softpipe-gles3-limited:
variables:
DEQP_VER: gles3
DEQP_EXPECTED_FAILS: deqp-softpipe-fails.txt
LIBGL_ALWAYS_SOFTWARE: "true"
DEQP_RENDERER_MATCH: "softpipe"
GALLIUM_DRIVER: "softpipe"
CI_NODE_INDEX: 1
CI_NODE_TOTAL: 10
extends: .deqp-test
dependencies:
- meson-main

View File

@@ -49,6 +49,7 @@ echo "deb https://deb.debian.org/debian/ buster main" >/etc/apt/sources.list.d/b
echo "deb https://deb.debian.org/debian/ buster-updates main" >/etc/apt/sources.list.d/buster-updates.list
apt-get update
apt-get install -y \
git \
bzip2 \
zlib1g-dev \
pkg-config \
@@ -69,19 +70,18 @@ apt-get install -y \
libelf-dev \
libunwind-dev \
libglvnd-dev \
libgtk-3-dev \
libpng-dev \
libgbm-dev \
libgles2-mesa-dev \
python-mako \
python3-mako \
meson \
scons
# autotools build deps
apt-get install -y \
automake \
libtool \
bison \
flex \
gettext \
make
cmake \
meson \
scons
# Cross-build Mesa deps
for arch in $CROSS_ARCHITECTURES; do
@@ -120,14 +120,14 @@ export DRI2PROTO_VERSION=dri2proto-2.8
export LIBPCIACCESS_VERSION=libpciaccess-0.13.4
export LIBDRM_VERSION=libdrm-2.4.99
export XCBPROTO_VERSION=xcb-proto-1.13
export RANDRPROTO_VERSION=randrproto-1.3.0
export LIBXRANDR_VERSION=libXrandr-1.3.0
export RANDRPROTO_VERSION=randrproto-1.5.0
export LIBXRANDR_VERSION=libXrandr-1.5.0
export LIBXCB_VERSION=libxcb-1.13
export LIBXSHMFENCE_VERSION=libxshmfence-1.3
export LIBVDPAU_VERSION=libvdpau-1.1
export LIBVA_VERSION=libva-1.7.0
export LIBWAYLAND_VERSION=wayland-1.15.0
export WAYLAND_PROTOCOLS_VERSION=wayland-protocols-1.8
export WAYLAND_PROTOCOLS_VERSION=wayland-protocols-1.12
wget $XORG_RELEASES/util/$XORGMACROS_VERSION.tar.bz2
tar -xvf $XORGMACROS_VERSION.tar.bz2 && rm $XORGMACROS_VERSION.tar.bz2
@@ -212,13 +212,74 @@ apt-get install -y ccache
# We need xmllint to validate the XML files in Mesa
apt-get install -y libxml2-utils
# Remove unused packages
# Generate cross build files for Meson
for arch in $CROSS_ARCHITECTURES; do
cross_file="/cross_file-$arch.txt"
/usr/share/meson/debcrossgen --arch "$arch" -o "$cross_file"
# Work around a bug in debcrossgen that should be fixed in the next release
if [ "$arch" = "i386" ]; then
sed -i "s|cpu_family = 'i686'|cpu_family = 'x86'|g" "$cross_file"
fi
done
############### Build dEQP
git config --global user.email "mesa@example.com"
git config --global user.name "Mesa CI"
# XXX: Use --depth 1 once we can drop the cherry-picks.
git clone \
https://github.com/KhronosGroup/VK-GL-CTS.git \
-b opengl-es-cts-3.2.5.1 \
/VK-GL-CTS
cd /VK-GL-CTS
# Fix surfaceless build
git cherry-pick -x 22f41e5e321c6dcd8569c4dad91bce89f06b3670
git cherry-pick -x 1daa8dff73161ea60ead965bd6c9f2a0a2165648
# surfaceless links against libkms and such despite not using it.
sed -i '/gbm/d' targets/surfaceless/surfaceless.cmake
sed -i '/libkms/d' targets/surfaceless/surfaceless.cmake
sed -i '/libgbm/d' targets/surfaceless/surfaceless.cmake
python3 external/fetch_sources.py
mkdir -p /deqp
cd /deqp
cmake -G Ninja \
-DDEQP_TARGET=surfaceless \
-DCMAKE_BUILD_TYPE=Release \
/VK-GL-CTS
ninja
# Copy out the mustpass lists we want from a bunch of other junk.
mkdir /deqp/mustpass
for gles in gles2 gles3 gles31; do
cp \
/deqp/external/openglcts/modules/gl_cts/data/mustpass/gles/aosp_mustpass/3.2.5.x/$gles-master.txt \
/deqp/mustpass/$gles-master.txt
done
# Remove the rest of the build products that we don't need.
rm -rf /deqp/external
rm -rf /deqp/modules/internal
rm -rf /deqp/executor
rm -rf /deqp/execserver
rm -rf /deqp/modules/egl
rm -rf /deqp/framework
du -sh *
rm -rf /VK-GL-CTS
############### Uninstall the build software
apt-get purge -y \
automake \
git \
libtool \
curl \
unzip \
gnupg \
software-properties-common
cmake \
git \
libgles2-mesa-dev \
libgbm-dev
apt-get autoremove -y --purge

View File

@@ -0,0 +1,10 @@
# Note: skips lists for CI are just a list of lines that, when
# non-zero-length and not starting with '#', will regex match to
# delete lines from the test list. Be careful.
# Skip the perf/stress tests to keep runtime manageable
dEQP-GLES[0-9]*.performance
dEQP-GLES[0-9]*.stress
# These are really slow on tiling architectures (including llvmpipe).
dEQP-GLES[0-9]*.functional.flush_finish

View File

@@ -0,0 +1,124 @@
dEQP-GLES2.functional.clipping.line.wide_line_clip_viewport_center
dEQP-GLES2.functional.clipping.line.wide_line_clip_viewport_corner
dEQP-GLES2.functional.clipping.point.wide_point_clip
dEQP-GLES2.functional.clipping.point.wide_point_clip_viewport_center
dEQP-GLES2.functional.clipping.point.wide_point_clip_viewport_corner
dEQP-GLES2.functional.clipping.triangle_vertex.clip_two.clip_neg_y_neg_z_and_neg_x_neg_y_pos_z
dEQP-GLES2.functional.clipping.triangle_vertex.clip_two.clip_pos_y_pos_z_and_neg_x_neg_y_neg_z
dEQP-GLES2.functional.fbo.render.color_clear.rbo_rgba4
dEQP-GLES2.functional.fbo.render.color_clear.rbo_rgba4_depth_component16
dEQP-GLES2.functional.fbo.render.color_clear.rbo_rgba4_stencil_index8
dEQP-GLES2.functional.fbo.render.depth.rbo_rgba4_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgba4
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgba4_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgba4
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgba4_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_depthbuffer.no_rebind_rbo_rgba4_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_depthbuffer.rebind_rbo_rgba4_depth_component16
dEQP-GLES2.functional.fbo.render.recreate_stencilbuffer.no_rebind_rbo_rgba4_stencil_index8
dEQP-GLES2.functional.fbo.render.recreate_stencilbuffer.rebind_rbo_rgba4_stencil_index8
dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgba4
dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgba4_depth_component16
dEQP-GLES2.functional.fbo.render.shared_depthbuffer.rbo_rgba4_depth_component16
dEQP-GLES2.functional.polygon_offset.default_displacement_with_units
dEQP-GLES2.functional.polygon_offset.fixed16_displacement_with_units
dEQP-GLES2.functional.rasterization.interpolation.basic.line_loop_wide
dEQP-GLES2.functional.rasterization.interpolation.basic.line_strip_wide
dEQP-GLES2.functional.rasterization.interpolation.basic.lines_wide
dEQP-GLES2.functional.rasterization.interpolation.projected.line_loop_wide
dEQP-GLES2.functional.rasterization.interpolation.projected.line_strip_wide
dEQP-GLES2.functional.rasterization.interpolation.projected.lines_wide
dEQP-GLES2.functional.rasterization.limits.points
dEQP-GLES2.functional.shaders.texture_functions.fragment.texture2d_bias
dEQP-GLES2.functional.shaders.texture_functions.fragment.texture2dproj_vec3_bias
dEQP-GLES2.functional.shaders.texture_functions.fragment.texture2dproj_vec4_bias
dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_linear_clamp_rgba8888
dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_linear_mirror_etc1
dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_linear_mirror_rgba8888
dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_linear_repeat_etc1
dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_linear_repeat_rgba8888
dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_nearest_clamp_rgba8888
dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_nearest_mirror_etc1
dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_nearest_mirror_rgba8888
dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_nearest_repeat_etc1
dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_nearest_repeat_l8
dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_nearest_repeat_rgb888
dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_nearest_repeat_rgba4444
dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_nearest_repeat_rgba8888
dEQP-GLES2.functional.texture.filtering.2d.nearest_mipmap_linear_linear_clamp_rgba8888
dEQP-GLES2.functional.texture.filtering.2d.nearest_mipmap_linear_linear_mirror_etc1
dEQP-GLES2.functional.texture.filtering.2d.nearest_mipmap_linear_linear_mirror_rgba8888
dEQP-GLES2.functional.texture.filtering.2d.nearest_mipmap_linear_linear_repeat_etc1
dEQP-GLES2.functional.texture.filtering.2d.nearest_mipmap_linear_linear_repeat_rgba8888
dEQP-GLES2.functional.texture.filtering.2d.nearest_mipmap_linear_nearest_clamp_rgba8888
dEQP-GLES2.functional.texture.filtering.2d.nearest_mipmap_linear_nearest_mirror_etc1
dEQP-GLES2.functional.texture.filtering.2d.nearest_mipmap_linear_nearest_mirror_rgba8888
dEQP-GLES2.functional.texture.filtering.2d.nearest_mipmap_linear_nearest_repeat_etc1
dEQP-GLES2.functional.texture.filtering.2d.nearest_mipmap_linear_nearest_repeat_l8
dEQP-GLES2.functional.texture.filtering.2d.nearest_mipmap_linear_nearest_repeat_rgb888
dEQP-GLES2.functional.texture.filtering.2d.nearest_mipmap_linear_nearest_repeat_rgba4444
dEQP-GLES2.functional.texture.filtering.2d.nearest_mipmap_linear_nearest_repeat_rgba8888
dEQP-GLES2.functional.texture.mipmap.2d.affine.linear_linear_repeat
dEQP-GLES2.functional.texture.mipmap.2d.affine.nearest_linear_clamp
dEQP-GLES2.functional.texture.mipmap.2d.affine.nearest_linear_mirror
dEQP-GLES2.functional.texture.mipmap.2d.affine.nearest_linear_repeat
dEQP-GLES2.functional.texture.mipmap.2d.basic.linear_linear_repeat
dEQP-GLES2.functional.texture.mipmap.2d.basic.linear_linear_repeat_non_square
dEQP-GLES2.functional.texture.mipmap.2d.basic.nearest_linear_clamp
dEQP-GLES2.functional.texture.mipmap.2d.basic.nearest_linear_clamp_non_square
dEQP-GLES2.functional.texture.mipmap.2d.basic.nearest_linear_mirror
dEQP-GLES2.functional.texture.mipmap.2d.basic.nearest_linear_mirror_non_square
dEQP-GLES2.functional.texture.mipmap.2d.basic.nearest_linear_repeat
dEQP-GLES2.functional.texture.mipmap.2d.basic.nearest_linear_repeat_non_square
dEQP-GLES2.functional.texture.mipmap.2d.projected.linear_linear_repeat
dEQP-GLES2.functional.texture.mipmap.2d.projected.nearest_linear_clamp
dEQP-GLES2.functional.texture.mipmap.2d.projected.nearest_linear_mirror
dEQP-GLES2.functional.texture.mipmap.2d.projected.nearest_linear_repeat
dEQP-GLES2.functional.texture.mipmap.cube.basic.linear_linear
dEQP-GLES2.functional.texture.mipmap.cube.basic.linear_nearest
dEQP-GLES2.functional.texture.mipmap.cube.bias.linear_linear
dEQP-GLES2.functional.texture.mipmap.cube.bias.linear_nearest
dEQP-GLES2.functional.texture.mipmap.cube.projected.linear_linear
dEQP-GLES2.functional.texture.mipmap.cube.projected.linear_nearest
dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_linear_linear_clamp
dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_linear_linear_mirror
dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_linear_linear_repeat
dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_linear_nearest_clamp
dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_linear_nearest_mirror
dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_linear_nearest_repeat
dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_linear_linear_clamp
dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_linear_linear_mirror
dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_linear_linear_repeat
dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_linear_nearest_clamp
dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_linear_nearest_mirror
dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_linear_nearest_repeat
dEQP-GLES2.functional.texture.vertex.2d.wrap.clamp_clamp
dEQP-GLES2.functional.texture.vertex.2d.wrap.clamp_mirror
dEQP-GLES2.functional.texture.vertex.2d.wrap.clamp_repeat
dEQP-GLES2.functional.texture.vertex.2d.wrap.mirror_clamp
dEQP-GLES2.functional.texture.vertex.2d.wrap.mirror_mirror
dEQP-GLES2.functional.texture.vertex.2d.wrap.mirror_repeat
dEQP-GLES2.functional.texture.vertex.2d.wrap.repeat_clamp
dEQP-GLES2.functional.texture.vertex.2d.wrap.repeat_mirror
dEQP-GLES2.functional.texture.vertex.2d.wrap.repeat_repeat
dEQP-GLES2.functional.texture.vertex.cube.filtering.linear_mipmap_linear_linear_clamp
dEQP-GLES2.functional.texture.vertex.cube.filtering.linear_mipmap_linear_linear_mirror
dEQP-GLES2.functional.texture.vertex.cube.filtering.linear_mipmap_linear_linear_repeat
dEQP-GLES2.functional.texture.vertex.cube.filtering.linear_mipmap_linear_nearest_clamp
dEQP-GLES2.functional.texture.vertex.cube.filtering.linear_mipmap_linear_nearest_mirror
dEQP-GLES2.functional.texture.vertex.cube.filtering.linear_mipmap_linear_nearest_repeat
dEQP-GLES2.functional.texture.vertex.cube.filtering.nearest_mipmap_linear_linear_clamp
dEQP-GLES2.functional.texture.vertex.cube.filtering.nearest_mipmap_linear_linear_mirror
dEQP-GLES2.functional.texture.vertex.cube.filtering.nearest_mipmap_linear_linear_repeat
dEQP-GLES2.functional.texture.vertex.cube.filtering.nearest_mipmap_linear_nearest_clamp
dEQP-GLES2.functional.texture.vertex.cube.filtering.nearest_mipmap_linear_nearest_mirror
dEQP-GLES2.functional.texture.vertex.cube.filtering.nearest_mipmap_linear_nearest_repeat
dEQP-GLES2.functional.texture.vertex.cube.wrap.clamp_clamp
dEQP-GLES2.functional.texture.vertex.cube.wrap.clamp_mirror
dEQP-GLES2.functional.texture.vertex.cube.wrap.clamp_repeat
dEQP-GLES2.functional.texture.vertex.cube.wrap.mirror_clamp
dEQP-GLES2.functional.texture.vertex.cube.wrap.mirror_mirror
dEQP-GLES2.functional.texture.vertex.cube.wrap.mirror_repeat
dEQP-GLES2.functional.texture.vertex.cube.wrap.repeat_clamp
dEQP-GLES2.functional.texture.vertex.cube.wrap.repeat_mirror
dEQP-GLES2.functional.texture.vertex.cube.wrap.repeat_repeat

112
.gitlab-ci/deqp-runner.sh Executable file
View File

@@ -0,0 +1,112 @@
#!/bin/bash
set -ex
DEQP_OPTIONS=(--deqp-surface-width=256 --deqp-surface-height=256)
DEQP_OPTIONS+=(--deqp-surface-type=pbuffer)
DEQP_OPTIONS+=(--deqp-gl-config-name=rgba8888d24s8ms0)
DEQP_OPTIONS+=(--deqp-visibility=hidden)
DEQP_OPTIONS+=(--deqp-log-images=disable)
DEQP_OPTIONS+=(--deqp-watchdog=enable)
DEQP_OPTIONS+=(--deqp-crashhandler=enable)
if [ -z "$DEQP_VER" ]; then
echo 'DEQP_VER must be set to something like "gles2" or "gles31" for the test run'
exit 1
fi
if [ -z "$DEQP_SKIPS" ]; then
echo 'DEQP_SKIPS must be set to something like "deqp-default-skips.txt"'
exit 1
fi
# Prep the expected failure list
if [ -n "$DEQP_EXPECTED_FAILS" ]; then
export DEQP_EXPECTED_FAILS=`pwd`/artifacts/$DEQP_EXPECTED_FAILS
else
export DEQP_EXPECTED_FAILS=/tmp/expect-no-failures.txt
touch $DEQP_EXPECTED_FAILS
fi
sort < $DEQP_EXPECTED_FAILS > /tmp/expected-fails.txt
# Fix relative paths on inputs.
export DEQP_SKIPS=`pwd`/artifacts/$DEQP_SKIPS
# Be a good citizen on the shared runners.
export LP_NUM_THREADS=4
# Set up the driver environment.
export LD_LIBRARY_PATH=`pwd`/install/lib/
export EGL_PLATFORM=surfaceless
# the runner was failing to look for libkms in /usr/local/lib for some reason
# I never figured out.
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib
RESULTS=`pwd`/results
mkdir -p $RESULTS
cd /deqp/modules/$DEQP_VER
# Generate test case list file
cp /deqp/mustpass/$DEQP_VER-master.txt /tmp/case-list.txt
# Note: not using sorted input and comm, becuase I want to run the tests in
# the same order that dEQP would.
while read -r line; do
if echo "$line" | grep -q '^[^#]'; then
sed -i "/$line/d" /tmp/case-list.txt
fi
done < $DEQP_SKIPS
# If the job is parallel, take the corresponding fraction of the caselist.
# Note: N~M is a gnu sed extension to match every nth line (first line is #1).
if [ -n "$CI_NODE_INDEX" ]; then
sed -ni $CI_NODE_INDEX~$CI_NODE_TOTAL"p" /tmp/case-list.txt
fi
if [ ! -s /tmp/case-list.txt ]; then
echo "Caselist generation failed"
exit 1
fi
# Cannot use tee because dash doesn't have pipefail
touch /tmp/result.txt
tail -f /tmp/result.txt &
./deqp-$DEQP_VER "${DEQP_OPTIONS[@]}" --deqp-log-filename=$RESULTS/results.qpa --deqp-caselist-file=/tmp/case-list.txt >> /tmp/result.txt
DEQP_EXITCODE=$?
sed -ne \
'/StatusCode="Fail"/{x;p}; s/#beginTestCaseResult //; T; h' \
$RESULTS/results.qpa \
> /tmp/unsorted-fails.txt
# Scrape out the renderer that the test run used, so we can validate that the
# right driver was used.
if grep -q "dEQP-.*.info.renderer" /tmp/case-list.txt; then
# This is an ugly dependency on the .qpa format: Print 3 lines after the
# match, which happens to contain the result.
RENDERER=`sed -n '/#beginTestCaseResult dEQP-.*.info.renderer/{n;n;n;p}' $RESULTS/results.qpa | sed -n -E "s|<Text>(.*)</Text>|\1|p"`
echo "GL_RENDERER for this test run: $RENDERER"
if [ -n "$DEQP_RENDERER_MATCH" ]; then
echo $RENDERER | grep -q $DEQP_RENDERER_MATCH > /dev/null
fi
fi
if [ $DEQP_EXITCODE -ne 0 ]; then
exit $DEQP_EXITCODE
fi
sort < /tmp/unsorted-fails.txt > $RESULTS/fails.txt
comm -23 $RESULTS/fails.txt /tmp/expected-fails.txt > /tmp/new-fails.txt
if [ -s /tmp/new-fails.txt ]; then
echo "Unexpected failures:"
cat /tmp/new-fails.txt
exit 1
else
echo "No new failures"
fi

View File

@@ -0,0 +1,445 @@
dEQP-GLES2.functional.clipping.line.wide_line_clip_viewport_center
dEQP-GLES2.functional.clipping.line.wide_line_clip_viewport_corner
dEQP-GLES2.functional.clipping.point.wide_point_clip
dEQP-GLES2.functional.clipping.point.wide_point_clip_viewport_center
dEQP-GLES2.functional.clipping.point.wide_point_clip_viewport_corner
dEQP-GLES2.functional.clipping.triangle_vertex.clip_two.clip_neg_y_neg_z_and_neg_x_neg_y_pos_z
dEQP-GLES2.functional.clipping.triangle_vertex.clip_two.clip_pos_y_pos_z_and_neg_x_neg_y_neg_z
dEQP-GLES2.functional.polygon_offset.default_displacement_with_units
dEQP-GLES2.functional.polygon_offset.fixed16_displacement_with_units
dEQP-GLES2.functional.rasterization.interpolation.basic.line_loop_wide
dEQP-GLES2.functional.rasterization.interpolation.basic.line_strip_wide
dEQP-GLES2.functional.rasterization.interpolation.basic.lines_wide
dEQP-GLES2.functional.rasterization.interpolation.projected.line_loop_wide
dEQP-GLES2.functional.rasterization.interpolation.projected.line_strip_wide
dEQP-GLES2.functional.rasterization.interpolation.projected.lines_wide
dEQP-GLES2.functional.rasterization.limits.points
dEQP-GLES2.functional.rasterization.primitives.points
dEQP-GLES3.functional.clipping.line.wide_line_clip_viewport_center
dEQP-GLES3.functional.clipping.line.wide_line_clip_viewport_corner
dEQP-GLES3.functional.clipping.point.wide_point_clip
dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_center
dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_corner
dEQP-GLES3.functional.clipping.triangle_vertex.clip_two.clip_neg_y_neg_z_and_neg_x_neg_y_pos_z
dEQP-GLES3.functional.clipping.triangle_vertex.clip_two.clip_pos_y_pos_z_and_neg_x_neg_y_neg_z
dEQP-GLES3.functional.draw.random.124
dEQP-GLES3.functional.fbo.depth.depth_test_clamp.depth24_stencil8
dEQP-GLES3.functional.fbo.depth.depth_test_clamp.depth32f_stencil8
dEQP-GLES3.functional.fbo.depth.depth_test_clamp.depth_component16
dEQP-GLES3.functional.fbo.depth.depth_test_clamp.depth_component24
dEQP-GLES3.functional.fbo.depth.depth_test_clamp.depth_component32f
dEQP-GLES3.functional.fbo.depth.depth_write_clamp.depth32f_stencil8
dEQP-GLES3.functional.fbo.depth.depth_write_clamp.depth_component32f
dEQP-GLES3.functional.fbo.invalidate.sub.unbind_blit_msaa_color
dEQP-GLES3.functional.fbo.invalidate.sub.unbind_blit_msaa_depth
dEQP-GLES3.functional.fbo.invalidate.sub.unbind_blit_msaa_depth_stencil
dEQP-GLES3.functional.fbo.invalidate.sub.unbind_blit_msaa_stencil
dEQP-GLES3.functional.fbo.invalidate.whole.unbind_blit_msaa_color
dEQP-GLES3.functional.fbo.invalidate.whole.unbind_blit_msaa_depth
dEQP-GLES3.functional.fbo.invalidate.whole.unbind_blit_msaa_depth_stencil
dEQP-GLES3.functional.fbo.invalidate.whole.unbind_blit_msaa_stencil
dEQP-GLES3.functional.fbo.msaa.2_samples.depth24_stencil8
dEQP-GLES3.functional.fbo.msaa.2_samples.depth32f_stencil8
dEQP-GLES3.functional.fbo.msaa.2_samples.depth_component16
dEQP-GLES3.functional.fbo.msaa.2_samples.depth_component24
dEQP-GLES3.functional.fbo.msaa.2_samples.depth_component32f
dEQP-GLES3.functional.fbo.msaa.2_samples.r11f_g11f_b10f
dEQP-GLES3.functional.fbo.msaa.2_samples.r16f
dEQP-GLES3.functional.fbo.msaa.2_samples.r8
dEQP-GLES3.functional.fbo.msaa.2_samples.rg16f
dEQP-GLES3.functional.fbo.msaa.2_samples.rg8
dEQP-GLES3.functional.fbo.msaa.2_samples.rgb10_a2
dEQP-GLES3.functional.fbo.msaa.2_samples.rgb565
dEQP-GLES3.functional.fbo.msaa.2_samples.rgb5_a1
dEQP-GLES3.functional.fbo.msaa.2_samples.rgb8
dEQP-GLES3.functional.fbo.msaa.2_samples.rgba4
dEQP-GLES3.functional.fbo.msaa.2_samples.rgba8
dEQP-GLES3.functional.fbo.msaa.2_samples.srgb8_alpha8
dEQP-GLES3.functional.fbo.msaa.2_samples.stencil_index8
dEQP-GLES3.functional.fbo.msaa.4_samples.depth24_stencil8
dEQP-GLES3.functional.fbo.msaa.4_samples.depth32f_stencil8
dEQP-GLES3.functional.fbo.msaa.4_samples.depth_component16
dEQP-GLES3.functional.fbo.msaa.4_samples.depth_component24
dEQP-GLES3.functional.fbo.msaa.4_samples.depth_component32f
dEQP-GLES3.functional.fbo.msaa.4_samples.r11f_g11f_b10f
dEQP-GLES3.functional.fbo.msaa.4_samples.r16f
dEQP-GLES3.functional.fbo.msaa.4_samples.r8
dEQP-GLES3.functional.fbo.msaa.4_samples.rg16f
dEQP-GLES3.functional.fbo.msaa.4_samples.rg8
dEQP-GLES3.functional.fbo.msaa.4_samples.rgb10_a2
dEQP-GLES3.functional.fbo.msaa.4_samples.rgb565
dEQP-GLES3.functional.fbo.msaa.4_samples.rgb5_a1
dEQP-GLES3.functional.fbo.msaa.4_samples.rgb8
dEQP-GLES3.functional.fbo.msaa.4_samples.rgba4
dEQP-GLES3.functional.fbo.msaa.4_samples.rgba8
dEQP-GLES3.functional.fbo.msaa.4_samples.srgb8_alpha8
dEQP-GLES3.functional.fbo.msaa.4_samples.stencil_index8
dEQP-GLES3.functional.multisample.fbo_max_samples.proportionality_alpha_to_coverage
dEQP-GLES3.functional.multisample.fbo_max_samples.proportionality_sample_coverage
dEQP-GLES3.functional.multisample.fbo_max_samples.proportionality_sample_coverage_inverted
dEQP-GLES3.functional.multisample.fbo_max_samples.sample_coverage_invert
dEQP-GLES3.functional.negative_api.buffer.blit_framebuffer_multisample
dEQP-GLES3.functional.negative_api.buffer.read_pixels_fbo_format_mismatch
dEQP-GLES3.functional.polygon_offset.default_displacement_with_units
dEQP-GLES3.functional.polygon_offset.fixed16_displacement_with_units
dEQP-GLES3.functional.polygon_offset.fixed24_displacement_with_units
dEQP-GLES3.functional.polygon_offset.float32_displacement_with_units
dEQP-GLES3.functional.rasterization.fbo.rbo_multisample_max.interpolation.lines_wide
dEQP-GLES3.functional.rasterization.fbo.rbo_multisample_max.primitives.lines_wide
dEQP-GLES3.functional.rasterization.fbo.rbo_singlesample.interpolation.lines_wide
dEQP-GLES3.functional.rasterization.fbo.rbo_singlesample.primitives.points
dEQP-GLES3.functional.rasterization.fbo.texture_2d.interpolation.lines_wide
dEQP-GLES3.functional.rasterization.fbo.texture_2d.primitives.points
dEQP-GLES3.functional.rasterization.interpolation.basic.line_loop_wide
dEQP-GLES3.functional.rasterization.interpolation.basic.line_strip_wide
dEQP-GLES3.functional.rasterization.interpolation.basic.lines_wide
dEQP-GLES3.functional.rasterization.interpolation.projected.line_loop_wide
dEQP-GLES3.functional.rasterization.interpolation.projected.line_strip_wide
dEQP-GLES3.functional.rasterization.interpolation.projected.lines_wide
dEQP-GLES3.functional.rasterization.primitives.points
dEQP-GLES3.functional.rasterizer_discard.basic.write_depth_points
dEQP-GLES3.functional.rasterizer_discard.basic.write_stencil_points
dEQP-GLES3.functional.rasterizer_discard.fbo.write_depth_points
dEQP-GLES3.functional.rasterizer_discard.fbo.write_stencil_points
dEQP-GLES3.functional.rasterizer_discard.scissor.write_depth_points
dEQP-GLES3.functional.rasterizer_discard.scissor.write_stencil_points
dEQP-GLES3.functional.shaders.derivate.dfdx.fastest.fbo_msaa4.float_highp
dEQP-GLES3.functional.shaders.derivate.dfdx.fastest.fbo_msaa4.float_mediump
dEQP-GLES3.functional.shaders.derivate.dfdx.fastest.fbo_msaa4.vec2_highp
dEQP-GLES3.functional.shaders.derivate.dfdx.fastest.fbo_msaa4.vec2_mediump
dEQP-GLES3.functional.shaders.derivate.dfdx.fastest.fbo_msaa4.vec3_highp
dEQP-GLES3.functional.shaders.derivate.dfdx.fastest.fbo_msaa4.vec3_mediump
dEQP-GLES3.functional.shaders.derivate.dfdx.fastest.fbo_msaa4.vec4_highp
dEQP-GLES3.functional.shaders.derivate.dfdx.fastest.fbo_msaa4.vec4_mediump
dEQP-GLES3.functional.shaders.derivate.dfdx.fbo_msaa2.float_highp
dEQP-GLES3.functional.shaders.derivate.dfdx.fbo_msaa2.float_mediump
dEQP-GLES3.functional.shaders.derivate.dfdx.fbo_msaa2.vec2_highp
dEQP-GLES3.functional.shaders.derivate.dfdx.fbo_msaa2.vec2_mediump
dEQP-GLES3.functional.shaders.derivate.dfdx.fbo_msaa2.vec3_highp
dEQP-GLES3.functional.shaders.derivate.dfdx.fbo_msaa2.vec3_mediump
dEQP-GLES3.functional.shaders.derivate.dfdx.fbo_msaa2.vec4_highp
dEQP-GLES3.functional.shaders.derivate.dfdx.fbo_msaa2.vec4_mediump
dEQP-GLES3.functional.shaders.derivate.dfdx.fbo_msaa4.float_highp
dEQP-GLES3.functional.shaders.derivate.dfdx.fbo_msaa4.float_mediump
dEQP-GLES3.functional.shaders.derivate.dfdx.fbo_msaa4.vec2_highp
dEQP-GLES3.functional.shaders.derivate.dfdx.fbo_msaa4.vec2_mediump
dEQP-GLES3.functional.shaders.derivate.dfdx.fbo_msaa4.vec3_highp
dEQP-GLES3.functional.shaders.derivate.dfdx.fbo_msaa4.vec3_mediump
dEQP-GLES3.functional.shaders.derivate.dfdx.fbo_msaa4.vec4_highp
dEQP-GLES3.functional.shaders.derivate.dfdx.fbo_msaa4.vec4_mediump
dEQP-GLES3.functional.shaders.derivate.dfdx.nicest.fbo_msaa4.float_highp
dEQP-GLES3.functional.shaders.derivate.dfdx.nicest.fbo_msaa4.float_mediump
dEQP-GLES3.functional.shaders.derivate.dfdx.nicest.fbo_msaa4.vec2_highp
dEQP-GLES3.functional.shaders.derivate.dfdx.nicest.fbo_msaa4.vec2_mediump
dEQP-GLES3.functional.shaders.derivate.dfdx.nicest.fbo_msaa4.vec3_highp
dEQP-GLES3.functional.shaders.derivate.dfdx.nicest.fbo_msaa4.vec3_mediump
dEQP-GLES3.functional.shaders.derivate.dfdx.nicest.fbo_msaa4.vec4_highp
dEQP-GLES3.functional.shaders.derivate.dfdx.nicest.fbo_msaa4.vec4_mediump
dEQP-GLES3.functional.shaders.derivate.dfdx.texture.msaa4.float_highp
dEQP-GLES3.functional.shaders.derivate.dfdx.texture.msaa4.float_mediump
dEQP-GLES3.functional.shaders.derivate.dfdx.texture.msaa4.vec2_highp
dEQP-GLES3.functional.shaders.derivate.dfdx.texture.msaa4.vec2_mediump
dEQP-GLES3.functional.shaders.derivate.dfdx.texture.msaa4.vec3_highp
dEQP-GLES3.functional.shaders.derivate.dfdx.texture.msaa4.vec3_mediump
dEQP-GLES3.functional.shaders.derivate.dfdx.texture.msaa4.vec4_highp
dEQP-GLES3.functional.shaders.derivate.dfdx.texture.msaa4.vec4_mediump
dEQP-GLES3.functional.shaders.derivate.dfdy.fastest.fbo_msaa4.float_highp
dEQP-GLES3.functional.shaders.derivate.dfdy.fastest.fbo_msaa4.float_mediump
dEQP-GLES3.functional.shaders.derivate.dfdy.fastest.fbo_msaa4.vec2_highp
dEQP-GLES3.functional.shaders.derivate.dfdy.fastest.fbo_msaa4.vec2_mediump
dEQP-GLES3.functional.shaders.derivate.dfdy.fastest.fbo_msaa4.vec3_highp
dEQP-GLES3.functional.shaders.derivate.dfdy.fastest.fbo_msaa4.vec3_mediump
dEQP-GLES3.functional.shaders.derivate.dfdy.fastest.fbo_msaa4.vec4_highp
dEQP-GLES3.functional.shaders.derivate.dfdy.fastest.fbo_msaa4.vec4_mediump
dEQP-GLES3.functional.shaders.derivate.dfdy.fbo_msaa2.float_highp
dEQP-GLES3.functional.shaders.derivate.dfdy.fbo_msaa2.float_mediump
dEQP-GLES3.functional.shaders.derivate.dfdy.fbo_msaa2.vec2_highp
dEQP-GLES3.functional.shaders.derivate.dfdy.fbo_msaa2.vec2_mediump
dEQP-GLES3.functional.shaders.derivate.dfdy.fbo_msaa2.vec3_highp
dEQP-GLES3.functional.shaders.derivate.dfdy.fbo_msaa2.vec3_mediump
dEQP-GLES3.functional.shaders.derivate.dfdy.fbo_msaa2.vec4_highp
dEQP-GLES3.functional.shaders.derivate.dfdy.fbo_msaa2.vec4_mediump
dEQP-GLES3.functional.shaders.derivate.dfdy.fbo_msaa4.float_highp
dEQP-GLES3.functional.shaders.derivate.dfdy.fbo_msaa4.float_mediump
dEQP-GLES3.functional.shaders.derivate.dfdy.fbo_msaa4.vec2_highp
dEQP-GLES3.functional.shaders.derivate.dfdy.fbo_msaa4.vec2_mediump
dEQP-GLES3.functional.shaders.derivate.dfdy.fbo_msaa4.vec3_highp
dEQP-GLES3.functional.shaders.derivate.dfdy.fbo_msaa4.vec3_mediump
dEQP-GLES3.functional.shaders.derivate.dfdy.fbo_msaa4.vec4_highp
dEQP-GLES3.functional.shaders.derivate.dfdy.fbo_msaa4.vec4_mediump
dEQP-GLES3.functional.shaders.derivate.dfdy.nicest.fbo_msaa4.float_highp
dEQP-GLES3.functional.shaders.derivate.dfdy.nicest.fbo_msaa4.float_mediump
dEQP-GLES3.functional.shaders.derivate.dfdy.nicest.fbo_msaa4.vec2_highp
dEQP-GLES3.functional.shaders.derivate.dfdy.nicest.fbo_msaa4.vec2_mediump
dEQP-GLES3.functional.shaders.derivate.dfdy.nicest.fbo_msaa4.vec3_highp
dEQP-GLES3.functional.shaders.derivate.dfdy.nicest.fbo_msaa4.vec3_mediump
dEQP-GLES3.functional.shaders.derivate.dfdy.nicest.fbo_msaa4.vec4_highp
dEQP-GLES3.functional.shaders.derivate.dfdy.nicest.fbo_msaa4.vec4_mediump
dEQP-GLES3.functional.shaders.derivate.dfdy.texture.msaa4.float_highp
dEQP-GLES3.functional.shaders.derivate.dfdy.texture.msaa4.float_mediump
dEQP-GLES3.functional.shaders.derivate.dfdy.texture.msaa4.vec2_highp
dEQP-GLES3.functional.shaders.derivate.dfdy.texture.msaa4.vec2_mediump
dEQP-GLES3.functional.shaders.derivate.dfdy.texture.msaa4.vec3_highp
dEQP-GLES3.functional.shaders.derivate.dfdy.texture.msaa4.vec3_mediump
dEQP-GLES3.functional.shaders.derivate.dfdy.texture.msaa4.vec4_highp
dEQP-GLES3.functional.shaders.derivate.dfdy.texture.msaa4.vec4_mediump
dEQP-GLES3.functional.shaders.derivate.fwidth.fastest.fbo_msaa4.float_highp
dEQP-GLES3.functional.shaders.derivate.fwidth.fastest.fbo_msaa4.float_mediump
dEQP-GLES3.functional.shaders.derivate.fwidth.fastest.fbo_msaa4.vec2_highp
dEQP-GLES3.functional.shaders.derivate.fwidth.fastest.fbo_msaa4.vec2_mediump
dEQP-GLES3.functional.shaders.derivate.fwidth.fastest.fbo_msaa4.vec3_highp
dEQP-GLES3.functional.shaders.derivate.fwidth.fastest.fbo_msaa4.vec3_mediump
dEQP-GLES3.functional.shaders.derivate.fwidth.fastest.fbo_msaa4.vec4_highp
dEQP-GLES3.functional.shaders.derivate.fwidth.fastest.fbo_msaa4.vec4_mediump
dEQP-GLES3.functional.shaders.derivate.fwidth.fbo_msaa2.float_highp
dEQP-GLES3.functional.shaders.derivate.fwidth.fbo_msaa2.float_mediump
dEQP-GLES3.functional.shaders.derivate.fwidth.fbo_msaa2.vec2_highp
dEQP-GLES3.functional.shaders.derivate.fwidth.fbo_msaa2.vec2_mediump
dEQP-GLES3.functional.shaders.derivate.fwidth.fbo_msaa2.vec3_highp
dEQP-GLES3.functional.shaders.derivate.fwidth.fbo_msaa2.vec3_mediump
dEQP-GLES3.functional.shaders.derivate.fwidth.fbo_msaa2.vec4_highp
dEQP-GLES3.functional.shaders.derivate.fwidth.fbo_msaa2.vec4_mediump
dEQP-GLES3.functional.shaders.derivate.fwidth.fbo_msaa4.float_highp
dEQP-GLES3.functional.shaders.derivate.fwidth.fbo_msaa4.float_mediump
dEQP-GLES3.functional.shaders.derivate.fwidth.fbo_msaa4.vec2_highp
dEQP-GLES3.functional.shaders.derivate.fwidth.fbo_msaa4.vec2_mediump
dEQP-GLES3.functional.shaders.derivate.fwidth.fbo_msaa4.vec3_highp
dEQP-GLES3.functional.shaders.derivate.fwidth.fbo_msaa4.vec3_mediump
dEQP-GLES3.functional.shaders.derivate.fwidth.fbo_msaa4.vec4_highp
dEQP-GLES3.functional.shaders.derivate.fwidth.fbo_msaa4.vec4_mediump
dEQP-GLES3.functional.shaders.derivate.fwidth.nicest.fbo_msaa4.float_highp
dEQP-GLES3.functional.shaders.derivate.fwidth.nicest.fbo_msaa4.float_mediump
dEQP-GLES3.functional.shaders.derivate.fwidth.nicest.fbo_msaa4.vec2_highp
dEQP-GLES3.functional.shaders.derivate.fwidth.nicest.fbo_msaa4.vec2_mediump
dEQP-GLES3.functional.shaders.derivate.fwidth.nicest.fbo_msaa4.vec3_highp
dEQP-GLES3.functional.shaders.derivate.fwidth.nicest.fbo_msaa4.vec3_mediump
dEQP-GLES3.functional.shaders.derivate.fwidth.nicest.fbo_msaa4.vec4_highp
dEQP-GLES3.functional.shaders.derivate.fwidth.nicest.fbo_msaa4.vec4_mediump
dEQP-GLES3.functional.shaders.derivate.fwidth.texture.msaa4.float_highp
dEQP-GLES3.functional.shaders.derivate.fwidth.texture.msaa4.float_mediump
dEQP-GLES3.functional.shaders.derivate.fwidth.texture.msaa4.vec2_highp
dEQP-GLES3.functional.shaders.derivate.fwidth.texture.msaa4.vec2_mediump
dEQP-GLES3.functional.shaders.derivate.fwidth.texture.msaa4.vec3_highp
dEQP-GLES3.functional.shaders.derivate.fwidth.texture.msaa4.vec3_mediump
dEQP-GLES3.functional.shaders.derivate.fwidth.texture.msaa4.vec4_highp
dEQP-GLES3.functional.shaders.derivate.fwidth.texture.msaa4.vec4_mediump
dEQP-GLES3.functional.state_query.integers.max_samples_getfloat
dEQP-GLES3.functional.state_query.integers.max_samples_getinteger64
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_clamp_clamp_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_clamp_clamp_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_clamp_mirror_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_clamp_mirror_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_clamp_repeat_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_clamp_repeat_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_mirror_clamp_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_mirror_clamp_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_mirror_mirror_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_mirror_mirror_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_mirror_repeat_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_mirror_repeat_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_repeat_clamp_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_repeat_clamp_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_repeat_mirror_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_repeat_mirror_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_repeat_repeat_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_repeat_repeat_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_linear_clamp_clamp_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_linear_clamp_mirror_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_linear_clamp_repeat_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_linear_mirror_clamp_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_linear_mirror_mirror_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_linear_mirror_repeat_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_linear_repeat_clamp_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_linear_repeat_clamp_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_linear_repeat_mirror_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_linear_repeat_mirror_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_linear_repeat_repeat_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_linear_repeat_repeat_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_nearest_repeat_clamp_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_nearest_repeat_clamp_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_nearest_repeat_mirror_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_nearest_repeat_mirror_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_nearest_repeat_repeat_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_nearest_repeat_repeat_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_linear_clamp_clamp_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_linear_clamp_mirror_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_linear_clamp_repeat_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_linear_mirror_clamp_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_linear_mirror_mirror_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_linear_mirror_repeat_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_linear_repeat_clamp_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_linear_repeat_clamp_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_linear_repeat_mirror_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_linear_repeat_mirror_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_linear_repeat_repeat_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_linear_repeat_repeat_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_nearest_repeat_clamp_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_nearest_repeat_clamp_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_nearest_repeat_mirror_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_nearest_repeat_mirror_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_nearest_repeat_repeat_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_nearest_repeat_repeat_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_clamp_clamp_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_clamp_clamp_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_clamp_mirror_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_clamp_mirror_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_clamp_repeat_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_clamp_repeat_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_mirror_clamp_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_mirror_clamp_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_mirror_mirror_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_mirror_mirror_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_mirror_repeat_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_mirror_repeat_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_repeat_clamp_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_repeat_clamp_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_repeat_mirror_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_repeat_mirror_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_repeat_repeat_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_repeat_repeat_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_linear_clamp_clamp_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_linear_clamp_mirror_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_linear_clamp_repeat_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_linear_mirror_clamp_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_linear_mirror_mirror_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_linear_mirror_repeat_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_linear_repeat_clamp_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_linear_repeat_clamp_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_linear_repeat_mirror_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_linear_repeat_mirror_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_linear_repeat_repeat_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_linear_repeat_repeat_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_linear_linear_clamp_clamp_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_linear_linear_clamp_mirror_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_linear_linear_clamp_repeat_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_linear_linear_mirror_clamp_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_linear_linear_mirror_mirror_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_linear_linear_mirror_repeat_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_linear_linear_repeat_clamp_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_linear_linear_repeat_clamp_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_linear_linear_repeat_mirror_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_linear_linear_repeat_mirror_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_linear_linear_repeat_repeat_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_linear_linear_repeat_repeat_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_nearest_linear_clamp_clamp_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_nearest_linear_clamp_mirror_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_nearest_linear_clamp_repeat_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_nearest_linear_mirror_clamp_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_nearest_linear_mirror_mirror_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_nearest_linear_mirror_repeat_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_nearest_linear_repeat_clamp_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_nearest_linear_repeat_clamp_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_nearest_linear_repeat_mirror_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_nearest_linear_repeat_mirror_repeat
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_nearest_linear_repeat_repeat_mirror
dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_nearest_linear_repeat_repeat_repeat
dEQP-GLES3.functional.texture.filtering.3d.formats.r11f_g11f_b10f_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.r11f_g11f_b10f_linear_mipmap_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.r11f_g11f_b10f_linear_mipmap_nearest
dEQP-GLES3.functional.texture.filtering.3d.formats.r11f_g11f_b10f_nearest_mipmap_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.r11f_g11f_b10f_nearest_mipmap_nearest
dEQP-GLES3.functional.texture.filtering.3d.formats.rgb10_a2_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.rgb10_a2_linear_mipmap_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.rgb10_a2_linear_mipmap_nearest
dEQP-GLES3.functional.texture.filtering.3d.formats.rgb10_a2_nearest_mipmap_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.rgb10_a2_nearest_mipmap_nearest
dEQP-GLES3.functional.texture.filtering.3d.formats.rgb565_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.rgb565_linear_mipmap_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.rgb565_linear_mipmap_nearest
dEQP-GLES3.functional.texture.filtering.3d.formats.rgb565_nearest_mipmap_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.rgb565_nearest_mipmap_nearest
dEQP-GLES3.functional.texture.filtering.3d.formats.rgb5_a1_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.rgb5_a1_linear_mipmap_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.rgb5_a1_linear_mipmap_nearest
dEQP-GLES3.functional.texture.filtering.3d.formats.rgb5_a1_nearest_mipmap_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.rgb5_a1_nearest_mipmap_nearest
dEQP-GLES3.functional.texture.filtering.3d.formats.rgb9_e5_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.rgb9_e5_linear_mipmap_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.rgb9_e5_linear_mipmap_nearest
dEQP-GLES3.functional.texture.filtering.3d.formats.rgb9_e5_nearest_mipmap_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.rgb9_e5_nearest_mipmap_nearest
dEQP-GLES3.functional.texture.filtering.3d.formats.rgba16f_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.rgba16f_linear_mipmap_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.rgba16f_linear_mipmap_nearest
dEQP-GLES3.functional.texture.filtering.3d.formats.rgba16f_nearest_mipmap_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.rgba16f_nearest_mipmap_nearest
dEQP-GLES3.functional.texture.filtering.3d.formats.rgba4_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.rgba4_linear_mipmap_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.rgba4_linear_mipmap_nearest
dEQP-GLES3.functional.texture.filtering.3d.formats.rgba4_nearest_mipmap_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.rgba4_nearest_mipmap_nearest
dEQP-GLES3.functional.texture.filtering.3d.formats.rgba8_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.rgba8_linear_mipmap_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.rgba8_linear_mipmap_nearest
dEQP-GLES3.functional.texture.filtering.3d.formats.rgba8_nearest_mipmap_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.rgba8_nearest_mipmap_nearest
dEQP-GLES3.functional.texture.filtering.3d.formats.rgba8_snorm_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.rgba8_snorm_linear_mipmap_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.rgba8_snorm_linear_mipmap_nearest
dEQP-GLES3.functional.texture.filtering.3d.formats.rgba8_snorm_nearest_mipmap_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.rgba8_snorm_nearest_mipmap_nearest
dEQP-GLES3.functional.texture.filtering.3d.formats.srgb8_alpha8_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.srgb8_alpha8_linear_mipmap_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.srgb8_alpha8_linear_mipmap_nearest
dEQP-GLES3.functional.texture.filtering.3d.formats.srgb8_alpha8_nearest_mipmap_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.srgb8_alpha8_nearest_mipmap_nearest
dEQP-GLES3.functional.texture.filtering.3d.formats.srgb_r8_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.srgb_r8_linear_mipmap_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.srgb_r8_linear_mipmap_nearest
dEQP-GLES3.functional.texture.filtering.3d.formats.srgb_r8_nearest_mipmap_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.srgb_r8_nearest_mipmap_nearest
dEQP-GLES3.functional.texture.filtering.3d.sizes.128x32x64_linear
dEQP-GLES3.functional.texture.filtering.3d.sizes.128x32x64_linear_mipmap_linear
dEQP-GLES3.functional.texture.filtering.3d.sizes.128x32x64_linear_mipmap_nearest
dEQP-GLES3.functional.texture.filtering.3d.sizes.128x32x64_nearest_mipmap_linear
dEQP-GLES3.functional.texture.filtering.3d.sizes.128x32x64_nearest_mipmap_nearest
dEQP-GLES3.functional.texture.filtering.3d.sizes.63x63x63_linear
dEQP-GLES3.functional.texture.filtering.3d.sizes.63x63x63_linear_mipmap_linear
dEQP-GLES3.functional.texture.filtering.3d.sizes.63x63x63_linear_mipmap_nearest
dEQP-GLES3.functional.texture.filtering.3d.sizes.63x63x63_nearest_mipmap_linear
dEQP-GLES3.functional.texture.filtering.3d.sizes.63x63x63_nearest_mipmap_nearest
dEQP-GLES3.functional.texture.vertex.3d.filtering.linear_linear_clamp
dEQP-GLES3.functional.texture.vertex.3d.filtering.linear_linear_mirror
dEQP-GLES3.functional.texture.vertex.3d.filtering.linear_linear_repeat
dEQP-GLES3.functional.texture.vertex.3d.filtering.linear_mipmap_linear_linear_clamp
dEQP-GLES3.functional.texture.vertex.3d.filtering.linear_mipmap_linear_linear_mirror
dEQP-GLES3.functional.texture.vertex.3d.filtering.linear_mipmap_linear_linear_repeat
dEQP-GLES3.functional.texture.vertex.3d.filtering.linear_mipmap_linear_nearest_clamp
dEQP-GLES3.functional.texture.vertex.3d.filtering.linear_mipmap_linear_nearest_mirror
dEQP-GLES3.functional.texture.vertex.3d.filtering.linear_mipmap_linear_nearest_repeat
dEQP-GLES3.functional.texture.vertex.3d.filtering.linear_mipmap_nearest_linear_repeat
dEQP-GLES3.functional.texture.vertex.3d.filtering.linear_nearest_clamp
dEQP-GLES3.functional.texture.vertex.3d.filtering.linear_nearest_mirror
dEQP-GLES3.functional.texture.vertex.3d.filtering.linear_nearest_repeat
dEQP-GLES3.functional.texture.vertex.3d.filtering.nearest_linear_repeat
dEQP-GLES3.functional.texture.vertex.3d.filtering.nearest_mipmap_linear_linear_repeat
dEQP-GLES3.functional.texture.vertex.3d.filtering.nearest_mipmap_nearest_linear_repeat
dEQP-GLES3.functional.texture.vertex.3d.wrap.clamp_clamp_clamp
dEQP-GLES3.functional.texture.vertex.3d.wrap.clamp_clamp_mirror
dEQP-GLES3.functional.texture.vertex.3d.wrap.clamp_clamp_repeat
dEQP-GLES3.functional.texture.vertex.3d.wrap.clamp_mirror_mirror
dEQP-GLES3.functional.texture.vertex.3d.wrap.clamp_mirror_repeat
dEQP-GLES3.functional.texture.vertex.3d.wrap.clamp_repeat_mirror
dEQP-GLES3.functional.texture.vertex.3d.wrap.clamp_repeat_repeat
dEQP-GLES3.functional.texture.vertex.3d.wrap.mirror_clamp_clamp
dEQP-GLES3.functional.texture.vertex.3d.wrap.mirror_clamp_mirror
dEQP-GLES3.functional.texture.vertex.3d.wrap.mirror_clamp_repeat
dEQP-GLES3.functional.texture.vertex.3d.wrap.mirror_mirror_mirror
dEQP-GLES3.functional.texture.vertex.3d.wrap.mirror_mirror_repeat
dEQP-GLES3.functional.texture.vertex.3d.wrap.mirror_repeat_mirror
dEQP-GLES3.functional.texture.vertex.3d.wrap.mirror_repeat_repeat
dEQP-GLES3.functional.texture.vertex.3d.wrap.repeat_clamp_clamp
dEQP-GLES3.functional.texture.vertex.3d.wrap.repeat_clamp_mirror
dEQP-GLES3.functional.texture.vertex.3d.wrap.repeat_clamp_repeat
dEQP-GLES3.functional.texture.vertex.3d.wrap.repeat_mirror_clamp
dEQP-GLES3.functional.texture.vertex.3d.wrap.repeat_mirror_mirror
dEQP-GLES3.functional.texture.vertex.3d.wrap.repeat_mirror_repeat
dEQP-GLES3.functional.texture.vertex.3d.wrap.repeat_repeat_clamp
dEQP-GLES3.functional.texture.vertex.3d.wrap.repeat_repeat_mirror
dEQP-GLES3.functional.texture.vertex.3d.wrap.repeat_repeat_repeat
dEQP-GLES3.functional.texture.wrap.astc_8x8.repeat_repeat_linear_divisible
dEQP-GLES3.functional.texture.wrap.astc_8x8.repeat_repeat_linear_not_divisible
dEQP-GLES3.functional.texture.wrap.astc_8x8_srgb.repeat_repeat_linear_divisible
dEQP-GLES3.functional.texture.wrap.astc_8x8_srgb.repeat_repeat_linear_not_divisible
dEQP-GLES3.functional.vertex_arrays.single_attribute.normalize.int2_10_10_10.components4_quads1
dEQP-GLES3.functional.vertex_arrays.single_attribute.normalize.int2_10_10_10.components4_quads256

View File

@@ -16,9 +16,10 @@ fi
rm -rf _build
meson _build --native-file=native.file \
${CROSS} \
${CROSS+--cross /cross_file-$CROSS.txt} \
-D prefix=`pwd`/install \
-D libdir=lib \
-D buildtype=debug \
-D buildtype=${BUILDTYPE:-debug} \
-D build-tests=true \
-D libunwind=${UNWIND} \
${DRI_LOADERS} \
@@ -32,9 +33,30 @@ cd _build
meson configure
ninja -j4
LC_ALL=C.UTF-8 ninja test
DESTDIR=$PWD/../install ninja install
ninja install
cd ..
if test -n "$MESON_SHADERDB"; then
./.gitlab-ci/run-shader-db.sh;
fi
# Delete 2MB of includes from artifacts.
rm -rf install/include
# Strip the drivers in the artifacts to cut 80% of the artifacts size.
if [ -n "$CROSS" ]; then
STRIP=`sed -n -E "s/strip\s*=\s*'(.*)'/\1/p" /cross_file-$CROSS.txt`
if [ -z "$STRIP" ]; then
echo "Failed to find strip command in cross file"
exit 1
fi
else
STRIP="strip"
fi
find install -name \*.so -exec $STRIP {} \;
# Test runs don't pull down the git tree, so put the dEQP helper
# script and associated bits there.
mkdir -p artifacts/
cp -Rp .gitlab-ci/deqp* artifacts/
# cp -Rp src/freedreno/ci/expected* artifacts/

View File

@@ -5,8 +5,8 @@ ARTIFACTSDIR=`pwd`/shader-db
mkdir -p $ARTIFACTSDIR
export DRM_SHIM_DEBUG=true
LIBDIR=`pwd`/install/usr/local/lib
export LIBGL_DRIVERS_PATH=$LIBDIR/dri
LIBDIR=`pwd`/install/lib
export LD_LIBRARY_PATH=$LIBDIR
cd /usr/local/shader-db

View File

@@ -9,8 +9,22 @@ env:
global:
- PKG_CONFIG_PATH=""
matrix:
include:
- env:
- BUILD=meson
- env:
- BUILD=scons
before_install:
- HOMEBREW_NO_AUTO_UPDATE=1 brew install python3 ninja expat gettext
- HOMEBREW_NO_AUTO_UPDATE=1 brew install expat gettext
- if test "x$BUILD" = xmeson; then
HOMEBREW_NO_AUTO_UPDATE=1 brew install python3 ninja;
fi
- if test "x$BUILD" = xscons; then
HOMEBREW_NO_AUTO_UPDATE=1 brew install python2 scons;
fi
# Set PATH for homebrew pip3 installs
- PATH="$HOME/Library/Python/3.6/bin:${PATH}"
# Set PKG_CONFIG_PATH for keg-only expat
@@ -28,10 +42,21 @@ before_install:
- PKG_CONFIG_PATH="/opt/X11/share/pkgconfig:/opt/X11/lib/pkgconfig:${PKG_CONFIG_PATH}"
install:
- pip3 install --user meson
- pip3 install --user mako
- if test "x$BUILD" = xmeson; then
pip3 install --user meson;
pip3 install --user mako;
fi
- if test "x$BUILD" = xscons; then
pip2 install --user mako;
fi
script:
- meson _build -Dbuild-tests=true
- ninja -C _build
- ninja -C _build test
- if test "x$BUILD" = xmeson; then
meson _build -Dbuild-tests=true;
ninja -C _build || travis_terminate 1;
ninja -C _build test || travis_terminate 1;
fi
- if test "x$BUILD" = xscons; then
scons || travis_terminate 1;
scons check || travis_terminate 1;
fi

View File

@@ -39,7 +39,7 @@ LOCAL_CFLAGS += \
-Wno-initializer-overrides \
-Wno-mismatched-tags \
-DPACKAGE_VERSION=\"$(MESA_VERSION)\" \
-DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\"
-DPACKAGE_BUGREPORT=\"https://gitlab.freedesktop.org/mesa/mesa/issues\"
# XXX: The following __STDC_*_MACROS defines should not be needed.
# It's likely due to a bug elsewhere, but let's temporarily add them
@@ -103,9 +103,12 @@ ifeq ($(shell test $(PLATFORM_SDK_VERSION) -ge 26 && echo true),true)
LOCAL_CFLAGS += -DHAVE_SYS_SHM_H
endif
ifeq ($(strip $(MESA_ENABLE_ASM)),true)
ifeq ($(TARGET_ARCH),x86)
LOCAL_CFLAGS += \
-DUSE_X86_ASM
endif
endif
ifeq ($(ARCH_ARM_HAVE_NEON),true)
LOCAL_CFLAGS_arm += -DUSE_ARM_ASM

View File

@@ -83,6 +83,13 @@ endif
$(foreach d, $(MESA_BUILD_CLASSIC) $(MESA_BUILD_GALLIUM), $(eval $(d) := true))
# host and target must be the same arch to generate matypes.h
ifeq ($(TARGET_ARCH),$(HOST_ARCH))
MESA_ENABLE_ASM := true
else
MESA_ENABLE_ASM := false
endif
ifneq ($(filter true, $(HAVE_GALLIUM_RADEONSI)),)
MESA_ENABLE_LLVM := true
endif
@@ -115,7 +122,8 @@ SUBDIRS := \
src/broadcom \
src/intel \
src/mesa/drivers/dri \
src/vulkan
src/vulkan \
src/panfrost \
INC_DIRS := $(call all-named-subdir-makefiles,$(SUBDIRS))
INC_DIRS += $(call all-named-subdir-makefiles,src/gallium)

View File

@@ -73,7 +73,7 @@ with open("VERSION") as f:
mesa_version = f.read().strip()
env.Append(CPPDEFINES = [
('PACKAGE_VERSION', '\\"%s\\"' % mesa_version),
('PACKAGE_BUGREPORT', '\\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\\"'),
('PACKAGE_BUGREPORT', '\\"https://gitlab.freedesktop.org/mesa/mesa/issues\\"'),
])
# Includes

View File

@@ -1 +1 @@
19.2.0-devel
19.2.8

51
bin/.cherry-ignore Normal file
View File

@@ -0,0 +1,51 @@
# warnings that are not useful
da5ebe30105f70e3520ce3ae145793b755552569
6b8cb087568699ca9a6e9e8b7bf49179e622b59f
# Jason doesn't want this applied to 19.2 (it's a revert)
d15fe8ca8262d502435c4f83985ac414f950bc5f
# This doesn't apply to 19.2
f833b4cada07b746a10ffa4d93fcd821920c3cb1
d2db43fcad6a2ea2070ff5f7884411f4b7d3925c
66f2aa6ccd0b226eebe2c1a46281160b0a54d522
# The author requested that this not be applied to 19.2
dcc0e23438f3e5929c2ef74d57e8207be25ecb41
# This doesn't apply cleanly, and no one really cares about this file on stable
# branches anyway.
bcd9224728dcb8d8fe4bcddc4bd9b2c36fcfe9dd
# De-nominated by its author due to alternate fix not being backported
43041627445540afda1a05d11861935963660344
# This is immediately reverted, so just don't apply
19546108d3dd5541a189e36df4ea83b3f519e48f
# The authors requested these not be applied to 19.2
869e32593a9096b845dd6106f8f86e1c41fac968
a2c3c65a31de90fdb55f76f2894860dfbafe2043
bb0c5c487e63e88acbb792f092dd8f392bad8540
937b9055698be0dfdb7d2e0673a989e2ecc05912
21376cffb37018160ad3eef38b5a640ba1675a4f
# This is reverted shortly after it was landed
4432a2d14d80081d062f7939a950d65ea3a16eed
# These aren't relevant for 19.2
1a05811936dd8d0c3a367c6f00629624ef39d537
911a8261419f48dcd756f78832fa5a5f4c5b8d93
# This was manually backported
2afeed301010917c4eae55dcd2544f9d329df934
4b392ced2d744fccffe95490ff57e6b41033c266
# This is not being backported to 19.2 due to causing build regressions for
# downstream projects
eaf43966027cf9654e91ca57aecc8f5a65b58f49
# Invalid sha warnings
023282a4f667695ea1dbbe9fbe1cd3a9d550a426
2fca325ea65f068043d4c18c9cd0fe7f25bde8f7
7564c5fc6d79a2ddec49a19f67183fb3be799fe5

0
bin/__init__.py Normal file
View File

View File

@@ -1,35 +0,0 @@
#!/bin/sh
# This script is used to generate the list of fixed bugs that
# appears in the release notes files, with HTML formatting.
#
# Note: This script could take a while until all details have
# been fetched from bugzilla.
#
# Usage examples:
#
# $ bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3
# $ bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3 > bugfixes
# $ bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3 | tee bugfixes
# regex pattern: trim before bug number
trim_before='s/.*show_bug.cgi?id=\([0-9]*\).*/\1/'
# regex pattern: reconstruct the url
use_after='s,^,https://bugs.freedesktop.org/show_bug.cgi?id=,'
echo "<ul>"
echo ""
# extract fdo urls from commit log
git log --pretty=medium $* | grep 'bugs.freedesktop.org/show_bug' | sed -e $trim_before | sort -n -u | sed -e $use_after |\
while read url
do
id=$(echo $url | cut -d'=' -f2)
summary=$(wget --quiet -O - $url | grep -e '<title>.*</title>' | sed -e 's/ *<title>[0-9]\+ &ndash; \(.*\)<\/title>/\1/')
echo "<li><a href=\"$url\">Bug $id</a> - $summary</li>"
echo ""
done
echo "</ul>"

272
bin/gen_release_notes.py Executable file
View File

@@ -0,0 +1,272 @@
#!/usr/bin/env python3
# Copyright © 2019 Intel Corporation
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
"""Generates release notes for a given version of mesa."""
import asyncio
import datetime
import os
import pathlib
import sys
import textwrap
import typing
import urllib.parse
import aiohttp
from mako.template import Template
from mako import exceptions
CURRENT_GL_VERSION = '4.5'
CURRENT_VK_VERSION = '1.1'
TEMPLATE = Template(textwrap.dedent("""\
<%!
import html
%>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa ${next_version} Release Notes / ${today}</h1>
<p>
%if not bugfix:
Mesa ${next_version} is a new development release. People who are concerned
with stability and reliability should stick with a previous release or
wait for Mesa ${version[:-1]}1.
%else:
Mesa ${next_version} is a bug fix release which fixes bugs found since the ${version} release.
%endif
</p>
<p>
Mesa ${next_version} implements the OpenGL ${gl_version} API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL ${gl_version}. OpenGL
${gl_version} is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<p>
Mesa ${next_version} implements the Vulkan ${vk_version} API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
</p>
<h2>SHA256 checksum</h2>
<pre>
TBD.
</pre>
<h2>New features</h2>
<ul>
%for f in features:
<li>${html.escape(f)}</li>
%endfor
</ul>
<h2>Bug fixes</h2>
<ul>
%for b in bugs:
<li>${html.escape(b)}</li>
%endfor
</ul>
<h2>Changes</h2>
<ul>
%for c, author in changes:
%if author:
<p>${html.escape(c)}</p>
%else:
<li>${html.escape(c)}</li>
%endif
%endfor
</ul>
</div>
</body>
</html>
"""))
async def gather_commits(version: str) -> str:
p = await asyncio.create_subprocess_exec(
'git', 'log', f'mesa-{version}..', '--grep', r'Closes: \(https\|#\).*',
stdout=asyncio.subprocess.PIPE)
out, _ = await p.communicate()
assert p.returncode == 0, f"git log didn't work: {version}"
return out.decode().strip()
async def gather_bugs(version: str) -> typing.List[str]:
commits = await gather_commits(version)
issues: typing.List[str] = []
for commit in commits.split('\n'):
sha, message = commit.split(maxsplit=1)
p = await asyncio.create_subprocess_exec(
'git', 'log', '--max-count', '1', r'--format=%b', sha,
stdout=asyncio.subprocess.PIPE)
_out, _ = await p.communicate()
out = _out.decode().split('\n')
for line in reversed(out):
if line.startswith('Closes:'):
bug = line.lstrip('Closes:').strip()
break
else:
raise Exception('No closes found?')
if bug.startswith('h'):
# This means we have a bug in the form "Closes: https://..."
issues.append(os.path.basename(urllib.parse.urlparse(bug).path))
else:
issues.append(bug.lstrip('#'))
loop = asyncio.get_event_loop()
async with aiohttp.ClientSession(loop=loop) as session:
results = await asyncio.gather(*[get_bug(session, i) for i in issues])
typing.cast(typing.Tuple[str, ...], results)
return list(results)
async def get_bug(session: aiohttp.ClientSession, bug_id: str) -> str:
"""Query gitlab to get the name of the issue that was closed."""
# Mesa's gitlab id is 176,
url = 'https://gitlab.freedesktop.org/api/v4/projects/176/issues'
params = {'iids[]': bug_id}
async with session.get(url, params=params) as response:
content = await response.json()
return content[0]['title']
async def get_shortlog(version: str) -> str:
"""Call git shortlog."""
p = await asyncio.create_subprocess_exec('git', 'shortlog', f'mesa-{version}..',
stdout=asyncio.subprocess.PIPE)
out, _ = await p.communicate()
assert p.returncode == 0, 'error getting shortlog'
assert out is not None, 'just for mypy'
return out.decode()
def walk_shortlog(log: str) -> typing.Generator[typing.Tuple[str, bool], None, None]:
for l in log.split('\n'):
if l.startswith(' '): # this means we have a patch description
yield l, False
else:
yield l, True
def calculate_next_version(version: str, is_point: bool) -> str:
"""Calculate the version about to be released."""
if '-' in version:
version = version.split('-')[0]
if is_point:
base = version.split('.')
base[2] = str(int(base[2]) + 1)
return '.'.join(base)
return version
def calculate_previous_version(version: str, is_point: bool) -> str:
"""Calculate the previous version to compare to.
In the case of -rc to final that verison is the previous .0 release,
(19.3.0 in the case of 20.0.0, for example). for point releases that is
the last point release. This value will be the same as the input value
for a point release, but different for a major release.
"""
if '-' in version:
version = version.split('-')[0]
if is_point:
return version
base = version.split('.')
if base[1] == '0':
base[0] = str(int(base[0]) - 1)
base[1] = '3'
else:
base[1] = str(int(base[1]) - 1)
return '.'.join(base)
def get_features(is_point_release: bool) -> typing.Generator[str, None, None]:
p = pathlib.Path(__file__).parent.parent / 'docs' / 'relnotes' / 'new_features.txt'
if p.exists():
if is_point_release:
print("WARNING: new features being introduced in a point release", file=sys.stderr)
with p.open('rt') as f:
for line in f:
yield line
else:
yield "None"
async def main() -> None:
v = pathlib.Path(__file__).parent.parent / 'VERSION'
with v.open('rt') as f:
raw_version = f.read().strip()
is_point_release = '-rc' not in raw_version
assert '-devel' not in raw_version, 'Do not run this script on -devel'
version = raw_version.split('-')[0]
previous_version = calculate_previous_version(version, is_point_release)
next_version = calculate_next_version(version, is_point_release)
shortlog, bugs = await asyncio.gather(
get_shortlog(previous_version),
gather_bugs(previous_version),
)
final = pathlib.Path(__file__).parent.parent / 'docs' / 'relnotes' / f'{next_version}.html'
with final.open('wt') as f:
try:
f.write(TEMPLATE.render(
bugfix=is_point_release,
bugs=bugs,
changes=walk_shortlog(shortlog),
features=get_features(is_point_release),
gl_version=CURRENT_GL_VERSION,
next_version=next_version,
today=datetime.date.today(),
version=previous_version,
vk_version=CURRENT_VK_VERSION,
))
except:
print(exceptions.text_error_template().render())
if __name__ == "__main__":
loop = asyncio.get_event_loop()
loop.run_until_complete(main())

View File

@@ -0,0 +1,62 @@
# Copyright © 2019 Intel Corporation
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
from unittest import mock
import pytest
from .gen_release_notes import *
@pytest.mark.parametrize(
'current, is_point, expected',
[
('19.2.0', True, '19.2.1'),
('19.3.6', True, '19.3.7'),
('20.0.0-rc4', False, '20.0.0'),
])
def test_next_version(current: str, is_point: bool, expected: str) -> None:
assert calculate_next_version(current, is_point) == expected
@pytest.mark.parametrize(
'current, is_point, expected',
[
('19.3.6', True, '19.3.6'),
('20.0.0-rc4', False, '19.3.0'),
])
def test_previous_version(current: str, is_point: bool, expected: str) -> None:
assert calculate_previous_version(current, is_point) == expected
@pytest.mark.asyncio
async def test_get_shortlog():
# Certainly not perfect, but it's something
version = '19.2.0'
out = await get_shortlog(version)
assert out
@pytest.mark.asyncio
async def test_gather_commits():
# Certainly not perfect, but it's something
version = '19.2.0'
out = await gather_commits(version)
assert out

View File

@@ -32,7 +32,7 @@ is_sha_nomination()
{
fixes=`git show --pretty=medium -s $1 | tr -d "\n" | \
sed -e 's/'"$2"'/\nfixes:/Ig' | \
grep -Eo 'fixes:[a-f0-9]{8,40}'`
grep -Eo 'fixes:[a-f0-9]{4,40}'`
fixes_count=`echo "$fixes" | grep "fixes:" | wc -l`
if test $fixes_count -eq 0; then
@@ -92,7 +92,7 @@ is_revert_nomination()
}
# Use the last branchpoint as our limit for the search
latest_branchpoint=`git merge-base origin/master HEAD`
latest_branchpoint=`git merge-base upstream/master HEAD`
# List all the commits between day 1 and the branch point...
git log --reverse --pretty=%H $latest_branchpoint > already_landed
@@ -103,7 +103,7 @@ git log --reverse --pretty=medium --grep="cherry picked from commit" $latest_bra
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked
# Grep for potential candidates
git log --reverse --pretty=%H -i --grep='^CC:.*mesa-stable\|^CC:.*mesa-dev\|\<fixes\>\|\<broken by\>\|This reverts commit' $latest_branchpoint..origin/master |\
git log --reverse --pretty=%H -i --grep='^CC:.*mesa-stable\|^CC:.*mesa-dev\|\<fixes\>\|\<broken by\>\|This reverts commit' $latest_branchpoint..upstream/master |\
while read sha
do
# Check to see whether the patch is on the ignore list.
@@ -143,7 +143,7 @@ do
esac
printf "[ %8s ] " "$tag"
git --no-pager show --no-patch --oneline $sha
git --no-pager show --no-patch --pretty=oneline $sha
done
rm -f already_picked

117
bin/post_version.py Executable file
View File

@@ -0,0 +1,117 @@
#!/usr/bin/env python3
# Copyright © 2019 Intel Corporation
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
"""Update the main page, release notes, and calendar."""
import argparse
import calendar
import datetime
import pathlib
from lxml import (
etree,
html,
)
def calculate_previous_version(version: str, is_point: bool) -> str:
"""Calculate the previous version to compare to.
In the case of -rc to final that verison is the previous .0 release,
(19.3.0 in the case of 20.0.0, for example). for point releases that is
the last point release. This value will be the same as the input value
for a poiont release, but different for a major release.
"""
if '-' in version:
version = version.split('-')[0]
if is_point:
return version
base = version.split('.')
if base[1] == '0':
base[0] = str(int(base[0]) - 1)
base[1] = '3'
else:
base[1] = str(int(base[1]) - 1)
return '.'.join(base)
def is_point_release(version: str) -> bool:
return not version.endswith('.0')
def update_index(is_point: bool, version: str, previous_version: str) -> None:
p = pathlib.Path(__file__).parent.parent / 'docs' / 'index.html'
with p.open('rt') as f:
tree = html.parse(f)
news = tree.xpath('.//h1')[0]
date = datetime.date.today()
month = calendar.month_name[date.month]
header = etree.Element('h2')
header.text = f"{month} {date.day}, {date.year}"
body = etree.Element('p')
a = etree.SubElement(
body, 'a', attrib={'href': f'relnotes/{previous_version}.html'})
a.text = f"Mesa {previous_version}"
if is_point:
a.tail = " is released. This is a bug fix release."
else:
a.tail = (" is released. This is a new development release. "
"See the release notes for mor information about this release.")
root = news.getparent()
index = root.index(news) + 1
root.insert(index, body)
root.insert(index, header)
tree.write(p.as_posix(), method='html')
def update_release_notes(previous_version: str) -> None:
p = pathlib.Path(__file__).parent.parent / 'docs' / 'relnotes.html'
with p.open('rt') as f:
tree = html.parse(f)
li = etree.Element('li')
a = etree.SubElement(li, 'a', href=f'relnotes/{previous_version}.html')
a.text = f'{previous_version} release notes'
ul = tree.xpath('.//ul')[0]
ul.insert(0, li)
tree.write(p.as_posix(), method='html')
def main() -> None:
parser = argparse.ArgumentParser()
parser.add_argument('version', help="The released version.")
args = parser.parse_args()
is_point = is_point_release(args.version)
previous_version = calculate_previous_version(args.version, is_point)
update_index(is_point, args.version, previous_version)
update_release_notes(previous_version)
if __name__ == "__main__":
main()

View File

@@ -1,29 +0,0 @@
#!/bin/sh
# This script is used to generate the list of changes that
# appears in the release notes files, with HTML formatting.
#
# Usage examples:
#
# $ bin/shortlog_mesa.sh mesa-9.0.2..mesa-9.0.3
# $ bin/shortlog_mesa.sh mesa-9.0.2..mesa-9.0.3 > changes
# $ bin/shortlog_mesa.sh mesa-9.0.2..mesa-9.0.3 | tee changes
in_log=0
git shortlog $* | while read l
do
if [ $in_log -eq 0 ]; then
echo '<p>'$l'</p>'
echo '<ul>'
in_log=1
elif echo "$l" | egrep -q '^$' ; then
echo '</ul>'
echo
in_log=0
else
mesg=$(echo $l | sed 's/ (cherry picked from commit [0-9a-f]\+)//;s/\&/&amp;/g;s/</\&lt;/g;s/>/\&gt;/g')
echo ' <li>'${mesg}'</li>'
fi
done

View File

@@ -1,8 +1,9 @@
#!/usr/bin/env python
import argparse
import subprocess
import os
import platform
import subprocess
# This list contains symbols that _might_ be exported for some platforms
PLATFORM_SYMBOLS = [
@@ -23,11 +24,22 @@ def get_symbols(nm, lib):
List all the (non platform-specific) symbols exported by the library
'''
symbols = []
output = subprocess.check_output([nm, '--format=bsd', '-D', '--defined-only', lib],
platform_name = platform.system()
output = subprocess.check_output([nm, '-gP', lib],
stderr=open(os.devnull, 'w')).decode("ascii")
for line in output.splitlines():
(_, _, symbol_name) = line.split()
fields = line.split()
if len(fields) == 2 or fields[1] == 'U':
continue
symbol_name = fields[0]
if platform_name == 'Linux':
if symbol_name in PLATFORM_SYMBOLS:
continue
elif platform_name == 'Darwin':
assert symbol_name[0] == '_'
symbol_name = symbol_name[1:]
symbols.append(symbol_name)
return symbols
@@ -47,7 +59,12 @@ def main():
help='path to binary (or name in $PATH)')
args = parser.parse_args()
lib_symbols = get_symbols(args.nm, args.lib)
try:
lib_symbols = get_symbols(args.nm, args.lib)
except:
# We can't run this test, but we haven't technically failed it either
# Return the GNU "skip" error code
exit(77)
mandatory_symbols = []
optional_symbols = []
with open(args.symbols_file) as symbols_file:
@@ -92,8 +109,6 @@ def main():
continue
if symbol in optional_symbols:
continue
if symbol in PLATFORM_SYMBOLS:
continue
unknown_symbols.append(symbol)
missing_symbols = [

View File

@@ -17,6 +17,9 @@ import SCons.Script.SConscript
host_platform = _platform.system().lower()
if host_platform.startswith('cygwin'):
host_platform = 'cygwin'
# MSYS2 default platform selection.
if host_platform.startswith('mingw'):
host_platform = 'windows'
# Search sys.argv[] for a "platform=foo" argument since we don't have
# an 'env' variable at this point.
@@ -49,9 +52,18 @@ if 'PROCESSOR_ARCHITECTURE' in os.environ:
else:
host_machine = _platform.machine()
host_machine = _machine_map.get(host_machine, 'generic')
# MSYS2 default machine selection.
if _platform.system().lower().startswith('mingw') and 'MSYSTEM' in os.environ:
if os.environ['MSYSTEM'] == 'MINGW32':
host_machine = 'x86'
if os.environ['MSYSTEM'] == 'MINGW64':
host_machine = 'x86_64'
default_machine = host_machine
default_toolchain = 'default'
# MSYS2 default toolchain selection.
if _platform.system().lower().startswith('mingw'):
default_toolchain = 'mingw'
if target_platform == 'windows' and host_platform != 'windows':
default_machine = 'x86'

View File

@@ -24,8 +24,8 @@ The old bug database on SourceForge is no longer used.
<p>
To file a Mesa bug, go to
<a href="https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa">
Bugzilla on freedesktop.org</a>
<a href="https://gitlab.freedesktop.org/mesa/mesa/issues">
GitLab on freedesktop.org</a>
</p>
<p>

View File

@@ -166,8 +166,8 @@ extern __thread struct _glapi_table *_glapi_tls_Dispatch
</blockquote>
<p>Use of this path is controlled by the preprocessor define
<code>GLX_USE_TLS</code>. Any platform capable of using TLS should use this as
the default dispatch method.</p>
<code>USE_ELF_TLS</code>. Any platform capable of using ELF TLS should use this
as the default dispatch method.</p>
<h3>3.3. Assembly Language Dispatch Stubs</h3>
@@ -204,7 +204,7 @@ terribly relevant.</p>
few preprocessor defines.</p>
<ul>
<li>If <code>GLX_USE_TLS</code> is defined, method #3 is used.</li>
<li>If <code>USE_ELF_TLS</code> is defined, method #3 is used.</li>
<li>If <code>HAVE_PTHREAD</code> is defined, method #2 is used.</li>
<li>If none of the preceding are defined, method #1 is used.</li>
</ul>

View File

@@ -215,7 +215,7 @@ GL 4.5, GLSL 4.50 -- all DONE: nvc0, radeonsi, r600
GL_ARB_clip_control DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_conditional_render_inverted DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr, virgl)
GL_ARB_cull_distance DONE (i965, nv50, llvmpipe, softpipe, swr, virgl)
GL_ARB_derivative_control DONE (i965, nv50, virgl)
GL_ARB_derivative_control DONE (i965, nv50, softpipe, virgl)
GL_ARB_direct_state_access DONE (all drivers)
GL_ARB_get_texture_sub_image DONE (all drivers)
GL_ARB_shader_texture_image_samples DONE (i965, nv50, virgl)

View File

@@ -29,7 +29,7 @@ immediately checked into git because not enough people are testing them.
Just applying patches, testing and reporting back is helpful.
<li>
<b>Driver debugging.</b>
There are plenty of open bugs in the <a href="https://bugs.freedesktop.org/describecomponents.cgi?product=Mesa">bug database</a>.
There are plenty of open bugs in the <a href="https://gitlab.freedesktop.org/mesa/mesa/issues">bug database</a>.
<li>
<b>Remove aliasing warnings.</b>
Enable gcc's <code>-Wstrict-aliasing=2 -fstrict-aliasing</code> arguments, and

View File

@@ -16,6 +16,12 @@
<h1>News</h1>
<h2>August 7, 2019</h2>
<p>
<a href="relnotes/19.1.4.html">Mesa 19.1.4</a> is released.
This is a bug-fix release.
</p>
<h2>July 23, 2019</h2>
<p>
<a href="relnotes/19.1.3.html">Mesa 19.1.3</a> is released.

View File

@@ -51,7 +51,7 @@ There are several examples of OSMesa in the mesa/demos repository.
Configure and build Mesa with something like:
<pre>
meson builddir -Dosmesa=gallium -Dgallium-drivers=swrast -Ddri-drivers= -Dvulkan-drivers= -Dprefix=$PWD/builddir/install
meson builddir -Dosmesa=gallium -Dgallium-drivers=swrast -Ddri-drivers=[] -Dvulkan-drivers=[] -Dprefix=$PWD/builddir/install
ninja -C builddir install
</pre>

View File

@@ -60,13 +60,7 @@ if you'd like to nominate a patch in the next stable release.
<th>Notes</th>
</tr>
<tr>
<td rowspan="4">19.1</td>
<td>2019-08-06</td>
<td>19.1.4</td>
<td>Juan A. Suarez</td>
<td>
</tr>
<tr>
<td rowspan="3">19.1</td>
<td>2019-08-20</td>
<td>19.1.5</td>
<td>Juan A. Suarez</td>

View File

@@ -285,7 +285,7 @@ To setup the branchpoint:
<p>
Now go to
<a href="https://bugs.freedesktop.org/editversions.cgi?action=add&amp;product=Mesa" target="_parent">Bugzilla</a> and add the new Mesa version X.Y.
<a href="https://gitlab.freedesktop.org/mesa/mesa/-/milestones" target="_parent">gitlab</a> and add the new Mesa version X.Y.
</p>
<p>

View File

@@ -21,6 +21,7 @@ The release notes summarize what's new or changed in each Mesa release.
</p>
<ul>
<li><a href="relnotes/19.1.4.html">19.1.4 release notes</a>
<li><a href="relnotes/19.1.3.html">19.1.3 release notes</a>
<li><a href="relnotes/19.1.2.html">19.1.2 release notes</a>
<li><a href="relnotes/19.0.8.html">19.0.8 release notes</a>

227
docs/relnotes/19.1.4.html Normal file
View File

@@ -0,0 +1,227 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 19.1.4 Release Notes / August 7, 2019</h1>
<p>
Mesa 19.1.4 is a bug fix release which fixes bugs found since the 19.1.3 release.
</p>
<p>
Mesa 19.1.4 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
a6d268a7d9edcfd92b6da80f2e34e6e0a7baaa442efbeba2fc66c404943c6bfb mesa-19.1.4.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109203">Bug 109203</a> - [cfl dxvk] GPU Crash Launching Monopoly Plus (Iris Plus 655 / Wine + DXVK)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109524">Bug 109524</a> - &quot;Invalid glsl version in shading_language_version()&quot; when trying to run directX games using wine</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110309">Bug 110309</a> - [icl][bisected] regression on piglit arb_gpu_shader_int 64.execution.fs-ishl-then-* tests</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110663">Bug 110663</a> - threads_posix.h:96: undefined reference to `pthread_once'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110955">Bug 110955</a> - Mesa 18.2.8 implementation error: Invalid GLSL version in shading_language_version()</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111010">Bug 111010</a> - Cemu Shader Cache Corruption Displaying Solid Color After commit 11e16ca7ce0</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111071">Bug 111071</a> - SPIR-V shader processing fails with message about &quot;extra dangling SSA sources&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111075">Bug 111075</a> - Processing of SPIR-V shader causes device hang, sometimes leading to system reboot</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111097">Bug 111097</a> - Can not detect VK_ERROR_OUT_OF_DATE_KHR or VK_SUBOPTIMAL_KHR when window resizing</li>
</ul>
<h2>Changes</h2>
<p>Andres Rodriguez (1):</p>
<ul>
<li>radv: fix queries with WAIT_BIT returning VK_NOT_READY</li>
</ul>
<p>Andrii Simiklit (2):</p>
<ul>
<li>intel/compiler: don't use a keyword struct for a class fs_reg</li>
<li>meson: add a warning for meson &lt; 0.46.0</li>
</ul>
<p>Arcady Goldmints-Orlov (1):</p>
<ul>
<li>anv: report HOST_ALLOCATION as supported for images</li>
</ul>
<p>Bas Nieuwenhuizen (3):</p>
<ul>
<li>radv: Set correct metadata size for GFX9+.</li>
<li>radv: Take variable descriptor counts into account for buffer entries.</li>
<li>radv: Fix descriptor set allocation failure.</li>
</ul>
<p>Boyuan Zhang (4):</p>
<ul>
<li>radeon/uvd: fix poc for hevc encode</li>
<li>radeon/vcn: fix poc for hevc encode</li>
<li>radeon/uvd: enable rate control for hevc encoding</li>
<li>radeon/vcn: enable rate control for hevc encoding</li>
</ul>
<p>Caio Marcelo de Oliveira Filho (1):</p>
<ul>
<li>anv: Remove special allocation for anv_push_constants</li>
</ul>
<p>Connor Abbott (1):</p>
<ul>
<li>nir: Allow qualifiers on copy_deref and image instructions</li>
</ul>
<p>Daniel Schürmann (1):</p>
<ul>
<li>spirv: Fix order of barriers in SpvOpControlBarrier</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>st/nir: fix arb fragment stage conversion</li>
</ul>
<p>Dylan Baker (1):</p>
<ul>
<li>meson: allow building all glx without any drivers</li>
</ul>
<p>Emil Velikov (1):</p>
<ul>
<li>egl/drm: ensure the backing gbm is set before using it</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>freedreno: Fix data races with allocating/freeing struct ir3.</li>
</ul>
<p>Eric Engestrom (5):</p>
<ul>
<li>nir: don't return void</li>
<li>util: fix no-op macro (bad number of arguments)</li>
<li>gallium+mesa: fix tgsi_semantic array type</li>
<li>scons+meson: suppress spammy build warning on MacOS</li>
<li>nir: remove explicit nir_intrinsic_index_flag values</li>
</ul>
<p>Francisco Jerez (1):</p>
<ul>
<li>intel/ir: Fix CFG corruption in opt_predicated_break().</li>
</ul>
<p>Ilia Mirkin (4):</p>
<ul>
<li>gallium/vl: fix compute tgsi shaders to not process undefined components</li>
<li>nv50,nvc0: update sampler/view bind functions to accept NULL array</li>
<li>nvc0: allow a non-user buffer to be bound at position 0</li>
<li>nv50/ir: handle insn not being there for definition of CVT arg</li>
</ul>
<p>Jason Ekstrand (6):</p>
<ul>
<li>intel/fs: Stop stack allocating large arrays</li>
<li>anv: Disable transform feedback on gen7</li>
<li>isl/formats: R8G8B8_UNORM_SRGB isn't supported on HSW</li>
<li>anv: Don't claim support for 24 and 48-bit formats on IVB</li>
<li>intel/fs: Use ALIGN16 instructions for all derivatives on gen &lt;= 7</li>
<li>intel/fs: Implement quad_swap_horizontal with a swizzle on gen7</li>
</ul>
<p>Juan A. Suarez Romero (2):</p>
<ul>
<li>docs: add sha256 checksums for 19.1.3</li>
<li>Update version to 19.1.4</li>
</ul>
<p>Kenneth Graunke (4):</p>
<ul>
<li>mesa: Fix ReadBuffers with pbuffers</li>
<li>egl: Quiet warning about front buffer rendering for pixmaps/pbuffers</li>
<li>egl: Make the 565 pbuffer-only config single buffered.</li>
<li>egl: Only expose 565 pbuffer configs if X can export them as DRI3 images</li>
</ul>
<p>Lionel Landwerlin (5):</p>
<ul>
<li>anv: fix use of comma operator</li>
<li>nir: add access to image_deref intrinsics</li>
<li>spirv: wrap push ssa/pointer values</li>
<li>spirv: propagate access qualifiers through ssa &amp; pointer</li>
<li>spirv: don't discard access set by vtn_pointer_dereference</li>
</ul>
<p>Mark Menzynski (1):</p>
<ul>
<li>nvc0/ir: Fix assert accessing null pointer</li>
</ul>
<p>Nataraj Deshpande (1):</p>
<ul>
<li>egl/android: Update color_buffers querying for buffer age</li>
</ul>
<p>Nicolas Dufresne (1):</p>
<ul>
<li>egl: Also query modifiers when exporting DMABuf</li>
</ul>
<p>Rhys Perry (1):</p>
<ul>
<li>ac/nir: fix txf_ms with an offset</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: fix crash in vkCmdClearAttachments with unused attachment</li>
</ul>
<p>Tapani Pälli (1):</p>
<ul>
<li>mesa: add glsl_type ref to one_time_init and decref to atexit</li>
</ul>
<p>Yevhenii Kolesnikov (1):</p>
<ul>
<li>main: Fix memleaks in mesa_use_program</li>
</ul>
</div>
</body>
</html>

View File

@@ -29,6 +29,11 @@ Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<p>
Mesa 19.2.0 implements the Vulkan 1.1 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
</p>
<h2>SHA256 checksums</h2>
<pre>
@@ -40,21 +45,405 @@ TBD.
<ul>
<li>GL_ARB_post_depth_coverage on radeonsi (Navi)</li>
<li>GL_ARB_seamless_cubemap_per_texture on etnaviv (if GPU supports SEAMLESS_CUBE_MAP)</li>
<li>GL_EXT_shader_image_load_store on radeonsi (with LLVM >= 10)</li>
<li>GL_EXT_shader_samples_identical on iris and radeonsi (if using NIR)</li>
<li>GL_EXT_texture_shadow_lod on i965, iris</li>
<li>EGL_EXT_platform_device</li>
<li>VK_EXT_queue_family_foreign for radv</li>
<li>VK_AMD_buffer_marker on radv</li>
<li>VK_EXT_index_type_uint8 on radv</li>
<li>VK_EXT_post_depth_coverage on radv</li>
<li>VK_EXT_queue_family_foreign on radv</li>
<li>VK_EXT_sample_locations on radv</li>
<li>VK_EXT_shader_demote_to_helper_invocation on Intel.</li>
<li>VK_KHR_depth_stencil_resolve on radv</li>
<li>VK_KHR_imageless_framebuffer on radv</li>
<li>VK_KHR_shader_atomic_int64 on radv</li>
<li>VK_KHR_uniform_buffer_standard_layout on radv</li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li>TBD</li>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103674">Bug 103674</a> - u_queue.c:173:7: error: implicit declaration of function 'timespec_get' is invalid in C99</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104395">Bug 104395</a> - [CTS] GTF-GL46.gtf32.GL3Tests.packed_pixels.packed_pixels tests fail on 32bit Mesa</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110765">Bug 110765</a> - ANV regression: Assertion `pass-&gt;attachment_count == framebuffer-&gt;attachment_count' failed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110814">Bug 110814</a> - KWin compositor crashes on launch</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111069">Bug 111069</a> - Assertion fails in nir_opt_remove_phis.c during compilation of SPIR-V shader</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111213">Bug 111213</a> - VA-API nouveau SIGSEGV and asserts</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111241">Bug 111241</a> - Shadertoy shader causing hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111248">Bug 111248</a> - Navi10 Font rendering issue in Overwatch</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111271">Bug 111271</a> - Crash in eglMakeCurrent</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111308">Bug 111308</a> - [Regression, NIR, bisected] Black squares in Unigine Heaven via DXVK</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111401">Bug 111401</a> - Vulkan overlay layer - async compute not supported, making overlay disappear in Doom</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111405">Bug 111405</a> - Some infinite 'do{}while' loops lead mesa to an infinite compilation</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111411">Bug 111411</a> - SPIR-V shader leads to GPU hang, sometimes making machine unstable</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111414">Bug 111414</a> - [REGRESSION] [BISECTED] Segmentation fault in si_bind_blend_state after removal of the blend state NULL check</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111467">Bug 111467</a> - WOLF RPG Editor + Gallium Nine Standalone: Rendering issue when using Iris driver</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111490">Bug 111490</a> - [REGRESSION] [BISECTED] Shadow Tactics: Blades of the Shogun - problems rendering water</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111493">Bug 111493</a> - In the game The Surge (378540) - textures disappear then appear again when I change the camera angle view</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111509">Bug 111509</a> - [regression][bisected] piglit.spec.ext_image_dma_buf_import.ext_image_dma_buf_import-export fails on iris</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111522">Bug 111522</a> - [bisected] Supraland no longer start</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111529">Bug 111529</a> - EGL_PLATFORM=drm doesn't expose MESA_query_driver extension</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111552">Bug 111552</a> - Geekbench 5.0 Vulkan compute benchmark fails on Anvil</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111566">Bug 111566</a> - [REGRESSION] [BISECTED] Large CS workgroup sizes broken in combination with FP64 on Intel.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111576">Bug 111576</a> - [bisected] Performance regression in X4:Foundations in 19.2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111676">Bug 111676</a> - Tropico 6 apitrace throws error into logs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=111734">Bug 111734</a> - Geometry shader with double interpolators fails in LLVM</li>
</ul>
</ul>
<h2>Changes</h2>
<ul>
<li>TBD</li>
<p>Adam Jackson (1):</p>
<ul>
<li>docs: Update bug report URLs for the gitlab migration</li>
</ul>
<p>Alex Smith (1):</p>
<ul>
<li>radv: Change memory type order for GPUs without dedicated VRAM</li>
</ul>
<p>Alyssa Rosenzweig (1):</p>
<ul>
<li>pan/midgard: Fix writeout combining</li>
</ul>
<p>Andres Gomez (1):</p>
<ul>
<li>docs: Add the maximum implemented Vulkan API version in 19.2 rel notes</li>
</ul>
<p>Andres Rodriguez (1):</p>
<ul>
<li>radv: additional query fixes</li>
</ul>
<p>Arcady Goldmints-Orlov (1):</p>
<ul>
<li>anv: fix descriptor limits on gen8</li>
</ul>
<p>Bas Nieuwenhuizen (6):</p>
<ul>
<li>radv: Use correct vgpr_comp_cnt for VS if both prim_id and instance_id are needed.</li>
<li>radv: Emit VGT_GS_ONCHIP_CNTL for tess on GFX10.</li>
<li>radv: Disable NGG for geometry shaders.</li>
<li>Revert "ac/nir: Lower large indirect variables to scratch"</li>
<li>tu: Set up glsl types.</li>
<li>radv: Add workaround for hang in The Surge 2.</li>
</ul>
<p>Caio Marcelo de Oliveira Filho (2):</p>
<ul>
<li>nir/lower_explicit_io: Handle 1 bit loads and stores</li>
<li>glsl/nir: Avoid overflow when setting max_uniform_location</li>
</ul>
<p>Connor Abbott (1):</p>
<ul>
<li>radv: Call nir_propagate_invariant()</li>
</ul>
<p>Danylo Piliaiev (3):</p>
<ul>
<li>nir/loop_unroll: Prepare loop for unrolling in wrapper_unroll</li>
<li>nir/loop_analyze: Treat do{}while(false) loops as 0 iterations</li>
<li>tgsi_to_nir: Translate TGSI_INTERPOLATE_COLOR as INTERP_MODE_NONE</li>
</ul>
<p>Dave Airlie (2):</p>
<ul>
<li>virgl: fix format conversion for recent gallium changes.</li>
<li>gallivm: fix atomic compare-and-swap</li>
</ul>
<p>Dave Stevenson (1):</p>
<ul>
<li>broadcom/v3d: Allow importing linear BOs with arbitrary offset/stride.</li>
</ul>
<p>Dylan Baker (9):</p>
<ul>
<li>bump version to 19.2-rc2</li>
<li>nir: Add is_not_negative helper function</li>
<li>Bump version for rc3</li>
<li>meson: don't generate file into subdirs</li>
<li>add patches to be ignored</li>
<li>Bump version for 19.2.0-rc4</li>
<li>cherry-ignore: Add patches</li>
<li>rehardcode from origin/master to upstream/master</li>
<li>bin/get-pick-list: use --oneline=pretty instead of --oneline</li>
</ul>
<p>Emil Velikov (1):</p>
<ul>
<li>Update version to 19.2.0-rc1</li>
</ul>
<p>Eric Engestrom (14):</p>
<ul>
<li>ttn: fix 64-bit shift on 32-bit `1`</li>
<li>egl: fix deadlock in malloc error path</li>
<li>util/os_file: fix double-close()</li>
<li>anv: fix format string in error message</li>
<li>freedreno/drm-shim: fix mem leak</li>
<li>nir: fix memleak in error path</li>
<li>anv: add support for driconf</li>
<li>wsi: add minImageCount override</li>
<li>anv: add support for vk_x11_override_min_image_count</li>
<li>amd: move adaptive sync to performance section, as it is defined in xmlpool</li>
<li>radv: add support for vk_x11_override_min_image_count</li>
<li>drirc: override minImageCount=2 for gfxbench</li>
<li>gl: drop incorrect pkg-config file for glvnd</li>
<li>meson: re-add incorrect pkg-config files with GLVND for backward compatibility</li>
</ul>
<p>Erik Faye-Lund (2):</p>
<ul>
<li>gallium/auxiliary/indices: consistently apply start only to input</li>
<li>util: fix SSE-version needed for double opcodes</li>
</ul>
<p>Haihao Xiang (1):</p>
<ul>
<li>i965: support AYUV/XYUV for external import only</li>
</ul>
<p>Hal Gentz (2):</p>
<ul>
<li>glx: Fix SEGV due to dereferencing a NULL ptr from XCB-GLX.</li>
<li>gallium/osmesa: Fix the inability to set no context as current.</li>
</ul>
<p>Iago Toral Quiroga (1):</p>
<ul>
<li>v3d: make sure we have enough space in the CL for the primitive counts packet</li>
</ul>
<p>Ian Romanick (8):</p>
<ul>
<li>nir/algrbraic: Don't optimize open-coded bitfield reverse when lowering is enabled</li>
<li>intel/compiler: Request bitfield_reverse lowering on pre-Gen7 hardware</li>
<li>nir/algebraic: Mark some value range analysis-based optimizations imprecise</li>
<li>nir/range-analysis: Adjust result range of exp2 to account for flush-to-zero</li>
<li>nir/range-analysis: Adjust result range of multiplication to account for flush-to-zero</li>
<li>nir/range-analysis: Fix incorrect fadd range result for (ne_zero, ne_zero)</li>
<li>nir/range-analysis: Handle constants in nir_op_mov just like nir_op_bcsel</li>
<li>nir/algebraic: Do not apply late DPH optimization in vertex processing stages</li>
</ul>
<p>Ilia Mirkin (1):</p>
<ul>
<li>gallium/vl: use compute preference for all multimedia, not just blit</li>
</ul>
<p>Jason Ekstrand (9):</p>
<ul>
<li>anv: Bump maxComputeWorkgroupSize</li>
<li>nir: Handle complex derefs in nir_split_array_vars</li>
<li>nir: Don't infinitely recurse in lower_ssa_defs_to_regs_block</li>
<li>nir: Add a block_is_unreachable helper</li>
<li>nir/repair_ssa: Repair dominance for unreachable blocks</li>
<li>nir/repair_ssa: Insert deref casts when needed</li>
<li>nir/dead_cf: Repair SSA if the pass makes progress</li>
<li>intel/fs: Handle UNDEF in split_virtual_grfs</li>
<li>nir/repair_ssa: Replace the unreachable check with the phi builder</li>
</ul>
<p>Jonathan Marek (1):</p>
<ul>
<li>freedreno/a2xx: ir2: fix lowering of instructions after float lowering</li>
</ul>
<p>Jose Maria Casanova Crespo (1):</p>
<ul>
<li>mesa: recover target_check before get_current_tex_objects</li>
</ul>
<p>Juan A. Suarez Romero (1):</p>
<ul>
<li>bin/get-pick-list.sh: sha1 commits can be smaller than 8 chars</li>
</ul>
<p>Kenneth Graunke (20):</p>
<ul>
<li>gallium/ddebug: Wrap resource_get_param if available</li>
<li>gallium/trace: Wrap resource_get_param if available</li>
<li>gallium/rbug: Wrap resource_get_param if available</li>
<li>gallium/noop: Implement resource_get_param</li>
<li>iris: Replace devinfo-&gt;gen with GEN_GEN</li>
<li>iris: Fix broken aux.possible/sampler_usages bitmask handling</li>
<li>iris: Update fast clear colors on Gen9 with direct immediate writes.</li>
<li>iris: Drop copy format hacks from copy region based transfer path.</li>
<li>iris: Avoid unnecessary resolves on transfer maps</li>
<li>iris: Fix large timeout handling in rel2abs()</li>
<li>isl: Drop UnormPathInColorPipe for buffer surfaces.</li>
<li>isl: Don't set UnormPathInColorPipe for integer surfaces.</li>
<li>util: Add a _mesa_i64roundevenf() helper.</li>
<li>mesa: Fix _mesa_float_to_unorm() on 32-bit systems.</li>
<li>iris: Fix partial fast clear checks to account for miplevel.</li>
<li>iris: Report correct number of planes for planar images</li>
<li>iris: Fix constant buffer sizes for non-UBOs</li>
<li>gallium: Fix util_format_get_depth_only</li>
<li>iris: Initialize ice-&gt;state.prim_mode to an invalid value</li>
<li>intel: Increase Gen11 compute shader scratch IDs to 64.</li>
</ul>
<p>Lepton Wu (1):</p>
<ul>
<li>virgl: Fix pipe_resource leaks under multi-sample.</li>
</ul>
<p>Lionel Landwerlin (9):</p>
<ul>
<li>util/timespec: use unsigned 64 bit integers for nsec values</li>
<li>util: fix compilation on macos</li>
<li>egl: fix platform selection</li>
<li>vulkan/overlay: bounce image back to present layout</li>
<li>radv: store engine name</li>
<li>driconfig: add a new engine name/version parameter</li>
<li>vulkan: add vk_x11_strict_image_count option</li>
<li>util/xmlconfig: fix regexp compile failure check</li>
<li>drirc: include unreal engine version 0 to 23</li>
</ul>
<p>Marek Olšák (23):</p>
<ul>
<li>radeonsi/gfx10: fix the legacy pipeline by storing as_ngg in the shader cache</li>
<li>radeonsi: move some global shader cache flags to per-binary flags</li>
<li>radeonsi/gfx10: fix tessellation for the legacy pipeline</li>
<li>radeonsi/gfx10: fix the PRIMITIVES_GENERATED query if using legacy streamout</li>
<li>radeonsi/gfx10: create the GS copy shader if using legacy streamout</li>
<li>radeonsi/gfx10: add as_ngg variant for VS as ES to select Wave32/64</li>
<li>radeonsi/gfx10: fix InstanceID for legacy VS+GS</li>
<li>radeonsi/gfx10: don't initialize VGT_INSTANCE_STEP_RATE_0</li>
<li>radeonsi/gfx10: always use the legacy pipeline for streamout</li>
<li>radeonsi/gfx10: finish up Navi14, add PCI ID</li>
<li>radeonsi/gfx10: add AMD_DEBUG=nongg</li>
<li>winsys/amdgpu+radeon: process AMD_DEBUG in addition to R600_DEBUG</li>
<li>radeonsi: add PKT3_CONTEXT_REG_RMW</li>
<li>radeonsi/gfx10: remove incorrect ngg/pos_writes_edgeflag variables</li>
<li>radeonsi/gfx10: set PA_CL_VS_OUT_CNTL with CONTEXT_REG_RMW to fix edge flags</li>
<li>radeonsi: consolidate determining VGPR_COMP_CNT for API VS</li>
<li>radeonsi: unbind blend/DSA/rasterizer state correctly in delete functions</li>
<li>radeonsi: fix scratch buffer WAVESIZE setting leading to corruption</li>
<li>radeonsi/gfx10: don't call gfx10_destroy_query with compute-only contexts</li>
<li>radeonsi/gfx10: fix wave occupancy computations</li>
<li>radeonsi: add Navi12 PCI ID</li>
<li>amd: add more PCI IDs for Navi14</li>
<li>ac/addrlib: fix chip identification for Vega10, Arcturus, Raven2, Renoir</li>
</ul>
<p>Mauro Rossi (2):</p>
<ul>
<li>android: mesa: revert "Enable asm unconditionally"</li>
<li>android: anv: libmesa_vulkan_common: add libmesa_util static dependency</li>
</ul>
<p>Paulo Zanoni (2):</p>
<ul>
<li>intel/fs: grab fail_msg from v32 instead of v16 when v32-&gt;run_cs fails</li>
<li>intel/fs: fix SHADER_OPCODE_CLUSTER_BROADCAST for SIMD32</li>
</ul>
<p>Pierre-Eric Pelloux-Prayer (1):</p>
<ul>
<li>glsl: replace 'x + (-x)' with constant 0</li>
</ul>
<p>Rafael Antognolli (1):</p>
<ul>
<li>anv: Only re-emit non-dynamic state that has changed.</li>
</ul>
<p>Rhys Perry (1):</p>
<ul>
<li>radv: always emit a position export in gs copy shaders</li>
</ul>
<p>Samuel Iglesias Gonsálvez (1):</p>
<ul>
<li>intel/nir: do not apply the fsin and fcos trig workarounds for consts</li>
</ul>
<p>Samuel Pitoiset (11):</p>
<ul>
<li>radv: allow to enable VK_AMD_shader_ballot only on GFX8+</li>
<li>radv: add a new debug option called RADV_DEBUG=noshaderballot</li>
<li>radv: force enable VK_AMD_shader_ballot for Wolfenstein Youngblood</li>
<li>ac: fix exclusive scans on GFX8-GFX9</li>
<li>radv/gfx10: don't initialize VGT_INSTANCE_STEP_RATE_0</li>
<li>radv/gfx10: do not use NGG with NAVI14</li>
<li>radv: fix getting the index type size for uint8_t</li>
<li>nir: do not assume that the result of fexp2(a) is always an integral</li>
<li>radv: fix allocating number of user sgprs if streamout is used</li>
<li>radv: fix loading 64-bit GS inputs</li>
<li>radv/gfx10: fix VK_KHR_pipeline_executable_properties with NGG GS</li>
</ul>
<p>Sergii Romantsov (2):</p>
<ul>
<li>intel/dri: finish proper glthread</li>
<li>nir/large_constants: more careful data copying</li>
</ul>
<p>Tapani Pälli (5):</p>
<ul>
<li>util: fix os_create_anonymous_file on android</li>
<li>iris/android: fix build and link with libmesa_intel_perf</li>
<li>egl: reset blob cache set/get functions on terminate</li>
<li>iris: close screen fd on iris_destroy_screen</li>
<li>egl: check for NULL value like eglGetSyncAttribKHR does</li>
</ul>
<p>Thong Thai (1):</p>
<ul>
<li>Revert "radeonsi: don't emit PKT3_CONTEXT_CONTROL on amdgpu"</li>
</ul>
<p>Timur Kristóf (1):</p>
<ul>
<li>st/nine: Properly initialize GLSL types for NIR shaders.</li>
</ul>
<p>Vinson Lee (2):</p>
<ul>
<li>swr: Fix build with llvm-9.0 again.</li>
<li>travis: Fail build if any command in if statement fails.</li>
</ul>
</ul>
</div>

159
docs/relnotes/19.2.1.html Normal file
View File

@@ -0,0 +1,159 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 19.2.1 Release Notes / 2019-10-09</h1>
<p>
Mesa 19.2.1 is a bug fix release which fixes bugs found since the 19.2.0 release.
</p>
<p>
Mesa 19.2.1 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<p>
Mesa 19.2.1 implements the Vulkan 1.1 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
</p>
<h2>SHA256 checksum</h2>
<pre>
4cc53ca1a8d12c6ff0e5ea44a5213c05c88447ab50d7e28bb350cd29199f01e9 mesa-19.2.1.tar.xz
</pre>
<h2>New features</h2>
<ul>
<li>None</li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li>meson.build:1447:6: ERROR: Problem encountered: libdrm required for gallium video statetrackers when using x11</li>
<li>Mesa doesn't build with current Scons version (3.1.0)</li>
<li>libXvMC-1.0.12 breaks mesa build</li>
<li>Meson can't find 32-bit libXvMCW in non-standard path</li>
<li>Mesa installs gl.pc and egl.pc even with libglvnd >= 1.2.0</li>
</ul>
<h2>Changes</h2>
<ul>
<p>Andreas Gottschling (1):</p>
<li> drisw: Fix shared memory leak on drawable resize</li>
<p></p>
<p>Andres Gomez (1):</p>
<li> egl: Remove the 565 pbuffer-only EGL config under X11.</li>
<p></p>
<p>Andrii Simiklit (1):</p>
<li> glsl: disallow incompatible matrices multiplication</li>
<p></p>
<p>Bas Nieuwenhuizen (1):</p>
<li> radv: Fix condition for skipping the continue CS.</li>
<p></p>
<p>Connor Abbott (1):</p>
<li> nir/opt_large_constants: Handle store writemasks</li>
<p></p>
<p>Danylo Piliaiev (1):</p>
<li> st/nine: Ignore D3DSIO_RET if it is the last instruction in a shader</li>
<p></p>
<p>Dylan Baker (9):</p>
<li> meson: fix logic for generating .pc files with old glvnd</li>
<li> meson: Try finding libxvmcw via pkg-config before using find_library</li>
<li> meson: Link xvmc with libxv</li>
<li> meson: gallium media state trackers require libdrm with x11</li>
<li> .cherry-ignore: Update for 19.2.1 cycle</li>
<li> meson: Only error building gallium video without libdrm when the platform is drm</li>
<li> scripts: Add a gen_release_notes.py script</li>
<li> release: Add an update_release_calendar.py script</li>
<li> bin: delete unused releasing scripts</li>
<p></p>
<p>Eric Engestrom (3):</p>
<li> radv: fix s/load/store/ copy-paste typo</li>
<li> meson: drop -Wno-foo bug workaround for Meson < 0.46</li>
<li> meson: add missing idep_nir_headers in iris_gen_libs</li>
<p></p>
<p>Erik Faye-Lund (1):</p>
<li> glsl: correct bitcast-helpers</li>
<p></p>
<p>Ian Romanick (1):</p>
<li> nir/range-analysis: Bail if the types don't match</li>
<p></p>
<p>Jason Ekstrand (1):</p>
<li> intel/fs: Fix fs_inst::flags_read for ANY/ALL predicates</li>
<p></p>
<p>Ken Mays (1):</p>
<li> haiku: fix Mesa build</li>
<p></p>
<p>Kenneth Graunke (2):</p>
<li> iris: Disable CCS_E for 32-bit floating point textures.</li>
<li> iris: Fix iris_rebind_buffer() for VBOs with non-zero offsets.</li>
<p></p>
<p>Lionel Landwerlin (6):</p>
<li> anv: gem-stubs: return a valid fd got anv_gem_userptr()</li>
<li> intel: use proper label for Comet Lake skus</li>
<li> mesa: don't forget to clear _Layer field on texture unit</li>
<li> intel: fix topology query</li>
<li> intel: fix subslice computation from topology data</li>
<li> intel/isl: Set null surface format to R32_UINT</li>
<p></p>
<p>Marek Olšák (7):</p>
<li> gallium/vl: don't set PIPE_HANDLE_USAGE_EXPLICIT_FLUSH</li>
<li> gallium: extend resource_get_param to be as capable as resource_get_handle</li>
<li> radeonsi/gfx10: fix L2 cache rinse programming</li>
<li> ac: fix incorrect vram_size reported by the kernel</li>
<li> ac: fix num_good_cu_per_sh for harvested chips</li>
<li> ac: add radeon_info::tcc_harvested</li>
<li> radeonsi/gfx10: fix corruption for chips with harvested TCCs</li>
<p></p>
<p>Mauro Rossi (1):</p>
<li> android: compiler/nir: build nir_divergence_analysis.c</li>
<p></p>
<p>Michel Dänzer (1):</p>
<li> radeonsi: fix VAAPI segfault due to various bugs</li>
<p></p>
<p>Michel Zou (1):</p>
<li> scons: add py3 support</li>
<p></p>
<p>Prodea Alexandru-Liviu (1):</p>
<li> scons/MSYS2-MinGW-W64: Fix build options defaults</li>
<p></p>
<p>Rhys Perry (1):</p>
<li> nir/opt_remove_phis: handle phis with no sources</li>
<p></p>
<p>Stephen Barber (1):</p>
<li> nouveau: add idep_nir_headers as dep for libnouveau</li>
<p></p>
<p>Tapani Pälli (2):</p>
<li> iris: disable aux on first get_param if not created with aux</li>
<li> anv/android: fix images created with external format support</li>
<p></p>
<p>pal1000 (2):</p>
<li> scons: Fix MSYS2 Mingw-w64 build.</li>
<li> scons/windows: Support build with LLVM 9.</li>
<p></p>
<p></p>
</ul>
</div>
</body>
</html>

147
docs/relnotes/19.2.2.html Normal file
View File

@@ -0,0 +1,147 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 19.2.2 Release Notes / 2019-10-23</h1>
<p>
Mesa 19.2.2 is a bug fix release which fixes bugs found since the 19.2.1 release.
</p>
<p>
Mesa 19.2.2 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<p>
Mesa 19.2.2 implements the Vulkan 1.1 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
</p>
<h2>SHA256 checksum</h2>
<pre>
TBD.
</pre>
<h2>New features</h2>
<ul>
<li>None</li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li>Vulkan version of &quot;Middle-earth: Shadow of Mordor&quot; has graphics glitches on RADV driver (part 2)</li>
<li>Vulkan version of &quot;Middle-earth: Shadow of Mordor&quot; has graphics glitches on RADV driver</li>
<li>[amdgpu][Navi][llvm] Minimap problem in Nier Automata</li>
<li>Black ground in Dirt 4</li>
<li>Superbibles examples crashing Mesa drivers (radeonsi) and causing gpu reset</li>
<li>[CTS] dEQP-VK.graphicsfuzz.write-red-in-loop-nest crashes</li>
<li>mesa and libglvnd install the same headers</li>
<li>Regression: Doom (2016) crashes on Mesa 19.2 and above and Radeon 380 with Vulkan (worked on Mesa 19.1)</li>
<li>Rocket League displays corruption when the game starts</li>
</ul>
<h2>Changes</h2>
<ul>
<p>Alan Coopersmith (6):</p>
<li> c99_compat.h: Don&#x27;t try to use &#x27;restrict&#x27; in C++ code</li>
<li> util: Make Solaris implemention of p_atomic_add work with gcc</li>
<li> util: Workaround lack of flock on Solaris</li>
<li> util: Solaris has linux-style pthread_setname_np</li>
<li> meson: recognize &quot;sunos&quot; as the system name for Solaris</li>
<li> intel/common: include unistd.h for ioctl() prototype on Solaris</li>
<p></p>
<p>Alejandro Piñeiro (1):</p>
<li> v3d: take into account prim_counts_offset</li>
<p></p>
<p>Bas Nieuwenhuizen (3):</p>
<li> radv: Disallow sparse shared images.</li>
<li> nir/dead_cf: Remove dead control flow after infinite loops.</li>
<li> radv: Fix single stage constant flush with merged shaders.</li>
<p></p>
<p>Clément Guérin (1):</p>
<li> radeonsi: enable zerovram for Rocket League</li>
<p></p>
<p>Connor Abbott (2):</p>
<li> nir/sink: Rewrite loop handling logic</li>
<li> nir/sink: Don&#x27;t sink load_ubo to outside of its defining loop</li>
<p></p>
<p>Dylan Baker (1):</p>
<li> docs: Add SHA256 sum for 19.2.1</li>
<p></p>
<p>Eric Engestrom (7):</p>
<li> GL: drop symbols mangling support</li>
<li> meson: rename `glvnd_missing_pc_files` to `not glvnd_has_headers_and_pc_files`</li>
<li> meson: move a couple of include installs around</li>
<li> meson: split headers one per line</li>
<li> meson: split Mesa headers as a separate installation</li>
<li> meson: skip installation of GLVND-provided headers</li>
<li> util/u_atomic: fix return type of p_atomic_{inc,dec}_return() and p_atomic_{cmp,}xchg()</li>
<p></p>
<p>Ian Romanick (2):</p>
<li> nir/search: Fix possible NULL dereference in is_fsign</li>
<li> intel/vec4: Don&#x27;t try both sources as immediates for DPH</li>
<p></p>
<p>James Xiong (1):</p>
<li> iris: finish aux import on get_param</li>
<p></p>
<p>Kenneth Graunke (2):</p>
<li> iris: Properly unreference extra VBOs for draw parameters</li>
<li> iris: Implement the Gen &lt; 9 tessellation quads workaround</li>
<p></p>
<p>Lepton Wu (1):</p>
<li> egl/android: Remove our own reference to buffers.</li>
<p></p>
<p>Lionel Landwerlin (3):</p>
<li> etnaviv: remove variable from global namespace</li>
<li> anv: fix vkUpdateDescriptorSets with inline uniform blocks</li>
<li> anv: fix memory leak on device destroy</li>
<p></p>
<p>Lucas Stach (3):</p>
<li> etnaviv: fix vertex buffer state emission for single stream GPUs</li>
<li> rbug: fix transmitted texture sizes</li>
<li> rbug: unwrap index buffer resource</li>
<p></p>
<p>Pierre-Eric Pelloux-Prayer (1):</p>
<li> mesa: fix invalid target error handling for teximage</li>
<p></p>
<p>Roland Scheidegger (1):</p>
<li> gallivm: Fix saturated signed psub/padd intrinsics on llvm 8</li>
<p></p>
<p>Samuel Pitoiset (6):</p>
<li> drirc: enable vk_x11_override_min_image_count for DOOM</li>
<li> radv: bump minTexelBufferOffsetAlignment to 4</li>
<li> radv: fix DCC fast clear code for intensity formats</li>
<li> Revert &quot;radv: do not emit PKT3_CONTEXT_CONTROL with AMDGPU 3.6.0+&quot;</li>
<li> radv: fix DCC fast clear code for intensity formats (correctly)</li>
<li> radv: fix updating bound fast ds clear values with different aspects</li>
<p></p>
<p>Timothy Arceri (1):</p>
<li> glsl: fix crash compiling bindless samplers inside unnamed UBOs</li>
<p></p>
<p></p>
</ul>
</div>
</body>
</html>

146
docs/relnotes/19.2.3.html Normal file
View File

@@ -0,0 +1,146 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 19.2.3 Release Notes / 2019-11-06</h1>
<p>
Mesa 19.2.3 is a bug fix release which fixes bugs found since the 19.2.2 release.
</p>
<p>
Mesa 19.2.3 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<p>
Mesa 19.2.3 implements the Vulkan 1.1 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
</p>
<h2>SHA256 checksum</h2>
<pre>
TBD.
</pre>
<h2>New features</h2>
<ul>
<li>None</li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li>19.2.2 fails mesa:util / timespec test on x86</li>
<li>Objects leaving trails in Firefox with antialias and preserveDrawingBuffer in three.js WebGLRednerer with mesa 19.2</li>
<li>glLinkProgram crash when using gcc-9 -O3 -flto due to use of uninitialised value</li>
</ul>
<h2>Changes</h2>
<ul>
<p>Bas Nieuwenhuizen (4):</p>
<li> radv: Fix timeout handling in syncobj wait.</li>
<li> radv: Remove _mesa_locale_init/fini calls.</li>
<li> turnip: Remove _mesa_locale_init/fini calls.</li>
<li> anv: Remove _mesa_locale_init/fini calls.</li>
<p></p>
<p>Caio Marcelo de Oliveira Filho (1):</p>
<li> anv: Fix output of INTEL_DEBUG=bat for chained batches</li>
<p></p>
<p>Danylo Piliaiev (1):</p>
<li> glsl: Initialize all fields of ir_variable in constructor</li>
<p></p>
<p>Dylan Baker (11):</p>
<li> bin/gen_release_notes.py: fix conditional of bugfix</li>
<li> bin/gen_release_notes.py: strip &#x27;#&#x27; from gitlab bugs</li>
<li> bin/gen_release_notes.py: Return &quot;None&quot; if there are no new features</li>
<li> bin/post_version.py: Pass version as an argument</li>
<li> bin/post_version.py: white space fixes</li>
<li> bin/post_release.py: Add .html to hrefs</li>
<li> bin/gen_release_notes.py: html escape all external data</li>
<li> bin/gen_release_notes.py: Add a warning if new features are introduced in a point release</li>
<li> cherry-ignore: update for 19.2.3 cycle</li>
<li> nir: correct use of identity check in python</li>
<li> meson: Add dep_glvnd to egl deps when building with glvnd</li>
<p></p>
<p>Ilia Mirkin (1):</p>
<li> nv50/ir: mark STORE destination inputs as used</li>
<p></p>
<p>Illia Iorin (1):</p>
<li> Revert &quot;mesa/main: Fix multisample texture initialize&quot;</li>
<p></p>
<p>Jason Ekstrand (2):</p>
<li> anv: Fix a potential BO handle leak</li>
<li> anv/tests: Zero-initialize instances</li>
<p></p>
<p>Jon Turney (2):</p>
<li> rbug: Fix use of alloca() without #include &quot;c99_alloca.h&quot;</li>
<li> Fix timespec_from_nsec test for 32-bit time_t</li>
<p></p>
<p>Jonathan Marek (1):</p>
<li> etnaviv: fix depth bias</li>
<p></p>
<p>Kenneth Graunke (1):</p>
<li> iris: Fix &quot;Force Zero RTA Index Enable&quot; setting again</li>
<p></p>
<p>Lionel Landwerlin (2):</p>
<li> anv: fix unwind of vkCreateDevice fail</li>
<li> mesa: check draw buffer completeness on glClearBufferfi/glClearBufferiv</li>
<p></p>
<p>Marek Olšák (1):</p>
<li> util/u_queue: skip util_queue_finish if num_threads is 0</li>
<p></p>
<p>Nanley Chery (5):</p>
<li> anv: Properly allocate aux-tracking space for CCS_E</li>
<li> intel/blorp: Disable depth testing for slow depth clears</li>
<li> iris: Clear ::has_hiz when disabling aux</li>
<li> iris: Don&#x27;t leak the resource for unsupported modifier</li>
<li> iris: Disallow incomplete resource creation</li>
<p></p>
<p>Paulo Zanoni (1):</p>
<li> intel/compiler: remove the operand restriction for src1 on GLK</li>
<p></p>
<p>Pierre-Eric Pelloux-Prayer (1):</p>
<li> mesa: enable msaa in clear_with_quad if needed</li>
<p></p>
<p>Sagar Ghuge (1):</p>
<li> intel/blorp: Assign correct view while clearing depth stencil</li>
<p></p>
<p>Samuel Pitoiset (4):</p>
<li> radv: do not create meta pipelines with 16 samples</li>
<li> radv: do not emit rbplus if attachments are undefined</li>
<li> radv/gfx10: fix 3D images</li>
<li> radv: fix vkUpdateDescriptorSets with inline uniform blocks</li>
<p></p>
<p>Tapani Pälli (1):</p>
<li> i965: setup sized internalformat for MESA_FORMAT_R10G10B10A2_UNORM</li>
<p></p>
<p>Thomas Hellstrom (2):</p>
<li> svga: Fix banded DMA upload unmap</li>
<li> winsys/svga: Limit the maximum DMA hardware buffer size</li>
<p></p>
<p></p>
</ul>
</div>
</body>
</html>

65
docs/relnotes/19.2.4.html Normal file
View File

@@ -0,0 +1,65 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 19.2.4 Release Notes / 2019-11-13</h1>
<p>
Mesa 19.2.4 is an emergency bug fix release to fix on ciritcal bug in 19.2.3.
</p>
<p>
Mesa 19.2.4 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<p>
Mesa 19.2.4 implements the Vulkan 1.1 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
</p>
<h2>SHA256 checksum</h2>
<pre>
09000a0f7dbbd82e193b81a8f1bf0c118eab7ca975c0329181968596e548e30f mesa-19.2.4.tar.xz
</pre>
<h2>New features</h2>
<ul>
<li>None</li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li>Dirt Rally: Menu system doesn&#x27;t show up with Mesa 19.2.3</li>
</ul>
<h2>Changes</h2>
<ul>
<p>Lionel Landwerlin (1):</p>
<li> mesa: check framebuffer completeness only after state update</li>
<p></p>
<p></p>
</ul>
</div>
</body>
</html>

115
docs/relnotes/19.2.5.html Normal file
View File

@@ -0,0 +1,115 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 19.2.5 Release Notes / 2019-11-20</h1>
<p>
Mesa 19.2.5 is a bug fix release which fixes bugs found since the 19.2.4 release.
</p>
<p>
Mesa 19.2.5 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<p>
Mesa 19.2.5 implements the Vulkan 1.1 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
</p>
<h2>SHA256 checksum</h2>
<pre>
3d010a366b28d10bdd71e32091d8684baf1522e6466c5c5703667091b2108c8b mesa-19.2.5.tar.xz
</pre>
<h2>New features</h2>
<ul>
<li>None</li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li>HSW. Tropico 6 and SuperTuxKart have shadows flickering</li>
<li>glxgears segfaults on POWER / Xvnc</li>
<li>Cannot start Civ6 with AMD GPU on Linux</li>
</ul>
<h2>Changes</h2>
<ul>
<p>Ben Crocker (1):</p>
<li> llvmpipe: use ppc64le/ppc64 Large code model for JIT-compiled shaders</li>
<p></p>
<p>Brian Paul (1):</p>
<li> Call shmget() with permission 0600 instead of 0777</li>
<p></p>
<p>Caio Marcelo de Oliveira Filho (1):</p>
<li> spirv: Don&#x27;t leak GS initialization to other stages</li>
<p></p>
<p>Danylo Piliaiev (1):</p>
<li> i965: Unify CC_STATE and BLEND_STATE atoms on Haswell as a workaround</li>
<p></p>
<p>Dylan Baker (2):</p>
<li> docs: Add SHA256 sum for for 19.2.4</li>
<li> cherry-ignore: Update for 19.2.4 cycle</li>
<p></p>
<p>Eric Engestrom (1):</p>
<li> egl: fix _EGL_NATIVE_PLATFORM fallback</li>
<p></p>
<p>Ian Romanick (2):</p>
<li> nir/algebraic: Add the ability to mark a replacement as exact</li>
<li> nir/algebraic: Mark other comparison exact when removing a == a</li>
<p></p>
<p>Illia Iorin (1):</p>
<li> mesa/main: Ignore filter state for MS texture completeness</li>
<p></p>
<p>Jason Ekstrand (1):</p>
<li> anv: Stop bounds-checking pushed UBOs</li>
<p></p>
<p>Lepton Wu (1):</p>
<li> gallium: dri2: Use index as plane number.</li>
<p></p>
<p>Lionel Landwerlin (3):</p>
<li> anv: invalidate file descriptor of semaphore sync fd at vkQueueSubmit</li>
<li> anv: remove list items on batch fini</li>
<li> anv/wsi: signal the semaphore in the acquireNextImage</li>
<p></p>
<p>Marek Olšák (3):</p>
<li> st/mesa: fix Sanctuary and Tropics by disabling ARB_gpu_shader5 for them</li>
<li> tgsi_to_nir: fix masked out image loads</li>
<li> tgsi_to_nir: handle PIPE_FORMAT_NONE in image opcodes</li>
<p></p>
<p>Paulo Zanoni (1):</p>
<li> intel/compiler: fix nir_op_{i,u}*32 on ICL</li>
<p></p>
<p>Pierre-Eric Pelloux-Prayer (3):</p>
<li> radeonsi: disable sdma for gfx10</li>
<li> radeonsi: tell the shader disk cache what IR is used</li>
<li> radeonsi: fix shader disk cache key</li>
<p></p>
<p></p>
</ul>
</div>
</body>
</html>

87
docs/relnotes/19.2.6.html Normal file
View File

@@ -0,0 +1,87 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 19.2.6 Release Notes / 2019-11-21</h1>
<p>
Mesa 19.2.6 is a bug fix release which fixes bugs found since the 19.2.5 release.
</p>
<p>
Mesa 19.2.6 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<p>
Mesa 19.2.6 implements the Vulkan 1.1 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
</p>
<h2>SHA256 checksum</h2>
<pre>
TBD.
</pre>
<h2>New features</h2>
<ul>
<li>None</li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li>glesv2.pc is not built since fafd20f67dec9f589</li>
<li>textureSize(samplerExternalOES, int) missing in desktop mesa 19.1.7 implementation</li>
<li>[19.2.5] lp_bld_misc: broken #if PIPE_ARCH_LITTLE_ENDIAN on ppc64l</li>
</ul>
<h2>Changes</h2>
<ul>
<p>Alejandro Piñeiro (1):</p>
<li> v3d: adds an extra MOV for any sig.ld*</li>
<p></p>
<p>Dave Airlie (1):</p>
<li> llvmpipe/ppc: fix if/ifdef confusion in backport.</li>
<p></p>
<p>Dylan Baker (2):</p>
<li> docs/relnotes/19.2.5: Add SHA256 sum</li>
<li> meson: generate .pc files for gles and gles2 with old glvnd</li>
<p></p>
<p>Eric Engestrom (1):</p>
<li> vulkan: delete typo&#x27;d header</li>
<p></p>
<p>Hyunjun Ko (1):</p>
<li> freedreno/ir3: fix printing output registers of FS.</li>
<p></p>
<p>Jose Maria Casanova Crespo (1):</p>
<li> v3d: Fix predication with atomic image operations</li>
<p></p>
<p>Yevhenii Kolesnikov (1):</p>
<li> glsl: Enable textureSize for samplerExternalOES</li>
<p></p>
<p></p>
</ul>
</div>
</body>
</html>

96
docs/relnotes/19.2.7.html Normal file
View File

@@ -0,0 +1,96 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 19.2.7 Release Notes / 2019-12-04</h1>
<p>
Mesa 19.2.7 is a bug fix release which fixes bugs found since the 19.2.6 release.
</p>
<p>
Mesa 19.2.7 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<p>
Mesa 19.2.7 implements the Vulkan 1.1 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
</p>
<h2>SHA256 checksum</h2>
<pre>
e3799fb7896fd9ed2f90f651fb907b95cdebfbd494968ff116e6bf1be143579e mesa-19.2.7.tar.xz
</pre>
<h2>New features</h2>
<ul>
<li>None</li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li>ld.lld: error: duplicate symbol (mesa-19.3.0-rc1)</li>
<li>triangle strip clipping with GL_FIRST_VERTEX_CONVENTION causes wrong vertex&#x27;s attribute to be broadcasted for flat interpolation</li>
<li>[bisected][regression][g45,g965,ilk] piglit arb_fragment_program kil failures</li>
</ul>
<h2>Changes</h2>
<ul>
<p>Bas Nieuwenhuizen (2):</p>
<li> radv: Allocate cmdbuffer space for buffer marker write.</li>
<li> radv: Unify max_descriptor_set_size.</li>
<p></p>
<p>Boris Brezillon (1):</p>
<li> gallium: Fix the -&gt;set_damage_region() implementation</li>
<p></p>
<p>Ian Romanick (1):</p>
<li> intel/fs: Disable conditional discard optimization on Gen4 and Gen5</li>
<p></p>
<p>Jason Ekstrand (1):</p>
<li> anv: Set up SBE_SWIZ properly for gl_Viewport</li>
<p></p>
<p>Jonathan Gray (2):</p>
<li> winsys/amdgpu: avoid double simple_mtx_unlock()</li>
<li> i965: update Makefile.sources for perf changes</li>
<p></p>
<p>Rhys Perry (1):</p>
<li> radv: set writes_memory for global memory stores/atomics</li>
<p></p>
<p>Samuel Pitoiset (3):</p>
<li> radv: fix enabling sample shading with SampleID/SamplePosition</li>
<li> radv/gfx10: fix implementation of exclusive scans</li>
<li> radv: fix compute pipeline keys when optimizations are disabled</li>
<p></p>
<p>Yevhenii Kolesnikov (1):</p>
<li> meson: Fix linkage of libgallium_nine with libgalliumvl</li>
<p></p>
<p>Zebediah Figura (1):</p>
<li> Revert &quot;draw: revert using correct order for prim decomposition.&quot;</li>
<p></p>
<p></p>
</ul>
</div>
</body>
</html>

108
docs/relnotes/19.2.8.html Normal file
View File

@@ -0,0 +1,108 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 19.2.8 Release Notes / 2019-12-18</h1>
<p>
Mesa 19.2.8 is a bug fix release which fixes bugs found since the 19.2.7 release.
</p>
<p>
Mesa 19.2.8 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<p>
Mesa 19.2.8 implements the Vulkan 1.1 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
</p>
<h2>SHA256 checksum</h2>
<pre>
TBD.
</pre>
<h2>New features</h2>
<ul>
<li>None</li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li>i965/iris: assert when destroy GL context with active query</li>
</ul>
<h2>Changes</h2>
<ul>
<p>Alyssa Rosenzweig (1):</p>
<li> gallium/util: Support POLYGON in u_stream_outputs_for_vertices</li>
<p></p>
<p>Bas Nieuwenhuizen (2):</p>
<li> amd/common: Always use addrlib for HTILE tc-compat.</li>
<li> amd/common: Fix tcCompatible degradation on Stoney.</li>
<p></p>
<p>Dylan Baker (4):</p>
<li> docs: Add SHA256 sums for 19.2.7</li>
<li> meson/broadcom: libbroadcom_cle needs expat headers</li>
<li> meson/broadcom: libbroadcom_cle also needs zlib</li>
<li> cherry-ignore: Update for 19.2.8</li>
<p></p>
<p>Gert Wollny (1):</p>
<li> virgl: Increase the shader transfer buffer by doubling the size</li>
<p></p>
<p>Iván Briano (1):</p>
<li> anv: Export filter_minmax support only when it&#x27;s really supported</li>
<p></p>
<p>Jason Ekstrand (2):</p>
<li> anv: Re-emit all compute state on pipeline switch</li>
<li> anv: Don&#x27;t leak when set_tiling fails</li>
<p></p>
<p>Kenneth Graunke (1):</p>
<li> iris: Default to X-tiling for scanout buffers without modifiers</li>
<p></p>
<p>Lionel Landwerlin (7):</p>
<li> intel/perf: fix invalid hw_id in query results</li>
<li> intel/perf: set read buffer len to 0 to identify empty buffer</li>
<li> intel/perf: take into account that reports read can be fairly old</li>
<li> intel/perf: simplify the processing of OA reports</li>
<li> intel/perf: fix improper pointer access</li>
<li> anv: fix fence underlying primitive checks</li>
<li> mesa: avoid triggering assert in implementation</li>
<p></p>
<p>Nanley Chery (2):</p>
<li> gallium/dri2: Fix creation of multi-planar modifier images</li>
<li> gallium: Store the image format in winsys_handle</li>
<p></p>
<p>Rob Clark (1):</p>
<li> nir/lower_clip: Fix incorrect driver loc for clipdist outputs</li>
<p></p>
<p>Timothy Arceri (1):</p>
<li> glsl/nir: iterate the system values list when adding varyings</li>
<p></p>
<p></p>
</ul>
</div>
</body>
</html>

View File

@@ -33,12 +33,16 @@ extern "C" {
** used to make the header, and the header can be found at
** http://www.khronos.org/registry/egl
**
** Khronos $Git commit SHA1: 9ed2ec4c67 $ on $Git commit date: 2019-01-09 17:54:35 -0800 $
** Khronos $Git commit SHA1: cb927ca98d $ on $Git commit date: 2019-08-08 01:05:38 -0700 $
*/
#include <EGL/eglplatform.h>
/* Generated on date 20190124 */
#ifndef EGL_EGL_PROTOTYPES
#define EGL_EGL_PROTOTYPES 1
#endif
/* Generated on date 20190808 */
/* Generated C header for:
* API: egl
@@ -118,6 +122,31 @@ typedef void (*__eglMustCastToProperFunctionPointerType)(void);
#define EGL_VERSION 0x3054
#define EGL_WIDTH 0x3057
#define EGL_WINDOW_BIT 0x0004
typedef EGLBoolean (EGLAPIENTRYP PFNEGLCHOOSECONFIGPROC) (EGLDisplay dpy, const EGLint *attrib_list, EGLConfig *configs, EGLint config_size, EGLint *num_config);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLCOPYBUFFERSPROC) (EGLDisplay dpy, EGLSurface surface, EGLNativePixmapType target);
typedef EGLContext (EGLAPIENTRYP PFNEGLCREATECONTEXTPROC) (EGLDisplay dpy, EGLConfig config, EGLContext share_context, const EGLint *attrib_list);
typedef EGLSurface (EGLAPIENTRYP PFNEGLCREATEPBUFFERSURFACEPROC) (EGLDisplay dpy, EGLConfig config, const EGLint *attrib_list);
typedef EGLSurface (EGLAPIENTRYP PFNEGLCREATEPIXMAPSURFACEPROC) (EGLDisplay dpy, EGLConfig config, EGLNativePixmapType pixmap, const EGLint *attrib_list);
typedef EGLSurface (EGLAPIENTRYP PFNEGLCREATEWINDOWSURFACEPROC) (EGLDisplay dpy, EGLConfig config, EGLNativeWindowType win, const EGLint *attrib_list);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLDESTROYCONTEXTPROC) (EGLDisplay dpy, EGLContext ctx);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLDESTROYSURFACEPROC) (EGLDisplay dpy, EGLSurface surface);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLGETCONFIGATTRIBPROC) (EGLDisplay dpy, EGLConfig config, EGLint attribute, EGLint *value);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLGETCONFIGSPROC) (EGLDisplay dpy, EGLConfig *configs, EGLint config_size, EGLint *num_config);
typedef EGLDisplay (EGLAPIENTRYP PFNEGLGETCURRENTDISPLAYPROC) (void);
typedef EGLSurface (EGLAPIENTRYP PFNEGLGETCURRENTSURFACEPROC) (EGLint readdraw);
typedef EGLDisplay (EGLAPIENTRYP PFNEGLGETDISPLAYPROC) (EGLNativeDisplayType display_id);
typedef EGLint (EGLAPIENTRYP PFNEGLGETERRORPROC) (void);
typedef __eglMustCastToProperFunctionPointerType (EGLAPIENTRYP PFNEGLGETPROCADDRESSPROC) (const char *procname);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLINITIALIZEPROC) (EGLDisplay dpy, EGLint *major, EGLint *minor);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLMAKECURRENTPROC) (EGLDisplay dpy, EGLSurface draw, EGLSurface read, EGLContext ctx);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLQUERYCONTEXTPROC) (EGLDisplay dpy, EGLContext ctx, EGLint attribute, EGLint *value);
typedef const char *(EGLAPIENTRYP PFNEGLQUERYSTRINGPROC) (EGLDisplay dpy, EGLint name);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLQUERYSURFACEPROC) (EGLDisplay dpy, EGLSurface surface, EGLint attribute, EGLint *value);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLSWAPBUFFERSPROC) (EGLDisplay dpy, EGLSurface surface);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLTERMINATEPROC) (EGLDisplay dpy);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLWAITGLPROC) (void);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLWAITNATIVEPROC) (EGLint engine);
#if EGL_EGL_PROTOTYPES
EGLAPI EGLBoolean EGLAPIENTRY eglChooseConfig (EGLDisplay dpy, const EGLint *attrib_list, EGLConfig *configs, EGLint config_size, EGLint *num_config);
EGLAPI EGLBoolean EGLAPIENTRY eglCopyBuffers (EGLDisplay dpy, EGLSurface surface, EGLNativePixmapType target);
EGLAPI EGLContext EGLAPIENTRY eglCreateContext (EGLDisplay dpy, EGLConfig config, EGLContext share_context, const EGLint *attrib_list);
@@ -142,6 +171,7 @@ EGLAPI EGLBoolean EGLAPIENTRY eglSwapBuffers (EGLDisplay dpy, EGLSurface surface
EGLAPI EGLBoolean EGLAPIENTRY eglTerminate (EGLDisplay dpy);
EGLAPI EGLBoolean EGLAPIENTRY eglWaitGL (void);
EGLAPI EGLBoolean EGLAPIENTRY eglWaitNative (EGLint engine);
#endif
#endif /* EGL_VERSION_1_0 */
#ifndef EGL_VERSION_1_1
@@ -160,10 +190,16 @@ EGLAPI EGLBoolean EGLAPIENTRY eglWaitNative (EGLint engine);
#define EGL_TEXTURE_RGB 0x305D
#define EGL_TEXTURE_RGBA 0x305E
#define EGL_TEXTURE_TARGET 0x3081
typedef EGLBoolean (EGLAPIENTRYP PFNEGLBINDTEXIMAGEPROC) (EGLDisplay dpy, EGLSurface surface, EGLint buffer);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLRELEASETEXIMAGEPROC) (EGLDisplay dpy, EGLSurface surface, EGLint buffer);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLSURFACEATTRIBPROC) (EGLDisplay dpy, EGLSurface surface, EGLint attribute, EGLint value);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLSWAPINTERVALPROC) (EGLDisplay dpy, EGLint interval);
#if EGL_EGL_PROTOTYPES
EGLAPI EGLBoolean EGLAPIENTRY eglBindTexImage (EGLDisplay dpy, EGLSurface surface, EGLint buffer);
EGLAPI EGLBoolean EGLAPIENTRY eglReleaseTexImage (EGLDisplay dpy, EGLSurface surface, EGLint buffer);
EGLAPI EGLBoolean EGLAPIENTRY eglSurfaceAttrib (EGLDisplay dpy, EGLSurface surface, EGLint attribute, EGLint value);
EGLAPI EGLBoolean EGLAPIENTRY eglSwapInterval (EGLDisplay dpy, EGLint interval);
#endif
#endif /* EGL_VERSION_1_1 */
#ifndef EGL_VERSION_1_2
@@ -199,11 +235,18 @@ typedef void *EGLClientBuffer;
#define EGL_SWAP_BEHAVIOR 0x3093
#define EGL_UNKNOWN EGL_CAST(EGLint,-1)
#define EGL_VERTICAL_RESOLUTION 0x3091
typedef EGLBoolean (EGLAPIENTRYP PFNEGLBINDAPIPROC) (EGLenum api);
typedef EGLenum (EGLAPIENTRYP PFNEGLQUERYAPIPROC) (void);
typedef EGLSurface (EGLAPIENTRYP PFNEGLCREATEPBUFFERFROMCLIENTBUFFERPROC) (EGLDisplay dpy, EGLenum buftype, EGLClientBuffer buffer, EGLConfig config, const EGLint *attrib_list);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLRELEASETHREADPROC) (void);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLWAITCLIENTPROC) (void);
#if EGL_EGL_PROTOTYPES
EGLAPI EGLBoolean EGLAPIENTRY eglBindAPI (EGLenum api);
EGLAPI EGLenum EGLAPIENTRY eglQueryAPI (void);
EGLAPI EGLSurface EGLAPIENTRY eglCreatePbufferFromClientBuffer (EGLDisplay dpy, EGLenum buftype, EGLClientBuffer buffer, EGLConfig config, const EGLint *attrib_list);
EGLAPI EGLBoolean EGLAPIENTRY eglReleaseThread (void);
EGLAPI EGLBoolean EGLAPIENTRY eglWaitClient (void);
#endif
#endif /* EGL_VERSION_1_2 */
#ifndef EGL_VERSION_1_3
@@ -232,7 +275,10 @@ EGLAPI EGLBoolean EGLAPIENTRY eglWaitClient (void);
#define EGL_OPENGL_API 0x30A2
#define EGL_OPENGL_BIT 0x0008
#define EGL_SWAP_BEHAVIOR_PRESERVED_BIT 0x0400
typedef EGLContext (EGLAPIENTRYP PFNEGLGETCURRENTCONTEXTPROC) (void);
#if EGL_EGL_PROTOTYPES
EGLAPI EGLContext EGLAPIENTRY eglGetCurrentContext (void);
#endif
#endif /* EGL_VERSION_1_4 */
#ifndef EGL_VERSION_1_5
@@ -284,6 +330,17 @@ typedef void *EGLImage;
#define EGL_GL_TEXTURE_CUBE_MAP_NEGATIVE_Z 0x30B8
#define EGL_IMAGE_PRESERVED 0x30D2
#define EGL_NO_IMAGE EGL_CAST(EGLImage,0)
typedef EGLSync (EGLAPIENTRYP PFNEGLCREATESYNCPROC) (EGLDisplay dpy, EGLenum type, const EGLAttrib *attrib_list);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLDESTROYSYNCPROC) (EGLDisplay dpy, EGLSync sync);
typedef EGLint (EGLAPIENTRYP PFNEGLCLIENTWAITSYNCPROC) (EGLDisplay dpy, EGLSync sync, EGLint flags, EGLTime timeout);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLGETSYNCATTRIBPROC) (EGLDisplay dpy, EGLSync sync, EGLint attribute, EGLAttrib *value);
typedef EGLImage (EGLAPIENTRYP PFNEGLCREATEIMAGEPROC) (EGLDisplay dpy, EGLContext ctx, EGLenum target, EGLClientBuffer buffer, const EGLAttrib *attrib_list);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLDESTROYIMAGEPROC) (EGLDisplay dpy, EGLImage image);
typedef EGLDisplay (EGLAPIENTRYP PFNEGLGETPLATFORMDISPLAYPROC) (EGLenum platform, void *native_display, const EGLAttrib *attrib_list);
typedef EGLSurface (EGLAPIENTRYP PFNEGLCREATEPLATFORMWINDOWSURFACEPROC) (EGLDisplay dpy, EGLConfig config, void *native_window, const EGLAttrib *attrib_list);
typedef EGLSurface (EGLAPIENTRYP PFNEGLCREATEPLATFORMPIXMAPSURFACEPROC) (EGLDisplay dpy, EGLConfig config, void *native_pixmap, const EGLAttrib *attrib_list);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLWAITSYNCPROC) (EGLDisplay dpy, EGLSync sync, EGLint flags);
#if EGL_EGL_PROTOTYPES
EGLAPI EGLSync EGLAPIENTRY eglCreateSync (EGLDisplay dpy, EGLenum type, const EGLAttrib *attrib_list);
EGLAPI EGLBoolean EGLAPIENTRY eglDestroySync (EGLDisplay dpy, EGLSync sync);
EGLAPI EGLint EGLAPIENTRY eglClientWaitSync (EGLDisplay dpy, EGLSync sync, EGLint flags, EGLTime timeout);
@@ -294,6 +351,7 @@ EGLAPI EGLDisplay EGLAPIENTRY eglGetPlatformDisplay (EGLenum platform, void *nat
EGLAPI EGLSurface EGLAPIENTRY eglCreatePlatformWindowSurface (EGLDisplay dpy, EGLConfig config, void *native_window, const EGLAttrib *attrib_list);
EGLAPI EGLSurface EGLAPIENTRY eglCreatePlatformPixmapSurface (EGLDisplay dpy, EGLConfig config, void *native_pixmap, const EGLAttrib *attrib_list);
EGLAPI EGLBoolean EGLAPIENTRY eglWaitSync (EGLDisplay dpy, EGLSync sync, EGLint flags);
#endif
#endif /* EGL_VERSION_1_5 */
#ifdef __cplusplus

View File

@@ -33,12 +33,12 @@ extern "C" {
** used to make the header, and the header can be found at
** http://www.khronos.org/registry/egl
**
** Khronos $Git commit SHA1: 9ed2ec4c67 $ on $Git commit date: 2019-01-09 17:54:35 -0800 $
** Khronos $Git commit SHA1: cb927ca98d $ on $Git commit date: 2019-08-08 01:05:38 -0700 $
*/
#include <EGL/eglplatform.h>
#define EGL_EGLEXT_VERSION 20190124
#define EGL_EGLEXT_VERSION 20190808
/* Generated C header for:
* API: egl
@@ -462,6 +462,10 @@ EGLAPI EGLint EGLAPIENTRY eglWaitSyncKHR (EGLDisplay dpy, EGLSyncKHR sync, EGLin
#endif
#endif /* EGL_KHR_wait_sync */
#ifndef EGL_ANDROID_GLES_layers
#define EGL_ANDROID_GLES_layers 1
#endif /* EGL_ANDROID_GLES_layers */
#ifndef EGL_ANDROID_blob_cache
#define EGL_ANDROID_blob_cache 1
typedef khronos_ssize_t EGLsizeiANDROID;
@@ -1129,6 +1133,11 @@ EGLAPI EGLBoolean EGLAPIENTRY eglPostSubBufferNV (EGLDisplay dpy, EGLSurface sur
#endif
#endif /* EGL_NV_post_sub_buffer */
#ifndef EGL_NV_quadruple_buffer
#define EGL_NV_quadruple_buffer 1
#define EGL_QUADRUPLE_BUFFER_NV 0x3231
#endif /* EGL_NV_quadruple_buffer */
#ifndef EGL_NV_robustness_video_memory_purge
#define EGL_NV_robustness_video_memory_purge 1
#define EGL_GENERATE_RESET_ON_VIDEO_MEMORY_PURGE_NV 0x334C
@@ -1170,6 +1179,12 @@ EGLAPI EGLBoolean EGLAPIENTRY eglStreamConsumerGLTextureExternalAttribsNV (EGLDi
#define EGL_STREAM_CROSS_SYSTEM_NV 0x334F
#endif /* EGL_NV_stream_cross_system */
#ifndef EGL_NV_stream_dma
#define EGL_NV_stream_dma 1
#define EGL_STREAM_DMA_NV 0x3371
#define EGL_STREAM_DMA_SERVER_NV 0x3372
#endif /* EGL_NV_stream_dma */
#ifndef EGL_NV_stream_fifo_next
#define EGL_NV_stream_fifo_next 1
#define EGL_PENDING_FRAME_NV 0x3329
@@ -1221,6 +1236,21 @@ EGLAPI EGLBoolean EGLAPIENTRY eglQueryStreamMetadataNV (EGLDisplay dpy, EGLStrea
#endif
#endif /* EGL_NV_stream_metadata */
#ifndef EGL_NV_stream_origin
#define EGL_NV_stream_origin 1
#define EGL_STREAM_FRAME_ORIGIN_X_NV 0x3366
#define EGL_STREAM_FRAME_ORIGIN_Y_NV 0x3367
#define EGL_STREAM_FRAME_MAJOR_AXIS_NV 0x3368
#define EGL_CONSUMER_AUTO_ORIENTATION_NV 0x3369
#define EGL_PRODUCER_AUTO_ORIENTATION_NV 0x336A
#define EGL_LEFT_NV 0x336B
#define EGL_RIGHT_NV 0x336C
#define EGL_TOP_NV 0x336D
#define EGL_BOTTOM_NV 0x336E
#define EGL_X_AXIS_NV 0x336F
#define EGL_Y_AXIS_NV 0x3370
#endif /* EGL_NV_stream_origin */
#ifndef EGL_NV_stream_remote
#define EGL_NV_stream_remote 1
#define EGL_STREAM_STATE_INITIALIZING_NV 0x3240
@@ -1317,6 +1347,11 @@ EGLAPI EGLuint64NV EGLAPIENTRY eglGetSystemTimeNV (void);
#endif /* KHRONOS_SUPPORT_INT64 */
#endif /* EGL_NV_system_time */
#ifndef EGL_NV_triple_buffer
#define EGL_NV_triple_buffer 1
#define EGL_TRIPLE_BUFFER_NV 0x3230
#endif /* EGL_NV_triple_buffer */
#ifndef EGL_TIZEN_image_native_buffer
#define EGL_TIZEN_image_native_buffer 1
#define EGL_NATIVE_BUFFER_TIZEN 0x32A0

View File

@@ -77,11 +77,17 @@ typedef HDC EGLNativeDisplayType;
typedef HBITMAP EGLNativePixmapType;
typedef HWND EGLNativeWindowType;
#elif defined(__EMSCRIPTEN__)
typedef int EGLNativeDisplayType;
typedef int EGLNativePixmapType;
typedef int EGLNativeWindowType;
#elif defined(__WINSCW__) || defined(__SYMBIAN32__) /* Symbian */
typedef int EGLNativeDisplayType;
typedef void *EGLNativeWindowType;
typedef void *EGLNativePixmapType;
typedef void *EGLNativeWindowType;
#elif defined(WL_EGL_PLATFORM)
@@ -100,15 +106,15 @@ typedef void *EGLNativeWindowType;
struct ANativeWindow;
struct egl_native_pixmap_t;
typedef struct ANativeWindow* EGLNativeWindowType;
typedef struct egl_native_pixmap_t* EGLNativePixmapType;
typedef void* EGLNativeDisplayType;
typedef struct egl_native_pixmap_t* EGLNativePixmapType;
typedef struct ANativeWindow* EGLNativeWindowType;
#elif defined(USE_OZONE)
typedef intptr_t EGLNativeDisplayType;
typedef intptr_t EGLNativeWindowType;
typedef intptr_t EGLNativePixmapType;
typedef intptr_t EGLNativeWindowType;
#elif defined(__unix__) || defined(__APPLE__)

View File

@@ -27,11 +27,6 @@
#ifndef __gl_h_
#define __gl_h_
#if defined(USE_MGL_NAMESPACE)
#include "gl_mangle.h"
#endif
/**********************************************************************
* Begin system-specific stuff.
*/
@@ -2101,13 +2096,6 @@ typedef void (APIENTRYP PFNGLEGLIMAGETARGETRENDERBUFFERSTORAGEOESPROC) (GLenum t
#endif
/**
** NOTE!!!!! If you add new functions to this file, or update
** glext.h be sure to regenerate the gl_mangle.h file. See comments
** in that file for details.
**/
#ifdef __cplusplus
}
#endif

File diff suppressed because it is too large Load Diff

View File

@@ -32,11 +32,6 @@
#include <GL/gl.h>
#if defined(USE_MGL_NAMESPACE)
#include "glx_mangle.h"
#endif
#ifdef __cplusplus
extern "C" {
#endif

View File

@@ -1,82 +0,0 @@
/*
* Mesa 3-D graphics library
*
* Copyright (C) 1999-2006 Brian Paul All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included
* in all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
* OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
* OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
* ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*/
#ifndef GLX_MANGLE_H
#define GLX_MANGLE_H
#define glXChooseVisual mglXChooseVisual
#define glXCreateContext mglXCreateContext
#define glXDestroyContext mglXDestroyContext
#define glXMakeCurrent mglXMakeCurrent
#define glXCopyContext mglXCopyContext
#define glXSwapBuffers mglXSwapBuffers
#define glXCreateGLXPixmap mglXCreateGLXPixmap
#define glXDestroyGLXPixmap mglXDestroyGLXPixmap
#define glXQueryExtension mglXQueryExtension
#define glXQueryVersion mglXQueryVersion
#define glXIsDirect mglXIsDirect
#define glXGetConfig mglXGetConfig
#define glXGetCurrentContext mglXGetCurrentContext
#define glXGetCurrentDrawable mglXGetCurrentDrawable
#define glXWaitGL mglXWaitGL
#define glXWaitX mglXWaitX
#define glXUseXFont mglXUseXFont
#define glXQueryExtensionsString mglXQueryExtensionsString
#define glXQueryServerString mglXQueryServerString
#define glXGetClientString mglXGetClientString
#define glXCreateGLXPixmapMESA mglXCreateGLXPixmapMESA
#define glXReleaseBuffersMESA mglXReleaseBuffersMESA
#define glXCopySubBufferMESA mglXCopySubBufferMESA
#define glXGetVideoSyncSGI mglXGetVideoSyncSGI
#define glXWaitVideoSyncSGI mglXWaitVideoSyncSGI
/* GLX 1.2 */
#define glXGetCurrentDisplay mglXGetCurrentDisplay
/* GLX 1.3 */
#define glXChooseFBConfig mglXChooseFBConfig
#define glXGetFBConfigAttrib mglXGetFBConfigAttrib
#define glXGetFBConfigs mglXGetFBConfigs
#define glXGetVisualFromFBConfig mglXGetVisualFromFBConfig
#define glXCreateWindow mglXCreateWindow
#define glXDestroyWindow mglXDestroyWindow
#define glXCreatePixmap mglXCreatePixmap
#define glXDestroyPixmap mglXDestroyPixmap
#define glXCreatePbuffer mglXCreatePbuffer
#define glXDestroyPbuffer mglXDestroyPbuffer
#define glXQueryDrawable mglXQueryDrawable
#define glXCreateNewContext mglXCreateNewContext
#define glXMakeContextCurrent mglXMakeContextCurrent
#define glXGetCurrentReadDrawable mglXGetCurrentReadDrawable
#define glXQueryContext mglXQueryContext
#define glXSelectEvent mglXSelectEvent
#define glXGetSelectedEvent mglXGetSelectedEvent
/* GLX 1.4 */
#define glXGetProcAddress mglXGetProcAddress
#define glXGetProcAddressARB mglXGetProcAddressARB
#endif

View File

@@ -85,6 +85,7 @@ typedef struct __DRI2throttleExtensionRec __DRI2throttleExtension;
typedef struct __DRI2fenceExtensionRec __DRI2fenceExtension;
typedef struct __DRI2interopExtensionRec __DRI2interopExtension;
typedef struct __DRI2blobExtensionRec __DRI2blobExtension;
typedef struct __DRI2bufferDamageExtensionRec __DRI2bufferDamageExtension;
typedef struct __DRIimageLoaderExtensionRec __DRIimageLoaderExtension;
typedef struct __DRIimageDriverExtensionRec __DRIimageDriverExtension;
@@ -488,6 +489,48 @@ struct __DRI2interopExtensionRec {
struct mesa_glinterop_export_out *out);
};
/**
* Extension for limiting window system back buffer rendering to user-defined
* scissor region.
*/
#define __DRI2_BUFFER_DAMAGE "DRI2_BufferDamage"
#define __DRI2_BUFFER_DAMAGE_VERSION 1
struct __DRI2bufferDamageExtensionRec {
__DRIextension base;
/**
* Provides an array of rectangles representing an overriding scissor region
* for rendering operations performed to the specified drawable. These
* rectangles do not replace client API scissor regions or draw
* co-ordinates, but instead inform the driver of the overall bounds of all
* operations which will be issued before the next flush.
*
* Any rendering operations writing pixels outside this region to the
* drawable will have an undefined effect on the entire drawable.
*
* This entrypoint may only be called after the drawable has either been
* newly created or flushed, and before any rendering operations which write
* pixels to the drawable. Calling this entrypoint at any other time will
* have an undefined effect on the entire drawable.
*
* Calling this entrypoint with @nrects 0 and @rects NULL will reset the
* region to the buffer's full size. This entrypoint may be called once to
* reset the region, followed by a second call with a populated region,
* before a rendering call is made.
*
* Used to implement EGL_KHR_partial_update.
*
* \param drawable affected drawable
* \param nrects number of rectangles provided
* \param rects the array of rectangles, lower-left origin
*/
void (*set_damage_region)(__DRIdrawable *drawable, unsigned int nrects,
int *rects);
};
/*@}*/
/**

View File

@@ -12,7 +12,6 @@ Normal Haiku Op*enGL layout:
* headers/os/opengl/GLView.h
* headers/os/opengl/GLRenderer.h
* headers/os/opengl/GL/gl.h
* headers/os/opengl/GL/gl_mangle.h
* headers/os/opengl/GL/glext.h
* headers/os/opengl/GL/osmesa.h (needed?)

View File

@@ -2,7 +2,7 @@
#define __khrplatform_h_
/*
** Copyright (c) 2008-2009 The Khronos Group Inc.
** Copyright (c) 2008-2018 The Khronos Group Inc.
**
** Permission is hereby granted, free of charge, to any person obtaining a
** copy of this software and/or associated documentation files (the
@@ -26,18 +26,16 @@
/* Khronos platform-specific types and definitions.
*
* $Revision: 32517 $ on $Date: 2016-03-11 02:41:19 -0800 (Fri, 11 Mar 2016) $
* The master copy of khrplatform.h is maintained in the Khronos EGL
* Registry repository at https://github.com/KhronosGroup/EGL-Registry
* The last semantic modification to khrplatform.h was at commit ID:
* 67a3e0864c2d75ea5287b9f3d2eb74a745936692
*
* Adopters may modify this file to suit their platform. Adopters are
* encouraged to submit platform specific modifications to the Khronos
* group so that they can be included in future versions of this file.
* Please submit changes by sending them to the public Khronos Bugzilla
* (http://khronos.org/bugzilla) by filing a bug against product
* "Khronos (general)" component "Registry".
*
* A predefined template which fills in some of the bug fields can be
* reached using http://tinyurl.com/khrplatform-h-bugreport, but you
* must create a Bugzilla login first.
* Please submit changes by filing pull requests or issues on
* the EGL Registry repository linked above.
*
*
* See the Implementer's Guidelines for information about where this file
@@ -92,12 +90,20 @@
* int arg2) KHRONOS_APIATTRIBUTES;
*/
#if defined(__SCITECH_SNAP__) && !defined(KHRONOS_STATIC)
# define KHRONOS_STATIC 1
#endif
/*-------------------------------------------------------------------------
* Definition of KHRONOS_APICALL
*-------------------------------------------------------------------------
* This precedes the return type of the function in the function prototype.
*/
#if defined(_WIN32) && !defined(__SCITECH_SNAP__)
#if defined(KHRONOS_STATIC)
/* If the preprocessor constant KHRONOS_STATIC is defined, make the
* header compatible with static linking. */
# define KHRONOS_APICALL
#elif defined(_WIN32)
# define KHRONOS_APICALL __declspec(dllimport)
#elif defined (__SYMBIAN32__)
# define KHRONOS_APICALL IMPORT_C
@@ -115,7 +121,7 @@
* This follows the return type of the function and precedes the function
* name in the function prototype.
*/
#if defined(_WIN32) && !defined(_WIN32_WCE) && !defined(__SCITECH_SNAP__)
#if defined(_WIN32) && !defined(_WIN32_WCE) && !defined(KHRONOS_STATIC)
/* Win32 but not WinCE */
# define KHRONOS_APIENTRY __stdcall
#else

27
include/c11_compat.h Normal file
View File

@@ -0,0 +1,27 @@
/* Copyright 2019 Intel Corporation */
/* SPDX-License-Identifier: MIT */
#include "no_extern_c.h"
#ifndef _C11_COMPAT_H_
#define _C11_COMPAT_H_
#if defined(__cplusplus)
/* This is C++ code, not C */
#elif (__STDC_VERSION__ >= 201112L)
/* Already C11 */
#else
/*
* C11 static_assert() macro
* assert.h only defines that name for C11 and above
*/
#ifndef static_assert
#define static_assert _Static_assert
#endif
#endif /* !C++ && !C11 */
#endif /* _C11_COMPAT_H_ */

View File

@@ -96,7 +96,7 @@
* - http://cellperformance.beyond3d.com/articles/2006/05/demystifying-the-restrict-keyword.html
*/
#ifndef restrict
# if (__STDC_VERSION__ >= 199901L)
# if (__STDC_VERSION__ >= 199901L) && !defined(__cplusplus)
/* C99 */
# elif defined(__GNUC__)
# define restrict __restrict__

View File

@@ -18,6 +18,9 @@ extern "C" {
#define DRM_PANFROST_MMAP_BO 0x03
#define DRM_PANFROST_GET_PARAM 0x04
#define DRM_PANFROST_GET_BO_OFFSET 0x05
#define DRM_PANFROST_PERFCNT_ENABLE 0x06
#define DRM_PANFROST_PERFCNT_DUMP 0x07
#define DRM_PANFROST_MADVISE 0x08
#define DRM_IOCTL_PANFROST_SUBMIT DRM_IOW(DRM_COMMAND_BASE + DRM_PANFROST_SUBMIT, struct drm_panfrost_submit)
#define DRM_IOCTL_PANFROST_WAIT_BO DRM_IOW(DRM_COMMAND_BASE + DRM_PANFROST_WAIT_BO, struct drm_panfrost_wait_bo)
@@ -25,6 +28,16 @@ extern "C" {
#define DRM_IOCTL_PANFROST_MMAP_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_MMAP_BO, struct drm_panfrost_mmap_bo)
#define DRM_IOCTL_PANFROST_GET_PARAM DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_GET_PARAM, struct drm_panfrost_get_param)
#define DRM_IOCTL_PANFROST_GET_BO_OFFSET DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_GET_BO_OFFSET, struct drm_panfrost_get_bo_offset)
#define DRM_IOCTL_PANFROST_MADVISE DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_MADVISE, struct drm_panfrost_madvise)
/*
* Unstable ioctl(s): only exposed when the unsafe unstable_ioctls module
* param is set to true.
* All these ioctl(s) are subject to deprecation, so please don't rely on
* them for anything but debugging purpose.
*/
#define DRM_IOCTL_PANFROST_PERFCNT_ENABLE DRM_IOW(DRM_COMMAND_BASE + DRM_PANFROST_PERFCNT_ENABLE, struct drm_panfrost_perfcnt_enable)
#define DRM_IOCTL_PANFROST_PERFCNT_DUMP DRM_IOW(DRM_COMMAND_BASE + DRM_PANFROST_PERFCNT_DUMP, struct drm_panfrost_perfcnt_dump)
#define PANFROST_JD_REQ_FS (1 << 0)
/**
@@ -71,6 +84,9 @@ struct drm_panfrost_wait_bo {
__s64 timeout_ns; /* absolute */
};
#define PANFROST_BO_NOEXEC 1
#define PANFROST_BO_HEAP 2
/**
* struct drm_panfrost_create_bo - ioctl argument for creating Panfrost BOs.
*
@@ -116,6 +132,45 @@ struct drm_panfrost_mmap_bo {
enum drm_panfrost_param {
DRM_PANFROST_PARAM_GPU_PROD_ID,
DRM_PANFROST_PARAM_GPU_REVISION,
DRM_PANFROST_PARAM_SHADER_PRESENT,
DRM_PANFROST_PARAM_TILER_PRESENT,
DRM_PANFROST_PARAM_L2_PRESENT,
DRM_PANFROST_PARAM_STACK_PRESENT,
DRM_PANFROST_PARAM_AS_PRESENT,
DRM_PANFROST_PARAM_JS_PRESENT,
DRM_PANFROST_PARAM_L2_FEATURES,
DRM_PANFROST_PARAM_CORE_FEATURES,
DRM_PANFROST_PARAM_TILER_FEATURES,
DRM_PANFROST_PARAM_MEM_FEATURES,
DRM_PANFROST_PARAM_MMU_FEATURES,
DRM_PANFROST_PARAM_THREAD_FEATURES,
DRM_PANFROST_PARAM_MAX_THREADS,
DRM_PANFROST_PARAM_THREAD_MAX_WORKGROUP_SZ,
DRM_PANFROST_PARAM_THREAD_MAX_BARRIER_SZ,
DRM_PANFROST_PARAM_COHERENCY_FEATURES,
DRM_PANFROST_PARAM_TEXTURE_FEATURES0,
DRM_PANFROST_PARAM_TEXTURE_FEATURES1,
DRM_PANFROST_PARAM_TEXTURE_FEATURES2,
DRM_PANFROST_PARAM_TEXTURE_FEATURES3,
DRM_PANFROST_PARAM_JS_FEATURES0,
DRM_PANFROST_PARAM_JS_FEATURES1,
DRM_PANFROST_PARAM_JS_FEATURES2,
DRM_PANFROST_PARAM_JS_FEATURES3,
DRM_PANFROST_PARAM_JS_FEATURES4,
DRM_PANFROST_PARAM_JS_FEATURES5,
DRM_PANFROST_PARAM_JS_FEATURES6,
DRM_PANFROST_PARAM_JS_FEATURES7,
DRM_PANFROST_PARAM_JS_FEATURES8,
DRM_PANFROST_PARAM_JS_FEATURES9,
DRM_PANFROST_PARAM_JS_FEATURES10,
DRM_PANFROST_PARAM_JS_FEATURES11,
DRM_PANFROST_PARAM_JS_FEATURES12,
DRM_PANFROST_PARAM_JS_FEATURES13,
DRM_PANFROST_PARAM_JS_FEATURES14,
DRM_PANFROST_PARAM_JS_FEATURES15,
DRM_PANFROST_PARAM_NR_CORE_GROUPS,
DRM_PANFROST_PARAM_THREAD_TLS_ALLOC,
};
struct drm_panfrost_get_param {
@@ -135,6 +190,39 @@ struct drm_panfrost_get_bo_offset {
__u64 offset;
};
struct drm_panfrost_perfcnt_enable {
__u32 enable;
/*
* On bifrost we have 2 sets of counters, this parameter defines the
* one to track.
*/
__u32 counterset;
};
struct drm_panfrost_perfcnt_dump {
__u64 buf_ptr;
};
/* madvise provides a way to tell the kernel in case a buffers contents
* can be discarded under memory pressure, which is useful for userspace
* bo cache where we want to optimistically hold on to buffer allocate
* and potential mmap, but allow the pages to be discarded under memory
* pressure.
*
* Typical usage would involve madvise(DONTNEED) when buffer enters BO
* cache, and madvise(WILLNEED) if trying to recycle buffer from BO cache.
* In the WILLNEED case, 'retained' indicates to userspace whether the
* backing pages still exist.
*/
#define PANFROST_MADV_WILLNEED 0 /* backing pages are needed, status returned in 'retained' */
#define PANFROST_MADV_DONTNEED 1 /* backing pages not needed */
struct drm_panfrost_madvise {
__u32 handle; /* in, GEM handle */
__u32 madv; /* in, PANFROST_MADV_x */
__u32 retained; /* out, whether backing store still exists */
};
#if defined(__cplusplus)
}
#endif

View File

@@ -22,52 +22,77 @@ inc_include = include_directories('.')
inc_d3d9 = include_directories('D3D9')
inc_haikugl = include_directories('HaikuGL')
if with_gles1
install_headers(
'GLES/egl.h', 'GLES/gl.h', 'GLES/glext.h', 'GLES/glplatform.h',
subdir : 'GLES',
)
if not glvnd_has_headers_and_pc_files
if with_gles1 or with_gles2 or with_opengl or with_egl
install_headers('KHR/khrplatform.h', subdir : 'KHR')
endif
if with_gles1
install_headers(
'GLES/egl.h',
'GLES/gl.h',
'GLES/glext.h',
'GLES/glplatform.h',
subdir : 'GLES',
)
endif
if with_gles2
install_headers(
'GLES2/gl2.h',
'GLES2/gl2ext.h',
'GLES2/gl2platform.h',
subdir : 'GLES2',
)
install_headers(
'GLES3/gl3.h',
'GLES3/gl31.h',
'GLES3/gl32.h',
'GLES3/gl3ext.h',
'GLES3/gl3platform.h',
subdir : 'GLES3',
)
endif
if with_opengl
install_headers(
'GL/gl.h',
'GL/glcorearb.h',
'GL/glext.h',
subdir : 'GL',
)
endif
if with_glx != 'disabled'
install_headers(
'GL/glx.h',
'GL/glxext.h',
subdir : 'GL')
endif
if with_egl
install_headers(
'EGL/egl.h',
'EGL/eglext.h',
'EGL/eglplatform.h',
subdir : 'EGL',
)
endif
endif
if with_gles2
# Non-upstream headers
if with_egl
install_headers(
'GLES2/gl2.h', 'GLES2/gl2ext.h', 'GLES2/gl2platform.h',
subdir : 'GLES2',
'EGL/eglmesaext.h',
'EGL/eglextchromium.h',
subdir : 'EGL',
)
install_headers(
'GLES3/gl3.h', 'GLES3/gl31.h', 'GLES3/gl32.h', 'GLES3/gl3ext.h',
'GLES3/gl3platform.h',
subdir : 'GLES3',
)
endif
if with_gles1 or with_gles2 or with_opengl or with_egl
install_headers('KHR/khrplatform.h', subdir : 'KHR')
endif
if with_opengl
install_headers(
'GL/gl.h', 'GL/glext.h', 'GL/glcorearb.h', 'GL/gl_mangle.h',
subdir : 'GL',
)
endif
if with_glx != 'disabled'
install_headers('GL/glx.h', 'GL/glxext.h', 'GL/glx_mangle.h', subdir : 'GL')
endif
if with_osmesa != 'none'
install_headers('GL/osmesa.h', subdir : 'GL')
endif
if with_egl
install_headers(
'EGL/eglext.h', 'EGL/egl.h', 'EGL/eglextchromium.h', 'EGL/eglmesaext.h',
'EGL/eglplatform.h',
subdir : 'EGL',
)
endif
if with_dri
install_headers('GL/internal/dri_interface.h', subdir : 'GL/internal')
endif

View File

@@ -186,29 +186,29 @@ CHIPSET(0x3EA5, cfl_gt3, "Intel(R) HD Graphics (Coffeelake 3x8 GT3)")
CHIPSET(0x3EA6, cfl_gt3, "Intel(R) HD Graphics (Coffeelake 3x8 GT3)")
CHIPSET(0x3EA7, cfl_gt3, "Intel(R) HD Graphics (Coffeelake 3x8 GT3)")
CHIPSET(0x3EA8, cfl_gt3, "Intel(R) HD Graphics (Coffeelake 3x8 GT3)")
CHIPSET(0x3EA1, cfl_gt1, "Intel(R) HD Graphics (Whiskey Lake 2x6 GT1)")
CHIPSET(0x3EA4, cfl_gt1, "Intel(R) HD Graphics (Whiskey Lake 3x8 GT1)")
CHIPSET(0x3EA0, cfl_gt2, "Intel(R) HD Graphics (Whiskey Lake 3x8 GT2)")
CHIPSET(0x3EA3, cfl_gt2, "Intel(R) HD Graphics (Whiskey Lake 3x8 GT2)")
CHIPSET(0x3EA2, cfl_gt3, "Intel(R) HD Graphics (Whiskey Lake 3x8 GT3)")
CHIPSET(0x9B21, cfl_gt1, "Intel(R) HD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9BA0, cfl_gt1, "Intel(R) HD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9BA2, cfl_gt1, "Intel(R) HD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9BA4, cfl_gt1, "Intel(R) HD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9BA5, cfl_gt1, "Intel(R) HD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9BA8, cfl_gt1, "Intel(R) HD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9BAA, cfl_gt1, "Intel(R) HD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9BAB, cfl_gt1, "Intel(R) HD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9BAC, cfl_gt1, "Intel(R) HD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9B41, cfl_gt2, "Intel(R) HD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x9BC0, cfl_gt2, "Intel(R) HD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x9BC2, cfl_gt2, "Intel(R) HD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x9BC4, cfl_gt2, "Intel(R) HD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x9BC5, cfl_gt2, "Intel(R) HD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x9BC8, cfl_gt2, "Intel(R) HD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x9BCA, cfl_gt2, "Intel(R) HD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x9BCB, cfl_gt2, "Intel(R) HD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x9BCC, cfl_gt2, "Intel(R) HD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x3EA1, cfl_gt1, "Intel(R) UHD Graphics (Whiskey Lake 2x6 GT1)")
CHIPSET(0x3EA4, cfl_gt1, "Intel(R) UHD Graphics (Whiskey Lake 3x8 GT1)")
CHIPSET(0x3EA0, cfl_gt2, "Intel(R) UHD Graphics (Whiskey Lake 3x8 GT2)")
CHIPSET(0x3EA3, cfl_gt2, "Intel(R) UHD Graphics (Whiskey Lake 3x8 GT2)")
CHIPSET(0x3EA2, cfl_gt3, "Intel(R) UHD Graphics (Whiskey Lake 3x8 GT3)")
CHIPSET(0x9B21, cfl_gt1, "Intel(R) UHD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9BA0, cfl_gt1, "Intel(R) UHD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9BA2, cfl_gt1, "Intel(R) UHD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9BA4, cfl_gt1, "Intel(R) UHD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9BA5, cfl_gt1, "Intel(R) UHD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9BA8, cfl_gt1, "Intel(R) UHD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9BAA, cfl_gt1, "Intel(R) UHD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9BAB, cfl_gt1, "Intel(R) UHD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9BAC, cfl_gt1, "Intel(R) UHD Graphics (Comet Lake 2x6 GT1)")
CHIPSET(0x9B41, cfl_gt2, "Intel(R) UHD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x9BC0, cfl_gt2, "Intel(R) UHD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x9BC2, cfl_gt2, "Intel(R) UHD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x9BC4, cfl_gt2, "Intel(R) UHD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x9BC5, cfl_gt2, "Intel(R) UHD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x9BC8, cfl_gt2, "Intel(R) UHD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x9BCA, cfl_gt2, "Intel(R) UHD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x9BCB, cfl_gt2, "Intel(R) UHD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x9BCC, cfl_gt2, "Intel(R) UHD Graphics (Comet Lake 3x8 GT2)")
CHIPSET(0x5A49, cnl_2x8, "Intel(R) HD Graphics (Cannonlake 2x8 GT0.5)")
CHIPSET(0x5A4A, cnl_2x8, "Intel(R) HD Graphics (Cannonlake 2x8 GT0.5)")
CHIPSET(0x5A41, cnl_3x8, "Intel(R) HD Graphics (Cannonlake 3x8 GT1)")

View File

@@ -254,6 +254,8 @@ CHIPSET(0x66AF, VEGA20)
CHIPSET(0x15DD, RAVEN)
CHIPSET(0x15D8, RAVEN)
CHIPSET(0x1636, RENOIR)
CHIPSET(0x738C, ARCTURUS)
CHIPSET(0x7388, ARCTURUS)
CHIPSET(0x738E, ARCTURUS)
@@ -265,3 +267,9 @@ CHIPSET(0x7319, NAVI10)
CHIPSET(0x731A, NAVI10)
CHIPSET(0x731B, NAVI10)
CHIPSET(0x731F, NAVI10)
CHIPSET(0x7360, NAVI12)
CHIPSET(0x7340, NAVI14)
CHIPSET(0x7341, NAVI14)
CHIPSET(0x7347, NAVI14)

View File

@@ -43,7 +43,7 @@ extern "C" {
#define VK_VERSION_MINOR(version) (((uint32_t)(version) >> 12) & 0x3ff)
#define VK_VERSION_PATCH(version) ((uint32_t)(version) & 0xfff)
// Version of this file
#define VK_HEADER_VERSION 117
#define VK_HEADER_VERSION 119
#define VK_NULL_HANDLE 0
@@ -447,6 +447,7 @@ typedef enum VkStructureType {
VK_STRUCTURE_TYPE_MEMORY_HOST_POINTER_PROPERTIES_EXT = 1000178001,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_EXTERNAL_MEMORY_HOST_PROPERTIES_EXT = 1000178002,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SHADER_ATOMIC_INT64_FEATURES_KHR = 1000180000,
VK_STRUCTURE_TYPE_PIPELINE_COMPILER_CONTROL_CREATE_INFO_AMD = 1000183000,
VK_STRUCTURE_TYPE_CALIBRATED_TIMESTAMP_INFO_EXT = 1000184000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SHADER_CORE_PROPERTIES_AMD = 1000185000,
VK_STRUCTURE_TYPE_DEVICE_MEMORY_OVERALLOCATION_CREATE_INFO_AMD = 1000189000,
@@ -487,6 +488,8 @@ typedef enum VkStructureType {
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SCALAR_BLOCK_LAYOUT_FEATURES_EXT = 1000221000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SUBGROUP_SIZE_CONTROL_PROPERTIES_EXT = 1000225000,
VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_REQUIRED_SUBGROUP_SIZE_CREATE_INFO_EXT = 1000225001,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SUBGROUP_SIZE_CONTROL_FEATURES_EXT = 1000225002,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SHADER_CORE_PROPERTIES_2_AMD = 1000227000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_MEMORY_BUDGET_PROPERTIES_EXT = 1000237000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_MEMORY_PRIORITY_FEATURES_EXT = 1000238000,
VK_STRUCTURE_TYPE_MEMORY_PRIORITY_ALLOCATE_INFO_EXT = 1000238001,
@@ -515,6 +518,12 @@ typedef enum VkStructureType {
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_LINE_RASTERIZATION_PROPERTIES_EXT = 1000259002,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_HOST_QUERY_RESET_FEATURES_EXT = 1000261000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_INDEX_TYPE_UINT8_FEATURES_EXT = 1000265000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_PIPELINE_EXECUTABLE_PROPERTIES_FEATURES_KHR = 1000269000,
VK_STRUCTURE_TYPE_PIPELINE_INFO_KHR = 1000269001,
VK_STRUCTURE_TYPE_PIPELINE_EXECUTABLE_PROPERTIES_KHR = 1000269002,
VK_STRUCTURE_TYPE_PIPELINE_EXECUTABLE_INFO_KHR = 1000269003,
VK_STRUCTURE_TYPE_PIPELINE_EXECUTABLE_STATISTIC_KHR = 1000269004,
VK_STRUCTURE_TYPE_PIPELINE_EXECUTABLE_INTERNAL_REPRESENTATION_KHR = 1000269005,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SHADER_DEMOTE_TO_HELPER_INVOCATION_FEATURES_EXT = 1000276000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_TEXEL_BUFFER_ALIGNMENT_FEATURES_EXT = 1000281000,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_TEXEL_BUFFER_ALIGNMENT_PROPERTIES_EXT = 1000281001,
@@ -1664,6 +1673,8 @@ typedef enum VkPipelineCreateFlagBits {
VK_PIPELINE_CREATE_VIEW_INDEX_FROM_DEVICE_INDEX_BIT = 0x00000008,
VK_PIPELINE_CREATE_DISPATCH_BASE = 0x00000010,
VK_PIPELINE_CREATE_DEFER_COMPILE_BIT_NV = 0x00000020,
VK_PIPELINE_CREATE_CAPTURE_STATISTICS_BIT_KHR = 0x00000040,
VK_PIPELINE_CREATE_CAPTURE_INTERNAL_REPRESENTATIONS_BIT_KHR = 0x00000080,
VK_PIPELINE_CREATE_VIEW_INDEX_FROM_DEVICE_INDEX_BIT_KHR = VK_PIPELINE_CREATE_VIEW_INDEX_FROM_DEVICE_INDEX_BIT,
VK_PIPELINE_CREATE_DISPATCH_BASE_KHR = VK_PIPELINE_CREATE_DISPATCH_BASE,
VK_PIPELINE_CREATE_FLAG_BITS_MAX_ENUM = 0x7FFFFFFF
@@ -6361,6 +6372,99 @@ typedef struct VkPhysicalDeviceUniformBufferStandardLayoutFeaturesKHR {
#define VK_KHR_pipeline_executable_properties 1
#define VK_KHR_PIPELINE_EXECUTABLE_PROPERTIES_SPEC_VERSION 1
#define VK_KHR_PIPELINE_EXECUTABLE_PROPERTIES_EXTENSION_NAME "VK_KHR_pipeline_executable_properties"
typedef enum VkPipelineExecutableStatisticFormatKHR {
VK_PIPELINE_EXECUTABLE_STATISTIC_FORMAT_BOOL32_KHR = 0,
VK_PIPELINE_EXECUTABLE_STATISTIC_FORMAT_INT64_KHR = 1,
VK_PIPELINE_EXECUTABLE_STATISTIC_FORMAT_UINT64_KHR = 2,
VK_PIPELINE_EXECUTABLE_STATISTIC_FORMAT_FLOAT64_KHR = 3,
VK_PIPELINE_EXECUTABLE_STATISTIC_FORMAT_BEGIN_RANGE_KHR = VK_PIPELINE_EXECUTABLE_STATISTIC_FORMAT_BOOL32_KHR,
VK_PIPELINE_EXECUTABLE_STATISTIC_FORMAT_END_RANGE_KHR = VK_PIPELINE_EXECUTABLE_STATISTIC_FORMAT_FLOAT64_KHR,
VK_PIPELINE_EXECUTABLE_STATISTIC_FORMAT_RANGE_SIZE_KHR = (VK_PIPELINE_EXECUTABLE_STATISTIC_FORMAT_FLOAT64_KHR - VK_PIPELINE_EXECUTABLE_STATISTIC_FORMAT_BOOL32_KHR + 1),
VK_PIPELINE_EXECUTABLE_STATISTIC_FORMAT_MAX_ENUM_KHR = 0x7FFFFFFF
} VkPipelineExecutableStatisticFormatKHR;
typedef struct VkPhysicalDevicePipelineExecutablePropertiesFeaturesKHR {
VkStructureType sType;
void* pNext;
VkBool32 pipelineExecutableInfo;
} VkPhysicalDevicePipelineExecutablePropertiesFeaturesKHR;
typedef struct VkPipelineInfoKHR {
VkStructureType sType;
const void* pNext;
VkPipeline pipeline;
} VkPipelineInfoKHR;
typedef struct VkPipelineExecutablePropertiesKHR {
VkStructureType sType;
void* pNext;
VkShaderStageFlags stages;
char name[VK_MAX_DESCRIPTION_SIZE];
char description[VK_MAX_DESCRIPTION_SIZE];
uint32_t subgroupSize;
} VkPipelineExecutablePropertiesKHR;
typedef struct VkPipelineExecutableInfoKHR {
VkStructureType sType;
const void* pNext;
VkPipeline pipeline;
uint32_t executableIndex;
} VkPipelineExecutableInfoKHR;
typedef union VkPipelineExecutableStatisticValueKHR {
VkBool32 b32;
int64_t i64;
uint64_t u64;
double f64;
} VkPipelineExecutableStatisticValueKHR;
typedef struct VkPipelineExecutableStatisticKHR {
VkStructureType sType;
void* pNext;
char name[VK_MAX_DESCRIPTION_SIZE];
char description[VK_MAX_DESCRIPTION_SIZE];
VkPipelineExecutableStatisticFormatKHR format;
VkPipelineExecutableStatisticValueKHR value;
} VkPipelineExecutableStatisticKHR;
typedef struct VkPipelineExecutableInternalRepresentationKHR {
VkStructureType sType;
void* pNext;
char name[VK_MAX_DESCRIPTION_SIZE];
char description[VK_MAX_DESCRIPTION_SIZE];
VkBool32 isText;
size_t dataSize;
void* pData;
} VkPipelineExecutableInternalRepresentationKHR;
typedef VkResult (VKAPI_PTR *PFN_vkGetPipelineExecutablePropertiesKHR)(VkDevice device, const VkPipelineInfoKHR* pPipelineInfo, uint32_t* pExecutableCount, VkPipelineExecutablePropertiesKHR* pProperties);
typedef VkResult (VKAPI_PTR *PFN_vkGetPipelineExecutableStatisticsKHR)(VkDevice device, const VkPipelineExecutableInfoKHR* pExecutableInfo, uint32_t* pStatisticCount, VkPipelineExecutableStatisticKHR* pStatistics);
typedef VkResult (VKAPI_PTR *PFN_vkGetPipelineExecutableInternalRepresentationsKHR)(VkDevice device, const VkPipelineExecutableInfoKHR* pExecutableInfo, uint32_t* pInternalRepresentationCount, VkPipelineExecutableInternalRepresentationKHR* pInternalRepresentations);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkGetPipelineExecutablePropertiesKHR(
VkDevice device,
const VkPipelineInfoKHR* pPipelineInfo,
uint32_t* pExecutableCount,
VkPipelineExecutablePropertiesKHR* pProperties);
VKAPI_ATTR VkResult VKAPI_CALL vkGetPipelineExecutableStatisticsKHR(
VkDevice device,
const VkPipelineExecutableInfoKHR* pExecutableInfo,
uint32_t* pStatisticCount,
VkPipelineExecutableStatisticKHR* pStatistics);
VKAPI_ATTR VkResult VKAPI_CALL vkGetPipelineExecutableInternalRepresentationsKHR(
VkDevice device,
const VkPipelineExecutableInfoKHR* pExecutableInfo,
uint32_t* pInternalRepresentationCount,
VkPipelineExecutableInternalRepresentationKHR* pInternalRepresentations);
#endif
#define VK_EXT_debug_report 1
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkDebugReportCallbackEXT)
#define VK_EXT_DEBUG_REPORT_SPEC_VERSION 9
@@ -8741,6 +8845,22 @@ VKAPI_ATTR void VKAPI_CALL vkCmdWriteBufferMarkerAMD(
#endif
#define VK_AMD_pipeline_compiler_control 1
#define VK_AMD_PIPELINE_COMPILER_CONTROL_SPEC_VERSION 1
#define VK_AMD_PIPELINE_COMPILER_CONTROL_EXTENSION_NAME "VK_AMD_pipeline_compiler_control"
typedef enum VkPipelineCompilerControlFlagBitsAMD {
VK_PIPELINE_COMPILER_CONTROL_FLAG_BITS_MAX_ENUM_AMD = 0x7FFFFFFF
} VkPipelineCompilerControlFlagBitsAMD;
typedef VkFlags VkPipelineCompilerControlFlagsAMD;
typedef struct VkPipelineCompilerControlCreateInfoAMD {
VkStructureType sType;
const void* pNext;
VkPipelineCompilerControlFlagsAMD compilerControlFlags;
} VkPipelineCompilerControlCreateInfoAMD;
#define VK_EXT_calibrated_timestamps 1
#define VK_EXT_CALIBRATED_TIMESTAMPS_SPEC_VERSION 1
#define VK_EXT_CALIBRATED_TIMESTAMPS_EXTENSION_NAME "VK_EXT_calibrated_timestamps"
@@ -9288,8 +9408,15 @@ typedef struct VkPhysicalDeviceScalarBlockLayoutFeaturesEXT {
#define VK_EXT_subgroup_size_control 1
#define VK_EXT_SUBGROUP_SIZE_CONTROL_SPEC_VERSION 1
#define VK_EXT_SUBGROUP_SIZE_CONTROL_SPEC_VERSION 2
#define VK_EXT_SUBGROUP_SIZE_CONTROL_EXTENSION_NAME "VK_EXT_subgroup_size_control"
typedef struct VkPhysicalDeviceSubgroupSizeControlFeaturesEXT {
VkStructureType sType;
void* pNext;
VkBool32 subgroupSizeControl;
VkBool32 computeFullSubgroups;
} VkPhysicalDeviceSubgroupSizeControlFeaturesEXT;
typedef struct VkPhysicalDeviceSubgroupSizeControlPropertiesEXT {
VkStructureType sType;
void* pNext;
@@ -9307,6 +9434,23 @@ typedef struct VkPipelineShaderStageRequiredSubgroupSizeCreateInfoEXT {
#define VK_AMD_shader_core_properties2 1
#define VK_AMD_SHADER_CORE_PROPERTIES_2_SPEC_VERSION 1
#define VK_AMD_SHADER_CORE_PROPERTIES_2_EXTENSION_NAME "VK_AMD_shader_core_properties2"
typedef enum VkShaderCorePropertiesFlagBitsAMD {
VK_SHADER_CORE_PROPERTIES_FLAG_BITS_MAX_ENUM_AMD = 0x7FFFFFFF
} VkShaderCorePropertiesFlagBitsAMD;
typedef VkFlags VkShaderCorePropertiesFlagsAMD;
typedef struct VkPhysicalDeviceShaderCoreProperties2AMD {
VkStructureType sType;
void* pNext;
VkShaderCorePropertiesFlagsAMD shaderCoreFeatures;
uint32_t activeComputeUnitCount;
} VkPhysicalDeviceShaderCoreProperties2AMD;
#define VK_EXT_memory_budget 1
#define VK_EXT_MEMORY_BUDGET_SPEC_VERSION 1
#define VK_EXT_MEMORY_BUDGET_EXTENSION_NAME "VK_EXT_memory_budget"

View File

@@ -1,54 +0,0 @@
#ifndef VULKAN_XLIB_RANDR_H_
#define VULKAN_XLIB_RANDR_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2017 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_EXT_acquire_xlib_display 1
#define VK_EXT_ACQUIRE_XLIB_DISPLAY_SPEC_VERSION 1
#define VK_EXT_ACQUIRE_XLIB_DISPLAY_EXTENSION_NAME "VK_EXT_acquire_xlib_display"
typedef VkResult (VKAPI_PTR *PFN_vkAcquireXlibDisplayEXT)(VkPhysicalDevice physicalDevice, Display* dpy, VkDisplayKHR display);
typedef VkResult (VKAPI_PTR *PFN_vkGetRandROutputDisplayEXT)(VkPhysicalDevice physicalDevice, Display* dpy, RROutput rrOutput, VkDisplayKHR* pDisplay);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkAcquireXlibDisplayEXT(
VkPhysicalDevice physicalDevice,
Display* dpy,
VkDisplayKHR display);
VKAPI_ATTR VkResult VKAPI_CALL vkGetRandROutputDisplayEXT(
VkPhysicalDevice physicalDevice,
Display* dpy,
RROutput rrOutput,
VkDisplayKHR* pDisplay);
#endif
#ifdef __cplusplus
}
#endif
#endif

View File

@@ -26,7 +26,7 @@ project(
).stdout(),
license : 'MIT',
meson_version : '>= 0.46',
default_options : ['buildtype=debugoptimized', 'b_ndebug=if-release', 'c_std=c99', 'cpp_std=c++11']
default_options : ['buildtype=debugoptimized', 'b_ndebug=if-release', 'c_std=c99', 'cpp_std=c++14']
)
cc = meson.get_compiler('c')
@@ -42,7 +42,7 @@ pre_args = [
'-D__STDC_FORMAT_MACROS',
'-D__STDC_LIMIT_MACROS',
'-DPACKAGE_VERSION="@0@"'.format(meson.project_version()),
'-DPACKAGE_BUGREPORT="https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa"',
'-DPACKAGE_BUGREPORT="https://gitlab.freedesktop.org/mesa/mesa/issues"',
]
with_vulkan_icd_dir = get_option('vulkan-icd-dir')
@@ -61,6 +61,8 @@ if with_tools.contains('all')
'freedreno',
'glsl',
'intel',
'intel-ui',
'lima',
'nir',
'nouveau',
'xvmc',
@@ -91,7 +93,7 @@ with_shared_glapi = get_option('shared-glapi')
# shared-glapi is required if at least two OpenGL APIs are being built
if not with_shared_glapi
if ((with_gles1 == 'true' and with_gles2 == 'true') or
if ((with_gles1 == 'true' and with_gles2 == 'true') or
(with_gles1 == 'true' and with_opengl) or
(with_gles2 == 'true' and with_opengl))
error('shared-glapi required for building two or more of OpenGL, OpenGL ES 1.x, OpenGL ES 2.x')
@@ -115,7 +117,7 @@ with_any_opengl = with_opengl or with_gles1 or with_gles2
# Only build shared_glapi if at least one OpenGL API is enabled
with_shared_glapi = get_option('shared-glapi') and with_any_opengl
system_has_kms_drm = ['openbsd', 'netbsd', 'freebsd', 'gnu/kfreebsd', 'dragonfly', 'linux'].contains(host_machine.system())
system_has_kms_drm = ['openbsd', 'netbsd', 'freebsd', 'gnu/kfreebsd', 'dragonfly', 'linux', 'sunos'].contains(host_machine.system())
dri_drivers = get_option('dri-drivers')
if dri_drivers.contains('auto')
@@ -158,7 +160,7 @@ if gallium_drivers.contains('auto')
elif ['arm', 'aarch64'].contains(host_machine.cpu_family())
gallium_drivers = [
'kmsro', 'v3d', 'vc4', 'freedreno', 'etnaviv', 'nouveau',
'tegra', 'virgl', 'lima', 'swrast'
'tegra', 'virgl', 'lima', 'panfrost', 'swrast'
]
else
error('Unknown architecture @0@. Please pass -Dgallium-drivers to set driver options. Patches gladly accepted to fix this.'.format(
@@ -375,11 +377,15 @@ if with_egl and not (with_platform_drm or with_platform_surfaceless or with_plat
endif
endif
pre_args += '-DGLX_USE_TLS'
# Android uses emutls for versions <= P/28. For USE_ELF_TLS we need ELF TLS.
if not with_platform_android or get_option('platform-sdk-version') >= 29
pre_args += '-DUSE_ELF_TLS'
endif
if with_glx != 'disabled'
if not (with_platform_x11 and with_any_opengl)
error('Cannot build GLX support without X11 platform support and at least one OpenGL API')
elif with_glx == 'gallium-xlib'
elif with_glx == 'gallium-xlib'
if not with_gallium
error('Gallium-xlib based GLX requires at least one gallium driver')
elif not with_gallium_softpipe
@@ -387,7 +393,7 @@ if with_glx != 'disabled'
elif with_dri
error('gallium-xlib conflicts with any dri driver')
endif
elif with_glx == 'xlib'
elif with_glx == 'xlib'
if with_dri
error('xlib conflicts with any dri driver')
endif
@@ -496,10 +502,12 @@ elif not (with_gallium_r600 or with_gallium_nouveau)
endif
endif
dep_xvmc = null_dep
dep_xv = null_dep
with_gallium_xvmc = false
if _xvmc != 'false'
dep_xvmc = dependency('xvmc', version : '>= 1.0.6', required : _xvmc == 'true')
with_gallium_xvmc = dep_xvmc.found()
dep_xv = dependency('xv', required : _xvmc == 'true')
with_gallium_xvmc = dep_xvmc.found() and dep_xv.found()
endif
xvmc_drivers_path = get_option('xvmc-libs-path')
@@ -773,6 +781,11 @@ if cc.get_id() == 'gcc' and cc.version().version_compare('< 4.4.6')
error('When using GCC, version 4.4.6 or later is required.')
endif
# Support systems without ETIME (e.g. FreeBSD)
if cc.get_define('ETIME', prefix : '#include <errno.h>') == ''
pre_args += '-DETIME=ETIMEDOUT'
endif
# Define DEBUG for debug builds only (debugoptimized is not included on this one)
if get_option('buildtype') == 'debug'
pre_args += '-DDEBUG'
@@ -845,6 +858,8 @@ endif
# TODO: this is very incomplete
if ['linux', 'cygwin', 'gnu', 'gnu/kfreebsd'].contains(host_machine.system())
pre_args += '-D_GNU_SOURCE'
elif host_machine.system() == 'sunos'
pre_args += '-D__EXTENSIONS__'
endif
# Check for generic C arguments
@@ -854,6 +869,8 @@ foreach a : ['-Werror=implicit-function-declaration',
'-Werror=incompatible-pointer-types',
'-Werror=format',
'-Wformat-security',
'-Wno-missing-field-initializers',
'-Wno-format-truncation',
'-fno-math-errno',
'-fno-trapping-math', '-Qunused-arguments']
if cc.has_argument(a)
@@ -861,12 +878,6 @@ foreach a : ['-Werror=implicit-function-declaration',
endif
endforeach
foreach a : ['missing-field-initializers', 'format-truncation']
if cc.has_argument('-W' + a)
c_args += '-Wno-' + a
endif
endforeach
c_vis_args = []
if cc.has_argument('-fvisibility=hidden')
c_vis_args += '-fvisibility=hidden'
@@ -877,6 +888,9 @@ cpp_args = []
foreach a : ['-Werror=return-type',
'-Werror=format',
'-Wformat-security',
'-Wno-non-virtual-dtor',
'-Wno-missing-field-initializers',
'-Wno-format-truncation',
'-fno-math-errno', '-fno-trapping-math',
'-Qunused-arguments']
if cpp.has_argument(a)
@@ -884,19 +898,11 @@ foreach a : ['-Werror=return-type',
endif
endforeach
# For some reason, the test for -Wno-foo always succeeds with gcc, even if the
# option is not supported. Hence, check for -Wfoo instead.
foreach a : ['non-virtual-dtor', 'missing-field-initializers', 'format-truncation']
if cpp.has_argument('-W' + a)
cpp_args += '-Wno-' + a
endif
endforeach
no_override_init_args = []
foreach a : ['override-init', 'initializer-overrides']
if cc.has_argument('-W' + a)
no_override_init_args += '-Wno-' + a
foreach a : ['-Wno-override-init',
'-Wno-initializer-overrides']
if cc.has_argument(a)
no_override_init_args += a
endif
endforeach
@@ -1024,13 +1030,13 @@ elif cc.has_header_symbol('sys/mkdev.h', 'major')
pre_args += '-DMAJOR_IN_MKDEV'
endif
foreach h : ['xlocale.h', 'sys/sysctl.h', 'linux/futex.h', 'endian.h', 'dlfcn.h', 'execinfo.h', 'sys/shm.h']
foreach h : ['xlocale.h', 'sys/sysctl.h', 'linux/futex.h', 'endian.h', 'dlfcn.h', 'execinfo.h', 'sys/shm.h', 'cet.h']
if cc.compiles('#include <@0@>'.format(h), name : '@0@'.format(h))
pre_args += '-DHAVE_@0@'.format(h.to_upper().underscorify())
endif
endforeach
foreach f : ['strtof', 'mkostemp', 'posix_memalign', 'timespec_get', 'memfd_create', 'random_r']
foreach f : ['strtof', 'mkostemp', 'posix_memalign', 'timespec_get', 'memfd_create', 'random_r', 'flock']
if cc.has_function(f)
pre_args += '-DHAVE_@0@'.format(f.to_upper())
endif
@@ -1133,6 +1139,12 @@ if dep_thread.found() and host_machine.system() != 'windows'
args : '-D_GNU_SOURCE')
pre_args += '-DHAVE_PTHREAD_SETAFFINITY'
endif
if cc.has_function(
'pthread_setaffinity_np',
dependencies : dep_thread,
prefix : '#include <pthread_np.h>')
pre_args += '-DPTHREAD_SETAFFINITY_IN_NP_HEADER'
endif
endif
dep_expat = dependency('expat')
# this only exists on linux so either this is linux and it will be found, or
@@ -1290,8 +1302,12 @@ else
endif
dep_glvnd = null_dep
glvnd_has_headers_and_pc_files = false
if with_glvnd
dep_glvnd = dependency('libglvnd', version : '>= 0.2.0')
# GLVND before 1.2 was missing its pkg-config and header files, forcing every
# vendor to provide them and the distro maintainers to resolve the conflict.
glvnd_has_headers_and_pc_files = dep_glvnd.version().version_compare('>= 1.2.0')
pre_args += '-DUSE_LIBGLVND=1'
endif
@@ -1325,9 +1341,6 @@ else
endif
if with_osmesa != 'none'
if with_osmesa == 'classic' and not with_dri_swrast
error('OSMesa classic requires dri (classic) swrast.')
endif
if with_osmesa == 'gallium' and not with_gallium_softpipe
error('OSMesa gallium requires gallium softpipe or llvmpipe.')
endif
@@ -1405,6 +1418,9 @@ if with_platform_x11
with_gallium_omx != 'disabled'))
dep_xcb = dependency('xcb')
dep_x11_xcb = dependency('x11-xcb')
if with_dri_platform == 'drm' and not dep_libdrm.found()
error('libdrm required for gallium video statetrackers when using x11')
endif
endif
if with_any_vk or with_egl or (with_glx == 'dri' and with_dri_platform == 'drm')
dep_xcb_dri2 = dependency('xcb-dri2', version : '>= 1.8')
@@ -1425,7 +1441,7 @@ if with_platform_x11
if with_glx == 'dri' or with_glx == 'gallium-xlib'
dep_glproto = dependency('glproto', version : '>= 1.4.14')
endif
if with_glx == 'dri'
if with_glx == 'dri'
if with_dri_platform == 'drm'
dep_dri2proto = dependency('dri2proto', version : '>= 2.8')
dep_xxf86vm = dependency('xxf86vm')

View File

@@ -128,9 +128,9 @@ def generate(env):
if not path:
path = []
if SCons.Util.is_String(path):
path = string.split(path, os.pathsep)
path = str.split(path, os.pathsep)
env['ENV']['PATH'] = string.join([dir] + path, os.pathsep)
env['ENV']['PATH'] = str.join(os.pathsep, [dir] + path)
# Most of mingw is the same as gcc and friends...
gnu_tools = ['gcc', 'g++', 'gnulink', 'ar', 'gas']

View File

@@ -262,8 +262,12 @@ def parse_source_list(env, filename, names=None):
sym_table = parser.parse(src.abspath)
if names:
if isinstance(names, basestring):
names = [names]
if sys.version_info[0] >= 3:
if isinstance(names, str):
names = [names]
else:
if isinstance(names, basestring):
names = [names]
symbols = names
else:

View File

@@ -132,7 +132,7 @@ def check_cc(env, cc, expr, cpp_opt = '-E'):
sys.stdout.write('Checking for %s ... ' % cc)
source = tempfile.NamedTemporaryFile(suffix='.c', delete=False)
source.write('#if !(%s)\n#error\n#endif\n' % expr)
source.write(('#if !(%s)\n#error\n#endif\n' % expr).encode())
source.close()
# sys.stderr.write('%r %s %s\n' % (env['CC'], cpp_opt, source.name));
@@ -237,6 +237,9 @@ def generate(env):
hosthost_platform = host_platform.system().lower()
if hosthost_platform.startswith('cygwin'):
hosthost_platform = 'cygwin'
# Avoid spurious crosscompilation in MSYS2 environment.
if hosthost_platform.startswith('mingw'):
hosthost_platform = 'windows'
host_machine = os.environ.get('PROCESSOR_ARCHITEW6432', os.environ.get('PROCESSOR_ARCHITECTURE', host_platform.machine()))
host_machine = {
'x86': 'x86',
@@ -403,10 +406,8 @@ def generate(env):
]
if env['build'] in ('debug', 'checked'):
cppdefines += ['_DEBUG']
if platform == 'windows':
cppdefines += ['PIPE_SUBSYSTEM_WINDOWS_USER']
if env['embedded']:
cppdefines += ['PIPE_SUBSYSTEM_EMBEDDED']
cppdefines += ['EMBEDDED_DEVICE']
env.Append(CPPDEFINES = cppdefines)
# C compiler options

View File

@@ -30,6 +30,7 @@ Tool-specific initialization for LLVM
import os
import os.path
import re
import platform as host_platform
import sys
import distutils.version
@@ -100,8 +101,36 @@ def generate(env):
env.Prepend(CPPPATH = [os.path.join(llvm_dir, 'include')])
env.Prepend(LIBPATH = [os.path.join(llvm_dir, 'lib')])
# LIBS should match the output of `llvm-config --libs engine mcjit bitwriter x86asmprinter irreader`
if llvm_version >= distutils.version.LooseVersion('5.0'):
# LLVM 5.0 and newer requires MinGW w/ pthreads due to use of std::thread and friends.
if llvm_version >= distutils.version.LooseVersion('5.0') and env['crosscompile']:
assert env['gcc']
env.AppendUnique(CXXFLAGS = ['-posix'])
# LIBS should match the output of `llvm-config --libs engine mcjit bitwriter x86asmprinter irreader` for LLVM<=7.0
# and `llvm-config --libs engine irreader` for LLVM>=8.0
# LLVMAggressiveInstCombine library part of engine component can be safely omitted as it's not used.
if llvm_version >= distutils.version.LooseVersion('9.0'):
env.Prepend(LIBS = [
'LLVMX86Disassembler', 'LLVMX86AsmParser',
'LLVMX86CodeGen', 'LLVMSelectionDAG', 'LLVMAsmPrinter',
'LLVMDebugInfoCodeView', 'LLVMCodeGen',
'LLVMScalarOpts', 'LLVMInstCombine',
'LLVMTransformUtils',
'LLVMBitWriter', 'LLVMX86Desc',
'LLVMMCDisassembler', 'LLVMX86Info',
'LLVMX86Utils',
'LLVMMCJIT', 'LLVMExecutionEngine', 'LLVMTarget',
'LLVMAnalysis', 'LLVMProfileData',
'LLVMRuntimeDyld', 'LLVMObject', 'LLVMMCParser',
'LLVMBitReader', 'LLVMMC', 'LLVMCore',
'LLVMSupport',
'LLVMIRReader', 'LLVMAsmParser',
'LLVMDemangle', 'LLVMGlobalISel', 'LLVMDebugInfoMSF',
'LLVMBinaryFormat',
'LLVMRemarks', 'LLVMBitstreamReader', 'LLVMDebugInfoDWARF',
])
elif llvm_version >= distutils.version.LooseVersion('5.0'):
env.Prepend(LIBS = [
'LLVMX86Disassembler', 'LLVMX86AsmParser',
'LLVMX86CodeGen', 'LLVMSelectionDAG', 'LLVMAsmPrinter',
@@ -120,10 +149,6 @@ def generate(env):
'LLVMDemangle', 'LLVMGlobalISel', 'LLVMDebugInfoMSF',
'LLVMBinaryFormat',
])
if env['platform'] == 'windows' and env['crosscompile']:
# LLVM 5.0 requires MinGW w/ pthreads due to use of std::thread and friends.
assert env['gcc']
env.AppendUnique(CXXFLAGS = ['-posix'])
elif llvm_version >= distutils.version.LooseVersion('4.0'):
env.Prepend(LIBS = [
'LLVMX86Disassembler', 'LLVMX86AsmParser',
@@ -217,6 +242,12 @@ def generate(env):
'uuid',
])
# Mingw-w64 zlib is required when building with LLVM support in MSYS2 environment
if host_platform.system().lower().startswith('mingw'):
env.Append(LIBS = [
'z',
])
if env['msvc']:
# Some of the LLVM C headers use the inline keyword without
# defining it.
@@ -269,7 +300,7 @@ def generate(env):
env.ParseConfig('%s --ldflags' % llvm_config)
if llvm_version >= distutils.version.LooseVersion('3.5'):
env.ParseConfig('%s --system-libs' % llvm_config)
env.Append(CXXFLAGS = ['-std=c++11'])
env.Append(CXXFLAGS = ['-std=c++14'])
except OSError:
print('scons: llvm-config version %s failed' % llvm_version)
return

View File

@@ -77,24 +77,22 @@
#define AMDGPU_ICELAND_RANGE 0x01, 0x14
#define AMDGPU_TONGA_RANGE 0x14, 0x28
#define AMDGPU_FIJI_RANGE 0x3C, 0x50
#define AMDGPU_POLARIS10_RANGE 0x50, 0x5A
#define AMDGPU_POLARIS11_RANGE 0x5A, 0x64
#define AMDGPU_POLARIS12_RANGE 0x64, 0x6E
#define AMDGPU_VEGAM_RANGE 0x6E, 0xFF
#define AMDGPU_CARRIZO_RANGE 0x01, 0x21
#define AMDGPU_BRISTOL_RANGE 0x10, 0x21
#define AMDGPU_STONEY_RANGE 0x61, 0xFF
#define AMDGPU_VEGA10_RANGE 0x01, 0x14
#define AMDGPU_VEGA12_RANGE 0x14, 0x28
#define AMDGPU_VEGA20_RANGE 0x28, 0xFF
#define AMDGPU_VEGA20_RANGE 0x28, 0x32
#define AMDGPU_ARCTURUS_RANGE 0x32, 0xFF
#define AMDGPU_RAVEN_RANGE 0x01, 0x81
#define AMDGPU_RAVEN2_RANGE 0x81, 0xFF
#define AMDGPU_ARCTURUS_RANGE 0x32, 0xFF
#define AMDGPU_RAVEN2_RANGE 0x81, 0x91
#define AMDGPU_RENOIR_RANGE 0x91, 0xFF
#define AMDGPU_NAVI10_RANGE 0x01, 0x0A
#define AMDGPU_NAVI12_RANGE 0x0A, 0x14
@@ -130,7 +128,6 @@
#define ASICREV_IS_VEGAM_P(r) ASICREV_IS(r, VEGAM)
#define ASICREV_IS_CARRIZO(r) ASICREV_IS(r, CARRIZO)
#define ASICREV_IS_CARRIZO_BRISTOL(r) ASICREV_IS(r, BRISTOL)
#define ASICREV_IS_STONEY(r) ASICREV_IS(r, STONEY)
#define ASICREV_IS_VEGA10_M(r) ASICREV_IS(r, VEGA10)
@@ -138,11 +135,11 @@
#define ASICREV_IS_VEGA12_P(r) ASICREV_IS(r, VEGA12)
#define ASICREV_IS_VEGA12_p(r) ASICREV_IS(r, VEGA12)
#define ASICREV_IS_VEGA20_P(r) ASICREV_IS(r, VEGA20)
#define ASICREV_IS_ARCTURUS(r) ASICREV_IS(r, ARCTURUS)
#define ASICREV_IS_RAVEN(r) ASICREV_IS(r, RAVEN)
#define ASICREV_IS_RAVEN2(r) ASICREV_IS(r, RAVEN2)
#define ASICREV_IS_ARCTURUS(r) ASICREV_IS(r, ARCTURUS)
#define ASICREV_IS_RENOIR(r) ASICREV_IS(r, RENOIR)
#define ASICREV_IS_NAVI10_P(r) ASICREV_IS(r, NAVI10)
#define ASICREV_IS_NAVI12(r) ASICREV_IS(r, NAVI12)

View File

@@ -1312,6 +1312,11 @@ ChipFamily Gfx9Lib::HwlConvertChipFamily(
m_settings.applyAliasFix = 1;
}
if (ASICREV_IS_RENOIR(uChipRevision))
{
m_settings.isRaven = 1;
}
m_settings.isDcn1 = m_settings.isRaven;
m_settings.metaBaseAlignFix = 1;

View File

@@ -26,6 +26,7 @@
#include "ac_gpu_info.h"
#include "sid.h"
#include "util/macros.h"
#include "util/u_math.h"
#include <stdio.h>
@@ -91,6 +92,14 @@ static bool has_syncobj(int fd)
return value ? true : false;
}
static uint64_t fix_vram_size(uint64_t size)
{
/* The VRAM size is underreported, so we need to fix it, because
* it's used to compute the number of memory modules for harvesting.
*/
return align64(size, 256*1024*1024);
}
bool ac_query_gpu_info(int fd, void *dev_p,
struct radeon_info *info,
struct amdgpu_gpu_info *amdinfo)
@@ -264,7 +273,7 @@ bool ac_query_gpu_info(int fd, void *dev_p,
/* Note: usable_heap_size values can be random and can't be relied on. */
info->gart_size = meminfo.gtt.total_heap_size;
info->vram_size = meminfo.vram.total_heap_size;
info->vram_size = fix_vram_size(meminfo.vram.total_heap_size);
info->vram_vis_size = meminfo.cpu_accessible_vram.total_heap_size;
} else {
/* This is a deprecated interface, which reports usable sizes
@@ -295,7 +304,7 @@ bool ac_query_gpu_info(int fd, void *dev_p,
}
info->gart_size = gtt.heap_size;
info->vram_size = vram.heap_size;
info->vram_size = fix_vram_size(vram.heap_size);
info->vram_vis_size = vram_vis.heap_size;
}
@@ -418,6 +427,9 @@ bool ac_query_gpu_info(int fd, void *dev_p,
}
if (info->chip_class >= GFX10) {
info->tcc_cache_line_size = 128;
/* This is a hack, but it's all we can do without a kernel upgrade. */
info->tcc_harvested =
(info->vram_size / info->num_tcc_blocks) != 512*1024*1024;
} else {
info->tcc_cache_line_size = 64;
}
@@ -449,6 +461,12 @@ bool ac_query_gpu_info(int fd, void *dev_p,
info->num_good_cu_per_sh = info->num_good_compute_units /
(info->max_se * info->max_sh_per_se);
/* Round down to the nearest multiple of 2, because the hw can't
* disable CUs. It can only disable whole WGPs (dual-CUs).
*/
if (info->chip_class >= GFX10)
info->num_good_cu_per_sh -= info->num_good_cu_per_sh % 2;
memcpy(info->si_tile_mode_array, amdinfo->gb_tile_mode,
sizeof(amdinfo->gb_tile_mode));
info->enabled_rb_mask = amdinfo->enabled_rb_pipes_mask;
@@ -460,7 +478,7 @@ bool ac_query_gpu_info(int fd, void *dev_p,
info->gart_page_size = alignment_info.size_remote;
if (info->chip_class == GFX6)
info->gfx_ib_pad_with_type2 = TRUE;
info->gfx_ib_pad_with_type2 = true;
unsigned ib_align = 0;
ib_align = MAX2(ib_align, gfx.ib_start_alignment);
@@ -477,7 +495,8 @@ bool ac_query_gpu_info(int fd, void *dev_p,
if (info->drm_minor >= 31 &&
(info->family == CHIP_RAVEN ||
info->family == CHIP_RAVEN2)) {
info->family == CHIP_RAVEN2 ||
info->family == CHIP_RENOIR)) {
if (info->num_render_backends == 1)
info->use_display_dcc_unaligned = true;
else
@@ -532,6 +551,7 @@ void ac_print_gpu_info(struct radeon_info *info)
printf(" num_sdma_rings = %i\n", info->num_sdma_rings);
printf(" clock_crystal_freq = %i\n", info->clock_crystal_freq);
printf(" tcc_cache_line_size = %u\n", info->tcc_cache_line_size);
printf(" tcc_harvested = %u\n", info->tcc_harvested);
printf(" use_display_dcc_unaligned = %u\n", info->use_display_dcc_unaligned);
printf(" use_display_dcc_with_retile_blit = %u\n", info->use_display_dcc_with_retile_blit);

View File

@@ -58,6 +58,7 @@ struct radeon_info {
uint32_t num_sdma_rings;
uint32_t clock_crystal_freq;
uint32_t tcc_cache_line_size;
bool tcc_harvested;
/* There are 2 display DCC codepaths, because display expects unaligned DCC. */
/* Disable RB and pipe alignment to skip the retile blit. (1 RB chips only) */
@@ -173,7 +174,7 @@ unsigned ac_get_compute_resource_limits(struct radeon_info *info,
unsigned max_waves_per_sh,
unsigned threadgroups_per_cu);
static inline unsigned ac_get_max_simd_waves(enum radeon_family family)
static inline unsigned ac_get_max_wave64_per_simd(enum radeon_family family)
{
switch (family) {
@@ -188,10 +189,26 @@ static inline unsigned ac_get_max_simd_waves(enum radeon_family family)
}
}
static inline uint32_t
ac_get_num_physical_sgprs(enum chip_class chip_class)
static inline unsigned ac_get_num_physical_vgprs(enum chip_class chip_class,
unsigned wave_size)
{
return chip_class >= GFX8 ? 800 : 512;
/* The number is per SIMD. */
if (chip_class >= GFX10)
return wave_size == 32 ? 1024 : 512;
else
return 256;
}
static inline uint32_t
ac_get_num_physical_sgprs(const struct radeon_info *info)
{
/* The number is per SIMD. There is enough SGPRs for the maximum number
* of Wave32, which is double the number for Wave64.
*/
if (info->chip_class >= GFX10)
return 128 * ac_get_max_wave64_per_simd(info->family) * 2;
return info->chip_class >= GFX8 ? 800 : 512;
}
#ifdef __cplusplus

View File

@@ -60,7 +60,8 @@ void
ac_llvm_context_init(struct ac_llvm_context *ctx,
struct ac_llvm_compiler *compiler,
enum chip_class chip_class, enum radeon_family family,
enum ac_float_mode float_mode, unsigned wave_size)
enum ac_float_mode float_mode, unsigned wave_size,
unsigned ballot_mask_bits)
{
LLVMValueRef args[1];
@@ -69,6 +70,7 @@ ac_llvm_context_init(struct ac_llvm_context *ctx,
ctx->chip_class = chip_class;
ctx->family = family;
ctx->wave_size = wave_size;
ctx->ballot_mask_bits = ballot_mask_bits;
ctx->module = ac_create_module(wave_size == 32 ? compiler->tm_wave32
: compiler->tm,
ctx->context);
@@ -93,6 +95,7 @@ ac_llvm_context_init(struct ac_llvm_context *ctx,
ctx->v4f32 = LLVMVectorType(ctx->f32, 4);
ctx->v8i32 = LLVMVectorType(ctx->i32, 8);
ctx->iN_wavemask = LLVMIntTypeInContext(ctx->context, ctx->wave_size);
ctx->iN_ballotmask = LLVMIntTypeInContext(ctx->context, ballot_mask_bits);
ctx->i8_0 = LLVMConstInt(ctx->i8, 0, false);
ctx->i8_1 = LLVMConstInt(ctx->i8, 1, false);
@@ -626,6 +629,22 @@ ac_build_expand(struct ac_llvm_context *ctx,
return ac_build_gather_values(ctx, chan, dst_channels);
}
/* Extract components [start, start + channels) from a vector.
*/
LLVMValueRef
ac_extract_components(struct ac_llvm_context *ctx,
LLVMValueRef value,
unsigned start,
unsigned channels)
{
LLVMValueRef chan[channels];
for (unsigned i = 0; i < channels; i++)
chan[i] = ac_llvm_extract_elem(ctx, value, i + start);
return ac_build_gather_values(ctx, chan, channels);
}
/* Expand a scalar or vector to <4 x type> by filling the remaining channels
* with undef. Extract at most num_channels components from the input.
*/
@@ -2580,6 +2599,8 @@ static const char *get_atomic_name(enum ac_atomic_op op)
case ac_atomic_and: return "and";
case ac_atomic_or: return "or";
case ac_atomic_xor: return "xor";
case ac_atomic_inc_wrap: return "inc";
case ac_atomic_dec_wrap: return "dec";
}
unreachable("bad atomic op");
}
@@ -3840,6 +3861,8 @@ ac_build_readlane(struct ac_llvm_context *ctx, LLVMValueRef src, LLVMValueRef la
LLVMConstInt(ctx->i32, i, 0), "");
}
}
if (LLVMGetTypeKind(src_type) == LLVMPointerTypeKind)
return LLVMBuildIntToPtr(ctx->builder, ret, src_type, "");
return LLVMBuildBitCast(ctx->builder, ret, src_type, "");
}
@@ -4195,13 +4218,47 @@ ac_build_scan(struct ac_llvm_context *ctx, nir_op op, LLVMValueRef src, LLVMValu
{
LLVMValueRef result, tmp;
if (ctx->chip_class >= GFX10) {
result = inclusive ? src : identity;
if (inclusive) {
result = src;
} else if (ctx->chip_class >= GFX10) {
/* wavefront shift_right by 1 on GFX10 (emulate dpp_wf_sr1) */
LLVMValueRef active, tmp1, tmp2;
LLVMValueRef tid = ac_get_thread_id(ctx);
tmp1 = ac_build_dpp(ctx, identity, src, dpp_row_sr(1), 0xf, 0xf, false);
tmp2 = ac_build_permlane16(ctx, src, (uint64_t)~0, true, false);
if (maxprefix > 32) {
active = LLVMBuildICmp(ctx->builder, LLVMIntEQ, tid,
LLVMConstInt(ctx->i32, 32, false), "");
tmp2 = LLVMBuildSelect(ctx->builder, active,
ac_build_readlane(ctx, src,
LLVMConstInt(ctx->i32, 31, false)),
tmp2, "");
active = LLVMBuildOr(ctx->builder, active,
LLVMBuildICmp(ctx->builder, LLVMIntEQ,
LLVMBuildAnd(ctx->builder, tid,
LLVMConstInt(ctx->i32, 0x1f, false), ""),
LLVMConstInt(ctx->i32, 0x10, false), ""), "");
src = LLVMBuildSelect(ctx->builder, active, tmp2, tmp1, "");
} else if (maxprefix > 16) {
active = LLVMBuildICmp(ctx->builder, LLVMIntEQ, tid,
LLVMConstInt(ctx->i32, 16, false), "");
src = LLVMBuildSelect(ctx->builder, active, tmp2, tmp1, "");
}
result = src;
} else if (ctx->chip_class >= GFX8) {
src = ac_build_dpp(ctx, identity, src, dpp_wf_sr1, 0xf, 0xf, false);
result = src;
} else {
if (inclusive)
result = src;
else
result = ac_build_dpp(ctx, identity, src, dpp_wf_sr1, 0xf, 0xf, false);
if (!inclusive)
src = ac_build_dpp(ctx, identity, src, dpp_wf_sr1, 0xf, 0xf, false);
result = src;
}
if (maxprefix <= 1)
return result;
@@ -4227,33 +4284,31 @@ ac_build_scan(struct ac_llvm_context *ctx, nir_op op, LLVMValueRef src, LLVMValu
return result;
if (ctx->chip_class >= GFX10) {
/* dpp_row_bcast{15,31} are not supported on gfx10. */
LLVMBuilderRef builder = ctx->builder;
LLVMValueRef tid = ac_get_thread_id(ctx);
LLVMValueRef cc;
/* TODO-GFX10: Can we get better code-gen by putting this into
* a branch so that LLVM generates EXEC mask manipulations? */
if (inclusive)
tmp = result;
else
tmp = ac_build_alu_op(ctx, result, src, op);
tmp = ac_build_permlane16(ctx, tmp, ~(uint64_t)0, true, false);
tmp = ac_build_alu_op(ctx, result, tmp, op);
cc = LLVMBuildAnd(builder, tid, LLVMConstInt(ctx->i32, 16, false), "");
cc = LLVMBuildICmp(builder, LLVMIntNE, cc, ctx->i32_0, "");
result = LLVMBuildSelect(builder, cc, tmp, result, "");
LLVMValueRef active;
tmp = ac_build_permlane16(ctx, result, ~(uint64_t)0, true, false);
active = LLVMBuildICmp(ctx->builder, LLVMIntNE,
LLVMBuildAnd(ctx->builder, tid,
LLVMConstInt(ctx->i32, 16, false), ""),
ctx->i32_0, "");
tmp = LLVMBuildSelect(ctx->builder, active, tmp, identity, "");
result = ac_build_alu_op(ctx, result, tmp, op);
if (maxprefix <= 32)
return result;
if (inclusive)
tmp = result;
else
tmp = ac_build_alu_op(ctx, result, src, op);
tmp = ac_build_readlane(ctx, tmp, LLVMConstInt(ctx->i32, 31, false));
tmp = ac_build_alu_op(ctx, result, tmp, op);
cc = LLVMBuildICmp(builder, LLVMIntUGE, tid,
LLVMConstInt(ctx->i32, 32, false), "");
result = LLVMBuildSelect(builder, cc, tmp, result, "");
tmp = ac_build_readlane(ctx, result, LLVMConstInt(ctx->i32, 31, false));
active = LLVMBuildICmp(ctx->builder, LLVMIntUGE, tid,
LLVMConstInt(ctx->i32, 32, false), "");
tmp = LLVMBuildSelect(ctx->builder, active, tmp, identity, "");
result = ac_build_alu_op(ctx, result, tmp, op);
return result;
}

View File

@@ -81,6 +81,7 @@ struct ac_llvm_context {
LLVMTypeRef v4f32;
LLVMTypeRef v8i32;
LLVMTypeRef iN_wavemask;
LLVMTypeRef iN_ballotmask;
LLVMValueRef i8_0;
LLVMValueRef i8_1;
@@ -114,7 +115,9 @@ struct ac_llvm_context {
enum chip_class chip_class;
enum radeon_family family;
unsigned wave_size;
unsigned ballot_mask_bits;
LLVMValueRef lds;
};
@@ -123,7 +126,8 @@ void
ac_llvm_context_init(struct ac_llvm_context *ctx,
struct ac_llvm_compiler *compiler,
enum chip_class chip_class, enum radeon_family family,
enum ac_float_mode float_mode, unsigned wave_size);
enum ac_float_mode float_mode, unsigned wave_size,
unsigned ballot_mask_bits);
void
ac_llvm_context_dispose(struct ac_llvm_context *ctx);
@@ -190,6 +194,13 @@ LLVMValueRef
ac_build_gather_values(struct ac_llvm_context *ctx,
LLVMValueRef *values,
unsigned value_count);
LLVMValueRef
ac_extract_components(struct ac_llvm_context *ctx,
LLVMValueRef value,
unsigned start,
unsigned channels);
LLVMValueRef ac_build_expand_to_vec4(struct ac_llvm_context *ctx,
LLVMValueRef value,
unsigned num_channels);
@@ -516,6 +527,8 @@ enum ac_atomic_op {
ac_atomic_and,
ac_atomic_or,
ac_atomic_xor,
ac_atomic_inc_wrap,
ac_atomic_dec_wrap,
};
enum ac_image_dim {

View File

@@ -132,6 +132,7 @@ const char *ac_get_llvm_processor_name(enum radeon_family family)
case CHIP_VEGA20:
return "gfx906";
case CHIP_RAVEN2:
case CHIP_RENOIR:
return HAVE_LLVM >= 0x0800 ? "gfx909" : "gfx902";
case CHIP_ARCTURUS:
return "gfx908";

View File

@@ -153,12 +153,6 @@ static LLVMBasicBlockRef get_block(struct ac_nir_context *nir,
return (LLVMBasicBlockRef)entry->data;
}
static LLVMValueRef emit_iabs(struct ac_llvm_context *ctx,
LLVMValueRef src0)
{
return ac_build_imax(ctx, src0, LLVMBuildNeg(ctx->builder, src0, ""));
}
static LLVMValueRef get_alu_src(struct ac_nir_context *ctx,
nir_alu_src src,
unsigned num_components)
@@ -193,38 +187,8 @@ static LLVMValueRef get_alu_src(struct ac_nir_context *ctx,
swizzle, "");
}
}
LLVMTypeRef type = LLVMTypeOf(value);
if (LLVMGetTypeKind(type) == LLVMVectorTypeKind)
type = LLVMGetElementType(type);
if (src.abs) {
if (LLVMGetTypeKind(type) == LLVMIntegerTypeKind) {
value = emit_iabs(&ctx->ac, value);
} else {
char name[128];
unsigned fsize = type == ctx->ac.f16 ? 16 :
type == ctx->ac.f32 ? 32 : 64;
if (LLVMGetTypeKind(LLVMTypeOf(value)) == LLVMVectorTypeKind) {
snprintf(name, sizeof(name), "llvm.fabs.v%uf%u",
LLVMGetVectorSize(LLVMTypeOf(value)), fsize);
} else {
snprintf(name, sizeof(name), "llvm.fabs.f%u", fsize);
}
value = ac_build_intrinsic(&ctx->ac, name, LLVMTypeOf(value),
&value, 1, AC_FUNC_ATTR_READNONE);
}
}
if (src.negate) {
if (LLVMGetTypeKind(type) == LLVMIntegerTypeKind)
value = LLVMBuildNeg(ctx->ac.builder, value, "");
else
value = LLVMBuildFNeg(ctx->ac.builder, value, "");
}
assert(!src.negate);
assert(!src.abs);
return value;
}
@@ -314,6 +278,12 @@ static LLVMValueRef emit_bcsel(struct ac_llvm_context *ctx,
ac_to_integer_or_pointer(ctx, src2), "");
}
static LLVMValueRef emit_iabs(struct ac_llvm_context *ctx,
LLVMValueRef src0)
{
return ac_build_imax(ctx, src0, LLVMBuildNeg(ctx->builder, src0, ""));
}
static LLVMValueRef emit_uint_carry(struct ac_llvm_context *ctx,
const char *intrin,
LLVMValueRef src0, LLVMValueRef src1)
@@ -1577,6 +1547,9 @@ static unsigned get_cache_policy(struct ac_nir_context *ctx,
cache_policy |= ac_glc;
}
if (access & ACCESS_STREAM_CACHE_POLICY)
cache_policy |= ac_slc;
return cache_policy;
}
@@ -1668,13 +1641,71 @@ static void visit_store_ssbo(struct ac_nir_context *ctx,
}
}
static LLVMValueRef emit_ssbo_comp_swap_64(struct ac_nir_context *ctx,
LLVMValueRef descriptor,
LLVMValueRef offset,
LLVMValueRef compare,
LLVMValueRef exchange)
{
LLVMBasicBlockRef start_block = NULL, then_block = NULL;
if (ctx->abi->robust_buffer_access) {
LLVMValueRef size = ac_llvm_extract_elem(&ctx->ac, descriptor, 2);
LLVMValueRef cond = LLVMBuildICmp(ctx->ac.builder, LLVMIntULT, offset, size, "");
start_block = LLVMGetInsertBlock(ctx->ac.builder);
ac_build_ifcc(&ctx->ac, cond, -1);
then_block = LLVMGetInsertBlock(ctx->ac.builder);
}
LLVMValueRef ptr_parts[2] = {
ac_llvm_extract_elem(&ctx->ac, descriptor, 0),
LLVMBuildAnd(ctx->ac.builder,
ac_llvm_extract_elem(&ctx->ac, descriptor, 1),
LLVMConstInt(ctx->ac.i32, 65535, 0), "")
};
ptr_parts[1] = LLVMBuildTrunc(ctx->ac.builder, ptr_parts[1], ctx->ac.i16, "");
ptr_parts[1] = LLVMBuildSExt(ctx->ac.builder, ptr_parts[1], ctx->ac.i32, "");
offset = LLVMBuildZExt(ctx->ac.builder, offset, ctx->ac.i64, "");
LLVMValueRef ptr = ac_build_gather_values(&ctx->ac, ptr_parts, 2);
ptr = LLVMBuildBitCast(ctx->ac.builder, ptr, ctx->ac.i64, "");
ptr = LLVMBuildAdd(ctx->ac.builder, ptr, offset, "");
ptr = LLVMBuildIntToPtr(ctx->ac.builder, ptr, LLVMPointerType(ctx->ac.i64, AC_ADDR_SPACE_GLOBAL), "");
LLVMValueRef result = ac_build_atomic_cmp_xchg(&ctx->ac, ptr, compare, exchange, "singlethread-one-as");
result = LLVMBuildExtractValue(ctx->ac.builder, result, 0, "");
if (ctx->abi->robust_buffer_access) {
ac_build_endif(&ctx->ac, -1);
LLVMBasicBlockRef incoming_blocks[2] = {
start_block,
then_block,
};
LLVMValueRef incoming_values[2] = {
LLVMConstInt(ctx->ac.i64, 0, 0),
result,
};
LLVMValueRef ret = LLVMBuildPhi(ctx->ac.builder, ctx->ac.i64, "");
LLVMAddIncoming(ret, incoming_values, incoming_blocks, 2);
return ret;
} else {
return result;
}
}
static LLVMValueRef visit_atomic_ssbo(struct ac_nir_context *ctx,
const nir_intrinsic_instr *instr)
{
LLVMTypeRef return_type = LLVMTypeOf(get_src(ctx, instr->src[2]));
const char *op;
char name[64], type[8];
LLVMValueRef params[6];
LLVMValueRef params[6], descriptor;
int arg_count = 0;
switch (instr->intrinsic) {
@@ -1712,13 +1743,22 @@ static LLVMValueRef visit_atomic_ssbo(struct ac_nir_context *ctx,
abort();
}
descriptor = ctx->abi->load_ssbo(ctx->abi,
get_src(ctx, instr->src[0]),
true);
if (instr->intrinsic == nir_intrinsic_ssbo_atomic_comp_swap &&
return_type == ctx->ac.i64) {
return emit_ssbo_comp_swap_64(ctx, descriptor,
get_src(ctx, instr->src[1]),
get_src(ctx, instr->src[2]),
get_src(ctx, instr->src[3]));
}
if (instr->intrinsic == nir_intrinsic_ssbo_atomic_comp_swap) {
params[arg_count++] = ac_llvm_extract_elem(&ctx->ac, get_src(ctx, instr->src[3]), 0);
}
params[arg_count++] = ac_llvm_extract_elem(&ctx->ac, get_src(ctx, instr->src[2]), 0);
params[arg_count++] = ctx->abi->load_ssbo(ctx->abi,
get_src(ctx, instr->src[0]),
true);
params[arg_count++] = descriptor;
if (HAVE_LLVM >= 0x900) {
/* XXX: The new raw/struct atomic intrinsics are buggy with
@@ -2399,7 +2439,7 @@ static void get_image_coords(struct ac_nir_context *ctx,
fmask_load_address[2],
sample_index,
get_sampler_desc(ctx, nir_instr_as_deref(instr->src[0].ssa->parent_instr),
AC_DESC_FMASK, &instr->instr, false, false));
AC_DESC_FMASK, &instr->instr, true, false));
}
if (count == 1 && !gfx9_1d) {
if (instr->src[1].ssa->num_components)
@@ -2638,6 +2678,27 @@ static LLVMValueRef visit_image_atomic(struct ac_nir_context *ctx,
atomic_name = "cmpswap";
atomic_subop = 0; /* not used */
break;
case nir_intrinsic_bindless_image_atomic_inc_wrap:
case nir_intrinsic_image_deref_atomic_inc_wrap: {
atomic_name = "inc";
atomic_subop = ac_atomic_inc_wrap;
/* ATOMIC_INC instruction does:
* value = (value + 1) % (data + 1)
* but we want:
* value = (value + 1) % data
* So replace 'data' by 'data - 1'.
*/
ctx->ssa_defs[instr->src[3].ssa->index] =
LLVMBuildSub(ctx->ac.builder,
ctx->ssa_defs[instr->src[3].ssa->index],
ctx->ac.i32_1, "");
break;
}
case nir_intrinsic_bindless_image_atomic_dec_wrap:
case nir_intrinsic_image_deref_atomic_dec_wrap:
atomic_name = "dec";
atomic_subop = ac_atomic_dec_wrap;
break;
default:
abort();
}
@@ -3041,6 +3102,9 @@ static LLVMValueRef barycentric_at_sample(struct ac_nir_context *ctx,
unsigned mode,
LLVMValueRef sample_id)
{
if (ctx->abi->interp_at_sample_force_center)
return barycentric_center(ctx, mode);
LLVMValueRef halfval = LLVMConstReal(ctx->ac.f32, 0.5f);
/* fetch sample ID */
@@ -3139,6 +3203,8 @@ static void visit_intrinsic(struct ac_nir_context *ctx,
switch (instr->intrinsic) {
case nir_intrinsic_ballot:
result = ac_build_ballot(&ctx->ac, get_src(ctx, instr->src[0]));
if (ctx->ac.ballot_mask_bits > ctx->ac.wave_size)
result = LLVMBuildZExt(ctx->ac.builder, result, ctx->ac.iN_ballotmask, "");
break;
case nir_intrinsic_read_invocation:
result = ac_build_readlane(&ctx->ac, get_src(ctx, instr->src[0]),
@@ -3247,6 +3313,10 @@ static void visit_intrinsic(struct ac_nir_context *ctx,
case nir_intrinsic_load_color1:
result = ctx->abi->color1;
break;
case nir_intrinsic_load_user_data_amd:
assert(LLVMTypeOf(ctx->abi->user_data) == ctx->ac.v4i32);
result = ctx->abi->user_data;
break;
case nir_intrinsic_load_instance_id:
result = ctx->abi->instance_id;
break;
@@ -3342,6 +3412,8 @@ static void visit_intrinsic(struct ac_nir_context *ctx,
case nir_intrinsic_bindless_image_atomic_xor:
case nir_intrinsic_bindless_image_atomic_exchange:
case nir_intrinsic_bindless_image_atomic_comp_swap:
case nir_intrinsic_bindless_image_atomic_inc_wrap:
case nir_intrinsic_bindless_image_atomic_dec_wrap:
result = visit_image_atomic(ctx, instr, true);
break;
case nir_intrinsic_image_deref_atomic_add:
@@ -3352,6 +3424,8 @@ static void visit_intrinsic(struct ac_nir_context *ctx,
case nir_intrinsic_image_deref_atomic_xor:
case nir_intrinsic_image_deref_atomic_exchange:
case nir_intrinsic_image_deref_atomic_comp_swap:
case nir_intrinsic_image_deref_atomic_inc_wrap:
case nir_intrinsic_image_deref_atomic_dec_wrap:
result = visit_image_atomic(ctx, instr, false);
break;
case nir_intrinsic_bindless_image_size:
@@ -3463,10 +3537,16 @@ static void visit_intrinsic(struct ac_nir_context *ctx,
result = ctx->abi->load_tess_coord(ctx->abi);
break;
case nir_intrinsic_load_tess_level_outer:
result = ctx->abi->load_tess_level(ctx->abi, VARYING_SLOT_TESS_LEVEL_OUTER);
result = ctx->abi->load_tess_level(ctx->abi, VARYING_SLOT_TESS_LEVEL_OUTER, false);
break;
case nir_intrinsic_load_tess_level_inner:
result = ctx->abi->load_tess_level(ctx->abi, VARYING_SLOT_TESS_LEVEL_INNER);
result = ctx->abi->load_tess_level(ctx->abi, VARYING_SLOT_TESS_LEVEL_INNER, false);
break;
case nir_intrinsic_load_tess_level_outer_default:
result = ctx->abi->load_tess_level(ctx->abi, VARYING_SLOT_TESS_LEVEL_OUTER, true);
break;
case nir_intrinsic_load_tess_level_inner_default:
result = ctx->abi->load_tess_level(ctx->abi, VARYING_SLOT_TESS_LEVEL_INNER, true);
break;
case nir_intrinsic_load_patch_vertices_in:
result = ctx->abi->load_patch_vertices_in(ctx->abi);

View File

@@ -65,6 +65,7 @@ struct ac_shader_abi {
LLVMValueRef prim_mask;
LLVMValueRef color0;
LLVMValueRef color1;
LLVMValueRef user_data;
/* CS */
LLVMValueRef local_invocation_ids;
LLVMValueRef num_work_groups;
@@ -138,7 +139,8 @@ struct ac_shader_abi {
LLVMValueRef (*load_patch_vertices_in)(struct ac_shader_abi *abi);
LLVMValueRef (*load_tess_level)(struct ac_shader_abi *abi,
unsigned varying_id);
unsigned varying_id,
bool load_default_state);
LLVMValueRef (*load_ubo)(struct ac_shader_abi *abi, LLVMValueRef index);
@@ -203,11 +205,15 @@ struct ac_shader_abi {
/* Whether to clamp the shadow reference value to [0,1]on GFX8. Radeonsi currently
* uses it due to promoting D16 to D32, but radv needs it off. */
bool clamp_shadow_reference;
bool interp_at_sample_force_center;
/* Whether to workaround GFX9 ignoring the stride for the buffer size if IDXEN=0
* and LLVM optimizes an indexed load with constant index to IDXEN=0. */
bool gfx9_stride_size_workaround;
bool gfx9_stride_size_workaround_for_atomic;
/* Whether bounds checks are required */
bool robust_buffer_access;
};
#endif /* AC_SHADER_ABI_H */

View File

@@ -343,7 +343,7 @@ static int gfx6_compute_level(ADDR_HANDLE addrlib,
AddrSurfInfoIn->flags.depth &&
surf_level->mode == RADEON_SURF_MODE_2D &&
level == 0) {
AddrHtileIn->flags.tcCompatible = AddrSurfInfoIn->flags.tcCompatible;
AddrHtileIn->flags.tcCompatible = AddrSurfInfoOut->tcCompatible;
AddrHtileIn->pitch = AddrSurfInfoOut->pitch;
AddrHtileIn->height = AddrSurfInfoOut->height;
AddrHtileIn->numSlices = AddrSurfInfoOut->depth;
@@ -777,19 +777,12 @@ static int gfx6_compute_surface(ADDR_HANDLE addrlib,
if (level > 0)
continue;
/* Check that we actually got a TC-compatible HTILE if
* we requested it (only for level 0, since we're not
* supporting HTILE on higher mip levels anyway). */
assert(AddrSurfInfoOut.tcCompatible ||
!AddrSurfInfoIn.flags.tcCompatible ||
AddrSurfInfoIn.flags.matchStencilTileCfg);
if (!AddrSurfInfoOut.tcCompatible) {
AddrSurfInfoIn.flags.tcCompatible = 0;
surf->flags &= ~RADEON_SURF_TC_COMPATIBLE_HTILE;
}
if (AddrSurfInfoIn.flags.matchStencilTileCfg) {
if (!AddrSurfInfoOut.tcCompatible) {
AddrSurfInfoIn.flags.tcCompatible = 0;
surf->flags &= ~RADEON_SURF_TC_COMPATIBLE_HTILE;
}
AddrSurfInfoIn.flags.matchStencilTileCfg = 0;
AddrSurfInfoIn.tileIndex = AddrSurfInfoOut.tileIndex;
stencil_tile_idx = AddrSurfInfoOut.stencilTileIdx;
@@ -1265,7 +1258,7 @@ static int gfx9_compute_miptree(ADDR_HANDLE addrlib,
return ret;
surf->u.gfx9.dcc_retile_map[index * 2] = addrout.addr;
if (addrout.addr > USHRT_MAX)
if (addrout.addr > UINT16_MAX)
surf->u.gfx9.dcc_retile_use_uint16 = false;
/* Compute dst DCC address */
@@ -1278,7 +1271,7 @@ static int gfx9_compute_miptree(ADDR_HANDLE addrlib,
return ret;
surf->u.gfx9.dcc_retile_map[index * 2 + 1] = addrout.addr;
if (addrout.addr > USHRT_MAX)
if (addrout.addr > UINT16_MAX)
surf->u.gfx9.dcc_retile_use_uint16 = false;
assert(index * 2 + 1 < surf->u.gfx9.dcc_retile_num_elements);

View File

@@ -97,6 +97,7 @@ enum radeon_family {
CHIP_VEGA20,
CHIP_RAVEN,
CHIP_RAVEN2,
CHIP_RENOIR,
CHIP_ARCTURUS,
CHIP_NAVI10,
CHIP_NAVI12,

View File

@@ -181,6 +181,7 @@
/* fix CP DMA before uncommenting: */
/*#define PKT3_EVENT_WRITE_EOS 0x48*/ /* not on GFX9 */
#define PKT3_RELEASE_MEM 0x49 /* GFX9+ [any ring] or GFX8 [compute ring only] */
#define PKT3_CONTEXT_REG_RMW 0x51 /* older firmware versions on older chips don't have this */
#define PKT3_ONE_REG_WRITE 0x57 /* not on CIK */
#define PKT3_ACQUIRE_MEM 0x58 /* new for CIK */
#define PKT3_REWIND 0x59 /* VI+ [any ring] or CIK [compute ring only] */

View File

@@ -153,12 +153,11 @@ libvulkan_radeon = shared_library(
],
link_with : [
libamd_common, libamdgpu_addrlib, libvulkan_wsi,
libmesa_util, libxmlconfig
],
dependencies : [
dep_llvm, dep_libdrm_amdgpu, dep_thread, dep_elf, dep_dl, dep_m,
dep_valgrind, radv_deps,
idep_nir, idep_vulkan_util, idep_amdgfxregs_h,
idep_mesautil, idep_nir, idep_vulkan_util, idep_amdgfxregs_h, idep_xmlconfig,
],
c_args : [c_vis_args, no_override_init_args, radv_flags],
cpp_args : [cpp_vis_args, radv_flags],

View File

@@ -54,7 +54,9 @@ enum {
static void radv_handle_image_transition(struct radv_cmd_buffer *cmd_buffer,
struct radv_image *image,
VkImageLayout src_layout,
bool src_render_loop,
VkImageLayout dst_layout,
bool dst_render_loop,
uint32_t src_family,
uint32_t dst_family,
const VkImageSubresourceRange *range,
@@ -990,13 +992,15 @@ radv_emit_rbplus_state(struct radv_cmd_buffer *cmd_buffer)
return;
struct radv_pipeline *pipeline = cmd_buffer->state.pipeline;
struct radv_framebuffer *framebuffer = cmd_buffer->state.framebuffer;
const struct radv_subpass *subpass = cmd_buffer->state.subpass;
unsigned sx_ps_downconvert = 0;
unsigned sx_blend_opt_epsilon = 0;
unsigned sx_blend_opt_control = 0;
if (!cmd_buffer->state.attachments || !subpass)
return;
for (unsigned i = 0; i < subpass->color_count; ++i) {
if (subpass->color_attachments[i].attachment == VK_ATTACHMENT_UNUSED) {
sx_blend_opt_control |= S_02875C_MRT0_COLOR_OPT_DISABLE(1) << (i * 4);
@@ -1005,7 +1009,7 @@ radv_emit_rbplus_state(struct radv_cmd_buffer *cmd_buffer)
}
int idx = subpass->color_attachments[i].attachment;
struct radv_color_buffer_info *cb = &framebuffer->attachments[idx].cb;
struct radv_color_buffer_info *cb = &cmd_buffer->state.attachments[idx].cb;
unsigned format = G_028C70_FORMAT(cb->cb_color_info);
unsigned swap = G_028C70_COMP_SWAP(cb->cb_color_info);
@@ -1284,16 +1288,16 @@ radv_emit_depth_bias(struct radv_cmd_buffer *cmd_buffer)
static void
radv_emit_fb_color_state(struct radv_cmd_buffer *cmd_buffer,
int index,
struct radv_attachment_info *att,
struct radv_color_buffer_info *cb,
struct radv_image_view *iview,
VkImageLayout layout)
VkImageLayout layout,
bool in_render_loop)
{
bool is_vi = cmd_buffer->device->physical_device->rad_info.chip_class >= GFX8;
struct radv_color_buffer_info *cb = &att->cb;
uint32_t cb_color_info = cb->cb_color_info;
struct radv_image *image = iview->image;
if (!radv_layout_dcc_compressed(image, layout,
if (!radv_layout_dcc_compressed(cmd_buffer->device, image, layout, in_render_loop,
radv_image_queue_family_mask(image,
cmd_buffer->queue_family_index,
cmd_buffer->queue_family_index))) {
@@ -1395,7 +1399,7 @@ static void
radv_update_zrange_precision(struct radv_cmd_buffer *cmd_buffer,
struct radv_ds_buffer_info *ds,
struct radv_image *image, VkImageLayout layout,
bool requires_cond_exec)
bool in_render_loop, bool requires_cond_exec)
{
uint32_t db_z_info = ds->db_z_info;
uint32_t db_z_info_reg;
@@ -1404,7 +1408,7 @@ radv_update_zrange_precision(struct radv_cmd_buffer *cmd_buffer,
!radv_image_is_tc_compat_htile(image))
return;
if (!radv_layout_has_htile(image, layout,
if (!radv_layout_has_htile(image, layout, in_render_loop,
radv_image_queue_family_mask(image,
cmd_buffer->queue_family_index,
cmd_buffer->queue_family_index))) {
@@ -1441,12 +1445,13 @@ static void
radv_emit_fb_ds_state(struct radv_cmd_buffer *cmd_buffer,
struct radv_ds_buffer_info *ds,
struct radv_image *image,
VkImageLayout layout)
VkImageLayout layout,
bool in_render_loop)
{
uint32_t db_z_info = ds->db_z_info;
uint32_t db_stencil_info = ds->db_stencil_info;
if (!radv_layout_has_htile(image, layout,
if (!radv_layout_has_htile(image, layout, in_render_loop,
radv_image_queue_family_mask(image,
cmd_buffer->queue_family_index,
cmd_buffer->queue_family_index))) {
@@ -1514,7 +1519,7 @@ radv_emit_fb_ds_state(struct radv_cmd_buffer *cmd_buffer,
}
/* Update the ZRANGE_PRECISION value for the TC-compat bug. */
radv_update_zrange_precision(cmd_buffer, ds, image, layout, true);
radv_update_zrange_precision(cmd_buffer, ds, image, layout, in_render_loop, true);
radeon_set_context_reg(cmd_buffer->cs, R_028B78_PA_SU_POLY_OFFSET_DB_FMT_CNTL,
ds->pa_su_poly_offset_db_fmt_cntl);
@@ -1530,26 +1535,33 @@ radv_update_bound_fast_clear_ds(struct radv_cmd_buffer *cmd_buffer,
VkClearDepthStencilValue ds_clear_value,
VkImageAspectFlags aspects)
{
struct radv_framebuffer *framebuffer = cmd_buffer->state.framebuffer;
const struct radv_subpass *subpass = cmd_buffer->state.subpass;
struct radeon_cmdbuf *cs = cmd_buffer->cs;
struct radv_attachment_info *att;
uint32_t att_idx;
if (!framebuffer || !subpass)
if (!cmd_buffer->state.attachments || !subpass)
return;
if (!subpass->depth_stencil_attachment)
return;
att_idx = subpass->depth_stencil_attachment->attachment;
att = &framebuffer->attachments[att_idx];
if (att->attachment->image != image)
if (cmd_buffer->state.attachments[att_idx].iview->image != image)
return;
radeon_set_context_reg_seq(cs, R_028028_DB_STENCIL_CLEAR, 2);
radeon_emit(cs, ds_clear_value.stencil);
radeon_emit(cs, fui(ds_clear_value.depth));
if (aspects == (VK_IMAGE_ASPECT_DEPTH_BIT |
VK_IMAGE_ASPECT_STENCIL_BIT)) {
radeon_set_context_reg_seq(cs, R_028028_DB_STENCIL_CLEAR, 2);
radeon_emit(cs, ds_clear_value.stencil);
radeon_emit(cs, fui(ds_clear_value.depth));
} else if (aspects == VK_IMAGE_ASPECT_DEPTH_BIT) {
radeon_set_context_reg_seq(cs, R_02802C_DB_DEPTH_CLEAR, 1);
radeon_emit(cs, fui(ds_clear_value.depth));
} else {
assert(aspects == VK_IMAGE_ASPECT_STENCIL_BIT);
radeon_set_context_reg_seq(cs, R_028028_DB_STENCIL_CLEAR, 1);
radeon_emit(cs, ds_clear_value.stencil);
}
/* Update the ZRANGE_PRECISION value for the TC-compat bug. This is
* only needed when clearing Z to 0.0.
@@ -1557,9 +1569,10 @@ radv_update_bound_fast_clear_ds(struct radv_cmd_buffer *cmd_buffer,
if ((aspects & VK_IMAGE_ASPECT_DEPTH_BIT) &&
ds_clear_value.depth == 0.0) {
VkImageLayout layout = subpass->depth_stencil_attachment->layout;
bool in_render_loop = subpass->depth_stencil_attachment->in_render_loop;
radv_update_zrange_precision(cmd_buffer, &att->ds, image,
layout, false);
radv_update_zrange_precision(cmd_buffer, &cmd_buffer->state.attachments[att_idx].ds, image,
layout, in_render_loop, false);
}
cmd_buffer->state.context_roll_without_scissor_emitted = true;
@@ -1780,21 +1793,18 @@ radv_update_bound_fast_clear_color(struct radv_cmd_buffer *cmd_buffer,
int cb_idx,
uint32_t color_values[2])
{
struct radv_framebuffer *framebuffer = cmd_buffer->state.framebuffer;
const struct radv_subpass *subpass = cmd_buffer->state.subpass;
struct radeon_cmdbuf *cs = cmd_buffer->cs;
struct radv_attachment_info *att;
uint32_t att_idx;
if (!framebuffer || !subpass)
if (!cmd_buffer->state.attachments || !subpass)
return;
att_idx = subpass->color_attachments[cb_idx].attachment;
if (att_idx == VK_ATTACHMENT_UNUSED)
return;
att = &framebuffer->attachments[att_idx];
if (att->attachment->image != image)
if (cmd_buffer->state.attachments[att_idx].iview->image != image)
return;
radeon_set_context_reg_seq(cs, R_028C8C_CB_COLOR0_CLEAR_WORD0 + cb_idx * 0x3c, 2);
@@ -1919,15 +1929,15 @@ radv_emit_framebuffer_state(struct radv_cmd_buffer *cmd_buffer)
}
int idx = subpass->color_attachments[i].attachment;
struct radv_attachment_info *att = &framebuffer->attachments[idx];
struct radv_image_view *iview = att->attachment;
struct radv_image_view *iview = cmd_buffer->state.attachments[idx].iview;
VkImageLayout layout = subpass->color_attachments[i].layout;
bool in_render_loop = subpass->color_attachments[i].in_render_loop;
radv_cs_add_buffer(cmd_buffer->device->ws, cmd_buffer->cs, att->attachment->bo);
radv_cs_add_buffer(cmd_buffer->device->ws, cmd_buffer->cs, iview->bo);
assert(att->attachment->aspect_mask & (VK_IMAGE_ASPECT_COLOR_BIT | VK_IMAGE_ASPECT_PLANE_0_BIT |
assert(iview->aspect_mask & (VK_IMAGE_ASPECT_COLOR_BIT | VK_IMAGE_ASPECT_PLANE_0_BIT |
VK_IMAGE_ASPECT_PLANE_1_BIT | VK_IMAGE_ASPECT_PLANE_2_BIT));
radv_emit_fb_color_state(cmd_buffer, i, att, iview, layout);
radv_emit_fb_color_state(cmd_buffer, i, &cmd_buffer->state.attachments[idx].cb, iview, layout, in_render_loop);
radv_load_color_clear_metadata(cmd_buffer, iview, i);
}
@@ -1935,21 +1945,21 @@ radv_emit_framebuffer_state(struct radv_cmd_buffer *cmd_buffer)
if (subpass->depth_stencil_attachment) {
int idx = subpass->depth_stencil_attachment->attachment;
VkImageLayout layout = subpass->depth_stencil_attachment->layout;
struct radv_attachment_info *att = &framebuffer->attachments[idx];
struct radv_image *image = att->attachment->image;
radv_cs_add_buffer(cmd_buffer->device->ws, cmd_buffer->cs, att->attachment->bo);
bool in_render_loop = subpass->depth_stencil_attachment->in_render_loop;
struct radv_image *image = cmd_buffer->state.attachments[idx].iview->image;
radv_cs_add_buffer(cmd_buffer->device->ws, cmd_buffer->cs, cmd_buffer->state.attachments[idx].iview->bo);
ASSERTED uint32_t queue_mask = radv_image_queue_family_mask(image,
cmd_buffer->queue_family_index,
cmd_buffer->queue_family_index);
/* We currently don't support writing decompressed HTILE */
assert(radv_layout_has_htile(image, layout, queue_mask) ==
radv_layout_is_htile_compressed(image, layout, queue_mask));
assert(radv_layout_has_htile(image, layout, in_render_loop, queue_mask) ==
radv_layout_is_htile_compressed(image, layout, in_render_loop, queue_mask));
radv_emit_fb_ds_state(cmd_buffer, &att->ds, image, layout);
radv_emit_fb_ds_state(cmd_buffer, &cmd_buffer->state.attachments[idx].ds, image, layout, in_render_loop);
if (att->ds.offset_scale != cmd_buffer->state.offset_scale) {
if (cmd_buffer->state.attachments[idx].ds.offset_scale != cmd_buffer->state.offset_scale) {
cmd_buffer->state.dirty |= RADV_CMD_DIRTY_DYNAMIC_DEPTH_BIAS;
cmd_buffer->state.offset_scale = att->ds.offset_scale;
cmd_buffer->state.offset_scale = cmd_buffer->state.attachments[idx].ds.offset_scale;
}
radv_load_ds_clear_metadata(cmd_buffer, image);
} else {
@@ -2271,14 +2281,15 @@ radv_flush_constants(struct radv_cmd_buffer *cmd_buffer,
return;
radv_foreach_stage(stage, stages) {
if (!pipeline->shaders[stage])
shader = radv_get_shader(pipeline, stage);
if (!shader)
continue;
need_push_constants |= pipeline->shaders[stage]->info.info.loads_push_constants;
need_push_constants |= pipeline->shaders[stage]->info.info.loads_dynamic_offsets;
need_push_constants |= shader->info.info.loads_push_constants;
need_push_constants |= shader->info.info.loads_dynamic_offsets;
uint8_t base = pipeline->shaders[stage]->info.info.base_inline_push_consts;
uint8_t count = pipeline->shaders[stage]->info.info.num_inline_push_consts;
uint8_t base = shader->info.info.base_inline_push_consts;
uint8_t count = shader->info.info.num_inline_push_consts;
radv_emit_inline_push_consts(cmd_buffer, pipeline, stage,
AC_UD_INLINE_PUSH_CONSTANTS,
@@ -2847,7 +2858,7 @@ radv_get_attachment_sample_locations(struct radv_cmd_buffer *cmd_buffer,
{
struct radv_cmd_state *state = &cmd_buffer->state;
uint32_t subpass_id = radv_get_subpass_id(cmd_buffer);
struct radv_image_view *view = state->framebuffer->attachments[att_idx].attachment;
struct radv_image_view *view = state->attachments[att_idx].iview;
if (view->image->info.samples == 1)
return NULL;
@@ -2886,7 +2897,7 @@ static void radv_handle_subpass_image_transition(struct radv_cmd_buffer *cmd_buf
bool begin_subpass)
{
unsigned idx = att.attachment;
struct radv_image_view *view = cmd_buffer->state.framebuffer->attachments[idx].attachment;
struct radv_image_view *view = cmd_buffer->state.attachments[idx].iview;
struct radv_sample_locations_state *sample_locs;
VkImageSubresourceRange range;
range.aspectMask = 0;
@@ -2915,9 +2926,12 @@ static void radv_handle_subpass_image_transition(struct radv_cmd_buffer *cmd_buf
radv_handle_image_transition(cmd_buffer,
view->image,
cmd_buffer->state.attachments[idx].current_layout,
att.layout, 0, 0, &range, sample_locs);
cmd_buffer->state.attachments[idx].current_in_render_loop,
att.layout, att.in_render_loop,
0, 0, &range, sample_locs);
cmd_buffer->state.attachments[idx].current_layout = att.layout;
cmd_buffer->state.attachments[idx].current_in_render_loop = att.in_render_loop;
}
@@ -2940,7 +2954,6 @@ radv_cmd_state_setup_sample_locations(struct radv_cmd_buffer *cmd_buffer,
vk_find_struct_const(info->pNext,
RENDER_PASS_SAMPLE_LOCATIONS_BEGIN_INFO_EXT);
struct radv_cmd_state *state = &cmd_buffer->state;
struct radv_framebuffer *framebuffer = state->framebuffer;
if (!sample_locs) {
state->subpass_sample_locs = NULL;
@@ -2951,8 +2964,7 @@ radv_cmd_state_setup_sample_locations(struct radv_cmd_buffer *cmd_buffer,
const VkAttachmentSampleLocationsEXT *att_sample_locs =
&sample_locs->pAttachmentInitialSampleLocations[i];
uint32_t att_idx = att_sample_locs->attachmentIndex;
struct radv_attachment_info *att = &framebuffer->attachments[att_idx];
struct radv_image *image = att->attachment->image;
struct radv_image *image = cmd_buffer->state.attachments[att_idx].iview->image;
assert(vk_format_is_depth_or_stencil(image->vk_format));
@@ -3020,6 +3032,13 @@ radv_cmd_state_setup_attachments(struct radv_cmd_buffer *cmd_buffer,
const VkRenderPassBeginInfo *info)
{
struct radv_cmd_state *state = &cmd_buffer->state;
const struct VkRenderPassAttachmentBeginInfoKHR *attachment_info = NULL;
if (info) {
attachment_info = vk_find_struct_const(info->pNext,
RENDER_PASS_ATTACHMENT_BEGIN_INFO_KHR);
}
if (pass->attachment_count == 0) {
state->attachments = NULL;
@@ -3069,6 +3088,20 @@ radv_cmd_state_setup_attachments(struct radv_cmd_buffer *cmd_buffer,
state->attachments[i].current_layout = att->initial_layout;
state->attachments[i].sample_location.count = 0;
struct radv_image_view *iview;
if (attachment_info && attachment_info->attachmentCount > i) {
iview = radv_image_view_from_handle(attachment_info->pAttachments[i]);
} else {
iview = state->framebuffer->attachments[i];
}
state->attachments[i].iview = iview;
if (iview->aspect_mask & (VK_IMAGE_ASPECT_DEPTH_BIT | VK_IMAGE_ASPECT_STENCIL_BIT)) {
radv_initialise_ds_surface(cmd_buffer->device, &state->attachments[i].ds, iview);
} else {
radv_initialise_color_surface(cmd_buffer->device, &state->attachments[i].cb, iview);
}
}
return VK_SUCCESS;
@@ -3188,9 +3221,11 @@ VkResult radv_BeginCommandBuffer(
struct radv_subpass *subpass =
&cmd_buffer->state.pass->subpasses[pBeginInfo->pInheritanceInfo->subpass];
result = radv_cmd_state_setup_attachments(cmd_buffer, cmd_buffer->state.pass, NULL);
if (result != VK_SUCCESS)
return result;
if (cmd_buffer->state.framebuffer) {
result = radv_cmd_state_setup_attachments(cmd_buffer, cmd_buffer->state.pass, NULL);
if (result != VK_SUCCESS)
return result;
}
radv_cmd_buffer_set_subpass(cmd_buffer, subpass);
}
@@ -3300,7 +3335,7 @@ void radv_CmdBindIndexBuffer(
cmd_buffer->state.index_va = radv_buffer_get_va(index_buffer->bo);
cmd_buffer->state.index_va += index_buffer->offset + offset;
int index_size = radv_get_vgt_index_size(indexType);
int index_size = radv_get_vgt_index_size(vk_to_index_type(indexType));
cmd_buffer->state.max_index_count = (index_buffer->size - offset) / index_size;
cmd_buffer->state.dirty |= RADV_CMD_DIRTY_INDEX_BUFFER;
radv_cs_add_buffer(cmd_buffer->device->ws, cmd_buffer->cs, index_buffer->bo);
@@ -3350,7 +3385,13 @@ void radv_CmdBindDescriptorSets(
for (unsigned i = 0; i < descriptorSetCount; ++i) {
unsigned idx = i + firstSet;
RADV_FROM_HANDLE(radv_descriptor_set, set, pDescriptorSets[i]);
radv_bind_descriptor_set(cmd_buffer, pipelineBindPoint, set, idx);
/* If the set is already bound we only need to update the
* (potentially changed) dynamic offsets. */
if (descriptors_state->sets[idx] != set ||
!(descriptors_state->valid & (1u << idx))) {
radv_bind_descriptor_set(cmd_buffer, pipelineBindPoint, set, idx);
}
for(unsigned j = 0; j < set->layout->dynamic_offset_count; ++j, ++dyn_idx) {
unsigned idx = j + layout->set[i + firstSet].dynamic_offset_start;
@@ -5051,7 +5092,9 @@ static void radv_initialize_htile(struct radv_cmd_buffer *cmd_buffer,
static void radv_handle_depth_image_transition(struct radv_cmd_buffer *cmd_buffer,
struct radv_image *image,
VkImageLayout src_layout,
bool src_render_loop,
VkImageLayout dst_layout,
bool dst_render_loop,
unsigned src_queue_mask,
unsigned dst_queue_mask,
const VkImageSubresourceRange *range,
@@ -5063,18 +5106,18 @@ static void radv_handle_depth_image_transition(struct radv_cmd_buffer *cmd_buffe
if (src_layout == VK_IMAGE_LAYOUT_UNDEFINED) {
uint32_t clear_value = vk_format_is_stencil(image->vk_format) ? 0xfffff30f : 0xfffc000f;
if (radv_layout_is_htile_compressed(image, dst_layout,
if (radv_layout_is_htile_compressed(image, dst_layout, dst_render_loop,
dst_queue_mask)) {
clear_value = 0;
}
radv_initialize_htile(cmd_buffer, image, range, clear_value);
} else if (!radv_layout_is_htile_compressed(image, src_layout, src_queue_mask) &&
radv_layout_is_htile_compressed(image, dst_layout, dst_queue_mask)) {
} else if (!radv_layout_is_htile_compressed(image, src_layout, src_render_loop, src_queue_mask) &&
radv_layout_is_htile_compressed(image, dst_layout, dst_render_loop, dst_queue_mask)) {
uint32_t clear_value = vk_format_is_stencil(image->vk_format) ? 0xfffff30f : 0xfffc000f;
radv_initialize_htile(cmd_buffer, image, range, clear_value);
} else if (radv_layout_is_htile_compressed(image, src_layout, src_queue_mask) &&
!radv_layout_is_htile_compressed(image, dst_layout, dst_queue_mask)) {
} else if (radv_layout_is_htile_compressed(image, src_layout, src_render_loop, src_queue_mask) &&
!radv_layout_is_htile_compressed(image, dst_layout, dst_render_loop, dst_queue_mask)) {
VkImageSubresourceRange local_range = *range;
local_range.aspectMask = VK_IMAGE_ASPECT_DEPTH_BIT;
local_range.baseMipLevel = 0;
@@ -5178,7 +5221,9 @@ void radv_initialize_dcc(struct radv_cmd_buffer *cmd_buffer,
static void radv_init_color_image_metadata(struct radv_cmd_buffer *cmd_buffer,
struct radv_image *image,
VkImageLayout src_layout,
bool src_render_loop,
VkImageLayout dst_layout,
bool dst_render_loop,
unsigned src_queue_mask,
unsigned dst_queue_mask,
const VkImageSubresourceRange *range)
@@ -5202,7 +5247,8 @@ static void radv_init_color_image_metadata(struct radv_cmd_buffer *cmd_buffer,
uint32_t value = 0xffffffffu; /* Fully expanded mode. */
bool need_decompress_pass = false;
if (radv_layout_dcc_compressed(image, dst_layout,
if (radv_layout_dcc_compressed(cmd_buffer->device, image, dst_layout,
dst_render_loop,
dst_queue_mask)) {
value = 0x20202020u;
need_decompress_pass = true;
@@ -5228,14 +5274,17 @@ static void radv_init_color_image_metadata(struct radv_cmd_buffer *cmd_buffer,
static void radv_handle_color_image_transition(struct radv_cmd_buffer *cmd_buffer,
struct radv_image *image,
VkImageLayout src_layout,
bool src_render_loop,
VkImageLayout dst_layout,
bool dst_render_loop,
unsigned src_queue_mask,
unsigned dst_queue_mask,
const VkImageSubresourceRange *range)
{
if (src_layout == VK_IMAGE_LAYOUT_UNDEFINED) {
radv_init_color_image_metadata(cmd_buffer, image,
src_layout, dst_layout,
src_layout, src_render_loop,
dst_layout, dst_render_loop,
src_queue_mask, dst_queue_mask,
range);
return;
@@ -5244,18 +5293,18 @@ static void radv_handle_color_image_transition(struct radv_cmd_buffer *cmd_buffe
if (radv_dcc_enabled(image, range->baseMipLevel)) {
if (src_layout == VK_IMAGE_LAYOUT_PREINITIALIZED) {
radv_initialize_dcc(cmd_buffer, image, range, 0xffffffffu);
} else if (radv_layout_dcc_compressed(image, src_layout, src_queue_mask) &&
!radv_layout_dcc_compressed(image, dst_layout, dst_queue_mask)) {
} else if (radv_layout_dcc_compressed(cmd_buffer->device, image, src_layout, src_render_loop, src_queue_mask) &&
!radv_layout_dcc_compressed(cmd_buffer->device, image, dst_layout, dst_render_loop, dst_queue_mask)) {
radv_decompress_dcc(cmd_buffer, image, range);
} else if (radv_layout_can_fast_clear(image, src_layout, src_queue_mask) &&
!radv_layout_can_fast_clear(image, dst_layout, dst_queue_mask)) {
} else if (radv_layout_can_fast_clear(image, src_layout, src_render_loop, src_queue_mask) &&
!radv_layout_can_fast_clear(image, dst_layout, dst_render_loop, dst_queue_mask)) {
radv_fast_clear_flush_image_inplace(cmd_buffer, image, range);
}
} else if (radv_image_has_cmask(image) || radv_image_has_fmask(image)) {
bool fce_eliminate = false, fmask_expand = false;
if (radv_layout_can_fast_clear(image, src_layout, src_queue_mask) &&
!radv_layout_can_fast_clear(image, dst_layout, dst_queue_mask)) {
if (radv_layout_can_fast_clear(image, src_layout, src_render_loop, src_queue_mask) &&
!radv_layout_can_fast_clear(image, dst_layout, dst_render_loop, dst_queue_mask)) {
fce_eliminate = true;
}
@@ -5280,7 +5329,9 @@ static void radv_handle_color_image_transition(struct radv_cmd_buffer *cmd_buffe
static void radv_handle_image_transition(struct radv_cmd_buffer *cmd_buffer,
struct radv_image *image,
VkImageLayout src_layout,
bool src_render_loop,
VkImageLayout dst_layout,
bool dst_render_loop,
uint32_t src_family,
uint32_t dst_family,
const VkImageSubresourceRange *range,
@@ -5319,12 +5370,14 @@ static void radv_handle_image_transition(struct radv_cmd_buffer *cmd_buffer,
if (vk_format_is_depth(image->vk_format)) {
radv_handle_depth_image_transition(cmd_buffer, image,
src_layout, dst_layout,
src_layout, src_render_loop,
dst_layout, dst_render_loop,
src_queue_mask, dst_queue_mask,
range, sample_locs);
} else {
radv_handle_color_image_transition(cmd_buffer, image,
src_layout, dst_layout,
src_layout, src_render_loop,
dst_layout, dst_render_loop,
src_queue_mask, dst_queue_mask,
range);
}
@@ -5421,7 +5474,9 @@ radv_barrier(struct radv_cmd_buffer *cmd_buffer,
radv_handle_image_transition(cmd_buffer, image,
pImageMemoryBarriers[i].oldLayout,
false, /* Outside of a renderpass we are never in a renderloop */
pImageMemoryBarriers[i].newLayout,
false, /* Outside of a renderpass we are never in a renderloop */
pImageMemoryBarriers[i].srcQueueFamilyIndex,
pImageMemoryBarriers[i].dstQueueFamilyIndex,
&pImageMemoryBarriers[i].subresourceRange,
@@ -5946,6 +6001,8 @@ void radv_CmdWriteBufferMarkerAMD(
si_emit_cache_flush(cmd_buffer);
ASSERTED unsigned cdw_max = radeon_check_space(cmd_buffer->device->ws, cmd_buffer->cs, 12);
if (!(pipelineStage & ~VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT)) {
radeon_emit(cs, PKT3(PKT3_COPY_DATA, 4, 0));
radeon_emit(cs, COPY_DATA_SRC_SEL(COPY_DATA_IMM) |
@@ -5965,4 +6022,6 @@ void radv_CmdWriteBufferMarkerAMD(
va, marker,
cmd_buffer->gfx9_eop_bug_va);
}
assert(cmd_buffer->cs->cdw <= cdw_max);
}

View File

@@ -42,7 +42,7 @@ static inline unsigned radeon_check_space(struct radeon_winsys *ws,
static inline void radeon_set_config_reg_seq(struct radeon_cmdbuf *cs, unsigned reg, unsigned num)
{
assert(reg >= SI_CONTEXT_REG_OFFSET && reg < SI_CONFIG_REG_END);
assert(reg >= SI_CONFIG_REG_OFFSET && reg < SI_CONFIG_REG_END);
assert(cs->cdw + 2 + num <= cs->max_dw);
assert(num);
radeon_emit(cs, PKT3(PKT3_SET_CONFIG_REG, num, 0));

View File

@@ -503,9 +503,8 @@ radv_dump_shader(struct radv_pipeline *pipeline,
radv_print_spirv(shader->spirv, shader->spirv_size, f);
}
if (shader->nir) {
fprintf(f, "NIR:\n");
nir_print_shader(shader->nir, f);
if (shader->nir_string) {
fprintf(f, "NIR:\n%s\n", shader->nir_string);
}
fprintf(f, "LLVM IR:\n%s\n", shader->llvm_ir_string);

View File

@@ -53,6 +53,7 @@ enum {
RADV_DEBUG_NOBINNING = 0x800000,
RADV_DEBUG_NO_LOAD_STORE_OPT = 0x1000000,
RADV_DEBUG_NO_NGG = 0x2000000,
RADV_DEBUG_NO_SHADER_BALLOT = 0x4000000,
};
enum {
@@ -65,6 +66,8 @@ enum {
RADV_PERFTEST_SHADER_BALLOT = 0x40,
RADV_PERFTEST_TC_COMPAT_CMASK = 0x80,
RADV_PERFTEST_CS_WAVE_32 = 0x100,
RADV_PERFTEST_PS_WAVE_32 = 0x200,
RADV_PERFTEST_GE_WAVE_32 = 0x400,
};
bool

View File

@@ -1076,6 +1076,14 @@ void radv_update_descriptor_sets(
src_ptr += src_binding_layout->offset / 4;
dst_ptr += dst_binding_layout->offset / 4;
if (src_binding_layout->type == VK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT) {
src_ptr += copyset->srcArrayElement / 4;
dst_ptr += copyset->dstArrayElement / 4;
memcpy(dst_ptr, src_ptr, copyset->descriptorCount);
continue;
}
src_ptr += src_binding_layout->size * copyset->srcArrayElement / 4;
dst_ptr += dst_binding_layout->size * copyset->dstArrayElement / 4;

View File

@@ -34,7 +34,6 @@
#include "radv_shader.h"
#include "radv_cs.h"
#include "util/disk_cache.h"
#include "util/strtod.h"
#include "vk_util.h"
#include <xf86drm.h>
#include <amdgpu.h>
@@ -173,12 +172,11 @@ radv_physical_device_init_mem_types(struct radv_physical_device *device)
.heapIndex = vram_index,
};
}
if (gart_index >= 0) {
if (gart_index >= 0 && device->rad_info.has_dedicated_vram) {
device->mem_type_indices[type_count] = RADV_MEM_TYPE_GTT_WRITE_COMBINE;
device->memory_properties.memoryTypes[type_count++] = (VkMemoryType) {
.propertyFlags = VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT |
VK_MEMORY_PROPERTY_HOST_COHERENT_BIT |
(device->rad_info.has_dedicated_vram ? 0 : VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT),
VK_MEMORY_PROPERTY_HOST_COHERENT_BIT,
.heapIndex = gart_index,
};
}
@@ -191,6 +189,19 @@ radv_physical_device_init_mem_types(struct radv_physical_device *device)
.heapIndex = visible_vram_index,
};
}
if (gart_index >= 0 && !device->rad_info.has_dedicated_vram) {
/* Put GTT after visible VRAM for GPUs without dedicated VRAM
* as they have identical property flags, and according to the
* spec, for types with identical flags, the one with greater
* performance must be given a lower index. */
device->mem_type_indices[type_count] = RADV_MEM_TYPE_GTT_WRITE_COMBINE;
device->memory_properties.memoryTypes[type_count++] = (VkMemoryType) {
.propertyFlags = VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT |
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT |
VK_MEMORY_PROPERTY_HOST_COHERENT_BIT,
.heapIndex = gart_index,
};
}
if (gart_index >= 0) {
device->mem_type_indices[type_count] = RADV_MEM_TYPE_GTT_CACHED;
device->memory_properties.memoryTypes[type_count++] = (VkMemoryType) {
@@ -348,7 +359,8 @@ radv_physical_device_init(struct radv_physical_device *device,
device->rbplus_allowed = device->rad_info.family == CHIP_STONEY ||
device->rad_info.family == CHIP_VEGA12 ||
device->rad_info.family == CHIP_RAVEN ||
device->rad_info.family == CHIP_RAVEN2;
device->rad_info.family == CHIP_RAVEN2 ||
device->rad_info.family == CHIP_RENOIR;
}
/* The mere presence of CLEAR_STATE in the IB causes random GPU hangs
@@ -379,16 +391,27 @@ radv_physical_device_init(struct radv_physical_device *device,
device->rad_info.me_fw_feature >= 41);
device->has_dcc_constant_encode = device->rad_info.family == CHIP_RAVEN2 ||
device->rad_info.family == CHIP_RENOIR ||
device->rad_info.chip_class >= GFX10;
device->use_shader_ballot = device->instance->perftest_flags & RADV_PERFTEST_SHADER_BALLOT;
device->use_shader_ballot = device->rad_info.chip_class >= GFX8 &&
device->instance->perftest_flags & RADV_PERFTEST_SHADER_BALLOT;
/* Determine the number of threads per wave for all stages. */
device->cs_wave_size = 64;
device->ps_wave_size = 64;
device->ge_wave_size = 64;
if (device->rad_info.chip_class >= GFX10) {
if (device->instance->perftest_flags & RADV_PERFTEST_CS_WAVE_32)
device->cs_wave_size = 32;
/* For pixel shaders, wave64 is recommanded. */
if (device->instance->perftest_flags & RADV_PERFTEST_PS_WAVE_32)
device->ps_wave_size = 32;
if (device->instance->perftest_flags & RADV_PERFTEST_GE_WAVE_32)
device->ge_wave_size = 32;
}
radv_physical_device_init_mem_types(device);
@@ -484,6 +507,7 @@ static const struct debug_control radv_debug_options[] = {
{"nobinning", RADV_DEBUG_NOBINNING},
{"noloadstoreopt", RADV_DEBUG_NO_LOAD_STORE_OPT},
{"nongg", RADV_DEBUG_NO_NGG},
{"noshaderballot", RADV_DEBUG_NO_SHADER_BALLOT},
{NULL, 0}
};
@@ -503,6 +527,8 @@ static const struct debug_control radv_perftest_options[] = {
{"shader_ballot", RADV_PERFTEST_SHADER_BALLOT},
{"tccompatcmask", RADV_PERFTEST_TC_COMPAT_CMASK},
{"cswave32", RADV_PERFTEST_CS_WAVE_32},
{"pswave32", RADV_PERFTEST_PS_WAVE_32},
{"gewave32", RADV_PERFTEST_GE_WAVE_32},
{NULL, 0}
};
@@ -540,6 +566,22 @@ radv_handle_per_app_options(struct radv_instance *instance,
*/
if (HAVE_LLVM < 0x900)
instance->debug_flags |= RADV_DEBUG_NO_LOAD_STORE_OPT;
} else if (!strcmp(name, "Wolfenstein: Youngblood")) {
if (!(instance->debug_flags & RADV_DEBUG_NO_SHADER_BALLOT)) {
/* Force enable VK_AMD_shader_ballot because it looks
* safe and it gives a nice boost (+20% on Vega 56 at
* this time).
*/
instance->perftest_flags |= RADV_PERFTEST_SHADER_BALLOT;
}
} else if (!strcmp(name, "Fledge")) {
/*
* Zero VRAM for "The Surge 2"
*
* This avoid a hang when when rendering any level. Likely
* uninitialized data in an indirect draw.
*/
instance->debug_flags |= RADV_DEBUG_ZERO_VRAM;
}
}
@@ -554,8 +596,10 @@ static int radv_get_instance_extension_index(const char *name)
static const char radv_dri_options_xml[] =
DRI_CONF_BEGIN
DRI_CONF_SECTION_QUALITY
DRI_CONF_SECTION_PERFORMANCE
DRI_CONF_ADAPTIVE_SYNC("true")
DRI_CONF_VK_X11_OVERRIDE_MIN_IMAGE_COUNT(0)
DRI_CONF_VK_X11_STRICT_IMAGE_COUNT("false")
DRI_CONF_SECTION_END
DRI_CONF_END;
@@ -564,7 +608,9 @@ static void radv_init_dri_options(struct radv_instance *instance)
driParseOptionInfo(&instance->available_dri_options, radv_dri_options_xml);
driParseConfigFiles(&instance->dri_options,
&instance->available_dri_options,
0, "radv", NULL);
0, "radv", NULL,
instance->engineName,
instance->engineVersion);
}
VkResult radv_CreateInstance(
@@ -585,6 +631,13 @@ VkResult radv_CreateInstance(
client_version = VK_API_VERSION_1_0;
}
const char *engine_name = NULL;
uint32_t engine_version = 0;
if (pCreateInfo->pApplicationInfo) {
engine_name = pCreateInfo->pApplicationInfo->pEngineName;
engine_version = pCreateInfo->pApplicationInfo->engineVersion;
}
instance = vk_zalloc2(&default_alloc, pAllocator, sizeof(*instance), 8,
VK_SYSTEM_ALLOCATION_SCOPE_INSTANCE);
if (!instance)
@@ -628,7 +681,10 @@ VkResult radv_CreateInstance(
return vk_error(instance, result);
}
_mesa_locale_init();
instance->engineName = vk_strdup(&instance->alloc, engine_name,
VK_SYSTEM_ALLOCATION_SCOPE_INSTANCE);
instance->engineVersion = engine_version;
glsl_type_singleton_init_or_ref();
VG(VALGRIND_CREATE_MEMPOOL(instance, 0, false));
@@ -654,10 +710,11 @@ void radv_DestroyInstance(
radv_physical_device_finish(instance->physicalDevices + i);
}
vk_free(&instance->alloc, instance->engineName);
VG(VALGRIND_DESTROY_MEMPOOL(instance));
glsl_type_singleton_decref();
_mesa_locale_fini();
driDestroyOptionCache(&instance->dri_options);
driDestroyOptionInfo(&instance->available_dri_options);
@@ -962,11 +1019,8 @@ void radv_GetPhysicalDeviceFeatures2(
case VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SHADER_ATOMIC_INT64_FEATURES_KHR: {
VkPhysicalDeviceShaderAtomicInt64FeaturesKHR *features =
(VkPhysicalDeviceShaderAtomicInt64FeaturesKHR *)ext;
/* TODO: Enable this once the driver supports 64-bit
* compare&swap atomic operations.
*/
features->shaderBufferInt64Atomics = false;
features->shaderSharedInt64Atomics = false;
features->shaderBufferInt64Atomics = HAVE_LLVM >= 0x0900;
features->shaderSharedInt64Atomics = HAVE_LLVM >= 0x0900;
break;
}
case VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_INLINE_UNIFORM_BLOCK_FEATURES_EXT: {
@@ -1002,6 +1056,18 @@ void radv_GetPhysicalDeviceFeatures2(
features->indexTypeUint8 = pdevice->rad_info.chip_class >= GFX8;
break;
}
case VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_IMAGELESS_FRAMEBUFFER_FEATURES_KHR: {
VkPhysicalDeviceImagelessFramebufferFeaturesKHR *features =
(VkPhysicalDeviceImagelessFramebufferFeaturesKHR *)ext;
features->imagelessFramebuffer = true;
break;
}
case VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_PIPELINE_EXECUTABLE_PROPERTIES_FEATURES_KHR: {
VkPhysicalDevicePipelineExecutablePropertiesFeaturesKHR *features =
(VkPhysicalDevicePipelineExecutablePropertiesFeaturesKHR *)ext;
features->pipelineExecutableInfo = true;
break;
}
default:
break;
}
@@ -1009,6 +1075,24 @@ void radv_GetPhysicalDeviceFeatures2(
return radv_GetPhysicalDeviceFeatures(physicalDevice, &pFeatures->features);
}
static size_t
radv_max_descriptor_set_size()
{
/* make sure that the entire descriptor set is addressable with a signed
* 32-bit int. So the sum of all limits scaled by descriptor size has to
* be at most 2 GiB. the combined image & samples object count as one of
* both. This limit is for the pipeline layout, not for the set layout, but
* there is no set limit, so we just set a pipeline limit. I don't think
* any app is going to hit this soon. */
return ((1ull << 31) - 16 * MAX_DYNAMIC_BUFFERS
- MAX_INLINE_UNIFORM_BLOCK_SIZE * MAX_INLINE_UNIFORM_BLOCK_COUNT) /
(32 /* uniform buffer, 32 due to potential space wasted on alignment */ +
32 /* storage buffer, 32 due to potential space wasted on alignment */ +
32 /* sampler, largest when combined with image */ +
64 /* sampled image */ +
64 /* storage image */);
}
void radv_GetPhysicalDeviceProperties(
VkPhysicalDevice physicalDevice,
VkPhysicalDeviceProperties* pProperties)
@@ -1016,18 +1100,7 @@ void radv_GetPhysicalDeviceProperties(
RADV_FROM_HANDLE(radv_physical_device, pdevice, physicalDevice);
VkSampleCountFlags sample_counts = 0xf;
/* make sure that the entire descriptor set is addressable with a signed
* 32-bit int. So the sum of all limits scaled by descriptor size has to
* be at most 2 GiB. the combined image & samples object count as one of
* both. This limit is for the pipeline layout, not for the set layout, but
* there is no set limit, so we just set a pipeline limit. I don't think
* any app is going to hit this soon. */
size_t max_descriptor_set_size = ((1ull << 31) - 16 * MAX_DYNAMIC_BUFFERS) /
(32 /* uniform buffer, 32 due to potential space wasted on alignment */ +
32 /* storage buffer, 32 due to potential space wasted on alignment */ +
32 /* sampler, largest when combined with image */ +
64 /* sampled image */ +
64 /* storage image */);
size_t max_descriptor_set_size = radv_max_descriptor_set_size();
VkPhysicalDeviceLimits limits = {
.maxImageDimension1D = (1 << 14),
@@ -1101,7 +1174,7 @@ void radv_GetPhysicalDeviceProperties(
.viewportBoundsRange = { INT16_MIN, INT16_MAX },
.viewportSubPixelBits = 8,
.minMemoryMapAlignment = 4096, /* A page */
.minTexelBufferOffsetAlignment = 1,
.minTexelBufferOffsetAlignment = 4,
.minUniformBufferOffsetAlignment = 4,
.minStorageBufferOffsetAlignment = 4,
.minTexelOffset = -32,
@@ -1262,7 +1335,7 @@ void radv_GetPhysicalDeviceProperties2(
/* SGPR. */
properties->sgprsPerSimd =
ac_get_num_physical_sgprs(pdevice->rad_info.chip_class);
ac_get_num_physical_sgprs(&pdevice->rad_info);
properties->minSgprAllocation =
pdevice->rad_info.chip_class >= GFX8 ? 16 : 8;
properties->maxSgprAllocation =
@@ -1296,13 +1369,7 @@ void radv_GetPhysicalDeviceProperties2(
properties->robustBufferAccessUpdateAfterBind = false;
properties->quadDivergentImplicitLod = false;
size_t max_descriptor_set_size = ((1ull << 31) - 16 * MAX_DYNAMIC_BUFFERS -
MAX_INLINE_UNIFORM_BLOCK_SIZE * MAX_INLINE_UNIFORM_BLOCK_COUNT) /
(32 /* uniform buffer, 32 due to potential space wasted on alignment */ +
32 /* storage buffer, 32 due to potential space wasted on alignment */ +
32 /* sampler, largest when combined with image */ +
64 /* sampled image */ +
64 /* storage image */);
size_t max_descriptor_set_size = radv_max_descriptor_set_size();
properties->maxPerStageDescriptorUpdateAfterBindSamplers = max_descriptor_set_size;
properties->maxPerStageDescriptorUpdateAfterBindUniformBuffers = max_descriptor_set_size;
properties->maxPerStageDescriptorUpdateAfterBindStorageBuffers = max_descriptor_set_size;
@@ -1878,6 +1945,9 @@ VkResult radv_CreateDevice(
device->enabled_extensions.EXT_descriptor_indexing ||
device->enabled_extensions.EXT_buffer_device_address;
device->robust_buffer_access = pCreateInfo->pEnabledFeatures &&
pCreateInfo->pEnabledFeatures->robustBufferAccess;
mtx_init(&device->shader_slab_mutex, mtx_plain);
list_inithead(&device->shader_slabs);
@@ -1914,10 +1984,10 @@ VkResult radv_CreateDevice(
device->pbb_allowed = device->physical_device->rad_info.chip_class >= GFX9 &&
!(device->instance->debug_flags & RADV_DEBUG_NOBINNING);
/* Disabled and not implemented for now. */
device->dfsm_allowed = device->pbb_allowed &&
(device->physical_device->rad_info.family == CHIP_RAVEN ||
device->physical_device->rad_info.family == CHIP_RAVEN2);
device->physical_device->rad_info.family == CHIP_RAVEN2 ||
device->physical_device->rad_info.family == CHIP_RENOIR);
#ifdef ANDROID
device->always_use_syncobj = device->physical_device->rad_info.has_syncobj_wait_for_submit;
@@ -2598,7 +2668,8 @@ radv_get_preamble_cs(struct radv_queue *queue,
*initial_full_flush_preamble_cs = queue->initial_full_flush_preamble_cs;
*initial_preamble_cs = queue->initial_preamble_cs;
*continue_preamble_cs = queue->continue_preamble_cs;
if (!scratch_size && !compute_scratch_size && !esgs_ring_size && !gsvs_ring_size)
if (!scratch_size && !compute_scratch_size && !esgs_ring_size && !gsvs_ring_size &&
!needs_tess_rings && !needs_sample_positions)
*continue_preamble_cs = NULL;
return VK_SUCCESS;
}
@@ -4341,7 +4412,7 @@ radv_init_dcc_control_reg(struct radv_device *device,
S_028C78_INDEPENDENT_128B_BLOCKS(independent_128b_blocks);
}
static void
void
radv_initialise_color_surface(struct radv_device *device,
struct radv_color_buffer_info *cb,
struct radv_image_view *iview)
@@ -4400,15 +4471,15 @@ radv_initialise_color_surface(struct radv_device *device,
cb->cb_color_pitch = S_028C64_TILE_MAX(pitch_tile_max);
cb->cb_color_slice = S_028C68_TILE_MAX(slice_tile_max);
cb->cb_color_cmask_slice = iview->image->cmask.slice_tile_max;
cb->cb_color_cmask_slice = surf->u.legacy.cmask_slice_tile_max;
cb->cb_color_attrib |= S_028C74_TILE_MODE_INDEX(tile_mode_index);
if (radv_image_has_fmask(iview->image)) {
if (device->physical_device->rad_info.chip_class >= GFX7)
cb->cb_color_pitch |= S_028C64_FMASK_TILE_MAX(iview->image->fmask.pitch_in_pixels / 8 - 1);
cb->cb_color_attrib |= S_028C74_FMASK_TILE_MODE_INDEX(iview->image->fmask.tile_mode_index);
cb->cb_color_fmask_slice = S_028C88_TILE_MAX(iview->image->fmask.slice_tile_max);
cb->cb_color_pitch |= S_028C64_FMASK_TILE_MAX(surf->u.legacy.fmask.pitch_in_pixels / 8 - 1);
cb->cb_color_attrib |= S_028C74_FMASK_TILE_MODE_INDEX(surf->u.legacy.fmask.tiling_index);
cb->cb_color_fmask_slice = S_028C88_TILE_MAX(surf->u.legacy.fmask.slice_tile_max);
} else {
/* This must be set for fast clear to work without FMASK. */
if (device->physical_device->rad_info.chip_class >= GFX7)
@@ -4420,7 +4491,7 @@ radv_initialise_color_surface(struct radv_device *device,
/* CMASK variables */
va = radv_buffer_get_va(iview->bo) + iview->image->offset;
va += iview->image->cmask.offset;
va += iview->image->cmask_offset;
cb->cb_color_cmask = va >> 8;
va = radv_buffer_get_va(iview->bo) + iview->image->offset;
@@ -4449,9 +4520,9 @@ radv_initialise_color_surface(struct radv_device *device,
}
if (radv_image_has_fmask(iview->image)) {
va = radv_buffer_get_va(iview->bo) + iview->image->offset + iview->image->fmask.offset;
va = radv_buffer_get_va(iview->bo) + iview->image->offset + iview->image->fmask_offset;
cb->cb_color_fmask = va >> 8;
cb->cb_color_fmask |= iview->image->fmask.tile_swizzle;
cb->cb_color_fmask |= surf->fmask_tile_swizzle;
} else {
cb->cb_color_fmask = cb->cb_color_base;
}
@@ -4462,7 +4533,7 @@ radv_initialise_color_surface(struct radv_device *device,
format = radv_translate_colorformat(iview->vk_format);
if (format == V_028C70_COLOR_INVALID || ntype == ~0u)
radv_finishme("Illegal color\n");
swap = radv_translate_colorswap(iview->vk_format, FALSE);
swap = radv_translate_colorswap(iview->vk_format, false);
endian = radv_colorformat_endian_swap(format);
/* blend clamp should be set for all NORM/SRGB types */
@@ -4501,7 +4572,7 @@ radv_initialise_color_surface(struct radv_device *device,
if (radv_image_has_fmask(iview->image)) {
cb->cb_color_info |= S_028C70_COMPRESSION(1);
if (device->physical_device->rad_info.chip_class == GFX6) {
unsigned fmask_bankh = util_logbase2(iview->image->fmask.bank_height);
unsigned fmask_bankh = util_logbase2(surf->u.legacy.fmask.bankh);
cb->cb_color_attrib |= S_028C74_FMASK_BANK_HEIGHT(fmask_bankh);
}
@@ -4601,7 +4672,7 @@ radv_calc_decompress_on_z_planes(struct radv_device *device,
return max_zplanes;
}
static void
void
radv_initialise_ds_surface(struct radv_device *device,
struct radv_ds_buffer_info *ds,
struct radv_image_view *iview)
@@ -4795,11 +4866,15 @@ VkResult radv_CreateFramebuffer(
{
RADV_FROM_HANDLE(radv_device, device, _device);
struct radv_framebuffer *framebuffer;
const VkFramebufferAttachmentsCreateInfoKHR *imageless_create_info =
vk_find_struct_const(pCreateInfo->pNext,
FRAMEBUFFER_ATTACHMENTS_CREATE_INFO_KHR);
assert(pCreateInfo->sType == VK_STRUCTURE_TYPE_FRAMEBUFFER_CREATE_INFO);
size_t size = sizeof(*framebuffer) +
sizeof(struct radv_attachment_info) * pCreateInfo->attachmentCount;
size_t size = sizeof(*framebuffer);
if (!imageless_create_info)
size += sizeof(struct radv_image_view*) * pCreateInfo->attachmentCount;
framebuffer = vk_alloc2(&device->alloc, pAllocator, size, 8,
VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
if (framebuffer == NULL)
@@ -4809,18 +4884,23 @@ VkResult radv_CreateFramebuffer(
framebuffer->width = pCreateInfo->width;
framebuffer->height = pCreateInfo->height;
framebuffer->layers = pCreateInfo->layers;
for (uint32_t i = 0; i < pCreateInfo->attachmentCount; i++) {
VkImageView _iview = pCreateInfo->pAttachments[i];
struct radv_image_view *iview = radv_image_view_from_handle(_iview);
framebuffer->attachments[i].attachment = iview;
if (iview->aspect_mask & (VK_IMAGE_ASPECT_DEPTH_BIT | VK_IMAGE_ASPECT_STENCIL_BIT)) {
radv_initialise_ds_surface(device, &framebuffer->attachments[i].ds, iview);
} else {
radv_initialise_color_surface(device, &framebuffer->attachments[i].cb, iview);
if (imageless_create_info) {
for (unsigned i = 0; i < imageless_create_info->attachmentImageInfoCount; ++i) {
const VkFramebufferAttachmentImageInfoKHR *attachment =
imageless_create_info->pAttachmentImageInfos + i;
framebuffer->width = MIN2(framebuffer->width, attachment->width);
framebuffer->height = MIN2(framebuffer->height, attachment->height);
framebuffer->layers = MIN2(framebuffer->layers, attachment->layerCount);
}
} else {
for (uint32_t i = 0; i < pCreateInfo->attachmentCount; i++) {
VkImageView _iview = pCreateInfo->pAttachments[i];
struct radv_image_view *iview = radv_image_view_from_handle(_iview);
framebuffer->attachments[i] = iview;
framebuffer->width = MIN2(framebuffer->width, iview->extent.width);
framebuffer->height = MIN2(framebuffer->height, iview->extent.height);
framebuffer->layers = MIN2(framebuffer->layers, radv_surface_max_layer_count(iview));
}
framebuffer->width = MIN2(framebuffer->width, iview->extent.width);
framebuffer->height = MIN2(framebuffer->height, iview->extent.height);
framebuffer->layers = MIN2(framebuffer->layers, radv_surface_max_layer_count(iview));
}
*pFramebuffer = radv_framebuffer_to_handle(framebuffer);

View File

@@ -75,15 +75,17 @@ EXTENSIONS = [
Extension('VK_KHR_get_physical_device_properties2', 1, True),
Extension('VK_KHR_get_surface_capabilities2', 1, 'RADV_HAS_SURFACE'),
Extension('VK_KHR_image_format_list', 1, True),
Extension('VK_KHR_imageless_framebuffer', 1, True),
Extension('VK_KHR_incremental_present', 1, 'RADV_HAS_SURFACE'),
Extension('VK_KHR_maintenance1', 1, True),
Extension('VK_KHR_maintenance2', 1, True),
Extension('VK_KHR_maintenance3', 1, True),
Extension('VK_KHR_pipeline_executable_properties', 1, True),
Extension('VK_KHR_push_descriptor', 1, True),
Extension('VK_KHR_relaxed_block_layout', 1, True),
Extension('VK_KHR_sampler_mirror_clamp_to_edge', 1, True),
Extension('VK_KHR_sampler_ycbcr_conversion', 1, True),
Extension('VK_KHR_shader_atomic_int64', 1, False),
Extension('VK_KHR_shader_atomic_int64', 1, 'HAVE_LLVM >= 0x0900'),
Extension('VK_KHR_shader_draw_parameters', 1, True),
Extension('VK_KHR_shader_float16_int8', 1, True),
Extension('VK_KHR_storage_buffer_storage_class', 1, True),

View File

@@ -1305,6 +1305,10 @@ get_external_image_format_properties(const VkPhysicalDeviceImageFormatInfo2 *pIm
VkExternalMemoryFeatureFlagBits flags = 0;
VkExternalMemoryHandleTypeFlags export_flags = 0;
VkExternalMemoryHandleTypeFlags compat_flags = 0;
if (pImageFormatInfo->flags & VK_IMAGE_CREATE_SPARSE_BINDING_BIT)
return;
switch (handleType) {
case VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_FD_BIT:
case VK_EXTERNAL_MEMORY_HANDLE_TYPE_DMA_BUF_BIT_EXT:
@@ -1381,14 +1385,9 @@ VkResult radv_GetPhysicalDeviceImageFormatProperties2(
* present and VkExternalImageFormatProperties will be ignored.
*/
if (external_info && external_info->handleType != 0) {
switch (external_info->handleType) {
case VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_FD_BIT:
case VK_EXTERNAL_MEMORY_HANDLE_TYPE_DMA_BUF_BIT_EXT:
case VK_EXTERNAL_MEMORY_HANDLE_TYPE_HOST_ALLOCATION_BIT_EXT:
get_external_image_format_properties(base_info, external_info->handleType,
&external_props->externalMemoryProperties);
break;
default:
get_external_image_format_properties(base_info, external_info->handleType,
&external_props->externalMemoryProperties);
if (!external_props->externalMemoryProperties.externalMemoryFeatures) {
/* From the Vulkan 1.0.97 spec:
*
* If handleType is not compatible with the [parameters] specified

View File

@@ -454,7 +454,8 @@ si_set_mutable_tex_desc_fields(struct radv_device *device,
unsigned plane_id,
unsigned base_level, unsigned first_level,
unsigned block_width, bool is_stencil,
bool is_storage_image, uint32_t *state)
bool is_storage_image, bool disable_compression,
uint32_t *state)
{
struct radv_image_plane *plane = &image->planes[plane_id];
uint64_t gpu_address = image->bo ? radv_buffer_get_va(image->bo) + image->offset : 0;
@@ -479,21 +480,23 @@ si_set_mutable_tex_desc_fields(struct radv_device *device,
if (chip_class >= GFX8) {
state[6] &= C_008F28_COMPRESSION_EN;
state[7] = 0;
if (!is_storage_image && radv_dcc_enabled(image, first_level)) {
if (!disable_compression && radv_dcc_enabled(image, first_level)) {
meta_va = gpu_address + image->dcc_offset;
if (chip_class <= GFX8)
meta_va += base_level_info->dcc_offset;
} else if (!is_storage_image &&
unsigned dcc_tile_swizzle = plane->surface.tile_swizzle << 8;
dcc_tile_swizzle &= plane->surface.dcc_alignment - 1;
meta_va |= dcc_tile_swizzle;
} else if (!disable_compression &&
radv_image_is_tc_compat_htile(image)) {
meta_va = gpu_address + image->htile_offset;
}
if (meta_va) {
state[6] |= S_008F28_COMPRESSION_EN(1);
if (chip_class <= GFX9) {
if (chip_class <= GFX9)
state[7] = meta_va >> 8;
state[7] |= plane->surface.tile_swizzle;
}
}
}
@@ -617,7 +620,7 @@ static unsigned gfx9_border_color_swizzle(const enum vk_swizzle swizzle[4])
return bc_swizzle;
}
static bool vi_alpha_is_on_msb(struct radv_device *device, VkFormat format)
bool vi_alpha_is_on_msb(struct radv_device *device, VkFormat format)
{
const struct vk_format_description *desc = vk_format_description(format);
@@ -691,7 +694,7 @@ gfx10_make_texture_descriptor(struct radv_device *device,
*/
state[4] = S_00A010_DEPTH(type == V_008F1C_SQ_RSRC_IMG_3D ? depth - 1 : last_layer) |
S_00A010_BASE_ARRAY(first_layer);
state[5] = S_00A014_ARRAY_PITCH(!!(type == V_008F1C_SQ_RSRC_IMG_3D)) |
state[5] = S_00A014_ARRAY_PITCH(0) |
S_00A014_MAX_MIP(image->info.samples > 1 ?
util_logbase2(image->info.samples) :
image->info.levels - 1) |
@@ -713,7 +716,7 @@ gfx10_make_texture_descriptor(struct radv_device *device,
assert(image->plane_count == 1);
va = gpu_address + image->offset + image->fmask.offset;
va = gpu_address + image->offset + image->fmask_offset;
switch (image->info.samples) {
case 2:
@@ -877,7 +880,7 @@ si_make_texture_descriptor(struct radv_device *device,
assert(image->plane_count == 1);
va = gpu_address + image->offset + image->fmask.offset;
va = gpu_address + image->offset + image->fmask_offset;
if (device->physical_device->rad_info.chip_class == GFX9) {
fmask_format = V_008F14_IMG_DATA_FORMAT_FMASK;
@@ -913,7 +916,7 @@ si_make_texture_descriptor(struct radv_device *device,
}
fmask_state[0] = va >> 8;
fmask_state[0] |= image->fmask.tile_swizzle;
fmask_state[0] |= image->planes[0].surface.fmask_tile_swizzle;
fmask_state[1] = S_008F14_BASE_ADDRESS_HI(va >> 40) |
S_008F14_DATA_FORMAT(fmask_format) |
S_008F14_NUM_FORMAT(num_format);
@@ -937,20 +940,20 @@ si_make_texture_descriptor(struct radv_device *device,
S_008F24_META_RB_ALIGNED(image->planes[0].surface.u.gfx9.cmask.rb_aligned);
if (radv_image_is_tc_compat_cmask(image)) {
va = gpu_address + image->offset + image->cmask.offset;
va = gpu_address + image->offset + image->cmask_offset;
fmask_state[5] |= S_008F24_META_DATA_ADDRESS(va >> 40);
fmask_state[6] |= S_008F28_COMPRESSION_EN(1);
fmask_state[7] |= va >> 8;
}
} else {
fmask_state[3] |= S_008F1C_TILING_INDEX(image->fmask.tile_mode_index);
fmask_state[3] |= S_008F1C_TILING_INDEX(image->planes[0].surface.u.legacy.fmask.tiling_index);
fmask_state[4] |= S_008F20_DEPTH(depth - 1) |
S_008F20_PITCH(image->fmask.pitch_in_pixels - 1);
S_008F20_PITCH(image->planes[0].surface.u.legacy.fmask.pitch_in_pixels - 1);
fmask_state[5] |= S_008F24_LAST_ARRAY(last_layer);
if (radv_image_is_tc_compat_cmask(image)) {
va = gpu_address + image->offset + image->cmask.offset;
va = gpu_address + image->offset + image->cmask_offset;
fmask_state[6] |= S_008F28_COMPRESSION_EN(1);
fmask_state[7] |= va >> 8;
@@ -1024,7 +1027,7 @@ radv_query_opaque_metadata(struct radv_device *device,
desc, NULL);
si_set_mutable_tex_desc_fields(device, image, &image->planes[0].surface.u.legacy.level[0], 0, 0, 0,
image->planes[0].surface.blk_w, false, false, desc);
image->planes[0].surface.blk_w, false, false, false, desc);
/* Clear the base address and set the relative DCC offset. */
desc[0] = 0;
@@ -1099,82 +1102,38 @@ radv_image_override_offset_stride(struct radv_device *device,
}
}
/* The number of samples can be specified independently of the texture. */
static void
radv_image_get_fmask_info(struct radv_device *device,
struct radv_image *image,
unsigned nr_samples,
struct radv_fmask_info *out)
{
if (device->physical_device->rad_info.chip_class >= GFX9) {
out->alignment = image->planes[0].surface.fmask_alignment;
out->size = image->planes[0].surface.fmask_size;
out->tile_swizzle = image->planes[0].surface.fmask_tile_swizzle;
return;
}
out->slice_tile_max = image->planes[0].surface.u.legacy.fmask.slice_tile_max;
out->tile_mode_index = image->planes[0].surface.u.legacy.fmask.tiling_index;
out->pitch_in_pixels = image->planes[0].surface.u.legacy.fmask.pitch_in_pixels;
out->slice_size = image->planes[0].surface.u.legacy.fmask.slice_size;
out->bank_height = image->planes[0].surface.u.legacy.fmask.bankh;
out->tile_swizzle = image->planes[0].surface.fmask_tile_swizzle;
out->alignment = image->planes[0].surface.fmask_alignment;
out->size = image->planes[0].surface.fmask_size;
assert(!out->tile_swizzle || !image->shareable);
}
static void
radv_image_alloc_fmask(struct radv_device *device,
struct radv_image *image)
{
radv_image_get_fmask_info(device, image, image->info.samples, &image->fmask);
unsigned fmask_alignment = image->planes[0].surface.fmask_alignment;
image->fmask.offset = align64(image->size, image->fmask.alignment);
image->size = image->fmask.offset + image->fmask.size;
image->alignment = MAX2(image->alignment, image->fmask.alignment);
}
static void
radv_image_get_cmask_info(struct radv_device *device,
struct radv_image *image,
struct radv_cmask_info *out)
{
assert(image->plane_count == 1);
if (device->physical_device->rad_info.chip_class >= GFX9) {
out->alignment = image->planes[0].surface.cmask_alignment;
out->size = image->planes[0].surface.cmask_size;
return;
}
out->slice_tile_max = image->planes[0].surface.u.legacy.cmask_slice_tile_max;
out->alignment = image->planes[0].surface.cmask_alignment;
out->slice_size = image->planes[0].surface.cmask_slice_size;
out->size = image->planes[0].surface.cmask_size;
image->fmask_offset = align64(image->size, fmask_alignment);
image->size = image->fmask_offset + image->planes[0].surface.fmask_size;
image->alignment = MAX2(image->alignment, fmask_alignment);
}
static void
radv_image_alloc_cmask(struct radv_device *device,
struct radv_image *image)
{
unsigned cmask_alignment = image->planes[0].surface.cmask_alignment;
unsigned cmask_size = image->planes[0].surface.cmask_size;
uint32_t clear_value_size = 0;
radv_image_get_cmask_info(device, image, &image->cmask);
if (!image->cmask.size)
if (!cmask_size)
return;
assert(image->cmask.alignment);
assert(cmask_alignment);
image->cmask.offset = align64(image->size, image->cmask.alignment);
image->cmask_offset = align64(image->size, cmask_alignment);
/* + 8 for storing the clear values */
if (!image->clear_value_offset) {
image->clear_value_offset = image->cmask.offset + image->cmask.size;
image->clear_value_offset = image->cmask_offset + cmask_size;
clear_value_size = 8;
}
image->size = image->cmask.offset + image->cmask.size + clear_value_size;
image->alignment = MAX2(image->alignment, image->cmask.alignment);
image->size = image->cmask_offset + cmask_size + clear_value_size;
image->alignment = MAX2(image->alignment, cmask_alignment);
}
static void
@@ -1442,8 +1401,8 @@ radv_image_view_make_descriptor(struct radv_image_view *iview,
struct radv_device *device,
VkFormat vk_format,
const VkComponentMapping *components,
bool is_storage_image, unsigned plane_id,
unsigned descriptor_plane_id)
bool is_storage_image, bool disable_compression,
unsigned plane_id, unsigned descriptor_plane_id)
{
struct radv_image *image = iview->image;
struct radv_image_plane *plane = &image->planes[plane_id];
@@ -1490,7 +1449,9 @@ radv_image_view_make_descriptor(struct radv_image_view *iview,
plane_id,
iview->base_mip,
iview->base_mip,
blk_w, is_stencil, is_storage_image, descriptor->plane_descriptors[descriptor_plane_id]);
blk_w, is_stencil, is_storage_image,
is_storage_image || disable_compression,
descriptor->plane_descriptors[descriptor_plane_id]);
}
static unsigned
@@ -1530,7 +1491,8 @@ radv_get_aspect_format(struct radv_image *image, VkImageAspectFlags mask)
void
radv_image_view_init(struct radv_image_view *iview,
struct radv_device *device,
const VkImageViewCreateInfo* pCreateInfo)
const VkImageViewCreateInfo* pCreateInfo,
const struct radv_image_view_extra_create_info* extra_create_info)
{
RADV_FROM_HANDLE(radv_image, image, pCreateInfo->image);
const VkImageSubresourceRange *range = &pCreateInfo->subresourceRange;
@@ -1630,15 +1592,23 @@ radv_image_view_init(struct radv_image_view *iview,
iview->base_mip = range->baseMipLevel;
iview->level_count = radv_get_levelCount(image, range);
bool disable_compression = extra_create_info ? extra_create_info->disable_compression: false;
for (unsigned i = 0; i < (iview->multiple_planes ? vk_format_get_plane_count(image->vk_format) : 1); ++i) {
VkFormat format = vk_format_get_plane_format(iview->vk_format, i);
radv_image_view_make_descriptor(iview, device, format, &pCreateInfo->components, false, iview->plane_id + i, i);
radv_image_view_make_descriptor(iview, device, format, &pCreateInfo->components, true, iview->plane_id + i, i);
radv_image_view_make_descriptor(iview, device, format,
&pCreateInfo->components,
false, disable_compression,
iview->plane_id + i, i);
radv_image_view_make_descriptor(iview, device,
format, &pCreateInfo->components,
true, disable_compression,
iview->plane_id + i, i);
}
}
bool radv_layout_has_htile(const struct radv_image *image,
VkImageLayout layout,
bool in_render_loop,
unsigned queue_mask)
{
if (radv_image_is_tc_compat_htile(image))
@@ -1652,6 +1622,7 @@ bool radv_layout_has_htile(const struct radv_image *image,
bool radv_layout_is_htile_compressed(const struct radv_image *image,
VkImageLayout layout,
bool in_render_loop,
unsigned queue_mask)
{
if (radv_image_is_tc_compat_htile(image))
@@ -1665,13 +1636,16 @@ bool radv_layout_is_htile_compressed(const struct radv_image *image,
bool radv_layout_can_fast_clear(const struct radv_image *image,
VkImageLayout layout,
bool in_render_loop,
unsigned queue_mask)
{
return layout == VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL;
}
bool radv_layout_dcc_compressed(const struct radv_image *image,
bool radv_layout_dcc_compressed(const struct radv_device *device,
const struct radv_image *image,
VkImageLayout layout,
bool in_render_loop,
unsigned queue_mask)
{
/* Don't compress compute transfer dst, as image stores are not supported. */
@@ -1804,7 +1778,7 @@ radv_CreateImageView(VkDevice _device,
if (view == NULL)
return vk_error(device->instance, VK_ERROR_OUT_OF_HOST_MEMORY);
radv_image_view_init(view, device, pCreateInfo);
radv_image_view_init(view, device, pCreateInfo, NULL);
*pView = radv_image_view_to_handle(view);

View File

@@ -28,8 +28,10 @@
class radv_llvm_per_thread_info {
public:
radv_llvm_per_thread_info(enum radeon_family arg_family,
enum ac_target_machine_options arg_tm_options)
: family(arg_family), tm_options(arg_tm_options), passes(NULL) {}
enum ac_target_machine_options arg_tm_options,
unsigned arg_wave_size)
: family(arg_family), tm_options(arg_tm_options),
wave_size(arg_wave_size), passes(NULL), passes_wave32(NULL) {}
~radv_llvm_per_thread_info()
{
@@ -47,19 +49,28 @@ public:
if (!passes)
return false;
if (llvm_info.tm_wave32) {
passes_wave32 = ac_create_llvm_passes(llvm_info.tm_wave32);
if (!passes_wave32)
return false;
}
return true;
}
bool compile_to_memory_buffer(LLVMModuleRef module,
char **pelf_buffer, size_t *pelf_size)
{
return ac_compile_module_to_elf(passes, module, pelf_buffer, pelf_size);
struct ac_compiler_passes *p = wave_size == 32 ? passes_wave32 : passes;
return ac_compile_module_to_elf(p, module, pelf_buffer, pelf_size);
}
bool is_same(enum radeon_family arg_family,
enum ac_target_machine_options arg_tm_options) {
enum ac_target_machine_options arg_tm_options,
unsigned arg_wave_size) {
if (arg_family == family &&
arg_tm_options == tm_options)
arg_tm_options == tm_options &&
arg_wave_size == wave_size)
return true;
return false;
}
@@ -67,7 +78,9 @@ public:
private:
enum radeon_family family;
enum ac_target_machine_options tm_options;
unsigned wave_size;
struct ac_compiler_passes *passes;
struct ac_compiler_passes *passes_wave32;
};
/* we have to store a linked list per thread due to the possiblity of multiple gpus being required */
@@ -99,17 +112,18 @@ bool radv_compile_to_elf(struct ac_llvm_compiler *info,
bool radv_init_llvm_compiler(struct ac_llvm_compiler *info,
bool thread_compiler,
enum radeon_family family,
enum ac_target_machine_options tm_options)
enum ac_target_machine_options tm_options,
unsigned wave_size)
{
if (thread_compiler) {
for (auto &I : radv_llvm_per_thread_list) {
if (I.is_same(family, tm_options)) {
if (I.is_same(family, tm_options, wave_size)) {
*info = I.llvm_info;
return true;
}
}
radv_llvm_per_thread_list.emplace_back(family, tm_options);
radv_llvm_per_thread_list.emplace_back(family, tm_options, wave_size);
radv_llvm_per_thread_info &tinfo = radv_llvm_per_thread_list.back();
if (!tinfo.init()) {

View File

@@ -660,7 +660,7 @@ void radv_CmdBlitImage(
.baseArrayLayer = dest_array_slice,
.layerCount = 1
},
});
}, NULL);
radv_image_view_init(&src_iview, cmd_buffer->device,
&(VkImageViewCreateInfo) {
.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO,
@@ -674,7 +674,7 @@ void radv_CmdBlitImage(
.baseArrayLayer = src_array_slice,
.layerCount = 1
},
});
}, NULL);
meta_emit_blit(cmd_buffer,
src_image, &src_iview, srcImageLayout,
src_offset_0, src_offset_1,

View File

@@ -79,7 +79,7 @@ create_iview(struct radv_cmd_buffer *cmd_buffer,
.baseArrayLayer = surf->layer,
.layerCount = 1
},
});
}, NULL);
}
static void
@@ -408,7 +408,7 @@ radv_meta_blit2d(struct radv_cmd_buffer *cmd_buffer,
unsigned num_rects,
struct radv_meta_blit2d_rect *rects)
{
bool use_3d = cmd_buffer->device->physical_device->rad_info.chip_class == GFX9 &&
bool use_3d = cmd_buffer->device->physical_device->rad_info.chip_class >= GFX9 &&
(src_img && src_img->image->type == VK_IMAGE_TYPE_3D);
enum blit2d_src_type src_type = src_buf ? BLIT2D_SRC_TYPE_BUFFER :
use_3d ? BLIT2D_SRC_TYPE_IMAGE_3D : BLIT2D_SRC_TYPE_IMAGE;
@@ -698,7 +698,7 @@ radv_device_finish_meta_blit2d_state(struct radv_device *device)
state->blit2d_stencil_only_rp[j], &state->alloc);
}
for (unsigned log2_samples = 0; log2_samples < 1 + MAX_SAMPLES_LOG2; ++log2_samples) {
for (unsigned log2_samples = 0; log2_samples < MAX_SAMPLES_LOG2; ++log2_samples) {
for (unsigned src = 0; src < BLIT2D_NUM_SRC_TYPES; src++) {
radv_DestroyPipelineLayout(radv_device_to_handle(device),
state->blit2d[log2_samples].p_layouts[src],
@@ -1308,9 +1308,9 @@ VkResult
radv_device_init_meta_blit2d_state(struct radv_device *device, bool on_demand)
{
VkResult result;
bool create_3d = device->physical_device->rad_info.chip_class == GFX9;
bool create_3d = device->physical_device->rad_info.chip_class >= GFX9;
for (unsigned log2_samples = 0; log2_samples < 1 + MAX_SAMPLES_LOG2; log2_samples++) {
for (unsigned log2_samples = 0; log2_samples < MAX_SAMPLES_LOG2; log2_samples++) {
for (unsigned src = 0; src < BLIT2D_NUM_SRC_TYPES; src++) {
if (src == BLIT2D_SRC_TYPE_IMAGE_3D && !create_3d)
continue;

View File

@@ -135,7 +135,7 @@ radv_device_init_meta_itob_state(struct radv_device *device)
struct radv_shader_module cs_3d = { .nir = NULL };
cs.nir = build_nir_itob_compute_shader(device, false);
if (device->physical_device->rad_info.chip_class == GFX9)
if (device->physical_device->rad_info.chip_class >= GFX9)
cs_3d.nir = build_nir_itob_compute_shader(device, true);
/*
@@ -211,7 +211,7 @@ radv_device_init_meta_itob_state(struct radv_device *device)
if (result != VK_SUCCESS)
goto fail;
if (device->physical_device->rad_info.chip_class == GFX9) {
if (device->physical_device->rad_info.chip_class >= GFX9) {
VkPipelineShaderStageCreateInfo pipeline_shader_stage_3d = {
.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO,
.stage = VK_SHADER_STAGE_COMPUTE_BIT,
@@ -256,7 +256,7 @@ radv_device_finish_meta_itob_state(struct radv_device *device)
&state->alloc);
radv_DestroyPipeline(radv_device_to_handle(device),
state->itob.pipeline, &state->alloc);
if (device->physical_device->rad_info.chip_class == GFX9)
if (device->physical_device->rad_info.chip_class >= GFX9)
radv_DestroyPipeline(radv_device_to_handle(device),
state->itob.pipeline_3d, &state->alloc);
}
@@ -361,7 +361,7 @@ radv_device_init_meta_btoi_state(struct radv_device *device)
struct radv_shader_module cs = { .nir = NULL };
struct radv_shader_module cs_3d = { .nir = NULL };
cs.nir = build_nir_btoi_compute_shader(device, false);
if (device->physical_device->rad_info.chip_class == GFX9)
if (device->physical_device->rad_info.chip_class >= GFX9)
cs_3d.nir = build_nir_btoi_compute_shader(device, true);
/*
* two descriptors one for the image being sampled
@@ -436,7 +436,7 @@ radv_device_init_meta_btoi_state(struct radv_device *device)
if (result != VK_SUCCESS)
goto fail;
if (device->physical_device->rad_info.chip_class == GFX9) {
if (device->physical_device->rad_info.chip_class >= GFX9) {
VkPipelineShaderStageCreateInfo pipeline_shader_stage_3d = {
.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO,
.stage = VK_SHADER_STAGE_COMPUTE_BIT,
@@ -785,7 +785,7 @@ radv_device_init_meta_itoi_state(struct radv_device *device)
struct radv_shader_module cs = { .nir = NULL };
struct radv_shader_module cs_3d = { .nir = NULL };
cs.nir = build_nir_itoi_compute_shader(device, false);
if (device->physical_device->rad_info.chip_class == GFX9)
if (device->physical_device->rad_info.chip_class >= GFX9)
cs_3d.nir = build_nir_itoi_compute_shader(device, true);
/*
* two descriptors one for the image being sampled
@@ -860,7 +860,7 @@ radv_device_init_meta_itoi_state(struct radv_device *device)
if (result != VK_SUCCESS)
goto fail;
if (device->physical_device->rad_info.chip_class == GFX9) {
if (device->physical_device->rad_info.chip_class >= GFX9) {
VkPipelineShaderStageCreateInfo pipeline_shader_stage_3d = {
.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO,
.stage = VK_SHADER_STAGE_COMPUTE_BIT,
@@ -904,7 +904,7 @@ radv_device_finish_meta_itoi_state(struct radv_device *device)
&state->alloc);
radv_DestroyPipeline(radv_device_to_handle(device),
state->itoi.pipeline, &state->alloc);
if (device->physical_device->rad_info.chip_class == GFX9)
if (device->physical_device->rad_info.chip_class >= GFX9)
radv_DestroyPipeline(radv_device_to_handle(device),
state->itoi.pipeline_3d, &state->alloc);
}
@@ -1191,7 +1191,7 @@ radv_device_init_meta_cleari_state(struct radv_device *device)
struct radv_shader_module cs = { .nir = NULL };
struct radv_shader_module cs_3d = { .nir = NULL };
cs.nir = build_nir_cleari_compute_shader(device, false);
if (device->physical_device->rad_info.chip_class == GFX9)
if (device->physical_device->rad_info.chip_class >= GFX9)
cs_3d.nir = build_nir_cleari_compute_shader(device, true);
/*
@@ -1261,7 +1261,7 @@ radv_device_init_meta_cleari_state(struct radv_device *device)
goto fail;
if (device->physical_device->rad_info.chip_class == GFX9) {
if (device->physical_device->rad_info.chip_class >= GFX9) {
/* compute shader */
VkPipelineShaderStageCreateInfo pipeline_shader_stage_3d = {
.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO,
@@ -1552,7 +1552,7 @@ create_iview(struct radv_cmd_buffer *cmd_buffer,
.baseArrayLayer = surf->layer,
.layerCount = 1
},
});
}, NULL);
}
static void
@@ -1706,7 +1706,7 @@ radv_meta_image_to_buffer(struct radv_cmd_buffer *cmd_buffer,
create_bview(cmd_buffer, dst->buffer, dst->offset, dst->format, &dst_view);
itob_bind_descriptors(cmd_buffer, &src_view, &dst_view);
if (device->physical_device->rad_info.chip_class == GFX9 &&
if (device->physical_device->rad_info.chip_class >= GFX9 &&
src->image->type == VK_IMAGE_TYPE_3D)
pipeline = cmd_buffer->device->meta_state.itob.pipeline_3d;
@@ -1875,7 +1875,7 @@ radv_meta_buffer_to_image_cs(struct radv_cmd_buffer *cmd_buffer,
create_iview(cmd_buffer, dst, &dst_view);
btoi_bind_descriptors(cmd_buffer, &src_view, &dst_view);
if (device->physical_device->rad_info.chip_class == GFX9 &&
if (device->physical_device->rad_info.chip_class >= GFX9 &&
dst->image->type == VK_IMAGE_TYPE_3D)
pipeline = cmd_buffer->device->meta_state.btoi.pipeline_3d;
radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer),
@@ -2060,7 +2060,7 @@ radv_meta_image_to_image_cs(struct radv_cmd_buffer *cmd_buffer,
itoi_bind_descriptors(cmd_buffer, &src_view, &dst_view);
if (device->physical_device->rad_info.chip_class == GFX9 &&
if (device->physical_device->rad_info.chip_class >= GFX9 &&
(src->image->type == VK_IMAGE_TYPE_3D || dst->image->type == VK_IMAGE_TYPE_3D))
pipeline = cmd_buffer->device->meta_state.itoi.pipeline_3d;
radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer),
@@ -2201,7 +2201,7 @@ radv_meta_clear_image_cs(struct radv_cmd_buffer *cmd_buffer,
create_iview(cmd_buffer, dst, &dst_iview);
cleari_bind_descriptors(cmd_buffer, &dst_iview);
if (device->physical_device->rad_info.chip_class == GFX9 &&
if (device->physical_device->rad_info.chip_class >= GFX9 &&
dst->image->type == VK_IMAGE_TYPE_3D)
pipeline = cmd_buffer->device->meta_state.cleari.pipeline_3d;

View File

@@ -367,10 +367,10 @@ emit_color_clear(struct radv_cmd_buffer *cmd_buffer,
{
struct radv_device *device = cmd_buffer->device;
const struct radv_subpass *subpass = cmd_buffer->state.subpass;
const struct radv_framebuffer *fb = cmd_buffer->state.framebuffer;
const uint32_t subpass_att = clear_att->colorAttachment;
const uint32_t pass_att = subpass->color_attachments[subpass_att].attachment;
const struct radv_image_view *iview = fb ? fb->attachments[pass_att].attachment : NULL;
const struct radv_image_view *iview = cmd_buffer->state.attachments ?
cmd_buffer->state.attachments[pass_att].iview : NULL;
uint32_t samples, samples_log2;
VkFormat format;
unsigned fs_key;
@@ -629,6 +629,7 @@ static bool depth_view_can_fast_clear(struct radv_cmd_buffer *cmd_buffer,
const struct radv_image_view *iview,
VkImageAspectFlags aspects,
VkImageLayout layout,
bool in_render_loop,
const VkClearRect *clear_rect,
VkClearDepthStencilValue clear_value)
{
@@ -651,7 +652,7 @@ static bool depth_view_can_fast_clear(struct radv_cmd_buffer *cmd_buffer,
iview->base_mip == 0 &&
iview->base_layer == 0 &&
iview->layer_count == iview->image->info.array_size &&
radv_layout_is_htile_compressed(iview->image, layout, queue_mask) &&
radv_layout_is_htile_compressed(iview->image, layout, in_render_loop, queue_mask) &&
radv_image_extent_compare(iview->image, &iview->extent))
return true;
return false;
@@ -664,10 +665,12 @@ pick_depthstencil_pipeline(struct radv_cmd_buffer *cmd_buffer,
int samples_log2,
VkImageAspectFlags aspects,
VkImageLayout layout,
bool in_render_loop,
const VkClearRect *clear_rect,
VkClearDepthStencilValue clear_value)
{
bool fast = depth_view_can_fast_clear(cmd_buffer, iview, aspects, layout, clear_rect, clear_value);
bool fast = depth_view_can_fast_clear(cmd_buffer, iview, aspects, layout,
in_render_loop, clear_rect, clear_value);
int index = DEPTH_CLEAR_SLOW;
VkPipeline *pipeline;
@@ -721,11 +724,11 @@ emit_depthstencil_clear(struct radv_cmd_buffer *cmd_buffer,
struct radv_device *device = cmd_buffer->device;
struct radv_meta_state *meta_state = &device->meta_state;
const struct radv_subpass *subpass = cmd_buffer->state.subpass;
const struct radv_framebuffer *fb = cmd_buffer->state.framebuffer;
const uint32_t pass_att = ds_att->attachment;
VkClearDepthStencilValue clear_value = clear_att->clearValue.depthStencil;
VkImageAspectFlags aspects = clear_att->aspectMask;
const struct radv_image_view *iview = fb ? fb->attachments[pass_att].attachment : NULL;
const struct radv_image_view *iview = cmd_buffer->state.attachments ?
cmd_buffer->state.attachments[pass_att].iview : NULL;
uint32_t samples, samples_log2;
VkCommandBuffer cmd_buffer_h = radv_cmd_buffer_to_handle(cmd_buffer);
@@ -763,6 +766,7 @@ emit_depthstencil_clear(struct radv_cmd_buffer *cmd_buffer,
samples_log2,
aspects,
ds_att->layout,
ds_att->in_render_loop,
clear_rect,
clear_value);
if (!pipeline)
@@ -780,7 +784,8 @@ emit_depthstencil_clear(struct radv_cmd_buffer *cmd_buffer,
pipeline);
if (depth_view_can_fast_clear(cmd_buffer, iview, aspects,
ds_att->layout, clear_rect, clear_value))
ds_att->layout, ds_att->in_render_loop,
clear_rect, clear_value))
radv_update_ds_clear_metadata(cmd_buffer, iview->image,
clear_value, aspects);
@@ -981,6 +986,7 @@ static bool
radv_can_fast_clear_depth(struct radv_cmd_buffer *cmd_buffer,
const struct radv_image_view *iview,
VkImageLayout image_layout,
bool in_render_loop,
VkImageAspectFlags aspects,
const VkClearRect *clear_rect,
const VkClearDepthStencilValue clear_value,
@@ -989,7 +995,10 @@ radv_can_fast_clear_depth(struct radv_cmd_buffer *cmd_buffer,
if (!radv_image_view_can_fast_clear(cmd_buffer->device, iview))
return false;
if (!radv_layout_is_htile_compressed(iview->image, image_layout, radv_image_queue_family_mask(iview->image, cmd_buffer->queue_family_index, cmd_buffer->queue_family_index)))
if (!radv_layout_is_htile_compressed(iview->image, image_layout, in_render_loop,
radv_image_queue_family_mask(iview->image,
cmd_buffer->queue_family_index,
cmd_buffer->queue_family_index)))
return false;
if (clear_rect->rect.offset.x || clear_rect->rect.offset.y ||
@@ -1330,15 +1339,18 @@ radv_clear_cmask(struct radv_cmd_buffer *cmd_buffer,
struct radv_image *image,
const VkImageSubresourceRange *range, uint32_t value)
{
uint64_t offset = image->offset + image->cmask.offset;
uint64_t offset = image->offset + image->cmask_offset;
uint64_t size;
if (cmd_buffer->device->physical_device->rad_info.chip_class >= GFX9) {
/* TODO: clear layers. */
size = image->cmask.size;
size = image->planes[0].surface.cmask_size;
} else {
offset += image->cmask.slice_size * range->baseArrayLayer;
size = image->cmask.slice_size * radv_get_layerCount(image, range);
unsigned cmask_slice_size =
image->planes[0].surface.cmask_slice_size;
offset += cmask_slice_size * range->baseArrayLayer;
size = cmask_slice_size * radv_get_layerCount(image, range);
}
return radv_fill_buffer(cmd_buffer, image->bo, offset, size, value);
@@ -1350,7 +1362,7 @@ radv_clear_fmask(struct radv_cmd_buffer *cmd_buffer,
struct radv_image *image,
const VkImageSubresourceRange *range, uint32_t value)
{
uint64_t offset = image->offset + image->fmask.offset;
uint64_t offset = image->offset + image->fmask_offset;
uint64_t size;
/* MSAA images do not support mipmap levels. */
@@ -1359,10 +1371,14 @@ radv_clear_fmask(struct radv_cmd_buffer *cmd_buffer,
if (cmd_buffer->device->physical_device->rad_info.chip_class >= GFX9) {
/* TODO: clear layers. */
size = image->fmask.size;
size = image->planes[0].surface.fmask_size;
} else {
offset += image->fmask.slice_size * range->baseArrayLayer;
size = image->fmask.slice_size * radv_get_layerCount(image, range);
unsigned fmask_slice_size =
image->planes[0].surface.u.legacy.fmask.slice_size;
offset += fmask_slice_size * range->baseArrayLayer;
size = fmask_slice_size * radv_get_layerCount(image, range);
}
return radv_fill_buffer(cmd_buffer, image->bo, offset, size, value);
@@ -1428,7 +1444,9 @@ enum {
RADV_DCC_CLEAR_SECONDARY_1 = 0x40404040U
};
static void vi_get_fast_clear_parameters(VkFormat format,
static void vi_get_fast_clear_parameters(struct radv_device *device,
VkFormat image_format,
VkFormat view_format,
const VkClearColorValue *clear_value,
uint32_t* reset_value,
bool *can_avoid_fast_clear_elim)
@@ -1437,18 +1455,20 @@ static void vi_get_fast_clear_parameters(VkFormat format,
int extra_channel;
bool main_value = false;
bool extra_value = false;
bool has_color = false;
bool has_alpha = false;
int i;
*can_avoid_fast_clear_elim = false;
*reset_value = RADV_DCC_CLEAR_REG;
const struct vk_format_description *desc = vk_format_description(format);
if (format == VK_FORMAT_B10G11R11_UFLOAT_PACK32 ||
format == VK_FORMAT_R5G6B5_UNORM_PACK16 ||
format == VK_FORMAT_B5G6R5_UNORM_PACK16)
const struct vk_format_description *desc = vk_format_description(view_format);
if (view_format == VK_FORMAT_B10G11R11_UFLOAT_PACK32 ||
view_format == VK_FORMAT_R5G6B5_UNORM_PACK16 ||
view_format == VK_FORMAT_B5G6R5_UNORM_PACK16)
extra_channel = -1;
else if (desc->layout == VK_FORMAT_LAYOUT_PLAIN) {
if (radv_translate_colorswap(format, false) <= 1)
if (vi_alpha_is_on_msb(device, view_format))
extra_channel = desc->nr_channels - 1;
else
extra_channel = 0;
@@ -1483,12 +1503,21 @@ static void vi_get_fast_clear_parameters(VkFormat format,
return;
}
if (index == extra_channel)
if (index == extra_channel) {
extra_value = values[i];
else
has_alpha = true;
} else {
main_value = values[i];
has_color = true;
}
}
/* If alpha isn't present, make it the same as color, and vice versa. */
if (!has_alpha)
extra_value = main_value;
else if (!has_color)
main_value = extra_value;
for (int i = 0; i < 4; ++i)
if (values[i] != main_value &&
desc->swizzle[i] - VK_SWIZZLE_X != extra_channel &&
@@ -1510,6 +1539,7 @@ static bool
radv_can_fast_clear_color(struct radv_cmd_buffer *cmd_buffer,
const struct radv_image_view *iview,
VkImageLayout image_layout,
bool in_render_loop,
const VkClearRect *clear_rect,
VkClearColorValue clear_value,
uint32_t view_mask)
@@ -1519,7 +1549,10 @@ radv_can_fast_clear_color(struct radv_cmd_buffer *cmd_buffer,
if (!radv_image_view_can_fast_clear(cmd_buffer->device, iview))
return false;
if (!radv_layout_can_fast_clear(iview->image, image_layout, radv_image_queue_family_mask(iview->image, cmd_buffer->queue_family_index, cmd_buffer->queue_family_index)))
if (!radv_layout_can_fast_clear(iview->image, image_layout, in_render_loop,
radv_image_queue_family_mask(iview->image,
cmd_buffer->queue_family_index,
cmd_buffer->queue_family_index)))
return false;
if (clear_rect->rect.offset.x || clear_rect->rect.offset.y ||
@@ -1544,7 +1577,9 @@ radv_can_fast_clear_color(struct radv_cmd_buffer *cmd_buffer,
bool can_avoid_fast_clear_elim;
uint32_t reset_value;
vi_get_fast_clear_parameters(iview->vk_format,
vi_get_fast_clear_parameters(cmd_buffer->device,
iview->image->vk_format,
iview->vk_format,
&clear_value, &reset_value,
&can_avoid_fast_clear_elim);
@@ -1616,7 +1651,9 @@ radv_fast_clear_color(struct radv_cmd_buffer *cmd_buffer,
bool can_avoid_fast_clear_elim;
bool need_decompress_pass = false;
vi_get_fast_clear_parameters(iview->vk_format,
vi_get_fast_clear_parameters(cmd_buffer->device,
iview->image->vk_format,
iview->vk_format,
&clear_value, &reset_value,
&can_avoid_fast_clear_elim);
@@ -1672,10 +1709,11 @@ emit_clear(struct radv_cmd_buffer *cmd_buffer,
return;
VkImageLayout image_layout = subpass->color_attachments[subpass_att].layout;
const struct radv_image_view *iview = fb ? fb->attachments[pass_att].attachment : NULL;
bool in_render_loop = subpass->color_attachments[subpass_att].in_render_loop;
const struct radv_image_view *iview = fb ? cmd_buffer->state.attachments[pass_att].iview : NULL;
VkClearColorValue clear_value = clear_att->clearValue.color;
if (radv_can_fast_clear_color(cmd_buffer, iview, image_layout,
if (radv_can_fast_clear_color(cmd_buffer, iview, image_layout, in_render_loop,
clear_rect, clear_value, view_mask)) {
radv_fast_clear_color(cmd_buffer, iview, clear_att,
subpass_att, pre_flush,
@@ -1693,15 +1731,16 @@ emit_clear(struct radv_cmd_buffer *cmd_buffer,
return;
VkImageLayout image_layout = ds_att->layout;
const struct radv_image_view *iview = fb ? fb->attachments[ds_att->attachment].attachment : NULL;
bool in_render_loop = ds_att->in_render_loop;
const struct radv_image_view *iview = fb ? cmd_buffer->state.attachments[ds_att->attachment].iview : NULL;
VkClearDepthStencilValue clear_value = clear_att->clearValue.depthStencil;
assert(aspects & (VK_IMAGE_ASPECT_DEPTH_BIT |
VK_IMAGE_ASPECT_STENCIL_BIT));
if (radv_can_fast_clear_depth(cmd_buffer, iview, image_layout,
aspects, clear_rect, clear_value,
view_mask)) {
in_render_loop, aspects, clear_rect,
clear_value, view_mask)) {
radv_fast_clear_depth(cmd_buffer, iview, clear_att,
pre_flush, post_flush);
} else {
@@ -1874,7 +1913,7 @@ radv_clear_image_layer(struct radv_cmd_buffer *cmd_buffer,
.baseArrayLayer = range->baseArrayLayer + layer,
.layerCount = 1
},
});
}, NULL);
VkFramebuffer fb;
radv_CreateFramebuffer(device_h,
@@ -1985,6 +2024,7 @@ radv_fast_clear_range(struct radv_cmd_buffer *cmd_buffer,
struct radv_image *image,
VkFormat format,
VkImageLayout image_layout,
bool in_render_loop,
const VkImageSubresourceRange *range,
const VkClearValue *clear_val)
{
@@ -2003,7 +2043,7 @@ radv_fast_clear_range(struct radv_cmd_buffer *cmd_buffer,
.baseArrayLayer = range->baseArrayLayer,
.layerCount = range->layerCount,
},
});
}, NULL);
VkClearRect clear_rect = {
.rect = {
@@ -2024,8 +2064,8 @@ radv_fast_clear_range(struct radv_cmd_buffer *cmd_buffer,
};
if (vk_format_is_color(format)) {
if (radv_can_fast_clear_color(cmd_buffer, &iview,
image_layout, &clear_rect,
if (radv_can_fast_clear_color(cmd_buffer, &iview, image_layout,
in_render_loop, &clear_rect,
clear_att.clearValue.color, 0)) {
radv_fast_clear_color(cmd_buffer, &iview, &clear_att,
clear_att.colorAttachment,
@@ -2034,8 +2074,9 @@ radv_fast_clear_range(struct radv_cmd_buffer *cmd_buffer,
}
} else {
if (radv_can_fast_clear_depth(cmd_buffer, &iview, image_layout,
range->aspectMask, &clear_rect,
clear_att.clearValue.depthStencil, 0)) {
in_render_loop,range->aspectMask,
&clear_rect, clear_att.clearValue.depthStencil,
0)) {
radv_fast_clear_depth(cmd_buffer, &iview, &clear_att,
NULL, NULL);
return true;
@@ -2085,7 +2126,7 @@ radv_cmd_clear_image(struct radv_cmd_buffer *cmd_buffer,
*/
if (!cs &&
radv_fast_clear_range(cmd_buffer, image, format,
image_layout, range,
image_layout, false, range,
&internal_clear_value)) {
continue;
}

View File

@@ -191,7 +191,7 @@ meta_copy_buffer_to_image(struct radv_cmd_buffer *cmd_buffer,
uint32_t queue_mask = radv_image_queue_family_mask(image,
cmd_buffer->queue_family_index,
cmd_buffer->queue_family_index);
bool compressed = radv_layout_dcc_compressed(image, layout, queue_mask);
bool compressed = radv_layout_dcc_compressed(cmd_buffer->device, image, layout, false, queue_mask);
if (compressed) {
radv_decompress_dcc(cmd_buffer, image, &(VkImageSubresourceRange) {
.aspectMask = pRegions[r].imageSubresource.aspectMask,
@@ -335,7 +335,7 @@ meta_copy_image_to_buffer(struct radv_cmd_buffer *cmd_buffer,
uint32_t queue_mask = radv_image_queue_family_mask(image,
cmd_buffer->queue_family_index,
cmd_buffer->queue_family_index);
bool compressed = radv_layout_dcc_compressed(image, layout, queue_mask);
bool compressed = radv_layout_dcc_compressed(cmd_buffer->device, image, layout, false, queue_mask);
if (compressed) {
radv_decompress_dcc(cmd_buffer, image, &(VkImageSubresourceRange) {
.aspectMask = pRegions[r].imageSubresource.aspectMask,
@@ -464,11 +464,11 @@ meta_copy_image(struct radv_cmd_buffer *cmd_buffer,
uint32_t dst_queue_mask = radv_image_queue_family_mask(dest_image,
cmd_buffer->queue_family_index,
cmd_buffer->queue_family_index);
bool dst_compressed = radv_layout_dcc_compressed(dest_image, dest_image_layout, dst_queue_mask);
bool dst_compressed = radv_layout_dcc_compressed(cmd_buffer->device, dest_image, dest_image_layout, false, dst_queue_mask);
uint32_t src_queue_mask = radv_image_queue_family_mask(src_image,
cmd_buffer->queue_family_index,
cmd_buffer->queue_family_index);
bool src_compressed = radv_layout_dcc_compressed(src_image, src_image_layout, src_queue_mask);
bool src_compressed = radv_layout_dcc_compressed(cmd_buffer->device, src_image, src_image_layout, false, src_queue_mask);
if (!src_compressed || radv_dcc_formats_compatible(b_src.format, b_dst.format)) {
b_src.format = b_dst.format;

View File

@@ -433,7 +433,7 @@ static void radv_process_depth_image_inplace(struct radv_cmd_buffer *cmd_buffer,
.baseArrayLayer = subresourceRange->baseArrayLayer + layer,
.layerCount = 1,
},
});
}, NULL);
VkFramebuffer fb_h;

View File

@@ -590,7 +590,7 @@ radv_process_color_image_layer(struct radv_cmd_buffer *cmd_buffer,
.baseArrayLayer = range->baseArrayLayer + layer,
.layerCount = 1,
},
});
}, NULL);
VkFramebuffer fb_h;
radv_CreateFramebuffer(radv_device_to_handle(device),
@@ -782,7 +782,8 @@ radv_decompress_dcc_compute(struct radv_cmd_buffer *cmd_buffer,
const VkImageSubresourceRange *subresourceRange)
{
struct radv_meta_saved_state saved_state;
struct radv_image_view iview = {0};
struct radv_image_view load_iview = {0};
struct radv_image_view store_iview = {0};
struct radv_device *device = cmd_buffer->device;
/* This assumes the image is 2d with 1 layer */
@@ -819,7 +820,7 @@ radv_decompress_dcc_compute(struct radv_cmd_buffer *cmd_buffer,
subresourceRange->baseMipLevel + l);
for (uint32_t s = 0; s < radv_get_layerCount(image, subresourceRange); s++) {
radv_image_view_init(&iview, cmd_buffer->device,
radv_image_view_init(&load_iview, cmd_buffer->device,
&(VkImageViewCreateInfo) {
.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO,
.image = radv_image_to_handle(image),
@@ -832,6 +833,22 @@ radv_decompress_dcc_compute(struct radv_cmd_buffer *cmd_buffer,
.baseArrayLayer = subresourceRange->baseArrayLayer + s,
.layerCount = 1
},
}, NULL);
radv_image_view_init(&store_iview, cmd_buffer->device,
&(VkImageViewCreateInfo) {
.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO,
.image = radv_image_to_handle(image),
.viewType = VK_IMAGE_VIEW_TYPE_2D,
.format = image->vk_format,
.subresourceRange = {
.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT,
.baseMipLevel = subresourceRange->baseMipLevel + l,
.levelCount = 1,
.baseArrayLayer = subresourceRange->baseArrayLayer + s,
.layerCount = 1
},
}, &(struct radv_image_view_extra_create_info) {
.disable_compression = true
});
radv_meta_push_descriptor_set(cmd_buffer,
@@ -849,7 +866,7 @@ radv_decompress_dcc_compute(struct radv_cmd_buffer *cmd_buffer,
.pImageInfo = (VkDescriptorImageInfo[]) {
{
.sampler = VK_NULL_HANDLE,
.imageView = radv_image_view_to_handle(&iview),
.imageView = radv_image_view_to_handle(&load_iview),
.imageLayout = VK_IMAGE_LAYOUT_GENERAL,
},
}
@@ -863,7 +880,7 @@ radv_decompress_dcc_compute(struct radv_cmd_buffer *cmd_buffer,
.pImageInfo = (VkDescriptorImageInfo[]) {
{
.sampler = VK_NULL_HANDLE,
.imageView = radv_image_view_to_handle(&iview),
.imageView = radv_image_view_to_handle(&store_iview),
.imageLayout = VK_IMAGE_LAYOUT_GENERAL,
},
}

View File

@@ -139,7 +139,7 @@ radv_expand_fmask_image_inplace(struct radv_cmd_buffer *cmd_buffer,
.baseArrayLayer = subresourceRange->baseArrayLayer + l,
.layerCount = 1,
},
});
}, NULL);
radv_meta_push_descriptor_set(cmd_buffer,
VK_PIPELINE_BIND_POINT_COMPUTE,

View File

@@ -330,10 +330,12 @@ enum radv_resolve_method {
RESOLVE_FRAGMENT,
};
static void radv_pick_resolve_method_images(struct radv_image *src_image,
static void radv_pick_resolve_method_images(struct radv_device *device,
struct radv_image *src_image,
VkFormat src_format,
struct radv_image *dest_image,
VkImageLayout dest_image_layout,
bool dest_render_loop,
struct radv_cmd_buffer *cmd_buffer,
enum radv_resolve_method *method)
@@ -352,7 +354,8 @@ static void radv_pick_resolve_method_images(struct radv_image *src_image,
dest_image->info.array_size > 1)
*method = RESOLVE_COMPUTE;
if (radv_layout_dcc_compressed(dest_image, dest_image_layout, queue_mask)) {
if (radv_layout_dcc_compressed(device, dest_image, dest_image_layout,
dest_render_loop, queue_mask)) {
*method = RESOLVE_FRAGMENT;
} else if (dest_image->planes[0].surface.micro_tile_mode !=
src_image->planes[0].surface.micro_tile_mode) {
@@ -431,9 +434,10 @@ void radv_CmdResolveImage(
} else
resolve_method = RESOLVE_COMPUTE;
radv_pick_resolve_method_images(src_image, src_image->vk_format,
dest_image, dest_image_layout,
cmd_buffer, &resolve_method);
radv_pick_resolve_method_images(cmd_buffer->device, src_image,
src_image->vk_format, dest_image,
dest_image_layout, false, cmd_buffer,
&resolve_method);
if (resolve_method == RESOLVE_FRAGMENT) {
radv_meta_resolve_fragment_image(cmd_buffer,
@@ -549,7 +553,7 @@ void radv_CmdResolveImage(
.baseArrayLayer = src_base_layer + layer,
.layerCount = 1,
},
});
}, NULL);
struct radv_image_view dest_iview;
radv_image_view_init(&dest_iview, cmd_buffer->device,
@@ -565,7 +569,7 @@ void radv_CmdResolveImage(
.baseArrayLayer = dest_base_layer + layer,
.layerCount = 1,
},
});
}, NULL);
VkFramebuffer fb_h;
radv_CreateFramebuffer(device_h,
@@ -641,14 +645,16 @@ radv_cmd_buffer_resolve_subpass(struct radv_cmd_buffer *cmd_buffer)
struct radv_subpass_attachment src_att = *subpass->depth_stencil_attachment;
struct radv_subpass_attachment dst_att = *subpass->ds_resolve_attachment;
struct radv_image_view *src_iview =
cmd_buffer->state.framebuffer->attachments[src_att.attachment].attachment;
cmd_buffer->state.attachments[src_att.attachment].iview;
struct radv_image_view *dst_iview =
cmd_buffer->state.framebuffer->attachments[dst_att.attachment].attachment;
cmd_buffer->state.attachments[dst_att.attachment].iview;
radv_pick_resolve_method_images(src_iview->image,
radv_pick_resolve_method_images(cmd_buffer->device,
src_iview->image,
src_iview->vk_format,
dst_iview->image,
dst_att.layout,
dst_att.in_render_loop,
cmd_buffer,
&resolve_method);
@@ -694,12 +700,14 @@ radv_cmd_buffer_resolve_subpass(struct radv_cmd_buffer *cmd_buffer)
/* Make sure to not clear color attachments after resolves. */
cmd_buffer->state.attachments[dest_att.attachment].pending_clear_aspects = 0;
struct radv_image *dst_img = cmd_buffer->state.framebuffer->attachments[dest_att.attachment].attachment->image;
struct radv_image_view *src_iview= cmd_buffer->state.framebuffer->attachments[src_att.attachment].attachment;
struct radv_image *dst_img = cmd_buffer->state.attachments[dest_att.attachment].iview->image;
struct radv_image_view *src_iview= cmd_buffer->state.attachments[src_att.attachment].iview;
struct radv_image *src_img = src_iview->image;
radv_pick_resolve_method_images(src_img, src_iview->vk_format,
dst_img, dest_att.layout,
radv_pick_resolve_method_images(cmd_buffer->device, src_img,
src_iview->vk_format, dst_img,
dest_att.layout,
dest_att.in_render_loop,
cmd_buffer, &resolve_method);
if (resolve_method == RESOLVE_FRAGMENT) {
@@ -725,7 +733,7 @@ radv_cmd_buffer_resolve_subpass(struct radv_cmd_buffer *cmd_buffer)
if (dest_att.attachment == VK_ATTACHMENT_UNUSED)
continue;
struct radv_image_view *dest_iview = cmd_buffer->state.framebuffer->attachments[dest_att.attachment].attachment;
struct radv_image_view *dest_iview = cmd_buffer->state.attachments[dest_att.attachment].iview;
struct radv_image *dst_img = dest_iview->image;
if (radv_dcc_enabled(dst_img, dest_iview->base_mip)) {
@@ -787,8 +795,7 @@ radv_decompress_resolve_subpass_src(struct radv_cmd_buffer *cmd_buffer)
if (dest_att.attachment == VK_ATTACHMENT_UNUSED)
continue;
struct radv_image_view *src_iview =
fb->attachments[src_att.attachment].attachment;
struct radv_image_view *src_iview = cmd_buffer->state.attachments[src_att.attachment].iview;
struct radv_image *src_image = src_iview->image;
VkImageResolve region = {};
@@ -803,8 +810,7 @@ radv_decompress_resolve_subpass_src(struct radv_cmd_buffer *cmd_buffer)
if (subpass->ds_resolve_attachment) {
struct radv_subpass_attachment src_att = *subpass->depth_stencil_attachment;
struct radv_image_view *src_iview =
fb->attachments[src_att.attachment].attachment;
struct radv_image_view *src_iview = fb->attachments[src_att.attachment];
struct radv_image *src_image = src_iview->image;
VkImageResolve region = {};

View File

@@ -863,7 +863,7 @@ void radv_meta_resolve_compute_image(struct radv_cmd_buffer *cmd_buffer,
.baseArrayLayer = src_base_layer + layer,
.layerCount = 1,
},
});
}, NULL);
struct radv_image_view dest_iview;
radv_image_view_init(&dest_iview, cmd_buffer->device,
@@ -879,7 +879,7 @@ void radv_meta_resolve_compute_image(struct radv_cmd_buffer *cmd_buffer,
.baseArrayLayer = dest_base_layer + layer,
.layerCount = 1,
},
});
}, NULL);
emit_resolve(cmd_buffer,
&src_iview,
@@ -921,8 +921,8 @@ radv_cmd_buffer_resolve_subpass_cs(struct radv_cmd_buffer *cmd_buffer)
if (dst_att.attachment == VK_ATTACHMENT_UNUSED)
continue;
struct radv_image_view *src_iview = fb->attachments[src_att.attachment].attachment;
struct radv_image_view *dst_iview = fb->attachments[dst_att.attachment].attachment;
struct radv_image_view *src_iview = cmd_buffer->state.attachments[src_att.attachment].iview;
struct radv_image_view *dst_iview = cmd_buffer->state.attachments[dst_att.attachment].iview;
VkImageResolve region = {
.extent = (VkExtent3D){ fb->width, fb->height, 0 },
@@ -989,9 +989,9 @@ radv_depth_stencil_resolve_subpass_cs(struct radv_cmd_buffer *cmd_buffer,
struct radv_subpass_attachment dest_att = *subpass->ds_resolve_attachment;
struct radv_image_view *src_iview =
cmd_buffer->state.framebuffer->attachments[src_att.attachment].attachment;
cmd_buffer->state.attachments[src_att.attachment].iview;
struct radv_image_view *dst_iview =
cmd_buffer->state.framebuffer->attachments[dest_att.attachment].attachment;
cmd_buffer->state.attachments[dest_att.attachment].iview;
struct radv_image *src_image = src_iview->image;
struct radv_image *dst_image = dst_iview->image;
@@ -1011,7 +1011,7 @@ radv_depth_stencil_resolve_subpass_cs(struct radv_cmd_buffer *cmd_buffer,
.baseArrayLayer = src_iview->base_layer + layer,
.layerCount = 1,
},
});
}, NULL);
struct radv_image_view tdst_iview;
radv_image_view_init(&tdst_iview, cmd_buffer->device,
@@ -1027,7 +1027,7 @@ radv_depth_stencil_resolve_subpass_cs(struct radv_cmd_buffer *cmd_buffer,
.baseArrayLayer = dst_iview->base_layer + layer,
.layerCount = 1,
},
});
}, NULL);
emit_depth_stencil_resolve(cmd_buffer, &tsrc_iview, &tdst_iview,
&(VkOffset2D) { 0, 0 },

View File

@@ -535,7 +535,7 @@ create_depth_stencil_resolve_pipeline(struct radv_device *device,
.pAttachments = &(VkAttachmentDescription) {
.format = src_format,
.loadOp = VK_ATTACHMENT_LOAD_OP_DONT_CARE,
.storeOp = VK_ATTACHMENT_LOAD_OP_DONT_CARE,
.storeOp = VK_ATTACHMENT_STORE_OP_DONT_CARE,
.stencilLoadOp = VK_ATTACHMENT_LOAD_OP_LOAD,
.stencilStoreOp = VK_ATTACHMENT_STORE_OP_STORE,
.initialLayout = VK_IMAGE_LAYOUT_GENERAL,
@@ -1050,7 +1050,7 @@ void radv_meta_resolve_fragment_image(struct radv_cmd_buffer *cmd_buffer,
.baseArrayLayer = src_base_layer + layer,
.layerCount = 1,
},
});
}, NULL);
struct radv_image_view dest_iview;
radv_image_view_init(&dest_iview, cmd_buffer->device,
@@ -1066,7 +1066,7 @@ void radv_meta_resolve_fragment_image(struct radv_cmd_buffer *cmd_buffer,
.baseArrayLayer = dest_base_layer + layer,
.layerCount = 1,
},
});
}, NULL);
VkFramebuffer fb;
@@ -1146,8 +1146,8 @@ radv_cmd_buffer_resolve_subpass_fs(struct radv_cmd_buffer *cmd_buffer)
if (dest_att.attachment == VK_ATTACHMENT_UNUSED)
continue;
struct radv_image_view *dest_iview = cmd_buffer->state.framebuffer->attachments[dest_att.attachment].attachment;
struct radv_image_view *src_iview = cmd_buffer->state.framebuffer->attachments[src_att.attachment].attachment;
struct radv_image_view *dest_iview = cmd_buffer->state.attachments[dest_att.attachment].iview;
struct radv_image_view *src_iview = cmd_buffer->state.attachments[src_att.attachment].iview;
struct radv_subpass resolve_subpass = {
.color_count = 1,
@@ -1201,10 +1201,10 @@ radv_depth_stencil_resolve_subpass_fs(struct radv_cmd_buffer *cmd_buffer,
struct radv_subpass_attachment dst_att = *subpass->ds_resolve_attachment;
struct radv_image_view *src_iview =
cmd_buffer->state.framebuffer->attachments[src_att.attachment].attachment;
cmd_buffer->state.attachments[src_att.attachment].iview;
struct radv_image *src_image = src_iview->image;
struct radv_image_view *dst_iview =
cmd_buffer->state.framebuffer->attachments[dst_att.attachment].attachment;
cmd_buffer->state.attachments[dst_att.attachment].iview;
struct radv_subpass resolve_subpass = {
.color_count = 0,
@@ -1228,7 +1228,7 @@ radv_depth_stencil_resolve_subpass_fs(struct radv_cmd_buffer *cmd_buffer,
.baseArrayLayer = 0,
.layerCount = 1,
},
});
}, NULL);
emit_depth_stencil_resolve(cmd_buffer, &tsrc_iview, dst_iview,
&(VkOffset2D) { 0, 0 },

View File

@@ -295,7 +295,7 @@ get_tcs_num_patches(struct radv_shader_context *ctx)
/* GFX6 bug workaround - limit LS-HS threadgroups to only one wave. */
if (ctx->options->chip_class == GFX6) {
unsigned one_wave = 64 / MAX2(num_tcs_input_cp, num_tcs_output_cp);
unsigned one_wave = ctx->options->wave_size / MAX2(num_tcs_input_cp, num_tcs_output_cp);
num_patches = MIN2(num_patches, one_wave);
}
return num_patches;
@@ -754,7 +754,7 @@ static void allocate_user_sgprs(struct radv_shader_context *ctx,
if (ctx->shader_info->info.loads_push_constants)
user_sgpr_count++;
if (ctx->streamout_buffers)
if (ctx->shader_info->info.so.num_outputs)
user_sgpr_count++;
uint32_t available_sgprs = ctx->options->chip_class >= GFX9 && stage != MESA_SHADER_COMPUTE ? 32 : 16;
@@ -1703,6 +1703,18 @@ load_tes_input(struct ac_shader_abi *abi,
return result;
}
static LLVMValueRef
radv_emit_fetch_64bit(struct radv_shader_context *ctx,
LLVMTypeRef type, LLVMValueRef a, LLVMValueRef b)
{
LLVMValueRef values[2] = {
ac_to_integer(&ctx->ac, a),
ac_to_integer(&ctx->ac, b),
};
LLVMValueRef result = ac_build_gather_values(&ctx->ac, values, 2);
return LLVMBuildBitCast(ctx->ac.builder, result, type, "");
}
static LLVMValueRef
load_gs_input(struct ac_shader_abi *abi,
unsigned location,
@@ -1731,6 +1743,14 @@ load_gs_input(struct ac_shader_abi *abi,
dw_addr = LLVMBuildAdd(ctx->ac.builder, dw_addr,
LLVMConstInt(ctx->ac.i32, param * 4 + i + const_index, 0), "");
value[i] = ac_lds_load(&ctx->ac, dw_addr);
if (ac_get_type_size(type) == 8) {
dw_addr = LLVMBuildAdd(ctx->ac.builder, dw_addr,
LLVMConstInt(ctx->ac.i32, param * 4 + i + const_index + 1, 0), "");
LLVMValueRef tmp = ac_lds_load(&ctx->ac, dw_addr);
value[i] = radv_emit_fetch_64bit(ctx, type, value[i], tmp);
}
} else {
LLVMValueRef soffset =
LLVMConstInt(ctx->ac.i32,
@@ -1742,6 +1762,21 @@ load_gs_input(struct ac_shader_abi *abi,
ctx->ac.i32_0,
vtx_offset, soffset,
0, ac_glc, true, false);
if (ac_get_type_size(type) == 8) {
soffset = LLVMConstInt(ctx->ac.i32,
(param * 4 + i + const_index + 1) * 256,
false);
LLVMValueRef tmp =
ac_build_buffer_load(&ctx->ac,
ctx->esgs_ring, 1,
ctx->ac.i32_0,
vtx_offset, soffset,
0, ac_glc, true, false);
value[i] = radv_emit_fetch_64bit(ctx, type, value[i], tmp);
}
}
if (ac_get_type_size(type) == 2) {
@@ -3038,7 +3073,8 @@ handle_es_outputs_post(struct radv_shader_context *ctx,
LLVMValueRef wave_idx = ac_unpack_param(&ctx->ac, ctx->merged_wave_info, 24, 4);
vertex_idx = LLVMBuildOr(ctx->ac.builder, vertex_idx,
LLVMBuildMul(ctx->ac.builder, wave_idx,
LLVMConstInt(ctx->ac.i32, 64, false), ""), "");
LLVMConstInt(ctx->ac.i32,
ctx->ac.wave_size, false), ""), "");
lds_base = LLVMBuildMul(ctx->ac.builder, vertex_idx,
LLVMConstInt(ctx->ac.i32, itemsize_dw, 0), "");
}
@@ -3140,7 +3176,7 @@ static LLVMValueRef get_thread_id_in_tg(struct radv_shader_context *ctx)
LLVMBuilderRef builder = ctx->ac.builder;
LLVMValueRef tmp;
tmp = LLVMBuildMul(builder, get_wave_id_in_tg(ctx),
LLVMConstInt(ctx->ac.i32, 64, false), "");
LLVMConstInt(ctx->ac.i32, ctx->ac.wave_size, false), "");
return LLVMBuildAdd(builder, tmp, ac_get_thread_id(&ctx->ac), "");
}
@@ -4190,7 +4226,7 @@ ac_setup_rings(struct radv_shader_context *ctx)
*/
LLVMTypeRef v2i64 = LLVMVectorType(ctx->ac.i64, 2);
uint64_t stream_offset = 0;
unsigned num_records = 64;
unsigned num_records = ctx->ac.wave_size;
LLVMValueRef base_ring;
base_ring =
@@ -4223,7 +4259,7 @@ ac_setup_rings(struct radv_shader_context *ctx)
ring = LLVMBuildInsertElement(ctx->ac.builder,
ring, tmp, ctx->ac.i32_0, "");
stream_offset += stride * 64;
stream_offset += stride * ctx->ac.wave_size;
ring = LLVMBuildBitCast(ctx->ac.builder, ring,
ctx->ac.v4i32, "");
@@ -4257,23 +4293,8 @@ radv_nir_get_max_workgroup_size(enum chip_class chip_class,
gl_shader_stage stage,
const struct nir_shader *nir)
{
switch (stage) {
case MESA_SHADER_TESS_CTRL:
return chip_class >= GFX7 ? 128 : 64;
case MESA_SHADER_GEOMETRY:
return chip_class >= GFX9 ? 128 : 64;
case MESA_SHADER_COMPUTE:
break;
default:
return 0;
}
if (!nir)
return chip_class >= GFX9 ? 128 : 64;
unsigned max_workgroup_size = nir->info.cs.local_size[0] *
nir->info.cs.local_size[1] *
nir->info.cs.local_size[2];
return max_workgroup_size;
const unsigned backup_sizes[] = {chip_class >= GFX9 ? 128 : 64, 1, 1};
return radv_get_max_workgroup_size(chip_class, stage, nir ? nir->info.cs.local_size : backup_sizes);
}
/* Fixup the HW not emitting the TCS regs if there are no HS threads. */
@@ -4317,15 +4338,6 @@ static void declare_esgs_ring(struct radv_shader_context *ctx)
LLVMSetAlignment(ctx->esgs_ring, 64 * 1024);
}
static uint8_t
radv_nir_shader_wave_size(struct nir_shader *const *shaders, int shader_count,
const struct radv_nir_compiler_options *options)
{
if (shaders[0]->info.stage == MESA_SHADER_COMPUTE)
return options->cs_wave_size;
return 64;
}
static
LLVMModuleRef ac_translate_nir_to_llvm(struct ac_llvm_compiler *ac_llvm,
struct nir_shader *const *shaders,
@@ -4342,11 +4354,9 @@ LLVMModuleRef ac_translate_nir_to_llvm(struct ac_llvm_compiler *ac_llvm,
options->unsafe_math ? AC_FLOAT_MODE_UNSAFE_FP_MATH :
AC_FLOAT_MODE_DEFAULT;
uint8_t wave_size = radv_nir_shader_wave_size(shaders,
shader_count, options);
ac_llvm_context_init(&ctx.ac, ac_llvm, options->chip_class,
options->family, float_mode, wave_size);
options->family, float_mode, options->wave_size,
options->wave_size);
ctx.context = ctx.ac.context;
radv_nir_shader_info_init(&shader_info->info);
@@ -4386,6 +4396,7 @@ LLVMModuleRef ac_translate_nir_to_llvm(struct ac_llvm_compiler *ac_llvm,
ctx.abi.load_resource = radv_load_resource;
ctx.abi.clamp_shadow_reference = false;
ctx.abi.gfx9_stride_size_workaround = ctx.ac.chip_class == GFX9 && HAVE_LLVM < 0x800;
ctx.abi.robust_buffer_access = options->robust_buffer_access;
/* Because the new raw/struct atomic intrinsics are buggy with LLVM 8,
* we fallback to the old intrinsics for atomic buffer image operations
@@ -4746,6 +4757,7 @@ radv_compile_nir_shader(struct ac_llvm_compiler *ac_llvm,
shader_info->gs.es_type = nir[0]->info.stage;
}
}
shader_info->info.wave_size = options->wave_size;
}
static void
@@ -4777,7 +4789,7 @@ ac_gs_copy_shader_emit(struct radv_shader_context *ctx)
LLVMBasicBlockRef bb;
unsigned offset;
if (!num_components)
if (stream > 0 && !num_components)
continue;
if (stream > 0 && !ctx->shader_info->info.so.num_outputs)
@@ -4858,7 +4870,7 @@ radv_compile_gs_copy_shader(struct ac_llvm_compiler *ac_llvm,
AC_FLOAT_MODE_DEFAULT;
ac_llvm_context_init(&ctx.ac, ac_llvm, options->chip_class,
options->family, float_mode, 64);
options->family, float_mode, 64, 64);
ctx.context = ctx.ac.context;
ctx.is_gs_copy_shader = true;

Some files were not shown because too many files have changed in this diff Show More