Compare commits

...

50 Commits

Author SHA1 Message Date
Dylan Baker
33595f88d6 VERSION: bump for 21.2.0 release 2021-08-04 11:47:36 -07:00
Dylan Baker
c0623dbe16 docs: clear new_features for 21.2.0 release 2021-08-04 11:47:19 -07:00
Dylan Baker
2eb92dec11 docs: add release notes for 21.2.0 2021-08-04 11:44:40 -07:00
Lionel Landwerlin
18b65515a6 intel/disasm: fix missing oword index decoding
Also switch to array of strings to show high/low dwords.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: daba2894ff ("intel/disasm: decode/describe more send messages")
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12183>
(cherry picked from commit 97be8e42e4)
2021-08-03 11:07:19 -07:00
Pierre Moreau
ecfa127381 clover/nir: Set constant buffer pointer size to host
The `argument::size` is supposed to represent the size of a pointer on
the host and not on the device (for which argument::target_size`
exists).

v3: Use `sizeof(buf)` instead of `marg.size`. (Francisco Jerez)

Fixes: 7c6f1d3bf9 ("clover/nir: extract constant buffer into its own section")

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Pierre Moreau <dev@pmoreau.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10256>
(cherry picked from commit b4e5bf0637)
2021-08-03 11:07:18 -07:00
Pierre Moreau
90645f3eea clover/spirv: Properly size 3-component vector args
This resolves clover returning `CL_INVALID_ARG_SIZE` whenever the OpenCL
CTS called `clSetKernelArg()` for 3-component vectors.

Fixes: 2147386505 ("clover/spirv: Add functions for parsing arguments, linking programs, etc.")

v2: Remove “api/clsetkernelarg/set kernel argument for cl_int3” from the
  expected fails for llvmpipe

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Pierre Moreau <dev@pmoreau.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10256>
(cherry picked from commit a6c26a6ad9)
2021-08-03 11:07:17 -07:00
Erik Faye-Lund
432964005b d3d12: split up root parameter update and set
SRV descriptors can require state-transitions before it's legal to set
them on the command-list. We used to just set them right away, and get
away with is, because the validator didn't verify this because we used
to flag the parameters as volatile.

Now that we don't, we trigger validation errors when setting a root
parameter that needs a transition first.

So let's split up the logic a bit, so we can prepare the tables, then do
the transision, and finally set the tables. We do this for all tables
instead of just the SRVs, just because it makes the logic a bit easier to
follow. We leave root constants alone, because they will never require
this, and doing them late would just compilcate things.

Fixes: 1208290558 ("d3d12: Sets all SRV descriptors as data-static")
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Bill Kristiansen <billkris@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12187>
(cherry picked from commit cd79351f02)
2021-08-03 11:07:17 -07:00
Juan A. Suarez Romero
a5dec10d83 gallium/hud: initialize query
Most of the drivers don't set up the maximum value in the query info. So
when later hud_pane_set_max_value() is invoked, we are using a rather
"random" number.

Turns out that in some 32bit cases, this random number is big enough
that `leftmost_digit` is 0 because DIV_ROUND_UP() overflows, aborting
with an assertion.

Fixes: c91cf7d7d2 ("gallium: implement a heads-up display module")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12181>
(cherry picked from commit 10541d1fad)
2021-08-03 11:07:16 -07:00
Samuel Pitoiset
1d02d0743d radv: fix missing cache flushes when clearing HTILE levels on GFX10+
The driver should accumulate the cache flush bits because if it uses
CP DMA for clearing the last level, it won't flush.

Found by inspection.

Cc: 21.2 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12170>
(cherry picked from commit ad83c06a5f)
2021-08-03 11:07:16 -07:00
Samuel Pitoiset
5b4b4b9ef6 radv: fix selecting the first active CU when profiling with SQTT
Fixes: d26bcc0f5c ("radv: always select the first active CU when profiling with SQTT")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12167>
(cherry picked from commit ebea075feb)
2021-08-03 11:07:15 -07:00
Timothy Arceri
7deef80ef6 intel/compiler: make sure swizzle is applied to if condition
This fixes a hang in the following piglit test when GCM moves a
UBO load outside of the loop.

tests/shaders/ssa/fs-if-def-else-break.shader_test

The end NIR ends up looking like this:

	vec2 32 ssa_3 = intrinsic load_ubo (ssa_2, ssa_0) (0, 1073741824, 0, 0, 8)
	vec1 32 ssa_4 = mov ssa_3.x
	vec1 32 ssa_5 = inot ssa_3.y
	/* succs: block_1 */
	loop {
           ...
           if ssa_5 { }
        }

Fixes: 1edf67fc3f ("intel/fs: Generate if instructions with inverted conditions")

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12064>
(cherry picked from commit a654e39f15)
2021-08-03 11:07:14 -07:00
Dylan Baker
1a1cf756d2 .pick_status.json: Update to 97be8e42e4 2021-08-03 11:07:12 -07:00
Dave Airlie
58bf0165ca crocus: add support for set alpha to one with blt.
This is ported from 965 and fixes
dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.*rebind_tex2d_rgb*

Fixes: f3630548f1 ("crocus: initial gallium driver for Intel gfx 4-7")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12164>
(cherry picked from commit 842b8c8965)
2021-08-02 13:30:09 -07:00
Dave Airlie
9fc8ae0cd5 intel/genxml: fix raster operation field in blt genxml
This field should be a uint, further changes on top of previous
ones in this area

Fixes: 4d80ec8fcf ("intel/genxml: fix raster op fields on gen4/5")
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12164>
(cherry picked from commit de625dddee)
2021-08-02 13:30:08 -07:00
Dave Airlie
5b99334ba3 crocus/gen45: fix mapping compressed textures
I don't think iris ever hits this path, but probably has the same bug if
it did.

Fixes texsubimage on gfx4 + gfx4.5

Fixes: 5bf6ec31cc ("crocus/gen4: restrict memcpy mapping to gen5")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12164>
(cherry picked from commit c12444ab88)
2021-08-02 13:30:08 -07:00
Yiwei Zhang
76a317170e venus: cache ahb backed buffer memory type bits requirement
To properly init buffer memory requirement for AHB, memory type bits
from dma_buf fd properties need to be masked. However, creating a test
AHB at buffer creation is too costy. This patch caches the ahb backed
buffer memory type bits at device creation time if the app is requesting
AHB extension.

Cc: 21.2 mesa-stable

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12171>
(cherry picked from commit e08960482a)
2021-08-02 13:30:07 -07:00
Lionel Landwerlin
97955560fd drm-shim: implement stat/fstat when xstat variants are not there
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 027095065d ("drm-shim: fix compile with glibc >= 2.33")
Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12129>
(cherry picked from commit f86faee9f4)
2021-08-02 13:30:06 -07:00
Michel Zou
149473db82 meson: dont use missing dumpbin path
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Cc: 21.2 mesa-stable
Closes #5142

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12139>
(cherry picked from commit 80160a67ab)
2021-08-02 13:30:04 -07:00
Icecream95
720645a5b3 pan/mdg: Analyze helper termination after scheduling
Similar to the fix in 6bf8e960fa ("pan/bi: Do helper termination
analysis on clauses")

Though apparently a "theoretical issue only", fixes artefacts in
DarkPlaces with both D3D9 and GL renderers.

Fixes: 9a7f0e268b ("pan/mdg: Use the helper invo analyze passes")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12156>
(cherry picked from commit a2b37e9592)
2021-08-02 13:30:02 -07:00
Hoe Hao Cheng
7c2c2b9d2a zink: make codegen compatible with python 3.5
Fixes: f1432fd3 ("zink: generate extension infrastructure using a python script")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12137>
(cherry picked from commit 86250c7251)
2021-08-02 13:30:01 -07:00
Dylan Baker
075ec9e608 .pick_status.json: Update to 842b8c8965 2021-08-02 13:29:59 -07:00
Alyssa Rosenzweig
538e9f93c5 pan/bi: Remove incorrect errata workaround
This worked around a symptom of the underlying issue worked around in
the previous commit. This workaround is wrong in the sense of failing to
correct some broken code sequences and needlessly rejecting some working
code sequences.

total tuples in shared programs: 123770 -> 123630 (-0.11%)
tuples in affected programs: 9548 -> 9408 (-1.47%)
helped: 133
HURT: 0
helped stats (abs) min: 1.0 max: 2.0 x̄: 1.05 x̃: 1
helped stats (rel) min: 0.42% max: 16.67% x̄: 4.07% x̃: 1.15%
95% mean confidence interval for tuples value: -1.09 -1.01
95% mean confidence interval for tuples %-change: -4.98% -3.17%
Tuples are helped.

total cycles in shared programs: 12114.83 -> 12114.50 (<.01%)
cycles in affected programs: 34.08 -> 33.75 (-0.98%)
helped: 9
HURT: 1
helped stats (abs) min: 0.04166599999999998 max: 0.04166700000000034 x̄: 0.04 x̃: 0
helped stats (rel) min: 0.72% max: 12.50% x̄: 2.99% x̃: 2.04%
HURT stats (abs)   min: 0.04166700000000034 max: 0.04166700000000034 x̄: 0.04 x̃: 0
HURT stats (rel)   min: 0.62% max: 0.62% x̄: 0.62% x̃: 0.62%
95% mean confidence interval for cycles value: -0.05 -0.01
95% mean confidence interval for cycles %-change: -5.27% <.01%
Inconclusive result (%-change mean confidence interval includes 0).

total arith in shared programs: 4603.42 -> 4601.54 (-0.04%)
arith in affected programs: 50.50 -> 48.62 (-3.71%)
helped: 41
HURT: 1
helped stats (abs) min: 0.04166599999999998 max: 0.08333299999999999 x̄: 0.05 x̃: 0
helped stats (rel) min: 0.72% max: 33.33% x̄: 17.23% x̃: 13.33%
HURT stats (abs)   min: 0.04166700000000034 max: 0.04166700000000034 x̄: 0.04 x̃: 0
HURT stats (rel)   min: 0.62% max: 0.62% x̄: 0.62% x̃: 0.62%
95% mean confidence interval for arith value: -0.05 -0.04
95% mean confidence interval for arith %-change: -20.93% -12.69%
Arith are helped.

total quadwords in shared programs: 110116 -> 110009 (-0.10%)
quadwords in affected programs: 7829 -> 7722 (-1.37%)
helped: 106
HURT: 0
helped stats (abs) min: 1.0 max: 2.0 x̄: 1.01 x̃: 1
helped stats (rel) min: 0.49% max: 7.14% x̄: 1.91% x̃: 1.35%
95% mean confidence interval for quadwords value: -1.03 -0.99
95% mean confidence interval for quadwords %-change: -2.23% -1.59%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12114>
(cherry picked from commit be7d964ff0)
2021-07-30 10:13:30 -07:00
Alyssa Rosenzweig
2ca05ac293 pan/bi: Restrict swizzles on same cycle temporaries
Hand typed. We could generate this from the XML to avoid the repititon
but I think the cure is worse than the disease.

This fixes instruction encoding faults seen in conformance tests.

Only a single shader-db affected, and it was likely already broken...

quadwords HURT:   shaders/glmark/22-1.shader_test MESA_SHADER_FRAGMENT: 133 -> 135 (1.50%)

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12114>
(cherry picked from commit 2cdf95703a)

Conflicts:
	src/panfrost/bifrost/bi_schedule.c
2021-07-30 10:13:29 -07:00
Pierre-Eric Pelloux-Prayer
5b17ed9781 amd/registers: fix fields conflict detection
The existing code handled the case where the new definition of the
same field was larger than the old one.
This commit adds a check to handle the reverse case: the new def
is smaller than the old one (= so writing using the merged macro
would affect the next fields).

The affected fields are:
* LGKM_CNT (in SQ_WAVE_IB_STS)
* DONUT_SPLIT (in VGT_TESS_DISTRIBUTION)
* HEAD_QUEUE (in GDS_GWS_RESOURCE)

DONUT_SPLIT is the only one used by radeonsi/radv.

Fixes: e6184b0892 ("amd/registers: scripts for processing register descriptions in JSON")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12063>
(cherry picked from commit 3914bd457b)
2021-07-30 09:59:20 -07:00
Pierre-Eric Pelloux-Prayer
ce0e8e022d gallium/va: don't use key=NULL in hash tables
Add 1 to the key index otherwise we hit the following assert
in hash_table_insert:

   assert(!key_pointer_is_reserved(ht, key));

Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12105>
(cherry picked from commit 2ea88d7cea)
2021-07-30 09:59:19 -07:00
Dave Airlie
14f09b60e5 intel/fs: restrict max push length on older GPUs to a smaller amount
Fixes crash in dEQP-GLES2.functional.uniform_api.random.79

Cc: mesa-stable
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12093>
(cherry picked from commit c8783001c7)
2021-07-30 09:59:19 -07:00
Joshua Watt
ca5fbe8517 v3d, vc4: Fix dmabuf import for non-scanout buffers
Failure to create a buffer for scanout should not be fatal when
importing a buffer. Buffers allocated from a render-only device may not
be able to scanned out directly but can still be used for other
rendering purposes (e.g. as a texture).

Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12081>
(cherry picked from commit 7bcb223639)
2021-07-30 09:59:18 -07:00
Simon Ser
39ffd918cd lima: fail in get_handle(TYPE_KMS) without a scanout resource
The previous logic was returning a handle valid for the render-only
device if rsc->scanout was NULL. However the caller doesn't expect
this: the caller will use the handle with the KMS device.

Instead of returning a handle for the wrong device, fail if we don't
have one.

Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12074>
(cherry picked from commit 47f000c170)
2021-07-30 09:59:13 -07:00
Simon Ser
4bb8e29a28 panfrost: fail in get_handle(TYPE_KMS) without a scanout resource
The previous logic was returning a handle valid for the render-only
device if rsc->scanout was NULL. However the caller doesn't expect
this: the caller will use the handle with the KMS device.

Instead of returning a handle for the wrong device, fail if we don't
have one.

Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12074>
(cherry picked from commit 4c092947df)
2021-07-30 09:59:13 -07:00
Simon Ser
4524e8bff8 freedreno: fail in get_handle(TYPE_KMS) without a scanout resource
The previous logic was returning a handle valid for the render-only
device if rsc->scanout was NULL. However the caller doesn't expect
this: the caller will use the handle with the KMS device.

Instead of returning a handle for the wrong device, fail if we don't
have one.

Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Emma Anholt <emma@anholt.net>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12074>
(cherry picked from commit 465eb7864b)
2021-07-30 09:59:12 -07:00
Simon Ser
bda17c7388 etnaviv: fail in get_handle(TYPE_KMS) without a scanout resource
The previous logic was returning a handle valid for the render-only
device if rsc->scanout was NULL. However the caller doesn't expect
this: the caller will use the handle with the KMS device.

Instead of returning a handle for the wrong device, fail if we don't
have one.

Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12074>
(cherry picked from commit 9da901d2b2)
2021-07-30 09:59:12 -07:00
Simon Ser
1396ddcc4e etnaviv: fix renderonly check in etna_resource_alloc
When the driver hasn't been initialized via renderonly, screen->ro
will be NULL. This fixes a crash when passing USE_SCANOUT to etnaviv
when it's missing renderonly.

Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12074>
(cherry picked from commit 3b3cd51286)
2021-07-30 09:59:11 -07:00
Thomas H.P. Andersen
8b4a8972a8 nine: Fix assert in tx_src_param
A previous commit cleaned up the asserts but the last part of
this assert looks like it got mixed up. It should have allowed
param->rel for D3DSPR_INPUT if version is 3.0. Instead it does
&& on the enum value D3DSPR_ADDR which is of course always true,
with the version check. The result is that we miss input
validation with version 3.0.

Spotted by a compile warning

Fixes: 5974401a4a ("st/nine: Regroup param->rel tests")
Reviewed-by: Axel Davy davyaxel0@gmail.com
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11880>
(cherry picked from commit 71a5bcb865)
2021-07-30 09:59:10 -07:00
Dylan Baker
4f7b4ba7f8 .pick_status.json: Update to 87b0962fef 2021-07-30 09:59:08 -07:00
Dylan Baker
940cb9ebe9 .pick_status.json: Mark 8cb795b477 as denominated 2021-07-29 10:04:30 -07:00
Lionel Landwerlin
f3a523a9be anv: fix submission batching with perf queries
If we have 2 command buffers back to back, one with a query pool, one
without, we don't want to retain the second query pool value (NULL).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 0a7224f3ff ("anv: group as many command buffers into a single execbuf")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12107>
(cherry picked from commit b8e29e8936)
2021-07-29 09:05:43 -07:00
Erik Faye-Lund
f9107dbf71 lavapipe: do not mark unsupported tests as crashing
These were fixed previously, but due to the CI not really running all
tests any more, I didn't notice these fixes. Let's bring the expected
results up to date.

Fixes: 2e29857bb6 ("llvmpipe: only report supported shader-image formats")
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12077>
(cherry picked from commit 0cfd1da8b3)
2021-07-29 09:05:43 -07:00
Juan A. Suarez Romero
ebcd657099 broadcom: remove v3dv3 from neon library
No need to build the simulator with NEON; and also v3dv3 simulator
is not for VC4, so don't inherit v3dv3 requirement when building vc4
driver.

Fixes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5126
Fixes: d198e26a1e ("broadcom/common: move v3d_tiling to common")
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12078>
(cherry picked from commit fe9d2d2046)
2021-07-29 09:05:42 -07:00
Lionel Landwerlin
72eeeba333 nir/lower_shader_calls: adding missing stack offset alignment
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 8dfb240b1f ("nir: Add raytracing shader call lowering pass.")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12112>
(cherry picked from commit 7e3bad0f8e)
2021-07-29 09:05:42 -07:00
Dylan Baker
947fd891bf freedreno/ir3: Add build id to the disassembler test
This is required (at least for me on x86) to get the tool to pass it's
own test, otherwise it fails the build_id assertion.

Fixes: 1462b00391
       ("freedreno/ir3: Add a unit test for our disassembler.")

Acked-by: Rob Clark <robclark@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12084>
(cherry picked from commit 097cf3952b)
2021-07-29 09:05:41 -07:00
Jesse Natalie
cfe3e2ff53 mesa/main: Check for fbo attachments when importing EGL images to textures
Fixes an assert when binding an fbo with a texture bound to one of its attachments,
if the texture was updated with an EGL image after it was bound.

Cc: mesa-stable
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11998>
(cherry picked from commit 3d64a97cf6)
2021-07-29 09:05:41 -07:00
Dylan Baker
ad0ba78934 .pick_status.json: Update to b8e29e8936 2021-07-29 09:05:11 -07:00
Connor Abbott
bfc6597375 ir3: Preserve gl_ViewportIndex in the binning shader
Fixes dEQP-VK.draw.shader_viewport_index.* with TU_DEBUG=forcebin.

Fixes: efff734220 ("turnip: multiViewport and VK_EXT_shader_viewport_index_layer")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12104>
(cherry picked from commit 7a14484bca)
2021-07-28 11:29:29 -07:00
Lionel Landwerlin
d06433f883 loader/dri3: create linear buffer with scanout support
If we have a different GPU dealing with display, we fallback to
exchanging linear buffers with the compositor. We should specify in
creating the linear buffer that this could be used for display.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4706
Cc: mesa-stable
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11942>
(cherry picked from commit f1a66e7c90)
2021-07-28 11:29:29 -07:00
Zhu Yuliang
5577fb807e gallium/vl: don't leak fd in vl_dri3_screen_create
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12092>
(cherry picked from commit b88fd3ccc0)
2021-07-28 11:29:28 -07:00
Erik Faye-Lund
6705d498f4 lavapipe: do not assert on more than 32 samplers
We can have more than 32 samplers, but the code below will assert in that
case. The return value is not used for samplers, so let's just return
zero early and be done with it.

Fixes: c18ff60087 ("lavapipe: emit correct textures_used for texture-arrays")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11845>
(cherry picked from commit bff8a948f7)
2021-07-28 11:29:28 -07:00
Chia-I Wu
d8402f2ff0 vulkan/wsi/x11: do not inherit last_present_mode
Under XWayland, the first present after a window resize is sometimes
completed with COPY (seems to happen when the previous present with the
old size is pending; not really sure).  The following presents are
completed with FLIP.

When a swapchain is created with an old swapchain, and
old_chain->last_present_mode is FLIP, chain->last_present_mode is set to
FLIP as well.  This causes the new swapchain to be marked
VK_SUBOPTIMAL_KHR, which is sticky, if the first present is completed
with COPY.

Instead of inheriting, treat each swapchain as independent.  We will
miss the case where an old swapchain is flipping but a new swapchain is
copying.  But swapchain reallocation normally happens in response to
present engine state change.  If the newly allocated swapchain is
copying, another reallocation is unlikely to fix that.

Fixes: 61309c2a72 ("vulkan/wsi/x11: Return VK_SUBOPTIMAL_KHR for X11")
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12030>
(cherry picked from commit 206fe780d5)
2021-07-28 11:29:27 -07:00
Philipp Zabel
0e7985a7de etnaviv: fix gbm_bo_get_handle_for_plane for multiplanar images
Implement resource_get_param for PIPE_RESOURCE_PARAM_NPLANES and fix
resource_get_handle to walk to the correct linked resource for
multiplanar images, allowing gbm_bo_get_handle_for_plane to be called
with plane > 0.

This fixes an assert that is triggered when a wayland client tries
to send weston an NV12 dmabuf, for example:

  weston: .../mesa/src/gbm/backends/dri/gbm_dri.c:752: gbm_dri_bo_get_handle_for_plane: Assertion `plane == 0' failed.

Fixes: 788f6dc857 ('Revert "gallium/dri: fix dri2_from_planar for multiplanar images"')
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12037>
(cherry picked from commit 8ba44103b3)
2021-07-28 11:29:27 -07:00
Lepton Wu
1b886f20a7 gallium: Reset {d,r}Priv in dri_unbind_context
The code in dri_make_current just checks the value of the pointers
to decide to update texture_stamp or not. This is buggy since a new
allocated drawable could share the same address with the previous
released drawable. Fix the stale pointer issue by always resetting
these pointers to NULL in dri_unbind_context.

v2:
   Move the reset codes to the end of the function.

Signed-off-by: Lepton Wu <lepton@chromium.org>
Cc: mesa-stable
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12050>
(cherry picked from commit 7ff30a0499)
2021-07-28 11:29:25 -07:00
Dylan Baker
b795fc4a28 .pick_status.json: Update to dff0d9911d 2021-07-28 11:29:23 -07:00
56 changed files with 8140 additions and 225 deletions

File diff suppressed because it is too large Load Diff

View File

@@ -1 +1 @@
21.2.0-rc3
21.2.0

5272
docs/relnotes/21.2.0.rst Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -1,25 +0,0 @@
zink supports GL_ARB_texture_filter_minmax, GL_ARB_shader_clock
VK_EXT_provoking_vertex on RADV.
VK_EXT_extended_dynamic_state2 on RADV.
VK_EXT_global_priority_query on RADV.
VK_EXT_physical_device_drm on RADV.
VK_KHR_shader_subgroup_uniform_control_flow on Intel and RADV.
VK_EXT_color_write_enable on RADV.
32-bit x86 builds now default disable x87 math and use sse2.
GL ES 3.1 on GT21x hardware.
VK_EXT_acquire_drm_display on RADV and ANV.
VK_EXT_vertex_input_dynamic_state on lavapipe
wideLines on lavapipe
VK_EXT_line_rasterization on lavapipe
VK_EXT_multi_draw on ANV, lavapipe, and RADV
VK_KHR_separate_depth_stencil_layouts on lavapipe
VK_EXT_separate_stencil_usage on lavapipe
VK_EXT_extended_dynamic_state2 on lavapipe
NGG shader based primitive culling is now supported by RADV.
Panfrost supports OpenGL ES 3.1
New Asahi driver for the Apple M1
GL_ARB_sample_locations on zink
GL_ARB_sparse_buffer on zink
GL_ARB_shader_group_vote on zink
DRM format modifiers on zink
freedreno+turnip: Initial support for a6xx gen4 (a660, a635)

View File

@@ -2107,7 +2107,9 @@ pkg = import('pkgconfig')
if host_machine.system() == 'windows'
prog_dumpbin = find_program('dumpbin', required : false)
with_symbols_check = prog_dumpbin.found() and with_tests
symbols_check_args = ['--dumpbin', prog_dumpbin.path()]
if with_symbols_check
symbols_check_args = ['--dumpbin', prog_dumpbin.path()]
endif
else
prog_nm = find_program('nm')
with_symbols_check = with_tests

View File

@@ -112,6 +112,23 @@ def get_chips_comment(chips, parent=None):
return ', '.join(comment)
def detect_conflict(regdb, field_in_type1, field_in_type2):
"""
Returns False if field_in_type1 and field_in_type2 can be merged
into a single field = if writing to field_in_type1 bits won't
overwrite adjacent fields in type2, and the other way around.
"""
for idx, type_refs in enumerate([field_in_type1.type_refs, field_in_type2.type_refs]):
ref = field_in_type2 if idx == 0 else field_in_type1
for type_ref in type_refs:
for field in regdb.register_type(type_ref).fields:
# If a different field in the other type starts in
# the tested field's bits[0, 1] interval
if (field.bits[0] > ref.bits[0] and
field.bits[0] <= ref.bits[1]):
return True
return False
class HeaderWriter(object):
def __init__(self, regdb, guard=None):
@@ -200,21 +217,10 @@ class HeaderWriter(object):
if prev.bits[0] != line.bits[0]:
continue
if prev.bits[1] < line.bits[1]:
if prev.bits[1] != line.bits[1]:
# Current line's field extends beyond the range of prev.
# Need to check for conflicts
conflict = False
for type_ref in prev.type_refs:
for field in regdb.register_type(type_ref).fields:
# The only possible conflict is for a prev field
# that starts at a higher bit.
if (field.bits[0] > line.bits[0] and
field.bits[0] <= line.bits[1]):
conflict = True
break
if conflict:
break
if conflict:
if detect_conflict(regdb, prev, line):
continue
prev.bits[1] = max(prev.bits[1], line.bits[1])

View File

@@ -1389,10 +1389,10 @@ radv_clear_htile(struct radv_cmd_buffer *cmd_buffer, const struct radv_image *im
if (htile_mask == UINT_MAX) {
/* Clear the whole HTILE buffer. */
flush_bits = radv_fill_buffer(cmd_buffer, image, image->bo, offset, size, value);
flush_bits |= radv_fill_buffer(cmd_buffer, image, image->bo, offset, size, value);
} else {
/* Only clear depth or stencil bytes in the HTILE buffer. */
flush_bits =
flush_bits |=
clear_htile_mask(cmd_buffer, image, image->bo, offset, size, value, htile_mask);
}
}

View File

@@ -625,8 +625,6 @@ radv_get_thread_trace(struct radv_queue *queue, struct ac_thread_trace *thread_t
? (first_active_cu / 2)
: first_active_cu;
thread_trace_se.compute_unit = 0;
thread_trace->traces[thread_trace->num_traces] = thread_trace_se;
thread_trace->num_traces++;
}

View File

@@ -503,13 +503,13 @@ si_emit_graphics(struct radv_device *device, struct radeon_cmdbuf *cs)
if (physical_device->rad_info.chip_class >= GFX9) {
radeon_set_context_reg(cs, R_028B50_VGT_TESS_DISTRIBUTION,
S_028B50_ACCUM_ISOLINE(40) | S_028B50_ACCUM_TRI(30) |
S_028B50_ACCUM_QUAD(24) | S_028B50_DONUT_SPLIT(24) |
S_028B50_ACCUM_QUAD(24) | S_028B50_DONUT_SPLIT_GFX9(24) |
S_028B50_TRAP_SPLIT(6));
} else if (physical_device->rad_info.chip_class >= GFX8) {
uint32_t vgt_tess_distribution;
vgt_tess_distribution = S_028B50_ACCUM_ISOLINE(32) | S_028B50_ACCUM_TRI(11) |
S_028B50_ACCUM_QUAD(11) | S_028B50_DONUT_SPLIT(16);
S_028B50_ACCUM_QUAD(11) | S_028B50_DONUT_SPLIT_GFX81(16);
if (physical_device->rad_info.family == CHIP_FIJI ||
physical_device->rad_info.family >= CHIP_POLARIS10)

View File

@@ -208,7 +208,7 @@ spec@ext_framebuffer_object@fbo-blending-formats,Fail
spec@ext_framebuffer_object@fbo-blending-formats@GL_RGB10,Fail
spec@ext_framebuffer_object@getteximage-formats init-by-clear-and-render,Fail
spec@ext_framebuffer_object@getteximage-formats init-by-rendering,Fail
spec@ext_image_dma_buf_import@ext_image_dma_buf_import-export-tex,Crash
spec@ext_image_dma_buf_import@ext_image_dma_buf_import-export-tex,Fail
spec@ext_packed_depth_stencil@texwrap formats bordercolor,Fail
spec@ext_packed_depth_stencil@texwrap formats bordercolor@GL_DEPTH24_STENCIL8- border color only,Fail
spec@ext_packed_depth_stencil@texwrap formats bordercolor-swizzled,Fail

View File

@@ -65,7 +65,7 @@ libv3d_neon = static_library(
],
c_args : [v3d_args, v3d_neon_c_args],
gnu_symbol_visibility : 'hidden',
dependencies : [dep_v3dv3, dep_libdrm, dep_valgrind, idep_nir_headers],
dependencies : [dep_libdrm, dep_valgrind, idep_nir_headers],
)
libbroadcom_v3d = static_library(

View File

@@ -459,6 +459,7 @@ spill_ssa_defs_and_lower_shader_calls(nir_shader *shader, uint32_t num_calls,
nir_builder *b = &before;
offset = ALIGN(offset, stack_alignment);
max_scratch_size = MAX2(max_scratch_size, offset);
/* First thing on the called shader's stack is the resume address

View File

@@ -77,11 +77,18 @@ REAL_FUNCTION_POINTER(readdir64);
REAL_FUNCTION_POINTER(readlink);
REAL_FUNCTION_POINTER(realpath);
#if __GLIBC__ == 2 && __GLIBC_MINOR__ < 33
#define HAS_XSTAT __GLIBC__ == 2 && __GLIBC_MINOR__ < 33
#if HAS_XSTAT
REAL_FUNCTION_POINTER(__xstat);
REAL_FUNCTION_POINTER(__xstat64);
REAL_FUNCTION_POINTER(__fxstat);
REAL_FUNCTION_POINTER(__fxstat64);
#else
REAL_FUNCTION_POINTER(stat);
REAL_FUNCTION_POINTER(stat64);
REAL_FUNCTION_POINTER(fstat);
REAL_FUNCTION_POINTER(fstat64);
#endif
/* Full path of /dev/dri/renderD* */
@@ -209,11 +216,16 @@ init_shim(void)
GET_FUNCTION_POINTER(readlink);
GET_FUNCTION_POINTER(realpath);
#if __GLIBC__ == 2 && __GLIBC_MINOR__ < 33
#if HAS_XSTAT
GET_FUNCTION_POINTER(__xstat);
GET_FUNCTION_POINTER(__xstat64);
GET_FUNCTION_POINTER(__fxstat);
GET_FUNCTION_POINTER(__fxstat64);
#else
GET_FUNCTION_POINTER(stat);
GET_FUNCTION_POINTER(stat64);
GET_FUNCTION_POINTER(fstat);
GET_FUNCTION_POINTER(fstat64);
#endif
get_dri_render_node_minor();
@@ -278,7 +290,7 @@ PUBLIC int open(const char *path, int flags, ...)
}
PUBLIC int open64(const char*, int, ...) __attribute__((alias("open")));
#if __GLIBC__ == 2 && __GLIBC_MINOR__ < 33
#if HAS_XSTAT
/* Fakes stat to return character device stuff for our fake render node. */
PUBLIC int __xstat(int ver, const char *path, struct stat *st)
{
@@ -379,6 +391,106 @@ PUBLIC int __fxstat64(int ver, int fd, struct stat64 *st)
return 0;
}
#else
PUBLIC int stat(const char* path, struct stat* stat_buf)
{
init_shim();
/* Note: call real stat if we're in the process of probing for a free
* render node!
*/
if (render_node_minor == -1)
return real_stat(path, stat_buf);
/* Fool libdrm's probe of whether the /sys dir for this char dev is
* there.
*/
char *sys_dev_drm_dir;
nfasprintf(&sys_dev_drm_dir,
"/sys/dev/char/%d:%d/device/drm",
DRM_MAJOR, render_node_minor);
if (strcmp(path, sys_dev_drm_dir) == 0) {
free(sys_dev_drm_dir);
return 0;
}
free(sys_dev_drm_dir);
if (strcmp(path, render_node_path) != 0)
return real_stat(path, stat_buf);
memset(stat_buf, 0, sizeof(*stat_buf));
stat_buf->st_rdev = makedev(DRM_MAJOR, render_node_minor);
stat_buf->st_mode = S_IFCHR;
return 0;
}
PUBLIC int stat64(const char* path, struct stat64* stat_buf)
{
init_shim();
/* Note: call real stat if we're in the process of probing for a free
* render node!
*/
if (render_node_minor == -1)
return real_stat64(path, stat_buf);
/* Fool libdrm's probe of whether the /sys dir for this char dev is
* there.
*/
char *sys_dev_drm_dir;
nfasprintf(&sys_dev_drm_dir,
"/sys/dev/char/%d:%d/device/drm",
DRM_MAJOR, render_node_minor);
if (strcmp(path, sys_dev_drm_dir) == 0) {
free(sys_dev_drm_dir);
return 0;
}
free(sys_dev_drm_dir);
if (strcmp(path, render_node_path) != 0)
return real_stat64(path, stat_buf);
memset(stat_buf, 0, sizeof(*stat_buf));
stat_buf->st_rdev = makedev(DRM_MAJOR, render_node_minor);
stat_buf->st_mode = S_IFCHR;
return 0;
}
PUBLIC int fstat(int fd, struct stat* stat_buf)
{
init_shim();
struct shim_fd *shim_fd = drm_shim_fd_lookup(fd);
if (!shim_fd)
return real_fstat(fd, stat_buf);
memset(stat_buf, 0, sizeof(*stat_buf));
stat_buf->st_rdev = makedev(DRM_MAJOR, render_node_minor);
stat_buf->st_mode = S_IFCHR;
return 0;
}
PUBLIC int fstat64(int fd, struct stat64* stat_buf)
{
init_shim();
struct shim_fd *shim_fd = drm_shim_fd_lookup(fd);
if (!shim_fd)
return real_fstat64(fd, stat_buf);
memset(stat_buf, 0, sizeof(*stat_buf));
stat_buf->st_rdev = makedev(DRM_MAJOR, render_node_minor);
stat_buf->st_mode = S_IFCHR;
return 0;
}
#endif
/* Tracks if the opendir was on /dev/dri. */

View File

@@ -21,30 +21,7 @@ dEQP-VK.api.device_init.create_instance_device_intentional_alloc_fail,Fail
dEQP-VK.compute.basic.max_local_size_x,Crash
dEQP-VK.compute.basic.max_local_size_y,Crash
# shader_viewport and atomic_operations fails to reproduce on anholt's cheza,
# even with a failing caselist from CI.
dEQP-VK.draw.shader_viewport_index.fragment_shader_10,Fail
dEQP-VK.draw.shader_viewport_index.fragment_shader_12,Fail
dEQP-VK.draw.shader_viewport_index.fragment_shader_13,Fail
dEQP-VK.draw.shader_viewport_index.fragment_shader_14,Fail
dEQP-VK.draw.shader_viewport_index.fragment_shader_16,Fail
dEQP-VK.draw.shader_viewport_index.fragment_shader_2,Fail
dEQP-VK.draw.shader_viewport_index.fragment_shader_4,Fail
dEQP-VK.draw.shader_viewport_index.fragment_shader_5,Fail
dEQP-VK.draw.shader_viewport_index.fragment_shader_6,Fail
dEQP-VK.draw.shader_viewport_index.fragment_shader_8,Fail
dEQP-VK.draw.shader_viewport_index.fragment_shader_9,Fail
dEQP-VK.draw.shader_viewport_index.vertex_shader_10,Fail
dEQP-VK.draw.shader_viewport_index.vertex_shader_11,Fail
dEQP-VK.draw.shader_viewport_index.vertex_shader_13,Fail
dEQP-VK.draw.shader_viewport_index.vertex_shader_14,Fail
dEQP-VK.draw.shader_viewport_index.vertex_shader_15,Fail
dEQP-VK.draw.shader_viewport_index.vertex_shader_2,Fail
dEQP-VK.draw.shader_viewport_index.vertex_shader_3,Fail
dEQP-VK.draw.shader_viewport_index.vertex_shader_5,Fail
dEQP-VK.draw.shader_viewport_index.vertex_shader_6,Fail
dEQP-VK.draw.shader_viewport_index.vertex_shader_7,Fail
dEQP-VK.draw.shader_viewport_index.vertex_shader_9,Fail
# only fails with TU_DEBUG=forcebin
dEQP-VK.glsl.atomic_operations.add_unsigned_geometry,Fail
dEQP-VK.glsl.atomic_operations.and_signed_geometry,Fail
dEQP-VK.glsl.atomic_operations.and_unsigned_geometry,Fail

View File

@@ -3828,7 +3828,8 @@ static bool
output_slot_used_for_binning(gl_varying_slot slot)
{
return slot == VARYING_SLOT_POS || slot == VARYING_SLOT_PSIZ ||
slot == VARYING_SLOT_CLIP_DIST0 || slot == VARYING_SLOT_CLIP_DIST1;
slot == VARYING_SLOT_CLIP_DIST0 || slot == VARYING_SLOT_CLIP_DIST1 ||
slot == VARYING_SLOT_VIEWPORT;
}
static struct ir3_instruction *

View File

@@ -128,6 +128,7 @@ test('ir3_disasm',
executable(
'ir3_disasm',
'tests/disasm.c',
link_args : [ld_args_build_id],
link_with: [libfreedreno_ir3, libir3decode],
dependencies: [idep_mesautil, idep_nir],
include_directories: [inc_freedreno, inc_include, inc_src, inc_gallium],

View File

@@ -424,7 +424,7 @@ hud_driver_query_install(struct hud_batch_query_context **pbq,
struct hud_pane *pane, struct pipe_screen *screen,
const char *name)
{
struct pipe_driver_query_info query;
struct pipe_driver_query_info query = { 0 };
unsigned num_queries, i;
boolean found = FALSE;

View File

@@ -852,6 +852,9 @@ vl_dri3_screen_create(Display *display, int screen)
scrn->base.set_back_texture_from_output = vl_dri3_screen_set_back_texture_from_output;
scrn->next_back = 1;
close(fd);
return &scrn->base;
no_context:

View File

@@ -31,6 +31,69 @@
#if GFX_VER <= 5
static uint32_t
color_depth_for_cpp(int cpp)
{
switch (cpp) {
case 4: return COLOR_DEPTH__32bit;
case 2: return COLOR_DEPTH__565;
case 1: return COLOR_DEPTH__8bit;
default:
unreachable("not reached");
}
}
static void
blt_set_alpha_to_one(struct crocus_batch *batch,
struct crocus_resource *dst,
int x, int y, int width, int height)
{
const struct isl_format_layout *fmtl = isl_format_get_layout(dst->surf.format);
unsigned cpp = fmtl->bpb / 8;
uint32_t pitch = dst->surf.row_pitch_B;
if (dst->surf.tiling != ISL_TILING_LINEAR)
pitch /= 4;
/* We need to split the blit into chunks that each fit within the blitter's
* restrictions. We can't use a chunk size of 32768 because we need to
* ensure that src_tile_x + chunk_size fits. We choose 16384 because it's
* a nice round power of two, big enough that performance won't suffer, and
* small enough to guarantee everything fits.
*/
const uint32_t max_chunk_size = 16384;
for (uint32_t chunk_x = 0; chunk_x < width; chunk_x += max_chunk_size) {
for (uint32_t chunk_y = 0; chunk_y < height; chunk_y += max_chunk_size) {
const uint32_t chunk_w = MIN2(max_chunk_size, width - chunk_x);
const uint32_t chunk_h = MIN2(max_chunk_size, height - chunk_y);
uint32_t tile_x, tile_y, offset;
ASSERTED uint32_t z_offset_el, array_offset;
isl_tiling_get_intratile_offset_el(dst->surf.tiling,
cpp * 8, dst->surf.row_pitch_B,
dst->surf.array_pitch_el_rows,
chunk_x, chunk_y, 0, 0,
&offset,
&tile_x, &tile_y,
&z_offset_el, &array_offset);
assert(z_offset_el == 0);
assert(array_offset == 0);
crocus_emit_cmd(batch, GENX(XY_COLOR_BLT), xyblt) {
xyblt.TilingEnable = dst->surf.tiling != ISL_TILING_LINEAR;
xyblt.ColorDepth = color_depth_for_cpp(cpp);
xyblt.RasterOperation = 0xF0;
xyblt.DestinationPitch = pitch;
xyblt._32bppByteMask = 2;
xyblt.DestinationBaseAddress = rw_bo(dst->bo, offset);
xyblt.DestinationX1Coordinate = tile_x;
xyblt.DestinationY1Coordinate = tile_y;
xyblt.DestinationX2Coordinate = tile_x + chunk_w;
xyblt.DestinationY2Coordinate = tile_y + chunk_h;
xyblt.SolidPatternColor = 0xffffffff;
}
}
}
}
static bool validate_blit_for_blt(struct crocus_batch *batch,
const struct pipe_blit_info *info)
{
@@ -51,6 +114,17 @@ static bool validate_blit_for_blt(struct crocus_batch *batch,
if (info->dst.box.depth > 1 || info->src.box.depth > 1)
return false;
const struct util_format_description *desc =
util_format_description(info->src.format);
int i = util_format_get_first_non_void_channel(info->src.format);
if (i == -1)
return false;
/* can't do the alpha to 1 setting for these. */
if ((util_format_has_alpha1(info->src.format) &&
util_format_has_alpha(info->dst.format) &&
desc->channel[i].size > 8))
return false;
return true;
}
@@ -62,17 +136,6 @@ static inline int crocus_resource_blt_pitch(struct crocus_resource *res)
return pitch;
}
static uint32_t
color_depth_for_cpp(int cpp)
{
switch (cpp) {
case 4: return COLOR_DEPTH__32bit;
case 2: return COLOR_DEPTH__565;
case 1: return COLOR_DEPTH__8bit;
default:
unreachable("not reached");
}
}
static bool emit_copy_blt(struct crocus_batch *batch,
struct crocus_resource *src,
@@ -283,6 +346,10 @@ static bool crocus_emit_blt(struct crocus_batch *batch,
}
}
}
if (util_format_has_alpha1(src->base.b.format) &&
util_format_has_alpha(dst->base.b.format))
blt_set_alpha_to_one(batch, dst, 0, 0, src_width, src_height);
return true;
}

View File

@@ -1542,12 +1542,17 @@ crocus_map_direct(struct crocus_transfer *map)
const unsigned cpp = fmtl->bpb / 8;
unsigned x0_el, y0_el;
assert(box->x % fmtl->bw == 0);
assert(box->y % fmtl->bh == 0);
get_image_offset_el(surf, xfer->level, box->z, &x0_el, &y0_el);
x0_el += box->x / fmtl->bw;
y0_el += box->y / fmtl->bh;
xfer->stride = isl_surf_get_row_pitch_B(surf);
xfer->layer_stride = isl_surf_get_array_pitch(surf);
map->ptr = ptr + (y0_el + box->y) * xfer->stride + (x0_el + box->x) * cpp;
map->ptr = ptr + y0_el * xfer->stride + x0_el * cpp;
}
}

View File

@@ -243,12 +243,17 @@ check_descriptors_left(struct d3d12_context *ctx)
return true;
}
static void
set_graphics_root_parameters(struct d3d12_context *ctx,
const struct pipe_draw_info *dinfo,
const struct pipe_draw_start_count_bias *draw)
#define MAX_DESCRIPTOR_TABLES (D3D12_GFX_SHADER_STAGES * 3)
static unsigned
update_graphics_root_parameters(struct d3d12_context *ctx,
const struct pipe_draw_info *dinfo,
const struct pipe_draw_start_count_bias *draw,
D3D12_GPU_DESCRIPTOR_HANDLE root_desc_tables[MAX_DESCRIPTOR_TABLES],
int root_desc_indices[MAX_DESCRIPTOR_TABLES])
{
unsigned num_params = 0;
unsigned num_root_desciptors = 0;
for (unsigned i = 0; i < D3D12_GFX_SHADER_STAGES; ++i) {
if (!ctx->gfx_stages[i])
@@ -260,16 +265,25 @@ set_graphics_root_parameters(struct d3d12_context *ctx,
assert(shader);
if (shader->num_cb_bindings > 0) {
if (dirty & D3D12_SHADER_DIRTY_CONSTBUF)
ctx->cmdlist->SetGraphicsRootDescriptorTable(num_params, fill_cbv_descriptors(ctx, shader, i));
if (dirty & D3D12_SHADER_DIRTY_CONSTBUF) {
assert(num_root_desciptors < MAX_DESCRIPTOR_TABLES);
root_desc_tables[num_root_desciptors] = fill_cbv_descriptors(ctx, shader, i);
root_desc_indices[num_root_desciptors++] = num_params;
}
num_params++;
}
if (shader->end_srv_binding > 0) {
if (dirty & D3D12_SHADER_DIRTY_SAMPLER_VIEWS)
ctx->cmdlist->SetGraphicsRootDescriptorTable(num_params, fill_srv_descriptors(ctx, shader, i));
if (dirty & D3D12_SHADER_DIRTY_SAMPLER_VIEWS) {
assert(num_root_desciptors < MAX_DESCRIPTOR_TABLES);
root_desc_tables[num_root_desciptors] = fill_srv_descriptors(ctx, shader, i);
root_desc_indices[num_root_desciptors++] = num_params;
}
num_params++;
if (dirty & D3D12_SHADER_DIRTY_SAMPLERS)
ctx->cmdlist->SetGraphicsRootDescriptorTable(num_params, fill_sampler_descriptors(ctx, shader_sel, i));
if (dirty & D3D12_SHADER_DIRTY_SAMPLERS) {
assert(num_root_desciptors < MAX_DESCRIPTOR_TABLES);
root_desc_tables[num_root_desciptors] = fill_sampler_descriptors(ctx, shader_sel, i);
root_desc_indices[num_root_desciptors++] = num_params;
}
num_params++;
}
/* TODO Don't always update state vars */
@@ -280,6 +294,7 @@ set_graphics_root_parameters(struct d3d12_context *ctx,
num_params++;
}
}
return num_root_desciptors;
}
static bool
@@ -580,7 +595,9 @@ d3d12_draw_vbo(struct pipe_context *pctx,
ctx->cmdlist->SetPipelineState(ctx->current_pso);
}
set_graphics_root_parameters(ctx, dinfo, &draws[0]);
D3D12_GPU_DESCRIPTOR_HANDLE root_desc_tables[MAX_DESCRIPTOR_TABLES];
int root_desc_indices[MAX_DESCRIPTOR_TABLES];
unsigned num_root_desciptors = update_graphics_root_parameters(ctx, dinfo, &draws[0], root_desc_tables, root_desc_indices);
bool need_zero_one_depth_range = d3d12_need_zero_one_depth_range(ctx);
if (need_zero_one_depth_range != ctx->need_zero_one_depth_range) {
@@ -718,6 +735,9 @@ d3d12_draw_vbo(struct pipe_context *pctx,
d3d12_apply_resource_states(ctx);
for (unsigned i = 0; i < num_root_desciptors; ++i)
ctx->cmdlist->SetGraphicsRootDescriptorTable(root_desc_indices[i], root_desc_tables[i]);
if (dinfo->index_size > 0)
ctx->cmdlist->DrawIndexedInstanced(draws[0].count, dinfo->instance_count,
draws[0].start, draws[0].index_bias,

View File

@@ -272,7 +272,7 @@ etna_resource_alloc(struct pipe_screen *pscreen, unsigned layout,
size = setup_miptree(rsc, paddingX, paddingY, msaa_xscale, msaa_yscale);
if (unlikely(templat->bind & PIPE_BIND_SCANOUT) && screen->ro->kms_fd >= 0) {
if (unlikely(templat->bind & PIPE_BIND_SCANOUT) && screen->ro) {
struct pipe_resource scanout_templat = *templat;
struct winsys_handle handle;
@@ -580,9 +580,23 @@ etna_resource_get_handle(struct pipe_screen *pscreen,
struct pipe_resource *prsc,
struct winsys_handle *handle, unsigned usage)
{
struct etna_screen *screen = etna_screen(pscreen);
struct etna_resource *rsc = etna_resource(prsc);
struct renderonly_scanout *scanout;
if (handle->plane) {
struct pipe_resource *cur = prsc;
for (int i = 0; i < handle->plane; i++) {
cur = cur->next;
if (!cur)
return false;
}
rsc = etna_resource(cur);
}
/* Scanout is always attached to the base resource */
struct renderonly_scanout *scanout = rsc->scanout;
scanout = rsc->scanout;
handle->stride = rsc->levels[0].stride;
handle->offset = rsc->levels[0].offset;
@@ -594,8 +608,8 @@ etna_resource_get_handle(struct pipe_screen *pscreen,
if (handle->type == WINSYS_HANDLE_TYPE_SHARED) {
return etna_bo_get_name(rsc->bo, &handle->handle) == 0;
} else if (handle->type == WINSYS_HANDLE_TYPE_KMS) {
if (renderonly_get_handle(scanout, handle)) {
return true;
if (screen->ro) {
return renderonly_get_handle(scanout, handle);
} else {
handle->handle = etna_bo_handle(rsc->bo);
return true;
@@ -608,6 +622,27 @@ etna_resource_get_handle(struct pipe_screen *pscreen,
}
}
static bool
etna_resource_get_param(struct pipe_screen *pscreen,
struct pipe_context *pctx, struct pipe_resource *prsc,
unsigned plane, unsigned layer, unsigned level,
enum pipe_resource_param param,
unsigned usage, uint64_t *value)
{
switch (param) {
case PIPE_RESOURCE_PARAM_NPLANES: {
unsigned count = 0;
for (struct pipe_resource *cur = prsc; cur; cur = cur->next)
count++;
*value = count;
return true;
}
default:
return false;
}
}
void
etna_resource_used(struct etna_context *ctx, struct pipe_resource *prsc,
enum etna_resource_status status)
@@ -707,6 +742,7 @@ etna_resource_screen_init(struct pipe_screen *pscreen)
pscreen->resource_create_with_modifiers = etna_resource_create_modifiers;
pscreen->resource_from_handle = etna_resource_from_handle;
pscreen->resource_get_handle = etna_resource_get_handle;
pscreen->resource_get_param = etna_resource_get_param;
pscreen->resource_changed = etna_resource_changed;
pscreen->resource_destroy = etna_resource_destroy;
}

View File

@@ -803,15 +803,19 @@ fd_screen_bo_get_handle(struct pipe_screen *pscreen, struct fd_bo *bo,
struct renderonly_scanout *scanout, unsigned stride,
struct winsys_handle *whandle)
{
struct fd_screen *screen = fd_screen(pscreen);
whandle->stride = stride;
if (whandle->type == WINSYS_HANDLE_TYPE_SHARED) {
return fd_bo_get_name(bo, &whandle->handle) == 0;
} else if (whandle->type == WINSYS_HANDLE_TYPE_KMS) {
if (renderonly_get_handle(scanout, whandle))
if (screen->ro) {
return renderonly_get_handle(scanout, whandle);
} else {
whandle->handle = fd_bo_handle(bo);
return true;
whandle->handle = fd_bo_handle(bo);
return true;
}
} else if (whandle->type == WINSYS_HANDLE_TYPE_FD) {
whandle->handle = fd_bo_dmabuf(bo);
return true;

View File

@@ -416,9 +416,8 @@ lima_resource_get_handle(struct pipe_screen *pscreen,
res->modifier_constant = true;
if (handle->type == WINSYS_HANDLE_TYPE_KMS && screen->ro &&
renderonly_get_handle(res->scanout, handle))
return true;
if (handle->type == WINSYS_HANDLE_TYPE_KMS && screen->ro)
return renderonly_get_handle(res->scanout, handle);
if (!lima_bo_export(res->bo, handle))
return false;

View File

@@ -6,7 +6,6 @@ api/clenqueuemigratememobjects: skip
api/clgetextensionfunctionaddressforplatform: skip
api/clgetkernelarginfo: skip
api/cllinkprogram: skip
api/clsetkernelarg/set kernel argument for cl_int3: fail
interop/egl_khr_cl_event2: skip
program/build/include-directories: fail
program/build/math-intrinsics: fail

View File

@@ -45,7 +45,7 @@ traces:
- path: gputest/furmark.trace
expectations:
- device: gl-panfrost-t860
checksum: 2bde9efdddd92c28d29f744e36a226e9
checksum: 6540f71b1c051ba82af2a25b93065f34
- path: gputest/triangle.trace
expectations:
- device: gl-panfrost-t860
@@ -55,7 +55,7 @@ traces:
- path: humus/Portals.trace
expectations:
- device: gl-panfrost-t860
checksum: f83da726bff354684a576effa74ef681
checksum: ad04db74ea70b7772719080f8a4c499b
- device: gl-panfrost-t760
# Wrong rendering, many elements are missing
checksum: 67db7302b28cb8e3e217cc79b672af79
@@ -213,7 +213,7 @@ traces:
- path: humus/AmbientAperture.trace
expectations:
- device: gl-panfrost-t860
checksum: 20492edd94ea94ba73013a4ee14285b7
checksum: e4c0b930ef99f14305e1ade7f1779c09
- path: humus/CelShading.trace
expectations:
- device: gl-panfrost-t860

View File

@@ -144,13 +144,14 @@ panfrost_resource_get_handle(struct pipe_screen *pscreen,
if (handle->type == WINSYS_HANDLE_TYPE_SHARED) {
return false;
} else if (handle->type == WINSYS_HANDLE_TYPE_KMS) {
if (renderonly_get_handle(scanout, handle))
if (dev->ro) {
return renderonly_get_handle(scanout, handle);
} else {
handle->handle = rsrc->image.data.bo->gem_handle;
handle->stride = rsrc->image.layout.slices[0].line_stride;
handle->offset = rsrc->image.layout.slices[0].offset;
return true;
handle->handle = rsrc->image.data.bo->gem_handle;
handle->stride = rsrc->image.layout.slices[0].line_stride;
handle->offset = rsrc->image.layout.slices[0].offset;
return TRUE;
}
} else if (handle->type == WINSYS_HANDLE_TYPE_FD) {
if (scanout) {
struct drm_prime_handle args = {

View File

@@ -5330,7 +5330,7 @@ void si_init_cs_preamble_state(struct si_context *sctx, bool uses_reg_shadowing)
unsigned vgt_tess_distribution;
vgt_tess_distribution = S_028B50_ACCUM_ISOLINE(32) | S_028B50_ACCUM_TRI(11) |
S_028B50_ACCUM_QUAD(11) | S_028B50_DONUT_SPLIT(16);
S_028B50_ACCUM_QUAD(11) | S_028B50_DONUT_SPLIT_GFX81(16);
/* Testing with Unigine Heaven extreme tesselation yielded best results
* with TRAP_SPLIT = 3.
@@ -5361,7 +5361,7 @@ void si_init_cs_preamble_state(struct si_context *sctx, bool uses_reg_shadowing)
si_pm4_set_reg(pm4, R_028B50_VGT_TESS_DISTRIBUTION,
S_028B50_ACCUM_ISOLINE(40) | S_028B50_ACCUM_TRI(30) | S_028B50_ACCUM_QUAD(24) |
S_028B50_DONUT_SPLIT(24) | S_028B50_TRAP_SPLIT(6));
S_028B50_DONUT_SPLIT_GFX9(24) | S_028B50_TRAP_SPLIT(6));
si_pm4_set_reg(pm4, R_028C48_PA_SC_BINNER_CNTL_1,
S_028C48_MAX_ALLOC_COUNT(sscreen->info.pbb_max_alloc_count - 1) |
S_028C48_MAX_PRIM_PER_BATCH(1023));

View File

@@ -432,10 +432,11 @@ v3d_resource_get_handle(struct pipe_screen *pscreen,
return v3d_bo_flink(bo, &whandle->handle);
case WINSYS_HANDLE_TYPE_KMS:
if (screen->ro) {
assert(rsc->scanout);
bool ok = renderonly_get_handle(rsc->scanout, whandle);
whandle->stride = rsc->slices[0].stride;
return ok;
if (renderonly_get_handle(rsc->scanout, whandle)) {
whandle->stride = rsc->slices[0].stride;
return true;
}
return false;
}
whandle->handle = bo->handle;
return true;
@@ -928,10 +929,6 @@ v3d_resource_from_handle(struct pipe_screen *pscreen,
renderonly_create_gpu_import_for_resource(prsc,
screen->ro,
NULL);
if (!rsc->scanout) {
fprintf(stderr, "Failed to create scanout resource.\n");
goto fail;
}
}
if (rsc->tiled && whandle->stride != slice->stride) {

View File

@@ -320,7 +320,6 @@ vc4_resource_get_handle(struct pipe_screen *pscreen,
return vc4_bo_flink(rsc->bo, &whandle->handle);
case WINSYS_HANDLE_TYPE_KMS:
if (screen->ro) {
assert(rsc->scanout);
return renderonly_get_handle(rsc->scanout, whandle);
}
whandle->handle = rsc->bo->handle;
@@ -689,8 +688,6 @@ vc4_resource_from_handle(struct pipe_screen *pscreen,
renderonly_create_gpu_import_for_resource(prsc,
screen->ro,
NULL);
if (!rsc->scanout)
goto fail;
}
if (rsc->tiled && whandle->stride != slice->stride) {

View File

@@ -25,8 +25,8 @@ from xml.etree import ElementTree
from typing import List,Tuple
class Version:
device_version : Tuple[int, int, int] = (1,0,0)
struct_version : Tuple[int, int] = (1,0)
device_version = (1,0,0)
struct_version = (1,0)
def __init__(self, version, struct=()):
self.device_version = version
@@ -59,17 +59,17 @@ class Version:
+ '_' + struct)
class Extension:
name : str = None
alias : str = None
is_required : bool = False
is_nonstandard : bool = False
enable_conds : List[str] = None
core_since : Version = None
name = None
alias = None
is_required = False
is_nonstandard = False
enable_conds = None
core_since = None
# these are specific to zink_device_info.py:
has_properties : bool = False
has_features : bool = False
guard : bool = False
has_properties = False
has_features = False
guard = False
def __init__(self, name, alias="", required=False, nonstandard=False,
properties=False, features=False, conditions=None, guard=False,
@@ -143,16 +143,16 @@ Layer = Extension
class ExtensionRegistryEntry:
# type of extension - right now it's either "instance" or "device"
ext_type : str = ""
ext_type = ""
# the version in which the extension is promoted to core VK
promoted_in : Version = None
promoted_in = None
# functions added by the extension are referred to as "commands" in the registry
device_commands : List[str] = None
pdevice_commands : List[str] = None
instance_commands : List[str] = None
constants : List[str] = None
features_struct : str = None
properties_struct : str = None
device_commands = None
pdevice_commands = None
instance_commands = None
constants = None
features_struct = None
properties_struct = None
class ExtensionRegistry:
# key = extension name, value = registry entry

View File

@@ -247,7 +247,7 @@ kernel::exec_context::bind(intrusive_ptr<command_queue> _q,
case module::argument::constant_buffer: {
auto arg = argument::create(marg);
cl_mem buf = kern._constant_buffers.at(&q->device()).get();
arg->set(q->device().address_bits() / 8, &buf);
arg->set(sizeof(buf), &buf);
arg->bind(*this, marg);
break;
}

View File

@@ -324,8 +324,8 @@ clover_lower_nir(nir_shader *nir, std::vector<module::argument> &args,
"constant_buffer_addr");
constant_var->data.location = args.size();
args.emplace_back(module::argument::global,
pointer_bit_size / 8, pointer_bit_size / 8, pointer_bit_size / 8,
args.emplace_back(module::argument::global, sizeof(cl_mem),
pointer_bit_size / 8, pointer_bit_size / 8,
module::argument::zero_ext,
module::argument::constant_buffer);
}

View File

@@ -330,9 +330,8 @@ namespace {
const auto elem_size = types_iter->second.size;
const auto elem_nbs = get<uint32_t>(inst, 3);
const auto size = elem_size * elem_nbs;
const auto align = elem_size * util_next_power_of_two(elem_nbs);
types[id] = { module::argument::scalar, size, size, align,
const auto size = elem_size * (elem_nbs != 3 ? elem_nbs : 4);
types[id] = { module::argument::scalar, size, size, size,
module::argument::zero_ext };
types[id].info.address_qualifier = CL_KERNEL_ARG_ADDRESS_PRIVATE;
break;

View File

@@ -271,6 +271,8 @@ dri_unbind_context(__DRIcontext * cPriv)
stapi->make_current(stapi, NULL, NULL, NULL);
}
}
ctx->dPriv = NULL;
ctx->rPriv = NULL;
return GL_TRUE;
}

View File

@@ -28,9 +28,6 @@ dEQP-VK.glsl.texture_functions.query.texturequerylod.sampler2darray_fixed_fragme
dEQP-VK.glsl.texture_functions.query.texturequerylod.sampler2darrayshadow_fragment,Fail
dEQP-VK.glsl.texture_functions.query.texturequerylod.sampler2dshadow_fragment,Fail
dEQP-VK.glsl.texture_functions.query.texturequerylod.usampler1darray_fragment,Fail
dEQP-VK.image.mismatched_formats.image_write.a8b8g8r8_srgb_pack32_with_rgb10a2,Crash
dEQP-VK.image.mismatched_formats.image_write.b8g8r8a8_srgb_with_rgba8,Crash
dEQP-VK.image.mismatched_formats.image_write.r8g8b8a8_srgb_with_rgb10a2,Crash
dEQP-VK.rasterization.primitives.static_stipple.rectangular_line_strip_wide,Fail
dEQP-VK.rasterization.primitives_multisample_4_bit.dynamic_stipple.line_strip_wide,Fail
dEQP-VK.texture.filtering.2d.combinations.linear_mipmap_linear.linear.clamp_to_edge.repeat,Fail

View File

@@ -144,6 +144,9 @@ lower_vri_instr_tex_deref(nir_tex_instr *tex,
else
tex->texture_index = value;
if (deref_src_type == nir_tex_src_sampler_deref)
return 0;
if (deref_instr->deref_type == nir_deref_type_array) {
assert(glsl_type_is_array(var->type));
assert(value >= 0);

View File

@@ -1010,7 +1010,7 @@ tx_src_param(struct shader_translator *tx, const struct sm1_src_param *param)
struct ureg_dst tmp;
assert(!param->rel || (IS_VS && param->file == D3DSPR_CONST) ||
(D3DSPR_ADDR && tx->version.major == 3));
(param->file == D3DSPR_INPUT && tx->version.major == 3));
switch (param->file)
{

View File

@@ -53,7 +53,7 @@ vlVaHandleVAEncPictureParameterBufferTypeH264(vlVaDriver *drv, vlVaContext *cont
context->coded_buf = coded_buf;
_mesa_hash_table_insert(context->desc.h264enc.frame_idx,
UINT_TO_PTR(h264->CurrPic.picture_id),
UINT_TO_PTR(h264->CurrPic.picture_id + 1),
UINT_TO_PTR(h264->frame_num));
if (h264->pic_fields.bits.idr_pic_flag == 1)
@@ -84,12 +84,12 @@ vlVaHandleVAEncSliceParameterBufferTypeH264(vlVaDriver *drv, vlVaContext *contex
if (h264->RefPicList0[i].picture_id != VA_INVALID_ID) {
if (context->desc.h264enc.ref_idx_l0 == VA_INVALID_ID)
context->desc.h264enc.ref_idx_l0 = PTR_TO_UINT(util_hash_table_get(context->desc.h264enc.frame_idx,
UINT_TO_PTR(h264->RefPicList0[i].picture_id)));
UINT_TO_PTR(h264->RefPicList0[i].picture_id + 1)));
}
if (h264->RefPicList1[i].picture_id != VA_INVALID_ID && h264->slice_type == 1) {
if (context->desc.h264enc.ref_idx_l1 == VA_INVALID_ID)
context->desc.h264enc.ref_idx_l1 = PTR_TO_UINT(util_hash_table_get(context->desc.h264enc.frame_idx,
UINT_TO_PTR(h264->RefPicList1[i].picture_id)));
UINT_TO_PTR(h264->RefPicList1[i].picture_id + 1)));
}
}

View File

@@ -83,7 +83,7 @@ vlVaHandleVAEncPictureParameterBufferTypeHEVC(vlVaDriver *drv, vlVaContext *cont
context->desc.h265enc.pic.constrained_intra_pred_flag = h265->pic_fields.bits.constrained_intra_pred_flag;
_mesa_hash_table_insert(context->desc.h265enc.frame_idx,
UINT_TO_PTR(h265->decoded_curr_pic.picture_id),
UINT_TO_PTR(h265->decoded_curr_pic.picture_id + 1),
UINT_TO_PTR(context->desc.h265enc.frame_num));
return VA_STATUS_SUCCESS;
@@ -102,12 +102,12 @@ vlVaHandleVAEncSliceParameterBufferTypeHEVC(vlVaDriver *drv, vlVaContext *contex
if (h265->ref_pic_list0[i].picture_id != VA_INVALID_ID) {
if (context->desc.h265enc.ref_idx_l0 == VA_INVALID_ID)
context->desc.h265enc.ref_idx_l0 = PTR_TO_UINT(util_hash_table_get(context->desc.h265enc.frame_idx,
UINT_TO_PTR(h265->ref_pic_list0[i].picture_id)));
UINT_TO_PTR(h265->ref_pic_list0[i].picture_id + 1)));
}
if (h265->ref_pic_list1[i].picture_id != VA_INVALID_ID && h265->slice_type == 1) {
if (context->desc.h265enc.ref_idx_l1 == VA_INVALID_ID)
context->desc.h265enc.ref_idx_l1 = PTR_TO_UINT(util_hash_table_get(context->desc.h265enc.frame_idx,
UINT_TO_PTR(h265->ref_pic_list1[i].picture_id)));
UINT_TO_PTR(h265->ref_pic_list1[i].picture_id + 1)));
}
}

View File

@@ -429,11 +429,12 @@ static const char *const dp_dc0_msg_type_gfx7[16] = {
[GFX7_DATAPORT_DC_UNTYPED_SURFACE_WRITE] = "DC untyped surface write",
};
static const int dp_oword_block_rw[8] = {
[BRW_DATAPORT_OWORD_BLOCK_1_OWORDLOW] = 1,
[BRW_DATAPORT_OWORD_BLOCK_2_OWORDS] = 2,
[BRW_DATAPORT_OWORD_BLOCK_4_OWORDS] = 4,
[BRW_DATAPORT_OWORD_BLOCK_8_OWORDS] = 8,
static const char *const dp_oword_block_rw[8] = {
[BRW_DATAPORT_OWORD_BLOCK_1_OWORDLOW] = "1-low",
[BRW_DATAPORT_OWORD_BLOCK_1_OWORDHIGH] = "1-high",
[BRW_DATAPORT_OWORD_BLOCK_2_OWORDS] = "2",
[BRW_DATAPORT_OWORD_BLOCK_4_OWORDS] = "4",
[BRW_DATAPORT_OWORD_BLOCK_8_OWORDS] = "8",
};
static const char *const dp_dc1_msg_type_hsw[32] = {
@@ -2307,8 +2308,8 @@ brw_disassemble_inst(FILE *file, const struct intel_device_info *devinfo,
case GFX7_DATAPORT_DC_OWORD_BLOCK_READ:
case GFX7_DATAPORT_DC_OWORD_BLOCK_WRITE: {
unsigned msg_ctrl = brw_dp_desc_msg_control(devinfo, imm_desc);
assert(dp_oword_block_rw[msg_ctrl & 7] > 0);
format(file, "owords = %d, aligned = %d",
assert(dp_oword_block_rw[msg_ctrl & 7]);
format(file, "owords = %s, aligned = %d",
dp_oword_block_rw[msg_ctrl & 7], (msg_ctrl >> 3) & 3);
break;
}
@@ -2369,8 +2370,8 @@ brw_disassemble_inst(FILE *file, const struct intel_device_info *devinfo,
break;
case GFX9_DATAPORT_DC_PORT1_A64_OWORD_BLOCK_WRITE:
case GFX9_DATAPORT_DC_PORT1_A64_OWORD_BLOCK_READ:
assert(dp_oword_block_rw[msg_ctrl & 7] > 0);
format(file, "owords = %d, aligned = %d",
assert(dp_oword_block_rw[msg_ctrl & 7]);
format(file, "owords = %s, aligned = %d",
dp_oword_block_rw[msg_ctrl & 7], (msg_ctrl >> 3) & 3);
break;
default:

View File

@@ -2656,16 +2656,23 @@ fs_visitor::assign_constant_locations()
/* Now that we know how many regular uniforms we'll push, reduce the
* UBO push ranges so we don't exceed the 3DSTATE_CONSTANT limits.
*/
/* For gen4/5:
* Only allow 16 registers (128 uniform components) as push constants.
*
* If changing this value, note the limitation about total_regs in
* brw_curbe.c/crocus_state.c
*/
const unsigned max_push_length = compiler->devinfo->ver < 6 ? 16 : 64;
unsigned push_length = DIV_ROUND_UP(stage_prog_data->nr_params, 8);
for (int i = 0; i < 4; i++) {
struct brw_ubo_range *range = &prog_data->ubo_ranges[i];
if (push_length + range->length > 64)
range->length = 64 - push_length;
if (push_length + range->length > max_push_length)
range->length = max_push_length - push_length;
push_length += range->length;
}
assert(push_length <= 64);
assert(push_length <= max_push_length);
}
bool

View File

@@ -359,6 +359,7 @@ fs_visitor::nir_emit_if(nir_if *if_stmt)
if (cond != NULL && cond->op == nir_op_inot) {
invert = true;
cond_reg = get_nir_src(cond->src[0].src);
cond_reg = offset(cond_reg, bld, cond->src[0].swizzle[0]);
} else {
invert = false;
cond_reg = get_nir_src(if_stmt->condition);

View File

@@ -999,7 +999,7 @@
<field name="2D Command Opcode" start="22" end="28" type="uint" default="1"/>
<field name="Command Type" start="29" end="31" type="uint" default="2"/>
<field name="Destination Pitch" start="32" end="47" type="int"/>
<field name="Raster Operation" start="48" end="55" type="int"/>
<field name="Raster Operation" start="48" end="55" type="uint"/>
<field name="Color Depth" start="56" end="57" type="uint" prefix="COLOR_DEPTH">
<value name="8 bit" value="0"/>
<value name="565" value="1"/>

View File

@@ -1009,7 +1009,7 @@
<field name="2D Command Opcode" start="22" end="28" type="uint" default="80"/>
<field name="Command Type" start="29" end="31" type="uint" default="2"/>
<field name="Destination Pitch" start="32" end="47" type="int"/>
<field name="Raster Operation" start="48" end="55" type="int"/>
<field name="Raster Operation" start="48" end="55" type="uint"/>
<field name="Color Depth" start="56" end="57" type="uint" prefix="COLOR_DEPTH">
<value name="8 bit" value="0"/>
<value name="565" value="1"/>

View File

@@ -1087,7 +1087,7 @@
<field name="2D Command Opcode" start="22" end="28" type="uint" default="80"/>
<field name="Command Type" start="29" end="31" type="uint" default="2"/>
<field name="Destination Pitch" start="32" end="47" type="int"/>
<field name="Raster Operation" start="48" end="55" type="int"/>
<field name="Raster Operation" start="48" end="55" type="uint"/>
<field name="Color Depth" start="56" end="58" type="uint" prefix="COLOR_DEPTH">
<value name="8 bit" value="0"/>
<value name="565" value="1"/>

View File

@@ -1220,7 +1220,12 @@ anv_queue_submit_add_cmd_buffer(struct anv_queue_submit *submit,
}
submit->cmd_buffers[submit->cmd_buffer_count++] = cmd_buffer;
submit->perf_query_pool = cmd_buffer->perf_query_pool;
/* Only update the perf_query_pool if there is one. We can decide to batch
* 2 command buffers if the second one doesn't use a query pool, but we
* can't drop the already chosen one.
*/
if (cmd_buffer->perf_query_pool)
submit->perf_query_pool = cmd_buffer->perf_query_pool;
submit->perf_query_pass = perf_pass;
return VK_SUCCESS;

View File

@@ -1448,7 +1448,8 @@ dri3_alloc_render_buffer(struct loader_dri3_drawable *draw, unsigned int format,
dri3_linear_format_for_format(draw, format),
__DRI_IMAGE_USE_SHARE |
__DRI_IMAGE_USE_LINEAR |
__DRI_IMAGE_USE_BACKBUFFER,
__DRI_IMAGE_USE_BACKBUFFER |
__DRI_IMAGE_USE_SCANOUT,
buffer);
pixmap_buffer = linear_buffer_display_gpu;
}
@@ -1460,7 +1461,8 @@ dri3_alloc_render_buffer(struct loader_dri3_drawable *draw, unsigned int format,
dri3_linear_format_for_format(draw, format),
__DRI_IMAGE_USE_SHARE |
__DRI_IMAGE_USE_LINEAR |
__DRI_IMAGE_USE_BACKBUFFER,
__DRI_IMAGE_USE_BACKBUFFER |
__DRI_IMAGE_USE_SCANOUT,
buffer);
pixmap_buffer = buffer->linear_buffer;

View File

@@ -3452,6 +3452,8 @@ egl_image_target_texture(struct gl_context *ctx,
if (tex_storage)
_mesa_set_texture_view_state(ctx, texObj, target, 1);
_mesa_update_fbo_texture(ctx, texObj, 0, 0);
_mesa_unlock_texture(ctx, texObj);
}

View File

@@ -507,12 +507,6 @@ bi_can_iaddc(bi_instr *ins)
ASSERTED static bool
bi_can_fma(bi_instr *ins)
{
/* Errata: *V2F32_TO_V2F16 with distinct sources raises
* INSTR_INVALID_ENC under certain conditions */
if (ins->op == BI_OPCODE_V2F32_TO_V2F16 &&
!bi_is_word_equiv(ins->src[0], ins->src[1]))
return false;
/* +IADD.i32 -> *IADDC.i32 */
if (bi_can_iaddc(ins))
return true;
@@ -624,6 +618,72 @@ bi_reads_temps(bi_instr *ins, unsigned src)
}
}
static bool
bi_impacted_t_modifiers(bi_instr *I, unsigned src)
{
enum bi_swizzle swizzle = I->src[src].swizzle;
switch (I->op) {
case BI_OPCODE_F16_TO_F32:
case BI_OPCODE_F16_TO_S32:
case BI_OPCODE_F16_TO_U32:
case BI_OPCODE_MKVEC_V2I16:
case BI_OPCODE_S16_TO_F32:
case BI_OPCODE_S16_TO_S32:
case BI_OPCODE_U16_TO_F32:
case BI_OPCODE_U16_TO_U32:
return (swizzle != BI_SWIZZLE_H00);
case BI_OPCODE_BRANCH_F32:
case BI_OPCODE_LOGB_F32:
case BI_OPCODE_ILOGB_F32:
case BI_OPCODE_FADD_F32:
case BI_OPCODE_FCMP_F32:
case BI_OPCODE_FREXPE_F32:
case BI_OPCODE_FREXPM_F32:
case BI_OPCODE_FROUND_F32:
return (swizzle != BI_SWIZZLE_H01);
case BI_OPCODE_IADD_S32:
case BI_OPCODE_IADD_U32:
case BI_OPCODE_ISUB_S32:
case BI_OPCODE_ISUB_U32:
case BI_OPCODE_IADD_V4S8:
case BI_OPCODE_IADD_V4U8:
case BI_OPCODE_ISUB_V4S8:
case BI_OPCODE_ISUB_V4U8:
return (src == 1) && (swizzle != BI_SWIZZLE_H01);
case BI_OPCODE_S8_TO_F32:
case BI_OPCODE_S8_TO_S32:
case BI_OPCODE_U8_TO_F32:
case BI_OPCODE_U8_TO_U32:
return (swizzle != BI_SWIZZLE_B0000);
case BI_OPCODE_V2S8_TO_V2F16:
case BI_OPCODE_V2S8_TO_V2S16:
case BI_OPCODE_V2U8_TO_V2F16:
case BI_OPCODE_V2U8_TO_V2U16:
return (swizzle != BI_SWIZZLE_B0022);
case BI_OPCODE_IADD_V2S16:
case BI_OPCODE_IADD_V2U16:
case BI_OPCODE_ISUB_V2S16:
case BI_OPCODE_ISUB_V2U16:
return (src == 1) && (swizzle >= BI_SWIZZLE_H11);
#if 0
/* Restriction on IADD in 64-bit clauses on G72 */
case BI_OPCODE_IADD_S64:
case BI_OPCODE_IADD_U64:
return (src == 1) && (swizzle != BI_SWIZZLE_D0);
#endif
default:
return false;
}
}
ASSERTED static bool
bi_reads_t(bi_instr *ins, unsigned src)
{
@@ -640,6 +700,11 @@ bi_reads_t(bi_instr *ins, unsigned src)
if (src == 0 && bi_opcode_props[ins->op].sr_read)
return false;
/* Bifrost cores newer than Mali G71 have restrictions on swizzles on
* same-cycle temporaries. Check the list for these hazards. */
if (bi_impacted_t_modifiers(ins, src))
return false;
/* Descriptor must not come from a passthrough */
switch (ins->op) {
case BI_OPCODE_LD_CVT:

View File

@@ -3173,13 +3173,15 @@ midgard_compile_shader_nir(nir_shader *nir,
/* Analyze now that the code is known but before scheduling creates
* pipeline registers which are harder to track */
mir_analyze_helper_terminate(ctx);
mir_analyze_helper_requirements(ctx);
/* Schedule! */
midgard_schedule_program(ctx);
mir_ra(ctx);
/* Analyze after scheduling since this is order-dependent */
mir_analyze_helper_terminate(ctx);
/* Emit flat binary from the instruction arrays. Iterate each block in
* sequence. Save instruction boundaries such that lookahead tags can
* be assigned easily */

View File

@@ -1128,10 +1128,7 @@ vn_android_fix_buffer_create_info(
}
VkResult
vn_android_buffer_from_ahb(struct vn_device *dev,
const VkBufferCreateInfo *create_info,
const VkAllocationCallbacks *alloc,
struct vn_buffer **out_buf)
vn_android_init_ahb_buffer_memory_type_bits(struct vn_device *dev)
{
const uint32_t format = AHARDWAREBUFFER_FORMAT_BLOB;
/* ensure dma_buf_memory_type_bits covers host visible usage */
@@ -1142,7 +1139,6 @@ vn_android_buffer_from_ahb(struct vn_device *dev,
int dma_buf_fd = -1;
uint64_t alloc_size = 0;
uint32_t mem_type_bits = 0;
struct vn_android_buffer_create_info local_info;
VkResult result;
ahb = vn_android_ahb_allocate(4096, 1, 1, format, usage);
@@ -1164,6 +1160,20 @@ vn_android_buffer_from_ahb(struct vn_device *dev,
if (result != VK_SUCCESS)
return result;
dev->ahb_buffer_memory_type_bits = mem_type_bits;
return VK_SUCCESS;
}
VkResult
vn_android_buffer_from_ahb(struct vn_device *dev,
const VkBufferCreateInfo *create_info,
const VkAllocationCallbacks *alloc,
struct vn_buffer **out_buf)
{
struct vn_android_buffer_create_info local_info;
VkResult result;
create_info = vn_android_fix_buffer_create_info(create_info, &local_info);
result = vn_buffer_create(dev, create_info, alloc, out_buf);
if (result != VK_SUCCESS)
@@ -1174,7 +1184,7 @@ vn_android_buffer_from_ahb(struct vn_device *dev,
* properties.
*/
(*out_buf)->memory_requirements.memoryRequirements.memoryTypeBits &=
mem_type_bits;
dev->ahb_buffer_memory_type_bits;
assert((*out_buf)->memory_requirements.memoryRequirements.memoryTypeBits);

View File

@@ -75,6 +75,9 @@ vn_android_buffer_from_ahb(struct vn_device *dev,
const VkAllocationCallbacks *alloc,
struct vn_buffer **out_buf);
VkResult
vn_android_init_ahb_buffer_memory_type_bits(struct vn_device *dev);
#else
static inline const VkNativeBufferANDROID *
@@ -157,6 +160,12 @@ vn_android_buffer_from_ahb(UNUSED struct vn_device *dev,
return VK_ERROR_OUT_OF_HOST_MEMORY;
}
static inline VkResult
vn_android_init_ahb_buffer_memory_type_bits(UNUSED struct vn_device *dev)
{
return VK_ERROR_FEATURE_NOT_PRESENT;
}
#endif /* ANDROID */
#endif /* VN_ANDROID_H */

View File

@@ -3386,6 +3386,15 @@ vn_CreateDevice(VkPhysicalDevice physicalDevice,
mtx_init(&pool->mutex, mtx_plain);
}
if (dev->base.base.enabled_extensions
.ANDROID_external_memory_android_hardware_buffer) {
result = vn_android_init_ahb_buffer_memory_type_bits(dev);
if (result != VK_SUCCESS) {
vn_call_vkDestroyDevice(instance, dev_handle, NULL);
goto fail;
}
}
*pDevice = dev_handle;
if (pCreateInfo == &local_create_info)

View File

@@ -139,6 +139,9 @@ struct vn_device {
uint32_t queue_count;
struct vn_device_memory_pool memory_pools[VK_MAX_MEMORY_TYPES];
/* cache memory type requirement for AHB backed VkBuffer */
uint32_t ahb_buffer_memory_type_bits;
};
VK_DEFINE_HANDLE_CASTS(vn_device,
base.base.base,

View File

@@ -816,7 +816,7 @@ struct x11_swapchain {
bool has_present_queue;
bool has_acquire_queue;
VkResult status;
xcb_present_complete_mode_t last_present_mode;
bool copy_is_suboptimal;
struct wsi_queue present_queue;
struct wsi_queue acquire_queue;
pthread_t queue_manager;
@@ -932,25 +932,30 @@ x11_handle_dri3_present_event(struct x11_swapchain *chain,
}
VkResult result = VK_SUCCESS;
/* The winsys is now trying to flip directly and cannot due to our
* configuration. Request the user reallocate.
*/
switch (complete->mode) {
case XCB_PRESENT_COMPLETE_MODE_COPY:
if (chain->copy_is_suboptimal)
result = VK_SUBOPTIMAL_KHR;
break;
case XCB_PRESENT_COMPLETE_MODE_FLIP:
/* If we ever go from flipping to copying, the odds are very likely
* that we could reallocate in a more optimal way if we didn't have
* to care about scanout, so we always do this.
*/
chain->copy_is_suboptimal = true;
break;
#ifdef HAVE_DRI3_MODIFIERS
if (complete->mode == XCB_PRESENT_COMPLETE_MODE_SUBOPTIMAL_COPY &&
chain->last_present_mode != XCB_PRESENT_COMPLETE_MODE_SUBOPTIMAL_COPY)
case XCB_PRESENT_COMPLETE_MODE_SUBOPTIMAL_COPY:
/* The winsys is now trying to flip directly and cannot due to our
* configuration. Request the user reallocate.
*/
result = VK_SUBOPTIMAL_KHR;
break;
#endif
default:
break;
}
/* When we go from flipping to copying, the odds are very likely that
* we could reallocate in a more optimal way if we didn't have to care
* about scanout, so we always do this.
*/
if (complete->mode == XCB_PRESENT_COMPLETE_MODE_COPY &&
chain->last_present_mode == XCB_PRESENT_COMPLETE_MODE_FLIP)
result = VK_SUBOPTIMAL_KHR;
chain->last_present_mode = complete->mode;
return result;
}
@@ -1631,17 +1636,15 @@ x11_surface_create_swapchain(VkIcdSurfaceBase *icd_surface,
if (chain->extent.width != cur_width || chain->extent.height != cur_height)
chain->status = VK_SUBOPTIMAL_KHR;
/* If we are reallocating from an old swapchain, then we inherit its
* last completion mode, to ensure we don't get into reallocation
* cycles. If we are starting anew, we set 'COPY', as that is the only
* mode which provokes reallocation when anything changes, to make
* sure we have the most optimal allocation.
/* We used to inherit copy_is_suboptimal from pCreateInfo->oldSwapchain.
* When it was true, and when the next present was completed with copying,
* we would return VK_SUBOPTIMAL_KHR and hint the app to reallocate again
* for no good reason. If all following presents on the surface were
* completed with copying because of some surface state change, we would
* always return VK_SUBOPTIMAL_KHR no matter how many times the app had
* reallocated.
*/
VK_FROM_HANDLE(x11_swapchain, old_chain, pCreateInfo->oldSwapchain);
if (old_chain)
chain->last_present_mode = old_chain->last_present_mode;
else
chain->last_present_mode = XCB_PRESENT_COMPLETE_MODE_COPY;
chain->copy_is_suboptimal = false;
if (!wsi_device->sw)
if (!wsi_x11_check_dri3_compatible(wsi_device, conn))