Compare commits

...

39 Commits

Author SHA1 Message Date
Eric Engestrom
e658e900bb VERSION: bump to 20.1.0-rc2
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
2020-05-06 21:49:41 +02:00
Marek Olšák
f7d67c99a6 radeonsi: fix compilation of monolithic PS
This was totally broken. Monolithic PS is only used if FBFETCH or
interpolateAtSample are used.

When the PS prolog was built, it overwrote ctx->main_fn.

Discovered by @eefano.

Fixes: 8832a88434 "radeonsi: move PS LLVM code into si_shader_llvm_ps.c"
Closes: #2814

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4918>
(cherry picked from commit 29da521280)
2020-05-06 19:32:39 +02:00
Danylo Piliaiev
b896c506b8 i965: Fix out-of-bounds access to brw_stage_state::surf_offset
../src/mesa/drivers/dri/i965/brw_wm_surface_state.c:1378:32: runtime error: index 3503345872 out of bounds for type 'uint32_t [149]'

brw_assign_common_binding_table_offsets has the following comment:
 "Unused groups are initialized to 0xd0d0d0d0 to make it obvious that they're
 unused but also make sure that addition of small offsets to them will
 trigger some of our asserts that surface indices are < BRW_MAX_SURFACES."

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4350>
(cherry picked from commit 784358bd6e)
2020-05-06 19:32:28 +02:00
Samuel Pitoiset
fa1739113b radv: don't report error with other vendor DRM devices
Enumeration should just skip unsupported DRM devices.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4806>
(cherry picked from commit 8d993c9d2c)
2020-05-06 19:32:28 +02:00
Samuel Pitoiset
d4c1cb59c2 radv: report INITIALIZATION_FAILED when the amdgpu winsys init failed
The driver should be capable if it reaches the winsys initialization.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4806>
(cherry picked from commit f03abd5041)
2020-05-06 19:32:28 +02:00
Erik Faye-Lund
1ed51096ac zink: lower b2b to b2i
Zink requires 1-bit booleans, but this requirement was missed before
b2b1s started getting automatically inserted. Let's lower these away, to
avoid piglit regressions.

Fixes the following piglits:
- shaders@glsl-vs-if-bool
- spec@!opengl 2.0@vertex-program-two-side

Fixes: c217ee8d35 ("nir: Insert b2b1s around booleans in nir_lower_to")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2902
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4903>
(cherry picked from commit 7f6a491eec)
2020-05-06 19:32:28 +02:00
Dave Airlie
a36b7d8c97 llvmpipo/nir: free compute shader NIR
I forgot this in the last round.

Fixes: 18f896e55d (llvmpipe: add initial nir support)

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4899>
(cherry picked from commit 870b6a6050)
2020-05-06 19:32:28 +02:00
Dave Airlie
ebb656bfb3 draw/tess: free tessellation control shader i/o memory.
Fixes: 0d02a7b8ca (draw: add main tessellation code)

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4899>
(cherry picked from commit d1ad1be35a)
2020-05-06 19:32:28 +02:00
Rhys Perry
b4e46da708 nir: add missing group_memory_barrier handling
Totals from 2 (0.00% of 127638) affected shaders:
VGPRs: 164 -> 168 (+2.44%)
CodeSize: 18420 -> 18756 (+1.82%)
Instrs: 3658 -> 3700 (+1.15%)
Cycles: 82912 -> 83080 (+0.20%)
VMEM: 70 -> 69 (-1.43%)
PreVGPRs: 155 -> 168 (+8.39%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
CC: <mesa-stable@lists.freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4889>
(cherry picked from commit a46aa3dc2e)
2020-05-06 19:32:28 +02:00
Pierre-Eric Pelloux-Prayer
f2a012f987 radeonsi: don't print gs_copy_shader stats for shaderdb
Fixes: dbc86fa3de ("radeonsi: dump shader stats when hitting the live cache")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4607>
(cherry picked from commit 547e81655a)
2020-05-06 19:32:28 +02:00
Pierre-Eric Pelloux-Prayer
a25234047f driconf: add force_integer_tex_nearest option
And enable it for "GRID Autosport" and "DIRT: Showdown" games.

CC: 20.1 <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1258
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4647>
(cherry picked from commit 403eb507f5)
2020-05-06 19:32:28 +02:00
Pierre-Eric Pelloux-Prayer
ae44a916ec mesa: add gl_coontext::ForceIntegerTexNearest
Some applications incorrectly use GL_LINEAR* values for integers texture.
copyimage.c already implemented a tolerance for such app in prepare_target_err.

This commit adds a boolean that will treat GL_LINEAR* filters as
GL_NEAREST for integer textures.

CC: 20.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4647>
(cherry picked from commit 12fb7d7008)
2020-05-06 19:32:28 +02:00
Eric Engestrom
6486ac1a4c .pick_status.json: Update to 29da521280 2020-05-06 19:32:19 +02:00
Eric Engestrom
ad9b00ee4e .pick_status.json: Mark 3fac55ce0d as denominated 2020-05-06 19:09:48 +02:00
Marek Olšák
de3a2b29bc ac/surface: fix MSAA crash with FORCE_SWIZZLE_MODE on gfx9
Fixes: 3dc2ccc14c "ac/surface: replace RADEON_SURF_OPTIMIZE_FOR_SPACE with !FORCE_SWIZZLE_MODE"

Closes: #2884

Tested-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4862>
(cherry picked from commit c4cdef64ad)
2020-05-05 18:56:46 +02:00
Marek Olšák
12d23b4a08 Revert "ac: reassociate FP expressions for inexact instructions for radeonsi"
This reverts commit cf2f3c2753.

It breaks shadows in Unigine Superposition.

Fixes: cf2f3c2753

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4837>
(cherry picked from commit b97cc41aa2)
2020-05-05 18:56:46 +02:00
Christian Gmeiner
33a086f44e etnaviv: do not use int filter when anisotropic filtering is used
The blob does not use this combination. This change moves the
decision if int filter gets used to state emit time.

Fixes: 7aaa0e5908 ("etnaviv: add anisotropic filter support")
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4872>
(cherry picked from commit 89a41dae77)
2020-05-05 18:56:46 +02:00
Christian Gmeiner
00001525f5 etnaviv: fix SAMP_ANISOTROPY register value
This caused some serious problems like shredded output, ~1fps and GPU hungs.

Fixes: 7aaa0e5908 ("etnaviv: add anisotropic filter support")
Reported-by: Lukas F. Hartmann <lukas@mntmn.com>
Tested-by: Lukas F. Hartmann <lukas@mntmn.com>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4872>
(cherry picked from commit b38e51bd96)
2020-05-05 18:56:46 +02:00
Jason Ekstrand
86629193f5 vulkan: Allow destroying NULL debug report callbacks
Fixes: 086cfa5652 "anv: implementation of VK_EXT_debug_report extension"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4690>
(cherry picked from commit 9d10bde5a8)
2020-05-05 18:56:46 +02:00
Tapani Pälli
e1e22e38e7 st/mesa: destroy only own program variants when program is released
Earlier commit tried to achieve this but actually did more. This makes
sure the variants for other contexts continue to live.

Fixes: de3d7dbed5 ("mesa/st: release variants for active programs before unref")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2865
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4831>
(cherry picked from commit 46b3cb011f)
2020-05-05 18:56:45 +02:00
Pierre-Eric Pelloux-Prayer
4af564cb92 radeonsi: fix export count
Fixes: 17acff01a0 ("radeonsi: skip vs output optimizations for some outputs")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2877
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4871>
(cherry picked from commit 7e7bb38bd8)
2020-05-05 18:56:45 +02:00
Eric Engestrom
7a93e75a41 .pick_status.json: Update to 5779694698 2020-05-05 18:56:45 +02:00
Marek Olšák
4e07d00fa5 Revert "ac/surface: remove RADEON_SURF_TC_COMPATIBLE_HTILE and assume it's always set"
This reverts commit f6d87ec8a9.

It breaks RADV.

Fixes: f6d87ec8a9 "ac/surface: remove RADEON_SURF_TC_COMPATIBLE_HTILE and assume it's always set"
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4864>
(cherry picked from commit f1a40a26a9)
2020-05-05 18:56:45 +02:00
Bas Nieuwenhuizen
ec918aa04c radv: Extend tiling flags to 64-bit.
SCANOUT is bit 63 ....

Fixes: bfd9e7ff24 "radv: Use new scanout gfx9 metadata flag."
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2879
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4859>
(cherry picked from commit df9629e593)
2020-05-05 18:56:45 +02:00
Rhys Perry
afa6e8cc0b aco: add message to static_assert
static_assert without a message is only supported with C++17 and later.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: c99107ece0
    ('aco: add explicit padding for all Instruction sub-structs')

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4850>
(cherry picked from commit b5f7b0ce19)
2020-05-05 18:56:45 +02:00
Rhys Perry
a63ca1776f aco: remove use of f-strings
f-strings require Python 3.6 but 3.5 is still maintained and used.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2839
Fixes: 2ab45f41 ("aco: implement sub-dword swaps")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4850>
(cherry picked from commit 8e02de4d7f)
2020-05-05 18:56:45 +02:00
D Scott Phillips
263451f9c9 anv,iris: Fix input vertex max for tcs on gen12
gen12 does away with the single patch dispatch mode for tcs, and
increases some limits so that 8_patch mode can always work. Make the
necessary changes so we don't try to fall back to single patch mode.

Fixes KHR-GL46.tessellation_shader.single.max_patch_vertices and others

Fixes: 44754279ac ("intel/fs/gen12: Use TCS 8_PATCH mode.")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4843>
(cherry picked from commit 65b05ebdda)
2020-05-05 18:56:45 +02:00
D Scott Phillips
3668e27ec3 intel/fs: Update location of Render Target Array Index for gen12
Render Target Array Index has moved from R0.0[26:16] to
R1.1[26:16] on gen12.

Fixes dEQP-VK.multiview.input_attachments.*

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4836>
(cherry picked from commit 7bd15135a6)
2020-05-05 18:56:45 +02:00
Tomeu Vizoso
5a7b5ea470 panfrost: Add Bifrost texture trampoline BO to batch
Fixes: d3eb23adb5 ("panfrost: Emit sampler descriptor on bifrost")
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4832>
(cherry picked from commit 3a81abf3b2)
2020-05-05 18:56:45 +02:00
Samuel Pitoiset
e2037aea0c ci: fix reporting the number of unexpected/flakes
`wc -l $file` returns the number of lines and the filename.

Fixes: b8c66aeb93 ("ci: Clean up some excessive use of pipes in dEQP results processing.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4829>
(cherry picked from commit cc2c3b41b8)
2020-05-05 18:56:45 +02:00
Marek Olšák
725f45bc63 radeonsi: revert an accidental change in si_clear_buffer
The change was in: 7b0b085c94

Fixes: 7b0b085c94 ("radeonsi: drop the negation from fmask_is_not_identity")

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4761>
(cherry picked from commit bdd2f284d9)
2020-05-05 18:56:45 +02:00
Marek Olšák
16c3eca327 radeonsi: unify and align down the max SSBO/TBO/UBO buffer binding size
Rounding down the size fixes:
    KHR-GL45.enhanced_layouts.ssb_member_invalid_offset_alignment

Fixes: 03e2adc990

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4761>
(cherry picked from commit e58dcc47c3)
2020-05-05 18:56:45 +02:00
Lionel Landwerlin
c98e895185 iris: don't assert on unfinished aux import in copy paths
After a resource is created the first command using it could be a copy
command.

In iris_state we finish the import on surface/view creation but we
don't do that for copies.

v2: Move finish call to gallium entrypoints (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2725
Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4657>
(cherry picked from commit 612e35c8d9)
2020-05-05 18:56:45 +02:00
Andres Gomez
8c0ad1d2db gitlab-ci: update tracie README after changes in main script
v2:
  - Update the default location for the traces when there is no
    traces-db entry in the traces definition file (Alexandros).

Fixes: 90a39af5f6 "(ci: Drop the git dependency in tracie)"
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4640>
(cherry picked from commit 5e9ae40430)
2020-05-04 22:00:04 +02:00
Francisco Jerez
4e710b3c37 intel/ir: Update performance analysis parameters for memory fence codegen changes.
The SFID field of the SHADER_OPCODE_MEMORY_FENCE and
SHADER_OPCODE_INTERLOCK instructions now indicates the target function
of the memory fence.  Account the cycle-count cost to the right shared
unit.

Fixes: f858fa26b4 ("intel/fs,vec4: Pull stall logic for memory fences up into the IR")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4817>
(cherry picked from commit 0842758ec0)
2020-05-04 22:00:03 +02:00
Rob Clark
8229d22234 freedreno: fix buffer import
`rsc->layout.cpp` is zero until we `fd_resource_layout_init()`

Fixes: 5a8718f01b ("freedreno: Make the slice pitch be bytes, not pixels.")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4818>
(cherry picked from commit a0fe98b478)
2020-05-04 22:00:02 +02:00
Bas Nieuwenhuizen
6236c97699 radv: Fix implicit sync with recent allocation changes.
the implicit sync flag gets set at the beginning at the function,
but I used = instead of |= later.

Fixes: bec9285027 "radv: Stop using memory type indices."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4814>
(cherry picked from commit 85fe0e551f)
2020-05-04 22:00:00 +02:00
Eric Engestrom
1b0e98c295 .pick_status.json: Update to af55bdd05d 2020-05-04 21:59:52 +02:00
Eric Engestrom
0865c5107f VERSION: bump to 20.1.0-rc1
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
2020-04-30 00:09:58 +02:00
56 changed files with 3527 additions and 191 deletions

View File

@@ -259,7 +259,7 @@ if [ $DEQP_EXITCODE -ne 0 ]; then
cat $UNEXPECTED_RESULTSFILE.txt
fi
count=`wc -l $UNEXPECTED_RESULTSFILE.txt`
count=`cat $UNEXPECTED_RESULTSFILE.txt | wc -l`
# Re-run fails to detect flakes. But use a small threshold, if
# something was fundamentally broken, we don't want to re-run
@@ -267,7 +267,7 @@ if [ $DEQP_EXITCODE -ne 0 ]; then
else
grep ",Flake" $RESULTSFILE > $FLAKESFILE
count=`wc -l $FLAKESFILE`
count=`cat $FLAKESFILE | wc -l`
if [ $count -gt 0 ]; then
echo "Some flakes found (see cts-runner-flakes.txt in artifacts for full results):"
head -n 50 $FLAKESFILE

View File

@@ -28,8 +28,8 @@ traces:
checksum: ff827f7eb069afd87cc305a422cba939
```
The traces-db entry can be absent, in which case it is assumed that the
current directory is the traces-db directory.
The `traces-db` entry can be absent, in which case it is assumed that
the traces can be found in the `CWD/traces-db` directory.
Traces that don't have an expectation for the current device are skipped
during trace replay.
@@ -99,22 +99,17 @@ publisher.
Mesa traces CI uses a set of scripts to replay traces and check the output
against reference checksums.
The high level script [tracie.sh](.gitlab-ci/tracie/tracie.sh) accepts
a traces definition file and the type of traces
(apitrace/renderdoc/gfxreconstruct) to run:
The high level script [tracie.py](.gitlab-ci/tracie/tracie.py) accepts
a traces definition file and the name of the device to be tested:
tracie.sh .gitlab-ci/traces.yml renderdoc
tracie.py --file .gitlab-ci/traces.yml --device-name gl-vmware-llvmpipe
tracie.sh copies produced artifacts to the `$CI_PROJECT_DIR/result`
tracie.py copies the produced artifacts to the `$CI_PROJECT_DIR/result`
directory. By default, created images from traces are only stored in case of a
checksum mismatch. The `TRACIE_STORE_IMAGES` CI/environment variable can be set
to `1` to force storing images, e.g., to get a complete set of reference
images.
The `tracie.sh` script requires that the environment variable `DEVICE_NAME` is
properly set for the target machine, and matches the `device` field of the
relevant trace expectations in the used `traces.yml` file.
At a lower level the
[dump_trace_images.py](.gitlab-ci/tracie/dump_trace_images.py) script is
called, which replays a trace, dumping a set of images in the process. By

3152
.pick_status.json Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -1 +1 @@
20.1.0-devel
20.1.0-rc2

View File

@@ -651,8 +651,7 @@ static int gfx6_compute_surface(ADDR_HANDLE addrlib,
AddrSurfInfoIn.flags.cube = config->is_cube;
AddrSurfInfoIn.flags.display = get_display_flag(config, surf);
AddrSurfInfoIn.flags.pow2Pad = config->info.levels > 1;
AddrSurfInfoIn.flags.tcCompatible = info->chip_class >= GFX8 &&
AddrSurfInfoIn.flags.depth;
AddrSurfInfoIn.flags.tcCompatible = (surf->flags & RADEON_SURF_TC_COMPATIBLE_HTILE) != 0;
/* Only degrade the tile mode for space if TC-compatible HTILE hasn't been
* requested, because TC-compatible HTILE requires 2D tiling.
@@ -773,7 +772,6 @@ static int gfx6_compute_surface(ADDR_HANDLE addrlib,
surf->htile_size = 0;
surf->htile_slice_size = 0;
surf->htile_alignment = 1;
surf->tc_compatible_htile_allowed = AddrSurfInfoIn.flags.tcCompatible;
const bool only_stencil = (surf->flags & RADEON_SURF_SBUFFER) &&
!(surf->flags & RADEON_SURF_ZBUFFER);
@@ -790,11 +788,10 @@ static int gfx6_compute_surface(ADDR_HANDLE addrlib,
if (level > 0)
continue;
if (!AddrSurfInfoOut.tcCompatible)
if (!AddrSurfInfoOut.tcCompatible) {
AddrSurfInfoIn.flags.tcCompatible = 0;
if (!AddrSurfInfoOut.tcCompatible || !surf->htile_size)
surf->tc_compatible_htile_allowed = false;
surf->flags &= ~RADEON_SURF_TC_COMPATIBLE_HTILE;
}
if (AddrSurfInfoIn.flags.matchStencilTileCfg) {
AddrSurfInfoIn.flags.matchStencilTileCfg = 0;
@@ -940,7 +937,7 @@ static int gfx6_compute_surface(ADDR_HANDLE addrlib,
* TC-compatible HTILE even for levels where it's disabled by DB.
*/
if (surf->htile_size && config->info.levels > 1 &&
surf->tc_compatible_htile_allowed) {
surf->flags & RADEON_SURF_TC_COMPATIBLE_HTILE) {
/* MSAA can't occur with levels > 1, so ignore the sample count. */
const unsigned total_pixels = surf->surf_size / surf->bpe;
const unsigned htile_block_size = 8 * 8;
@@ -1569,12 +1566,14 @@ static int gfx9_compute_surface(ADDR_HANDLE addrlib,
AddrSurfInfoIn.bpp = surf->bpe * 8;
}
AddrSurfInfoIn.flags.color = !(surf->flags & RADEON_SURF_Z_OR_SBUFFER) &&
bool is_color_surface = !(surf->flags & RADEON_SURF_Z_OR_SBUFFER);
AddrSurfInfoIn.flags.color = is_color_surface &&
!(surf->flags & RADEON_SURF_NO_RENDER_TARGET);
AddrSurfInfoIn.flags.depth = (surf->flags & RADEON_SURF_ZBUFFER) != 0;
AddrSurfInfoIn.flags.display = get_display_flag(config, surf);
/* flags.texture currently refers to TC-compatible HTILE */
AddrSurfInfoIn.flags.texture = 1;
AddrSurfInfoIn.flags.texture = is_color_surface ||
surf->flags & RADEON_SURF_TC_COMPATIBLE_HTILE;
AddrSurfInfoIn.flags.opt4space = 1;
AddrSurfInfoIn.numMipLevels = config->info.levels;
@@ -1656,7 +1655,9 @@ static int gfx9_compute_surface(ADDR_HANDLE addrlib,
case RADEON_SURF_MODE_1D:
case RADEON_SURF_MODE_2D:
if (surf->flags & (RADEON_SURF_IMPORTED | RADEON_SURF_FORCE_SWIZZLE_MODE)) {
if (surf->flags & RADEON_SURF_IMPORTED ||
(info->chip_class >= GFX10 &&
surf->flags & RADEON_SURF_FORCE_SWIZZLE_MODE)) {
AddrSurfInfoIn.swizzleMode = surf->u.gfx9.surf.swizzle_mode;
break;
}
@@ -1714,7 +1715,6 @@ static int gfx9_compute_surface(ADDR_HANDLE addrlib,
}
surf->is_linear = surf->u.gfx9.surf.swizzle_mode == ADDR_SW_LINEAR;
surf->tc_compatible_htile_allowed = surf->htile_size != 0;
/* Query whether the surface is displayable. */
/* This is only useful for surfaces that are allocated without SCANOUT. */

View File

@@ -67,7 +67,7 @@ enum radeon_micro_mode {
/* bits 19 and 20 are reserved for libdrm_radeon, don't use them */
#define RADEON_SURF_FMASK (1 << 21)
#define RADEON_SURF_DISABLE_DCC (1 << 22)
/* gap */
#define RADEON_SURF_TC_COMPATIBLE_HTILE (1 << 23)
#define RADEON_SURF_IMPORTED (1 << 24)
/* gap */
#define RADEON_SURF_SHAREABLE (1 << 26)
@@ -194,7 +194,6 @@ struct radeon_surf {
unsigned has_stencil:1;
/* This might be true even if micro_tile_mode isn't displayable or rotated. */
unsigned is_displayable:1;
unsigned tc_compatible_htile_allowed:1;
/* Displayable, thin, depth, rotated. AKA D,S,Z,R swizzle modes. */
unsigned micro_tile_mode:3;
uint32_t flags;

View File

@@ -815,31 +815,31 @@ struct Instruction {
return false;
}
};
static_assert(sizeof(Instruction) == 16);
static_assert(sizeof(Instruction) == 16, "Unexpected padding");
struct SOPK_instruction : public Instruction {
uint16_t imm;
uint16_t padding;
};
static_assert(sizeof(SOPK_instruction) == sizeof(Instruction) + 4);
static_assert(sizeof(SOPK_instruction) == sizeof(Instruction) + 4, "Unexpected padding");
struct SOPP_instruction : public Instruction {
uint32_t imm;
int block;
};
static_assert(sizeof(SOPP_instruction) == sizeof(Instruction) + 8);
static_assert(sizeof(SOPP_instruction) == sizeof(Instruction) + 8, "Unexpected padding");
struct SOPC_instruction : public Instruction {
};
static_assert(sizeof(SOPC_instruction) == sizeof(Instruction) + 0);
static_assert(sizeof(SOPC_instruction) == sizeof(Instruction) + 0, "Unexpected padding");
struct SOP1_instruction : public Instruction {
};
static_assert(sizeof(SOP1_instruction) == sizeof(Instruction) + 0);
static_assert(sizeof(SOP1_instruction) == sizeof(Instruction) + 0, "Unexpected padding");
struct SOP2_instruction : public Instruction {
};
static_assert(sizeof(SOP2_instruction) == sizeof(Instruction) + 0);
static_assert(sizeof(SOP2_instruction) == sizeof(Instruction) + 0, "Unexpected padding");
/**
* Scalar Memory Format:
@@ -861,19 +861,19 @@ struct SMEM_instruction : public Instruction {
bool disable_wqm : 1;
uint32_t padding: 19;
};
static_assert(sizeof(SMEM_instruction) == sizeof(Instruction) + 4);
static_assert(sizeof(SMEM_instruction) == sizeof(Instruction) + 4, "Unexpected padding");
struct VOP1_instruction : public Instruction {
};
static_assert(sizeof(VOP1_instruction) == sizeof(Instruction) + 0);
static_assert(sizeof(VOP1_instruction) == sizeof(Instruction) + 0, "Unexpected padding");
struct VOP2_instruction : public Instruction {
};
static_assert(sizeof(VOP2_instruction) == sizeof(Instruction) + 0);
static_assert(sizeof(VOP2_instruction) == sizeof(Instruction) + 0, "Unexpected padding");
struct VOPC_instruction : public Instruction {
};
static_assert(sizeof(VOPC_instruction) == sizeof(Instruction) + 0);
static_assert(sizeof(VOPC_instruction) == sizeof(Instruction) + 0, "Unexpected padding");
struct VOP3A_instruction : public Instruction {
bool abs[3];
@@ -883,7 +883,7 @@ struct VOP3A_instruction : public Instruction {
bool clamp : 1;
uint32_t padding : 9;
};
static_assert(sizeof(VOP3A_instruction) == sizeof(Instruction) + 8);
static_assert(sizeof(VOP3A_instruction) == sizeof(Instruction) + 8, "Unexpected padding");
struct VOP3P_instruction : public Instruction {
bool neg_lo[3];
@@ -893,7 +893,7 @@ struct VOP3P_instruction : public Instruction {
bool clamp : 1;
uint32_t padding : 9;
};
static_assert(sizeof(VOP3P_instruction) == sizeof(Instruction) + 8);
static_assert(sizeof(VOP3P_instruction) == sizeof(Instruction) + 8, "Unexpected padding");
/**
* Data Parallel Primitives Format:
@@ -910,7 +910,7 @@ struct DPP_instruction : public Instruction {
bool bound_ctrl : 1;
uint32_t padding : 7;
};
static_assert(sizeof(DPP_instruction) == sizeof(Instruction) + 8);
static_assert(sizeof(DPP_instruction) == sizeof(Instruction) + 8, "Unexpected padding");
enum sdwa_sel : uint8_t {
/* masks */
@@ -968,14 +968,14 @@ struct SDWA_instruction : public Instruction {
uint8_t omod : 2; /* GFX9+ */
uint32_t padding : 4;
};
static_assert(sizeof(SDWA_instruction) == sizeof(Instruction) + 8);
static_assert(sizeof(SDWA_instruction) == sizeof(Instruction) + 8, "Unexpected padding");
struct Interp_instruction : public Instruction {
uint8_t attribute;
uint8_t component;
uint16_t padding;
};
static_assert(sizeof(Interp_instruction) == sizeof(Instruction) + 4);
static_assert(sizeof(Interp_instruction) == sizeof(Instruction) + 4, "Unexpected padding");
/**
* Local and Global Data Sharing instructions
@@ -991,7 +991,7 @@ struct DS_instruction : public Instruction {
int8_t offset1;
bool gds;
};
static_assert(sizeof(DS_instruction) == sizeof(Instruction) + 4);
static_assert(sizeof(DS_instruction) == sizeof(Instruction) + 4, "Unexpected padding");
/**
* Vector Memory Untyped-buffer Instructions
@@ -1016,7 +1016,7 @@ struct MUBUF_instruction : public Instruction {
uint8_t padding : 2;
barrier_interaction barrier;
};
static_assert(sizeof(MUBUF_instruction) == sizeof(Instruction) + 4);
static_assert(sizeof(MUBUF_instruction) == sizeof(Instruction) + 4, "Unexpected padding");
/**
* Vector Memory Typed-buffer Instructions
@@ -1041,7 +1041,7 @@ struct MTBUF_instruction : public Instruction {
bool can_reorder : 1;
uint32_t padding : 25;
};
static_assert(sizeof(MTBUF_instruction) == sizeof(Instruction) + 8);
static_assert(sizeof(MTBUF_instruction) == sizeof(Instruction) + 8, "Unexpected padding");
/**
* Vector Memory Image Instructions
@@ -1070,7 +1070,7 @@ struct MIMG_instruction : public Instruction {
uint8_t padding : 1;
barrier_interaction barrier;
};
static_assert(sizeof(MIMG_instruction) == sizeof(Instruction) + 4);
static_assert(sizeof(MIMG_instruction) == sizeof(Instruction) + 4, "Unexpected padding");
/**
* Flat/Scratch/Global Instructions
@@ -1091,7 +1091,7 @@ struct FLAT_instruction : public Instruction {
uint8_t padding : 1;
barrier_interaction barrier;
};
static_assert(sizeof(FLAT_instruction) == sizeof(Instruction) + 4);
static_assert(sizeof(FLAT_instruction) == sizeof(Instruction) + 4, "Unexpected padding");
struct Export_instruction : public Instruction {
uint8_t enabled_mask;
@@ -1101,14 +1101,14 @@ struct Export_instruction : public Instruction {
bool valid_mask : 1;
uint32_t padding : 13;
};
static_assert(sizeof(Export_instruction) == sizeof(Instruction) + 4);
static_assert(sizeof(Export_instruction) == sizeof(Instruction) + 4, "Unexpected padding");
struct Pseudo_instruction : public Instruction {
PhysReg scratch_sgpr; /* might not be valid if it's not needed */
bool tmp_in_scc;
uint8_t padding;
};
static_assert(sizeof(Pseudo_instruction) == sizeof(Instruction) + 4);
static_assert(sizeof(Pseudo_instruction) == sizeof(Instruction) + 4, "Unexpected padding");
struct Pseudo_branch_instruction : public Instruction {
/* target[0] is the block index of the branch target.
@@ -1117,11 +1117,11 @@ struct Pseudo_branch_instruction : public Instruction {
*/
uint32_t target[2];
};
static_assert(sizeof(Pseudo_branch_instruction) == sizeof(Instruction) + 8);
static_assert(sizeof(Pseudo_branch_instruction) == sizeof(Instruction) + 8, "Unexpected padding");
struct Pseudo_barrier_instruction : public Instruction {
};
static_assert(sizeof(Pseudo_barrier_instruction) == sizeof(Instruction) + 0);
static_assert(sizeof(Pseudo_barrier_instruction) == sizeof(Instruction) + 0, "Unexpected padding");
enum ReduceOp : uint16_t {
iadd32, iadd64,
@@ -1157,7 +1157,7 @@ struct Pseudo_reduction_instruction : public Instruction {
ReduceOp reduce_op;
uint16_t cluster_size; // must be 0 for scans
};
static_assert(sizeof(Pseudo_reduction_instruction) == sizeof(Instruction) + 4);
static_assert(sizeof(Pseudo_reduction_instruction) == sizeof(Instruction) + 4, "Unexpected padding");
struct instr_deleter_functor {
void operator()(void* p) {

View File

@@ -153,7 +153,7 @@ class Format(Enum):
res = ''
if self == Format.SDWA:
for i in range(min(num_operands, 2)):
res += f'instr->sel[{i}] = op{i}.op.bytes() == 2 ? sdwa_uword : (op{i}.op.bytes() == 1 ? sdwa_ubyte : sdwa_udword);\n'
res += 'instr->sel[{0}] = op{0}.op.bytes() == 2 ? sdwa_uword : (op{0}.op.bytes() == 1 ? sdwa_ubyte : sdwa_udword);\n'.format(i)
res += 'instr->dst_sel = def0.bytes() == 2 ? sdwa_uword : (def0.bytes() == 1 ? sdwa_ubyte : sdwa_udword);\n'
res += 'instr->dst_preserve = true;'
return res

View File

@@ -3125,9 +3125,6 @@ void ac_optimize_vs_outputs(struct ac_llvm_context *ctx,
target -= V_008DFC_SQ_EXP_PARAM;
if ((1u << target) & skip_output_mask)
continue;
/* Parse the instruction. */
memset(&exp, 0, sizeof(exp));
exp.offset = target;
@@ -3151,12 +3148,13 @@ void ac_optimize_vs_outputs(struct ac_llvm_context *ctx,
}
/* Eliminate constant and duplicated PARAM exports. */
if (ac_eliminate_const_output(vs_output_param_offset,
num_outputs, &exp) ||
ac_eliminate_duplicated_output(ctx,
vs_output_param_offset,
num_outputs, &exports,
&exp)) {
if (!((1u << target) & skip_output_mask) &&
(ac_eliminate_const_output(vs_output_param_offset,
num_outputs, &exp) ||
ac_eliminate_duplicated_output(ctx,
vs_output_param_offset,
num_outputs, &exports,
&exp))) {
removed_any = true;
} else {
exports.exp[exports.num++] = exp;

View File

@@ -101,11 +101,6 @@ LLVMBuilderRef ac_create_builder(LLVMContextRef ctx,
*/
flags.setAllowContract(); /* contract */
/* Allow reassociation transformations for floating-point
* instructions. This may dramatically change results.
*/
flags.setAllowReassoc(); /* reassoc */
llvm::unwrap(builder)->setFastMathFlags(flags);
break;
}
@@ -118,13 +113,11 @@ bool ac_disable_inexact_math(LLVMBuilderRef builder)
{
auto *b = llvm::unwrap(builder);
llvm::FastMathFlags flags = b->getFastMathFlags();
assert(flags.allowContract() == flags.allowReassoc());
if (!flags.allowContract())
return false;
flags.setAllowContract(false);
flags.setAllowReassoc(false);
b->setFastMathFlags(flags);
return true;
}
@@ -133,13 +126,11 @@ void ac_restore_inexact_math(LLVMBuilderRef builder, bool value)
{
auto *b = llvm::unwrap(builder);
llvm::FastMathFlags flags = b->getFastMathFlags();
assert(flags.allowContract() == flags.allowReassoc());
if (flags.allowContract() == value)
return;
flags.setAllowContract(value);
flags.setAllowReassoc(value);
b->setFastMathFlags(flags);
}

View File

@@ -296,7 +296,7 @@ radv_physical_device_init(struct radv_physical_device *device,
}
if (!device->ws) {
result = vk_error(instance, VK_ERROR_INCOMPATIBLE_DRIVER);
result = vk_error(instance, VK_ERROR_INITIALIZATION_FAILED);
goto fail;
}
@@ -757,7 +757,7 @@ radv_enumerate_devices(struct radv_instance *instance)
{
/* TODO: Check for more devices ? */
drmDevicePtr devices[8];
VkResult result = VK_ERROR_INCOMPATIBLE_DRIVER;
VkResult result = VK_SUCCESS;
int max_devices;
instance->physicalDeviceCount = 0;
@@ -781,7 +781,7 @@ radv_enumerate_devices(struct radv_instance *instance)
radv_logi("Found %d drm nodes", max_devices);
if (max_devices < 1)
return vk_error(instance, VK_ERROR_INCOMPATIBLE_DRIVER);
return vk_error(instance, VK_SUCCESS);
for (unsigned i = 0; i < (unsigned)max_devices; i++) {
if (devices[i]->available_nodes & 1 << DRM_NODE_RENDER &&
@@ -792,14 +792,22 @@ radv_enumerate_devices(struct radv_instance *instance)
instance->physicalDeviceCount,
instance,
devices[i]);
if (result == VK_SUCCESS)
++instance->physicalDeviceCount;
else if (result != VK_ERROR_INCOMPATIBLE_DRIVER)
/* Incompatible DRM device, skip. */
if (result == VK_ERROR_INCOMPATIBLE_DRIVER) {
result = VK_SUCCESS;
continue;
}
/* Error creating the physical device, report the error. */
if (result != VK_SUCCESS)
break;
++instance->physicalDeviceCount;
}
}
drmFreeDevices(devices, max_devices);
/* If we successfully enumerated any devices, call it success */
return result;
}
@@ -813,8 +821,7 @@ VkResult radv_EnumeratePhysicalDevices(
if (instance->physicalDeviceCount < 0) {
result = radv_enumerate_devices(instance);
if (result != VK_SUCCESS &&
result != VK_ERROR_INCOMPATIBLE_DRIVER)
if (result != VK_SUCCESS)
return result;
}
@@ -840,8 +847,7 @@ VkResult radv_EnumeratePhysicalDeviceGroups(
if (instance->physicalDeviceCount < 0) {
result = radv_enumerate_devices(instance);
if (result != VK_SUCCESS &&
result != VK_ERROR_INCOMPATIBLE_DRIVER)
if (result != VK_SUCCESS)
return result;
}
@@ -5200,7 +5206,7 @@ static VkResult radv_alloc_memory(struct radv_device *device,
heap_index = device->physical_device->memory_properties.memoryTypes[pAllocateInfo->memoryTypeIndex].heapIndex;
domain = device->physical_device->memory_domains[pAllocateInfo->memoryTypeIndex];
flags = device->physical_device->memory_flags[pAllocateInfo->memoryTypeIndex];
flags |= device->physical_device->memory_flags[pAllocateInfo->memoryTypeIndex];
if (!dedicate_info && !import_info && (!export_info || !export_info->handleTypes)) {
flags |= RADEON_FLAG_NO_INTERPROCESS_SHARING;

View File

@@ -437,8 +437,11 @@ radv_init_surface(struct radv_device *device,
unreachable("unhandled image type");
}
if (is_depth)
if (is_depth) {
surface->flags |= RADEON_SURF_ZBUFFER;
if (radv_use_tc_compat_htile_for_image(device, pCreateInfo, image_format))
surface->flags |= RADEON_SURF_TC_COMPATIBLE_HTILE;
}
if (is_stencil)
surface->flags |= RADEON_SURF_SBUFFER;
@@ -1348,8 +1351,6 @@ static void radv_image_disable_htile(struct radv_image *image)
{
for (unsigned i = 0; i < image->plane_count; ++i)
image->planes[i].surface.htile_size = 0;
image->tc_compatible_htile = false;
}
VkResult
@@ -1421,8 +1422,7 @@ radv_image_create_layout(struct radv_device *device,
/* Otherwise, try to enable HTILE for depth surfaces. */
if (radv_image_can_enable_htile(image) &&
!(device->instance->debug_flags & RADV_DEBUG_NO_HIZ)) {
if (!image->planes[0].surface.tc_compatible_htile_allowed)
image->tc_compatible_htile = false;
image->tc_compatible_htile = image->planes[0].surface.flags & RADEON_SURF_TC_COMPATIBLE_HTILE;
radv_image_alloc_htile(device, image);
} else {
radv_image_disable_htile(image);
@@ -1500,10 +1500,6 @@ radv_image_create(VkDevice _device,
image->info.surf_index = &device->image_mrt_offset_counter;
}
image->tc_compatible_htile =
radv_use_tc_compat_htile_for_image(device, create_info->vk_info,
image->vk_format);
for (unsigned plane = 0; plane < image->plane_count; ++plane) {
radv_init_surface(device, image, &image->planes[plane].surface, plane, pCreateInfo, format);
}

View File

@@ -717,7 +717,7 @@ radv_amdgpu_winsys_bo_set_metadata(struct radeon_winsys_bo *_bo,
{
struct radv_amdgpu_winsys_bo *bo = radv_amdgpu_winsys_bo(_bo);
struct amdgpu_bo_metadata metadata = {0};
uint32_t tiling_flags = 0;
uint64_t tiling_flags = 0;
if (bo->ws->info.chip_class >= GFX9) {
tiling_flags |= AMDGPU_TILING_SET(SWIZZLE_MODE, md->u.gfx9.swizzle_mode);

View File

@@ -166,6 +166,7 @@ gather_vars_written(struct copy_prop_var_state *state,
nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
switch (intrin->intrinsic) {
case nir_intrinsic_control_barrier:
case nir_intrinsic_group_memory_barrier:
case nir_intrinsic_memory_barrier:
written->modes |= nir_var_shader_out |
nir_var_mem_ssbo |

View File

@@ -133,6 +133,7 @@ remove_dead_write_vars_local(void *mem_ctx, nir_block *block)
nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
switch (intrin->intrinsic) {
case nir_intrinsic_control_barrier:
case nir_intrinsic_group_memory_barrier:
case nir_intrinsic_memory_barrier: {
clear_unused_for_modes(&unused_writes, nir_var_shader_out |
nir_var_mem_ssbo |

View File

@@ -497,8 +497,11 @@ void draw_delete_tess_ctrl_shader(struct draw_context *draw,
}
assert(shader->variants_cached == 0);
align_free(dtcs->tcs_input);
align_free(dtcs->tcs_output);
}
#endif
if (dtcs->state.ir.nir)
ralloc_free(dtcs->state.ir.nir);
FREE(dtcs);

View File

@@ -42,4 +42,5 @@ DRI_CONF_SECTION_MISCELLANEOUS
DRI_CONF_VS_POSITION_ALWAYS_INVARIANT("false")
DRI_CONF_ALLOW_RGB10_CONFIGS("true")
DRI_CONF_ALLOW_FP16_CONFIGS("false")
DRI_CONF_FORCE_INTEGER_TEX_NEAREST("false")
DRI_CONF_SECTION_END

View File

@@ -268,9 +268,11 @@ translate_texture_format(enum pipe_format fmt)
}
bool
texture_use_int_filter(const struct pipe_sampler_view *so, bool tex_desc)
texture_use_int_filter(const struct pipe_sampler_view *sv,
const struct pipe_sampler_state *ss,
bool tex_desc)
{
switch (so->target) {
switch (sv->target) {
case PIPE_TEXTURE_1D_ARRAY:
case PIPE_TEXTURE_2D_ARRAY:
if (tex_desc)
@@ -282,16 +284,19 @@ texture_use_int_filter(const struct pipe_sampler_view *so, bool tex_desc)
}
/* only unorm formats can use int filter */
if (!util_format_is_unorm(so->format))
if (!util_format_is_unorm(sv->format))
return false;
if (util_format_is_srgb(so->format))
if (util_format_is_srgb(sv->format))
return false;
if (util_format_description(so->format)->layout == UTIL_FORMAT_LAYOUT_ASTC)
if (util_format_description(sv->format)->layout == UTIL_FORMAT_LAYOUT_ASTC)
return false;
switch (so->format) {
if (ss->max_anisotropy > 1)
return false;
switch (sv->format) {
/* apparently D16 can't use int filter but D24 can */
case PIPE_FORMAT_Z16_UNORM:
case PIPE_FORMAT_R10G10B10A2_UNORM:

View File

@@ -39,7 +39,9 @@ uint32_t
translate_texture_format(enum pipe_format fmt);
bool
texture_use_int_filter(const struct pipe_sampler_view *so, bool tex_desc);
texture_use_int_filter(const struct pipe_sampler_view *sv,
const struct pipe_sampler_state *ss,
bool tex_desc);
bool
texture_format_needs_swiz(enum pipe_format fmt);

View File

@@ -109,8 +109,7 @@ etna_create_sampler_state_desc(struct pipe_context *pipe,
cs->SAMP_LOD_BIAS =
VIVS_NTE_DESCRIPTOR_SAMP_LOD_BIAS_BIAS(etna_float_to_fixp88(ss->lod_bias)) |
COND(ss->lod_bias != 0.0, VIVS_NTE_DESCRIPTOR_SAMP_LOD_BIAS_ENABLE);
cs->SAMP_ANISOTROPY =
VIVS_NTE_DESCRIPTOR_SAMP_ANISOTROPY(COND(ansio, etna_log2_fixp88(ss->max_anisotropy)));
cs->SAMP_ANISOTROPY = COND(ansio, etna_log2_fixp88(ss->max_anisotropy));
return cs;
}
@@ -162,9 +161,6 @@ etna_create_sampler_view_desc(struct pipe_context *pctx, struct pipe_resource *p
if (util_format_is_srgb(so->format))
sv->SAMP_CTRL1 |= VIVS_NTE_DESCRIPTOR_SAMP_CTRL1_SRGB;
if (texture_use_int_filter(so, true))
sv->SAMP_CTRL0 |= VIVS_NTE_DESCRIPTOR_SAMP_CTRL0_INT_FILTER;
/* Create texture descriptor */
sv->bo = etna_bo_new(ctx->screen->dev, 0x100, DRM_ETNA_GEM_CACHE_WC);
if (!sv->bo)
@@ -294,6 +290,10 @@ etna_emit_texture_desc(struct etna_context *ctx)
if ((1 << x) & active_samplers) {
struct etna_sampler_state_desc *ss = etna_sampler_state_desc(ctx->sampler[x]);
struct etna_sampler_view_desc *sv = etna_sampler_view_desc(ctx->sampler_view[x]);
if (texture_use_int_filter(&sv->base, &ss->base, true))
sv->SAMP_CTRL0 |= VIVS_NTE_DESCRIPTOR_SAMP_CTRL0_INT_FILTER;
etna_set_state(stream, VIVS_NTE_DESCRIPTOR_TX_CTRL(x),
COND(sv->ts.enable, VIVS_NTE_DESCRIPTOR_TX_CTRL_TS_ENABLE) |
VIVS_NTE_DESCRIPTOR_TX_CTRL_TS_MODE(sv->ts.mode) |

View File

@@ -232,8 +232,7 @@ etna_create_sampler_view_state(struct pipe_context *pctx, struct pipe_resource *
VIVS_TE_SAMPLER_LOG_SIZE_WIDTH(etna_log2_fixp55(res->base.width0)) |
VIVS_TE_SAMPLER_LOG_SIZE_HEIGHT(etna_log2_fixp55(base_height)) |
COND(util_format_is_srgb(so->format) && !astc, VIVS_TE_SAMPLER_LOG_SIZE_SRGB) |
COND(astc, VIVS_TE_SAMPLER_LOG_SIZE_ASTC) |
COND(texture_use_int_filter(so, false), VIVS_TE_SAMPLER_LOG_SIZE_INT_FILTER);
COND(astc, VIVS_TE_SAMPLER_LOG_SIZE_ASTC);
sv->TE_SAMPLER_3D_CONFIG =
VIVS_TE_SAMPLER_3D_CONFIG_DEPTH(base_depth) |
VIVS_TE_SAMPLER_3D_CONFIG_LOG_DEPTH(etna_log2_fixp55(base_depth));
@@ -335,6 +334,7 @@ etna_emit_texture_state(struct etna_context *ctx)
}
}
if (unlikely(dirty & (ETNA_DIRTY_SAMPLER_VIEWS))) {
struct etna_sampler_state *ss;
struct etna_sampler_view *sv;
for (int x = 0; x < VIVS_TE_SAMPLER__LEN; ++x) {
@@ -345,7 +345,12 @@ etna_emit_texture_state(struct etna_context *ctx)
}
for (int x = 0; x < VIVS_TE_SAMPLER__LEN; ++x) {
if ((1 << x) & active_samplers) {
ss = etna_sampler_state(ctx->sampler[x]);
sv = etna_sampler_view(ctx->sampler_view[x]);
if (texture_use_int_filter(&sv->base, &ss->base, false))
sv->TE_SAMPLER_LOG_SIZE |= VIVS_TE_SAMPLER_LOG_SIZE_INT_FILTER;
/*02080*/ EMIT_STATE(TE_SAMPLER_LOG_SIZE(x), sv->TE_SAMPLER_LOG_SIZE);
}
}

View File

@@ -1092,7 +1092,6 @@ fd_resource_from_handle(struct pipe_screen *pscreen,
struct fd_resource *rsc = CALLOC_STRUCT(fd_resource);
struct fdl_slice *slice = fd_resource_slice(rsc, 0);
struct pipe_resource *prsc = &rsc->base;
uint32_t pitchalign = fd_screen(pscreen)->gmem_alignw * rsc->layout.cpp;
DBG("target=%d, format=%s, %ux%ux%u, array_size=%u, last_level=%u, "
"nr_samples=%u, usage=%u, bind=%x, flags=%x",
@@ -1124,6 +1123,8 @@ fd_resource_from_handle(struct pipe_screen *pscreen,
slice->offset = handle->offset;
slice->size0 = handle->stride * prsc->height0;
uint32_t pitchalign = fd_screen(pscreen)->gmem_alignw * rsc->layout.cpp;
if ((slice->pitch < align(prsc->width0 * rsc->layout.cpp, pitchalign)) ||
(slice->pitch & (pitchalign - 1)))
goto fail;

View File

@@ -362,6 +362,11 @@ iris_blit(struct pipe_context *ctx, const struct pipe_blit_info *info)
blorp_flags |= BLORP_BATCH_PREDICATE_ENABLE;
}
if (iris_resource_unfinished_aux_import(src_res))
iris_resource_finish_aux_import(ctx->screen, src_res);
if (iris_resource_unfinished_aux_import(dst_res))
iris_resource_finish_aux_import(ctx->screen, dst_res);
struct iris_format_info src_fmt =
iris_format_for_usage(devinfo, info->src.format,
ISL_SURF_USAGE_TEXTURE_BIT);
@@ -715,44 +720,52 @@ get_preferred_batch(struct iris_context *ice, struct iris_bo *bo)
*/
static void
iris_resource_copy_region(struct pipe_context *ctx,
struct pipe_resource *dst,
struct pipe_resource *p_dst,
unsigned dst_level,
unsigned dstx, unsigned dsty, unsigned dstz,
struct pipe_resource *src,
struct pipe_resource *p_src,
unsigned src_level,
const struct pipe_box *src_box)
{
struct iris_context *ice = (void *) ctx;
struct iris_screen *screen = (void *) ctx->screen;
struct iris_batch *batch = &ice->batches[IRIS_BATCH_RENDER];
struct iris_resource *src = (void *) p_src;
struct iris_resource *dst = (void *) p_dst;
if (iris_resource_unfinished_aux_import(src))
iris_resource_finish_aux_import(ctx->screen, src);
if (iris_resource_unfinished_aux_import(dst))
iris_resource_finish_aux_import(ctx->screen, dst);
/* Use MI_COPY_MEM_MEM for tiny (<= 16 byte, % 4) buffer copies. */
if (src->target == PIPE_BUFFER && dst->target == PIPE_BUFFER &&
if (p_src->target == PIPE_BUFFER && p_dst->target == PIPE_BUFFER &&
(src_box->width % 4 == 0) && src_box->width <= 16) {
struct iris_bo *dst_bo = iris_resource_bo(dst);
struct iris_bo *dst_bo = iris_resource_bo(p_dst);
batch = get_preferred_batch(ice, dst_bo);
iris_batch_maybe_flush(batch, 24 + 5 * (src_box->width / 4));
iris_emit_pipe_control_flush(batch,
"stall for MI_COPY_MEM_MEM copy_region",
PIPE_CONTROL_CS_STALL);
batch->screen->vtbl.copy_mem_mem(batch, dst_bo, dstx, iris_resource_bo(src),
src_box->x, src_box->width);
screen->vtbl.copy_mem_mem(batch, dst_bo, dstx, iris_resource_bo(p_src),
src_box->x, src_box->width);
return;
}
iris_copy_region(&ice->blorp, batch, dst, dst_level, dstx, dsty, dstz,
src, src_level, src_box);
iris_copy_region(&ice->blorp, batch, p_dst, dst_level, dstx, dsty, dstz,
p_src, src_level, src_box);
if (util_format_is_depth_and_stencil(dst->format) &&
util_format_has_stencil(util_format_description(src->format))) {
if (util_format_is_depth_and_stencil(p_dst->format) &&
util_format_has_stencil(util_format_description(p_src->format))) {
struct iris_resource *junk, *s_src_res, *s_dst_res;
iris_get_depth_stencil_resources(src, &junk, &s_src_res);
iris_get_depth_stencil_resources(dst, &junk, &s_dst_res);
iris_get_depth_stencil_resources(p_src, &junk, &s_src_res);
iris_get_depth_stencil_resources(p_dst, &junk, &s_dst_res);
iris_copy_region(&ice->blorp, batch, &s_dst_res->base, dst_level, dstx,
dsty, dstz, &s_src_res->base, src_level, src_box);
}
iris_flush_and_dirty_for_history(ice, batch, (struct iris_resource *) dst,
iris_flush_and_dirty_for_history(ice, batch, dst,
PIPE_CONTROL_RENDER_TARGET_FLUSH,
"cache history: post copy_region");
}

View File

@@ -702,8 +702,12 @@ iris_clear_texture(struct pipe_context *ctx,
{
struct iris_context *ice = (void *) ctx;
struct iris_screen *screen = (void *) ctx->screen;
struct iris_resource *res = (void *) p_res;
const struct gen_device_info *devinfo = &screen->devinfo;
if (iris_resource_unfinished_aux_import(res))
iris_resource_finish_aux_import(ctx->screen, res);
if (util_format_is_depth_or_stencil(p_res->format)) {
const struct util_format_description *fmt_desc =
util_format_description(p_res->format);

View File

@@ -1810,6 +1810,9 @@ iris_transfer_map(struct pipe_context *ctx,
struct iris_resource *res = (struct iris_resource *)resource;
struct isl_surf *surf = &res->surf;
if (iris_resource_unfinished_aux_import(res))
iris_resource_finish_aux_import(ctx->screen, res);
if (usage & PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE) {
/* Replace the backing storage with a fresh buffer for non-async maps */
if (!(usage & (PIPE_TRANSFER_UNSYNCHRONIZED |

View File

@@ -4178,6 +4178,8 @@ iris_store_tcs_state(struct iris_context *ice,
* more than 2 times the number of instance count.
*/
assert((devinfo->max_tcs_threads / 2) > tcs_prog_data->instances);
hs.DispatchGRFStartRegisterForURBData = prog_data->dispatch_grf_start_reg & 0x1f;
hs.DispatchGRFStartRegisterForURBData5 = prog_data->dispatch_grf_start_reg >> 5;
#endif
hs.InstanceCount = tcs_prog_data->instances - 1;

View File

@@ -523,6 +523,8 @@ llvmpipe_delete_compute_state(struct pipe_context *pipe,
llvmpipe_remove_cs_shader_variant(llvmpipe, li->base);
li = next;
}
if (shader->base.ir.nir)
ralloc_free(shader->base.ir.nir);
tgsi_free_tokens(shader->base.tokens);
FREE(shader);
}

View File

@@ -1272,10 +1272,16 @@ panfrost_emit_texture_descriptors(struct panfrost_batch *batch,
struct pipe_sampler_view *pview = &view->base;
struct panfrost_resource *rsrc = pan_resource(pview->texture);
/* Add the BOs to the job so they are retained until the job is done. */
panfrost_batch_add_bo(batch, rsrc->bo,
PAN_BO_ACCESS_SHARED | PAN_BO_ACCESS_READ |
panfrost_bo_access_for_stage(stage));
panfrost_batch_add_bo(batch, view->bifrost_bo,
PAN_BO_ACCESS_SHARED | PAN_BO_ACCESS_READ |
panfrost_bo_access_for_stage(stage));
memcpy(&descriptors[i], view->bifrost_descriptor, sizeof(*view->bifrost_descriptor));
}

View File

@@ -299,7 +299,7 @@ void si_clear_buffer(struct si_context *sctx, struct pipe_resource *dst, uint64_
* about buffer placements.
*/
if (clear_value_size > 4 || (!force_cpdma && clear_value_size == 4 && offset % 4 == 0 &&
(size > 32 * 1024 || sctx->chip_class <= GFX9))) {
(size > 32 * 1024 || sctx->chip_class <= GFX8))) {
si_compute_do_clear_or_copy(sctx, dst, offset, NULL, 0, aligned_size, clear_value,
clear_value_size, coher);
} else {

View File

@@ -209,7 +209,8 @@ static int si_get_param(struct pipe_screen *pscreen, enum pipe_cap param)
case PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE:
case PIPE_CAP_MAX_SHADER_BUFFER_SIZE:
return MIN2(sscreen->info.max_alloc_size, INT_MAX);
/* Align it down to 256 bytes. I've chosen the number randomly. */
return ROUND_DOWN_TO(MIN2(sscreen->info.max_alloc_size, INT_MAX), 256);
case PIPE_CAP_VERTEX_BUFFER_OFFSET_4BYTE_ALIGNED_ONLY:
case PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY:
@@ -371,13 +372,6 @@ static int si_get_shader_param(struct pipe_screen *pscreen, enum pipe_shader_typ
return ir;
}
case PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE: {
uint64_t max_const_buffer_size;
pscreen->get_compute_param(pscreen, PIPE_SHADER_IR_NIR,
PIPE_COMPUTE_CAP_MAX_MEM_ALLOC_SIZE, &max_const_buffer_size);
return MIN2(max_const_buffer_size, INT_MAX);
}
default:
/* If compute shaders don't require a special value
* for this cap, we can return the same value we
@@ -404,7 +398,7 @@ static int si_get_shader_param(struct pipe_screen *pscreen, enum pipe_shader_typ
case PIPE_SHADER_CAP_MAX_TEMPS:
return 256; /* Max native temporaries. */
case PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE:
return MIN2(sscreen->info.max_alloc_size, INT_MAX - 3); /* aligned to 4 */
return si_get_param(pscreen, PIPE_CAP_MAX_SHADER_BUFFER_SIZE);
case PIPE_SHADER_CAP_MAX_CONST_BUFFERS:
return SI_NUM_CONST_BUFFERS;
case PIPE_SHADER_CAP_MAX_TEXTURE_SAMPLERS:

View File

@@ -905,6 +905,7 @@ void si_llvm_build_monolithic_ps(struct si_shader_context *ctx, struct si_shader
{
LLVMValueRef parts[3];
unsigned num_parts = 0, main_index;
LLVMValueRef main_fn = ctx->main_fn;
union si_shader_part_key prolog_key;
si_get_ps_prolog_key(shader, &prolog_key, false);
@@ -915,7 +916,7 @@ void si_llvm_build_monolithic_ps(struct si_shader_context *ctx, struct si_shader
}
main_index = num_parts;
parts[num_parts++] = ctx->main_fn;
parts[num_parts++] = main_fn;
union si_shader_part_key epilog_key;
si_get_ps_epilog_key(shader, &epilog_key);

View File

@@ -2841,8 +2841,6 @@ static void *si_create_shader(struct pipe_context *ctx, const struct pipe_shader
si_shader_dump_stats_for_shader_db(sscreen, sel->main_shader_part_ngg, &sctx->debug);
if (sel->main_shader_part_ngg_es)
si_shader_dump_stats_for_shader_db(sscreen, sel->main_shader_part_ngg_es, &sctx->debug);
if (sel->gs_copy_shader)
si_shader_dump_stats_for_shader_db(sscreen, sel->gs_copy_shader, &sctx->debug);
}
return sel;
}

View File

@@ -241,6 +241,8 @@ static int si_init_surface(struct si_screen *sscreen, struct radeon_surf *surfac
*/
if (sscreen->info.chip_class == GFX8)
bpe = 4;
flags |= RADEON_SURF_TC_COMPATIBLE_HTILE;
}
if (is_stencil)
@@ -1186,8 +1188,7 @@ static struct si_texture *si_texture_create_object(struct pipe_screen *screen,
const struct radeon_surf *surface,
const struct si_texture *plane0,
struct pb_buffer *imported_buf, uint64_t offset,
uint64_t alloc_size, unsigned alignment,
bool tc_compatible_htile)
uint64_t alloc_size, unsigned alignment)
{
struct si_texture *tex;
struct si_resource *resource;
@@ -1206,8 +1207,8 @@ static struct si_texture *si_texture_create_object(struct pipe_screen *screen,
/* don't include stencil-only formats which we don't support for rendering */
tex->is_depth = util_format_has_depth(util_format_description(tex->buffer.b.b.format));
tex->surface = *surface;
tex->tc_compatible_htile = tex->surface.tc_compatible_htile_allowed &&
tc_compatible_htile;
tex->tc_compatible_htile =
tex->surface.htile_size != 0 && (tex->surface.flags & RADEON_SURF_TC_COMPATIBLE_HTILE);
/* TC-compatible HTILE:
* - GFX8 only supports Z32_FLOAT.
@@ -1568,8 +1569,7 @@ struct pipe_resource *si_texture_create(struct pipe_screen *screen,
for (unsigned i = 0; i < num_planes; i++) {
struct si_texture *tex =
si_texture_create_object(screen, &plane_templ[i], &surface[i], plane0, NULL,
plane_offset[i], total_size, max_alignment,
tc_compatible_htile);
plane_offset[i], total_size, max_alignment);
if (!tex) {
si_texture_reference(&plane0, NULL);
return NULL;
@@ -1641,7 +1641,7 @@ static struct pipe_resource *si_texture_from_winsys_buffer(struct si_screen *ssc
if (r)
return NULL;
tex = si_texture_create_object(&sscreen->b, templ, &surface, NULL, buf, offset, 0, 0, false);
tex = si_texture_create_object(&sscreen->b, templ, &surface, NULL, buf, offset, 0, 0);
if (!tex)
return NULL;

View File

@@ -39,9 +39,21 @@ files_libzink = files(
'zink_surface.c',
)
zink_nir_algebraic_c = custom_target(
'zink_nir_algebraic.c',
input : 'nir_to_spirv/zink_nir_algebraic.py',
output : 'zink_nir_algebraic.c',
command : [
prog_python, '@INPUT@',
'-p', join_paths(meson.source_root(), 'src/compiler/nir/'),
],
capture : true,
depend_files : nir_algebraic_py,
)
libzink = static_library(
'zink',
files_libzink,
[files_libzink, zink_nir_algebraic_c],
c_args : c_vis_args,
include_directories : [inc_include, inc_src, inc_mapi, inc_mesa, inc_gallium, inc_gallium_aux],
dependencies: [dep_vulkan, idep_nir_headers],

View File

@@ -46,4 +46,9 @@ spirv_shader_delete(struct spirv_shader *s);
uint32_t
zink_binding(gl_shader_stage stage, VkDescriptorType type, int index);
struct nir_shader;
bool
zink_nir_lower_b2b(struct nir_shader *shader);
#endif

View File

@@ -0,0 +1,48 @@
#
# Copyright (C) 2020 Collabora Ltd.
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice (including the next
# paragraph) shall be included in all copies or substantial portions of the
# Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
# IN THE SOFTWARE.
import argparse
import sys
lower_b2b = [
(('b2b32', 'a'), ('b2i32', 'a')),
(('b2b1', 'a'), ('i2b1', 'a')),
]
def main():
parser = argparse.ArgumentParser()
parser.add_argument('-p', '--import-path', required=True)
args = parser.parse_args()
sys.path.insert(0, args.import_path)
run()
def run():
import nir_algebraic # pylint: disable=import-error
print('#include "nir_to_spirv/nir_to_spirv.h"')
print(nir_algebraic.AlgebraicPass("zink_nir_lower_b2b",
lower_b2b).render())
if __name__ == '__main__':
main()

View File

@@ -209,6 +209,7 @@ optimize_nir(struct nir_shader *s)
NIR_PASS(progress, s, nir_opt_algebraic);
NIR_PASS(progress, s, nir_opt_constant_folding);
NIR_PASS(progress, s, nir_opt_undef);
NIR_PASS(progress, s, zink_nir_lower_b2b);
} while (progress);
}

View File

@@ -235,6 +235,7 @@ struct st_config_options
bool allow_glsl_cross_stage_interpolation_mismatch;
bool allow_glsl_layout_qualifier_on_function_parameters;
bool allow_draw_out_of_order;
bool force_integer_tex_nearest;
char *force_gl_vendor;
unsigned char config_options_sha1[20];
};

View File

@@ -84,6 +84,8 @@ dri_fill_st_options(struct dri_screen *screen)
options->allow_higher_compat_version =
driQueryOptionb(optionCache, "allow_higher_compat_version");
options->glsl_zero_init = driQueryOptionb(optionCache, "glsl_zero_init");
options->force_integer_tex_nearest =
driQueryOptionb(optionCache, "force_integer_tex_nearest");
options->vs_position_always_invariant =
driQueryOptionb(optionCache, "vs_position_always_invariant");
options->force_glsl_abs_sqrt =

View File

@@ -3162,7 +3162,15 @@ fs_visitor::nir_emit_gs_intrinsic(const fs_builder &bld,
static fs_reg
fetch_render_target_array_index(const fs_builder &bld)
{
if (bld.shader->devinfo->gen >= 6) {
if (bld.shader->devinfo->gen >= 12) {
/* The render target array index is provided in the thread payload as
* bits 26:16 of r1.1.
*/
const fs_reg idx = bld.vgrf(BRW_REGISTER_TYPE_UD);
bld.AND(idx, brw_uw1_reg(BRW_GENERAL_REGISTER_FILE, 1, 3),
brw_imm_uw(0x7ff));
return idx;
} else if (bld.shader->devinfo->gen >= 6) {
/* The render target array index is provided in the thread payload as
* bits 26:16 of r0.0.
*/

View File

@@ -934,11 +934,25 @@ namespace {
case SHADER_OPCODE_MEMORY_FENCE:
case SHADER_OPCODE_INTERLOCK:
if (devinfo->gen >= 7)
return calculate_desc(info, unit_dp_dc, 2, 0, 0, 30 /* XXX */, 0,
10 /* XXX */, 100 /* XXX */, 0, 0, 0, 0);
else
switch (info.sfid) {
case GEN6_SFID_DATAPORT_RENDER_CACHE:
if (devinfo->gen >= 7)
return calculate_desc(info, unit_dp_rc, 2, 0, 0, 30 /* XXX */, 0,
10 /* XXX */, 300 /* XXX */, 0, 0, 0, 0);
else
abort();
case GEN7_SFID_DATAPORT_DATA_CACHE:
case HSW_SFID_DATAPORT_DATA_CACHE_1:
if (devinfo->gen >= 7)
return calculate_desc(info, unit_dp_dc, 2, 0, 0, 30 /* XXX */, 0,
10 /* XXX */, 100 /* XXX */, 0, 0, 0, 0);
else
abort();
default:
abort();
}
case SHADER_OPCODE_GEN4_SCRATCH_READ:
case SHADER_OPCODE_GEN4_SCRATCH_WRITE:

View File

@@ -394,7 +394,7 @@ brw_compile_tcs(const struct brw_compiler *compiler,
if (compiler->use_tcs_8_patch &&
nir->info.tess.tcs_vertices_out <= (devinfo->gen >= 12 ? 32 : 16) &&
2 + has_primitive_id + key->input_vertices <= 31) {
2 + has_primitive_id + key->input_vertices <= (devinfo->gen >= 12 ? 63 : 31)) {
/* 3DSTATE_HS imposes two constraints on using 8_PATCH mode. First, the
* "Instance" field limits the number of output vertices to [1, 16] on
* gen11 and below, or [1, 32] on gen12 and above. Secondly, the

View File

@@ -1964,7 +1964,7 @@
<value name="9-12 Samplers" value="3"/>
<value name="13-16 Samplers" value="4"/>
</field>
<field name="Instance Count" start="64" end="67" type="uint"/>
<field name="Instance Count" start="64" end="68" type="uint"/>
<field name="Maximum Number of Threads" start="72" end="80" type="uint"/>
<field name="Statistics Enable" start="93" end="93" type="bool"/>
<field name="Enable" start="95" end="95" type="bool"/>

View File

@@ -1624,7 +1624,12 @@ emit_3dstate_hs_te_ds(struct anv_graphics_pipeline *pipeline,
hs.VertexURBEntryReadLength = 0;
hs.VertexURBEntryReadOffset = 0;
hs.DispatchGRFStartRegisterForURBData =
tcs_prog_data->base.base.dispatch_grf_start_reg;
tcs_prog_data->base.base.dispatch_grf_start_reg & 0x1f;
#if GEN_GEN >= 12
hs.DispatchGRFStartRegisterForURBData5 =
tcs_prog_data->base.base.dispatch_grf_start_reg >> 5;
#endif
hs.PerThreadScratchSpace = get_scratch_space(tcs_bin);
hs.ScratchSpaceBasePointer =

View File

@@ -1364,33 +1364,39 @@ brw_upload_ubo_surfaces(struct brw_context *brw, struct gl_program *prog,
prog->info.num_abos == 0))
return;
uint32_t *ubo_surf_offsets =
&stage_state->surf_offset[prog_data->binding_table.ubo_start];
if (prog->info.num_ubos) {
assert(prog_data->binding_table.ubo_start < BRW_MAX_SURFACES);
uint32_t *ubo_surf_offsets =
&stage_state->surf_offset[prog_data->binding_table.ubo_start];
for (int i = 0; i < prog->info.num_ubos; i++) {
struct gl_buffer_binding *binding =
&ctx->UniformBufferBindings[prog->sh.UniformBlocks[i]->Binding];
upload_buffer_surface(brw, binding, &ubo_surf_offsets[i],
ISL_FORMAT_R32G32B32A32_FLOAT, 0);
for (int i = 0; i < prog->info.num_ubos; i++) {
struct gl_buffer_binding *binding =
&ctx->UniformBufferBindings[prog->sh.UniformBlocks[i]->Binding];
upload_buffer_surface(brw, binding, &ubo_surf_offsets[i],
ISL_FORMAT_R32G32B32A32_FLOAT, 0);
}
}
uint32_t *ssbo_surf_offsets =
&stage_state->surf_offset[prog_data->binding_table.ssbo_start];
uint32_t *abo_surf_offsets = ssbo_surf_offsets + prog->info.num_ssbos;
if (prog->info.num_ssbos || prog->info.num_abos) {
assert(prog_data->binding_table.ssbo_start < BRW_MAX_SURFACES);
uint32_t *ssbo_surf_offsets =
&stage_state->surf_offset[prog_data->binding_table.ssbo_start];
uint32_t *abo_surf_offsets = ssbo_surf_offsets + prog->info.num_ssbos;
for (int i = 0; i < prog->info.num_abos; i++) {
struct gl_buffer_binding *binding =
&ctx->AtomicBufferBindings[prog->sh.AtomicBuffers[i]->Binding];
upload_buffer_surface(brw, binding, &abo_surf_offsets[i],
ISL_FORMAT_RAW, RELOC_WRITE);
}
for (int i = 0; i < prog->info.num_abos; i++) {
struct gl_buffer_binding *binding =
&ctx->AtomicBufferBindings[prog->sh.AtomicBuffers[i]->Binding];
upload_buffer_surface(brw, binding, &abo_surf_offsets[i],
ISL_FORMAT_RAW, RELOC_WRITE);
}
for (int i = 0; i < prog->info.num_ssbos; i++) {
struct gl_buffer_binding *binding =
&ctx->ShaderStorageBufferBindings[prog->sh.ShaderStorageBlocks[i]->Binding];
for (int i = 0; i < prog->info.num_ssbos; i++) {
struct gl_buffer_binding *binding =
&ctx->ShaderStorageBufferBindings[prog->sh.ShaderStorageBlocks[i]->Binding];
upload_buffer_surface(brw, binding, &ssbo_surf_offsets[i],
ISL_FORMAT_RAW, RELOC_WRITE);
upload_buffer_surface(brw, binding, &ssbo_surf_offsets[i],
ISL_FORMAT_RAW, RELOC_WRITE);
}
}
stage_state->push_constants_dirty = true;

View File

@@ -3866,6 +3866,11 @@ struct gl_constants
*/
GLboolean GLSLZeroInit;
/**
* Treat integer textures using GL_LINEAR filters as GL_NEAREST.
*/
GLboolean ForceIntegerTexNearest;
/**
* Does the driver support real 32-bit integers? (Otherwise, integers are
* simulated via floats.)

View File

@@ -122,7 +122,8 @@ _mesa_unlock_texture(struct gl_context *ctx, struct gl_texture_object *texObj)
/** Is the texture "complete" with respect to the given sampler state? */
static inline GLboolean
_mesa_is_texture_complete(const struct gl_texture_object *texObj,
const struct gl_sampler_object *sampler)
const struct gl_sampler_object *sampler,
bool linear_as_nearest_for_int_tex)
{
struct gl_texture_image *img = texObj->Image[0][texObj->BaseLevel];
bool isMultisample = img && img->NumSamples >= 2;
@@ -149,8 +150,16 @@ _mesa_is_texture_complete(const struct gl_texture_object *texObj,
(sampler->MagFilter != GL_NEAREST ||
(sampler->MinFilter != GL_NEAREST &&
sampler->MinFilter != GL_NEAREST_MIPMAP_NEAREST))) {
/* If the format is integer, only nearest filtering is allowed */
return GL_FALSE;
/* If the format is integer, only nearest filtering is allowed,
* but some applications (eg: Grid Autosport) uses the default
* filtering values.
*/
if (texObj->_IsIntegerFormat &&
linear_as_nearest_for_int_tex) {
/* Skip return */
} else {
return GL_FALSE;
}
}
/* Section 8.17 (texture completeness) of the OpenGL 4.6 core profile spec:

View File

@@ -670,11 +670,13 @@ update_single_program_texture(struct gl_context *ctx, struct gl_program *prog,
texUnit->Sampler : &texObj->Sampler;
if (likely(texObj)) {
if (_mesa_is_texture_complete(texObj, sampler))
if (_mesa_is_texture_complete(texObj, sampler,
ctx->Const.ForceIntegerTexNearest))
return texObj;
_mesa_test_texobj_completeness(ctx, texObj);
if (_mesa_is_texture_complete(texObj, sampler))
if (_mesa_is_texture_complete(texObj, sampler,
ctx->Const.ForceIntegerTexNearest))
return texObj;
}
@@ -816,10 +818,12 @@ update_ff_texture_state(struct gl_context *ctx,
struct gl_sampler_object *sampler = texUnit->Sampler ?
texUnit->Sampler : &texObj->Sampler;
if (!_mesa_is_texture_complete(texObj, sampler)) {
if (!_mesa_is_texture_complete(texObj, sampler,
ctx->Const.ForceIntegerTexNearest)) {
_mesa_test_texobj_completeness(ctx, texObj);
}
if (_mesa_is_texture_complete(texObj, sampler)) {
if (_mesa_is_texture_complete(texObj, sampler,
ctx->Const.ForceIntegerTexNearest)) {
_mesa_reference_texobj(&texUnit->_Current, texObj);
complete = true;
break;

View File

@@ -546,7 +546,8 @@ _mesa_GetTextureHandleARB_no_error(GLuint texture)
GET_CURRENT_CONTEXT(ctx);
texObj = _mesa_lookup_texture(ctx, texture);
if (!_mesa_is_texture_complete(texObj, &texObj->Sampler))
if (!_mesa_is_texture_complete(texObj, &texObj->Sampler,
ctx->Const.ForceIntegerTexNearest))
_mesa_test_texobj_completeness(ctx, texObj);
return get_texture_handle(ctx, texObj, &texObj->Sampler);
@@ -585,9 +586,11 @@ _mesa_GetTextureHandleARB(GLuint texture)
* GetTextureSamplerHandleARB if the texture object specified by <texture>
* is not complete."
*/
if (!_mesa_is_texture_complete(texObj, &texObj->Sampler)) {
if (!_mesa_is_texture_complete(texObj, &texObj->Sampler,
ctx->Const.ForceIntegerTexNearest)) {
_mesa_test_texobj_completeness(ctx, texObj);
if (!_mesa_is_texture_complete(texObj, &texObj->Sampler)) {
if (!_mesa_is_texture_complete(texObj, &texObj->Sampler,
ctx->Const.ForceIntegerTexNearest)) {
_mesa_error(ctx, GL_INVALID_OPERATION,
"glGetTextureHandleARB(incomplete texture)");
return 0;
@@ -614,7 +617,8 @@ _mesa_GetTextureSamplerHandleARB_no_error(GLuint texture, GLuint sampler)
texObj = _mesa_lookup_texture(ctx, texture);
sampObj = _mesa_lookup_samplerobj(ctx, sampler);
if (!_mesa_is_texture_complete(texObj, sampObj))
if (!_mesa_is_texture_complete(texObj, sampObj,
ctx->Const.ForceIntegerTexNearest))
_mesa_test_texobj_completeness(ctx, texObj);
return get_texture_handle(ctx, texObj, sampObj);
@@ -667,9 +671,11 @@ _mesa_GetTextureSamplerHandleARB(GLuint texture, GLuint sampler)
* GetTextureSamplerHandleARB if the texture object specified by <texture>
* is not complete."
*/
if (!_mesa_is_texture_complete(texObj, sampObj)) {
if (!_mesa_is_texture_complete(texObj, sampObj,
ctx->Const.ForceIntegerTexNearest)) {
_mesa_test_texobj_completeness(ctx, texObj);
if (!_mesa_is_texture_complete(texObj, sampObj)) {
if (!_mesa_is_texture_complete(texObj, sampObj,
ctx->Const.ForceIntegerTexNearest)) {
_mesa_error(ctx, GL_INVALID_OPERATION,
"glGetTextureSamplerHandleARB(incomplete texture)");
return 0;
@@ -786,7 +792,8 @@ _mesa_GetImageHandleARB_no_error(GLuint texture, GLint level, GLboolean layered,
GET_CURRENT_CONTEXT(ctx);
texObj = _mesa_lookup_texture(ctx, texture);
if (!_mesa_is_texture_complete(texObj, &texObj->Sampler))
if (!_mesa_is_texture_complete(texObj, &texObj->Sampler,
ctx->Const.ForceIntegerTexNearest))
_mesa_test_texobj_completeness(ctx, texObj);
return get_image_handle(ctx, texObj, level, layered, layer, format);
@@ -845,9 +852,11 @@ _mesa_GetImageHandleARB(GLuint texture, GLint level, GLboolean layered,
* <texture> is not a three-dimensional, one-dimensional array, two
* dimensional array, cube map, or cube map array texture."
*/
if (!_mesa_is_texture_complete(texObj, &texObj->Sampler)) {
if (!_mesa_is_texture_complete(texObj, &texObj->Sampler,
ctx->Const.ForceIntegerTexNearest)) {
_mesa_test_texobj_completeness(ctx, texObj);
if (!_mesa_is_texture_complete(texObj, &texObj->Sampler)) {
if (!_mesa_is_texture_complete(texObj, &texObj->Sampler,
ctx->Const.ForceIntegerTexNearest)) {
_mesa_error(ctx, GL_INVALID_OPERATION,
"glGetImageHandleARB(incomplete texture)");
return 0;

View File

@@ -114,9 +114,14 @@ st_convert_sampler(const struct st_context *st,
sampler->wrap_t = gl_wrap_xlate(msamp->WrapT);
sampler->wrap_r = gl_wrap_xlate(msamp->WrapR);
sampler->min_img_filter = gl_filter_to_img_filter(msamp->MinFilter);
if (texobj->_IsIntegerFormat && st->ctx->Const.ForceIntegerTexNearest) {
sampler->min_img_filter = gl_filter_to_img_filter(GL_NEAREST);
sampler->mag_img_filter = gl_filter_to_img_filter(GL_NEAREST);
} else {
sampler->min_img_filter = gl_filter_to_img_filter(msamp->MinFilter);
sampler->mag_img_filter = gl_filter_to_img_filter(msamp->MagFilter);
}
sampler->min_mip_filter = gl_filter_to_mip_filter(msamp->MinFilter);
sampler->mag_img_filter = gl_filter_to_img_filter(msamp->MagFilter);
if (texobj->Target != GL_TEXTURE_RECTANGLE_ARB)
sampler->normalized_coords = 1;

View File

@@ -1203,6 +1203,8 @@ void st_init_extensions(struct pipe_screen *screen,
consts->GLSLZeroInit = options->glsl_zero_init;
consts->ForceIntegerTexNearest = options->force_integer_tex_nearest;
consts->VendorOverride = options->force_gl_vendor;
consts->UniformBooleanTrue = consts->NativeIntegers ? ~0U : fui(1.0f);

View File

@@ -70,6 +70,8 @@
#include "cso_cache/cso_context.h"
static void
destroy_program_variants(struct st_context *st, struct gl_program *target);
static void
set_affected_state_flags(uint64_t *states,
@@ -345,7 +347,7 @@ st_release_program(struct st_context *st, struct st_program **p)
if (!*p)
return;
st_release_variants(st, *p);
destroy_program_variants(st, &((*p)->Base));
st_reference_prog(st, p, NULL);
}

View File

@@ -3712,7 +3712,8 @@ _swrast_choose_texture_sample_func( struct gl_context *ctx,
const struct gl_texture_object *t,
const struct gl_sampler_object *sampler)
{
if (!t || !_mesa_is_texture_complete(t, sampler)) {
if (!t || !_mesa_is_texture_complete(t, sampler,
ctx->Const.ForceIntegerTexNearest)) {
return null_sample_func;
}
else {

View File

@@ -268,6 +268,16 @@ TODO: document the other workarounds.
<option name="glsl_zero_init" value="true" />
</application>
<application name="GRID Autosport" executable="GridAutosport">
<!-- https://gitlab.freedesktop.org/mesa/mesa/issues/1258 -->
<option name="force_integer_tex_nearest" value="true" />
</application>
<application name="DIRT: Showdown" executable="dirt.i386">
<!-- https://gitlab.freedesktop.org/mesa/mesa/issues/1258 -->
<option name="force_integer_tex_nearest" value="true" />
</application>
<!-- The GL thread whitelist is below, workarounds are above.
Keep it that way. -->

View File

@@ -309,6 +309,11 @@ DRI_CONF_OPT_BEGIN_B(allow_fp16_configs, def) \
DRI_CONF_DESC(en,gettext("Allow exposure of visuals and fbconfigs with fp16 formats")) \
DRI_CONF_OPT_END
#define DRI_CONF_FORCE_INTEGER_TEX_NEAREST(def) \
DRI_CONF_OPT_BEGIN_B(force_integer_tex_nearest, def) \
DRI_CONF_DESC(en,gettext("Force integer textures to use nearest filtering")) \
DRI_CONF_OPT_END
/**
* \brief Initialization configuration options
*/

View File

@@ -77,6 +77,9 @@ vk_destroy_debug_report_callback(struct vk_debug_report_instance *instance,
const VkAllocationCallbacks* pAllocator,
const VkAllocationCallbacks* instance_allocator)
{
if (_callback == VK_NULL_HANDLE)
return;
struct vk_debug_report_callback *callback =
(struct vk_debug_report_callback *)(uintptr_t)_callback;