Compare commits

...

56 Commits

Author SHA1 Message Date
Eric Engestrom
d6695f1641 VERSION: bump for 24.0.8 2024-05-22 18:48:39 +02:00
Eric Engestrom
8f3dfb0aaa docs: add release notes for 24.0.8 2024-05-22 18:48:26 +02:00
David Heidelberg
c15886ec43 winsys/i915: depends on intel_wa.h
Prevent compilation failure due to not-yet generated intel_wa header.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11174
Cc: mesa-stable

Reviewed-by: Mark Janes <markjanes@swizzler.org>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29252>
(cherry picked from commit 08659a0baa)
2024-05-20 12:40:12 +02:00
Mike Blumenkrantz
8c913751a2 nir/linking: fix nir_assign_io_var_locations for scalarized dual blend
this would previously assign all scalar variables to the highest
driver location

cc: mesa-stable

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28753>
(cherry picked from commit ffe54ca293)
2024-05-20 12:40:12 +02:00
Mike Blumenkrantz
16c64c184c nir/lower_aaline: fix for scalarized outputs
this otherwise was broken

cc: mesa-stable

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28753>
(cherry picked from commit e28061c502)
2024-05-20 12:40:11 +02:00
Karol Herbst
d5675923b3 nir/lower_cl_images: set binding also for samplers
Fixes https://github.com/darktable-org/darktable/issues/16717 on radeonsi.

Fixes: 31ed24cec7 ("nir/lower_images: extract from clover")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29230>
(cherry picked from commit 564e569072)
2024-05-20 12:40:11 +02:00
David Rosca
0fd4f15601 radeonsi/vcn: Ensure at least one reference for H264 P/B frames
The original fix from

0f3370eede ("raseonsi/vcn: fix a h264 decoding issue")

would in some cases also trigger for I frames with interlaced streams.
Instead of checking used_for_reference_flags, use slice type and
only add one reference for P/B frames if needed.
This change still fixes playback of the sample from the original issue,
avoids the issue with interlaced streams and also fixes the case where
application provides no references at all.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11060
Cc: mesa-stable
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29055>
(cherry picked from commit 5f4a6b5b00)
2024-05-20 12:40:11 +02:00
David Rosca
c3f11a4011 radeonsi/vcn: Allow duplicate buffers in DPB
In case of missing frames (eg. when decoding corrupted streams), there
will be duplicate buffers and all of them needs to be in DPB to keep
the layout correct for decoding.

Cc: mesa-stable
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29055>
(cherry picked from commit 2ef3a34f1a)
2024-05-20 12:40:11 +02:00
David Rosca
ea6c84849d radeonsi/vcn: Ensure DPB has as many buffers as references
In case of corrupted streams (or application bugs) the number
of references may not be equal to DPB size. This needs to be fixed by
filling the missing slots with dummy buffers.

Cc: mesa-stable
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29055>
(cherry picked from commit 47b6ca47d0)
2024-05-20 12:40:11 +02:00
David Rosca
93572f4e31 frontends/va: Store slice types for H264 decode
Cc: mesa-stable
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29055>
(cherry picked from commit 9837dab4bd)
2024-05-20 12:40:11 +02:00
Patrick Lerda
450ad166c0 r600: fix vertex state update clover regression
This change handles the case when "vertex_fetch_shader.cso" is null,
it implements the previous behavior in this specific case. This
situation is happening with clover.

For instance, this issue is triggered with "piglit/bin/cl-custom-buffer-flags":
==6467==ERROR: AddressSanitizer: SEGV on unknown address 0x00000000000c (pc 0x7ff92908fe6e bp 0x7ffe86ae5ad0 sp 0x7ffe86ae5a30 T0)
==6467==The signal is caused by a READ memory access.
==6467==Hint: address points to the zero page.
    #0 0x7ff92908fe6e in evergreen_emit_vertex_buffers ../src/gallium/drivers/r600/evergreen_state.c:2123
    #1 0x7ff92908444b in r600_emit_atom ../src/gallium/drivers/r600/r600_pipe.h:627
    #2 0x7ff92908444b in compute_emit_cs ../src/gallium/drivers/r600/evergreen_compute.c:798
    #3 0x7ff92908444b in evergreen_launch_grid ../src/gallium/drivers/r600/evergreen_compute.c:927
    #4 0x7ff9349f9350 in clover::kernel::launch(clover::command_queue&, std::vector<unsigned long, std::allocator<unsigned long> > const&, std::vector<unsigned long, std::allocator<unsigned long> > const&, std::vector<unsigned long, std::allocator<unsigned long> > const&) ../src/gallium/frontends/clover/core/kernel.cpp:105
    #5 0x7ff9349c331d in std::function<void (clover::event&)>::operator()(clover::event&) const /usr/include/c++/11.4.0/bits/std_function.h:590
    #6 0x7ff9349c331d in clover::event::trigger() ../src/gallium/frontends/clover/core/event.cpp:54
    #7 0x7ff9349c82f1 in clover::hard_event::hard_event(clover::command_queue&, unsigned int, clover::ref_vector<clover::event> const&, std::function<void (clover::event&)>) ../src/gallium/frontends/clover/core/event.cpp:138
    #8 0x7ff9348daa47 in create<clover::hard_event, clover::command_queue&, int, clover::ref_vector<clover::event>&, clEnqueueNDRangeKernel(cl_command_queue, cl_kernel, cl_uint, const size_t*, const size_t*, const size_t*, cl_uint, _cl_event* const*, _cl_event**)::<lambda(clover::event&)> > ../src/gallium/frontends/clover/util/pointer.hpp:241
    #9 0x7ff9348daa47 in clEnqueueNDRangeKernel ../src/gallium/frontends/clover/api/kernel.cpp:334

Fixes: 659b7eb2 ("r600: better tracking for vertex buffer emission")
Related: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10079
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29163>
(cherry picked from commit f8a1d9f787)
2024-05-20 12:40:11 +02:00
David Rosca
331d440811 radeonsi: Update buffer for other planes in si_alloc_resource
The buffer is shared with all planes, so it needs to be updated
in all other planes. This is already done in si_texture_create_object
when creating the buffer, but it was missing when reallocating
in si_texture_invalidate_storage.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11155
Cc: mesa-stable
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29216>
(cherry picked from commit c522848d5a)
2024-05-20 12:40:11 +02:00
Karol Herbst
1bf184747e rusticl/mesa/context: flush context before destruction
Drivers might still be busy doing things and not properly clean things up.

Fixes a rare crash on applicatione exits with some drivers.

Fixes: 50e981a050 ("rusticl/mesa: add fencing support")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29223>
(cherry picked from commit f1662e9bc9)
2024-05-20 12:40:11 +02:00
Karol Herbst
30ddd43e6d event: break long dependency chains on drop
This prevents stack overflows on drop without making it expensive to read
from dependencies (e.g. my attempt to use Weak instead).

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29190>
(cherry picked from commit 48c752d3e0)
2024-05-20 12:40:11 +02:00
Karol Herbst
11f595f5e7 Revert "rusticl/event: use Weak refs for dependencies"
I didn't like the solution and I _think_ it even introduced a potential
regressions involving releasing failed events and that causing dependents
to run and succeed regardless.

This reverts commit a45f199086.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29190>
(cherry picked from commit 2f1f98e846)
2024-05-20 12:40:11 +02:00
Lionel Landwerlin
eac0e52cdb nir/divergence: add missing load_printf_buffer_address
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25814>
(cherry picked from commit 8d336f069e)
2024-05-20 12:40:10 +02:00
Lionel Landwerlin
a0e1b5f436 anv: fix push constant subgroup_id location
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 7c76125db2 ("anv: use 2 different buffers for surfaces/samplers in descriptor sets")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25814>
(cherry picked from commit 3716bd704f)
2024-05-20 12:40:10 +02:00
Yiwei Zhang
2c1cbf296e turnip: virtio: fix racy gem close for re-imported dma-buf
Similar to the prior fix for msm. On the dmabuf import path, tu_bo_init
can be outside of the vma lock, but left inside for code simplicity.

Fixes: f17c5297d7 ("tu: Add virtgpu support")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29093>
(cherry picked from commit 43bb989070)
2024-05-20 12:40:10 +02:00
Yiwei Zhang
699ec9c4c7 turnip: virtio: fix iova leak upon found already imported dmabuf
There's a success path on found dmabuf while the iova won't be cleaned
up. This change defers iova alloc till lookup miss and also to prepare
for later racy dmabuf re-import fix.

Also documented a potential leak on error path due to unable to tell
whether a gem handle should be closed or not without refcounting.

Fixes: f17c5297d7 ("tu: Add virtgpu support")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29093>
(cherry picked from commit 6ca192f586)
2024-05-20 12:40:10 +02:00
Yiwei Zhang
352d44ce5a turnip: virtio: fix error path in virtio_bo_init
Fixes: f17c5297d7 ("tu: Add virtgpu support")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29093>
(cherry picked from commit 585a87ae53)
2024-05-20 12:40:10 +02:00
David Rosca
69c7b25037 frontends/va: Only increment slice offset after first slice parameters
Fixes slice offset if app submits exactly one data buffer followed by
parameter buffers.

Fixes: 6746d4df6e ("frontends/va: Fix AV1 slice_data_offset with multiple slice data buffers")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11133
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11138
Tested-by: Marcus Seyfarth <m.seyfarth@gmail.com>
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29124>
(cherry picked from commit b33bb4077d)
2024-05-20 12:40:10 +02:00
Friedrich Vock
1aaec51f56 aco/spill: Insert p_start_linear_vgpr right after p_logical_end
If p_start_linear_vgpr allocates a VGPR that is already blocked, RA
will try moving the blocking VGPR somewhere else. If
p_start_linear_vgpr is inserted right before the branch, that move will
be inserted after exec has been overwritten, which might cause the move
to be skipped for some threads.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28041>
(cherry picked from commit 590ea76104)
2024-05-20 12:40:10 +02:00
Friedrich Vock
d3b8a28357 aco/tests: Insert p_logical_start/end in reduce_temp tests
Linear VGPR insertion will depend on a p_logical_end existing in the
blocks the VGPR is inserted in.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28041>
(cherry picked from commit 84c1870b65)
2024-05-20 12:40:10 +02:00
Marek Olšák
368892e9e2 util: shift the mask in BITSET_TEST_RANGE_INSIDE_WORD to be relative to b
so that users don't have to shift it at every use. It was supposed to be
like this from the beginning.

Fixes: fb994f44d9 - util: make BITSET_TEST_RANGE_INSIDE_WORD take a value to compare with

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29187>
(cherry picked from commit 5502ecd771)
2024-05-20 12:40:10 +02:00
Faith Ekstrand
ba4462df44 vulkan/wsi: Bind memory planes, not YCbCr planes.
Reviewed-by: Joshua Ashton <joshua@froggi.es>
Fixes: f5433e4d6c ("vulkan/wsi: Add modifiers support to wsi_create_native_image")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10176
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24795>
(cherry picked from commit 28342a581f)
2024-05-20 11:02:02 +02:00
Faith Ekstrand
115598022a nouveau/winsys: Add back nouveau_ws_bo_new_tiled()
This reverts commit ce1cccea98.  In this
new version, we also add a query for whether or not tiled BOs are
supported by nouveau.ko.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24795>
(cherry picked from commit 3bb531d245)
2024-05-20 11:00:53 +02:00
Faith Ekstrand
20595e465b drm-uapi: Sync nouveau_drm.h
Taken from drm-misc-next-fixes:

    commit 959314c438caf1b62d787f02d54a193efda38880
    Author: Mohamed Ahmed <mohamedahmedegypt2001@gmail.com>
    Date:   Thu May 9 23:43:52 2024 +0300

        drm/nouveau: use tile_mode and pte_kind for VM_BIND bo allocations

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24795>
(cherry picked from commit 03c4a46fe5)
2024-05-20 11:00:53 +02:00
Faith Ekstrand
6eaf495a19 nouveau/winsys: Take a reference to BOs found in the cache
Fixes: c370260a8f ("nouveau/winsys: Add dma-buf import support")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24795>
(cherry picked from commit 19b143b7bc)
2024-05-20 10:55:02 +02:00
David Heidelberg
5dbc8d493d freedreno/ci: move the disabled jobs from include to the main file
Accidentally moved.

Fixes: 9442571664 ("ci: separate hiden jobs to -inc.yml files")

Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29155>
(cherry picked from commit d9a0373a65)
2024-05-20 10:54:56 +02:00
Yiwei Zhang
0598097e8a turnip: msm: fix racy gem close for re-imported dma-buf
For dma-buf, if the import and finish occur back-2-back for the same
dma-buf, zombie vma cleanup will unexpectedly close the re-imported
dma-buf gem handle. This change fixes it by trying to resurrect from
zombie vmas on the dma-buf import path.

Fixes: 63904240f2 ("tu: Re-enable bufferDeviceAddressCaptureReplay")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29173>
(cherry picked from commit a1392394ba)
2024-05-20 10:54:18 +02:00
Yiwei Zhang
74802851d2 turnip: msm: clean up iova on error path
Fixes: e23c4fbd9b ("tu: Switch to userspace iova allocations if kernel supports it")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29173>
(cherry picked from commit 3909803849)
2024-05-20 10:52:20 +02:00
Patrick Lerda
028dc8957f clover: fix memory leak related to optimize
Indeed, the object returned by LLVMCreatePassBuilderOptions()
was not freed.

For instance, this issue is triggered with "piglit/bin/cl-api-build-program":
Direct leak of 32 byte(s) in 1 object(s) allocated from:
    #0 0x7f6b15abdf57 in operator new(unsigned long) (/usr/lib64/libasan.so.6+0xb2f57)
    #1 0x7f6afff6529e in LLVMCreatePassBuilderOptions llvm-18.1.5/lib/Passes/PassBuilderBindings.cpp:83
    #2 0x7f6b1186ee41 in optimize ../src/gallium/frontends/clover/llvm/invocation.cpp:521
    #3 0x7f6b1186ee41 in clover::llvm::link_program(std::vector<clover::binary, std::allocator<clover::binary> > const&, clover::device const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) ../src/gallium/frontends/clover/llvm/invocation.cpp:554
    #4 0x7f6b1150ce67 in link_program ../src/gallium/frontends/clover/core/compiler.hpp:78
    #5 0x7f6b1150ce67 in clover::program::link(clover::ref_vector<clover::device> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, clover::ref_vector<clover::program> const&) ../src/gallium/frontends/clover/core/program.cpp:78
    #6 0x7f6b11401a2b in clBuildProgram ../src/gallium/frontends/clover/api/program.cpp:283

Fixes: 2d4fe5f229 ("clover/llvm: move to modern pass manager.")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29164>
(cherry picked from commit df39994d51)
2024-05-20 10:52:19 +02:00
Romain Naour
c291a73202 glxext: don't try zink if not enabled in mesa
Commit 7d9ea77b45 ("glx: add automatic zink fallback loading between hw and sw drivers")
added an automatic zink fallback even when the zink gallium is not
enabled at build time.

It leads to unexpected error log while loading drisw driver and
zink is not installed on the rootfs:

  MESA-LOADER: failed to open zink: /usr/lib/dri/zink_dri.so

Fixes: 7d9ea77b45 ("glx: add automatic zink fallback loading between hw and sw drivers")

Signed-off-by: Romain Naour <romain.naour@smile.fr>
Reviewed-by: Antoine Coutant <antoine.coutant@smile.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27478>
(cherry picked from commit 02ab51a61e)
2024-05-20 10:52:18 +02:00
Antoine Coutant
c2739fefe3 drisw: fix build without dri3
commit 1887368df4 ("glx/sw: check for modifier support in the kopper path")
added dri3_priv.h header and dri3_check_multibuffer() function in drisw that
can be build without dri3.

Commit 4477139ec2 added a guard around dri3_check_multibuffer()
function but not around dri3_priv.h header.

Add HAVE_DRI3 guard around dri3_priv.h header.

Fixes: 1887368df4 ("glx/sw: check for modifier support in the kopper path")

v2: Remove the guard around dri3_check_multibuffer() function.

Signed-off-by: Romain Naour <romain.naour@smile.fr>
Signed-off-by: Antoine Coutant <antoine.coutant@smile.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27478>
(cherry picked from commit 3163b65ba7)
2024-05-20 10:52:17 +02:00
Eric Engestrom
e462e3cc39 .pick_status.json: Update to a31996ce5a 2024-05-20 10:51:15 +02:00
Mike Blumenkrantz
abf8b28b65 zink: clean up semaphore arrays on batch state destroy
cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29152>
(cherry picked from commit 604573cf0a)
2024-05-13 11:12:34 +02:00
Konstantin Seurer
9eb14991f9 radv: Zero initialize capture replay group handles
radv_serialized_shader_arena_block is not tightly packed and using an
initializer list leaves the gaps uninitialized.

cc: mesa-stable

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28961>
(cherry picked from commit 406dda70e7)
2024-05-13 11:12:33 +02:00
Konstantin Seurer
c64129a0bd radv: Remove arenas from capture_replay_arena_vas
Avoids an use after free when looking up an arena.

cc: mesa-stable

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28961>
(cherry picked from commit df82221bb3)
2024-05-13 11:10:13 +02:00
Konstantin Seurer
ca6431d9d7 radv: Fix radv_shader_arena_block list corruption
Remove it from the previous list befor adding it to a new one.

cc: mesa-stable

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28961>
(cherry picked from commit e050abc961)
2024-05-13 11:10:12 +02:00
Bas Nieuwenhuizen
df810add64 radv: Use zerovram for Enshrouded.
Two users now reporting that zerovram fixes hangs.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10500
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29149>
(cherry picked from commit 79cb884275)
2024-05-13 11:10:12 +02:00
Faith Ekstrand
d68141bd68 nvk/meta: Restore set_sizes[0]
Fixes: af3e7ba105 ("nvk: Stash descriptor set sizes")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29147>
(cherry picked from commit c834644c4e)
2024-05-13 11:10:11 +02:00
Faith Ekstrand
0a312787cd nvk: Re-emit sample locations when rasterization samples changes
We need them for the case where explicit sample locations are not
enabled.  While we're at it, fix the case where rasterization_samples=0.
This can happen when rasterizer discard is enabled.  This fixes MSAA
resolves with NVK+Zink.  In particular, it fixes MSAA for the Unigine
Heaven and Valley benchmark.

This also fixes all of the spec@arb_texture_float@multisample-formats
piglit tests.

Fixes: 41d094c2cc ("nvk: Support dynamic state for enabling sample locations")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10786
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29147>
(cherry picked from commit a160c2a14e)
2024-05-13 11:10:10 +02:00
Mike Blumenkrantz
ce7e1ca1fa frontends/dri: always init opencl_func_mutex in InitScreen hooks
this otherwise leads to a mismatch where some types of screen may have
the mutex initialized while others don't, in which case dri_release_screen()
will attempt to destroy an uninitialized mutex

cc: mesa-stable

Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29021>
(cherry picked from commit bc15c95c7a)
2024-05-13 11:10:00 +02:00
Mike Blumenkrantz
254b300f6b frontends/dri: only release pipe when screen init fails
the caller (driCreateNewScreen3) will always call dri_destroy_screen()
when these functions return failure, so releasing the screen
is always wrong

cc: mesa-stable

Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29021>
(cherry picked from commit a1225e81c9)
2024-05-13 11:09:58 +02:00
Eric Engestrom
6d23f70e79 .pick_status.json: Mark ae8fbe220a as denominated 2024-05-13 11:09:56 +02:00
Eric Engestrom
9343ede6a2 .pick_status.json: Update to e154f90aa9 2024-05-13 11:09:20 +02:00
Rhys Perry
5be42f6982 aco/waitcnt: fix DS/VMEM ordered writes when mixed
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28981>
(cherry picked from commit 5b1b09ad42)
2024-05-10 22:56:59 +02:00
Mike Blumenkrantz
020d145f4a u_blitter: stop leaking saved blitter states on no-op blits
drivers expect blitter to clean up after itself

cc: mesa-stable

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29122>
(cherry picked from commit cd004defd4)
2024-05-10 22:53:08 +02:00
Mike Blumenkrantz
cb375dfe03 zink: add a batch ref for committed sparse resources
this ensures that the sparse commit will complete before the resource
is destroyed

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29123>
(cherry picked from commit 67a356742f)
2024-05-10 22:53:07 +02:00
Georg Lehmann
57198d2ca9 zink: use bitcasts instead of pack/unpack double opcodes
The pack/unpack double opcodes may flush denorms, and the nir ops are pure
bitcasts.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29125>
(cherry picked from commit 925fff229f)
2024-05-10 22:53:06 +02:00
Mike Blumenkrantz
90012f1f66 egl/x11: disable dri3 with LIBGL_KOPPER_DRI2=1 as expected
cc: mesa-stable

Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29106>
(cherry picked from commit 568807cf88)
2024-05-10 22:52:51 +02:00
Karol Herbst
38485a49f4 rusticl/event: use Weak refs for dependencies
This fixes a potential stack overflow when the dep chain of events gets
too long and droped all at the same time.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29089>
(cherry picked from commit a45f199086)
2024-05-10 22:52:47 +02:00
Lionel Landwerlin
33d6e6f9a2 anv: fix ycbcr plane indexing with indirect descriptors
We need to add the plane index to compute the address from which to
load the descriptor (anv_sampled_image_descriptor in this case).

This was likely broken before we added direct descriptor support so
that gets a stable backport.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11125
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29111>
(cherry picked from commit 665cad6408)
2024-05-10 22:52:46 +02:00
José Expósito
6fc67be3a5 meson: Update proc_macro2 meson.build patch
Update the proc-macro2/meson.build to include the changes from v1.0.81.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11071
Signed-off-by: José Expósito <jexposit@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28923>
(cherry picked from commit 18c5315731)
2024-05-10 22:52:42 +02:00
Eric Engestrom
58dfa780c1 .pick_status.json: Update to 18c5315731 2024-05-10 22:52:36 +02:00
Eric Engestrom
4047c51834 docs: add sha256sum for 24.0.7 2024-05-08 16:29:01 +02:00
60 changed files with 7273 additions and 251 deletions

File diff suppressed because it is too large Load Diff

View File

@@ -1 +1 @@
24.0.7
24.0.8

View File

@@ -3,6 +3,7 @@ Release Notes
The release notes summarize what's new or changed in each Mesa release.
- :doc:`24.0.8 release notes <relnotes/24.0.8>`
- :doc:`24.0.7 release notes <relnotes/24.0.7>`
- :doc:`24.0.6 release notes <relnotes/24.0.6>`
- :doc:`24.0.5 release notes <relnotes/24.0.5>`
@@ -415,6 +416,7 @@ The release notes summarize what's new or changed in each Mesa release.
:maxdepth: 1
:hidden:
24.0.8 <relnotes/24.0.8>
24.0.7 <relnotes/24.0.7>
24.0.6 <relnotes/24.0.6>
24.0.5 <relnotes/24.0.5>

View File

@@ -19,7 +19,7 @@ SHA256 checksum
::
TBD.
7454425f1ed4a6f1b5b107e1672b30c88b22ea0efea000ae2c7d96db93f6c26a mesa-24.0.7.tar.xz
New features

155
docs/relnotes/24.0.8.rst Normal file
View File

@@ -0,0 +1,155 @@
Mesa 24.0.8 Release Notes / 2024-05-22
======================================
Mesa 24.0.8 is a bug fix release which fixes bugs found since the 24.0.7 release.
Mesa 24.0.8 implements the OpenGL 4.6 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.6. OpenGL
4.6 is **only** available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
Mesa 24.0.8 implements the Vulkan 1.3 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
SHA256 checksum
---------------
::
TBD.
New features
------------
- None
Bug fixes
---------
- [24.1-rc4] fatal error: intel/dev/intel_wa.h: No such file or directory
- vcn: rewinding attached video in Totem cause [mmhub] page fault
- When using amd gpu deinterlace, tv bt709 properties mapping to 2 chroma
- VCN decoding freezes the whole system
- [RDNA2 [AV1] [VAAPI] hw decoding glitches in Thorium 123.0.6312.133 after https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28960
- WSI: Support VK_IMAGE_ASPECT_MEMORY_PLANE_i_BIT_EXT for DRM Modifiers in Vulkan
- radv: Enshrouded GPU hang on RX 6800
- NVK Zink: Wrong color in Unigine Valley benchmark
- [anv] FINISHME: support YUV colorspace with DRM format modifiers
- 24.0.6: build fails
Changes
-------
Antoine Coutant (1):
- drisw: fix build without dri3
Bas Nieuwenhuizen (1):
- radv: Use zerovram for Enshrouded.
David Heidelberg (2):
- freedreno/ci: move the disabled jobs from include to the main file
- winsys/i915: depends on intel_wa.h
David Rosca (6):
- frontends/va: Only increment slice offset after first slice parameters
- radeonsi: Update buffer for other planes in si_alloc_resource
- frontends/va: Store slice types for H264 decode
- radeonsi/vcn: Ensure DPB has as many buffers as references
- radeonsi/vcn: Allow duplicate buffers in DPB
- radeonsi/vcn: Ensure at least one reference for H264 P/B frames
Eric Engestrom (5):
- docs: add sha256sum for 24.0.7
- .pick_status.json: Update to 18c53157318d6c8e572062f6bb768dfb621a55fd
- .pick_status.json: Update to e154f90aa9e71cc98375866c3ab24c4e08e66cb7
- .pick_status.json: Mark ae8fbe220ae67ffdce662c26bc4a634d475c0389 as denominated
- .pick_status.json: Update to a31996ce5a6b7eb3b324b71eb9e9c45173953c50
Faith Ekstrand (6):
- nvk: Re-emit sample locations when rasterization samples changes
- nvk/meta: Restore set_sizes[0]
- nouveau/winsys: Take a reference to BOs found in the cache
- drm-uapi: Sync nouveau_drm.h
- nouveau/winsys: Add back nouveau_ws_bo_new_tiled()
- vulkan/wsi: Bind memory planes, not YCbCr planes.
Friedrich Vock (2):
- aco/tests: Insert p_logical_start/end in reduce_temp tests
- aco/spill: Insert p_start_linear_vgpr right after p_logical_end
Georg Lehmann (1):
- zink: use bitcasts instead of pack/unpack double opcodes
José Expósito (1):
- meson: Update proc_macro2 meson.build patch
Karol Herbst (5):
- rusticl/event: use Weak refs for dependencies
- Revert "rusticl/event: use Weak refs for dependencies"
- event: break long dependency chains on drop
- rusticl/mesa/context: flush context before destruction
- nir/lower_cl_images: set binding also for samplers
Konstantin Seurer (3):
- radv: Fix radv_shader_arena_block list corruption
- radv: Remove arenas from capture_replay_arena_vas
- radv: Zero initialize capture replay group handles
Lionel Landwerlin (3):
- anv: fix ycbcr plane indexing with indirect descriptors
- anv: fix push constant subgroup_id location
- nir/divergence: add missing load_printf_buffer_address
Marek Olšák (1):
- util: shift the mask in BITSET_TEST_RANGE_INSIDE_WORD to be relative to b
Mike Blumenkrantz (8):
- egl/x11: disable dri3 with LIBGL_KOPPER_DRI2=1 as expected
- zink: add a batch ref for committed sparse resources
- u_blitter: stop leaking saved blitter states on no-op blits
- frontends/dri: only release pipe when screen init fails
- frontends/dri: always init opencl_func_mutex in InitScreen hooks
- zink: clean up semaphore arrays on batch state destroy
- nir/lower_aaline: fix for scalarized outputs
- nir/linking: fix nir_assign_io_var_locations for scalarized dual blend
Patrick Lerda (2):
- clover: fix memory leak related to optimize
- r600: fix vertex state update clover regression
Rhys Perry (1):
- aco/waitcnt: fix DS/VMEM ordered writes when mixed
Romain Naour (1):
- glxext: don't try zink if not enabled in mesa
Yiwei Zhang (5):
- turnip: msm: clean up iova on error path
- turnip: msm: fix racy gem close for re-imported dma-buf
- turnip: virtio: fix error path in virtio_bo_init
- turnip: virtio: fix iova leak upon found already imported dmabuf
- turnip: virtio: fix racy gem close for re-imported dma-buf

View File

@@ -54,11 +54,42 @@ extern "C" {
*/
#define NOUVEAU_GETPARAM_EXEC_PUSH_MAX 17
/*
* NOUVEAU_GETPARAM_VRAM_BAR_SIZE - query bar size
*
* Query the VRAM BAR size.
*/
#define NOUVEAU_GETPARAM_VRAM_BAR_SIZE 18
/*
* NOUVEAU_GETPARAM_VRAM_USED
*
* Get remaining VRAM size.
*/
#define NOUVEAU_GETPARAM_VRAM_USED 19
/*
* NOUVEAU_GETPARAM_HAS_VMA_TILEMODE
*
* Query whether tile mode and PTE kind are accepted with VM allocs or not.
*/
#define NOUVEAU_GETPARAM_HAS_VMA_TILEMODE 20
struct drm_nouveau_getparam {
__u64 param;
__u64 value;
};
/*
* Those are used to support selecting the main engine used on Kepler.
* This goes into drm_nouveau_channel_alloc::tt_ctxdma_handle
*/
#define NOUVEAU_FIFO_ENGINE_GR 0x01
#define NOUVEAU_FIFO_ENGINE_VP 0x02
#define NOUVEAU_FIFO_ENGINE_PPP 0x04
#define NOUVEAU_FIFO_ENGINE_BSP 0x08
#define NOUVEAU_FIFO_ENGINE_CE 0x30
struct drm_nouveau_channel_alloc {
__u32 fb_ctxdma_handle;
__u32 tt_ctxdma_handle;
@@ -81,6 +112,18 @@ struct drm_nouveau_channel_free {
__s32 channel;
};
struct drm_nouveau_notifierobj_alloc {
__u32 channel;
__u32 handle;
__u32 size;
__u32 offset;
};
struct drm_nouveau_gpuobj_free {
__s32 channel;
__u32 handle;
};
#define NOUVEAU_GEM_DOMAIN_CPU (1 << 0)
#define NOUVEAU_GEM_DOMAIN_VRAM (1 << 1)
#define NOUVEAU_GEM_DOMAIN_GART (1 << 2)
@@ -238,34 +281,32 @@ struct drm_nouveau_vm_init {
struct drm_nouveau_vm_bind_op {
/**
* @op: the operation type
*
* Supported values:
*
* %DRM_NOUVEAU_VM_BIND_OP_MAP - Map a GEM object to the GPU's VA
* space. Optionally, the &DRM_NOUVEAU_VM_BIND_SPARSE flag can be
* passed to instruct the kernel to create sparse mappings for the
* given range.
*
* %DRM_NOUVEAU_VM_BIND_OP_UNMAP - Unmap an existing mapping in the
* GPU's VA space. If the region the mapping is located in is a
* sparse region, new sparse mappings are created where the unmapped
* (memory backed) mapping was mapped previously. To remove a sparse
* region the &DRM_NOUVEAU_VM_BIND_SPARSE must be set.
*/
__u32 op;
/**
* @DRM_NOUVEAU_VM_BIND_OP_MAP:
*
* Map a GEM object to the GPU's VA space. Optionally, the
* &DRM_NOUVEAU_VM_BIND_SPARSE flag can be passed to instruct the kernel to
* create sparse mappings for the given range.
*/
#define DRM_NOUVEAU_VM_BIND_OP_MAP 0x0
/**
* @DRM_NOUVEAU_VM_BIND_OP_UNMAP:
*
* Unmap an existing mapping in the GPU's VA space. If the region the mapping
* is located in is a sparse region, new sparse mappings are created where the
* unmapped (memory backed) mapping was mapped previously. To remove a sparse
* region the &DRM_NOUVEAU_VM_BIND_SPARSE must be set.
*/
#define DRM_NOUVEAU_VM_BIND_OP_UNMAP 0x1
/**
* @flags: the flags for a &drm_nouveau_vm_bind_op
*
* Supported values:
*
* %DRM_NOUVEAU_VM_BIND_SPARSE - Indicates that an allocated VA
* space region should be sparse.
*/
__u32 flags;
/**
* @DRM_NOUVEAU_VM_BIND_SPARSE:
*
* Indicates that an allocated VA space region should be sparse.
*/
#define DRM_NOUVEAU_VM_BIND_SPARSE (1 << 8)
/**
* @handle: the handle of the DRM GEM object to map
@@ -301,17 +342,17 @@ struct drm_nouveau_vm_bind {
__u32 op_count;
/**
* @flags: the flags for a &drm_nouveau_vm_bind ioctl
*
* Supported values:
*
* %DRM_NOUVEAU_VM_BIND_RUN_ASYNC - Indicates that the given VM_BIND
* operation should be executed asynchronously by the kernel.
*
* If this flag is not supplied the kernel executes the associated
* operations synchronously and doesn't accept any &drm_nouveau_sync
* objects.
*/
__u32 flags;
/**
* @DRM_NOUVEAU_VM_BIND_RUN_ASYNC:
*
* Indicates that the given VM_BIND operation should be executed asynchronously
* by the kernel.
*
* If this flag is not supplied the kernel executes the associated operations
* synchronously and doesn't accept any &drm_nouveau_sync objects.
*/
#define DRM_NOUVEAU_VM_BIND_RUN_ASYNC 0x1
/**
* @wait_count: the number of wait &drm_nouveau_syncs

View File

@@ -10,7 +10,4 @@ dEQP-VK.draw.renderpass.multi_draw.mosaic.indexed_mixed.max_draws.stride_extra_1
dEQP-VK.pipeline.*line_stipple_enable
dEQP-VK.pipeline.*line_stipple_params
# New CTS flakes in 1.3.6.3
dEQP-VK.ray_tracing_pipeline.pipeline_library.configurations.(single|multi)threaded_compilation.*_check_(all|capture_replay)_handles
dEQP-VK.query_pool.statistics_query.host_query_reset.geometry_shader_invocations.secondary.32bits_triangle_list_clear_depth

View File

@@ -1,5 +0,0 @@
# New CTS flakes in 1.3.6.3
dEQP-VK.ray_tracing_pipeline.pipeline_library.configurations.multithreaded_compilation.*_check_all_handles
dEQP-VK.ray_tracing_pipeline.pipeline_library.configurations.multithreaded_compilation.*_check_capture_replay_handles
dEQP-VK.ray_tracing_pipeline.pipeline_library.configurations.singlethreaded_compilation.*_check_all_handles
dEQP-VK.ray_tracing_pipeline.pipeline_library.configurations.singlethreaded_compilation.*_check_capture_replay_handles

View File

@@ -428,18 +428,20 @@ check_instr(wait_ctx& ctx, wait_imm& wait, alu_delay_info& delay, Instruction* i
if (it == ctx.gpr_map.end())
continue;
wait_imm reg_imm = it->second.imm;
/* Vector Memory reads and writes return in the order they were issued */
uint8_t vmem_type = get_vmem_type(instr);
if (vmem_type && ((it->second.events & vm_events) == event_vmem) &&
it->second.vmem_types == vmem_type)
continue;
reg_imm.vm = wait_imm::unset_counter;
/* LDS reads and writes return in the order they were issued. same for GDS */
if (instr->isDS() &&
(it->second.events & lgkm_events) == (instr->ds().gds ? event_gds : event_lds))
continue;
reg_imm.lgkm = wait_imm::unset_counter;
wait.combine(it->second.imm);
wait.combine(reg_imm);
}
}
}

View File

@@ -118,11 +118,16 @@ setup_reduce_temp(Program* program)
* would insert at the end instead of using this one. */
} else {
assert(last_top_level_block_idx < block.index);
/* insert before the branch at last top level block */
/* insert after p_logical_end of the last top-level block */
std::vector<aco_ptr<Instruction>>& instructions =
program->blocks[last_top_level_block_idx].instructions;
instructions.insert(std::next(instructions.begin(), instructions.size() - 1),
std::move(create));
auto insert_point =
std::find_if(instructions.rbegin(), instructions.rend(),
[](const auto& iter) {
return iter->opcode == aco_opcode::p_logical_end;
})
.base();
instructions.insert(insert_point, std::move(create));
inserted_at = last_top_level_block_idx;
}
}
@@ -161,8 +166,13 @@ setup_reduce_temp(Program* program)
assert(last_top_level_block_idx < block.index);
std::vector<aco_ptr<Instruction>>& instructions =
program->blocks[last_top_level_block_idx].instructions;
instructions.insert(std::next(instructions.begin(), instructions.size() - 1),
std::move(create));
auto insert_point =
std::find_if(instructions.rbegin(), instructions.rend(),
[](const auto& iter) {
return iter->opcode == aco_opcode::p_logical_end;
})
.base();
instructions.insert(insert_point, std::move(create));
vtmp_inserted_at = last_top_level_block_idx;
}
}

View File

@@ -1845,10 +1845,16 @@ assign_spill_slots(spill_ctx& ctx, unsigned spills_to_vgpr)
instructions.emplace_back(std::move(create));
} else {
assert(last_top_level_block_idx < block.index);
/* insert before the branch at last top level block */
/* insert after p_logical_end of the last top-level block */
std::vector<aco_ptr<Instruction>>& block_instrs =
ctx.program->blocks[last_top_level_block_idx].instructions;
block_instrs.insert(std::prev(block_instrs.end()), std::move(create));
auto insert_point =
std::find_if(block_instrs.rbegin(), block_instrs.rend(),
[](const auto& iter) {
return iter->opcode == aco_opcode::p_logical_end;
})
.base();
block_instrs.insert(insert_point, std::move(create));
}
}
@@ -1885,10 +1891,16 @@ assign_spill_slots(spill_ctx& ctx, unsigned spills_to_vgpr)
instructions.emplace_back(std::move(create));
} else {
assert(last_top_level_block_idx < block.index);
/* insert before the branch at last top level block */
/* insert after p_logical_end of the last top-level block */
std::vector<aco_ptr<Instruction>>& block_instrs =
ctx.program->blocks[last_top_level_block_idx].instructions;
block_instrs.insert(std::prev(block_instrs.end()), std::move(create));
auto insert_point =
std::find_if(block_instrs.rbegin(), block_instrs.rend(),
[](const auto& iter) {
return iter->opcode == aco_opcode::p_logical_end;
})
.base();
block_instrs.insert(insert_point, std::move(create));
}
}

View File

@@ -53,3 +53,71 @@ BEGIN_TEST(insert_waitcnt.ds_ordered_count)
finish_waitcnt_test();
END_TEST
BEGIN_TEST(insert_waitcnt.waw.mixed_vmem_lds.vmem)
if (!setup_cs(NULL, GFX10))
return;
Definition def_v4(PhysReg(260), v1);
Operand op_v0(PhysReg(256), v1);
Operand desc0(PhysReg(0), s4);
//>> BB0
//! /* logical preds: / linear preds: / kind: top-level, */
//! v1: %0:v[4] = buffer_load_dword %0:s[0-3], %0:v[0], 0
bld.mubuf(aco_opcode::buffer_load_dword, def_v4, desc0, op_v0, Operand::zero(), 0, false);
//>> BB1
//! /* logical preds: / linear preds: / kind: */
//! v1: %0:v[4] = ds_read_b32 %0:v[0]
bld.reset(program->create_and_insert_block());
bld.ds(aco_opcode::ds_read_b32, def_v4, op_v0);
bld.reset(program->create_and_insert_block());
program->blocks[2].linear_preds.push_back(0);
program->blocks[2].linear_preds.push_back(1);
program->blocks[2].logical_preds.push_back(0);
program->blocks[2].logical_preds.push_back(1);
//>> BB2
//! /* logical preds: BB0, BB1, / linear preds: BB0, BB1, / kind: uniform, */
//! s_waitcnt lgkmcnt(0)
//! v1: %0:v[4] = buffer_load_dword %0:s[0-3], %0:v[0], 0
bld.mubuf(aco_opcode::buffer_load_dword, def_v4, desc0, op_v0, Operand::zero(), 0, false);
finish_waitcnt_test();
END_TEST
BEGIN_TEST(insert_waitcnt.waw.mixed_vmem_lds.lds)
if (!setup_cs(NULL, GFX10))
return;
Definition def_v4(PhysReg(260), v1);
Operand op_v0(PhysReg(256), v1);
Operand desc0(PhysReg(0), s4);
//>> BB0
//! /* logical preds: / linear preds: / kind: top-level, */
//! v1: %0:v[4] = buffer_load_dword %0:s[0-3], %0:v[0], 0
bld.mubuf(aco_opcode::buffer_load_dword, def_v4, desc0, op_v0, Operand::zero(), 0, false);
//>> BB1
//! /* logical preds: / linear preds: / kind: */
//! v1: %0:v[4] = ds_read_b32 %0:v[0]
bld.reset(program->create_and_insert_block());
bld.ds(aco_opcode::ds_read_b32, def_v4, op_v0);
bld.reset(program->create_and_insert_block());
program->blocks[2].linear_preds.push_back(0);
program->blocks[2].linear_preds.push_back(1);
program->blocks[2].logical_preds.push_back(0);
program->blocks[2].logical_preds.push_back(1);
//>> BB2
//! /* logical preds: BB0, BB1, / linear preds: BB0, BB1, / kind: uniform, */
//! s_waitcnt vmcnt(0)
//! v1: %0:v[4] = ds_read_b32 %0:v[0]
bld.ds(aco_opcode::ds_read_b32, def_v4, op_v0);
finish_waitcnt_test();
END_TEST

View File

@@ -41,6 +41,11 @@ BEGIN_TEST(setup_reduce_temp.divergent_if_phi)
if (!setup_cs("s2 v1", GFX9))
return;
//>> p_logical_start
//>> p_logical_end
bld.pseudo(aco_opcode::p_logical_start);
bld.pseudo(aco_opcode::p_logical_end);
//>> lv1: %lv = p_start_linear_vgpr
emit_divergent_if_else(
program.get(), bld, Operand(inputs[0]),

View File

@@ -1320,9 +1320,6 @@ radv_DestroyDevice(VkDevice _device, const VkAllocationCallbacks *pAllocator)
if (!device)
return;
if (device->capture_replay_arena_vas)
_mesa_hash_table_u64_destroy(device->capture_replay_arena_vas);
radv_device_finish_perf_counter_lock_cs(device);
if (device->perf_counter_bo)
device->ws->buffer_destroy(device->ws, device->perf_counter_bo);
@@ -1372,6 +1369,8 @@ radv_DestroyDevice(VkDevice _device, const VkAllocationCallbacks *pAllocator)
radv_finish_trace(device);
radv_destroy_shader_arenas(device);
if (device->capture_replay_arena_vas)
_mesa_hash_table_u64_destroy(device->capture_replay_arena_vas);
radv_sqtt_finish(device);

View File

@@ -943,8 +943,12 @@ radv_GetRayTracingCaptureReplayShaderGroupHandlesKHR(VkDevice device, VkPipeline
uint32_t recursive_shader = rt_pipeline->groups[firstGroup + i].recursive_shader;
if (recursive_shader != VK_SHADER_UNUSED_KHR) {
struct radv_shader *shader = rt_pipeline->stages[recursive_shader].shader;
if (shader)
data[i].recursive_shader_alloc = radv_serialize_shader_arena_block(shader->alloc);
if (shader) {
data[i].recursive_shader_alloc.offset = shader->alloc->offset;
data[i].recursive_shader_alloc.size = shader->alloc->size;
data[i].recursive_shader_alloc.arena_va = shader->alloc->arena->bo->va;
data[i].recursive_shader_alloc.arena_size = shader->alloc->arena->size;
}
}
data[i].non_recursive_idx = rt_pipeline->groups[firstGroup + i].handle.any_hit_index;
}

View File

@@ -982,6 +982,7 @@ alloc_block_obj(struct radv_device *device)
static void
free_block_obj(struct radv_device *device, union radv_shader_arena_block *block)
{
list_del(&block->pool);
list_add(&block->pool, &device->shader_block_obj_pool);
}
@@ -1267,7 +1268,6 @@ radv_free_shader_memory(struct radv_device *device, union radv_shader_arena_bloc
remove_hole(free_list, hole_prev);
hole_prev->size += hole->size;
list_del(&hole->list);
free_block_obj(device, hole);
hole = hole_prev;
@@ -1280,7 +1280,6 @@ radv_free_shader_memory(struct radv_device *device, union radv_shader_arena_bloc
hole_next->offset -= hole->size;
hole_next->size += hole->size;
list_del(&hole->list);
free_block_obj(device, hole);
hole = hole_next;
@@ -1293,6 +1292,18 @@ radv_free_shader_memory(struct radv_device *device, union radv_shader_arena_bloc
radv_rmv_log_bo_destroy(device, arena->bo);
device->ws->buffer_destroy(device->ws, arena->bo);
list_del(&arena->list);
if (device->capture_replay_arena_vas) {
struct hash_entry *arena_entry = NULL;
hash_table_foreach (device->capture_replay_arena_vas->table, entry) {
if (entry->data == arena) {
arena_entry = entry;
break;
}
}
_mesa_hash_table_remove(device->capture_replay_arena_vas->table, arena_entry);
}
free(arena);
} else if (free_list) {
add_hole(free_list, hole);
@@ -1301,18 +1312,6 @@ radv_free_shader_memory(struct radv_device *device, union radv_shader_arena_bloc
mtx_unlock(&device->shader_arena_mutex);
}
struct radv_serialized_shader_arena_block
radv_serialize_shader_arena_block(union radv_shader_arena_block *block)
{
struct radv_serialized_shader_arena_block serialized_block = {
.offset = block->offset,
.size = block->size,
.arena_va = block->arena->bo->va,
.arena_size = block->arena->size,
};
return serialized_block;
}
union radv_shader_arena_block *
radv_replay_shader_arena_block(struct radv_device *device, const struct radv_serialized_shader_arena_block *src,
void *ptr)

View File

@@ -775,8 +775,6 @@ union radv_shader_arena_block *radv_replay_shader_arena_block(struct radv_device
const struct radv_serialized_shader_arena_block *src,
void *ptr);
struct radv_serialized_shader_arena_block radv_serialize_shader_arena_block(union radv_shader_arena_block *block);
void radv_free_shader_memory(struct radv_device *device, union radv_shader_arena_block *alloc);
struct radv_shader *radv_create_trap_handler_shader(struct radv_device *device);

View File

@@ -216,6 +216,7 @@ visit_intrinsic(nir_shader *shader, nir_intrinsic_instr *instr)
case nir_intrinsic_load_rasterization_primitive_amd:
case nir_intrinsic_load_global_constant_uniform_block_intel:
case nir_intrinsic_cmat_length:
case nir_intrinsic_load_printf_buffer_address:
is_divergent = false;
break;

View File

@@ -1492,7 +1492,7 @@ nir_assign_io_var_locations(nir_shader *shader, nir_variable_mode mode,
unsigned *size, gl_shader_stage stage)
{
unsigned location = 0;
unsigned assigned_locations[VARYING_SLOT_TESS_MAX];
unsigned assigned_locations[VARYING_SLOT_TESS_MAX][2];
uint64_t processed_locs[2] = { 0 };
struct exec_list io_vars;
@@ -1584,7 +1584,7 @@ nir_assign_io_var_locations(nir_shader *shader, nir_variable_mode mode,
if (processed) {
/* TODO handle overlapping per-view variables */
assert(!var->data.per_view);
unsigned driver_location = assigned_locations[var->data.location];
unsigned driver_location = assigned_locations[var->data.location][var->data.index];
var->data.driver_location = driver_location;
/* An array may be packed such that is crosses multiple other arrays
@@ -1605,7 +1605,7 @@ nir_assign_io_var_locations(nir_shader *shader, nir_variable_mode mode,
unsigned num_unallocated_slots = last_slot_location - location;
unsigned first_unallocated_slot = var_size - num_unallocated_slots;
for (unsigned i = first_unallocated_slot; i < var_size; i++) {
assigned_locations[var->data.location + i] = location;
assigned_locations[var->data.location + i][var->data.index] = location;
location++;
}
}
@@ -1613,7 +1613,7 @@ nir_assign_io_var_locations(nir_shader *shader, nir_variable_mode mode,
}
for (unsigned i = 0; i < var_size; i++) {
assigned_locations[var->data.location + i] = location + i;
assigned_locations[var->data.location + i][var->data.index] = location + i;
}
var->data.driver_location = location;

View File

@@ -161,6 +161,7 @@ nir_lower_cl_images(nir_shader *shader, bool lower_image_derefs, bool lower_samp
assert(var->data.location > last_loc);
last_loc = var->data.location;
var->data.driver_location = num_samplers++;
var->data.binding = var->data.driver_location;
} else {
/* CL shouldn't have any sampled images */
assert(!glsl_type_is_sampler(var->type));

View File

@@ -1518,7 +1518,8 @@ dri2_initialize_x11_swrast(_EGLDisplay *disp)
*/
dri2_dpy->driver_name = strdup(disp->Options.Zink ? "zink" : "swrast");
if (disp->Options.Zink &&
!debug_get_bool_option("LIBGL_DRI3_DISABLE", false))
!debug_get_bool_option("LIBGL_DRI3_DISABLE", false) &&
!debug_get_bool_option("LIBGL_KOPPER_DRI2", false))
dri3_x11_connect(dri2_dpy);
if (!dri2_load_driver_swrast(disp))
goto cleanup;

View File

@@ -292,25 +292,6 @@
tags:
- google-freedreno-db410c
# New jobs. Leave it as manual for now.
.a306_piglit:
extends:
- .piglit-test
- .a306-test
- .google-freedreno-manual-rules
variables:
HWCI_START_XORG: 1
# Something happened and now this hangchecks and doesn't recover. Unkown when
# it started.
.a306_piglit_gl:
extends:
- .a306_piglit
variables:
PIGLIT_PROFILES: quick_gl
BM_KERNEL_EXTRA_ARGS: "msm.num_hw_submissions=1"
FDO_CI_CONCURRENT: 3
# 8 devices (2023-04-15)
.a530-test:
extends:

View File

@@ -10,6 +10,25 @@ a306_gl:
FDO_CI_CONCURRENT: 6
parallel: 5
# New jobs. Leave it as manual for now.
.a306_piglit:
extends:
- .piglit-test
- .a306-test
- .google-freedreno-manual-rules
variables:
HWCI_START_XORG: 1
# Something happened and now this hangchecks and doesn't recover. Unkown when
# it started.
.a306_piglit_gl:
extends:
- .a306_piglit
variables:
PIGLIT_PROFILES: quick_gl
BM_KERNEL_EXTRA_ARGS: "msm.num_hw_submissions=1"
FDO_CI_CONCURRENT: 3
a306_piglit_shader:
extends:
- .a306_piglit

View File

@@ -20,5 +20,8 @@ IncludeCategories:
- Regex: '.*'
Priority: 1
ForEachMacros:
- u_vector_foreach
SpaceAfterCStyleCast: true
SpaceBeforeCpp11BracedList: true

View File

@@ -15,12 +15,12 @@
struct tu_u_trace_syncobj;
struct vdrm_bo;
enum tu_bo_alloc_flags
{
enum tu_bo_alloc_flags {
TU_BO_ALLOC_NO_FLAGS = 0,
TU_BO_ALLOC_ALLOW_DUMP = 1 << 0,
TU_BO_ALLOC_GPU_READ_ONLY = 1 << 1,
TU_BO_ALLOC_REPLAYABLE = 1 << 2,
TU_BO_ALLOC_DMABUF = 1 << 4,
};
/* Define tu_timeline_sync type based on drm syncobj for a point type

View File

@@ -321,44 +321,68 @@ tu_free_zombie_vma_locked(struct tu_device *dev, bool wait)
last_signaled_fence = vma->fence;
}
/* Ensure that internal kernel's vma is freed. */
struct drm_msm_gem_info req = {
.handle = vma->gem_handle,
.info = MSM_INFO_SET_IOVA,
.value = 0,
};
if (vma->gem_handle) {
/* Ensure that internal kernel's vma is freed. */
struct drm_msm_gem_info req = {
.handle = vma->gem_handle,
.info = MSM_INFO_SET_IOVA,
.value = 0,
};
int ret =
drmCommandWriteRead(dev->fd, DRM_MSM_GEM_INFO, &req, sizeof(req));
if (ret < 0) {
mesa_loge("MSM_INFO_SET_IOVA(0) failed! %d (%s)", ret,
strerror(errno));
return VK_ERROR_UNKNOWN;
int ret =
drmCommandWriteRead(dev->fd, DRM_MSM_GEM_INFO, &req, sizeof(req));
if (ret < 0) {
mesa_loge("MSM_INFO_SET_IOVA(0) failed! %d (%s)", ret,
strerror(errno));
return VK_ERROR_UNKNOWN;
}
tu_gem_close(dev, vma->gem_handle);
util_vma_heap_free(&dev->vma, vma->iova, vma->size);
}
tu_gem_close(dev, vma->gem_handle);
util_vma_heap_free(&dev->vma, vma->iova, vma->size);
u_vector_remove(&dev->zombie_vmas);
}
return VK_SUCCESS;
}
static bool
tu_restore_from_zombie_vma_locked(struct tu_device *dev,
uint32_t gem_handle,
uint64_t *iova)
{
struct tu_zombie_vma *vma;
u_vector_foreach (vma, &dev->zombie_vmas) {
if (vma->gem_handle == gem_handle) {
*iova = vma->iova;
/* mark to skip later gem and iova cleanup */
vma->gem_handle = 0;
return true;
}
}
return false;
}
static VkResult
msm_allocate_userspace_iova(struct tu_device *dev,
uint32_t gem_handle,
uint64_t size,
uint64_t client_iova,
enum tu_bo_alloc_flags flags,
uint64_t *iova)
msm_allocate_userspace_iova_locked(struct tu_device *dev,
uint32_t gem_handle,
uint64_t size,
uint64_t client_iova,
enum tu_bo_alloc_flags flags,
uint64_t *iova)
{
VkResult result;
mtx_lock(&dev->vma_mutex);
*iova = 0;
if ((flags & TU_BO_ALLOC_DMABUF) &&
tu_restore_from_zombie_vma_locked(dev, gem_handle, iova))
return VK_SUCCESS;
tu_free_zombie_vma_locked(dev, false);
result = tu_allocate_userspace_iova(dev, size, client_iova, flags, iova);
@@ -372,8 +396,6 @@ msm_allocate_userspace_iova(struct tu_device *dev,
result = tu_allocate_userspace_iova(dev, size, client_iova, flags, iova);
}
mtx_unlock(&dev->vma_mutex);
if (result != VK_SUCCESS)
return result;
@@ -386,6 +408,7 @@ msm_allocate_userspace_iova(struct tu_device *dev,
int ret =
drmCommandWriteRead(dev->fd, DRM_MSM_GEM_INFO, &req, sizeof(req));
if (ret < 0) {
util_vma_heap_free(&dev->vma, *iova, size);
mesa_loge("MSM_INFO_SET_IOVA failed! %d (%s)", ret, strerror(errno));
return VK_ERROR_OUT_OF_HOST_MEMORY;
}
@@ -420,8 +443,8 @@ tu_bo_init(struct tu_device *dev,
assert(!client_iova || dev->physical_device->has_set_iova);
if (dev->physical_device->has_set_iova) {
result = msm_allocate_userspace_iova(dev, gem_handle, size, client_iova,
flags, &iova);
result = msm_allocate_userspace_iova_locked(dev, gem_handle, size,
client_iova, flags, &iova);
} else {
result = tu_allocate_kernel_iova(dev, gem_handle, &iova);
}
@@ -445,6 +468,8 @@ tu_bo_init(struct tu_device *dev,
if (!new_ptr) {
dev->bo_count--;
mtx_unlock(&dev->bo_mutex);
if (dev->physical_device->has_set_iova)
util_vma_heap_free(&dev->vma, iova, size);
tu_gem_close(dev, gem_handle);
return VK_ERROR_OUT_OF_HOST_MEMORY;
}
@@ -506,6 +531,20 @@ tu_bo_set_kernel_name(struct tu_device *dev, struct tu_bo *bo, const char *name)
}
}
static inline void
msm_vma_lock(struct tu_device *dev)
{
if (dev->physical_device->has_set_iova)
mtx_lock(&dev->vma_mutex);
}
static inline void
msm_vma_unlock(struct tu_device *dev)
{
if (dev->physical_device->has_set_iova)
mtx_unlock(&dev->vma_mutex);
}
static VkResult
msm_bo_init(struct tu_device *dev,
struct tu_bo **out_bo,
@@ -541,9 +580,15 @@ msm_bo_init(struct tu_device *dev,
struct tu_bo* bo = tu_device_lookup_bo(dev, req.handle);
assert(bo && bo->gem_handle == 0);
assert(!(flags & TU_BO_ALLOC_DMABUF));
msm_vma_lock(dev);
VkResult result =
tu_bo_init(dev, bo, req.handle, size, client_iova, flags, name);
msm_vma_unlock(dev);
if (result != VK_SUCCESS)
memset(bo, 0, sizeof(*bo));
else
@@ -591,11 +636,13 @@ msm_bo_init_dmabuf(struct tu_device *dev,
* to happen in parallel.
*/
u_rwlock_wrlock(&dev->dma_bo_lock);
msm_vma_lock(dev);
uint32_t gem_handle;
int ret = drmPrimeFDToHandle(dev->fd, prime_fd,
&gem_handle);
if (ret) {
msm_vma_unlock(dev);
u_rwlock_wrunlock(&dev->dma_bo_lock);
return vk_error(dev, VK_ERROR_INVALID_EXTERNAL_HANDLE);
}
@@ -604,6 +651,7 @@ msm_bo_init_dmabuf(struct tu_device *dev,
if (bo->refcnt != 0) {
p_atomic_inc(&bo->refcnt);
msm_vma_unlock(dev);
u_rwlock_wrunlock(&dev->dma_bo_lock);
*out_bo = bo;
@@ -611,13 +659,14 @@ msm_bo_init_dmabuf(struct tu_device *dev,
}
VkResult result =
tu_bo_init(dev, bo, gem_handle, size, 0, TU_BO_ALLOC_NO_FLAGS, "dmabuf");
tu_bo_init(dev, bo, gem_handle, size, 0, TU_BO_ALLOC_DMABUF, "dmabuf");
if (result != VK_SUCCESS)
memset(bo, 0, sizeof(*bo));
else
*out_bo = bo;
msm_vma_unlock(dev);
u_rwlock_wrunlock(&dev->dma_bo_lock);
return result;

View File

@@ -412,14 +412,16 @@ tu_free_zombie_vma_locked(struct tu_device *dev, bool wait)
last_signaled_fence = vma->fence;
}
set_iova(dev, vma->res_id, 0);
u_vector_remove(&dev->zombie_vmas);
struct tu_zombie_vma *vma2 = (struct tu_zombie_vma *)
u_vector_add(&vdev->zombie_vmas_stage_2);
if (vma->gem_handle) {
set_iova(dev, vma->res_id, 0);
*vma2 = *vma;
struct tu_zombie_vma *vma2 =
(struct tu_zombie_vma *) u_vector_add(&vdev->zombie_vmas_stage_2);
*vma2 = *vma;
}
}
/* And _then_ close the GEM handles: */
@@ -434,19 +436,44 @@ tu_free_zombie_vma_locked(struct tu_device *dev, bool wait)
return VK_SUCCESS;
}
static bool
tu_restore_from_zombie_vma_locked(struct tu_device *dev,
uint32_t gem_handle,
uint64_t *iova)
{
struct tu_zombie_vma *vma;
u_vector_foreach (vma, &dev->zombie_vmas) {
if (vma->gem_handle == gem_handle) {
*iova = vma->iova;
/* mark to skip later vdrm bo and iova cleanup */
vma->gem_handle = 0;
return true;
}
}
return false;
}
static VkResult
virtio_allocate_userspace_iova(struct tu_device *dev,
uint64_t size,
uint64_t client_iova,
enum tu_bo_alloc_flags flags,
uint64_t *iova)
virtio_allocate_userspace_iova_locked(struct tu_device *dev,
uint32_t gem_handle,
uint64_t size,
uint64_t client_iova,
enum tu_bo_alloc_flags flags,
uint64_t *iova)
{
VkResult result;
mtx_lock(&dev->vma_mutex);
*iova = 0;
if (flags & TU_BO_ALLOC_DMABUF) {
assert(gem_handle);
if (tu_restore_from_zombie_vma_locked(dev, gem_handle, iova))
return VK_SUCCESS;
}
tu_free_zombie_vma_locked(dev, false);
result = tu_allocate_userspace_iova(dev, size, client_iova, flags, iova);
@@ -460,8 +487,6 @@ virtio_allocate_userspace_iova(struct tu_device *dev,
result = tu_allocate_userspace_iova(dev, size, client_iova, flags, iova);
}
mtx_unlock(&dev->vma_mutex);
return result;
}
@@ -571,12 +596,8 @@ virtio_bo_init(struct tu_device *dev,
.size = size,
};
VkResult result;
result = virtio_allocate_userspace_iova(dev, size, client_iova,
flags, &req.iova);
if (result != VK_SUCCESS) {
return result;
}
uint32_t res_id;
struct tu_bo *bo;
if (mem_property & VK_MEMORY_PROPERTY_HOST_CACHED_BIT) {
if (mem_property & VK_MEMORY_PROPERTY_HOST_COHERENT_BIT) {
@@ -601,6 +622,16 @@ virtio_bo_init(struct tu_device *dev,
if (flags & TU_BO_ALLOC_GPU_READ_ONLY)
req.flags |= MSM_BO_GPU_READONLY;
assert(!(flags & TU_BO_ALLOC_DMABUF));
mtx_lock(&dev->vma_mutex);
result = virtio_allocate_userspace_iova_locked(dev, 0, size, client_iova,
flags, &req.iova);
mtx_unlock(&dev->vma_mutex);
if (result != VK_SUCCESS)
return result;
/* tunneled cmds are processed separately on host side,
* before the renderer->get_blob() callback.. the blob_id
* is used to link the created bo to the get_blob() call
@@ -611,27 +642,28 @@ virtio_bo_init(struct tu_device *dev,
vdrm_bo_create(vdev->vdrm, size, blob_flags, req.blob_id, &req.hdr);
if (!handle) {
util_vma_heap_free(&dev->vma, req.iova, size);
return vk_error(dev, VK_ERROR_OUT_OF_DEVICE_MEMORY);
result = VK_ERROR_OUT_OF_DEVICE_MEMORY;
goto fail;
}
uint32_t res_id = vdrm_handle_to_res_id(vdev->vdrm, handle);
struct tu_bo* bo = tu_device_lookup_bo(dev, res_id);
res_id = vdrm_handle_to_res_id(vdev->vdrm, handle);
bo = tu_device_lookup_bo(dev, res_id);
assert(bo && bo->gem_handle == 0);
bo->res_id = res_id;
result = tu_bo_init(dev, bo, handle, size, req.iova, flags, name);
if (result != VK_SUCCESS)
if (result != VK_SUCCESS) {
memset(bo, 0, sizeof(*bo));
else
*out_bo = bo;
goto fail;
}
*out_bo = bo;
/* We don't use bo->name here because for the !TU_DEBUG=bo case bo->name is NULL. */
tu_bo_set_kernel_name(dev, bo, name);
if (result == VK_SUCCESS &&
(mem_property & VK_MEMORY_PROPERTY_HOST_CACHED_BIT) &&
if ((mem_property & VK_MEMORY_PROPERTY_HOST_CACHED_BIT) &&
!(mem_property & VK_MEMORY_PROPERTY_HOST_COHERENT_BIT)) {
tu_bo_map(dev, bo);
@@ -644,6 +676,12 @@ virtio_bo_init(struct tu_device *dev,
tu_sync_cache_bo(dev, bo, 0, VK_WHOLE_SIZE, TU_MEM_SYNC_CACHE_TO_GPU);
}
return VK_SUCCESS;
fail:
mtx_lock(&dev->vma_mutex);
util_vma_heap_free(&dev->vma, req.iova, size);
mtx_unlock(&dev->vma_mutex);
return result;
}
@@ -666,11 +704,6 @@ virtio_bo_init_dmabuf(struct tu_device *dev,
/* iova allocation needs to consider the object's *real* size: */
size = real_size;
uint64_t iova;
result = virtio_allocate_userspace_iova(dev, size, 0, TU_BO_ALLOC_NO_FLAGS, &iova);
if (result != VK_SUCCESS)
return result;
/* Importing the same dmabuf several times would yield the same
* gem_handle. Thus there could be a race when destroying
* BO and importing the same dmabuf from different threads.
@@ -678,8 +711,10 @@ virtio_bo_init_dmabuf(struct tu_device *dev,
* to happen in parallel.
*/
u_rwlock_wrlock(&dev->dma_bo_lock);
mtx_lock(&dev->vma_mutex);
uint32_t handle, res_id;
uint64_t iova;
handle = vdrm_dmabuf_to_handle(vdrm, prime_fd);
if (!handle) {
@@ -689,6 +724,7 @@ virtio_bo_init_dmabuf(struct tu_device *dev,
res_id = vdrm_handle_to_res_id(vdrm, handle);
if (!res_id) {
/* XXX gem_handle potentially leaked here since no refcnt */
result = vk_error(dev, VK_ERROR_INVALID_EXTERNAL_HANDLE);
goto out_unlock;
}
@@ -702,21 +738,25 @@ virtio_bo_init_dmabuf(struct tu_device *dev,
goto out_unlock;
}
result = tu_bo_init(dev, bo, handle, size, iova,
TU_BO_ALLOC_NO_FLAGS, "dmabuf");
if (result != VK_SUCCESS)
memset(bo, 0, sizeof(*bo));
else
*out_bo = bo;
out_unlock:
u_rwlock_wrunlock(&dev->dma_bo_lock);
result = virtio_allocate_userspace_iova_locked(dev, handle, size, 0,
TU_BO_ALLOC_DMABUF, &iova);
if (result != VK_SUCCESS) {
mtx_lock(&dev->vma_mutex);
util_vma_heap_free(&dev->vma, iova, size);
mtx_unlock(&dev->vma_mutex);
vdrm_bo_close(dev->vdev->vdrm, handle);
goto out_unlock;
}
result =
tu_bo_init(dev, bo, handle, size, iova, TU_BO_ALLOC_NO_FLAGS, "dmabuf");
if (result != VK_SUCCESS) {
util_vma_heap_free(&dev->vma, iova, size);
memset(bo, 0, sizeof(*bo));
} else {
*out_bo = bo;
}
out_unlock:
mtx_unlock(&dev->vma_mutex);
u_rwlock_wrunlock(&dev->dma_bo_lock);
return result;
}

View File

@@ -177,6 +177,9 @@ lower_aaline_instr(nir_builder *b, nir_instr *instr, void *data)
return false;
if (var->data.location < FRAG_RESULT_DATA0 && var->data.location != FRAG_RESULT_COLOR)
return false;
uint32_t mask = nir_intrinsic_write_mask(intrin) << var->data.location_frac;
if (!(mask & BITFIELD_BIT(3)))
return false;
nir_def *out_input = intrin->src[1].ssa;
b->cursor = nir_before_instr(instr);
@@ -223,12 +226,10 @@ lower_aaline_instr(nir_builder *b, nir_instr *instr, void *data)
tmp = nir_fmul(b, nir_channel(b, tmp, 0),
nir_fmin(b, nir_channel(b, tmp, 1), max));
tmp = nir_fmul(b, nir_channel(b, out_input, 3), tmp);
tmp = nir_fmul(b, nir_channel(b, out_input, out_input->num_components - 1), tmp);
nir_def *out = nir_vec4(b, nir_channel(b, out_input, 0),
nir_channel(b, out_input, 1),
nir_channel(b, out_input, 2),
tmp);
nir_def *out = nir_vector_insert_imm(b, out_input, tmp,
out_input->num_components - 1);
nir_src_rewrite(&intrin->src[1], out);
return true;
}

View File

@@ -2014,6 +2014,7 @@ void util_blitter_blit_generic(struct blitter_context *blitter,
unsigned dst_sample)
{
struct blitter_context_priv *ctx = (struct blitter_context_priv*)blitter;
unsigned count = 0;
struct pipe_context *pipe = ctx->base.pipe;
enum pipe_texture_target src_target = src->target;
unsigned src_samples = src->texture->nr_samples;
@@ -2038,7 +2039,7 @@ void util_blitter_blit_generic(struct blitter_context *blitter,
/* Return if there is nothing to do. */
if (!dst_has_color && !dst_has_depth && !dst_has_stencil) {
return;
goto out;
}
bool is_scaled = dstbox->width != abs(srcbox->width) ||
@@ -2170,7 +2171,6 @@ void util_blitter_blit_generic(struct blitter_context *blitter,
}
/* Set samplers. */
unsigned count = 0;
if (src_has_depth && src_has_stencil &&
(dst_has_color || (dst_has_depth && dst_has_stencil))) {
/* Setup two samplers, one for depth and the other one for stencil. */
@@ -2223,7 +2223,8 @@ void util_blitter_blit_generic(struct blitter_context *blitter,
do_blits(ctx, dst, dstbox, src, src_width0, src_height0,
srcbox, dst_has_depth || dst_has_stencil, use_txf, sample0_only,
dst_sample);
util_blitter_unset_running_flag(blitter);
out:
util_blitter_restore_vertex_states(blitter);
util_blitter_restore_fragment_states(blitter);
util_blitter_restore_textures_internal(blitter, count);
@@ -2232,7 +2233,6 @@ void util_blitter_blit_generic(struct blitter_context *blitter,
pipe->set_scissor_states(pipe, 0, 1, &ctx->base.saved_scissor);
}
util_blitter_restore_render_cond(blitter);
util_blitter_unset_running_flag(blitter);
}
void

View File

@@ -2137,7 +2137,8 @@ static void evergreen_emit_vertex_buffers(struct r600_context *rctx,
{
struct radeon_cmdbuf *cs = &rctx->b.gfx.cs;
struct r600_fetch_shader *shader = (struct r600_fetch_shader*)rctx->vertex_fetch_shader.cso;
uint32_t dirty_mask = state->dirty_mask & shader->buffer_mask;
uint32_t buffer_mask = shader ? shader->buffer_mask : ~0;
uint32_t dirty_mask = state->dirty_mask & buffer_mask;
while (dirty_mask) {
struct pipe_vertex_buffer *vb;
@@ -2176,7 +2177,7 @@ static void evergreen_emit_vertex_buffers(struct r600_context *rctx,
radeon_emit(cs, radeon_add_to_buffer_list(&rctx->b, &rctx->b.gfx, rbuffer,
RADEON_USAGE_READ | RADEON_PRIO_VERTEX_BUFFER));
}
state->dirty_mask &= ~shader->buffer_mask;
state->dirty_mask &= ~buffer_mask;
}
static void evergreen_fs_emit_vertex_buffers(struct r600_context *rctx, struct r600_atom * atom)

View File

@@ -239,15 +239,13 @@ static rvcn_dec_message_avc_t get_h264_msg(struct radeon_decoder *dec,
}
}
/* if reference picture exists, however no reference picture found at the end
curr_pic_ref_frame_num == 0, which is not reasonable, should be corrected. */
if (result.used_for_reference_flags && (result.curr_pic_ref_frame_num == 0)) {
for (i = 0; i < ARRAY_SIZE(result.ref_frame_list); i++) {
result.ref_frame_list[i] = pic->ref[i] ?
(uintptr_t)vl_video_buffer_get_associated_data(pic->ref[i], &dec->base) : 0xff;
if (result.ref_frame_list[i] != 0xff) {
/* need at least one reference for P/B frames */
if (result.curr_pic_ref_frame_num == 0 && pic->slice_parameter.slice_info_present) {
for (i = 0; i < pic->slice_count; i++) {
if (pic->slice_parameter.slice_type[i] % 5 != 2) {
result.curr_pic_ref_frame_num++;
result.non_existing_frame_flags &= ~(1 << i);
result.ref_frame_list[0] = 0;
result.non_existing_frame_flags &= ~1;
break;
}
}
@@ -279,7 +277,8 @@ static rvcn_dec_message_avc_t get_h264_msg(struct radeon_decoder *dec,
dec->ref_codec.bts = CODEC_8_BITS;
dec->ref_codec.index = result.decoded_pic_idx;
dec->ref_codec.ref_size = 16;
memset(dec->ref_codec.ref_list, 0xff, sizeof(dec->ref_codec.ref_list));
dec->ref_codec.num_refs = result.curr_pic_ref_frame_num;
STATIC_ASSERT(sizeof(dec->ref_codec.ref_list) == sizeof(result.ref_frame_list));
memcpy(dec->ref_codec.ref_list, result.ref_frame_list, sizeof(result.ref_frame_list));
}
@@ -292,7 +291,7 @@ static rvcn_dec_message_hevc_t get_h265_msg(struct radeon_decoder *dec,
struct pipe_h265_picture_desc *pic)
{
rvcn_dec_message_hevc_t result;
unsigned i, j;
unsigned i, j, num_refs = 0;
memset(&result, 0, sizeof(result));
result.sps_info_flags = 0;
@@ -413,9 +412,10 @@ static rvcn_dec_message_hevc_t get_h265_msg(struct radeon_decoder *dec,
result.poc_list[i] = pic->PicOrderCntVal[i];
if (ref)
if (ref) {
ref_pic = (uintptr_t)vl_video_buffer_get_associated_data(ref, &dec->base);
else
num_refs++;
} else
ref_pic = 0x7F;
result.ref_pic_list[i] = ref_pic;
}
@@ -469,7 +469,8 @@ static rvcn_dec_message_hevc_t get_h265_msg(struct radeon_decoder *dec,
CODEC_10_BITS : CODEC_8_BITS;
dec->ref_codec.index = result.curr_idx;
dec->ref_codec.ref_size = 15;
memset(dec->ref_codec.ref_list, 0x7f, sizeof(dec->ref_codec.ref_list));
dec->ref_codec.num_refs = num_refs;
STATIC_ASSERT(sizeof(dec->ref_codec.ref_list) == sizeof(result.ref_pic_list));
memcpy(dec->ref_codec.ref_list, result.ref_pic_list, sizeof(result.ref_pic_list));
}
return result;
@@ -507,7 +508,7 @@ static rvcn_dec_message_vp9_t get_vp9_msg(struct radeon_decoder *dec,
struct pipe_vp9_picture_desc *pic)
{
rvcn_dec_message_vp9_t result;
unsigned i ,j;
unsigned i, j, num_refs = 0;
memset(&result, 0, sizeof(result));
@@ -641,9 +642,13 @@ static rvcn_dec_message_vp9_t get_vp9_msg(struct radeon_decoder *dec,
get_current_pic_index(dec, target, &result.curr_pic_idx);
for (i = 0; i < 8; i++) {
result.ref_frame_map[i] =
(pic->ref[i]) ? (uintptr_t)vl_video_buffer_get_associated_data(pic->ref[i], &dec->base)
: 0x7f;
uintptr_t ref_frame;
if (pic->ref[i]) {
ref_frame = (uintptr_t)vl_video_buffer_get_associated_data(pic->ref[i], &dec->base);
num_refs++;
} else
ref_frame = 0x7f;
result.ref_frame_map[i] = ref_frame;
}
result.frame_refs[0] = result.ref_frame_map[pic->picture_parameter.pic_fields.last_ref_frame];
@@ -669,6 +674,7 @@ static rvcn_dec_message_vp9_t get_vp9_msg(struct radeon_decoder *dec,
CODEC_10_BITS : CODEC_8_BITS;
dec->ref_codec.index = result.curr_pic_idx;
dec->ref_codec.ref_size = 8;
dec->ref_codec.num_refs = num_refs;
memset(dec->ref_codec.ref_list, 0x7f, sizeof(dec->ref_codec.ref_list));
memcpy(dec->ref_codec.ref_list, result.ref_frame_map, sizeof(result.ref_frame_map));
}
@@ -959,7 +965,7 @@ static rvcn_dec_message_av1_t get_av1_msg(struct radeon_decoder *dec,
struct pipe_av1_picture_desc *pic)
{
rvcn_dec_message_av1_t result;
unsigned i, j;
unsigned i, j, num_refs = 0;
uint16_t tile_count = pic->picture_parameter.tile_cols * pic->picture_parameter.tile_rows;
memset(&result, 0, sizeof(result));
@@ -1151,9 +1157,13 @@ static rvcn_dec_message_av1_t get_av1_msg(struct radeon_decoder *dec,
result.order_hint_bits = pic->picture_parameter.order_hint_bits_minus_1 + 1;
for (i = 0; i < NUM_AV1_REFS; ++i) {
result.ref_frame_map[i] =
(pic->ref[i]) ? (uintptr_t)vl_video_buffer_get_associated_data(pic->ref[i], &dec->base)
: 0x7f;
uintptr_t ref_frame;
if (pic->ref[i]) {
ref_frame = (uintptr_t)vl_video_buffer_get_associated_data(pic->ref[i], &dec->base);
num_refs++;
} else
ref_frame = 0x7f;
result.ref_frame_map[i] = ref_frame;
}
for (i = 0; i < NUM_AV1_REFS_PER_FRAME; ++i)
result.frame_refs[i] = result.ref_frame_map[pic->picture_parameter.ref_frame_idx[i]];
@@ -1300,6 +1310,7 @@ static rvcn_dec_message_av1_t get_av1_msg(struct radeon_decoder *dec,
dec->ref_codec.bts = pic->picture_parameter.bit_depth_idx ? CODEC_10_BITS : CODEC_8_BITS;
dec->ref_codec.index = result.curr_pic_idx;
dec->ref_codec.ref_size = 8;
dec->ref_codec.num_refs = num_refs;
memset(dec->ref_codec.ref_list, 0x7f, sizeof(dec->ref_codec.ref_list));
memcpy(dec->ref_codec.ref_list, result.ref_frame_map, sizeof(result.ref_frame_map));
}
@@ -1816,6 +1827,7 @@ static unsigned rvcn_dec_dynamic_dpb_t2_message(struct radeon_decoder *dec, rvcn
size = size * 3 / 2;
list_for_each_entry_safe(struct rvcn_dec_dynamic_dpb_t2, d, &dec->dpb_ref_list, list) {
bool found = false;
for (i = 0; i < dec->ref_codec.ref_size; ++i) {
if (((dec->ref_codec.ref_list[i] & 0x7f) != 0x7f) && (d->index == (dec->ref_codec.ref_list[i] & 0x7f))) {
if (!dummy)
@@ -1829,10 +1841,10 @@ static unsigned rvcn_dec_dynamic_dpb_t2_message(struct radeon_decoder *dec, rvcn
dynamic_dpb_t2->dpbAddrLo[i] = addr;
dynamic_dpb_t2->dpbAddrHi[i] = addr >> 32;
++dynamic_dpb_t2->dpbArraySize;
break;
found = true;
}
}
if (i == dec->ref_codec.ref_size) {
if (!found) {
if (d->dpb.res->b.b.width0 * d->dpb.res->b.b.height0 != size) {
list_del(&d->list);
list_addtail(&d->list, &dec->dpb_unref_list);
@@ -1887,6 +1899,23 @@ static unsigned rvcn_dec_dynamic_dpb_t2_message(struct radeon_decoder *dec, rvcn
list_addtail(&dpb->list, &dec->dpb_ref_list);
}
if (dynamic_dpb_t2->dpbArraySize < dec->ref_codec.num_refs) {
struct rvcn_dec_dynamic_dpb_t2 *d =
list_first_entry(&dec->dpb_ref_list, struct rvcn_dec_dynamic_dpb_t2, list);
addr = dec->ws->buffer_get_virtual_address(d->dpb.res->buf);
if (!addr && dummy)
addr = dec->ws->buffer_get_virtual_address(dummy->dpb.res->buf);
assert(addr);
for (i = 0; i < dec->ref_codec.num_refs; ++i) {
if (dynamic_dpb_t2->dpbAddrLo[i] || dynamic_dpb_t2->dpbAddrHi[i])
continue;
dynamic_dpb_t2->dpbAddrLo[i] = addr;
dynamic_dpb_t2->dpbAddrHi[i] = addr >> 32;
++dynamic_dpb_t2->dpbArraySize;
}
assert(dynamic_dpb_t2->dpbArraySize == dec->ref_codec.num_refs);
}
dec->ws->cs_add_buffer(&dec->cs, dpb->dpb.res->buf,
RADEON_USAGE_READWRITE | RADEON_USAGE_SYNCHRONIZED, RADEON_DOMAIN_VRAM);
addr = dec->ws->buffer_get_virtual_address(dpb->dpb.res->buf);

View File

@@ -120,6 +120,7 @@ struct radeon_decoder {
} bts;
uint8_t index;
unsigned ref_size;
unsigned num_refs;
uint8_t ref_list[16];
} ref_codec;

View File

@@ -176,6 +176,15 @@ bool si_alloc_resource(struct si_screen *sscreen, struct si_resource *res)
util_range_set_empty(&res->valid_buffer_range);
res->TC_L2_dirty = false;
if (res->b.b.target != PIPE_BUFFER && !(res->b.b.flags & SI_RESOURCE_AUX_PLANE)) {
/* The buffer is shared with other planes. */
struct si_resource *plane = (struct si_resource *)res->b.b.next;
for (; plane; plane = (struct si_resource *)plane->b.b.next) {
radeon_bo_reference(sscreen->ws, &plane->buf, res->buf);
plane->gpu_address = res->gpu_address;
}
}
/* Print debug information. */
if (sscreen->debug_flags & DBG(VM) && res->b.b.target == PIPE_BUFFER) {
fprintf(stderr, "VM start=0x%" PRIX64 " end=0x%" PRIX64 " | Buffer %" PRIu64 " bytes | Flags: ",

View File

@@ -1933,13 +1933,7 @@ emit_alu(struct ntv_context *ctx, nir_alu_instr *alu)
result = emit_builtin_unop(ctx, GLSLstd450PackHalf2x16, get_def_type(ctx, &alu->def, nir_type_uint), src[0]);
break;
case nir_op_unpack_64_2x32:
assert(nir_op_infos[alu->op].num_inputs == 1);
result = emit_builtin_unop(ctx, GLSLstd450UnpackDouble2x32, get_def_type(ctx, &alu->def, nir_type_uint), src[0]);
break;
BUILTIN_UNOPF(nir_op_unpack_half_2x16, GLSLstd450UnpackHalf2x16)
BUILTIN_UNOPF(nir_op_pack_64_2x32, GLSLstd450PackDouble2x32)
#undef BUILTIN_UNOP
#undef BUILTIN_UNOPF
@@ -2125,9 +2119,11 @@ emit_alu(struct ntv_context *ctx, nir_alu_instr *alu)
/* those are all simple bitcasts, we could do better, but it doesn't matter */
case nir_op_pack_32_4x8:
case nir_op_pack_32_2x16:
case nir_op_pack_64_2x32:
case nir_op_pack_64_4x16:
case nir_op_unpack_32_4x8:
case nir_op_unpack_32_2x16:
case nir_op_unpack_64_2x32:
case nir_op_unpack_64_4x16: {
result = emit_bitcast(ctx, dest_type, src[0]);
break;

View File

@@ -309,6 +309,11 @@ zink_batch_state_destroy(struct zink_screen *screen, struct zink_batch_state *bs
util_dynarray_fini(&bs->bindless_releases[0]);
util_dynarray_fini(&bs->bindless_releases[1]);
util_dynarray_fini(&bs->acquires);
util_dynarray_fini(&bs->signal_semaphores);
util_dynarray_fini(&bs->wait_semaphores);
util_dynarray_fini(&bs->wait_semaphore_stages);
util_dynarray_fini(&bs->fd_wait_semaphores);
util_dynarray_fini(&bs->fd_wait_semaphore_stages);
util_dynarray_fini(&bs->acquire_flags);
unsigned num_mfences = util_dynarray_num_elements(&bs->fence.mfences, void *);
struct zink_tc_fence **mfence = bs->fence.mfences.data;

View File

@@ -4846,8 +4846,11 @@ zink_resource_commit(struct pipe_context *pctx, struct pipe_resource *pres, unsi
VkSemaphore sem = VK_NULL_HANDLE;
bool ret = zink_bo_commit(ctx, res, level, box, commit, &sem);
if (ret) {
if (sem)
if (sem) {
zink_batch_add_wait_semaphore(&ctx->batch, sem);
zink_batch_reference_resource_rw(&ctx->batch, res, true);
ctx->batch.has_work = true;
}
} else {
check_device_lost(ctx);
}

View File

@@ -513,6 +513,7 @@ namespace {
LLVMRunPasses(wrap(&mod), opt_str, tm, opts);
LLVMDisposeTargetMachine(tm);
LLVMDisposePassBuilderOptions(opts);
}
std::unique_ptr<Module>

View File

@@ -2385,7 +2385,7 @@ dri2_init_screen(struct dri_screen *screen)
pscreen = pipe_loader_create_screen(screen->dev);
if (!pscreen)
goto fail;
return NULL;
dri_init_options(screen);
screen->throttle = pscreen->get_param(pscreen, PIPE_CAP_THROTTLE);
@@ -2419,7 +2419,7 @@ dri2_init_screen(struct dri_screen *screen)
return configs;
fail:
dri_release_screen(screen);
pipe_loader_release(&screen->dev, 1);
return NULL;
}

View File

@@ -546,6 +546,8 @@ drisw_init_screen(struct dri_screen *screen)
struct pipe_screen *pscreen = NULL;
const struct drisw_loader_funcs *lf = &drisw_lf;
(void) mtx_init(&screen->opencl_func_mutex, mtx_plain);
screen->swrast_no_present = debug_get_option_swrast_no_present();
if (loader->base.version >= 4) {
@@ -565,7 +567,7 @@ drisw_init_screen(struct dri_screen *screen)
pscreen = pipe_loader_create_screen(screen->dev);
if (!pscreen)
goto fail;
return NULL;
dri_init_options(screen);
configs = dri_init_screen(screen, pscreen);
@@ -593,7 +595,7 @@ drisw_init_screen(struct dri_screen *screen)
return configs;
fail:
dri_release_screen(screen);
pipe_loader_release(&screen->dev, 1);
return NULL;
}

View File

@@ -115,6 +115,8 @@ kopper_init_screen(struct dri_screen *screen)
const __DRIconfig **configs;
struct pipe_screen *pscreen = NULL;
(void) mtx_init(&screen->opencl_func_mutex, mtx_plain);
if (!screen->kopper_loader) {
fprintf(stderr, "mesa: Kopper interface not found!\n"
" Ensure the versions of %s built with this version of Zink are\n"
@@ -134,7 +136,7 @@ kopper_init_screen(struct dri_screen *screen)
pscreen = pipe_loader_create_screen(screen->dev);
if (!pscreen)
goto fail;
return NULL;
dri_init_options(screen);
screen->unwrapped_screen = trace_screen_unwrap(pscreen);
@@ -167,7 +169,7 @@ kopper_init_screen(struct dri_screen *screen)
return configs;
fail:
dri_release_screen(screen);
pipe_loader_release(&screen->dev, 1);
return NULL;
}

View File

@@ -10,6 +10,7 @@ use mesa_rust_util::static_assert;
use rusticl_opencl_gen::*;
use std::collections::HashSet;
use std::mem;
use std::slice;
use std::sync::Arc;
use std::sync::Condvar;
@@ -272,6 +273,27 @@ impl Event {
}
}
impl Drop for Event {
// implement drop in order to prevent stack overflows of long dependency chains.
//
// This abuses the fact that `Arc::into_inner` only succeeds when there is one strong reference
// so we turn a recursive drop chain into a drop list for events having no other references.
fn drop(&mut self) {
if self.deps.is_empty() {
return;
}
let mut deps_list = vec![mem::take(&mut self.deps)];
while let Some(deps) = deps_list.pop() {
for dep in deps {
if let Some(mut dep) = Arc::into_inner(dep) {
deps_list.push(mem::take(&mut dep.deps));
}
}
}
}
}
// TODO worker thread per device
// Condvar to wait on new events to work on
// notify condvar when flushing queue events to worker

View File

@@ -658,6 +658,7 @@ impl PipeContext {
impl Drop for PipeContext {
fn drop(&mut self) {
self.flush().wait();
unsafe {
self.pipe.as_ref().destroy.unwrap()(self.pipe.as_ptr());
}

View File

@@ -1033,7 +1033,8 @@ vlVaRenderPicture(VADriverContextP ctx, VAContextID context_id, VABufferID *buff
case VASliceDataBufferType:
vaStatus = handleVASliceDataBufferType(context, buf);
slice_offset += buf->size;
if (slice_idx)
slice_offset += buf->size;
break;
case VAProcPipelineParameterBufferType:

View File

@@ -186,6 +186,7 @@ void vlVaHandleSliceParameterBufferH264(vlVaContext *context, vlVaBuffer *buf)
assert(context->desc.h264.slice_count < max_pipe_h264_slices);
context->desc.h264.slice_parameter.slice_info_present = true;
context->desc.h264.slice_parameter.slice_type[context->desc.h264.slice_count] = h264->slice_type;
context->desc.h264.slice_parameter.slice_data_size[context->desc.h264.slice_count] = h264->slice_data_size;
context->desc.h264.slice_parameter.slice_data_offset[context->desc.h264.slice_count] = h264->slice_data_offset;

View File

@@ -411,6 +411,7 @@ struct pipe_h264_picture_desc
{
bool slice_info_present;
uint32_t slice_count;
uint8_t slice_type[128];
uint32_t slice_data_size[128];
uint32_t slice_data_offset[128];
enum pipe_slice_buffer_placement_type slice_data_flag[128];

View File

@@ -28,5 +28,5 @@ libi915drm = static_library(
inc_include, inc_src, inc_gallium, inc_gallium_aux, inc_gallium_drivers
],
link_with : [libintel_common],
dependencies : [dep_libdrm, dep_libdrm_intel],
dependencies : [dep_libdrm, dep_libdrm_intel, idep_intel_dev_wa],
)

View File

@@ -32,7 +32,9 @@
#include <dlfcn.h>
#include "dri_common.h"
#include "drisw_priv.h"
#ifdef HAVE_DRI3
#include "dri3_priv.h"
#endif
#include <X11/extensions/shmproto.h>
#include <assert.h>
#include <vulkan/vulkan_core.h>

View File

@@ -908,9 +908,11 @@ __glXInitialize(Display * dpy)
#endif /* HAVE_DRI3 */
if (!debug_get_bool_option("LIBGL_DRI2_DISABLE", false))
dpyPriv->dri2Display = dri2CreateDisplay(dpy);
#if defined(HAVE_ZINK)
if (!dpyPriv->dri3Display && !dpyPriv->dri2Display)
try_zink = !debug_get_bool_option("LIBGL_KOPPER_DISABLE", false) &&
!getenv("GALLIUM_DRIVER");
#endif /* HAVE_ZINK */
}
#endif /* GLX_USE_DRM */
if (glx_direct)

View File

@@ -743,7 +743,7 @@ build_desc_addr_for_res_index(nir_builder *b,
static nir_def *
build_desc_addr_for_binding(nir_builder *b,
unsigned set, unsigned binding,
nir_def *array_index,
nir_def *array_index, unsigned plane,
const struct apply_pipeline_layout_state *state)
{
const struct anv_descriptor_set_binding_layout *bind_layout =
@@ -759,6 +759,10 @@ build_desc_addr_for_binding(nir_builder *b,
array_index,
bind_layout->descriptor_surface_stride),
bind_layout->descriptor_surface_offset);
if (plane != 0) {
desc_offset = nir_iadd_imm(
b, desc_offset, plane * bind_layout->descriptor_data_surface_size);
}
return nir_vec4(b, nir_unpack_64_2x32_split_x(b, set_addr),
nir_unpack_64_2x32_split_y(b, set_addr),
@@ -766,14 +770,21 @@ build_desc_addr_for_binding(nir_builder *b,
desc_offset);
}
case nir_address_format_32bit_index_offset:
case nir_address_format_32bit_index_offset: {
nir_def *desc_offset =
nir_iadd_imm(b,
nir_imul_imm(b,
array_index,
bind_layout->descriptor_surface_stride),
bind_layout->descriptor_surface_offset);
if (plane != 0) {
desc_offset = nir_iadd_imm(
b, desc_offset, plane * bind_layout->descriptor_data_surface_size);
}
return nir_vec2(b,
nir_imm_int(b, state->set[set].desc_offset),
nir_iadd_imm(b,
nir_imul_imm(b,
array_index,
bind_layout->descriptor_surface_stride),
bind_layout->descriptor_surface_offset));
desc_offset);
}
default:
unreachable("Unhandled address format");
@@ -827,7 +838,8 @@ build_surface_index_for_binding(nir_builder *b,
set_offset = nir_imm_int(b, 0xdeaddead);
nir_def *desc_addr =
build_desc_addr_for_binding(b, set, binding, array_index, state);
build_desc_addr_for_binding(b, set, binding, array_index,
plane, state);
surface_index =
build_load_descriptor_mem(b, desc_addr, 0, 1, 32, state);
@@ -908,7 +920,8 @@ build_sampler_handle_for_binding(nir_builder *b,
set_offset = nir_imm_int(b, 0xdeaddead);
nir_def *desc_addr =
build_desc_addr_for_binding(b, set, binding, array_index, state);
build_desc_addr_for_binding(b, set, binding, array_index,
plane, state);
/* This is anv_sampled_image_descriptor, the sampler handle is always
* in component 1.
@@ -1384,7 +1397,8 @@ lower_load_accel_struct_desc(nir_builder *b,
struct res_index_defs res = unpack_res_index(b, res_index);
nir_def *desc_addr =
build_desc_addr_for_binding(b, set, binding, res.array_index, state);
build_desc_addr_for_binding(b, set, binding, res.array_index,
0 /* plane */, state);
/* Acceleration structure descriptors are always uint64_t */
nir_def *desc = build_load_descriptor_mem(b, desc_addr, 0, 1, 64, state);
@@ -1613,7 +1627,7 @@ lower_image_size_intrinsic(nir_builder *b, nir_intrinsic_instr *intrin,
}
nir_def *desc_addr = build_desc_addr_for_binding(
b, set, binding, array_index, state);
b, set, binding, array_index, 0 /* plane */, state);
b->cursor = nir_after_instr(&intrin->instr);

View File

@@ -3180,6 +3180,12 @@ struct anv_push_constants {
/** Dynamic offsets for dynamic UBOs and SSBOs */
uint32_t dynamic_offsets[MAX_DYNAMIC_BUFFERS];
/* Robust access pushed registers. */
uint64_t push_reg_mask[MESA_SHADER_STAGES];
/** Ray query globals (RT_DISPATCH_GLOBALS) */
uint64_t ray_query_globals;
union {
struct {
/** Dynamic MSAA value */
@@ -3200,16 +3206,12 @@ struct anv_push_constants {
*
* This is never set by software but is implicitly filled out when
* uploading the push constants for compute shaders.
*
* This *MUST* be the last field of the anv_push_constants structure.
*/
uint32_t subgroup_id;
} cs;
};
/* Robust access pushed registers. */
uint64_t push_reg_mask[MESA_SHADER_STAGES];
/** Ray query globals (RT_DISPATCH_GLOBALS) */
uint64_t ray_query_globals;
};
struct anv_surface_state {

View File

@@ -1438,13 +1438,15 @@ nvk_flush_ms_state(struct nvk_cmd_buffer *cmd)
});
}
if (BITSET_TEST(dyn->dirty, MESA_VK_DYNAMIC_MS_SAMPLE_LOCATIONS) ||
if (BITSET_TEST(dyn->dirty, MESA_VK_DYNAMIC_MS_RASTERIZATION_SAMPLES) ||
BITSET_TEST(dyn->dirty, MESA_VK_DYNAMIC_MS_SAMPLE_LOCATIONS) ||
BITSET_TEST(dyn->dirty, MESA_VK_DYNAMIC_MS_SAMPLE_LOCATIONS_ENABLE)) {
const struct vk_sample_locations_state *sl;
if (dyn->ms.sample_locations_enable) {
sl = dyn->ms.sample_locations;
} else {
sl = vk_standard_sample_locations_state(dyn->ms.rasterization_samples);
const uint32_t samples = MAX2(1, dyn->ms.rasterization_samples);
sl = vk_standard_sample_locations_state(samples);
}
for (uint32_t i = 0; i < sl->per_pixel; i++) {

View File

@@ -130,6 +130,7 @@ nvk_meta_end(struct nvk_cmd_buffer *cmd,
{
if (save->desc0) {
cmd->state.gfx.descriptors.sets[0] = save->desc0;
cmd->state.gfx.descriptors.set_sizes[0] = save->desc0->size;
cmd->state.gfx.descriptors.root.sets[0] = nvk_descriptor_set_addr(save->desc0);
cmd->state.gfx.descriptors.sets_dirty |= BITFIELD_BIT(0);
cmd->state.gfx.descriptors.push_dirty &= ~BITFIELD_BIT(0);

View File

@@ -10,6 +10,9 @@
#include <sys/mman.h>
#include <xf86drm.h>
#include "nvidia/classes/cl9097.h"
#include "nvidia/classes/clc597.h"
static void
bo_bind(struct nouveau_ws_device *dev,
uint32_t handle, uint64_t addr,
@@ -170,9 +173,10 @@ nouveau_ws_bo_new_mapped(struct nouveau_ws_device *dev,
}
static struct nouveau_ws_bo *
nouveau_ws_bo_new_locked(struct nouveau_ws_device *dev,
uint64_t size, uint64_t align,
enum nouveau_ws_bo_flags flags)
nouveau_ws_bo_new_tiled_locked(struct nouveau_ws_device *dev,
uint64_t size, uint64_t align,
uint8_t pte_kind, uint16_t tile_mode,
enum nouveau_ws_bo_flags flags)
{
struct drm_nouveau_gem_new req = {};
@@ -205,6 +209,9 @@ nouveau_ws_bo_new_locked(struct nouveau_ws_device *dev,
if (flags & NOUVEAU_WS_BO_NO_SHARE)
req.info.domain |= NOUVEAU_GEM_DOMAIN_NO_SHARE;
req.info.tile_flags = (uint32_t)pte_kind << 8;
req.info.tile_mode = tile_mode;
req.info.size = size;
req.align = align;
@@ -242,19 +249,29 @@ fail_gem_new:
}
struct nouveau_ws_bo *
nouveau_ws_bo_new(struct nouveau_ws_device *dev,
uint64_t size, uint64_t align,
enum nouveau_ws_bo_flags flags)
nouveau_ws_bo_new_tiled(struct nouveau_ws_device *dev,
uint64_t size, uint64_t align,
uint8_t pte_kind, uint16_t tile_mode,
enum nouveau_ws_bo_flags flags)
{
struct nouveau_ws_bo *bo;
simple_mtx_lock(&dev->bos_lock);
bo = nouveau_ws_bo_new_locked(dev, size, align, flags);
bo = nouveau_ws_bo_new_tiled_locked(dev, size, align,
pte_kind, tile_mode, flags);
simple_mtx_unlock(&dev->bos_lock);
return bo;
}
struct nouveau_ws_bo *
nouveau_ws_bo_new(struct nouveau_ws_device *dev,
uint64_t size, uint64_t align,
enum nouveau_ws_bo_flags flags)
{
return nouveau_ws_bo_new_tiled(dev, size, align, 0, 0, flags);
}
static struct nouveau_ws_bo *
nouveau_ws_bo_from_dma_buf_locked(struct nouveau_ws_device *dev, int fd)
{
@@ -265,8 +282,11 @@ nouveau_ws_bo_from_dma_buf_locked(struct nouveau_ws_device *dev, int fd)
struct hash_entry *entry =
_mesa_hash_table_search(dev->bos, (void *)(uintptr_t)handle);
if (entry != NULL)
return entry->data;
if (entry != NULL) {
struct nouveau_ws_bo *bo = entry->data;
nouveau_ws_bo_ref(bo);
return bo;
}
/*
* If we got here, no BO exists for the retrieved handle. If we error

View File

@@ -68,6 +68,11 @@ struct nouveau_ws_bo *nouveau_ws_bo_new_mapped(struct nouveau_ws_device *,
enum nouveau_ws_bo_flags,
enum nouveau_ws_bo_map_flags map_flags,
void **map_out);
struct nouveau_ws_bo *nouveau_ws_bo_new_tiled(struct nouveau_ws_device *,
uint64_t size, uint64_t align,
uint8_t pte_kind,
uint16_t tile_mode,
enum nouveau_ws_bo_flags);
struct nouveau_ws_bo *nouveau_ws_bo_from_dma_buf(struct nouveau_ws_device *,
int fd);
void nouveau_ws_bo_destroy(struct nouveau_ws_bo *);

View File

@@ -380,3 +380,13 @@ nouveau_ws_device_destroy(struct nouveau_ws_device *device)
close(device->fd);
FREE(device);
}
bool
nouveau_ws_device_has_tiled_bo(struct nouveau_ws_device *device)
{
uint64_t has = 0;
if (nouveau_ws_param(device->fd, NOUVEAU_GETPARAM_HAS_VMA_TILEMODE, &has))
return false;
return has != 0;
}

View File

@@ -64,6 +64,8 @@ struct nouveau_ws_device {
struct nouveau_ws_device *nouveau_ws_device_new(struct _drmDevice *drm_device);
void nouveau_ws_device_destroy(struct nouveau_ws_device *);
bool nouveau_ws_device_has_tiled_bo(struct nouveau_ws_device *device);
#ifdef __cplusplus
}
#endif

View File

@@ -208,5 +208,9 @@ Application bugs worked around in this file:
<application name="Half-Life Alyx" application_name_match="hlvr">
<option name="dual_color_blend_by_location" value="true" />
</application>
<application name="Enshrouded" executable="enshrouded.exe">
<option name="radv_zero_vram" value="true"/>
</application>
</device>
</driconf>

View File

@@ -209,7 +209,8 @@ __bitset_shl(BITSET_WORD *x, unsigned amount, unsigned n)
*/
#define BITSET_TEST_RANGE_INSIDE_WORD(x, b, e, mask) \
(BITSET_BITWORD(b) == BITSET_BITWORD(e) ? \
(((x)[BITSET_BITWORD(b)] & BITSET_RANGE(b, e)) == mask) : \
(((x)[BITSET_BITWORD(b)] & BITSET_RANGE(b, e)) == \
(((BITSET_WORD)mask) << (b % BITSET_WORDBITS))) : \
(assert (!"BITSET_TEST_RANGE: bit range crosses word boundary"), 0))
#define BITSET_SET_RANGE_INSIDE_WORD(x, b, e) \
(BITSET_BITWORD(b) == BITSET_BITWORD(e) ? \

View File

@@ -532,7 +532,7 @@ wsi_create_native_image_mem(const struct wsi_swapchain *chain,
for (uint32_t p = 0; p < image->num_planes; p++) {
const VkImageSubresource image_subresource = {
.aspectMask = VK_IMAGE_ASPECT_PLANE_0_BIT << p,
.aspectMask = VK_IMAGE_ASPECT_MEMORY_PLANE_0_BIT_EXT << p,
.mipLevel = 0,
.arrayLayer = 0,
};

View File

@@ -41,6 +41,15 @@ endif
if rc.version().version_compare('< 1.57')
rust_args += ['--cfg', 'no_is_available']
endif
if rc.version().version_compare('< 1.66')
rust_args += ['--cfg', 'no_source_text']
endif
if rc.version().version_compare('< 1.79')
rust_args += [
'--cfg', 'no_literal_byte_character',
'--cfg', 'no_literal_c_string',
]
endif
u_ind = subproject('unicode-ident').get_variable('lib')