Compare commits
56 Commits
mesa-24.0.
...
mesa-24.0.
Author | SHA1 | Date | |
---|---|---|---|
|
d6695f1641 | ||
|
8f3dfb0aaa | ||
|
c15886ec43 | ||
|
8c913751a2 | ||
|
16c64c184c | ||
|
d5675923b3 | ||
|
0fd4f15601 | ||
|
c3f11a4011 | ||
|
ea6c84849d | ||
|
93572f4e31 | ||
|
450ad166c0 | ||
|
331d440811 | ||
|
1bf184747e | ||
|
30ddd43e6d | ||
|
11f595f5e7 | ||
|
eac0e52cdb | ||
|
a0e1b5f436 | ||
|
2c1cbf296e | ||
|
699ec9c4c7 | ||
|
352d44ce5a | ||
|
69c7b25037 | ||
|
1aaec51f56 | ||
|
d3b8a28357 | ||
|
368892e9e2 | ||
|
ba4462df44 | ||
|
115598022a | ||
|
20595e465b | ||
|
6eaf495a19 | ||
|
5dbc8d493d | ||
|
0598097e8a | ||
|
74802851d2 | ||
|
028dc8957f | ||
|
c291a73202 | ||
|
c2739fefe3 | ||
|
e462e3cc39 | ||
|
abf8b28b65 | ||
|
9eb14991f9 | ||
|
c64129a0bd | ||
|
ca6431d9d7 | ||
|
df810add64 | ||
|
d68141bd68 | ||
|
0a312787cd | ||
|
ce7e1ca1fa | ||
|
254b300f6b | ||
|
6d23f70e79 | ||
|
9343ede6a2 | ||
|
5be42f6982 | ||
|
020d145f4a | ||
|
cb375dfe03 | ||
|
57198d2ca9 | ||
|
90012f1f66 | ||
|
38485a49f4 | ||
|
33d6e6f9a2 | ||
|
6fc67be3a5 | ||
|
58dfa780c1 | ||
|
4047c51834 |
6490
.pick_status.json
6490
.pick_status.json
File diff suppressed because it is too large
Load Diff
@@ -3,6 +3,7 @@ Release Notes
|
||||
|
||||
The release notes summarize what's new or changed in each Mesa release.
|
||||
|
||||
- :doc:`24.0.8 release notes <relnotes/24.0.8>`
|
||||
- :doc:`24.0.7 release notes <relnotes/24.0.7>`
|
||||
- :doc:`24.0.6 release notes <relnotes/24.0.6>`
|
||||
- :doc:`24.0.5 release notes <relnotes/24.0.5>`
|
||||
@@ -415,6 +416,7 @@ The release notes summarize what's new or changed in each Mesa release.
|
||||
:maxdepth: 1
|
||||
:hidden:
|
||||
|
||||
24.0.8 <relnotes/24.0.8>
|
||||
24.0.7 <relnotes/24.0.7>
|
||||
24.0.6 <relnotes/24.0.6>
|
||||
24.0.5 <relnotes/24.0.5>
|
||||
|
@@ -19,7 +19,7 @@ SHA256 checksum
|
||||
|
||||
::
|
||||
|
||||
TBD.
|
||||
7454425f1ed4a6f1b5b107e1672b30c88b22ea0efea000ae2c7d96db93f6c26a mesa-24.0.7.tar.xz
|
||||
|
||||
|
||||
New features
|
||||
|
155
docs/relnotes/24.0.8.rst
Normal file
155
docs/relnotes/24.0.8.rst
Normal file
@@ -0,0 +1,155 @@
|
||||
Mesa 24.0.8 Release Notes / 2024-05-22
|
||||
======================================
|
||||
|
||||
Mesa 24.0.8 is a bug fix release which fixes bugs found since the 24.0.7 release.
|
||||
|
||||
Mesa 24.0.8 implements the OpenGL 4.6 API, but the version reported by
|
||||
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
|
||||
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
|
||||
Some drivers don't support all the features required in OpenGL 4.6. OpenGL
|
||||
4.6 is **only** available if requested at context creation.
|
||||
Compatibility contexts may report a lower version depending on each driver.
|
||||
|
||||
Mesa 24.0.8 implements the Vulkan 1.3 API, but the version reported by
|
||||
the apiVersion property of the VkPhysicalDeviceProperties struct
|
||||
depends on the particular driver being used.
|
||||
|
||||
SHA256 checksum
|
||||
---------------
|
||||
|
||||
::
|
||||
|
||||
TBD.
|
||||
|
||||
|
||||
New features
|
||||
------------
|
||||
|
||||
- None
|
||||
|
||||
|
||||
Bug fixes
|
||||
---------
|
||||
|
||||
- [24.1-rc4] fatal error: intel/dev/intel_wa.h: No such file or directory
|
||||
- vcn: rewinding attached video in Totem cause [mmhub] page fault
|
||||
- When using amd gpu deinterlace, tv bt709 properties mapping to 2 chroma
|
||||
- VCN decoding freezes the whole system
|
||||
- [RDNA2 [AV1] [VAAPI] hw decoding glitches in Thorium 123.0.6312.133 after https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28960
|
||||
- WSI: Support VK_IMAGE_ASPECT_MEMORY_PLANE_i_BIT_EXT for DRM Modifiers in Vulkan
|
||||
- radv: Enshrouded GPU hang on RX 6800
|
||||
- NVK Zink: Wrong color in Unigine Valley benchmark
|
||||
- [anv] FINISHME: support YUV colorspace with DRM format modifiers
|
||||
- 24.0.6: build fails
|
||||
|
||||
|
||||
Changes
|
||||
-------
|
||||
|
||||
Antoine Coutant (1):
|
||||
|
||||
- drisw: fix build without dri3
|
||||
|
||||
Bas Nieuwenhuizen (1):
|
||||
|
||||
- radv: Use zerovram for Enshrouded.
|
||||
|
||||
David Heidelberg (2):
|
||||
|
||||
- freedreno/ci: move the disabled jobs from include to the main file
|
||||
- winsys/i915: depends on intel_wa.h
|
||||
|
||||
David Rosca (6):
|
||||
|
||||
- frontends/va: Only increment slice offset after first slice parameters
|
||||
- radeonsi: Update buffer for other planes in si_alloc_resource
|
||||
- frontends/va: Store slice types for H264 decode
|
||||
- radeonsi/vcn: Ensure DPB has as many buffers as references
|
||||
- radeonsi/vcn: Allow duplicate buffers in DPB
|
||||
- radeonsi/vcn: Ensure at least one reference for H264 P/B frames
|
||||
|
||||
Eric Engestrom (5):
|
||||
|
||||
- docs: add sha256sum for 24.0.7
|
||||
- .pick_status.json: Update to 18c53157318d6c8e572062f6bb768dfb621a55fd
|
||||
- .pick_status.json: Update to e154f90aa9e71cc98375866c3ab24c4e08e66cb7
|
||||
- .pick_status.json: Mark ae8fbe220ae67ffdce662c26bc4a634d475c0389 as denominated
|
||||
- .pick_status.json: Update to a31996ce5a6b7eb3b324b71eb9e9c45173953c50
|
||||
|
||||
Faith Ekstrand (6):
|
||||
|
||||
- nvk: Re-emit sample locations when rasterization samples changes
|
||||
- nvk/meta: Restore set_sizes[0]
|
||||
- nouveau/winsys: Take a reference to BOs found in the cache
|
||||
- drm-uapi: Sync nouveau_drm.h
|
||||
- nouveau/winsys: Add back nouveau_ws_bo_new_tiled()
|
||||
- vulkan/wsi: Bind memory planes, not YCbCr planes.
|
||||
|
||||
Friedrich Vock (2):
|
||||
|
||||
- aco/tests: Insert p_logical_start/end in reduce_temp tests
|
||||
- aco/spill: Insert p_start_linear_vgpr right after p_logical_end
|
||||
|
||||
Georg Lehmann (1):
|
||||
|
||||
- zink: use bitcasts instead of pack/unpack double opcodes
|
||||
|
||||
José Expósito (1):
|
||||
|
||||
- meson: Update proc_macro2 meson.build patch
|
||||
|
||||
Karol Herbst (5):
|
||||
|
||||
- rusticl/event: use Weak refs for dependencies
|
||||
- Revert "rusticl/event: use Weak refs for dependencies"
|
||||
- event: break long dependency chains on drop
|
||||
- rusticl/mesa/context: flush context before destruction
|
||||
- nir/lower_cl_images: set binding also for samplers
|
||||
|
||||
Konstantin Seurer (3):
|
||||
|
||||
- radv: Fix radv_shader_arena_block list corruption
|
||||
- radv: Remove arenas from capture_replay_arena_vas
|
||||
- radv: Zero initialize capture replay group handles
|
||||
|
||||
Lionel Landwerlin (3):
|
||||
|
||||
- anv: fix ycbcr plane indexing with indirect descriptors
|
||||
- anv: fix push constant subgroup_id location
|
||||
- nir/divergence: add missing load_printf_buffer_address
|
||||
|
||||
Marek Olšák (1):
|
||||
|
||||
- util: shift the mask in BITSET_TEST_RANGE_INSIDE_WORD to be relative to b
|
||||
|
||||
Mike Blumenkrantz (8):
|
||||
|
||||
- egl/x11: disable dri3 with LIBGL_KOPPER_DRI2=1 as expected
|
||||
- zink: add a batch ref for committed sparse resources
|
||||
- u_blitter: stop leaking saved blitter states on no-op blits
|
||||
- frontends/dri: only release pipe when screen init fails
|
||||
- frontends/dri: always init opencl_func_mutex in InitScreen hooks
|
||||
- zink: clean up semaphore arrays on batch state destroy
|
||||
- nir/lower_aaline: fix for scalarized outputs
|
||||
- nir/linking: fix nir_assign_io_var_locations for scalarized dual blend
|
||||
|
||||
Patrick Lerda (2):
|
||||
|
||||
- clover: fix memory leak related to optimize
|
||||
- r600: fix vertex state update clover regression
|
||||
|
||||
Rhys Perry (1):
|
||||
|
||||
- aco/waitcnt: fix DS/VMEM ordered writes when mixed
|
||||
|
||||
Romain Naour (1):
|
||||
|
||||
- glxext: don't try zink if not enabled in mesa
|
||||
|
||||
Yiwei Zhang (5):
|
||||
|
||||
- turnip: msm: clean up iova on error path
|
||||
- turnip: msm: fix racy gem close for re-imported dma-buf
|
||||
- turnip: virtio: fix error path in virtio_bo_init
|
||||
- turnip: virtio: fix iova leak upon found already imported dmabuf
|
||||
- turnip: virtio: fix racy gem close for re-imported dma-buf
|
@@ -54,11 +54,42 @@ extern "C" {
|
||||
*/
|
||||
#define NOUVEAU_GETPARAM_EXEC_PUSH_MAX 17
|
||||
|
||||
/*
|
||||
* NOUVEAU_GETPARAM_VRAM_BAR_SIZE - query bar size
|
||||
*
|
||||
* Query the VRAM BAR size.
|
||||
*/
|
||||
#define NOUVEAU_GETPARAM_VRAM_BAR_SIZE 18
|
||||
|
||||
/*
|
||||
* NOUVEAU_GETPARAM_VRAM_USED
|
||||
*
|
||||
* Get remaining VRAM size.
|
||||
*/
|
||||
#define NOUVEAU_GETPARAM_VRAM_USED 19
|
||||
|
||||
/*
|
||||
* NOUVEAU_GETPARAM_HAS_VMA_TILEMODE
|
||||
*
|
||||
* Query whether tile mode and PTE kind are accepted with VM allocs or not.
|
||||
*/
|
||||
#define NOUVEAU_GETPARAM_HAS_VMA_TILEMODE 20
|
||||
|
||||
struct drm_nouveau_getparam {
|
||||
__u64 param;
|
||||
__u64 value;
|
||||
};
|
||||
|
||||
/*
|
||||
* Those are used to support selecting the main engine used on Kepler.
|
||||
* This goes into drm_nouveau_channel_alloc::tt_ctxdma_handle
|
||||
*/
|
||||
#define NOUVEAU_FIFO_ENGINE_GR 0x01
|
||||
#define NOUVEAU_FIFO_ENGINE_VP 0x02
|
||||
#define NOUVEAU_FIFO_ENGINE_PPP 0x04
|
||||
#define NOUVEAU_FIFO_ENGINE_BSP 0x08
|
||||
#define NOUVEAU_FIFO_ENGINE_CE 0x30
|
||||
|
||||
struct drm_nouveau_channel_alloc {
|
||||
__u32 fb_ctxdma_handle;
|
||||
__u32 tt_ctxdma_handle;
|
||||
@@ -81,6 +112,18 @@ struct drm_nouveau_channel_free {
|
||||
__s32 channel;
|
||||
};
|
||||
|
||||
struct drm_nouveau_notifierobj_alloc {
|
||||
__u32 channel;
|
||||
__u32 handle;
|
||||
__u32 size;
|
||||
__u32 offset;
|
||||
};
|
||||
|
||||
struct drm_nouveau_gpuobj_free {
|
||||
__s32 channel;
|
||||
__u32 handle;
|
||||
};
|
||||
|
||||
#define NOUVEAU_GEM_DOMAIN_CPU (1 << 0)
|
||||
#define NOUVEAU_GEM_DOMAIN_VRAM (1 << 1)
|
||||
#define NOUVEAU_GEM_DOMAIN_GART (1 << 2)
|
||||
@@ -238,34 +281,32 @@ struct drm_nouveau_vm_init {
|
||||
struct drm_nouveau_vm_bind_op {
|
||||
/**
|
||||
* @op: the operation type
|
||||
*
|
||||
* Supported values:
|
||||
*
|
||||
* %DRM_NOUVEAU_VM_BIND_OP_MAP - Map a GEM object to the GPU's VA
|
||||
* space. Optionally, the &DRM_NOUVEAU_VM_BIND_SPARSE flag can be
|
||||
* passed to instruct the kernel to create sparse mappings for the
|
||||
* given range.
|
||||
*
|
||||
* %DRM_NOUVEAU_VM_BIND_OP_UNMAP - Unmap an existing mapping in the
|
||||
* GPU's VA space. If the region the mapping is located in is a
|
||||
* sparse region, new sparse mappings are created where the unmapped
|
||||
* (memory backed) mapping was mapped previously. To remove a sparse
|
||||
* region the &DRM_NOUVEAU_VM_BIND_SPARSE must be set.
|
||||
*/
|
||||
__u32 op;
|
||||
/**
|
||||
* @DRM_NOUVEAU_VM_BIND_OP_MAP:
|
||||
*
|
||||
* Map a GEM object to the GPU's VA space. Optionally, the
|
||||
* &DRM_NOUVEAU_VM_BIND_SPARSE flag can be passed to instruct the kernel to
|
||||
* create sparse mappings for the given range.
|
||||
*/
|
||||
#define DRM_NOUVEAU_VM_BIND_OP_MAP 0x0
|
||||
/**
|
||||
* @DRM_NOUVEAU_VM_BIND_OP_UNMAP:
|
||||
*
|
||||
* Unmap an existing mapping in the GPU's VA space. If the region the mapping
|
||||
* is located in is a sparse region, new sparse mappings are created where the
|
||||
* unmapped (memory backed) mapping was mapped previously. To remove a sparse
|
||||
* region the &DRM_NOUVEAU_VM_BIND_SPARSE must be set.
|
||||
*/
|
||||
#define DRM_NOUVEAU_VM_BIND_OP_UNMAP 0x1
|
||||
/**
|
||||
* @flags: the flags for a &drm_nouveau_vm_bind_op
|
||||
*
|
||||
* Supported values:
|
||||
*
|
||||
* %DRM_NOUVEAU_VM_BIND_SPARSE - Indicates that an allocated VA
|
||||
* space region should be sparse.
|
||||
*/
|
||||
__u32 flags;
|
||||
/**
|
||||
* @DRM_NOUVEAU_VM_BIND_SPARSE:
|
||||
*
|
||||
* Indicates that an allocated VA space region should be sparse.
|
||||
*/
|
||||
#define DRM_NOUVEAU_VM_BIND_SPARSE (1 << 8)
|
||||
/**
|
||||
* @handle: the handle of the DRM GEM object to map
|
||||
@@ -301,17 +342,17 @@ struct drm_nouveau_vm_bind {
|
||||
__u32 op_count;
|
||||
/**
|
||||
* @flags: the flags for a &drm_nouveau_vm_bind ioctl
|
||||
*
|
||||
* Supported values:
|
||||
*
|
||||
* %DRM_NOUVEAU_VM_BIND_RUN_ASYNC - Indicates that the given VM_BIND
|
||||
* operation should be executed asynchronously by the kernel.
|
||||
*
|
||||
* If this flag is not supplied the kernel executes the associated
|
||||
* operations synchronously and doesn't accept any &drm_nouveau_sync
|
||||
* objects.
|
||||
*/
|
||||
__u32 flags;
|
||||
/**
|
||||
* @DRM_NOUVEAU_VM_BIND_RUN_ASYNC:
|
||||
*
|
||||
* Indicates that the given VM_BIND operation should be executed asynchronously
|
||||
* by the kernel.
|
||||
*
|
||||
* If this flag is not supplied the kernel executes the associated operations
|
||||
* synchronously and doesn't accept any &drm_nouveau_sync objects.
|
||||
*/
|
||||
#define DRM_NOUVEAU_VM_BIND_RUN_ASYNC 0x1
|
||||
/**
|
||||
* @wait_count: the number of wait &drm_nouveau_syncs
|
||||
|
@@ -10,7 +10,4 @@ dEQP-VK.draw.renderpass.multi_draw.mosaic.indexed_mixed.max_draws.stride_extra_1
|
||||
dEQP-VK.pipeline.*line_stipple_enable
|
||||
dEQP-VK.pipeline.*line_stipple_params
|
||||
|
||||
# New CTS flakes in 1.3.6.3
|
||||
dEQP-VK.ray_tracing_pipeline.pipeline_library.configurations.(single|multi)threaded_compilation.*_check_(all|capture_replay)_handles
|
||||
|
||||
dEQP-VK.query_pool.statistics_query.host_query_reset.geometry_shader_invocations.secondary.32bits_triangle_list_clear_depth
|
||||
|
@@ -1,5 +0,0 @@
|
||||
# New CTS flakes in 1.3.6.3
|
||||
dEQP-VK.ray_tracing_pipeline.pipeline_library.configurations.multithreaded_compilation.*_check_all_handles
|
||||
dEQP-VK.ray_tracing_pipeline.pipeline_library.configurations.multithreaded_compilation.*_check_capture_replay_handles
|
||||
dEQP-VK.ray_tracing_pipeline.pipeline_library.configurations.singlethreaded_compilation.*_check_all_handles
|
||||
dEQP-VK.ray_tracing_pipeline.pipeline_library.configurations.singlethreaded_compilation.*_check_capture_replay_handles
|
||||
|
@@ -428,18 +428,20 @@ check_instr(wait_ctx& ctx, wait_imm& wait, alu_delay_info& delay, Instruction* i
|
||||
if (it == ctx.gpr_map.end())
|
||||
continue;
|
||||
|
||||
wait_imm reg_imm = it->second.imm;
|
||||
|
||||
/* Vector Memory reads and writes return in the order they were issued */
|
||||
uint8_t vmem_type = get_vmem_type(instr);
|
||||
if (vmem_type && ((it->second.events & vm_events) == event_vmem) &&
|
||||
it->second.vmem_types == vmem_type)
|
||||
continue;
|
||||
reg_imm.vm = wait_imm::unset_counter;
|
||||
|
||||
/* LDS reads and writes return in the order they were issued. same for GDS */
|
||||
if (instr->isDS() &&
|
||||
(it->second.events & lgkm_events) == (instr->ds().gds ? event_gds : event_lds))
|
||||
continue;
|
||||
reg_imm.lgkm = wait_imm::unset_counter;
|
||||
|
||||
wait.combine(it->second.imm);
|
||||
wait.combine(reg_imm);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@@ -118,11 +118,16 @@ setup_reduce_temp(Program* program)
|
||||
* would insert at the end instead of using this one. */
|
||||
} else {
|
||||
assert(last_top_level_block_idx < block.index);
|
||||
/* insert before the branch at last top level block */
|
||||
/* insert after p_logical_end of the last top-level block */
|
||||
std::vector<aco_ptr<Instruction>>& instructions =
|
||||
program->blocks[last_top_level_block_idx].instructions;
|
||||
instructions.insert(std::next(instructions.begin(), instructions.size() - 1),
|
||||
std::move(create));
|
||||
auto insert_point =
|
||||
std::find_if(instructions.rbegin(), instructions.rend(),
|
||||
[](const auto& iter) {
|
||||
return iter->opcode == aco_opcode::p_logical_end;
|
||||
})
|
||||
.base();
|
||||
instructions.insert(insert_point, std::move(create));
|
||||
inserted_at = last_top_level_block_idx;
|
||||
}
|
||||
}
|
||||
@@ -161,8 +166,13 @@ setup_reduce_temp(Program* program)
|
||||
assert(last_top_level_block_idx < block.index);
|
||||
std::vector<aco_ptr<Instruction>>& instructions =
|
||||
program->blocks[last_top_level_block_idx].instructions;
|
||||
instructions.insert(std::next(instructions.begin(), instructions.size() - 1),
|
||||
std::move(create));
|
||||
auto insert_point =
|
||||
std::find_if(instructions.rbegin(), instructions.rend(),
|
||||
[](const auto& iter) {
|
||||
return iter->opcode == aco_opcode::p_logical_end;
|
||||
})
|
||||
.base();
|
||||
instructions.insert(insert_point, std::move(create));
|
||||
vtmp_inserted_at = last_top_level_block_idx;
|
||||
}
|
||||
}
|
||||
|
@@ -1845,10 +1845,16 @@ assign_spill_slots(spill_ctx& ctx, unsigned spills_to_vgpr)
|
||||
instructions.emplace_back(std::move(create));
|
||||
} else {
|
||||
assert(last_top_level_block_idx < block.index);
|
||||
/* insert before the branch at last top level block */
|
||||
/* insert after p_logical_end of the last top-level block */
|
||||
std::vector<aco_ptr<Instruction>>& block_instrs =
|
||||
ctx.program->blocks[last_top_level_block_idx].instructions;
|
||||
block_instrs.insert(std::prev(block_instrs.end()), std::move(create));
|
||||
auto insert_point =
|
||||
std::find_if(block_instrs.rbegin(), block_instrs.rend(),
|
||||
[](const auto& iter) {
|
||||
return iter->opcode == aco_opcode::p_logical_end;
|
||||
})
|
||||
.base();
|
||||
block_instrs.insert(insert_point, std::move(create));
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1885,10 +1891,16 @@ assign_spill_slots(spill_ctx& ctx, unsigned spills_to_vgpr)
|
||||
instructions.emplace_back(std::move(create));
|
||||
} else {
|
||||
assert(last_top_level_block_idx < block.index);
|
||||
/* insert before the branch at last top level block */
|
||||
/* insert after p_logical_end of the last top-level block */
|
||||
std::vector<aco_ptr<Instruction>>& block_instrs =
|
||||
ctx.program->blocks[last_top_level_block_idx].instructions;
|
||||
block_instrs.insert(std::prev(block_instrs.end()), std::move(create));
|
||||
auto insert_point =
|
||||
std::find_if(block_instrs.rbegin(), block_instrs.rend(),
|
||||
[](const auto& iter) {
|
||||
return iter->opcode == aco_opcode::p_logical_end;
|
||||
})
|
||||
.base();
|
||||
block_instrs.insert(insert_point, std::move(create));
|
||||
}
|
||||
}
|
||||
|
||||
|
@@ -53,3 +53,71 @@ BEGIN_TEST(insert_waitcnt.ds_ordered_count)
|
||||
|
||||
finish_waitcnt_test();
|
||||
END_TEST
|
||||
|
||||
BEGIN_TEST(insert_waitcnt.waw.mixed_vmem_lds.vmem)
|
||||
if (!setup_cs(NULL, GFX10))
|
||||
return;
|
||||
|
||||
Definition def_v4(PhysReg(260), v1);
|
||||
Operand op_v0(PhysReg(256), v1);
|
||||
Operand desc0(PhysReg(0), s4);
|
||||
|
||||
//>> BB0
|
||||
//! /* logical preds: / linear preds: / kind: top-level, */
|
||||
//! v1: %0:v[4] = buffer_load_dword %0:s[0-3], %0:v[0], 0
|
||||
bld.mubuf(aco_opcode::buffer_load_dword, def_v4, desc0, op_v0, Operand::zero(), 0, false);
|
||||
|
||||
//>> BB1
|
||||
//! /* logical preds: / linear preds: / kind: */
|
||||
//! v1: %0:v[4] = ds_read_b32 %0:v[0]
|
||||
bld.reset(program->create_and_insert_block());
|
||||
bld.ds(aco_opcode::ds_read_b32, def_v4, op_v0);
|
||||
|
||||
bld.reset(program->create_and_insert_block());
|
||||
program->blocks[2].linear_preds.push_back(0);
|
||||
program->blocks[2].linear_preds.push_back(1);
|
||||
program->blocks[2].logical_preds.push_back(0);
|
||||
program->blocks[2].logical_preds.push_back(1);
|
||||
|
||||
//>> BB2
|
||||
//! /* logical preds: BB0, BB1, / linear preds: BB0, BB1, / kind: uniform, */
|
||||
//! s_waitcnt lgkmcnt(0)
|
||||
//! v1: %0:v[4] = buffer_load_dword %0:s[0-3], %0:v[0], 0
|
||||
bld.mubuf(aco_opcode::buffer_load_dword, def_v4, desc0, op_v0, Operand::zero(), 0, false);
|
||||
|
||||
finish_waitcnt_test();
|
||||
END_TEST
|
||||
|
||||
BEGIN_TEST(insert_waitcnt.waw.mixed_vmem_lds.lds)
|
||||
if (!setup_cs(NULL, GFX10))
|
||||
return;
|
||||
|
||||
Definition def_v4(PhysReg(260), v1);
|
||||
Operand op_v0(PhysReg(256), v1);
|
||||
Operand desc0(PhysReg(0), s4);
|
||||
|
||||
//>> BB0
|
||||
//! /* logical preds: / linear preds: / kind: top-level, */
|
||||
//! v1: %0:v[4] = buffer_load_dword %0:s[0-3], %0:v[0], 0
|
||||
bld.mubuf(aco_opcode::buffer_load_dword, def_v4, desc0, op_v0, Operand::zero(), 0, false);
|
||||
|
||||
//>> BB1
|
||||
//! /* logical preds: / linear preds: / kind: */
|
||||
//! v1: %0:v[4] = ds_read_b32 %0:v[0]
|
||||
bld.reset(program->create_and_insert_block());
|
||||
bld.ds(aco_opcode::ds_read_b32, def_v4, op_v0);
|
||||
|
||||
bld.reset(program->create_and_insert_block());
|
||||
program->blocks[2].linear_preds.push_back(0);
|
||||
program->blocks[2].linear_preds.push_back(1);
|
||||
program->blocks[2].logical_preds.push_back(0);
|
||||
program->blocks[2].logical_preds.push_back(1);
|
||||
|
||||
//>> BB2
|
||||
//! /* logical preds: BB0, BB1, / linear preds: BB0, BB1, / kind: uniform, */
|
||||
//! s_waitcnt vmcnt(0)
|
||||
//! v1: %0:v[4] = ds_read_b32 %0:v[0]
|
||||
bld.ds(aco_opcode::ds_read_b32, def_v4, op_v0);
|
||||
|
||||
finish_waitcnt_test();
|
||||
END_TEST
|
||||
|
@@ -41,6 +41,11 @@ BEGIN_TEST(setup_reduce_temp.divergent_if_phi)
|
||||
if (!setup_cs("s2 v1", GFX9))
|
||||
return;
|
||||
|
||||
//>> p_logical_start
|
||||
//>> p_logical_end
|
||||
bld.pseudo(aco_opcode::p_logical_start);
|
||||
bld.pseudo(aco_opcode::p_logical_end);
|
||||
|
||||
//>> lv1: %lv = p_start_linear_vgpr
|
||||
emit_divergent_if_else(
|
||||
program.get(), bld, Operand(inputs[0]),
|
||||
|
@@ -1320,9 +1320,6 @@ radv_DestroyDevice(VkDevice _device, const VkAllocationCallbacks *pAllocator)
|
||||
if (!device)
|
||||
return;
|
||||
|
||||
if (device->capture_replay_arena_vas)
|
||||
_mesa_hash_table_u64_destroy(device->capture_replay_arena_vas);
|
||||
|
||||
radv_device_finish_perf_counter_lock_cs(device);
|
||||
if (device->perf_counter_bo)
|
||||
device->ws->buffer_destroy(device->ws, device->perf_counter_bo);
|
||||
@@ -1372,6 +1369,8 @@ radv_DestroyDevice(VkDevice _device, const VkAllocationCallbacks *pAllocator)
|
||||
radv_finish_trace(device);
|
||||
|
||||
radv_destroy_shader_arenas(device);
|
||||
if (device->capture_replay_arena_vas)
|
||||
_mesa_hash_table_u64_destroy(device->capture_replay_arena_vas);
|
||||
|
||||
radv_sqtt_finish(device);
|
||||
|
||||
|
@@ -943,8 +943,12 @@ radv_GetRayTracingCaptureReplayShaderGroupHandlesKHR(VkDevice device, VkPipeline
|
||||
uint32_t recursive_shader = rt_pipeline->groups[firstGroup + i].recursive_shader;
|
||||
if (recursive_shader != VK_SHADER_UNUSED_KHR) {
|
||||
struct radv_shader *shader = rt_pipeline->stages[recursive_shader].shader;
|
||||
if (shader)
|
||||
data[i].recursive_shader_alloc = radv_serialize_shader_arena_block(shader->alloc);
|
||||
if (shader) {
|
||||
data[i].recursive_shader_alloc.offset = shader->alloc->offset;
|
||||
data[i].recursive_shader_alloc.size = shader->alloc->size;
|
||||
data[i].recursive_shader_alloc.arena_va = shader->alloc->arena->bo->va;
|
||||
data[i].recursive_shader_alloc.arena_size = shader->alloc->arena->size;
|
||||
}
|
||||
}
|
||||
data[i].non_recursive_idx = rt_pipeline->groups[firstGroup + i].handle.any_hit_index;
|
||||
}
|
||||
|
@@ -982,6 +982,7 @@ alloc_block_obj(struct radv_device *device)
|
||||
static void
|
||||
free_block_obj(struct radv_device *device, union radv_shader_arena_block *block)
|
||||
{
|
||||
list_del(&block->pool);
|
||||
list_add(&block->pool, &device->shader_block_obj_pool);
|
||||
}
|
||||
|
||||
@@ -1267,7 +1268,6 @@ radv_free_shader_memory(struct radv_device *device, union radv_shader_arena_bloc
|
||||
remove_hole(free_list, hole_prev);
|
||||
|
||||
hole_prev->size += hole->size;
|
||||
list_del(&hole->list);
|
||||
free_block_obj(device, hole);
|
||||
|
||||
hole = hole_prev;
|
||||
@@ -1280,7 +1280,6 @@ radv_free_shader_memory(struct radv_device *device, union radv_shader_arena_bloc
|
||||
|
||||
hole_next->offset -= hole->size;
|
||||
hole_next->size += hole->size;
|
||||
list_del(&hole->list);
|
||||
free_block_obj(device, hole);
|
||||
|
||||
hole = hole_next;
|
||||
@@ -1293,6 +1292,18 @@ radv_free_shader_memory(struct radv_device *device, union radv_shader_arena_bloc
|
||||
radv_rmv_log_bo_destroy(device, arena->bo);
|
||||
device->ws->buffer_destroy(device->ws, arena->bo);
|
||||
list_del(&arena->list);
|
||||
|
||||
if (device->capture_replay_arena_vas) {
|
||||
struct hash_entry *arena_entry = NULL;
|
||||
hash_table_foreach (device->capture_replay_arena_vas->table, entry) {
|
||||
if (entry->data == arena) {
|
||||
arena_entry = entry;
|
||||
break;
|
||||
}
|
||||
}
|
||||
_mesa_hash_table_remove(device->capture_replay_arena_vas->table, arena_entry);
|
||||
}
|
||||
|
||||
free(arena);
|
||||
} else if (free_list) {
|
||||
add_hole(free_list, hole);
|
||||
@@ -1301,18 +1312,6 @@ radv_free_shader_memory(struct radv_device *device, union radv_shader_arena_bloc
|
||||
mtx_unlock(&device->shader_arena_mutex);
|
||||
}
|
||||
|
||||
struct radv_serialized_shader_arena_block
|
||||
radv_serialize_shader_arena_block(union radv_shader_arena_block *block)
|
||||
{
|
||||
struct radv_serialized_shader_arena_block serialized_block = {
|
||||
.offset = block->offset,
|
||||
.size = block->size,
|
||||
.arena_va = block->arena->bo->va,
|
||||
.arena_size = block->arena->size,
|
||||
};
|
||||
return serialized_block;
|
||||
}
|
||||
|
||||
union radv_shader_arena_block *
|
||||
radv_replay_shader_arena_block(struct radv_device *device, const struct radv_serialized_shader_arena_block *src,
|
||||
void *ptr)
|
||||
|
@@ -775,8 +775,6 @@ union radv_shader_arena_block *radv_replay_shader_arena_block(struct radv_device
|
||||
const struct radv_serialized_shader_arena_block *src,
|
||||
void *ptr);
|
||||
|
||||
struct radv_serialized_shader_arena_block radv_serialize_shader_arena_block(union radv_shader_arena_block *block);
|
||||
|
||||
void radv_free_shader_memory(struct radv_device *device, union radv_shader_arena_block *alloc);
|
||||
|
||||
struct radv_shader *radv_create_trap_handler_shader(struct radv_device *device);
|
||||
|
@@ -216,6 +216,7 @@ visit_intrinsic(nir_shader *shader, nir_intrinsic_instr *instr)
|
||||
case nir_intrinsic_load_rasterization_primitive_amd:
|
||||
case nir_intrinsic_load_global_constant_uniform_block_intel:
|
||||
case nir_intrinsic_cmat_length:
|
||||
case nir_intrinsic_load_printf_buffer_address:
|
||||
is_divergent = false;
|
||||
break;
|
||||
|
||||
|
@@ -1492,7 +1492,7 @@ nir_assign_io_var_locations(nir_shader *shader, nir_variable_mode mode,
|
||||
unsigned *size, gl_shader_stage stage)
|
||||
{
|
||||
unsigned location = 0;
|
||||
unsigned assigned_locations[VARYING_SLOT_TESS_MAX];
|
||||
unsigned assigned_locations[VARYING_SLOT_TESS_MAX][2];
|
||||
uint64_t processed_locs[2] = { 0 };
|
||||
|
||||
struct exec_list io_vars;
|
||||
@@ -1584,7 +1584,7 @@ nir_assign_io_var_locations(nir_shader *shader, nir_variable_mode mode,
|
||||
if (processed) {
|
||||
/* TODO handle overlapping per-view variables */
|
||||
assert(!var->data.per_view);
|
||||
unsigned driver_location = assigned_locations[var->data.location];
|
||||
unsigned driver_location = assigned_locations[var->data.location][var->data.index];
|
||||
var->data.driver_location = driver_location;
|
||||
|
||||
/* An array may be packed such that is crosses multiple other arrays
|
||||
@@ -1605,7 +1605,7 @@ nir_assign_io_var_locations(nir_shader *shader, nir_variable_mode mode,
|
||||
unsigned num_unallocated_slots = last_slot_location - location;
|
||||
unsigned first_unallocated_slot = var_size - num_unallocated_slots;
|
||||
for (unsigned i = first_unallocated_slot; i < var_size; i++) {
|
||||
assigned_locations[var->data.location + i] = location;
|
||||
assigned_locations[var->data.location + i][var->data.index] = location;
|
||||
location++;
|
||||
}
|
||||
}
|
||||
@@ -1613,7 +1613,7 @@ nir_assign_io_var_locations(nir_shader *shader, nir_variable_mode mode,
|
||||
}
|
||||
|
||||
for (unsigned i = 0; i < var_size; i++) {
|
||||
assigned_locations[var->data.location + i] = location + i;
|
||||
assigned_locations[var->data.location + i][var->data.index] = location + i;
|
||||
}
|
||||
|
||||
var->data.driver_location = location;
|
||||
|
@@ -161,6 +161,7 @@ nir_lower_cl_images(nir_shader *shader, bool lower_image_derefs, bool lower_samp
|
||||
assert(var->data.location > last_loc);
|
||||
last_loc = var->data.location;
|
||||
var->data.driver_location = num_samplers++;
|
||||
var->data.binding = var->data.driver_location;
|
||||
} else {
|
||||
/* CL shouldn't have any sampled images */
|
||||
assert(!glsl_type_is_sampler(var->type));
|
||||
|
@@ -1518,7 +1518,8 @@ dri2_initialize_x11_swrast(_EGLDisplay *disp)
|
||||
*/
|
||||
dri2_dpy->driver_name = strdup(disp->Options.Zink ? "zink" : "swrast");
|
||||
if (disp->Options.Zink &&
|
||||
!debug_get_bool_option("LIBGL_DRI3_DISABLE", false))
|
||||
!debug_get_bool_option("LIBGL_DRI3_DISABLE", false) &&
|
||||
!debug_get_bool_option("LIBGL_KOPPER_DRI2", false))
|
||||
dri3_x11_connect(dri2_dpy);
|
||||
if (!dri2_load_driver_swrast(disp))
|
||||
goto cleanup;
|
||||
|
@@ -292,25 +292,6 @@
|
||||
tags:
|
||||
- google-freedreno-db410c
|
||||
|
||||
# New jobs. Leave it as manual for now.
|
||||
.a306_piglit:
|
||||
extends:
|
||||
- .piglit-test
|
||||
- .a306-test
|
||||
- .google-freedreno-manual-rules
|
||||
variables:
|
||||
HWCI_START_XORG: 1
|
||||
|
||||
# Something happened and now this hangchecks and doesn't recover. Unkown when
|
||||
# it started.
|
||||
.a306_piglit_gl:
|
||||
extends:
|
||||
- .a306_piglit
|
||||
variables:
|
||||
PIGLIT_PROFILES: quick_gl
|
||||
BM_KERNEL_EXTRA_ARGS: "msm.num_hw_submissions=1"
|
||||
FDO_CI_CONCURRENT: 3
|
||||
|
||||
# 8 devices (2023-04-15)
|
||||
.a530-test:
|
||||
extends:
|
||||
|
@@ -10,6 +10,25 @@ a306_gl:
|
||||
FDO_CI_CONCURRENT: 6
|
||||
parallel: 5
|
||||
|
||||
# New jobs. Leave it as manual for now.
|
||||
.a306_piglit:
|
||||
extends:
|
||||
- .piglit-test
|
||||
- .a306-test
|
||||
- .google-freedreno-manual-rules
|
||||
variables:
|
||||
HWCI_START_XORG: 1
|
||||
|
||||
# Something happened and now this hangchecks and doesn't recover. Unkown when
|
||||
# it started.
|
||||
.a306_piglit_gl:
|
||||
extends:
|
||||
- .a306_piglit
|
||||
variables:
|
||||
PIGLIT_PROFILES: quick_gl
|
||||
BM_KERNEL_EXTRA_ARGS: "msm.num_hw_submissions=1"
|
||||
FDO_CI_CONCURRENT: 3
|
||||
|
||||
a306_piglit_shader:
|
||||
extends:
|
||||
- .a306_piglit
|
||||
|
@@ -20,5 +20,8 @@ IncludeCategories:
|
||||
- Regex: '.*'
|
||||
Priority: 1
|
||||
|
||||
ForEachMacros:
|
||||
- u_vector_foreach
|
||||
|
||||
SpaceAfterCStyleCast: true
|
||||
SpaceBeforeCpp11BracedList: true
|
||||
|
@@ -15,12 +15,12 @@
|
||||
struct tu_u_trace_syncobj;
|
||||
struct vdrm_bo;
|
||||
|
||||
enum tu_bo_alloc_flags
|
||||
{
|
||||
enum tu_bo_alloc_flags {
|
||||
TU_BO_ALLOC_NO_FLAGS = 0,
|
||||
TU_BO_ALLOC_ALLOW_DUMP = 1 << 0,
|
||||
TU_BO_ALLOC_GPU_READ_ONLY = 1 << 1,
|
||||
TU_BO_ALLOC_REPLAYABLE = 1 << 2,
|
||||
TU_BO_ALLOC_DMABUF = 1 << 4,
|
||||
};
|
||||
|
||||
/* Define tu_timeline_sync type based on drm syncobj for a point type
|
||||
|
@@ -321,44 +321,68 @@ tu_free_zombie_vma_locked(struct tu_device *dev, bool wait)
|
||||
last_signaled_fence = vma->fence;
|
||||
}
|
||||
|
||||
/* Ensure that internal kernel's vma is freed. */
|
||||
struct drm_msm_gem_info req = {
|
||||
.handle = vma->gem_handle,
|
||||
.info = MSM_INFO_SET_IOVA,
|
||||
.value = 0,
|
||||
};
|
||||
if (vma->gem_handle) {
|
||||
/* Ensure that internal kernel's vma is freed. */
|
||||
struct drm_msm_gem_info req = {
|
||||
.handle = vma->gem_handle,
|
||||
.info = MSM_INFO_SET_IOVA,
|
||||
.value = 0,
|
||||
};
|
||||
|
||||
int ret =
|
||||
drmCommandWriteRead(dev->fd, DRM_MSM_GEM_INFO, &req, sizeof(req));
|
||||
if (ret < 0) {
|
||||
mesa_loge("MSM_INFO_SET_IOVA(0) failed! %d (%s)", ret,
|
||||
strerror(errno));
|
||||
return VK_ERROR_UNKNOWN;
|
||||
int ret =
|
||||
drmCommandWriteRead(dev->fd, DRM_MSM_GEM_INFO, &req, sizeof(req));
|
||||
if (ret < 0) {
|
||||
mesa_loge("MSM_INFO_SET_IOVA(0) failed! %d (%s)", ret,
|
||||
strerror(errno));
|
||||
return VK_ERROR_UNKNOWN;
|
||||
}
|
||||
|
||||
tu_gem_close(dev, vma->gem_handle);
|
||||
|
||||
util_vma_heap_free(&dev->vma, vma->iova, vma->size);
|
||||
}
|
||||
|
||||
tu_gem_close(dev, vma->gem_handle);
|
||||
|
||||
util_vma_heap_free(&dev->vma, vma->iova, vma->size);
|
||||
u_vector_remove(&dev->zombie_vmas);
|
||||
}
|
||||
|
||||
return VK_SUCCESS;
|
||||
}
|
||||
|
||||
static bool
|
||||
tu_restore_from_zombie_vma_locked(struct tu_device *dev,
|
||||
uint32_t gem_handle,
|
||||
uint64_t *iova)
|
||||
{
|
||||
struct tu_zombie_vma *vma;
|
||||
u_vector_foreach (vma, &dev->zombie_vmas) {
|
||||
if (vma->gem_handle == gem_handle) {
|
||||
*iova = vma->iova;
|
||||
|
||||
/* mark to skip later gem and iova cleanup */
|
||||
vma->gem_handle = 0;
|
||||
return true;
|
||||
}
|
||||
}
|
||||
|
||||
return false;
|
||||
}
|
||||
|
||||
static VkResult
|
||||
msm_allocate_userspace_iova(struct tu_device *dev,
|
||||
uint32_t gem_handle,
|
||||
uint64_t size,
|
||||
uint64_t client_iova,
|
||||
enum tu_bo_alloc_flags flags,
|
||||
uint64_t *iova)
|
||||
msm_allocate_userspace_iova_locked(struct tu_device *dev,
|
||||
uint32_t gem_handle,
|
||||
uint64_t size,
|
||||
uint64_t client_iova,
|
||||
enum tu_bo_alloc_flags flags,
|
||||
uint64_t *iova)
|
||||
{
|
||||
VkResult result;
|
||||
|
||||
mtx_lock(&dev->vma_mutex);
|
||||
|
||||
*iova = 0;
|
||||
|
||||
if ((flags & TU_BO_ALLOC_DMABUF) &&
|
||||
tu_restore_from_zombie_vma_locked(dev, gem_handle, iova))
|
||||
return VK_SUCCESS;
|
||||
|
||||
tu_free_zombie_vma_locked(dev, false);
|
||||
|
||||
result = tu_allocate_userspace_iova(dev, size, client_iova, flags, iova);
|
||||
@@ -372,8 +396,6 @@ msm_allocate_userspace_iova(struct tu_device *dev,
|
||||
result = tu_allocate_userspace_iova(dev, size, client_iova, flags, iova);
|
||||
}
|
||||
|
||||
mtx_unlock(&dev->vma_mutex);
|
||||
|
||||
if (result != VK_SUCCESS)
|
||||
return result;
|
||||
|
||||
@@ -386,6 +408,7 @@ msm_allocate_userspace_iova(struct tu_device *dev,
|
||||
int ret =
|
||||
drmCommandWriteRead(dev->fd, DRM_MSM_GEM_INFO, &req, sizeof(req));
|
||||
if (ret < 0) {
|
||||
util_vma_heap_free(&dev->vma, *iova, size);
|
||||
mesa_loge("MSM_INFO_SET_IOVA failed! %d (%s)", ret, strerror(errno));
|
||||
return VK_ERROR_OUT_OF_HOST_MEMORY;
|
||||
}
|
||||
@@ -420,8 +443,8 @@ tu_bo_init(struct tu_device *dev,
|
||||
assert(!client_iova || dev->physical_device->has_set_iova);
|
||||
|
||||
if (dev->physical_device->has_set_iova) {
|
||||
result = msm_allocate_userspace_iova(dev, gem_handle, size, client_iova,
|
||||
flags, &iova);
|
||||
result = msm_allocate_userspace_iova_locked(dev, gem_handle, size,
|
||||
client_iova, flags, &iova);
|
||||
} else {
|
||||
result = tu_allocate_kernel_iova(dev, gem_handle, &iova);
|
||||
}
|
||||
@@ -445,6 +468,8 @@ tu_bo_init(struct tu_device *dev,
|
||||
if (!new_ptr) {
|
||||
dev->bo_count--;
|
||||
mtx_unlock(&dev->bo_mutex);
|
||||
if (dev->physical_device->has_set_iova)
|
||||
util_vma_heap_free(&dev->vma, iova, size);
|
||||
tu_gem_close(dev, gem_handle);
|
||||
return VK_ERROR_OUT_OF_HOST_MEMORY;
|
||||
}
|
||||
@@ -506,6 +531,20 @@ tu_bo_set_kernel_name(struct tu_device *dev, struct tu_bo *bo, const char *name)
|
||||
}
|
||||
}
|
||||
|
||||
static inline void
|
||||
msm_vma_lock(struct tu_device *dev)
|
||||
{
|
||||
if (dev->physical_device->has_set_iova)
|
||||
mtx_lock(&dev->vma_mutex);
|
||||
}
|
||||
|
||||
static inline void
|
||||
msm_vma_unlock(struct tu_device *dev)
|
||||
{
|
||||
if (dev->physical_device->has_set_iova)
|
||||
mtx_unlock(&dev->vma_mutex);
|
||||
}
|
||||
|
||||
static VkResult
|
||||
msm_bo_init(struct tu_device *dev,
|
||||
struct tu_bo **out_bo,
|
||||
@@ -541,9 +580,15 @@ msm_bo_init(struct tu_device *dev,
|
||||
struct tu_bo* bo = tu_device_lookup_bo(dev, req.handle);
|
||||
assert(bo && bo->gem_handle == 0);
|
||||
|
||||
assert(!(flags & TU_BO_ALLOC_DMABUF));
|
||||
|
||||
msm_vma_lock(dev);
|
||||
|
||||
VkResult result =
|
||||
tu_bo_init(dev, bo, req.handle, size, client_iova, flags, name);
|
||||
|
||||
msm_vma_unlock(dev);
|
||||
|
||||
if (result != VK_SUCCESS)
|
||||
memset(bo, 0, sizeof(*bo));
|
||||
else
|
||||
@@ -591,11 +636,13 @@ msm_bo_init_dmabuf(struct tu_device *dev,
|
||||
* to happen in parallel.
|
||||
*/
|
||||
u_rwlock_wrlock(&dev->dma_bo_lock);
|
||||
msm_vma_lock(dev);
|
||||
|
||||
uint32_t gem_handle;
|
||||
int ret = drmPrimeFDToHandle(dev->fd, prime_fd,
|
||||
&gem_handle);
|
||||
if (ret) {
|
||||
msm_vma_unlock(dev);
|
||||
u_rwlock_wrunlock(&dev->dma_bo_lock);
|
||||
return vk_error(dev, VK_ERROR_INVALID_EXTERNAL_HANDLE);
|
||||
}
|
||||
@@ -604,6 +651,7 @@ msm_bo_init_dmabuf(struct tu_device *dev,
|
||||
|
||||
if (bo->refcnt != 0) {
|
||||
p_atomic_inc(&bo->refcnt);
|
||||
msm_vma_unlock(dev);
|
||||
u_rwlock_wrunlock(&dev->dma_bo_lock);
|
||||
|
||||
*out_bo = bo;
|
||||
@@ -611,13 +659,14 @@ msm_bo_init_dmabuf(struct tu_device *dev,
|
||||
}
|
||||
|
||||
VkResult result =
|
||||
tu_bo_init(dev, bo, gem_handle, size, 0, TU_BO_ALLOC_NO_FLAGS, "dmabuf");
|
||||
tu_bo_init(dev, bo, gem_handle, size, 0, TU_BO_ALLOC_DMABUF, "dmabuf");
|
||||
|
||||
if (result != VK_SUCCESS)
|
||||
memset(bo, 0, sizeof(*bo));
|
||||
else
|
||||
*out_bo = bo;
|
||||
|
||||
msm_vma_unlock(dev);
|
||||
u_rwlock_wrunlock(&dev->dma_bo_lock);
|
||||
|
||||
return result;
|
||||
|
@@ -412,14 +412,16 @@ tu_free_zombie_vma_locked(struct tu_device *dev, bool wait)
|
||||
last_signaled_fence = vma->fence;
|
||||
}
|
||||
|
||||
set_iova(dev, vma->res_id, 0);
|
||||
|
||||
u_vector_remove(&dev->zombie_vmas);
|
||||
|
||||
struct tu_zombie_vma *vma2 = (struct tu_zombie_vma *)
|
||||
u_vector_add(&vdev->zombie_vmas_stage_2);
|
||||
if (vma->gem_handle) {
|
||||
set_iova(dev, vma->res_id, 0);
|
||||
|
||||
*vma2 = *vma;
|
||||
struct tu_zombie_vma *vma2 =
|
||||
(struct tu_zombie_vma *) u_vector_add(&vdev->zombie_vmas_stage_2);
|
||||
|
||||
*vma2 = *vma;
|
||||
}
|
||||
}
|
||||
|
||||
/* And _then_ close the GEM handles: */
|
||||
@@ -434,19 +436,44 @@ tu_free_zombie_vma_locked(struct tu_device *dev, bool wait)
|
||||
return VK_SUCCESS;
|
||||
}
|
||||
|
||||
static bool
|
||||
tu_restore_from_zombie_vma_locked(struct tu_device *dev,
|
||||
uint32_t gem_handle,
|
||||
uint64_t *iova)
|
||||
{
|
||||
struct tu_zombie_vma *vma;
|
||||
u_vector_foreach (vma, &dev->zombie_vmas) {
|
||||
if (vma->gem_handle == gem_handle) {
|
||||
*iova = vma->iova;
|
||||
|
||||
/* mark to skip later vdrm bo and iova cleanup */
|
||||
vma->gem_handle = 0;
|
||||
return true;
|
||||
}
|
||||
}
|
||||
|
||||
return false;
|
||||
}
|
||||
|
||||
static VkResult
|
||||
virtio_allocate_userspace_iova(struct tu_device *dev,
|
||||
uint64_t size,
|
||||
uint64_t client_iova,
|
||||
enum tu_bo_alloc_flags flags,
|
||||
uint64_t *iova)
|
||||
virtio_allocate_userspace_iova_locked(struct tu_device *dev,
|
||||
uint32_t gem_handle,
|
||||
uint64_t size,
|
||||
uint64_t client_iova,
|
||||
enum tu_bo_alloc_flags flags,
|
||||
uint64_t *iova)
|
||||
{
|
||||
VkResult result;
|
||||
|
||||
mtx_lock(&dev->vma_mutex);
|
||||
|
||||
*iova = 0;
|
||||
|
||||
if (flags & TU_BO_ALLOC_DMABUF) {
|
||||
assert(gem_handle);
|
||||
|
||||
if (tu_restore_from_zombie_vma_locked(dev, gem_handle, iova))
|
||||
return VK_SUCCESS;
|
||||
}
|
||||
|
||||
tu_free_zombie_vma_locked(dev, false);
|
||||
|
||||
result = tu_allocate_userspace_iova(dev, size, client_iova, flags, iova);
|
||||
@@ -460,8 +487,6 @@ virtio_allocate_userspace_iova(struct tu_device *dev,
|
||||
result = tu_allocate_userspace_iova(dev, size, client_iova, flags, iova);
|
||||
}
|
||||
|
||||
mtx_unlock(&dev->vma_mutex);
|
||||
|
||||
return result;
|
||||
}
|
||||
|
||||
@@ -571,12 +596,8 @@ virtio_bo_init(struct tu_device *dev,
|
||||
.size = size,
|
||||
};
|
||||
VkResult result;
|
||||
|
||||
result = virtio_allocate_userspace_iova(dev, size, client_iova,
|
||||
flags, &req.iova);
|
||||
if (result != VK_SUCCESS) {
|
||||
return result;
|
||||
}
|
||||
uint32_t res_id;
|
||||
struct tu_bo *bo;
|
||||
|
||||
if (mem_property & VK_MEMORY_PROPERTY_HOST_CACHED_BIT) {
|
||||
if (mem_property & VK_MEMORY_PROPERTY_HOST_COHERENT_BIT) {
|
||||
@@ -601,6 +622,16 @@ virtio_bo_init(struct tu_device *dev,
|
||||
if (flags & TU_BO_ALLOC_GPU_READ_ONLY)
|
||||
req.flags |= MSM_BO_GPU_READONLY;
|
||||
|
||||
assert(!(flags & TU_BO_ALLOC_DMABUF));
|
||||
|
||||
mtx_lock(&dev->vma_mutex);
|
||||
result = virtio_allocate_userspace_iova_locked(dev, 0, size, client_iova,
|
||||
flags, &req.iova);
|
||||
mtx_unlock(&dev->vma_mutex);
|
||||
|
||||
if (result != VK_SUCCESS)
|
||||
return result;
|
||||
|
||||
/* tunneled cmds are processed separately on host side,
|
||||
* before the renderer->get_blob() callback.. the blob_id
|
||||
* is used to link the created bo to the get_blob() call
|
||||
@@ -611,27 +642,28 @@ virtio_bo_init(struct tu_device *dev,
|
||||
vdrm_bo_create(vdev->vdrm, size, blob_flags, req.blob_id, &req.hdr);
|
||||
|
||||
if (!handle) {
|
||||
util_vma_heap_free(&dev->vma, req.iova, size);
|
||||
return vk_error(dev, VK_ERROR_OUT_OF_DEVICE_MEMORY);
|
||||
result = VK_ERROR_OUT_OF_DEVICE_MEMORY;
|
||||
goto fail;
|
||||
}
|
||||
|
||||
uint32_t res_id = vdrm_handle_to_res_id(vdev->vdrm, handle);
|
||||
struct tu_bo* bo = tu_device_lookup_bo(dev, res_id);
|
||||
res_id = vdrm_handle_to_res_id(vdev->vdrm, handle);
|
||||
bo = tu_device_lookup_bo(dev, res_id);
|
||||
assert(bo && bo->gem_handle == 0);
|
||||
|
||||
bo->res_id = res_id;
|
||||
|
||||
result = tu_bo_init(dev, bo, handle, size, req.iova, flags, name);
|
||||
if (result != VK_SUCCESS)
|
||||
if (result != VK_SUCCESS) {
|
||||
memset(bo, 0, sizeof(*bo));
|
||||
else
|
||||
*out_bo = bo;
|
||||
goto fail;
|
||||
}
|
||||
|
||||
*out_bo = bo;
|
||||
|
||||
/* We don't use bo->name here because for the !TU_DEBUG=bo case bo->name is NULL. */
|
||||
tu_bo_set_kernel_name(dev, bo, name);
|
||||
|
||||
if (result == VK_SUCCESS &&
|
||||
(mem_property & VK_MEMORY_PROPERTY_HOST_CACHED_BIT) &&
|
||||
if ((mem_property & VK_MEMORY_PROPERTY_HOST_CACHED_BIT) &&
|
||||
!(mem_property & VK_MEMORY_PROPERTY_HOST_COHERENT_BIT)) {
|
||||
tu_bo_map(dev, bo);
|
||||
|
||||
@@ -644,6 +676,12 @@ virtio_bo_init(struct tu_device *dev,
|
||||
tu_sync_cache_bo(dev, bo, 0, VK_WHOLE_SIZE, TU_MEM_SYNC_CACHE_TO_GPU);
|
||||
}
|
||||
|
||||
return VK_SUCCESS;
|
||||
|
||||
fail:
|
||||
mtx_lock(&dev->vma_mutex);
|
||||
util_vma_heap_free(&dev->vma, req.iova, size);
|
||||
mtx_unlock(&dev->vma_mutex);
|
||||
return result;
|
||||
}
|
||||
|
||||
@@ -666,11 +704,6 @@ virtio_bo_init_dmabuf(struct tu_device *dev,
|
||||
/* iova allocation needs to consider the object's *real* size: */
|
||||
size = real_size;
|
||||
|
||||
uint64_t iova;
|
||||
result = virtio_allocate_userspace_iova(dev, size, 0, TU_BO_ALLOC_NO_FLAGS, &iova);
|
||||
if (result != VK_SUCCESS)
|
||||
return result;
|
||||
|
||||
/* Importing the same dmabuf several times would yield the same
|
||||
* gem_handle. Thus there could be a race when destroying
|
||||
* BO and importing the same dmabuf from different threads.
|
||||
@@ -678,8 +711,10 @@ virtio_bo_init_dmabuf(struct tu_device *dev,
|
||||
* to happen in parallel.
|
||||
*/
|
||||
u_rwlock_wrlock(&dev->dma_bo_lock);
|
||||
mtx_lock(&dev->vma_mutex);
|
||||
|
||||
uint32_t handle, res_id;
|
||||
uint64_t iova;
|
||||
|
||||
handle = vdrm_dmabuf_to_handle(vdrm, prime_fd);
|
||||
if (!handle) {
|
||||
@@ -689,6 +724,7 @@ virtio_bo_init_dmabuf(struct tu_device *dev,
|
||||
|
||||
res_id = vdrm_handle_to_res_id(vdrm, handle);
|
||||
if (!res_id) {
|
||||
/* XXX gem_handle potentially leaked here since no refcnt */
|
||||
result = vk_error(dev, VK_ERROR_INVALID_EXTERNAL_HANDLE);
|
||||
goto out_unlock;
|
||||
}
|
||||
@@ -702,21 +738,25 @@ virtio_bo_init_dmabuf(struct tu_device *dev,
|
||||
goto out_unlock;
|
||||
}
|
||||
|
||||
result = tu_bo_init(dev, bo, handle, size, iova,
|
||||
TU_BO_ALLOC_NO_FLAGS, "dmabuf");
|
||||
if (result != VK_SUCCESS)
|
||||
memset(bo, 0, sizeof(*bo));
|
||||
else
|
||||
*out_bo = bo;
|
||||
|
||||
out_unlock:
|
||||
u_rwlock_wrunlock(&dev->dma_bo_lock);
|
||||
result = virtio_allocate_userspace_iova_locked(dev, handle, size, 0,
|
||||
TU_BO_ALLOC_DMABUF, &iova);
|
||||
if (result != VK_SUCCESS) {
|
||||
mtx_lock(&dev->vma_mutex);
|
||||
util_vma_heap_free(&dev->vma, iova, size);
|
||||
mtx_unlock(&dev->vma_mutex);
|
||||
vdrm_bo_close(dev->vdev->vdrm, handle);
|
||||
goto out_unlock;
|
||||
}
|
||||
|
||||
result =
|
||||
tu_bo_init(dev, bo, handle, size, iova, TU_BO_ALLOC_NO_FLAGS, "dmabuf");
|
||||
if (result != VK_SUCCESS) {
|
||||
util_vma_heap_free(&dev->vma, iova, size);
|
||||
memset(bo, 0, sizeof(*bo));
|
||||
} else {
|
||||
*out_bo = bo;
|
||||
}
|
||||
|
||||
out_unlock:
|
||||
mtx_unlock(&dev->vma_mutex);
|
||||
u_rwlock_wrunlock(&dev->dma_bo_lock);
|
||||
return result;
|
||||
}
|
||||
|
||||
|
@@ -177,6 +177,9 @@ lower_aaline_instr(nir_builder *b, nir_instr *instr, void *data)
|
||||
return false;
|
||||
if (var->data.location < FRAG_RESULT_DATA0 && var->data.location != FRAG_RESULT_COLOR)
|
||||
return false;
|
||||
uint32_t mask = nir_intrinsic_write_mask(intrin) << var->data.location_frac;
|
||||
if (!(mask & BITFIELD_BIT(3)))
|
||||
return false;
|
||||
|
||||
nir_def *out_input = intrin->src[1].ssa;
|
||||
b->cursor = nir_before_instr(instr);
|
||||
@@ -223,12 +226,10 @@ lower_aaline_instr(nir_builder *b, nir_instr *instr, void *data)
|
||||
|
||||
tmp = nir_fmul(b, nir_channel(b, tmp, 0),
|
||||
nir_fmin(b, nir_channel(b, tmp, 1), max));
|
||||
tmp = nir_fmul(b, nir_channel(b, out_input, 3), tmp);
|
||||
tmp = nir_fmul(b, nir_channel(b, out_input, out_input->num_components - 1), tmp);
|
||||
|
||||
nir_def *out = nir_vec4(b, nir_channel(b, out_input, 0),
|
||||
nir_channel(b, out_input, 1),
|
||||
nir_channel(b, out_input, 2),
|
||||
tmp);
|
||||
nir_def *out = nir_vector_insert_imm(b, out_input, tmp,
|
||||
out_input->num_components - 1);
|
||||
nir_src_rewrite(&intrin->src[1], out);
|
||||
return true;
|
||||
}
|
||||
|
@@ -2014,6 +2014,7 @@ void util_blitter_blit_generic(struct blitter_context *blitter,
|
||||
unsigned dst_sample)
|
||||
{
|
||||
struct blitter_context_priv *ctx = (struct blitter_context_priv*)blitter;
|
||||
unsigned count = 0;
|
||||
struct pipe_context *pipe = ctx->base.pipe;
|
||||
enum pipe_texture_target src_target = src->target;
|
||||
unsigned src_samples = src->texture->nr_samples;
|
||||
@@ -2038,7 +2039,7 @@ void util_blitter_blit_generic(struct blitter_context *blitter,
|
||||
|
||||
/* Return if there is nothing to do. */
|
||||
if (!dst_has_color && !dst_has_depth && !dst_has_stencil) {
|
||||
return;
|
||||
goto out;
|
||||
}
|
||||
|
||||
bool is_scaled = dstbox->width != abs(srcbox->width) ||
|
||||
@@ -2170,7 +2171,6 @@ void util_blitter_blit_generic(struct blitter_context *blitter,
|
||||
}
|
||||
|
||||
/* Set samplers. */
|
||||
unsigned count = 0;
|
||||
if (src_has_depth && src_has_stencil &&
|
||||
(dst_has_color || (dst_has_depth && dst_has_stencil))) {
|
||||
/* Setup two samplers, one for depth and the other one for stencil. */
|
||||
@@ -2223,7 +2223,8 @@ void util_blitter_blit_generic(struct blitter_context *blitter,
|
||||
do_blits(ctx, dst, dstbox, src, src_width0, src_height0,
|
||||
srcbox, dst_has_depth || dst_has_stencil, use_txf, sample0_only,
|
||||
dst_sample);
|
||||
|
||||
util_blitter_unset_running_flag(blitter);
|
||||
out:
|
||||
util_blitter_restore_vertex_states(blitter);
|
||||
util_blitter_restore_fragment_states(blitter);
|
||||
util_blitter_restore_textures_internal(blitter, count);
|
||||
@@ -2232,7 +2233,6 @@ void util_blitter_blit_generic(struct blitter_context *blitter,
|
||||
pipe->set_scissor_states(pipe, 0, 1, &ctx->base.saved_scissor);
|
||||
}
|
||||
util_blitter_restore_render_cond(blitter);
|
||||
util_blitter_unset_running_flag(blitter);
|
||||
}
|
||||
|
||||
void
|
||||
|
@@ -2137,7 +2137,8 @@ static void evergreen_emit_vertex_buffers(struct r600_context *rctx,
|
||||
{
|
||||
struct radeon_cmdbuf *cs = &rctx->b.gfx.cs;
|
||||
struct r600_fetch_shader *shader = (struct r600_fetch_shader*)rctx->vertex_fetch_shader.cso;
|
||||
uint32_t dirty_mask = state->dirty_mask & shader->buffer_mask;
|
||||
uint32_t buffer_mask = shader ? shader->buffer_mask : ~0;
|
||||
uint32_t dirty_mask = state->dirty_mask & buffer_mask;
|
||||
|
||||
while (dirty_mask) {
|
||||
struct pipe_vertex_buffer *vb;
|
||||
@@ -2176,7 +2177,7 @@ static void evergreen_emit_vertex_buffers(struct r600_context *rctx,
|
||||
radeon_emit(cs, radeon_add_to_buffer_list(&rctx->b, &rctx->b.gfx, rbuffer,
|
||||
RADEON_USAGE_READ | RADEON_PRIO_VERTEX_BUFFER));
|
||||
}
|
||||
state->dirty_mask &= ~shader->buffer_mask;
|
||||
state->dirty_mask &= ~buffer_mask;
|
||||
}
|
||||
|
||||
static void evergreen_fs_emit_vertex_buffers(struct r600_context *rctx, struct r600_atom * atom)
|
||||
|
@@ -239,15 +239,13 @@ static rvcn_dec_message_avc_t get_h264_msg(struct radeon_decoder *dec,
|
||||
}
|
||||
}
|
||||
|
||||
/* if reference picture exists, however no reference picture found at the end
|
||||
curr_pic_ref_frame_num == 0, which is not reasonable, should be corrected. */
|
||||
if (result.used_for_reference_flags && (result.curr_pic_ref_frame_num == 0)) {
|
||||
for (i = 0; i < ARRAY_SIZE(result.ref_frame_list); i++) {
|
||||
result.ref_frame_list[i] = pic->ref[i] ?
|
||||
(uintptr_t)vl_video_buffer_get_associated_data(pic->ref[i], &dec->base) : 0xff;
|
||||
if (result.ref_frame_list[i] != 0xff) {
|
||||
/* need at least one reference for P/B frames */
|
||||
if (result.curr_pic_ref_frame_num == 0 && pic->slice_parameter.slice_info_present) {
|
||||
for (i = 0; i < pic->slice_count; i++) {
|
||||
if (pic->slice_parameter.slice_type[i] % 5 != 2) {
|
||||
result.curr_pic_ref_frame_num++;
|
||||
result.non_existing_frame_flags &= ~(1 << i);
|
||||
result.ref_frame_list[0] = 0;
|
||||
result.non_existing_frame_flags &= ~1;
|
||||
break;
|
||||
}
|
||||
}
|
||||
@@ -279,7 +277,8 @@ static rvcn_dec_message_avc_t get_h264_msg(struct radeon_decoder *dec,
|
||||
dec->ref_codec.bts = CODEC_8_BITS;
|
||||
dec->ref_codec.index = result.decoded_pic_idx;
|
||||
dec->ref_codec.ref_size = 16;
|
||||
memset(dec->ref_codec.ref_list, 0xff, sizeof(dec->ref_codec.ref_list));
|
||||
dec->ref_codec.num_refs = result.curr_pic_ref_frame_num;
|
||||
STATIC_ASSERT(sizeof(dec->ref_codec.ref_list) == sizeof(result.ref_frame_list));
|
||||
memcpy(dec->ref_codec.ref_list, result.ref_frame_list, sizeof(result.ref_frame_list));
|
||||
}
|
||||
|
||||
@@ -292,7 +291,7 @@ static rvcn_dec_message_hevc_t get_h265_msg(struct radeon_decoder *dec,
|
||||
struct pipe_h265_picture_desc *pic)
|
||||
{
|
||||
rvcn_dec_message_hevc_t result;
|
||||
unsigned i, j;
|
||||
unsigned i, j, num_refs = 0;
|
||||
|
||||
memset(&result, 0, sizeof(result));
|
||||
result.sps_info_flags = 0;
|
||||
@@ -413,9 +412,10 @@ static rvcn_dec_message_hevc_t get_h265_msg(struct radeon_decoder *dec,
|
||||
|
||||
result.poc_list[i] = pic->PicOrderCntVal[i];
|
||||
|
||||
if (ref)
|
||||
if (ref) {
|
||||
ref_pic = (uintptr_t)vl_video_buffer_get_associated_data(ref, &dec->base);
|
||||
else
|
||||
num_refs++;
|
||||
} else
|
||||
ref_pic = 0x7F;
|
||||
result.ref_pic_list[i] = ref_pic;
|
||||
}
|
||||
@@ -469,7 +469,8 @@ static rvcn_dec_message_hevc_t get_h265_msg(struct radeon_decoder *dec,
|
||||
CODEC_10_BITS : CODEC_8_BITS;
|
||||
dec->ref_codec.index = result.curr_idx;
|
||||
dec->ref_codec.ref_size = 15;
|
||||
memset(dec->ref_codec.ref_list, 0x7f, sizeof(dec->ref_codec.ref_list));
|
||||
dec->ref_codec.num_refs = num_refs;
|
||||
STATIC_ASSERT(sizeof(dec->ref_codec.ref_list) == sizeof(result.ref_pic_list));
|
||||
memcpy(dec->ref_codec.ref_list, result.ref_pic_list, sizeof(result.ref_pic_list));
|
||||
}
|
||||
return result;
|
||||
@@ -507,7 +508,7 @@ static rvcn_dec_message_vp9_t get_vp9_msg(struct radeon_decoder *dec,
|
||||
struct pipe_vp9_picture_desc *pic)
|
||||
{
|
||||
rvcn_dec_message_vp9_t result;
|
||||
unsigned i ,j;
|
||||
unsigned i, j, num_refs = 0;
|
||||
|
||||
memset(&result, 0, sizeof(result));
|
||||
|
||||
@@ -641,9 +642,13 @@ static rvcn_dec_message_vp9_t get_vp9_msg(struct radeon_decoder *dec,
|
||||
get_current_pic_index(dec, target, &result.curr_pic_idx);
|
||||
|
||||
for (i = 0; i < 8; i++) {
|
||||
result.ref_frame_map[i] =
|
||||
(pic->ref[i]) ? (uintptr_t)vl_video_buffer_get_associated_data(pic->ref[i], &dec->base)
|
||||
: 0x7f;
|
||||
uintptr_t ref_frame;
|
||||
if (pic->ref[i]) {
|
||||
ref_frame = (uintptr_t)vl_video_buffer_get_associated_data(pic->ref[i], &dec->base);
|
||||
num_refs++;
|
||||
} else
|
||||
ref_frame = 0x7f;
|
||||
result.ref_frame_map[i] = ref_frame;
|
||||
}
|
||||
|
||||
result.frame_refs[0] = result.ref_frame_map[pic->picture_parameter.pic_fields.last_ref_frame];
|
||||
@@ -669,6 +674,7 @@ static rvcn_dec_message_vp9_t get_vp9_msg(struct radeon_decoder *dec,
|
||||
CODEC_10_BITS : CODEC_8_BITS;
|
||||
dec->ref_codec.index = result.curr_pic_idx;
|
||||
dec->ref_codec.ref_size = 8;
|
||||
dec->ref_codec.num_refs = num_refs;
|
||||
memset(dec->ref_codec.ref_list, 0x7f, sizeof(dec->ref_codec.ref_list));
|
||||
memcpy(dec->ref_codec.ref_list, result.ref_frame_map, sizeof(result.ref_frame_map));
|
||||
}
|
||||
@@ -959,7 +965,7 @@ static rvcn_dec_message_av1_t get_av1_msg(struct radeon_decoder *dec,
|
||||
struct pipe_av1_picture_desc *pic)
|
||||
{
|
||||
rvcn_dec_message_av1_t result;
|
||||
unsigned i, j;
|
||||
unsigned i, j, num_refs = 0;
|
||||
uint16_t tile_count = pic->picture_parameter.tile_cols * pic->picture_parameter.tile_rows;
|
||||
|
||||
memset(&result, 0, sizeof(result));
|
||||
@@ -1151,9 +1157,13 @@ static rvcn_dec_message_av1_t get_av1_msg(struct radeon_decoder *dec,
|
||||
result.order_hint_bits = pic->picture_parameter.order_hint_bits_minus_1 + 1;
|
||||
|
||||
for (i = 0; i < NUM_AV1_REFS; ++i) {
|
||||
result.ref_frame_map[i] =
|
||||
(pic->ref[i]) ? (uintptr_t)vl_video_buffer_get_associated_data(pic->ref[i], &dec->base)
|
||||
: 0x7f;
|
||||
uintptr_t ref_frame;
|
||||
if (pic->ref[i]) {
|
||||
ref_frame = (uintptr_t)vl_video_buffer_get_associated_data(pic->ref[i], &dec->base);
|
||||
num_refs++;
|
||||
} else
|
||||
ref_frame = 0x7f;
|
||||
result.ref_frame_map[i] = ref_frame;
|
||||
}
|
||||
for (i = 0; i < NUM_AV1_REFS_PER_FRAME; ++i)
|
||||
result.frame_refs[i] = result.ref_frame_map[pic->picture_parameter.ref_frame_idx[i]];
|
||||
@@ -1300,6 +1310,7 @@ static rvcn_dec_message_av1_t get_av1_msg(struct radeon_decoder *dec,
|
||||
dec->ref_codec.bts = pic->picture_parameter.bit_depth_idx ? CODEC_10_BITS : CODEC_8_BITS;
|
||||
dec->ref_codec.index = result.curr_pic_idx;
|
||||
dec->ref_codec.ref_size = 8;
|
||||
dec->ref_codec.num_refs = num_refs;
|
||||
memset(dec->ref_codec.ref_list, 0x7f, sizeof(dec->ref_codec.ref_list));
|
||||
memcpy(dec->ref_codec.ref_list, result.ref_frame_map, sizeof(result.ref_frame_map));
|
||||
}
|
||||
@@ -1816,6 +1827,7 @@ static unsigned rvcn_dec_dynamic_dpb_t2_message(struct radeon_decoder *dec, rvcn
|
||||
size = size * 3 / 2;
|
||||
|
||||
list_for_each_entry_safe(struct rvcn_dec_dynamic_dpb_t2, d, &dec->dpb_ref_list, list) {
|
||||
bool found = false;
|
||||
for (i = 0; i < dec->ref_codec.ref_size; ++i) {
|
||||
if (((dec->ref_codec.ref_list[i] & 0x7f) != 0x7f) && (d->index == (dec->ref_codec.ref_list[i] & 0x7f))) {
|
||||
if (!dummy)
|
||||
@@ -1829,10 +1841,10 @@ static unsigned rvcn_dec_dynamic_dpb_t2_message(struct radeon_decoder *dec, rvcn
|
||||
dynamic_dpb_t2->dpbAddrLo[i] = addr;
|
||||
dynamic_dpb_t2->dpbAddrHi[i] = addr >> 32;
|
||||
++dynamic_dpb_t2->dpbArraySize;
|
||||
break;
|
||||
found = true;
|
||||
}
|
||||
}
|
||||
if (i == dec->ref_codec.ref_size) {
|
||||
if (!found) {
|
||||
if (d->dpb.res->b.b.width0 * d->dpb.res->b.b.height0 != size) {
|
||||
list_del(&d->list);
|
||||
list_addtail(&d->list, &dec->dpb_unref_list);
|
||||
@@ -1887,6 +1899,23 @@ static unsigned rvcn_dec_dynamic_dpb_t2_message(struct radeon_decoder *dec, rvcn
|
||||
list_addtail(&dpb->list, &dec->dpb_ref_list);
|
||||
}
|
||||
|
||||
if (dynamic_dpb_t2->dpbArraySize < dec->ref_codec.num_refs) {
|
||||
struct rvcn_dec_dynamic_dpb_t2 *d =
|
||||
list_first_entry(&dec->dpb_ref_list, struct rvcn_dec_dynamic_dpb_t2, list);
|
||||
addr = dec->ws->buffer_get_virtual_address(d->dpb.res->buf);
|
||||
if (!addr && dummy)
|
||||
addr = dec->ws->buffer_get_virtual_address(dummy->dpb.res->buf);
|
||||
assert(addr);
|
||||
for (i = 0; i < dec->ref_codec.num_refs; ++i) {
|
||||
if (dynamic_dpb_t2->dpbAddrLo[i] || dynamic_dpb_t2->dpbAddrHi[i])
|
||||
continue;
|
||||
dynamic_dpb_t2->dpbAddrLo[i] = addr;
|
||||
dynamic_dpb_t2->dpbAddrHi[i] = addr >> 32;
|
||||
++dynamic_dpb_t2->dpbArraySize;
|
||||
}
|
||||
assert(dynamic_dpb_t2->dpbArraySize == dec->ref_codec.num_refs);
|
||||
}
|
||||
|
||||
dec->ws->cs_add_buffer(&dec->cs, dpb->dpb.res->buf,
|
||||
RADEON_USAGE_READWRITE | RADEON_USAGE_SYNCHRONIZED, RADEON_DOMAIN_VRAM);
|
||||
addr = dec->ws->buffer_get_virtual_address(dpb->dpb.res->buf);
|
||||
|
@@ -120,6 +120,7 @@ struct radeon_decoder {
|
||||
} bts;
|
||||
uint8_t index;
|
||||
unsigned ref_size;
|
||||
unsigned num_refs;
|
||||
uint8_t ref_list[16];
|
||||
} ref_codec;
|
||||
|
||||
|
@@ -176,6 +176,15 @@ bool si_alloc_resource(struct si_screen *sscreen, struct si_resource *res)
|
||||
util_range_set_empty(&res->valid_buffer_range);
|
||||
res->TC_L2_dirty = false;
|
||||
|
||||
if (res->b.b.target != PIPE_BUFFER && !(res->b.b.flags & SI_RESOURCE_AUX_PLANE)) {
|
||||
/* The buffer is shared with other planes. */
|
||||
struct si_resource *plane = (struct si_resource *)res->b.b.next;
|
||||
for (; plane; plane = (struct si_resource *)plane->b.b.next) {
|
||||
radeon_bo_reference(sscreen->ws, &plane->buf, res->buf);
|
||||
plane->gpu_address = res->gpu_address;
|
||||
}
|
||||
}
|
||||
|
||||
/* Print debug information. */
|
||||
if (sscreen->debug_flags & DBG(VM) && res->b.b.target == PIPE_BUFFER) {
|
||||
fprintf(stderr, "VM start=0x%" PRIX64 " end=0x%" PRIX64 " | Buffer %" PRIu64 " bytes | Flags: ",
|
||||
|
@@ -1933,13 +1933,7 @@ emit_alu(struct ntv_context *ctx, nir_alu_instr *alu)
|
||||
result = emit_builtin_unop(ctx, GLSLstd450PackHalf2x16, get_def_type(ctx, &alu->def, nir_type_uint), src[0]);
|
||||
break;
|
||||
|
||||
case nir_op_unpack_64_2x32:
|
||||
assert(nir_op_infos[alu->op].num_inputs == 1);
|
||||
result = emit_builtin_unop(ctx, GLSLstd450UnpackDouble2x32, get_def_type(ctx, &alu->def, nir_type_uint), src[0]);
|
||||
break;
|
||||
|
||||
BUILTIN_UNOPF(nir_op_unpack_half_2x16, GLSLstd450UnpackHalf2x16)
|
||||
BUILTIN_UNOPF(nir_op_pack_64_2x32, GLSLstd450PackDouble2x32)
|
||||
#undef BUILTIN_UNOP
|
||||
#undef BUILTIN_UNOPF
|
||||
|
||||
@@ -2125,9 +2119,11 @@ emit_alu(struct ntv_context *ctx, nir_alu_instr *alu)
|
||||
/* those are all simple bitcasts, we could do better, but it doesn't matter */
|
||||
case nir_op_pack_32_4x8:
|
||||
case nir_op_pack_32_2x16:
|
||||
case nir_op_pack_64_2x32:
|
||||
case nir_op_pack_64_4x16:
|
||||
case nir_op_unpack_32_4x8:
|
||||
case nir_op_unpack_32_2x16:
|
||||
case nir_op_unpack_64_2x32:
|
||||
case nir_op_unpack_64_4x16: {
|
||||
result = emit_bitcast(ctx, dest_type, src[0]);
|
||||
break;
|
||||
|
@@ -309,6 +309,11 @@ zink_batch_state_destroy(struct zink_screen *screen, struct zink_batch_state *bs
|
||||
util_dynarray_fini(&bs->bindless_releases[0]);
|
||||
util_dynarray_fini(&bs->bindless_releases[1]);
|
||||
util_dynarray_fini(&bs->acquires);
|
||||
util_dynarray_fini(&bs->signal_semaphores);
|
||||
util_dynarray_fini(&bs->wait_semaphores);
|
||||
util_dynarray_fini(&bs->wait_semaphore_stages);
|
||||
util_dynarray_fini(&bs->fd_wait_semaphores);
|
||||
util_dynarray_fini(&bs->fd_wait_semaphore_stages);
|
||||
util_dynarray_fini(&bs->acquire_flags);
|
||||
unsigned num_mfences = util_dynarray_num_elements(&bs->fence.mfences, void *);
|
||||
struct zink_tc_fence **mfence = bs->fence.mfences.data;
|
||||
|
@@ -4846,8 +4846,11 @@ zink_resource_commit(struct pipe_context *pctx, struct pipe_resource *pres, unsi
|
||||
VkSemaphore sem = VK_NULL_HANDLE;
|
||||
bool ret = zink_bo_commit(ctx, res, level, box, commit, &sem);
|
||||
if (ret) {
|
||||
if (sem)
|
||||
if (sem) {
|
||||
zink_batch_add_wait_semaphore(&ctx->batch, sem);
|
||||
zink_batch_reference_resource_rw(&ctx->batch, res, true);
|
||||
ctx->batch.has_work = true;
|
||||
}
|
||||
} else {
|
||||
check_device_lost(ctx);
|
||||
}
|
||||
|
@@ -513,6 +513,7 @@ namespace {
|
||||
LLVMRunPasses(wrap(&mod), opt_str, tm, opts);
|
||||
|
||||
LLVMDisposeTargetMachine(tm);
|
||||
LLVMDisposePassBuilderOptions(opts);
|
||||
}
|
||||
|
||||
std::unique_ptr<Module>
|
||||
|
@@ -2385,7 +2385,7 @@ dri2_init_screen(struct dri_screen *screen)
|
||||
pscreen = pipe_loader_create_screen(screen->dev);
|
||||
|
||||
if (!pscreen)
|
||||
goto fail;
|
||||
return NULL;
|
||||
|
||||
dri_init_options(screen);
|
||||
screen->throttle = pscreen->get_param(pscreen, PIPE_CAP_THROTTLE);
|
||||
@@ -2419,7 +2419,7 @@ dri2_init_screen(struct dri_screen *screen)
|
||||
return configs;
|
||||
|
||||
fail:
|
||||
dri_release_screen(screen);
|
||||
pipe_loader_release(&screen->dev, 1);
|
||||
|
||||
return NULL;
|
||||
}
|
||||
|
@@ -546,6 +546,8 @@ drisw_init_screen(struct dri_screen *screen)
|
||||
struct pipe_screen *pscreen = NULL;
|
||||
const struct drisw_loader_funcs *lf = &drisw_lf;
|
||||
|
||||
(void) mtx_init(&screen->opencl_func_mutex, mtx_plain);
|
||||
|
||||
screen->swrast_no_present = debug_get_option_swrast_no_present();
|
||||
|
||||
if (loader->base.version >= 4) {
|
||||
@@ -565,7 +567,7 @@ drisw_init_screen(struct dri_screen *screen)
|
||||
pscreen = pipe_loader_create_screen(screen->dev);
|
||||
|
||||
if (!pscreen)
|
||||
goto fail;
|
||||
return NULL;
|
||||
|
||||
dri_init_options(screen);
|
||||
configs = dri_init_screen(screen, pscreen);
|
||||
@@ -593,7 +595,7 @@ drisw_init_screen(struct dri_screen *screen)
|
||||
|
||||
return configs;
|
||||
fail:
|
||||
dri_release_screen(screen);
|
||||
pipe_loader_release(&screen->dev, 1);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
|
@@ -115,6 +115,8 @@ kopper_init_screen(struct dri_screen *screen)
|
||||
const __DRIconfig **configs;
|
||||
struct pipe_screen *pscreen = NULL;
|
||||
|
||||
(void) mtx_init(&screen->opencl_func_mutex, mtx_plain);
|
||||
|
||||
if (!screen->kopper_loader) {
|
||||
fprintf(stderr, "mesa: Kopper interface not found!\n"
|
||||
" Ensure the versions of %s built with this version of Zink are\n"
|
||||
@@ -134,7 +136,7 @@ kopper_init_screen(struct dri_screen *screen)
|
||||
pscreen = pipe_loader_create_screen(screen->dev);
|
||||
|
||||
if (!pscreen)
|
||||
goto fail;
|
||||
return NULL;
|
||||
|
||||
dri_init_options(screen);
|
||||
screen->unwrapped_screen = trace_screen_unwrap(pscreen);
|
||||
@@ -167,7 +169,7 @@ kopper_init_screen(struct dri_screen *screen)
|
||||
|
||||
return configs;
|
||||
fail:
|
||||
dri_release_screen(screen);
|
||||
pipe_loader_release(&screen->dev, 1);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
|
@@ -10,6 +10,7 @@ use mesa_rust_util::static_assert;
|
||||
use rusticl_opencl_gen::*;
|
||||
|
||||
use std::collections::HashSet;
|
||||
use std::mem;
|
||||
use std::slice;
|
||||
use std::sync::Arc;
|
||||
use std::sync::Condvar;
|
||||
@@ -272,6 +273,27 @@ impl Event {
|
||||
}
|
||||
}
|
||||
|
||||
impl Drop for Event {
|
||||
// implement drop in order to prevent stack overflows of long dependency chains.
|
||||
//
|
||||
// This abuses the fact that `Arc::into_inner` only succeeds when there is one strong reference
|
||||
// so we turn a recursive drop chain into a drop list for events having no other references.
|
||||
fn drop(&mut self) {
|
||||
if self.deps.is_empty() {
|
||||
return;
|
||||
}
|
||||
|
||||
let mut deps_list = vec![mem::take(&mut self.deps)];
|
||||
while let Some(deps) = deps_list.pop() {
|
||||
for dep in deps {
|
||||
if let Some(mut dep) = Arc::into_inner(dep) {
|
||||
deps_list.push(mem::take(&mut dep.deps));
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// TODO worker thread per device
|
||||
// Condvar to wait on new events to work on
|
||||
// notify condvar when flushing queue events to worker
|
||||
|
@@ -658,6 +658,7 @@ impl PipeContext {
|
||||
|
||||
impl Drop for PipeContext {
|
||||
fn drop(&mut self) {
|
||||
self.flush().wait();
|
||||
unsafe {
|
||||
self.pipe.as_ref().destroy.unwrap()(self.pipe.as_ptr());
|
||||
}
|
||||
|
@@ -1033,7 +1033,8 @@ vlVaRenderPicture(VADriverContextP ctx, VAContextID context_id, VABufferID *buff
|
||||
|
||||
case VASliceDataBufferType:
|
||||
vaStatus = handleVASliceDataBufferType(context, buf);
|
||||
slice_offset += buf->size;
|
||||
if (slice_idx)
|
||||
slice_offset += buf->size;
|
||||
break;
|
||||
|
||||
case VAProcPipelineParameterBufferType:
|
||||
|
@@ -186,6 +186,7 @@ void vlVaHandleSliceParameterBufferH264(vlVaContext *context, vlVaBuffer *buf)
|
||||
assert(context->desc.h264.slice_count < max_pipe_h264_slices);
|
||||
|
||||
context->desc.h264.slice_parameter.slice_info_present = true;
|
||||
context->desc.h264.slice_parameter.slice_type[context->desc.h264.slice_count] = h264->slice_type;
|
||||
context->desc.h264.slice_parameter.slice_data_size[context->desc.h264.slice_count] = h264->slice_data_size;
|
||||
context->desc.h264.slice_parameter.slice_data_offset[context->desc.h264.slice_count] = h264->slice_data_offset;
|
||||
|
||||
|
@@ -411,6 +411,7 @@ struct pipe_h264_picture_desc
|
||||
{
|
||||
bool slice_info_present;
|
||||
uint32_t slice_count;
|
||||
uint8_t slice_type[128];
|
||||
uint32_t slice_data_size[128];
|
||||
uint32_t slice_data_offset[128];
|
||||
enum pipe_slice_buffer_placement_type slice_data_flag[128];
|
||||
|
@@ -28,5 +28,5 @@ libi915drm = static_library(
|
||||
inc_include, inc_src, inc_gallium, inc_gallium_aux, inc_gallium_drivers
|
||||
],
|
||||
link_with : [libintel_common],
|
||||
dependencies : [dep_libdrm, dep_libdrm_intel],
|
||||
dependencies : [dep_libdrm, dep_libdrm_intel, idep_intel_dev_wa],
|
||||
)
|
||||
|
@@ -32,7 +32,9 @@
|
||||
#include <dlfcn.h>
|
||||
#include "dri_common.h"
|
||||
#include "drisw_priv.h"
|
||||
#ifdef HAVE_DRI3
|
||||
#include "dri3_priv.h"
|
||||
#endif
|
||||
#include <X11/extensions/shmproto.h>
|
||||
#include <assert.h>
|
||||
#include <vulkan/vulkan_core.h>
|
||||
|
@@ -908,9 +908,11 @@ __glXInitialize(Display * dpy)
|
||||
#endif /* HAVE_DRI3 */
|
||||
if (!debug_get_bool_option("LIBGL_DRI2_DISABLE", false))
|
||||
dpyPriv->dri2Display = dri2CreateDisplay(dpy);
|
||||
#if defined(HAVE_ZINK)
|
||||
if (!dpyPriv->dri3Display && !dpyPriv->dri2Display)
|
||||
try_zink = !debug_get_bool_option("LIBGL_KOPPER_DISABLE", false) &&
|
||||
!getenv("GALLIUM_DRIVER");
|
||||
#endif /* HAVE_ZINK */
|
||||
}
|
||||
#endif /* GLX_USE_DRM */
|
||||
if (glx_direct)
|
||||
|
@@ -743,7 +743,7 @@ build_desc_addr_for_res_index(nir_builder *b,
|
||||
static nir_def *
|
||||
build_desc_addr_for_binding(nir_builder *b,
|
||||
unsigned set, unsigned binding,
|
||||
nir_def *array_index,
|
||||
nir_def *array_index, unsigned plane,
|
||||
const struct apply_pipeline_layout_state *state)
|
||||
{
|
||||
const struct anv_descriptor_set_binding_layout *bind_layout =
|
||||
@@ -759,6 +759,10 @@ build_desc_addr_for_binding(nir_builder *b,
|
||||
array_index,
|
||||
bind_layout->descriptor_surface_stride),
|
||||
bind_layout->descriptor_surface_offset);
|
||||
if (plane != 0) {
|
||||
desc_offset = nir_iadd_imm(
|
||||
b, desc_offset, plane * bind_layout->descriptor_data_surface_size);
|
||||
}
|
||||
|
||||
return nir_vec4(b, nir_unpack_64_2x32_split_x(b, set_addr),
|
||||
nir_unpack_64_2x32_split_y(b, set_addr),
|
||||
@@ -766,14 +770,21 @@ build_desc_addr_for_binding(nir_builder *b,
|
||||
desc_offset);
|
||||
}
|
||||
|
||||
case nir_address_format_32bit_index_offset:
|
||||
case nir_address_format_32bit_index_offset: {
|
||||
nir_def *desc_offset =
|
||||
nir_iadd_imm(b,
|
||||
nir_imul_imm(b,
|
||||
array_index,
|
||||
bind_layout->descriptor_surface_stride),
|
||||
bind_layout->descriptor_surface_offset);
|
||||
if (plane != 0) {
|
||||
desc_offset = nir_iadd_imm(
|
||||
b, desc_offset, plane * bind_layout->descriptor_data_surface_size);
|
||||
}
|
||||
return nir_vec2(b,
|
||||
nir_imm_int(b, state->set[set].desc_offset),
|
||||
nir_iadd_imm(b,
|
||||
nir_imul_imm(b,
|
||||
array_index,
|
||||
bind_layout->descriptor_surface_stride),
|
||||
bind_layout->descriptor_surface_offset));
|
||||
desc_offset);
|
||||
}
|
||||
|
||||
default:
|
||||
unreachable("Unhandled address format");
|
||||
@@ -827,7 +838,8 @@ build_surface_index_for_binding(nir_builder *b,
|
||||
set_offset = nir_imm_int(b, 0xdeaddead);
|
||||
|
||||
nir_def *desc_addr =
|
||||
build_desc_addr_for_binding(b, set, binding, array_index, state);
|
||||
build_desc_addr_for_binding(b, set, binding, array_index,
|
||||
plane, state);
|
||||
|
||||
surface_index =
|
||||
build_load_descriptor_mem(b, desc_addr, 0, 1, 32, state);
|
||||
@@ -908,7 +920,8 @@ build_sampler_handle_for_binding(nir_builder *b,
|
||||
set_offset = nir_imm_int(b, 0xdeaddead);
|
||||
|
||||
nir_def *desc_addr =
|
||||
build_desc_addr_for_binding(b, set, binding, array_index, state);
|
||||
build_desc_addr_for_binding(b, set, binding, array_index,
|
||||
plane, state);
|
||||
|
||||
/* This is anv_sampled_image_descriptor, the sampler handle is always
|
||||
* in component 1.
|
||||
@@ -1384,7 +1397,8 @@ lower_load_accel_struct_desc(nir_builder *b,
|
||||
|
||||
struct res_index_defs res = unpack_res_index(b, res_index);
|
||||
nir_def *desc_addr =
|
||||
build_desc_addr_for_binding(b, set, binding, res.array_index, state);
|
||||
build_desc_addr_for_binding(b, set, binding, res.array_index,
|
||||
0 /* plane */, state);
|
||||
|
||||
/* Acceleration structure descriptors are always uint64_t */
|
||||
nir_def *desc = build_load_descriptor_mem(b, desc_addr, 0, 1, 64, state);
|
||||
@@ -1613,7 +1627,7 @@ lower_image_size_intrinsic(nir_builder *b, nir_intrinsic_instr *intrin,
|
||||
}
|
||||
|
||||
nir_def *desc_addr = build_desc_addr_for_binding(
|
||||
b, set, binding, array_index, state);
|
||||
b, set, binding, array_index, 0 /* plane */, state);
|
||||
|
||||
b->cursor = nir_after_instr(&intrin->instr);
|
||||
|
||||
|
@@ -3180,6 +3180,12 @@ struct anv_push_constants {
|
||||
/** Dynamic offsets for dynamic UBOs and SSBOs */
|
||||
uint32_t dynamic_offsets[MAX_DYNAMIC_BUFFERS];
|
||||
|
||||
/* Robust access pushed registers. */
|
||||
uint64_t push_reg_mask[MESA_SHADER_STAGES];
|
||||
|
||||
/** Ray query globals (RT_DISPATCH_GLOBALS) */
|
||||
uint64_t ray_query_globals;
|
||||
|
||||
union {
|
||||
struct {
|
||||
/** Dynamic MSAA value */
|
||||
@@ -3200,16 +3206,12 @@ struct anv_push_constants {
|
||||
*
|
||||
* This is never set by software but is implicitly filled out when
|
||||
* uploading the push constants for compute shaders.
|
||||
*
|
||||
* This *MUST* be the last field of the anv_push_constants structure.
|
||||
*/
|
||||
uint32_t subgroup_id;
|
||||
} cs;
|
||||
};
|
||||
|
||||
/* Robust access pushed registers. */
|
||||
uint64_t push_reg_mask[MESA_SHADER_STAGES];
|
||||
|
||||
/** Ray query globals (RT_DISPATCH_GLOBALS) */
|
||||
uint64_t ray_query_globals;
|
||||
};
|
||||
|
||||
struct anv_surface_state {
|
||||
|
@@ -1438,13 +1438,15 @@ nvk_flush_ms_state(struct nvk_cmd_buffer *cmd)
|
||||
});
|
||||
}
|
||||
|
||||
if (BITSET_TEST(dyn->dirty, MESA_VK_DYNAMIC_MS_SAMPLE_LOCATIONS) ||
|
||||
if (BITSET_TEST(dyn->dirty, MESA_VK_DYNAMIC_MS_RASTERIZATION_SAMPLES) ||
|
||||
BITSET_TEST(dyn->dirty, MESA_VK_DYNAMIC_MS_SAMPLE_LOCATIONS) ||
|
||||
BITSET_TEST(dyn->dirty, MESA_VK_DYNAMIC_MS_SAMPLE_LOCATIONS_ENABLE)) {
|
||||
const struct vk_sample_locations_state *sl;
|
||||
if (dyn->ms.sample_locations_enable) {
|
||||
sl = dyn->ms.sample_locations;
|
||||
} else {
|
||||
sl = vk_standard_sample_locations_state(dyn->ms.rasterization_samples);
|
||||
const uint32_t samples = MAX2(1, dyn->ms.rasterization_samples);
|
||||
sl = vk_standard_sample_locations_state(samples);
|
||||
}
|
||||
|
||||
for (uint32_t i = 0; i < sl->per_pixel; i++) {
|
||||
|
@@ -130,6 +130,7 @@ nvk_meta_end(struct nvk_cmd_buffer *cmd,
|
||||
{
|
||||
if (save->desc0) {
|
||||
cmd->state.gfx.descriptors.sets[0] = save->desc0;
|
||||
cmd->state.gfx.descriptors.set_sizes[0] = save->desc0->size;
|
||||
cmd->state.gfx.descriptors.root.sets[0] = nvk_descriptor_set_addr(save->desc0);
|
||||
cmd->state.gfx.descriptors.sets_dirty |= BITFIELD_BIT(0);
|
||||
cmd->state.gfx.descriptors.push_dirty &= ~BITFIELD_BIT(0);
|
||||
|
@@ -10,6 +10,9 @@
|
||||
#include <sys/mman.h>
|
||||
#include <xf86drm.h>
|
||||
|
||||
#include "nvidia/classes/cl9097.h"
|
||||
#include "nvidia/classes/clc597.h"
|
||||
|
||||
static void
|
||||
bo_bind(struct nouveau_ws_device *dev,
|
||||
uint32_t handle, uint64_t addr,
|
||||
@@ -170,9 +173,10 @@ nouveau_ws_bo_new_mapped(struct nouveau_ws_device *dev,
|
||||
}
|
||||
|
||||
static struct nouveau_ws_bo *
|
||||
nouveau_ws_bo_new_locked(struct nouveau_ws_device *dev,
|
||||
uint64_t size, uint64_t align,
|
||||
enum nouveau_ws_bo_flags flags)
|
||||
nouveau_ws_bo_new_tiled_locked(struct nouveau_ws_device *dev,
|
||||
uint64_t size, uint64_t align,
|
||||
uint8_t pte_kind, uint16_t tile_mode,
|
||||
enum nouveau_ws_bo_flags flags)
|
||||
{
|
||||
struct drm_nouveau_gem_new req = {};
|
||||
|
||||
@@ -205,6 +209,9 @@ nouveau_ws_bo_new_locked(struct nouveau_ws_device *dev,
|
||||
if (flags & NOUVEAU_WS_BO_NO_SHARE)
|
||||
req.info.domain |= NOUVEAU_GEM_DOMAIN_NO_SHARE;
|
||||
|
||||
req.info.tile_flags = (uint32_t)pte_kind << 8;
|
||||
req.info.tile_mode = tile_mode;
|
||||
|
||||
req.info.size = size;
|
||||
req.align = align;
|
||||
|
||||
@@ -242,19 +249,29 @@ fail_gem_new:
|
||||
}
|
||||
|
||||
struct nouveau_ws_bo *
|
||||
nouveau_ws_bo_new(struct nouveau_ws_device *dev,
|
||||
uint64_t size, uint64_t align,
|
||||
enum nouveau_ws_bo_flags flags)
|
||||
nouveau_ws_bo_new_tiled(struct nouveau_ws_device *dev,
|
||||
uint64_t size, uint64_t align,
|
||||
uint8_t pte_kind, uint16_t tile_mode,
|
||||
enum nouveau_ws_bo_flags flags)
|
||||
{
|
||||
struct nouveau_ws_bo *bo;
|
||||
|
||||
simple_mtx_lock(&dev->bos_lock);
|
||||
bo = nouveau_ws_bo_new_locked(dev, size, align, flags);
|
||||
bo = nouveau_ws_bo_new_tiled_locked(dev, size, align,
|
||||
pte_kind, tile_mode, flags);
|
||||
simple_mtx_unlock(&dev->bos_lock);
|
||||
|
||||
return bo;
|
||||
}
|
||||
|
||||
struct nouveau_ws_bo *
|
||||
nouveau_ws_bo_new(struct nouveau_ws_device *dev,
|
||||
uint64_t size, uint64_t align,
|
||||
enum nouveau_ws_bo_flags flags)
|
||||
{
|
||||
return nouveau_ws_bo_new_tiled(dev, size, align, 0, 0, flags);
|
||||
}
|
||||
|
||||
static struct nouveau_ws_bo *
|
||||
nouveau_ws_bo_from_dma_buf_locked(struct nouveau_ws_device *dev, int fd)
|
||||
{
|
||||
@@ -265,8 +282,11 @@ nouveau_ws_bo_from_dma_buf_locked(struct nouveau_ws_device *dev, int fd)
|
||||
|
||||
struct hash_entry *entry =
|
||||
_mesa_hash_table_search(dev->bos, (void *)(uintptr_t)handle);
|
||||
if (entry != NULL)
|
||||
return entry->data;
|
||||
if (entry != NULL) {
|
||||
struct nouveau_ws_bo *bo = entry->data;
|
||||
nouveau_ws_bo_ref(bo);
|
||||
return bo;
|
||||
}
|
||||
|
||||
/*
|
||||
* If we got here, no BO exists for the retrieved handle. If we error
|
||||
|
@@ -68,6 +68,11 @@ struct nouveau_ws_bo *nouveau_ws_bo_new_mapped(struct nouveau_ws_device *,
|
||||
enum nouveau_ws_bo_flags,
|
||||
enum nouveau_ws_bo_map_flags map_flags,
|
||||
void **map_out);
|
||||
struct nouveau_ws_bo *nouveau_ws_bo_new_tiled(struct nouveau_ws_device *,
|
||||
uint64_t size, uint64_t align,
|
||||
uint8_t pte_kind,
|
||||
uint16_t tile_mode,
|
||||
enum nouveau_ws_bo_flags);
|
||||
struct nouveau_ws_bo *nouveau_ws_bo_from_dma_buf(struct nouveau_ws_device *,
|
||||
int fd);
|
||||
void nouveau_ws_bo_destroy(struct nouveau_ws_bo *);
|
||||
|
@@ -380,3 +380,13 @@ nouveau_ws_device_destroy(struct nouveau_ws_device *device)
|
||||
close(device->fd);
|
||||
FREE(device);
|
||||
}
|
||||
|
||||
bool
|
||||
nouveau_ws_device_has_tiled_bo(struct nouveau_ws_device *device)
|
||||
{
|
||||
uint64_t has = 0;
|
||||
if (nouveau_ws_param(device->fd, NOUVEAU_GETPARAM_HAS_VMA_TILEMODE, &has))
|
||||
return false;
|
||||
|
||||
return has != 0;
|
||||
}
|
||||
|
@@ -64,6 +64,8 @@ struct nouveau_ws_device {
|
||||
struct nouveau_ws_device *nouveau_ws_device_new(struct _drmDevice *drm_device);
|
||||
void nouveau_ws_device_destroy(struct nouveau_ws_device *);
|
||||
|
||||
bool nouveau_ws_device_has_tiled_bo(struct nouveau_ws_device *device);
|
||||
|
||||
#ifdef __cplusplus
|
||||
}
|
||||
#endif
|
||||
|
@@ -208,5 +208,9 @@ Application bugs worked around in this file:
|
||||
<application name="Half-Life Alyx" application_name_match="hlvr">
|
||||
<option name="dual_color_blend_by_location" value="true" />
|
||||
</application>
|
||||
|
||||
<application name="Enshrouded" executable="enshrouded.exe">
|
||||
<option name="radv_zero_vram" value="true"/>
|
||||
</application>
|
||||
</device>
|
||||
</driconf>
|
||||
|
@@ -209,7 +209,8 @@ __bitset_shl(BITSET_WORD *x, unsigned amount, unsigned n)
|
||||
*/
|
||||
#define BITSET_TEST_RANGE_INSIDE_WORD(x, b, e, mask) \
|
||||
(BITSET_BITWORD(b) == BITSET_BITWORD(e) ? \
|
||||
(((x)[BITSET_BITWORD(b)] & BITSET_RANGE(b, e)) == mask) : \
|
||||
(((x)[BITSET_BITWORD(b)] & BITSET_RANGE(b, e)) == \
|
||||
(((BITSET_WORD)mask) << (b % BITSET_WORDBITS))) : \
|
||||
(assert (!"BITSET_TEST_RANGE: bit range crosses word boundary"), 0))
|
||||
#define BITSET_SET_RANGE_INSIDE_WORD(x, b, e) \
|
||||
(BITSET_BITWORD(b) == BITSET_BITWORD(e) ? \
|
||||
|
@@ -532,7 +532,7 @@ wsi_create_native_image_mem(const struct wsi_swapchain *chain,
|
||||
|
||||
for (uint32_t p = 0; p < image->num_planes; p++) {
|
||||
const VkImageSubresource image_subresource = {
|
||||
.aspectMask = VK_IMAGE_ASPECT_PLANE_0_BIT << p,
|
||||
.aspectMask = VK_IMAGE_ASPECT_MEMORY_PLANE_0_BIT_EXT << p,
|
||||
.mipLevel = 0,
|
||||
.arrayLayer = 0,
|
||||
};
|
||||
|
@@ -41,6 +41,15 @@ endif
|
||||
if rc.version().version_compare('< 1.57')
|
||||
rust_args += ['--cfg', 'no_is_available']
|
||||
endif
|
||||
if rc.version().version_compare('< 1.66')
|
||||
rust_args += ['--cfg', 'no_source_text']
|
||||
endif
|
||||
if rc.version().version_compare('< 1.79')
|
||||
rust_args += [
|
||||
'--cfg', 'no_literal_byte_character',
|
||||
'--cfg', 'no_literal_c_string',
|
||||
]
|
||||
endif
|
||||
|
||||
u_ind = subproject('unicode-ident').get_variable('lib')
|
||||
|
||||
|
Reference in New Issue
Block a user