Compare commits

...

28 Commits

Author SHA1 Message Date
Dylan Baker
716fc5280a VERSION: bump version for 22.0.0-rc2 2022-02-09 09:43:34 -08:00
Lionel Landwerlin
2e1387c752 anv: fix conditional render for vkCmdDrawIndirectByteCountEXT
We just forgot about conditional render for this entry point.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 2be89cbd82 ("anv: Implement vkCmdDrawIndirectByteCountEXT")
Tested-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14891>
(cherry picked from commit 93a90fc85d)
2022-02-08 09:23:53 -08:00
Lionel Landwerlin
a910e58ad8 intel/nir: fix shader call lowering
We're replacing a generic instruction by an intel specific one, we
need to remove the previous instruction.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: c5a42e4010 ("intel/fs: fix shader call lowering pass")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13719>
(cherry picked from commit 39f6cd5d79)
2022-02-08 09:23:53 -08:00
Lionel Landwerlin
54f49993d1 intel/fs: don't set allow_sample_mask for CS intrinsics
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 77486db867 ("intel/fs: Disable sample mask predication for scratch stores")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13719>
(cherry picked from commit c89024e446)
2022-02-08 09:23:52 -08:00
Dylan Baker
2b282fb3b5 .pick_status.json: Update to 5e9df85b1a 2022-02-08 09:23:49 -08:00
Dave Airlie
4e67d2aad4 crocus: find correct relocation target for the bo.
If we have batch a + b, and writing to batch b, causes batch a
to flush, all the bo->index get reset, and we try to submit a -1
to the kernel.

Look the bo index up when creating relocations.

Fixes crash seen in KHR-GL46.compute_shader.pipeline-post-fs
and a trace from Wasteland 3

Fixes: f3630548f1 ("crocus: initial gallium driver for Intel gfx 4-7")

Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14905>
(cherry picked from commit 37c3be6947)

Conflicts:
	src/gallium/drivers/crocus/ci/crocus-hsw-flakes.txt

I've deleted this file, which the original removed an entry from as it
doesn't exist, and the CI isn't run on the 22.0 branch.
2022-02-07 21:51:26 -08:00
Mike Blumenkrantz
8f5fb1eb10 zink: min/max blit region in coverage functions
these regions might not have the coords in the correct order, which will
cause them to fail intersection tests, resulting in clears that are never
applied

cc: mesa-stable

fixes:
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_all_buffer_blit
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_color_and_depth_blit
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_color_and_stencil_blit
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_linear_filter_color_blit
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_magnifying_blit
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_minifying_blit
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_missing_buffers_blit
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_nearest_filter_color_blit
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_negative_dimensions_blit
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_negative_height_blit
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_negative_width_blit
GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_scissor_blit

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14867>
(cherry picked from commit 388f23eabe)
2022-02-07 21:49:43 -08:00
Mike Blumenkrantz
4587268d2b zink: reject invalid draws
cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14859>
(cherry picked from commit b656ab75a6)
2022-02-07 21:49:43 -08:00
Mike Blumenkrantz
a04818a500 zink: fix PIPE_CAP_TGSI_BALLOT export conditional
this requires VK_EXT_shader_subgroup_ballot

cc: mesa-stable

fixes (lavapipe):
KHR-GL46.shader_ballot_tests.ShaderBallotAvailability
KHR-GL46.shader_ballot_tests.ShaderBallotFunctionRead

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14858>
(cherry picked from commit e38c13830f)
2022-02-07 21:49:42 -08:00
Rhys Perry
59b2c1ddde radv: fix R_02881C_PA_CL_VS_OUT_CNTL with mixed cull/clip distances
Matches radeonsi.

Seems Vulkan CTS doesn't really test cull distances. Removing
VARYING_SLOT_CULL_DIST0/VARYING_SLOT_CULL_DIST1 variables doesn't break
any of dEQP-VK.clipping.*, except for tests which read the variables in
the fragment shader.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5984
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14882>
(cherry picked from commit 7ddad1b93a)
2022-02-07 21:49:42 -08:00
Daniel Stone
2ce020120a egl/wayland: Reset buffer age when destroying buffers
A buffer age of 0 means that the buffer is uninitialised or has unknown
content. We rely on the buffer age initially being 0 through zalloc when
the surface is first created; when they are first used for a swap, we
set their age to 1, and then we increment the age of every buffer in the
chain with a non-zero age when we swap.

Now that we can release buffers, both through dmabuf-feedback as well as
detecting when we're using a deeper swapchain than the compositor needs,
make sure to reset their age as they are released. Without doing this,
the age will stay as it was before it was released and be incremented,
returning the wrong age to the user the first time a previously-released
buffer slot has been reused.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5977
Fixes: 22d796feb8 ("egl/wayland: break double/tripple buffering feedback loops")
Fixes: b5848b2dac ("egl/wayland: use surface dma-buf feedback to allocate surface buffers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14873>
(cherry picked from commit 3da8300562)
2022-02-07 21:49:41 -08:00
Samuel Pitoiset
ba2d22e95f Revert "radv: re-apply "Do not access set layout during vkCmdBindDescriptorSets.""
The most famous RADV revert over the past months. This was an issue
in RADV and not an use-after-free (descriptor set layouts can be
destroyed almost at any time).

This reverts commit b775aaff1e.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14621>
(cherry picked from commit 9ea4029f9f)
2022-02-07 21:49:15 -08:00
Charmaine Lee
5ff5f3cbf7 mesa: fix misaligned pointer returned by dlist_alloc
In cases where the to-be-allocated node size with padding exceeds BLOCK_SIZE
but without padding doesn't, a new block is not created and no padding is done
to the previous instruction, causing a misaligned pointer to be returned.

v2: Per Ilia Mirkin's suggestion, remove the extra condition in the first
    if statement, let it unconditionally pad the last instruction if needed.
    The updated currentPos will then be taken into account in the
    block size checking.

This fixes crash seen with lightsmark and Optuma apitraces

Fixes:  05605d7f53 (' mesa: remove display list OPCODE_NOP')

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Tested-by: Neha Bhende <bhenden@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14871>
(cherry picked from commit 945a1e0b8c)
2022-02-07 21:36:05 -08:00
Neha Bhende
5a7a564d7c svga: store shared_mem_size in svga_compute_shader instead of svga_context
When new context was created, shared_mem_size was getting overwritten.
This fixes glretrace failure seen with manhattan, aztec and BASS2_intro
apitraces

Fixes: 247c61f2d0 ('svga: Add support for compute shader, shader buffers and image views')

Tested with glretrace, piglit

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
(cherry picked from commit dd6793ec9218782b1b716a87582d7219bae4e75f)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14870>
(cherry picked from commit 9230b28533)
2022-02-07 21:36:04 -08:00
Mike Blumenkrantz
2c7d0e1b49 zink: use scanout obj when returning resource param info
embarrassing typo since the base obj has no modifier data available

cc: mesa-stable

fixes #5980

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14875>
(cherry picked from commit 960e72417f)
2022-02-07 21:36:04 -08:00
Danylo Piliaiev
83eef372a2 turnip: Unconditionaly remove descriptor set from pool's list on free
We didn't remove desc set from the pool's list if pool was
host_memory_base. On the other hand in there is no point in removing
desc set from the list in DestroyDescriptorPool/ResetDescriptorPool.

Fixes: da7a4751
("turnip: Drop references to layout of all sets on pool reset/destruction")

Fixes cts tests:
 dEQP-VK.api.buffer_marker.graphics.default_mem.bottom_of_pipe.memory_dep.draw
 dEQP-VK.api.buffer_marker.graphics.default_mem.bottom_of_pipe.memory_dep.dispatch

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14855>
(cherry picked from commit 183bc15bdb)
2022-02-07 21:36:03 -08:00
Kenneth Graunke
0a1f166f4d iris: Make an iris_foreach_batch macro that skips unsupported batches
IRIS_BATCH_BLITTER isn't supported prior to Tigerlake; in general,
batches may not be supported on all hardware.  In most cases, querying
them is harmless (if useless): they reference nothing, have no commands
to flush, and so on.  However, the fence code does need to know that
certain batches don't exist, so it can avoid adding inter-batch fences
involving them.

This patch introduces a new iris_foreach_batch() iterator macro that
walks over all batches that are actually supported on the platform,
while skipping the others.  It provides a central place to update should
we add or reorder more batches in the future.

Fixes various tests in the piglit.spec.ext_external_objects.* category.

Thanks to Tapani Pälli for catching this.

Fixes: a90a1f15 ("iris: Create an IRIS_BATCH_BLITTER for using the BLT command streamer")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14834>
(cherry picked from commit fd0e4aedeb)
2022-02-07 21:36:03 -08:00
Jesse Natalie
68242654f8 microsoft/compiler: Only treat tess level location as special if it's a patch constant
Fixes: a550c059 ("microsoft/compiler: For load_input from DS, use loadPatchConstant")
Reviewed-by: Bill Kristiansen <billkris@microsoft.com>
Reviewed-By: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14837>
(cherry picked from commit ce6dbbabf9)
2022-02-07 21:36:02 -08:00
Jesse Natalie
c7bd1f0720 microsoft/compiler: Only prep phis for the current function
Fixes: 41af9620 ("microsoft/compiler: Emit all NIR functions into the DXIL module")
Reviewed-by: Bill Kristiansen <billkris@microsoft.com>
Reviewed-By: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14837>
(cherry picked from commit 0c711dc823)
2022-02-07 21:36:02 -08:00
Mike Blumenkrantz
88762cf59b zink: add VK_BUFFER_USAGE_CONDITIONAL_RENDERING_BIT_EXT for query binds
required by spec

cc: mesa-stable

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14853>
(cherry picked from commit 1e96542390)
2022-02-07 21:36:01 -08:00
Dylan Baker
6420dc86cf .pick_status.json: Update to 8335fdfeaf 2022-02-07 21:35:59 -08:00
Rhys Perry
a58a01050c aco: don't encode src2 for v_writelane_b32_e64
Encoding src2 doesn't cause issues for print_asm() because we have a
workaround there, but it does for RGP and it seems the developers are not
interested in fixing it.

https://github.com/GPUOpen-Tools/radeon_gpu_profiler/issues/61

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14832>
(cherry picked from commit 0447a2303f)
2022-02-03 10:32:02 -08:00
Pierre-Eric Pelloux-Prayer
b6e296f823 radeonsi: limit loop unrolling for LLVM < 13
Without this change LLVM 12 hits this error:

"""
LLVM ERROR: Error while trying to spill SGPR0_SGPR1 from class SReg_64:
Cannot scavenge register without an emergency spill slot!
"""

when running glcts KHR-GL46.arrays_of_arrays_gl.AtomicUsage test.

Fixes: 9ff086052a ("radeonsi: unroll loops of up to 128 iterations")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14848>
(cherry picked from commit eaa87b1a46)
2022-02-03 10:32:02 -08:00
Iago Toral Quiroga
fabb6b5c5e broadcom/compiler: fix offset alignment for ldunifa when skipping
The intention was to align the address to 4 bytes (32-bit), not
16 bytes.

Fixes: bdb6201ea1 ("broadcom/compiler: use ldunifa with unaligned constant offset")

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14830>
(cherry picked from commit 0a8449b07c)
2022-02-03 10:32:01 -08:00
Mike Blumenkrantz
0ec3de0563 llvmpipe: disable PIPE_SHADER_CAP_FP16_CONST_BUFFERS
this cap is broken

cc: mesa-stable

fixes:
GTF-GL46.gtf21.GL2Tests.glGetUniform.glGetUnifor

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14835>
(cherry picked from commit 9a75392cd8)
2022-02-03 10:32:01 -08:00
Mike Blumenkrantz
b2be43a192 zink: disable PIPE_SHADER_CAP_FP16_CONST_BUFFERS
this cap is broken

cc: mesa-stable

fixes:
GTF-GL46.gtf21.GL2Tests.glGetUniform.glGetUniform

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14835>
(cherry picked from commit 9a38dab2d1)
2022-02-03 10:32:00 -08:00
Dylan Baker
9e17fcbed2 .pick_status.json: Update to 0447a2303f 2022-02-03 10:31:57 -08:00
Dylan Baker
c69a870f86 VERSION: bump for 22.0.0-rc1 release 2022-02-02 15:15:29 -08:00
35 changed files with 1907 additions and 128 deletions

1730
.pick_status.json Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -1 +1 @@
22.0.0-devel
22.0.0-rc2

View File

@@ -625,6 +625,10 @@ emit_instruction(asm_context& ctx, std::vector<uint32_t>& out, Instruction* inst
encoding = 0;
if (instr->opcode == aco_opcode::v_interp_mov_f32) {
encoding = 0x3 & instr->operands[0].constantValue();
} else if (instr->opcode == aco_opcode::v_writelane_b32_e64) {
encoding |= instr->operands[0].physReg() << 0;
encoding |= instr->operands[1].physReg() << 9;
/* Encoding src2 works fine with hardware but breaks some disassemblers. */
} else {
for (unsigned i = 0; i < instr->operands.size(); i++)
encoding |= instr->operands[i].physReg() << (i * 9);

View File

@@ -271,12 +271,6 @@ std::pair<bool, size_t>
disasm_instr(chip_class chip, LLVMDisasmContextRef disasm, uint32_t* binary, unsigned exec_size,
size_t pos, char* outline, unsigned outline_size)
{
/* mask out src2 on v_writelane_b32 */
if (((chip == GFX8 || chip == GFX9) && (binary[pos] & 0xffff8000) == 0xd28a0000) ||
(chip >= GFX10 && (binary[pos] & 0xffff8000) == 0xd7610000)) {
binary[pos + 1] = binary[pos + 1] & 0xF803FFFF;
}
size_t l =
LLVMDisasmInstruction(disasm, (uint8_t*)&binary[pos], (exec_size - pos) * sizeof(uint32_t),
pos * 4, outline, outline_size);

View File

@@ -4727,6 +4727,7 @@ radv_bind_descriptor_set(struct radv_cmd_buffer *cmd_buffer, VkPipelineBindPoint
radv_set_descriptor_set(cmd_buffer, bind_point, set, idx);
assert(set);
assert(!(set->header.layout->flags & VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR));
if (!cmd_buffer->device->use_global_bo_list) {
for (unsigned j = 0; j < set->header.buffer_count; ++j)
@@ -4764,7 +4765,7 @@ radv_CmdBindDescriptorSets(VkCommandBuffer commandBuffer, VkPipelineBindPoint pi
radv_bind_descriptor_set(cmd_buffer, pipelineBindPoint, set, set_idx);
}
for (unsigned j = 0; j < layout->set[set_idx].dynamic_offset_count; ++j, ++dyn_idx) {
for (unsigned j = 0; j < set->header.layout->dynamic_offset_count; ++j, ++dyn_idx) {
unsigned idx = j + layout->set[i + firstSet].dynamic_offset_start;
uint32_t *dst = descriptors_state->dynamic_buffers + idx * 4;
assert(dyn_idx < dynamicOffsetCount);
@@ -4790,7 +4791,7 @@ radv_CmdBindDescriptorSets(VkCommandBuffer commandBuffer, VkPipelineBindPoint pi
}
}
cmd_buffer->push_constant_stages |= layout->set[set_idx].dynamic_offset_stages;
cmd_buffer->push_constant_stages |= set->header.layout->dynamic_shader_stages;
}
}
}

View File

@@ -496,16 +496,11 @@ radv_CreatePipelineLayout(VkDevice _device, const VkPipelineLayoutCreateInfo *pC
layout->set[set].layout = set_layout;
layout->set[set].dynamic_offset_start = dynamic_offset_count;
layout->set[set].dynamic_offset_count = 0;
layout->set[set].dynamic_offset_stages = 0;
for (uint32_t b = 0; b < set_layout->binding_count; b++) {
layout->set[set].dynamic_offset_count +=
set_layout->binding[b].array_size * set_layout->binding[b].dynamic_offset_count;
layout->set[set].dynamic_offset_stages |= set_layout->dynamic_shader_stages;
dynamic_offset_count += set_layout->binding[b].array_size * set_layout->binding[b].dynamic_offset_count;
dynamic_shader_stages |= set_layout->dynamic_shader_stages;
}
dynamic_offset_count += layout->set[set].dynamic_offset_count;
dynamic_shader_stages |= layout->set[set].dynamic_offset_stages;
/* Hash the entire set layout except for the vk_object_base. The
* rest of the set layout is carefully constructed to not have

View File

@@ -89,9 +89,7 @@ struct radv_pipeline_layout {
struct {
struct radv_descriptor_set_layout *layout;
uint32_t size;
uint16_t dynamic_offset_start;
uint16_t dynamic_offset_count;
VkShaderStageFlags dynamic_offset_stages;
uint32_t dynamic_offset_start;
} set[MAX_SETS];
uint32_t num_sets;

View File

@@ -4773,7 +4773,7 @@ radv_pipeline_generate_hw_vs(struct radeon_cmdbuf *ctx_cs, struct radeon_cmdbuf
S_02881C_VS_OUT_MISC_SIDE_BUS_ENA(misc_vec_ena) |
S_02881C_VS_OUT_CCDIST0_VEC_ENA((total_mask & 0x0f) != 0) |
S_02881C_VS_OUT_CCDIST1_VEC_ENA((total_mask & 0xf0) != 0) |
cull_dist_mask << 8 | clip_dist_mask);
total_mask << 8 | clip_dist_mask);
if (pipeline->device->physical_device->rad_info.chip_class <= GFX8)
radeon_set_context_reg(ctx_cs, R_028AB4_VGT_REUSE_OFF, outinfo->writes_viewport_index);
@@ -4911,7 +4911,7 @@ radv_pipeline_generate_hw_ngg(struct radeon_cmdbuf *ctx_cs, struct radeon_cmdbuf
S_02881C_VS_OUT_MISC_SIDE_BUS_ENA(misc_vec_ena) |
S_02881C_VS_OUT_CCDIST0_VEC_ENA((total_mask & 0x0f) != 0) |
S_02881C_VS_OUT_CCDIST1_VEC_ENA((total_mask & 0xf0) != 0) |
cull_dist_mask << 8 | clip_dist_mask);
total_mask << 8 | clip_dist_mask);
radeon_set_context_reg(ctx_cs, R_028A84_VGT_PRIMITIVEID_EN,
S_028A84_PRIMITIVEID_EN(es_enable_prim_id) |

View File

@@ -3034,7 +3034,7 @@ ntq_emit_load_ubo_unifa(struct v3d_compile *c, nir_intrinsic_instr *instr)
* alignment and skip over unused elements in result.
*/
value_skips = (const_offset % 4) / (bit_size / 8);
const_offset &= ~0xf;
const_offset &= ~0x3;
}
}

View File

@@ -256,6 +256,7 @@ wl_buffer_release(void *data, struct wl_buffer *buffer)
wl_buffer_destroy(buffer);
dri2_surf->color_buffers[i].wl_release = false;
dri2_surf->color_buffers[i].wl_buffer = NULL;
dri2_surf->color_buffers[i].age = 0;
}
dri2_surf->color_buffers[i].locked = false;
@@ -863,6 +864,7 @@ dri2_wl_release_buffers(struct dri2_egl_surface *dri2_surf)
dri2_surf->color_buffers[i].dri_image = NULL;
dri2_surf->color_buffers[i].linear_copy = NULL;
dri2_surf->color_buffers[i].data = NULL;
dri2_surf->color_buffers[i].age = 0;
}
if (dri2_dpy->dri2)
@@ -1145,6 +1147,7 @@ update_buffers(struct dri2_egl_surface *dri2_surf)
dri2_surf->color_buffers[i].wl_buffer = NULL;
dri2_surf->color_buffers[i].dri_image = NULL;
dri2_surf->color_buffers[i].linear_copy = NULL;
dri2_surf->color_buffers[i].age = 0;
}
}
@@ -2342,6 +2345,7 @@ swrast_update_buffers(struct dri2_egl_surface *dri2_surf)
dri2_surf->color_buffers[i].data_size);
dri2_surf->color_buffers[i].wl_buffer = NULL;
dri2_surf->color_buffers[i].data = NULL;
dri2_surf->color_buffers[i].age = 0;
}
}

View File

@@ -589,8 +589,6 @@ tu_descriptor_set_destroy(struct tu_device *device,
}
}
list_del(&set->pool_link);
vk_object_free(&device->vk, NULL, set);
}
@@ -814,8 +812,10 @@ tu_FreeDescriptorSets(VkDevice _device,
for (uint32_t i = 0; i < count; i++) {
TU_FROM_HANDLE(tu_descriptor_set, set, pDescriptorSets[i]);
if (set)
if (set) {
tu_descriptor_set_layout_unref(device, set->layout);
list_del(&set->pool_link);
}
if (set && !pool->host_memory_base)
tu_descriptor_set_destroy(device, pool, set, true);

View File

@@ -132,8 +132,10 @@ gallivm_get_shader_param(enum pipe_shader_cap param)
return 1;
case PIPE_SHADER_CAP_FP16:
case PIPE_SHADER_CAP_FP16_DERIVATIVES:
case PIPE_SHADER_CAP_FP16_CONST_BUFFERS:
return lp_has_fp16();
//enabling this breaks GTF-GL46.gtf21.GL2Tests.glGetUniform.glGetUniform
case PIPE_SHADER_CAP_FP16_CONST_BUFFERS:
return 0;
case PIPE_SHADER_CAP_INT64_ATOMICS:
return 0;
case PIPE_SHADER_CAP_INT16:

View File

@@ -263,21 +263,30 @@ crocus_init_batch(struct crocus_context *ice,
crocus_batch_reset(batch);
}
static struct drm_i915_gem_exec_object2 *
find_validation_entry(struct crocus_batch *batch, struct crocus_bo *bo)
static int
find_exec_index(struct crocus_batch *batch, struct crocus_bo *bo)
{
unsigned index = READ_ONCE(bo->index);
if (index < batch->exec_count && batch->exec_bos[index] == bo)
return &batch->validation_list[index];
return index;
/* May have been shared between multiple active batches */
for (index = 0; index < batch->exec_count; index++) {
if (batch->exec_bos[index] == bo)
return &batch->validation_list[index];
return index;
}
return -1;
}
return NULL;
static struct drm_i915_gem_exec_object2 *
find_validation_entry(struct crocus_batch *batch, struct crocus_bo *bo)
{
int index = find_exec_index(batch, bo);
if (index == -1)
return NULL;
return &batch->validation_list[index];
}
static void
@@ -409,7 +418,7 @@ emit_reloc(struct crocus_batch *batch,
(struct drm_i915_gem_relocation_entry) {
.offset = offset,
.delta = target_offset,
.target_handle = target->index,
.target_handle = find_exec_index(batch, target),
.presumed_offset = entry->offset,
};

View File

@@ -181,13 +181,13 @@ iris_init_batch(struct iris_context *ice,
struct iris_batch *batch = &ice->batches[name];
struct iris_screen *screen = (void *) ice->ctx.screen;
/* Note: ctx_id, exec_flags and has_engines_context fields are initialized
* at an earlier phase when contexts are created.
/* Note: screen, ctx_id, exec_flags and has_engines_context fields are
* initialized at an earlier phase when contexts are created.
*
* Ref: iris_init_engines_context(), iris_init_non_engine_contexts()
* See iris_init_batches(), which calls either iris_init_engines_context()
* or iris_init_non_engine_contexts().
*/
batch->screen = screen;
batch->dbg = &ice->dbg;
batch->reset = &ice->reset;
batch->state_sizes = ice->state.sizes;
@@ -214,11 +214,12 @@ iris_init_batch(struct iris_context *ice,
batch->cache.render = _mesa_hash_table_create(NULL, _mesa_hash_pointer,
_mesa_key_pointer_equal);
batch->num_other_batches = 0;
memset(batch->other_batches, 0, sizeof(batch->other_batches));
for (int i = 0, j = 0; i < IRIS_BATCH_COUNT; i++) {
if (i != name)
batch->other_batches[j++] = &ice->batches[i];
iris_foreach_batch(ice, other_batch) {
if (batch != other_batch)
batch->other_batches[batch->num_other_batches++] = other_batch;
}
if (INTEL_DEBUG(DEBUG_ANY)) {
@@ -250,8 +251,7 @@ iris_init_non_engine_contexts(struct iris_context *ice, int priority)
{
struct iris_screen *screen = (void *) ice->ctx.screen;
for (int i = 0; i < IRIS_BATCH_COUNT; i++) {
struct iris_batch *batch = &ice->batches[i];
iris_foreach_batch(ice, batch) {
batch->ctx_id = iris_create_hw_context(screen->bufmgr);
batch->exec_flags = I915_EXEC_RENDER;
batch->has_engines_context = false;
@@ -315,8 +315,8 @@ iris_init_engines_context(struct iris_context *ice, int priority)
struct iris_screen *screen = (void *) ice->ctx.screen;
iris_hw_context_set_priority(screen->bufmgr, engines_ctx, priority);
for (int i = 0; i < IRIS_BATCH_COUNT; i++) {
struct iris_batch *batch = &ice->batches[i];
iris_foreach_batch(ice, batch) {
unsigned i = batch - &ice->batches[0];
batch->ctx_id = engines_ctx;
batch->exec_flags = i;
batch->has_engines_context = true;
@@ -328,10 +328,14 @@ iris_init_engines_context(struct iris_context *ice, int priority)
void
iris_init_batches(struct iris_context *ice, int priority)
{
/* We have to do this early for iris_foreach_batch() to work */
for (int i = 0; i < IRIS_BATCH_COUNT; i++)
ice->batches[i].screen = (void *) ice->ctx.screen;
if (!iris_init_engines_context(ice, priority))
iris_init_non_engine_contexts(ice, priority);
for (int i = 0; i < IRIS_BATCH_COUNT; i++)
iris_init_batch(ice, (enum iris_batch_name) i);
iris_foreach_batch(ice, batch)
iris_init_batch(ice, batch - &ice->batches[0]);
}
static int
@@ -400,7 +404,7 @@ flush_for_cross_batch_dependencies(struct iris_batch *batch,
* it had already referenced, we may need to flush other batches in order
* to correctly synchronize them.
*/
for (int b = 0; b < ARRAY_SIZE(batch->other_batches); b++) {
for (int b = 0; b < batch->num_other_batches; b++) {
struct iris_batch *other_batch = batch->other_batches[b];
int other_index = find_exec_index(other_batch, bo);
@@ -598,8 +602,8 @@ iris_destroy_batches(struct iris_context *ice)
ice->batches[0].ctx_id);
}
for (int i = 0; i < IRIS_BATCH_COUNT; i++)
iris_batch_free(&ice->batches[i]);
iris_foreach_batch(ice, batch)
iris_batch_free(batch);
}
/**
@@ -726,10 +730,10 @@ replace_kernel_ctx(struct iris_batch *batch)
int new_ctx = iris_create_engines_context(ice, priority);
if (new_ctx < 0)
return false;
for (int i = 0; i < IRIS_BATCH_COUNT; i++) {
ice->batches[i].ctx_id = new_ctx;
iris_foreach_batch(ice, bat) {
bat->ctx_id = new_ctx;
/* Notify the context that state must be re-initialized. */
iris_lost_context_state(&ice->batches[i]);
iris_lost_context_state(bat);
}
iris_destroy_kernel_context(bufmgr, old_ctx);
} else {
@@ -810,6 +814,7 @@ update_bo_syncobjs(struct iris_batch *batch, struct iris_bo *bo, bool write)
{
struct iris_screen *screen = batch->screen;
struct iris_bufmgr *bufmgr = screen->bufmgr;
struct iris_context *ice = batch->ice;
/* Make sure bo->deps is big enough */
if (screen->id >= bo->deps_size) {
@@ -838,7 +843,9 @@ update_bo_syncobjs(struct iris_batch *batch, struct iris_bo *bo, bool write)
* have come from a different context, and apps don't like it when we don't
* do inter-context tracking.
*/
for (unsigned i = 0; i < IRIS_BATCH_COUNT; i++) {
iris_foreach_batch(ice, batch_i) {
unsigned i = batch_i->name;
/* If the bo is being written to by others, wait for them. */
if (bo_deps->write_syncobjs[i])
move_syncobj_to_batch(batch, &bo_deps->write_syncobjs[i],

View File

@@ -136,6 +136,7 @@ struct iris_batch {
/** List of other batches which we might need to flush to use a BO */
struct iris_batch *other_batches[IRIS_BATCH_COUNT - 1];
unsigned num_other_batches;
struct {
/**
@@ -382,4 +383,9 @@ iris_batch_mark_reset_sync(struct iris_batch *batch)
const char *
iris_batch_name_to_string(enum iris_batch_name name);
#define iris_foreach_batch(ice, batch) \
for (struct iris_batch *batch = &ice->batches[0]; \
batch <= &ice->batches[((struct iris_screen *)ice->ctx.screen)->devinfo.ver >= 12 ? IRIS_BATCH_BLITTER : IRIS_BATCH_COMPUTE]; \
++batch)
#endif

View File

@@ -114,9 +114,9 @@ iris_border_color_pool_reserve(struct iris_context *ice, unsigned count)
if (remaining_entries < count) {
/* It's safe to flush because we're called outside of state upload. */
for (int i = 0; i < IRIS_BATCH_COUNT; i++) {
if (iris_batch_references(&ice->batches[i], pool->bo))
iris_batch_flush(&ice->batches[i]);
iris_foreach_batch(ice, batch) {
if (iris_batch_references(batch, pool->bo))
iris_batch_flush(batch);
}
iris_reset_border_color_pool(pool, pool->bo->bufmgr);

View File

@@ -98,12 +98,12 @@ iris_get_device_reset_status(struct pipe_context *ctx)
/* Check the reset status of each batch's hardware context, and take the
* worst status (if one was guilty, proclaim guilt).
*/
for (int i = 0; i < IRIS_BATCH_COUNT; i++) {
iris_foreach_batch(ice, batch) {
/* This will also recreate the hardware contexts as necessary, so any
* future queries will show no resets. We only want to report once.
*/
enum pipe_reset_status batch_reset =
iris_batch_check_for_reset(&ice->batches[i]);
iris_batch_check_for_reset(batch);
if (batch_reset == PIPE_NO_RESET)
continue;

View File

@@ -263,8 +263,8 @@ iris_fence_flush(struct pipe_context *ctx,
iris_flush_dirty_dmabufs(ice);
if (!deferred) {
for (unsigned i = 0; i < IRIS_BATCH_COUNT; i++)
iris_batch_flush(&ice->batches[i]);
iris_foreach_batch(ice, batch)
iris_batch_flush(batch);
}
if (flags & PIPE_FLUSH_END_OF_FRAME) {
@@ -286,8 +286,8 @@ iris_fence_flush(struct pipe_context *ctx,
if (deferred)
fence->unflushed_ctx = ctx;
for (unsigned b = 0; b < IRIS_BATCH_COUNT; b++) {
struct iris_batch *batch = &ice->batches[b];
iris_foreach_batch(ice, batch) {
unsigned b = batch->name;
if (deferred && iris_batch_bytes_used(batch) > 0) {
struct iris_fine_fence *fine =
@@ -339,9 +339,7 @@ iris_fence_await(struct pipe_context *ctx,
if (iris_fine_fence_signaled(fine))
continue;
for (unsigned b = 0; b < IRIS_BATCH_COUNT; b++) {
struct iris_batch *batch = &ice->batches[b];
iris_foreach_batch(ice, batch) {
/* We're going to make any future work in this batch wait for our
* fence to have gone by. But any currently queued work doesn't
* need to wait. Flush the batch now, so it can happen sooner.
@@ -402,14 +400,14 @@ iris_fence_finish(struct pipe_screen *p_screen,
* that it matches first.
*/
if (ctx && ctx == fence->unflushed_ctx) {
for (unsigned i = 0; i < IRIS_BATCH_COUNT; i++) {
struct iris_fine_fence *fine = fence->fine[i];
iris_foreach_batch(ice, batch) {
struct iris_fine_fence *fine = fence->fine[batch->name];
if (iris_fine_fence_signaled(fine))
continue;
if (fine->syncobj == iris_batch_get_signal_syncobj(&ice->batches[i]))
iris_batch_flush(&ice->batches[i]);
if (fine->syncobj == iris_batch_get_signal_syncobj(batch))
iris_batch_flush(batch);
}
/* The fence is no longer deferred. */
@@ -595,7 +593,7 @@ iris_fence_signal(struct pipe_context *ctx,
if (ctx == fence->unflushed_ctx)
return;
for (unsigned b = 0; b < IRIS_BATCH_COUNT; b++) {
iris_foreach_batch(ice, batch) {
for (unsigned i = 0; i < ARRAY_SIZE(fence->fine); i++) {
struct iris_fine_fence *fine = fence->fine[i];
@@ -603,9 +601,8 @@ iris_fence_signal(struct pipe_context *ctx,
if (iris_fine_fence_signaled(fine))
continue;
ice->batches[b].contains_fence_signal = true;
iris_batch_add_syncobj(&ice->batches[b], fine->syncobj,
I915_EXEC_FENCE_SIGNAL);
batch->contains_fence_signal = true;
iris_batch_add_syncobj(batch, fine->syncobj, I915_EXEC_FENCE_SIGNAL);
}
}
}

View File

@@ -357,11 +357,10 @@ iris_memory_barrier(struct pipe_context *ctx, unsigned flags)
PIPE_CONTROL_TILE_CACHE_FLUSH;
}
for (int i = 0; i < IRIS_BATCH_COUNT; i++) {
if (ice->batches[i].contains_draw) {
iris_batch_maybe_flush(&ice->batches[i], 24);
iris_emit_pipe_control_flush(&ice->batches[i], "API: memory barrier",
bits);
iris_foreach_batch(ice, batch) {
if (batch->contains_draw) {
iris_batch_maybe_flush(batch, 24);
iris_emit_pipe_control_flush(batch, "API: memory barrier", bits);
}
}
}

View File

@@ -1404,9 +1404,9 @@ iris_flush_resource(struct pipe_context *ctx, struct pipe_resource *resource)
* sure to get rid of any compression that a consumer wouldn't know how
* to handle.
*/
for (int i = 0; i < IRIS_BATCH_COUNT; i++) {
if (iris_batch_references(&ice->batches[i], res->bo))
iris_batch_flush(&ice->batches[i]);
iris_foreach_batch(ice, batch) {
if (iris_batch_references(batch, res->bo))
iris_batch_flush(batch);
}
iris_resource_disable_aux(res);
@@ -1741,8 +1741,8 @@ resource_is_busy(struct iris_context *ice,
{
bool busy = iris_bo_busy(res->bo);
for (int i = 0; i < IRIS_BATCH_COUNT; i++)
busy |= iris_batch_references(&ice->batches[i], res->bo);
iris_foreach_batch(ice, batch)
busy |= iris_batch_references(batch, res->bo);
return busy;
}
@@ -2339,9 +2339,9 @@ iris_transfer_map(struct pipe_context *ctx,
}
if (!(usage & PIPE_MAP_UNSYNCHRONIZED)) {
for (int i = 0; i < IRIS_BATCH_COUNT; i++) {
if (iris_batch_references(&ice->batches[i], res->bo))
iris_batch_flush(&ice->batches[i]);
iris_foreach_batch(ice, batch) {
if (iris_batch_references(batch, res->bo))
iris_batch_flush(batch);
}
}
@@ -2384,8 +2384,7 @@ iris_transfer_flush_region(struct pipe_context *ctx,
}
if (history_flush & ~PIPE_CONTROL_CS_STALL) {
for (int i = 0; i < IRIS_BATCH_COUNT; i++) {
struct iris_batch *batch = &ice->batches[i];
iris_foreach_batch(ice, batch) {
if (batch->contains_draw || batch->cache.render->entries) {
iris_batch_maybe_flush(batch, 24);
iris_emit_pipe_control_flush(batch,
@@ -2474,9 +2473,9 @@ iris_texture_subdata(struct pipe_context *ctx,
iris_resource_access_raw(ice, res, level, box->z, box->depth, true);
for (int i = 0; i < IRIS_BATCH_COUNT; i++) {
if (iris_batch_references(&ice->batches[i], res->bo))
iris_batch_flush(&ice->batches[i]);
iris_foreach_batch(ice, batch) {
if (iris_batch_references(batch, res->bo))
iris_batch_flush(batch);
}
uint8_t *dst = iris_bo_map(&ice->dbg, res->bo, MAP_WRITE | MAP_RAW);

View File

@@ -1072,8 +1072,8 @@ void si_init_screen_get_functions(struct si_screen *sscreen)
.has_udot_4x8 = sscreen->info.has_accelerated_dot_product,
.has_dot_2x16 = sscreen->info.has_accelerated_dot_product,
.optimize_sample_mask_in = true,
.max_unroll_iterations = 128,
.max_unroll_iterations_aggressive = 128,
.max_unroll_iterations = LLVM_VERSION_MAJOR >= 13 ? 128 : 32,
.max_unroll_iterations_aggressive = LLVM_VERSION_MAJOR >= 13 ? 128 : 32,
.use_interpolated_input_intrinsics = true,
.lower_uniforms_to_ubo = true,
.support_16bit_alu = sscreen->options.fp16,

View File

@@ -374,7 +374,6 @@ struct svga_state
struct pipe_resource *indirect;
} grid_info;
unsigned shared_mem_size;
};
struct svga_prescale {

View File

@@ -61,7 +61,7 @@ svga_create_compute_state(struct pipe_context *pipe,
cs->base.id = svga->debug.shader_id++;
svga->curr.shared_mem_size = templ->req_local_mem;
cs->shared_mem_size = templ->req_local_mem;
SVGA_STATS_TIME_POP(svga_sws(svga));
return cs;

View File

@@ -380,6 +380,7 @@ struct svga_tes_shader
struct svga_compute_shader
{
struct svga_shader base;
unsigned shared_mem_size;
};

View File

@@ -80,7 +80,7 @@ make_cs_key(struct svga_context *svga,
key->cs.grid_size[0] = svga->curr.grid_info.size[0];
key->cs.grid_size[1] = svga->curr.grid_info.size[1];
key->cs.grid_size[2] = svga->curr.grid_info.size[2];
key->cs.mem_size = svga->curr.shared_mem_size;
key->cs.mem_size = cs->shared_mem_size;
if (svga->curr.grid_info.indirect && cs->base.info.uses_grid_size) {
struct pipe_transfer *transfer = NULL;

View File

@@ -363,12 +363,18 @@ bool
zink_blit_region_fills(struct u_rect region, unsigned width, unsigned height)
{
struct u_rect intersect = {0, width, 0, height};
struct u_rect r = {
MIN2(region.x0, region.x1),
MAX2(region.x0, region.x1),
MIN2(region.y0, region.y1),
MAX2(region.y0, region.y1),
};
if (!u_rect_test_intersection(&region, &intersect))
if (!u_rect_test_intersection(&r, &intersect))
/* is this even a thing? */
return false;
u_rect_find_intersection(&region, &intersect);
u_rect_find_intersection(&r, &intersect);
if (intersect.x0 != 0 || intersect.y0 != 0 ||
intersect.x1 != width || intersect.y1 != height)
return false;
@@ -379,11 +385,23 @@ zink_blit_region_fills(struct u_rect region, unsigned width, unsigned height)
bool
zink_blit_region_covers(struct u_rect region, struct u_rect covers)
{
struct u_rect r = {
MIN2(region.x0, region.x1),
MAX2(region.x0, region.x1),
MIN2(region.y0, region.y1),
MAX2(region.y0, region.y1),
};
struct u_rect c = {
MIN2(covers.x0, covers.x1),
MAX2(covers.x0, covers.x1),
MIN2(covers.y0, covers.y1),
MAX2(covers.y0, covers.y1),
};
struct u_rect intersect;
if (!u_rect_test_intersection(&region, &covers))
if (!u_rect_test_intersection(&r, &c))
return false;
u_rect_union(&intersect, &region, &covers);
return intersect.x0 == covers.x0 && intersect.y0 == covers.y0 &&
intersect.x1 == covers.x1 && intersect.y1 == covers.y1;
u_rect_union(&intersect, &r, &c);
return intersect.x0 == c.x0 && intersect.y0 == c.y0 &&
intersect.x1 == c.x1 && intersect.y1 == c.y1;
}

View File

@@ -487,6 +487,9 @@ zink_draw(struct pipe_context *pctx,
struct pipe_vertex_state *vstate,
uint32_t partial_velem_mask)
{
if (!dindirect && (!draws[0].count || !dinfo->instance_count))
return;
struct zink_context *ctx = zink_context(pctx);
struct zink_screen *screen = zink_screen(pctx->screen);
struct zink_rasterizer_state *rast_state = ctx->rast_state;

View File

@@ -172,6 +172,9 @@ create_bci(struct zink_screen *screen, const struct pipe_resource *templ, unsign
if (bind & PIPE_BIND_SHADER_IMAGE)
bci.usage |= VK_BUFFER_USAGE_STORAGE_TEXEL_BUFFER_BIT;
if (bind & PIPE_BIND_QUERY_BUFFER)
bci.usage |= VK_BUFFER_USAGE_CONDITIONAL_RENDERING_BIT_EXT;
if (templ->flags & PIPE_RESOURCE_FLAG_SPARSE)
bci.flags |= VK_BUFFER_CREATE_SPARSE_BINDING_BIT;
return bci;
@@ -1036,7 +1039,7 @@ zink_resource_get_param(struct pipe_screen *pscreen, struct pipe_context *pctx,
switch (param) {
case PIPE_RESOURCE_PARAM_NPLANES:
if (screen->info.have_EXT_image_drm_format_modifier)
*value = pscreen->get_dmabuf_modifier_planes(pscreen, res->obj->modifier, pres->format);
*value = pscreen->get_dmabuf_modifier_planes(pscreen, obj->modifier, pres->format);
else
*value = 1;
break;
@@ -1066,7 +1069,7 @@ zink_resource_get_param(struct pipe_screen *pscreen, struct pipe_context *pctx,
}
case PIPE_RESOURCE_PARAM_MODIFIER: {
*value = res->obj->modifier;
*value = obj->modifier;
break;
}

View File

@@ -468,7 +468,7 @@ zink_get_param(struct pipe_screen *pscreen, enum pipe_cap param)
return 1;
case PIPE_CAP_TGSI_BALLOT:
return screen->vk_version >= VK_MAKE_VERSION(1,2,0) && screen->info.props11.subgroupSize <= 64;
return screen->info.have_vulkan12 && screen->info.have_EXT_shader_subgroup_ballot && screen->info.props11.subgroupSize <= 64;
case PIPE_CAP_SAMPLE_SHADING:
return screen->info.feats.features.sampleRateShading;
@@ -935,8 +935,10 @@ zink_get_shader_param(struct pipe_screen *pscreen,
return 0; /* not implemented */
case PIPE_SHADER_CAP_FP16_CONST_BUFFERS:
return screen->info.feats11.uniformAndStorageBuffer16BitAccess ||
(screen->info.have_KHR_16bit_storage && screen->info.storage_16bit_feats.uniformAndStorageBuffer16BitAccess);
//enabling this breaks GTF-GL46.gtf21.GL2Tests.glGetUniform.glGetUniform
//return screen->info.feats11.uniformAndStorageBuffer16BitAccess ||
//(screen->info.have_KHR_16bit_storage && screen->info.storage_16bit_feats.uniformAndStorageBuffer16BitAccess);
return 0;
case PIPE_SHADER_CAP_FP16_DERIVATIVES:
return 0; //spirv requires 32bit derivative srcs and dests
case PIPE_SHADER_CAP_FP16:

View File

@@ -3945,7 +3945,10 @@ fs_visitor::nir_emit_cs_intrinsic(const fs_builder &bld,
srcs[SURFACE_LOGICAL_SRC_SURFACE] = brw_imm_ud(GFX7_BTI_SLM);
srcs[SURFACE_LOGICAL_SRC_ADDRESS] = get_nir_src(instr->src[1]);
srcs[SURFACE_LOGICAL_SRC_IMM_DIMS] = brw_imm_ud(1);
srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(1);
/* No point in masking with sample mask, here we're handling compute
* intrinsics.
*/
srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(0);
fs_reg data = get_nir_src(instr->src[0]);
data.type = brw_reg_type_from_bit_size(bit_size, BRW_REGISTER_TYPE_UD);

View File

@@ -137,6 +137,8 @@ lower_shader_calls_instr(struct nir_builder *b, nir_instr *instr, void *data)
switch (call->intrinsic) {
case nir_intrinsic_rt_trace_ray: {
b->cursor = nir_instr_remove(instr);
store_resume_addr(b, call);
nir_ssa_def *as_addr = call->src[0].ssa;
@@ -217,6 +219,8 @@ lower_shader_calls_instr(struct nir_builder *b, nir_instr *instr, void *data)
}
case nir_intrinsic_rt_execute_callable: {
b->cursor = nir_instr_remove(instr);
store_resume_addr(b, call);
nir_ssa_def *sbt_offset32 =

View File

@@ -4371,6 +4371,9 @@ void genX(CmdDrawIndirectByteCountEXT)(
genX(cmd_buffer_flush_state)(cmd_buffer);
if (cmd_buffer->state.conditional_render_enabled)
genX(cmd_emit_conditional_render_predicate)(cmd_buffer);
if (vs_prog_data->uses_firstvertex ||
vs_prog_data->uses_baseinstance)
emit_base_vertex_instance(cmd_buffer, firstVertex, firstInstance);
@@ -4405,6 +4408,7 @@ void genX(CmdDrawIndirectByteCountEXT)(
anv_batch_emit(&cmd_buffer->batch, GENX(3DPRIMITIVE), prim) {
prim.IndirectParameterEnable = true;
prim.PredicateEnable = cmd_buffer->state.conditional_render_enabled;
prim.VertexAccessType = SEQUENTIAL;
prim.PrimitiveTopologyType = cmd_buffer->state.gfx.primitive_topology;
}

View File

@@ -1506,8 +1506,7 @@ dlist_alloc(struct gl_context *ctx, OpCode opcode, GLuint bytes, bool align8)
/* If this node needs to start on an 8-byte boundary, pad the last node. */
if (sizeof(void *) == 8 && align8 &&
ctx->ListState.CurrentPos % 2 == 1 &&
ctx->ListState.CurrentPos + 1 + numNodes + contNodes <= BLOCK_SIZE) {
ctx->ListState.CurrentPos % 2 == 1) {
Node *last = ctx->ListState.CurrentBlock + ctx->ListState.CurrentPos -
ctx->ListState.LastInstSize;
last->InstSize++;

View File

@@ -3094,8 +3094,9 @@ emit_store_output_via_intrinsic(struct ntd_context *ctx, nir_intrinsic_instr *in
* generation, so muck with them here too.
*/
nir_io_semantics semantics = nir_intrinsic_io_semantics(intr);
bool is_tess_level = semantics.location == VARYING_SLOT_TESS_LEVEL_INNER ||
semantics.location == VARYING_SLOT_TESS_LEVEL_OUTER;
bool is_tess_level = is_patch_constant &&
(semantics.location == VARYING_SLOT_TESS_LEVEL_INNER ||
semantics.location == VARYING_SLOT_TESS_LEVEL_OUTER);
const struct dxil_value *row = NULL;
const struct dxil_value *col = NULL;
@@ -3198,8 +3199,9 @@ emit_load_input_via_intrinsic(struct ntd_context *ctx, nir_intrinsic_instr *intr
* generation, so muck with them here too.
*/
nir_io_semantics semantics = nir_intrinsic_io_semantics(intr);
bool is_tess_level = semantics.location == VARYING_SLOT_TESS_LEVEL_INNER ||
semantics.location == VARYING_SLOT_TESS_LEVEL_OUTER;
bool is_tess_level = is_patch_constant &&
(semantics.location == VARYING_SLOT_TESS_LEVEL_INNER ||
semantics.location == VARYING_SLOT_TESS_LEVEL_OUTER);
const struct dxil_value *row = NULL;
const struct dxil_value *comp = NULL;
@@ -5011,7 +5013,7 @@ sort_uniforms_by_binding_and_remove_structs(nir_shader *s)
}
static void
prepare_phi_values(struct ntd_context *ctx)
prepare_phi_values(struct ntd_context *ctx, nir_function_impl *impl)
{
/* PHI nodes are difficult to get right when tracking the types:
* Since the incoming sources are linked to blocks, we can't bitcast
@@ -5020,19 +5022,15 @@ prepare_phi_values(struct ntd_context *ctx)
* value has a different type then the one expected by the phi node.
* We choose int as default, because it supports more bit sizes.
*/
nir_foreach_function(function, ctx->shader) {
if (function->impl) {
nir_foreach_block(block, function->impl) {
nir_foreach_instr(instr, block) {
if (instr->type == nir_instr_type_phi) {
nir_phi_instr *ir = nir_instr_as_phi(instr);
unsigned bitsize = nir_dest_bit_size(ir->dest);
const struct dxil_value *dummy = dxil_module_get_int_const(&ctx->mod, 0, bitsize);
nir_foreach_phi_src(src, ir) {
for(unsigned int i = 0; i < ir->dest.ssa.num_components; ++i)
store_ssa_def(ctx, src->src.ssa, i, dummy);
}
}
nir_foreach_block(block, impl) {
nir_foreach_instr(instr, block) {
if (instr->type == nir_instr_type_phi) {
nir_phi_instr *ir = nir_instr_as_phi(instr);
unsigned bitsize = nir_dest_bit_size(ir->dest);
const struct dxil_value *dummy = dxil_module_get_int_const(&ctx->mod, 0, bitsize);
nir_foreach_phi_src(src, ir) {
for(unsigned int i = 0; i < ir->dest.ssa.num_components; ++i)
store_ssa_def(ctx, src->src.ssa, i, dummy);
}
}
}
@@ -5163,7 +5161,7 @@ emit_function(struct ntd_context *ctx, nir_function *func)
if (!ctx->phis)
return false;
prepare_phi_values(ctx);
prepare_phi_values(ctx, impl);
if (!emit_scratch(ctx))
return false;