Compare commits
41 Commits
mesa-21.0.
...
mesa-21.0.
Author | SHA1 | Date | |
---|---|---|---|
|
7419e553db | ||
|
ebe8cfc3ec | ||
|
759ce9f053 | ||
|
a3a2783237 | ||
|
b96b1db389 | ||
|
882d47fae4 | ||
|
1665f478ac | ||
|
6a0f0a34fe | ||
|
11585bb003 | ||
|
816fd2cf5f | ||
|
a1328ea781 | ||
|
c5c7d6a05a | ||
|
99a47874de | ||
|
ed60dec381 | ||
|
d30cea2b9b | ||
|
5bcbe14854 | ||
|
da38b604e3 | ||
|
540172fa43 | ||
|
ca86b94e55 | ||
|
fe9e25b29a | ||
|
b6123cd4d5 | ||
|
f5444d504a | ||
|
5de93ffed8 | ||
|
9a439ebcac | ||
|
f1ec9335a8 | ||
|
4e4962b464 | ||
|
bca2aa6e48 | ||
|
d4e0e7c0f0 | ||
|
ffd661d50b | ||
|
a6a79fb31e | ||
|
090239c244 | ||
|
2ac46f95bd | ||
|
f0b620307e | ||
|
8d32c55d93 | ||
|
2733a9c712 | ||
|
3260a85b5c | ||
|
aa8bff051e | ||
|
8d9ec9cd11 | ||
|
e37442f1b8 | ||
|
770b0185ab | ||
|
63267e018d |
5252
.pick_status.json
5252
.pick_status.json
File diff suppressed because it is too large
Load Diff
@@ -8,10 +8,9 @@ VK-GL-CTS, on the shared GitLab runners provided by `freedesktop
|
||||
Software architecture
|
||||
---------------------
|
||||
|
||||
The Docker containers are rebuilt from the debian-install.sh script
|
||||
when DEBIAN\_TAG is changed in .gitlab-ci.yml, and
|
||||
debian-test-install.sh when DEBIAN\_ARM64\_TAG is changed in
|
||||
.gitlab-ci.yml. The resulting images are around 500MB, and are
|
||||
The Docker containers are rebuilt using the shell scripts under
|
||||
.gitlab-ci/container/ when the FDO\_DISTRIBUTION\_TAG changes in
|
||||
.gitlab-ci.yml. The resulting images are around 1 GB, and are
|
||||
expected to change approximately weekly (though an individual
|
||||
developer working on them may produce many more images while trying to
|
||||
come up with a working MR!).
|
||||
|
@@ -19,7 +19,7 @@ SHA256 checksum
|
||||
|
||||
::
|
||||
|
||||
TBD.
|
||||
379fc984459394f2ab2d84049efdc3a659869dc1328ce72ef0598506611712bb mesa-21.0.1.tar.xz
|
||||
|
||||
|
||||
New features
|
||||
|
135
docs/relnotes/21.0.2.rst
Normal file
135
docs/relnotes/21.0.2.rst
Normal file
@@ -0,0 +1,135 @@
|
||||
Mesa 21.0.2 Release Notes / 2021-04-07
|
||||
======================================
|
||||
|
||||
Mesa 21.0.2 is a bug fix release which fixes bugs found since the 21.0.1 release.
|
||||
|
||||
Mesa 21.0.2 implements the OpenGL 4.6 API, but the version reported by
|
||||
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
|
||||
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
|
||||
Some drivers don't support all the features required in OpenGL 4.6. OpenGL
|
||||
4.6 is **only** available if requested at context creation.
|
||||
Compatibility contexts may report a lower version depending on each driver.
|
||||
|
||||
Mesa 21.0.2 implements the Vulkan 1.2 API, but the version reported by
|
||||
the apiVersion property of the VkPhysicalDeviceProperties struct
|
||||
depends on the particular driver being used.
|
||||
|
||||
SHA256 checksum
|
||||
---------------
|
||||
|
||||
::
|
||||
|
||||
TBD.
|
||||
|
||||
|
||||
New features
|
||||
------------
|
||||
|
||||
- None
|
||||
|
||||
|
||||
Bug fixes
|
||||
---------
|
||||
|
||||
- warning: xnack 'Off' was requested for a processor that does not support it! \[AMD VEGAM with LLVM 12.0.0\]
|
||||
- Clover doesn't work for kmsro drivers
|
||||
- util cpu detection breaks on 128-core AMD machines
|
||||
- ACO error with GCN 1 GPU
|
||||
- kmsro advertises EGL_MESA_device_software
|
||||
|
||||
|
||||
Changes
|
||||
-------
|
||||
|
||||
Adrian Ratiu (1):
|
||||
|
||||
- docs: docker: minor stale documentation fix
|
||||
|
||||
Bas Nieuwenhuizen (1):
|
||||
|
||||
- radv: Flush caches for shader read operations.
|
||||
|
||||
Boyuan Zhang (1):
|
||||
|
||||
- frontend/va/image: add pipe flush for vlVaPutImage
|
||||
|
||||
Charmaine Lee (1):
|
||||
|
||||
- gallivm: increase size of texture target enum bitfield
|
||||
|
||||
Dave Airlie (3):
|
||||
|
||||
- lavapipe: fix templated descriptor updates
|
||||
- util: rework AMD cpu L3 cache affinity code.
|
||||
- drisw: move zink down the list below the sw drivers.
|
||||
|
||||
Dylan Baker (9):
|
||||
|
||||
- docs: Add 21.0.1 hashes
|
||||
- .pick_status.json: Update to 9be24c89c8c298069eaa3ff600ba556b9a4557e9
|
||||
- .pick_status.json: Update to 8e43abcd2c29366d77fff804a7845b61fb97ca5c
|
||||
- .pick_status.json: Mark 75951a44ee9f25d29865f3dd60cdf3b8ce3f7f0c as backported
|
||||
- .pick_status.json: Update to a7c0cf500b335069bfe480c947b26052335f897e
|
||||
- .pick_status.json: Update to ee14bec09a92e4363ef916d00d4d9baecfb09fa9
|
||||
- .pick_status.json: Update to 3c64c090e0d2250d7ee880550f8cbeac0052c8d9
|
||||
- .pick_status.json: Update to fb5615af40a5878b127827f80f4185df63933f34
|
||||
- .pick_status.json: Update to 1e0a69afa72c61e5f5841db3e5e7f6bb846a0fab
|
||||
|
||||
Erik Faye-Lund (1):
|
||||
|
||||
- compiler/glsl: avoid null-pointer deref
|
||||
|
||||
Gert Wollny (1):
|
||||
|
||||
- r600: don't set an index_bias for indirect draw calls
|
||||
|
||||
Icecream95 (2):
|
||||
|
||||
- panfrost: Disable early-z when alpha test is used
|
||||
- pipe-loader,gallium/drm: Fix the kmsro pipe_loader target
|
||||
|
||||
Lionel Landwerlin (1):
|
||||
|
||||
- intel/fs/copy_prop: check stride constraints with actual final type
|
||||
|
||||
Marek Olšák (2):
|
||||
|
||||
- ac/llvm: don't set unsupported xnack options to fix LLVM crashes on gfx6-8
|
||||
- radeonsi: disable sparse buffers on gfx7-8
|
||||
|
||||
Michel Dänzer (2):
|
||||
|
||||
- intel/tools: Use subprocess.Popen to read output directly from a pipe
|
||||
- Revert "glsl/test: Don't run whitespace tests in parallel"
|
||||
|
||||
Mike Blumenkrantz (5):
|
||||
|
||||
- util/set: stop leaking u32 key sets which pass a mem ctx
|
||||
- lavapipe: use the passed offset for CmdCopyQueryPoolResults
|
||||
- util/bitscan: add u_foreach_bit macros
|
||||
- lavapipe: fix CmdCopyQueryPoolResults for partial pipeline statistics queries
|
||||
- lavapipe: fix array texture region copies
|
||||
|
||||
Pierre-Eric Pelloux-Prayer (3):
|
||||
|
||||
- mesa/st: fix lower_tex_src_plane in multiple samplers scenario
|
||||
- nir/lower_tex: ignore texture_index if tex_instr has deref src
|
||||
- mesa/st: fix st_nir_lower_tex_src_plane arguments
|
||||
|
||||
Rhys Perry (1):
|
||||
|
||||
- aco: implement image_deref_samples
|
||||
|
||||
Simon Ser (3):
|
||||
|
||||
- egl: fix software flag in \_eglAddDevice call on DRM
|
||||
- egl: only take render nodes into account when listing DRM devices
|
||||
- Revert "egl: Don't add hardware device if there is no render node v2."
|
||||
|
||||
Tapani Pälli (1):
|
||||
|
||||
- iris: clamp PointWidth in 3DSTATE_SF like i965 does
|
||||
|
||||
Tony Wasserka (1):
|
||||
|
||||
- aco/isel: Don't emit unsupported i16<->f16 conversion opcodes on GFX6/7
|
@@ -2444,11 +2444,24 @@ void visit_alu_instr(isel_context *ctx, nir_alu_instr *instr)
|
||||
case nir_op_i2f16: {
|
||||
assert(dst.regClass() == v2b);
|
||||
Temp src = get_alu_src(ctx, instr->src[0]);
|
||||
if (instr->src[0].src.ssa->bit_size == 8)
|
||||
src = convert_int(ctx, bld, src, 8, 16, true);
|
||||
else if (instr->src[0].src.ssa->bit_size == 64)
|
||||
const unsigned input_size = instr->src[0].src.ssa->bit_size;
|
||||
if (input_size <= 16) {
|
||||
/* Expand integer to the size expected by the uint→float converter used below */
|
||||
unsigned target_size = (ctx->program->chip_class >= GFX8 ? 16 : 32);
|
||||
if (input_size != target_size) {
|
||||
src = convert_int(ctx, bld, src, input_size, target_size, true);
|
||||
}
|
||||
} else if (input_size == 64) {
|
||||
src = convert_int(ctx, bld, src, 64, 32, false);
|
||||
bld.vop1(aco_opcode::v_cvt_f16_i16, Definition(dst), src);
|
||||
}
|
||||
|
||||
if (ctx->program->chip_class >= GFX8) {
|
||||
bld.vop1(aco_opcode::v_cvt_f16_i16, Definition(dst), src);
|
||||
} else {
|
||||
/* GFX7 and earlier do not support direct f16⟷i16 conversions */
|
||||
src = bld.vop1(aco_opcode::v_cvt_f32_i32, bld.def(v1), src);
|
||||
bld.vop1(aco_opcode::v_cvt_f16_f32, Definition(dst), src);
|
||||
}
|
||||
break;
|
||||
}
|
||||
case nir_op_i2f32: {
|
||||
@@ -2483,11 +2496,24 @@ void visit_alu_instr(isel_context *ctx, nir_alu_instr *instr)
|
||||
case nir_op_u2f16: {
|
||||
assert(dst.regClass() == v2b);
|
||||
Temp src = get_alu_src(ctx, instr->src[0]);
|
||||
if (instr->src[0].src.ssa->bit_size == 8)
|
||||
src = convert_int(ctx, bld, src, 8, 16, false);
|
||||
else if (instr->src[0].src.ssa->bit_size == 64)
|
||||
const unsigned input_size = instr->src[0].src.ssa->bit_size;
|
||||
if (input_size <= 16) {
|
||||
/* Expand integer to the size expected by the uint→float converter used below */
|
||||
unsigned target_size = (ctx->program->chip_class >= GFX8 ? 16 : 32);
|
||||
if (input_size != target_size) {
|
||||
src = convert_int(ctx, bld, src, input_size, target_size, false);
|
||||
}
|
||||
} else if (input_size == 64) {
|
||||
src = convert_int(ctx, bld, src, 64, 32, false);
|
||||
bld.vop1(aco_opcode::v_cvt_f16_u16, Definition(dst), src);
|
||||
}
|
||||
|
||||
if (ctx->program->chip_class >= GFX8) {
|
||||
bld.vop1(aco_opcode::v_cvt_f16_u16, Definition(dst), src);
|
||||
} else {
|
||||
/* GFX7 and earlier do not support direct f16⟷u16 conversions */
|
||||
src = bld.vop1(aco_opcode::v_cvt_f32_u32, bld.def(v1), src);
|
||||
bld.vop1(aco_opcode::v_cvt_f16_f32, Definition(dst), src);
|
||||
}
|
||||
break;
|
||||
}
|
||||
case nir_op_u2f32: {
|
||||
@@ -2524,22 +2550,46 @@ void visit_alu_instr(isel_context *ctx, nir_alu_instr *instr)
|
||||
}
|
||||
case nir_op_f2i8:
|
||||
case nir_op_f2i16: {
|
||||
if (instr->src[0].src.ssa->bit_size == 16)
|
||||
emit_vop1_instruction(ctx, instr, aco_opcode::v_cvt_i16_f16, dst);
|
||||
else if (instr->src[0].src.ssa->bit_size == 32)
|
||||
if (instr->src[0].src.ssa->bit_size == 16) {
|
||||
if (ctx->program->chip_class >= GFX8) {
|
||||
emit_vop1_instruction(ctx, instr, aco_opcode::v_cvt_i16_f16, dst);
|
||||
} else {
|
||||
/* GFX7 and earlier do not support direct f16⟷i16 conversions */
|
||||
Temp tmp = bld.tmp(v1);
|
||||
emit_vop1_instruction(ctx, instr, aco_opcode::v_cvt_f32_f16, tmp);
|
||||
tmp = bld.vop1(aco_opcode::v_cvt_i32_f32, bld.def(v1), tmp);
|
||||
tmp = convert_int(ctx, bld, tmp, 32, 16, false, (dst.type() == RegType::sgpr) ? Temp() : dst);
|
||||
if (dst.type() == RegType::sgpr) {
|
||||
bld.pseudo(aco_opcode::p_as_uniform, Definition(dst), tmp);
|
||||
}
|
||||
}
|
||||
} else if (instr->src[0].src.ssa->bit_size == 32) {
|
||||
emit_vop1_instruction(ctx, instr, aco_opcode::v_cvt_i32_f32, dst);
|
||||
else
|
||||
} else {
|
||||
emit_vop1_instruction(ctx, instr, aco_opcode::v_cvt_i32_f64, dst);
|
||||
}
|
||||
break;
|
||||
}
|
||||
case nir_op_f2u8:
|
||||
case nir_op_f2u16: {
|
||||
if (instr->src[0].src.ssa->bit_size == 16)
|
||||
emit_vop1_instruction(ctx, instr, aco_opcode::v_cvt_u16_f16, dst);
|
||||
else if (instr->src[0].src.ssa->bit_size == 32)
|
||||
if (instr->src[0].src.ssa->bit_size == 16) {
|
||||
if (ctx->program->chip_class >= GFX8) {
|
||||
emit_vop1_instruction(ctx, instr, aco_opcode::v_cvt_u16_f16, dst);
|
||||
} else {
|
||||
/* GFX7 and earlier do not support direct f16⟷u16 conversions */
|
||||
Temp tmp = bld.tmp(v1);
|
||||
emit_vop1_instruction(ctx, instr, aco_opcode::v_cvt_f32_f16, tmp);
|
||||
tmp = bld.vop1(aco_opcode::v_cvt_u32_f32, bld.def(v1), tmp);
|
||||
tmp = convert_int(ctx, bld, tmp, 32, 16, false, (dst.type() == RegType::sgpr) ? Temp() : dst);
|
||||
if (dst.type() == RegType::sgpr) {
|
||||
bld.pseudo(aco_opcode::p_as_uniform, Definition(dst), tmp);
|
||||
}
|
||||
}
|
||||
} else if (instr->src[0].src.ssa->bit_size == 32) {
|
||||
emit_vop1_instruction(ctx, instr, aco_opcode::v_cvt_u32_f32, dst);
|
||||
else
|
||||
} else {
|
||||
emit_vop1_instruction(ctx, instr, aco_opcode::v_cvt_u32_f64, dst);
|
||||
}
|
||||
break;
|
||||
}
|
||||
case nir_op_f2i32: {
|
||||
@@ -6456,6 +6506,37 @@ void visit_image_size(isel_context *ctx, nir_intrinsic_instr *instr)
|
||||
emit_split_vector(ctx, dst, instr->dest.ssa.num_components);
|
||||
}
|
||||
|
||||
void get_image_samples(isel_context *ctx, Definition dst, Temp resource)
|
||||
{
|
||||
Builder bld(ctx->program, ctx->block);
|
||||
|
||||
Temp dword3 = emit_extract_vector(ctx, resource, 3, s1);
|
||||
Temp samples_log2 = bld.sop2(aco_opcode::s_bfe_u32, bld.def(s1), bld.def(s1, scc), dword3, Operand(16u | 4u<<16));
|
||||
Temp samples = bld.sop2(aco_opcode::s_lshl_b32, bld.def(s1), bld.def(s1, scc), Operand(1u), samples_log2);
|
||||
Temp type = bld.sop2(aco_opcode::s_bfe_u32, bld.def(s1), bld.def(s1, scc), dword3, Operand(28u | 4u<<16 /* offset=28, width=4 */));
|
||||
|
||||
Operand default_sample = Operand(1u);
|
||||
if (ctx->options->robust_buffer_access) {
|
||||
/* Extract the second dword of the descriptor, if it's
|
||||
* all zero, then it's a null descriptor.
|
||||
*/
|
||||
Temp dword1 = emit_extract_vector(ctx, resource, 1, s1);
|
||||
Temp is_non_null_descriptor = bld.sopc(aco_opcode::s_cmp_gt_u32, bld.def(s1, scc), dword1, Operand(0u));
|
||||
default_sample = Operand(is_non_null_descriptor);
|
||||
}
|
||||
|
||||
Temp is_msaa = bld.sopc(aco_opcode::s_cmp_ge_u32, bld.def(s1, scc), type, Operand(14u));
|
||||
bld.sop2(aco_opcode::s_cselect_b32, dst, samples, default_sample, bld.scc(is_msaa));
|
||||
}
|
||||
|
||||
void visit_image_samples(isel_context *ctx, nir_intrinsic_instr *instr)
|
||||
{
|
||||
Builder bld(ctx->program, ctx->block);
|
||||
Temp dst = get_ssa_temp(ctx, &instr->dest.ssa);
|
||||
Temp resource = get_sampler_desc(ctx, nir_instr_as_deref(instr->src[0].ssa->parent_instr), ACO_DESC_IMAGE, NULL, true, false);
|
||||
get_image_samples(ctx, Definition(dst), resource);
|
||||
}
|
||||
|
||||
void visit_load_ssbo(isel_context *ctx, nir_intrinsic_instr *instr)
|
||||
{
|
||||
Builder bld(ctx->program, ctx->block);
|
||||
@@ -8060,6 +8141,9 @@ void visit_intrinsic(isel_context *ctx, nir_intrinsic_instr *instr)
|
||||
case nir_intrinsic_image_deref_size:
|
||||
visit_image_size(ctx, instr);
|
||||
break;
|
||||
case nir_intrinsic_image_deref_samples:
|
||||
visit_image_samples(ctx, instr);
|
||||
break;
|
||||
case nir_intrinsic_load_ssbo:
|
||||
visit_load_ssbo(ctx, instr);
|
||||
break;
|
||||
@@ -9006,25 +9090,7 @@ void visit_tex(isel_context *ctx, nir_tex_instr *instr)
|
||||
return get_buffer_size(ctx, resource, get_ssa_temp(ctx, &instr->dest.ssa), true);
|
||||
|
||||
if (instr->op == nir_texop_texture_samples) {
|
||||
Temp dword3 = emit_extract_vector(ctx, resource, 3, s1);
|
||||
|
||||
Temp samples_log2 = bld.sop2(aco_opcode::s_bfe_u32, bld.def(s1), bld.def(s1, scc), dword3, Operand(16u | 4u<<16));
|
||||
Temp samples = bld.sop2(aco_opcode::s_lshl_b32, bld.def(s1), bld.def(s1, scc), Operand(1u), samples_log2);
|
||||
Temp type = bld.sop2(aco_opcode::s_bfe_u32, bld.def(s1), bld.def(s1, scc), dword3, Operand(28u | 4u<<16 /* offset=28, width=4 */));
|
||||
|
||||
Operand default_sample = Operand(1u);
|
||||
if (ctx->options->robust_buffer_access) {
|
||||
/* Extract the second dword of the descriptor, if it's
|
||||
* all zero, then it's a null descriptor.
|
||||
*/
|
||||
Temp dword1 = emit_extract_vector(ctx, resource, 1, s1);
|
||||
Temp is_non_null_descriptor = bld.sopc(aco_opcode::s_cmp_gt_u32, bld.def(s1, scc), dword1, Operand(0u));
|
||||
default_sample = Operand(is_non_null_descriptor);
|
||||
}
|
||||
|
||||
Temp is_msaa = bld.sopc(aco_opcode::s_cmp_ge_u32, bld.def(s1, scc), type, Operand(14u));
|
||||
bld.sop2(aco_opcode::s_cselect_b32, Definition(get_ssa_temp(ctx, &instr->dest.ssa)),
|
||||
samples, default_sample, bld.scc(is_msaa));
|
||||
get_image_samples(ctx, Definition(get_ssa_temp(ctx, &instr->dest.ssa)), resource);
|
||||
return;
|
||||
}
|
||||
|
||||
|
@@ -799,6 +799,7 @@ void init_context(isel_context *ctx, nir_shader *shader)
|
||||
case nir_intrinsic_read_invocation:
|
||||
case nir_intrinsic_first_invocation:
|
||||
case nir_intrinsic_ballot:
|
||||
case nir_intrinsic_image_deref_samples:
|
||||
type = RegType::sgpr;
|
||||
break;
|
||||
case nir_intrinsic_load_sample_id:
|
||||
|
@@ -194,13 +194,11 @@ static LLVMTargetMachineRef ac_create_target_machine(enum radeon_family family,
|
||||
const char *triple = (tm_options & AC_TM_SUPPORTS_SPILL) ? "amdgcn-mesa-mesa3d" : "amdgcn--";
|
||||
LLVMTargetRef target = ac_get_llvm_target(triple);
|
||||
|
||||
snprintf(features, sizeof(features), "+DumpCode%s%s%s%s%s",
|
||||
snprintf(features, sizeof(features), "+DumpCode%s%s%s",
|
||||
LLVM_VERSION_MAJOR >= 11 ? "" : ",-fp32-denormals,+fp64-denormals",
|
||||
family >= CHIP_NAVI10 && !(tm_options & AC_TM_WAVE32)
|
||||
? ",+wavefrontsize64,-wavefrontsize32"
|
||||
: "",
|
||||
family <= CHIP_NAVI14 && tm_options & AC_TM_FORCE_ENABLE_XNACK ? ",+xnack" : "",
|
||||
family <= CHIP_NAVI14 && tm_options & AC_TM_FORCE_DISABLE_XNACK ? ",-xnack" : "",
|
||||
tm_options & AC_TM_PROMOTE_ALLOCA_TO_SCRATCH ? ",-promote-alloca" : "");
|
||||
|
||||
LLVMTargetMachineRef tm =
|
||||
|
@@ -62,8 +62,6 @@ enum ac_func_attr
|
||||
enum ac_target_machine_options
|
||||
{
|
||||
AC_TM_SUPPORTS_SPILL = (1 << 0),
|
||||
AC_TM_FORCE_ENABLE_XNACK = (1 << 1),
|
||||
AC_TM_FORCE_DISABLE_XNACK = (1 << 2),
|
||||
AC_TM_PROMOTE_ALLOCA_TO_SCRATCH = (1 << 3),
|
||||
AC_TM_CHECK_IR = (1 << 4),
|
||||
AC_TM_ENABLE_GLOBAL_ISEL = (1 << 5),
|
||||
|
@@ -114,7 +114,9 @@ radv_expand_fmask_image_inplace(struct radv_cmd_buffer *cmd_buffer,
|
||||
radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer),
|
||||
VK_PIPELINE_BIND_POINT_COMPUTE, pipeline);
|
||||
|
||||
cmd_buffer->state.flush_bits |= radv_dst_access_flush(cmd_buffer, VK_ACCESS_SHADER_WRITE_BIT, image);
|
||||
cmd_buffer->state.flush_bits |=
|
||||
radv_dst_access_flush(cmd_buffer, VK_ACCESS_SHADER_READ_BIT |
|
||||
VK_ACCESS_SHADER_WRITE_BIT, image);
|
||||
|
||||
for (unsigned l = 0; l < radv_get_layerCount(image, subresourceRange); l++) {
|
||||
struct radv_image_view iview;
|
||||
|
@@ -120,9 +120,10 @@ remove_struct_derefs_prep(nir_deref_instr **p, char **name,
|
||||
|
||||
static void
|
||||
record_images_used(struct shader_info *info,
|
||||
nir_deref_instr *deref)
|
||||
nir_intrinsic_instr *instr)
|
||||
{
|
||||
nir_variable *var = nir_deref_instr_get_variable(deref);
|
||||
nir_variable *var =
|
||||
nir_deref_instr_get_variable(nir_src_as_deref(instr->src[0]));
|
||||
|
||||
/* Structs have been lowered already, so get_aoa_size is sufficient. */
|
||||
const unsigned size =
|
||||
@@ -302,7 +303,7 @@ lower_intrinsic(nir_intrinsic_instr *instr,
|
||||
nir_deref_instr *deref =
|
||||
lower_deref(b, state, nir_src_as_deref(instr->src[0]));
|
||||
|
||||
record_images_used(&state->shader->info, deref);
|
||||
record_images_used(&state->shader->info, instr);
|
||||
|
||||
/* don't lower bindless: */
|
||||
if (!deref)
|
||||
|
@@ -86,13 +86,6 @@ if with_any_opengl and with_tests and host_machine.system() != 'windows'
|
||||
modes += ['valgrind']
|
||||
endif
|
||||
|
||||
# For some unfathomable reason, three out of these four tests often time out
|
||||
# when running within CI. On the assumption that there is some
|
||||
# parallelisation badness happening rather than the non-UNIX tests entering
|
||||
# infinite loops, try just marking them as serial-only.
|
||||
#
|
||||
# This should have a negligible impact on runtime since they are quick to
|
||||
# execute.
|
||||
foreach m : modes
|
||||
test(
|
||||
'glcpp test (@0@)'.format(m),
|
||||
@@ -104,7 +97,6 @@ if with_any_opengl and with_tests and host_machine.system() != 'windows'
|
||||
],
|
||||
suite : ['compiler', 'glcpp'],
|
||||
timeout: 60,
|
||||
is_parallel: false,
|
||||
)
|
||||
endforeach
|
||||
endif
|
||||
|
@@ -287,16 +287,17 @@ static void
|
||||
convert_yuv_to_rgb(nir_builder *b, nir_tex_instr *tex,
|
||||
nir_ssa_def *y, nir_ssa_def *u, nir_ssa_def *v,
|
||||
nir_ssa_def *a,
|
||||
const nir_lower_tex_options *options)
|
||||
const nir_lower_tex_options *options,
|
||||
unsigned texture_index)
|
||||
{
|
||||
|
||||
const float *offset_vals;
|
||||
const nir_const_value_3_4 *m;
|
||||
assert((options->bt709_external & options->bt2020_external) == 0);
|
||||
if (options->bt709_external & (1 << tex->texture_index)) {
|
||||
if (options->bt709_external & (1u << texture_index)) {
|
||||
m = &bt709_csc_coeffs;
|
||||
offset_vals = bt709_csc_offsets;
|
||||
} else if (options->bt2020_external & (1 << tex->texture_index)) {
|
||||
} else if (options->bt2020_external & (1u << texture_index)) {
|
||||
m = &bt2020_csc_coeffs;
|
||||
offset_vals = bt2020_csc_offsets;
|
||||
} else {
|
||||
@@ -327,7 +328,8 @@ convert_yuv_to_rgb(nir_builder *b, nir_tex_instr *tex,
|
||||
|
||||
static void
|
||||
lower_y_uv_external(nir_builder *b, nir_tex_instr *tex,
|
||||
const nir_lower_tex_options *options)
|
||||
const nir_lower_tex_options *options,
|
||||
unsigned texture_index)
|
||||
{
|
||||
b->cursor = nir_after_instr(&tex->instr);
|
||||
|
||||
@@ -339,12 +341,14 @@ lower_y_uv_external(nir_builder *b, nir_tex_instr *tex,
|
||||
nir_channel(b, uv, 0),
|
||||
nir_channel(b, uv, 1),
|
||||
nir_imm_float(b, 1.0f),
|
||||
options);
|
||||
options,
|
||||
texture_index);
|
||||
}
|
||||
|
||||
static void
|
||||
lower_y_u_v_external(nir_builder *b, nir_tex_instr *tex,
|
||||
const nir_lower_tex_options *options)
|
||||
const nir_lower_tex_options *options,
|
||||
unsigned texture_index)
|
||||
{
|
||||
b->cursor = nir_after_instr(&tex->instr);
|
||||
|
||||
@@ -357,12 +361,14 @@ lower_y_u_v_external(nir_builder *b, nir_tex_instr *tex,
|
||||
nir_channel(b, u, 0),
|
||||
nir_channel(b, v, 0),
|
||||
nir_imm_float(b, 1.0f),
|
||||
options);
|
||||
options,
|
||||
texture_index);
|
||||
}
|
||||
|
||||
static void
|
||||
lower_yx_xuxv_external(nir_builder *b, nir_tex_instr *tex,
|
||||
const nir_lower_tex_options *options)
|
||||
const nir_lower_tex_options *options,
|
||||
unsigned texture_index)
|
||||
{
|
||||
b->cursor = nir_after_instr(&tex->instr);
|
||||
|
||||
@@ -374,12 +380,14 @@ lower_yx_xuxv_external(nir_builder *b, nir_tex_instr *tex,
|
||||
nir_channel(b, xuxv, 1),
|
||||
nir_channel(b, xuxv, 3),
|
||||
nir_imm_float(b, 1.0f),
|
||||
options);
|
||||
options,
|
||||
texture_index);
|
||||
}
|
||||
|
||||
static void
|
||||
lower_xy_uxvx_external(nir_builder *b, nir_tex_instr *tex,
|
||||
const nir_lower_tex_options *options)
|
||||
const nir_lower_tex_options *options,
|
||||
unsigned texture_index)
|
||||
{
|
||||
b->cursor = nir_after_instr(&tex->instr);
|
||||
|
||||
@@ -391,12 +399,14 @@ lower_xy_uxvx_external(nir_builder *b, nir_tex_instr *tex,
|
||||
nir_channel(b, uxvx, 0),
|
||||
nir_channel(b, uxvx, 2),
|
||||
nir_imm_float(b, 1.0f),
|
||||
options);
|
||||
options,
|
||||
texture_index);
|
||||
}
|
||||
|
||||
static void
|
||||
lower_ayuv_external(nir_builder *b, nir_tex_instr *tex,
|
||||
const nir_lower_tex_options *options)
|
||||
const nir_lower_tex_options *options,
|
||||
unsigned texture_index)
|
||||
{
|
||||
b->cursor = nir_after_instr(&tex->instr);
|
||||
|
||||
@@ -407,12 +417,14 @@ lower_ayuv_external(nir_builder *b, nir_tex_instr *tex,
|
||||
nir_channel(b, ayuv, 1),
|
||||
nir_channel(b, ayuv, 0),
|
||||
nir_channel(b, ayuv, 3),
|
||||
options);
|
||||
options,
|
||||
texture_index);
|
||||
}
|
||||
|
||||
static void
|
||||
lower_xyuv_external(nir_builder *b, nir_tex_instr *tex,
|
||||
const nir_lower_tex_options *options)
|
||||
const nir_lower_tex_options *options,
|
||||
unsigned texture_index)
|
||||
{
|
||||
b->cursor = nir_after_instr(&tex->instr);
|
||||
|
||||
@@ -423,12 +435,14 @@ lower_xyuv_external(nir_builder *b, nir_tex_instr *tex,
|
||||
nir_channel(b, xyuv, 1),
|
||||
nir_channel(b, xyuv, 0),
|
||||
nir_imm_float(b, 1.0f),
|
||||
options);
|
||||
options,
|
||||
texture_index);
|
||||
}
|
||||
|
||||
static void
|
||||
lower_yuv_external(nir_builder *b, nir_tex_instr *tex,
|
||||
const nir_lower_tex_options *options)
|
||||
const nir_lower_tex_options *options,
|
||||
unsigned texture_index)
|
||||
{
|
||||
b->cursor = nir_after_instr(&tex->instr);
|
||||
|
||||
@@ -439,7 +453,8 @@ lower_yuv_external(nir_builder *b, nir_tex_instr *tex,
|
||||
nir_channel(b, yuv, 1),
|
||||
nir_channel(b, yuv, 2),
|
||||
nir_imm_float(b, 1.0f),
|
||||
options);
|
||||
options,
|
||||
texture_index);
|
||||
}
|
||||
|
||||
/*
|
||||
@@ -1052,38 +1067,45 @@ nir_lower_tex_block(nir_block *block, nir_builder *b,
|
||||
progress = true;
|
||||
}
|
||||
|
||||
if ((1 << tex->texture_index) & options->lower_y_uv_external) {
|
||||
lower_y_uv_external(b, tex, options);
|
||||
unsigned texture_index = tex->texture_index;
|
||||
int tex_index = nir_tex_instr_src_index(tex, nir_tex_src_texture_deref);
|
||||
if (tex_index >= 0) {
|
||||
nir_deref_instr *deref = nir_src_as_deref(tex->src[tex_index].src);
|
||||
texture_index = nir_deref_instr_get_variable(deref)->data.binding;
|
||||
}
|
||||
|
||||
if ((1u << texture_index) & options->lower_y_uv_external) {
|
||||
lower_y_uv_external(b, tex, options, texture_index);
|
||||
progress = true;
|
||||
}
|
||||
|
||||
if ((1 << tex->texture_index) & options->lower_y_u_v_external) {
|
||||
lower_y_u_v_external(b, tex, options);
|
||||
if ((1u << texture_index) & options->lower_y_u_v_external) {
|
||||
lower_y_u_v_external(b, tex, options, texture_index);
|
||||
progress = true;
|
||||
}
|
||||
|
||||
if ((1 << tex->texture_index) & options->lower_yx_xuxv_external) {
|
||||
lower_yx_xuxv_external(b, tex, options);
|
||||
if ((1u << texture_index) & options->lower_yx_xuxv_external) {
|
||||
lower_yx_xuxv_external(b, tex, options, texture_index);
|
||||
progress = true;
|
||||
}
|
||||
|
||||
if ((1 << tex->texture_index) & options->lower_xy_uxvx_external) {
|
||||
lower_xy_uxvx_external(b, tex, options);
|
||||
if ((1u << texture_index) & options->lower_xy_uxvx_external) {
|
||||
lower_xy_uxvx_external(b, tex, options, texture_index);
|
||||
progress = true;
|
||||
}
|
||||
|
||||
if ((1 << tex->texture_index) & options->lower_ayuv_external) {
|
||||
lower_ayuv_external(b, tex, options);
|
||||
if ((1u << texture_index) & options->lower_ayuv_external) {
|
||||
lower_ayuv_external(b, tex, options, texture_index);
|
||||
progress = true;
|
||||
}
|
||||
|
||||
if ((1 << tex->texture_index) & options->lower_xyuv_external) {
|
||||
lower_xyuv_external(b, tex, options);
|
||||
if ((1u << texture_index) & options->lower_xyuv_external) {
|
||||
lower_xyuv_external(b, tex, options, texture_index);
|
||||
progress = true;
|
||||
}
|
||||
|
||||
if ((1 << tex->texture_index) & options->lower_yuv_external) {
|
||||
lower_yuv_external(b, tex, options);
|
||||
lower_yuv_external(b, tex, options, texture_index);
|
||||
progress = true;
|
||||
}
|
||||
|
||||
@@ -1097,7 +1119,7 @@ nir_lower_tex_block(nir_block *block, nir_builder *b,
|
||||
progress = true;
|
||||
}
|
||||
|
||||
if (((1 << tex->texture_index) & options->swizzle_result) &&
|
||||
if (((1u << texture_index) & options->swizzle_result) &&
|
||||
!nir_tex_instr_is_query(tex) &&
|
||||
!(tex->is_shadow && tex->is_new_style_shadow)) {
|
||||
swizzle_result(b, tex, options->swizzles[tex->texture_index]);
|
||||
@@ -1105,7 +1127,7 @@ nir_lower_tex_block(nir_block *block, nir_builder *b,
|
||||
}
|
||||
|
||||
/* should be after swizzle so we know which channels are rgb: */
|
||||
if (((1 << tex->texture_index) & options->lower_srgb) &&
|
||||
if (((1u << texture_index) & options->lower_srgb) &&
|
||||
!nir_tex_instr_is_query(tex) && !tex->is_shadow) {
|
||||
linearize_srgb_result(b, tex);
|
||||
progress = true;
|
||||
|
@@ -718,7 +718,7 @@ dri2_initialize_drm(_EGLDisplay *disp)
|
||||
goto cleanup;
|
||||
}
|
||||
|
||||
dev = _eglAddDevice(dri2_dpy->fd, disp->Options.ForceSoftware);
|
||||
dev = _eglAddDevice(dri2_dpy->fd, dri2_dpy->gbm_dri->software);
|
||||
if (!dev) {
|
||||
err = "DRI2: failed to find EGLDevice";
|
||||
goto cleanup;
|
||||
|
@@ -109,9 +109,9 @@ static int
|
||||
_eglAddDRMDevice(drmDevicePtr device, _EGLDevice **out_dev)
|
||||
{
|
||||
_EGLDevice *dev;
|
||||
const int wanted_nodes = 1 << DRM_NODE_RENDER | 1 << DRM_NODE_PRIMARY;
|
||||
|
||||
if ((device->available_nodes & wanted_nodes) != wanted_nodes)
|
||||
if ((device->available_nodes & (1 << DRM_NODE_PRIMARY |
|
||||
1 << DRM_NODE_RENDER)) == 0)
|
||||
return -1;
|
||||
|
||||
dev = _eglGlobal.DeviceList;
|
||||
@@ -274,6 +274,9 @@ _eglRefreshDeviceList(void)
|
||||
|
||||
num_devs = drmGetDevices2(0, devices, ARRAY_SIZE(devices));
|
||||
for (int i = 0; i < num_devs; i++) {
|
||||
if (!(devices[i]->available_nodes & (1 << DRM_NODE_RENDER)))
|
||||
continue;
|
||||
|
||||
ret = _eglAddDRMDevice(devices[i], NULL);
|
||||
|
||||
/* Device is not added - error or already present */
|
||||
|
@@ -169,7 +169,7 @@ struct lp_static_texture_state
|
||||
unsigned swizzle_a:3;
|
||||
|
||||
/* pipe_texture's state */
|
||||
enum pipe_texture_target target:4; /**< PIPE_TEXTURE_* */
|
||||
enum pipe_texture_target target:5; /**< PIPE_TEXTURE_* */
|
||||
unsigned pot_width:1; /**< is the width a power of two? */
|
||||
unsigned pot_height:1;
|
||||
unsigned pot_depth:1;
|
||||
|
@@ -60,6 +60,15 @@ const struct drm_driver_descriptor descriptor_name = { \
|
||||
|
||||
#endif
|
||||
|
||||
#ifdef GALLIUM_KMSRO_ONLY
|
||||
#undef GALLIUM_V3D
|
||||
#undef GALLIUM_VC4
|
||||
#undef GALLIUM_FREEDRENO
|
||||
#undef GALLIUM_ETNAVIV
|
||||
#undef GALLIUM_PANFROST
|
||||
#undef GALLIUM_LIMA
|
||||
#endif
|
||||
|
||||
#ifdef GALLIUM_I915
|
||||
#include "i915/drm/i915_drm_public.h"
|
||||
#include "i915/i915_public.h"
|
||||
|
@@ -81,9 +81,6 @@ sw_screen_create(struct sw_winsys *winsys)
|
||||
UNUSED bool only_sw = env_var_as_boolean("LIBGL_ALWAYS_SOFTWARE", false);
|
||||
const char *drivers[] = {
|
||||
debug_get_option("GALLIUM_DRIVER", ""),
|
||||
#if defined(GALLIUM_ZINK)
|
||||
only_sw ? "" : "zink",
|
||||
#endif
|
||||
#if defined(GALLIUM_D3D12)
|
||||
only_sw ? "" : "d3d12",
|
||||
#endif
|
||||
@@ -95,6 +92,9 @@ sw_screen_create(struct sw_winsys *winsys)
|
||||
#endif
|
||||
#if defined(GALLIUM_SWR)
|
||||
"swr",
|
||||
#endif
|
||||
#if defined(GALLIUM_ZINK)
|
||||
only_sw ? "" : "zink",
|
||||
#endif
|
||||
};
|
||||
|
||||
|
@@ -86,9 +86,6 @@ sw_screen_create(struct sw_winsys *winsys)
|
||||
UNUSED bool only_sw = env_var_as_boolean("LIBGL_ALWAYS_SOFTWARE", false);
|
||||
const char *drivers[] = {
|
||||
debug_get_option("GALLIUM_DRIVER", ""),
|
||||
#if defined(GALLIUM_ZINK)
|
||||
only_sw ? "" : "zink",
|
||||
#endif
|
||||
#if defined(GALLIUM_D3D12)
|
||||
only_sw ? "" : "d3d12",
|
||||
#endif
|
||||
@@ -100,6 +97,9 @@ sw_screen_create(struct sw_winsys *winsys)
|
||||
#endif
|
||||
#if defined(GALLIUM_SWR)
|
||||
"swr",
|
||||
#endif
|
||||
#if defined(GALLIUM_ZINK)
|
||||
only_sw ? "" : "zink",
|
||||
#endif
|
||||
};
|
||||
|
||||
|
@@ -1761,7 +1761,7 @@ iris_create_rasterizer_state(struct pipe_context *ctx,
|
||||
sf.SmoothPointEnable = (state->point_smooth || state->multisample) &&
|
||||
!state->point_quad_rasterization;
|
||||
sf.PointWidthSource = state->point_size_per_vertex ? Vertex : State;
|
||||
sf.PointWidth = state->point_size;
|
||||
sf.PointWidth = CLAMP(state->point_size, 0.125f, 255.875f);
|
||||
|
||||
if (state->flatshade_first) {
|
||||
sf.TriangleFanProvokingVertexSelect = 1;
|
||||
|
@@ -436,7 +436,8 @@ panfrost_prepare_midgard_fs_state(struct panfrost_context *ctx,
|
||||
} else {
|
||||
/* Reasons to disable early-Z from a shader perspective */
|
||||
bool late_z = fs->can_discard || fs->writes_global ||
|
||||
fs->writes_depth || fs->writes_stencil;
|
||||
fs->writes_depth || fs->writes_stencil ||
|
||||
(zsa->alpha_func != MALI_FUNC_ALWAYS);
|
||||
|
||||
/* If either depth or stencil is enabled, discard matters */
|
||||
bool zs_enabled =
|
||||
|
@@ -2210,7 +2210,7 @@ static void r600_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info
|
||||
}
|
||||
index_bias = info->index_bias;
|
||||
} else {
|
||||
index_bias = draws[0].start;
|
||||
index_bias = indirect ? 0 : draws[0].start;
|
||||
}
|
||||
|
||||
/* Set the index offset and primitive restart. */
|
||||
|
@@ -229,7 +229,9 @@ static int si_get_param(struct pipe_screen *pscreen, enum pipe_cap param)
|
||||
return LLVM_VERSION_MAJOR < 9 && !sscreen->info.has_unaligned_shader_loads;
|
||||
|
||||
case PIPE_CAP_SPARSE_BUFFER_PAGE_SIZE:
|
||||
return sscreen->info.has_sparse_vm_mappings ? RADEON_SPARSE_PAGE_SIZE : 0;
|
||||
/* Gfx8 (Polaris11) hangs, so don't enable this on Gfx8 and older chips. */
|
||||
return sscreen->info.chip_class >= GFX9 &&
|
||||
sscreen->info.has_sparse_vm_mappings ? RADEON_SPARSE_PAGE_SIZE : 0;
|
||||
|
||||
case PIPE_CAP_UMA:
|
||||
case PIPE_CAP_PREFER_IMM_ARRAYS_AS_CONSTBUF:
|
||||
|
@@ -140,8 +140,6 @@ void si_init_compiler(struct si_screen *sscreen, struct ac_llvm_compiler *compil
|
||||
|
||||
enum ac_target_machine_options tm_options =
|
||||
(sscreen->debug_flags & DBG(GISEL) ? AC_TM_ENABLE_GLOBAL_ISEL : 0) |
|
||||
(sscreen->info.chip_class <= GFX8 ? AC_TM_FORCE_DISABLE_XNACK :
|
||||
sscreen->info.chip_class <= GFX10 ? AC_TM_FORCE_ENABLE_XNACK : 0) |
|
||||
(!sscreen->llvm_has_working_vgpr_indexing ? AC_TM_PROMOTE_ALLOCA_TO_SCRATCH : 0) |
|
||||
(sscreen->debug_flags & DBG(CHECK_IR) ? AC_TM_CHECK_IR : 0) |
|
||||
(create_low_opt_compiler ? AC_TM_CREATE_LOW_OPT : 0);
|
||||
|
@@ -567,11 +567,12 @@ void lvp_UpdateDescriptorSetWithTemplate(VkDevice _device,
|
||||
struct lvp_descriptor *desc =
|
||||
&set->descriptors[bind_layout->descriptor_index];
|
||||
for (j = 0; j < entry->descriptorCount; ++j) {
|
||||
unsigned idx = j + entry->dstArrayElement;
|
||||
switch (entry->descriptorType) {
|
||||
case VK_DESCRIPTOR_TYPE_SAMPLER: {
|
||||
LVP_FROM_HANDLE(lvp_sampler, sampler,
|
||||
*(VkSampler *)pSrc);
|
||||
desc[j] = (struct lvp_descriptor) {
|
||||
desc[idx] = (struct lvp_descriptor) {
|
||||
.type = VK_DESCRIPTOR_TYPE_SAMPLER,
|
||||
.info.sampler = sampler,
|
||||
};
|
||||
@@ -579,7 +580,7 @@ void lvp_UpdateDescriptorSetWithTemplate(VkDevice _device,
|
||||
}
|
||||
case VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER: {
|
||||
VkDescriptorImageInfo *info = (VkDescriptorImageInfo *)pSrc;
|
||||
desc[j] = (struct lvp_descriptor) {
|
||||
desc[idx] = (struct lvp_descriptor) {
|
||||
.type = entry->descriptorType,
|
||||
.info.iview = lvp_image_view_from_handle(info->imageView),
|
||||
.info.sampler = lvp_sampler_from_handle(info->sampler),
|
||||
@@ -591,7 +592,7 @@ void lvp_UpdateDescriptorSetWithTemplate(VkDevice _device,
|
||||
case VK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT: {
|
||||
LVP_FROM_HANDLE(lvp_image_view, iview,
|
||||
((VkDescriptorImageInfo *)pSrc)->imageView);
|
||||
desc[j] = (struct lvp_descriptor) {
|
||||
desc[idx] = (struct lvp_descriptor) {
|
||||
.type = entry->descriptorType,
|
||||
.info.iview = iview,
|
||||
};
|
||||
@@ -601,7 +602,7 @@ void lvp_UpdateDescriptorSetWithTemplate(VkDevice _device,
|
||||
case VK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER: {
|
||||
LVP_FROM_HANDLE(lvp_buffer_view, bview,
|
||||
*(VkBufferView *)pSrc);
|
||||
desc[j] = (struct lvp_descriptor) {
|
||||
desc[idx] = (struct lvp_descriptor) {
|
||||
.type = entry->descriptorType,
|
||||
.info.buffer_view = bview,
|
||||
};
|
||||
@@ -613,7 +614,7 @@ void lvp_UpdateDescriptorSetWithTemplate(VkDevice _device,
|
||||
case VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC:
|
||||
case VK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC: {
|
||||
VkDescriptorBufferInfo *info = (VkDescriptorBufferInfo *)pSrc;
|
||||
desc[j] = (struct lvp_descriptor) {
|
||||
desc[idx] = (struct lvp_descriptor) {
|
||||
.type = entry->descriptorType,
|
||||
.info.offset = info->offset,
|
||||
.info.buffer = lvp_buffer_from_handle(info->buffer),
|
||||
|
@@ -1722,16 +1722,24 @@ static void handle_copy_image(struct lvp_cmd_buffer_entry *cmd,
|
||||
struct pipe_box src_box;
|
||||
src_box.x = copycmd->regions[i].srcOffset.x;
|
||||
src_box.y = copycmd->regions[i].srcOffset.y;
|
||||
src_box.z = copycmd->regions[i].srcOffset.z + copycmd->regions[i].srcSubresource.baseArrayLayer;
|
||||
src_box.width = copycmd->regions[i].extent.width;
|
||||
src_box.height = copycmd->regions[i].extent.height;
|
||||
src_box.depth = copycmd->regions[i].extent.depth;
|
||||
if (copycmd->src->bo->target == PIPE_TEXTURE_3D) {
|
||||
src_box.depth = copycmd->regions[i].extent.depth;
|
||||
src_box.z = copycmd->regions[i].srcOffset.z;
|
||||
} else {
|
||||
src_box.depth = copycmd->regions[i].srcSubresource.layerCount;
|
||||
src_box.z = copycmd->regions[i].srcSubresource.baseArrayLayer;
|
||||
}
|
||||
|
||||
unsigned dstz = copycmd->dst->bo->target == PIPE_TEXTURE_3D ?
|
||||
copycmd->regions[i].dstOffset.z :
|
||||
copycmd->regions[i].dstSubresource.baseArrayLayer;
|
||||
state->pctx->resource_copy_region(state->pctx, copycmd->dst->bo,
|
||||
copycmd->regions[i].dstSubresource.mipLevel,
|
||||
copycmd->regions[i].dstOffset.x,
|
||||
copycmd->regions[i].dstOffset.y,
|
||||
copycmd->regions[i].dstOffset.z + copycmd->regions[i].dstSubresource.baseArrayLayer,
|
||||
dstz,
|
||||
copycmd->src->bo,
|
||||
copycmd->regions[i].srcSubresource.mipLevel,
|
||||
&src_box);
|
||||
@@ -2096,7 +2104,7 @@ static void handle_copy_query_pool_results(struct lvp_cmd_buffer_entry *cmd,
|
||||
struct lvp_query_pool *pool = copycmd->pool;
|
||||
|
||||
for (unsigned i = copycmd->first_query; i < copycmd->first_query + copycmd->query_count; i++) {
|
||||
unsigned offset = copycmd->dst->offset + (copycmd->stride * (i - copycmd->first_query));
|
||||
unsigned offset = copycmd->dst_offset + copycmd->dst->offset + (copycmd->stride * (i - copycmd->first_query));
|
||||
if (pool->queries[i]) {
|
||||
if (copycmd->flags & VK_QUERY_RESULT_WITH_AVAILABILITY_BIT)
|
||||
state->pctx->get_query_result_resource(state->pctx,
|
||||
@@ -2106,21 +2114,35 @@ static void handle_copy_query_pool_results(struct lvp_cmd_buffer_entry *cmd,
|
||||
-1,
|
||||
copycmd->dst->bo,
|
||||
offset + (copycmd->flags & VK_QUERY_RESULT_64_BIT ? 8 : 4));
|
||||
state->pctx->get_query_result_resource(state->pctx,
|
||||
pool->queries[i],
|
||||
copycmd->flags & VK_QUERY_RESULT_WAIT_BIT,
|
||||
copycmd->flags & VK_QUERY_RESULT_64_BIT ? PIPE_QUERY_TYPE_U64 : PIPE_QUERY_TYPE_U32,
|
||||
0,
|
||||
copycmd->dst->bo,
|
||||
offset);
|
||||
if (pool->type == VK_QUERY_TYPE_PIPELINE_STATISTICS) {
|
||||
unsigned num_results = 0;
|
||||
unsigned result_size = copycmd->flags & VK_QUERY_RESULT_64_BIT ? 8 : 4;
|
||||
u_foreach_bit(bit, pool->pipeline_stats)
|
||||
state->pctx->get_query_result_resource(state->pctx,
|
||||
pool->queries[i],
|
||||
copycmd->flags & VK_QUERY_RESULT_WAIT_BIT,
|
||||
copycmd->flags & VK_QUERY_RESULT_64_BIT ? PIPE_QUERY_TYPE_U64 : PIPE_QUERY_TYPE_U32,
|
||||
bit,
|
||||
copycmd->dst->bo,
|
||||
offset + num_results++ * result_size);
|
||||
} else {
|
||||
state->pctx->get_query_result_resource(state->pctx,
|
||||
pool->queries[i],
|
||||
copycmd->flags & VK_QUERY_RESULT_WAIT_BIT,
|
||||
copycmd->flags & VK_QUERY_RESULT_64_BIT ? PIPE_QUERY_TYPE_U64 : PIPE_QUERY_TYPE_U32,
|
||||
0,
|
||||
copycmd->dst->bo,
|
||||
offset);
|
||||
}
|
||||
} else {
|
||||
/* if no queries emitted yet, just reset the buffer to 0 so avail is reported correctly */
|
||||
if (copycmd->flags & VK_QUERY_RESULT_WITH_AVAILABILITY_BIT) {
|
||||
struct pipe_transfer *src_t;
|
||||
uint32_t *map;
|
||||
|
||||
struct pipe_box box = {};
|
||||
box.width = copycmd->stride * copycmd->query_count;
|
||||
struct pipe_box box = {0};
|
||||
box.x = offset;
|
||||
box.width = copycmd->stride;
|
||||
box.height = 1;
|
||||
box.depth = 1;
|
||||
map = state->pctx->transfer_map(state->pctx,
|
||||
|
@@ -696,6 +696,7 @@ vlVaPutImage(VADriverContextP ctx, VASurfaceID surface, VAImageID image,
|
||||
}
|
||||
}
|
||||
}
|
||||
drv->pipe->flush(drv->pipe, NULL, 0);
|
||||
mtx_unlock(&drv->mutex);
|
||||
|
||||
return VA_STATUS_SUCCESS;
|
||||
|
@@ -2,3 +2,5 @@
|
||||
#include "target-helpers/inline_debug_helper.h"
|
||||
#include "frontend/drm_driver.h"
|
||||
#include "kmsro/drm/kmsro_drm_public.h"
|
||||
#define GALLIUM_KMSRO_ONLY
|
||||
#include "target-helpers/drm_helper.h"
|
||||
|
@@ -486,10 +486,13 @@ dri_screen_create_sw(struct gbm_dri_device *dri)
|
||||
return -errno;
|
||||
|
||||
ret = dri_screen_create_dri2(dri, driver_name);
|
||||
if (ret == 0)
|
||||
if (ret != 0)
|
||||
ret = dri_screen_create_swrast(dri);
|
||||
if (ret != 0)
|
||||
return ret;
|
||||
|
||||
return dri_screen_create_swrast(dri);
|
||||
dri->software = true;
|
||||
return 0;
|
||||
}
|
||||
|
||||
static const struct gbm_dri_visual gbm_dri_visuals_table[] = {
|
||||
|
@@ -63,6 +63,7 @@ struct gbm_dri_device {
|
||||
|
||||
void *driver;
|
||||
char *driver_name; /* Name of the DRI module, without the _dri suffix */
|
||||
bool software; /* A software driver was loaded */
|
||||
|
||||
__DRIscreen *screen;
|
||||
__DRIcontext *context;
|
||||
|
@@ -367,7 +367,8 @@ is_logic_op(enum opcode opcode)
|
||||
}
|
||||
|
||||
static bool
|
||||
can_take_stride(fs_inst *inst, unsigned arg, unsigned stride,
|
||||
can_take_stride(fs_inst *inst, brw_reg_type dst_type,
|
||||
unsigned arg, unsigned stride,
|
||||
const gen_device_info *devinfo)
|
||||
{
|
||||
if (stride > 4)
|
||||
@@ -377,9 +378,9 @@ can_take_stride(fs_inst *inst, unsigned arg, unsigned stride,
|
||||
* of the corresponding channel of the destination, and the provided stride
|
||||
* would break this restriction.
|
||||
*/
|
||||
if (has_dst_aligned_region_restriction(devinfo, inst) &&
|
||||
if (has_dst_aligned_region_restriction(devinfo, inst, dst_type) &&
|
||||
!(type_sz(inst->src[arg].type) * stride ==
|
||||
type_sz(inst->dst.type) * inst->dst.stride ||
|
||||
type_sz(dst_type) * inst->dst.stride ||
|
||||
stride == 0))
|
||||
return false;
|
||||
|
||||
@@ -528,10 +529,15 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry)
|
||||
if (instruction_requires_packed_data(inst) && entry_stride != 1)
|
||||
return false;
|
||||
|
||||
const brw_reg_type dst_type = (has_source_modifiers &&
|
||||
entry->dst.type != inst->src[arg].type) ?
|
||||
entry->dst.type : inst->dst.type;
|
||||
|
||||
/* Bail if the result of composing both strides would exceed the
|
||||
* hardware limit.
|
||||
*/
|
||||
if (!can_take_stride(inst, arg, entry_stride * inst->src[arg].stride,
|
||||
if (!can_take_stride(inst, dst_type, arg,
|
||||
entry_stride * inst->src[arg].stride,
|
||||
devinfo))
|
||||
return false;
|
||||
|
||||
|
@@ -549,7 +549,8 @@ is_unordered(const fs_inst *inst)
|
||||
*/
|
||||
static inline bool
|
||||
has_dst_aligned_region_restriction(const gen_device_info *devinfo,
|
||||
const fs_inst *inst)
|
||||
const fs_inst *inst,
|
||||
brw_reg_type dst_type)
|
||||
{
|
||||
const brw_reg_type exec_type = get_exec_type(inst);
|
||||
/* Even though the hardware spec claims that "integer DWord multiply"
|
||||
@@ -563,13 +564,20 @@ has_dst_aligned_region_restriction(const gen_device_info *devinfo,
|
||||
(inst->opcode == BRW_OPCODE_MAD &&
|
||||
MIN2(type_sz(inst->src[1].type), type_sz(inst->src[2].type)) >= 4));
|
||||
|
||||
if (type_sz(inst->dst.type) > 4 || type_sz(exec_type) > 4 ||
|
||||
if (type_sz(dst_type) > 4 || type_sz(exec_type) > 4 ||
|
||||
(type_sz(exec_type) == 4 && is_dword_multiply))
|
||||
return devinfo->is_cherryview || gen_device_info_is_9lp(devinfo);
|
||||
else
|
||||
return false;
|
||||
}
|
||||
|
||||
static inline bool
|
||||
has_dst_aligned_region_restriction(const gen_device_info *devinfo,
|
||||
const fs_inst *inst)
|
||||
{
|
||||
return has_dst_aligned_region_restriction(devinfo, inst, inst->dst.type);
|
||||
}
|
||||
|
||||
/**
|
||||
* Return whether the LOAD_PAYLOAD instruction is a plain copy of bits from
|
||||
* the specified register file into a VGRF.
|
||||
|
@@ -7,7 +7,6 @@ import os
|
||||
import pathlib
|
||||
import subprocess
|
||||
import sys
|
||||
import tempfile
|
||||
|
||||
# The meson version handles windows paths better, but if it's not available
|
||||
# fall back to shlex
|
||||
@@ -37,18 +36,17 @@ success = True
|
||||
for asm_file in args.gen_folder.glob('*.asm'):
|
||||
expected_file = asm_file.stem + '.expected'
|
||||
expected_path = args.gen_folder / expected_file
|
||||
out_path = tempfile.NamedTemporaryFile()
|
||||
|
||||
try:
|
||||
command = i965_asm + [
|
||||
'--type', 'hex',
|
||||
'--gen', args.gen_name,
|
||||
'--output', out_path.name,
|
||||
asm_file
|
||||
]
|
||||
subprocess.run(command,
|
||||
stdout=subprocess.DEVNULL,
|
||||
stderr=subprocess.STDOUT)
|
||||
with subprocess.Popen(command,
|
||||
stdout=subprocess.PIPE,
|
||||
stderr=subprocess.DEVNULL) as cmd:
|
||||
lines_after = [line.decode('ascii') for line in cmd.stdout.readlines()]
|
||||
except OSError as e:
|
||||
if e.errno == errno.ENOEXEC:
|
||||
print('Skipping due to inability to run host binaries.',
|
||||
@@ -58,7 +56,6 @@ for asm_file in args.gen_folder.glob('*.asm'):
|
||||
|
||||
with expected_path.open() as f:
|
||||
lines_before = f.readlines()
|
||||
lines_after = [line.decode('ascii') for line in out_path]
|
||||
|
||||
diff = ''.join(difflib.unified_diff(lines_before, lines_after,
|
||||
expected_file, asm_file.stem + '.out'))
|
||||
|
@@ -139,7 +139,7 @@ lower_tex_src_plane_block(nir_builder *b, lower_tex_src_state *state, nir_block
|
||||
if (tex_index >= 0 && samp_index >= 0) {
|
||||
b->cursor = nir_before_instr(&tex->instr);
|
||||
|
||||
nir_variable* samp = find_sampler(state, plane[0].i32);
|
||||
nir_variable* samp = find_sampler(state, tex->sampler_index);
|
||||
assert(samp);
|
||||
|
||||
nir_deref_instr *tex_deref_instr = nir_build_deref_var(b, samp);
|
||||
|
@@ -1321,7 +1321,7 @@ st_create_fp_variant(struct st_context *st,
|
||||
key->external.lower_yuv)) {
|
||||
NIR_PASS_V(state.ir.nir, st_nir_lower_tex_src_plane,
|
||||
~stfp->Base.SamplersUsed,
|
||||
key->external.lower_nv12 || key->external.lower_xy_uxvx ||
|
||||
key->external.lower_nv12 | key->external.lower_xy_uxvx |
|
||||
key->external.lower_yx_xuxv,
|
||||
key->external.lower_iyuv);
|
||||
finalize = true;
|
||||
|
@@ -104,6 +104,11 @@ u_bit_scan(unsigned *mask)
|
||||
return i;
|
||||
}
|
||||
|
||||
#define u_foreach_bit(b, dword) \
|
||||
for (uint32_t __dword = (dword), b; \
|
||||
((b) = ffs(__dword) - 1, __dword); \
|
||||
__dword &= ~(1 << (b)))
|
||||
|
||||
static inline int
|
||||
u_bit_scan64(uint64_t *mask)
|
||||
{
|
||||
@@ -112,6 +117,11 @@ u_bit_scan64(uint64_t *mask)
|
||||
return i;
|
||||
}
|
||||
|
||||
#define u_foreach_bit64(b, dword) \
|
||||
for (uint64_t __dword = (dword), b; \
|
||||
((b) = ffsll(__dword) - 1, __dword); \
|
||||
__dword &= ~(1ull << (b)))
|
||||
|
||||
/* Determine if an unsigned value is a power of two.
|
||||
*
|
||||
* \note
|
||||
|
@@ -165,7 +165,7 @@ key_u32_equals(const void *a, const void *b)
|
||||
struct set *
|
||||
_mesa_set_create_u32_keys(void *mem_ctx)
|
||||
{
|
||||
return _mesa_set_create(NULL, key_u32_hash, key_u32_equals);
|
||||
return _mesa_set_create(mem_ctx, key_u32_hash, key_u32_equals);
|
||||
}
|
||||
|
||||
struct set *
|
||||
|
@@ -444,20 +444,14 @@ get_cpu_topology(void)
|
||||
util_cpu_caps.family < CPU_AMD_LAST) {
|
||||
uint32_t regs[4];
|
||||
|
||||
/* Query the L3 cache count. */
|
||||
cpuid_count(0x8000001D, 3, regs);
|
||||
unsigned cache_level = (regs[0] >> 5) & 0x7;
|
||||
unsigned cores_per_L3 = ((regs[0] >> 14) & 0xfff) + 1;
|
||||
|
||||
if (cache_level != 3 || cores_per_L3 == util_cpu_caps.nr_cpus)
|
||||
return;
|
||||
|
||||
uint32_t saved_mask[UTIL_MAX_CPUS / 32] = {0};
|
||||
uint32_t mask[UTIL_MAX_CPUS / 32] = {0};
|
||||
uint32_t allowed_mask[UTIL_MAX_CPUS / 32] = {0};
|
||||
uint32_t apic_id[UTIL_MAX_CPUS];
|
||||
bool saved = false;
|
||||
|
||||
uint32_t L3_found[UTIL_MAX_CPUS] = {0};
|
||||
uint32_t num_L3_caches = 0;
|
||||
util_affinity_mask *L3_affinity_masks = NULL;
|
||||
|
||||
/* Query APIC IDs from each CPU core.
|
||||
*
|
||||
* An APIC ID is a logical ID of the CPU with respect to the cache
|
||||
@@ -484,39 +478,58 @@ get_cpu_topology(void)
|
||||
!saved ? saved_mask : NULL,
|
||||
util_cpu_caps.num_cpu_mask_bits)) {
|
||||
saved = true;
|
||||
allowed_mask[i / 32] |= cpu_bit;
|
||||
|
||||
/* Query the APIC ID of the current core. */
|
||||
cpuid(0x00000001, regs);
|
||||
apic_id[i] = regs[1] >> 24;
|
||||
unsigned apic_id = regs[1] >> 24;
|
||||
|
||||
/* Query the total core count for the CPU */
|
||||
uint32_t core_count = 1;
|
||||
if (regs[3] & (1 << 28))
|
||||
core_count = (regs[1] >> 16) & 0xff;
|
||||
|
||||
core_count = util_next_power_of_two(core_count);
|
||||
|
||||
/* Query the L3 cache count. */
|
||||
cpuid_count(0x8000001D, 3, regs);
|
||||
unsigned cache_level = (regs[0] >> 5) & 0x7;
|
||||
unsigned cores_per_L3 = ((regs[0] >> 14) & 0xfff) + 1;
|
||||
|
||||
if (cache_level != 3)
|
||||
continue;
|
||||
|
||||
unsigned local_core_id = apic_id & (core_count - 1);
|
||||
unsigned phys_id = (apic_id & ~(core_count - 1)) >> util_logbase2(core_count);
|
||||
unsigned local_l3_cache_index = local_core_id / util_next_power_of_two(cores_per_L3);
|
||||
#define L3_ID(p, i) (p << 16 | i << 1 | 1);
|
||||
|
||||
unsigned l3_id = L3_ID(phys_id, local_l3_cache_index);
|
||||
int idx = -1;
|
||||
for (unsigned c = 0; c < num_L3_caches; c++) {
|
||||
if (L3_found[c] == l3_id) {
|
||||
idx = c;
|
||||
break;
|
||||
}
|
||||
}
|
||||
if (idx == -1) {
|
||||
idx = num_L3_caches;
|
||||
L3_found[num_L3_caches++] = l3_id;
|
||||
L3_affinity_masks = realloc(L3_affinity_masks, sizeof(util_affinity_mask) * num_L3_caches);
|
||||
if (!L3_affinity_masks)
|
||||
return;
|
||||
memset(&L3_affinity_masks[num_L3_caches - 1], 0, sizeof(util_affinity_mask));
|
||||
}
|
||||
util_cpu_caps.cpu_to_L3[i] = idx;
|
||||
L3_affinity_masks[idx][i / 32] |= cpu_bit;
|
||||
|
||||
}
|
||||
mask[i / 32] = 0;
|
||||
}
|
||||
|
||||
util_cpu_caps.num_L3_caches = num_L3_caches;
|
||||
util_cpu_caps.L3_affinity_mask = L3_affinity_masks;
|
||||
|
||||
if (saved) {
|
||||
|
||||
/* We succeeded in using at least one CPU. */
|
||||
util_cpu_caps.num_L3_caches = util_cpu_caps.nr_cpus / cores_per_L3;
|
||||
util_cpu_caps.cores_per_L3 = cores_per_L3;
|
||||
util_cpu_caps.L3_affinity_mask = calloc(sizeof(util_affinity_mask),
|
||||
util_cpu_caps.num_L3_caches);
|
||||
|
||||
for (unsigned i = 0; i < util_cpu_caps.nr_cpus && i < UTIL_MAX_CPUS;
|
||||
i++) {
|
||||
uint32_t cpu_bit = 1u << (i % 32);
|
||||
|
||||
if (allowed_mask[i / 32] & cpu_bit) {
|
||||
/* Each APIC ID bit represents a topology level, so we need
|
||||
* to round up to the next power of two.
|
||||
*/
|
||||
unsigned L3_index = apic_id[i] /
|
||||
util_next_power_of_two(cores_per_L3);
|
||||
|
||||
util_cpu_caps.L3_affinity_mask[L3_index][i / 32] |= cpu_bit;
|
||||
util_cpu_caps.cpu_to_L3[i] = L3_index;
|
||||
}
|
||||
}
|
||||
|
||||
if (debug_get_option_dump_cpu()) {
|
||||
fprintf(stderr, "CPU <-> L3 cache mapping:\n");
|
||||
for (unsigned i = 0; i < util_cpu_caps.num_L3_caches; i++) {
|
||||
|
Reference in New Issue
Block a user