Compare commits

...

16 Commits

Author SHA1 Message Date
Dylan Baker
ea1d5faa96 bump version to 18.1.0-rc3 2018-05-04 10:57:29 -07:00
Neil Roberts
d90d4e61e2 spirv: Apply OriginUpperLeft to FragCoord
This behaviour was changed in 1e5b09f42f. The commit message
for that says it is just a “tidy up” so my assumption is that the
behaviour change was a mistake. It’s a little hard to decipher looking
at the diff, but the previous code before that patch was:

  if (builtin == SpvBuiltInFragCoord || builtin == SpvBuiltInSamplePosition)
     nir_var->data.origin_upper_left = b->origin_upper_left;

  if (builtin == SpvBuiltInFragCoord)
     nir_var->data.pixel_center_integer = b->pixel_center_integer;

After the patch the code was:

  case SpvBuiltInSamplePosition:
     nir_var->data.origin_upper_left = b->origin_upper_left;
     /* fallthrough */
  case SpvBuiltInFragCoord:
     nir_var->data.pixel_center_integer = b->pixel_center_integer;
     break;

Before the patch origin_upper_left affected both builtins and
pixel_center_integer only affected FragCoord. After the patch
origin_upper_left only affects SamplePosition and pixel_center_integer
affects both variables.

This patch tries to restore the previous behaviour by changing the
code to:

  case SpvBuiltInFragCoord:
     nir_var->data.pixel_center_integer = b->pixel_center_integer;
     /* fallthrough */
  case SpvBuiltInSamplePosition:
     nir_var->data.origin_upper_left = b->origin_upper_left;
     break;

This change will be important for ARB_gl_spirv which is meant to
support OriginLowerLeft.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Fixes: 1e5b09f42f "spirv: Tidy some repeated if checks..."
(cherry picked from commit e17d0ccbbd)
2018-05-03 10:56:08 -07:00
Deepak Rawat
cd1435aa9d egl/x11: Send invalidate to driver on copy_region path in swap_buffer
Similar to swap_available path send invalidate to the driver because
egl/X11 is not watching for for server's invalidate events. The
dri2_copy_region path is trigerred when server supports DRI2 version
minor 1.

Tested with piglit egl tests for regression.

V2: Move invalidate from dri2_copy_region to swap_buffer common.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 9a21c96126)
2018-05-03 10:55:51 -07:00
Jose Maria Casanova Crespo
0d15a443fa intel/compiler: fix brw_imm_w for negative 16-bit integers
16-bit immediates need to replicate the 16-bit immediate value
in both words of the 32-bit value. This needs to be careful
to avoid sign-extension, which the previous implementation was
not handling properly.

For example, with the previous implementation, storing the value
-3 would generate imm.d = 0xfffffffd due to signed integer sign
extension, which is not correct. Instead, we should cast to
uint16_t, which gives us the correct result: imm.ud = 0xfffdfffd.

We only had a couple of cases hitting this path in the driver
until now, one with value -1, which would work since all bits are
one in this case, and another with value -2 in brw_clip_tri(),
which would hit the aforementioned issue (this case only affects
gen4 although we are not aware of whether this was causing an
actual bug somewhere).

v2: Make explicit uint32_t casting for left shift (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

Cc: "18.0 18.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f0e6dacee5)
2018-05-03 10:55:43 -07:00
Jose Maria Casanova Crespo
1e5c3fa29b intel/compiler: fix 16-bit int brw_negate_immediate and brw_abs_immediate
From Intel Skylake PRM, vol 07, "Immediate" section (page 768):

"For a word, unsigned word, or half-float immediate data,
software must replicate the same 16-bit immediate value to both
the lower word and the high word of the 32-bit immediate field
in a GEN instruction."

This fixes the int16/uint16 negate and abs immediates that weren't
taking into account the replication in lower and upper words.

v2: Integer cases are different to Float cases. (Jason Ekstrand)
    Included reference to PRM (Jose Maria Casanova)
v3: Make explicit uint32_t casting for left shift (Jason Ekstrand)
    Split half float implementation. (Jason Ekstrand)
    Fix brw_abs_immediate (Jose Maria Casanova)

Cc: "18.0 18.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 2a76f03c90)
2018-05-03 10:55:34 -07:00
Bas Nieuwenhuizen
57aebd4283 radv: Don't check the incoming apiVersion on CreateInstance.
This fixes

dEQP-VK.api.device_init.create_instance_invalid_api_version

CC: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 467c562a29)
2018-05-03 10:55:26 -07:00
Bas Nieuwenhuizen
e334caa4be radv: Allow vkEnumerateInstanceVersion ProcAddr without instance.
Apparently the somewhere between 1.1.70 and 1.1.73 the loader started
depending on this. The loader then creates a 1.0 instance, which gets
into funny situation because we have a 1.1 device.

No idea how to do line wrapping in Mako though, my random guesses
did not work.

CC: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 9267ff9883)
2018-05-03 10:55:19 -07:00
Nanley Chery
b3f3d605c8 i965/tex_image: Avoid the ASTC LDR workaround on gen9lp
Both the internal documentation and the results of testing this in the
CI suggest that this is unnecessary. Add the fixes tag because this
reduces an internal benchmark's startup time by about 17 seconds
(reported by Eero).

Fixes: 710b1d2e66 "i965/tex_image: Flush certain subnormal ASTC channel values"
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 3e56e4642f)
2018-05-02 11:23:54 -07:00
Jason Ekstrand
c760bbff20 anv: Allow lookup of vkEnumerateInstanceVersion without an instance
Fixes: cbab2d1da5
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit d216ffc604)
2018-05-02 11:23:49 -07:00
Matthew Nicholls
63953cc0fb radv: fix multisample image copies
Previously before fb077b0728, the LOD parameter was being used in place of the
sample index, which would only copy the first sample to all samples in the
destination image. After that multisample image copies wouldn't copy anything
from my observations.

This fixes some copy_and_blit CTS tests.

v3.1: - set lod to 0 for nir_txf_ms (Samuel)
v2: - use GLSL_SAMPLER_DIM_MS instead of 2D (Samuel)
    - updated commit description (Samuel)

Fix this properly by copying each sample in a separate radv_CmdDraw and using a
pipeline with the correct rasterizationSamples for the destination image.

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 97d57ef917)
2018-05-02 11:23:32 -07:00
Samuel Pitoiset
4cf3a2b064 radv: compute the number of subpass attachments correctly
Only count color attachments twice if resolves are used, also
account for the depth stencil attachment if present.

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit d8db5986ce)
2018-05-01 14:19:55 -07:00
Andres Rodriguez
2fe5a43995 radv/winsys: fix leaking resources from bo's imported by fd
A bo's ref_count was not being initialized when imported from an fd.
Therefore, we would fail to free the resource during VkFreeMemory().

This patch fixes applications like hifi VR in threaded mode, which
perform frequent imports/releases of IPC shared memory.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f56e22e496)
2018-05-01 14:19:49 -07:00
Leo Liu
7a1f220b26 st/omx/enc: fix blit setup for YUV LoadImage
The blit here involves scaling since it's copying from I8 format to R8G8 format.
Half of source will be filtered out with PIPE_TEX_FILTER_NEAREST instruction, it
looks that GPU always uses the second half as source. Currently we use "1" as
the start point of x for R, then causing 1 source pixel of U component shift to
right. So "-1" should be the start point for U component.

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 1c5f4f4e17)
2018-04-30 09:22:20 -07:00
Juan A. Suarez Romero
171753ff5d autotools, meson: bump up required VA version
Due using a new VP9 config we use, required VA API 0.39

Fixes: 413c5ca372 ("travis: update libva required version")
CC: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 4d449c94e4)
2018-04-30 09:22:13 -07:00
Marek Olšák
7d6ed8d0dd radeonsi/gfx9: workaround for INTERP with indirect indexing
and clean up the conditions.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6d19120da8)
2018-04-30 09:22:08 -07:00
Marek Olšák
66f64177b2 util/u_queue: fix a deadlock in util_queue_finish
Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 7083ac7290)
2018-04-30 09:22:03 -07:00
19 changed files with 258 additions and 155 deletions

View File

@@ -1 +1 @@
18.1.0-rc2
18.1.0-rc3

View File

@@ -86,7 +86,7 @@ DRI2PROTO_REQUIRED=2.8
GLPROTO_REQUIRED=1.4.14
LIBOMXIL_BELLAGIO_REQUIRED=0.0
LIBOMXIL_TIZONIA_REQUIRED=0.10.0
LIBVA_REQUIRED=0.38.0
LIBVA_REQUIRED=0.39.0
VDPAU_REQUIRED=1.1
WAYLAND_REQUIRED=1.11
WAYLAND_PROTOCOLS_REQUIRED=1.8

View File

@@ -584,7 +584,7 @@ endif
with_gallium_va = _va == 'true'
dep_va = null_dep
if with_gallium_va
dep_va = dependency('libva', version : '>= 0.38.0')
dep_va = dependency('libva', version : '>= 0.39.0')
dep_va_headers = declare_dependency(
compile_args : run_command(prog_pkgconfig, ['libva', '--cflags']).stdout().split()
)

View File

@@ -463,15 +463,6 @@ VkResult radv_CreateInstance(
client_version = VK_MAKE_VERSION(1, 0, 0);
}
if (VK_MAKE_VERSION(1, 0, 0) > client_version ||
client_version > VK_MAKE_VERSION(1, 1, 0xfff)) {
return vk_errorf(VK_ERROR_INCOMPATIBLE_DRIVER,
"Client requested version %d.%d.%d",
VK_VERSION_MAJOR(client_version),
VK_VERSION_MINOR(client_version),
VK_VERSION_PATCH(client_version));
}
instance = vk_zalloc2(&default_alloc, pAllocator, sizeof(*instance), 8,
VK_SYSTEM_ALLOCATION_SCOPE_INSTANCE);
if (!instance)

View File

@@ -205,7 +205,7 @@ radv_entrypoint_is_enabled(int index, uint32_t core_version,
% if not e.device_command:
if (device) return false;
% endif
% if e.name == 'vkCreateInstance' or e.name == 'vkEnumerateInstanceExtensionProperties' or e.name == 'vkEnumerateInstanceLayerProperties':
% if e.name == 'vkCreateInstance' or e.name == 'vkEnumerateInstanceExtensionProperties' or e.name == 'vkEnumerateInstanceLayerProperties' or e.name == 'vkEnumerateInstanceVersion':
return !device;
% elif e.core_version:
return instance && ${e.core_version.c_vk_version()} <= core_version;

View File

@@ -100,7 +100,8 @@ blit2d_bind_src(struct radv_cmd_buffer *cmd_buffer,
struct radv_meta_blit2d_buffer *src_buf,
struct blit2d_src_temps *tmp,
enum blit2d_src_type src_type, VkFormat depth_format,
VkImageAspectFlagBits aspects)
VkImageAspectFlagBits aspects,
uint32_t log2_samples)
{
struct radv_device *device = cmd_buffer->device;
@@ -108,7 +109,7 @@ blit2d_bind_src(struct radv_cmd_buffer *cmd_buffer,
create_bview(cmd_buffer, src_buf, &tmp->bview, depth_format);
radv_meta_push_descriptor_set(cmd_buffer, VK_PIPELINE_BIND_POINT_GRAPHICS,
device->meta_state.blit2d.p_layouts[src_type],
device->meta_state.blit2d[log2_samples].p_layouts[src_type],
0, /* set */
1, /* descriptorWriteCount */
(VkWriteDescriptorSet[]) {
@@ -123,7 +124,7 @@ blit2d_bind_src(struct radv_cmd_buffer *cmd_buffer,
});
radv_CmdPushConstants(radv_cmd_buffer_to_handle(cmd_buffer),
device->meta_state.blit2d.p_layouts[src_type],
device->meta_state.blit2d[log2_samples].p_layouts[src_type],
VK_SHADER_STAGE_FRAGMENT_BIT, 16, 4,
&src_buf->pitch);
} else {
@@ -131,12 +132,12 @@ blit2d_bind_src(struct radv_cmd_buffer *cmd_buffer,
if (src_type == BLIT2D_SRC_TYPE_IMAGE_3D)
radv_CmdPushConstants(radv_cmd_buffer_to_handle(cmd_buffer),
device->meta_state.blit2d.p_layouts[src_type],
device->meta_state.blit2d[log2_samples].p_layouts[src_type],
VK_SHADER_STAGE_FRAGMENT_BIT, 16, 4,
&src_img->layer);
radv_meta_push_descriptor_set(cmd_buffer, VK_PIPELINE_BIND_POINT_GRAPHICS,
device->meta_state.blit2d.p_layouts[src_type],
device->meta_state.blit2d[log2_samples].p_layouts[src_type],
0, /* set */
1, /* descriptorWriteCount */
(VkWriteDescriptorSet[]) {
@@ -190,10 +191,11 @@ blit2d_bind_dst(struct radv_cmd_buffer *cmd_buffer,
static void
bind_pipeline(struct radv_cmd_buffer *cmd_buffer,
enum blit2d_src_type src_type, unsigned fs_key)
enum blit2d_src_type src_type, unsigned fs_key,
uint32_t log2_samples)
{
VkPipeline pipeline =
cmd_buffer->device->meta_state.blit2d.pipelines[src_type][fs_key];
cmd_buffer->device->meta_state.blit2d[log2_samples].pipelines[src_type][fs_key];
radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer),
VK_PIPELINE_BIND_POINT_GRAPHICS, pipeline);
@@ -201,10 +203,11 @@ bind_pipeline(struct radv_cmd_buffer *cmd_buffer,
static void
bind_depth_pipeline(struct radv_cmd_buffer *cmd_buffer,
enum blit2d_src_type src_type)
enum blit2d_src_type src_type,
uint32_t log2_samples)
{
VkPipeline pipeline =
cmd_buffer->device->meta_state.blit2d.depth_only_pipeline[src_type];
cmd_buffer->device->meta_state.blit2d[log2_samples].depth_only_pipeline[src_type];
radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer),
VK_PIPELINE_BIND_POINT_GRAPHICS, pipeline);
@@ -212,10 +215,11 @@ bind_depth_pipeline(struct radv_cmd_buffer *cmd_buffer,
static void
bind_stencil_pipeline(struct radv_cmd_buffer *cmd_buffer,
enum blit2d_src_type src_type)
enum blit2d_src_type src_type,
uint32_t log2_samples)
{
VkPipeline pipeline =
cmd_buffer->device->meta_state.blit2d.stencil_only_pipeline[src_type];
cmd_buffer->device->meta_state.blit2d[log2_samples].stencil_only_pipeline[src_type];
radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer),
VK_PIPELINE_BIND_POINT_GRAPHICS, pipeline);
@@ -227,7 +231,8 @@ radv_meta_blit2d_normal_dst(struct radv_cmd_buffer *cmd_buffer,
struct radv_meta_blit2d_buffer *src_buf,
struct radv_meta_blit2d_surf *dst,
unsigned num_rects,
struct radv_meta_blit2d_rect *rects, enum blit2d_src_type src_type)
struct radv_meta_blit2d_rect *rects, enum blit2d_src_type src_type,
uint32_t log2_samples)
{
struct radv_device *device = cmd_buffer->device;
@@ -241,7 +246,7 @@ radv_meta_blit2d_normal_dst(struct radv_cmd_buffer *cmd_buffer,
else if (aspect_mask == VK_IMAGE_ASPECT_DEPTH_BIT)
depth_format = vk_format_depth_only(dst->image->vk_format);
struct blit2d_src_temps src_temps;
blit2d_bind_src(cmd_buffer, src_img, src_buf, &src_temps, src_type, depth_format, aspect_mask);
blit2d_bind_src(cmd_buffer, src_img, src_buf, &src_temps, src_type, depth_format, aspect_mask, log2_samples);
struct blit2d_dst_temps dst_temps;
blit2d_bind_dst(cmd_buffer, dst, rects[r].dst_x + rects[r].width,
@@ -255,7 +260,7 @@ radv_meta_blit2d_normal_dst(struct radv_cmd_buffer *cmd_buffer,
};
radv_CmdPushConstants(radv_cmd_buffer_to_handle(cmd_buffer),
device->meta_state.blit2d.p_layouts[src_type],
device->meta_state.blit2d[log2_samples].p_layouts[src_type],
VK_SHADER_STAGE_VERTEX_BIT, 0, 16,
vertex_push_constants);
@@ -266,7 +271,7 @@ radv_meta_blit2d_normal_dst(struct radv_cmd_buffer *cmd_buffer,
radv_CmdBeginRenderPass(radv_cmd_buffer_to_handle(cmd_buffer),
&(VkRenderPassBeginInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO,
.renderPass = device->meta_state.blit2d.render_passes[fs_key][dst_layout],
.renderPass = device->meta_state.blit2d_render_passes[fs_key][dst_layout],
.framebuffer = dst_temps.fb,
.renderArea = {
.offset = { rects[r].dst_x, rects[r].dst_y, },
@@ -277,13 +282,13 @@ radv_meta_blit2d_normal_dst(struct radv_cmd_buffer *cmd_buffer,
}, VK_SUBPASS_CONTENTS_INLINE);
bind_pipeline(cmd_buffer, src_type, fs_key);
bind_pipeline(cmd_buffer, src_type, fs_key, log2_samples);
} else if (aspect_mask == VK_IMAGE_ASPECT_DEPTH_BIT) {
enum radv_blit_ds_layout ds_layout = radv_meta_blit_ds_to_type(dst->current_layout);
radv_CmdBeginRenderPass(radv_cmd_buffer_to_handle(cmd_buffer),
&(VkRenderPassBeginInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO,
.renderPass = device->meta_state.blit2d.depth_only_rp[ds_layout],
.renderPass = device->meta_state.blit2d_depth_only_rp[ds_layout],
.framebuffer = dst_temps.fb,
.renderArea = {
.offset = { rects[r].dst_x, rects[r].dst_y, },
@@ -294,14 +299,14 @@ radv_meta_blit2d_normal_dst(struct radv_cmd_buffer *cmd_buffer,
}, VK_SUBPASS_CONTENTS_INLINE);
bind_depth_pipeline(cmd_buffer, src_type);
bind_depth_pipeline(cmd_buffer, src_type, log2_samples);
} else if (aspect_mask == VK_IMAGE_ASPECT_STENCIL_BIT) {
enum radv_blit_ds_layout ds_layout = radv_meta_blit_ds_to_type(dst->current_layout);
radv_CmdBeginRenderPass(radv_cmd_buffer_to_handle(cmd_buffer),
&(VkRenderPassBeginInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO,
.renderPass = device->meta_state.blit2d.stencil_only_rp[ds_layout],
.renderPass = device->meta_state.blit2d_stencil_only_rp[ds_layout],
.framebuffer = dst_temps.fb,
.renderArea = {
.offset = { rects[r].dst_x, rects[r].dst_y, },
@@ -312,7 +317,7 @@ radv_meta_blit2d_normal_dst(struct radv_cmd_buffer *cmd_buffer,
}, VK_SUBPASS_CONTENTS_INLINE);
bind_stencil_pipeline(cmd_buffer, src_type);
bind_stencil_pipeline(cmd_buffer, src_type, log2_samples);
} else
unreachable("Processing blit2d with multiple aspects.");
@@ -332,7 +337,24 @@ radv_meta_blit2d_normal_dst(struct radv_cmd_buffer *cmd_buffer,
radv_CmdDraw(radv_cmd_buffer_to_handle(cmd_buffer), 3, 1, 0, 0);
if (log2_samples > 0) {
for (uint32_t sample = 0; sample < src_img->image->info.samples; sample++) {
uint32_t sample_mask = 1 << sample;
radv_CmdPushConstants(radv_cmd_buffer_to_handle(cmd_buffer),
device->meta_state.blit2d[log2_samples].p_layouts[src_type],
VK_SHADER_STAGE_FRAGMENT_BIT, 20, 4,
&sample);
radv_CmdPushConstants(radv_cmd_buffer_to_handle(cmd_buffer),
device->meta_state.blit2d[log2_samples].p_layouts[src_type],
VK_SHADER_STAGE_FRAGMENT_BIT, 24, 4,
&sample_mask);
radv_CmdDraw(radv_cmd_buffer_to_handle(cmd_buffer), 3, 1, 0, 0);
}
}
else
radv_CmdDraw(radv_cmd_buffer_to_handle(cmd_buffer), 3, 1, 0, 0);
radv_CmdEndRenderPass(radv_cmd_buffer_to_handle(cmd_buffer));
/* At the point where we emit the draw call, all data from the
@@ -358,7 +380,8 @@ radv_meta_blit2d(struct radv_cmd_buffer *cmd_buffer,
enum blit2d_src_type src_type = src_buf ? BLIT2D_SRC_TYPE_BUFFER :
use_3d ? BLIT2D_SRC_TYPE_IMAGE_3D : BLIT2D_SRC_TYPE_IMAGE;
radv_meta_blit2d_normal_dst(cmd_buffer, src_img, src_buf, dst,
num_rects, rects, src_type);
num_rects, rects, src_type,
src_img ? util_logbase2(src_img->image->info.samples) : 0);
}
static nir_shader *
@@ -421,13 +444,14 @@ build_nir_vertex_shader(void)
typedef nir_ssa_def* (*texel_fetch_build_func)(struct nir_builder *,
struct radv_device *,
nir_ssa_def *, bool);
nir_ssa_def *, bool, bool);
static nir_ssa_def *
build_nir_texel_fetch(struct nir_builder *b, struct radv_device *device,
nir_ssa_def *tex_pos, bool is_3d)
nir_ssa_def *tex_pos, bool is_3d, bool is_multisampled)
{
enum glsl_sampler_dim dim = is_3d ? GLSL_SAMPLER_DIM_3D : GLSL_SAMPLER_DIM_2D;
enum glsl_sampler_dim dim =
is_3d ? GLSL_SAMPLER_DIM_3D : is_multisampled ? GLSL_SAMPLER_DIM_MS : GLSL_SAMPLER_DIM_2D;
const struct glsl_type *sampler_type =
glsl_sampler_type(dim, false, false, GLSL_TYPE_UINT);
nir_variable *sampler = nir_variable_create(b->shader, nir_var_uniform,
@@ -436,6 +460,7 @@ build_nir_texel_fetch(struct nir_builder *b, struct radv_device *device,
sampler->data.binding = 0;
nir_ssa_def *tex_pos_3d = NULL;
nir_intrinsic_instr *sample_idx = NULL;
if (is_3d) {
nir_intrinsic_instr *layer = nir_intrinsic_instr_create(b->shader, nir_intrinsic_load_push_constant);
nir_intrinsic_set_base(layer, 16);
@@ -451,13 +476,26 @@ build_nir_texel_fetch(struct nir_builder *b, struct radv_device *device,
chans[2] = &layer->dest.ssa;
tex_pos_3d = nir_vec(b, chans, 3);
}
nir_tex_instr *tex = nir_tex_instr_create(b->shader, 2);
if (is_multisampled) {
sample_idx = nir_intrinsic_instr_create(b->shader, nir_intrinsic_load_push_constant);
nir_intrinsic_set_base(sample_idx, 20);
nir_intrinsic_set_range(sample_idx, 4);
sample_idx->src[0] = nir_src_for_ssa(nir_imm_int(b, 0));
sample_idx->num_components = 1;
nir_ssa_dest_init(&sample_idx->instr, &sample_idx->dest, 1, 32, "sample_idx");
nir_builder_instr_insert(b, &sample_idx->instr);
}
nir_tex_instr *tex = nir_tex_instr_create(b->shader, is_multisampled ? 3 : 2);
tex->sampler_dim = dim;
tex->op = nir_texop_txf;
tex->op = is_multisampled ? nir_texop_txf_ms : nir_texop_txf;
tex->src[0].src_type = nir_tex_src_coord;
tex->src[0].src = nir_src_for_ssa(is_3d ? tex_pos_3d : tex_pos);
tex->src[1].src_type = nir_tex_src_lod;
tex->src[1].src = nir_src_for_ssa(nir_imm_int(b, 0));
tex->src[1].src_type = is_multisampled ? nir_tex_src_ms_index : nir_tex_src_lod;
tex->src[1].src = nir_src_for_ssa(is_multisampled ? &sample_idx->dest.ssa : nir_imm_int(b, 0));
if (is_multisampled) {
tex->src[2].src_type = nir_tex_src_lod;
tex->src[2].src = nir_src_for_ssa(nir_imm_int(b, 0));
}
tex->dest_type = nir_type_uint;
tex->is_array = false;
tex->coord_components = is_3d ? 3 : 2;
@@ -473,7 +511,7 @@ build_nir_texel_fetch(struct nir_builder *b, struct radv_device *device,
static nir_ssa_def *
build_nir_buffer_fetch(struct nir_builder *b, struct radv_device *device,
nir_ssa_def *tex_pos, bool is_3d)
nir_ssa_def *tex_pos, bool is_3d, bool is_multisampled)
{
const struct glsl_type *sampler_type =
glsl_sampler_type(GLSL_SAMPLER_DIM_BUF, false, false, GLSL_TYPE_UINT);
@@ -519,9 +557,31 @@ static const VkPipelineVertexInputStateCreateInfo normal_vi_create_info = {
.vertexAttributeDescriptionCount = 0,
};
static void
build_nir_store_sample_mask(struct nir_builder *b)
{
nir_intrinsic_instr *sample_mask = nir_intrinsic_instr_create(b->shader, nir_intrinsic_load_push_constant);
nir_intrinsic_set_base(sample_mask, 24);
nir_intrinsic_set_range(sample_mask, 4);
sample_mask->src[0] = nir_src_for_ssa(nir_imm_int(b, 0));
sample_mask->num_components = 1;
nir_ssa_dest_init(&sample_mask->instr, &sample_mask->dest, 1, 32, "sample_mask");
nir_builder_instr_insert(b, &sample_mask->instr);
const struct glsl_type *sample_mask_out_type = glsl_uint_type();
nir_variable *sample_mask_out =
nir_variable_create(b->shader, nir_var_shader_out,
sample_mask_out_type, "sample_mask_out");
sample_mask_out->data.location = FRAG_RESULT_SAMPLE_MASK;
nir_store_var(b, sample_mask_out, &sample_mask->dest.ssa, 0x1);
}
static nir_shader *
build_nir_copy_fragment_shader(struct radv_device *device,
texel_fetch_build_func txf_func, const char* name, bool is_3d)
texel_fetch_build_func txf_func, const char* name, bool is_3d,
bool is_multisampled)
{
const struct glsl_type *vec4 = glsl_vec4_type();
const struct glsl_type *vec2 = glsl_vector_type(GLSL_TYPE_FLOAT, 2);
@@ -538,11 +598,15 @@ build_nir_copy_fragment_shader(struct radv_device *device,
vec4, "f_color");
color_out->data.location = FRAG_RESULT_DATA0;
if (is_multisampled) {
build_nir_store_sample_mask(&b);
}
nir_ssa_def *pos_int = nir_f2i32(&b, nir_load_var(&b, tex_pos_in));
unsigned swiz[4] = { 0, 1 };
nir_ssa_def *tex_pos = nir_swizzle(&b, pos_int, swiz, 2, false);
nir_ssa_def *color = txf_func(&b, device, tex_pos, is_3d);
nir_ssa_def *color = txf_func(&b, device, tex_pos, is_3d, is_multisampled);
nir_store_var(&b, color_out, color, 0xf);
return b.shader;
@@ -550,7 +614,8 @@ build_nir_copy_fragment_shader(struct radv_device *device,
static nir_shader *
build_nir_copy_fragment_shader_depth(struct radv_device *device,
texel_fetch_build_func txf_func, const char* name, bool is_3d)
texel_fetch_build_func txf_func, const char* name, bool is_3d,
bool is_multisampled)
{
const struct glsl_type *vec4 = glsl_vec4_type();
const struct glsl_type *vec2 = glsl_vector_type(GLSL_TYPE_FLOAT, 2);
@@ -567,11 +632,15 @@ build_nir_copy_fragment_shader_depth(struct radv_device *device,
vec4, "f_color");
color_out->data.location = FRAG_RESULT_DEPTH;
if (is_multisampled) {
build_nir_store_sample_mask(&b);
}
nir_ssa_def *pos_int = nir_f2i32(&b, nir_load_var(&b, tex_pos_in));
unsigned swiz[4] = { 0, 1 };
nir_ssa_def *tex_pos = nir_swizzle(&b, pos_int, swiz, 2, false);
nir_ssa_def *color = txf_func(&b, device, tex_pos, is_3d);
nir_ssa_def *color = txf_func(&b, device, tex_pos, is_3d, is_multisampled);
nir_store_var(&b, color_out, color, 0x1);
return b.shader;
@@ -579,7 +648,8 @@ build_nir_copy_fragment_shader_depth(struct radv_device *device,
static nir_shader *
build_nir_copy_fragment_shader_stencil(struct radv_device *device,
texel_fetch_build_func txf_func, const char* name, bool is_3d)
texel_fetch_build_func txf_func, const char* name, bool is_3d,
bool is_multisampled)
{
const struct glsl_type *vec4 = glsl_vec4_type();
const struct glsl_type *vec2 = glsl_vector_type(GLSL_TYPE_FLOAT, 2);
@@ -596,11 +666,15 @@ build_nir_copy_fragment_shader_stencil(struct radv_device *device,
vec4, "f_color");
color_out->data.location = FRAG_RESULT_STENCIL;
if (is_multisampled) {
build_nir_store_sample_mask(&b);
}
nir_ssa_def *pos_int = nir_f2i32(&b, nir_load_var(&b, tex_pos_in));
unsigned swiz[4] = { 0, 1 };
nir_ssa_def *tex_pos = nir_swizzle(&b, pos_int, swiz, 2, false);
nir_ssa_def *color = txf_func(&b, device, tex_pos, is_3d);
nir_ssa_def *color = txf_func(&b, device, tex_pos, is_3d, is_multisampled);
nir_store_var(&b, color_out, color, 0x1);
return b.shader;
@@ -614,45 +688,48 @@ radv_device_finish_meta_blit2d_state(struct radv_device *device)
for(unsigned j = 0; j < NUM_META_FS_KEYS; ++j) {
for (unsigned k = 0; k < RADV_META_DST_LAYOUT_COUNT; ++k) {
radv_DestroyRenderPass(radv_device_to_handle(device),
state->blit2d.render_passes[j][k],
&state->alloc);
state->blit2d_render_passes[j][k],
&state->alloc);
}
}
for (enum radv_blit_ds_layout j = RADV_BLIT_DS_LAYOUT_TILE_ENABLE; j < RADV_BLIT_DS_LAYOUT_COUNT; j++) {
radv_DestroyRenderPass(radv_device_to_handle(device),
state->blit2d.depth_only_rp[j], &state->alloc);
state->blit2d_depth_only_rp[j], &state->alloc);
radv_DestroyRenderPass(radv_device_to_handle(device),
state->blit2d.stencil_only_rp[j], &state->alloc);
state->blit2d_stencil_only_rp[j], &state->alloc);
}
for (unsigned src = 0; src < BLIT2D_NUM_SRC_TYPES; src++) {
radv_DestroyPipelineLayout(radv_device_to_handle(device),
state->blit2d.p_layouts[src],
&state->alloc);
radv_DestroyDescriptorSetLayout(radv_device_to_handle(device),
state->blit2d.ds_layouts[src],
&state->alloc);
for (unsigned log2_samples = 0; log2_samples < 1 + MAX_SAMPLES_LOG2; ++log2_samples) {
for (unsigned src = 0; src < BLIT2D_NUM_SRC_TYPES; src++) {
radv_DestroyPipelineLayout(radv_device_to_handle(device),
state->blit2d[log2_samples].p_layouts[src],
&state->alloc);
radv_DestroyDescriptorSetLayout(radv_device_to_handle(device),
state->blit2d[log2_samples].ds_layouts[src],
&state->alloc);
for (unsigned j = 0; j < NUM_META_FS_KEYS; ++j) {
radv_DestroyPipeline(radv_device_to_handle(device),
state->blit2d[log2_samples].pipelines[src][j],
&state->alloc);
}
for (unsigned j = 0; j < NUM_META_FS_KEYS; ++j) {
radv_DestroyPipeline(radv_device_to_handle(device),
state->blit2d.pipelines[src][j],
state->blit2d[log2_samples].depth_only_pipeline[src],
&state->alloc);
radv_DestroyPipeline(radv_device_to_handle(device),
state->blit2d[log2_samples].stencil_only_pipeline[src],
&state->alloc);
}
radv_DestroyPipeline(radv_device_to_handle(device),
state->blit2d.depth_only_pipeline[src],
&state->alloc);
radv_DestroyPipeline(radv_device_to_handle(device),
state->blit2d.stencil_only_pipeline[src],
&state->alloc);
}
}
static VkResult
blit2d_init_color_pipeline(struct radv_device *device,
enum blit2d_src_type src_type,
VkFormat format)
VkFormat format,
uint32_t log2_samples)
{
VkResult result;
unsigned fs_key = radv_format_meta_fs_key(format);
@@ -681,7 +758,7 @@ blit2d_init_color_pipeline(struct radv_device *device,
struct radv_shader_module fs = { .nir = NULL };
fs.nir = build_nir_copy_fragment_shader(device, src_func, name, src_type == BLIT2D_SRC_TYPE_IMAGE_3D);
fs.nir = build_nir_copy_fragment_shader(device, src_func, name, src_type == BLIT2D_SRC_TYPE_IMAGE_3D, log2_samples > 0);
vi_create_info = &normal_vi_create_info;
struct radv_shader_module vs = {
@@ -705,7 +782,7 @@ blit2d_init_color_pipeline(struct radv_device *device,
};
for (unsigned dst_layout = 0; dst_layout < RADV_META_DST_LAYOUT_COUNT; ++dst_layout) {
if (!device->meta_state.blit2d.render_passes[fs_key][dst_layout]) {
if (!device->meta_state.blit2d_render_passes[fs_key][dst_layout]) {
VkImageLayout layout = radv_meta_dst_layout_to_layout(dst_layout);
result = radv_CreateRenderPass(radv_device_to_handle(device),
@@ -737,7 +814,7 @@ blit2d_init_color_pipeline(struct radv_device *device,
.pPreserveAttachments = (uint32_t[]) { 0 },
},
.dependencyCount = 0,
}, &device->meta_state.alloc, &device->meta_state.blit2d.render_passes[fs_key][dst_layout]);
}, &device->meta_state.alloc, &device->meta_state.blit2d_render_passes[fs_key][dst_layout]);
}
}
@@ -765,7 +842,7 @@ blit2d_init_color_pipeline(struct radv_device *device,
},
.pMultisampleState = &(VkPipelineMultisampleStateCreateInfo) {
.sType = VK_STRUCTURE_TYPE_PIPELINE_MULTISAMPLE_STATE_CREATE_INFO,
.rasterizationSamples = 1,
.rasterizationSamples = 1 << log2_samples,
.sampleShadingEnable = false,
.pSampleMask = (VkSampleMask[]) { UINT32_MAX },
},
@@ -796,8 +873,8 @@ blit2d_init_color_pipeline(struct radv_device *device,
},
},
.flags = 0,
.layout = device->meta_state.blit2d.p_layouts[src_type],
.renderPass = device->meta_state.blit2d.render_passes[fs_key][0],
.layout = device->meta_state.blit2d[log2_samples].p_layouts[src_type],
.renderPass = device->meta_state.blit2d_render_passes[fs_key][0],
.subpass = 0,
};
@@ -809,7 +886,7 @@ blit2d_init_color_pipeline(struct radv_device *device,
radv_pipeline_cache_to_handle(&device->meta_state.cache),
&vk_pipeline_info, &radv_pipeline_info,
&device->meta_state.alloc,
&device->meta_state.blit2d.pipelines[src_type][fs_key]);
&device->meta_state.blit2d[log2_samples].pipelines[src_type][fs_key]);
ralloc_free(vs.nir);
@@ -820,7 +897,8 @@ blit2d_init_color_pipeline(struct radv_device *device,
static VkResult
blit2d_init_depth_only_pipeline(struct radv_device *device,
enum blit2d_src_type src_type)
enum blit2d_src_type src_type,
uint32_t log2_samples)
{
VkResult result;
const char *name;
@@ -847,7 +925,7 @@ blit2d_init_depth_only_pipeline(struct radv_device *device,
const VkPipelineVertexInputStateCreateInfo *vi_create_info;
struct radv_shader_module fs = { .nir = NULL };
fs.nir = build_nir_copy_fragment_shader_depth(device, src_func, name, src_type == BLIT2D_SRC_TYPE_IMAGE_3D);
fs.nir = build_nir_copy_fragment_shader_depth(device, src_func, name, src_type == BLIT2D_SRC_TYPE_IMAGE_3D, log2_samples > 0);
vi_create_info = &normal_vi_create_info;
struct radv_shader_module vs = {
@@ -871,7 +949,7 @@ blit2d_init_depth_only_pipeline(struct radv_device *device,
};
for (enum radv_blit_ds_layout ds_layout = RADV_BLIT_DS_LAYOUT_TILE_ENABLE; ds_layout < RADV_BLIT_DS_LAYOUT_COUNT; ds_layout++) {
if (!device->meta_state.blit2d.depth_only_rp[ds_layout]) {
if (!device->meta_state.blit2d_depth_only_rp[ds_layout]) {
VkImageLayout layout = radv_meta_blit_ds_to_layout(ds_layout);
result = radv_CreateRenderPass(radv_device_to_handle(device),
&(VkRenderPassCreateInfo) {
@@ -899,7 +977,7 @@ blit2d_init_depth_only_pipeline(struct radv_device *device,
.pPreserveAttachments = (uint32_t[]) { 0 },
},
.dependencyCount = 0,
}, &device->meta_state.alloc, &device->meta_state.blit2d.depth_only_rp[ds_layout]);
}, &device->meta_state.alloc, &device->meta_state.blit2d_depth_only_rp[ds_layout]);
}
}
@@ -927,7 +1005,7 @@ blit2d_init_depth_only_pipeline(struct radv_device *device,
},
.pMultisampleState = &(VkPipelineMultisampleStateCreateInfo) {
.sType = VK_STRUCTURE_TYPE_PIPELINE_MULTISAMPLE_STATE_CREATE_INFO,
.rasterizationSamples = 1,
.rasterizationSamples = 1 << log2_samples,
.sampleShadingEnable = false,
.pSampleMask = (VkSampleMask[]) { UINT32_MAX },
},
@@ -958,8 +1036,8 @@ blit2d_init_depth_only_pipeline(struct radv_device *device,
},
},
.flags = 0,
.layout = device->meta_state.blit2d.p_layouts[src_type],
.renderPass = device->meta_state.blit2d.depth_only_rp[0],
.layout = device->meta_state.blit2d[log2_samples].p_layouts[src_type],
.renderPass = device->meta_state.blit2d_depth_only_rp[0],
.subpass = 0,
};
@@ -971,7 +1049,7 @@ blit2d_init_depth_only_pipeline(struct radv_device *device,
radv_pipeline_cache_to_handle(&device->meta_state.cache),
&vk_pipeline_info, &radv_pipeline_info,
&device->meta_state.alloc,
&device->meta_state.blit2d.depth_only_pipeline[src_type]);
&device->meta_state.blit2d[log2_samples].depth_only_pipeline[src_type]);
ralloc_free(vs.nir);
@@ -982,7 +1060,8 @@ blit2d_init_depth_only_pipeline(struct radv_device *device,
static VkResult
blit2d_init_stencil_only_pipeline(struct radv_device *device,
enum blit2d_src_type src_type)
enum blit2d_src_type src_type,
uint32_t log2_samples)
{
VkResult result;
const char *name;
@@ -1009,7 +1088,7 @@ blit2d_init_stencil_only_pipeline(struct radv_device *device,
const VkPipelineVertexInputStateCreateInfo *vi_create_info;
struct radv_shader_module fs = { .nir = NULL };
fs.nir = build_nir_copy_fragment_shader_stencil(device, src_func, name, src_type == BLIT2D_SRC_TYPE_IMAGE_3D);
fs.nir = build_nir_copy_fragment_shader_stencil(device, src_func, name, src_type == BLIT2D_SRC_TYPE_IMAGE_3D, log2_samples > 0);
vi_create_info = &normal_vi_create_info;
struct radv_shader_module vs = {
@@ -1033,7 +1112,7 @@ blit2d_init_stencil_only_pipeline(struct radv_device *device,
};
for (enum radv_blit_ds_layout ds_layout = RADV_BLIT_DS_LAYOUT_TILE_ENABLE; ds_layout < RADV_BLIT_DS_LAYOUT_COUNT; ds_layout++) {
if (!device->meta_state.blit2d.stencil_only_rp[ds_layout]) {
if (!device->meta_state.blit2d_stencil_only_rp[ds_layout]) {
VkImageLayout layout = radv_meta_blit_ds_to_layout(ds_layout);
result = radv_CreateRenderPass(radv_device_to_handle(device),
&(VkRenderPassCreateInfo) {
@@ -1061,7 +1140,7 @@ blit2d_init_stencil_only_pipeline(struct radv_device *device,
.pPreserveAttachments = (uint32_t[]) { 0 },
},
.dependencyCount = 0,
}, &device->meta_state.alloc, &device->meta_state.blit2d.stencil_only_rp[ds_layout]);
}, &device->meta_state.alloc, &device->meta_state.blit2d_stencil_only_rp[ds_layout]);
}
}
@@ -1089,7 +1168,7 @@ blit2d_init_stencil_only_pipeline(struct radv_device *device,
},
.pMultisampleState = &(VkPipelineMultisampleStateCreateInfo) {
.sType = VK_STRUCTURE_TYPE_PIPELINE_MULTISAMPLE_STATE_CREATE_INFO,
.rasterizationSamples = 1,
.rasterizationSamples = 1 << log2_samples,
.sampleShadingEnable = false,
.pSampleMask = (VkSampleMask[]) { UINT32_MAX },
},
@@ -1136,8 +1215,8 @@ blit2d_init_stencil_only_pipeline(struct radv_device *device,
},
},
.flags = 0,
.layout = device->meta_state.blit2d.p_layouts[src_type],
.renderPass = device->meta_state.blit2d.stencil_only_rp[0],
.layout = device->meta_state.blit2d[log2_samples].p_layouts[src_type],
.renderPass = device->meta_state.blit2d_stencil_only_rp[0],
.subpass = 0,
};
@@ -1149,7 +1228,7 @@ blit2d_init_stencil_only_pipeline(struct radv_device *device,
radv_pipeline_cache_to_handle(&device->meta_state.cache),
&vk_pipeline_info, &radv_pipeline_info,
&device->meta_state.alloc,
&device->meta_state.blit2d.stencil_only_pipeline[src_type]);
&device->meta_state.blit2d[log2_samples].stencil_only_pipeline[src_type]);
ralloc_free(vs.nir);
@@ -1175,15 +1254,16 @@ static VkFormat pipeline_formats[] = {
static VkResult
meta_blit2d_create_pipe_layout(struct radv_device *device,
int idx)
int idx,
uint32_t log2_samples)
{
VkResult result;
VkDescriptorType desc_type = (idx == BLIT2D_SRC_TYPE_BUFFER) ? VK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER : VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE;
const VkPushConstantRange push_constant_ranges[] = {
{VK_SHADER_STAGE_VERTEX_BIT, 0, 16},
{VK_SHADER_STAGE_FRAGMENT_BIT, 16, 4},
{VK_SHADER_STAGE_FRAGMENT_BIT, 16, 12},
};
int num_push_constant_range = (idx != BLIT2D_SRC_TYPE_IMAGE) ? 2 : 1;
int num_push_constant_range = (idx != BLIT2D_SRC_TYPE_IMAGE || log2_samples > 0) ? 2 : 1;
result = radv_CreateDescriptorSetLayout(radv_device_to_handle(device),
&(VkDescriptorSetLayoutCreateInfo) {
@@ -1199,7 +1279,7 @@ meta_blit2d_create_pipe_layout(struct radv_device *device,
.pImmutableSamplers = NULL
},
}
}, &device->meta_state.alloc, &device->meta_state.blit2d.ds_layouts[idx]);
}, &device->meta_state.alloc, &device->meta_state.blit2d[log2_samples].ds_layouts[idx]);
if (result != VK_SUCCESS)
goto fail;
@@ -1207,11 +1287,11 @@ meta_blit2d_create_pipe_layout(struct radv_device *device,
&(VkPipelineLayoutCreateInfo) {
.sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO,
.setLayoutCount = 1,
.pSetLayouts = &device->meta_state.blit2d.ds_layouts[idx],
.pSetLayouts = &device->meta_state.blit2d[log2_samples].ds_layouts[idx],
.pushConstantRangeCount = num_push_constant_range,
.pPushConstantRanges = push_constant_ranges,
},
&device->meta_state.alloc, &device->meta_state.blit2d.p_layouts[idx]);
&device->meta_state.alloc, &device->meta_state.blit2d[log2_samples].p_layouts[idx]);
if (result != VK_SUCCESS)
goto fail;
return VK_SUCCESS;
@@ -1225,27 +1305,33 @@ radv_device_init_meta_blit2d_state(struct radv_device *device)
VkResult result;
bool create_3d = device->physical_device->rad_info.chip_class >= GFX9;
for (unsigned src = 0; src < BLIT2D_NUM_SRC_TYPES; src++) {
if (src == BLIT2D_SRC_TYPE_IMAGE_3D && !create_3d)
continue;
for (unsigned log2_samples = 0; log2_samples < 1 + MAX_SAMPLES_LOG2; log2_samples++) {
for (unsigned src = 0; src < BLIT2D_NUM_SRC_TYPES; src++) {
if (src == BLIT2D_SRC_TYPE_IMAGE_3D && !create_3d)
continue;
result = meta_blit2d_create_pipe_layout(device, src);
if (result != VK_SUCCESS)
goto fail;
/* Don't need to handle copies between buffers and multisample images. */
if (src == BLIT2D_SRC_TYPE_BUFFER && log2_samples > 0)
continue;
for (unsigned j = 0; j < ARRAY_SIZE(pipeline_formats); ++j) {
result = blit2d_init_color_pipeline(device, src, pipeline_formats[j]);
result = meta_blit2d_create_pipe_layout(device, src, log2_samples);
if (result != VK_SUCCESS)
goto fail;
for (unsigned j = 0; j < ARRAY_SIZE(pipeline_formats); ++j) {
result = blit2d_init_color_pipeline(device, src, pipeline_formats[j], log2_samples);
if (result != VK_SUCCESS)
goto fail;
}
result = blit2d_init_depth_only_pipeline(device, src, log2_samples);
if (result != VK_SUCCESS)
goto fail;
result = blit2d_init_stencil_only_pipeline(device, src, log2_samples);
if (result != VK_SUCCESS)
goto fail;
}
result = blit2d_init_depth_only_pipeline(device, src);
if (result != VK_SUCCESS)
goto fail;
result = blit2d_init_stencil_only_pipeline(device, src);
if (result != VK_SUCCESS)
goto fail;
}
return VK_SUCCESS;

View File

@@ -87,8 +87,8 @@ VkResult radv_CreateRenderPass(
subpass_attachment_count +=
desc->inputAttachmentCount +
desc->colorAttachmentCount +
/* Count colorAttachmentCount again for resolve_attachments */
desc->colorAttachmentCount;
(desc->pResolveAttachments ? desc->colorAttachmentCount : 0) +
(desc->pDepthStencilAttachment != NULL);
}
if (subpass_attachment_count) {

View File

@@ -465,18 +465,18 @@ struct radv_meta_state {
} blit;
struct {
VkRenderPass render_passes[NUM_META_FS_KEYS][RADV_META_DST_LAYOUT_COUNT];
VkPipelineLayout p_layouts[5];
VkDescriptorSetLayout ds_layouts[5];
VkPipeline pipelines[5][NUM_META_FS_KEYS];
VkPipelineLayout p_layouts[3];
VkDescriptorSetLayout ds_layouts[3];
VkPipeline pipelines[3][NUM_META_FS_KEYS];
VkPipeline depth_only_pipeline[5];
VkRenderPass depth_only_rp[RADV_BLIT_DS_LAYOUT_COUNT];
VkPipeline depth_only_pipeline[3];
VkPipeline stencil_only_pipeline[5];
} blit2d[1 + MAX_SAMPLES_LOG2];
VkRenderPass stencil_only_rp[RADV_BLIT_DS_LAYOUT_COUNT];
VkPipeline stencil_only_pipeline[3];
} blit2d;
VkRenderPass blit2d_render_passes[NUM_META_FS_KEYS][RADV_META_DST_LAYOUT_COUNT];
VkRenderPass blit2d_depth_only_rp[RADV_BLIT_DS_LAYOUT_COUNT];
VkRenderPass blit2d_stencil_only_rp[RADV_BLIT_DS_LAYOUT_COUNT];
struct {
VkPipelineLayout img_p_layout;

View File

@@ -501,6 +501,7 @@ radv_amdgpu_winsys_bo_from_fd(struct radeon_winsys *_ws,
bo->size = result.alloc_size;
bo->is_shared = true;
bo->ws = ws;
bo->ref_count = 1;
radv_amdgpu_add_buffer_to_global_list(bo);
return (struct radeon_winsys_bo *)bo;
error_va_map:

View File

@@ -1419,11 +1419,11 @@ apply_var_decoration(struct vtn_builder *b, nir_variable *nir_var,
case SpvBuiltInTessLevelInner:
nir_var->data.compact = true;
break;
case SpvBuiltInSamplePosition:
nir_var->data.origin_upper_left = b->origin_upper_left;
/* fallthrough */
case SpvBuiltInFragCoord:
nir_var->data.pixel_center_integer = b->pixel_center_integer;
/* fallthrough */
case SpvBuiltInSamplePosition:
nir_var->data.origin_upper_left = b->origin_upper_left;
break;
default:
break;

View File

@@ -864,19 +864,22 @@ dri2_x11_swap_buffers_msc(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *draw,
if (draw->Type == EGL_PIXMAP_BIT || draw->Type == EGL_PBUFFER_BIT)
return 0;
if (draw->SwapBehavior == EGL_BUFFER_PRESERVED || !dri2_dpy->swap_available)
return dri2_copy_region(drv, disp, draw, dri2_surf->region) ? 0 : -1;
if (draw->SwapBehavior == EGL_BUFFER_PRESERVED || !dri2_dpy->swap_available) {
swap_count = dri2_copy_region(drv, disp, draw, dri2_surf->region) ? 0 : -1;
} else {
dri2_flush_drawable_for_swapbuffers(disp, draw);
dri2_flush_drawable_for_swapbuffers(disp, draw);
cookie = xcb_dri2_swap_buffers_unchecked(dri2_dpy->conn,
dri2_surf->drawable, msc_hi,
msc_lo, divisor_hi, divisor_lo,
remainder_hi, remainder_lo);
cookie = xcb_dri2_swap_buffers_unchecked(dri2_dpy->conn, dri2_surf->drawable,
msc_hi, msc_lo, divisor_hi, divisor_lo, remainder_hi, remainder_lo);
reply = xcb_dri2_swap_buffers_reply(dri2_dpy->conn, cookie, NULL);
reply = xcb_dri2_swap_buffers_reply(dri2_dpy->conn, cookie, NULL);
if (reply) {
swap_count = (((int64_t)reply->swap_hi) << 32) | reply->swap_lo;
free(reply);
if (reply) {
swap_count = (((int64_t)reply->swap_hi) << 32) | reply->swap_lo;
free(reply);
}
}
/* Since we aren't watching for the server's invalidate events like we're

View File

@@ -477,12 +477,19 @@ static int si_get_shader_param(struct pipe_screen* pscreen,
case PIPE_SHADER_CAP_INDIRECT_INPUT_ADDR:
/* TODO: Indirect indexing of GS inputs is unimplemented. */
return shader != PIPE_SHADER_GEOMETRY &&
(sscreen->llvm_has_working_vgpr_indexing ||
/* TCS and TES load inputs directly from LDS or
* offchip memory, so indirect indexing is trivial. */
shader == PIPE_SHADER_TESS_CTRL ||
shader == PIPE_SHADER_TESS_EVAL);
if (shader == PIPE_SHADER_GEOMETRY)
return 0;
if (shader == PIPE_SHADER_VERTEX &&
!sscreen->llvm_has_working_vgpr_indexing)
return 0;
/* TCS and TES load inputs directly from LDS or offchip
* memory, so indirect indexing is always supported.
* PS has to support indirect indexing, because we can't
* lower that to TEMPs for INTERP instructions.
*/
return 1;
case PIPE_SHADER_CAP_INDIRECT_OUTPUT_ADDR:
return sscreen->llvm_has_working_vgpr_indexing ||

View File

@@ -353,7 +353,7 @@ OMX_ERRORTYPE enc_LoadImage_common(vid_enc_PrivateType * priv, OMX_VIDEO_PORTDEF
blit.src.resource = inp->resource;
blit.src.format = inp->resource->format;
blit.src.box.x = 0;
blit.src.box.x = -1;
blit.src.box.y = def->nFrameHeight;
blit.src.box.width = def->nFrameWidth;
blit.src.box.height = def->nFrameHeight / 2 ;
@@ -367,11 +367,11 @@ OMX_ERRORTYPE enc_LoadImage_common(vid_enc_PrivateType * priv, OMX_VIDEO_PORTDEF
blit.dst.box.depth = 1;
blit.filter = PIPE_TEX_FILTER_NEAREST;
blit.mask = PIPE_MASK_G;
blit.mask = PIPE_MASK_R;
priv->s_pipe->blit(priv->s_pipe, &blit);
blit.src.box.x = 1;
blit.mask = PIPE_MASK_R;
blit.src.box.x = 0;
blit.mask = PIPE_MASK_G;
priv->s_pipe->blit(priv->s_pipe, &blit);
priv->s_pipe->flush(priv->s_pipe, NULL, 0);

View File

@@ -705,7 +705,7 @@ static inline struct brw_reg
brw_imm_w(int16_t w)
{
struct brw_reg imm = brw_imm_reg(BRW_REGISTER_TYPE_W);
imm.d = w | (w << 16);
imm.ud = (uint16_t)w | (uint32_t)(uint16_t)w << 16;
return imm;
}

View File

@@ -580,9 +580,11 @@ brw_negate_immediate(enum brw_reg_type type, struct brw_reg *reg)
reg->d = -reg->d;
return true;
case BRW_REGISTER_TYPE_W:
case BRW_REGISTER_TYPE_UW:
reg->d = -(int16_t)reg->ud;
case BRW_REGISTER_TYPE_UW: {
uint16_t value = -(int16_t)reg->ud;
reg->ud = value | (uint32_t)value << 16;
return true;
}
case BRW_REGISTER_TYPE_F:
reg->f = -reg->f;
return true;
@@ -618,9 +620,11 @@ brw_abs_immediate(enum brw_reg_type type, struct brw_reg *reg)
case BRW_REGISTER_TYPE_D:
reg->d = abs(reg->d);
return true;
case BRW_REGISTER_TYPE_W:
reg->d = abs((int16_t)reg->ud);
case BRW_REGISTER_TYPE_W: {
uint16_t value = abs((int16_t)reg->ud);
reg->ud = value | (uint32_t)value << 16;
return true;
}
case BRW_REGISTER_TYPE_F:
reg->f = fabsf(reg->f);
return true;

View File

@@ -1179,6 +1179,7 @@ PFN_vkVoidFunction anv_GetInstanceProcAddr(
LOOKUP_ANV_ENTRYPOINT(EnumerateInstanceExtensionProperties);
LOOKUP_ANV_ENTRYPOINT(EnumerateInstanceLayerProperties);
LOOKUP_ANV_ENTRYPOINT(EnumerateInstanceVersion);
LOOKUP_ANV_ENTRYPOINT(CreateInstance);
#undef LOOKUP_ANV_ENTRYPOINT

View File

@@ -927,7 +927,7 @@ intelCompressedTexSubImage(struct gl_context *ctx, GLuint dims,
!_mesa_is_srgb_format(gl_format);
struct brw_context *brw = (struct brw_context*) ctx;
const struct gen_device_info *devinfo = &brw->screen->devinfo;
if (devinfo->gen == 9 && is_linear_astc)
if (devinfo->gen == 9 && !gen_device_info_is_9lp(devinfo) && is_linear_astc)
flush_astc_denorms(ctx, dims, texImage,
xoffset, yoffset, zoffset,
width, height, depth);

View File

@@ -311,6 +311,7 @@ util_queue_init(struct util_queue *queue,
goto fail;
(void) mtx_init(&queue->lock, mtx_plain);
(void) mtx_init(&queue->finish_lock, mtx_plain);
queue->num_queued = 0;
cnd_init(&queue->has_queued_cond);
@@ -398,6 +399,7 @@ util_queue_destroy(struct util_queue *queue)
cnd_destroy(&queue->has_space_cond);
cnd_destroy(&queue->has_queued_cond);
mtx_destroy(&queue->finish_lock);
mtx_destroy(&queue->lock);
free(queue->jobs);
free(queue->threads);
@@ -529,6 +531,12 @@ util_queue_finish(struct util_queue *queue)
util_barrier_init(&barrier, queue->num_threads);
/* If 2 threads were adding jobs for 2 different barries at the same time,
* a deadlock would happen, because 1 barrier requires that all threads
* wait for it exclusively.
*/
mtx_lock(&queue->finish_lock);
for (unsigned i = 0; i < queue->num_threads; ++i) {
util_queue_fence_init(&fences[i]);
util_queue_add_job(queue, &barrier, &fences[i], util_queue_finish_execute, NULL);
@@ -538,6 +546,7 @@ util_queue_finish(struct util_queue *queue)
util_queue_fence_wait(&fences[i]);
util_queue_fence_destroy(&fences[i]);
}
mtx_unlock(&queue->finish_lock);
util_barrier_destroy(&barrier);

View File

@@ -200,6 +200,7 @@ struct util_queue_job {
/* Put this into your context. */
struct util_queue {
const char *name;
mtx_t finish_lock; /* only for util_queue_finish */
mtx_t lock;
cnd_t has_queued_cond;
cnd_t has_space_cond;