Compare commits
39 Commits
mesa-20.1.
...
mesa-20.1.
Author | SHA1 | Date | |
---|---|---|---|
|
0a443eb1ad | ||
|
bc6fd91e68 | ||
|
e1f6000b54 | ||
|
30b256c21e | ||
|
71b3582ec1 | ||
|
fc21ef6b66 | ||
|
d74c2e743d | ||
|
0dbec6b964 | ||
|
db4a29d078 | ||
|
4bff9ca691 | ||
|
9dcc7d4d41 | ||
|
3a8ba8ecb3 | ||
|
7e3ed26c28 | ||
|
79bed11bdd | ||
|
80c6955c23 | ||
|
02f2b9fa7b | ||
|
083b992f9d | ||
|
520d023bfb | ||
|
14c7f4740e | ||
|
53356f8972 | ||
|
7590165899 | ||
|
46762687a0 | ||
|
ef29f3758e | ||
|
45a937e040 | ||
|
5fedabe34b | ||
|
a4f2c6face | ||
|
077d2a8068 | ||
|
78df8e5e38 | ||
|
0f61e68ede | ||
|
11ebe27d97 | ||
|
80da07288b | ||
|
d99fe9f86f | ||
|
b8534f4771 | ||
|
819be690c0 | ||
|
c2c53b9e63 | ||
|
d226595210 | ||
|
c0d443656f | ||
|
ed94f8f266 | ||
|
e9ec84ad66 |
4172
.pick_status.json
4172
.pick_status.json
File diff suppressed because it is too large
Load Diff
@@ -36,7 +36,7 @@ depends on the particular driver being used.
|
||||
|
||||
<h2>SHA256 checksum</h2>
|
||||
<pre>
|
||||
TBD.
|
||||
df21351494f7caaec5a3ccc16f14f15512e98d2ecde178bba1d134edc899b961 mesa-20.1.8.tar.xz
|
||||
</pre>
|
||||
|
||||
|
||||
|
140
docs/relnotes/20.1.9.html
Normal file
140
docs/relnotes/20.1.9.html
Normal file
@@ -0,0 +1,140 @@
|
||||
|
||||
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta http-equiv="content-type" content="text/html; charset=utf-8">
|
||||
<title>Mesa Release Notes</title>
|
||||
<link rel="stylesheet" type="text/css" href="../mesa.css">
|
||||
</head>
|
||||
<body>
|
||||
|
||||
<div class="header">
|
||||
<h1>The Mesa 3D Graphics Library</h1>
|
||||
</div>
|
||||
|
||||
<iframe src="../contents.html"></iframe>
|
||||
<div class="content">
|
||||
|
||||
<h1>Mesa 20.1.9 Release Notes / 2020-09-30</h1>
|
||||
|
||||
<p>
|
||||
Mesa 20.1.9 is a bug fix release which fixes bugs found since the 20.1.8 release.
|
||||
</p>
|
||||
<p>
|
||||
Mesa 20.1.9 implements the OpenGL 4.6 API, but the version reported by
|
||||
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
|
||||
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
|
||||
Some drivers don't support all the features required in OpenGL 4.6. OpenGL
|
||||
4.6 is <strong>only</strong> available if requested at context creation.
|
||||
Compatibility contexts may report a lower version depending on each driver.
|
||||
</p>
|
||||
<p>
|
||||
Mesa 20.1.9 implements the Vulkan 1.2 API, but the version reported by
|
||||
the apiVersion property of the VkPhysicalDeviceProperties struct
|
||||
depends on the particular driver being used.
|
||||
</p>
|
||||
|
||||
<h2>SHA256 checksum</h2>
|
||||
<pre>
|
||||
TBD.
|
||||
</pre>
|
||||
|
||||
|
||||
<h2>New features</h2>
|
||||
|
||||
<ul>
|
||||
<li>None</li>
|
||||
</ul>
|
||||
|
||||
<h2>Bug fixes</h2>
|
||||
|
||||
<ul>
|
||||
<li>Horizon Zero Dawn graphics corruption with with radv</li>
|
||||
<li>Running Amber test leads to VK_DEVICE_LOST</li>
|
||||
<li>[spirv-fuzz] Shader generates a wrong image</li>
|
||||
<li>anv: dEQP-VK.robustness.robustness2.* failures on gen12</li>
|
||||
<li>[RADV] Problems reading primitive ID in fragment shader after tessellation</li>
|
||||
<li>Substance Painter 6.1.3 black glitches on Radeon RX570</li>
|
||||
<li>vkCmdCopyImage broadcasts subsample 0 of MSAA src into all subsamples of dst on RADV</li>
|
||||
</ul>
|
||||
|
||||
<h2>Changes</h2>
|
||||
|
||||
<ul>
|
||||
<p>Bas Nieuwenhuizen (3):</p>
|
||||
<li> amd/common: Cache intra-tile addresses for retile map.</li>
|
||||
<li> ac/surface: Fix depth import on GFX6-GFX8.</li>
|
||||
<li> st/mesa: Deal with empty textures/buffers in semaphore wait/signal.</li>
|
||||
<p></p>
|
||||
<p>Christian Gmeiner (1):</p>
|
||||
<li> etnaviv: simplify linear stride implementation</li>
|
||||
<p></p>
|
||||
<p>Connor Abbott (1):</p>
|
||||
<li> nir/lower_io_arrays: Fix xfb_offset bug</li>
|
||||
<p></p>
|
||||
<p>Danylo Piliaiev (4):</p>
|
||||
<li> nir/lower_io: Eliminate oob writes and return zero for oob reads</li>
|
||||
<li> nir/large_constants: Eliminate out-of-bounds writes to large constants</li>
|
||||
<li> nir/lower_samplers: Clamp out-of-bounds access to array of samplers</li>
|
||||
<li> intel/fs: Disable sample mask predication for scratch stores</li>
|
||||
<p></p>
|
||||
<p>Dylan Baker (1):</p>
|
||||
<li> meson/anv: Use variable that checks for --build-id</li>
|
||||
<p></p>
|
||||
<p>Eric Engestrom (9):</p>
|
||||
<li> docs/relnotes: add sha256 sums to 20.1.8</li>
|
||||
<li> .pick_status.json: Update to d74fe47101995d2659b1e59495d2f77b9dc14f3d</li>
|
||||
<li> .pick_status.json: Update to c669db0b503c10faf2d1c67c9340d7222b4f946e</li>
|
||||
<li> .pick_status.json: Update to a3543adc2628461818cfa691a7f547af7bc6f0fb</li>
|
||||
<li> .pick_status.json: Mark 802d3611dcec8102ef75fe2461340c2997af931e as denominated</li>
|
||||
<li> .pick_status.json: Mark e98c7a66347a05fc166c377ab1abb77955aff775 as denominated</li>
|
||||
<li> .pick_status.json: Mark 6b1a56b908e702c06f55c63b19b695a47f607456 as denominated</li>
|
||||
<li> .pick_status.json: Mark 89401e58672e1251b954662f0f776a6e9bce6df8 as denominated</li>
|
||||
<li> .pick_status.json: Update to efaea653b5766427701817ab06c319902a148ee9</li>
|
||||
<p></p>
|
||||
<p>Erik Faye-Lund (2):</p>
|
||||
<li> mesa: handle GL_FRONT after translating to it</li>
|
||||
<li> st/mesa: use roundf instead of floorf for lod-bias rounding</li>
|
||||
<p></p>
|
||||
<p>Jason Ekstrand (2):</p>
|
||||
<li> intel/fs/swsb: SCHEDULING_FENCE only emits SYNC_NOP</li>
|
||||
<li> nir/liveness: Consider if uses in nir_ssa_defs_interfere</li>
|
||||
<p></p>
|
||||
<p>Jesse Natalie (1):</p>
|
||||
<li> glsl_type: Add packed to structure type comparison for hash map</li>
|
||||
<p></p>
|
||||
<p>Karol Herbst (1):</p>
|
||||
<li> spirv: extract switch parsing into its own function</li>
|
||||
<p></p>
|
||||
<p>Lionel Landwerlin (1):</p>
|
||||
<li> intel/compiler: fixup Gen12 workaround for array sizes</li>
|
||||
<p></p>
|
||||
<p>Marek Olšák (1):</p>
|
||||
<li> radeonsi: fix indirect dispatches with variable block sizes</li>
|
||||
<p></p>
|
||||
<p>Nanley Chery (1):</p>
|
||||
<li> blorp: Ensure aligned HIZ_CCS_WT partial clears</li>
|
||||
<p></p>
|
||||
<p>Pierre-Eric Pelloux-Prayer (3):</p>
|
||||
<li> mesa: fix glUniform* when a struct contains a bindless sampler</li>
|
||||
<li> gallium/vl: do not call transfer_unmap if transfer is NULL</li>
|
||||
<li> gallium/vl: add chroma_format arg to vl_video_buffer functions</li>
|
||||
<p></p>
|
||||
<p>Pierre-Loup A. Griffais (2):</p>
|
||||
<li> radv: fix null descriptor for dynamic buffers</li>
|
||||
<li> radv: fix vertex buffer null descriptors</li>
|
||||
<p></p>
|
||||
<p>Rhys Perry (2):</p>
|
||||
<li> radv: initialize with expanded cmask if the destination layout needs it</li>
|
||||
<li> radv,aco: fix reading primitive ID in FS after TES</li>
|
||||
<p></p>
|
||||
<p>Samuel Pitoiset (2):</p>
|
||||
<li> radv: fix transform feedback crashes if pCounterBufferOffsets is NULL</li>
|
||||
<li> spirv: fix emitting switch cases that directly jump to the merge block</li>
|
||||
<p></p>
|
||||
<p></p>
|
||||
</ul>
|
||||
|
||||
</div>
|
||||
</body>
|
||||
</html>
|
@@ -61,6 +61,7 @@ struct ac_addrlib {
|
||||
*/
|
||||
simple_mtx_t dcc_retile_map_lock;
|
||||
struct hash_table *dcc_retile_maps;
|
||||
struct hash_table *dcc_retile_tile_indices;
|
||||
};
|
||||
|
||||
struct dcc_retile_map_key {
|
||||
@@ -89,6 +90,156 @@ static void dcc_retile_map_free(struct hash_entry *entry)
|
||||
free(entry->data);
|
||||
}
|
||||
|
||||
struct dcc_retile_tile_key {
|
||||
enum radeon_family family;
|
||||
unsigned bpp;
|
||||
unsigned swizzle_mode;
|
||||
bool rb_aligned;
|
||||
bool pipe_aligned;
|
||||
};
|
||||
|
||||
struct dcc_retile_tile_data {
|
||||
unsigned tile_width_log2;
|
||||
unsigned tile_height_log2;
|
||||
uint16_t *data;
|
||||
};
|
||||
|
||||
static uint32_t dcc_retile_tile_hash_key(const void *key)
|
||||
{
|
||||
return _mesa_hash_data(key, sizeof(struct dcc_retile_tile_key));
|
||||
}
|
||||
|
||||
static bool dcc_retile_tile_keys_equal(const void *a, const void *b)
|
||||
{
|
||||
return memcmp(a, b, sizeof(struct dcc_retile_tile_key)) == 0;
|
||||
}
|
||||
|
||||
static void dcc_retile_tile_free(struct hash_entry *entry)
|
||||
{
|
||||
free((void*)entry->key);
|
||||
free(((struct dcc_retile_tile_data*)entry->data)->data);
|
||||
free(entry->data);
|
||||
}
|
||||
|
||||
/* Assumes dcc_retile_map_lock is taken. */
|
||||
static const struct dcc_retile_tile_data *
|
||||
ac_compute_dcc_retile_tile_indices(struct ac_addrlib *addrlib,
|
||||
const struct radeon_info *info,
|
||||
unsigned bpp, unsigned swizzle_mode,
|
||||
bool rb_aligned, bool pipe_aligned)
|
||||
{
|
||||
struct dcc_retile_tile_key key = (struct dcc_retile_tile_key) {
|
||||
.family = info->family,
|
||||
.bpp = bpp,
|
||||
.swizzle_mode = swizzle_mode,
|
||||
.rb_aligned = rb_aligned,
|
||||
.pipe_aligned = pipe_aligned
|
||||
};
|
||||
|
||||
struct hash_entry *entry = _mesa_hash_table_search(addrlib->dcc_retile_tile_indices, &key);
|
||||
if (entry)
|
||||
return entry->data;
|
||||
|
||||
ADDR2_COMPUTE_DCCINFO_INPUT din = {0};
|
||||
ADDR2_COMPUTE_DCCINFO_OUTPUT dout = {0};
|
||||
din.size = sizeof(ADDR2_COMPUTE_DCCINFO_INPUT);
|
||||
dout.size = sizeof(ADDR2_COMPUTE_DCCINFO_OUTPUT);
|
||||
|
||||
din.dccKeyFlags.pipeAligned = pipe_aligned;
|
||||
din.dccKeyFlags.rbAligned = rb_aligned;
|
||||
din.resourceType = ADDR_RSRC_TEX_2D;
|
||||
din.swizzleMode = swizzle_mode;
|
||||
din.bpp = bpp;
|
||||
din.unalignedWidth = 1;
|
||||
din.unalignedHeight = 1;
|
||||
din.numSlices = 1;
|
||||
din.numFrags = 1;
|
||||
din.numMipLevels = 1;
|
||||
|
||||
ADDR_E_RETURNCODE ret = Addr2ComputeDccInfo(addrlib->handle, &din, &dout);
|
||||
if (ret != ADDR_OK)
|
||||
return NULL;
|
||||
|
||||
ADDR2_COMPUTE_DCC_ADDRFROMCOORD_INPUT addrin = {0};
|
||||
addrin.size = sizeof(addrin);
|
||||
addrin.swizzleMode = swizzle_mode;
|
||||
addrin.resourceType = ADDR_RSRC_TEX_2D;
|
||||
addrin.bpp = bpp;
|
||||
addrin.numSlices = 1;
|
||||
addrin.numMipLevels = 1;
|
||||
addrin.numFrags = 1;
|
||||
addrin.pitch = dout.pitch;
|
||||
addrin.height = dout.height;
|
||||
addrin.compressBlkWidth = dout.compressBlkWidth;
|
||||
addrin.compressBlkHeight = dout.compressBlkHeight;
|
||||
addrin.compressBlkDepth = dout.compressBlkDepth;
|
||||
addrin.metaBlkWidth = dout.metaBlkWidth;
|
||||
addrin.metaBlkHeight = dout.metaBlkHeight;
|
||||
addrin.metaBlkDepth = dout.metaBlkDepth;
|
||||
addrin.dccKeyFlags.pipeAligned = pipe_aligned;
|
||||
addrin.dccKeyFlags.rbAligned = rb_aligned;
|
||||
|
||||
unsigned w = dout.metaBlkWidth / dout.compressBlkWidth;
|
||||
unsigned h = dout.metaBlkHeight / dout.compressBlkHeight;
|
||||
uint16_t *indices = malloc(w * h * sizeof (uint16_t));
|
||||
if (!indices)
|
||||
return NULL;
|
||||
|
||||
ADDR2_COMPUTE_DCC_ADDRFROMCOORD_OUTPUT addrout = {};
|
||||
addrout.size = sizeof(addrout);
|
||||
|
||||
for (unsigned y = 0; y < h; ++y) {
|
||||
addrin.y = y * dout.compressBlkHeight;
|
||||
for (unsigned x = 0; x < w; ++x) {
|
||||
addrin.x = x * dout.compressBlkWidth;
|
||||
addrout.addr = 0;
|
||||
|
||||
if (Addr2ComputeDccAddrFromCoord(addrlib->handle, &addrin, &addrout) != ADDR_OK) {
|
||||
free(indices);
|
||||
return NULL;
|
||||
}
|
||||
indices[y * w + x] = addrout.addr;
|
||||
}
|
||||
}
|
||||
|
||||
struct dcc_retile_tile_data *data = calloc(1, sizeof(*data));
|
||||
if (!data) {
|
||||
free(indices);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
data->tile_width_log2 = util_logbase2(w);
|
||||
data->tile_height_log2 = util_logbase2(h);
|
||||
data->data = indices;
|
||||
|
||||
struct dcc_retile_tile_key *heap_key = mem_dup(&key, sizeof(key));
|
||||
if (!heap_key) {
|
||||
free(data);
|
||||
free(indices);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
entry = _mesa_hash_table_insert(addrlib->dcc_retile_tile_indices, heap_key, data);
|
||||
if (!entry) {
|
||||
free(heap_key);
|
||||
free(data);
|
||||
free(indices);
|
||||
}
|
||||
return data;
|
||||
}
|
||||
|
||||
static uint32_t ac_compute_retile_tile_addr(const struct dcc_retile_tile_data *tile,
|
||||
unsigned stride, unsigned x, unsigned y)
|
||||
{
|
||||
unsigned x_mask = (1u << tile->tile_width_log2) - 1;
|
||||
unsigned y_mask = (1u << tile->tile_height_log2) - 1;
|
||||
unsigned tile_size_log2 = tile->tile_width_log2 + tile->tile_height_log2;
|
||||
|
||||
unsigned base = ((y >> tile->tile_height_log2) * stride + (x >> tile->tile_width_log2)) << tile_size_log2;
|
||||
unsigned offset_in_tile = tile->data[((y & y_mask) << tile->tile_width_log2) + (x & x_mask)];
|
||||
return base + offset_in_tile;
|
||||
}
|
||||
|
||||
static uint32_t *ac_compute_dcc_retile_map(struct ac_addrlib *addrlib,
|
||||
const struct radeon_info *info,
|
||||
unsigned retile_width, unsigned retile_height,
|
||||
@@ -120,11 +271,17 @@ static uint32_t *ac_compute_dcc_retile_map(struct ac_addrlib *addrlib,
|
||||
return map;
|
||||
}
|
||||
|
||||
ADDR2_COMPUTE_DCC_ADDRFROMCOORD_INPUT addrin;
|
||||
memcpy(&addrin, in, sizeof(*in));
|
||||
|
||||
ADDR2_COMPUTE_DCC_ADDRFROMCOORD_OUTPUT addrout = {};
|
||||
addrout.size = sizeof(addrout);
|
||||
const struct dcc_retile_tile_data *src_tile =
|
||||
ac_compute_dcc_retile_tile_indices(addrlib, info, in->bpp,
|
||||
in->swizzleMode,
|
||||
rb_aligned, pipe_aligned);
|
||||
const struct dcc_retile_tile_data *dst_tile =
|
||||
ac_compute_dcc_retile_tile_indices(addrlib, info, in->bpp,
|
||||
in->swizzleMode, false, false);
|
||||
if (!src_tile || !dst_tile) {
|
||||
simple_mtx_unlock(&addrlib->dcc_retile_map_lock);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
void *dcc_retile_map = malloc(dcc_retile_map_size);
|
||||
if (!dcc_retile_map) {
|
||||
@@ -133,47 +290,27 @@ static uint32_t *ac_compute_dcc_retile_map(struct ac_addrlib *addrlib,
|
||||
}
|
||||
|
||||
unsigned index = 0;
|
||||
unsigned w = DIV_ROUND_UP(retile_width, in->compressBlkWidth);
|
||||
unsigned h = DIV_ROUND_UP(retile_height, in->compressBlkHeight);
|
||||
unsigned src_stride = DIV_ROUND_UP(w, 1u << src_tile->tile_width_log2);
|
||||
unsigned dst_stride = DIV_ROUND_UP(w, 1u << dst_tile->tile_width_log2);
|
||||
|
||||
for (unsigned y = 0; y < retile_height; y += in->compressBlkHeight) {
|
||||
addrin.y = y;
|
||||
for (unsigned y = 0; y < h; ++y) {
|
||||
for (unsigned x = 0; x < w; ++x) {
|
||||
unsigned src_addr = ac_compute_retile_tile_addr(src_tile, src_stride, x, y);
|
||||
unsigned dst_addr = ac_compute_retile_tile_addr(dst_tile, dst_stride, x, y);
|
||||
|
||||
for (unsigned x = 0; x < retile_width; x += in->compressBlkWidth) {
|
||||
addrin.x = x;
|
||||
|
||||
/* Compute src DCC address */
|
||||
addrin.dccKeyFlags.pipeAligned = pipe_aligned;
|
||||
addrin.dccKeyFlags.rbAligned = rb_aligned;
|
||||
addrout.addr = 0;
|
||||
|
||||
if (Addr2ComputeDccAddrFromCoord(addrlib->handle, &addrin, &addrout) != ADDR_OK) {
|
||||
simple_mtx_unlock(&addrlib->dcc_retile_map_lock);
|
||||
return NULL;
|
||||
if (use_uint16) {
|
||||
((uint16_t*)dcc_retile_map)[2 * index] = src_addr;
|
||||
((uint16_t*)dcc_retile_map)[2 * index + 1] = dst_addr;
|
||||
} else {
|
||||
((uint32_t*)dcc_retile_map)[2 * index] = src_addr;
|
||||
((uint32_t*)dcc_retile_map)[2 * index + 1] = dst_addr;
|
||||
}
|
||||
|
||||
if (use_uint16)
|
||||
((uint16_t*)dcc_retile_map)[index * 2] = addrout.addr;
|
||||
else
|
||||
((uint32_t*)dcc_retile_map)[index * 2] = addrout.addr;
|
||||
|
||||
/* Compute dst DCC address */
|
||||
addrin.dccKeyFlags.pipeAligned = 0;
|
||||
addrin.dccKeyFlags.rbAligned = 0;
|
||||
addrout.addr = 0;
|
||||
|
||||
if (Addr2ComputeDccAddrFromCoord(addrlib->handle, &addrin, &addrout) != ADDR_OK) {
|
||||
simple_mtx_unlock(&addrlib->dcc_retile_map_lock);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
if (use_uint16)
|
||||
((uint16_t*)dcc_retile_map)[index * 2 + 1] = addrout.addr;
|
||||
else
|
||||
((uint32_t*)dcc_retile_map)[index * 2 + 1] = addrout.addr;
|
||||
|
||||
assert(index * 2 + 1 < dcc_retile_num_elements);
|
||||
index++;
|
||||
++index;
|
||||
}
|
||||
}
|
||||
|
||||
/* Fill the remaining pairs with the last one (for the compute shader). */
|
||||
for (unsigned i = index * 2; i < dcc_retile_num_elements; i++) {
|
||||
if (use_uint16)
|
||||
@@ -276,6 +413,8 @@ struct ac_addrlib *ac_addrlib_create(const struct radeon_info *info,
|
||||
simple_mtx_init(&addrlib->dcc_retile_map_lock, mtx_plain);
|
||||
addrlib->dcc_retile_maps = _mesa_hash_table_create(NULL, dcc_retile_map_hash_key,
|
||||
dcc_retile_map_keys_equal);
|
||||
addrlib->dcc_retile_tile_indices = _mesa_hash_table_create(NULL, dcc_retile_tile_hash_key,
|
||||
dcc_retile_tile_keys_equal);
|
||||
return addrlib;
|
||||
}
|
||||
|
||||
@@ -284,6 +423,7 @@ void ac_addrlib_destroy(struct ac_addrlib *addrlib)
|
||||
AddrDestroy(addrlib->handle);
|
||||
simple_mtx_destroy(&addrlib->dcc_retile_map_lock);
|
||||
_mesa_hash_table_destroy(addrlib->dcc_retile_maps, dcc_retile_map_free);
|
||||
_mesa_hash_table_destroy(addrlib->dcc_retile_tile_indices, dcc_retile_tile_free);
|
||||
free(addrlib);
|
||||
}
|
||||
|
||||
@@ -872,7 +1012,8 @@ static int gfx6_compute_surface(ADDR_HANDLE addrlib,
|
||||
|
||||
/* Set preferred macrotile parameters. This is usually required
|
||||
* for shared resources. This is for 2D tiling only. */
|
||||
if (AddrSurfInfoIn.tileMode >= ADDR_TM_2D_TILED_THIN1 &&
|
||||
if (!(surf->flags & RADEON_SURF_Z_OR_SBUFFER) &&
|
||||
AddrSurfInfoIn.tileMode >= ADDR_TM_2D_TILED_THIN1 &&
|
||||
surf->u.legacy.bankw && surf->u.legacy.bankh &&
|
||||
surf->u.legacy.mtilea && surf->u.legacy.tile_split) {
|
||||
/* If any of these parameters are incorrect, the calculation
|
||||
|
@@ -9753,7 +9753,10 @@ static void create_vs_exports(isel_context *ctx)
|
||||
|
||||
if (outinfo->export_prim_id && !(ctx->stage & hw_ngg_gs)) {
|
||||
ctx->outputs.mask[VARYING_SLOT_PRIMITIVE_ID] |= 0x1;
|
||||
ctx->outputs.temps[VARYING_SLOT_PRIMITIVE_ID * 4u] = get_arg(ctx, ctx->args->vs_prim_id);
|
||||
if (ctx->stage & sw_tes)
|
||||
ctx->outputs.temps[VARYING_SLOT_PRIMITIVE_ID * 4u] = get_arg(ctx, ctx->args->ac.tes_patch_id);
|
||||
else
|
||||
ctx->outputs.temps[VARYING_SLOT_PRIMITIVE_ID * 4u] = get_arg(ctx, ctx->args->vs_prim_id);
|
||||
}
|
||||
|
||||
if (ctx->options->key.has_multiview_view_index) {
|
||||
|
@@ -2486,8 +2486,10 @@ radv_flush_vertex_descriptors(struct radv_cmd_buffer *cmd_buffer,
|
||||
uint32_t stride = cmd_buffer->state.pipeline->binding_stride[i];
|
||||
unsigned num_records;
|
||||
|
||||
if (!buffer)
|
||||
if (!buffer) {
|
||||
memset(desc, 0, 4 * 4);
|
||||
continue;
|
||||
}
|
||||
|
||||
va = radv_buffer_get_va(buffer->bo);
|
||||
|
||||
@@ -3619,22 +3621,27 @@ void radv_CmdBindDescriptorSets(
|
||||
assert(dyn_idx < dynamicOffsetCount);
|
||||
|
||||
struct radv_descriptor_range *range = set->dynamic_descriptors + j;
|
||||
uint64_t va = range->va + pDynamicOffsets[dyn_idx];
|
||||
dst[0] = va;
|
||||
dst[1] = S_008F04_BASE_ADDRESS_HI(va >> 32);
|
||||
dst[2] = no_dynamic_bounds ? 0xffffffffu : range->size;
|
||||
dst[3] = S_008F0C_DST_SEL_X(V_008F0C_SQ_SEL_X) |
|
||||
S_008F0C_DST_SEL_Y(V_008F0C_SQ_SEL_Y) |
|
||||
S_008F0C_DST_SEL_Z(V_008F0C_SQ_SEL_Z) |
|
||||
S_008F0C_DST_SEL_W(V_008F0C_SQ_SEL_W);
|
||||
|
||||
if (cmd_buffer->device->physical_device->rad_info.chip_class >= GFX10) {
|
||||
dst[3] |= S_008F0C_FORMAT(V_008F0C_IMG_FORMAT_32_FLOAT) |
|
||||
S_008F0C_OOB_SELECT(V_008F0C_OOB_SELECT_RAW) |
|
||||
S_008F0C_RESOURCE_LEVEL(1);
|
||||
if (!range->va) {
|
||||
memset(dst, 0, 4 * 4);
|
||||
} else {
|
||||
dst[3] |= S_008F0C_NUM_FORMAT(V_008F0C_BUF_NUM_FORMAT_FLOAT) |
|
||||
S_008F0C_DATA_FORMAT(V_008F0C_BUF_DATA_FORMAT_32);
|
||||
uint64_t va = range->va + pDynamicOffsets[dyn_idx];
|
||||
dst[0] = va;
|
||||
dst[1] = S_008F04_BASE_ADDRESS_HI(va >> 32);
|
||||
dst[2] = no_dynamic_bounds ? 0xffffffffu : range->size;
|
||||
dst[3] = S_008F0C_DST_SEL_X(V_008F0C_SQ_SEL_X) |
|
||||
S_008F0C_DST_SEL_Y(V_008F0C_SQ_SEL_Y) |
|
||||
S_008F0C_DST_SEL_Z(V_008F0C_SQ_SEL_Z) |
|
||||
S_008F0C_DST_SEL_W(V_008F0C_SQ_SEL_W);
|
||||
|
||||
if (cmd_buffer->device->physical_device->rad_info.chip_class >= GFX10) {
|
||||
dst[3] |= S_008F0C_FORMAT(V_008F0C_IMG_FORMAT_32_FLOAT) |
|
||||
S_008F0C_OOB_SELECT(V_008F0C_OOB_SELECT_RAW) |
|
||||
S_008F0C_RESOURCE_LEVEL(1);
|
||||
} else {
|
||||
dst[3] |= S_008F0C_NUM_FORMAT(V_008F0C_BUF_NUM_FORMAT_FLOAT) |
|
||||
S_008F0C_DATA_FORMAT(V_008F0C_BUF_DATA_FORMAT_32);
|
||||
}
|
||||
}
|
||||
|
||||
cmd_buffer->push_constant_stages |=
|
||||
@@ -5517,8 +5524,16 @@ static void radv_init_color_image_metadata(struct radv_cmd_buffer *cmd_buffer,
|
||||
if (radv_image_has_cmask(image)) {
|
||||
uint32_t value = 0xffffffffu; /* Fully expanded mode. */
|
||||
|
||||
/* TODO: clarify this. */
|
||||
if (radv_image_has_fmask(image)) {
|
||||
/* TODO: clarify why 0xccccccccu is used. */
|
||||
|
||||
/* If CMASK isn't updated with the new layout, we should use the
|
||||
* fully expanded mode so that the image is read correctly if
|
||||
* CMASK is used (such as when transitioning to a compressed
|
||||
* layout).
|
||||
*/
|
||||
if (radv_image_has_fmask(image) &&
|
||||
radv_layout_can_fast_clear(image, dst_layout,
|
||||
dst_render_loop, dst_queue_mask)) {
|
||||
value = 0xccccccccu;
|
||||
}
|
||||
|
||||
@@ -6163,8 +6178,12 @@ radv_emit_streamout_begin(struct radv_cmd_buffer *cmd_buffer,
|
||||
/* The array of counter buffers is optional. */
|
||||
RADV_FROM_HANDLE(radv_buffer, buffer, pCounterBuffers[counter_buffer_idx]);
|
||||
uint64_t va = radv_buffer_get_va(buffer->bo);
|
||||
uint64_t counter_buffer_offset = 0;
|
||||
|
||||
va += buffer->offset + pCounterBufferOffsets[counter_buffer_idx];
|
||||
if (pCounterBufferOffsets)
|
||||
counter_buffer_offset = pCounterBufferOffsets[counter_buffer_idx];
|
||||
|
||||
va += buffer->offset + counter_buffer_offset;
|
||||
|
||||
/* Append */
|
||||
radeon_emit(cs, PKT3(PKT3_STRMOUT_BUFFER_UPDATE, 4, 0));
|
||||
@@ -6227,9 +6246,13 @@ gfx10_emit_streamout_begin(struct radv_cmd_buffer *cmd_buffer,
|
||||
|
||||
if (append) {
|
||||
RADV_FROM_HANDLE(radv_buffer, buffer, pCounterBuffers[counter_buffer_idx]);
|
||||
uint64_t counter_buffer_offset = 0;
|
||||
|
||||
if (pCounterBufferOffsets)
|
||||
counter_buffer_offset = pCounterBufferOffsets[counter_buffer_idx];
|
||||
|
||||
va += radv_buffer_get_va(buffer->bo);
|
||||
va += buffer->offset + pCounterBufferOffsets[counter_buffer_idx];
|
||||
va += buffer->offset + counter_buffer_offset;
|
||||
|
||||
radv_cs_add_buffer(cmd_buffer->device->ws, cs, buffer->bo);
|
||||
}
|
||||
@@ -6292,8 +6315,12 @@ radv_emit_streamout_end(struct radv_cmd_buffer *cmd_buffer,
|
||||
/* The array of counters buffer is optional. */
|
||||
RADV_FROM_HANDLE(radv_buffer, buffer, pCounterBuffers[counter_buffer_idx]);
|
||||
uint64_t va = radv_buffer_get_va(buffer->bo);
|
||||
uint64_t counter_buffer_offset = 0;
|
||||
|
||||
va += buffer->offset + pCounterBufferOffsets[counter_buffer_idx];
|
||||
if (pCounterBufferOffsets)
|
||||
counter_buffer_offset = pCounterBufferOffsets[counter_buffer_idx];
|
||||
|
||||
va += buffer->offset + counter_buffer_offset;
|
||||
|
||||
radeon_emit(cs, PKT3(PKT3_STRMOUT_BUFFER_UPDATE, 4, 0));
|
||||
radeon_emit(cs, STRMOUT_SELECT_BUFFER(i) |
|
||||
@@ -6344,8 +6371,12 @@ gfx10_emit_streamout_end(struct radv_cmd_buffer *cmd_buffer,
|
||||
/* The array of counters buffer is optional. */
|
||||
RADV_FROM_HANDLE(radv_buffer, buffer, pCounterBuffers[counter_buffer_idx]);
|
||||
uint64_t va = radv_buffer_get_va(buffer->bo);
|
||||
uint64_t counter_buffer_offset = 0;
|
||||
|
||||
va += buffer->offset + pCounterBufferOffsets[counter_buffer_idx];
|
||||
if (pCounterBufferOffsets)
|
||||
counter_buffer_offset = pCounterBufferOffsets[counter_buffer_idx];
|
||||
|
||||
va += buffer->offset + counter_buffer_offset;
|
||||
|
||||
si_cs_emit_write_event_eop(cs,
|
||||
cmd_buffer->device->physical_device->rad_info.chip_class,
|
||||
|
@@ -928,8 +928,10 @@ static void write_dynamic_buffer_descriptor(struct radv_device *device,
|
||||
uint64_t va;
|
||||
unsigned size;
|
||||
|
||||
if (!buffer)
|
||||
if (!buffer) {
|
||||
range->va = 0;
|
||||
return;
|
||||
}
|
||||
|
||||
va = radv_buffer_get_va(buffer->bo);
|
||||
size = buffer_info->range;
|
||||
|
@@ -1987,8 +1987,12 @@ handle_vs_outputs_post(struct radv_shader_context *ctx,
|
||||
outputs[noutput].slot_name = VARYING_SLOT_PRIMITIVE_ID;
|
||||
outputs[noutput].slot_index = 0;
|
||||
outputs[noutput].usage_mask = 0x1;
|
||||
outputs[noutput].values[0] =
|
||||
ac_get_arg(&ctx->ac, ctx->args->vs_prim_id);
|
||||
if (ctx->stage == MESA_SHADER_TESS_EVAL)
|
||||
outputs[noutput].values[0] =
|
||||
ac_get_arg(&ctx->ac, ctx->args->ac.tes_patch_id);
|
||||
else
|
||||
outputs[noutput].values[0] =
|
||||
ac_get_arg(&ctx->ac, ctx->args->vs_prim_id);
|
||||
for (unsigned j = 1; j < 4; j++)
|
||||
outputs[noutput].values[j] = ctx->ac.f32_0;
|
||||
noutput++;
|
||||
|
@@ -1087,6 +1087,9 @@ glsl_type::record_compare(const glsl_type *b, bool match_name,
|
||||
if (this->interface_row_major != b->interface_row_major)
|
||||
return false;
|
||||
|
||||
if (this->packed != b->packed)
|
||||
return false;
|
||||
|
||||
/* From the GLSL 4.20 specification (Sec 4.2):
|
||||
*
|
||||
* "Structures must have the same name, sequence of type names, and
|
||||
|
@@ -250,6 +250,15 @@ search_for_use_after_instr(nir_instr *start, nir_ssa_def *def)
|
||||
return true;
|
||||
node = node->next;
|
||||
}
|
||||
|
||||
/* If uses are considered to be in the block immediately preceding the if
|
||||
* so we need to also check the following if condition, if any.
|
||||
*/
|
||||
nir_if *following_if = nir_block_get_following_if(start->block);
|
||||
if (following_if && following_if->condition.is_ssa &&
|
||||
following_if->condition.ssa == def)
|
||||
return true;
|
||||
|
||||
return false;
|
||||
}
|
||||
|
||||
|
@@ -649,6 +649,37 @@ nir_lower_io_block(nir_block *block,
|
||||
mode == nir_var_shader_out ||
|
||||
var->data.bindless;
|
||||
|
||||
if (nir_deref_instr_is_known_out_of_bounds(deref)) {
|
||||
/* Section 5.11 (Out-of-Bounds Accesses) of the GLSL 4.60 spec says:
|
||||
*
|
||||
* In the subsections described above for array, vector, matrix and
|
||||
* structure accesses, any out-of-bounds access produced undefined
|
||||
* behavior....
|
||||
* Out-of-bounds reads return undefined values, which
|
||||
* include values from other variables of the active program or zero.
|
||||
* Out-of-bounds writes may be discarded or overwrite
|
||||
* other variables of the active program.
|
||||
*
|
||||
* GL_KHR_robustness and GL_ARB_robustness encourage us to return zero
|
||||
* for reads.
|
||||
*
|
||||
* Otherwise get_io_offset would return out-of-bound offset which may
|
||||
* result in out-of-bound loading/storing of inputs/outputs,
|
||||
* that could cause issues in drivers down the line.
|
||||
*/
|
||||
if (intrin->intrinsic != nir_intrinsic_store_deref) {
|
||||
nir_ssa_def *zero =
|
||||
nir_imm_zero(b, intrin->dest.ssa.num_components,
|
||||
intrin->dest.ssa.bit_size);
|
||||
nir_ssa_def_rewrite_uses(&intrin->dest.ssa,
|
||||
nir_src_for_ssa(zero));
|
||||
}
|
||||
|
||||
nir_instr_remove(&intrin->instr);
|
||||
progress = true;
|
||||
continue;
|
||||
}
|
||||
|
||||
offset = get_io_offset(b, deref, per_vertex ? &vertex_index : NULL,
|
||||
state->type_size, &component_offset,
|
||||
bindless_type_size);
|
||||
|
@@ -61,7 +61,7 @@ get_io_offset(nir_builder *b, nir_deref_instr *deref, nir_variable *var,
|
||||
unsigned size = glsl_count_attribute_slots((*p)->type, false);
|
||||
offset += size * index;
|
||||
|
||||
xfb_offset += index * glsl_get_component_slots((*p)->type) * 4;
|
||||
*xfb_offset += index * glsl_get_component_slots((*p)->type) * 4;
|
||||
|
||||
unsigned num_elements = glsl_type_is_array((*p)->type) ?
|
||||
glsl_get_aoa_size((*p)->type) : 1;
|
||||
|
@@ -47,7 +47,27 @@ lower_tex_src_to_offset(nir_builder *b,
|
||||
|
||||
if (nir_src_is_const(deref->arr.index) && index == NULL) {
|
||||
/* We're still building a direct index */
|
||||
base_index += nir_src_as_uint(deref->arr.index) * array_elements;
|
||||
unsigned index_in_array = nir_src_as_uint(deref->arr.index);
|
||||
|
||||
/* Section 5.11 (Out-of-Bounds Accesses) of the GLSL 4.60 spec says:
|
||||
*
|
||||
* In the subsections described above for array, vector, matrix and
|
||||
* structure accesses, any out-of-bounds access produced undefined
|
||||
* behavior.... Out-of-bounds reads return undefined values, which
|
||||
* include values from other variables of the active program or zero.
|
||||
*
|
||||
* Robustness extensions suggest to return zero on out-of-bounds
|
||||
* accesses, however it's not applicable to the arrays of samplers,
|
||||
* so just clamp the index.
|
||||
*
|
||||
* Otherwise instr->sampler_index or instr->texture_index would be out
|
||||
* of bounds, and they are used as an index to arrays of driver state.
|
||||
*/
|
||||
if (index_in_array < glsl_array_size(parent->type)) {
|
||||
base_index += index_in_array * array_elements;
|
||||
} else {
|
||||
base_index = glsl_array_size(parent->type) - 1;
|
||||
}
|
||||
} else {
|
||||
if (index == NULL) {
|
||||
/* We used to be direct but not anymore */
|
||||
|
@@ -118,8 +118,11 @@ handle_constant_store(void *mem_ctx, struct var_info *info,
|
||||
info->constant_data = rzalloc_size(mem_ctx, var_size);
|
||||
}
|
||||
|
||||
char *dst = (char *)info->constant_data +
|
||||
nir_deref_instr_get_const_offset(deref, size_align);
|
||||
const unsigned offset = nir_deref_instr_get_const_offset(deref, size_align);
|
||||
if (offset >= info->constant_data_size)
|
||||
return;
|
||||
|
||||
char *dst = (char *)info->constant_data + offset;
|
||||
|
||||
for (unsigned i = 0; i < num_components; i++) {
|
||||
if (!(writemask & (1 << i)))
|
||||
|
@@ -608,6 +608,74 @@ vtn_add_cfg_work_item(struct vtn_builder *b,
|
||||
list_addtail(&work->link, work_list);
|
||||
}
|
||||
|
||||
/* returns the default block */
|
||||
static void
|
||||
vtn_parse_switch(struct vtn_builder *b,
|
||||
struct vtn_switch *swtch,
|
||||
const uint32_t *branch,
|
||||
struct list_head *case_list)
|
||||
{
|
||||
const uint32_t *branch_end = branch + (branch[0] >> SpvWordCountShift);
|
||||
|
||||
struct vtn_value *sel_val = vtn_untyped_value(b, branch[1]);
|
||||
vtn_fail_if(!sel_val->type ||
|
||||
sel_val->type->base_type != vtn_base_type_scalar,
|
||||
"Selector of OpSwitch must have a type of OpTypeInt");
|
||||
|
||||
nir_alu_type sel_type =
|
||||
nir_get_nir_type_for_glsl_type(sel_val->type->type);
|
||||
vtn_fail_if(nir_alu_type_get_base_type(sel_type) != nir_type_int &&
|
||||
nir_alu_type_get_base_type(sel_type) != nir_type_uint,
|
||||
"Selector of OpSwitch must have a type of OpTypeInt");
|
||||
|
||||
struct hash_table *block_to_case = _mesa_pointer_hash_table_create(b);
|
||||
|
||||
bool is_default = true;
|
||||
const unsigned bitsize = nir_alu_type_get_type_size(sel_type);
|
||||
for (const uint32_t *w = branch + 2; w < branch_end;) {
|
||||
uint64_t literal = 0;
|
||||
if (!is_default) {
|
||||
if (bitsize <= 32) {
|
||||
literal = *(w++);
|
||||
} else {
|
||||
assert(bitsize == 64);
|
||||
literal = vtn_u64_literal(w);
|
||||
w += 2;
|
||||
}
|
||||
}
|
||||
struct vtn_block *case_block = vtn_block(b, *(w++));
|
||||
|
||||
struct hash_entry *case_entry =
|
||||
_mesa_hash_table_search(block_to_case, case_block);
|
||||
|
||||
struct vtn_case *cse;
|
||||
if (case_entry) {
|
||||
cse = case_entry->data;
|
||||
} else {
|
||||
cse = rzalloc(b, struct vtn_case);
|
||||
|
||||
cse->node.type = vtn_cf_node_type_case;
|
||||
cse->node.parent = swtch ? &swtch->node : NULL;
|
||||
cse->block = case_block;
|
||||
list_inithead(&cse->body);
|
||||
util_dynarray_init(&cse->values, b);
|
||||
|
||||
list_addtail(&cse->node.link, case_list);
|
||||
_mesa_hash_table_insert(block_to_case, case_block, cse);
|
||||
}
|
||||
|
||||
if (is_default) {
|
||||
cse->is_default = true;
|
||||
} else {
|
||||
util_dynarray_append(&cse->values, uint64_t, literal);
|
||||
}
|
||||
|
||||
is_default = false;
|
||||
}
|
||||
|
||||
_mesa_hash_table_destroy(block_to_case, NULL);
|
||||
}
|
||||
|
||||
/* Processes a block and returns the next block to process or NULL if we've
|
||||
* reached the end of the construct.
|
||||
*/
|
||||
@@ -812,17 +880,6 @@ vtn_process_block(struct vtn_builder *b,
|
||||
}
|
||||
|
||||
case SpvOpSwitch: {
|
||||
struct vtn_value *sel_val = vtn_untyped_value(b, block->branch[1]);
|
||||
vtn_fail_if(!sel_val->type ||
|
||||
sel_val->type->base_type != vtn_base_type_scalar,
|
||||
"Selector of OpSwitch must have a type of OpTypeInt");
|
||||
|
||||
nir_alu_type sel_type =
|
||||
nir_get_nir_type_for_glsl_type(sel_val->type->type);
|
||||
vtn_fail_if(nir_alu_type_get_base_type(sel_type) != nir_type_int &&
|
||||
nir_alu_type_get_base_type(sel_type) != nir_type_uint,
|
||||
"Selector of OpSwitch must have a type of OpTypeInt");
|
||||
|
||||
struct vtn_switch *swtch = rzalloc(b, struct vtn_switch);
|
||||
|
||||
swtch->node.type = vtn_cf_node_type_switch;
|
||||
@@ -843,82 +900,39 @@ vtn_process_block(struct vtn_builder *b,
|
||||
}
|
||||
|
||||
/* First, we go through and record all of the cases. */
|
||||
const uint32_t *branch_end =
|
||||
block->branch + (block->branch[0] >> SpvWordCountShift);
|
||||
vtn_parse_switch(b, swtch, block->branch, &swtch->cases);
|
||||
|
||||
struct hash_table *block_to_case = _mesa_pointer_hash_table_create(b);
|
||||
/* Gather the branch types for the switch */
|
||||
vtn_foreach_cf_node(case_node, &swtch->cases) {
|
||||
struct vtn_case *cse = vtn_cf_node_as_case(case_node);
|
||||
|
||||
bool is_default = true;
|
||||
const unsigned bitsize = nir_alu_type_get_type_size(sel_type);
|
||||
for (const uint32_t *w = block->branch + 2; w < branch_end;) {
|
||||
uint64_t literal = 0;
|
||||
if (!is_default) {
|
||||
if (bitsize <= 32) {
|
||||
literal = *(w++);
|
||||
} else {
|
||||
assert(bitsize == 64);
|
||||
literal = vtn_u64_literal(w);
|
||||
w += 2;
|
||||
}
|
||||
cse->type = vtn_handle_branch(b, &swtch->node, cse->block);
|
||||
switch (cse->type) {
|
||||
case vtn_branch_type_none:
|
||||
/* This is a "real" cases which has stuff in it */
|
||||
vtn_fail_if(cse->block->switch_case != NULL,
|
||||
"OpSwitch has a case which is also in another "
|
||||
"OpSwitch construct");
|
||||
cse->block->switch_case = cse;
|
||||
vtn_add_cfg_work_item(b, work_list, &cse->node,
|
||||
&cse->body, cse->block);
|
||||
break;
|
||||
|
||||
case vtn_branch_type_switch_break:
|
||||
case vtn_branch_type_loop_break:
|
||||
case vtn_branch_type_loop_continue:
|
||||
/* Switch breaks as well as loop breaks and continues can be
|
||||
* used to break out of a switch construct or as direct targets
|
||||
* of the OpSwitch.
|
||||
*/
|
||||
break;
|
||||
|
||||
default:
|
||||
vtn_fail("Target of OpSwitch is not a valid structured exit "
|
||||
"from the switch construct.");
|
||||
}
|
||||
struct vtn_block *case_block = vtn_block(b, *(w++));
|
||||
|
||||
struct hash_entry *case_entry =
|
||||
_mesa_hash_table_search(block_to_case, case_block);
|
||||
|
||||
struct vtn_case *cse;
|
||||
if (case_entry) {
|
||||
cse = case_entry->data;
|
||||
} else {
|
||||
cse = rzalloc(b, struct vtn_case);
|
||||
|
||||
cse->node.type = vtn_cf_node_type_case;
|
||||
cse->node.parent = &swtch->node;
|
||||
list_inithead(&cse->body);
|
||||
util_dynarray_init(&cse->values, b);
|
||||
|
||||
cse->type = vtn_handle_branch(b, &swtch->node, case_block);
|
||||
switch (cse->type) {
|
||||
case vtn_branch_type_none:
|
||||
/* This is a "real" cases which has stuff in it */
|
||||
vtn_fail_if(case_block->switch_case != NULL,
|
||||
"OpSwitch has a case which is also in another "
|
||||
"OpSwitch construct");
|
||||
case_block->switch_case = cse;
|
||||
vtn_add_cfg_work_item(b, work_list, &cse->node,
|
||||
&cse->body, case_block);
|
||||
break;
|
||||
|
||||
case vtn_branch_type_switch_break:
|
||||
case vtn_branch_type_loop_break:
|
||||
case vtn_branch_type_loop_continue:
|
||||
/* Switch breaks as well as loop breaks and continues can be
|
||||
* used to break out of a switch construct or as direct targets
|
||||
* of the OpSwitch.
|
||||
*/
|
||||
break;
|
||||
|
||||
default:
|
||||
vtn_fail("Target of OpSwitch is not a valid structured exit "
|
||||
"from the switch construct.");
|
||||
}
|
||||
|
||||
list_addtail(&cse->node.link, &swtch->cases);
|
||||
|
||||
_mesa_hash_table_insert(block_to_case, case_block, cse);
|
||||
}
|
||||
|
||||
if (is_default) {
|
||||
cse->is_default = true;
|
||||
} else {
|
||||
util_dynarray_append(&cse->values, uint64_t, literal);
|
||||
}
|
||||
|
||||
is_default = false;
|
||||
}
|
||||
|
||||
_mesa_hash_table_destroy(block_to_case, NULL);
|
||||
|
||||
return swtch->break_block;
|
||||
}
|
||||
|
||||
@@ -1271,6 +1285,13 @@ vtn_emit_cf_list(struct vtn_builder *b, struct list_head *cf_list,
|
||||
vtn_foreach_cf_node(case_node, &vtn_switch->cases) {
|
||||
struct vtn_case *cse = vtn_cf_node_as_case(case_node);
|
||||
|
||||
/* If this case jumps directly to the break block, we don't have
|
||||
* to handle the case as the body is empty and doesn't fall
|
||||
* through.
|
||||
*/
|
||||
if (cse->block == vtn_switch->break_block)
|
||||
continue;
|
||||
|
||||
/* Figure out the condition */
|
||||
nir_ssa_def *cond =
|
||||
vtn_switch_case_condition(b, vtn_switch, sel, cse);
|
||||
|
@@ -185,6 +185,8 @@ struct vtn_if {
|
||||
struct vtn_case {
|
||||
struct vtn_cf_node node;
|
||||
|
||||
struct vtn_block *block;
|
||||
|
||||
enum vtn_branch_type type;
|
||||
struct list_head body;
|
||||
|
||||
|
@@ -769,7 +769,8 @@ vl_mpeg12_end_frame(struct pipe_video_codec *decoder,
|
||||
|
||||
vl_vb_unmap(&buf->vertex_stream, dec->context);
|
||||
|
||||
dec->context->transfer_unmap(dec->context, buf->tex_transfer);
|
||||
if (buf->tex_transfer)
|
||||
dec->context->transfer_unmap(dec->context, buf->tex_transfer);
|
||||
|
||||
vb[0] = dec->quads;
|
||||
vb[1] = dec->pos;
|
||||
@@ -982,28 +983,28 @@ init_idct(struct vl_mpeg12_decoder *dec, const struct format_config* format_conf
|
||||
nr_of_idct_render_targets = 1;
|
||||
|
||||
formats[0] = formats[1] = formats[2] = format_config->idct_source_format;
|
||||
assert(pipe_format_to_chroma_format(formats[0]) == dec->base.chroma_format);
|
||||
memset(&templat, 0, sizeof(templat));
|
||||
templat.width = dec->base.width / 4;
|
||||
templat.height = dec->base.height;
|
||||
dec->idct_source = vl_video_buffer_create_ex
|
||||
(
|
||||
dec->context, &templat,
|
||||
formats, 1, 1, PIPE_USAGE_DEFAULT
|
||||
formats, 1, 1, PIPE_USAGE_DEFAULT,
|
||||
PIPE_VIDEO_CHROMA_FORMAT_420
|
||||
);
|
||||
|
||||
if (!dec->idct_source)
|
||||
goto error_idct_source;
|
||||
|
||||
formats[0] = formats[1] = formats[2] = format_config->mc_source_format;
|
||||
assert(pipe_format_to_chroma_format(formats[0]) == dec->base.chroma_format);
|
||||
memset(&templat, 0, sizeof(templat));
|
||||
templat.width = dec->base.width / nr_of_idct_render_targets;
|
||||
templat.height = dec->base.height / 4;
|
||||
dec->mc_source = vl_video_buffer_create_ex
|
||||
(
|
||||
dec->context, &templat,
|
||||
formats, nr_of_idct_render_targets, 1, PIPE_USAGE_DEFAULT
|
||||
formats, nr_of_idct_render_targets, 1, PIPE_USAGE_DEFAULT,
|
||||
PIPE_VIDEO_CHROMA_FORMAT_420
|
||||
);
|
||||
|
||||
if (!dec->mc_source)
|
||||
@@ -1054,9 +1055,10 @@ init_mc_source_widthout_idct(struct vl_mpeg12_decoder *dec, const struct format_
|
||||
dec->mc_source = vl_video_buffer_create_ex
|
||||
(
|
||||
dec->context, &templat,
|
||||
formats, 1, 1, PIPE_USAGE_DEFAULT
|
||||
formats, 1, 1, PIPE_USAGE_DEFAULT,
|
||||
PIPE_VIDEO_CHROMA_FORMAT_420
|
||||
);
|
||||
|
||||
|
||||
return dec->mc_source != NULL;
|
||||
}
|
||||
|
||||
|
@@ -85,7 +85,8 @@ vl_video_buffer_template(struct pipe_resource *templ,
|
||||
const struct pipe_video_buffer *tmpl,
|
||||
enum pipe_format resource_format,
|
||||
unsigned depth, unsigned array_size,
|
||||
unsigned usage, unsigned plane)
|
||||
unsigned usage, unsigned plane,
|
||||
enum pipe_video_chroma_format chroma_format)
|
||||
{
|
||||
assert(0);
|
||||
}
|
||||
|
@@ -352,11 +352,13 @@ vl_vb_unmap(struct vl_vertex_buffer *buffer, struct pipe_context *pipe)
|
||||
assert(buffer && pipe);
|
||||
|
||||
for (i = 0; i < VL_NUM_COMPONENTS; ++i) {
|
||||
pipe_buffer_unmap(pipe, buffer->ycbcr[i].transfer);
|
||||
if (buffer->ycbcr[i].transfer)
|
||||
pipe_buffer_unmap(pipe, buffer->ycbcr[i].transfer);
|
||||
}
|
||||
|
||||
for (i = 0; i < VL_MAX_REF_FRAMES; ++i) {
|
||||
pipe_buffer_unmap(pipe, buffer->mv[i].transfer);
|
||||
if (buffer->mv[i].transfer)
|
||||
pipe_buffer_unmap(pipe, buffer->mv[i].transfer);
|
||||
}
|
||||
}
|
||||
|
||||
|
@@ -169,7 +169,8 @@ vl_video_buffer_template(struct pipe_resource *templ,
|
||||
const struct pipe_video_buffer *tmpl,
|
||||
enum pipe_format resource_format,
|
||||
unsigned depth, unsigned array_size,
|
||||
unsigned usage, unsigned plane)
|
||||
unsigned usage, unsigned plane,
|
||||
enum pipe_video_chroma_format chroma_format)
|
||||
{
|
||||
unsigned height = tmpl->height;
|
||||
|
||||
@@ -188,7 +189,7 @@ vl_video_buffer_template(struct pipe_resource *templ,
|
||||
templ->usage = usage;
|
||||
|
||||
vl_video_buffer_adjust_size(&templ->width0, &height, plane,
|
||||
pipe_format_to_chroma_format(tmpl->buffer_format), false);
|
||||
chroma_format, false);
|
||||
templ->height0 = height;
|
||||
}
|
||||
|
||||
@@ -372,7 +373,8 @@ vl_video_buffer_create(struct pipe_context *pipe,
|
||||
result = vl_video_buffer_create_ex
|
||||
(
|
||||
pipe, &templat, resource_formats,
|
||||
1, tmpl->interlaced ? 2 : 1, PIPE_USAGE_DEFAULT
|
||||
1, tmpl->interlaced ? 2 : 1, PIPE_USAGE_DEFAULT,
|
||||
pipe_format_to_chroma_format(templat.buffer_format)
|
||||
);
|
||||
|
||||
|
||||
@@ -386,7 +388,8 @@ struct pipe_video_buffer *
|
||||
vl_video_buffer_create_ex(struct pipe_context *pipe,
|
||||
const struct pipe_video_buffer *tmpl,
|
||||
const enum pipe_format resource_formats[VL_NUM_COMPONENTS],
|
||||
unsigned depth, unsigned array_size, unsigned usage)
|
||||
unsigned depth, unsigned array_size, unsigned usage,
|
||||
enum pipe_video_chroma_format chroma_format)
|
||||
{
|
||||
struct pipe_resource res_tmpl;
|
||||
struct pipe_resource *resources[VL_NUM_COMPONENTS];
|
||||
@@ -396,7 +399,8 @@ vl_video_buffer_create_ex(struct pipe_context *pipe,
|
||||
|
||||
memset(resources, 0, sizeof resources);
|
||||
|
||||
vl_video_buffer_template(&res_tmpl, tmpl, resource_formats[0], depth, array_size, usage, 0);
|
||||
vl_video_buffer_template(&res_tmpl, tmpl, resource_formats[0], depth, array_size,
|
||||
usage, 0, chroma_format);
|
||||
resources[0] = pipe->screen->resource_create(pipe->screen, &res_tmpl);
|
||||
if (!resources[0])
|
||||
goto error;
|
||||
@@ -406,7 +410,8 @@ vl_video_buffer_create_ex(struct pipe_context *pipe,
|
||||
return vl_video_buffer_create_ex2(pipe, tmpl, resources);
|
||||
}
|
||||
|
||||
vl_video_buffer_template(&res_tmpl, tmpl, resource_formats[1], depth, array_size, usage, 1);
|
||||
vl_video_buffer_template(&res_tmpl, tmpl, resource_formats[1], depth, array_size,
|
||||
usage, 1, chroma_format);
|
||||
resources[1] = pipe->screen->resource_create(pipe->screen, &res_tmpl);
|
||||
if (!resources[1])
|
||||
goto error;
|
||||
@@ -414,7 +419,8 @@ vl_video_buffer_create_ex(struct pipe_context *pipe,
|
||||
if (resource_formats[2] == PIPE_FORMAT_NONE)
|
||||
return vl_video_buffer_create_ex2(pipe, tmpl, resources);
|
||||
|
||||
vl_video_buffer_template(&res_tmpl, tmpl, resource_formats[2], depth, array_size, usage, 2);
|
||||
vl_video_buffer_template(&res_tmpl, tmpl, resource_formats[2], depth, array_size,
|
||||
usage, 2, chroma_format);
|
||||
resources[2] = pipe->screen->resource_create(pipe->screen, &res_tmpl);
|
||||
if (!resources[2])
|
||||
goto error;
|
||||
|
@@ -119,7 +119,8 @@ vl_video_buffer_template(struct pipe_resource *templ,
|
||||
const struct pipe_video_buffer *templat,
|
||||
enum pipe_format resource_format,
|
||||
unsigned depth, unsigned array_size,
|
||||
unsigned usage, unsigned plane);
|
||||
unsigned usage, unsigned plane,
|
||||
enum pipe_video_chroma_format chroma_format);
|
||||
|
||||
/**
|
||||
* creates a video buffer, can be used as a standard implementation for pipe->create_video_buffer
|
||||
@@ -135,7 +136,8 @@ struct pipe_video_buffer *
|
||||
vl_video_buffer_create_ex(struct pipe_context *pipe,
|
||||
const struct pipe_video_buffer *templat,
|
||||
const enum pipe_format resource_formats[VL_NUM_COMPONENTS],
|
||||
unsigned depth, unsigned array_size, unsigned usage);
|
||||
unsigned depth, unsigned array_size, unsigned usage,
|
||||
enum pipe_video_chroma_format chroma_format);
|
||||
|
||||
/**
|
||||
* even more extended create function, provide the pipe_resource for each plane
|
||||
|
@@ -68,7 +68,7 @@ struct etna_sampler_view {
|
||||
uint32_t TE_SAMPLER_SIZE;
|
||||
uint32_t TE_SAMPLER_LOG_SIZE;
|
||||
uint32_t TE_SAMPLER_ASTC0;
|
||||
uint32_t TE_SAMPLER_LINEAR_STRIDE[VIVS_TE_SAMPLER_LINEAR_STRIDE__LEN];
|
||||
uint32_t TE_SAMPLER_LINEAR_STRIDE; /* only LOD0 */
|
||||
struct etna_reloc TE_SAMPLER_LOD_ADDR[VIVS_TE_SAMPLER_LOD_ADDR__LEN];
|
||||
unsigned min_lod, max_lod; /* 5.5 fixp */
|
||||
|
||||
@@ -211,12 +211,11 @@ etna_create_sampler_view_state(struct pipe_context *pctx, struct pipe_resource *
|
||||
if (res->layout == ETNA_LAYOUT_LINEAR && !util_format_is_compressed(so->format)) {
|
||||
sv->TE_SAMPLER_CONFIG0 |= VIVS_TE_SAMPLER_CONFIG0_ADDRESSING_MODE(TEXTURE_ADDRESSING_MODE_LINEAR);
|
||||
|
||||
for (int lod = 0; lod <= res->base.last_level; ++lod)
|
||||
sv->TE_SAMPLER_LINEAR_STRIDE[lod] = res->levels[lod].stride;
|
||||
|
||||
assert(res->base.last_level == 0);
|
||||
sv->TE_SAMPLER_LINEAR_STRIDE = res->levels[0].stride;
|
||||
} else {
|
||||
sv->TE_SAMPLER_CONFIG0 |= VIVS_TE_SAMPLER_CONFIG0_ADDRESSING_MODE(TEXTURE_ADDRESSING_MODE_TILED);
|
||||
memset(&sv->TE_SAMPLER_LINEAR_STRIDE, 0, sizeof(sv->TE_SAMPLER_LINEAR_STRIDE));
|
||||
sv->TE_SAMPLER_LINEAR_STRIDE = 0;
|
||||
}
|
||||
|
||||
sv->TE_SAMPLER_CONFIG1 |= COND(ext, VIVS_TE_SAMPLER_CONFIG1_FORMAT_EXT(format)) |
|
||||
@@ -406,12 +405,11 @@ etna_emit_texture_state(struct etna_context *ctx)
|
||||
}
|
||||
}
|
||||
if (unlikely(dirty & (ETNA_DIRTY_SAMPLER_VIEWS))) {
|
||||
for (int y = 0; y < VIVS_TE_SAMPLER_LINEAR_STRIDE__LEN; ++y) {
|
||||
for (int x = 0; x < VIVS_TE_SAMPLER__LEN; ++x) {
|
||||
if ((1 << x) & active_samplers) {
|
||||
struct etna_sampler_view *sv = etna_sampler_view(ctx->sampler_view[x]);
|
||||
/*02C00*/ EMIT_STATE(TE_SAMPLER_LINEAR_STRIDE(x, y), sv->TE_SAMPLER_LINEAR_STRIDE[y]);
|
||||
}
|
||||
/* only LOD0 is valid for this register */
|
||||
for (int x = 0; x < VIVS_TE_SAMPLER__LEN; ++x) {
|
||||
if ((1 << x) & active_samplers) {
|
||||
struct etna_sampler_view *sv = etna_sampler_view(ctx->sampler_view[x]);
|
||||
/*02C00*/ EMIT_STATE(TE_SAMPLER_LINEAR_STRIDE(0, x), sv->TE_SAMPLER_LINEAR_STRIDE);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@@ -66,6 +66,8 @@ struct pipe_video_buffer *r600_video_buffer_create(struct pipe_context *pipe,
|
||||
struct pipe_video_buffer template;
|
||||
struct pipe_resource templ;
|
||||
unsigned i, array_size;
|
||||
enum pipe_video_chroma_format chroma_format =
|
||||
pipe_format_to_chroma_format(tmpl->buffer_format);
|
||||
|
||||
assert(pipe);
|
||||
|
||||
@@ -77,7 +79,8 @@ struct pipe_video_buffer *r600_video_buffer_create(struct pipe_context *pipe,
|
||||
template.width = align(tmpl->width, VL_MACROBLOCK_WIDTH);
|
||||
template.height = align(tmpl->height / array_size, VL_MACROBLOCK_HEIGHT);
|
||||
|
||||
vl_video_buffer_template(&templ, &template, resource_formats[0], 1, array_size, PIPE_USAGE_DEFAULT, 0);
|
||||
vl_video_buffer_template(&templ, &template, resource_formats[0], 1, array_size,
|
||||
PIPE_USAGE_DEFAULT, 0, chroma_format);
|
||||
if (ctx->b.chip_class < EVERGREEN || tmpl->interlaced || !R600_UVD_ENABLE_TILING)
|
||||
templ.bind = PIPE_BIND_LINEAR;
|
||||
resources[0] = (struct r600_texture *)
|
||||
@@ -86,7 +89,8 @@ struct pipe_video_buffer *r600_video_buffer_create(struct pipe_context *pipe,
|
||||
goto error;
|
||||
|
||||
if (resource_formats[1] != PIPE_FORMAT_NONE) {
|
||||
vl_video_buffer_template(&templ, &template, resource_formats[1], 1, array_size, PIPE_USAGE_DEFAULT, 1);
|
||||
vl_video_buffer_template(&templ, &template, resource_formats[1], 1, array_size,
|
||||
PIPE_USAGE_DEFAULT, 1, chroma_format);
|
||||
if (ctx->b.chip_class < EVERGREEN || tmpl->interlaced || !R600_UVD_ENABLE_TILING)
|
||||
templ.bind = PIPE_BIND_LINEAR;
|
||||
resources[1] = (struct r600_texture *)
|
||||
@@ -96,7 +100,8 @@ struct pipe_video_buffer *r600_video_buffer_create(struct pipe_context *pipe,
|
||||
}
|
||||
|
||||
if (resource_formats[2] != PIPE_FORMAT_NONE) {
|
||||
vl_video_buffer_template(&templ, &template, resource_formats[2], 1, array_size, PIPE_USAGE_DEFAULT, 2);
|
||||
vl_video_buffer_template(&templ, &template, resource_formats[2], 1, array_size,
|
||||
PIPE_USAGE_DEFAULT, 2, chroma_format);
|
||||
if (ctx->b.chip_class < EVERGREEN || tmpl->interlaced || !R600_UVD_ENABLE_TILING)
|
||||
templ.bind = PIPE_BIND_LINEAR;
|
||||
resources[2] = (struct r600_texture *)
|
||||
|
@@ -677,27 +677,26 @@ static void si_setup_nir_user_data(struct si_context *sctx, const struct pipe_gr
|
||||
12 * sel->info.uses_grid_size;
|
||||
unsigned cs_user_data_reg = block_size_reg + 12 * program->reads_variable_block_size;
|
||||
|
||||
if (info->indirect) {
|
||||
if (sel->info.uses_grid_size) {
|
||||
if (sel->info.uses_grid_size) {
|
||||
if (info->indirect) {
|
||||
for (unsigned i = 0; i < 3; ++i) {
|
||||
si_cp_copy_data(sctx, sctx->gfx_cs, COPY_DATA_REG, NULL, (grid_size_reg >> 2) + i,
|
||||
COPY_DATA_SRC_MEM, si_resource(info->indirect),
|
||||
info->indirect_offset + 4 * i);
|
||||
}
|
||||
}
|
||||
} else {
|
||||
if (sel->info.uses_grid_size) {
|
||||
} else {
|
||||
radeon_set_sh_reg_seq(cs, grid_size_reg, 3);
|
||||
radeon_emit(cs, info->grid[0]);
|
||||
radeon_emit(cs, info->grid[1]);
|
||||
radeon_emit(cs, info->grid[2]);
|
||||
}
|
||||
if (program->reads_variable_block_size) {
|
||||
radeon_set_sh_reg_seq(cs, block_size_reg, 3);
|
||||
radeon_emit(cs, info->block[0]);
|
||||
radeon_emit(cs, info->block[1]);
|
||||
radeon_emit(cs, info->block[2]);
|
||||
}
|
||||
}
|
||||
|
||||
if (program->reads_variable_block_size) {
|
||||
radeon_set_sh_reg_seq(cs, block_size_reg, 3);
|
||||
radeon_emit(cs, info->block[0]);
|
||||
radeon_emit(cs, info->block[1]);
|
||||
radeon_emit(cs, info->block[2]);
|
||||
}
|
||||
|
||||
if (program->num_cs_user_data_dwords) {
|
||||
|
@@ -834,11 +834,12 @@ blorp_can_hiz_clear_depth(const struct gen_device_info *devinfo,
|
||||
const bool unaligned = (slice_x0 + x0) % 16 || (slice_y0 + y0) % 8 ||
|
||||
(max_x1_y1 ? haligned_x1 % 16 || valigned_y1 % 8 :
|
||||
x1 % 16 || y1 % 8);
|
||||
const bool alignment_used = surf->levels > 1 ||
|
||||
surf->logical_level0_px.depth > 1 ||
|
||||
surf->logical_level0_px.array_len > 1;
|
||||
const bool partial_clear = x0 > 0 || y0 > 0 || !max_x1_y1;
|
||||
const bool multislice_surf = surf->levels > 1 ||
|
||||
surf->logical_level0_px.depth > 1 ||
|
||||
surf->logical_level0_px.array_len > 1;
|
||||
|
||||
if (unaligned && alignment_used)
|
||||
if (unaligned && (partial_clear || multislice_surf))
|
||||
return false;
|
||||
}
|
||||
|
||||
|
@@ -901,6 +901,11 @@ enum surface_logical_srcs {
|
||||
SURFACE_LOGICAL_SRC_IMM_DIMS,
|
||||
/** Per-opcode immediate argument. For atomics, this is the atomic opcode */
|
||||
SURFACE_LOGICAL_SRC_IMM_ARG,
|
||||
/**
|
||||
* Some instructions with side-effects should not be predicated on
|
||||
* sample mask, e.g. lowered stores to scratch.
|
||||
*/
|
||||
SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK,
|
||||
|
||||
SURFACE_LOGICAL_NUM_SRCS
|
||||
};
|
||||
|
@@ -5462,7 +5462,10 @@ lower_surface_logical_send(const fs_builder &bld, fs_inst *inst)
|
||||
const fs_reg &surface_handle = inst->src[SURFACE_LOGICAL_SRC_SURFACE_HANDLE];
|
||||
const UNUSED fs_reg &dims = inst->src[SURFACE_LOGICAL_SRC_IMM_DIMS];
|
||||
const fs_reg &arg = inst->src[SURFACE_LOGICAL_SRC_IMM_ARG];
|
||||
const fs_reg &allow_sample_mask =
|
||||
inst->src[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK];
|
||||
assert(arg.file == IMM);
|
||||
assert(allow_sample_mask.file == IMM);
|
||||
|
||||
/* We must have exactly one of surface and surface_handle */
|
||||
assert((surface.file == BAD_FILE) != (surface_handle.file == BAD_FILE));
|
||||
@@ -5486,8 +5489,9 @@ lower_surface_logical_send(const fs_builder &bld, fs_inst *inst)
|
||||
surface.ud == GEN8_BTI_STATELESS_NON_COHERENT);
|
||||
|
||||
const bool has_side_effects = inst->has_side_effects();
|
||||
fs_reg sample_mask = has_side_effects ? sample_mask_reg(bld) :
|
||||
fs_reg(brw_imm_d(0xffff));
|
||||
|
||||
fs_reg sample_mask = allow_sample_mask.ud ? sample_mask_reg(bld) :
|
||||
fs_reg(brw_imm_d(0xffff));
|
||||
|
||||
/* From the BDW PRM Volume 7, page 147:
|
||||
*
|
||||
|
@@ -3767,6 +3767,7 @@ fs_visitor::nir_emit_cs_intrinsic(const fs_builder &bld,
|
||||
srcs[SURFACE_LOGICAL_SRC_SURFACE] = brw_imm_ud(surface);
|
||||
srcs[SURFACE_LOGICAL_SRC_IMM_DIMS] = brw_imm_ud(1);
|
||||
srcs[SURFACE_LOGICAL_SRC_IMM_ARG] = brw_imm_ud(1); /* num components */
|
||||
srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(0);
|
||||
|
||||
/* Read the 3 GLuint components of gl_NumWorkGroups */
|
||||
for (unsigned i = 0; i < 3; i++) {
|
||||
@@ -3804,6 +3805,7 @@ fs_visitor::nir_emit_cs_intrinsic(const fs_builder &bld,
|
||||
srcs[SURFACE_LOGICAL_SRC_SURFACE] = brw_imm_ud(GEN7_BTI_SLM);
|
||||
srcs[SURFACE_LOGICAL_SRC_ADDRESS] = get_nir_src(instr->src[0]);
|
||||
srcs[SURFACE_LOGICAL_SRC_IMM_DIMS] = brw_imm_ud(1);
|
||||
srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(0);
|
||||
|
||||
/* Make dest unsigned because that's what the temporary will be */
|
||||
dest.type = brw_reg_type_from_bit_size(bit_size, BRW_REGISTER_TYPE_UD);
|
||||
@@ -3840,6 +3842,7 @@ fs_visitor::nir_emit_cs_intrinsic(const fs_builder &bld,
|
||||
srcs[SURFACE_LOGICAL_SRC_SURFACE] = brw_imm_ud(GEN7_BTI_SLM);
|
||||
srcs[SURFACE_LOGICAL_SRC_ADDRESS] = get_nir_src(instr->src[1]);
|
||||
srcs[SURFACE_LOGICAL_SRC_IMM_DIMS] = brw_imm_ud(1);
|
||||
srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(1);
|
||||
|
||||
fs_reg data = get_nir_src(instr->src[0]);
|
||||
data.type = brw_reg_type_from_bit_size(bit_size, BRW_REGISTER_TYPE_UD);
|
||||
@@ -4123,6 +4126,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
|
||||
if (instr->intrinsic == nir_intrinsic_image_load ||
|
||||
instr->intrinsic == nir_intrinsic_bindless_image_load) {
|
||||
srcs[SURFACE_LOGICAL_SRC_IMM_ARG] = brw_imm_ud(instr->num_components);
|
||||
srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(0);
|
||||
fs_inst *inst =
|
||||
bld.emit(SHADER_OPCODE_TYPED_SURFACE_READ_LOGICAL,
|
||||
dest, srcs, SURFACE_LOGICAL_NUM_SRCS);
|
||||
@@ -4131,6 +4135,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
|
||||
instr->intrinsic == nir_intrinsic_bindless_image_store) {
|
||||
srcs[SURFACE_LOGICAL_SRC_IMM_ARG] = brw_imm_ud(instr->num_components);
|
||||
srcs[SURFACE_LOGICAL_SRC_DATA] = get_nir_src(instr->src[3]);
|
||||
srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(1);
|
||||
bld.emit(SHADER_OPCODE_TYPED_SURFACE_WRITE_LOGICAL,
|
||||
fs_reg(), srcs, SURFACE_LOGICAL_NUM_SRCS);
|
||||
} else {
|
||||
@@ -4153,6 +4158,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
|
||||
data = tmp;
|
||||
}
|
||||
srcs[SURFACE_LOGICAL_SRC_DATA] = data;
|
||||
srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(1);
|
||||
|
||||
bld.emit(SHADER_OPCODE_TYPED_ATOMIC_LOGICAL,
|
||||
dest, srcs, SURFACE_LOGICAL_NUM_SRCS);
|
||||
@@ -4210,6 +4216,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
|
||||
srcs[SURFACE_LOGICAL_SRC_ADDRESS] = get_nir_src(instr->src[1]);
|
||||
srcs[SURFACE_LOGICAL_SRC_IMM_DIMS] = brw_imm_ud(1);
|
||||
srcs[SURFACE_LOGICAL_SRC_IMM_ARG] = brw_imm_ud(instr->num_components);
|
||||
srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(0);
|
||||
|
||||
fs_inst *inst =
|
||||
bld.emit(SHADER_OPCODE_UNTYPED_SURFACE_READ_LOGICAL,
|
||||
@@ -4229,6 +4236,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
|
||||
srcs[SURFACE_LOGICAL_SRC_DATA] = get_nir_src(instr->src[2]);
|
||||
srcs[SURFACE_LOGICAL_SRC_IMM_DIMS] = brw_imm_ud(1);
|
||||
srcs[SURFACE_LOGICAL_SRC_IMM_ARG] = brw_imm_ud(instr->num_components);
|
||||
srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(1);
|
||||
|
||||
bld.emit(SHADER_OPCODE_UNTYPED_SURFACE_WRITE_LOGICAL,
|
||||
fs_reg(), srcs, SURFACE_LOGICAL_NUM_SRCS);
|
||||
@@ -4643,6 +4651,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
|
||||
get_nir_ssbo_intrinsic_index(bld, instr);
|
||||
srcs[SURFACE_LOGICAL_SRC_ADDRESS] = get_nir_src(instr->src[1]);
|
||||
srcs[SURFACE_LOGICAL_SRC_IMM_DIMS] = brw_imm_ud(1);
|
||||
srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(0);
|
||||
|
||||
/* Make dest unsigned because that's what the temporary will be */
|
||||
dest.type = brw_reg_type_from_bit_size(bit_size, BRW_REGISTER_TYPE_UD);
|
||||
@@ -4682,6 +4691,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
|
||||
get_nir_ssbo_intrinsic_index(bld, instr);
|
||||
srcs[SURFACE_LOGICAL_SRC_ADDRESS] = get_nir_src(instr->src[2]);
|
||||
srcs[SURFACE_LOGICAL_SRC_IMM_DIMS] = brw_imm_ud(1);
|
||||
srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(1);
|
||||
|
||||
fs_reg data = get_nir_src(instr->src[0]);
|
||||
data.type = brw_reg_type_from_bit_size(bit_size, BRW_REGISTER_TYPE_UD);
|
||||
@@ -4820,6 +4830,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
|
||||
|
||||
srcs[SURFACE_LOGICAL_SRC_IMM_DIMS] = brw_imm_ud(1);
|
||||
srcs[SURFACE_LOGICAL_SRC_IMM_ARG] = brw_imm_ud(bit_size);
|
||||
srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(0);
|
||||
const fs_reg nir_addr = get_nir_src(instr->src[0]);
|
||||
|
||||
/* Make dest unsigned because that's what the temporary will be */
|
||||
@@ -4865,6 +4876,14 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
|
||||
|
||||
srcs[SURFACE_LOGICAL_SRC_IMM_DIMS] = brw_imm_ud(1);
|
||||
srcs[SURFACE_LOGICAL_SRC_IMM_ARG] = brw_imm_ud(bit_size);
|
||||
/**
|
||||
* While this instruction has side-effects, it should not be predicated
|
||||
* on sample mask, because otherwise fs helper invocations would
|
||||
* load undefined values from scratch memory. And scratch memory
|
||||
* load-stores are produced from operations without side-effects, thus
|
||||
* they should not have different behaviour in the helper invocations.
|
||||
*/
|
||||
srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(0);
|
||||
const fs_reg nir_addr = get_nir_src(instr->src[1]);
|
||||
|
||||
fs_reg data = get_nir_src(instr->src[0]);
|
||||
@@ -5316,6 +5335,7 @@ fs_visitor::nir_emit_ssbo_atomic(const fs_builder &bld,
|
||||
srcs[SURFACE_LOGICAL_SRC_ADDRESS] = get_nir_src(instr->src[1]);
|
||||
srcs[SURFACE_LOGICAL_SRC_IMM_DIMS] = brw_imm_ud(1);
|
||||
srcs[SURFACE_LOGICAL_SRC_IMM_ARG] = brw_imm_ud(op);
|
||||
srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(1);
|
||||
|
||||
fs_reg data;
|
||||
if (op != BRW_AOP_INC && op != BRW_AOP_DEC && op != BRW_AOP_PREDEC)
|
||||
@@ -5351,6 +5371,7 @@ fs_visitor::nir_emit_ssbo_atomic_float(const fs_builder &bld,
|
||||
srcs[SURFACE_LOGICAL_SRC_ADDRESS] = get_nir_src(instr->src[1]);
|
||||
srcs[SURFACE_LOGICAL_SRC_IMM_DIMS] = brw_imm_ud(1);
|
||||
srcs[SURFACE_LOGICAL_SRC_IMM_ARG] = brw_imm_ud(op);
|
||||
srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(1);
|
||||
|
||||
fs_reg data = get_nir_src(instr->src[2]);
|
||||
if (op == BRW_AOP_FCMPWR) {
|
||||
@@ -5379,6 +5400,7 @@ fs_visitor::nir_emit_shared_atomic(const fs_builder &bld,
|
||||
srcs[SURFACE_LOGICAL_SRC_SURFACE] = brw_imm_ud(GEN7_BTI_SLM);
|
||||
srcs[SURFACE_LOGICAL_SRC_IMM_DIMS] = brw_imm_ud(1);
|
||||
srcs[SURFACE_LOGICAL_SRC_IMM_ARG] = brw_imm_ud(op);
|
||||
srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(1);
|
||||
|
||||
fs_reg data;
|
||||
if (op != BRW_AOP_INC && op != BRW_AOP_DEC && op != BRW_AOP_PREDEC)
|
||||
@@ -5420,6 +5442,7 @@ fs_visitor::nir_emit_shared_atomic_float(const fs_builder &bld,
|
||||
srcs[SURFACE_LOGICAL_SRC_SURFACE] = brw_imm_ud(GEN7_BTI_SLM);
|
||||
srcs[SURFACE_LOGICAL_SRC_IMM_DIMS] = brw_imm_ud(1);
|
||||
srcs[SURFACE_LOGICAL_SRC_IMM_ARG] = brw_imm_ud(op);
|
||||
srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(1);
|
||||
|
||||
fs_reg data = get_nir_src(instr->src[1]);
|
||||
if (op == BRW_AOP_FCMPWR) {
|
||||
|
@@ -77,6 +77,7 @@ namespace {
|
||||
case BRW_OPCODE_DO:
|
||||
case SHADER_OPCODE_UNDEF:
|
||||
case FS_OPCODE_PLACEHOLDER_HALT:
|
||||
case FS_OPCODE_SCHEDULING_FENCE:
|
||||
return 0;
|
||||
default:
|
||||
/* Note that the following is inaccurate for virtual instructions
|
||||
|
@@ -107,12 +107,29 @@ brw_nir_clamp_image_1d_2d_array_sizes(nir_shader *shader)
|
||||
b.cursor = nir_after_instr(instr);
|
||||
|
||||
nir_ssa_def *components[4];
|
||||
/* OR all the sizes for all components but the last. */
|
||||
nir_ssa_def *or_components = nir_imm_int(&b, 0);
|
||||
for (int i = 0; i < image_size->num_components; i++) {
|
||||
if (i == (image_size->num_components - 1)) {
|
||||
components[i] = nir_imax(&b, nir_channel(&b, image_size, i),
|
||||
nir_imm_int(&b, 1));
|
||||
nir_ssa_def *null_or_size[2] = {
|
||||
nir_imm_int(&b, 0),
|
||||
nir_imax(&b, nir_channel(&b, image_size, i),
|
||||
nir_imm_int(&b, 1)),
|
||||
};
|
||||
nir_ssa_def *vec2_null_or_size = nir_vec(&b, null_or_size, 2);
|
||||
|
||||
/* Using the ORed sizes select either the element 0 or 1
|
||||
* from this vec2. For NULL textures which have a size of
|
||||
* 0x0x0, we'll select the first element which is 0 and for
|
||||
* the rest MAX(depth, 1).
|
||||
*/
|
||||
components[i] =
|
||||
nir_vector_extract(&b, vec2_null_or_size,
|
||||
nir_imin(&b, or_components,
|
||||
nir_imm_int(&b, 1)));
|
||||
} else {
|
||||
components[i] = nir_channel(&b, image_size, i);
|
||||
or_components = nir_ior(&b, components[i], or_components);
|
||||
}
|
||||
}
|
||||
nir_ssa_def *image_size_replacement =
|
||||
|
@@ -203,7 +203,7 @@ libvulkan_intel = shared_library(
|
||||
idep_nir, idep_genxml, idep_vulkan_util, idep_mesautil, idep_xmlconfig,
|
||||
],
|
||||
c_args : anv_flags,
|
||||
link_args : ['-Wl,--build-id=sha1', ld_args_bsymbolic, ld_args_gc_sections],
|
||||
link_args : [ld_args_build_id, ld_args_bsymbolic, ld_args_gc_sections],
|
||||
install : true,
|
||||
)
|
||||
|
||||
|
@@ -343,6 +343,7 @@ get_fb0_attachment(struct gl_context *ctx, struct gl_framebuffer *fb,
|
||||
}
|
||||
|
||||
switch (attachment) {
|
||||
case GL_FRONT:
|
||||
case GL_FRONT_LEFT:
|
||||
/* Front buffers can be allocated on the first use, but
|
||||
* glGetFramebufferAttachmentParameteriv must work even if that
|
||||
|
@@ -1043,10 +1043,12 @@ copy_uniforms_to_storage(gl_constant_value *storage,
|
||||
const unsigned offset, const unsigned components,
|
||||
enum glsl_base_type basicType)
|
||||
{
|
||||
if (!uni->type->is_boolean() && !uni->is_bindless) {
|
||||
bool copy_as_uint64 = uni->is_bindless &&
|
||||
(uni->type->is_sampler() || uni->type->is_image());
|
||||
if (!uni->type->is_boolean() && !copy_as_uint64) {
|
||||
memcpy(storage, values,
|
||||
sizeof(storage[0]) * components * count * size_mul);
|
||||
} else if (uni->is_bindless) {
|
||||
} else if (copy_as_uint64) {
|
||||
const union gl_constant_value *src =
|
||||
(const union gl_constant_value *) values;
|
||||
GLuint64 *dst = (GLuint64 *)&storage->i;
|
||||
|
@@ -132,7 +132,7 @@ st_convert_sampler(const struct st_context *st,
|
||||
* levels.
|
||||
*/
|
||||
sampler->lod_bias = CLAMP(sampler->lod_bias, -16, 16);
|
||||
sampler->lod_bias = floorf(sampler->lod_bias * 256) / 256;
|
||||
sampler->lod_bias = roundf(sampler->lod_bias * 256) / 256;
|
||||
|
||||
sampler->min_lod = MAX2(msamp->MinLod, 0.0f);
|
||||
sampler->max_lod = msamp->MaxLod;
|
||||
|
@@ -109,7 +109,8 @@ st_server_wait_semaphore(struct gl_context *ctx,
|
||||
continue;
|
||||
|
||||
bufObj = st_buffer_object(bufObjs[i]);
|
||||
pipe->flush_resource(pipe, bufObj->buffer);
|
||||
if (bufObj->buffer)
|
||||
pipe->flush_resource(pipe, bufObj->buffer);
|
||||
}
|
||||
|
||||
for (unsigned i = 0; i < numTextureBarriers; i++) {
|
||||
@@ -117,7 +118,8 @@ st_server_wait_semaphore(struct gl_context *ctx,
|
||||
continue;
|
||||
|
||||
texObj = st_texture_object(texObjs[i]);
|
||||
pipe->flush_resource(pipe, texObj->pt);
|
||||
if (texObj->pt)
|
||||
pipe->flush_resource(pipe, texObj->pt);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -141,7 +143,8 @@ st_server_signal_semaphore(struct gl_context *ctx,
|
||||
continue;
|
||||
|
||||
bufObj = st_buffer_object(bufObjs[i]);
|
||||
pipe->flush_resource(pipe, bufObj->buffer);
|
||||
if (bufObj->buffer)
|
||||
pipe->flush_resource(pipe, bufObj->buffer);
|
||||
}
|
||||
|
||||
for (unsigned i = 0; i < numTextureBarriers; i++) {
|
||||
@@ -149,7 +152,8 @@ st_server_signal_semaphore(struct gl_context *ctx,
|
||||
continue;
|
||||
|
||||
texObj = st_texture_object(texObjs[i]);
|
||||
pipe->flush_resource(pipe, texObj->pt);
|
||||
if (texObj->pt)
|
||||
pipe->flush_resource(pipe, texObj->pt);
|
||||
}
|
||||
|
||||
/* The driver is allowed to flush during fence_server_signal, be prepared */
|
||||
|
Reference in New Issue
Block a user