VERSION: bump to release 20.1.9

docs: add release notes for 20.1.9
nir/lower_io_arrays: Fix xfb_offset bug
2020-09-30 20:37:42 +02:00 · 2020-09-30 20:33:53 +02:00 · 2020-09-30 11:37:10 +02:00 · 2020-09-30 11:37:10 +02:00 · 2020-09-29 22:11:46 +02:00 · 2020-09-29 22:11:42 +02:00
36 changed files with 4863 additions and 222 deletions
--- a/.pick_status.json
+++ b/.pick_status.json
--- a/2
+++ b/2
@@ -1 +1 @@
-20.1.8
+20.1.9
--- a/docs/relnotes/20.1.8.html
+++ b/docs/relnotes/20.1.8.html
@@ -36,7 +36,7 @@ depends on the particular driver being used.

 <h2>SHA256 checksum</h2>
 <pre>
-TBD.
+df21351494f7caaec5a3ccc16f14f15512e98d2ecde178bba1d134edc899b961  mesa-20.1.8.tar.xz
 </pre>


--- a/docs/relnotes/20.1.9.html
+++ b/docs/relnotes/20.1.9.html
@@ -0,0 +1,140 @@
+
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
+<html lang="en">
+<head>
+<meta http-equiv="content-type" content="text/html; charset=utf-8">
+<title>Mesa Release Notes</title>
+<link rel="stylesheet" type="text/css" href="../mesa.css">
+</head>
+<body>
+
+<div class="header">
+<h1>The Mesa 3D Graphics Library</h1>
+</div>
+
+<iframe src="../contents.html"></iframe>
+<div class="content">
+
+<h1>Mesa 20.1.9 Release Notes / 2020-09-30</h1>
+
+<p>
+    Mesa 20.1.9 is a bug fix release which fixes bugs found since the 20.1.8 release.
+</p>
+<p>
+Mesa 20.1.9 implements the OpenGL 4.6 API, but the version reported by
+glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
+glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
+Some drivers don't support all the features required in OpenGL 4.6. OpenGL
+4.6 is <strong>only</strong> available if requested at context creation.
+Compatibility contexts may report a lower version depending on each driver.
+</p>
+<p>
+Mesa 20.1.9 implements the Vulkan 1.2 API, but the version reported by
+the apiVersion property of the VkPhysicalDeviceProperties struct
+depends on the particular driver being used.
+</p>
+
+<h2>SHA256 checksum</h2>
+<pre>
+TBD.
+</pre>
+
+
+<h2>New features</h2>
+
+<ul>
+    <li>None</li>
+</ul>
+
+<h2>Bug fixes</h2>
+
+<ul>
+    <li>Horizon Zero Dawn graphics corruption with with radv</li>
+    <li>Running Amber test leads to VK_DEVICE_LOST</li>
+    <li>[spirv-fuzz] Shader generates a wrong image</li>
+    <li>anv: dEQP-VK.robustness.robustness2.* failures on gen12</li>
+    <li>[RADV] Problems reading primitive ID in fragment shader after tessellation</li>
+    <li>Substance Painter 6.1.3 black glitches on Radeon RX570</li>
+    <li>vkCmdCopyImage broadcasts subsample 0 of MSAA src into all subsamples of dst on RADV</li>
+</ul>
+
+<h2>Changes</h2>
+
+<ul>
+    <p>Bas Nieuwenhuizen (3):</p>
+    <li>      amd/common: Cache intra-tile addresses for retile map.</li>
+    <li>      ac/surface: Fix depth import on GFX6-GFX8.</li>
+    <li>      st/mesa: Deal with empty textures/buffers in semaphore wait/signal.</li>
+    <p></p>
+    <p>Christian Gmeiner (1):</p>
+    <li>      etnaviv: simplify linear stride implementation</li>
+    <p></p>
+    <p>Connor Abbott (1):</p>
+    <li>      nir/lower_io_arrays: Fix xfb_offset bug</li>
+    <p></p>
+    <p>Danylo Piliaiev (4):</p>
+    <li>      nir/lower_io: Eliminate oob writes and return zero for oob reads</li>
+    <li>      nir/large_constants: Eliminate out-of-bounds writes to large constants</li>
+    <li>      nir/lower_samplers: Clamp out-of-bounds access to array of samplers</li>
+    <li>      intel/fs: Disable sample mask predication for scratch stores</li>
+    <p></p>
+    <p>Dylan Baker (1):</p>
+    <li>      meson/anv: Use variable that checks for --build-id</li>
+    <p></p>
+    <p>Eric Engestrom (9):</p>
+    <li>      docs/relnotes: add sha256 sums to 20.1.8</li>
+    <li>      .pick_status.json: Update to d74fe47101995d2659b1e59495d2f77b9dc14f3d</li>
+    <li>      .pick_status.json: Update to c669db0b503c10faf2d1c67c9340d7222b4f946e</li>
+    <li>      .pick_status.json: Update to a3543adc2628461818cfa691a7f547af7bc6f0fb</li>
+    <li>      .pick_status.json: Mark 802d3611dcec8102ef75fe2461340c2997af931e as denominated</li>
+    <li>      .pick_status.json: Mark e98c7a66347a05fc166c377ab1abb77955aff775 as denominated</li>
+    <li>      .pick_status.json: Mark 6b1a56b908e702c06f55c63b19b695a47f607456 as denominated</li>
+    <li>      .pick_status.json: Mark 89401e58672e1251b954662f0f776a6e9bce6df8 as denominated</li>
+    <li>      .pick_status.json: Update to efaea653b5766427701817ab06c319902a148ee9</li>
+    <p></p>
+    <p>Erik Faye-Lund (2):</p>
+    <li>      mesa: handle GL_FRONT after translating to it</li>
+    <li>      st/mesa: use roundf instead of floorf for lod-bias rounding</li>
+    <p></p>
+    <p>Jason Ekstrand (2):</p>
+    <li>      intel/fs/swsb: SCHEDULING_FENCE only emits SYNC_NOP</li>
+    <li>      nir/liveness: Consider if uses in nir_ssa_defs_interfere</li>
+    <p></p>
+    <p>Jesse Natalie (1):</p>
+    <li>      glsl_type: Add packed to structure type comparison for hash map</li>
+    <p></p>
+    <p>Karol Herbst (1):</p>
+    <li>      spirv: extract switch parsing into its own function</li>
+    <p></p>
+    <p>Lionel Landwerlin (1):</p>
+    <li>      intel/compiler: fixup Gen12 workaround for array sizes</li>
+    <p></p>
+    <p>Marek Olšák (1):</p>
+    <li>      radeonsi: fix indirect dispatches with variable block sizes</li>
+    <p></p>
+    <p>Nanley Chery (1):</p>
+    <li>      blorp: Ensure aligned HIZ_CCS_WT partial clears</li>
+    <p></p>
+    <p>Pierre-Eric Pelloux-Prayer (3):</p>
+    <li>      mesa: fix glUniform* when a struct contains a bindless sampler</li>
+    <li>      gallium/vl: do not call transfer_unmap if transfer is NULL</li>
+    <li>      gallium/vl: add chroma_format arg to vl_video_buffer functions</li>
+    <p></p>
+    <p>Pierre-Loup A. Griffais (2):</p>
+    <li>      radv: fix null descriptor for dynamic buffers</li>
+    <li>      radv: fix vertex buffer null descriptors</li>
+    <p></p>
+    <p>Rhys Perry (2):</p>
+    <li>      radv: initialize with expanded cmask if the destination layout needs it</li>
+    <li>      radv,aco: fix reading primitive ID in FS after TES</li>
+    <p></p>
+    <p>Samuel Pitoiset (2):</p>
+    <li>      radv: fix transform feedback crashes if pCounterBufferOffsets is NULL</li>
+    <li>      spirv: fix emitting switch cases that directly jump to the merge block</li>
+    <p></p>
+    <p></p>
+</ul>
+
+</div>
+</body>
+</html>
--- a/src/amd/common/ac_surface.c
+++ b/src/amd/common/ac_surface.c
@@ -61,6 +61,7 @@ struct ac_addrlib {
 	 */
 	simple_mtx_t dcc_retile_map_lock;
 	struct hash_table *dcc_retile_maps;
+	struct hash_table *dcc_retile_tile_indices;
 };

 struct dcc_retile_map_key {
@@ -89,6 +90,156 @@ static void dcc_retile_map_free(struct hash_entry *entry)
 	free(entry->data);
 }

+struct dcc_retile_tile_key {
+	enum radeon_family family;
+	unsigned bpp;
+	unsigned swizzle_mode;
+	bool rb_aligned;
+	bool pipe_aligned;
+};
+
+struct dcc_retile_tile_data {
+	unsigned tile_width_log2;
+	unsigned tile_height_log2;
+	uint16_t *data;
+};
+
+static uint32_t dcc_retile_tile_hash_key(const void *key)
+{
+	return _mesa_hash_data(key, sizeof(struct dcc_retile_tile_key));
+}
+
+static bool dcc_retile_tile_keys_equal(const void *a, const void *b)
+{
+	return memcmp(a, b, sizeof(struct dcc_retile_tile_key)) == 0;
+}
+
+static void dcc_retile_tile_free(struct hash_entry *entry)
+{
+	free((void*)entry->key);
+	free(((struct dcc_retile_tile_data*)entry->data)->data);
+	free(entry->data);
+}
+
+/* Assumes dcc_retile_map_lock is taken. */
+static const struct dcc_retile_tile_data *
+ac_compute_dcc_retile_tile_indices(struct ac_addrlib *addrlib,
+                                   const struct radeon_info *info,
+                                   unsigned bpp, unsigned swizzle_mode,
+                                   bool rb_aligned, bool pipe_aligned)
+{
+	struct dcc_retile_tile_key key = (struct dcc_retile_tile_key) {
+		.family = info->family,
+		.bpp = bpp,
+		.swizzle_mode = swizzle_mode,
+		.rb_aligned = rb_aligned,
+		.pipe_aligned = pipe_aligned
+	};
+
+	struct hash_entry *entry = _mesa_hash_table_search(addrlib->dcc_retile_tile_indices, &key);
+	if (entry)
+		return entry->data;
+
+	ADDR2_COMPUTE_DCCINFO_INPUT din = {0};
+	ADDR2_COMPUTE_DCCINFO_OUTPUT dout = {0};
+	din.size = sizeof(ADDR2_COMPUTE_DCCINFO_INPUT);
+	dout.size = sizeof(ADDR2_COMPUTE_DCCINFO_OUTPUT);
+
+	din.dccKeyFlags.pipeAligned = pipe_aligned;
+	din.dccKeyFlags.rbAligned = rb_aligned;
+	din.resourceType = ADDR_RSRC_TEX_2D;
+	din.swizzleMode = swizzle_mode;
+	din.bpp = bpp;
+	din.unalignedWidth = 1;
+	din.unalignedHeight = 1;
+	din.numSlices = 1;
+	din.numFrags = 1;
+	din.numMipLevels = 1;
+
+	ADDR_E_RETURNCODE ret = Addr2ComputeDccInfo(addrlib->handle, &din, &dout);
+	if (ret != ADDR_OK)
+		return NULL;
+
+	ADDR2_COMPUTE_DCC_ADDRFROMCOORD_INPUT addrin = {0};
+	addrin.size = sizeof(addrin);
+	addrin.swizzleMode = swizzle_mode;
+	addrin.resourceType = ADDR_RSRC_TEX_2D;
+	addrin.bpp = bpp;
+	addrin.numSlices = 1;
+	addrin.numMipLevels = 1;
+	addrin.numFrags = 1;
+	addrin.pitch = dout.pitch;
+	addrin.height = dout.height;
+	addrin.compressBlkWidth = dout.compressBlkWidth;
+	addrin.compressBlkHeight = dout.compressBlkHeight;
+	addrin.compressBlkDepth = dout.compressBlkDepth;
+	addrin.metaBlkWidth = dout.metaBlkWidth;
+	addrin.metaBlkHeight = dout.metaBlkHeight;
+	addrin.metaBlkDepth = dout.metaBlkDepth;
+	addrin.dccKeyFlags.pipeAligned = pipe_aligned;
+	addrin.dccKeyFlags.rbAligned = rb_aligned;
+
+	unsigned w = dout.metaBlkWidth / dout.compressBlkWidth;
+	unsigned h = dout.metaBlkHeight / dout.compressBlkHeight;
+	uint16_t *indices = malloc(w * h * sizeof (uint16_t));
+	if (!indices)
+		return NULL;
+
+	ADDR2_COMPUTE_DCC_ADDRFROMCOORD_OUTPUT addrout = {};
+	addrout.size = sizeof(addrout);
+
+	for (unsigned y = 0; y < h; ++y) {
+		addrin.y = y * dout.compressBlkHeight;
+		for (unsigned x = 0; x < w; ++x) {
+			addrin.x = x * dout.compressBlkWidth;
+			addrout.addr = 0;
+
+			if (Addr2ComputeDccAddrFromCoord(addrlib->handle, &addrin, &addrout) != ADDR_OK) {
+				free(indices);
+				return NULL;
+			}
+			indices[y * w + x] = addrout.addr;
+		}
+	}
+
+	struct dcc_retile_tile_data *data = calloc(1, sizeof(*data));
+	if (!data) {
+		free(indices);
+		return NULL;
+	}
+
+	data->tile_width_log2 = util_logbase2(w);
+	data->tile_height_log2 = util_logbase2(h);
+	data->data = indices;
+
+	struct dcc_retile_tile_key *heap_key = mem_dup(&key, sizeof(key));
+	if (!heap_key) {
+		free(data);
+		free(indices);
+		return NULL;
+	}
+
+	entry = _mesa_hash_table_insert(addrlib->dcc_retile_tile_indices, heap_key, data);
+	if (!entry) {
+		free(heap_key);
+		free(data);
+		free(indices);
+	}
+	return data;
+}
+
+static uint32_t ac_compute_retile_tile_addr(const struct dcc_retile_tile_data *tile,
+                                            unsigned stride, unsigned x, unsigned y)
+{
+	unsigned x_mask = (1u << tile->tile_width_log2) - 1;
+	unsigned y_mask = (1u << tile->tile_height_log2) - 1;
+	unsigned tile_size_log2 = tile->tile_width_log2 + tile->tile_height_log2;
+
+	unsigned base = ((y >> tile->tile_height_log2) * stride + (x >> tile->tile_width_log2)) << tile_size_log2;
+	unsigned offset_in_tile = tile->data[((y & y_mask) << tile->tile_width_log2) + (x & x_mask)];
+	return base + offset_in_tile;
+}
+
 static uint32_t *ac_compute_dcc_retile_map(struct ac_addrlib *addrlib,
 					   const struct radeon_info *info,
 					   unsigned retile_width, unsigned retile_height,
@@ -120,11 +271,17 @@ static uint32_t *ac_compute_dcc_retile_map(struct ac_addrlib *addrlib,
 		return map;
 	}

-	ADDR2_COMPUTE_DCC_ADDRFROMCOORD_INPUT addrin;
-	memcpy(&addrin, in, sizeof(*in));
-
-	ADDR2_COMPUTE_DCC_ADDRFROMCOORD_OUTPUT addrout = {};
-	addrout.size = sizeof(addrout);
+	const struct dcc_retile_tile_data *src_tile =
+		ac_compute_dcc_retile_tile_indices(addrlib, info, in->bpp,
+		                                   in->swizzleMode,
+		                                   rb_aligned, pipe_aligned);
+	const struct dcc_retile_tile_data *dst_tile =
+		ac_compute_dcc_retile_tile_indices(addrlib, info, in->bpp,
+		                                   in->swizzleMode, false, false);
+	if (!src_tile || !dst_tile) {
+		simple_mtx_unlock(&addrlib->dcc_retile_map_lock);
+		return NULL;
+	}

 	void *dcc_retile_map = malloc(dcc_retile_map_size);
 	if (!dcc_retile_map) {
@@ -133,47 +290,27 @@ static uint32_t *ac_compute_dcc_retile_map(struct ac_addrlib *addrlib,
 	}

 	unsigned index = 0;
+	unsigned w = DIV_ROUND_UP(retile_width, in->compressBlkWidth);
+	unsigned h = DIV_ROUND_UP(retile_height, in->compressBlkHeight);
+	unsigned src_stride = DIV_ROUND_UP(w, 1u << src_tile->tile_width_log2);
+	unsigned dst_stride = DIV_ROUND_UP(w, 1u << dst_tile->tile_width_log2);

-	for (unsigned y = 0; y < retile_height; y += in->compressBlkHeight) {
-		addrin.y = y;
+	for (unsigned y = 0; y < h; ++y) {
+		for (unsigned x = 0; x < w; ++x) {
+			unsigned src_addr = ac_compute_retile_tile_addr(src_tile, src_stride, x, y);
+			unsigned dst_addr = ac_compute_retile_tile_addr(dst_tile, dst_stride, x, y);

-		for (unsigned x = 0; x < retile_width; x += in->compressBlkWidth) {
-			addrin.x = x;
-
-			/* Compute src DCC address */
-			addrin.dccKeyFlags.pipeAligned = pipe_aligned;
-			addrin.dccKeyFlags.rbAligned = rb_aligned;
-			addrout.addr = 0;
-
-			if (Addr2ComputeDccAddrFromCoord(addrlib->handle, &addrin, &addrout) != ADDR_OK) {
-				simple_mtx_unlock(&addrlib->dcc_retile_map_lock);
-				return NULL;
+			if (use_uint16) {
+				((uint16_t*)dcc_retile_map)[2 * index] = src_addr;
+				((uint16_t*)dcc_retile_map)[2 * index + 1] = dst_addr;
+			} else {
+				((uint32_t*)dcc_retile_map)[2 * index] = src_addr;
+				((uint32_t*)dcc_retile_map)[2 * index + 1] = dst_addr;
 			}
-
-			if (use_uint16)
-				((uint16_t*)dcc_retile_map)[index * 2] = addrout.addr;
-			else
-				((uint32_t*)dcc_retile_map)[index * 2] = addrout.addr;
-
-			/* Compute dst DCC address */
-			addrin.dccKeyFlags.pipeAligned = 0;
-			addrin.dccKeyFlags.rbAligned = 0;
-			addrout.addr = 0;
-
-			if (Addr2ComputeDccAddrFromCoord(addrlib->handle, &addrin, &addrout) != ADDR_OK) {
-				simple_mtx_unlock(&addrlib->dcc_retile_map_lock);
-				return NULL;
-			}
-
-			if (use_uint16)
-				((uint16_t*)dcc_retile_map)[index * 2 + 1] = addrout.addr;
-			else
-				((uint32_t*)dcc_retile_map)[index * 2 + 1] = addrout.addr;
-
-			assert(index * 2 + 1 < dcc_retile_num_elements);
-			index++;
+			++index;
 		}
 	}
+
 	/* Fill the remaining pairs with the last one (for the compute shader). */
 	for (unsigned i = index * 2; i < dcc_retile_num_elements; i++) {
 		if (use_uint16)
@@ -276,6 +413,8 @@ struct ac_addrlib *ac_addrlib_create(const struct radeon_info *info,
 	simple_mtx_init(&addrlib->dcc_retile_map_lock, mtx_plain);
 	addrlib->dcc_retile_maps = _mesa_hash_table_create(NULL, dcc_retile_map_hash_key,
 							   dcc_retile_map_keys_equal);
+	addrlib->dcc_retile_tile_indices = _mesa_hash_table_create(NULL, dcc_retile_tile_hash_key,
+	                                                           dcc_retile_tile_keys_equal);
 	return addrlib;
 }

@@ -284,6 +423,7 @@ void ac_addrlib_destroy(struct ac_addrlib *addrlib)
 	AddrDestroy(addrlib->handle);
 	simple_mtx_destroy(&addrlib->dcc_retile_map_lock);
 	_mesa_hash_table_destroy(addrlib->dcc_retile_maps, dcc_retile_map_free);
+	_mesa_hash_table_destroy(addrlib->dcc_retile_tile_indices, dcc_retile_tile_free);
 	free(addrlib);
 }

@@ -872,7 +1012,8 @@ static int gfx6_compute_surface(ADDR_HANDLE addrlib,

 	/* Set preferred macrotile parameters. This is usually required
 	 * for shared resources. This is for 2D tiling only. */
-	if (AddrSurfInfoIn.tileMode >= ADDR_TM_2D_TILED_THIN1 &&
+	if (!(surf->flags & RADEON_SURF_Z_OR_SBUFFER) &&
+	    AddrSurfInfoIn.tileMode >= ADDR_TM_2D_TILED_THIN1 &&
 	    surf->u.legacy.bankw && surf->u.legacy.bankh &&
 	    surf->u.legacy.mtilea && surf->u.legacy.tile_split) {
 		/* If any of these parameters are incorrect, the calculation
--- a/src/amd/compiler/aco_instruction_selection.cpp
+++ b/src/amd/compiler/aco_instruction_selection.cpp
@@ -9753,7 +9753,10 @@ static void create_vs_exports(isel_context *ctx)

   if (outinfo->export_prim_id && !(ctx->stage & hw_ngg_gs)) {
      ctx->outputs.mask[VARYING_SLOT_PRIMITIVE_ID] |= 0x1;
-      ctx->outputs.temps[VARYING_SLOT_PRIMITIVE_ID * 4u] = get_arg(ctx, ctx->args->vs_prim_id);
+      if (ctx->stage & sw_tes)
+         ctx->outputs.temps[VARYING_SLOT_PRIMITIVE_ID * 4u] = get_arg(ctx, ctx->args->ac.tes_patch_id);
+      else
+         ctx->outputs.temps[VARYING_SLOT_PRIMITIVE_ID * 4u] = get_arg(ctx, ctx->args->vs_prim_id);
   }

   if (ctx->options->key.has_multiview_view_index) {
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -2486,8 +2486,10 @@ radv_flush_vertex_descriptors(struct radv_cmd_buffer *cmd_buffer,
 			uint32_t stride = cmd_buffer->state.pipeline->binding_stride[i];
 			unsigned num_records;

-			if (!buffer)
+			if (!buffer) {
+				memset(desc, 0, 4 * 4);
 				continue;
+			}

 			va = radv_buffer_get_va(buffer->bo);

@@ -3619,22 +3621,27 @@ void radv_CmdBindDescriptorSets(
 			assert(dyn_idx < dynamicOffsetCount);

 			struct radv_descriptor_range *range = set->dynamic_descriptors + j;
-			uint64_t va = range->va + pDynamicOffsets[dyn_idx];
-			dst[0] = va;
-			dst[1] = S_008F04_BASE_ADDRESS_HI(va >> 32);
-			dst[2] = no_dynamic_bounds ? 0xffffffffu : range->size;
-			dst[3] = S_008F0C_DST_SEL_X(V_008F0C_SQ_SEL_X) |
-			         S_008F0C_DST_SEL_Y(V_008F0C_SQ_SEL_Y) |
-			         S_008F0C_DST_SEL_Z(V_008F0C_SQ_SEL_Z) |
-			         S_008F0C_DST_SEL_W(V_008F0C_SQ_SEL_W);

-			if (cmd_buffer->device->physical_device->rad_info.chip_class >= GFX10) {
-				dst[3] |= S_008F0C_FORMAT(V_008F0C_IMG_FORMAT_32_FLOAT) |
-					  S_008F0C_OOB_SELECT(V_008F0C_OOB_SELECT_RAW) |
-					  S_008F0C_RESOURCE_LEVEL(1);
+			if (!range->va) {
+				memset(dst, 0, 4 * 4);
 			} else {
-				dst[3] |= S_008F0C_NUM_FORMAT(V_008F0C_BUF_NUM_FORMAT_FLOAT) |
-					  S_008F0C_DATA_FORMAT(V_008F0C_BUF_DATA_FORMAT_32);
+				uint64_t va = range->va + pDynamicOffsets[dyn_idx];
+				dst[0] = va;
+				dst[1] = S_008F04_BASE_ADDRESS_HI(va >> 32);
+				dst[2] = no_dynamic_bounds ? 0xffffffffu : range->size;
+				dst[3] = S_008F0C_DST_SEL_X(V_008F0C_SQ_SEL_X) |
+					 S_008F0C_DST_SEL_Y(V_008F0C_SQ_SEL_Y) |
+					 S_008F0C_DST_SEL_Z(V_008F0C_SQ_SEL_Z) |
+					 S_008F0C_DST_SEL_W(V_008F0C_SQ_SEL_W);
+
+				if (cmd_buffer->device->physical_device->rad_info.chip_class >= GFX10) {
+					dst[3] |= S_008F0C_FORMAT(V_008F0C_IMG_FORMAT_32_FLOAT) |
+						  S_008F0C_OOB_SELECT(V_008F0C_OOB_SELECT_RAW) |
+						  S_008F0C_RESOURCE_LEVEL(1);
+				} else {
+					dst[3] |= S_008F0C_NUM_FORMAT(V_008F0C_BUF_NUM_FORMAT_FLOAT) |
+						  S_008F0C_DATA_FORMAT(V_008F0C_BUF_DATA_FORMAT_32);
+				}
 			}

 			cmd_buffer->push_constant_stages |=
@@ -5517,8 +5524,16 @@ static void radv_init_color_image_metadata(struct radv_cmd_buffer *cmd_buffer,
 	if (radv_image_has_cmask(image)) {
 		uint32_t value = 0xffffffffu; /* Fully expanded mode. */

-		/*  TODO: clarify this. */
-		if (radv_image_has_fmask(image)) {
+		/*  TODO: clarify why 0xccccccccu is used. */
+
+		/* If CMASK isn't updated with the new layout, we should use the
+		 * fully expanded mode so that the image is read correctly if
+		 * CMASK is used (such as when transitioning to a compressed
+		 * layout).
+		 */
+		if (radv_image_has_fmask(image) &&
+		    radv_layout_can_fast_clear(image, dst_layout,
+					       dst_render_loop, dst_queue_mask)) {
 			value = 0xccccccccu;
 		}

@@ -6163,8 +6178,12 @@ radv_emit_streamout_begin(struct radv_cmd_buffer *cmd_buffer,
 			/* The array of counter buffers is optional. */
 			RADV_FROM_HANDLE(radv_buffer, buffer, pCounterBuffers[counter_buffer_idx]);
 			uint64_t va = radv_buffer_get_va(buffer->bo);
+			uint64_t counter_buffer_offset = 0;

-			va += buffer->offset + pCounterBufferOffsets[counter_buffer_idx];
+			if (pCounterBufferOffsets)
+				counter_buffer_offset = pCounterBufferOffsets[counter_buffer_idx];
+
+			va += buffer->offset + counter_buffer_offset;

 			/* Append */
 			radeon_emit(cs, PKT3(PKT3_STRMOUT_BUFFER_UPDATE, 4, 0));
@@ -6227,9 +6246,13 @@ gfx10_emit_streamout_begin(struct radv_cmd_buffer *cmd_buffer,

 		if (append) {
 			RADV_FROM_HANDLE(radv_buffer, buffer, pCounterBuffers[counter_buffer_idx]);
+			uint64_t counter_buffer_offset = 0;
+
+			if (pCounterBufferOffsets)
+				counter_buffer_offset = pCounterBufferOffsets[counter_buffer_idx];

 			va += radv_buffer_get_va(buffer->bo);
-			va += buffer->offset + pCounterBufferOffsets[counter_buffer_idx];
+			va += buffer->offset + counter_buffer_offset;

 			radv_cs_add_buffer(cmd_buffer->device->ws, cs, buffer->bo);
 		}
@@ -6292,8 +6315,12 @@ radv_emit_streamout_end(struct radv_cmd_buffer *cmd_buffer,
 			/* The array of counters buffer is optional. */
 			RADV_FROM_HANDLE(radv_buffer, buffer, pCounterBuffers[counter_buffer_idx]);
 			uint64_t va = radv_buffer_get_va(buffer->bo);
+			uint64_t counter_buffer_offset = 0;

-			va += buffer->offset + pCounterBufferOffsets[counter_buffer_idx];
+			if (pCounterBufferOffsets)
+				counter_buffer_offset = pCounterBufferOffsets[counter_buffer_idx];
+
+			va += buffer->offset + counter_buffer_offset;

 			radeon_emit(cs, PKT3(PKT3_STRMOUT_BUFFER_UPDATE, 4, 0));
 			radeon_emit(cs, STRMOUT_SELECT_BUFFER(i) |
@@ -6344,8 +6371,12 @@ gfx10_emit_streamout_end(struct radv_cmd_buffer *cmd_buffer,
 			/* The array of counters buffer is optional. */
 			RADV_FROM_HANDLE(radv_buffer, buffer, pCounterBuffers[counter_buffer_idx]);
 			uint64_t va = radv_buffer_get_va(buffer->bo);
+			uint64_t counter_buffer_offset = 0;

-			va += buffer->offset + pCounterBufferOffsets[counter_buffer_idx];
+			if (pCounterBufferOffsets)
+				counter_buffer_offset = pCounterBufferOffsets[counter_buffer_idx];
+
+			va += buffer->offset + counter_buffer_offset;

 			si_cs_emit_write_event_eop(cs,
 						   cmd_buffer->device->physical_device->rad_info.chip_class,
--- a/src/amd/vulkan/radv_descriptor_set.c
+++ b/src/amd/vulkan/radv_descriptor_set.c
@@ -928,8 +928,10 @@ static void write_dynamic_buffer_descriptor(struct radv_device *device,
 	uint64_t va;
 	unsigned size;

-	if (!buffer)
+	if (!buffer) {
+		range->va = 0;
 		return;
+	}

 	va = radv_buffer_get_va(buffer->bo);
 	size = buffer_info->range;
--- a/src/amd/vulkan/radv_nir_to_llvm.c
+++ b/src/amd/vulkan/radv_nir_to_llvm.c
@@ -1987,8 +1987,12 @@ handle_vs_outputs_post(struct radv_shader_context *ctx,
 		outputs[noutput].slot_name = VARYING_SLOT_PRIMITIVE_ID;
 		outputs[noutput].slot_index = 0;
 		outputs[noutput].usage_mask = 0x1;
-		outputs[noutput].values[0] =
-			ac_get_arg(&ctx->ac, ctx->args->vs_prim_id);
+		if (ctx->stage == MESA_SHADER_TESS_EVAL)
+			outputs[noutput].values[0] =
+				ac_get_arg(&ctx->ac, ctx->args->ac.tes_patch_id);
+		else
+			outputs[noutput].values[0] =
+				ac_get_arg(&ctx->ac, ctx->args->vs_prim_id);
 		for (unsigned j = 1; j < 4; j++)
 			outputs[noutput].values[j] = ctx->ac.f32_0;
 		noutput++;
--- a/src/compiler/glsl_types.cpp
+++ b/src/compiler/glsl_types.cpp
@@ -1087,6 +1087,9 @@ glsl_type::record_compare(const glsl_type *b, bool match_name,
   if (this->interface_row_major != b->interface_row_major)
      return false;

+   if (this->packed != b->packed)
+      return false;
+
   /* From the GLSL 4.20 specification (Sec 4.2):
    *
    *     "Structures must have the same name, sequence of type names, and
--- a/src/compiler/nir/nir_liveness.c
+++ b/src/compiler/nir/nir_liveness.c
@@ -250,6 +250,15 @@ search_for_use_after_instr(nir_instr *start, nir_ssa_def *def)
         return true;
      node = node->next;
   }
+
+   /* If uses are considered to be in the block immediately preceding the if
+    * so we need to also check the following if condition, if any.
+    */
+   nir_if *following_if = nir_block_get_following_if(start->block);
+   if (following_if && following_if->condition.is_ssa &&
+       following_if->condition.ssa == def)
+      return true;
+
   return false;
 }

--- a/src/compiler/nir/nir_lower_io.c
+++ b/src/compiler/nir/nir_lower_io.c
@@ -649,6 +649,37 @@ nir_lower_io_block(nir_block *block,
                                mode == nir_var_shader_out ||
                                var->data.bindless;

+     if (nir_deref_instr_is_known_out_of_bounds(deref)) {
+        /* Section 5.11 (Out-of-Bounds Accesses) of the GLSL 4.60 spec says:
+         *
+         *    In the subsections described above for array, vector, matrix and
+         *    structure accesses, any out-of-bounds access produced undefined
+         *    behavior....
+         *    Out-of-bounds reads return undefined values, which
+         *    include values from other variables of the active program or zero.
+         *    Out-of-bounds writes may be discarded or overwrite
+         *    other variables of the active program.
+         *
+         * GL_KHR_robustness and GL_ARB_robustness encourage us to return zero
+         * for reads.
+         *
+         * Otherwise get_io_offset would return out-of-bound offset which may
+         * result in out-of-bound loading/storing of inputs/outputs,
+         * that could cause issues in drivers down the line.
+         */
+         if (intrin->intrinsic != nir_intrinsic_store_deref) {
+            nir_ssa_def *zero =
+               nir_imm_zero(b, intrin->dest.ssa.num_components,
+                             intrin->dest.ssa.bit_size);
+            nir_ssa_def_rewrite_uses(&intrin->dest.ssa,
+                                  nir_src_for_ssa(zero));
+         }
+
+         nir_instr_remove(&intrin->instr);
+         progress = true;
+         continue;
+      }
+
      offset = get_io_offset(b, deref, per_vertex ? &vertex_index : NULL,
                             state->type_size, &component_offset,
                             bindless_type_size);
--- a/src/compiler/nir/nir_lower_io_arrays_to_elements.c
+++ b/src/compiler/nir/nir_lower_io_arrays_to_elements.c
@@ -61,7 +61,7 @@ get_io_offset(nir_builder *b, nir_deref_instr *deref, nir_variable *var,
         unsigned size = glsl_count_attribute_slots((*p)->type, false);
         offset += size * index;

-         xfb_offset += index * glsl_get_component_slots((*p)->type) * 4;
+         *xfb_offset += index * glsl_get_component_slots((*p)->type) * 4;

         unsigned num_elements = glsl_type_is_array((*p)->type) ?
            glsl_get_aoa_size((*p)->type) : 1;
--- a/src/compiler/nir/nir_lower_samplers.c
+++ b/src/compiler/nir/nir_lower_samplers.c
@@ -47,7 +47,27 @@ lower_tex_src_to_offset(nir_builder *b,

      if (nir_src_is_const(deref->arr.index) && index == NULL) {
         /* We're still building a direct index */
-         base_index += nir_src_as_uint(deref->arr.index) * array_elements;
+         unsigned index_in_array = nir_src_as_uint(deref->arr.index);
+
+         /* Section 5.11 (Out-of-Bounds Accesses) of the GLSL 4.60 spec says:
+          *
+          *    In the subsections described above for array, vector, matrix and
+          *    structure accesses, any out-of-bounds access produced undefined
+          *    behavior.... Out-of-bounds reads return undefined values, which
+          *    include values from other variables of the active program or zero.
+          *
+          * Robustness extensions suggest to return zero on out-of-bounds
+          * accesses, however it's not applicable to the arrays of samplers,
+          * so just clamp the index.
+          *
+          * Otherwise instr->sampler_index or instr->texture_index would be out
+          * of bounds, and they are used as an index to arrays of driver state.
+          */
+         if (index_in_array < glsl_array_size(parent->type)) {
+            base_index += index_in_array * array_elements;
+         } else {
+            base_index = glsl_array_size(parent->type) - 1;
+         }
      } else {
         if (index == NULL) {
            /* We used to be direct but not anymore */
--- a/src/compiler/nir/nir_opt_large_constants.c
+++ b/src/compiler/nir/nir_opt_large_constants.c
@@ -118,8 +118,11 @@ handle_constant_store(void *mem_ctx, struct var_info *info,
      info->constant_data = rzalloc_size(mem_ctx, var_size);
   }

-   char *dst = (char *)info->constant_data +
-               nir_deref_instr_get_const_offset(deref, size_align);
+   const unsigned offset = nir_deref_instr_get_const_offset(deref, size_align);
+   if (offset >= info->constant_data_size)
+      return;
+
+   char *dst = (char *)info->constant_data + offset;

   for (unsigned i = 0; i < num_components; i++) {
      if (!(writemask & (1 << i)))
--- a/src/compiler/spirv/vtn_cfg.c
+++ b/src/compiler/spirv/vtn_cfg.c
@@ -608,6 +608,74 @@ vtn_add_cfg_work_item(struct vtn_builder *b,
   list_addtail(&work->link, work_list);
 }

+/* returns the default block */
+static void
+vtn_parse_switch(struct vtn_builder *b,
+                 struct vtn_switch *swtch,
+                 const uint32_t *branch,
+                 struct list_head *case_list)
+{
+   const uint32_t *branch_end = branch + (branch[0] >> SpvWordCountShift);
+
+   struct vtn_value *sel_val = vtn_untyped_value(b, branch[1]);
+   vtn_fail_if(!sel_val->type ||
+               sel_val->type->base_type != vtn_base_type_scalar,
+               "Selector of OpSwitch must have a type of OpTypeInt");
+
+   nir_alu_type sel_type =
+      nir_get_nir_type_for_glsl_type(sel_val->type->type);
+   vtn_fail_if(nir_alu_type_get_base_type(sel_type) != nir_type_int &&
+               nir_alu_type_get_base_type(sel_type) != nir_type_uint,
+               "Selector of OpSwitch must have a type of OpTypeInt");
+
+   struct hash_table *block_to_case = _mesa_pointer_hash_table_create(b);
+
+   bool is_default = true;
+   const unsigned bitsize = nir_alu_type_get_type_size(sel_type);
+   for (const uint32_t *w = branch + 2; w < branch_end;) {
+      uint64_t literal = 0;
+      if (!is_default) {
+         if (bitsize <= 32) {
+            literal = *(w++);
+         } else {
+            assert(bitsize == 64);
+            literal = vtn_u64_literal(w);
+            w += 2;
+         }
+      }
+      struct vtn_block *case_block = vtn_block(b, *(w++));
+
+      struct hash_entry *case_entry =
+         _mesa_hash_table_search(block_to_case, case_block);
+
+      struct vtn_case *cse;
+      if (case_entry) {
+         cse = case_entry->data;
+      } else {
+         cse = rzalloc(b, struct vtn_case);
+
+         cse->node.type = vtn_cf_node_type_case;
+         cse->node.parent = swtch ? &swtch->node : NULL;
+         cse->block = case_block;
+         list_inithead(&cse->body);
+         util_dynarray_init(&cse->values, b);
+
+         list_addtail(&cse->node.link, case_list);
+         _mesa_hash_table_insert(block_to_case, case_block, cse);
+      }
+
+      if (is_default) {
+         cse->is_default = true;
+      } else {
+         util_dynarray_append(&cse->values, uint64_t, literal);
+      }
+
+      is_default = false;
+   }
+
+   _mesa_hash_table_destroy(block_to_case, NULL);
+}
+
 /* Processes a block and returns the next block to process or NULL if we've
 * reached the end of the construct.
 */
@@ -812,17 +880,6 @@ vtn_process_block(struct vtn_builder *b,
   }

   case SpvOpSwitch: {
-      struct vtn_value *sel_val = vtn_untyped_value(b, block->branch[1]);
-      vtn_fail_if(!sel_val->type ||
-                  sel_val->type->base_type != vtn_base_type_scalar,
-                  "Selector of OpSwitch must have a type of OpTypeInt");
-
-      nir_alu_type sel_type =
-         nir_get_nir_type_for_glsl_type(sel_val->type->type);
-      vtn_fail_if(nir_alu_type_get_base_type(sel_type) != nir_type_int &&
-                  nir_alu_type_get_base_type(sel_type) != nir_type_uint,
-                  "Selector of OpSwitch must have a type of OpTypeInt");
-
      struct vtn_switch *swtch = rzalloc(b, struct vtn_switch);

      swtch->node.type = vtn_cf_node_type_switch;
@@ -843,82 +900,39 @@ vtn_process_block(struct vtn_builder *b,
      }

      /* First, we go through and record all of the cases. */
-      const uint32_t *branch_end =
-         block->branch + (block->branch[0] >> SpvWordCountShift);
+      vtn_parse_switch(b, swtch, block->branch, &swtch->cases);

-      struct hash_table *block_to_case = _mesa_pointer_hash_table_create(b);
+      /* Gather the branch types for the switch */
+      vtn_foreach_cf_node(case_node, &swtch->cases) {
+         struct vtn_case *cse = vtn_cf_node_as_case(case_node);

-      bool is_default = true;
-      const unsigned bitsize = nir_alu_type_get_type_size(sel_type);
-      for (const uint32_t *w = block->branch + 2; w < branch_end;) {
-         uint64_t literal = 0;
-         if (!is_default) {
-            if (bitsize <= 32) {
-               literal = *(w++);
-            } else {
-               assert(bitsize == 64);
-               literal = vtn_u64_literal(w);
-               w += 2;
-            }
+         cse->type = vtn_handle_branch(b, &swtch->node, cse->block);
+         switch (cse->type) {
+         case vtn_branch_type_none:
+            /* This is a "real" cases which has stuff in it */
+            vtn_fail_if(cse->block->switch_case != NULL,
+                        "OpSwitch has a case which is also in another "
+                        "OpSwitch construct");
+            cse->block->switch_case = cse;
+            vtn_add_cfg_work_item(b, work_list, &cse->node,
+                                  &cse->body, cse->block);
+            break;
+
+         case vtn_branch_type_switch_break:
+         case vtn_branch_type_loop_break:
+         case vtn_branch_type_loop_continue:
+            /* Switch breaks as well as loop breaks and continues can be
+             * used to break out of a switch construct or as direct targets
+             * of the OpSwitch.
+             */
+            break;
+
+         default:
+            vtn_fail("Target of OpSwitch is not a valid structured exit "
+                     "from the switch construct.");
         }
-         struct vtn_block *case_block = vtn_block(b, *(w++));
-
-         struct hash_entry *case_entry =
-            _mesa_hash_table_search(block_to_case, case_block);
-
-         struct vtn_case *cse;
-         if (case_entry) {
-            cse = case_entry->data;
-         } else {
-            cse = rzalloc(b, struct vtn_case);
-
-            cse->node.type = vtn_cf_node_type_case;
-            cse->node.parent = &swtch->node;
-            list_inithead(&cse->body);
-            util_dynarray_init(&cse->values, b);
-
-            cse->type = vtn_handle_branch(b, &swtch->node, case_block);
-            switch (cse->type) {
-            case vtn_branch_type_none:
-               /* This is a "real" cases which has stuff in it */
-               vtn_fail_if(case_block->switch_case != NULL,
-                           "OpSwitch has a case which is also in another "
-                           "OpSwitch construct");
-               case_block->switch_case = cse;
-               vtn_add_cfg_work_item(b, work_list, &cse->node,
-                                     &cse->body, case_block);
-               break;
-
-            case vtn_branch_type_switch_break:
-            case vtn_branch_type_loop_break:
-            case vtn_branch_type_loop_continue:
-               /* Switch breaks as well as loop breaks and continues can be
-                * used to break out of a switch construct or as direct targets
-                * of the OpSwitch.
-                */
-               break;
-
-            default:
-               vtn_fail("Target of OpSwitch is not a valid structured exit "
-                        "from the switch construct.");
-            }
-
-            list_addtail(&cse->node.link, &swtch->cases);
-
-            _mesa_hash_table_insert(block_to_case, case_block, cse);
-         }
-
-         if (is_default) {
-            cse->is_default = true;
-         } else {
-            util_dynarray_append(&cse->values, uint64_t, literal);
-         }
-
-         is_default = false;
      }

-      _mesa_hash_table_destroy(block_to_case, NULL);
-
      return swtch->break_block;
   }

@@ -1271,6 +1285,13 @@ vtn_emit_cf_list(struct vtn_builder *b, struct list_head *cf_list,
         vtn_foreach_cf_node(case_node, &vtn_switch->cases) {
            struct vtn_case *cse = vtn_cf_node_as_case(case_node);

+            /* If this case jumps directly to the break block, we don't have
+             * to handle the case as the body is empty and doesn't fall
+             * through.
+             */
+            if (cse->block == vtn_switch->break_block)
+               continue;
+
            /* Figure out the condition */
            nir_ssa_def *cond =
               vtn_switch_case_condition(b, vtn_switch, sel, cse);
--- a/src/compiler/spirv/vtn_private.h
+++ b/src/compiler/spirv/vtn_private.h
@@ -185,6 +185,8 @@ struct vtn_if {
 struct vtn_case {
   struct vtn_cf_node node;

+   struct vtn_block *block;
+
   enum vtn_branch_type type;
   struct list_head body;

--- a/src/gallium/auxiliary/vl/vl_mpeg12_decoder.c
+++ b/src/gallium/auxiliary/vl/vl_mpeg12_decoder.c
@@ -769,7 +769,8 @@ vl_mpeg12_end_frame(struct pipe_video_codec *decoder,

   vl_vb_unmap(&buf->vertex_stream, dec->context);

-   dec->context->transfer_unmap(dec->context, buf->tex_transfer);
+   if (buf->tex_transfer)
+      dec->context->transfer_unmap(dec->context, buf->tex_transfer);

   vb[0] = dec->quads;
   vb[1] = dec->pos;
@@ -982,28 +983,28 @@ init_idct(struct vl_mpeg12_decoder *dec, const struct format_config* format_conf
      nr_of_idct_render_targets = 1;

   formats[0] = formats[1] = formats[2] = format_config->idct_source_format;
-   assert(pipe_format_to_chroma_format(formats[0]) == dec->base.chroma_format);
   memset(&templat, 0, sizeof(templat));
   templat.width = dec->base.width / 4;
   templat.height = dec->base.height;
   dec->idct_source = vl_video_buffer_create_ex
   (
      dec->context, &templat,
-      formats, 1, 1, PIPE_USAGE_DEFAULT
+      formats, 1, 1, PIPE_USAGE_DEFAULT,
+      PIPE_VIDEO_CHROMA_FORMAT_420
   );

   if (!dec->idct_source)
      goto error_idct_source;

   formats[0] = formats[1] = formats[2] = format_config->mc_source_format;
-   assert(pipe_format_to_chroma_format(formats[0]) == dec->base.chroma_format);
   memset(&templat, 0, sizeof(templat));
   templat.width = dec->base.width / nr_of_idct_render_targets;
   templat.height = dec->base.height / 4;
   dec->mc_source = vl_video_buffer_create_ex
   (
      dec->context, &templat,
-      formats, nr_of_idct_render_targets, 1, PIPE_USAGE_DEFAULT
+      formats, nr_of_idct_render_targets, 1, PIPE_USAGE_DEFAULT,
+      PIPE_VIDEO_CHROMA_FORMAT_420
   );

   if (!dec->mc_source)
@@ -1054,9 +1055,10 @@ init_mc_source_widthout_idct(struct vl_mpeg12_decoder *dec, const struct format_
   dec->mc_source = vl_video_buffer_create_ex
   (
      dec->context, &templat,
-      formats, 1, 1, PIPE_USAGE_DEFAULT
+      formats, 1, 1, PIPE_USAGE_DEFAULT,
+      PIPE_VIDEO_CHROMA_FORMAT_420
   );
-      
+
   return dec->mc_source != NULL;
 }

--- a/src/gallium/auxiliary/vl/vl_stubs.c
+++ b/src/gallium/auxiliary/vl/vl_stubs.c
@@ -85,7 +85,8 @@ vl_video_buffer_template(struct pipe_resource *templ,
                         const struct pipe_video_buffer *tmpl,
                         enum pipe_format resource_format,
                         unsigned depth, unsigned array_size,
-                         unsigned usage, unsigned plane)
+                         unsigned usage, unsigned plane,
+                         enum pipe_video_chroma_format chroma_format)
 {
   assert(0);
 }
--- a/src/gallium/auxiliary/vl/vl_vertex_buffers.c
+++ b/src/gallium/auxiliary/vl/vl_vertex_buffers.c
@@ -352,11 +352,13 @@ vl_vb_unmap(struct vl_vertex_buffer *buffer, struct pipe_context *pipe)
   assert(buffer && pipe);

   for (i = 0; i < VL_NUM_COMPONENTS; ++i) {
-      pipe_buffer_unmap(pipe, buffer->ycbcr[i].transfer);
+      if (buffer->ycbcr[i].transfer)
+         pipe_buffer_unmap(pipe, buffer->ycbcr[i].transfer);
   }

   for (i = 0; i < VL_MAX_REF_FRAMES; ++i) {
-      pipe_buffer_unmap(pipe, buffer->mv[i].transfer);
+      if (buffer->mv[i].transfer)
+         pipe_buffer_unmap(pipe, buffer->mv[i].transfer);
   }
 }

--- a/src/gallium/auxiliary/vl/vl_video_buffer.c
+++ b/src/gallium/auxiliary/vl/vl_video_buffer.c
@@ -169,7 +169,8 @@ vl_video_buffer_template(struct pipe_resource *templ,
                         const struct pipe_video_buffer *tmpl,
                         enum pipe_format resource_format,
                         unsigned depth, unsigned array_size,
-                         unsigned usage, unsigned plane)
+                         unsigned usage, unsigned plane,
+                         enum pipe_video_chroma_format chroma_format)
 {
   unsigned height = tmpl->height;

@@ -188,7 +189,7 @@ vl_video_buffer_template(struct pipe_resource *templ,
   templ->usage = usage;

   vl_video_buffer_adjust_size(&templ->width0, &height, plane,
-                               pipe_format_to_chroma_format(tmpl->buffer_format), false);
+                               chroma_format, false);
   templ->height0 = height;
 }

@@ -372,7 +373,8 @@ vl_video_buffer_create(struct pipe_context *pipe,
   result = vl_video_buffer_create_ex
   (
      pipe, &templat, resource_formats,
-      1, tmpl->interlaced ? 2 : 1, PIPE_USAGE_DEFAULT
+      1, tmpl->interlaced ? 2 : 1, PIPE_USAGE_DEFAULT,
+      pipe_format_to_chroma_format(templat.buffer_format)
   );


@@ -386,7 +388,8 @@ struct pipe_video_buffer *
 vl_video_buffer_create_ex(struct pipe_context *pipe,
                          const struct pipe_video_buffer *tmpl,
                          const enum pipe_format resource_formats[VL_NUM_COMPONENTS],
-                          unsigned depth, unsigned array_size, unsigned usage)
+                          unsigned depth, unsigned array_size, unsigned usage,
+                          enum pipe_video_chroma_format chroma_format)
 {
   struct pipe_resource res_tmpl;
   struct pipe_resource *resources[VL_NUM_COMPONENTS];
@@ -396,7 +399,8 @@ vl_video_buffer_create_ex(struct pipe_context *pipe,

   memset(resources, 0, sizeof resources);

-   vl_video_buffer_template(&res_tmpl, tmpl, resource_formats[0], depth, array_size, usage, 0);
+   vl_video_buffer_template(&res_tmpl, tmpl, resource_formats[0], depth, array_size,
+                            usage, 0, chroma_format);
   resources[0] = pipe->screen->resource_create(pipe->screen, &res_tmpl);
   if (!resources[0])
      goto error;
@@ -406,7 +410,8 @@ vl_video_buffer_create_ex(struct pipe_context *pipe,
      return vl_video_buffer_create_ex2(pipe, tmpl, resources);
   }

-   vl_video_buffer_template(&res_tmpl, tmpl, resource_formats[1], depth, array_size, usage, 1);
+   vl_video_buffer_template(&res_tmpl, tmpl, resource_formats[1], depth, array_size,
+                            usage, 1, chroma_format);
   resources[1] = pipe->screen->resource_create(pipe->screen, &res_tmpl);
   if (!resources[1])
      goto error;
@@ -414,7 +419,8 @@ vl_video_buffer_create_ex(struct pipe_context *pipe,
   if (resource_formats[2] == PIPE_FORMAT_NONE)
      return vl_video_buffer_create_ex2(pipe, tmpl, resources);

-   vl_video_buffer_template(&res_tmpl, tmpl, resource_formats[2], depth, array_size, usage, 2);
+   vl_video_buffer_template(&res_tmpl, tmpl, resource_formats[2], depth, array_size,
+                            usage, 2, chroma_format);
   resources[2] = pipe->screen->resource_create(pipe->screen, &res_tmpl);
   if (!resources[2])
      goto error;
--- a/src/gallium/auxiliary/vl/vl_video_buffer.h
+++ b/src/gallium/auxiliary/vl/vl_video_buffer.h
@@ -119,7 +119,8 @@ vl_video_buffer_template(struct pipe_resource *templ,
                         const struct pipe_video_buffer *templat,
                         enum pipe_format resource_format,
                         unsigned depth, unsigned array_size,
-                         unsigned usage, unsigned plane);
+                         unsigned usage, unsigned plane,
+                         enum pipe_video_chroma_format chroma_format);

 /**
 * creates a video buffer, can be used as a standard implementation for pipe->create_video_buffer
@@ -135,7 +136,8 @@ struct pipe_video_buffer *
 vl_video_buffer_create_ex(struct pipe_context *pipe,
                          const struct pipe_video_buffer *templat,
                          const enum pipe_format resource_formats[VL_NUM_COMPONENTS],
-                          unsigned depth, unsigned array_size, unsigned usage);
+                          unsigned depth, unsigned array_size, unsigned usage,
+                          enum pipe_video_chroma_format chroma_format);

 /**
 * even more extended create function, provide the pipe_resource for each plane
--- a/src/gallium/drivers/etnaviv/etnaviv_texture_state.c
+++ b/src/gallium/drivers/etnaviv/etnaviv_texture_state.c
@@ -68,7 +68,7 @@ struct etna_sampler_view {
   uint32_t TE_SAMPLER_SIZE;
   uint32_t TE_SAMPLER_LOG_SIZE;
   uint32_t TE_SAMPLER_ASTC0;
-   uint32_t TE_SAMPLER_LINEAR_STRIDE[VIVS_TE_SAMPLER_LINEAR_STRIDE__LEN];
+   uint32_t TE_SAMPLER_LINEAR_STRIDE;  /* only LOD0 */
   struct etna_reloc TE_SAMPLER_LOD_ADDR[VIVS_TE_SAMPLER_LOD_ADDR__LEN];
   unsigned min_lod, max_lod; /* 5.5 fixp */

@@ -211,12 +211,11 @@ etna_create_sampler_view_state(struct pipe_context *pctx, struct pipe_resource *
   if (res->layout == ETNA_LAYOUT_LINEAR && !util_format_is_compressed(so->format)) {
      sv->TE_SAMPLER_CONFIG0 |= VIVS_TE_SAMPLER_CONFIG0_ADDRESSING_MODE(TEXTURE_ADDRESSING_MODE_LINEAR);

-      for (int lod = 0; lod <= res->base.last_level; ++lod)
-         sv->TE_SAMPLER_LINEAR_STRIDE[lod] = res->levels[lod].stride;
-
+      assert(res->base.last_level == 0);
+      sv->TE_SAMPLER_LINEAR_STRIDE = res->levels[0].stride;
   } else {
      sv->TE_SAMPLER_CONFIG0 |= VIVS_TE_SAMPLER_CONFIG0_ADDRESSING_MODE(TEXTURE_ADDRESSING_MODE_TILED);
-      memset(&sv->TE_SAMPLER_LINEAR_STRIDE, 0, sizeof(sv->TE_SAMPLER_LINEAR_STRIDE));
+      sv->TE_SAMPLER_LINEAR_STRIDE = 0;
   }

   sv->TE_SAMPLER_CONFIG1 |= COND(ext, VIVS_TE_SAMPLER_CONFIG1_FORMAT_EXT(format)) |
@@ -406,12 +405,11 @@ etna_emit_texture_state(struct etna_context *ctx)
      }
   }
   if (unlikely(dirty & (ETNA_DIRTY_SAMPLER_VIEWS))) {
-      for (int y = 0; y < VIVS_TE_SAMPLER_LINEAR_STRIDE__LEN; ++y) {
-         for (int x = 0; x < VIVS_TE_SAMPLER__LEN; ++x) {
-            if ((1 << x) & active_samplers) {
-               struct etna_sampler_view *sv = etna_sampler_view(ctx->sampler_view[x]);
-               /*02C00*/ EMIT_STATE(TE_SAMPLER_LINEAR_STRIDE(x, y), sv->TE_SAMPLER_LINEAR_STRIDE[y]);
-            }
+      /* only LOD0 is valid for this register */
+      for (int x = 0; x < VIVS_TE_SAMPLER__LEN; ++x) {
+         if ((1 << x) & active_samplers) {
+            struct etna_sampler_view *sv = etna_sampler_view(ctx->sampler_view[x]);
+            /*02C00*/ EMIT_STATE(TE_SAMPLER_LINEAR_STRIDE(0, x), sv->TE_SAMPLER_LINEAR_STRIDE);
         }
      }
   }
--- a/src/gallium/drivers/r600/r600_uvd.c
+++ b/src/gallium/drivers/r600/r600_uvd.c
@@ -66,6 +66,8 @@ struct pipe_video_buffer *r600_video_buffer_create(struct pipe_context *pipe,
 	struct pipe_video_buffer template;
 	struct pipe_resource templ;
 	unsigned i, array_size;
+	enum pipe_video_chroma_format chroma_format =
+		pipe_format_to_chroma_format(tmpl->buffer_format);

 	assert(pipe);

@@ -77,7 +79,8 @@ struct pipe_video_buffer *r600_video_buffer_create(struct pipe_context *pipe,
 	template.width = align(tmpl->width, VL_MACROBLOCK_WIDTH);
 	template.height = align(tmpl->height / array_size, VL_MACROBLOCK_HEIGHT);

-	vl_video_buffer_template(&templ, &template, resource_formats[0], 1, array_size, PIPE_USAGE_DEFAULT, 0);
+	vl_video_buffer_template(&templ, &template, resource_formats[0], 1, array_size,
+									 PIPE_USAGE_DEFAULT, 0, chroma_format);
 	if (ctx->b.chip_class < EVERGREEN || tmpl->interlaced || !R600_UVD_ENABLE_TILING)
 		templ.bind = PIPE_BIND_LINEAR;
 	resources[0] = (struct r600_texture *)
@@ -86,7 +89,8 @@ struct pipe_video_buffer *r600_video_buffer_create(struct pipe_context *pipe,
 		goto error;

 	if (resource_formats[1] != PIPE_FORMAT_NONE) {
-		vl_video_buffer_template(&templ, &template, resource_formats[1], 1, array_size, PIPE_USAGE_DEFAULT, 1);
+		vl_video_buffer_template(&templ, &template, resource_formats[1], 1, array_size,
+										 PIPE_USAGE_DEFAULT, 1, chroma_format);
 		if (ctx->b.chip_class < EVERGREEN || tmpl->interlaced || !R600_UVD_ENABLE_TILING)
 			templ.bind = PIPE_BIND_LINEAR;
 		resources[1] = (struct r600_texture *)
@@ -96,7 +100,8 @@ struct pipe_video_buffer *r600_video_buffer_create(struct pipe_context *pipe,
 	}

 	if (resource_formats[2] != PIPE_FORMAT_NONE) {
-		vl_video_buffer_template(&templ, &template, resource_formats[2], 1, array_size, PIPE_USAGE_DEFAULT, 2);
+		vl_video_buffer_template(&templ, &template, resource_formats[2], 1, array_size,
+										 PIPE_USAGE_DEFAULT, 2, chroma_format);
 		if (ctx->b.chip_class < EVERGREEN || tmpl->interlaced || !R600_UVD_ENABLE_TILING)
 			templ.bind = PIPE_BIND_LINEAR;
 		resources[2] = (struct r600_texture *)
--- a/src/gallium/drivers/radeonsi/si_compute.c
+++ b/src/gallium/drivers/radeonsi/si_compute.c
@@ -677,27 +677,26 @@ static void si_setup_nir_user_data(struct si_context *sctx, const struct pipe_gr
                             12 * sel->info.uses_grid_size;
   unsigned cs_user_data_reg = block_size_reg + 12 * program->reads_variable_block_size;

-   if (info->indirect) {
-      if (sel->info.uses_grid_size) {
+   if (sel->info.uses_grid_size) {
+      if (info->indirect) {
         for (unsigned i = 0; i < 3; ++i) {
            si_cp_copy_data(sctx, sctx->gfx_cs, COPY_DATA_REG, NULL, (grid_size_reg >> 2) + i,
                            COPY_DATA_SRC_MEM, si_resource(info->indirect),
                            info->indirect_offset + 4 * i);
         }
-      }
-   } else {
-      if (sel->info.uses_grid_size) {
+      } else {
         radeon_set_sh_reg_seq(cs, grid_size_reg, 3);
         radeon_emit(cs, info->grid[0]);
         radeon_emit(cs, info->grid[1]);
         radeon_emit(cs, info->grid[2]);
      }
-      if (program->reads_variable_block_size) {
-         radeon_set_sh_reg_seq(cs, block_size_reg, 3);
-         radeon_emit(cs, info->block[0]);
-         radeon_emit(cs, info->block[1]);
-         radeon_emit(cs, info->block[2]);
-      }
+   }
+
+   if (program->reads_variable_block_size) {
+      radeon_set_sh_reg_seq(cs, block_size_reg, 3);
+      radeon_emit(cs, info->block[0]);
+      radeon_emit(cs, info->block[1]);
+      radeon_emit(cs, info->block[2]);
   }

   if (program->num_cs_user_data_dwords) {
--- a/src/intel/blorp/blorp_clear.c
+++ b/src/intel/blorp/blorp_clear.c
@@ -834,11 +834,12 @@ blorp_can_hiz_clear_depth(const struct gen_device_info *devinfo,
      const bool unaligned = (slice_x0 + x0) % 16 || (slice_y0 + y0) % 8 ||
                             (max_x1_y1 ? haligned_x1 % 16 || valigned_y1 % 8 :
                              x1 % 16 || y1 % 8);
-      const bool alignment_used = surf->levels > 1 ||
-                                  surf->logical_level0_px.depth > 1 ||
-                                  surf->logical_level0_px.array_len > 1;
+      const bool partial_clear = x0 > 0 || y0 > 0 || !max_x1_y1;
+      const bool multislice_surf = surf->levels > 1 ||
+                                   surf->logical_level0_px.depth > 1 ||
+                                   surf->logical_level0_px.array_len > 1;

-      if (unaligned && alignment_used)
+      if (unaligned && (partial_clear || multislice_surf))
         return false;
   }

--- a/src/intel/compiler/brw_eu_defines.h
+++ b/src/intel/compiler/brw_eu_defines.h
@@ -901,6 +901,11 @@ enum surface_logical_srcs {
   SURFACE_LOGICAL_SRC_IMM_DIMS,
   /** Per-opcode immediate argument.  For atomics, this is the atomic opcode */
   SURFACE_LOGICAL_SRC_IMM_ARG,
+   /**
+    * Some instructions with side-effects should not be predicated on
+    * sample mask, e.g. lowered stores to scratch.
+    */
+   SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK,

   SURFACE_LOGICAL_NUM_SRCS
 };
--- a/src/intel/compiler/brw_fs.cpp
+++ b/src/intel/compiler/brw_fs.cpp
@@ -5462,7 +5462,10 @@ lower_surface_logical_send(const fs_builder &bld, fs_inst *inst)
   const fs_reg &surface_handle = inst->src[SURFACE_LOGICAL_SRC_SURFACE_HANDLE];
   const UNUSED fs_reg &dims = inst->src[SURFACE_LOGICAL_SRC_IMM_DIMS];
   const fs_reg &arg = inst->src[SURFACE_LOGICAL_SRC_IMM_ARG];
+   const fs_reg &allow_sample_mask =
+      inst->src[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK];
   assert(arg.file == IMM);
+   assert(allow_sample_mask.file == IMM);

   /* We must have exactly one of surface and surface_handle */
   assert((surface.file == BAD_FILE) != (surface_handle.file == BAD_FILE));
@@ -5486,8 +5489,9 @@ lower_surface_logical_send(const fs_builder &bld, fs_inst *inst)
                              surface.ud == GEN8_BTI_STATELESS_NON_COHERENT);

   const bool has_side_effects = inst->has_side_effects();
-   fs_reg sample_mask = has_side_effects ? sample_mask_reg(bld) :
-                                           fs_reg(brw_imm_d(0xffff));
+
+   fs_reg sample_mask = allow_sample_mask.ud ? sample_mask_reg(bld) :
+                                               fs_reg(brw_imm_d(0xffff));

   /* From the BDW PRM Volume 7, page 147:
    *
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -3767,6 +3767,7 @@ fs_visitor::nir_emit_cs_intrinsic(const fs_builder &bld,
      srcs[SURFACE_LOGICAL_SRC_SURFACE] = brw_imm_ud(surface);
      srcs[SURFACE_LOGICAL_SRC_IMM_DIMS] = brw_imm_ud(1);
      srcs[SURFACE_LOGICAL_SRC_IMM_ARG] = brw_imm_ud(1); /* num components */
+      srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(0);

      /* Read the 3 GLuint components of gl_NumWorkGroups */
      for (unsigned i = 0; i < 3; i++) {
@@ -3804,6 +3805,7 @@ fs_visitor::nir_emit_cs_intrinsic(const fs_builder &bld,
      srcs[SURFACE_LOGICAL_SRC_SURFACE] = brw_imm_ud(GEN7_BTI_SLM);
      srcs[SURFACE_LOGICAL_SRC_ADDRESS] = get_nir_src(instr->src[0]);
      srcs[SURFACE_LOGICAL_SRC_IMM_DIMS] = brw_imm_ud(1);
+      srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(0);

      /* Make dest unsigned because that's what the temporary will be */
      dest.type = brw_reg_type_from_bit_size(bit_size, BRW_REGISTER_TYPE_UD);
@@ -3840,6 +3842,7 @@ fs_visitor::nir_emit_cs_intrinsic(const fs_builder &bld,
      srcs[SURFACE_LOGICAL_SRC_SURFACE] = brw_imm_ud(GEN7_BTI_SLM);
      srcs[SURFACE_LOGICAL_SRC_ADDRESS] = get_nir_src(instr->src[1]);
      srcs[SURFACE_LOGICAL_SRC_IMM_DIMS] = brw_imm_ud(1);
+      srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(1);

      fs_reg data = get_nir_src(instr->src[0]);
      data.type = brw_reg_type_from_bit_size(bit_size, BRW_REGISTER_TYPE_UD);
@@ -4123,6 +4126,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
      if (instr->intrinsic == nir_intrinsic_image_load ||
          instr->intrinsic == nir_intrinsic_bindless_image_load) {
         srcs[SURFACE_LOGICAL_SRC_IMM_ARG] = brw_imm_ud(instr->num_components);
+         srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(0);
         fs_inst *inst =
            bld.emit(SHADER_OPCODE_TYPED_SURFACE_READ_LOGICAL,
                     dest, srcs, SURFACE_LOGICAL_NUM_SRCS);
@@ -4131,6 +4135,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
                 instr->intrinsic == nir_intrinsic_bindless_image_store) {
         srcs[SURFACE_LOGICAL_SRC_IMM_ARG] = brw_imm_ud(instr->num_components);
         srcs[SURFACE_LOGICAL_SRC_DATA] = get_nir_src(instr->src[3]);
+         srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(1);
         bld.emit(SHADER_OPCODE_TYPED_SURFACE_WRITE_LOGICAL,
                  fs_reg(), srcs, SURFACE_LOGICAL_NUM_SRCS);
      } else {
@@ -4153,6 +4158,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
            data = tmp;
         }
         srcs[SURFACE_LOGICAL_SRC_DATA] = data;
+         srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(1);

         bld.emit(SHADER_OPCODE_TYPED_ATOMIC_LOGICAL,
                  dest, srcs, SURFACE_LOGICAL_NUM_SRCS);
@@ -4210,6 +4216,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
      srcs[SURFACE_LOGICAL_SRC_ADDRESS] = get_nir_src(instr->src[1]);
      srcs[SURFACE_LOGICAL_SRC_IMM_DIMS] = brw_imm_ud(1);
      srcs[SURFACE_LOGICAL_SRC_IMM_ARG] = brw_imm_ud(instr->num_components);
+      srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(0);

      fs_inst *inst =
         bld.emit(SHADER_OPCODE_UNTYPED_SURFACE_READ_LOGICAL,
@@ -4229,6 +4236,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
      srcs[SURFACE_LOGICAL_SRC_DATA] = get_nir_src(instr->src[2]);
      srcs[SURFACE_LOGICAL_SRC_IMM_DIMS] = brw_imm_ud(1);
      srcs[SURFACE_LOGICAL_SRC_IMM_ARG] = brw_imm_ud(instr->num_components);
+      srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(1);

      bld.emit(SHADER_OPCODE_UNTYPED_SURFACE_WRITE_LOGICAL,
               fs_reg(), srcs, SURFACE_LOGICAL_NUM_SRCS);
@@ -4643,6 +4651,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
         get_nir_ssbo_intrinsic_index(bld, instr);
      srcs[SURFACE_LOGICAL_SRC_ADDRESS] = get_nir_src(instr->src[1]);
      srcs[SURFACE_LOGICAL_SRC_IMM_DIMS] = brw_imm_ud(1);
+      srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(0);

      /* Make dest unsigned because that's what the temporary will be */
      dest.type = brw_reg_type_from_bit_size(bit_size, BRW_REGISTER_TYPE_UD);
@@ -4682,6 +4691,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
         get_nir_ssbo_intrinsic_index(bld, instr);
      srcs[SURFACE_LOGICAL_SRC_ADDRESS] = get_nir_src(instr->src[2]);
      srcs[SURFACE_LOGICAL_SRC_IMM_DIMS] = brw_imm_ud(1);
+      srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(1);

      fs_reg data = get_nir_src(instr->src[0]);
      data.type = brw_reg_type_from_bit_size(bit_size, BRW_REGISTER_TYPE_UD);
@@ -4820,6 +4830,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr

      srcs[SURFACE_LOGICAL_SRC_IMM_DIMS] = brw_imm_ud(1);
      srcs[SURFACE_LOGICAL_SRC_IMM_ARG] = brw_imm_ud(bit_size);
+      srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(0);
      const fs_reg nir_addr = get_nir_src(instr->src[0]);

      /* Make dest unsigned because that's what the temporary will be */
@@ -4865,6 +4876,14 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr

      srcs[SURFACE_LOGICAL_SRC_IMM_DIMS] = brw_imm_ud(1);
      srcs[SURFACE_LOGICAL_SRC_IMM_ARG] = brw_imm_ud(bit_size);
+      /**
+       * While this instruction has side-effects, it should not be predicated
+       * on sample mask, because otherwise fs helper invocations would
+       * load undefined values from scratch memory. And scratch memory
+       * load-stores are produced from operations without side-effects, thus
+       * they should not have different behaviour in the helper invocations.
+       */
+      srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(0);
      const fs_reg nir_addr = get_nir_src(instr->src[1]);

      fs_reg data = get_nir_src(instr->src[0]);
@@ -5316,6 +5335,7 @@ fs_visitor::nir_emit_ssbo_atomic(const fs_builder &bld,
   srcs[SURFACE_LOGICAL_SRC_ADDRESS] = get_nir_src(instr->src[1]);
   srcs[SURFACE_LOGICAL_SRC_IMM_DIMS] = brw_imm_ud(1);
   srcs[SURFACE_LOGICAL_SRC_IMM_ARG] = brw_imm_ud(op);
+   srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(1);

   fs_reg data;
   if (op != BRW_AOP_INC && op != BRW_AOP_DEC && op != BRW_AOP_PREDEC)
@@ -5351,6 +5371,7 @@ fs_visitor::nir_emit_ssbo_atomic_float(const fs_builder &bld,
   srcs[SURFACE_LOGICAL_SRC_ADDRESS] = get_nir_src(instr->src[1]);
   srcs[SURFACE_LOGICAL_SRC_IMM_DIMS] = brw_imm_ud(1);
   srcs[SURFACE_LOGICAL_SRC_IMM_ARG] = brw_imm_ud(op);
+   srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(1);

   fs_reg data = get_nir_src(instr->src[2]);
   if (op == BRW_AOP_FCMPWR) {
@@ -5379,6 +5400,7 @@ fs_visitor::nir_emit_shared_atomic(const fs_builder &bld,
   srcs[SURFACE_LOGICAL_SRC_SURFACE] = brw_imm_ud(GEN7_BTI_SLM);
   srcs[SURFACE_LOGICAL_SRC_IMM_DIMS] = brw_imm_ud(1);
   srcs[SURFACE_LOGICAL_SRC_IMM_ARG] = brw_imm_ud(op);
+   srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(1);

   fs_reg data;
   if (op != BRW_AOP_INC && op != BRW_AOP_DEC && op != BRW_AOP_PREDEC)
@@ -5420,6 +5442,7 @@ fs_visitor::nir_emit_shared_atomic_float(const fs_builder &bld,
   srcs[SURFACE_LOGICAL_SRC_SURFACE] = brw_imm_ud(GEN7_BTI_SLM);
   srcs[SURFACE_LOGICAL_SRC_IMM_DIMS] = brw_imm_ud(1);
   srcs[SURFACE_LOGICAL_SRC_IMM_ARG] = brw_imm_ud(op);
+   srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(1);

   fs_reg data = get_nir_src(instr->src[1]);
   if (op == BRW_AOP_FCMPWR) {
--- a/src/intel/compiler/brw_fs_scoreboard.cpp
+++ b/src/intel/compiler/brw_fs_scoreboard.cpp
@@ -77,6 +77,7 @@ namespace {
      case BRW_OPCODE_DO:
      case SHADER_OPCODE_UNDEF:
      case FS_OPCODE_PLACEHOLDER_HALT:
+      case FS_OPCODE_SCHEDULING_FENCE:
         return 0;
      default:
         /* Note that the following is inaccurate for virtual instructions
--- a/src/intel/compiler/brw_nir_clamp_image_1d_2d_array_sizes.c
+++ b/src/intel/compiler/brw_nir_clamp_image_1d_2d_array_sizes.c
@@ -107,12 +107,29 @@ brw_nir_clamp_image_1d_2d_array_sizes(nir_shader *shader)
            b.cursor = nir_after_instr(instr);

            nir_ssa_def *components[4];
+            /* OR all the sizes for all components but the last. */
+            nir_ssa_def *or_components = nir_imm_int(&b, 0);
            for (int i = 0; i < image_size->num_components; i++) {
               if (i == (image_size->num_components - 1)) {
-                  components[i] = nir_imax(&b, nir_channel(&b, image_size, i),
-                                               nir_imm_int(&b, 1));
+                  nir_ssa_def *null_or_size[2] = {
+                     nir_imm_int(&b, 0),
+                     nir_imax(&b, nir_channel(&b, image_size, i),
+                                  nir_imm_int(&b, 1)),
+                  };
+                  nir_ssa_def *vec2_null_or_size = nir_vec(&b, null_or_size, 2);
+
+                  /* Using the ORed sizes select either the element 0 or 1
+                   * from this vec2. For NULL textures which have a size of
+                   * 0x0x0, we'll select the first element which is 0 and for
+                   * the rest MAX(depth, 1).
+                   */
+                  components[i] =
+                     nir_vector_extract(&b, vec2_null_or_size,
+                                            nir_imin(&b, or_components,
+                                                         nir_imm_int(&b, 1)));
               } else {
                  components[i] = nir_channel(&b, image_size, i);
+                  or_components = nir_ior(&b, components[i], or_components);
               }
            }
            nir_ssa_def *image_size_replacement =
--- a/src/intel/vulkan/meson.build
+++ b/src/intel/vulkan/meson.build
@@ -203,7 +203,7 @@ libvulkan_intel = shared_library(
    idep_nir, idep_genxml, idep_vulkan_util, idep_mesautil, idep_xmlconfig,
  ],
  c_args : anv_flags,
-  link_args : ['-Wl,--build-id=sha1', ld_args_bsymbolic, ld_args_gc_sections],
+  link_args : [ld_args_build_id, ld_args_bsymbolic, ld_args_gc_sections],
  install : true,
 )

--- a/src/mesa/main/fbobject.c
+++ b/src/mesa/main/fbobject.c
@@ -343,6 +343,7 @@ get_fb0_attachment(struct gl_context *ctx, struct gl_framebuffer *fb,
   }

   switch (attachment) {
+   case GL_FRONT:
   case GL_FRONT_LEFT:
      /* Front buffers can be allocated on the first use, but
       * glGetFramebufferAttachmentParameteriv must work even if that
--- a/src/mesa/main/uniform_query.cpp
+++ b/src/mesa/main/uniform_query.cpp
@@ -1043,10 +1043,12 @@ copy_uniforms_to_storage(gl_constant_value *storage,
                         const unsigned offset, const unsigned components,
                         enum glsl_base_type basicType)
 {
-   if (!uni->type->is_boolean() && !uni->is_bindless) {
+   bool copy_as_uint64 = uni->is_bindless &&
+                         (uni->type->is_sampler() || uni->type->is_image());
+   if (!uni->type->is_boolean() && !copy_as_uint64) {
      memcpy(storage, values,
             sizeof(storage[0]) * components * count * size_mul);
-   } else if (uni->is_bindless) {
+   } else if (copy_as_uint64) {
      const union gl_constant_value *src =
         (const union gl_constant_value *) values;
      GLuint64 *dst = (GLuint64 *)&storage->i;
--- a/src/mesa/state_tracker/st_atom_sampler.c
+++ b/src/mesa/state_tracker/st_atom_sampler.c
@@ -132,7 +132,7 @@ st_convert_sampler(const struct st_context *st,
    * levels.
    */
   sampler->lod_bias = CLAMP(sampler->lod_bias, -16, 16);
-   sampler->lod_bias = floorf(sampler->lod_bias * 256) / 256;
+   sampler->lod_bias = roundf(sampler->lod_bias * 256) / 256;

   sampler->min_lod = MAX2(msamp->MinLod, 0.0f);
   sampler->max_lod = msamp->MaxLod;
--- a/src/mesa/state_tracker/st_cb_semaphoreobjects.c
+++ b/src/mesa/state_tracker/st_cb_semaphoreobjects.c
@@ -109,7 +109,8 @@ st_server_wait_semaphore(struct gl_context *ctx,
         continue;

      bufObj = st_buffer_object(bufObjs[i]);
-      pipe->flush_resource(pipe, bufObj->buffer);
+      if (bufObj->buffer)
+         pipe->flush_resource(pipe, bufObj->buffer);
   }

   for (unsigned i = 0; i < numTextureBarriers; i++) {
@@ -117,7 +118,8 @@ st_server_wait_semaphore(struct gl_context *ctx,
         continue;

      texObj = st_texture_object(texObjs[i]);
-      pipe->flush_resource(pipe, texObj->pt);
+      if (texObj->pt)
+         pipe->flush_resource(pipe, texObj->pt);
   }
 }

@@ -141,7 +143,8 @@ st_server_signal_semaphore(struct gl_context *ctx,
         continue;

      bufObj = st_buffer_object(bufObjs[i]);
-      pipe->flush_resource(pipe, bufObj->buffer);
+      if (bufObj->buffer)
+         pipe->flush_resource(pipe, bufObj->buffer);
   }

   for (unsigned i = 0; i < numTextureBarriers; i++) {
@@ -149,7 +152,8 @@ st_server_signal_semaphore(struct gl_context *ctx,
         continue;

      texObj = st_texture_object(texObjs[i]);
-      pipe->flush_resource(pipe, texObj->pt);
+      if (texObj->pt)
+         pipe->flush_resource(pipe, texObj->pt);
   }

   /* The driver is allowed to flush during fence_server_signal, be prepared */
Author	SHA1	Message	Date
Eric Engestrom	0a443eb1ad	VERSION: bump to release 20.1.9	2020-09-30 20:37:42 +02:00
Eric Engestrom	bc6fd91e68	docs: add release notes for 20.1.9	2020-09-30 20:33:53 +02:00
Connor Abbott	e1f6000b54	nir/lower_io_arrays: Fix xfb_offset bug I noticed this once I started gathering xfb_info after nir_lower_io_arrays_to_elements_no_indirect. Fixes: `b2bbd978d0` ("nir: fix lowering arrays to elements for XFB outputs") Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6514> (cherry picked from commit `5a88db682e`)	2020-09-30 11:37:10 +02:00
Erik Faye-Lund	30b256c21e	st/mesa: use roundf instead of floorf for lod-bias rounding There's no good reason not to use a symmetric rounding mode here. This fixes the following GL CTS case for me: GTF-GL33.gtf21.GL3Tests.texture_lod_bias.texture_lod_bias_all Fixes: `132b69c4ed` ("st/mesa: round lod_bias to a multiple of 1/256") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6892> (cherry picked from commit `7685c37bf4`)	2020-09-30 11:37:10 +02:00
Pierre-Eric Pelloux-Prayer	71b3582ec1	gallium/vl: add chroma_format arg to vl_video_buffer functions vl_mpeg12_decoder needs to override the chroma_format value to get the correct size calculated (chroma_format is used by vl_video_buffer_adjust_size). I'm not sure why it's needed, but this is needed to get correct mpeg decode. Fixes: `24f2b0a856` ("gallium/video: remove pipe_video_buffer.chroma_format") Acked-by: Leo Liu <leo.liu@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6817> (cherry picked from commit `2584d48b2c`)	2020-09-29 22:11:46 +02:00
Pierre-Eric Pelloux-Prayer	fc21ef6b66	gallium/vl: do not call transfer_unmap if transfer is NULL CC: mesa-stable Acked-by: Leo Liu <leo.liu@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6817> (cherry picked from commit `b121b1b8b8`)	2020-09-29 22:11:42 +02:00
Eric Engestrom	d74c2e743d	.pick_status.json: Update to `efaea653b5`	2020-09-29 22:11:28 +02:00
Eric Engestrom	0dbec6b964	.pick_status.json: Mark `89401e5867` as denominated	2020-09-28 23:04:07 +02:00
Samuel Pitoiset	db4a29d078	spirv: fix emitting switch cases that directly jump to the merge block As shown in the valid SPIR-V below, if one switch case statement directly jumps to the merge block, it has no branches at all and we have to reset the fall variable. Otherwise, it creates an unintentional fallthrough. OpSelectionMerge %97 None OpSwitch %96 %97 1 %99 2 %100 %100 = OpLabel %102 = OpAccessChain %_ptr_StorageBuffer_v4float %86 %uint_0 %uint_37 %103 = OpLoad %v4float %102 %104 = OpBitcast %v4uint %103 %105 = OpCompositeExtract %uint %104 0 %106 = OpShiftLeftLogical %uint %105 %uint_1 OpBranch %97 %99 = OpLabel OpBranch %97 %97 = OpLabel %107 = OpPhi %uint %uint_4 %75 %uint_5 %99 %106 %100 This fixes serious corruption in Horizon Zero Dawn. v2: Changed the code to skip the entire if-block instead of resetting the fallthrough variable. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3460 Cc: mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6590> (cherry picked from commit `57fba85da4`)	2020-09-28 18:23:20 +02:00
Karol Herbst	4bff9ca691	spirv: extract switch parsing into its own function v2 (Jason Ekstrand): - Construct a list of vtn_case objects Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2401> (cherry picked from commit `467b90fcc4`)	2020-09-28 18:23:20 +02:00
Eric Engestrom	9dcc7d4d41	.pick_status.json: Mark `6b1a56b908` as denominated	2020-09-28 17:00:59 +02:00
Eric Engestrom	3a8ba8ecb3	.pick_status.json: Mark `e98c7a6634` as denominated	2020-09-28 17:00:59 +02:00
Eric Engestrom	7e3ed26c28	.pick_status.json: Mark `802d3611dc` as denominated	2020-09-28 17:00:59 +02:00
Danylo Piliaiev	79bed11bdd	intel/fs: Disable sample mask predication for scratch stores Scratch stores are being lowered to the instructions with side-effects, however they should be enabled in fs helper invocations, since they are produced from operations which don't imply side-effects. To fix this - we move the decision of whether the sample mask predication is enable to the point where logical brw instructions are created. GLSL example of the issue: int tmp[1024]; ... do { // changes to tmp } while (some_condition(tmp)) If `tmp` is lowered to scrach memory, `some_condition` would be undefined if scratch write is predicated on sample mask, making possible for the while loop to become infinite and hang the GPU. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3256 Fixes: `53bfcdeecf` Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `77486db867`)	2020-09-28 17:00:59 +02:00
Dylan Baker	80c6955c23	meson/anv: Use variable that checks for --build-id fixes: `d1992255bb` ("meson: Add build Intel "anv" vulkan driver") Acked-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6819> (cherry picked from commit `465460943a`)	2020-09-27 11:13:04 +02:00
Nanley Chery	02f2b9fa7b	blorp: Ensure aligned HIZ_CCS_WT partial clears Fixes: `5425fcf2cb` ("intel/blorp: Satisfy HIZ_CCS fast-clear alignments") Reported-by: Sagar Ghuge <sagar.ghuge@intel.com> Tested-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6854> (cherry picked from commit `7f3e881c6c`)	2020-09-27 11:11:33 +02:00
Jason Ekstrand	083b992f9d	nir/liveness: Consider if uses in nir_ssa_defs_interfere Fixes: `f86902e75d` "nir: Add an SSA-based liveness analysis pass" Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3428 Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Yevhenii Kharchenko <yevhenii.kharchenko@globallogic.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6824> (cherry picked from commit `0206fb3941`)	2020-09-27 11:11:31 +02:00
Marek Olšák	520d023bfb	radeonsi: fix indirect dispatches with variable block sizes The block size input was uninitialized. Fixes: `77c81164bc` "radeonsi: support ARB_compute_variable_group_size" Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6782> (cherry picked from commit `8be46d6558`)	2020-09-27 11:10:03 +02:00
Christian Gmeiner	14c7f4740e	etnaviv: simplify linear stride implementation As documented in the galcore kernel driver "only LOD0 is valid for this register". This makes sense, as NTE's LINEAR_STRIDE is only capable to store one linear stride value per sampler. This fixes linear textures in sampler slot != 0. Fixes: `34458c1cf6` ("etnaviv: add linear sampling support") CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Michael Tretter <m.tretter@pengutronix.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3285> (cherry picked from commit `a7e3cc7a0e`)	2020-09-27 11:10:02 +02:00
Erik Faye-Lund	53356f8972	mesa: handle GL_FRONT after translating to it Without this, we end up throwing errors on code along these lines when rendering using single-buffering: GLint att; glGetIntegerv(GL_READ_BUFFER, &att); glGetFramebufferAttachmentParameteriv(GL_READ_FRAMEBUFFER, att, ...); This is because we internally translate GL_BACK (which is what glGetIntegerv returned) to GL_FRONT, which we don't handle in the Desktop GL case. So let's start handling it. This fixes the GLTF-GL33.gtf21.GL2FixedTests.buffer_color.blend_color test for me. Fixes: `e6ca6e587e` ("mesa: Handle pbuffers in desktop GL framebuffer attachment queries") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6815> (cherry picked from commit `9e13a16c97`)	2020-09-27 11:09:59 +02:00
Eric Engestrom	7590165899	.pick_status.json: Update to `a3543adc26`	2020-09-27 11:09:31 +02:00
Danylo Piliaiev	46762687a0	nir/lower_samplers: Clamp out-of-bounds access to array of samplers Section 5.11 (Out-of-Bounds Accesses) of the GLSL 4.60 spec says: "In the subsections described above for array, vector, matrix and structure accesses, any out-of-bounds access produced undefined behavior.... Out-of-bounds reads return undefined values, which include values from other variables of the active program or zero." Robustness extensions suggest to return zero on out-of-bounds accesses, however it's not applicable to the arrays of samplers, so just clamp the index. Otherwise instr->sampler_index or instr->texture_index would be out of bounds, and they are used as an index to arrays of driver state. E.g. this fixes such dereference: if (options->lower_tex_packing[tex->sampler_index] != in nir_lower_tex.c CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6428> (cherry picked from commit `f2b17dec12`)	2020-09-23 20:58:12 +02:00
Danylo Piliaiev	ef29f3758e	nir/large_constants: Eliminate out-of-bounds writes to large constants Out-of-bounds writes could be eliminated per spec: Section 5.11 (Out-of-Bounds Accesses) of the GLSL 4.60 spec says: "In the subsections described above for array, vector, matrix and structure accesses, any out-of-bounds access produced undefined behavior.... Out-of-bounds writes may be discarded or overwrite other variables of the active program." Fixes: `1235850522` Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6428> (cherry picked from commit `0ba82f78a5`)	2020-09-23 20:58:10 +02:00
Danylo Piliaiev	45a937e040	nir/lower_io: Eliminate oob writes and return zero for oob reads Out-of-bounds writes could be eliminated per spec: Section 5.11 (Out-of-Bounds Accesses) of the GLSL 4.60 spec says: "In the subsections described above for array, vector, matrix and structure accesses, any out-of-bounds access produced undefined behavior.... Out-of-bounds writes may be discarded or overwrite other variables of the active program. Out-of-bounds reads return undefined values, which include values from other variables of the active program or zero." GL_KHR_robustness and GL_ARB_robustness encourage us to return zero for reads. Otherwise get_io_offset would return out-of-bound offset which may result in out-of-bound loading/storing of inputs/outputs, that could cause issues in drivers down the line. E.g. this fixes such dereference: int vue_slot = vue_map->varying_to_slot[intrin->const_index[0]]; in brw_nir.c CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6428> (cherry picked from commit `66669eb529`)	2020-09-23 20:58:08 +02:00
Bas Nieuwenhuizen	5fedabe34b	st/mesa: Deal with empty textures/buffers in semaphore wait/signal. The actual texture might not have been created yet. Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3257 CC: mesa-stable Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6788> (cherry picked from commit `ade72e677b`)	2020-09-23 20:58:06 +02:00
Lionel Landwerlin	a4f2c6face	intel/compiler: fixup Gen12 workaround for array sizes We didn't handle the case of NULL images/textures for which we should return 0. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `397ff2976b` ("intel: Implement Gen12 workaround for array textures of size 1") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3522 Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6729> (cherry picked from commit `cc3bf00cc2`)	2020-09-23 20:58:04 +02:00
Samuel Pitoiset	077d2a8068	radv: fix transform feedback crashes if pCounterBufferOffsets is NULL From the Vulkan 1.2.154 spec: "If pCounterBufferOffsets is NULL, then it is assumed the offsets are zero." Fix new CTS dEQP-VK.transform_feedback.simple.backward_dependency_no_offset_array. CC: mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6798> (cherry picked from commit `2b99e15d0a`)	2020-09-23 20:58:02 +02:00
Rhys Perry	78df8e5e38	radv,aco: fix reading primitive ID in FS after TES Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3530 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6760> (cherry picked from commit `2228835fb5`)	2020-09-23 20:58:00 +02:00
Bas Nieuwenhuizen	0f61e68ede	ac/surface: Fix depth import on GFX6-GFX8. Lets just do depth interop imports by convention between radv and radeonsi for now. The only thing using this should be Vulkan interop anyway. CC: mesa-stable Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6617> (cherry picked from commit `ecc19e9819`)	2020-09-23 20:56:27 +02:00
Jason Ekstrand	11ebe27d97	intel/fs/swsb: SCHEDULING_FENCE only emits SYNC_NOP It's not really unordered in the sense that it can still stall on ordered things and we don't need a SYNC_NOP for that because it is a SYNC_NOP. However, it also doesn't count when computing instruction distances. Fixes: `18e72ee210` "intel/fs: Add FS_OPCODE_SCHEDULING_FENCE" Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6781> (cherry picked from commit `f63ffc18e7`)	2020-09-23 20:46:57 +02:00
Jesse Natalie	80da07288b	glsl_type: Add packed to structure type comparison for hash map Fixes: `659f333b3a` "glsl: add packed for struct types" Reviewed-by: Marek Olák <marek.olsak@amd.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6767> (cherry picked from commit `9aa86eb61a`)	2020-09-23 20:46:52 +02:00
Pierre-Loup A. Griffais	d99fe9f86f	radv: fix vertex buffer null descriptors Fixes: `0f1ead7b53` "radv: handle NULL vertex bindings" Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6773> (cherry picked from commit `7b4eaac6a9`)	2020-09-23 20:45:18 +02:00
Pierre-Loup A. Griffais	b8534f4771	radv: fix null descriptor for dynamic buffers Fixes: `c1ef225d18` "radv: handle NULL descriptors" Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6772> (cherry picked from commit `ec13622ff4`)	2020-09-23 20:45:16 +02:00
Pierre-Eric Pelloux-Prayer	819be690c0	mesa: fix glUniform* when a struct contains a bindless sampler Small example from #3271: layout (bindless_sampler) uniform; struct SamplerSparse { sampler2D tex; vec4 size; [...] }; uniform SamplerSparse foo; 'foo' will be marked as bindless but we should only take the assign-as-GLuint64 path for 'tex'. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3271 Fixes: `990c8d15ac` ("mesa: fix setting uniform variables for bindless samplers/images") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6730> (cherry picked from commit `090fc593b4`)	2020-09-23 20:41:21 +02:00
Eric Engestrom	c2c53b9e63	.pick_status.json: Update to `c669db0b50`	2020-09-23 20:40:51 +02:00
Rhys Perry	d226595210	radv: initialize with expanded cmask if the destination layout needs it If radv_layout_can_fast_clear() is false, 028C70_COMPRESSION is unset when the image is rendered to and CMASK isn't updated. This appears to cause FMASK to be ignored and the 0th sample to always be used. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3449 Fixes: `7b21ce401f` ('radv: disable FMASK compression when drawing with GENERAL layout') Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6745> (cherry picked from commit `85cc2950a0`)	2020-09-17 22:03:29 +02:00
Bas Nieuwenhuizen	c0d443656f	amd/common: Cache intra-tile addresses for retile map. However complicated DCC addressing is it is still based on tiles. If we have the intra-tile offsets + tile dimensions we can expand that to the full image ourselves. Behavior around ~1080p on a 2500U: old: 30-60 ms on every miss new: 5 ms initally (miss in the tile cache) <0.5 ms afterwards The most common case is that the tile cache only contains data for 2 tiles, which for Raven/Renoir/Navi14 will be 4 KiB each, so the size increase is fairly modest. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5865> (cherry picked from commit `a37aeb128d`)	2020-09-17 22:03:29 +02:00
Eric Engestrom	ed94f8f266	.pick_status.json: Update to `d74fe47101`	2020-09-17 22:03:29 +02:00
Eric Engestrom	e9ec84ad66	docs/relnotes: add sha256 sums to 20.1.8	2020-09-16 19:42:34 +02:00
@@ -1 +1 @@
 .1.8
 .1.9