VERSION: bump version for 22.0.0-rc2

anv: fix conditional render for vkCmdDrawIndirectByteCountEXT
We just forgot about conditional render for this entry point. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: 2be89cbd82 ("anv: Implement vkCmdDrawIndirectByteCountEXT") Tested-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14891> (cherry picked from commit 93a90fc85d)
2022-02-09 09:43:34 -08:00 · 2022-02-08 09:23:53 -08:00 · 2022-02-08 09:23:53 -08:00 · 2022-02-08 09:23:52 -08:00 · 2022-02-08 09:23:49 -08:00 · 2022-02-07 21:51:26 -08:00
35 changed files with 1907 additions and 128 deletions
--- a/.pick_status.json
+++ b/.pick_status.json
--- a/2
+++ b/2
@@ -1 +1 @@
-22.0.0-devel
+22.0.0-rc2
--- a/src/amd/compiler/aco_assembler.cpp
+++ b/src/amd/compiler/aco_assembler.cpp
@@ -625,6 +625,10 @@ emit_instruction(asm_context& ctx, std::vector<uint32_t>& out, Instruction* inst
         encoding = 0;
         if (instr->opcode == aco_opcode::v_interp_mov_f32) {
            encoding = 0x3 & instr->operands[0].constantValue();
+         } else if (instr->opcode == aco_opcode::v_writelane_b32_e64) {
+            encoding |= instr->operands[0].physReg() << 0;
+            encoding |= instr->operands[1].physReg() << 9;
+            /* Encoding src2 works fine with hardware but breaks some disassemblers. */
         } else {
            for (unsigned i = 0; i < instr->operands.size(); i++)
               encoding |= instr->operands[i].physReg() << (i * 9);
--- a/src/amd/compiler/aco_print_asm.cpp
+++ b/src/amd/compiler/aco_print_asm.cpp
@@ -271,12 +271,6 @@ std::pair<bool, size_t>
 disasm_instr(chip_class chip, LLVMDisasmContextRef disasm, uint32_t* binary, unsigned exec_size,
             size_t pos, char* outline, unsigned outline_size)
 {
-   /* mask out src2 on v_writelane_b32 */
-   if (((chip == GFX8 || chip == GFX9) && (binary[pos] & 0xffff8000) == 0xd28a0000) ||
-       (chip >= GFX10 && (binary[pos] & 0xffff8000) == 0xd7610000)) {
-      binary[pos + 1] = binary[pos + 1] & 0xF803FFFF;
-   }
-
   size_t l =
      LLVMDisasmInstruction(disasm, (uint8_t*)&binary[pos], (exec_size - pos) * sizeof(uint32_t),
                            pos * 4, outline, outline_size);
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -4727,6 +4727,7 @@ radv_bind_descriptor_set(struct radv_cmd_buffer *cmd_buffer, VkPipelineBindPoint
   radv_set_descriptor_set(cmd_buffer, bind_point, set, idx);

   assert(set);
+   assert(!(set->header.layout->flags & VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR));

   if (!cmd_buffer->device->use_global_bo_list) {
      for (unsigned j = 0; j < set->header.buffer_count; ++j)
@@ -4764,7 +4765,7 @@ radv_CmdBindDescriptorSets(VkCommandBuffer commandBuffer, VkPipelineBindPoint pi
         radv_bind_descriptor_set(cmd_buffer, pipelineBindPoint, set, set_idx);
      }

-      for (unsigned j = 0; j < layout->set[set_idx].dynamic_offset_count; ++j, ++dyn_idx) {
+      for (unsigned j = 0; j < set->header.layout->dynamic_offset_count; ++j, ++dyn_idx) {
         unsigned idx = j + layout->set[i + firstSet].dynamic_offset_start;
         uint32_t *dst = descriptors_state->dynamic_buffers + idx * 4;
         assert(dyn_idx < dynamicOffsetCount);
@@ -4790,7 +4791,7 @@ radv_CmdBindDescriptorSets(VkCommandBuffer commandBuffer, VkPipelineBindPoint pi
            }
         }

-         cmd_buffer->push_constant_stages |= layout->set[set_idx].dynamic_offset_stages;
+         cmd_buffer->push_constant_stages |= set->header.layout->dynamic_shader_stages;
      }
   }
 }
--- a/src/amd/vulkan/radv_descriptor_set.c
+++ b/src/amd/vulkan/radv_descriptor_set.c
@@ -496,16 +496,11 @@ radv_CreatePipelineLayout(VkDevice _device, const VkPipelineLayoutCreateInfo *pC
      layout->set[set].layout = set_layout;

      layout->set[set].dynamic_offset_start = dynamic_offset_count;
-      layout->set[set].dynamic_offset_count = 0;
-      layout->set[set].dynamic_offset_stages = 0;

      for (uint32_t b = 0; b < set_layout->binding_count; b++) {
-         layout->set[set].dynamic_offset_count +=
-            set_layout->binding[b].array_size * set_layout->binding[b].dynamic_offset_count;
-         layout->set[set].dynamic_offset_stages |= set_layout->dynamic_shader_stages;
+         dynamic_offset_count += set_layout->binding[b].array_size * set_layout->binding[b].dynamic_offset_count;
+         dynamic_shader_stages |= set_layout->dynamic_shader_stages;
      }
-      dynamic_offset_count += layout->set[set].dynamic_offset_count;
-      dynamic_shader_stages |= layout->set[set].dynamic_offset_stages;

      /* Hash the entire set layout except for the vk_object_base. The
       * rest of the set layout is carefully constructed to not have
--- a/src/amd/vulkan/radv_descriptor_set.h
+++ b/src/amd/vulkan/radv_descriptor_set.h
@@ -89,9 +89,7 @@ struct radv_pipeline_layout {
   struct {
      struct radv_descriptor_set_layout *layout;
      uint32_t size;
-      uint16_t dynamic_offset_start;
-      uint16_t dynamic_offset_count;
-      VkShaderStageFlags dynamic_offset_stages;
+      uint32_t dynamic_offset_start;
   } set[MAX_SETS];

   uint32_t num_sets;
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -4773,7 +4773,7 @@ radv_pipeline_generate_hw_vs(struct radeon_cmdbuf *ctx_cs, struct radeon_cmdbuf
                             S_02881C_VS_OUT_MISC_SIDE_BUS_ENA(misc_vec_ena) |
                             S_02881C_VS_OUT_CCDIST0_VEC_ENA((total_mask & 0x0f) != 0) |
                             S_02881C_VS_OUT_CCDIST1_VEC_ENA((total_mask & 0xf0) != 0) |
-                             cull_dist_mask << 8 | clip_dist_mask);
+                             total_mask << 8 | clip_dist_mask);

   if (pipeline->device->physical_device->rad_info.chip_class <= GFX8)
      radeon_set_context_reg(ctx_cs, R_028AB4_VGT_REUSE_OFF, outinfo->writes_viewport_index);
@@ -4911,7 +4911,7 @@ radv_pipeline_generate_hw_ngg(struct radeon_cmdbuf *ctx_cs, struct radeon_cmdbuf
                             S_02881C_VS_OUT_MISC_SIDE_BUS_ENA(misc_vec_ena) |
                             S_02881C_VS_OUT_CCDIST0_VEC_ENA((total_mask & 0x0f) != 0) |
                             S_02881C_VS_OUT_CCDIST1_VEC_ENA((total_mask & 0xf0) != 0) |
-                             cull_dist_mask << 8 | clip_dist_mask);
+                             total_mask << 8 | clip_dist_mask);

   radeon_set_context_reg(ctx_cs, R_028A84_VGT_PRIMITIVEID_EN,
                          S_028A84_PRIMITIVEID_EN(es_enable_prim_id) |
--- a/src/broadcom/compiler/nir_to_vir.c
+++ b/src/broadcom/compiler/nir_to_vir.c
@@ -3034,7 +3034,7 @@ ntq_emit_load_ubo_unifa(struct v3d_compile *c, nir_intrinsic_instr *instr)
                         * alignment and skip over unused elements in result.
                         */
                        value_skips = (const_offset % 4) / (bit_size / 8);
-                        const_offset &= ~0xf;
+                        const_offset &= ~0x3;
                }
        }

--- a/src/egl/drivers/dri2/platform_wayland.c
+++ b/src/egl/drivers/dri2/platform_wayland.c
@@ -256,6 +256,7 @@ wl_buffer_release(void *data, struct wl_buffer *buffer)
      wl_buffer_destroy(buffer);
      dri2_surf->color_buffers[i].wl_release = false;
      dri2_surf->color_buffers[i].wl_buffer = NULL;
+      dri2_surf->color_buffers[i].age = 0;
   }

   dri2_surf->color_buffers[i].locked = false;
@@ -863,6 +864,7 @@ dri2_wl_release_buffers(struct dri2_egl_surface *dri2_surf)
      dri2_surf->color_buffers[i].dri_image = NULL;
      dri2_surf->color_buffers[i].linear_copy = NULL;
      dri2_surf->color_buffers[i].data = NULL;
+      dri2_surf->color_buffers[i].age = 0;
   }

   if (dri2_dpy->dri2)
@@ -1145,6 +1147,7 @@ update_buffers(struct dri2_egl_surface *dri2_surf)
         dri2_surf->color_buffers[i].wl_buffer = NULL;
         dri2_surf->color_buffers[i].dri_image = NULL;
         dri2_surf->color_buffers[i].linear_copy = NULL;
+         dri2_surf->color_buffers[i].age = 0;
      }
   }

@@ -2342,6 +2345,7 @@ swrast_update_buffers(struct dri2_egl_surface *dri2_surf)
                dri2_surf->color_buffers[i].data_size);
         dri2_surf->color_buffers[i].wl_buffer = NULL;
         dri2_surf->color_buffers[i].data = NULL;
+         dri2_surf->color_buffers[i].age = 0;
      }
   }

--- a/src/freedreno/vulkan/tu_descriptor_set.c
+++ b/src/freedreno/vulkan/tu_descriptor_set.c
@@ -589,8 +589,6 @@ tu_descriptor_set_destroy(struct tu_device *device,
      }
   }

-   list_del(&set->pool_link);
-
   vk_object_free(&device->vk, NULL, set);
 }

@@ -814,8 +812,10 @@ tu_FreeDescriptorSets(VkDevice _device,
   for (uint32_t i = 0; i < count; i++) {
      TU_FROM_HANDLE(tu_descriptor_set, set, pDescriptorSets[i]);

-      if (set)
+      if (set) {
         tu_descriptor_set_layout_unref(device, set->layout);
+         list_del(&set->pool_link);
+      }

      if (set && !pool->host_memory_base)
         tu_descriptor_set_destroy(device, pool, set, true);
--- a/src/gallium/auxiliary/gallivm/lp_bld_limits.h
+++ b/src/gallium/auxiliary/gallivm/lp_bld_limits.h
@@ -132,8 +132,10 @@ gallivm_get_shader_param(enum pipe_shader_cap param)
      return 1;
   case PIPE_SHADER_CAP_FP16:
   case PIPE_SHADER_CAP_FP16_DERIVATIVES:
-   case PIPE_SHADER_CAP_FP16_CONST_BUFFERS:
      return lp_has_fp16();
+   //enabling this breaks GTF-GL46.gtf21.GL2Tests.glGetUniform.glGetUniform
+   case PIPE_SHADER_CAP_FP16_CONST_BUFFERS:
+      return 0;
   case PIPE_SHADER_CAP_INT64_ATOMICS:
      return 0;
   case PIPE_SHADER_CAP_INT16:
--- a/src/gallium/drivers/crocus/ci/crocus-hsw-flakes.txt
+++ b/src/gallium/drivers/crocus/ci/crocus-hsw-flakes.txt
--- a/src/gallium/drivers/crocus/crocus_batch.c
+++ b/src/gallium/drivers/crocus/crocus_batch.c
@@ -263,21 +263,30 @@ crocus_init_batch(struct crocus_context *ice,
   crocus_batch_reset(batch);
 }

-static struct drm_i915_gem_exec_object2 *
-find_validation_entry(struct crocus_batch *batch, struct crocus_bo *bo)
+static int
+find_exec_index(struct crocus_batch *batch, struct crocus_bo *bo)
 {
   unsigned index = READ_ONCE(bo->index);

   if (index < batch->exec_count && batch->exec_bos[index] == bo)
-      return &batch->validation_list[index];
+      return index;

   /* May have been shared between multiple active batches */
   for (index = 0; index < batch->exec_count; index++) {
      if (batch->exec_bos[index] == bo)
-         return &batch->validation_list[index];
+	 return index;
   }
+   return -1;
+}

-   return NULL;
+static struct drm_i915_gem_exec_object2 *
+find_validation_entry(struct crocus_batch *batch, struct crocus_bo *bo)
+{
+   int index = find_exec_index(batch, bo);
+
+   if (index == -1)
+      return NULL;
+   return &batch->validation_list[index];
 }

 static void
@@ -409,7 +418,7 @@ emit_reloc(struct crocus_batch *batch,
      (struct drm_i915_gem_relocation_entry) {
         .offset = offset,
         .delta = target_offset,
-         .target_handle = target->index,
+         .target_handle = find_exec_index(batch, target),
         .presumed_offset = entry->offset,
      };

--- a/src/gallium/drivers/iris/iris_batch.c
+++ b/src/gallium/drivers/iris/iris_batch.c
@@ -181,13 +181,13 @@ iris_init_batch(struct iris_context *ice,
   struct iris_batch *batch = &ice->batches[name];
   struct iris_screen *screen = (void *) ice->ctx.screen;

-   /* Note: ctx_id, exec_flags and has_engines_context fields are initialized
-    * at an earlier phase when contexts are created.
+   /* Note: screen, ctx_id, exec_flags and has_engines_context fields are
+    * initialized at an earlier phase when contexts are created.
    *
-    * Ref: iris_init_engines_context(), iris_init_non_engine_contexts()
+    * See iris_init_batches(), which calls either iris_init_engines_context()
+    * or iris_init_non_engine_contexts().
    */

-   batch->screen = screen;
   batch->dbg = &ice->dbg;
   batch->reset = &ice->reset;
   batch->state_sizes = ice->state.sizes;
@@ -214,11 +214,12 @@ iris_init_batch(struct iris_context *ice,
   batch->cache.render = _mesa_hash_table_create(NULL, _mesa_hash_pointer,
                                                 _mesa_key_pointer_equal);

+   batch->num_other_batches = 0;
   memset(batch->other_batches, 0, sizeof(batch->other_batches));

-   for (int i = 0, j = 0; i < IRIS_BATCH_COUNT; i++) {
-      if (i != name)
-         batch->other_batches[j++] = &ice->batches[i];
+   iris_foreach_batch(ice, other_batch) {
+      if (batch != other_batch)
+         batch->other_batches[batch->num_other_batches++] = other_batch;
   }

   if (INTEL_DEBUG(DEBUG_ANY)) {
@@ -250,8 +251,7 @@ iris_init_non_engine_contexts(struct iris_context *ice, int priority)
 {
   struct iris_screen *screen = (void *) ice->ctx.screen;

-   for (int i = 0; i < IRIS_BATCH_COUNT; i++) {
-      struct iris_batch *batch = &ice->batches[i];
+   iris_foreach_batch(ice, batch) {
      batch->ctx_id = iris_create_hw_context(screen->bufmgr);
      batch->exec_flags = I915_EXEC_RENDER;
      batch->has_engines_context = false;
@@ -315,8 +315,8 @@ iris_init_engines_context(struct iris_context *ice, int priority)
   struct iris_screen *screen = (void *) ice->ctx.screen;
   iris_hw_context_set_priority(screen->bufmgr, engines_ctx, priority);

-   for (int i = 0; i < IRIS_BATCH_COUNT; i++) {
-      struct iris_batch *batch = &ice->batches[i];
+   iris_foreach_batch(ice, batch) {
+      unsigned i = batch - &ice->batches[0];
      batch->ctx_id = engines_ctx;
      batch->exec_flags = i;
      batch->has_engines_context = true;
@@ -328,10 +328,14 @@ iris_init_engines_context(struct iris_context *ice, int priority)
 void
 iris_init_batches(struct iris_context *ice, int priority)
 {
+   /* We have to do this early for iris_foreach_batch() to work */
+   for (int i = 0; i < IRIS_BATCH_COUNT; i++)
+      ice->batches[i].screen = (void *) ice->ctx.screen;
+
   if (!iris_init_engines_context(ice, priority))
      iris_init_non_engine_contexts(ice, priority);
-   for (int i = 0; i < IRIS_BATCH_COUNT; i++)
-      iris_init_batch(ice, (enum iris_batch_name) i);
+   iris_foreach_batch(ice, batch)
+      iris_init_batch(ice, batch - &ice->batches[0]);
 }

 static int
@@ -400,7 +404,7 @@ flush_for_cross_batch_dependencies(struct iris_batch *batch,
    * it had already referenced, we may need to flush other batches in order
    * to correctly synchronize them.
    */
-   for (int b = 0; b < ARRAY_SIZE(batch->other_batches); b++) {
+   for (int b = 0; b < batch->num_other_batches; b++) {
      struct iris_batch *other_batch = batch->other_batches[b];
      int other_index = find_exec_index(other_batch, bo);

@@ -598,8 +602,8 @@ iris_destroy_batches(struct iris_context *ice)
                                  ice->batches[0].ctx_id);
   }

-   for (int i = 0; i < IRIS_BATCH_COUNT; i++)
-      iris_batch_free(&ice->batches[i]);
+   iris_foreach_batch(ice, batch)
+      iris_batch_free(batch);
 }

 /**
@@ -726,10 +730,10 @@ replace_kernel_ctx(struct iris_batch *batch)
      int new_ctx = iris_create_engines_context(ice, priority);
      if (new_ctx < 0)
         return false;
-      for (int i = 0; i < IRIS_BATCH_COUNT; i++) {
-         ice->batches[i].ctx_id = new_ctx;
+      iris_foreach_batch(ice, bat) {
+         bat->ctx_id = new_ctx;
         /* Notify the context that state must be re-initialized. */
-         iris_lost_context_state(&ice->batches[i]);
+         iris_lost_context_state(bat);
      }
      iris_destroy_kernel_context(bufmgr, old_ctx);
   } else {
@@ -810,6 +814,7 @@ update_bo_syncobjs(struct iris_batch *batch, struct iris_bo *bo, bool write)
 {
   struct iris_screen *screen = batch->screen;
   struct iris_bufmgr *bufmgr = screen->bufmgr;
+   struct iris_context *ice = batch->ice;

   /* Make sure bo->deps is big enough */
   if (screen->id >= bo->deps_size) {
@@ -838,7 +843,9 @@ update_bo_syncobjs(struct iris_batch *batch, struct iris_bo *bo, bool write)
    * have come from a different context, and apps don't like it when we don't
    * do inter-context tracking.
    */
-   for (unsigned i = 0; i < IRIS_BATCH_COUNT; i++) {
+   iris_foreach_batch(ice, batch_i) {
+      unsigned i = batch_i->name;
+
      /* If the bo is being written to by others, wait for them. */
      if (bo_deps->write_syncobjs[i])
         move_syncobj_to_batch(batch, &bo_deps->write_syncobjs[i],
--- a/src/gallium/drivers/iris/iris_batch.h
+++ b/src/gallium/drivers/iris/iris_batch.h
@@ -136,6 +136,7 @@ struct iris_batch {

   /** List of other batches which we might need to flush to use a BO */
   struct iris_batch *other_batches[IRIS_BATCH_COUNT - 1];
+   unsigned num_other_batches;

   struct {
      /**
@@ -382,4 +383,9 @@ iris_batch_mark_reset_sync(struct iris_batch *batch)
 const char *
 iris_batch_name_to_string(enum iris_batch_name name);

+#define iris_foreach_batch(ice, batch)                \
+   for (struct iris_batch *batch = &ice->batches[0];  \
+        batch <= &ice->batches[((struct iris_screen *)ice->ctx.screen)->devinfo.ver >= 12 ? IRIS_BATCH_BLITTER : IRIS_BATCH_COMPUTE]; \
+        ++batch)
+
 #endif
--- a/src/gallium/drivers/iris/iris_border_color.c
+++ b/src/gallium/drivers/iris/iris_border_color.c
@@ -114,9 +114,9 @@ iris_border_color_pool_reserve(struct iris_context *ice, unsigned count)

   if (remaining_entries < count) {
      /* It's safe to flush because we're called outside of state upload. */
-      for (int i = 0; i < IRIS_BATCH_COUNT; i++) {
-         if (iris_batch_references(&ice->batches[i], pool->bo))
-            iris_batch_flush(&ice->batches[i]);
+      iris_foreach_batch(ice, batch) {
+         if (iris_batch_references(batch, pool->bo))
+            iris_batch_flush(batch);
      }

      iris_reset_border_color_pool(pool, pool->bo->bufmgr);
--- a/src/gallium/drivers/iris/iris_context.c
+++ b/src/gallium/drivers/iris/iris_context.c
@@ -98,12 +98,12 @@ iris_get_device_reset_status(struct pipe_context *ctx)
   /* Check the reset status of each batch's hardware context, and take the
    * worst status (if one was guilty, proclaim guilt).
    */
-   for (int i = 0; i < IRIS_BATCH_COUNT; i++) {
+   iris_foreach_batch(ice, batch) {
      /* This will also recreate the hardware contexts as necessary, so any
       * future queries will show no resets.  We only want to report once.
       */
      enum pipe_reset_status batch_reset =
-         iris_batch_check_for_reset(&ice->batches[i]);
+         iris_batch_check_for_reset(batch);

      if (batch_reset == PIPE_NO_RESET)
         continue;
--- a/src/gallium/drivers/iris/iris_fence.c
+++ b/src/gallium/drivers/iris/iris_fence.c
@@ -263,8 +263,8 @@ iris_fence_flush(struct pipe_context *ctx,
   iris_flush_dirty_dmabufs(ice);

   if (!deferred) {
-      for (unsigned i = 0; i < IRIS_BATCH_COUNT; i++)
-         iris_batch_flush(&ice->batches[i]);
+      iris_foreach_batch(ice, batch)
+         iris_batch_flush(batch);
   }

   if (flags & PIPE_FLUSH_END_OF_FRAME) {
@@ -286,8 +286,8 @@ iris_fence_flush(struct pipe_context *ctx,
   if (deferred)
      fence->unflushed_ctx = ctx;

-   for (unsigned b = 0; b < IRIS_BATCH_COUNT; b++) {
-      struct iris_batch *batch = &ice->batches[b];
+   iris_foreach_batch(ice, batch) {
+      unsigned b = batch->name;

      if (deferred && iris_batch_bytes_used(batch) > 0) {
         struct iris_fine_fence *fine =
@@ -339,9 +339,7 @@ iris_fence_await(struct pipe_context *ctx,
      if (iris_fine_fence_signaled(fine))
         continue;

-      for (unsigned b = 0; b < IRIS_BATCH_COUNT; b++) {
-         struct iris_batch *batch = &ice->batches[b];
-
+      iris_foreach_batch(ice, batch) {
         /* We're going to make any future work in this batch wait for our
          * fence to have gone by.  But any currently queued work doesn't
          * need to wait.  Flush the batch now, so it can happen sooner.
@@ -402,14 +400,14 @@ iris_fence_finish(struct pipe_screen *p_screen,
    * that it matches first.
    */
   if (ctx && ctx == fence->unflushed_ctx) {
-      for (unsigned i = 0; i < IRIS_BATCH_COUNT; i++) {
-         struct iris_fine_fence *fine = fence->fine[i];
+      iris_foreach_batch(ice, batch) {
+         struct iris_fine_fence *fine = fence->fine[batch->name];

         if (iris_fine_fence_signaled(fine))
            continue;

-         if (fine->syncobj == iris_batch_get_signal_syncobj(&ice->batches[i]))
-            iris_batch_flush(&ice->batches[i]);
+         if (fine->syncobj == iris_batch_get_signal_syncobj(batch))
+            iris_batch_flush(batch);
      }

      /* The fence is no longer deferred. */
@@ -595,7 +593,7 @@ iris_fence_signal(struct pipe_context *ctx,
   if (ctx == fence->unflushed_ctx)
      return;

-   for (unsigned b = 0; b < IRIS_BATCH_COUNT; b++) {
+   iris_foreach_batch(ice, batch) {
      for (unsigned i = 0; i < ARRAY_SIZE(fence->fine); i++) {
         struct iris_fine_fence *fine = fence->fine[i];

@@ -603,9 +601,8 @@ iris_fence_signal(struct pipe_context *ctx,
         if (iris_fine_fence_signaled(fine))
            continue;

-         ice->batches[b].contains_fence_signal = true;
-         iris_batch_add_syncobj(&ice->batches[b], fine->syncobj,
-                                I915_EXEC_FENCE_SIGNAL);
+         batch->contains_fence_signal = true;
+         iris_batch_add_syncobj(batch, fine->syncobj, I915_EXEC_FENCE_SIGNAL);
      }
   }
 }
--- a/src/gallium/drivers/iris/iris_pipe_control.c
+++ b/src/gallium/drivers/iris/iris_pipe_control.c
@@ -357,11 +357,10 @@ iris_memory_barrier(struct pipe_context *ctx, unsigned flags)
              PIPE_CONTROL_TILE_CACHE_FLUSH;
   }

-   for (int i = 0; i < IRIS_BATCH_COUNT; i++) {
-      if (ice->batches[i].contains_draw) {
-         iris_batch_maybe_flush(&ice->batches[i], 24);
-         iris_emit_pipe_control_flush(&ice->batches[i], "API: memory barrier",
-                                      bits);
+   iris_foreach_batch(ice, batch) {
+      if (batch->contains_draw) {
+         iris_batch_maybe_flush(batch, 24);
+         iris_emit_pipe_control_flush(batch, "API: memory barrier", bits);
      }
   }
 }
--- a/src/gallium/drivers/iris/iris_resource.c
+++ b/src/gallium/drivers/iris/iris_resource.c
@@ -1404,9 +1404,9 @@ iris_flush_resource(struct pipe_context *ctx, struct pipe_resource *resource)
       * sure to get rid of any compression that a consumer wouldn't know how
       * to handle.
       */
-      for (int i = 0; i < IRIS_BATCH_COUNT; i++) {
-         if (iris_batch_references(&ice->batches[i], res->bo))
-            iris_batch_flush(&ice->batches[i]);
+      iris_foreach_batch(ice, batch) {
+         if (iris_batch_references(batch, res->bo))
+            iris_batch_flush(batch);
      }

      iris_resource_disable_aux(res);
@@ -1741,8 +1741,8 @@ resource_is_busy(struct iris_context *ice,
 {
   bool busy = iris_bo_busy(res->bo);

-   for (int i = 0; i < IRIS_BATCH_COUNT; i++)
-      busy |= iris_batch_references(&ice->batches[i], res->bo);
+   iris_foreach_batch(ice, batch)
+      busy |= iris_batch_references(batch, res->bo);

   return busy;
 }
@@ -2339,9 +2339,9 @@ iris_transfer_map(struct pipe_context *ctx,
      }

      if (!(usage & PIPE_MAP_UNSYNCHRONIZED)) {
-         for (int i = 0; i < IRIS_BATCH_COUNT; i++) {
-            if (iris_batch_references(&ice->batches[i], res->bo))
-               iris_batch_flush(&ice->batches[i]);
+         iris_foreach_batch(ice, batch) {
+            if (iris_batch_references(batch, res->bo))
+               iris_batch_flush(batch);
         }
      }

@@ -2384,8 +2384,7 @@ iris_transfer_flush_region(struct pipe_context *ctx,
   }

   if (history_flush & ~PIPE_CONTROL_CS_STALL) {
-      for (int i = 0; i < IRIS_BATCH_COUNT; i++) {
-         struct iris_batch *batch = &ice->batches[i];
+      iris_foreach_batch(ice, batch) {
         if (batch->contains_draw || batch->cache.render->entries) {
            iris_batch_maybe_flush(batch, 24);
            iris_emit_pipe_control_flush(batch,
@@ -2474,9 +2473,9 @@ iris_texture_subdata(struct pipe_context *ctx,

   iris_resource_access_raw(ice, res, level, box->z, box->depth, true);

-   for (int i = 0; i < IRIS_BATCH_COUNT; i++) {
-      if (iris_batch_references(&ice->batches[i], res->bo))
-         iris_batch_flush(&ice->batches[i]);
+   iris_foreach_batch(ice, batch) {
+      if (iris_batch_references(batch, res->bo))
+         iris_batch_flush(batch);
   }

   uint8_t *dst = iris_bo_map(&ice->dbg, res->bo, MAP_WRITE | MAP_RAW);
--- a/src/gallium/drivers/radeonsi/si_get.c
+++ b/src/gallium/drivers/radeonsi/si_get.c
@@ -1072,8 +1072,8 @@ void si_init_screen_get_functions(struct si_screen *sscreen)
      .has_udot_4x8 = sscreen->info.has_accelerated_dot_product,
      .has_dot_2x16 = sscreen->info.has_accelerated_dot_product,
      .optimize_sample_mask_in = true,
-      .max_unroll_iterations = 128,
-      .max_unroll_iterations_aggressive = 128,
+      .max_unroll_iterations = LLVM_VERSION_MAJOR >= 13 ? 128 : 32,
+      .max_unroll_iterations_aggressive = LLVM_VERSION_MAJOR >= 13 ? 128 : 32,
      .use_interpolated_input_intrinsics = true,
      .lower_uniforms_to_ubo = true,
      .support_16bit_alu = sscreen->options.fp16,
--- a/src/gallium/drivers/svga/svga_context.h
+++ b/src/gallium/drivers/svga/svga_context.h
@@ -374,7 +374,6 @@ struct svga_state
      struct pipe_resource *indirect;
   } grid_info;

-   unsigned shared_mem_size;
 };

 struct svga_prescale {
--- a/src/gallium/drivers/svga/svga_pipe_cs.c
+++ b/src/gallium/drivers/svga/svga_pipe_cs.c
@@ -61,7 +61,7 @@ svga_create_compute_state(struct pipe_context *pipe,

   cs->base.id = svga->debug.shader_id++;

-   svga->curr.shared_mem_size = templ->req_local_mem;
+   cs->shared_mem_size = templ->req_local_mem;

   SVGA_STATS_TIME_POP(svga_sws(svga));
   return cs;
--- a/src/gallium/drivers/svga/svga_shader.h
+++ b/src/gallium/drivers/svga/svga_shader.h
@@ -380,6 +380,7 @@ struct svga_tes_shader
 struct svga_compute_shader
 {
   struct svga_shader base;
+   unsigned shared_mem_size;
 };


--- a/src/gallium/drivers/svga/svga_state_cs.c
+++ b/src/gallium/drivers/svga/svga_state_cs.c
@@ -80,7 +80,7 @@ make_cs_key(struct svga_context *svga,
   key->cs.grid_size[0] = svga->curr.grid_info.size[0];
   key->cs.grid_size[1] = svga->curr.grid_info.size[1];
   key->cs.grid_size[2] = svga->curr.grid_info.size[2];
-   key->cs.mem_size = svga->curr.shared_mem_size;
+   key->cs.mem_size = cs->shared_mem_size;

   if (svga->curr.grid_info.indirect && cs->base.info.uses_grid_size) {
      struct pipe_transfer *transfer = NULL;
--- a/src/gallium/drivers/zink/zink_blit.c
+++ b/src/gallium/drivers/zink/zink_blit.c
@@ -363,12 +363,18 @@ bool
 zink_blit_region_fills(struct u_rect region, unsigned width, unsigned height)
 {
   struct u_rect intersect = {0, width, 0, height};
+   struct u_rect r = {
+      MIN2(region.x0, region.x1),
+      MAX2(region.x0, region.x1),
+      MIN2(region.y0, region.y1),
+      MAX2(region.y0, region.y1),
+   };

-   if (!u_rect_test_intersection(&region, &intersect))
+   if (!u_rect_test_intersection(&r, &intersect))
      /* is this even a thing? */
      return false;

-   u_rect_find_intersection(&region, &intersect);
+   u_rect_find_intersection(&r, &intersect);
   if (intersect.x0 != 0 || intersect.y0 != 0 ||
       intersect.x1 != width || intersect.y1 != height)
      return false;
@@ -379,11 +385,23 @@ zink_blit_region_fills(struct u_rect region, unsigned width, unsigned height)
 bool
 zink_blit_region_covers(struct u_rect region, struct u_rect covers)
 {
+   struct u_rect r = {
+      MIN2(region.x0, region.x1),
+      MAX2(region.x0, region.x1),
+      MIN2(region.y0, region.y1),
+      MAX2(region.y0, region.y1),
+   };
+   struct u_rect c = {
+      MIN2(covers.x0, covers.x1),
+      MAX2(covers.x0, covers.x1),
+      MIN2(covers.y0, covers.y1),
+      MAX2(covers.y0, covers.y1),
+   };
   struct u_rect intersect;
-   if (!u_rect_test_intersection(&region, &covers))
+   if (!u_rect_test_intersection(&r, &c))
      return false;

-    u_rect_union(&intersect, &region, &covers);
-    return intersect.x0 == covers.x0 && intersect.y0 == covers.y0 &&
-           intersect.x1 == covers.x1 && intersect.y1 == covers.y1;
+    u_rect_union(&intersect, &r, &c);
+    return intersect.x0 == c.x0 && intersect.y0 == c.y0 &&
+           intersect.x1 == c.x1 && intersect.y1 == c.y1;
 }
--- a/src/gallium/drivers/zink/zink_draw.cpp
+++ b/src/gallium/drivers/zink/zink_draw.cpp
@@ -487,6 +487,9 @@ zink_draw(struct pipe_context *pctx,
          struct pipe_vertex_state *vstate,
          uint32_t partial_velem_mask)
 {
+   if (!dindirect && (!draws[0].count || !dinfo->instance_count))
+      return;
+
   struct zink_context *ctx = zink_context(pctx);
   struct zink_screen *screen = zink_screen(pctx->screen);
   struct zink_rasterizer_state *rast_state = ctx->rast_state;
--- a/src/gallium/drivers/zink/zink_resource.c
+++ b/src/gallium/drivers/zink/zink_resource.c
@@ -172,6 +172,9 @@ create_bci(struct zink_screen *screen, const struct pipe_resource *templ, unsign
   if (bind & PIPE_BIND_SHADER_IMAGE)
      bci.usage |= VK_BUFFER_USAGE_STORAGE_TEXEL_BUFFER_BIT;

+   if (bind & PIPE_BIND_QUERY_BUFFER)
+      bci.usage |= VK_BUFFER_USAGE_CONDITIONAL_RENDERING_BIT_EXT;
+
   if (templ->flags & PIPE_RESOURCE_FLAG_SPARSE)
      bci.flags |= VK_BUFFER_CREATE_SPARSE_BINDING_BIT;
   return bci;
@@ -1036,7 +1039,7 @@ zink_resource_get_param(struct pipe_screen *pscreen, struct pipe_context *pctx,
   switch (param) {
   case PIPE_RESOURCE_PARAM_NPLANES:
      if (screen->info.have_EXT_image_drm_format_modifier)
-         *value = pscreen->get_dmabuf_modifier_planes(pscreen, res->obj->modifier, pres->format);
+         *value = pscreen->get_dmabuf_modifier_planes(pscreen, obj->modifier, pres->format);
      else
         *value = 1;
      break;
@@ -1066,7 +1069,7 @@ zink_resource_get_param(struct pipe_screen *pscreen, struct pipe_context *pctx,
   }

   case PIPE_RESOURCE_PARAM_MODIFIER: {
-      *value = res->obj->modifier;
+      *value = obj->modifier;
      break;
   }

--- a/src/gallium/drivers/zink/zink_screen.c
+++ b/src/gallium/drivers/zink/zink_screen.c
@@ -468,7 +468,7 @@ zink_get_param(struct pipe_screen *pscreen, enum pipe_cap param)
      return 1;

   case PIPE_CAP_TGSI_BALLOT:
-      return screen->vk_version >= VK_MAKE_VERSION(1,2,0) && screen->info.props11.subgroupSize <= 64;
+      return screen->info.have_vulkan12 && screen->info.have_EXT_shader_subgroup_ballot && screen->info.props11.subgroupSize <= 64;

   case PIPE_CAP_SAMPLE_SHADING:
      return screen->info.feats.features.sampleRateShading;
@@ -935,8 +935,10 @@ zink_get_shader_param(struct pipe_screen *pscreen,
      return 0; /* not implemented */

   case PIPE_SHADER_CAP_FP16_CONST_BUFFERS:
-      return screen->info.feats11.uniformAndStorageBuffer16BitAccess ||
-             (screen->info.have_KHR_16bit_storage && screen->info.storage_16bit_feats.uniformAndStorageBuffer16BitAccess);
+      //enabling this breaks GTF-GL46.gtf21.GL2Tests.glGetUniform.glGetUniform
+      //return screen->info.feats11.uniformAndStorageBuffer16BitAccess ||
+             //(screen->info.have_KHR_16bit_storage && screen->info.storage_16bit_feats.uniformAndStorageBuffer16BitAccess);
+      return 0;
   case PIPE_SHADER_CAP_FP16_DERIVATIVES:
      return 0; //spirv requires 32bit derivative srcs and dests
   case PIPE_SHADER_CAP_FP16:
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -3945,7 +3945,10 @@ fs_visitor::nir_emit_cs_intrinsic(const fs_builder &bld,
      srcs[SURFACE_LOGICAL_SRC_SURFACE] = brw_imm_ud(GFX7_BTI_SLM);
      srcs[SURFACE_LOGICAL_SRC_ADDRESS] = get_nir_src(instr->src[1]);
      srcs[SURFACE_LOGICAL_SRC_IMM_DIMS] = brw_imm_ud(1);
-      srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(1);
+      /* No point in masking with sample mask, here we're handling compute
+       * intrinsics.
+       */
+      srcs[SURFACE_LOGICAL_SRC_ALLOW_SAMPLE_MASK] = brw_imm_ud(0);

      fs_reg data = get_nir_src(instr->src[0]);
      data.type = brw_reg_type_from_bit_size(bit_size, BRW_REGISTER_TYPE_UD);
--- a/src/intel/compiler/brw_nir_lower_shader_calls.c
+++ b/src/intel/compiler/brw_nir_lower_shader_calls.c
@@ -137,6 +137,8 @@ lower_shader_calls_instr(struct nir_builder *b, nir_instr *instr, void *data)

   switch (call->intrinsic) {
   case nir_intrinsic_rt_trace_ray: {
+      b->cursor = nir_instr_remove(instr);
+
      store_resume_addr(b, call);

      nir_ssa_def *as_addr = call->src[0].ssa;
@@ -217,6 +219,8 @@ lower_shader_calls_instr(struct nir_builder *b, nir_instr *instr, void *data)
   }

   case nir_intrinsic_rt_execute_callable: {
+      b->cursor = nir_instr_remove(instr);
+
      store_resume_addr(b, call);

      nir_ssa_def *sbt_offset32 =
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -4371,6 +4371,9 @@ void genX(CmdDrawIndirectByteCountEXT)(

   genX(cmd_buffer_flush_state)(cmd_buffer);

+   if (cmd_buffer->state.conditional_render_enabled)
+      genX(cmd_emit_conditional_render_predicate)(cmd_buffer);
+
   if (vs_prog_data->uses_firstvertex ||
       vs_prog_data->uses_baseinstance)
      emit_base_vertex_instance(cmd_buffer, firstVertex, firstInstance);
@@ -4405,6 +4408,7 @@ void genX(CmdDrawIndirectByteCountEXT)(

   anv_batch_emit(&cmd_buffer->batch, GENX(3DPRIMITIVE), prim) {
      prim.IndirectParameterEnable  = true;
+      prim.PredicateEnable          = cmd_buffer->state.conditional_render_enabled;
      prim.VertexAccessType         = SEQUENTIAL;
      prim.PrimitiveTopologyType    = cmd_buffer->state.gfx.primitive_topology;
   }
--- a/src/mesa/main/dlist.c
+++ b/src/mesa/main/dlist.c
@@ -1506,8 +1506,7 @@ dlist_alloc(struct gl_context *ctx, OpCode opcode, GLuint bytes, bool align8)

   /* If this node needs to start on an 8-byte boundary, pad the last node. */
   if (sizeof(void *) == 8 && align8 &&
-       ctx->ListState.CurrentPos % 2 == 1 &&
-       ctx->ListState.CurrentPos + 1 + numNodes + contNodes <= BLOCK_SIZE) {
+       ctx->ListState.CurrentPos % 2 == 1) {
      Node *last = ctx->ListState.CurrentBlock + ctx->ListState.CurrentPos -
                   ctx->ListState.LastInstSize;
      last->InstSize++;
--- a/src/microsoft/compiler/nir_to_dxil.c
+++ b/src/microsoft/compiler/nir_to_dxil.c
@@ -3094,8 +3094,9 @@ emit_store_output_via_intrinsic(struct ntd_context *ctx, nir_intrinsic_instr *in
    * generation, so muck with them here too.
    */
   nir_io_semantics semantics = nir_intrinsic_io_semantics(intr);
-   bool is_tess_level = semantics.location == VARYING_SLOT_TESS_LEVEL_INNER ||
-                        semantics.location == VARYING_SLOT_TESS_LEVEL_OUTER;
+   bool is_tess_level = is_patch_constant &&
+                        (semantics.location == VARYING_SLOT_TESS_LEVEL_INNER ||
+                         semantics.location == VARYING_SLOT_TESS_LEVEL_OUTER);

   const struct dxil_value *row = NULL;
   const struct dxil_value *col = NULL;
@@ -3198,8 +3199,9 @@ emit_load_input_via_intrinsic(struct ntd_context *ctx, nir_intrinsic_instr *intr
    * generation, so muck with them here too.
    */
   nir_io_semantics semantics = nir_intrinsic_io_semantics(intr);
-   bool is_tess_level = semantics.location == VARYING_SLOT_TESS_LEVEL_INNER ||
-                        semantics.location == VARYING_SLOT_TESS_LEVEL_OUTER;
+   bool is_tess_level = is_patch_constant &&
+                        (semantics.location == VARYING_SLOT_TESS_LEVEL_INNER ||
+                         semantics.location == VARYING_SLOT_TESS_LEVEL_OUTER);

   const struct dxil_value *row = NULL;
   const struct dxil_value *comp = NULL;
@@ -5011,7 +5013,7 @@ sort_uniforms_by_binding_and_remove_structs(nir_shader *s)
 }

 static void
-prepare_phi_values(struct ntd_context *ctx)
+prepare_phi_values(struct ntd_context *ctx, nir_function_impl *impl)
 {
   /* PHI nodes are difficult to get right when tracking the types:
    * Since the incoming sources are linked to blocks, we can't bitcast
@@ -5020,19 +5022,15 @@ prepare_phi_values(struct ntd_context *ctx)
    * value has a different type then the one expected by the phi node.
    * We choose int as default, because it supports more bit sizes.
    */
-   nir_foreach_function(function, ctx->shader) {
-      if (function->impl) {
-         nir_foreach_block(block, function->impl) {
-            nir_foreach_instr(instr, block) {
-               if (instr->type == nir_instr_type_phi) {
-                  nir_phi_instr *ir = nir_instr_as_phi(instr);
-                  unsigned bitsize = nir_dest_bit_size(ir->dest);
-                  const struct dxil_value *dummy = dxil_module_get_int_const(&ctx->mod, 0, bitsize);
-                  nir_foreach_phi_src(src, ir) {
-                     for(unsigned int i = 0; i < ir->dest.ssa.num_components; ++i)
-                        store_ssa_def(ctx, src->src.ssa, i, dummy);
-                  }
-               }
+   nir_foreach_block(block, impl) {
+      nir_foreach_instr(instr, block) {
+         if (instr->type == nir_instr_type_phi) {
+            nir_phi_instr *ir = nir_instr_as_phi(instr);
+            unsigned bitsize = nir_dest_bit_size(ir->dest);
+            const struct dxil_value *dummy = dxil_module_get_int_const(&ctx->mod, 0, bitsize);
+            nir_foreach_phi_src(src, ir) {
+               for(unsigned int i = 0; i < ir->dest.ssa.num_components; ++i)
+                  store_ssa_def(ctx, src->src.ssa, i, dummy);
            }
         }
      }
@@ -5163,7 +5161,7 @@ emit_function(struct ntd_context *ctx, nir_function *func)
   if (!ctx->phis)
      return false;

-   prepare_phi_values(ctx);
+   prepare_phi_values(ctx, impl);

   if (!emit_scratch(ctx))
      return false;
Author	SHA1	Message	Date
Dylan Baker	716fc5280a	VERSION: bump version for 22.0.0-rc2	2022-02-09 09:43:34 -08:00
Lionel Landwerlin	2e1387c752	anv: fix conditional render for vkCmdDrawIndirectByteCountEXT We just forgot about conditional render for this entry point. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `2be89cbd82` ("anv: Implement vkCmdDrawIndirectByteCountEXT") Tested-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14891> (cherry picked from commit `93a90fc85d`)	2022-02-08 09:23:53 -08:00
Lionel Landwerlin	a910e58ad8	intel/nir: fix shader call lowering We're replacing a generic instruction by an intel specific one, we need to remove the previous instruction. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `c5a42e4010` ("intel/fs: fix shader call lowering pass") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13719> (cherry picked from commit `39f6cd5d79`)	2022-02-08 09:23:53 -08:00
Lionel Landwerlin	54f49993d1	intel/fs: don't set allow_sample_mask for CS intrinsics Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `77486db867` ("intel/fs: Disable sample mask predication for scratch stores") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13719> (cherry picked from commit `c89024e446`)	2022-02-08 09:23:52 -08:00
Dylan Baker	2b282fb3b5	.pick_status.json: Update to `5e9df85b1a`	2022-02-08 09:23:49 -08:00
Dave Airlie	4e67d2aad4	crocus: find correct relocation target for the bo. If we have batch a + b, and writing to batch b, causes batch a to flush, all the bo->index get reset, and we try to submit a -1 to the kernel. Look the bo index up when creating relocations. Fixes crash seen in KHR-GL46.compute_shader.pipeline-post-fs and a trace from Wasteland 3 Fixes: `f3630548f1` ("crocus: initial gallium driver for Intel gfx 4-7") Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14905> (cherry picked from commit `37c3be6947`) Conflicts: src/gallium/drivers/crocus/ci/crocus-hsw-flakes.txt I've deleted this file, which the original removed an entry from as it doesn't exist, and the CI isn't run on the 22.0 branch.	2022-02-07 21:51:26 -08:00
Mike Blumenkrantz	8f5fb1eb10	zink: min/max blit region in coverage functions these regions might not have the coords in the correct order, which will cause them to fail intersection tests, resulting in clears that are never applied cc: mesa-stable fixes: GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_all_buffer_blit GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_color_and_depth_blit GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_color_and_stencil_blit GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_linear_filter_color_blit GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_magnifying_blit GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_minifying_blit GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_missing_buffers_blit GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_nearest_filter_color_blit GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_negative_dimensions_blit GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_negative_height_blit GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_negative_width_blit GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_scissor_blit Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14867> (cherry picked from commit `388f23eabe`)	2022-02-07 21:49:43 -08:00
Mike Blumenkrantz	4587268d2b	zink: reject invalid draws cc: mesa-stable Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14859> (cherry picked from commit `b656ab75a6`)	2022-02-07 21:49:43 -08:00
Mike Blumenkrantz	a04818a500	zink: fix PIPE_CAP_TGSI_BALLOT export conditional this requires VK_EXT_shader_subgroup_ballot cc: mesa-stable fixes (lavapipe): KHR-GL46.shader_ballot_tests.ShaderBallotAvailability KHR-GL46.shader_ballot_tests.ShaderBallotFunctionRead Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14858> (cherry picked from commit `e38c13830f`)	2022-02-07 21:49:42 -08:00
Rhys Perry	59b2c1ddde	radv: fix R_02881C_PA_CL_VS_OUT_CNTL with mixed cull/clip distances Matches radeonsi. Seems Vulkan CTS doesn't really test cull distances. Removing VARYING_SLOT_CULL_DIST0/VARYING_SLOT_CULL_DIST1 variables doesn't break any of dEQP-VK.clipping.*, except for tests which read the variables in the fragment shader. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5984 Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14882> (cherry picked from commit `7ddad1b93a`)	2022-02-07 21:49:42 -08:00
Daniel Stone	2ce020120a	egl/wayland: Reset buffer age when destroying buffers A buffer age of 0 means that the buffer is uninitialised or has unknown content. We rely on the buffer age initially being 0 through zalloc when the surface is first created; when they are first used for a swap, we set their age to 1, and then we increment the age of every buffer in the chain with a non-zero age when we swap. Now that we can release buffers, both through dmabuf-feedback as well as detecting when we're using a deeper swapchain than the compositor needs, make sure to reset their age as they are released. Without doing this, the age will stay as it was before it was released and be incremented, returning the wrong age to the user the first time a previously-released buffer slot has been reused. Signed-off-by: Daniel Stone <daniels@collabora.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5977 Fixes: `22d796feb8` ("egl/wayland: break double/tripple buffering feedback loops") Fixes: `b5848b2dac` ("egl/wayland: use surface dma-buf feedback to allocate surface buffers") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14873> (cherry picked from commit `3da8300562`)	2022-02-07 21:49:41 -08:00
Samuel Pitoiset	ba2d22e95f	Revert "radv: re-apply "Do not access set layout during vkCmdBindDescriptorSets."" The most famous RADV revert over the past months. This was an issue in RADV and not an use-after-free (descriptor set layouts can be destroyed almost at any time). This reverts commit `b775aaff1e`. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14621> (cherry picked from commit `9ea4029f9f`)	2022-02-07 21:49:15 -08:00
Charmaine Lee	5ff5f3cbf7	mesa: fix misaligned pointer returned by dlist_alloc In cases where the to-be-allocated node size with padding exceeds BLOCK_SIZE but without padding doesn't, a new block is not created and no padding is done to the previous instruction, causing a misaligned pointer to be returned. v2: Per Ilia Mirkin's suggestion, remove the extra condition in the first if statement, let it unconditionally pad the last instruction if needed. The updated currentPos will then be taken into account in the block size checking. This fixes crash seen with lightsmark and Optuma apitraces Fixes: `05605d7f53` (' mesa: remove display list OPCODE_NOP') Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Neha Bhende <bhenden@vmware.com> Tested-by: Neha Bhende <bhenden@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14871> (cherry picked from commit `945a1e0b8c`)	2022-02-07 21:36:05 -08:00
Neha Bhende	5a7a564d7c	svga: store shared_mem_size in svga_compute_shader instead of svga_context When new context was created, shared_mem_size was getting overwritten. This fixes glretrace failure seen with manhattan, aztec and BASS2_intro apitraces Fixes: `247c61f2d0` ('svga: Add support for compute shader, shader buffers and image views') Tested with glretrace, piglit Reviewed-by: Charmaine Lee <charmainel@vmware.com> (cherry picked from commit dd6793ec9218782b1b716a87582d7219bae4e75f) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14870> (cherry picked from commit `9230b28533`)	2022-02-07 21:36:04 -08:00
Mike Blumenkrantz	2c7d0e1b49	zink: use scanout obj when returning resource param info embarrassing typo since the base obj has no modifier data available cc: mesa-stable fixes #5980 Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14875> (cherry picked from commit `960e72417f`)	2022-02-07 21:36:04 -08:00
Danylo Piliaiev	83eef372a2	turnip: Unconditionaly remove descriptor set from pool's list on free We didn't remove desc set from the pool's list if pool was host_memory_base. On the other hand in there is no point in removing desc set from the list in DestroyDescriptorPool/ResetDescriptorPool. Fixes: `da7a4751` ("turnip: Drop references to layout of all sets on pool reset/destruction") Fixes cts tests: dEQP-VK.api.buffer_marker.graphics.default_mem.bottom_of_pipe.memory_dep.draw dEQP-VK.api.buffer_marker.graphics.default_mem.bottom_of_pipe.memory_dep.dispatch Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14855> (cherry picked from commit `183bc15bdb`)	2022-02-07 21:36:03 -08:00
Kenneth Graunke	0a1f166f4d	iris: Make an iris_foreach_batch macro that skips unsupported batches IRIS_BATCH_BLITTER isn't supported prior to Tigerlake; in general, batches may not be supported on all hardware. In most cases, querying them is harmless (if useless): they reference nothing, have no commands to flush, and so on. However, the fence code does need to know that certain batches don't exist, so it can avoid adding inter-batch fences involving them. This patch introduces a new iris_foreach_batch() iterator macro that walks over all batches that are actually supported on the platform, while skipping the others. It provides a central place to update should we add or reorder more batches in the future. Fixes various tests in the piglit.spec.ext_external_objects.* category. Thanks to Tapani Pälli for catching this. Fixes: `a90a1f15` ("iris: Create an IRIS_BATCH_BLITTER for using the BLT command streamer") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14834> (cherry picked from commit `fd0e4aedeb`)	2022-02-07 21:36:03 -08:00
Jesse Natalie	68242654f8	microsoft/compiler: Only treat tess level location as special if it's a patch constant Fixes: `a550c059` ("microsoft/compiler: For load_input from DS, use loadPatchConstant") Reviewed-by: Bill Kristiansen <billkris@microsoft.com> Reviewed-By: Sil Vilerino <sivileri@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14837> (cherry picked from commit `ce6dbbabf9`)	2022-02-07 21:36:02 -08:00
Jesse Natalie	c7bd1f0720	microsoft/compiler: Only prep phis for the current function Fixes: `41af9620` ("microsoft/compiler: Emit all NIR functions into the DXIL module") Reviewed-by: Bill Kristiansen <billkris@microsoft.com> Reviewed-By: Sil Vilerino <sivileri@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14837> (cherry picked from commit `0c711dc823`)	2022-02-07 21:36:02 -08:00
Mike Blumenkrantz	88762cf59b	zink: add VK_BUFFER_USAGE_CONDITIONAL_RENDERING_BIT_EXT for query binds required by spec cc: mesa-stable Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14853> (cherry picked from commit `1e96542390`)	2022-02-07 21:36:01 -08:00
Dylan Baker	6420dc86cf	.pick_status.json: Update to `8335fdfeaf`	2022-02-07 21:35:59 -08:00
Rhys Perry	a58a01050c	aco: don't encode src2 for v_writelane_b32_e64 Encoding src2 doesn't cause issues for print_asm() because we have a workaround there, but it does for RGP and it seems the developers are not interested in fixing it. https://github.com/GPUOpen-Tools/radeon_gpu_profiler/issues/61 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Tested-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14832> (cherry picked from commit `0447a2303f`)	2022-02-03 10:32:02 -08:00
Pierre-Eric Pelloux-Prayer	b6e296f823	radeonsi: limit loop unrolling for LLVM < 13 Without this change LLVM 12 hits this error: """ LLVM ERROR: Error while trying to spill SGPR0_SGPR1 from class SReg_64: Cannot scavenge register without an emergency spill slot! """ when running glcts KHR-GL46.arrays_of_arrays_gl.AtomicUsage test. Fixes: `9ff086052a` ("radeonsi: unroll loops of up to 128 iterations") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14848> (cherry picked from commit `eaa87b1a46`)	2022-02-03 10:32:02 -08:00
Iago Toral Quiroga	fabb6b5c5e	broadcom/compiler: fix offset alignment for ldunifa when skipping The intention was to align the address to 4 bytes (32-bit), not 16 bytes. Fixes: `bdb6201ea1` ("broadcom/compiler: use ldunifa with unaligned constant offset") Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14830> (cherry picked from commit `0a8449b07c`)	2022-02-03 10:32:01 -08:00
Mike Blumenkrantz	0ec3de0563	llvmpipe: disable PIPE_SHADER_CAP_FP16_CONST_BUFFERS this cap is broken cc: mesa-stable fixes: GTF-GL46.gtf21.GL2Tests.glGetUniform.glGetUnifor Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14835> (cherry picked from commit `9a75392cd8`)	2022-02-03 10:32:01 -08:00
Mike Blumenkrantz	b2be43a192	zink: disable PIPE_SHADER_CAP_FP16_CONST_BUFFERS this cap is broken cc: mesa-stable fixes: GTF-GL46.gtf21.GL2Tests.glGetUniform.glGetUniform Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14835> (cherry picked from commit `9a38dab2d1`)	2022-02-03 10:32:00 -08:00
Dylan Baker	9e17fcbed2	.pick_status.json: Update to `0447a2303f`	2022-02-03 10:31:57 -08:00
Dylan Baker	c69a870f86	VERSION: bump for 22.0.0-rc1 release	2022-02-02 15:15:29 -08:00