docs/relnotes/19.2.8: Add SHA256 sum

VERSION: bump for 19.2.8
docs: add relnotes for 19.2.8
2019-12-18 11:23:15 -08:00 · 2019-12-18 11:02:09 -08:00 · 2019-12-18 11:01:53 -08:00 · 2019-12-17 09:10:49 -08:00 · 2019-12-17 09:08:49 -08:00 · 2019-12-16 16:09:30 -08:00
18 changed files with 295 additions and 106 deletions
--- a/2
+++ b/2
@@ -1 +1 @@
-19.2.7
+19.2.8
--- a/bin/.cherry-ignore
+++ b/bin/.cherry-ignore
@@ -27,6 +27,8 @@ bcd9224728dcb8d8fe4bcddc4bd9b2c36fcfe9dd
 869e32593a9096b845dd6106f8f86e1c41fac968
 a2c3c65a31de90fdb55f76f2894860dfbafe2043
 bb0c5c487e63e88acbb792f092dd8f392bad8540
+937b9055698be0dfdb7d2e0673a989e2ecc05912
+21376cffb37018160ad3eef38b5a640ba1675a4f

 # This is reverted shortly after it was landed
 4432a2d14d80081d062f7939a950d65ea3a16eed
@@ -35,6 +37,15 @@ bb0c5c487e63e88acbb792f092dd8f392bad8540
 1a05811936dd8d0c3a367c6f00629624ef39d537
 911a8261419f48dcd756f78832fa5a5f4c5b8d93

-# This was manuall backported
+# This was manually backported
 2afeed301010917c4eae55dcd2544f9d329df934
 4b392ced2d744fccffe95490ff57e6b41033c266
+
+# This is not being backported to 19.2 due to causing build regressions for
+# downstream projects
+eaf43966027cf9654e91ca57aecc8f5a65b58f49
+
+# Invalid sha warnings
+023282a4f667695ea1dbbe9fbe1cd3a9d550a426
+2fca325ea65f068043d4c18c9cd0fe7f25bde8f7
+7564c5fc6d79a2ddec49a19f67183fb3be799fe5
--- a/docs/relnotes/19.2.7.html
+++ b/docs/relnotes/19.2.7.html
@@ -36,7 +36,7 @@ depends on the particular driver being used.

 <h2>SHA256 checksum</h2>
 <pre>
-TBD.
+    e3799fb7896fd9ed2f90f651fb907b95cdebfbd494968ff116e6bf1be143579e  mesa-19.2.7.tar.xz
 </pre>


--- a/docs/relnotes/19.2.8.html
+++ b/docs/relnotes/19.2.8.html
@@ -0,0 +1,108 @@
+
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
+<html lang="en">
+<head>
+<meta http-equiv="content-type" content="text/html; charset=utf-8">
+<title>Mesa Release Notes</title>
+<link rel="stylesheet" type="text/css" href="../mesa.css">
+</head>
+<body>
+
+<div class="header">
+<h1>The Mesa 3D Graphics Library</h1>
+</div>
+
+<iframe src="../contents.html"></iframe>
+<div class="content">
+
+<h1>Mesa 19.2.8 Release Notes / 2019-12-18</h1>
+
+<p>
+    Mesa 19.2.8 is a bug fix release which fixes bugs found since the 19.2.7 release.
+</p>
+<p>
+Mesa 19.2.8 implements the OpenGL 4.5 API, but the version reported by
+glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
+glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
+Some drivers don't support all the features required in OpenGL 4.5. OpenGL
+4.5 is <strong>only</strong> available if requested at context creation.
+Compatibility contexts may report a lower version depending on each driver.
+</p>
+<p>
+Mesa 19.2.8 implements the Vulkan 1.1 API, but the version reported by
+the apiVersion property of the VkPhysicalDeviceProperties struct
+depends on the particular driver being used.
+</p>
+
+<h2>SHA256 checksum</h2>
+<pre>
+    cffa8fa755c7422ce014c39ca0b770a092d9e0bbae537ceb2609c106916e5a57  mesa-19.2.8.tar.xz
+</pre>
+
+
+<h2>New features</h2>
+
+<ul>
+    <li>None</li>
+</ul>
+
+<h2>Bug fixes</h2>
+
+<ul>
+    <li>i965/iris: assert when destroy GL context with active query</li>
+</ul>
+
+<h2>Changes</h2>
+
+<ul>
+    <p>Alyssa Rosenzweig (1):</p>
+    <li>      gallium/util: Support POLYGON in u_stream_outputs_for_vertices</li>
+    <p></p>
+    <p>Bas Nieuwenhuizen (2):</p>
+    <li>      amd/common: Always use addrlib for HTILE tc-compat.</li>
+    <li>      amd/common: Fix tcCompatible degradation on Stoney.</li>
+    <p></p>
+    <p>Dylan Baker (4):</p>
+    <li>      docs: Add SHA256 sums for 19.2.7</li>
+    <li>      meson/broadcom: libbroadcom_cle needs expat headers</li>
+    <li>      meson/broadcom: libbroadcom_cle also needs zlib</li>
+    <li>      cherry-ignore: Update for 19.2.8</li>
+    <p></p>
+    <p>Gert Wollny (1):</p>
+    <li>      virgl: Increase the shader transfer buffer by doubling the size</li>
+    <p></p>
+    <p>Iván Briano (1):</p>
+    <li>      anv: Export filter_minmax support only when it&#x27;s really supported</li>
+    <p></p>
+    <p>Jason Ekstrand (2):</p>
+    <li>      anv: Re-emit all compute state on pipeline switch</li>
+    <li>      anv: Don&#x27;t leak when set_tiling fails</li>
+    <p></p>
+    <p>Kenneth Graunke (1):</p>
+    <li>      iris: Default to X-tiling for scanout buffers without modifiers</li>
+    <p></p>
+    <p>Lionel Landwerlin (7):</p>
+    <li>      intel/perf: fix invalid hw_id in query results</li>
+    <li>      intel/perf: set read buffer len to 0 to identify empty buffer</li>
+    <li>      intel/perf: take into account that reports read can be fairly old</li>
+    <li>      intel/perf: simplify the processing of OA reports</li>
+    <li>      intel/perf: fix improper pointer access</li>
+    <li>      anv: fix fence underlying primitive checks</li>
+    <li>      mesa: avoid triggering assert in implementation</li>
+    <p></p>
+    <p>Nanley Chery (2):</p>
+    <li>      gallium/dri2: Fix creation of multi-planar modifier images</li>
+    <li>      gallium: Store the image format in winsys_handle</li>
+    <p></p>
+    <p>Rob Clark (1):</p>
+    <li>      nir/lower_clip: Fix incorrect driver loc for clipdist outputs</li>
+    <p></p>
+    <p>Timothy Arceri (1):</p>
+    <li>      glsl/nir: iterate the system values list when adding varyings</li>
+    <p></p>
+    <p></p>
+</ul>
+
+</div>
+</body>
+</html>
--- a/src/amd/common/ac_surface.c
+++ b/src/amd/common/ac_surface.c
@@ -343,7 +343,7 @@ static int gfx6_compute_level(ADDR_HANDLE addrlib,
 	    AddrSurfInfoIn->flags.depth &&
 	    surf_level->mode == RADEON_SURF_MODE_2D &&
 	    level == 0) {
-		AddrHtileIn->flags.tcCompatible = AddrSurfInfoIn->flags.tcCompatible;
+		AddrHtileIn->flags.tcCompatible = AddrSurfInfoOut->tcCompatible;
 		AddrHtileIn->pitch = AddrSurfInfoOut->pitch;
 		AddrHtileIn->height = AddrSurfInfoOut->height;
 		AddrHtileIn->numSlices = AddrSurfInfoOut->depth;
@@ -777,19 +777,12 @@ static int gfx6_compute_surface(ADDR_HANDLE addrlib,
 			if (level > 0)
 				continue;

-			/* Check that we actually got a TC-compatible HTILE if
-			 * we requested it (only for level 0, since we're not
-			 * supporting HTILE on higher mip levels anyway). */
-			assert(AddrSurfInfoOut.tcCompatible ||
-			       !AddrSurfInfoIn.flags.tcCompatible ||
-			       AddrSurfInfoIn.flags.matchStencilTileCfg);
+			if (!AddrSurfInfoOut.tcCompatible) {
+				AddrSurfInfoIn.flags.tcCompatible = 0;
+				surf->flags &= ~RADEON_SURF_TC_COMPATIBLE_HTILE;
+			}

 			if (AddrSurfInfoIn.flags.matchStencilTileCfg) {
-				if (!AddrSurfInfoOut.tcCompatible) {
-					AddrSurfInfoIn.flags.tcCompatible = 0;
-					surf->flags &= ~RADEON_SURF_TC_COMPATIBLE_HTILE;
-				}
-
 				AddrSurfInfoIn.flags.matchStencilTileCfg = 0;
 				AddrSurfInfoIn.tileIndex = AddrSurfInfoOut.tileIndex;
 				stencil_tile_idx = AddrSurfInfoOut.stencilTileIdx;
--- a/src/broadcom/cle/meson.build
+++ b/src/broadcom/cle/meson.build
@@ -58,6 +58,6 @@ libbroadcom_cle = static_library(
  'v3d_decoder.c',
  include_directories : [inc_common, inc_broadcom],
  c_args : [c_vis_args, no_override_init_args],
-  dependencies : [dep_libdrm, dep_valgrind],
+  dependencies : [dep_libdrm, dep_valgrind, dep_expat, dep_zlib],
  build_by_default : false,
 )
--- a/src/compiler/glsl/gl_nir_linker.c
+++ b/src/compiler/glsl/gl_nir_linker.c
@@ -34,32 +34,11 @@
 */

 static bool
-add_interface_variables(const struct gl_context *cts,
-                        struct gl_shader_program *prog,
-                        struct set *resource_set,
-                        unsigned stage, GLenum programInterface)
+add_vars_from_list(const struct gl_context *ctx,
+                   struct gl_shader_program *prog, struct set *resource_set,
+                   const struct exec_list *var_list, unsigned stage,
+                   GLenum programInterface)
 {
-   const struct exec_list *var_list = NULL;
-
-   struct gl_linked_shader *sh = prog->_LinkedShaders[stage];
-   if (!sh)
-      return true;
-
-   nir_shader *nir = sh->Program->nir;
-   assert(nir);
-
-   switch (programInterface) {
-   case GL_PROGRAM_INPUT:
-      var_list = &nir->inputs;
-      break;
-   case GL_PROGRAM_OUTPUT:
-      var_list = &nir->outputs;
-      break;
-   default:
-      assert("!Should not get here");
-      break;
-   }
-
   nir_foreach_variable(var, var_list) {
      if (var->data.how_declared == nir_var_hidden)
         continue;
@@ -108,6 +87,38 @@ add_interface_variables(const struct gl_context *cts,
   return true;
 }

+static bool
+add_interface_variables(const struct gl_context *ctx,
+                        struct gl_shader_program *prog,
+                        struct set *resource_set,
+                        unsigned stage, GLenum programInterface)
+{
+   struct gl_linked_shader *sh = prog->_LinkedShaders[stage];
+   if (!sh)
+      return true;
+
+   nir_shader *nir = sh->Program->nir;
+   assert(nir);
+
+   switch (programInterface) {
+   case GL_PROGRAM_INPUT: {
+      bool result = add_vars_from_list(ctx, prog, resource_set,
+                                       &nir->inputs, stage, programInterface);
+      result &= add_vars_from_list(ctx, prog, resource_set, &nir->system_values,
+                                   stage, programInterface);
+      return result;
+   }
+   case GL_PROGRAM_OUTPUT:
+      return add_vars_from_list(ctx, prog, resource_set, &nir->outputs, stage,
+                                programInterface);
+   default:
+      assert("!Should not get here");
+      break;
+   }
+
+   return false;
+}
+
 /* TODO: as we keep adding features, this method is becoming more and more
 * similar to its GLSL counterpart at linker.cpp. Eventually it would be good
 * to check if they could be refactored, and reduce code duplication somehow
--- a/src/compiler/nir/nir_lower_clip.c
+++ b/src/compiler/nir/nir_lower_clip.c
@@ -262,6 +262,17 @@ nir_lower_clip_vs(nir_shader *shader, unsigned ucp_enables, bool use_vars)
   if (!ucp_enables)
      return false;

+   /* find clipvertex/position outputs: */
+   nir_foreach_variable(var, &shader->outputs) {
+      int loc = var->data.driver_location;
+
+      /* keep track of last used driver-location.. we'll be
+       * appending CLIP_DIST0/CLIP_DIST1 after last existing
+       * output:
+       */
+      maxloc = MAX2(maxloc, loc);
+   }
+
   nir_builder_init(&b, impl);

   /* NIR should ensure that, even in case of loops/if-else, there
--- a/src/gallium/auxiliary/util/u_prim.h
+++ b/src/gallium/auxiliary/util/u_prim.h
@@ -338,7 +338,14 @@ u_stream_outputs_for_vertices(enum pipe_prim_type primitive, unsigned nr)
   /* Extraneous vertices don't contribute to stream outputs */
   u_trim_pipe_prim(primitive, &nr);

-   /* Consider how many primitives are actually generated */
+   /* Polygons are special, since they are a single primitive with many
+    * vertices. In this case, we just have an output for each vertex (after
+    * trimming) */
+
+   if (primitive == PIPE_PRIM_POLYGON)
+      return nr;
+
+   /* Normally, consider how many primitives are actually generated */
   unsigned prims = u_decomposed_prims_for_vertices(primitive, nr);

   /* One output per vertex after decomposition */
--- a/src/gallium/drivers/iris/iris_resource.c
+++ b/src/gallium/drivers/iris/iris_resource.c
@@ -739,6 +739,8 @@ iris_resource_create_with_modifiers(struct pipe_screen *pscreen,
      if (templ->usage == PIPE_USAGE_STAGING ||
          templ->bind & (PIPE_BIND_LINEAR | PIPE_BIND_CURSOR) )
         tiling_flags = ISL_TILING_LINEAR_BIT;
+      else if (templ->bind & PIPE_BIND_SCANOUT)
+         tiling_flags = ISL_TILING_X_BIT;
   }

   isl_surf_usage_flags_t usage = pipe_bind_to_isl_usage(templ->bind);
--- a/src/gallium/drivers/virgl/virgl_encode.c
+++ b/src/gallium/drivers/virgl/virgl_encode.c
@@ -492,12 +492,13 @@ int virgl_encode_shader_state(struct virgl_context *ctx,
         if (virgl_debug & VIRGL_DEBUG_VERBOSE)
            debug_printf("Failed to translate shader in available space - trying again\n");
         old_size = str_total_size;
-         str_total_size = 65536 * ++retry_size;
+         str_total_size = 65536 * retry_size;
+         retry_size *= 2;
         str = REALLOC(str, old_size, str_total_size);
         if (!str)
            return -1;
      }
-   } while (bret == false && retry_size < 10);
+   } while (bret == false && retry_size < 1024);

   if (bret == false)
      return -1;
--- a/src/gallium/include/state_tracker/winsys_handle.h
+++ b/src/gallium/include/state_tracker/winsys_handle.h
@@ -49,6 +49,12 @@ struct winsys_handle
    */
   unsigned offset;

+   /**
+    * Input to resource_from_handle.
+    * Output from resource_get_handle.
+    */
+   uint64_t format;
+
   /**
    * Input to resource_from_handle.
    * Output from resource_get_handle.
--- a/src/gallium/state_trackers/dri/dri2.c
+++ b/src/gallium/state_trackers/dri/dri2.c
@@ -529,6 +529,7 @@ dri2_allocate_textures(struct dri_context *ctx,
         whandle.handle = buf->name;
         whandle.stride = buf->pitch;
         whandle.offset = 0;
+         whandle.format = format;
         whandle.modifier = DRM_FORMAT_MOD_INVALID;
         if (screen->can_share_buffer)
            whandle.type = WINSYS_HANDLE_TYPE_SHARED;
@@ -759,18 +760,12 @@ dri2_create_image_from_winsys(__DRIscreen *_screen,
   for (i = num_handles - 1; i >= 0; i--) {
      struct pipe_resource *tex;

-      if (whandle[i].modifier == DRM_FORMAT_MOD_INVALID) {
-         templ.width0 = width >> map->planes[i].width_shift;
-         templ.height0 = height >> map->planes[i].height_shift;
-         if (is_yuv)
-            templ.format = dri2_get_pipe_format_for_dri_format(map->planes[i].dri_format);
-         else
-            templ.format = map->pipe_format;
-      } else {
-         templ.width0 = width;
-         templ.height0 = height;
+      templ.width0 = width >> map->planes[i].width_shift;
+      templ.height0 = height >> map->planes[i].height_shift;
+      if (is_yuv)
+         templ.format = dri2_get_pipe_format_for_dri_format(map->planes[i].dri_format);
+      else
         templ.format = map->pipe_format;
-      }
      assert(templ.format != PIPE_FORMAT_NONE);

      tex = pscreen->resource_from_handle(pscreen,
@@ -808,6 +803,7 @@ dri2_create_image_from_name(__DRIscreen *_screen,
   memset(&whandle, 0, sizeof(whandle));
   whandle.type = WINSYS_HANDLE_TYPE_SHARED;
   whandle.handle = name;
+   whandle.format = map->pipe_format;
   whandle.modifier = DRM_FORMAT_MOD_INVALID;

   whandle.stride = pitch * util_format_get_blocksize(map->pipe_format);
@@ -826,8 +822,13 @@ dri2_create_image_from_name(__DRIscreen *_screen,
 }

 static unsigned
-dri2_get_modifier_num_planes(uint64_t modifier)
+dri2_get_modifier_num_planes(uint64_t modifier, int fourcc)
 {
+   const struct dri2_format_mapping *map = dri2_get_mapping_by_fourcc(fourcc);
+
+   if (!map)
+      return 0;
+
   switch (modifier) {
   case I915_FORMAT_MOD_Y_TILED_CCS:
      return 2;
@@ -849,8 +850,8 @@ dri2_get_modifier_num_planes(uint64_t modifier)
   /* FD_FORMAT_MOD_QCOM_TILED is not in drm_fourcc.h */
   case I915_FORMAT_MOD_X_TILED:
   case I915_FORMAT_MOD_Y_TILED:
-      return 1;
   case DRM_FORMAT_MOD_INVALID:
+      return map->nplanes;
   default:
      return 0;
   }
@@ -868,15 +869,13 @@ dri2_create_image_from_fd(__DRIscreen *_screen,
   __DRIimage *img = NULL;
   unsigned err = __DRI_IMAGE_ERROR_SUCCESS;
   int i, expected_num_fds;
-   uint64_t mod_planes = dri2_get_modifier_num_planes(modifier);
+   int num_handles = dri2_get_modifier_num_planes(modifier, fourcc);

-   if (!map || (modifier != DRM_FORMAT_MOD_INVALID && mod_planes == 0)) {
+   if (!map || num_handles == 0) {
      err = __DRI_IMAGE_ERROR_BAD_MATCH;
      goto exit;
   }

-   int num_handles = mod_planes > 0 ? mod_planes : map->nplanes;
-
   switch (fourcc) {
   case __DRI_IMAGE_FOURCC_YUYV:
   case __DRI_IMAGE_FOURCC_UYVY:
@@ -896,7 +895,7 @@ dri2_create_image_from_fd(__DRIscreen *_screen,

   for (i = 0; i < num_handles; i++) {
      int fdnum = i >= num_fds ? 0 : i;
-      int index = mod_planes > 0 ? i : map->planes[i].buffer_index;
+      int index = i >= map->nplanes ? i : map->planes[i].buffer_index;
      if (fds[fdnum] < 0) {
         err = __DRI_IMAGE_ERROR_BAD_ALLOC;
         goto exit;
@@ -906,6 +905,7 @@ dri2_create_image_from_fd(__DRIscreen *_screen,
      whandles[i].handle = (unsigned)fds[fdnum];
      whandles[i].stride = (unsigned)strides[index];
      whandles[i].offset = (unsigned)offsets[index];
+      whandles[i].format = map->pipe_format;
      whandles[i].modifier = modifier;
      whandles[i].plane = index;
   }
@@ -1296,6 +1296,7 @@ dri2_from_names(__DRIscreen *screen, int width, int height, int format,
   whandle.handle = names[0];
   whandle.stride = strides[0];
   whandle.offset = offsets[0];
+   whandle.format = map->pipe_format;
   whandle.modifier = DRM_FORMAT_MOD_INVALID;

   img = dri2_create_image_from_winsys(screen, width, height, map,
@@ -1393,7 +1394,7 @@ dri2_query_dma_buf_format_modifier_attribs(__DRIscreen *_screen,
 {
   switch (attrib) {
   case __DRI_IMAGE_FORMAT_MODIFIER_ATTRIB_PLANE_COUNT: {
-      uint64_t mod_planes = dri2_get_modifier_num_planes(modifier);
+      uint64_t mod_planes = dri2_get_modifier_num_planes(modifier, fourcc);
      if (mod_planes > 0)
         *value = mod_planes;
      return mod_planes > 0;
--- a/src/intel/perf/gen_perf.c
+++ b/src/intel/perf/gen_perf.c
@@ -69,6 +69,8 @@
 #define MAP_READ  (1 << 0)
 #define MAP_WRITE (1 << 1)

+#define OA_REPORT_INVALID_CTX_ID (0xffffffff)
+
 /**
 * Periodic OA samples are read() into these buffer structures via the
 * i915 perf kernel interface and appended to the
@@ -997,7 +999,9 @@ query_result_accumulate(struct gen_perf_query_result *result,
 {
   int i, idx = 0;

-   result->hw_id = start[2];
+   if (result->hw_id == OA_REPORT_INVALID_CTX_ID &&
+       start[2] != OA_REPORT_INVALID_CTX_ID)
+      result->hw_id = start[2];
   result->reports_accumulated++;

   switch (query->oa_format) {
@@ -1035,7 +1039,7 @@ static void
 query_result_clear(struct gen_perf_query_result *result)
 {
   memset(result, 0, sizeof(*result));
-   result->hw_id = 0xffffffff; /* invalid */
+   result->hw_id = OA_REPORT_INVALID_CTX_ID; /* invalid */
 }

 static void
@@ -1316,8 +1320,8 @@ get_free_sample_buf(struct gen_perf_context *perf_ctx)

      exec_node_init(&buf->link);
      buf->refcount = 0;
-      buf->len = 0;
   }
+   buf->len = 0;

   return buf;
 }
@@ -1834,7 +1838,8 @@ read_oa_samples_until(struct gen_perf_context *perf_ctx,
      exec_list_get_tail(&perf_ctx->sample_buffers);
   struct oa_sample_buf *tail_buf =
      exec_node_data(struct oa_sample_buf, tail_node, link);
-   uint32_t last_timestamp = tail_buf->last_timestamp;
+   uint32_t last_timestamp =
+      tail_buf->len == 0 ? start_timestamp : tail_buf->last_timestamp;

   while (1) {
      struct oa_sample_buf *buf = get_free_sample_buf(perf_ctx);
@@ -1849,12 +1854,13 @@ read_oa_samples_until(struct gen_perf_context *perf_ctx,
         exec_list_push_tail(&perf_ctx->free_sample_buffers, &buf->link);

         if (len < 0) {
-            if (errno == EAGAIN)
-               return ((last_timestamp - start_timestamp) >=
+            if (errno == EAGAIN) {
+               return ((last_timestamp - start_timestamp) < INT32_MAX &&
+                       (last_timestamp - start_timestamp) >=
                       (end_timestamp - start_timestamp)) ?
                      OA_READ_STATUS_FINISHED :
                      OA_READ_STATUS_UNFINISHED;
-            else {
+            } else {
               DBG("Error reading i915 perf samples: %m\n");
            }
         } else
@@ -2070,6 +2076,17 @@ discard_all_queries(struct gen_perf_context *perf_ctx)
   }
 }

+/* Looks for the validity bit of context ID (dword 2) of an OA report. */
+static bool
+oa_report_ctx_id_valid(const struct gen_device_info *devinfo,
+                       const uint32_t *report)
+{
+   assert(devinfo->gen >= 8);
+   if (devinfo->gen == 8)
+      return (report[0] & (1 << 25)) != 0;
+   return (report[0] & (1 << 16)) != 0;
+}
+
 /**
 * Accumulate raw OA counter values based on deltas between pairs of
 * OA reports.
@@ -2097,7 +2114,7 @@ accumulate_oa_reports(struct gen_perf_context *perf_ctx,
   uint32_t *last;
   uint32_t *end;
   struct exec_node *first_samples_node;
-   bool in_ctx = true;
+   bool last_report_ctx_match = true;
   int out_duration = 0;

   assert(query->oa.map != NULL);
@@ -2126,7 +2143,7 @@ accumulate_oa_reports(struct gen_perf_context *perf_ctx,
   first_samples_node = query->oa.samples_head->next;

   foreach_list_typed_from(struct oa_sample_buf, buf, link,
-                           &perf_ctx.sample_buffers,
+                           &perf_ctx->sample_buffers,
                           first_samples_node)
   {
      int offset = 0;
@@ -2143,6 +2160,7 @@ accumulate_oa_reports(struct gen_perf_context *perf_ctx,
         switch (header->type) {
         case DRM_I915_PERF_RECORD_SAMPLE: {
            uint32_t *report = (uint32_t *)(header + 1);
+            bool report_ctx_match = true;
            bool add = true;

            /* Ignore reports that come before the start marker.
@@ -2171,35 +2189,30 @@ accumulate_oa_reports(struct gen_perf_context *perf_ctx,
             * of OA counters while any other context is acctive.
             */
            if (devinfo->gen >= 8) {
-               if (in_ctx && report[2] != query->oa.result.hw_id) {
-                  DBG("i915 perf: Switch AWAY (observed by ID change)\n");
-                  in_ctx = false;
+               /* Consider that the current report matches our context only if
+                * the report says the report ID is valid.
+                */
+               report_ctx_match = oa_report_ctx_id_valid(devinfo, report) &&
+                  report[2] == start[2];
+               if (report_ctx_match)
                  out_duration = 0;
-               } else if (in_ctx == false && report[2] == query->oa.result.hw_id) {
-                  DBG("i915 perf: Switch TO\n");
-                  in_ctx = true;
-
-                  /* From experimentation in IGT, we found that the OA unit
-                   * might label some report as "idle" (using an invalid
-                   * context ID), right after a report for a given context.
-                   * Deltas generated by those reports actually belong to the
-                   * previous context, even though they're not labelled as
-                   * such.
-                   *
-                   * We didn't *really* Switch AWAY in the case that we e.g.
-                   * saw a single periodic report while idle...
-                   */
-                  if (out_duration >= 1)
-                     add = false;
-               } else if (in_ctx) {
-                  assert(report[2] == query->oa.result.hw_id);
-                  DBG("i915 perf: Continuation IN\n");
-               } else {
-                  assert(report[2] != query->oa.result.hw_id);
-                  DBG("i915 perf: Continuation OUT\n");
-                  add = false;
+               else
                  out_duration++;
-               }
+
+               /* Only add the delta between <last, report> if the last report
+                * was clearly identified as our context, or if we have at most
+                * 1 report without a matching ID.
+                *
+                * The OA unit will sometimes label reports with an invalid
+                * context ID when i915 rewrites the execlist submit register
+                * with the same context as the one currently running. This
+                * happens when i915 wants to notify the HW of ringbuffer tail
+                * register update. We have to consider this report as part of
+                * our context as the 3d pipeline behind the OACS unit is still
+                * processing the operations started at the previous execlist
+                * submission.
+                */
+               add = last_report_ctx_match && out_duration < 2;
            }

            if (add) {
@@ -2208,6 +2221,7 @@ accumulate_oa_reports(struct gen_perf_context *perf_ctx,
            }

            last = report;
+            last_report_ctx_match = report_ctx_match;

            break;
         }
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -1640,7 +1640,7 @@ void anv_GetPhysicalDeviceProperties2(
         VkPhysicalDeviceSamplerFilterMinmaxPropertiesEXT *properties =
            (VkPhysicalDeviceSamplerFilterMinmaxPropertiesEXT *)ext;
         properties->filterMinmaxImageComponentMapping = pdevice->info.gen >= 9;
-         properties->filterMinmaxSingleComponentFormats = true;
+         properties->filterMinmaxSingleComponentFormats = pdevice->info.gen >= 9;
         break;
      }

@@ -3098,9 +3098,10 @@ VkResult anv_AllocateMemory(
                                      i915_tiling);
         if (ret) {
            anv_bo_cache_release(device, &device->bo_cache, mem->bo);
-            return vk_errorf(device->instance, NULL,
-                             VK_ERROR_OUT_OF_DEVICE_MEMORY,
-                             "failed to set BO tiling: %m");
+            result = vk_errorf(device->instance, NULL,
+                               VK_ERROR_OUT_OF_DEVICE_MEMORY,
+                               "failed to set BO tiling: %m");
+            goto fail;
         }
      }
   }
--- a/src/intel/vulkan/anv_queue.c
+++ b/src/intel/vulkan/anv_queue.c
@@ -681,7 +681,11 @@ anv_wait_for_fences(struct anv_device *device,
   if (fenceCount <= 1 || waitAll) {
      for (uint32_t i = 0; i < fenceCount; i++) {
         ANV_FROM_HANDLE(anv_fence, fence, pFences[i]);
-         switch (fence->permanent.type) {
+         struct anv_fence_impl *impl =
+            fence->temporary.type != ANV_FENCE_TYPE_NONE ?
+            &fence->temporary : &fence->permanent;
+
+         switch (impl->type) {
         case ANV_FENCE_TYPE_BO:
            result = anv_wait_for_bo_fences(device, 1, &pFences[i],
                                            true, abs_timeout);
@@ -716,7 +720,10 @@ static bool anv_all_fences_syncobj(uint32_t fenceCount, const VkFence *pFences)
 {
   for (uint32_t i = 0; i < fenceCount; ++i) {
      ANV_FROM_HANDLE(anv_fence, fence, pFences[i]);
-      if (fence->permanent.type != ANV_FENCE_TYPE_SYNCOBJ)
+      struct anv_fence_impl *impl =
+         fence->temporary.type != ANV_FENCE_TYPE_NONE ?
+         &fence->temporary : &fence->permanent;
+      if (impl->type != ANV_FENCE_TYPE_SYNCOBJ)
         return false;
   }
   return true;
@@ -726,7 +733,10 @@ static bool anv_all_fences_bo(uint32_t fenceCount, const VkFence *pFences)
 {
   for (uint32_t i = 0; i < fenceCount; ++i) {
      ANV_FROM_HANDLE(anv_fence, fence, pFences[i]);
-      if (fence->permanent.type != ANV_FENCE_TYPE_BO)
+      struct anv_fence_impl *impl =
+         fence->temporary.type != ANV_FENCE_TYPE_NONE ?
+         &fence->temporary : &fence->permanent;
+      if (impl->type != ANV_FENCE_TYPE_BO)
         return false;
   }
   return true;
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -3803,6 +3803,13 @@ genX(flush_pipeline_select)(struct anv_cmd_buffer *cmd_buffer,
         vfe.NumberofURBEntries     = 2;
         vfe.URBEntryAllocationSize = 2;
      }
+
+      /* We just emitted a dummy MEDIA_VFE_STATE so now that packet is
+       * invalid. Set the compute pipeline to dirty to force a re-emit of the
+       * pipeline in case we get back-to-back dispatch calls with the same
+       * pipeline and a PIPELINE_SELECT in between.
+       */
+      cmd_buffer->state.compute.pipeline_dirty = true;
   }
 #endif

--- a/src/mesa/main/performance_query.c
+++ b/src/mesa/main/performance_query.c
@@ -48,6 +48,12 @@ free_performance_query(GLuint key, void *data, void *user)
   struct gl_perf_query_object *m = data;
   struct gl_context *ctx = user;

+   /* Don't confuse the implementation by deleting an active query. We can
+    * toggle Active/Used to false because we're tearing down the GL context
+    * and it's already idle (see _mesa_free_context_data).
+    */
+   m->Active = false;
+   m->Used = false;
   ctx->Driver.DeletePerfQuery(ctx, m);
 }
Author	SHA1	Message	Date
Dylan Baker	02fcd9d803	docs/relnotes/19.2.8: Add SHA256 sum	2019-12-18 11:23:15 -08:00
Dylan Baker	34896d2299	VERSION: bump for 19.2.8	2019-12-18 11:02:09 -08:00
Dylan Baker	1743dec475	docs: add relnotes for 19.2.8	2019-12-18 11:01:53 -08:00
Lionel Landwerlin	6979d19fcb	mesa: avoid triggering assert in implementation When tearing down a GL context with an active performance query, the implementation can be confused by a query marked active when it's being deleted. This shouldn't happen in the implementation because the context will already be idle. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2235 Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3115> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3115> (cherry picked from commit `2c8742ed85`)	2019-12-17 09:10:49 -08:00
Gert Wollny	f42f9bbcd6	virgl: Increase the shader transfer buffer by doubling the size With only linearly increasing the size of the shader transfer buffer the transfer of very large shaders may fail, so with each attempt double the size of the buffer. CTS: dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.48 for VTK-GL-CTS b5dcfb9c5 and newer virglrenderer bug: https://gitlab.freedesktop.org/virgl/virglrenderer/issues/150 Fixes: `a8987b88ff` virgl: add driver for virtio-gpu 3D (v2) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3121> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3121> (cherry picked from commit `cffa7bb990`)	2019-12-17 09:08:49 -08:00
Dylan Baker	4244f4af88	cherry-ignore: Update for 19.2.8	2019-12-16 16:09:30 -08:00
Bas Nieuwenhuizen	ed9c1f7f42	amd/common: Fix tcCompatible degradation on Stoney. addrlib sometimes returns smaller sizes for tcCompat as it does not seem to take into account the depth+stencil matching config gymnastics with tcCompat. This fixes dEQP-VK.pipeline.render_to_image.core.2d_array.huge.height.r8g8b8a8_unorm_d32_sfloat_s8_uint CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3054> (cherry picked from commit `e197fb1c2f`) Conflicts resolved by Dylan Baker Conflicts: src/amd/common/ac_surface.c	2019-12-16 16:09:30 -08:00
Iván Briano	5d2d6442ff	anv: Export filter_minmax support only when it's really supported Fixes: `bea4d4c78c` ("anv: add VK_EXT_sampler_filter_minmax support") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3071> (cherry picked from commit `0fd93b9589`)	2019-12-16 15:16:50 -08:00
Bas Nieuwenhuizen	e8913e62a6	amd/common: Always use addrlib for HTILE tc-compat. Even without depth+stencil addrlib can (correctly!) decide to disable tc compatible HTILE. One example is 8x sampling with 32-bit depth on Stoney. The row size on Stoney is 1024, while the tile size is 2048, which results in tile splits which are not supported with tc-compat. On Stoney, this fixes dEQP-VK.glsl.builtin_var.fragdepth.*_list_d32_sfloat_multisample_8 CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3054> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3054> (cherry picked from commit `b53856aca3`)	2019-12-16 15:16:50 -08:00
Lionel Landwerlin	448b90a4a0	anv: fix fence underlying primitive checks We appear to have got lucky that the only type of temporary fence payload we could have was a syncobj and that would only happen when the type of the permanent payload was also a syncobj. This code was broken if that assumption changed and it did in commit `f9a3d9738b`. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Ivan Briano <ivan.briano@intel.com> (cherry picked from commit `52bc235f2a`)	2019-12-16 15:16:49 -08:00
Kenneth Graunke	06b97d1e34	iris: Default to X-tiling for scanout buffers without modifiers Neither Mutter nor KWin's wayland compositors appear to use modifiers. In the non-modifier case, iris was still trying to use Y-tiling for scan-out surfaces, leading to this error: (gnome-shell:7247): mutter-WARNING **: 09:23:47.787: meta_drm_buffer_gbm_new failed: drmModeAddFB failed: Invalid argument We now fall back to the historical X-tiling for scanout buffers, which ought to work everyone, at lower performance. To regain that, we need to ensure modifiers are actually supported in environments people use. Fixes: `fbf3124771` ("iris: Rework tiling/modifiers handling") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `dcb4230e5e`)	2019-12-16 15:16:49 -08:00
Jason Ekstrand	9925871a1a	anv: Don't leak when set_tiling fails Fixes: `a44744e01d` "anv: Require a dedicated allocation for..." Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit `0a36fafa95`) Conflicts resolved by Dylan Baker Conflicts: src/intel/vulkan/anv_device.c	2019-12-16 15:16:49 -08:00
Dylan Baker	325ef15f26	meson/broadcom: libbroadcom_cle also needs zlib Fixes: `1ae8018a6a` ("meson: Add support for the vc4 driver.") Reviewed-by: Eric Anholt <eric@anholt.net> (cherry picked from commit `d0eebda990`)	2019-12-11 13:19:24 -08:00
Dylan Baker	5e100b6ba7	meson/broadcom: libbroadcom_cle needs expat headers Fixes: `1ae8018a6a` ("meson: Add support for the vc4 driver.") Reviewed-by: Eric Anholt <eric@anholt.net> (cherry picked from commit `85a9698ac3`)	2019-12-11 13:19:17 -08:00
Alyssa Rosenzweig	47c8b41f8f	gallium/util: Support POLYGON in u_stream_outputs_for_vertices u_decomposed_prims_for_vertices cannot support POLYGON, but POLYGON is trivial to support as a special case directly (since we have the number of vertices directly). Fixes aborts in Panfrost in apps using GL_POLYGON. Fixes: `e881aa8c12` ("gallium/util: Add u_stream_outputs_for_vertices helper") Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Revewied-by: Eric Anholt <eric@anholt.net> (cherry picked from commit `a37822f5f7`)	2019-12-10 09:09:05 -08:00
Jason Ekstrand	af0d38bfde	anv: Re-emit all compute state on pipeline switch It's a very odd case to hit in the real world. However, there are some CTS tests which switch back and forth between dispatch and clear without changing the pipeline. Fixes: `bc612536eb` "anv: Emit a dummy MEDIA_VFE_STATE before switching..." Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> (cherry picked from commit `0f60aa4037`)	2019-12-10 09:08:59 -08:00
Nanley Chery	f3507690f8	gallium: Store the image format in winsys_handle This format will be used to properly handle planar images with modifiers in iris. Fixes: `246eebba4a` ("iris: Export and import surfaces with modifiers that have aux data") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `51ee8fff9b`)	2019-12-10 09:08:41 -08:00
Nanley Chery	96b8f42611	gallium/dri2: Fix creation of multi-planar modifier images The commit noted below assumed and enforced that DRM_MOD_INVALID was the only valid modifier for multi-planar imported images. Due to that, it required that modifier on multi-planar images to: 1. Allow multiple planes. 2. Perform YUV format lowering and extent adjustments. 3. Use buffer_index to correctly map the given planes. Fix these issues by removing or updating the code built on that assumption. Fixes: `2066966c10` ("gallium/dri2: Support creating multi-planar modifier images") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `d5c857837a`)	2019-12-10 09:08:35 -08:00
Timothy Arceri	0a70ed2aa3	glsl/nir: iterate the system values list when adding varyings Iterate the system values list when adding varyings to the program resource list in the NIR linker. This is needed to avoid CTS regressions when using the NIR to build the GLSL resource list in an upcoming series. Presumably it also fixes a bug with the current ARB_gl_spirv support. Fixes: `ffdb44d3a0` ("nir/linker: Add inputs/outputs to the program resource list") Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> (cherry picked from commit `1abca2b3c8`)	2019-12-10 09:08:29 -08:00
Rob Clark	3742910ba2	nir/lower_clip: Fix incorrect driver loc for clipdist outputs Somehow adjusting maxloc based on existing outputs got lost, resulting in the clipdist varying clobbering the position varying. Causing a shader that had no position output in freedreno/ir3, which triggers GPU hangs in neverball. Fixes: `d0f746b645` ("nir: Save nir_variable pointers in nir_lower_clip_vs rather than locs.") Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> (cherry picked from commit `372ed42d22`)	2019-12-04 14:44:25 -08:00
Lionel Landwerlin	21a4be582c	intel/perf: fix improper pointer access This expression was unused by the macro, probably why it didn't register in the compilation. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `ddacd3d43b`)	2019-12-04 14:44:04 -08:00
Lionel Landwerlin	272c4f2711	intel/perf: simplify the processing of OA reports This is a more accurate description of what happens in processing the OA reports. Previously we only had a somewhat difficult to parse state machine tracking the context ID. What we really only need to do to decide if the delta between 2 reports (r0 & r1) should be accumulated in the query result is : * whether the r0 is tagged with the context ID relevant to us * if r0 is not tagged with our context ID and r1 is: does r0 have a invalid context id? If not then we're in a case where i915 has resubmitted the same context for execution through the execlist submission port v2: Update comment (Ken) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `8c0b058263`)	2019-12-04 14:43:58 -08:00
Lionel Landwerlin	d4bb049e98	intel/perf: take into account that reports read can be fairly old If we read the OA reports late enough after the query happens, we can get a timestamp in the report that is significantly in the past compared to the start timestamp of the query. The current code must deal with the wraparound of the timestamp value (every ~6 minute). So consider that if the difference is greater than half that wraparound period, we're probably dealing with an old report and make the caller aware it should read more reports when they're available. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `b364e920bf`)	2019-12-04 14:43:51 -08:00
Lionel Landwerlin	21df9767ad	intel/perf: set read buffer len to 0 to identify empty buffer We always add an empty buffer in the list when creating the query. Let's set the len appropriately so that we can recognize it when we read OA reports up to the end of a query. We were using an 0 timestamp value associated with the empty buffer and incorrectly assuming this was a valid value. In turn that led to not reading enough reports and resulted in deltas added to our counter values which should have been discarded because those would be flagged for a different context. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `9d0a5c817c`)	2019-12-04 14:43:38 -08:00
Lionel Landwerlin	82177cdc1d	intel/perf: fix invalid hw_id in query results Accumulation happens between 2 reports, it can be between a start/end report from another context. So only consider updating the hw_id of the results when it's not already valid and that we have a valid value to put in there. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `41b54b5faf` ("i965: move OA accumulation code to intel/perf") Reviewed-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `acea59dbf8`)	2019-12-04 14:43:29 -08:00
Dylan Baker	e5b4f23475	docs: Add SHA256 sums for 19.2.7	2019-12-04 14:36:13 -08:00
@@ -1 +1 @@
 .2.7
 .2.8