Compare commits

..

26 Commits

Author SHA1 Message Date
Dylan Baker
02fcd9d803 docs/relnotes/19.2.8: Add SHA256 sum 2019-12-18 11:23:15 -08:00
Dylan Baker
34896d2299 VERSION: bump for 19.2.8 2019-12-18 11:02:09 -08:00
Dylan Baker
1743dec475 docs: add relnotes for 19.2.8 2019-12-18 11:01:53 -08:00
Lionel Landwerlin
6979d19fcb mesa: avoid triggering assert in implementation
When tearing down a GL context with an active performance query, the
implementation can be confused by a query marked active when it's
being deleted.

This shouldn't happen in the implementation because the context will
already be idle.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2235
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3115>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3115>
(cherry picked from commit 2c8742ed85)
2019-12-17 09:10:49 -08:00
Gert Wollny
f42f9bbcd6 virgl: Increase the shader transfer buffer by doubling the size
With only linearly increasing the size of the shader transfer buffer
the transfer of very large shaders may fail, so with each attempt double
the size of the buffer.

CTS:
  dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.48
  for VTK-GL-CTS b5dcfb9c5 and newer

virglrenderer bug:
  https://gitlab.freedesktop.org/virgl/virglrenderer/issues/150

Fixes: a8987b88ff
    virgl: add driver for virtio-gpu 3D (v2)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3121>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3121>
(cherry picked from commit cffa7bb990)
2019-12-17 09:08:49 -08:00
Dylan Baker
4244f4af88 cherry-ignore: Update for 19.2.8 2019-12-16 16:09:30 -08:00
Bas Nieuwenhuizen
ed9c1f7f42 amd/common: Fix tcCompatible degradation on Stoney.
addrlib sometimes returns smaller sizes for tcCompat as it does
not seem to take into account the depth+stencil matching config
gymnastics with tcCompat.

This fixes
dEQP-VK.pipeline.render_to_image.core.2d_array.huge.height.r8g8b8a8_unorm_d32_sfloat_s8_uint

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3054>
(cherry picked from commit e197fb1c2f)
Conflicts resolved by Dylan Baker

Conflicts:
	src/amd/common/ac_surface.c
2019-12-16 16:09:30 -08:00
Iván Briano
5d2d6442ff anv: Export filter_minmax support only when it's really supported
Fixes: bea4d4c78c ("anv: add VK_EXT_sampler_filter_minmax support")

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3071>
(cherry picked from commit 0fd93b9589)
2019-12-16 15:16:50 -08:00
Bas Nieuwenhuizen
e8913e62a6 amd/common: Always use addrlib for HTILE tc-compat.
Even without depth+stencil addrlib can (correctly!) decide to
disable tc compatible HTILE.

One example is 8x sampling with 32-bit depth on Stoney. The row size
on Stoney is 1024, while the tile size is 2048, which results in
tile splits which are not supported with tc-compat.

On Stoney, this fixes
dEQP-VK.glsl.builtin_var.fragdepth.*_list_d32_sfloat_multisample_8

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3054>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3054>
(cherry picked from commit b53856aca3)
2019-12-16 15:16:50 -08:00
Lionel Landwerlin
448b90a4a0 anv: fix fence underlying primitive checks
We appear to have got lucky that the only type of temporary fence
payload we could have was a syncobj and that would only happen when
the type of the permanent payload was also a syncobj.

This code was broken if that assumption changed and it did in commit
f9a3d9738b.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
(cherry picked from commit 52bc235f2a)
2019-12-16 15:16:49 -08:00
Kenneth Graunke
06b97d1e34 iris: Default to X-tiling for scanout buffers without modifiers
Neither Mutter nor KWin's wayland compositors appear to use modifiers.
In the non-modifier case, iris was still trying to use Y-tiling for
scan-out surfaces, leading to this error:

(gnome-shell:7247): mutter-WARNING **: 09:23:47.787: meta_drm_buffer_gbm_new failed: drmModeAddFB failed: Invalid argument

We now fall back to the historical X-tiling for scanout buffers, which
ought to work everyone, at lower performance.  To regain that, we need
to ensure modifiers are actually supported in environments people use.

Fixes: fbf3124771 ("iris: Rework tiling/modifiers handling")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit dcb4230e5e)
2019-12-16 15:16:49 -08:00
Jason Ekstrand
9925871a1a anv: Don't leak when set_tiling fails
Fixes: a44744e01d "anv: Require a dedicated allocation for..."
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 0a36fafa95)
Conflicts resolved by Dylan Baker

Conflicts:
	src/intel/vulkan/anv_device.c
2019-12-16 15:16:49 -08:00
Dylan Baker
325ef15f26 meson/broadcom: libbroadcom_cle also needs zlib
Fixes: 1ae8018a6a
       ("meson: Add support for the vc4 driver.")
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit d0eebda990)
2019-12-11 13:19:24 -08:00
Dylan Baker
5e100b6ba7 meson/broadcom: libbroadcom_cle needs expat headers
Fixes: 1ae8018a6a
       ("meson: Add support for the vc4 driver.")
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 85a9698ac3)
2019-12-11 13:19:17 -08:00
Alyssa Rosenzweig
47c8b41f8f gallium/util: Support POLYGON in u_stream_outputs_for_vertices
u_decomposed_prims_for_vertices cannot support POLYGON, but POLYGON is
trivial to support as a special case directly (since we have the number
of vertices directly).

Fixes aborts in Panfrost in apps using GL_POLYGON.

Fixes: e881aa8c12 ("gallium/util: Add u_stream_outputs_for_vertices helper")
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Revewied-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit a37822f5f7)
2019-12-10 09:09:05 -08:00
Jason Ekstrand
af0d38bfde anv: Re-emit all compute state on pipeline switch
It's a very odd case to hit in the real world.  However, there are some
CTS tests which switch back and forth between dispatch and clear without
changing the pipeline.

Fixes: bc612536eb "anv: Emit a dummy MEDIA_VFE_STATE before switching..."
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit 0f60aa4037)
2019-12-10 09:08:59 -08:00
Nanley Chery
f3507690f8 gallium: Store the image format in winsys_handle
This format will be used to properly handle planar images with modifiers
in iris.

Fixes: 246eebba4a ("iris: Export and import surfaces with modifiers that have aux data")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 51ee8fff9b)
2019-12-10 09:08:41 -08:00
Nanley Chery
96b8f42611 gallium/dri2: Fix creation of multi-planar modifier images
The commit noted below assumed and enforced that DRM_MOD_INVALID was the
only valid modifier for multi-planar imported images. Due to that, it
required that modifier on multi-planar images to:

   1. Allow multiple planes.
   2. Perform YUV format lowering and extent adjustments.
   3. Use buffer_index to correctly map the given planes.

Fix these issues by removing or updating the code built on that
assumption.

Fixes: 2066966c10 ("gallium/dri2: Support creating multi-planar modifier images")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit d5c857837a)
2019-12-10 09:08:35 -08:00
Timothy Arceri
0a70ed2aa3 glsl/nir: iterate the system values list when adding varyings
Iterate the system values list when adding varyings to the program
resource list in the NIR linker. This is needed to avoid CTS
regressions when using the NIR to build the GLSL resource list in
an upcoming series. Presumably it also fixes a bug with the current
ARB_gl_spirv support.

Fixes: ffdb44d3a0 ("nir/linker: Add inputs/outputs to the program resource list")

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
(cherry picked from commit 1abca2b3c8)
2019-12-10 09:08:29 -08:00
Rob Clark
3742910ba2 nir/lower_clip: Fix incorrect driver loc for clipdist outputs
Somehow adjusting maxloc based on existing outputs got lost, resulting
in the clipdist varying clobbering the position varying.  Causing a
shader that had no position output in freedreno/ir3, which triggers GPU
hangs in neverball.

Fixes: d0f746b645 ("nir: Save nir_variable pointers in nir_lower_clip_vs rather than locs.")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
(cherry picked from commit 372ed42d22)
2019-12-04 14:44:25 -08:00
Lionel Landwerlin
21a4be582c intel/perf: fix improper pointer access
This expression was unused by the macro, probably why it didn't
register in the compilation.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit ddacd3d43b)
2019-12-04 14:44:04 -08:00
Lionel Landwerlin
272c4f2711 intel/perf: simplify the processing of OA reports
This is a more accurate description of what happens in processing the
OA reports.

Previously we only had a somewhat difficult to parse state machine
tracking the context ID.

What we really only need to do to decide if the delta between 2
reports (r0 & r1) should be accumulated in the query result is :

   * whether the r0 is tagged with the context ID relevant to us

   * if r0 is not tagged with our context ID and r1 is: does r0 have a
     invalid context id? If not then we're in a case where i915 has
     resubmitted the same context for execution through the execlist
     submission port

v2: Update comment (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 8c0b058263)
2019-12-04 14:43:58 -08:00
Lionel Landwerlin
d4bb049e98 intel/perf: take into account that reports read can be fairly old
If we read the OA reports late enough after the query happens, we can
get a timestamp in the report that is significantly in the past
compared to the start timestamp of the query. The current code must
deal with the wraparound of the timestamp value (every ~6 minute). So
consider that if the difference is greater than half that wraparound
period, we're probably dealing with an old report and make the caller
aware it should read more reports when they're available.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit b364e920bf)
2019-12-04 14:43:51 -08:00
Lionel Landwerlin
21df9767ad intel/perf: set read buffer len to 0 to identify empty buffer
We always add an empty buffer in the list when creating the query.
Let's set the len appropriately so that we can recognize it when we
read OA reports up to the end of a query.

We were using an 0 timestamp value associated with the empty buffer
and incorrectly assuming this was a valid value. In turn that led to
not reading enough reports and resulted in deltas added to our counter
values which should have been discarded because those would be flagged
for a different context.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 9d0a5c817c)
2019-12-04 14:43:38 -08:00
Lionel Landwerlin
82177cdc1d intel/perf: fix invalid hw_id in query results
Accumulation happens between 2 reports, it can be between a start/end
report from another context. So only consider updating the hw_id of
the results when it's not already valid and that we have a valid value
to put in there.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 41b54b5faf ("i965: move OA accumulation code to intel/perf")
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit acea59dbf8)
2019-12-04 14:43:29 -08:00
Dylan Baker
e5b4f23475 docs: Add SHA256 sums for 19.2.7 2019-12-04 14:36:13 -08:00
18 changed files with 295 additions and 106 deletions

View File

@@ -1 +1 @@
19.2.7
19.2.8

View File

@@ -27,6 +27,8 @@ bcd9224728dcb8d8fe4bcddc4bd9b2c36fcfe9dd
869e32593a9096b845dd6106f8f86e1c41fac968
a2c3c65a31de90fdb55f76f2894860dfbafe2043
bb0c5c487e63e88acbb792f092dd8f392bad8540
937b9055698be0dfdb7d2e0673a989e2ecc05912
21376cffb37018160ad3eef38b5a640ba1675a4f
# This is reverted shortly after it was landed
4432a2d14d80081d062f7939a950d65ea3a16eed
@@ -35,6 +37,15 @@ bb0c5c487e63e88acbb792f092dd8f392bad8540
1a05811936dd8d0c3a367c6f00629624ef39d537
911a8261419f48dcd756f78832fa5a5f4c5b8d93
# This was manuall backported
# This was manually backported
2afeed301010917c4eae55dcd2544f9d329df934
4b392ced2d744fccffe95490ff57e6b41033c266
# This is not being backported to 19.2 due to causing build regressions for
# downstream projects
eaf43966027cf9654e91ca57aecc8f5a65b58f49
# Invalid sha warnings
023282a4f667695ea1dbbe9fbe1cd3a9d550a426
2fca325ea65f068043d4c18c9cd0fe7f25bde8f7
7564c5fc6d79a2ddec49a19f67183fb3be799fe5

View File

@@ -36,7 +36,7 @@ depends on the particular driver being used.
<h2>SHA256 checksum</h2>
<pre>
TBD.
e3799fb7896fd9ed2f90f651fb907b95cdebfbd494968ff116e6bf1be143579e mesa-19.2.7.tar.xz
</pre>

108
docs/relnotes/19.2.8.html Normal file
View File

@@ -0,0 +1,108 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 19.2.8 Release Notes / 2019-12-18</h1>
<p>
Mesa 19.2.8 is a bug fix release which fixes bugs found since the 19.2.7 release.
</p>
<p>
Mesa 19.2.8 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<p>
Mesa 19.2.8 implements the Vulkan 1.1 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
</p>
<h2>SHA256 checksum</h2>
<pre>
cffa8fa755c7422ce014c39ca0b770a092d9e0bbae537ceb2609c106916e5a57 mesa-19.2.8.tar.xz
</pre>
<h2>New features</h2>
<ul>
<li>None</li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li>i965/iris: assert when destroy GL context with active query</li>
</ul>
<h2>Changes</h2>
<ul>
<p>Alyssa Rosenzweig (1):</p>
<li> gallium/util: Support POLYGON in u_stream_outputs_for_vertices</li>
<p></p>
<p>Bas Nieuwenhuizen (2):</p>
<li> amd/common: Always use addrlib for HTILE tc-compat.</li>
<li> amd/common: Fix tcCompatible degradation on Stoney.</li>
<p></p>
<p>Dylan Baker (4):</p>
<li> docs: Add SHA256 sums for 19.2.7</li>
<li> meson/broadcom: libbroadcom_cle needs expat headers</li>
<li> meson/broadcom: libbroadcom_cle also needs zlib</li>
<li> cherry-ignore: Update for 19.2.8</li>
<p></p>
<p>Gert Wollny (1):</p>
<li> virgl: Increase the shader transfer buffer by doubling the size</li>
<p></p>
<p>Iván Briano (1):</p>
<li> anv: Export filter_minmax support only when it&#x27;s really supported</li>
<p></p>
<p>Jason Ekstrand (2):</p>
<li> anv: Re-emit all compute state on pipeline switch</li>
<li> anv: Don&#x27;t leak when set_tiling fails</li>
<p></p>
<p>Kenneth Graunke (1):</p>
<li> iris: Default to X-tiling for scanout buffers without modifiers</li>
<p></p>
<p>Lionel Landwerlin (7):</p>
<li> intel/perf: fix invalid hw_id in query results</li>
<li> intel/perf: set read buffer len to 0 to identify empty buffer</li>
<li> intel/perf: take into account that reports read can be fairly old</li>
<li> intel/perf: simplify the processing of OA reports</li>
<li> intel/perf: fix improper pointer access</li>
<li> anv: fix fence underlying primitive checks</li>
<li> mesa: avoid triggering assert in implementation</li>
<p></p>
<p>Nanley Chery (2):</p>
<li> gallium/dri2: Fix creation of multi-planar modifier images</li>
<li> gallium: Store the image format in winsys_handle</li>
<p></p>
<p>Rob Clark (1):</p>
<li> nir/lower_clip: Fix incorrect driver loc for clipdist outputs</li>
<p></p>
<p>Timothy Arceri (1):</p>
<li> glsl/nir: iterate the system values list when adding varyings</li>
<p></p>
<p></p>
</ul>
</div>
</body>
</html>

View File

@@ -343,7 +343,7 @@ static int gfx6_compute_level(ADDR_HANDLE addrlib,
AddrSurfInfoIn->flags.depth &&
surf_level->mode == RADEON_SURF_MODE_2D &&
level == 0) {
AddrHtileIn->flags.tcCompatible = AddrSurfInfoIn->flags.tcCompatible;
AddrHtileIn->flags.tcCompatible = AddrSurfInfoOut->tcCompatible;
AddrHtileIn->pitch = AddrSurfInfoOut->pitch;
AddrHtileIn->height = AddrSurfInfoOut->height;
AddrHtileIn->numSlices = AddrSurfInfoOut->depth;
@@ -777,19 +777,12 @@ static int gfx6_compute_surface(ADDR_HANDLE addrlib,
if (level > 0)
continue;
/* Check that we actually got a TC-compatible HTILE if
* we requested it (only for level 0, since we're not
* supporting HTILE on higher mip levels anyway). */
assert(AddrSurfInfoOut.tcCompatible ||
!AddrSurfInfoIn.flags.tcCompatible ||
AddrSurfInfoIn.flags.matchStencilTileCfg);
if (!AddrSurfInfoOut.tcCompatible) {
AddrSurfInfoIn.flags.tcCompatible = 0;
surf->flags &= ~RADEON_SURF_TC_COMPATIBLE_HTILE;
}
if (AddrSurfInfoIn.flags.matchStencilTileCfg) {
if (!AddrSurfInfoOut.tcCompatible) {
AddrSurfInfoIn.flags.tcCompatible = 0;
surf->flags &= ~RADEON_SURF_TC_COMPATIBLE_HTILE;
}
AddrSurfInfoIn.flags.matchStencilTileCfg = 0;
AddrSurfInfoIn.tileIndex = AddrSurfInfoOut.tileIndex;
stencil_tile_idx = AddrSurfInfoOut.stencilTileIdx;

View File

@@ -58,6 +58,6 @@ libbroadcom_cle = static_library(
'v3d_decoder.c',
include_directories : [inc_common, inc_broadcom],
c_args : [c_vis_args, no_override_init_args],
dependencies : [dep_libdrm, dep_valgrind],
dependencies : [dep_libdrm, dep_valgrind, dep_expat, dep_zlib],
build_by_default : false,
)

View File

@@ -34,32 +34,11 @@
*/
static bool
add_interface_variables(const struct gl_context *cts,
struct gl_shader_program *prog,
struct set *resource_set,
unsigned stage, GLenum programInterface)
add_vars_from_list(const struct gl_context *ctx,
struct gl_shader_program *prog, struct set *resource_set,
const struct exec_list *var_list, unsigned stage,
GLenum programInterface)
{
const struct exec_list *var_list = NULL;
struct gl_linked_shader *sh = prog->_LinkedShaders[stage];
if (!sh)
return true;
nir_shader *nir = sh->Program->nir;
assert(nir);
switch (programInterface) {
case GL_PROGRAM_INPUT:
var_list = &nir->inputs;
break;
case GL_PROGRAM_OUTPUT:
var_list = &nir->outputs;
break;
default:
assert("!Should not get here");
break;
}
nir_foreach_variable(var, var_list) {
if (var->data.how_declared == nir_var_hidden)
continue;
@@ -108,6 +87,38 @@ add_interface_variables(const struct gl_context *cts,
return true;
}
static bool
add_interface_variables(const struct gl_context *ctx,
struct gl_shader_program *prog,
struct set *resource_set,
unsigned stage, GLenum programInterface)
{
struct gl_linked_shader *sh = prog->_LinkedShaders[stage];
if (!sh)
return true;
nir_shader *nir = sh->Program->nir;
assert(nir);
switch (programInterface) {
case GL_PROGRAM_INPUT: {
bool result = add_vars_from_list(ctx, prog, resource_set,
&nir->inputs, stage, programInterface);
result &= add_vars_from_list(ctx, prog, resource_set, &nir->system_values,
stage, programInterface);
return result;
}
case GL_PROGRAM_OUTPUT:
return add_vars_from_list(ctx, prog, resource_set, &nir->outputs, stage,
programInterface);
default:
assert("!Should not get here");
break;
}
return false;
}
/* TODO: as we keep adding features, this method is becoming more and more
* similar to its GLSL counterpart at linker.cpp. Eventually it would be good
* to check if they could be refactored, and reduce code duplication somehow

View File

@@ -262,6 +262,17 @@ nir_lower_clip_vs(nir_shader *shader, unsigned ucp_enables, bool use_vars)
if (!ucp_enables)
return false;
/* find clipvertex/position outputs: */
nir_foreach_variable(var, &shader->outputs) {
int loc = var->data.driver_location;
/* keep track of last used driver-location.. we'll be
* appending CLIP_DIST0/CLIP_DIST1 after last existing
* output:
*/
maxloc = MAX2(maxloc, loc);
}
nir_builder_init(&b, impl);
/* NIR should ensure that, even in case of loops/if-else, there

View File

@@ -338,7 +338,14 @@ u_stream_outputs_for_vertices(enum pipe_prim_type primitive, unsigned nr)
/* Extraneous vertices don't contribute to stream outputs */
u_trim_pipe_prim(primitive, &nr);
/* Consider how many primitives are actually generated */
/* Polygons are special, since they are a single primitive with many
* vertices. In this case, we just have an output for each vertex (after
* trimming) */
if (primitive == PIPE_PRIM_POLYGON)
return nr;
/* Normally, consider how many primitives are actually generated */
unsigned prims = u_decomposed_prims_for_vertices(primitive, nr);
/* One output per vertex after decomposition */

View File

@@ -739,6 +739,8 @@ iris_resource_create_with_modifiers(struct pipe_screen *pscreen,
if (templ->usage == PIPE_USAGE_STAGING ||
templ->bind & (PIPE_BIND_LINEAR | PIPE_BIND_CURSOR) )
tiling_flags = ISL_TILING_LINEAR_BIT;
else if (templ->bind & PIPE_BIND_SCANOUT)
tiling_flags = ISL_TILING_X_BIT;
}
isl_surf_usage_flags_t usage = pipe_bind_to_isl_usage(templ->bind);

View File

@@ -492,12 +492,13 @@ int virgl_encode_shader_state(struct virgl_context *ctx,
if (virgl_debug & VIRGL_DEBUG_VERBOSE)
debug_printf("Failed to translate shader in available space - trying again\n");
old_size = str_total_size;
str_total_size = 65536 * ++retry_size;
str_total_size = 65536 * retry_size;
retry_size *= 2;
str = REALLOC(str, old_size, str_total_size);
if (!str)
return -1;
}
} while (bret == false && retry_size < 10);
} while (bret == false && retry_size < 1024);
if (bret == false)
return -1;

View File

@@ -49,6 +49,12 @@ struct winsys_handle
*/
unsigned offset;
/**
* Input to resource_from_handle.
* Output from resource_get_handle.
*/
uint64_t format;
/**
* Input to resource_from_handle.
* Output from resource_get_handle.

View File

@@ -529,6 +529,7 @@ dri2_allocate_textures(struct dri_context *ctx,
whandle.handle = buf->name;
whandle.stride = buf->pitch;
whandle.offset = 0;
whandle.format = format;
whandle.modifier = DRM_FORMAT_MOD_INVALID;
if (screen->can_share_buffer)
whandle.type = WINSYS_HANDLE_TYPE_SHARED;
@@ -759,18 +760,12 @@ dri2_create_image_from_winsys(__DRIscreen *_screen,
for (i = num_handles - 1; i >= 0; i--) {
struct pipe_resource *tex;
if (whandle[i].modifier == DRM_FORMAT_MOD_INVALID) {
templ.width0 = width >> map->planes[i].width_shift;
templ.height0 = height >> map->planes[i].height_shift;
if (is_yuv)
templ.format = dri2_get_pipe_format_for_dri_format(map->planes[i].dri_format);
else
templ.format = map->pipe_format;
} else {
templ.width0 = width;
templ.height0 = height;
templ.width0 = width >> map->planes[i].width_shift;
templ.height0 = height >> map->planes[i].height_shift;
if (is_yuv)
templ.format = dri2_get_pipe_format_for_dri_format(map->planes[i].dri_format);
else
templ.format = map->pipe_format;
}
assert(templ.format != PIPE_FORMAT_NONE);
tex = pscreen->resource_from_handle(pscreen,
@@ -808,6 +803,7 @@ dri2_create_image_from_name(__DRIscreen *_screen,
memset(&whandle, 0, sizeof(whandle));
whandle.type = WINSYS_HANDLE_TYPE_SHARED;
whandle.handle = name;
whandle.format = map->pipe_format;
whandle.modifier = DRM_FORMAT_MOD_INVALID;
whandle.stride = pitch * util_format_get_blocksize(map->pipe_format);
@@ -826,8 +822,13 @@ dri2_create_image_from_name(__DRIscreen *_screen,
}
static unsigned
dri2_get_modifier_num_planes(uint64_t modifier)
dri2_get_modifier_num_planes(uint64_t modifier, int fourcc)
{
const struct dri2_format_mapping *map = dri2_get_mapping_by_fourcc(fourcc);
if (!map)
return 0;
switch (modifier) {
case I915_FORMAT_MOD_Y_TILED_CCS:
return 2;
@@ -849,8 +850,8 @@ dri2_get_modifier_num_planes(uint64_t modifier)
/* FD_FORMAT_MOD_QCOM_TILED is not in drm_fourcc.h */
case I915_FORMAT_MOD_X_TILED:
case I915_FORMAT_MOD_Y_TILED:
return 1;
case DRM_FORMAT_MOD_INVALID:
return map->nplanes;
default:
return 0;
}
@@ -868,15 +869,13 @@ dri2_create_image_from_fd(__DRIscreen *_screen,
__DRIimage *img = NULL;
unsigned err = __DRI_IMAGE_ERROR_SUCCESS;
int i, expected_num_fds;
uint64_t mod_planes = dri2_get_modifier_num_planes(modifier);
int num_handles = dri2_get_modifier_num_planes(modifier, fourcc);
if (!map || (modifier != DRM_FORMAT_MOD_INVALID && mod_planes == 0)) {
if (!map || num_handles == 0) {
err = __DRI_IMAGE_ERROR_BAD_MATCH;
goto exit;
}
int num_handles = mod_planes > 0 ? mod_planes : map->nplanes;
switch (fourcc) {
case __DRI_IMAGE_FOURCC_YUYV:
case __DRI_IMAGE_FOURCC_UYVY:
@@ -896,7 +895,7 @@ dri2_create_image_from_fd(__DRIscreen *_screen,
for (i = 0; i < num_handles; i++) {
int fdnum = i >= num_fds ? 0 : i;
int index = mod_planes > 0 ? i : map->planes[i].buffer_index;
int index = i >= map->nplanes ? i : map->planes[i].buffer_index;
if (fds[fdnum] < 0) {
err = __DRI_IMAGE_ERROR_BAD_ALLOC;
goto exit;
@@ -906,6 +905,7 @@ dri2_create_image_from_fd(__DRIscreen *_screen,
whandles[i].handle = (unsigned)fds[fdnum];
whandles[i].stride = (unsigned)strides[index];
whandles[i].offset = (unsigned)offsets[index];
whandles[i].format = map->pipe_format;
whandles[i].modifier = modifier;
whandles[i].plane = index;
}
@@ -1296,6 +1296,7 @@ dri2_from_names(__DRIscreen *screen, int width, int height, int format,
whandle.handle = names[0];
whandle.stride = strides[0];
whandle.offset = offsets[0];
whandle.format = map->pipe_format;
whandle.modifier = DRM_FORMAT_MOD_INVALID;
img = dri2_create_image_from_winsys(screen, width, height, map,
@@ -1393,7 +1394,7 @@ dri2_query_dma_buf_format_modifier_attribs(__DRIscreen *_screen,
{
switch (attrib) {
case __DRI_IMAGE_FORMAT_MODIFIER_ATTRIB_PLANE_COUNT: {
uint64_t mod_planes = dri2_get_modifier_num_planes(modifier);
uint64_t mod_planes = dri2_get_modifier_num_planes(modifier, fourcc);
if (mod_planes > 0)
*value = mod_planes;
return mod_planes > 0;

View File

@@ -69,6 +69,8 @@
#define MAP_READ (1 << 0)
#define MAP_WRITE (1 << 1)
#define OA_REPORT_INVALID_CTX_ID (0xffffffff)
/**
* Periodic OA samples are read() into these buffer structures via the
* i915 perf kernel interface and appended to the
@@ -997,7 +999,9 @@ query_result_accumulate(struct gen_perf_query_result *result,
{
int i, idx = 0;
result->hw_id = start[2];
if (result->hw_id == OA_REPORT_INVALID_CTX_ID &&
start[2] != OA_REPORT_INVALID_CTX_ID)
result->hw_id = start[2];
result->reports_accumulated++;
switch (query->oa_format) {
@@ -1035,7 +1039,7 @@ static void
query_result_clear(struct gen_perf_query_result *result)
{
memset(result, 0, sizeof(*result));
result->hw_id = 0xffffffff; /* invalid */
result->hw_id = OA_REPORT_INVALID_CTX_ID; /* invalid */
}
static void
@@ -1316,8 +1320,8 @@ get_free_sample_buf(struct gen_perf_context *perf_ctx)
exec_node_init(&buf->link);
buf->refcount = 0;
buf->len = 0;
}
buf->len = 0;
return buf;
}
@@ -1834,7 +1838,8 @@ read_oa_samples_until(struct gen_perf_context *perf_ctx,
exec_list_get_tail(&perf_ctx->sample_buffers);
struct oa_sample_buf *tail_buf =
exec_node_data(struct oa_sample_buf, tail_node, link);
uint32_t last_timestamp = tail_buf->last_timestamp;
uint32_t last_timestamp =
tail_buf->len == 0 ? start_timestamp : tail_buf->last_timestamp;
while (1) {
struct oa_sample_buf *buf = get_free_sample_buf(perf_ctx);
@@ -1849,12 +1854,13 @@ read_oa_samples_until(struct gen_perf_context *perf_ctx,
exec_list_push_tail(&perf_ctx->free_sample_buffers, &buf->link);
if (len < 0) {
if (errno == EAGAIN)
return ((last_timestamp - start_timestamp) >=
if (errno == EAGAIN) {
return ((last_timestamp - start_timestamp) < INT32_MAX &&
(last_timestamp - start_timestamp) >=
(end_timestamp - start_timestamp)) ?
OA_READ_STATUS_FINISHED :
OA_READ_STATUS_UNFINISHED;
else {
} else {
DBG("Error reading i915 perf samples: %m\n");
}
} else
@@ -2070,6 +2076,17 @@ discard_all_queries(struct gen_perf_context *perf_ctx)
}
}
/* Looks for the validity bit of context ID (dword 2) of an OA report. */
static bool
oa_report_ctx_id_valid(const struct gen_device_info *devinfo,
const uint32_t *report)
{
assert(devinfo->gen >= 8);
if (devinfo->gen == 8)
return (report[0] & (1 << 25)) != 0;
return (report[0] & (1 << 16)) != 0;
}
/**
* Accumulate raw OA counter values based on deltas between pairs of
* OA reports.
@@ -2097,7 +2114,7 @@ accumulate_oa_reports(struct gen_perf_context *perf_ctx,
uint32_t *last;
uint32_t *end;
struct exec_node *first_samples_node;
bool in_ctx = true;
bool last_report_ctx_match = true;
int out_duration = 0;
assert(query->oa.map != NULL);
@@ -2126,7 +2143,7 @@ accumulate_oa_reports(struct gen_perf_context *perf_ctx,
first_samples_node = query->oa.samples_head->next;
foreach_list_typed_from(struct oa_sample_buf, buf, link,
&perf_ctx.sample_buffers,
&perf_ctx->sample_buffers,
first_samples_node)
{
int offset = 0;
@@ -2143,6 +2160,7 @@ accumulate_oa_reports(struct gen_perf_context *perf_ctx,
switch (header->type) {
case DRM_I915_PERF_RECORD_SAMPLE: {
uint32_t *report = (uint32_t *)(header + 1);
bool report_ctx_match = true;
bool add = true;
/* Ignore reports that come before the start marker.
@@ -2171,35 +2189,30 @@ accumulate_oa_reports(struct gen_perf_context *perf_ctx,
* of OA counters while any other context is acctive.
*/
if (devinfo->gen >= 8) {
if (in_ctx && report[2] != query->oa.result.hw_id) {
DBG("i915 perf: Switch AWAY (observed by ID change)\n");
in_ctx = false;
/* Consider that the current report matches our context only if
* the report says the report ID is valid.
*/
report_ctx_match = oa_report_ctx_id_valid(devinfo, report) &&
report[2] == start[2];
if (report_ctx_match)
out_duration = 0;
} else if (in_ctx == false && report[2] == query->oa.result.hw_id) {
DBG("i915 perf: Switch TO\n");
in_ctx = true;
/* From experimentation in IGT, we found that the OA unit
* might label some report as "idle" (using an invalid
* context ID), right after a report for a given context.
* Deltas generated by those reports actually belong to the
* previous context, even though they're not labelled as
* such.
*
* We didn't *really* Switch AWAY in the case that we e.g.
* saw a single periodic report while idle...
*/
if (out_duration >= 1)
add = false;
} else if (in_ctx) {
assert(report[2] == query->oa.result.hw_id);
DBG("i915 perf: Continuation IN\n");
} else {
assert(report[2] != query->oa.result.hw_id);
DBG("i915 perf: Continuation OUT\n");
add = false;
else
out_duration++;
}
/* Only add the delta between <last, report> if the last report
* was clearly identified as our context, or if we have at most
* 1 report without a matching ID.
*
* The OA unit will sometimes label reports with an invalid
* context ID when i915 rewrites the execlist submit register
* with the same context as the one currently running. This
* happens when i915 wants to notify the HW of ringbuffer tail
* register update. We have to consider this report as part of
* our context as the 3d pipeline behind the OACS unit is still
* processing the operations started at the previous execlist
* submission.
*/
add = last_report_ctx_match && out_duration < 2;
}
if (add) {
@@ -2208,6 +2221,7 @@ accumulate_oa_reports(struct gen_perf_context *perf_ctx,
}
last = report;
last_report_ctx_match = report_ctx_match;
break;
}

View File

@@ -1640,7 +1640,7 @@ void anv_GetPhysicalDeviceProperties2(
VkPhysicalDeviceSamplerFilterMinmaxPropertiesEXT *properties =
(VkPhysicalDeviceSamplerFilterMinmaxPropertiesEXT *)ext;
properties->filterMinmaxImageComponentMapping = pdevice->info.gen >= 9;
properties->filterMinmaxSingleComponentFormats = true;
properties->filterMinmaxSingleComponentFormats = pdevice->info.gen >= 9;
break;
}
@@ -3098,9 +3098,10 @@ VkResult anv_AllocateMemory(
i915_tiling);
if (ret) {
anv_bo_cache_release(device, &device->bo_cache, mem->bo);
return vk_errorf(device->instance, NULL,
VK_ERROR_OUT_OF_DEVICE_MEMORY,
"failed to set BO tiling: %m");
result = vk_errorf(device->instance, NULL,
VK_ERROR_OUT_OF_DEVICE_MEMORY,
"failed to set BO tiling: %m");
goto fail;
}
}
}

View File

@@ -681,7 +681,11 @@ anv_wait_for_fences(struct anv_device *device,
if (fenceCount <= 1 || waitAll) {
for (uint32_t i = 0; i < fenceCount; i++) {
ANV_FROM_HANDLE(anv_fence, fence, pFences[i]);
switch (fence->permanent.type) {
struct anv_fence_impl *impl =
fence->temporary.type != ANV_FENCE_TYPE_NONE ?
&fence->temporary : &fence->permanent;
switch (impl->type) {
case ANV_FENCE_TYPE_BO:
result = anv_wait_for_bo_fences(device, 1, &pFences[i],
true, abs_timeout);
@@ -716,7 +720,10 @@ static bool anv_all_fences_syncobj(uint32_t fenceCount, const VkFence *pFences)
{
for (uint32_t i = 0; i < fenceCount; ++i) {
ANV_FROM_HANDLE(anv_fence, fence, pFences[i]);
if (fence->permanent.type != ANV_FENCE_TYPE_SYNCOBJ)
struct anv_fence_impl *impl =
fence->temporary.type != ANV_FENCE_TYPE_NONE ?
&fence->temporary : &fence->permanent;
if (impl->type != ANV_FENCE_TYPE_SYNCOBJ)
return false;
}
return true;
@@ -726,7 +733,10 @@ static bool anv_all_fences_bo(uint32_t fenceCount, const VkFence *pFences)
{
for (uint32_t i = 0; i < fenceCount; ++i) {
ANV_FROM_HANDLE(anv_fence, fence, pFences[i]);
if (fence->permanent.type != ANV_FENCE_TYPE_BO)
struct anv_fence_impl *impl =
fence->temporary.type != ANV_FENCE_TYPE_NONE ?
&fence->temporary : &fence->permanent;
if (impl->type != ANV_FENCE_TYPE_BO)
return false;
}
return true;

View File

@@ -3803,6 +3803,13 @@ genX(flush_pipeline_select)(struct anv_cmd_buffer *cmd_buffer,
vfe.NumberofURBEntries = 2;
vfe.URBEntryAllocationSize = 2;
}
/* We just emitted a dummy MEDIA_VFE_STATE so now that packet is
* invalid. Set the compute pipeline to dirty to force a re-emit of the
* pipeline in case we get back-to-back dispatch calls with the same
* pipeline and a PIPELINE_SELECT in between.
*/
cmd_buffer->state.compute.pipeline_dirty = true;
}
#endif

View File

@@ -48,6 +48,12 @@ free_performance_query(GLuint key, void *data, void *user)
struct gl_perf_query_object *m = data;
struct gl_context *ctx = user;
/* Don't confuse the implementation by deleting an active query. We can
* toggle Active/Used to false because we're tearing down the GL context
* and it's already idle (see _mesa_free_context_data).
*/
m->Active = false;
m->Used = false;
ctx->Driver.DeletePerfQuery(ctx, m);
}