docs: add release notes for 11.0.6

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Update version to 11.0.6
2015-11-21 11:43:55 +00:00 · 2015-11-21 11:42:52 +00:00 · 2015-11-21 11:42:52 +00:00 · 2015-11-18 19:13:17 +00:00 · 2015-11-18 18:59:34 +00:00 · 2015-11-18 18:59:20 +00:00
29 changed files with 357 additions and 56 deletions
--- a/Makefile.am
+++ b/Makefile.am
@@ -32,6 +32,7 @@ AM_DISTCHECK_CONFIGURE_FLAGS = \
 	--enable-vdpau \
 	--enable-xa \
 	--enable-xvmc \
+	--disable-llvm-shared-libs \
 	--with-egl-platforms=x11,wayland,drm \
 	--with-dri-drivers=i915,i965,nouveau,radeon,r200,swrast \
 	--with-gallium-drivers=i915,ilo,nouveau,r300,r600,radeonsi,freedreno,svga,swrast
--- a/2
+++ b/2
@@ -1 +1 @@
-11.0.5
+11.0.6
--- a/bin/.cherry-ignore
+++ b/bin/.cherry-ignore
@@ -1,2 +1,5 @@
 # The commit base differs greatly between 11.0 and master
-2832ca95ecce064c7d841a3a374c2179f56161be glsl: fix stream qualifier for blocks with an instance name
+2832ca95ecce064c7d841a3a374c2179f56161be glsl: fix stream qualifier for blocks with an instance name
+
+# Somewhat of a mixed feature/bugfix patch, causing some 200 piglit regressions
+2b676570960277d47477822ffeccc672613f9142 gallium/swrast: fix front buffer blitting. (v2)
--- a/docs/relnotes/11.0.5.html
+++ b/docs/relnotes/11.0.5.html
@@ -31,7 +31,8 @@ because compatibility contexts are not supported.

 <h2>SHA256 checksums</h2>
 <pre>
-TBD
+8495ef5c06f7f726452462b7d408a5b40048373ff908f2283a3b4d1f49b45ee6  mesa-11.0.5.tar.gz
+9c255a2a6695fcc6ef4a279e1df0aeaf417dc142f39ee59dfb533d80494bb67a  mesa-11.0.5.tar.xz
 </pre>


--- a/docs/relnotes/11.0.6.html
+++ b/docs/relnotes/11.0.6.html
@@ -0,0 +1,144 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
+<html lang="en">
+<head>
+  <meta http-equiv="content-type" content="text/html; charset=utf-8">
+  <title>Mesa Release Notes</title>
+  <link rel="stylesheet" type="text/css" href="../mesa.css">
+</head>
+<body>
+
+<div class="header">
+  <h1>The Mesa 3D Graphics Library</h1>
+</div>
+
+<iframe src="../contents.html"></iframe>
+<div class="content">
+
+<h1>Mesa 11.0.6 Release Notes / November 21, 2015</h1>
+
+<p>
+Mesa 11.0.6 is a bug fix release which fixes bugs found since the 11.0.5 release.
+</p>
+<p>
+Mesa 11.0.6 implements the OpenGL 4.1 API, but the version reported by
+glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
+glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
+Some drivers don't support all the features required in OpenGL 4.1.  OpenGL
+4.1 is <strong>only</strong> available if requested at context creation
+because compatibility contexts are not supported.
+</p>
+
+
+<h2>SHA256 checksums</h2>
+<pre>
+TBD
+</pre>
+
+
+<h2>New features</h2>
+<p>None</p>
+
+<h2>Bug fixes</h2>
+
+<p>This list is likely incomplete.</p>
+
+<ul>
+
+<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91780">Bug 91780</a> - Rendering issues with geometry shader</li>
+
+<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92588">Bug 92588</a> - [HSW,BDW,BSW,SKL-Y][GLES 3.1 CTS] ES31-CTS.arrays_of_arrays.InteractionFunctionCalls2 - assert</li>
+
+<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92738">Bug 92738</a> - Randon R7 240 doesn't work on 16KiB page size platform</li>
+
+<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92860">Bug 92860</a> - [radeonsi][bisected] st/mesa: implement ARB_copy_image - Corruption in ARK Survival Evolved</li>
+
+<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92900">Bug 92900</a> - [regression bisected] About 700 piglit regressions is what could go wrong</li>
+
+</ul>
+
+
+<h2>Changes</h2>
+
+<p>Alex Deucher (1):</p>
+<ul>
+  <li>radeonsi: enable optimal raster config setting for fiji (v2)</li>
+</ul>
+
+<p>Ben Widawsky (1):</p>
+<ul>
+  <li>i965/skl/gt4: Fix URB programming restriction.</li>
+</ul>
+
+<p>Boyuan Zhang (2):</p>
+<ul>
+  <li>st/vaapi: fix vaapi VC-1 simple/main corruption v2</li>
+  <li>radeon/uvd: fix VC-1 simple/main profile decode v2</li>
+</ul>
+
+<p>Dave Airlie (1):</p>
+<ul>
+  <li>r600: initialised PGM_RESOURCES_2 for ES/GS</li>
+</ul>
+
+<p>Emil Velikov (4):</p>
+<ul>
+  <li>docs: add sha256 checksums for 11.0.5</li>
+  <li>cherry-ignore: add the swrast front buffer support</li>
+  <li>automake: use static llvm for make distcheck</li>
+  <li>Update version to 11.0.6</li>
+</ul>
+
+<p>Eric Anholt (3):</p>
+<ul>
+  <li>vc4: Return GL_OUT_OF_MEMORY when buffer allocation fails.</li>
+  <li>vc4: Return NULL when we can't make our shadow for a sampler view.</li>
+  <li>vc4: Add support for nir_op_uge, using the carry bit on QPU_A_SUB.</li>
+</ul>
+
+<p>Ian Romanick (2):</p>
+<ul>
+  <li>meta/generate_mipmap: Don't leak the sampler object</li>
+  <li>meta/generate_mipmap: Only modify the draw framebuffer binding in fallback_required</li>
+</ul>
+
+<p>Ilia Mirkin (2):</p>
+<ul>
+  <li>mesa/copyimage: allow width/height to not be multiples of block</li>
+  <li>nouveau: don't expose HEVC decoding support</li>
+</ul>
+
+<p>Jason Ekstrand (1):</p>
+<ul>
+  <li>nir/vars_to_ssa: Rework copy set handling in lower_copies_to_load_store</li>
+</ul>
+
+<p>Kenneth Graunke (1):</p>
+<ul>
+  <li>glsl: Allow implicit int -&gt; uint conversions for the % operator.</li>
+</ul>
+
+<p>Marek Olšák (1):</p>
+<ul>
+  <li>radeonsi: initialize SX_PS_DOWNCONVERT to 0 on Stoney</li>
+</ul>
+
+<p>Michel Dänzer (1):</p>
+<ul>
+  <li>winsys/radeon: Use CPU page size instead of hardcoding 4096 bytes v3</li>
+</ul>
+
+<p>Oded Gabbay (1):</p>
+<ul>
+  <li>llvmpipe: use simple coeffs calc for 128bit vectors</li>
+</ul>
+
+<p>Roland Scheidegger (2):</p>
+<ul>
+  <li>radeon: fix bgrx8/xrgb8 blits</li>
+  <li>r200: fix bgrx8/xrgb8 blits</li>
+</ul>
+
+
+</div>
+</body>
+</html>
--- a/src/gallium/drivers/llvmpipe/lp_bld_interp.c
+++ b/src/gallium/drivers/llvmpipe/lp_bld_interp.c
@@ -746,7 +746,12 @@ lp_build_interp_soa_init(struct lp_build_interp_soa_context *bld,

   pos_init(bld, x0, y0);

-   if (coeff_type.length > 4) {
+   /*
+    * Simple method (single step interpolation) may be slower if vector length
+    * is just 4, but the results are different (generally less accurate) with
+    * the other method, so always use more accurate version.
+    */
+   if (1) {
      bld->simple_interp = TRUE;
      {
         /* XXX this should use a global static table */
--- a/src/gallium/drivers/nouveau/nouveau_vp3_video.c
+++ b/src/gallium/drivers/nouveau/nouveau_vp3_video.c
@@ -437,6 +437,7 @@ nouveau_vp3_screen_get_video_param(struct pipe_screen *pscreen,
      /* VP3 does not support MPEG4, VP4+ do. */
      return entrypoint == PIPE_VIDEO_ENTRYPOINT_BITSTREAM &&
         profile >= PIPE_VIDEO_PROFILE_MPEG1 &&
+         profile < PIPE_VIDEO_PROFILE_HEVC_MAIN &&
         (!vp3 || codec != PIPE_VIDEO_FORMAT_MPEG4) &&
         firmware_present(pscreen, profile);
   case PIPE_VIDEO_CAP_NPOT_TEXTURES:
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -2342,6 +2342,8 @@ static void cayman_init_atom_start_cs(struct r600_context *rctx)

 	r600_store_context_reg(cb, R_028848_SQ_PGM_RESOURCES_2_PS, S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN));
 	r600_store_context_reg(cb, R_028864_SQ_PGM_RESOURCES_2_VS, S_028864_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN));
+	r600_store_context_reg(cb, R_02887C_SQ_PGM_RESOURCES_2_GS, S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN));
+	r600_store_context_reg(cb, R_028894_SQ_PGM_RESOURCES_2_ES, S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN));
 	r600_store_context_reg(cb, R_0288A8_SQ_PGM_RESOURCES_FS, 0);

 	/* to avoid GPU doing any preloading of constant from random address */
@@ -2781,6 +2783,8 @@ void evergreen_init_atom_start_cs(struct r600_context *rctx)

 	r600_store_context_reg(cb, R_028848_SQ_PGM_RESOURCES_2_PS, S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN));
 	r600_store_context_reg(cb, R_028864_SQ_PGM_RESOURCES_2_VS, S_028864_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN));
+	r600_store_context_reg(cb, R_02887C_SQ_PGM_RESOURCES_2_GS, S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN));
+	r600_store_context_reg(cb, R_028894_SQ_PGM_RESOURCES_2_ES, S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN));
 	r600_store_context_reg(cb, R_0288A8_SQ_PGM_RESOURCES_FS, 0);

 	/* to avoid GPU doing any preloading of constant from random address */
--- a/src/gallium/drivers/r600/evergreend.h
+++ b/src/gallium/drivers/r600/evergreend.h
@@ -1497,6 +1497,7 @@
 #define   S_028878_UNCACHED_FIRST_INST(x)              (((x) & 0x1) << 28)
 #define   G_028878_UNCACHED_FIRST_INST(x)              (((x) >> 28) & 0x1)
 #define   C_028878_UNCACHED_FIRST_INST                 0xEFFFFFFF
+#define R_02887C_SQ_PGM_RESOURCES_2_GS                 0x02887C

 #define R_028890_SQ_PGM_RESOURCES_ES                 0x028890
 #define   S_028890_NUM_GPRS(x)                         (((x) & 0xFF) << 0)
@@ -1511,6 +1512,7 @@
 #define   S_028890_UNCACHED_FIRST_INST(x)              (((x) & 0x1) << 28)
 #define   G_028890_UNCACHED_FIRST_INST(x)              (((x) >> 28) & 0x1)
 #define   C_028890_UNCACHED_FIRST_INST                 0xEFFFFFFF
+#define R_028894_SQ_PGM_RESOURCES_2_ES                 0x028894

 #define R_028864_SQ_PGM_RESOURCES_2_VS               0x028864
 #define   S_028864_SINGLE_ROUND(x)                     (((x) & 0x3) << 0)
--- a/src/gallium/drivers/radeon/radeon_uvd.c
+++ b/src/gallium/drivers/radeon/radeon_uvd.c
@@ -940,6 +940,12 @@ static void ruvd_end_frame(struct pipe_video_codec *decoder,
 	dec->msg->body.decode.width_in_samples = dec->base.width;
 	dec->msg->body.decode.height_in_samples = dec->base.height;

+	if ((picture->profile == PIPE_VIDEO_PROFILE_VC1_SIMPLE) ||
+	    (picture->profile == PIPE_VIDEO_PROFILE_VC1_MAIN)) {
+		dec->msg->body.decode.width_in_samples = align(dec->msg->body.decode.width_in_samples, 16) / 16;
+		dec->msg->body.decode.height_in_samples = align(dec->msg->body.decode.height_in_samples, 16) / 16;
+	}
+
 	dec->msg->body.decode.dpb_size = dec->dpb.res->buf->size;
 	dec->msg->body.decode.bsd_size = bs_size;
 	dec->msg->body.decode.db_pitch = dec->base.width;
--- a/src/gallium/drivers/radeon/radeon_video.c
+++ b/src/gallium/drivers/radeon/radeon_video.c
@@ -244,8 +244,7 @@ int rvid_get_video_param(struct pipe_screen *screen,
 				return codec != PIPE_VIDEO_FORMAT_MPEG4;
 			return true;
 		case PIPE_VIDEO_FORMAT_VC1:
-			/* FIXME: VC-1 simple/main profile is broken */
-			return profile == PIPE_VIDEO_PROFILE_VC1_ADVANCED;
+			return true;
 		case PIPE_VIDEO_FORMAT_HEVC:
 			/* Carrizo only supports HEVC Main */
 			return rscreen->family >= CHIP_CARRIZO &&
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -3176,6 +3176,7 @@ si_write_harvested_raster_configs(struct si_context *sctx,

 static void si_init_config(struct si_context *sctx)
 {
+	struct si_screen *sscreen = sctx->screen;
 	unsigned num_rb = MIN2(sctx->screen->b.info.r600_num_backends, 16);
 	unsigned rb_mask = sctx->screen->b.info.si_backend_enabled_mask;
 	unsigned raster_config, raster_config_1;
@@ -3243,9 +3244,14 @@ static void si_init_config(struct si_context *sctx)
 		raster_config_1 = 0x0000002e;
 		break;
 	case CHIP_FIJI:
-		/* Fiji should be same as Hawaii, but that causes corruption in some cases */
-		raster_config = 0x16000012; /* 0x3a00161a */
-		raster_config_1 = 0x0000002a; /* 0x0000002e */
+		if (sscreen->b.info.cik_macrotile_mode_array[0] == 0x000000e8) {
+			/* old kernels with old tiling config */
+			raster_config = 0x16000012;
+			raster_config_1 = 0x0000002a;
+		} else {
+			raster_config = 0x3a00161a;
+			raster_config_1 = 0x0000002e;
+		}
 		break;
 	case CHIP_TONGA:
 		raster_config = 0x16000012;
@@ -3342,5 +3348,8 @@ static void si_init_config(struct si_context *sctx)
 		si_pm4_set_reg(pm4, R_028C5C_VGT_OUT_DEALLOC_CNTL, 32);
 	}

+	if (sctx->b.family == CHIP_STONEY)
+		si_pm4_set_reg(pm4, R_028754_SX_PS_DOWNCONVERT, 0);
+
 	sctx->init_config = pm4;
 }
--- a/src/gallium/drivers/vc4/vc4_bufmgr.c
+++ b/src/gallium/drivers/vc4/vc4_bufmgr.c
@@ -168,8 +168,9 @@ retry:
                        vc4_bo_cache_free_all(&screen->bo_cache);
                        goto retry;
                }
-                fprintf(stderr, "create ioctl failure\n");
-                abort();
+
+                free(bo);
+                return NULL;
        }

        screen->bo_count++;
--- a/src/gallium/drivers/vc4/vc4_opt_algebraic.c
+++ b/src/gallium/drivers/vc4/vc4_opt_algebraic.c
@@ -143,6 +143,8 @@ qir_opt_algebraic(struct vc4_compile *c)
                case QOP_SEL_X_Y_ZC:
                case QOP_SEL_X_Y_NS:
                case QOP_SEL_X_Y_NC:
+                case QOP_SEL_X_Y_CS:
+                case QOP_SEL_X_Y_CC:
                        if (is_zero(c, inst->src[1])) {
                                /* Replace references to a 0 uniform value
                                 * with the SEL_X_0 equivalent.
--- a/src/gallium/drivers/vc4/vc4_program.c
+++ b/src/gallium/drivers/vc4/vc4_program.c
@@ -1055,6 +1055,10 @@ ntq_emit_alu(struct vc4_compile *c, nir_alu_instr *instr)
                qir_SF(c, qir_SUB(c, src[0], src[1]));
                *dest = qir_SEL_X_0_NC(c, qir_uniform_ui(c, ~0));
                break;
+        case nir_op_uge:
+                qir_SF(c, qir_SUB(c, src[0], src[1]));
+                *dest = qir_SEL_X_0_CC(c, qir_uniform_ui(c, ~0));
+                break;
        case nir_op_ilt:
                qir_SF(c, qir_SUB(c, src[0], src[1]));
                *dest = qir_SEL_X_0_NS(c, qir_uniform_ui(c, ~0));
--- a/src/gallium/drivers/vc4/vc4_qir.c
+++ b/src/gallium/drivers/vc4/vc4_qir.c
@@ -62,10 +62,14 @@ static const struct qir_op_info qir_op_info[] = {
        [QOP_SEL_X_0_NC] = { "fsel_x_0_nc", 1, 1, false, true },
        [QOP_SEL_X_0_ZS] = { "fsel_x_0_zs", 1, 1, false, true },
        [QOP_SEL_X_0_ZC] = { "fsel_x_0_zc", 1, 1, false, true },
+        [QOP_SEL_X_0_CS] = { "fsel_x_0_cs", 1, 1, false, true },
+        [QOP_SEL_X_0_CC] = { "fsel_x_0_cc", 1, 1, false, true },
        [QOP_SEL_X_Y_NS] = { "fsel_x_y_ns", 1, 2, false, true },
        [QOP_SEL_X_Y_NC] = { "fsel_x_y_nc", 1, 2, false, true },
        [QOP_SEL_X_Y_ZS] = { "fsel_x_y_zs", 1, 2, false, true },
        [QOP_SEL_X_Y_ZC] = { "fsel_x_y_zc", 1, 2, false, true },
+        [QOP_SEL_X_Y_CS] = { "fsel_x_y_cs", 1, 2, false, true },
+        [QOP_SEL_X_Y_CC] = { "fsel_x_y_cc", 1, 2, false, true },

        [QOP_RCP] = { "rcp", 1, 1, false, true },
        [QOP_RSQ] = { "rsq", 1, 1, false, true },
@@ -193,10 +197,14 @@ qir_depends_on_flags(struct qinst *inst)
        case QOP_SEL_X_0_NC:
        case QOP_SEL_X_0_ZS:
        case QOP_SEL_X_0_ZC:
+        case QOP_SEL_X_0_CS:
+        case QOP_SEL_X_0_CC:
        case QOP_SEL_X_Y_NS:
        case QOP_SEL_X_Y_NC:
        case QOP_SEL_X_Y_ZS:
        case QOP_SEL_X_Y_ZC:
+        case QOP_SEL_X_Y_CS:
+        case QOP_SEL_X_Y_CC:
                return true;
        default:
                return false;
--- a/src/gallium/drivers/vc4/vc4_qir.h
+++ b/src/gallium/drivers/vc4/vc4_qir.h
@@ -91,11 +91,15 @@ enum qop {
        QOP_SEL_X_0_ZC,
        QOP_SEL_X_0_NS,
        QOP_SEL_X_0_NC,
+        QOP_SEL_X_0_CS,
+        QOP_SEL_X_0_CC,
        /* Selects the src[0] if the ns flag bit is set, otherwise src[1]. */
        QOP_SEL_X_Y_ZS,
        QOP_SEL_X_Y_ZC,
        QOP_SEL_X_Y_NS,
        QOP_SEL_X_Y_NC,
+        QOP_SEL_X_Y_CS,
+        QOP_SEL_X_Y_CC,

        QOP_FTOI,
        QOP_ITOF,
@@ -570,10 +574,14 @@ QIR_ALU1(SEL_X_0_ZS)
 QIR_ALU1(SEL_X_0_ZC)
 QIR_ALU1(SEL_X_0_NS)
 QIR_ALU1(SEL_X_0_NC)
+QIR_ALU1(SEL_X_0_CS)
+QIR_ALU1(SEL_X_0_CC)
 QIR_ALU2(SEL_X_Y_ZS)
 QIR_ALU2(SEL_X_Y_ZC)
 QIR_ALU2(SEL_X_Y_NS)
 QIR_ALU2(SEL_X_Y_NC)
+QIR_ALU2(SEL_X_Y_CS)
+QIR_ALU2(SEL_X_Y_CC)
 QIR_ALU2(FMIN)
 QIR_ALU2(FMAX)
 QIR_ALU2(FMINABS)
--- a/src/gallium/drivers/vc4/vc4_qpu_emit.c
+++ b/src/gallium/drivers/vc4/vc4_qpu_emit.c
@@ -271,6 +271,8 @@ vc4_generate_code(struct vc4_context *vc4, struct vc4_compile *c)
                case QOP_SEL_X_0_ZC:
                case QOP_SEL_X_0_NS:
                case QOP_SEL_X_0_NC:
+                case QOP_SEL_X_0_CS:
+                case QOP_SEL_X_0_CC:
                        queue(c, qpu_a_MOV(dst, src[0]));
                        set_last_cond_add(c, qinst->op - QOP_SEL_X_0_ZS +
                                          QPU_COND_ZS);
@@ -284,6 +286,8 @@ vc4_generate_code(struct vc4_context *vc4, struct vc4_compile *c)
                case QOP_SEL_X_Y_ZC:
                case QOP_SEL_X_Y_NS:
                case QOP_SEL_X_Y_NC:
+                case QOP_SEL_X_Y_CS:
+                case QOP_SEL_X_Y_CC:
                        queue(c, qpu_a_MOV(dst, src[0]));
                        set_last_cond_add(c, qinst->op - QOP_SEL_X_Y_ZS +
                                          QPU_COND_ZS);
--- a/src/gallium/drivers/vc4/vc4_resource.c
+++ b/src/gallium/drivers/vc4/vc4_resource.c
@@ -35,11 +35,12 @@

 static bool miptree_debug = false;

-static void
+static bool
 vc4_resource_bo_alloc(struct vc4_resource *rsc)
 {
        struct pipe_resource *prsc = &rsc->base.b;
        struct pipe_screen *pscreen = prsc->screen;
+        struct vc4_bo *bo;

        if (miptree_debug) {
                fprintf(stderr, "alloc %p: size %d + offset %d -> %d\n",
@@ -51,12 +52,18 @@ vc4_resource_bo_alloc(struct vc4_resource *rsc)
                        rsc->cube_map_stride * (prsc->array_size - 1));
        }

-        vc4_bo_unreference(&rsc->bo);
-        rsc->bo = vc4_bo_alloc(vc4_screen(pscreen),
-                               rsc->slices[0].offset +
-                               rsc->slices[0].size +
-                               rsc->cube_map_stride * (prsc->array_size - 1),
-                               "resource");
+        bo = vc4_bo_alloc(vc4_screen(pscreen),
+                          rsc->slices[0].offset +
+                          rsc->slices[0].size +
+                          rsc->cube_map_stride * (prsc->array_size - 1),
+                          "resource");
+        if (bo) {
+                vc4_bo_unreference(&rsc->bo);
+                rsc->bo = bo;
+                return true;
+        } else {
+                return false;
+        }
 }

 static void
@@ -101,21 +108,27 @@ vc4_resource_transfer_map(struct pipe_context *pctx,
        char *buf;

        if (usage & PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE) {
-                vc4_resource_bo_alloc(rsc);
+                if (vc4_resource_bo_alloc(rsc)) {

-                /* If it might be bound as one of our vertex buffers, make
-                 * sure we re-emit vertex buffer state.
-                 */
-                if (prsc->bind & PIPE_BIND_VERTEX_BUFFER)
-                        vc4->dirty |= VC4_DIRTY_VTXBUF;
+                        /* If it might be bound as one of our vertex buffers,
+                         * make sure we re-emit vertex buffer state.
+                         */
+                        if (prsc->bind & PIPE_BIND_VERTEX_BUFFER)
+                                vc4->dirty |= VC4_DIRTY_VTXBUF;
+                } else {
+                        /* If we failed to reallocate, flush everything so
+                         * that we don't violate any syncing requirements.
+                         */
+                        vc4_flush(pctx);
+                }
        } else if (!(usage & PIPE_TRANSFER_UNSYNCHRONIZED)) {
                if (vc4_cl_references_bo(pctx, rsc->bo)) {
                        if ((usage & PIPE_TRANSFER_DISCARD_RANGE) &&
                            prsc->last_level == 0 &&
                            prsc->width0 == box->width &&
                            prsc->height0 == box->height &&
-                            prsc->depth0 == box->depth) {
-                                vc4_resource_bo_alloc(rsc);
+                            prsc->depth0 == box->depth &&
+                            vc4_resource_bo_alloc(rsc)) {
                                if (prsc->bind & PIPE_BIND_VERTEX_BUFFER)
                                        vc4->dirty |= VC4_DIRTY_VTXBUF;
                        } else {
@@ -389,8 +402,7 @@ vc4_resource_create(struct pipe_screen *pscreen,
                rsc->vc4_format = get_resource_texture_format(prsc);

        vc4_setup_slices(rsc);
-        vc4_resource_bo_alloc(rsc);
-        if (!rsc->bo)
+        if (!vc4_resource_bo_alloc(rsc))
                goto fail;

        return prsc;
--- a/src/gallium/drivers/vc4/vc4_state.c
+++ b/src/gallium/drivers/vc4/vc4_state.c
@@ -581,6 +581,10 @@ vc4_create_sampler_view(struct pipe_context *pctx, struct pipe_resource *prsc,
                tmpl.last_level = cso->u.tex.last_level - cso->u.tex.first_level;

                prsc = vc4_resource_create(pctx->screen, &tmpl);
+                if (!prsc) {
+                        free(so);
+                        return NULL;
+                }
                rsc = vc4_resource(prsc);
                clone = vc4_resource(prsc);
                clone->shadow_parent = &shadow_parent->base.b;
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -500,8 +500,10 @@ handleVASliceDataBufferType(vlVaContext *context, vlVaBuffer *buf)
          bufHasStartcode(buf, 0x0000010b, 32))
         break;

+      if (context->decoder->profile == PIPE_VIDEO_PROFILE_VC1_ADVANCED) {
         buffers[num_buffers] = (void *const)&start_code_vc1;
         sizes[num_buffers++] = sizeof(start_code_vc1);
+      }
      break;
   case PIPE_VIDEO_FORMAT_MPEG4:
      if (bufHasStartcode(buf, 0x000001, 24))
--- a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
+++ b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
@@ -76,6 +76,9 @@ struct radeon_bomgr {
    bool va;
    uint64_t va_offset;
    struct list_head va_holes;
+
+    /* BO size alignment */
+    unsigned size_align;
 };

 static inline struct radeon_bomgr *radeon_bomgr(struct pb_manager *mgr)
@@ -164,8 +167,10 @@ static uint64_t radeon_bomgr_find_va(struct radeon_bomgr *mgr, uint64_t size, ui
    struct radeon_bo_va_hole *hole, *n;
    uint64_t offset = 0, waste = 0;

-    alignment = MAX2(alignment, 4096);
-    size = align(size, 4096);
+    /* All VM address space holes will implicitly start aligned to the
+     * size alignment, so we don't need to sanitize the alignment here
+     */
+    size = align(size, mgr->size_align);

    pipe_mutex_lock(mgr->bo_va_mutex);
    /* first look for a hole */
@@ -222,7 +227,7 @@ static void radeon_bomgr_free_va(struct radeon_bomgr *mgr, uint64_t va, uint64_t
 {
    struct radeon_bo_va_hole *hole;

-    size = align(size, 4096);
+    size = align(size, mgr->size_align);

    pipe_mutex_lock(mgr->bo_va_mutex);
    if ((va + size) == mgr->va_offset) {
@@ -333,9 +338,9 @@ static void radeon_bo_destroy(struct pb_buffer *_buf)
    pipe_mutex_destroy(bo->map_mutex);

    if (bo->initial_domain & RADEON_DOMAIN_VRAM)
-        bo->rws->allocated_vram -= align(bo->base.size, 4096);
+        bo->rws->allocated_vram -= align(bo->base.size, mgr->size_align);
    else if (bo->initial_domain & RADEON_DOMAIN_GTT)
-        bo->rws->allocated_gtt -= align(bo->base.size, 4096);
+        bo->rws->allocated_gtt -= align(bo->base.size, mgr->size_align);
    FREE(bo);
 }

@@ -620,9 +625,9 @@ static struct pb_buffer *radeon_bomgr_create_bo(struct pb_manager *_mgr,
    }

    if (rdesc->initial_domains & RADEON_DOMAIN_VRAM)
-        rws->allocated_vram += align(size, 4096);
+        rws->allocated_vram += align(size, mgr->size_align);
    else if (rdesc->initial_domains & RADEON_DOMAIN_GTT)
-        rws->allocated_gtt += align(size, 4096);
+        rws->allocated_gtt += align(size, mgr->size_align);

    return &bo->base;
 }
@@ -696,6 +701,9 @@ struct pb_manager *radeon_bomgr_create(struct radeon_drm_winsys *rws)
    mgr->va_offset = rws->va_start;
    list_inithead(&mgr->va_holes);

+    /* TTM aligns the BO size to the CPU page size */
+    mgr->size_align = sysconf(_SC_PAGESIZE);
+
    return &mgr->base;
 }

@@ -858,7 +866,7 @@ radeon_winsys_bo_create(struct radeon_winsys *rws,
     * BOs. Aligning this here helps the cached bufmgr. Especially small BOs,
     * like constant/uniform buffers, can benefit from better and more reuse.
     */
-    size = align(size, 4096);
+    size = align(size, mgr->size_align);

    /* Only set one usage bit each for domains and flags, or the cache manager
     * might consider different sets of domains / flags compatible
@@ -969,7 +977,7 @@ static struct pb_buffer *radeon_winsys_bo_from_ptr(struct radeon_winsys *rws,
        pipe_mutex_unlock(mgr->bo_handles_mutex);
    }

-    ws->allocated_gtt += align(bo->base.size, 4096);
+    ws->allocated_gtt += align(bo->base.size, mgr->size_align);

    return (struct pb_buffer*)bo;
 }
@@ -1106,9 +1114,9 @@ done:
    bo->initial_domain = radeon_bo_get_initial_domain((void*)bo);

    if (bo->initial_domain & RADEON_DOMAIN_VRAM)
-        ws->allocated_vram += align(bo->base.size, 4096);
+        ws->allocated_vram += align(bo->base.size, mgr->size_align);
    else if (bo->initial_domain & RADEON_DOMAIN_GTT)
-        ws->allocated_gtt += align(bo->base.size, 4096);
+        ws->allocated_gtt += align(bo->base.size, mgr->size_align);

    return (struct pb_buffer*)bo;

--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -482,18 +482,20 @@ bit_logic_result_type(const struct glsl_type *type_a,
 }

 static const struct glsl_type *
-modulus_result_type(const struct glsl_type *type_a,
-                    const struct glsl_type *type_b,
+modulus_result_type(ir_rvalue * &value_a, ir_rvalue * &value_b,
                    struct _mesa_glsl_parse_state *state, YYLTYPE *loc)
 {
+   const glsl_type *type_a = value_a->type;
+   const glsl_type *type_b = value_b->type;
+
   if (!state->check_version(130, 300, loc, "operator '%%' is reserved")) {
      return glsl_type::error_type;
   }

-   /* From GLSL 1.50 spec, page 56:
+   /* Section 5.9 (Expressions) of the GLSL 4.00 specification says:
+    *
    *    "The operator modulus (%) operates on signed or unsigned integers or
-    *    integer vectors. The operand types must both be signed or both be
-    *    unsigned."
+    *    integer vectors."
    */
   if (!type_a->is_integer()) {
      _mesa_glsl_error(loc, state, "LHS of operator %% must be an integer");
@@ -503,11 +505,28 @@ modulus_result_type(const struct glsl_type *type_a,
      _mesa_glsl_error(loc, state, "RHS of operator %% must be an integer");
      return glsl_type::error_type;
   }
-   if (type_a->base_type != type_b->base_type) {
+
+   /*    "If the fundamental types in the operands do not match, then the
+    *    conversions from section 4.1.10 "Implicit Conversions" are applied
+    *    to create matching types."
+    *
+    * Note that GLSL 4.00 (and GL_ARB_gpu_shader5) introduced implicit
+    * int -> uint conversion rules.  Prior to that, there were no implicit
+    * conversions.  So it's harmless to apply them universally - no implicit
+    * conversions will exist.  If the types don't match, we'll receive false,
+    * and raise an error, satisfying the GLSL 1.50 spec, page 56:
+    *
+    *    "The operand types must both be signed or unsigned."
+    */
+   if (!apply_implicit_conversion(type_a, value_b, state) &&
+       !apply_implicit_conversion(type_b, value_a, state)) {
      _mesa_glsl_error(loc, state,
-                       "operands of %% must have the same base type");
+                       "could not implicitly convert operands to "
+                       "modulus (%%) operator");
      return glsl_type::error_type;
   }
+   type_a = value_a->type;
+   type_b = value_b->type;

   /*    "The operands cannot be vectors of differing size. If one operand is
    *    a scalar and the other vector, then the scalar is applied component-
@@ -1267,7 +1286,7 @@ ast_expression::do_hir(exec_list *instructions,
      op[0] = this->subexpressions[0]->hir(instructions, state);
      op[1] = this->subexpressions[1]->hir(instructions, state);

-      type = modulus_result_type(op[0]->type, op[1]->type, state, & loc);
+      type = modulus_result_type(op[0], op[1], state, &loc);

      assert(operations[this->oper] == ir_binop_mod);

@@ -1514,7 +1533,7 @@ ast_expression::do_hir(exec_list *instructions,
      op[0] = this->subexpressions[0]->hir(instructions, state);
      op[1] = this->subexpressions[1]->hir(instructions, state);

-      type = modulus_result_type(op[0]->type, op[1]->type, state, & loc);
+      type = modulus_result_type(op[0], op[1], state, &loc);

      assert(operations[this->oper] == ir_binop_mod);

--- a/src/glsl/nir/nir_lower_vars_to_ssa.c
+++ b/src/glsl/nir/nir_lower_vars_to_ssa.c
@@ -455,7 +455,8 @@ lower_copies_to_load_store(struct deref_node *node,
         struct deref_node *arg_node =
            get_deref_node(copy->variables[i], state);

-         if (arg_node == NULL)
+         /* Only bother removing copy entries for other nodes */
+         if (arg_node == NULL || arg_node == node)
            continue;

         struct set_entry *arg_entry = _mesa_set_search(arg_node->copies, copy);
@@ -466,6 +467,8 @@ lower_copies_to_load_store(struct deref_node *node,
      nir_instr_remove(&copy->instr);
   }

+   node->copies = NULL;
+
   return true;
 }

--- a/src/mesa/drivers/common/meta_generate_mipmap.c
+++ b/src/mesa/drivers/common/meta_generate_mipmap.c
@@ -102,13 +102,13 @@ fallback_required(struct gl_context *ctx, GLenum target,
    */
   if (!mipmap->FBO)
      _mesa_GenFramebuffers(1, &mipmap->FBO);
-   _mesa_BindFramebuffer(GL_FRAMEBUFFER_EXT, mipmap->FBO);
+   _mesa_BindFramebuffer(GL_DRAW_FRAMEBUFFER, mipmap->FBO);

-   _mesa_meta_bind_fbo_image(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, baseImage, 0);
+   _mesa_meta_bind_fbo_image(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, baseImage, 0);

-   status = _mesa_CheckFramebufferStatus(GL_FRAMEBUFFER_EXT);
+   status = _mesa_CheckFramebufferStatus(GL_DRAW_FRAMEBUFFER);

-   _mesa_BindFramebuffer(GL_FRAMEBUFFER_EXT, fboSave);
+   _mesa_BindFramebuffer(GL_DRAW_FRAMEBUFFER, fboSave);

   if (status != GL_FRAMEBUFFER_COMPLETE_EXT) {
      _mesa_perf_debug(ctx, MESA_DEBUG_SEVERITY_HIGH,
@@ -128,6 +128,8 @@ _mesa_meta_glsl_generate_mipmap_cleanup(struct gen_mipmap_state *mipmap)
   mipmap->VAO = 0;
   _mesa_DeleteBuffers(1, &mipmap->VBO);
   mipmap->VBO = 0;
+   _mesa_DeleteSamplers(1, &mipmap->Sampler);
+   mipmap->Sampler = 0;

   _mesa_meta_blit_shader_table_cleanup(&mipmap->shaders);
 }
--- a/src/mesa/drivers/dri/i965/brw_device_info.c
+++ b/src/mesa/drivers/dri/i965/brw_device_info.c
@@ -336,6 +336,15 @@ static const struct brw_device_info brw_device_info_skl_gt3 = {

 static const struct brw_device_info brw_device_info_skl_gt4 = {
   GEN9_FEATURES, .gt = 4,
+   /* From the "L3 Allocation and Programming" documentation:
+    *
+    * "URB is limited to 1008KB due to programming restrictions.  This is not a
+    * restriction of the L3 implementation, but of the FF and other clients.
+    * Therefore, in a GT4 implementation it is possible for the programmed
+    * allocation of the L3 data array to provide 3*384KB=1152KB for URB, but
+    * only 1008KB of this will be used."
+    */
+   .urb.size = 1008 / 3,
 };

 static const struct brw_device_info brw_device_info_bxt = {
--- a/src/mesa/drivers/dri/r200/r200_tex.h
+++ b/src/mesa/drivers/dri/r200/r200_tex.h
@@ -63,7 +63,9 @@ static const struct tx_table tx_table_be[] =
   [ MESA_FORMAT_A8B8G8R8_UNORM ] = { R200_TXFORMAT_ABGR8888 | R200_TXFORMAT_ALPHA_IN_MAP, 0 },
   [ MESA_FORMAT_R8G8B8A8_UNORM ] = { R200_TXFORMAT_RGBA8888 | R200_TXFORMAT_ALPHA_IN_MAP, 0 },
   [ MESA_FORMAT_B8G8R8A8_UNORM ] = { R200_TXFORMAT_ARGB8888 | R200_TXFORMAT_ALPHA_IN_MAP, 0 },
+   [ MESA_FORMAT_B8G8R8X8_UNORM ] = { R200_TXFORMAT_ARGB8888, 0 },
   [ MESA_FORMAT_A8R8G8B8_UNORM ] = { R200_TXFORMAT_ARGB8888 | R200_TXFORMAT_ALPHA_IN_MAP, 0 },
+   [ MESA_FORMAT_X8R8G8B8_UNORM ] = { R200_TXFORMAT_ARGB8888, 0 },
   [ MESA_FORMAT_BGR_UNORM8 ] = { 0xffffffff, 0 },
   [ MESA_FORMAT_B5G6R5_UNORM ] = { R200_TXFORMAT_RGB565, 0 },
   [ MESA_FORMAT_R5G6B5_UNORM ] = { R200_TXFORMAT_RGB565, 0 },
@@ -91,7 +93,9 @@ static const struct tx_table tx_table_le[] =
   [ MESA_FORMAT_A8B8G8R8_UNORM ] = { R200_TXFORMAT_RGBA8888 | R200_TXFORMAT_ALPHA_IN_MAP, 0 },
   [ MESA_FORMAT_R8G8B8A8_UNORM ] = { R200_TXFORMAT_ABGR8888 | R200_TXFORMAT_ALPHA_IN_MAP, 0 },
   [ MESA_FORMAT_B8G8R8A8_UNORM ] = { R200_TXFORMAT_ARGB8888 | R200_TXFORMAT_ALPHA_IN_MAP, 0 },
+   [ MESA_FORMAT_B8G8R8X8_UNORM ] = { R200_TXFORMAT_ARGB8888, 0 },
   [ MESA_FORMAT_A8R8G8B8_UNORM ] = { R200_TXFORMAT_ARGB8888 | R200_TXFORMAT_ALPHA_IN_MAP, 0 },
+   [ MESA_FORMAT_X8R8G8B8_UNORM ] = { R200_TXFORMAT_ARGB8888, 0 },
   [ MESA_FORMAT_BGR_UNORM8 ] = { R200_TXFORMAT_ARGB8888, 0 },
   [ MESA_FORMAT_B5G6R5_UNORM ] = { R200_TXFORMAT_RGB565, 0 },
   [ MESA_FORMAT_R5G6B5_UNORM ] = { R200_TXFORMAT_RGB565, 0 },
--- a/src/mesa/drivers/dri/radeon/radeon_tex.h
+++ b/src/mesa/drivers/dri/radeon/radeon_tex.h
@@ -63,6 +63,8 @@ static const struct tx_table tx_table[] =
   [ MESA_FORMAT_R8G8B8A8_UNORM ] = { RADEON_TXFORMAT_RGBA8888 | RADEON_TXFORMAT_ALPHA_IN_MAP, 0 },
   [ MESA_FORMAT_B8G8R8A8_UNORM ] = { RADEON_TXFORMAT_ARGB8888 | RADEON_TXFORMAT_ALPHA_IN_MAP, 0 },
   [ MESA_FORMAT_A8R8G8B8_UNORM ] = { RADEON_TXFORMAT_ARGB8888 | RADEON_TXFORMAT_ALPHA_IN_MAP, 0 },
+   [ MESA_FORMAT_B8G8R8X8_UNORM ] = { RADEON_TXFORMAT_ARGB8888, 0 },
+   [ MESA_FORMAT_X8R8G8B8_UNORM ] = { RADEON_TXFORMAT_ARGB8888, 0 },
   [ MESA_FORMAT_BGR_UNORM8 ] = { RADEON_TXFORMAT_ARGB8888, 0 },
   [ MESA_FORMAT_B5G6R5_UNORM ] = { RADEON_TXFORMAT_RGB565, 0 },
   [ MESA_FORMAT_R5G6B5_UNORM ] = { RADEON_TXFORMAT_RGB565, 0 },
--- a/src/mesa/main/copyimage.c
+++ b/src/mesa/main/copyimage.c
@@ -57,6 +57,8 @@ static bool
 prepare_target(struct gl_context *ctx, GLuint name, GLenum *target, int level,
               struct gl_texture_object **tex_obj,
               struct gl_texture_image **tex_image, GLuint *tmp_tex,
+               GLuint *width,
+               GLuint *height,
               const char *dbg_prefix)
 {
   if (name == 0) {
@@ -130,6 +132,8 @@ prepare_target(struct gl_context *ctx, GLuint name, GLenum *target, int level,
      _mesa_BindTexture(*target, *tmp_tex);
      *tex_obj = _mesa_lookup_texture(ctx, *tmp_tex);
      *tex_image = _mesa_get_tex_image(ctx, *tex_obj, *target, 0);
+      *width = rb->Width;
+      *height = rb->Height;

      if (!ctx->Driver.BindRenderbufferTexImage(ctx, rb, *tex_image)) {
         _mesa_problem(ctx, "Failed to create texture from renderbuffer");
@@ -175,6 +179,9 @@ prepare_target(struct gl_context *ctx, GLuint name, GLenum *target, int level,
                     "glCopyImageSubData(%sLevel = %u)", dbg_prefix, level);
         return false;
      }
+
+      *width = (*tex_image)->Width;
+      *height = (*tex_image)->Height;
   }

   return true;
@@ -409,6 +416,7 @@ _mesa_CopyImageSubData(GLuint srcName, GLenum srcTarget, GLint srcLevel,
   GLuint tmpTexNames[2] = { 0, 0 };
   struct gl_texture_object *srcTexObj, *dstTexObj;
   struct gl_texture_image *srcTexImage, *dstTexImage;
+   GLuint src_w, src_h, dst_w, dst_h;
   GLuint src_bw, src_bh, dst_bw, dst_bh;
   int i;

@@ -429,16 +437,42 @@ _mesa_CopyImageSubData(GLuint srcName, GLenum srcTarget, GLint srcLevel,
   }

   if (!prepare_target(ctx, srcName, &srcTarget, srcLevel,
-                       &srcTexObj, &srcTexImage, &tmpTexNames[0], "src"))
+                       &srcTexObj, &srcTexImage, &tmpTexNames[0],
+                       &src_w, &src_h, "src"))
      goto cleanup;

   if (!prepare_target(ctx, dstName, &dstTarget, dstLevel,
-                       &dstTexObj, &dstTexImage, &tmpTexNames[1], "dst"))
+                       &dstTexObj, &dstTexImage, &tmpTexNames[1],
+                       &dst_w, &dst_h, "dst"))
      goto cleanup;

   _mesa_get_format_block_size(srcTexImage->TexFormat, &src_bw, &src_bh);
+
+   /* Section 18.3.2 (Copying Between Images) of the OpenGL 4.5 Core Profile
+    * spec says:
+    *
+    *    An INVALID_VALUE error is generated if the dimensions of either
+    *    subregion exceeds the boundaries of the corresponding image object,
+    *    or if the image format is compressed and the dimensions of the
+    *    subregion fail to meet the alignment constraints of the format.
+    *
+    * and Section 8.7 (Compressed Texture Images) says:
+    *
+    *    An INVALID_OPERATION error is generated if any of the following
+    *    conditions occurs:
+    *
+    *      * width is not a multiple of four, and width + xoffset is not
+    *        equal to the value of TEXTURE_WIDTH.
+    *      * height is not a multiple of four, and height + yoffset is not
+    *        equal to the value of TEXTURE_HEIGHT.
+    *
+    * so we take that to mean that you can copy the "last" block of a
+    * compressed texture image even if it's smaller than the minimum block
+    * dimensions.
+    */
   if ((srcX % src_bw != 0) || (srcY % src_bh != 0) ||
-       (srcWidth % src_bw != 0) || (srcHeight % src_bh != 0)) {
+       (srcWidth % src_bw != 0 && (srcX + srcWidth) != src_w) ||
+       (srcHeight % src_bh != 0 && (srcY + srcHeight) != src_h)) {
      _mesa_error(ctx, GL_INVALID_VALUE,
                  "glCopyImageSubData(unaligned src rectangle)");
      goto cleanup;
Author	SHA1	Message	Date
Emil Velikov	04fd3a6f62	docs: add release notes for 11.0.6 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2015-11-21 11:43:55 +00:00
Emil Velikov	5018418573	Update version to 11.0.6 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2015-11-21 11:42:52 +00:00
Emil Velikov	040785c08b	automake: use static llvm for make distcheck With llvm 3.7 semi-dropping the autoconf build, we rely on their cmake build. With the latter of which annoyingly using another (busted?) SONAME. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `c45b4257c2`)	2015-11-21 11:42:52 +00:00
Oded Gabbay	0c56517d16	llvmpipe: use simple coeffs calc for 128bit vectors There are currently two methods in llvmpipe code to calculate coeffs to be used as inputs for the fragment shader. The two methods use slightly different ways to do the floating point calculations and thus produce slightly different results. The decision which method to use is determined by the size of the vector that is used by the platform. For vectors with size of more than 128bit, a single-step method is used, in which coeffs_init_simple() + attribs_update_simple() are called. For vectors with size of 128bit or less, a two-step method is used, in which coeffs_init() + attribs_update() are called. This causes some piglit tests (clip-distance-bulk-copy, interface-vs-unnamed-to-fs-unnamed) to fail when using platforms with 128bit vectors (such as ppc64le or x86-64 without AVX). This patch makes platforms with 128bit vectors use the single-step method (aka "simple" method) instead of the two-step method. This would make the resulting coeffs identical between more platforms, make sure the piglit tests passes, and make debugging and maintainability a bit easier as the generated LLVM IR will be the same for more platforms. The performance impact is negligible for x86-64 without AVX, and basically non-existent for ppc64le, as it can be seen from the following benchmarking results: - glxspheres, on ppc64le: - original code: 4.892745317 frames/sec 5.460303857 Mpixels/sec - with the patch: 4.932083873 frames/sec 5.504205571 Mpixels/sec - Additional 0.8% performance boost - glxspheres, on x86-64 without AVX: - original code: 20.16418809 frames/sec 22.50323395 Mpixels/sec - with the patch: 20.31328989 frames/sec 22.66963152 Mpixels/sec - Additional 0.74% performance boost - glmark2, on ppc64le: - original code: score of 58 - with my change: score of 57 - glmark2, on x86-64 without AVX: - original code: score of 175 - with the patch: score of 167 - Impact of of -4.5% on performance - OpenArena, on ppc64le: - original code: 3398 frames 1719.0 seconds 2.0 fps 255.0/505.9/2773.0/0.0 ms - with the patch: 3398 frames 1690.4 seconds 2.0 fps 241.0/497.5/2563.0/0.2 ms - 29 seconds faster with the patch, which is about 2% - OpenArena, on x86-64 without AVX: - original code: 3398 frames 239.6 seconds 14.2 fps 38.0/70.5/719.0/14.6 ms - with the patch: 3398 frames 244.4 seconds 13.9 fps 38.0/71.9/697.0/14.3 ms - 0.3 fps slower with the patch (about 2%) Additional details can be found at: http://lists.freedesktop.org/archives/mesa-dev/2015-October/098635.html Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> (cherry picked from commit `39b4dfe6ab`)	2015-11-18 19:13:17 +00:00
Eric Anholt	d425a2f26c	vc4: Add support for nir_op_uge, using the carry bit on QPU_A_SUB. It looks like nir_lower_idiv is going to use it soon, so add support. With Ilia's change, this fixes one case in fs-op-div-large-uint-uint (with GL 3.0 forced on). Cc: "11.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `a4bf28178f`) [Emil Velikov: Resolve trivial conflicts] Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Conflicts: src/gallium/drivers/vc4/vc4_qpu_emit.c	2015-11-18 18:59:34 +00:00
Roland Scheidegger	c667a0d1d3	r200: fix bgrx8/xrgb8 blits Since `779cabfc7d` the same txformat table entries are used for "normal" texturing as well as for blits. However, I forgot to put in an entry for the bgrx8 (le) and xrgb8 (be) formats - the normal texturing path can't hit them because the radeon tex format chooser will never chose them, but we get that format from the dri buffers (at least I assume we got it from there). This is untested but essentially addressing the same bug as for radeon. (I don't think that the second entry per le/be table is actually necessary, but shouldn't hurt...) Tested-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `a2611ffe4b`)	2015-11-18 18:59:20 +00:00
Roland Scheidegger	f112696f15	radeon: fix bgrx8/xrgb8 blits Since `d21320f625` the same txformat table entries are used for "normal" texturing as well as for blits. However, I forgot to put in an entry for the bgrx8 (le) and xrgb8 (be) formats - the normal texturing path can't hit them because the radeon tex format chooser will never chose them, but we get that format from the dri buffers (at least I assume we got it from there). This caused lots of piglit regressions (and probably lots of trouble outside piglit too). This fixes bug https://bugs.freedesktop.org/show_bug.cgi?id=92900. Tested-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `983614dbed`)	2015-11-18 18:59:20 +00:00
Ian Romanick	acbaa3d0fc	meta/generate_mipmap: Only modify the draw framebuffer binding in fallback_required Previously GL_FRAMEBUFFER was used. However, if GL_EXT_framebuffer_blit is supported (note: it is supported by every Mesa driver), this is sometimes an alias for GL_DRAW_FRAMEBUFFER (getters) and sometimes an alias for both GL_DRAW_FRAMEBUFFER and GL_READ_FRAMEBUFFER (setters). As a result, the code saved one binding but modified both. If the bindings were different, the GL_READ_FRAMEBUFFER would be incorrect on exit. Fixes the piglit fbo-generatemipmap-versus-READ_FRAMEBUFFER test. Ideally this function would use DSA functions and not modify the binding at all. However, that would be a much more intrusive change because _mesa_meta_bind_fbo_image would also need to be modified. _mesa_meta_bind_fbo_image has a lot of callers. Much of this code is about to get a major rework due to bug #92363, so I don't think it matters too much. In fact, I discovered this bug while working on the other bug. Le bon temps! Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `c40a88b6c5`)	2015-11-18 18:59:19 +00:00
Alex Deucher	55325d0632	radeonsi: enable optimal raster config setting for fiji (v2) Requires proper kernel tiling configuration so check the tiling config registers. v2: send the right version of the patch Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `00f554abba`)	2015-11-18 18:59:19 +00:00
Ilia Mirkin	09a7ee2782	nouveau: don't expose HEVC decoding support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `f94e1d9738`)	2015-11-18 18:59:19 +00:00
Kenneth Graunke	120559bd30	glsl: Allow implicit int -> uint conversions for the % operator. GLSL 4.00 and GL_ARB_gpu_shader5 introduced a new int -> uint implicit conversion rule and updated the rules for modulus to use them. (In earlier languages, none of the implicit conversion rules did anything relevant, so there was no point in applying them.) This allows expressions such as: int foo; uint bar; uint mod = foo % bar; Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (cherry picked from commit `511de1a80c`)	2015-11-18 18:59:19 +00:00
Ian Romanick	0b7bdb0668	meta/generate_mipmap: Don't leak the sampler object Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> (cherry picked from commit `758f12fd98`)	2015-11-18 18:59:19 +00:00
Marek Olšák	f9325a97b3	radeonsi: initialize SX_PS_DOWNCONVERT to 0 on Stoney otherwise the SX or CB blocks can go bananas Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `40912dd91e`) [Emil Velikov: resolve trivial conflicts] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Conflicts: src/gallium/drivers/radeonsi/si_state.c	2015-11-18 18:59:13 +00:00
Jason Ekstrand	0dd0d6696f	nir/vars_to_ssa: Rework copy set handling in lower_copies_to_load_store Previously, we walked through a given deref_node's copies and, after lowering the copy away, removed it from both the source and destination copy sets. This commit changes this to only remove it from the other node's copy set (not the one we're lowering). At the end of the loop, we just throw away the copy set for the node we're lowering since that node no longer has any copies. This has two advantages: 1) It's more efficient because we're doing potentially half as many set search operations. 2) It now properly handles copies from a node to itself. Perviously, it would delete the copy from the set when processing the destinatioon and then assert-fail when we couldn't find it for the source. Cc: "11.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92588 Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (cherry picked from commit `226ba889a0`)	2015-11-18 18:58:53 +00:00
Ben Widawsky	4b3d4ceaba	i965/skl/gt4: Fix URB programming restriction. The comment in the code details the restriction. Thanks to Ken for having a very helpful conversation with me, and spotting the blurb in the link I sent him :P. There are still stability problems for me on GT4, but this definitely helps with some of the failures. v2: Comment fixes Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `55314c5be4`)	2015-11-18 18:58:53 +00:00
Dave Airlie	20f0d88495	r600: initialised PGM_RESOURCES_2 for ES/GS This fixes the corruption on rendering that we are seeing in certain geometry shaders. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=91780 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested / Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Cc: "10.6" "11.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `df8af7d751`)	2015-11-18 18:58:53 +00:00
Ilia Mirkin	fa527fce5c	mesa/copyimage: allow width/height to not be multiples of block For compressed textures, the image size is not necessarily a multiple of the block size (e.g. the last mip levels). Section 18.3.2 (Copying Between Images) of the OpenGL 4.5 Core Profile spec says: An INVALID_VALUE error is generated if the dimensions of either subregion exceeds the boundaries of the corresponding image object, or if the image format is compressed and the dimensions of the subregion fail to meet the alignment constraints of the format. and Section 8.7 (Compressed Texture Images) says: An INVALID_OPERATION error is generated if any of the following conditions occurs: * width is not a multiple of four, and width + xoffset is not equal to the value of TEXTURE_WIDTH. * height is not a multiple of four, and height + yoffset is not equal to the value of TEXTURE_HEIGHT. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92860 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `912babba7b`) [Emil Velikov: resolve trivial conflicts] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Conflicts: src/mesa/main/copyimage.c	2015-11-18 18:58:46 +00:00
Eric Anholt	9bbdd99d8c	vc4: Return NULL when we can't make our shadow for a sampler view. I'm not sure what the caller does is appropriate (just have a NULL sampler at this slot), but it fixes the immediate crash. Cc: "11.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `5980389bbf`)	2015-11-18 18:49:41 +00:00
Eric Anholt	e54ac25120	vc4: Return GL_OUT_OF_MEMORY when buffer allocation fails. I was afraid our callers weren't prepared for this, but it looks like at least for resource creation, mesa/st throws an error appropriately. Cc: "11.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `eb8fb0064d`)	2015-11-18 18:49:41 +00:00
Michel Dänzer	312ec1946d	winsys/radeon: Use CPU page size instead of hardcoding 4096 bytes v3 Fixes GPUVM conflicts with non-4K page size. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92738 v2: Replace sanitization of VM base address alignment with comment why that's not necessary. v3: Use unsigned instead of long as the type for the size_align member. (Marek) Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Christian König <christian.koenig@amd.com> (v1) Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `24abbaff9a`)	2015-11-18 18:49:41 +00:00
Boyuan Zhang	6a958b0b51	radeon/uvd: fix VC-1 simple/main profile decode v2 We just needed to set the extra width/height fields to get this working. v2 (chk): rebased, CC stable added, commit message added, fixed coding style Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `6bad554d98`)	2015-11-18 18:49:41 +00:00
Boyuan Zhang	71a785fc5f	st/vaapi: fix vaapi VC-1 simple/main corruption v2 Apply the start code fix only to advanced profile. v2 (chk): add commit message Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `ed55def44f`)	2015-11-18 18:49:41 +00:00
Emil Velikov	f6e19f673e	cherry-ignore: add the swrast front buffer support Although a sort of a bugfix, it causes many piglit regressions and even lockup with llvmpipe. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2015-11-18 18:49:40 +00:00
Emil Velikov	66c949d0a1	docs: add sha256 checksums for 11.0.5 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2015-11-11 11:10:30 +00:00
@@ -1 +1 @@
 .0.5
 .0.6