Compare commits

...

24 Commits

Author SHA1 Message Date
Emil Velikov
04fd3a6f62 docs: add release notes for 11.0.6
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2015-11-21 11:43:55 +00:00
Emil Velikov
5018418573 Update version to 11.0.6
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2015-11-21 11:42:52 +00:00
Emil Velikov
040785c08b automake: use static llvm for make distcheck
With llvm 3.7 semi-dropping the autoconf build, we rely on their cmake
build. With the latter of which annoyingly using another (busted?)
SONAME.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit c45b4257c2)
2015-11-21 11:42:52 +00:00
Oded Gabbay
0c56517d16 llvmpipe: use simple coeffs calc for 128bit vectors
There are currently two methods in llvmpipe code to calculate coeffs to
be used as inputs for the fragment shader. The two methods use slightly
different ways to do the floating point calculations and thus produce
slightly different results.

The decision which method to use is determined by the size of the vector
that is used by the platform.

For vectors with size of more than 128bit, a single-step method is used,
in which coeffs_init_simple() + attribs_update_simple() are called.

For vectors with size of 128bit or less, a two-step method is used, in
which coeffs_init() + attribs_update() are called.

This causes some piglit tests (clip-distance-bulk-copy,
interface-vs-unnamed-to-fs-unnamed) to fail when using platforms with
128bit vectors (such as ppc64le or x86-64 without AVX).

This patch makes platforms with 128bit vectors use the single-step
method (aka "simple" method) instead of the two-step method.
This would make the resulting coeffs identical between more platforms,
make sure the piglit tests passes, and make debugging and maintainability
a bit easier as the generated LLVM IR will be the same for more platforms.

The performance impact is negligible for x86-64 without AVX, and
basically non-existent for ppc64le, as it can be seen from the following
benchmarking results:

- glxspheres, on ppc64le:

   - original code:  4.892745317 frames/sec 5.460303857 Mpixels/sec
   - with the patch: 4.932083873 frames/sec 5.504205571 Mpixels/sec
   - Additional 0.8% performance boost

- glxspheres, on x86-64 without AVX:

   - original code:  20.16418809 frames/sec 22.50323395 Mpixels/sec
   - with the patch: 20.31328989 frames/sec 22.66963152 Mpixels/sec
   - Additional 0.74% performance boost

- glmark2, on ppc64le:

  - original code:  score of 58
  - with my change: score of 57

- glmark2, on x86-64 without AVX:

  - original code:  score of 175
  - with the patch: score of 167
  - Impact of of -4.5% on performance

- OpenArena, on ppc64le:

  - original code:  3398 frames 1719.0 seconds 2.0 fps
                    255.0/505.9/2773.0/0.0 ms

  - with the patch: 3398 frames 1690.4 seconds 2.0 fps
                    241.0/497.5/2563.0/0.2 ms

  - 29 seconds faster with the patch, which is about 2%

- OpenArena, on x86-64 without AVX:

  - original code:  3398 frames 239.6 seconds 14.2 fps
                    38.0/70.5/719.0/14.6 ms

  - with the patch: 3398 frames 244.4 seconds 13.9 fps
                    38.0/71.9/697.0/14.3 ms

  - 0.3 fps slower with the patch (about 2%)

Additional details can be found at:
http://lists.freedesktop.org/archives/mesa-dev/2015-October/098635.html

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 39b4dfe6ab)
2015-11-18 19:13:17 +00:00
Eric Anholt
d425a2f26c vc4: Add support for nir_op_uge, using the carry bit on QPU_A_SUB.
It looks like nir_lower_idiv is going to use it soon, so add support.
With Ilia's change, this fixes one case in fs-op-div-large-uint-uint (with
GL 3.0 forced on).

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a4bf28178f)
[Emil Velikov: Resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
	src/gallium/drivers/vc4/vc4_qpu_emit.c
2015-11-18 18:59:34 +00:00
Roland Scheidegger
c667a0d1d3 r200: fix bgrx8/xrgb8 blits
Since 779cabfc7d the same txformat table entries
are used for "normal" texturing as well as for blits. However, I forgot to put
in an entry for the bgrx8 (le) and xrgb8 (be) formats - the normal texturing
path can't hit them because the radeon tex format chooser will never chose
them, but we get that format from the dri buffers (at least I assume we got
it from there).
This is untested but essentially addressing the same bug as for radeon.
(I don't think that the second entry per le/be table is actually necessary,
but shouldn't hurt...)

Tested-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a2611ffe4b)
2015-11-18 18:59:20 +00:00
Roland Scheidegger
f112696f15 radeon: fix bgrx8/xrgb8 blits
Since d21320f625 the same txformat table entries
are used for "normal" texturing as well as for blits. However, I forgot to put
in an entry for the bgrx8 (le) and xrgb8 (be) formats - the normal texturing
path can't hit them because the radeon tex format chooser will never chose
them, but we get that format from the dri buffers (at least I assume we got
it from there). This caused lots of piglit regressions (and probably lots of
trouble outside piglit too).
This fixes bug https://bugs.freedesktop.org/show_bug.cgi?id=92900.

Tested-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 983614dbed)
2015-11-18 18:59:20 +00:00
Ian Romanick
acbaa3d0fc meta/generate_mipmap: Only modify the draw framebuffer binding in fallback_required
Previously GL_FRAMEBUFFER was used.  However, if GL_EXT_framebuffer_blit
is supported (note: it is supported by every Mesa driver), this is
*sometimes* an alias for GL_DRAW_FRAMEBUFFER (getters) and *sometimes*
an alias for *both* GL_DRAW_FRAMEBUFFER and GL_READ_FRAMEBUFFER
(setters).  As a result, the code saved one binding but modified both.
If the bindings were different, the GL_READ_FRAMEBUFFER would be
incorrect on exit.

Fixes the piglit fbo-generatemipmap-versus-READ_FRAMEBUFFER test.

Ideally this function would use DSA functions and not modify the binding
at all.  However, that would be a much more intrusive change because
_mesa_meta_bind_fbo_image would also need to be modified.
_mesa_meta_bind_fbo_image has a lot of callers.  Much of this code is
about to get a major rework due to bug #92363, so I don't think it
matters too much.  In fact, I discovered this bug while working on the
other bug.  Le bon temps!

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c40a88b6c5)
2015-11-18 18:59:19 +00:00
Alex Deucher
55325d0632 radeonsi: enable optimal raster config setting for fiji (v2)
Requires proper kernel tiling configuration so check the tiling
config registers.

v2: send the right version of the patch

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 00f554abba)
2015-11-18 18:59:19 +00:00
Ilia Mirkin
09a7ee2782 nouveau: don't expose HEVC decoding support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit f94e1d9738)
2015-11-18 18:59:19 +00:00
Kenneth Graunke
120559bd30 glsl: Allow implicit int -> uint conversions for the % operator.
GLSL 4.00 and GL_ARB_gpu_shader5 introduced a new int -> uint implicit
conversion rule and updated the rules for modulus to use them.  (In
earlier languages, none of the implicit conversion rules did anything
relevant, so there was no point in applying them.)

This allows expressions such as:

   int foo;
   uint bar;
   uint mod = foo % bar;

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 511de1a80c)
2015-11-18 18:59:19 +00:00
Ian Romanick
0b7bdb0668 meta/generate_mipmap: Don't leak the sampler object
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit 758f12fd98)
2015-11-18 18:59:19 +00:00
Marek Olšák
f9325a97b3 radeonsi: initialize SX_PS_DOWNCONVERT to 0 on Stoney
otherwise the SX or CB blocks can go bananas

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 40912dd91e)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/gallium/drivers/radeonsi/si_state.c
2015-11-18 18:59:13 +00:00
Jason Ekstrand
0dd0d6696f nir/vars_to_ssa: Rework copy set handling in lower_copies_to_load_store
Previously, we walked through a given deref_node's copies and, after
lowering the copy away, removed it from both the source and destination
copy sets.  This commit changes this to only remove it from the other
node's copy set (not the one we're lowering).  At the end of the loop, we
just throw away the copy set for the node we're lowering since that node no
longer has any copies.  This has two advantages:

 1) It's more efficient because we're doing potentially half as many set
    search operations.

 2) It now properly handles copies from a node to itself.  Perviously, it
    would delete the copy from the set when processing the destinatioon and
    then assert-fail when we couldn't find it for the source.

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92588
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
(cherry picked from commit 226ba889a0)
2015-11-18 18:58:53 +00:00
Ben Widawsky
4b3d4ceaba i965/skl/gt4: Fix URB programming restriction.
The comment in the code details the restriction. Thanks to Ken for having a very
helpful conversation with me, and spotting the blurb in the link I sent him :P.

There are still stability problems for me on GT4, but this definitely helps with
some of the failures.

v2: Comment fixes

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 55314c5be4)
2015-11-18 18:58:53 +00:00
Dave Airlie
20f0d88495 r600: initialised PGM_RESOURCES_2 for ES/GS
This fixes the corruption on rendering that we are seeing in
certain geometry shaders.

Fixes:  https://bugs.freedesktop.org/show_bug.cgi?id=91780
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested / Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Cc: "10.6" "11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>

(cherry picked from commit df8af7d751)
2015-11-18 18:58:53 +00:00
Ilia Mirkin
fa527fce5c mesa/copyimage: allow width/height to not be multiples of block
For compressed textures, the image size is not necessarily a multiple of
the block size (e.g. the last mip levels). Section 18.3.2 (Copying
Between Images) of the OpenGL 4.5 Core Profile spec says:

    An INVALID_VALUE error is generated if the dimensions of either
    subregion exceeds the boundaries of the corresponding image
    object, or if the image format is compressed and the dimensions of
    the subregion fail to meet the alignment constraints of the
    format.

and Section 8.7 (Compressed Texture Images) says:

    An INVALID_OPERATION error is generated if any of the following
    conditions occurs:

      * width is not a multiple of four, and width + xoffset is not
        equal to the value of TEXTURE_WIDTH.
      * height is not a multiple of four, and height + yoffset is not
        equal to the value of TEXTURE_HEIGHT.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92860
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 912babba7b)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/mesa/main/copyimage.c
2015-11-18 18:58:46 +00:00
Eric Anholt
9bbdd99d8c vc4: Return NULL when we can't make our shadow for a sampler view.
I'm not sure what the caller does is appropriate (just have a NULL sampler
at this slot), but it fixes the immediate crash.

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5980389bbf)
2015-11-18 18:49:41 +00:00
Eric Anholt
e54ac25120 vc4: Return GL_OUT_OF_MEMORY when buffer allocation fails.
I was afraid our callers weren't prepared for this, but it looks like
at least for resource creation, mesa/st throws an error appropriately.

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit eb8fb0064d)
2015-11-18 18:49:41 +00:00
Michel Dänzer
312ec1946d winsys/radeon: Use CPU page size instead of hardcoding 4096 bytes v3
Fixes GPUVM conflicts with non-4K page size.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92738

v2: Replace sanitization of VM base address alignment with comment why
    that's not necessary.
v3: Use unsigned instead of long as the type for the size_align member.
    (Marek)

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Christian König <christian.koenig@amd.com> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 24abbaff9a)
2015-11-18 18:49:41 +00:00
Boyuan Zhang
6a958b0b51 radeon/uvd: fix VC-1 simple/main profile decode v2
We just needed to set the extra width/height fields to get this working.

v2 (chk): rebased, CC stable added, commit message added, fixed coding style

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6bad554d98)
2015-11-18 18:49:41 +00:00
Boyuan Zhang
71a785fc5f st/vaapi: fix vaapi VC-1 simple/main corruption v2
Apply the start code fix only to advanced profile.

v2 (chk): add commit message

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ed55def44f)
2015-11-18 18:49:41 +00:00
Emil Velikov
f6e19f673e cherry-ignore: add the swrast front buffer support
Although a sort of a bugfix, it causes many piglit regressions and even
lockup with llvmpipe.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2015-11-18 18:49:40 +00:00
Emil Velikov
66c949d0a1 docs: add sha256 checksums for 11.0.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2015-11-11 11:10:30 +00:00
29 changed files with 357 additions and 56 deletions

View File

@@ -32,6 +32,7 @@ AM_DISTCHECK_CONFIGURE_FLAGS = \
--enable-vdpau \
--enable-xa \
--enable-xvmc \
--disable-llvm-shared-libs \
--with-egl-platforms=x11,wayland,drm \
--with-dri-drivers=i915,i965,nouveau,radeon,r200,swrast \
--with-gallium-drivers=i915,ilo,nouveau,r300,r600,radeonsi,freedreno,svga,swrast

View File

@@ -1 +1 @@
11.0.5
11.0.6

View File

@@ -1,2 +1,5 @@
# The commit base differs greatly between 11.0 and master
2832ca95ecce064c7d841a3a374c2179f56161be glsl: fix stream qualifier for blocks with an instance name
2832ca95ecce064c7d841a3a374c2179f56161be glsl: fix stream qualifier for blocks with an instance name
# Somewhat of a mixed feature/bugfix patch, causing some 200 piglit regressions
2b676570960277d47477822ffeccc672613f9142 gallium/swrast: fix front buffer blitting. (v2)

View File

@@ -31,7 +31,8 @@ because compatibility contexts are not supported.
<h2>SHA256 checksums</h2>
<pre>
TBD
8495ef5c06f7f726452462b7d408a5b40048373ff908f2283a3b4d1f49b45ee6 mesa-11.0.5.tar.gz
9c255a2a6695fcc6ef4a279e1df0aeaf417dc142f39ee59dfb533d80494bb67a mesa-11.0.5.tar.xz
</pre>

144
docs/relnotes/11.0.6.html Normal file
View File

@@ -0,0 +1,144 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 11.0.6 Release Notes / November 21, 2015</h1>
<p>
Mesa 11.0.6 is a bug fix release which fixes bugs found since the 11.0.5 release.
</p>
<p>
Mesa 11.0.6 implements the OpenGL 4.1 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.1. OpenGL
4.1 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
TBD
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91780">Bug 91780</a> - Rendering issues with geometry shader</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92588">Bug 92588</a> - [HSW,BDW,BSW,SKL-Y][GLES 3.1 CTS] ES31-CTS.arrays_of_arrays.InteractionFunctionCalls2 - assert</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92738">Bug 92738</a> - Randon R7 240 doesn't work on 16KiB page size platform</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92860">Bug 92860</a> - [radeonsi][bisected] st/mesa: implement ARB_copy_image - Corruption in ARK Survival Evolved</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92900">Bug 92900</a> - [regression bisected] About 700 piglit regressions is what could go wrong</li>
</ul>
<h2>Changes</h2>
<p>Alex Deucher (1):</p>
<ul>
<li>radeonsi: enable optimal raster config setting for fiji (v2)</li>
</ul>
<p>Ben Widawsky (1):</p>
<ul>
<li>i965/skl/gt4: Fix URB programming restriction.</li>
</ul>
<p>Boyuan Zhang (2):</p>
<ul>
<li>st/vaapi: fix vaapi VC-1 simple/main corruption v2</li>
<li>radeon/uvd: fix VC-1 simple/main profile decode v2</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>r600: initialised PGM_RESOURCES_2 for ES/GS</li>
</ul>
<p>Emil Velikov (4):</p>
<ul>
<li>docs: add sha256 checksums for 11.0.5</li>
<li>cherry-ignore: add the swrast front buffer support</li>
<li>automake: use static llvm for make distcheck</li>
<li>Update version to 11.0.6</li>
</ul>
<p>Eric Anholt (3):</p>
<ul>
<li>vc4: Return GL_OUT_OF_MEMORY when buffer allocation fails.</li>
<li>vc4: Return NULL when we can't make our shadow for a sampler view.</li>
<li>vc4: Add support for nir_op_uge, using the carry bit on QPU_A_SUB.</li>
</ul>
<p>Ian Romanick (2):</p>
<ul>
<li>meta/generate_mipmap: Don't leak the sampler object</li>
<li>meta/generate_mipmap: Only modify the draw framebuffer binding in fallback_required</li>
</ul>
<p>Ilia Mirkin (2):</p>
<ul>
<li>mesa/copyimage: allow width/height to not be multiples of block</li>
<li>nouveau: don't expose HEVC decoding support</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>nir/vars_to_ssa: Rework copy set handling in lower_copies_to_load_store</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>glsl: Allow implicit int -&gt; uint conversions for the % operator.</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>radeonsi: initialize SX_PS_DOWNCONVERT to 0 on Stoney</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>winsys/radeon: Use CPU page size instead of hardcoding 4096 bytes v3</li>
</ul>
<p>Oded Gabbay (1):</p>
<ul>
<li>llvmpipe: use simple coeffs calc for 128bit vectors</li>
</ul>
<p>Roland Scheidegger (2):</p>
<ul>
<li>radeon: fix bgrx8/xrgb8 blits</li>
<li>r200: fix bgrx8/xrgb8 blits</li>
</ul>
</div>
</body>
</html>

View File

@@ -746,7 +746,12 @@ lp_build_interp_soa_init(struct lp_build_interp_soa_context *bld,
pos_init(bld, x0, y0);
if (coeff_type.length > 4) {
/*
* Simple method (single step interpolation) may be slower if vector length
* is just 4, but the results are different (generally less accurate) with
* the other method, so always use more accurate version.
*/
if (1) {
bld->simple_interp = TRUE;
{
/* XXX this should use a global static table */

View File

@@ -437,6 +437,7 @@ nouveau_vp3_screen_get_video_param(struct pipe_screen *pscreen,
/* VP3 does not support MPEG4, VP4+ do. */
return entrypoint == PIPE_VIDEO_ENTRYPOINT_BITSTREAM &&
profile >= PIPE_VIDEO_PROFILE_MPEG1 &&
profile < PIPE_VIDEO_PROFILE_HEVC_MAIN &&
(!vp3 || codec != PIPE_VIDEO_FORMAT_MPEG4) &&
firmware_present(pscreen, profile);
case PIPE_VIDEO_CAP_NPOT_TEXTURES:

View File

@@ -2342,6 +2342,8 @@ static void cayman_init_atom_start_cs(struct r600_context *rctx)
r600_store_context_reg(cb, R_028848_SQ_PGM_RESOURCES_2_PS, S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN));
r600_store_context_reg(cb, R_028864_SQ_PGM_RESOURCES_2_VS, S_028864_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN));
r600_store_context_reg(cb, R_02887C_SQ_PGM_RESOURCES_2_GS, S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN));
r600_store_context_reg(cb, R_028894_SQ_PGM_RESOURCES_2_ES, S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN));
r600_store_context_reg(cb, R_0288A8_SQ_PGM_RESOURCES_FS, 0);
/* to avoid GPU doing any preloading of constant from random address */
@@ -2781,6 +2783,8 @@ void evergreen_init_atom_start_cs(struct r600_context *rctx)
r600_store_context_reg(cb, R_028848_SQ_PGM_RESOURCES_2_PS, S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN));
r600_store_context_reg(cb, R_028864_SQ_PGM_RESOURCES_2_VS, S_028864_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN));
r600_store_context_reg(cb, R_02887C_SQ_PGM_RESOURCES_2_GS, S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN));
r600_store_context_reg(cb, R_028894_SQ_PGM_RESOURCES_2_ES, S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN));
r600_store_context_reg(cb, R_0288A8_SQ_PGM_RESOURCES_FS, 0);
/* to avoid GPU doing any preloading of constant from random address */

View File

@@ -1497,6 +1497,7 @@
#define S_028878_UNCACHED_FIRST_INST(x) (((x) & 0x1) << 28)
#define G_028878_UNCACHED_FIRST_INST(x) (((x) >> 28) & 0x1)
#define C_028878_UNCACHED_FIRST_INST 0xEFFFFFFF
#define R_02887C_SQ_PGM_RESOURCES_2_GS 0x02887C
#define R_028890_SQ_PGM_RESOURCES_ES 0x028890
#define S_028890_NUM_GPRS(x) (((x) & 0xFF) << 0)
@@ -1511,6 +1512,7 @@
#define S_028890_UNCACHED_FIRST_INST(x) (((x) & 0x1) << 28)
#define G_028890_UNCACHED_FIRST_INST(x) (((x) >> 28) & 0x1)
#define C_028890_UNCACHED_FIRST_INST 0xEFFFFFFF
#define R_028894_SQ_PGM_RESOURCES_2_ES 0x028894
#define R_028864_SQ_PGM_RESOURCES_2_VS 0x028864
#define S_028864_SINGLE_ROUND(x) (((x) & 0x3) << 0)

View File

@@ -940,6 +940,12 @@ static void ruvd_end_frame(struct pipe_video_codec *decoder,
dec->msg->body.decode.width_in_samples = dec->base.width;
dec->msg->body.decode.height_in_samples = dec->base.height;
if ((picture->profile == PIPE_VIDEO_PROFILE_VC1_SIMPLE) ||
(picture->profile == PIPE_VIDEO_PROFILE_VC1_MAIN)) {
dec->msg->body.decode.width_in_samples = align(dec->msg->body.decode.width_in_samples, 16) / 16;
dec->msg->body.decode.height_in_samples = align(dec->msg->body.decode.height_in_samples, 16) / 16;
}
dec->msg->body.decode.dpb_size = dec->dpb.res->buf->size;
dec->msg->body.decode.bsd_size = bs_size;
dec->msg->body.decode.db_pitch = dec->base.width;

View File

@@ -244,8 +244,7 @@ int rvid_get_video_param(struct pipe_screen *screen,
return codec != PIPE_VIDEO_FORMAT_MPEG4;
return true;
case PIPE_VIDEO_FORMAT_VC1:
/* FIXME: VC-1 simple/main profile is broken */
return profile == PIPE_VIDEO_PROFILE_VC1_ADVANCED;
return true;
case PIPE_VIDEO_FORMAT_HEVC:
/* Carrizo only supports HEVC Main */
return rscreen->family >= CHIP_CARRIZO &&

View File

@@ -3176,6 +3176,7 @@ si_write_harvested_raster_configs(struct si_context *sctx,
static void si_init_config(struct si_context *sctx)
{
struct si_screen *sscreen = sctx->screen;
unsigned num_rb = MIN2(sctx->screen->b.info.r600_num_backends, 16);
unsigned rb_mask = sctx->screen->b.info.si_backend_enabled_mask;
unsigned raster_config, raster_config_1;
@@ -3243,9 +3244,14 @@ static void si_init_config(struct si_context *sctx)
raster_config_1 = 0x0000002e;
break;
case CHIP_FIJI:
/* Fiji should be same as Hawaii, but that causes corruption in some cases */
raster_config = 0x16000012; /* 0x3a00161a */
raster_config_1 = 0x0000002a; /* 0x0000002e */
if (sscreen->b.info.cik_macrotile_mode_array[0] == 0x000000e8) {
/* old kernels with old tiling config */
raster_config = 0x16000012;
raster_config_1 = 0x0000002a;
} else {
raster_config = 0x3a00161a;
raster_config_1 = 0x0000002e;
}
break;
case CHIP_TONGA:
raster_config = 0x16000012;
@@ -3342,5 +3348,8 @@ static void si_init_config(struct si_context *sctx)
si_pm4_set_reg(pm4, R_028C5C_VGT_OUT_DEALLOC_CNTL, 32);
}
if (sctx->b.family == CHIP_STONEY)
si_pm4_set_reg(pm4, R_028754_SX_PS_DOWNCONVERT, 0);
sctx->init_config = pm4;
}

View File

@@ -168,8 +168,9 @@ retry:
vc4_bo_cache_free_all(&screen->bo_cache);
goto retry;
}
fprintf(stderr, "create ioctl failure\n");
abort();
free(bo);
return NULL;
}
screen->bo_count++;

View File

@@ -143,6 +143,8 @@ qir_opt_algebraic(struct vc4_compile *c)
case QOP_SEL_X_Y_ZC:
case QOP_SEL_X_Y_NS:
case QOP_SEL_X_Y_NC:
case QOP_SEL_X_Y_CS:
case QOP_SEL_X_Y_CC:
if (is_zero(c, inst->src[1])) {
/* Replace references to a 0 uniform value
* with the SEL_X_0 equivalent.

View File

@@ -1055,6 +1055,10 @@ ntq_emit_alu(struct vc4_compile *c, nir_alu_instr *instr)
qir_SF(c, qir_SUB(c, src[0], src[1]));
*dest = qir_SEL_X_0_NC(c, qir_uniform_ui(c, ~0));
break;
case nir_op_uge:
qir_SF(c, qir_SUB(c, src[0], src[1]));
*dest = qir_SEL_X_0_CC(c, qir_uniform_ui(c, ~0));
break;
case nir_op_ilt:
qir_SF(c, qir_SUB(c, src[0], src[1]));
*dest = qir_SEL_X_0_NS(c, qir_uniform_ui(c, ~0));

View File

@@ -62,10 +62,14 @@ static const struct qir_op_info qir_op_info[] = {
[QOP_SEL_X_0_NC] = { "fsel_x_0_nc", 1, 1, false, true },
[QOP_SEL_X_0_ZS] = { "fsel_x_0_zs", 1, 1, false, true },
[QOP_SEL_X_0_ZC] = { "fsel_x_0_zc", 1, 1, false, true },
[QOP_SEL_X_0_CS] = { "fsel_x_0_cs", 1, 1, false, true },
[QOP_SEL_X_0_CC] = { "fsel_x_0_cc", 1, 1, false, true },
[QOP_SEL_X_Y_NS] = { "fsel_x_y_ns", 1, 2, false, true },
[QOP_SEL_X_Y_NC] = { "fsel_x_y_nc", 1, 2, false, true },
[QOP_SEL_X_Y_ZS] = { "fsel_x_y_zs", 1, 2, false, true },
[QOP_SEL_X_Y_ZC] = { "fsel_x_y_zc", 1, 2, false, true },
[QOP_SEL_X_Y_CS] = { "fsel_x_y_cs", 1, 2, false, true },
[QOP_SEL_X_Y_CC] = { "fsel_x_y_cc", 1, 2, false, true },
[QOP_RCP] = { "rcp", 1, 1, false, true },
[QOP_RSQ] = { "rsq", 1, 1, false, true },
@@ -193,10 +197,14 @@ qir_depends_on_flags(struct qinst *inst)
case QOP_SEL_X_0_NC:
case QOP_SEL_X_0_ZS:
case QOP_SEL_X_0_ZC:
case QOP_SEL_X_0_CS:
case QOP_SEL_X_0_CC:
case QOP_SEL_X_Y_NS:
case QOP_SEL_X_Y_NC:
case QOP_SEL_X_Y_ZS:
case QOP_SEL_X_Y_ZC:
case QOP_SEL_X_Y_CS:
case QOP_SEL_X_Y_CC:
return true;
default:
return false;

View File

@@ -91,11 +91,15 @@ enum qop {
QOP_SEL_X_0_ZC,
QOP_SEL_X_0_NS,
QOP_SEL_X_0_NC,
QOP_SEL_X_0_CS,
QOP_SEL_X_0_CC,
/* Selects the src[0] if the ns flag bit is set, otherwise src[1]. */
QOP_SEL_X_Y_ZS,
QOP_SEL_X_Y_ZC,
QOP_SEL_X_Y_NS,
QOP_SEL_X_Y_NC,
QOP_SEL_X_Y_CS,
QOP_SEL_X_Y_CC,
QOP_FTOI,
QOP_ITOF,
@@ -570,10 +574,14 @@ QIR_ALU1(SEL_X_0_ZS)
QIR_ALU1(SEL_X_0_ZC)
QIR_ALU1(SEL_X_0_NS)
QIR_ALU1(SEL_X_0_NC)
QIR_ALU1(SEL_X_0_CS)
QIR_ALU1(SEL_X_0_CC)
QIR_ALU2(SEL_X_Y_ZS)
QIR_ALU2(SEL_X_Y_ZC)
QIR_ALU2(SEL_X_Y_NS)
QIR_ALU2(SEL_X_Y_NC)
QIR_ALU2(SEL_X_Y_CS)
QIR_ALU2(SEL_X_Y_CC)
QIR_ALU2(FMIN)
QIR_ALU2(FMAX)
QIR_ALU2(FMINABS)

View File

@@ -271,6 +271,8 @@ vc4_generate_code(struct vc4_context *vc4, struct vc4_compile *c)
case QOP_SEL_X_0_ZC:
case QOP_SEL_X_0_NS:
case QOP_SEL_X_0_NC:
case QOP_SEL_X_0_CS:
case QOP_SEL_X_0_CC:
queue(c, qpu_a_MOV(dst, src[0]));
set_last_cond_add(c, qinst->op - QOP_SEL_X_0_ZS +
QPU_COND_ZS);
@@ -284,6 +286,8 @@ vc4_generate_code(struct vc4_context *vc4, struct vc4_compile *c)
case QOP_SEL_X_Y_ZC:
case QOP_SEL_X_Y_NS:
case QOP_SEL_X_Y_NC:
case QOP_SEL_X_Y_CS:
case QOP_SEL_X_Y_CC:
queue(c, qpu_a_MOV(dst, src[0]));
set_last_cond_add(c, qinst->op - QOP_SEL_X_Y_ZS +
QPU_COND_ZS);

View File

@@ -35,11 +35,12 @@
static bool miptree_debug = false;
static void
static bool
vc4_resource_bo_alloc(struct vc4_resource *rsc)
{
struct pipe_resource *prsc = &rsc->base.b;
struct pipe_screen *pscreen = prsc->screen;
struct vc4_bo *bo;
if (miptree_debug) {
fprintf(stderr, "alloc %p: size %d + offset %d -> %d\n",
@@ -51,12 +52,18 @@ vc4_resource_bo_alloc(struct vc4_resource *rsc)
rsc->cube_map_stride * (prsc->array_size - 1));
}
vc4_bo_unreference(&rsc->bo);
rsc->bo = vc4_bo_alloc(vc4_screen(pscreen),
rsc->slices[0].offset +
rsc->slices[0].size +
rsc->cube_map_stride * (prsc->array_size - 1),
"resource");
bo = vc4_bo_alloc(vc4_screen(pscreen),
rsc->slices[0].offset +
rsc->slices[0].size +
rsc->cube_map_stride * (prsc->array_size - 1),
"resource");
if (bo) {
vc4_bo_unreference(&rsc->bo);
rsc->bo = bo;
return true;
} else {
return false;
}
}
static void
@@ -101,21 +108,27 @@ vc4_resource_transfer_map(struct pipe_context *pctx,
char *buf;
if (usage & PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE) {
vc4_resource_bo_alloc(rsc);
if (vc4_resource_bo_alloc(rsc)) {
/* If it might be bound as one of our vertex buffers, make
* sure we re-emit vertex buffer state.
*/
if (prsc->bind & PIPE_BIND_VERTEX_BUFFER)
vc4->dirty |= VC4_DIRTY_VTXBUF;
/* If it might be bound as one of our vertex buffers,
* make sure we re-emit vertex buffer state.
*/
if (prsc->bind & PIPE_BIND_VERTEX_BUFFER)
vc4->dirty |= VC4_DIRTY_VTXBUF;
} else {
/* If we failed to reallocate, flush everything so
* that we don't violate any syncing requirements.
*/
vc4_flush(pctx);
}
} else if (!(usage & PIPE_TRANSFER_UNSYNCHRONIZED)) {
if (vc4_cl_references_bo(pctx, rsc->bo)) {
if ((usage & PIPE_TRANSFER_DISCARD_RANGE) &&
prsc->last_level == 0 &&
prsc->width0 == box->width &&
prsc->height0 == box->height &&
prsc->depth0 == box->depth) {
vc4_resource_bo_alloc(rsc);
prsc->depth0 == box->depth &&
vc4_resource_bo_alloc(rsc)) {
if (prsc->bind & PIPE_BIND_VERTEX_BUFFER)
vc4->dirty |= VC4_DIRTY_VTXBUF;
} else {
@@ -389,8 +402,7 @@ vc4_resource_create(struct pipe_screen *pscreen,
rsc->vc4_format = get_resource_texture_format(prsc);
vc4_setup_slices(rsc);
vc4_resource_bo_alloc(rsc);
if (!rsc->bo)
if (!vc4_resource_bo_alloc(rsc))
goto fail;
return prsc;

View File

@@ -581,6 +581,10 @@ vc4_create_sampler_view(struct pipe_context *pctx, struct pipe_resource *prsc,
tmpl.last_level = cso->u.tex.last_level - cso->u.tex.first_level;
prsc = vc4_resource_create(pctx->screen, &tmpl);
if (!prsc) {
free(so);
return NULL;
}
rsc = vc4_resource(prsc);
clone = vc4_resource(prsc);
clone->shadow_parent = &shadow_parent->base.b;

View File

@@ -500,8 +500,10 @@ handleVASliceDataBufferType(vlVaContext *context, vlVaBuffer *buf)
bufHasStartcode(buf, 0x0000010b, 32))
break;
if (context->decoder->profile == PIPE_VIDEO_PROFILE_VC1_ADVANCED) {
buffers[num_buffers] = (void *const)&start_code_vc1;
sizes[num_buffers++] = sizeof(start_code_vc1);
}
break;
case PIPE_VIDEO_FORMAT_MPEG4:
if (bufHasStartcode(buf, 0x000001, 24))

View File

@@ -76,6 +76,9 @@ struct radeon_bomgr {
bool va;
uint64_t va_offset;
struct list_head va_holes;
/* BO size alignment */
unsigned size_align;
};
static inline struct radeon_bomgr *radeon_bomgr(struct pb_manager *mgr)
@@ -164,8 +167,10 @@ static uint64_t radeon_bomgr_find_va(struct radeon_bomgr *mgr, uint64_t size, ui
struct radeon_bo_va_hole *hole, *n;
uint64_t offset = 0, waste = 0;
alignment = MAX2(alignment, 4096);
size = align(size, 4096);
/* All VM address space holes will implicitly start aligned to the
* size alignment, so we don't need to sanitize the alignment here
*/
size = align(size, mgr->size_align);
pipe_mutex_lock(mgr->bo_va_mutex);
/* first look for a hole */
@@ -222,7 +227,7 @@ static void radeon_bomgr_free_va(struct radeon_bomgr *mgr, uint64_t va, uint64_t
{
struct radeon_bo_va_hole *hole;
size = align(size, 4096);
size = align(size, mgr->size_align);
pipe_mutex_lock(mgr->bo_va_mutex);
if ((va + size) == mgr->va_offset) {
@@ -333,9 +338,9 @@ static void radeon_bo_destroy(struct pb_buffer *_buf)
pipe_mutex_destroy(bo->map_mutex);
if (bo->initial_domain & RADEON_DOMAIN_VRAM)
bo->rws->allocated_vram -= align(bo->base.size, 4096);
bo->rws->allocated_vram -= align(bo->base.size, mgr->size_align);
else if (bo->initial_domain & RADEON_DOMAIN_GTT)
bo->rws->allocated_gtt -= align(bo->base.size, 4096);
bo->rws->allocated_gtt -= align(bo->base.size, mgr->size_align);
FREE(bo);
}
@@ -620,9 +625,9 @@ static struct pb_buffer *radeon_bomgr_create_bo(struct pb_manager *_mgr,
}
if (rdesc->initial_domains & RADEON_DOMAIN_VRAM)
rws->allocated_vram += align(size, 4096);
rws->allocated_vram += align(size, mgr->size_align);
else if (rdesc->initial_domains & RADEON_DOMAIN_GTT)
rws->allocated_gtt += align(size, 4096);
rws->allocated_gtt += align(size, mgr->size_align);
return &bo->base;
}
@@ -696,6 +701,9 @@ struct pb_manager *radeon_bomgr_create(struct radeon_drm_winsys *rws)
mgr->va_offset = rws->va_start;
list_inithead(&mgr->va_holes);
/* TTM aligns the BO size to the CPU page size */
mgr->size_align = sysconf(_SC_PAGESIZE);
return &mgr->base;
}
@@ -858,7 +866,7 @@ radeon_winsys_bo_create(struct radeon_winsys *rws,
* BOs. Aligning this here helps the cached bufmgr. Especially small BOs,
* like constant/uniform buffers, can benefit from better and more reuse.
*/
size = align(size, 4096);
size = align(size, mgr->size_align);
/* Only set one usage bit each for domains and flags, or the cache manager
* might consider different sets of domains / flags compatible
@@ -969,7 +977,7 @@ static struct pb_buffer *radeon_winsys_bo_from_ptr(struct radeon_winsys *rws,
pipe_mutex_unlock(mgr->bo_handles_mutex);
}
ws->allocated_gtt += align(bo->base.size, 4096);
ws->allocated_gtt += align(bo->base.size, mgr->size_align);
return (struct pb_buffer*)bo;
}
@@ -1106,9 +1114,9 @@ done:
bo->initial_domain = radeon_bo_get_initial_domain((void*)bo);
if (bo->initial_domain & RADEON_DOMAIN_VRAM)
ws->allocated_vram += align(bo->base.size, 4096);
ws->allocated_vram += align(bo->base.size, mgr->size_align);
else if (bo->initial_domain & RADEON_DOMAIN_GTT)
ws->allocated_gtt += align(bo->base.size, 4096);
ws->allocated_gtt += align(bo->base.size, mgr->size_align);
return (struct pb_buffer*)bo;

View File

@@ -482,18 +482,20 @@ bit_logic_result_type(const struct glsl_type *type_a,
}
static const struct glsl_type *
modulus_result_type(const struct glsl_type *type_a,
const struct glsl_type *type_b,
modulus_result_type(ir_rvalue * &value_a, ir_rvalue * &value_b,
struct _mesa_glsl_parse_state *state, YYLTYPE *loc)
{
const glsl_type *type_a = value_a->type;
const glsl_type *type_b = value_b->type;
if (!state->check_version(130, 300, loc, "operator '%%' is reserved")) {
return glsl_type::error_type;
}
/* From GLSL 1.50 spec, page 56:
/* Section 5.9 (Expressions) of the GLSL 4.00 specification says:
*
* "The operator modulus (%) operates on signed or unsigned integers or
* integer vectors. The operand types must both be signed or both be
* unsigned."
* integer vectors."
*/
if (!type_a->is_integer()) {
_mesa_glsl_error(loc, state, "LHS of operator %% must be an integer");
@@ -503,11 +505,28 @@ modulus_result_type(const struct glsl_type *type_a,
_mesa_glsl_error(loc, state, "RHS of operator %% must be an integer");
return glsl_type::error_type;
}
if (type_a->base_type != type_b->base_type) {
/* "If the fundamental types in the operands do not match, then the
* conversions from section 4.1.10 "Implicit Conversions" are applied
* to create matching types."
*
* Note that GLSL 4.00 (and GL_ARB_gpu_shader5) introduced implicit
* int -> uint conversion rules. Prior to that, there were no implicit
* conversions. So it's harmless to apply them universally - no implicit
* conversions will exist. If the types don't match, we'll receive false,
* and raise an error, satisfying the GLSL 1.50 spec, page 56:
*
* "The operand types must both be signed or unsigned."
*/
if (!apply_implicit_conversion(type_a, value_b, state) &&
!apply_implicit_conversion(type_b, value_a, state)) {
_mesa_glsl_error(loc, state,
"operands of %% must have the same base type");
"could not implicitly convert operands to "
"modulus (%%) operator");
return glsl_type::error_type;
}
type_a = value_a->type;
type_b = value_b->type;
/* "The operands cannot be vectors of differing size. If one operand is
* a scalar and the other vector, then the scalar is applied component-
@@ -1267,7 +1286,7 @@ ast_expression::do_hir(exec_list *instructions,
op[0] = this->subexpressions[0]->hir(instructions, state);
op[1] = this->subexpressions[1]->hir(instructions, state);
type = modulus_result_type(op[0]->type, op[1]->type, state, & loc);
type = modulus_result_type(op[0], op[1], state, &loc);
assert(operations[this->oper] == ir_binop_mod);
@@ -1514,7 +1533,7 @@ ast_expression::do_hir(exec_list *instructions,
op[0] = this->subexpressions[0]->hir(instructions, state);
op[1] = this->subexpressions[1]->hir(instructions, state);
type = modulus_result_type(op[0]->type, op[1]->type, state, & loc);
type = modulus_result_type(op[0], op[1], state, &loc);
assert(operations[this->oper] == ir_binop_mod);

View File

@@ -455,7 +455,8 @@ lower_copies_to_load_store(struct deref_node *node,
struct deref_node *arg_node =
get_deref_node(copy->variables[i], state);
if (arg_node == NULL)
/* Only bother removing copy entries for other nodes */
if (arg_node == NULL || arg_node == node)
continue;
struct set_entry *arg_entry = _mesa_set_search(arg_node->copies, copy);
@@ -466,6 +467,8 @@ lower_copies_to_load_store(struct deref_node *node,
nir_instr_remove(&copy->instr);
}
node->copies = NULL;
return true;
}

View File

@@ -102,13 +102,13 @@ fallback_required(struct gl_context *ctx, GLenum target,
*/
if (!mipmap->FBO)
_mesa_GenFramebuffers(1, &mipmap->FBO);
_mesa_BindFramebuffer(GL_FRAMEBUFFER_EXT, mipmap->FBO);
_mesa_BindFramebuffer(GL_DRAW_FRAMEBUFFER, mipmap->FBO);
_mesa_meta_bind_fbo_image(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, baseImage, 0);
_mesa_meta_bind_fbo_image(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, baseImage, 0);
status = _mesa_CheckFramebufferStatus(GL_FRAMEBUFFER_EXT);
status = _mesa_CheckFramebufferStatus(GL_DRAW_FRAMEBUFFER);
_mesa_BindFramebuffer(GL_FRAMEBUFFER_EXT, fboSave);
_mesa_BindFramebuffer(GL_DRAW_FRAMEBUFFER, fboSave);
if (status != GL_FRAMEBUFFER_COMPLETE_EXT) {
_mesa_perf_debug(ctx, MESA_DEBUG_SEVERITY_HIGH,
@@ -128,6 +128,8 @@ _mesa_meta_glsl_generate_mipmap_cleanup(struct gen_mipmap_state *mipmap)
mipmap->VAO = 0;
_mesa_DeleteBuffers(1, &mipmap->VBO);
mipmap->VBO = 0;
_mesa_DeleteSamplers(1, &mipmap->Sampler);
mipmap->Sampler = 0;
_mesa_meta_blit_shader_table_cleanup(&mipmap->shaders);
}

View File

@@ -336,6 +336,15 @@ static const struct brw_device_info brw_device_info_skl_gt3 = {
static const struct brw_device_info brw_device_info_skl_gt4 = {
GEN9_FEATURES, .gt = 4,
/* From the "L3 Allocation and Programming" documentation:
*
* "URB is limited to 1008KB due to programming restrictions. This is not a
* restriction of the L3 implementation, but of the FF and other clients.
* Therefore, in a GT4 implementation it is possible for the programmed
* allocation of the L3 data array to provide 3*384KB=1152KB for URB, but
* only 1008KB of this will be used."
*/
.urb.size = 1008 / 3,
};
static const struct brw_device_info brw_device_info_bxt = {

View File

@@ -63,7 +63,9 @@ static const struct tx_table tx_table_be[] =
[ MESA_FORMAT_A8B8G8R8_UNORM ] = { R200_TXFORMAT_ABGR8888 | R200_TXFORMAT_ALPHA_IN_MAP, 0 },
[ MESA_FORMAT_R8G8B8A8_UNORM ] = { R200_TXFORMAT_RGBA8888 | R200_TXFORMAT_ALPHA_IN_MAP, 0 },
[ MESA_FORMAT_B8G8R8A8_UNORM ] = { R200_TXFORMAT_ARGB8888 | R200_TXFORMAT_ALPHA_IN_MAP, 0 },
[ MESA_FORMAT_B8G8R8X8_UNORM ] = { R200_TXFORMAT_ARGB8888, 0 },
[ MESA_FORMAT_A8R8G8B8_UNORM ] = { R200_TXFORMAT_ARGB8888 | R200_TXFORMAT_ALPHA_IN_MAP, 0 },
[ MESA_FORMAT_X8R8G8B8_UNORM ] = { R200_TXFORMAT_ARGB8888, 0 },
[ MESA_FORMAT_BGR_UNORM8 ] = { 0xffffffff, 0 },
[ MESA_FORMAT_B5G6R5_UNORM ] = { R200_TXFORMAT_RGB565, 0 },
[ MESA_FORMAT_R5G6B5_UNORM ] = { R200_TXFORMAT_RGB565, 0 },
@@ -91,7 +93,9 @@ static const struct tx_table tx_table_le[] =
[ MESA_FORMAT_A8B8G8R8_UNORM ] = { R200_TXFORMAT_RGBA8888 | R200_TXFORMAT_ALPHA_IN_MAP, 0 },
[ MESA_FORMAT_R8G8B8A8_UNORM ] = { R200_TXFORMAT_ABGR8888 | R200_TXFORMAT_ALPHA_IN_MAP, 0 },
[ MESA_FORMAT_B8G8R8A8_UNORM ] = { R200_TXFORMAT_ARGB8888 | R200_TXFORMAT_ALPHA_IN_MAP, 0 },
[ MESA_FORMAT_B8G8R8X8_UNORM ] = { R200_TXFORMAT_ARGB8888, 0 },
[ MESA_FORMAT_A8R8G8B8_UNORM ] = { R200_TXFORMAT_ARGB8888 | R200_TXFORMAT_ALPHA_IN_MAP, 0 },
[ MESA_FORMAT_X8R8G8B8_UNORM ] = { R200_TXFORMAT_ARGB8888, 0 },
[ MESA_FORMAT_BGR_UNORM8 ] = { R200_TXFORMAT_ARGB8888, 0 },
[ MESA_FORMAT_B5G6R5_UNORM ] = { R200_TXFORMAT_RGB565, 0 },
[ MESA_FORMAT_R5G6B5_UNORM ] = { R200_TXFORMAT_RGB565, 0 },

View File

@@ -63,6 +63,8 @@ static const struct tx_table tx_table[] =
[ MESA_FORMAT_R8G8B8A8_UNORM ] = { RADEON_TXFORMAT_RGBA8888 | RADEON_TXFORMAT_ALPHA_IN_MAP, 0 },
[ MESA_FORMAT_B8G8R8A8_UNORM ] = { RADEON_TXFORMAT_ARGB8888 | RADEON_TXFORMAT_ALPHA_IN_MAP, 0 },
[ MESA_FORMAT_A8R8G8B8_UNORM ] = { RADEON_TXFORMAT_ARGB8888 | RADEON_TXFORMAT_ALPHA_IN_MAP, 0 },
[ MESA_FORMAT_B8G8R8X8_UNORM ] = { RADEON_TXFORMAT_ARGB8888, 0 },
[ MESA_FORMAT_X8R8G8B8_UNORM ] = { RADEON_TXFORMAT_ARGB8888, 0 },
[ MESA_FORMAT_BGR_UNORM8 ] = { RADEON_TXFORMAT_ARGB8888, 0 },
[ MESA_FORMAT_B5G6R5_UNORM ] = { RADEON_TXFORMAT_RGB565, 0 },
[ MESA_FORMAT_R5G6B5_UNORM ] = { RADEON_TXFORMAT_RGB565, 0 },

View File

@@ -57,6 +57,8 @@ static bool
prepare_target(struct gl_context *ctx, GLuint name, GLenum *target, int level,
struct gl_texture_object **tex_obj,
struct gl_texture_image **tex_image, GLuint *tmp_tex,
GLuint *width,
GLuint *height,
const char *dbg_prefix)
{
if (name == 0) {
@@ -130,6 +132,8 @@ prepare_target(struct gl_context *ctx, GLuint name, GLenum *target, int level,
_mesa_BindTexture(*target, *tmp_tex);
*tex_obj = _mesa_lookup_texture(ctx, *tmp_tex);
*tex_image = _mesa_get_tex_image(ctx, *tex_obj, *target, 0);
*width = rb->Width;
*height = rb->Height;
if (!ctx->Driver.BindRenderbufferTexImage(ctx, rb, *tex_image)) {
_mesa_problem(ctx, "Failed to create texture from renderbuffer");
@@ -175,6 +179,9 @@ prepare_target(struct gl_context *ctx, GLuint name, GLenum *target, int level,
"glCopyImageSubData(%sLevel = %u)", dbg_prefix, level);
return false;
}
*width = (*tex_image)->Width;
*height = (*tex_image)->Height;
}
return true;
@@ -409,6 +416,7 @@ _mesa_CopyImageSubData(GLuint srcName, GLenum srcTarget, GLint srcLevel,
GLuint tmpTexNames[2] = { 0, 0 };
struct gl_texture_object *srcTexObj, *dstTexObj;
struct gl_texture_image *srcTexImage, *dstTexImage;
GLuint src_w, src_h, dst_w, dst_h;
GLuint src_bw, src_bh, dst_bw, dst_bh;
int i;
@@ -429,16 +437,42 @@ _mesa_CopyImageSubData(GLuint srcName, GLenum srcTarget, GLint srcLevel,
}
if (!prepare_target(ctx, srcName, &srcTarget, srcLevel,
&srcTexObj, &srcTexImage, &tmpTexNames[0], "src"))
&srcTexObj, &srcTexImage, &tmpTexNames[0],
&src_w, &src_h, "src"))
goto cleanup;
if (!prepare_target(ctx, dstName, &dstTarget, dstLevel,
&dstTexObj, &dstTexImage, &tmpTexNames[1], "dst"))
&dstTexObj, &dstTexImage, &tmpTexNames[1],
&dst_w, &dst_h, "dst"))
goto cleanup;
_mesa_get_format_block_size(srcTexImage->TexFormat, &src_bw, &src_bh);
/* Section 18.3.2 (Copying Between Images) of the OpenGL 4.5 Core Profile
* spec says:
*
* An INVALID_VALUE error is generated if the dimensions of either
* subregion exceeds the boundaries of the corresponding image object,
* or if the image format is compressed and the dimensions of the
* subregion fail to meet the alignment constraints of the format.
*
* and Section 8.7 (Compressed Texture Images) says:
*
* An INVALID_OPERATION error is generated if any of the following
* conditions occurs:
*
* * width is not a multiple of four, and width + xoffset is not
* equal to the value of TEXTURE_WIDTH.
* * height is not a multiple of four, and height + yoffset is not
* equal to the value of TEXTURE_HEIGHT.
*
* so we take that to mean that you can copy the "last" block of a
* compressed texture image even if it's smaller than the minimum block
* dimensions.
*/
if ((srcX % src_bw != 0) || (srcY % src_bh != 0) ||
(srcWidth % src_bw != 0) || (srcHeight % src_bh != 0)) {
(srcWidth % src_bw != 0 && (srcX + srcWidth) != src_w) ||
(srcHeight % src_bh != 0 && (srcY + srcHeight) != src_h)) {
_mesa_error(ctx, GL_INVALID_VALUE,
"glCopyImageSubData(unaligned src rectangle)");
goto cleanup;