Compare commits

...

147 Commits

Author SHA1 Message Date
Emil Velikov
04fd3a6f62 docs: add release notes for 11.0.6
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2015-11-21 11:43:55 +00:00
Emil Velikov
5018418573 Update version to 11.0.6
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2015-11-21 11:42:52 +00:00
Emil Velikov
040785c08b automake: use static llvm for make distcheck
With llvm 3.7 semi-dropping the autoconf build, we rely on their cmake
build. With the latter of which annoyingly using another (busted?)
SONAME.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit c45b4257c2)
2015-11-21 11:42:52 +00:00
Oded Gabbay
0c56517d16 llvmpipe: use simple coeffs calc for 128bit vectors
There are currently two methods in llvmpipe code to calculate coeffs to
be used as inputs for the fragment shader. The two methods use slightly
different ways to do the floating point calculations and thus produce
slightly different results.

The decision which method to use is determined by the size of the vector
that is used by the platform.

For vectors with size of more than 128bit, a single-step method is used,
in which coeffs_init_simple() + attribs_update_simple() are called.

For vectors with size of 128bit or less, a two-step method is used, in
which coeffs_init() + attribs_update() are called.

This causes some piglit tests (clip-distance-bulk-copy,
interface-vs-unnamed-to-fs-unnamed) to fail when using platforms with
128bit vectors (such as ppc64le or x86-64 without AVX).

This patch makes platforms with 128bit vectors use the single-step
method (aka "simple" method) instead of the two-step method.
This would make the resulting coeffs identical between more platforms,
make sure the piglit tests passes, and make debugging and maintainability
a bit easier as the generated LLVM IR will be the same for more platforms.

The performance impact is negligible for x86-64 without AVX, and
basically non-existent for ppc64le, as it can be seen from the following
benchmarking results:

- glxspheres, on ppc64le:

   - original code:  4.892745317 frames/sec 5.460303857 Mpixels/sec
   - with the patch: 4.932083873 frames/sec 5.504205571 Mpixels/sec
   - Additional 0.8% performance boost

- glxspheres, on x86-64 without AVX:

   - original code:  20.16418809 frames/sec 22.50323395 Mpixels/sec
   - with the patch: 20.31328989 frames/sec 22.66963152 Mpixels/sec
   - Additional 0.74% performance boost

- glmark2, on ppc64le:

  - original code:  score of 58
  - with my change: score of 57

- glmark2, on x86-64 without AVX:

  - original code:  score of 175
  - with the patch: score of 167
  - Impact of of -4.5% on performance

- OpenArena, on ppc64le:

  - original code:  3398 frames 1719.0 seconds 2.0 fps
                    255.0/505.9/2773.0/0.0 ms

  - with the patch: 3398 frames 1690.4 seconds 2.0 fps
                    241.0/497.5/2563.0/0.2 ms

  - 29 seconds faster with the patch, which is about 2%

- OpenArena, on x86-64 without AVX:

  - original code:  3398 frames 239.6 seconds 14.2 fps
                    38.0/70.5/719.0/14.6 ms

  - with the patch: 3398 frames 244.4 seconds 13.9 fps
                    38.0/71.9/697.0/14.3 ms

  - 0.3 fps slower with the patch (about 2%)

Additional details can be found at:
http://lists.freedesktop.org/archives/mesa-dev/2015-October/098635.html

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 39b4dfe6ab)
2015-11-18 19:13:17 +00:00
Eric Anholt
d425a2f26c vc4: Add support for nir_op_uge, using the carry bit on QPU_A_SUB.
It looks like nir_lower_idiv is going to use it soon, so add support.
With Ilia's change, this fixes one case in fs-op-div-large-uint-uint (with
GL 3.0 forced on).

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a4bf28178f)
[Emil Velikov: Resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
	src/gallium/drivers/vc4/vc4_qpu_emit.c
2015-11-18 18:59:34 +00:00
Roland Scheidegger
c667a0d1d3 r200: fix bgrx8/xrgb8 blits
Since 779cabfc7d the same txformat table entries
are used for "normal" texturing as well as for blits. However, I forgot to put
in an entry for the bgrx8 (le) and xrgb8 (be) formats - the normal texturing
path can't hit them because the radeon tex format chooser will never chose
them, but we get that format from the dri buffers (at least I assume we got
it from there).
This is untested but essentially addressing the same bug as for radeon.
(I don't think that the second entry per le/be table is actually necessary,
but shouldn't hurt...)

Tested-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a2611ffe4b)
2015-11-18 18:59:20 +00:00
Roland Scheidegger
f112696f15 radeon: fix bgrx8/xrgb8 blits
Since d21320f625 the same txformat table entries
are used for "normal" texturing as well as for blits. However, I forgot to put
in an entry for the bgrx8 (le) and xrgb8 (be) formats - the normal texturing
path can't hit them because the radeon tex format chooser will never chose
them, but we get that format from the dri buffers (at least I assume we got
it from there). This caused lots of piglit regressions (and probably lots of
trouble outside piglit too).
This fixes bug https://bugs.freedesktop.org/show_bug.cgi?id=92900.

Tested-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 983614dbed)
2015-11-18 18:59:20 +00:00
Ian Romanick
acbaa3d0fc meta/generate_mipmap: Only modify the draw framebuffer binding in fallback_required
Previously GL_FRAMEBUFFER was used.  However, if GL_EXT_framebuffer_blit
is supported (note: it is supported by every Mesa driver), this is
*sometimes* an alias for GL_DRAW_FRAMEBUFFER (getters) and *sometimes*
an alias for *both* GL_DRAW_FRAMEBUFFER and GL_READ_FRAMEBUFFER
(setters).  As a result, the code saved one binding but modified both.
If the bindings were different, the GL_READ_FRAMEBUFFER would be
incorrect on exit.

Fixes the piglit fbo-generatemipmap-versus-READ_FRAMEBUFFER test.

Ideally this function would use DSA functions and not modify the binding
at all.  However, that would be a much more intrusive change because
_mesa_meta_bind_fbo_image would also need to be modified.
_mesa_meta_bind_fbo_image has a lot of callers.  Much of this code is
about to get a major rework due to bug #92363, so I don't think it
matters too much.  In fact, I discovered this bug while working on the
other bug.  Le bon temps!

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c40a88b6c5)
2015-11-18 18:59:19 +00:00
Alex Deucher
55325d0632 radeonsi: enable optimal raster config setting for fiji (v2)
Requires proper kernel tiling configuration so check the tiling
config registers.

v2: send the right version of the patch

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 00f554abba)
2015-11-18 18:59:19 +00:00
Ilia Mirkin
09a7ee2782 nouveau: don't expose HEVC decoding support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit f94e1d9738)
2015-11-18 18:59:19 +00:00
Kenneth Graunke
120559bd30 glsl: Allow implicit int -> uint conversions for the % operator.
GLSL 4.00 and GL_ARB_gpu_shader5 introduced a new int -> uint implicit
conversion rule and updated the rules for modulus to use them.  (In
earlier languages, none of the implicit conversion rules did anything
relevant, so there was no point in applying them.)

This allows expressions such as:

   int foo;
   uint bar;
   uint mod = foo % bar;

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 511de1a80c)
2015-11-18 18:59:19 +00:00
Ian Romanick
0b7bdb0668 meta/generate_mipmap: Don't leak the sampler object
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit 758f12fd98)
2015-11-18 18:59:19 +00:00
Marek Olšák
f9325a97b3 radeonsi: initialize SX_PS_DOWNCONVERT to 0 on Stoney
otherwise the SX or CB blocks can go bananas

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 40912dd91e)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/gallium/drivers/radeonsi/si_state.c
2015-11-18 18:59:13 +00:00
Jason Ekstrand
0dd0d6696f nir/vars_to_ssa: Rework copy set handling in lower_copies_to_load_store
Previously, we walked through a given deref_node's copies and, after
lowering the copy away, removed it from both the source and destination
copy sets.  This commit changes this to only remove it from the other
node's copy set (not the one we're lowering).  At the end of the loop, we
just throw away the copy set for the node we're lowering since that node no
longer has any copies.  This has two advantages:

 1) It's more efficient because we're doing potentially half as many set
    search operations.

 2) It now properly handles copies from a node to itself.  Perviously, it
    would delete the copy from the set when processing the destinatioon and
    then assert-fail when we couldn't find it for the source.

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92588
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
(cherry picked from commit 226ba889a0)
2015-11-18 18:58:53 +00:00
Ben Widawsky
4b3d4ceaba i965/skl/gt4: Fix URB programming restriction.
The comment in the code details the restriction. Thanks to Ken for having a very
helpful conversation with me, and spotting the blurb in the link I sent him :P.

There are still stability problems for me on GT4, but this definitely helps with
some of the failures.

v2: Comment fixes

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 55314c5be4)
2015-11-18 18:58:53 +00:00
Dave Airlie
20f0d88495 r600: initialised PGM_RESOURCES_2 for ES/GS
This fixes the corruption on rendering that we are seeing in
certain geometry shaders.

Fixes:  https://bugs.freedesktop.org/show_bug.cgi?id=91780
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested / Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Cc: "10.6" "11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>

(cherry picked from commit df8af7d751)
2015-11-18 18:58:53 +00:00
Ilia Mirkin
fa527fce5c mesa/copyimage: allow width/height to not be multiples of block
For compressed textures, the image size is not necessarily a multiple of
the block size (e.g. the last mip levels). Section 18.3.2 (Copying
Between Images) of the OpenGL 4.5 Core Profile spec says:

    An INVALID_VALUE error is generated if the dimensions of either
    subregion exceeds the boundaries of the corresponding image
    object, or if the image format is compressed and the dimensions of
    the subregion fail to meet the alignment constraints of the
    format.

and Section 8.7 (Compressed Texture Images) says:

    An INVALID_OPERATION error is generated if any of the following
    conditions occurs:

      * width is not a multiple of four, and width + xoffset is not
        equal to the value of TEXTURE_WIDTH.
      * height is not a multiple of four, and height + yoffset is not
        equal to the value of TEXTURE_HEIGHT.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92860
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 912babba7b)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/mesa/main/copyimage.c
2015-11-18 18:58:46 +00:00
Eric Anholt
9bbdd99d8c vc4: Return NULL when we can't make our shadow for a sampler view.
I'm not sure what the caller does is appropriate (just have a NULL sampler
at this slot), but it fixes the immediate crash.

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5980389bbf)
2015-11-18 18:49:41 +00:00
Eric Anholt
e54ac25120 vc4: Return GL_OUT_OF_MEMORY when buffer allocation fails.
I was afraid our callers weren't prepared for this, but it looks like
at least for resource creation, mesa/st throws an error appropriately.

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit eb8fb0064d)
2015-11-18 18:49:41 +00:00
Michel Dänzer
312ec1946d winsys/radeon: Use CPU page size instead of hardcoding 4096 bytes v3
Fixes GPUVM conflicts with non-4K page size.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92738

v2: Replace sanitization of VM base address alignment with comment why
    that's not necessary.
v3: Use unsigned instead of long as the type for the size_align member.
    (Marek)

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Christian König <christian.koenig@amd.com> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 24abbaff9a)
2015-11-18 18:49:41 +00:00
Boyuan Zhang
6a958b0b51 radeon/uvd: fix VC-1 simple/main profile decode v2
We just needed to set the extra width/height fields to get this working.

v2 (chk): rebased, CC stable added, commit message added, fixed coding style

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6bad554d98)
2015-11-18 18:49:41 +00:00
Boyuan Zhang
71a785fc5f st/vaapi: fix vaapi VC-1 simple/main corruption v2
Apply the start code fix only to advanced profile.

v2 (chk): add commit message

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ed55def44f)
2015-11-18 18:49:41 +00:00
Emil Velikov
f6e19f673e cherry-ignore: add the swrast front buffer support
Although a sort of a bugfix, it causes many piglit regressions and even
lockup with llvmpipe.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2015-11-18 18:49:40 +00:00
Emil Velikov
66c949d0a1 docs: add sha256 checksums for 11.0.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2015-11-11 11:10:30 +00:00
Emil Velikov
ee57c22141 docs: add release notes for 11.0.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2015-11-11 10:05:57 +00:00
Emil Velikov
a12fdff695 Update version to 11.0.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2015-11-11 09:56:00 +00:00
Marek Olšák
6a2a631bf9 radeonsi: add register definitions for Stoney
There are a few non-stoney changes too.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit d57ede92b7)
Nominated-by: Emil Velikov <emil.velikov@collabora.co.uk>
2015-11-11 09:54:17 +00:00
Emil Velikov
18fed2011f Revert "mesa/glformats: Undo code changes from _mesa_base_tex_format() move"
This reverts commit 2294f6f311.

It introduces a regression in the following test
   piglit.spec.oes_compressed_paletted_texture.basic api

In general this commit is needed to prevent regressions in
GL_KHR_texture_compression_astc_ldr, which... isn't in 11.0

Reported-by: Mark Janes <mark.a.janes@intel.com>
2015-11-10 20:17:41 +00:00
Julien Isorce
774dd015bd st/va: add more errors checks in vlVaBufferSetNumElements and vlVaMapBuffer
Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 5e763aaa21)
Nominated-by: Emil Velikov <emil.velikov@collabora.co.uk>
2015-11-07 15:17:49 +00:00
Julien Isorce
507b589685 st/va: do not destroy old buffer when new one failed
If formats are not the same vlVaPutImage re-creates the video
buffer with the right format. But if the creation of this new
video buffer fails then the surface looses its current buffer.
Let's just destroy the previous buffer on success.

Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit d42029d2d9)
Nominated-by: Emil Velikov <emil.velikov@collabora.co.uk>
2015-11-07 15:17:49 +00:00
Julien Isorce
bc47b385b4 nvc0: fix crash when nv50_miptree_from_handle fails
Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 3bbb8715ac)
Nominated-by: Emil Velikov <emil.velikov@collabora.co.uk>
2015-11-07 15:17:49 +00:00
Julien Isorce
dff2b9ed8a st/va: pass picture desc to begin and decode
At least vl_mpeg12_decoder uses the picture
desc in begin_frame and decode_bitstream.

https://bugs.freedesktop.org/show_bug.cgi?id=92634

Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit a61be1a798)
Nominated-by: Emil Velikov <emil.velikov@collabora.co.uk>
2015-11-07 15:17:49 +00:00
Ilia Mirkin
a4fbfc8189 nouveau: relax fence emit space assert
We also have the "reserved for kick" space available. Some of my earlier
changes can probably be removed, but this is a quick fix for some of the
rarer fallout.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bb73fc4cb8)
2015-11-07 15:17:49 +00:00
Eric Anholt
c323f97963 vc4: When the create ioctl fails, free our cache and try again.
This greatly increases the pressure you can put on the driver before
create fails.  Ultimately we need to let the kernel take control of
our cached BOs and just take them from us (and other clients)
directly, but this is a very easy patch for the moment.

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6d3a24bce8)
2015-11-07 15:17:49 +00:00
Kenneth Graunke
7cfd87ce84 nir: Properly invalidate metadata in nir_opt_remove_phis().
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 59bbe2681b)
2015-11-07 15:17:49 +00:00
Kenneth Graunke
5f565d7645 nir: Properly invalidate metadata in nir_lower_vec_to_movs().
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit bc3942e297)
2015-11-07 15:17:49 +00:00
Jason Ekstrand
ef4e862396 nir: Report progress from lower_vec_to_movs().
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
(cherry picked from commit 9f5e7ae9d8)
[Emil Velikov] Correctly derive nir_shader from vec_to_movs_state
Signed-off-by: Emil Velikov <emil.velikov@collabora.co.uk>

Conflicts:
        src/glsl/nir/nir.h
        src/glsl/nir/nir_lower_vec_to_movs.c
2015-11-07 15:17:49 +00:00
Jason Ekstrand
2cc4e97396 nir/lower_vec_to_movs: Pass the shader around directly
Previously, we were passing the shader around, we were just calling it
"mem_ctx".  However, the nir_shader is (and must be for the purposes of
mark-and-sweep) the mem_ctx so we might as well pass it around explicitly.

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
(cherry picked from commit b7eeced3c7)
2015-11-07 15:17:49 +00:00
Kenneth Graunke
ba0c78f4e2 nir: Properly invalidate metadata in nir_opt_copy_prop().
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 0f037bd71f)
2015-11-07 15:17:49 +00:00
Kenneth Graunke
a4b73eeff0 nir: Properly invalidate metadata in nir_split_var_copies().
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 8bb44510fc)
2015-11-07 15:17:49 +00:00
Kenneth Graunke
800217a165 nir: Report progress from nir_split_var_copies().
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
(cherry picked from commit dc18b9357b)
2015-11-07 15:17:48 +00:00
Ben Widawsky
aa739dff86 i965/skl: Add GT4 PCI IDs
Like other gen8+ hardware, the hardware automatically scales up thread counts.
We must be careful about the URB sizes since GT4 adds another slice.

One of the existing PCI IDs is actually mislabeled as GT3. Arguably this is a
real bug since the URB size will be wrong. Because this patch is simply meant to
add the missing IDs, that will be fixed in a later patch.

v2: No longer relevant.

v3: Update the wm thread count to support GT4. The WM thread count is used to
determine the maximum scratch space required. Currently the code always
allocates the maximum amount even though lower GT SKUs require less. The formula
is threads_per_psd * subslices_per_slice * slices

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
(cherry picked from commit 7cbd6608f5)
2015-11-07 15:17:48 +00:00
Ilia Mirkin
16bc98fb5e nouveau: set MaxDrawBuffers to the same value as MaxColorAttachments
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 985b51551a)
2015-11-07 15:17:48 +00:00
Emmanuel Gil Peyrot
addd501acd gbm.h: Add a missing stddef.h include for size_t.
This was causing compilation issues when one of its providers wasn’t
already included before gbm.h.

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit f3d4d10a1d)
2015-11-05 14:05:20 +00:00
Ivan Kalvachev
d9474cb70e r600g: Fix special negative immediate constants when using ABS modifier.
Some constants (like 1.0 and 0.5) could be inlined as immediate inputs
without using their literal value. The r600_bytecode_special_constants()
function emulates the negative of these constants by using NEG modifier.

However some shaders define -1.0 constant and want to use it as 1.0.
They do so by using ABS modifier. But r600_bytecode_special_constants()
set NEG in addition to ABS. Since NEG modifier have priority over ABS one,
we get -|1.0| as result, instead of |1.0|.

The patch simply prevents the additional switching of NEG when ABS is set.

[According to Ivan Kalvachev, this bug was fond via
https://github.com/iXit/Mesa-3D/issues/126 and
https://github.com/iXit/Mesa-3D/issues/127]

Signed-off-by: Ivan Kalvachev <ikalvachev@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
CC: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f75f21a24a)
2015-11-05 14:05:20 +00:00
Nicolai Hähnle
7aba6fa3eb st/mesa: fix mipmap generation for immutable textures with incomplete pyramids
Without the clamping by NumLevels, the state tracker would reallocate the
texture storage (incorrect) and even fail to copy the base level image
after reallocation, leading to the graphical glitch of
https://bugs.freedesktop.org/show_bug.cgi?id=91993 .

A piglit test has been submitted for review as well (subtest of
arb_texture_storage-texture-storage).

v2: also bypass all calls to st_finalize_texture (suggested by Marek Olšák)

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 24c90888ae)
2015-11-05 14:05:19 +00:00
Kenneth Graunke
05fdf4b1c9 i965: Fix missing BRW_NEW_*_PROG_DATA flagging caused by cache reuse.
Consider the case of two nearly identical GLSL fragment shaders:

   out vec4 color;
   void main() { color = vec4(1); }

and

   layout(early_fragment_tests) in;
   out vec4 color;
   void main() { color = vec4(1); }

These shaders compile to the exact same assembly, but have distinct
values for brw_wm_prog_data::early_fragment_tests.

Since these are two independent GLSL shaders, they have different
program keys - notably, brw_wm_prog_key::program_string_id differs.

When uploading the second, brw_upload_cache will find an existing copy
of the assembly in the cache BO, which means matching_data will be
non-NULL.  Although we create a second cache item (with the new key
and prog_data), we set item->offset to the existing copy and avoid
re-uploading duplicate assembly.

However, brw_search_cache() would only flag BRW_NEW_*_PROG_DATA if
item->offset differed from the supplied offset.  With reuse, both
programs have the same offset, but prog_data changed.  We have to
flag it, but failed to.

To fix this, we simply need to check if the aux (prog_data) pointer
changed.  If either the assembly or the prog_data differs, flag it.

This fixes a regression since 1bba29ed40,
where Topi fixed brw_upload_cache() to actually reuse identical
assembly.  Prior to that, reuse basically never happened due to bugs.
Unfortunately, this code apparently wasn't prepared to handle reuse!

Fixes GPU hangs in Dolphin on Broadwell.

Huge thanks to Pierre Bourdon and Ilia Mirkin for debugging this
and helping track down the real issue.

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92623
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Tested-by: Pierre Bourdon <delroth@gmail.com>
(cherry picked from commit bf05af3f0e)
2015-11-05 14:05:19 +00:00
Ian Romanick
d8c58ff25a i965: Fix is-renderable check in intel_image_target_renderbuffer_storage
Previously we could create a renderbuffer with format
MESA_FORMAT_R8G8B8A8_UNORM, convert that renderbuffer to an EGLImage,
then FAIL to convert the EGLImage back to a renderbuffer because
reasons.  Just use the same check in
intel_image_target_renderbuffer_storage that brw_render_target_supported
uses.

There are more checks in brw_render_target_supported, but I don't think
they are necessary here.  A different approach would be to refactor
brw_render_target_supported to take rb->Format and rb->NumSamples as
parameters (instead of a gl_renderbuffer) and use the new function here.

Fixes:

    ES2-CTS.gtf.GL2ExtensionTests.egl_image.egl_image

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92476
Cc: "10.3 10.4 10.5 10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7070c8879a)
2015-11-05 14:05:19 +00:00
Samuel Li
f86028cf07 radeonsi: add Stoney pci ids
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Samuel Li <samuel.li@amd.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 98546bfd03)
2015-11-05 14:05:19 +00:00
Samuel Li
64e903f82e radeonsi: add support for Stoney asics (v3)
v2 (agd): rebase on mesa master, split pci ids to
separate commit
v3 (agd): use carrizo for llvm processor name for
llvm 3.7 and older

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Samuel Li <samuel.li@amd.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit bf0d0ce0d5)
2015-11-05 14:05:19 +00:00
Ilia Mirkin
55cd3ab8e7 nvc0: respect edgeflag attribute width
The edgeflag comes in as ubyte with glEdgeFlagPointer but as float with
plain immediate glEdgeFlag. Avoid reading bytes that weren't meant for
the edgeflag in the pointer case.

Fixes intermittent failures with gl-2.0-edgeflag piglit (and valgrind
complaints about reading uninitialized memory).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit e05021ff72)
2015-11-05 14:05:19 +00:00
Roland Scheidegger
4a4e148ac7 gallivm: disable f16c when not using AVX
f16c intrinsic can only be emitted when AVX is used. So when we disable AVX
due to forcing 128bit vectors we must not use this intrinsic (depending on
llvm version, this worked previously because llvm used AVX even when we didn't
tell it to, however I've seen this fail with llvm 3.3 since
718249843b which seems to have the side effect
of disabling avx in llvm albeit it only touches sse flags really, but
with ea421e919a it's now really disabled).
Albeit being able to use AVX with 128bit vectors also would have its uses, the
code as is really was meant to emulate jit code creation for less capable cpus.
v2: add some (ifdefed out) missing de-featuring options for simulating
less capable cpus.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 711489648b)
Nominated-by: Roland Scheidegger <sroland@vmware.com>
2015-11-05 14:05:19 +00:00
Jose Fonseca
7f6f273a55 gallivm: Explicitly disable unsupported CPU features.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92214
CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit ea421e919a)
2015-11-05 14:05:19 +00:00
Alex Deucher
5ce639c001 radeon/uvd: don't expose HEVC on old UVD hw (v3)
The section for UVD 2 and older was not updated
when HEVC support was added. Reported by Kano
on irc.

v2: integrate the UVD2 and older checks into the
main switch statement.
v3: handle encode checking as well.  Encode is
already checked in the top case statement, so
drop encode checks in the lower case statement.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 7b63658125)
2015-11-05 14:05:19 +00:00
Jose Fonseca
38a8b467cb gallivm: Translate all util_cpu_caps bits to LLVM attributes.
This should prevent disparity between features Mesa and LLVM
believe are supported by the CPU.

http://lists.freedesktop.org/archives/mesa-dev/2015-October/thread.html#96990

Tested on a i7-3720QM w/ LLVM 3.3 and 3.6.

v2: Increase SmallVector initial size as suggested by Gustaw Smolarczyk.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 718249843b)
2015-11-05 14:05:19 +00:00
Nanley Chery
2294f6f311 mesa/glformats: Undo code changes from _mesa_base_tex_format() move
The refactoring commit, c6bf1cd, accidentally reverted cd49b97
and 99b1f47. These changes caused more code to be added to the
function and removed the existing support for ASTC. This patch
reverts those modifications.

v2. Actually include ASTC support again.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92221
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit f1147a238a)
[Emil Velikov]
 - Drop the KHR_texture_compression_astc_ldr check
 - Add texcompress.h include.
Signed-off-by: Emil Velikov <emil.velikov@collabora.co.uk>
2015-11-05 14:04:38 +00:00
Nigel Stewart
a333791259 osmesa: Expose GL entry points for Windows build via DEF file.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92437
CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 04703762e5)
2015-11-05 11:44:42 +00:00
Emil Velikov
bfd14ebb05 cherry-ignore: ignore a possible wrong nomination
The commit base varies greatly between master and 11.0. It seems that
the commit (in it's current form) is not applicable for the branch.

Signed-off-by: Emil Velikov <emil.velikov@collabora.co.uk>
2015-11-05 11:29:44 +00:00
Emil Velikov
ec14e6f8fd docs: add sha256 checksums for 11.0.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2015-10-25 10:05:01 +00:00
Emil Velikov
31bf247031 docs: add release notes for 11.0.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2015-10-24 19:34:01 +01:00
Emil Velikov
b530dccbff Update version to 11.0.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2015-10-24 19:29:27 +01:00
Jonathan Gray
6d6a4d7c76 configure.ac: ensure RM is set
GNU make predefines RM to rm -f but this is not required by POSIX
so ensure that RM is set.  This fixes "make clean" on OpenBSD.

v2: use AC_CHECK_PROG

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 99c4079c37)
2015-10-21 14:23:22 +01:00
Tapani Pälli
13276962c7 mesa: fix ARRAY_SIZE query for GetProgramResourceiv
Patch also refactors name length queries which were using array size
in computation, this has to be done in same time to avoid regression in
arb_program_interface_query-resource-query Piglit test.

Fixes rest of the failures with
   ES31-CTS.program_interface_query.no-locations

v2: make additional check only for GS inputs
v3: create helper function for resource name length
    so that it gets calculated only in one place

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
(cherry picked from commit c0722be9f5)
2015-10-21 14:23:22 +01:00
Chris Wilson
03ab39fa70 i965: Remove early release of DRI2 miptree
intel_update_winsys_renderbuffer_miptree() will release the existing
miptree when wrapping a new DRI2 buffer, so we can remove the early
release and so prevent a NULL mt dereference should importing the new
DRI2 name fail for any reason. (Reusing the old DRI2 name will result
in the rendering going astray, to a stale buffer, and not shown on the
screen, but it allows us to issue a warning and not crash much later in
innocent code.)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86281
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
(cherry picked from commit 70e91d61fd)
2015-10-21 14:23:22 +01:00
Alejandro Piñeiro
4d215a25d5 i965/vec4: fill src_reg type using the constructor type parameter
The src_reg constructor that received the glsl_type was using it
only to build the swizzle, but not to fill this->type as dst_reg
is doing.

This caused some type mismatch between movs and alu operations
on the NIR path, so copy propagation optimization was not applied
to remove unneeded movs if negate modifier was involved. This was
first detected on minus (negate+add) operations.

Shader DB results (taking into account only vec4):

total instructions in shared programs: 20019 -> 19934 (-0.42%)
instructions in affected programs:     2918 -> 2833 (-2.91%)
helped:                                79
HURT:                                  0
GAINED:                                0
LOST:                                  0

Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 4de86e1371)
Nominated-by: Christoph Brill <egore911@egore911.de>
2015-10-21 14:23:22 +01:00
Alejandro Piñeiro
6766a36e19 i965/vec4: check writemask when bailing out at register coalesce
opt_register_coalesce stopped to check previous instructions to
coalesce with if somebody else was writing on the same
destination. This can be optimized to check if somebody else was
writing to the same channels of the same destination using the
writemask.

Shader DB results (taking into account only vec4):

total instructions in shared programs: 1781593 -> 1734957 (-2.62%)
instructions in affected programs:     1238390 -> 1191754 (-3.77%)
helped:                                12782
HURT:                                  0
GAINED:                                0
LOST:                                  0

v2: removed some parenthesis, fixed indentation, as suggested by
    Matt Turner
v3: added brackets, for consistency, as suggested by Eduardo Lima

Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit d4e29af234)
Nominated-by: Jason Ekstrand <jason@jlekstrand.net>
2015-10-21 14:23:22 +01:00
Brian Paul
42364b33d1 mesa: fix incorrect opcode in save_BlendFunci()
Fixes assertion failure with new piglit
arb_draw_buffers_blend-state_set_get test.

Cc: mesa-stable@lists.freedesktop.org

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit e24d04e436)
2015-10-21 14:23:22 +01:00
Marek Olšák
54a30ed94f gallium: add PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT
This avoids a serious r600g bug leading to a GPU hang.
The chances this bug will get fixed are pretty low now.

I deeply regret listening to others and not pushing this patch, leaving
other users with a GPU-crashing driver. Yes, it should be fixed
in the compiler and it's ugly, but users couldn't care less about that.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86720

Cc: 11.0 10.6 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 814f31457e)
2015-10-21 14:23:22 +01:00
Leo Liu
6f48b8957e st/omx/dec/h264: fix field picture type 0 poc disorder
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 867284a8f0)
2015-10-21 14:23:22 +01:00
Indrajit Das
b91ed628c1 st/va: Used correct parameter to derive the value of the "h" variable in vlVaCreateImage
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 381c17d695)
2015-10-21 14:23:22 +01:00
Marek Olšák
141109cc52 radeonsi: fix a GS copy shader leak
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit aa060e276c)
[Emil Velikov: si_shader_destroy() wants the ctx as first argument]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/gallium/drivers/radeonsi/si_shader.c
2015-10-21 14:23:21 +01:00
Marek Olšák
5d41a78769 st/mesa: fix clip state dependencies
This allows removing FLUSH_VERTICES in MatrixMode.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 3c6156a4a7)
2015-10-21 14:23:21 +01:00
Tapani Pälli
da1d57faf3 mesa: Set api prefix to version string when overriding version
Otherwise there are problems when user overrides version and application
such as Piglit wants to detect used api with glGetString(GL_VERSION).

This makes it currently impossible to run glslparsertest tests for
OpenGL ES when using version override.

Below is example when using MESA_GLES_VERSION_OVERRIDE=3.1.

Before:
	"3.1 Mesa 11.1.0-devel (git-24a1a15)"

After:
	"OpenGL ES 3.1 Mesa 11.1.0-devel (git-78042ff)"

v2: only include api prefix for OpenGL ES (Boyan Ding)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit dc8c221e28)
2015-10-21 14:23:21 +01:00
Rob Clark
0d87f75763 freedreno/a3xx: cache-flush is needed after MEM_WRITE
Otherwise the mem2gmem blit would see potentially bogus texture
coordinates.  Fixes an issue that shows up with glamor.

CC: "11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 6206da736c)
2015-10-21 14:23:21 +01:00
Chih-Wei Huang
009890a0de nv30: include the header of ffs prototype
It fixes a building error of the android 6.0 64-bit target.

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 7599f8b167)
2015-10-21 14:23:21 +01:00
Chih-Wei Huang
938df905ea nv50/ir: use C++11 standard std::unordered_map if possible
Note Android version before Lollipop is not supported.

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit d31005e3e5)
2015-10-21 14:23:21 +01:00
Chih-Wei Huang
9b561ed2d1 mesa: android: Fix the incorrect path of sse_minmax.c
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Fixes: 669cfc267a (android: mesa: fix the path of the SSE4_1
optimisations)
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>

(cherry picked from commit 67d8518a0e)
2015-10-21 14:23:21 +01:00
Krzysztof Sobiecki
b0b31397e2 st/fbo: use pipe_surface_release instead of pipe_surface_reference
pipe_surface_reference have problems with deleted contexts,
so use of pipe_surface_release might be more appropriate.

Fixes Wasteland 2 Director's Cut crash on start.

Cc: mesa-stable@lists.freedesktop.org

Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 14f7ce4248)
2015-10-21 14:23:21 +01:00
Brian Paul
c0b85c5a4c vbo: fix incorrect switch statement in init_mat_currval()
The variable 'i' is a value in [0, MAT_ATTRIB_MAX-1] so subtracting
VERT_ATTRIB_GENERIC0 gave a bogus value and we executed the default
switch clause for all loop iterations.

This doesn't fix any known issues but was clearly incorrect.

Cc: mesa-stable@lists.freedesktop.org

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit dd293d8aae)
2015-10-21 14:23:21 +01:00
Ian Romanick
a9da1ead7b glsl: In later GLSL versions, sequence operator is cannot be a constant expression
Fixes:
    ES3-CTS.shaders.negative.constant_sequence

    spec/glsl-es-3.00/compiler/global-initializer/from-sequence.vert
    spec/glsl-es-3.00/compiler/global-initializer/from-sequence.frag

v2: Fix a couple copy-and-paste mistake in the spec quotations.
Suggested by Matt.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 92635a84a7)
2015-10-21 14:23:21 +01:00
Ian Romanick
dab0c565d3 glsl: Add method to determine whether an expression contains the sequence operator
This will be used in the next patch to enforce some language sematics.

v2: Fix inverted logic in
ast_function_expression::has_sequence_subexpression.  The method
originally had a different name and a different meaning.  I fixed the
logic in ast_to_hir.cpp, but I only changed the names in
ast_function.cpp.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com> [v1]
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 05e4601c6b)
2015-10-21 14:23:21 +01:00
Ian Romanick
96931dbf14 glsl: Restrict initializers for global variables to constant expression in ES
v2: Combine this check with the existing const and uniform checks.  This
change depends on the previous patch (glsl: Only set
ir_variable::constant_value for const-decorated variables).

Fixes:

    ES2-CTS.shaders.negative.initialize
    ES3-CTS.shaders.negative.initialize

    spec/glsl-es-1.00/compiler/global-initializer/from-attribute.vert
    spec/glsl-es-1.00/compiler/global-initializer/from-uniform.vert
    spec/glsl-es-1.00/compiler/global-initializer/from-uniform.frag
    spec/glsl-es-1.00/compiler/global-initializer/from-global.vert
    spec/glsl-es-1.00/compiler/global-initializer/from-global.frag
    spec/glsl-es-1.00/compiler/global-initializer/from-varying.frag
    spec/glsl-es-3.00/compiler/global-initializer/from-uniform.vert
    spec/glsl-es-3.00/compiler/global-initializer/from-uniform.frag
    spec/glsl-es-3.00/compiler/global-initializer/from-in.vert
    spec/glsl-es-3.00/compiler/global-initializer/from-in.frag
    spec/glsl-es-3.00/compiler/global-initializer/from-global.vert
    spec/glsl-es-3.00/compiler/global-initializer/from-global.frag

Note: spec/glsl-es-3.00/compiler/global-initializer/from-sequence.*
still fail because the result of a sequence operator is still considered
to be a constant expression.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92304
Reviewed-by: Tapani Pälli <tapani.palli@intel.com> [v1]
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> [v1]
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bb329f2ff6)
2015-10-21 14:23:21 +01:00
Ian Romanick
2d5b8efd7d glsl: Only set ir_variable::constant_value for const-decorated variables
Right now we're also setting for uniforms, and that doesn't seem to hurt
things.  The next patch will make general global variables in GLSL ES,
and those definitely should not have constant_value set!

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3524d6df33)
2015-10-21 14:23:21 +01:00
Ian Romanick
0c6b210749 glsl: Use constant_initializer instead of constant_value to determine whether to keep an unused uniform
This even matches the comment "uniform initializers are precious, and
could get used by another stage."

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5bc68f0f2b)
2015-10-21 14:23:20 +01:00
Ian Romanick
8e9b698c24 glsl/linker: Use constant_initializer instead of constant_value to initialize uniforms
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 313372cae8)
2015-10-21 14:23:20 +01:00
Ian Romanick
3b238aa08f ff_fragment_shader: Use binding to set the sampler unit
This is the way layout(binding=xxx) works from GLSL.  The old method
just happened to work (and significantly predated support for
layout(binding=xxx)), but future changes will break this.

v2: Remove some stale comments.  Suggested by Matt and Chris Forbes.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8acce5d53a)
2015-10-21 14:23:20 +01:00
Ian Romanick
cd6ff70856 glsl: Allow built-in functions as constant expressions in OpenGL ES 1.00
In d4a24745 (August 2012), Paul made functions calls not be constant
expressions in GLSL ES 1.00.  Since this feature was added in desktop
GLSL 1.20, we believed that it was added in GLSL ES 3.00.  That turns
out to be completely wrong.  Built-in functions have always been allowed
as constant expressions in GLSL ES, and the patch adds the (many) spec
quotations to prove it.

While we never previously encountered this, a later patch enforces a GLSL
ES 1.00 rule that global variable initializers must be constant
expressions.  Without this fix, several dEQP tests fail.

Fixes:

    tests/spec/glsl-es-1.00/compiler/const-initializer/from-function.frag
    tests/spec/glsl-es-1.00/compiler/const-initializer/from-function.vert
    tests/spec/glsl-es-1.00/compiler/const-initializer/from-sequence-in-function.frag
    tests/spec/glsl-es-1.00/compiler/const-initializer/from-sequence-in-function.vert

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.0 10.1 10.2 10.3 10.4 10.5 10.6 11.0" <mesa-stable@lists.freedesktop.org>

Yes, I know we don't maintain stable branches that far back, but that
*is* how far back this bug goes!

(cherry picked from commit 43b07eb60f)
2015-10-21 14:23:20 +01:00
Nicolai Hähnle
2ee32ffe7c u_vbuf: fix vb slot assignment for translated buffers
Vertex attributes of different categories (constant/per-instance/
per-vertex) go into different buffers for translation, and this is now
properly reflected in the vertex buffers passed to the driver.

Fixes e.g. piglit's point-vertex-id divisor test.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 45ed627d89)
2015-10-21 14:23:20 +01:00
Dave Airlie
25e1e90937 mesa/uniforms: fix get_uniform for doubles (v2)
The initial glGetUniformdv support didn't cover all the
casting cases that are apparantly legal, and cts seems to
test for them.

I've updated the piglit test to cover these cases now.

v2: fix indentation - it's all broken in this file (Ilia)
fix src/dst index tracking in light of fp64 support (Ilia)

cc: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit bcfaab3885)
2015-10-21 14:23:20 +01:00
Francisco Jerez
22aae69aa5 mesa: Get rid of texture-dependent image unit derived state.
The point is to avoid having to re-validate all image units when
_NEW_TEXTURE is flagged, which can be expensive if the driver exposes
a large number of image units.  This has been reported to fix a 36%
performance regression in the Synmark2 Multithread benchmark on the
i965 driver which exposes 192 image units.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91788
Reported-by: Wendy Wang <wendy.wang@intel.com>
Tested-by: Ye Tian <yex.tian@intel.com>
CC: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 7e441bf025)
2015-10-21 14:23:20 +01:00
Francisco Jerez
4779eb04a4 i965: Use _mesa_is_image_unit_valid() instead of gl_image_unit::_Valid.
gl_image_unit::_Valid will be removed in a future commit.

Tested-by: Ye Tian <yex.tian@intel.com>
CC: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 2d97a78b37)
2015-10-21 14:23:20 +01:00
Francisco Jerez
df361e2311 mesa: Skip redundant texture completeness checking during image validation.
The call to _mesa_test_texobj_completeness() is unnecessary if the
texture is already known to be complete.  If the texture object is
dirtied in the meantime _BaseComplete and _MipmapComplete will be
reset to false.  _mesa_is_image_unit_valid() will start to be called
more frequently in a future commit, so it seems desirable to avoid the
unnecessary work.

Tested-by: Ye Tian <yex.tian@intel.com>
CC: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 25d3338be3)
2015-10-21 14:23:20 +01:00
Francisco Jerez
7259f17eca mesa: Expose function to calculate whether a shader image unit is valid.
A future commit will remove all texture object-dependent derived state
from the image unit struct to make validation unnecessary on texture
state changes.  Instead of checking gl_image_unit::_Valid drivers will
be required to call this function when needed to find out whether an
image unit is in a valid state and whether access from the shader is
allowed.

Tested-by: Ye Tian <yex.tian@intel.com>
CC: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 5152db415f)
2015-10-21 14:23:20 +01:00
Francisco Jerez
37b647b979 i965: Don't tell the hardware about our UAV access.
The hardware documentation relating to the UAV HW-assisted coherency
mechanism and UAV access enable bits is scarce and sometimes
contradictory, and there's quite some guesswork behind this commit, so
let me summarize the background first: HSW and later hardware have
infrastructure to support a stricter form of data coherency between
shader invocations from separate primitives.  The mechanism is
controlled by the "Accesses UAV" bits on 3DSTATE_VS, _HS, _DS, _GS and
_PS (or _PS_EXTRA on BDW+), and the "UAV Coherency Required" bit on
the 3DPRIMITIVE command.

Regardless of whether "UAV Coherency Required" is set, the hardware
fixed-function units will increment a per-stage semaphore for each
request received if "Accesses UAV" is set for the same or any lower
stage.  An implicit DC flush is emitted by the lowermost stage with
"Accesses UAV" set once it's done processing the request, this also
happens regardless of the value of "UAV Coherency Required".  The
completion of the DC flush will cause the same stage and all previous
ones to decrement the semaphore, marking the UAV accesses for the
primitive as coherent with L3.

The "UAV Coherency Required" 3DPRIMITIVE bit will cause a pipeline
stall before any threads are dispatched for the first FF stage with
"Accesses UAV" set until the semaphore is cleared for the same stage.
Effectively this guarantees that UAV memory accesses performed by
previous primitives from any stage will be strictly ordered (and
thanks to the implicit DC flush visible in memory) with UAV accesses
from the following primitives.

None of this is required by the usual image, atomic counter and SSBO
GL APIs which have very relaxed cross-primitive coherency and ordering
requirements, so we don't actually ever set the "UAV Coherency
Required" bit -- Ordering with respect to shader invocations from
previous stages on the same primitive where there is a data dependency
is of course already guaranteed as the spec requires, regardless of
this mechanism being enabled.  We do set the "Accesses UAV" bits
though since my commit ac7664e493 (which
this patch partially reverts), mainly because of comments like the
following from the BDW PRM:

> 3DSTATE_GS
>[...]
> 12 Accesses UAV
>    Format: Enable
>    This field must be set when GS has a UAV access.

There are similar comments in the documentation for the other
3DSTATE_*S commands.  The "must" part is misleading and unjustified
AFAIK.  Most of the "Accesses UAV" bits don't seem to have any side
effects other than the implicit DC flushes and the related
book-keeping in anticipation for a subsequent primitive with "UAV
Coherency Required" set, so in most cases they are unnecessary and may
incur a performance penalty.  There is an exception though.  On Gen8+
the PS_EXTRA UAV access bit influences the calculation of the PS
UAV-only and ThreadDispatchEnable signals which on previous
generations were set explicitly by the driver, so we cannot always
avoid enabling it on the PS stage.

The primary motivation for this change is that in fact the hardware
coherency mechanism is buggy and will cause a rather non-deterministic
hang on Gen8 when VS is the only stage with "Accesses UAV" set and the
processing of a request terminates immediately after the implicit DC
flush is sent for a previous primitive with no additional vertices
being emitted for the second primitive, what will cause the hardware
to skip sending a second DC flush and cause the VS to stall
indefinitely waiting for a response from the DC (BDWGFX HSD 1912017).
This hardware bug can be reproduced on current master with the
spec@arb_shader_image_load_store@host-mem-barrier@Indirect/RaW piglit
subtest (if you have the patience to run it a few dozen times).

The proposed workaround is to insert CS STALLs speculatively between
3DPRIMITIVE commands when "Accesses UAV" is enabled for the VS stage
only.  Because this would affect one of the hottest paths in the
driver and likely decrease performance even further due to the
unnecessary serialization, and because we don't actually need the
implicit DC flushes, it seems better to just disable them.

Cc: 11.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5346c11670)
2015-10-21 14:23:20 +01:00
Tapani Pälli
41cc0965bb mesa: add GL_UNSIGNED_INT_24_8 to _mesa_pack_depth_span
Patch adds missing type (used with NV_read_depth) so that it gets
handled correctly. This fixes errors seen with following CTS test:

   ES3-CTS.gtf.GL3Tests.packed_pixels.packed_pixels

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d8d0e4a81e)
2015-10-21 14:23:20 +01:00
Ilia Mirkin
3f802ebaf8 nouveau: make sure there's always room to emit a fence
I started seeing a lot of situations on nv30 where fence emission
wouldn't fit into the previous buffer (causing assertions). This ensures
that whenever checking for space, we always leave a bit of extra room
for the fence emission commands. Adjusts the nv30 and nvc0 fence
emission logic to bypass the space checking as well.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 47d11990b2)

Squashed with commit

nouveau: avoid emitting new fences unnecessarily

Right now we emit on every kick, but this is only necessary if something
will ever be able to observe that the fence completed. If there are no
refs, leave the fence alone and emit it another day.

This also happens to work around an issue for the kick handler -- a kick
can be a result of e.g. nouveau_bo_wait or explicit kick, or it can be
due to lack of space in the pushbuf. We want the emit to happen in the
current batch, so we want there to always be enough space. However an
explicit kick could take the reserved space for the implicitly-triggered
kick's fence emission if it happened right after. With the new mechanism,
hopefully there's no way to cause two fences to be emitted into the same
reserved space.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: 47d11990b (nouveau: make sure there's always room to emit a fence)
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 8053c9208f)

Squashed with commit

nv50,nvc0: don't base decisions on available pushbuf space

We still have to push everything out, might as well kick earlier and
flip pushbufs when we know we'll need it. This resolves some issues with
the new policy of making sure that we always leave a bit of room at the
end for fences.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: 47d11990b (nouveau: make sure there's always room to emit a fence)
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 9fe458335f)

Squashed with commit

nouveau: avoid double-emitting fence

The act of ensuring that there is space can cause a flush to happen,
which will emit the current screen fence. If that is the fence we're
trying to wait on, then it will have been emitted as a result of doing
the PUSH_SPACE. Don't attempt to emit it a second time.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Fixes: 8053c9208f (nouveau: avoid emitting new fences unnecessarily)
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit bf97f8d467)
2015-10-21 14:21:07 +01:00
Emil Velikov
b4bfea0094 docs: add sha256 checksums for 11.0.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2015-10-10 17:02:46 +01:00
Emil Velikov
914966befc docs: add release notes for 11.0.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2015-10-10 16:21:59 +01:00
Emil Velikov
3c86315ca3 Update version to 11.0.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2015-10-10 16:17:51 +01:00
Emil Velikov
d0c22560a1 Revert "nouveau: make sure there's always room to emit a fence"
This reverts commit 30570b2629.

As mentioned by Ilia Mirkin:

 Please remove this one from your list of cherry-picked patches. While
  it fixes real issues on nv30 (and probably the other generations too),
  it appears to introduce some new ones on nvc0. I've figured out what's
  causing it, but haven't figured out a proper fix. Not sure I'll be
  able to before you do a release.
2015-10-10 16:15:08 +01:00
Jason Ekstrand
1a866b3e49 mesa: Correctly handle GL_BGRA_EXT in ES3 format_and_type checks
The EXT_texture_format_BGRA8888 extension (which mesa supports
unconditionally) adds a new format and internal format called GL_BGRA_EXT.
Previously, this was not really handled at all in
_mesa_ex3_error_check_format_and_type.  When the checks were tightened in
commit f15a7f3c, we accidentally tightened things too far and GL_BGRA_EXT
would always cause an error to be thrown.

There were two primary issues here.  First, is that
_mesa_es3_effective_internal_format_for_format_and_type didn't handle the
GL_BGRA_EXT format.  Second is that it blindly uses _mesa_base_tex_format
which returns GL_RGBA for GL_BGRA_EXT.  This commit fixes both of these
issues as well as adds explicit checks that GL_BGRA_EXT is only ever used
with GL_BGRA_EXT and GL_UNSIGNED_BYTE.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92265
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6ad9ebb073)
2015-10-10 16:14:12 +01:00
Michel Dänzer
b1230e3e01 st/dri: Use packed RGB formats
Fixes Gallium based DRI drivers failing to load on big endian hosts
because they can't find any matching fbconfigs.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71789
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 87c3c9acd2)
Nominated-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-10-07 16:42:01 +01:00
Varad Gautam
d09b37e7d5 egl: restore surface type before linking config to its display
commit c2c2e9a (egl: implement EGL_KHR_gl_colorspace (v2)) leaves
_EGLConfig->SurfaceType set incorrectly before calling _eglLinkConfig(),
and the bad value is passed around to platform_android. set it to zero
as earlier.

v2: Set SurfaceType to 0, rather than surface_type (Suggested by Emil)

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91596
Signed-off-by: Varad Gautam <varadgautam@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit f988eff379)
2015-10-07 15:21:10 +01:00
Ilia Mirkin
30570b2629 nouveau: make sure there's always room to emit a fence
I started seeing a lot of situations on nv30 where fence emission
wouldn't fit into the previous buffer (causing assertions). This ensures
that whenever checking for space, we always leave a bit of extra room
for the fence emission commands. Adjusts the nv30 and nvc0 fence
emission logic to bypass the space checking as well.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 47d11990b2)
2015-10-07 14:52:55 +01:00
Ilia Mirkin
f114967ca9 nv30: always go through translate module on big-endian
It seems like things are either coming in slighly wrong, or perhaps
uploaded incorrectly, but either way passing them through the translate
module seems to fix everything. Eventually we should figure out what's
going wrong and fix it "for real", but this should do for now.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 78ec9e28ec)
2015-10-07 14:52:29 +01:00
Ilia Mirkin
39a3871b1e nv30: pretend to have packed texture/surface formats
This puts us in line with what the DDX/DRI2 st are expecting. It also
happens to work... no idea why, but seems better to have it work than to
ask lots of questions.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 1fec05d114)
2015-10-07 14:52:04 +01:00
Marek Olšák
28373c75ba egl/dri2: don't require a context for ClientWaitSync (v2)
The spec doesn't require it. This fixes a crash on Android.

v2: don't set any flags if ctx == NULL
v3: add the spec note

Cc: 10.6 11.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Albert Freeman <albertwdfreeman@gmail.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 18123a732b)
2015-10-07 14:51:39 +01:00
Marek Olšák
eabc656324 st/dri: don't use _ctx in client_wait_sync
Not needed and it can be NULL.

v2: fix dri2_get_fence_from_cl_event - thanks Albert

Cc: 10.6 11.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Albert Freeman <albertwdfreeman@gmail.com>
(cherry picked from commit b78336085b)
2015-10-07 14:51:14 +01:00
Matthew Waters
1f2d007e49 egl: rework handling EGL_CONTEXT_FLAGS
As of version 15 of the EGL_KHR_create_context spec, debug contexts
are allowed for ES contexts.  We should allow creation instead of
erroring.

While we're here provide a more comprehensive checking for the other two
flags - ROBUST_ACCESS_BIT_KHR and FORWARD_COMPATIBLE_BIT_KHR

v2 [Emil Velikov] Rebase. Minor tweak in commit message.

Cc: Boyan Ding <boyan.j.ding@gmail.com>
Cc: Chad Versace <chad.versace@intel.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91044
Signed-off-by: Matthew Waters <ystreet00@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 11cabc45b7)
2015-10-07 14:50:48 +01:00
Tom Stellard
00425de657 radeon/llvm: Initialize gallivm targets when initializing the AMDGPU target v2
This fixes a race condition in the glx-multithreaded-shader-compile
test.

v2:
  - Replace gallivm_init_llvm_{begin,end}() with gallivm_init_llvm_targets().

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>

CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a2e1e3d325)
2015-10-07 14:50:23 +01:00
Tom Stellard
16d9e62107 gallivm: Allow drivers and state trackers to initialize gallivm LLVM targets v2
Drivers and state trackers that use LLVM for generating code, must
register the targets they use with LLVM's global TargetRegistry.
The TargetRegistry is not thread-safe, so all targets must be added
to the registry before it can be queried for target information.

When drivers and state trackers initialize their own targets, they need
a way to force gallivm to initialize its targets at the same time.
Otherwise, there can be a race condition in some multi-threaded
applications (e.g. glx-multihreaded-shader-compile in piglit),
when one thread creates a context for a driver that uses LLVM (e.g.
radeonsi) and another thread creates a gallivm context (glxContextCreate
does this).

The race happens when the driver thread initializes its LLVM targets and
then starts using the registry before the gallivm thread has a chance to
register its targets.

This patch allows users to force gallivm to register its targets by
calling the gallivm_init_llvm_targets() function.

v2:
  - Use call_once and remove mutexes and static initializations.
  - Replace gallivm_init_llvm_{begin,end}() with
    gallivm_init_llvm_targets().

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>

CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 76cfd6f1da)
2015-10-07 14:49:58 +01:00
Tom Stellard
776bcb2042 gallium/radeon: Use call_once() when initailizing LLVM targets
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>

CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3219b48ae5)
2015-10-07 14:49:32 +01:00
Kyle Brenneman
ac75afff88 glx: Don't hard-code the name "libGL.so.1" in driOpenDriver (v3)
Add a macro GL_LIB_NAME to hold the filename that configure comes up with
based on the --with-gl-lib-name and --enable-mangling options.

In driOpenDriver, use the GL_LIB_NAME macro instead of hard-coding
"libGL.so.1".

v2: Add an #ifndef/#define for GL_LIB_NAME so that non-autoconf builds will
    work.
v3: Fix the library filename in the Makefile.

Signed-off-by: Kyle Brenneman <kbrenneman@nvidia.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d35391cfda)
2015-10-07 14:49:07 +01:00
Kyle Brenneman
de936892db mapi: Make _glapi_get_stub work with "gl" or "mgl" prefix.
When USE_MGL_NAMESPACE is defined, _glapi_get_stub will check for the "m"
prefix before trying to skip it, so that "glFoo" and "mglFoo" are
equivalent.

This should let it work with all the places where something calls
_glapi_get_proc_offset with a hard-coded name that starts with the normal
"gl" prefix.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55552
Signed-off-by: Kyle Brenneman <kbrenneman@nvidia.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 798f260a2f)
2015-10-07 14:48:41 +01:00
Kyle Brenneman
b2a04cfcc2 glx: Fix build errors with --enable-mangling (v2)
Rearranged the GLX_ALIAS macro in glextensions.h so that it will pick up
the renames from glx_mangle.h.

Fixed the alias attribute for glXGetProcAddress when USE_MGL_NAMESPACE is
defined.

v2: Add a comment clarifying why GLX_ALIAS needs two macros.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55552
Signed-off-by: Kyle Brenneman <kbrenneman@nvidia.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit a27f2d991b)
2015-10-07 14:48:15 +01:00
Daniel Scharrer
dca86265a2 mesa: Add abs input modifier to base for POW in ffvertex_prog
The result of POW for a negative base is undefined. Even when the result
is multiplied by zero (which is the case here whenever the base is
negative), the Inf and NaNs can propagate past that.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91342
Signed-off-by: Daniel Scharrer <daniel@constexpr.org>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit b3f9c5cc0f)
2015-10-07 14:47:50 +01:00
Ian Romanick
d0684f3d58 meta: Handle array textures in scaled MSAA blits
The old code had some significant problems with respect to
sampler2DArray textures.  The biggest problem was that some of the code
would use vec3 for the texture coordinate type, and other parts of the
code would use vec2.  The resulting shader would not even compile.
Since there were not tests for this path, nobody noticed.

The input to the fragment shader is always treated as a vec3.  If the
source data is only vec2, the vertex puller will supply 0 for the .z
component.  The texture coordinate passed to the fragment shader is
always a vec2 that comes from the .xy part of the vertex shader input.
The layer, taken from the .z of the vertex shader input is passed
separately as a flat integer.  If the generated fragment shader does not
use the layer integer, the GLSL linker will eliminate all the dead code
in the vertex shader.

Fixes the new piglit tests "blit-scaled samples=2 with
gl_texture_2d_multisample_array", etc. on i965.

Note for stable maintainer: This patch may depend on 46037237, and that
patch should be safe for stable.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9bd9cf1fa4)
2015-10-07 14:47:23 +01:00
Ville Syrjälä
7d78578b06 i915: Remember to call intel_prepare_render() before blitting
Bring over the following fix from i965:
 commit fb3d62fe3d
 Author: Kenneth Graunke <kenneth@whitecape.org>
 Date:   Tue Aug 6 14:36:09 2013 -0700

    i965: Remember to call intel_prepare_render() before blitting.

Fixes a crash in the following piglit tests:
 bin/fbo-sys-blit -auto
 bin/fbo-sys-sub-blit -auto

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a1a3f0961b)
2015-10-07 14:46:56 +01:00
Ville Syrjälä
88ed45b033 i915: Fix texcoord vs. varying collision in fragment programs
i915 fragment programs utilize the texture coordinate registers
for both texture coordinates and varyings. Unfortunately the
code doesn't check if the same index might be in use for both.
It just naively uses the index to pick a texture unit, which
could lead to collisions.

Add an extra mapping step to allocate non conflicting texture
units for both uses.

The issue can be reproduced with a pair of simple shaders like
these:
 attribute vec4 in_mod;
 varying vec4 mod;
 void main() {
   mod = in_mod;
   gl_TexCoord[0] = gl_MultiTexCoord0;
   gl_Position = gl_ModelViewProjectionMatrix * gl_Vertex;
 }

 varying vec4 mod;
 uniform sampler2D tex;
 void main() {
   gl_FragColor = texture2D(tex, vec2(gl_TexCoord[0])) * mod;
 }

Fixes many piglit tests on i915:

    glsl-link-varyings-2
    glsl-orangebook-ch06-bump
    interpolation-none-gl_frontcolor-smooth-fixed
    interpolation-none-gl_frontcolor-smooth-none
    interpolation-none-gl_frontcolor-smooth-vertex
    interpolation-none-gl_frontsecondarycolor-smooth-fixed
    interpolation-none-gl_frontsecondarycolor-smooth-vertex
    interpolation-none-gl_frontsecondarycolor-smooth-none
    interpolation-none-other-flat-fixed
    interpolation-none-other-flat-none
    interpolation-none-other-flat-vertex
    interpolation-none-other-smooth-fixed
    interpolation-none-other-smooth-none
    interpolation-none-other-smooth-vertex

v2 [idr]: Minor formatting tweaks.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c349031c27)
2015-10-07 14:46:29 +01:00
Ville Syrjälä
fbcd36ddb6 i830: Fix collision between I830_UPLOAD_RASTER_RULES and I830_UPLOAD_TEX(0)
I830_UPLOAD_RASTER_RULES and I830_UPLOAD_TEX(0) are trying to occupy
the same bit. Move the texture bits upwards a bit to make room for
I830_UPLOAD_RASTER_RULES.

Now the driver will actually upload the raster rules which is rather
important to get the provoking vertex right. Fixes the appearance
of glxgears teeth on gen2.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9504740f3e)
2015-10-07 14:46:00 +01:00
Brian Paul
531309a5f0 st/mesa: try PIPE_BIND_RENDER_TARGET when choosing float texture formats
For 8-bit RGB(A) texture formats we set the PIPE_BIND_RENDER_TARGET flag
to try to get a hardware format which also supports rendering (for FBO
textures).  Do the same thing for floating point formats.

This allows the Redway3D Flat demo to run.

Cc: 10.6 11.0 <mesa-stable@lists.freedesktop.org>

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit cb758b892a)
2015-10-07 14:45:34 +01:00
Ilia Mirkin
0ae914f65d nouveau: wait to unref the transfer's bo until it's no longer used
The bo will often come from a slab in which case it doesn't matter. But
for larger allocations this will be in its own bo, and we have to make
sure to wait until it's no longer used in order for it to be freed.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Tested-by: Marcin Ślusarz <marcin.slusarz@gmail.com>
(cherry picked from commit 1d8cba9b51)
2015-10-07 14:45:08 +01:00
Ilia Mirkin
b2c8b0e546 nouveau: delay deleting buffer with unflushed fence
If there is an unflushed fence on the bo, then the resource may still be
used in commands built up in the local pushbuf. Flushing can cause all
sorts of unwanted effects, so just free the bo when the relevant fence
is hit.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Tested-by: Marcin Ślusarz <marcin.slusarz@gmail.com>
(cherry picked from commit 3a6b9a7830)
2015-10-07 14:44:40 +01:00
Ilia Mirkin
d6ee06e9fe nouveau: be more careful about freeing temporary transfer buffers
Deleting a buffer does not flush the command stream. Make sure that we
wait for the copies to finish before deleting the temporary bo.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Tested-by: Marcin Ślusarz <marcin.slusarz@gmail.com>
(cherry picked from commit d4e650b07b)
2015-10-07 14:44:15 +01:00
Francisco Jerez
7b8b044ee4 i965/fs: Fix hang on IVB and VLV with image format mismatch.
IVB and VLV hang sporadically when an untyped surface read or write
message is used to access a surface of format other than RAW, as may
happen when there is a mismatch between the format qualifier of the
image uniform and the format of the actual image bound to the
pipeline.  According to the spec this condition gives undefined
results but may not lead to program termination (which is one of the
possible outcomes of the hang).  Fix it by checking at runtime whether
the surface is of the right type.

Fixes the "arb_shader_image_load_store.invalid/format mismatch" piglit
subtest.

Reported-by: Mark Janes <mark.a.janes@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91718
CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit b61292296b)
2015-10-07 14:43:49 +01:00
Marek Olšák
ec7cda29b6 radeonsi: add scratch buffer to the buffer list when it's re-allocated
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 9932142192)
2015-10-07 14:43:19 +01:00
Leo Liu
ab68081ffb radeon/vce: fix vui time_scale zero error
if app pass 0 as frame_rate_num, it should not be encoded to the VUI.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 1e97b41893)
2015-10-07 14:42:54 +01:00
Roland Scheidegger
46dc4946a2 mesa: fix mipmap generation for immutable, compressed textures
If the immutable compressed texture didn't have the full mip pyramid,
this didn't work, because it tried to generate mip levels for non-existing
levels. _mesa_prepare_mipmap_level() would correctly handle this by returning
FALSE if the mip level didn't exist, however we actually created the
non-existing mip level right before that because we used _mesa_get_tex_image()
before calling _mesa_prepare_mipmap_level(). It would then proceed to crash
(we allocated the mip level, which is a bad idea on an immutable texture,
but didn't initialize the values, leading to assertion failures or segfaults).
Fix this by using _mesa_select_tex_image() instead and call it after
_mesa_prepare_mipmap_level(), as that function will allocate missing mip levels
for non-immutable textures already.
This fixes a (2 year old) crash with astromenace which was hack-fixed in ubuntu
packages instead: http://bugs.debian.org/718680 (I guess most apps do full mip
chains - I believe this app not doing it is actually unintentional, always one
level less than full mip chain...).

Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 19604d30e1)
2015-10-07 14:42:27 +01:00
Marek Olšák
01e197c21a gallium/u_blitter: handle allocation failures
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 7bbce21e45)
2015-10-07 14:41:58 +01:00
Marek Olšák
0c5aacf446 radeonsi: handle dummy constant buffer allocation failure
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit ae418a7b56)
2015-10-07 14:41:30 +01:00
Marek Olšák
fb5dd33166 radeonsi: don't forget to update scratch relocations for LS, HS, ES shaders
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit b737d9c1dc)
2015-10-07 14:41:03 +01:00
Marek Olšák
b2d3012e35 radeonsi: skip drawing if updating the scratch buffer fails
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit d556346b35)
2015-10-07 14:40:36 +01:00
Marek Olšák
154573e427 radeonsi: skip drawing if PS fails to compile or upload
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 1f99b0be7e)
2015-10-07 14:40:08 +01:00
Marek Olšák
7e64e887f0 radeonsi: skip drawing if VS, TCS, TES, GS fail to compile or upload
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 237d7cccce)
2015-10-07 14:39:41 +01:00
Marek Olšák
10382380f0 radeonsi: handle fixed-func TCS shader create failure
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 9b6d9dd7d8)
2015-10-07 14:39:14 +01:00
Marek Olšák
815b595b5f radeonsi: handle shader precompile failures
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 5dbadb0257)
2015-10-07 14:38:46 +01:00
Marek Olšák
4e0ae01588 radeonsi: skip drawing if GS ring allocations fail
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 263f5a2cf9)
[Emil Velikov: Track gs_rings over gsvs_ring. NULL check/FREE gs_rings.]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/gallium/drivers/radeonsi/si_state_shaders.c
2015-10-07 14:36:53 +01:00
Marek Olšák
33ed153214 radeonsi: skip drawing if the tess factor ring allocation fails
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 22d3ccf5a8)
[Emil Velikov: Track tf_state over tf_ring. NULL check/FREE tf_state.]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/gallium/drivers/radeonsi/si_state_shaders.c
2015-10-07 14:36:30 +01:00
Marek Olšák
3cd7493f11 radeonsi: add malloc fail paths to si_create_shader_state
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 5c219ab552)
2015-10-07 14:10:31 +01:00
Marek Olšák
dacccf8e22 radeonsi: report alloc failure from si_shader_binary_read
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 394d67a58f)
2015-10-07 14:10:03 +01:00
Marek Olšák
288d9a06cc gallium/radeon: add a fail path for depth MSAA texture readback
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit dea834e639)
2015-10-07 14:09:37 +01:00
Marek Olšák
62ac723a34 gallium/radeon: handle buffer alloc failures in r600_draw_rectangle
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit f95e695059)
2015-10-07 14:09:11 +01:00
Marek Olšák
766a0b4661 gallium/radeon: handle buffer_map staging buffer failures better
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 282b378012)
2015-10-07 14:08:44 +01:00
Marek Olšák
f2e8b94f84 radeonsi: handle constant buffer alloc failures
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit cd27ff6a0f)
2015-10-07 14:08:18 +01:00
Marek Olšák
02a631bfbc radeonsi: handle index buffer alloc failures
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 29dff6f676)
2015-10-07 14:07:51 +01:00
Marek Olšák
94e9c52b62 st/mesa: fix front buffer regression after dropping st_validate_state in Blit
Broken by: d082c53249
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92072

Cc: 10.6 11.0 <mesa-stable@lists.freedesktop.org>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit f3a0819533)
2015-10-07 14:07:14 +01:00
Emil Velikov
4c0b484612 docs: add sha256 checksums for 11.0.2
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-09-29 00:19:36 +01:00
159 changed files with 3279 additions and 585 deletions

View File

@@ -32,6 +32,7 @@ AM_DISTCHECK_CONFIGURE_FLAGS = \
--enable-vdpau \
--enable-xa \
--enable-xvmc \
--disable-llvm-shared-libs \
--with-egl-platforms=x11,wayland,drm \
--with-dri-drivers=i915,i965,nouveau,radeon,r200,swrast \
--with-gallium-drivers=i915,ilo,nouveau,r300,r600,radeonsi,freedreno,svga,swrast

View File

@@ -1 +1 @@
11.0.2
11.0.6

5
bin/.cherry-ignore Normal file
View File

@@ -0,0 +1,5 @@
# The commit base differs greatly between 11.0 and master
2832ca95ecce064c7d841a3a374c2179f56161be glsl: fix stream qualifier for blocks with an instance name
# Somewhat of a mixed feature/bugfix patch, causing some 200 piglit regressions
2b676570960277d47477822ffeccc672613f9142 gallium/swrast: fix front buffer blitting. (v2)

View File

@@ -106,6 +106,8 @@ AC_SYS_LARGEFILE
LT_PREREQ([2.2])
LT_INIT([disable-static])
AC_CHECK_PROG(RM, rm, [rm -f])
AX_PROG_BISON([],
AS_IF([test ! -f "$srcdir/src/glsl/glcpp/glcpp-parse.c"],
[AC_MSG_ERROR([bison not found - unable to compile glcpp-parse.y])]))

View File

@@ -31,7 +31,8 @@ because compatibility contexts are not supported.
<h2>SHA256 checksums</h2>
<pre>
TBD
45170773500d6ae2f9eb93fc85efee69f7c97084411ada4eddf92f78bca56d20 mesa-11.0.2.tar.gz
fce11fb27eb87adf1e620a76455d635c6136dfa49ae58c53b34ef8d0c7b7eae4 mesa-11.0.2.tar.xz
</pre>

185
docs/relnotes/11.0.3.html Normal file
View File

@@ -0,0 +1,185 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 11.0.3 Release Notes / October 10, 2015</h1>
<p>
Mesa 11.0.3 is a bug fix release which fixes bugs found since the 11.0.2 release.
</p>
<p>
Mesa 11.0.3 implements the OpenGL 4.1 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.1. OpenGL
4.1 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
c2210e3daecc10ed9fdcea500327652ed6effc2f47c4b9cee63fb08f560d7117 mesa-11.0.3.tar.gz
ab2992eece21adc23c398720ef8c6933cb69ea42e1b2611dc09d031e17e033d6 mesa-11.0.3.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=55552">Bug 55552</a> - Compile errors with --enable-mangling</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71789">Bug 71789</a> - [r300g] Visuals not found in (default) depth = 24</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91044">Bug 91044</a> - piglit spec/egl_khr_create_context/valid debug flag gles* fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91342">Bug 91342</a> - Very dark textures on some objects in indoors environments in Postal 2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91596">Bug 91596</a> - EGL_KHR_gl_colorspace (v2) causes problem with Android-x86 GUI</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91718">Bug 91718</a> - piglit.spec.arb_shader_image_load_store.invalid causes intermittent GPU HANG</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92072">Bug 92072</a> - Wine breakage since d082c5324 (st/mesa: don't call st_validate_state in BlitFramebuffer)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92265">Bug 92265</a> - Black windows in weston after update mesa to 11.0.2-1</li>
</ul>
<h2>Changes</h2>
<p>Brian Paul (1):</p>
<ul>
<li>st/mesa: try PIPE_BIND_RENDER_TARGET when choosing float texture formats</li>
</ul>
<p>Daniel Scharrer (1):</p>
<ul>
<li>mesa: Add abs input modifier to base for POW in ffvertex_prog</li>
</ul>
<p>Emil Velikov (3):</p>
<ul>
<li>docs: add sha256 checksums for 11.0.2</li>
<li>Revert "nouveau: make sure there's always room to emit a fence"</li>
<li>Update version to 11.0.3</li>
</ul>
<p>Francisco Jerez (1):</p>
<ul>
<li>i965/fs: Fix hang on IVB and VLV with image format mismatch.</li>
</ul>
<p>Ian Romanick (1):</p>
<ul>
<li>meta: Handle array textures in scaled MSAA blits</li>
</ul>
<p>Ilia Mirkin (6):</p>
<ul>
<li>nouveau: be more careful about freeing temporary transfer buffers</li>
<li>nouveau: delay deleting buffer with unflushed fence</li>
<li>nouveau: wait to unref the transfer's bo until it's no longer used</li>
<li>nv30: pretend to have packed texture/surface formats</li>
<li>nv30: always go through translate module on big-endian</li>
<li>nouveau: make sure there's always room to emit a fence</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>mesa: Correctly handle GL_BGRA_EXT in ES3 format_and_type checks</li>
</ul>
<p>Kyle Brenneman (3):</p>
<ul>
<li>glx: Fix build errors with --enable-mangling (v2)</li>
<li>mapi: Make _glapi_get_stub work with "gl" or "mgl" prefix.</li>
<li>glx: Don't hard-code the name "libGL.so.1" in driOpenDriver (v3)</li>
</ul>
<p>Leo Liu (1):</p>
<ul>
<li>radeon/vce: fix vui time_scale zero error</li>
</ul>
<p>Marek Olšák (21):</p>
<ul>
<li>st/mesa: fix front buffer regression after dropping st_validate_state in Blit</li>
<li>radeonsi: handle index buffer alloc failures</li>
<li>radeonsi: handle constant buffer alloc failures</li>
<li>gallium/radeon: handle buffer_map staging buffer failures better</li>
<li>gallium/radeon: handle buffer alloc failures in r600_draw_rectangle</li>
<li>gallium/radeon: add a fail path for depth MSAA texture readback</li>
<li>radeonsi: report alloc failure from si_shader_binary_read</li>
<li>radeonsi: add malloc fail paths to si_create_shader_state</li>
<li>radeonsi: skip drawing if the tess factor ring allocation fails</li>
<li>radeonsi: skip drawing if GS ring allocations fail</li>
<li>radeonsi: handle shader precompile failures</li>
<li>radeonsi: handle fixed-func TCS shader create failure</li>
<li>radeonsi: skip drawing if VS, TCS, TES, GS fail to compile or upload</li>
<li>radeonsi: skip drawing if PS fails to compile or upload</li>
<li>radeonsi: skip drawing if updating the scratch buffer fails</li>
<li>radeonsi: don't forget to update scratch relocations for LS, HS, ES shaders</li>
<li>radeonsi: handle dummy constant buffer allocation failure</li>
<li>gallium/u_blitter: handle allocation failures</li>
<li>radeonsi: add scratch buffer to the buffer list when it's re-allocated</li>
<li>st/dri: don't use _ctx in client_wait_sync</li>
<li>egl/dri2: don't require a context for ClientWaitSync (v2)</li>
</ul>
<p>Matthew Waters (1):</p>
<ul>
<li>egl: rework handling EGL_CONTEXT_FLAGS</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>st/dri: Use packed RGB formats</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>mesa: fix mipmap generation for immutable, compressed textures</li>
</ul>
<p>Tom Stellard (3):</p>
<ul>
<li>gallium/radeon: Use call_once() when initailizing LLVM targets</li>
<li>gallivm: Allow drivers and state trackers to initialize gallivm LLVM targets v2</li>
<li>radeon/llvm: Initialize gallivm targets when initializing the AMDGPU target v2</li>
</ul>
<p>Varad Gautam (1):</p>
<ul>
<li>egl: restore surface type before linking config to its display</li>
</ul>
<p>Ville Syrjälä (3):</p>
<ul>
<li>i830: Fix collision between I830_UPLOAD_RASTER_RULES and I830_UPLOAD_TEX(0)</li>
<li>i915: Fix texcoord vs. varying collision in fragment programs</li>
<li>i915: Remember to call intel_prepare_render() before blitting</li>
</ul>
</div>
</body>
</html>

168
docs/relnotes/11.0.4.html Normal file
View File

@@ -0,0 +1,168 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 11.0.4 Release Notes / October 24, 2015</h1>
<p>
Mesa 11.0.4 is a bug fix release which fixes bugs found since the 11.0.3 release.
</p>
<p>
Mesa 11.0.4 implements the OpenGL 4.1 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.1. OpenGL
4.1 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
ed412ca6a46d1bd055120e5c12806c15419ae8c4dd6d3f6ea20a83091d5c78bf mesa-11.0.4.tar.gz
40201bf7fc6fa12a6d9edfe870b41eb4dd6669154e3c42c48a96f70805f5483d mesa-11.0.4.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86281">Bug 86281</a> - brw_meta_fast_clear (brw=brw&#64;entry=0x7fffd4097a08, fb=fb&#64;entry=0x7fffd40fa900, buffers=buffers&#64;entry=2, partial_clear=partial_clear&#64;entry=false)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86720">Bug 86720</a> - [radeon] Europa Universalis 4 freezing during game start (10.3.3+, still broken on 11.0.2)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91788">Bug 91788</a> - [HSW Regression] Synmark2_v6 Multithread performance case FPS reduced by 36%</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92304">Bug 92304</a> - [cts] cts.shaders.negative conformance tests fail</li>
</ul>
<h2>Changes</h2>
<p>Alejandro Piñeiro (2):</p>
<ul>
<li>i965/vec4: check writemask when bailing out at register coalesce</li>
<li>i965/vec4: fill src_reg type using the constructor type parameter</li>
</ul>
<p>Brian Paul (2):</p>
<ul>
<li>vbo: fix incorrect switch statement in init_mat_currval()</li>
<li>mesa: fix incorrect opcode in save_BlendFunci()</li>
</ul>
<p>Chih-Wei Huang (3):</p>
<ul>
<li>mesa: android: Fix the incorrect path of sse_minmax.c</li>
<li>nv50/ir: use C++11 standard std::unordered_map if possible</li>
<li>nv30: include the header of ffs prototype</li>
</ul>
<p>Chris Wilson (1):</p>
<ul>
<li>i965: Remove early release of DRI2 miptree</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>mesa/uniforms: fix get_uniform for doubles (v2)</li>
</ul>
<p>Emil Velikov (1):</p>
<ul>
<li>docs: add sha256 checksums for 11.0.3</li>
</ul>
<p>Francisco Jerez (5):</p>
<ul>
<li>i965: Don't tell the hardware about our UAV access.</li>
<li>mesa: Expose function to calculate whether a shader image unit is valid.</li>
<li>mesa: Skip redundant texture completeness checking during image validation.</li>
<li>i965: Use _mesa_is_image_unit_valid() instead of gl_image_unit::_Valid.</li>
<li>mesa: Get rid of texture-dependent image unit derived state.</li>
</ul>
<p>Ian Romanick (8):</p>
<ul>
<li>glsl: Allow built-in functions as constant expressions in OpenGL ES 1.00</li>
<li>ff_fragment_shader: Use binding to set the sampler unit</li>
<li>glsl/linker: Use constant_initializer instead of constant_value to initialize uniforms</li>
<li>glsl: Use constant_initializer instead of constant_value to determine whether to keep an unused uniform</li>
<li>glsl: Only set ir_variable::constant_value for const-decorated variables</li>
<li>glsl: Restrict initializers for global variables to constant expression in ES</li>
<li>glsl: Add method to determine whether an expression contains the sequence operator</li>
<li>glsl: In later GLSL versions, sequence operator is cannot be a constant expression</li>
</ul>
<p>Ilia Mirkin (1):</p>
<ul>
<li>nouveau: make sure there's always room to emit a fence</li>
</ul>
<p>Indrajit Das (1):</p>
<ul>
<li>st/va: Used correct parameter to derive the value of the "h" variable in vlVaCreateImage</li>
</ul>
<p>Jonathan Gray (1):</p>
<ul>
<li>configure.ac: ensure RM is set</li>
</ul>
<p>Krzysztof Sobiecki (1):</p>
<ul>
<li>st/fbo: use pipe_surface_release instead of pipe_surface_reference</li>
</ul>
<p>Leo Liu (1):</p>
<ul>
<li>st/omx/dec/h264: fix field picture type 0 poc disorder</li>
</ul>
<p>Marek Olšák (3):</p>
<ul>
<li>st/mesa: fix clip state dependencies</li>
<li>radeonsi: fix a GS copy shader leak</li>
<li>gallium: add PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT</li>
</ul>
<p>Nicolai Hähnle (1):</p>
<ul>
<li>u_vbuf: fix vb slot assignment for translated buffers</li>
</ul>
<p>Rob Clark (1):</p>
<ul>
<li>freedreno/a3xx: cache-flush is needed after MEM_WRITE</li>
</ul>
<p>Tapani Pälli (3):</p>
<ul>
<li>mesa: add GL_UNSIGNED_INT_24_8 to _mesa_pack_depth_span</li>
<li>mesa: Set api prefix to version string when overriding version</li>
<li>mesa: fix ARRAY_SIZE query for GetProgramResourceiv</li>
</ul>
</div>
</body>
</html>

174
docs/relnotes/11.0.5.html Normal file
View File

@@ -0,0 +1,174 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 11.0.5 Release Notes / November 11, 2015</h1>
<p>
Mesa 11.0.5 is a bug fix release which fixes bugs found since the 11.0.4 release.
</p>
<p>
Mesa 11.0.5 implements the OpenGL 4.1 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.1. OpenGL
4.1 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
8495ef5c06f7f726452462b7d408a5b40048373ff908f2283a3b4d1f49b45ee6 mesa-11.0.5.tar.gz
9c255a2a6695fcc6ef4a279e1df0aeaf417dc142f39ee59dfb533d80494bb67a mesa-11.0.5.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91993">Bug 91993</a> - Graphical glitch in Astromenace (open-source game).</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92214">Bug 92214</a> - Flightgear crashes during splashboot with R600 driver, LLVM 3.7.0 and mesa 11.0.2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92437">Bug 92437</a> - osmesa: Expose GL entry points for Windows build, via .def file</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92476">Bug 92476</a> - [cts] ES2-CTS.gtf.GL2ExtensionTests.egl_image.egl_image fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92623">Bug 92623</a> - Differences in prog_data ignored when caching fragment programs (causes hangs)</li>
</ul>
<h2>Changes</h2>
<p>Alex Deucher (1):</p>
<ul>
<li>radeon/uvd: don't expose HEVC on old UVD hw (v3)</li>
</ul>
<p>Ben Widawsky (1):</p>
<ul>
<li>i965/skl: Add GT4 PCI IDs</li>
</ul>
<p>Emil Velikov (4):</p>
<ul>
<li>docs: add sha256 checksums for 11.0.4</li>
<li>cherry-ignore: ignore a possible wrong nomination</li>
<li>Revert "mesa/glformats: Undo code changes from _mesa_base_tex_format() move"</li>
<li>Update version to 11.0.5</li>
</ul>
<p>Emmanuel Gil Peyrot (1):</p>
<ul>
<li>gbm.h: Add a missing stddef.h include for size_t.</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>vc4: When the create ioctl fails, free our cache and try again.</li>
</ul>
<p>Ian Romanick (1):</p>
<ul>
<li>i965: Fix is-renderable check in intel_image_target_renderbuffer_storage</li>
</ul>
<p>Ilia Mirkin (3):</p>
<ul>
<li>nvc0: respect edgeflag attribute width</li>
<li>nouveau: set MaxDrawBuffers to the same value as MaxColorAttachments</li>
<li>nouveau: relax fence emit space assert</li>
</ul>
<p>Ivan Kalvachev (1):</p>
<ul>
<li>r600g: Fix special negative immediate constants when using ABS modifier.</li>
</ul>
<p>Jason Ekstrand (2):</p>
<ul>
<li>nir/lower_vec_to_movs: Pass the shader around directly</li>
<li>nir: Report progress from lower_vec_to_movs().</li>
</ul>
<p>Jose Fonseca (2):</p>
<ul>
<li>gallivm: Translate all util_cpu_caps bits to LLVM attributes.</li>
<li>gallivm: Explicitly disable unsupported CPU features.</li>
</ul>
<p>Julien Isorce (4):</p>
<ul>
<li>st/va: pass picture desc to begin and decode</li>
<li>nvc0: fix crash when nv50_miptree_from_handle fails</li>
<li>st/va: do not destroy old buffer when new one failed</li>
<li>st/va: add more errors checks in vlVaBufferSetNumElements and vlVaMapBuffer</li>
</ul>
<p>Kenneth Graunke (6):</p>
<ul>
<li>i965: Fix missing BRW_NEW_*_PROG_DATA flagging caused by cache reuse.</li>
<li>nir: Report progress from nir_split_var_copies().</li>
<li>nir: Properly invalidate metadata in nir_split_var_copies().</li>
<li>nir: Properly invalidate metadata in nir_opt_copy_prop().</li>
<li>nir: Properly invalidate metadata in nir_lower_vec_to_movs().</li>
<li>nir: Properly invalidate metadata in nir_opt_remove_phis().</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>radeonsi: add register definitions for Stoney</li>
</ul>
<p>Nanley Chery (1):</p>
<ul>
<li>mesa/glformats: Undo code changes from _mesa_base_tex_format() move</li>
</ul>
<p>Nicolai Hähnle (1):</p>
<ul>
<li>st/mesa: fix mipmap generation for immutable textures with incomplete pyramids</li>
</ul>
<p>Nigel Stewart (1):</p>
<ul>
<li>osmesa: Expose GL entry points for Windows build via DEF file.</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>gallivm: disable f16c when not using AVX</li>
</ul>
<p>Samuel Li (2):</p>
<ul>
<li>radeonsi: add support for Stoney asics (v3)</li>
<li>radeonsi: add Stoney pci ids</li>
</ul>
</div>
</body>
</html>

144
docs/relnotes/11.0.6.html Normal file
View File

@@ -0,0 +1,144 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 11.0.6 Release Notes / November 21, 2015</h1>
<p>
Mesa 11.0.6 is a bug fix release which fixes bugs found since the 11.0.5 release.
</p>
<p>
Mesa 11.0.6 implements the OpenGL 4.1 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.1. OpenGL
4.1 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
TBD
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91780">Bug 91780</a> - Rendering issues with geometry shader</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92588">Bug 92588</a> - [HSW,BDW,BSW,SKL-Y][GLES 3.1 CTS] ES31-CTS.arrays_of_arrays.InteractionFunctionCalls2 - assert</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92738">Bug 92738</a> - Randon R7 240 doesn't work on 16KiB page size platform</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92860">Bug 92860</a> - [radeonsi][bisected] st/mesa: implement ARB_copy_image - Corruption in ARK Survival Evolved</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92900">Bug 92900</a> - [regression bisected] About 700 piglit regressions is what could go wrong</li>
</ul>
<h2>Changes</h2>
<p>Alex Deucher (1):</p>
<ul>
<li>radeonsi: enable optimal raster config setting for fiji (v2)</li>
</ul>
<p>Ben Widawsky (1):</p>
<ul>
<li>i965/skl/gt4: Fix URB programming restriction.</li>
</ul>
<p>Boyuan Zhang (2):</p>
<ul>
<li>st/vaapi: fix vaapi VC-1 simple/main corruption v2</li>
<li>radeon/uvd: fix VC-1 simple/main profile decode v2</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>r600: initialised PGM_RESOURCES_2 for ES/GS</li>
</ul>
<p>Emil Velikov (4):</p>
<ul>
<li>docs: add sha256 checksums for 11.0.5</li>
<li>cherry-ignore: add the swrast front buffer support</li>
<li>automake: use static llvm for make distcheck</li>
<li>Update version to 11.0.6</li>
</ul>
<p>Eric Anholt (3):</p>
<ul>
<li>vc4: Return GL_OUT_OF_MEMORY when buffer allocation fails.</li>
<li>vc4: Return NULL when we can't make our shadow for a sampler view.</li>
<li>vc4: Add support for nir_op_uge, using the carry bit on QPU_A_SUB.</li>
</ul>
<p>Ian Romanick (2):</p>
<ul>
<li>meta/generate_mipmap: Don't leak the sampler object</li>
<li>meta/generate_mipmap: Only modify the draw framebuffer binding in fallback_required</li>
</ul>
<p>Ilia Mirkin (2):</p>
<ul>
<li>mesa/copyimage: allow width/height to not be multiples of block</li>
<li>nouveau: don't expose HEVC decoding support</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>nir/vars_to_ssa: Rework copy set handling in lower_copies_to_load_store</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>glsl: Allow implicit int -&gt; uint conversions for the % operator.</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>radeonsi: initialize SX_PS_DOWNCONVERT to 0 on Stoney</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>winsys/radeon: Use CPU page size instead of hardcoding 4096 bytes v3</li>
</ul>
<p>Oded Gabbay (1):</p>
<ul>
<li>llvmpipe: use simple coeffs calc for 128bit vectors</li>
</ul>
<p>Roland Scheidegger (2):</p>
<ul>
<li>radeon: fix bgrx8/xrgb8 blits</li>
<li>r200: fix bgrx8/xrgb8 blits</li>
</ul>
</div>
</body>
</html>

View File

@@ -124,6 +124,10 @@ CHIPSET(0x1921, skl_gt2, "Intel(R) Skylake ULT GT2F")
CHIPSET(0x1926, skl_gt3, "Intel(R) Skylake ULT GT3")
CHIPSET(0x192A, skl_gt3, "Intel(R) Skylake SRV GT3")
CHIPSET(0x192B, skl_gt3, "Intel(R) Skylake Halo GT3")
CHIPSET(0x1932, skl_gt4, "Intel(R) Skylake GT4")
CHIPSET(0x193A, skl_gt4, "Intel(R) Skylake GT4")
CHIPSET(0x193B, skl_gt4, "Intel(R) Skylake GT4")
CHIPSET(0x193D, skl_gt4, "Intel(R) Skylake GT4")
CHIPSET(0x22B0, chv, "Intel(R) HD Graphics (Cherryview)")
CHIPSET(0x22B1, chv, "Intel(R) HD Graphics (Cherryview)")
CHIPSET(0x22B2, chv, "Intel(R) HD Graphics (Cherryview)")

View File

@@ -181,3 +181,5 @@ CHIPSET(0x9876, CARRIZO_, CARRIZO)
CHIPSET(0x9877, CARRIZO_, CARRIZO)
CHIPSET(0x7300, FIJI_, FIJI)
CHIPSET(0x98E4, STONEY_, STONEY)

View File

@@ -312,6 +312,8 @@ dri2_add_config(_EGLDisplay *disp, const __DRIconfig *dri_config, int id,
else
conf->dri_single_config = dri_config;
}
conf->base.SurfaceType = 0;
conf->base.ConfigID = config_id;
_eglLinkConfig(&conf->base);
@@ -2384,13 +2386,18 @@ dri2_client_wait_sync(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSync *sync,
unsigned wait_flags = 0;
EGLint ret = EGL_CONDITION_SATISFIED_KHR;
if (flags & EGL_SYNC_FLUSH_COMMANDS_BIT_KHR)
/* The EGL_KHR_fence_sync spec states:
*
* "If no context is current for the bound API,
* the EGL_SYNC_FLUSH_COMMANDS_BIT_KHR bit is ignored.
*/
if (dri2_ctx && flags & EGL_SYNC_FLUSH_COMMANDS_BIT_KHR)
wait_flags |= __DRI2_FENCE_FLAG_FLUSH_COMMANDS;
/* the sync object should take a reference while waiting */
dri2_egl_ref_sync(dri2_sync);
if (dri2_dpy->fence->client_wait_sync(dri2_ctx->dri_context,
if (dri2_dpy->fence->client_wait_sync(dri2_ctx ? dri2_ctx->dri_context : NULL,
dri2_sync->fence, wait_flags,
timeout))
dri2_sync->base.SyncStatus = EGL_SIGNALED_KHR;

View File

@@ -152,12 +152,51 @@ _eglParseContextAttribList(_EGLContext *ctx, _EGLDisplay *dpy,
/* The EGL_KHR_create_context spec says:
*
* "Flags are only defined for OpenGL context creation, and
* specifying a flags value other than zero for other types of
* contexts, including OpenGL ES contexts, will generate an
* error."
* "If the EGL_CONTEXT_OPENGL_DEBUG_BIT_KHR flag bit is set in
* EGL_CONTEXT_FLAGS_KHR, then a <debug context> will be created.
* [...]
* In some cases a debug context may be identical to a non-debug
* context. This bit is supported for OpenGL and OpenGL ES
* contexts."
*/
if (api != EGL_OPENGL_API && val != 0) {
if ((val & EGL_CONTEXT_OPENGL_DEBUG_BIT_KHR) &&
(api != EGL_OPENGL_API && api != EGL_OPENGL_ES_API)) {
err = EGL_BAD_ATTRIBUTE;
break;
}
/* The EGL_KHR_create_context spec says:
*
* "If the EGL_CONTEXT_OPENGL_FORWARD_COMPATIBLE_BIT_KHR flag bit
* is set in EGL_CONTEXT_FLAGS_KHR, then a <forward-compatible>
* context will be created. Forward-compatible contexts are
* defined only for OpenGL versions 3.0 and later. They must not
* support functionality marked as <deprecated> by that version of
* the API, while a non-forward-compatible context must support
* all functionality in that version, deprecated or not. This bit
* is supported for OpenGL contexts, and requesting a
* forward-compatible context for OpenGL versions less than 3.0
* will generate an error."
*/
if ((val & EGL_CONTEXT_OPENGL_FORWARD_COMPATIBLE_BIT_KHR) &&
(api != EGL_OPENGL_API || ctx->ClientMajorVersion < 3)) {
err = EGL_BAD_ATTRIBUTE;
break;
}
/* The EGL_KHR_create_context_spec says:
*
* "If the EGL_CONTEXT_OPENGL_ROBUST_ACCESS_BIT_KHR bit is set in
* EGL_CONTEXT_FLAGS_KHR, then a context supporting <robust buffer
* access> will be created. Robust buffer access is defined in the
* GL_ARB_robustness extension specification, and the resulting
* context must also support either the GL_ARB_robustness
* extension, or a version of OpenGL incorporating equivalent
* functionality. This bit is supported for OpenGL contexts.
*/
if ((val & EGL_CONTEXT_OPENGL_ROBUST_ACCESS_BIT_KHR) &&
(api != EGL_OPENGL_API ||
!dpy->Extensions.EXT_create_context_robustness)) {
err = EGL_BAD_ATTRIBUTE;
break;
}

View File

@@ -427,6 +427,7 @@ lp_build_init(void)
*/
util_cpu_caps.has_avx = 0;
util_cpu_caps.has_avx2 = 0;
util_cpu_caps.has_f16c = 0;
}
#ifdef PIPE_ARCH_PPC_64
@@ -458,7 +459,9 @@ lp_build_init(void)
util_cpu_caps.has_sse3 = 0;
util_cpu_caps.has_ssse3 = 0;
util_cpu_caps.has_sse4_1 = 0;
util_cpu_caps.has_sse4_2 = 0;
util_cpu_caps.has_avx = 0;
util_cpu_caps.has_avx2 = 0;
util_cpu_caps.has_f16c = 0;
#endif

View File

@@ -137,6 +137,8 @@ gallivm_get_shader_param(enum pipe_shader_cap param)
case PIPE_SHADER_CAP_TGSI_DFRACEXP_DLDEXP_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED:
return 0;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
}
/* if we get here, we missed a shader cap above (and should have seen
* a compiler warning.)

View File

@@ -81,6 +81,8 @@
# pragma pop_macro("DEBUG")
#endif
#include "c11/threads.h"
#include "os/os_thread.h"
#include "pipe/p_config.h"
#include "util/u_debug.h"
#include "util/u_cpu_detect.h"
@@ -103,6 +105,33 @@ static LLVMEnsureMultithreaded lLVMEnsureMultithreaded;
}
static once_flag init_native_targets_once_flag;
static void init_native_targets()
{
// If we have a native target, initialize it to ensure it is linked in and
// usable by the JIT.
llvm::InitializeNativeTarget();
llvm::InitializeNativeTargetAsmPrinter();
llvm::InitializeNativeTargetDisassembler();
}
/**
* The llvm target registry is not thread-safe, so drivers and state-trackers
* that want to initialize targets should use the gallivm_init_llvm_targets()
* function to safely initialize targets.
*
* LLVM targets should be initialized before the driver or state-tracker tries
* to access the registry.
*/
extern "C" void
gallivm_init_llvm_targets(void)
{
call_once(&init_native_targets_once_flag, init_native_targets);
}
extern "C" void
lp_set_target_options(void)
{
@@ -115,13 +144,7 @@ lp_set_target_options(void)
llvm::DisablePrettyStackTrace = true;
#endif
// If we have a native target, initialize it to ensure it is linked in and
// usable by the JIT.
llvm::InitializeNativeTarget();
llvm::InitializeNativeTargetAsmPrinter();
llvm::InitializeNativeTargetDisassembler();
gallivm_init_llvm_targets();
}
@@ -474,20 +497,48 @@ lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT,
#endif
}
llvm::SmallVector<std::string, 1> MAttrs;
if (util_cpu_caps.has_avx) {
/*
* AVX feature is not automatically detected from CPUID by the X86 target
* yet, because the old (yet default) JIT engine is not capable of
* emitting the opcodes. On newer llvm versions it is and at least some
* versions (tested with 3.3) will emit avx opcodes without this anyway.
*/
MAttrs.push_back("+avx");
if (util_cpu_caps.has_f16c) {
MAttrs.push_back("+f16c");
}
builder.setMAttrs(MAttrs);
}
llvm::SmallVector<std::string, 16> MAttrs;
#if defined(PIPE_ARCH_X86) || defined(PIPE_ARCH_X86_64)
/*
* We need to unset attributes because sometimes LLVM mistakenly assumes
* certain features are present given the processor name.
*
* https://bugs.freedesktop.org/show_bug.cgi?id=92214
* http://llvm.org/PR25021
* http://llvm.org/PR19429
* http://llvm.org/PR16721
*/
MAttrs.push_back(util_cpu_caps.has_sse ? "+sse" : "-sse" );
MAttrs.push_back(util_cpu_caps.has_sse2 ? "+sse2" : "-sse2" );
MAttrs.push_back(util_cpu_caps.has_sse3 ? "+sse3" : "-sse3" );
MAttrs.push_back(util_cpu_caps.has_ssse3 ? "+ssse3" : "-ssse3" );
#if HAVE_LLVM >= 0x0304
MAttrs.push_back(util_cpu_caps.has_sse4_1 ? "+sse4.1" : "-sse4.1");
#else
MAttrs.push_back(util_cpu_caps.has_sse4_1 ? "+sse41" : "-sse41" );
#endif
#if HAVE_LLVM >= 0x0304
MAttrs.push_back(util_cpu_caps.has_sse4_2 ? "+sse4.2" : "-sse4.2");
#else
MAttrs.push_back(util_cpu_caps.has_sse4_2 ? "+sse42" : "-sse42" );
#endif
/*
* AVX feature is not automatically detected from CPUID by the X86 target
* yet, because the old (yet default) JIT engine is not capable of
* emitting the opcodes. On newer llvm versions it is and at least some
* versions (tested with 3.3) will emit avx opcodes without this anyway.
*/
MAttrs.push_back(util_cpu_caps.has_avx ? "+avx" : "-avx");
MAttrs.push_back(util_cpu_caps.has_f16c ? "+f16c" : "-f16c");
MAttrs.push_back(util_cpu_caps.has_avx2 ? "+avx2" : "-avx2");
#endif
#if defined(PIPE_ARCH_PPC)
MAttrs.push_back(util_cpu_caps.has_altivec ? "+altivec" : "-altivec");
#endif
builder.setMAttrs(MAttrs);
#if HAVE_LLVM >= 0x0305
StringRef MCPU = llvm::sys::getHostCPUName();

View File

@@ -41,6 +41,8 @@ extern "C" {
struct lp_generated_code;
extern void
gallivm_init_llvm_targets(void);
extern void
lp_set_target_options(void);

View File

@@ -463,6 +463,8 @@ tgsi_exec_get_shader_param(enum pipe_shader_cap param)
case PIPE_SHADER_CAP_TGSI_DROUND_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED:
return 0;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
}
/* if we get here, we missed a shader cap above (and should have seen
* a compiler warning.)

View File

@@ -1190,6 +1190,8 @@ static void blitter_draw(struct blitter_context_priv *ctx,
u_upload_data(ctx->upload, 0, sizeof(ctx->vertices), ctx->vertices,
&vb.buffer_offset, &vb.buffer);
if (!vb.buffer)
return;
u_upload_unmap(ctx->upload);
pipe->set_vertex_buffers(pipe, ctx->base.vb_slot, 1, &vb);
@@ -2089,6 +2091,9 @@ void util_blitter_clear_buffer(struct blitter_context *blitter,
u_upload_data(ctx->upload, 0, num_channels*4, clear_value,
&vb.buffer_offset, &vb.buffer);
if (!vb.buffer)
goto out;
vb.stride = 0;
blitter_set_running_flag(ctx);
@@ -2112,6 +2117,7 @@ void util_blitter_clear_buffer(struct blitter_context *blitter,
util_draw_arrays(pipe, PIPE_PRIM_POINTS, 0, size / 4);
out:
blitter_restore_vertex_states(ctx);
blitter_restore_render_cond(ctx);
blitter_unset_running_flag(ctx);

View File

@@ -545,6 +545,7 @@ u_vbuf_translate_find_free_vb_slots(struct u_vbuf *mgr,
index = ffs(unused_vb_mask) - 1;
fallback_vbs[type] = index;
unused_vb_mask &= ~(1 << index);
/*printf("found slot=%i for type=%i\n", index, type);*/
}
}

View File

@@ -355,6 +355,10 @@ to be 0.
are supported.
* ``PIPE_SHADER_CAP_TGSI_ANY_INOUT_DECL_RANGE``: Whether the driver doesn't
ignore tgsi_declaration_range::Last for shader inputs and outputs.
* ``PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT``: This is the maximum number
of iterations that loops are allowed to have to be unrolled. It is only
a hint to state trackers. Whether any loops will be unrolled is not
guaranteed.
.. _pipe_compute_cap:

View File

@@ -828,11 +828,7 @@ fd3_emit_restore(struct fd_context *ctx)
OUT_RING(ring, A3XX_HLSQ_CONST_FSPRESV_RANGE_REG_STARTENTRY(0) |
A3XX_HLSQ_CONST_FSPRESV_RANGE_REG_ENDENTRY(0));
OUT_PKT0(ring, REG_A3XX_UCHE_CACHE_INVALIDATE0_REG, 2);
OUT_RING(ring, A3XX_UCHE_CACHE_INVALIDATE0_REG_ADDR(0));
OUT_RING(ring, A3XX_UCHE_CACHE_INVALIDATE1_REG_ADDR(0) |
A3XX_UCHE_CACHE_INVALIDATE1_REG_OPCODE(INVALIDATE) |
A3XX_UCHE_CACHE_INVALIDATE1_REG_ENTIRE_CACHE);
fd3_emit_cache_flush(ctx, ring);
OUT_PKT0(ring, REG_A3XX_GRAS_CL_CLIP_CNTL, 1);
OUT_RING(ring, 0x00000000); /* GRAS_CL_CLIP_CNTL */

View File

@@ -90,4 +90,15 @@ void fd3_emit_restore(struct fd_context *ctx);
void fd3_emit_init(struct pipe_context *pctx);
static inline void
fd3_emit_cache_flush(struct fd_context *ctx, struct fd_ringbuffer *ring)
{
fd_wfi(ctx, ring);
OUT_PKT0(ring, REG_A3XX_UCHE_CACHE_INVALIDATE0_REG, 2);
OUT_RING(ring, A3XX_UCHE_CACHE_INVALIDATE0_REG_ADDR(0));
OUT_RING(ring, A3XX_UCHE_CACHE_INVALIDATE1_REG_ADDR(0) |
A3XX_UCHE_CACHE_INVALIDATE1_REG_OPCODE(INVALIDATE) |
A3XX_UCHE_CACHE_INVALIDATE1_REG_ENTIRE_CACHE);
}
#endif /* FD3_EMIT_H */

View File

@@ -558,6 +558,8 @@ fd3_emit_tile_mem2gmem(struct fd_context *ctx, struct fd_tile *tile)
OUT_RING(ring, fui(x1));
OUT_RING(ring, fui(y1));
fd3_emit_cache_flush(ctx, ring);
for (i = 0; i < 4; i++) {
OUT_PKT0(ring, REG_A3XX_RB_MRT_CONTROL(i), 1);
OUT_RING(ring, A3XX_RB_MRT_CONTROL_ROP_CODE(ROP_COPY) |

View File

@@ -407,6 +407,8 @@ fd_screen_get_shader_param(struct pipe_screen *pscreen, unsigned shader,
return 16;
case PIPE_SHADER_CAP_PREFERRED_IR:
return PIPE_SHADER_IR_TGSI;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
}
debug_printf("unknown shader param %d\n", param);
return 0;

View File

@@ -167,6 +167,8 @@ i915_get_shader_param(struct pipe_screen *screen, unsigned shader, enum pipe_sha
case PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_ANY_INOUT_DECL_RANGE:
return 0;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
default:
debug_printf("%s: Unknown cap %u.\n", __FUNCTION__, cap);
return 0;

View File

@@ -138,6 +138,8 @@ ilo_get_shader_param(struct pipe_screen *screen, unsigned shader,
return PIPE_SHADER_IR_TGSI;
case PIPE_SHADER_CAP_TGSI_SQRT_SUPPORTED:
return 1;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
default:
return 0;

View File

@@ -746,7 +746,12 @@ lp_build_interp_soa_init(struct lp_build_interp_soa_context *bld,
pos_init(bld, x0, y0);
if (coeff_type.length > 4) {
/*
* Simple method (single step interpolation) may be slower if vector length
* is just 4, but the results are different (generally less accurate) with
* the other method, so always use more accurate version.
*/
if (1) {
bld->simple_interp = TRUE;
{
/* XXX this should use a global static table */

View File

@@ -25,10 +25,24 @@
#include <stack>
#include <limits>
#if __cplusplus >= 201103L
#include <unordered_map>
#else
#include <tr1/unordered_map>
#endif
namespace nv50_ir {
#if __cplusplus >= 201103L
using std::hash;
using std::unordered_map;
#elif !defined(ANDROID)
using std::tr1::hash;
using std::tr1::unordered_map;
#else
#error Android release before Lollipop is not supported!
#endif
#define MAX_REGISTER_FILE_SIZE 256
class RegisterSet
@@ -349,12 +363,12 @@ RegAlloc::PhiMovesPass::needNewElseBlock(BasicBlock *b, BasicBlock *p)
struct PhiMapHash {
size_t operator()(const std::pair<Instruction *, BasicBlock *>& val) const {
return std::tr1::hash<Instruction*>()(val.first) * 31 +
std::tr1::hash<BasicBlock*>()(val.second);
return hash<Instruction*>()(val.first) * 31 +
hash<BasicBlock*>()(val.second);
}
};
typedef std::tr1::unordered_map<
typedef unordered_map<
std::pair<Instruction *, BasicBlock *>, Value *, PhiMapHash> PhiMap;
// Critical edges need to be split up so that work can be inserted along

View File

@@ -80,7 +80,12 @@ release_allocation(struct nouveau_mm_allocation **mm,
inline void
nouveau_buffer_release_gpu_storage(struct nv04_resource *buf)
{
nouveau_bo_ref(NULL, &buf->bo);
if (buf->fence && buf->fence->state < NOUVEAU_FENCE_STATE_FLUSHED) {
nouveau_fence_work(buf->fence, nouveau_fence_unref_bo, buf->bo);
buf->bo = NULL;
} else {
nouveau_bo_ref(NULL, &buf->bo);
}
if (buf->mm)
release_allocation(&buf->mm, buf->fence);
@@ -281,7 +286,8 @@ nouveau_buffer_transfer_del(struct nouveau_context *nv,
{
if (tx->map) {
if (likely(tx->bo)) {
nouveau_bo_ref(NULL, &tx->bo);
nouveau_fence_work(nv->screen->fence.current,
nouveau_fence_unref_bo, tx->bo);
if (tx->mm)
release_allocation(&tx->mm, nv->screen->fence.current);
} else {
@@ -782,7 +788,7 @@ nouveau_buffer_migrate(struct nouveau_context *nv,
nv->copy_data(nv, buf->bo, buf->offset, new_domain,
bo, offset, old_domain, buf->base.width0);
nouveau_bo_ref(NULL, &bo);
nouveau_fence_work(screen->fence.current, nouveau_fence_unref_bo, bo);
if (mm)
release_allocation(&mm, screen->fence.current);
} else

View File

@@ -190,8 +190,14 @@ nouveau_fence_wait(struct nouveau_fence *fence)
/* wtf, someone is waiting on a fence in flush_notify handler? */
assert(fence->state != NOUVEAU_FENCE_STATE_EMITTING);
if (fence->state < NOUVEAU_FENCE_STATE_EMITTED)
nouveau_fence_emit(fence);
if (fence->state < NOUVEAU_FENCE_STATE_EMITTED) {
PUSH_SPACE(screen->pushbuf, 8);
/* The space allocation might trigger a flush, which could emit the
* current fence. So check again.
*/
if (fence->state < NOUVEAU_FENCE_STATE_EMITTED)
nouveau_fence_emit(fence);
}
if (fence->state < NOUVEAU_FENCE_STATE_FLUSHED)
if (nouveau_pushbuf_kick(screen->pushbuf, screen->pushbuf->channel))
@@ -224,10 +230,22 @@ nouveau_fence_wait(struct nouveau_fence *fence)
void
nouveau_fence_next(struct nouveau_screen *screen)
{
if (screen->fence.current->state < NOUVEAU_FENCE_STATE_EMITTING)
nouveau_fence_emit(screen->fence.current);
if (screen->fence.current->state < NOUVEAU_FENCE_STATE_EMITTING) {
if (screen->fence.current->ref > 1)
nouveau_fence_emit(screen->fence.current);
else
return;
}
nouveau_fence_ref(NULL, &screen->fence.current);
nouveau_fence_new(screen, &screen->fence.current, false);
}
void
nouveau_fence_unref_bo(void *data)
{
struct nouveau_bo *bo = data;
nouveau_bo_ref(NULL, &bo);
}

View File

@@ -37,6 +37,9 @@ void nouveau_fence_next(struct nouveau_screen *);
bool nouveau_fence_wait(struct nouveau_fence *);
bool nouveau_fence_signalled(struct nouveau_fence *);
void nouveau_fence_unref_bo(void *data); /* generic unref bo callback */
static inline void
nouveau_fence_ref(struct nouveau_fence *fence, struct nouveau_fence **ref)
{

View File

@@ -437,6 +437,7 @@ nouveau_vp3_screen_get_video_param(struct pipe_screen *pscreen,
/* VP3 does not support MPEG4, VP4+ do. */
return entrypoint == PIPE_VIDEO_ENTRYPOINT_BITSTREAM &&
profile >= PIPE_VIDEO_PROFILE_MPEG1 &&
profile < PIPE_VIDEO_PROFILE_HEVC_MAIN &&
(!vp3 || codec != PIPE_VIDEO_FORMAT_MPEG4) &&
firmware_present(pscreen, profile);
case PIPE_VIDEO_CAP_NPOT_TEXTURES:

View File

@@ -24,6 +24,8 @@ PUSH_AVAIL(struct nouveau_pushbuf *push)
static inline bool
PUSH_SPACE(struct nouveau_pushbuf *push, uint32_t size)
{
/* Provide a buffer so that fences always have room to be emitted */
size += 8;
if (PUSH_AVAIL(push) < size)
return nouveau_pushbuf_space(push, size, 0, 0) == 0;
return true;

View File

@@ -78,12 +78,12 @@ nv30_format_info_table[PIPE_FORMAT_COUNT] = {
_(B4G4R4X4_UNORM , S___),
_(B4G4R4A4_UNORM , S___),
_(B5G6R5_UNORM , SB__),
_(B8G8R8X8_UNORM , SB__),
_(B8G8R8X8_SRGB , S___),
_(B8G8R8A8_UNORM , SB__),
_(B8G8R8A8_SRGB , S___),
_(BGRX8888_UNORM , SB__),
_(BGRX8888_SRGB , S___),
_(BGRA8888_UNORM , SB__),
_(BGRA8888_SRGB , S___),
_(R8G8B8A8_UNORM , __V_),
_(R8G8B8A8_SNORM , S___),
_(RGBA8888_SNORM , S___),
_(DXT1_RGB , S___),
_(DXT1_SRGB , S___),
_(DXT1_RGBA , S___),
@@ -138,8 +138,8 @@ const struct nv30_format
nv30_format_table[PIPE_FORMAT_COUNT] = {
R_(B5G5R5X1_UNORM , X1R5G5B5 ),
R_(B5G6R5_UNORM , R5G6B5 ),
R_(B8G8R8X8_UNORM , X8R8G8B8 ),
R_(B8G8R8A8_UNORM , A8R8G8B8 ),
R_(BGRX8888_UNORM , X8R8G8B8 ),
R_(BGRA8888_UNORM , A8R8G8B8 ),
Z_(Z16_UNORM , Z16 ),
Z_(X8Z24_UNORM , Z24S8 ),
Z_(S8_UINT_Z24_UNORM , Z24S8 ),
@@ -223,11 +223,11 @@ nv30_texfmt_table[PIPE_FORMAT_COUNT] = {
_(B4G4R4X4_UNORM , A4R4G4B4, 0, C, C, C, 1, 2, 1, 0, x, NONE, ____),
_(B4G4R4A4_UNORM , A4R4G4B4, 0, C, C, C, C, 2, 1, 0, 3, NONE, ____),
_(B5G6R5_UNORM , R5G6B5 , 0, C, C, C, 1, 2, 1, 0, x, NONE, ____),
_(B8G8R8X8_UNORM , A8R8G8B8, 0, C, C, C, 1, 2, 1, 0, x, NONE, ____),
_(B8G8R8X8_SRGB , A8R8G8B8, 0, C, C, C, 1, 2, 1, 0, x, SRGB, ____),
_(B8G8R8A8_UNORM , A8R8G8B8, 0, C, C, C, C, 2, 1, 0, 3, NONE, ____),
_(B8G8R8A8_SRGB , A8R8G8B8, 0, C, C, C, C, 2, 1, 0, 3, SRGB, ____),
_(R8G8B8A8_SNORM , A8R8G8B8, 0, C, C, C, C, 0, 1, 2, 3, NONE, SSSS),
_(BGRX8888_UNORM , A8R8G8B8, 0, C, C, C, 1, 2, 1, 0, x, NONE, ____),
_(BGRX8888_SRGB , A8R8G8B8, 0, C, C, C, 1, 2, 1, 0, x, SRGB, ____),
_(BGRA8888_UNORM , A8R8G8B8, 0, C, C, C, C, 2, 1, 0, 3, NONE, ____),
_(BGRA8888_SRGB , A8R8G8B8, 0, C, C, C, C, 2, 1, 0, 3, SRGB, ____),
_(RGBA8888_SNORM , A8R8G8B8, 0, C, C, C, C, 0, 1, 2, 3, NONE, SSSS),
_(DXT1_RGB , DXT1 , 0, C, C, C, 1, 2, 1, 0, x, NONE, ____),
_(DXT1_SRGB , DXT1 , 0, C, C, C, 1, 2, 1, 0, x, SRGB, ____),
_(DXT1_RGBA , DXT1 , 0, C, C, C, C, 2, 1, 0, 3, NONE, ____),

View File

@@ -339,10 +339,15 @@ nv30_miptree_transfer_unmap(struct pipe_context *pipe,
struct nv30_context *nv30 = nv30_context(pipe);
struct nv30_transfer *tx = nv30_transfer(ptx);
if (ptx->usage & PIPE_TRANSFER_WRITE)
if (ptx->usage & PIPE_TRANSFER_WRITE) {
nv30_transfer_rect(nv30, NEAREST, &tx->tmp, &tx->img);
nouveau_bo_ref(NULL, &tx->tmp.bo);
/* Allow the copies above to finish executing before freeing the source */
nouveau_fence_work(nv30->screen->base.fence.current,
nouveau_fence_unref_bo, tx->tmp.bo);
} else {
nouveau_bo_ref(NULL, &tx->tmp.bo);
}
pipe_resource_reference(&ptx->resource, NULL);
FREE(tx);
}

View File

@@ -261,6 +261,8 @@ nv30_screen_get_shader_param(struct pipe_screen *pscreen, unsigned shader,
case PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_ANY_INOUT_DECL_RANGE:
return 0;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
default:
debug_printf("unknown vertex shader param %d\n", param);
return 0;
@@ -302,6 +304,8 @@ nv30_screen_get_shader_param(struct pipe_screen *pscreen, unsigned shader,
case PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_ANY_INOUT_DECL_RANGE:
return 0;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
default:
debug_printf("unknown fragment shader param %d\n", param);
return 0;
@@ -345,7 +349,9 @@ nv30_screen_fence_emit(struct pipe_screen *pscreen, uint32_t *sequence)
*sequence = ++screen->base.fence.sequence;
BEGIN_NV04(push, NV30_3D(FENCE_OFFSET), 2);
assert(PUSH_AVAIL(push) + push->rsvd_kick >= 3);
PUSH_DATA (push, NV30_3D_FENCE_OFFSET |
(2 /* size */ << 18) | (7 /* subchan */ << 13));
PUSH_DATA (push, 0);
PUSH_DATA (push, *sequence);
}

View File

@@ -191,7 +191,11 @@ nv30_vbo_validate(struct nv30_context *nv30)
if (!nv30->vertex || nv30->draw_flags)
return;
#ifdef PIPE_ARCH_BIG_ENDIAN
if (1) { /* Figure out where the buffers are getting messed up */
#else
if (unlikely(vertex->need_conversion)) {
#endif
nv30->vbo_fifo = ~0;
nv30->vbo_user = 0;
} else {

View File

@@ -1,3 +1,4 @@
#include <strings.h>
#include "pipe/p_context.h"
#include "pipe/p_defines.h"
#include "pipe/p_state.h"

View File

@@ -163,7 +163,10 @@ nv50_miptree_destroy(struct pipe_screen *pscreen, struct pipe_resource *pt)
{
struct nv50_miptree *mt = nv50_miptree(pt);
nouveau_bo_ref(NULL, &mt->base.bo);
if (mt->base.fence && mt->base.fence->state < NOUVEAU_FENCE_STATE_FLUSHED)
nouveau_fence_work(mt->base.fence, nouveau_fence_unref_bo, mt->base.bo);
else
nouveau_bo_ref(NULL, &mt->base.bo);
nouveau_fence_ref(NULL, &mt->base.fence);
nouveau_fence_ref(NULL, &mt->base.fence_wr);

View File

@@ -297,6 +297,8 @@ nv50_screen_get_shader_param(struct pipe_screen *pscreen, unsigned shader,
case PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_ANY_INOUT_DECL_RANGE:
return 0;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
default:
NOUVEAU_ERR("unknown PIPE_SHADER_CAP %d\n", param);
return 0;
@@ -386,6 +388,7 @@ nv50_screen_fence_emit(struct pipe_screen *pscreen, u32 *sequence)
/* we need to do it after possible flush in MARK_RING */
*sequence = ++screen->base.fence.sequence;
assert(PUSH_AVAIL(push) + push->rsvd_kick >= 5);
PUSH_DATA (push, NV50_FIFO_PKHDR(NV50_3D(QUERY_ADDRESS_HIGH), 4));
PUSH_DATAh(push, screen->fence.bo->offset);
PUSH_DATA (push, screen->fence.bo->offset);

View File

@@ -65,14 +65,9 @@ nv50_constbufs_validate(struct nv50_context *nv50)
PUSH_DATA (push, (b << 12) | (i << 8) | p | 1);
}
while (words) {
unsigned nr;
if (!PUSH_SPACE(push, 16))
break;
nr = PUSH_AVAIL(push);
assert(nr >= 16);
nr = MIN2(MIN2(nr - 3, words), NV04_PFIFO_MAX_PACKET_LEN);
unsigned nr = MIN2(words, NV04_PFIFO_MAX_PACKET_LEN);
PUSH_SPACE(push, nr + 3);
BEGIN_NV04(push, NV50_3D(CB_ADDR), 1);
PUSH_DATA (push, (start << 8) | b);
BEGIN_NI04(push, NV50_3D(CB_DATA(0)), nr);

View File

@@ -187,14 +187,7 @@ nv50_sifc_linear_u8(struct nouveau_context *nv,
PUSH_DATA (push, 0);
while (count) {
unsigned nr;
if (!PUSH_SPACE(push, 16))
break;
nr = PUSH_AVAIL(push);
assert(nr >= 16);
nr = MIN2(count, nr - 1);
nr = MIN2(nr, NV04_PFIFO_MAX_PACKET_LEN);
unsigned nr = MIN2(count, NV04_PFIFO_MAX_PACKET_LEN);
BEGIN_NI04(push, NV50_2D(SIFC_DATA), nr);
PUSH_DATAp(push, src, nr);
@@ -365,9 +358,14 @@ nv50_miptree_transfer_unmap(struct pipe_context *pctx,
tx->rect[0].base += mt->layer_stride;
tx->rect[1].base += tx->nblocksy * tx->base.stride;
}
/* Allow the copies above to finish executing before freeing the source */
nouveau_fence_work(nv50->screen->base.fence.current,
nouveau_fence_unref_bo, tx->rect[1].bo);
} else {
nouveau_bo_ref(NULL, &tx->rect[1].bo);
}
nouveau_bo_ref(NULL, &tx->rect[1].bo);
pipe_resource_reference(&transfer->resource, NULL);
FREE(tx);
@@ -390,12 +388,9 @@ nv50_cb_push(struct nouveau_context *nv,
nouveau_pushbuf_validate(push);
while (words) {
unsigned nr;
nr = PUSH_AVAIL(push);
nr = MIN2(nr - 7, words);
nr = MIN2(nr, NV04_PFIFO_MAX_PACKET_LEN - 1);
unsigned nr = MIN2(words, NV04_PFIFO_MAX_PACKET_LEN);
PUSH_SPACE(push, nr + 7);
BEGIN_NV04(push, NV50_3D(CB_DEF_ADDRESS_HIGH), 3);
PUSH_DATAh(push, bo->offset + base);
PUSH_DATA (push, bo->offset + base);

View File

@@ -26,7 +26,8 @@ nvc0_resource_from_handle(struct pipe_screen * screen,
} else {
struct pipe_resource *res = nv50_miptree_from_handle(screen,
templ, whandle);
nv04_resource(res)->vtbl = &nvc0_miptree_vtbl;
if (res)
nv04_resource(res)->vtbl = &nvc0_miptree_vtbl;
return res;
}
}

View File

@@ -310,6 +310,8 @@ nvc0_screen_get_shader_param(struct pipe_screen *pscreen, unsigned shader,
return 16; /* would be 32 in linked (OpenGL-style) mode */
case PIPE_SHADER_CAP_MAX_SAMPLER_VIEWS:
return 16; /* XXX not sure if more are really safe */
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
default:
NOUVEAU_ERR("unknown PIPE_SHADER_CAP %d\n", param);
return 0;
@@ -535,7 +537,8 @@ nvc0_screen_fence_emit(struct pipe_screen *pscreen, u32 *sequence)
/* we need to do it after possible flush in MARK_RING */
*sequence = ++screen->base.fence.sequence;
BEGIN_NVC0(push, NVC0_3D(QUERY_ADDRESS_HIGH), 4);
assert(PUSH_AVAIL(push) + push->rsvd_kick >= 5);
PUSH_DATA (push, NVC0_FIFO_PKHDR_SQ(NVC0_3D(QUERY_ADDRESS_HIGH), 4));
PUSH_DATAh(push, screen->fence.bo->offset);
PUSH_DATA (push, screen->fence.bo->offset);
PUSH_DATA (push, *sequence);

View File

@@ -188,14 +188,10 @@ nvc0_m2mf_push_linear(struct nouveau_context *nv,
nouveau_pushbuf_validate(push);
while (count) {
unsigned nr;
unsigned nr = MIN2(count, NV04_PFIFO_MAX_PACKET_LEN);
if (!PUSH_SPACE(push, 16))
if (!PUSH_SPACE(push, nr + 9))
break;
nr = PUSH_AVAIL(push);
assert(nr >= 16);
nr = MIN2(count, nr - 9);
nr = MIN2(nr, NV04_PFIFO_MAX_PACKET_LEN);
BEGIN_NVC0(push, NVC0_M2MF(OFFSET_OUT_HIGH), 2);
PUSH_DATAh(push, dst->offset + offset);
@@ -234,14 +230,10 @@ nve4_p2mf_push_linear(struct nouveau_context *nv,
nouveau_pushbuf_validate(push);
while (count) {
unsigned nr;
unsigned nr = MIN2(count, (NV04_PFIFO_MAX_PACKET_LEN - 1));
if (!PUSH_SPACE(push, 16))
if (!PUSH_SPACE(push, nr + 10))
break;
nr = PUSH_AVAIL(push);
assert(nr >= 16);
nr = MIN2(count, nr - 8);
nr = MIN2(nr, (NV04_PFIFO_MAX_PACKET_LEN - 1));
BEGIN_NVC0(push, NVE4_P2MF(UPLOAD_DST_ADDRESS_HIGH), 2);
PUSH_DATAh(push, dst->offset + offset);
@@ -495,11 +487,16 @@ nvc0_miptree_transfer_unmap(struct pipe_context *pctx,
tx->rect[1].base += tx->nblocksy * tx->base.stride;
}
NOUVEAU_DRV_STAT(&nvc0->screen->base, tex_transfers_wr, 1);
/* Allow the copies above to finish executing before freeing the source */
nouveau_fence_work(nvc0->screen->base.fence.current,
nouveau_fence_unref_bo, tx->rect[1].bo);
} else {
nouveau_bo_ref(NULL, &tx->rect[1].bo);
}
if (tx->base.usage & PIPE_TRANSFER_READ)
NOUVEAU_DRV_STAT(&nvc0->screen->base, tex_transfers_rd, 1);
nouveau_bo_ref(NULL, &tx->rect[1].bo);
pipe_resource_reference(&transfer->resource, NULL);
FREE(tx);
@@ -566,9 +563,7 @@ nvc0_cb_bo_push(struct nouveau_context *nv,
PUSH_DATA (push, bo->offset + base);
while (words) {
unsigned nr = PUSH_AVAIL(push);
nr = MIN2(nr, words);
nr = MIN2(nr, NV04_PFIFO_MAX_PACKET_LEN - 1);
unsigned nr = MIN2(words, NV04_PFIFO_MAX_PACKET_LEN - 1);
PUSH_SPACE(push, nr + 2);
PUSH_REFN (push, bo, NOUVEAU_BO_WR | domain);

View File

@@ -27,6 +27,7 @@ struct push_context {
struct {
bool enabled;
bool value;
uint8_t width;
unsigned stride;
const uint8_t *data;
} edgeflag;
@@ -53,6 +54,7 @@ nvc0_push_context_init(struct nvc0_context *nvc0, struct push_context *ctx)
/* silence warnings */
ctx->edgeflag.data = NULL;
ctx->edgeflag.stride = 0;
ctx->edgeflag.width = 0;
}
static inline void
@@ -100,6 +102,7 @@ nvc0_push_map_edgeflag(struct push_context *ctx, struct nvc0_context *nvc0,
struct nv04_resource *buf = nv04_resource(vb->buffer);
ctx->edgeflag.stride = vb->stride;
ctx->edgeflag.width = util_format_get_blocksize(ve->src_format);
if (buf) {
unsigned offset = vb->buffer_offset + ve->src_offset;
ctx->edgeflag.data = nouveau_resource_map_offset(&nvc0->base,
@@ -137,10 +140,17 @@ prim_restart_search_i32(const uint32_t *elts, unsigned push, uint32_t index)
}
static inline bool
ef_value(const struct push_context *ctx, uint32_t index)
ef_value_8(const struct push_context *ctx, uint32_t index)
{
float *pf = (float *)&ctx->edgeflag.data[index * ctx->edgeflag.stride];
return *pf ? true : false;
uint8_t *pf = (uint8_t *)&ctx->edgeflag.data[index * ctx->edgeflag.stride];
return !!*pf;
}
static inline bool
ef_value_32(const struct push_context *ctx, uint32_t index)
{
uint32_t *pf = (uint32_t *)&ctx->edgeflag.data[index * ctx->edgeflag.stride];
return !!*pf;
}
static inline bool
@@ -154,7 +164,11 @@ static inline unsigned
ef_toggle_search_i08(struct push_context *ctx, const uint8_t *elts, unsigned n)
{
unsigned i;
for (i = 0; i < n && ef_value(ctx, elts[i]) == ctx->edgeflag.value; ++i);
bool ef = ctx->edgeflag.value;
if (ctx->edgeflag.width == 1)
for (i = 0; i < n && ef_value_8(ctx, elts[i]) == ef; ++i);
else
for (i = 0; i < n && ef_value_32(ctx, elts[i]) == ef; ++i);
return i;
}
@@ -162,7 +176,11 @@ static inline unsigned
ef_toggle_search_i16(struct push_context *ctx, const uint16_t *elts, unsigned n)
{
unsigned i;
for (i = 0; i < n && ef_value(ctx, elts[i]) == ctx->edgeflag.value; ++i);
bool ef = ctx->edgeflag.value;
if (ctx->edgeflag.width == 1)
for (i = 0; i < n && ef_value_8(ctx, elts[i]) == ef; ++i);
else
for (i = 0; i < n && ef_value_32(ctx, elts[i]) == ef; ++i);
return i;
}
@@ -170,7 +188,11 @@ static inline unsigned
ef_toggle_search_i32(struct push_context *ctx, const uint32_t *elts, unsigned n)
{
unsigned i;
for (i = 0; i < n && ef_value(ctx, elts[i]) == ctx->edgeflag.value; ++i);
bool ef = ctx->edgeflag.value;
if (ctx->edgeflag.width == 1)
for (i = 0; i < n && ef_value_8(ctx, elts[i]) == ef; ++i);
else
for (i = 0; i < n && ef_value_32(ctx, elts[i]) == ef; ++i);
return i;
}
@@ -178,7 +200,11 @@ static inline unsigned
ef_toggle_search_seq(struct push_context *ctx, unsigned start, unsigned n)
{
unsigned i;
for (i = 0; i < n && ef_value(ctx, start++) == ctx->edgeflag.value; ++i);
bool ef = ctx->edgeflag.value;
if (ctx->edgeflag.width == 1)
for (i = 0; i < n && ef_value_8(ctx, start++) == ef; ++i);
else
for (i = 0; i < n && ef_value_32(ctx, start++) == ef; ++i);
return i;
}

View File

@@ -300,6 +300,8 @@ static int r300_get_shader_param(struct pipe_screen *pscreen, unsigned shader, e
case PIPE_SHADER_CAP_TGSI_DFRACEXP_DLDEXP_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED:
return 0;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
case PIPE_SHADER_CAP_PREFERRED_IR:
return PIPE_SHADER_IR_TGSI;
}
@@ -356,6 +358,8 @@ static int r300_get_shader_param(struct pipe_screen *pscreen, unsigned shader, e
case PIPE_SHADER_CAP_TGSI_DFRACEXP_DLDEXP_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED:
return 0;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
case PIPE_SHADER_CAP_PREFERRED_IR:
return PIPE_SHADER_IR_TGSI;
}

View File

@@ -2342,6 +2342,8 @@ static void cayman_init_atom_start_cs(struct r600_context *rctx)
r600_store_context_reg(cb, R_028848_SQ_PGM_RESOURCES_2_PS, S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN));
r600_store_context_reg(cb, R_028864_SQ_PGM_RESOURCES_2_VS, S_028864_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN));
r600_store_context_reg(cb, R_02887C_SQ_PGM_RESOURCES_2_GS, S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN));
r600_store_context_reg(cb, R_028894_SQ_PGM_RESOURCES_2_ES, S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN));
r600_store_context_reg(cb, R_0288A8_SQ_PGM_RESOURCES_FS, 0);
/* to avoid GPU doing any preloading of constant from random address */
@@ -2781,6 +2783,8 @@ void evergreen_init_atom_start_cs(struct r600_context *rctx)
r600_store_context_reg(cb, R_028848_SQ_PGM_RESOURCES_2_PS, S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN));
r600_store_context_reg(cb, R_028864_SQ_PGM_RESOURCES_2_VS, S_028864_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN));
r600_store_context_reg(cb, R_02887C_SQ_PGM_RESOURCES_2_GS, S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN));
r600_store_context_reg(cb, R_028894_SQ_PGM_RESOURCES_2_ES, S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN));
r600_store_context_reg(cb, R_0288A8_SQ_PGM_RESOURCES_FS, 0);
/* to avoid GPU doing any preloading of constant from random address */

View File

@@ -1497,6 +1497,7 @@
#define S_028878_UNCACHED_FIRST_INST(x) (((x) & 0x1) << 28)
#define G_028878_UNCACHED_FIRST_INST(x) (((x) >> 28) & 0x1)
#define C_028878_UNCACHED_FIRST_INST 0xEFFFFFFF
#define R_02887C_SQ_PGM_RESOURCES_2_GS 0x02887C
#define R_028890_SQ_PGM_RESOURCES_ES 0x028890
#define S_028890_NUM_GPRS(x) (((x) & 0xFF) << 0)
@@ -1511,6 +1512,7 @@
#define S_028890_UNCACHED_FIRST_INST(x) (((x) & 0x1) << 28)
#define G_028890_UNCACHED_FIRST_INST(x) (((x) >> 28) & 0x1)
#define C_028890_UNCACHED_FIRST_INST 0xEFFFFFFF
#define R_028894_SQ_PGM_RESOURCES_2_ES 0x028894
#define R_028864_SQ_PGM_RESOURCES_2_VS 0x028864
#define S_028864_SINGLE_ROUND(x) (((x) & 0x3) << 0)

View File

@@ -621,7 +621,7 @@ static int replace_gpr_with_pv_ps(struct r600_bytecode *bc,
return 0;
}
void r600_bytecode_special_constants(uint32_t value, unsigned *sel, unsigned *neg)
void r600_bytecode_special_constants(uint32_t value, unsigned *sel, unsigned *neg, unsigned abs)
{
switch(value) {
case 0:
@@ -641,11 +641,11 @@ void r600_bytecode_special_constants(uint32_t value, unsigned *sel, unsigned *ne
break;
case 0xBF800000: /* -1.0f */
*sel = V_SQ_ALU_SRC_1;
*neg ^= 1;
*neg ^= !abs;
break;
case 0xBF000000: /* -0.5f */
*sel = V_SQ_ALU_SRC_0_5;
*neg ^= 1;
*neg ^= !abs;
break;
default:
*sel = V_SQ_ALU_SRC_LITERAL;
@@ -1194,7 +1194,7 @@ int r600_bytecode_add_alu_type(struct r600_bytecode *bc,
}
if (nalu->src[i].sel == V_SQ_ALU_SRC_LITERAL)
r600_bytecode_special_constants(nalu->src[i].value,
&nalu->src[i].sel, &nalu->src[i].neg);
&nalu->src[i].sel, &nalu->src[i].neg, nalu->src[i].abs);
}
if (nalu->dst.sel >= bc->ngpr) {
bc->ngpr = nalu->dst.sel + 1;

View File

@@ -254,7 +254,7 @@ int r600_bytecode_add_cfinst(struct r600_bytecode *bc,
int r600_bytecode_add_alu_type(struct r600_bytecode *bc,
const struct r600_bytecode_alu *alu, unsigned type);
void r600_bytecode_special_constants(uint32_t value,
unsigned *sel, unsigned *neg);
unsigned *sel, unsigned *neg, unsigned abs);
void r600_bytecode_disasm(struct r600_bytecode *bc);
void r600_bytecode_alu_read(struct r600_bytecode *bc,
struct r600_bytecode_alu *alu, uint32_t word0, uint32_t word1);

View File

@@ -504,6 +504,12 @@ static int r600_get_shader_param(struct pipe_screen* pscreen, unsigned shader, e
case PIPE_SHADER_CAP_TGSI_DFRACEXP_DLDEXP_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED:
return 0;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
/* due to a bug in the shader compiler, some loops hang
* if they are not unrolled, see:
* https://bugs.freedesktop.org/show_bug.cgi?id=86720
*/
return 255;
}
return 0;
}

View File

@@ -1004,7 +1004,7 @@ static void tgsi_src(struct r600_shader_ctx *ctx,
(tgsi_src->Register.SwizzleX == tgsi_src->Register.SwizzleW)) {
index = tgsi_src->Register.Index * 4 + tgsi_src->Register.SwizzleX;
r600_bytecode_special_constants(ctx->literals[index], &r600_src->sel, &r600_src->neg);
r600_bytecode_special_constants(ctx->literals[index], &r600_src->sel, &r600_src->neg, r600_src->abs);
if (r600_src->sel != V_SQ_ALU_SRC_LITERAL)
return;
}

View File

@@ -305,12 +305,11 @@ static void *r600_buffer_transfer_map(struct pipe_context *ctx,
data += box->x % R600_MAP_BUFFER_ALIGNMENT;
return r600_buffer_get_transfer(ctx, resource, level, usage, box,
ptransfer, data, staging, offset);
} else {
return NULL; /* error, shouldn't occur though */
}
} else {
/* At this point, the buffer is always idle (we checked it above). */
usage |= PIPE_TRANSFER_UNSYNCHRONIZED;
}
/* At this point, the buffer is always idle (we checked it above). */
usage |= PIPE_TRANSFER_UNSYNCHRONIZED;
}
/* Using a staging buffer in GTT for larger reads is much faster. */
else if ((usage & PIPE_TRANSFER_READ) &&

View File

@@ -78,6 +78,9 @@ void r600_draw_rectangle(struct blitter_context *blitter,
* I guess the 4th one is derived from the first 3.
* The vertex specification should match u_blitter's vertex element state. */
u_upload_alloc(rctx->uploader, 0, sizeof(float) * 24, &offset, &buf, (void**)&vb);
if (!buf)
return;
vb[0] = x1;
vb[1] = y1;
vb[2] = depth;
@@ -412,6 +415,7 @@ static const char* r600_get_chip_name(struct r600_common_screen *rscreen)
case CHIP_ICELAND: return "AMD ICELAND";
case CHIP_CARRIZO: return "AMD CARRIZO";
case CHIP_FIJI: return "AMD FIJI";
case CHIP_STONEY: return "AMD STONEY";
default: return "AMD unknown";
}
}
@@ -540,6 +544,11 @@ const char *r600_get_llvm_processor_name(enum radeon_family family)
case CHIP_ICELAND: return "iceland";
case CHIP_CARRIZO: return "carrizo";
case CHIP_FIJI: return "fiji";
#if HAVE_LLVM <= 0x0307
case CHIP_STONEY: return "carrizo";
#else
case CHIP_STONEY: return "stoney";
#endif
default: return "";
}
}

View File

@@ -989,6 +989,11 @@ static void *r600_texture_transfer_map(struct pipe_context *ctx,
if (usage & PIPE_TRANSFER_READ) {
struct pipe_resource *temp = ctx->screen->resource_create(ctx->screen, &resource);
if (!temp) {
R600_ERR("failed to create a temporary depth texture\n");
FREE(trans);
return NULL;
}
r600_copy_region_with_blit(ctx, temp, 0, 0, 0, 0, texture, level, box);
rctx->blit_decompress_depth(ctx, (struct r600_texture*)temp, staging_depth,

View File

@@ -25,6 +25,8 @@
*/
#include "radeon_llvm_emit.h"
#include "radeon_elf_util.h"
#include "c11/threads.h"
#include "gallivm/lp_bld_misc.h"
#include "util/u_memory.h"
#include "pipe/p_shader_tokens.h"
@@ -86,30 +88,29 @@ void radeon_llvm_shader_type(LLVMValueRef F, unsigned type)
static void init_r600_target()
{
static unsigned initialized = 0;
if (!initialized) {
gallivm_init_llvm_targets();
#if HAVE_LLVM < 0x0307
LLVMInitializeR600TargetInfo();
LLVMInitializeR600Target();
LLVMInitializeR600TargetMC();
LLVMInitializeR600AsmPrinter();
LLVMInitializeR600TargetInfo();
LLVMInitializeR600Target();
LLVMInitializeR600TargetMC();
LLVMInitializeR600AsmPrinter();
#else
LLVMInitializeAMDGPUTargetInfo();
LLVMInitializeAMDGPUTarget();
LLVMInitializeAMDGPUTargetMC();
LLVMInitializeAMDGPUAsmPrinter();
LLVMInitializeAMDGPUTargetInfo();
LLVMInitializeAMDGPUTarget();
LLVMInitializeAMDGPUTargetMC();
LLVMInitializeAMDGPUAsmPrinter();
#endif
initialized = 1;
}
}
static once_flag init_r600_target_once_flag = ONCE_FLAG_INIT;
LLVMTargetRef radeon_llvm_get_r600_target(const char *triple)
{
LLVMTargetRef target = NULL;
char *err_message = NULL;
init_r600_target();
call_once(&init_r600_target_once_flag, init_r600_target);
if (LLVMGetTargetFromTriple(triple, &target, &err_message)) {
fprintf(stderr, "Cannot find target for triple %s ", triple);

View File

@@ -940,6 +940,12 @@ static void ruvd_end_frame(struct pipe_video_codec *decoder,
dec->msg->body.decode.width_in_samples = dec->base.width;
dec->msg->body.decode.height_in_samples = dec->base.height;
if ((picture->profile == PIPE_VIDEO_PROFILE_VC1_SIMPLE) ||
(picture->profile == PIPE_VIDEO_PROFILE_VC1_MAIN)) {
dec->msg->body.decode.width_in_samples = align(dec->msg->body.decode.width_in_samples, 16) / 16;
dec->msg->body.decode.height_in_samples = align(dec->msg->body.decode.height_in_samples, 16) / 16;
}
dec->msg->body.decode.dpb_size = dec->dpb.res->buf->size;
dec->msg->body.decode.bsd_size = bs_size;
dec->msg->body.decode.db_pitch = dec->base.width;

View File

@@ -233,6 +233,9 @@ static void vui(struct rvce_encoder *enc)
{
int i;
if (!enc->pic.rate_ctrl.frame_rate_num)
return;
RVCE_BEGIN(0x04000009); // vui
RVCE_CS(0x00000000); //aspectRatioInfoPresentFlag
RVCE_CS(0x00000000); //aspectRatioInfo.aspectRatioIdc

View File

@@ -205,11 +205,12 @@ int rvid_get_video_param(struct pipe_screen *screen,
enum pipe_video_cap param)
{
struct r600_common_screen *rscreen = (struct r600_common_screen *)screen;
enum pipe_video_format codec = u_reduce_video_profile(profile);
if (entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE) {
switch (param) {
case PIPE_VIDEO_CAP_SUPPORTED:
return u_reduce_video_profile(profile) == PIPE_VIDEO_FORMAT_MPEG4_AVC &&
return codec == PIPE_VIDEO_FORMAT_MPEG4_AVC &&
rvce_is_fw_version_supported(rscreen);
case PIPE_VIDEO_CAP_NPOT_TEXTURES:
return 1;
@@ -232,38 +233,18 @@ int rvid_get_video_param(struct pipe_screen *screen,
}
}
/* UVD 2.x limits */
if (rscreen->family < CHIP_PALM) {
enum pipe_video_format codec = u_reduce_video_profile(profile);
switch (param) {
case PIPE_VIDEO_CAP_SUPPORTED:
/* no support for MPEG4 */
return codec != PIPE_VIDEO_FORMAT_MPEG4 &&
/* FIXME: VC-1 simple/main profile is broken */
profile != PIPE_VIDEO_PROFILE_VC1_SIMPLE &&
profile != PIPE_VIDEO_PROFILE_VC1_MAIN;
case PIPE_VIDEO_CAP_PREFERS_INTERLACED:
case PIPE_VIDEO_CAP_SUPPORTS_INTERLACED:
/* MPEG2 only with shaders and no support for
interlacing on R6xx style UVD */
return codec != PIPE_VIDEO_FORMAT_MPEG12 &&
rscreen->family > CHIP_RV770;
default:
break;
}
}
switch (param) {
case PIPE_VIDEO_CAP_SUPPORTED:
switch (u_reduce_video_profile(profile)) {
switch (codec) {
case PIPE_VIDEO_FORMAT_MPEG12:
case PIPE_VIDEO_FORMAT_MPEG4:
case PIPE_VIDEO_FORMAT_MPEG4_AVC:
return entrypoint != PIPE_VIDEO_ENTRYPOINT_ENCODE;
if (rscreen->family < CHIP_PALM)
/* no support for MPEG4 */
return codec != PIPE_VIDEO_FORMAT_MPEG4;
return true;
case PIPE_VIDEO_FORMAT_VC1:
/* FIXME: VC-1 simple/main profile is broken */
return profile == PIPE_VIDEO_PROFILE_VC1_ADVANCED &&
entrypoint != PIPE_VIDEO_ENTRYPOINT_ENCODE;
return true;
case PIPE_VIDEO_FORMAT_HEVC:
/* Carrizo only supports HEVC Main */
return rscreen->family >= CHIP_CARRIZO &&
@@ -280,13 +261,17 @@ int rvid_get_video_param(struct pipe_screen *screen,
case PIPE_VIDEO_CAP_PREFERED_FORMAT:
return PIPE_FORMAT_NV12;
case PIPE_VIDEO_CAP_PREFERS_INTERLACED:
if (u_reduce_video_profile(profile) == PIPE_VIDEO_FORMAT_HEVC)
return false; //The hardware doesn't support interlaced HEVC.
return true;
case PIPE_VIDEO_CAP_SUPPORTS_INTERLACED:
if (u_reduce_video_profile(profile) == PIPE_VIDEO_FORMAT_HEVC)
return false; //The hardware doesn't support interlaced HEVC.
return true;
if (rscreen->family < CHIP_PALM) {
/* MPEG2 only with shaders and no support for
interlacing on R6xx style UVD */
return codec != PIPE_VIDEO_FORMAT_MPEG12 &&
rscreen->family > CHIP_RV770;
} else {
if (u_reduce_video_profile(profile) == PIPE_VIDEO_FORMAT_HEVC)
return false; //The firmware doesn't support interlaced HEVC.
return true;
}
case PIPE_VIDEO_CAP_SUPPORTS_PROGRESSIVE:
return true;
case PIPE_VIDEO_CAP_MAX_LEVEL:

View File

@@ -137,6 +137,7 @@ enum radeon_family {
CHIP_ICELAND,
CHIP_CARRIZO,
CHIP_FIJI,
CHIP_STONEY,
CHIP_LAST,
};

View File

@@ -468,7 +468,8 @@ void si_upload_const_buffer(struct si_context *sctx, struct r600_resource **rbuf
u_upload_alloc(sctx->b.uploader, 0, size, const_offset,
(struct pipe_resource**)rbuffer, &tmp);
util_memcpy_cpu_to_le32(tmp, ptr, size);
if (rbuffer)
util_memcpy_cpu_to_le32(tmp, ptr, size);
}
static void si_set_constant_buffer(struct pipe_context *ctx, uint shader, uint slot,
@@ -500,6 +501,11 @@ static void si_set_constant_buffer(struct pipe_context *ctx, uint shader, uint s
si_upload_const_buffer(sctx,
(struct r600_resource**)&buffer, input->user_buffer,
input->buffer_size, &buffer_offset);
if (!buffer) {
/* Just unbind on failure. */
si_set_constant_buffer(ctx, shader, slot, NULL);
return;
}
va = r600_resource(buffer)->gpu_address + buffer_offset;
} else {
pipe_resource_reference(&buffer, input->buffer);

View File

@@ -170,6 +170,8 @@ static struct pipe_context *si_create_context(struct pipe_screen *screen, void *
if (sctx->b.chip_class == CIK) {
sctx->null_const_buf.buffer = pipe_buffer_create(screen, PIPE_BIND_CONSTANT_BUFFER,
PIPE_USAGE_DEFAULT, 16);
if (!sctx->null_const_buf.buffer)
goto fail;
sctx->null_const_buf.buffer_size = sctx->null_const_buf.buffer->width0;
for (shader = 0; shader < SI_NUM_SHADERS; shader++) {
@@ -487,6 +489,8 @@ static int si_get_shader_param(struct pipe_screen* pscreen, unsigned shader, enu
case PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_ANY_INOUT_DECL_RANGE:
return 1;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
}
return 0;
}

View File

@@ -3829,11 +3829,14 @@ int si_shader_binary_read(struct si_screen *sscreen, struct si_shader *shader)
{
const struct radeon_shader_binary *binary = &shader->binary;
unsigned i;
int r;
bool dump = r600_can_dump_shader(&sscreen->b,
shader->selector ? shader->selector->tokens : NULL);
si_shader_binary_read_config(sscreen, shader, 0);
si_shader_binary_upload(sscreen, shader);
r = si_shader_binary_upload(sscreen, shader);
if (r)
return r;
if (dump) {
if (!(sscreen->b.debug_flags & DBG_NO_ASM)) {
@@ -4198,8 +4201,10 @@ out:
void si_shader_destroy(struct pipe_context *ctx, struct si_shader *shader)
{
if (shader->gs_copy_shader)
if (shader->gs_copy_shader) {
si_shader_destroy(ctx, shader->gs_copy_shader);
FREE(shader->gs_copy_shader);
}
if (shader->scratch_bo)
r600_resource_reference(&shader->scratch_bo, NULL);

View File

@@ -3176,6 +3176,7 @@ si_write_harvested_raster_configs(struct si_context *sctx,
static void si_init_config(struct si_context *sctx)
{
struct si_screen *sscreen = sctx->screen;
unsigned num_rb = MIN2(sctx->screen->b.info.r600_num_backends, 16);
unsigned rb_mask = sctx->screen->b.info.si_backend_enabled_mask;
unsigned raster_config, raster_config_1;
@@ -3243,9 +3244,14 @@ static void si_init_config(struct si_context *sctx)
raster_config_1 = 0x0000002e;
break;
case CHIP_FIJI:
/* Fiji should be same as Hawaii, but that causes corruption in some cases */
raster_config = 0x16000012; /* 0x3a00161a */
raster_config_1 = 0x0000002a; /* 0x0000002e */
if (sscreen->b.info.cik_macrotile_mode_array[0] == 0x000000e8) {
/* old kernels with old tiling config */
raster_config = 0x16000012;
raster_config_1 = 0x0000002a;
} else {
raster_config = 0x3a00161a;
raster_config_1 = 0x0000002e;
}
break;
case CHIP_TONGA:
raster_config = 0x16000012;
@@ -3266,6 +3272,7 @@ static void si_init_config(struct si_context *sctx)
break;
case CHIP_KABINI:
case CHIP_MULLINS:
case CHIP_STONEY:
raster_config = 0x00000000;
raster_config_1 = 0x00000000;
break;
@@ -3341,5 +3348,8 @@ static void si_init_config(struct si_context *sctx)
si_pm4_set_reg(pm4, R_028C5C_VGT_OUT_DEALLOC_CNTL, 32);
}
if (sctx->b.family == CHIP_STONEY)
si_pm4_set_reg(pm4, R_028754_SX_PS_DOWNCONVERT, 0);
sctx->init_config = pm4;
}

View File

@@ -274,7 +274,7 @@ si_create_sampler_view_custom(struct pipe_context *ctx,
unsigned force_level);
/* si_state_shader.c */
void si_update_shaders(struct si_context *sctx);
bool si_update_shaders(struct si_context *sctx);
void si_init_shader_functions(struct si_context *sctx);
/* si_state_draw.c */

View File

@@ -760,8 +760,8 @@ void si_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info *info)
else
sctx->current_rast_prim = info->mode;
si_update_shaders(sctx);
if (!si_upload_shader_descriptors(sctx))
if (!si_update_shaders(sctx) ||
!si_upload_shader_descriptors(sctx))
return;
if (info->indexed) {
@@ -783,6 +783,10 @@ void si_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info *info)
u_upload_alloc(sctx->b.uploader, start_offset, count * 2,
&out_offset, &out_buffer, &ptr);
if (!out_buffer) {
pipe_resource_reference(&ib.buffer, NULL);
return;
}
util_shorten_ubyte_elts_to_userptr(&sctx->b.b, &ib, 0,
ib.offset + start_offset,
@@ -803,6 +807,8 @@ void si_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info *info)
u_upload_data(sctx->b.uploader, start_offset, count * ib.index_size,
(char*)ib.user_buffer + start_offset,
&ib.offset, &ib.buffer);
if (!ib.buffer)
return;
/* info->start will be added by the drawing code */
ib.offset -= start_offset;
}

View File

@@ -665,8 +665,16 @@ static void *si_create_shader_state(struct pipe_context *ctx,
struct si_shader_selector *sel = CALLOC_STRUCT(si_shader_selector);
int i;
if (!sel)
return NULL;
sel->type = pipe_shader_type;
sel->tokens = tgsi_dup_tokens(state->tokens);
if (!sel->tokens) {
FREE(sel);
return NULL;
}
sel->so = state->stream_output;
tgsi_scan_shader(state->tokens, &sel->info);
p_atomic_inc(&sscreen->b.num_shaders_created);
@@ -725,7 +733,12 @@ static void *si_create_shader_state(struct pipe_context *ctx,
}
if (sscreen->b.debug_flags & DBG_PRECOMPILE)
si_shader_select(ctx, sel);
if (si_shader_select(ctx, sel)) {
fprintf(stderr, "radeonsi: can't create a shader\n");
tgsi_free_tokens(sel->tokens);
FREE(sel);
return NULL;
}
return sel;
}
@@ -1031,11 +1044,23 @@ static void si_init_gs_rings(struct si_context *sctx)
assert(!sctx->gs_rings);
sctx->gs_rings = CALLOC_STRUCT(si_pm4_state);
if (!sctx->gs_rings)
return;
sctx->esgs_ring = pipe_buffer_create(sctx->b.b.screen, PIPE_BIND_CUSTOM,
PIPE_USAGE_DEFAULT, esgs_ring_size);
if (!sctx->esgs_ring) {
FREE(sctx->gs_rings);
return;
}
sctx->gsvs_ring = pipe_buffer_create(sctx->b.b.screen, PIPE_BIND_CUSTOM,
PIPE_USAGE_DEFAULT, gsvs_ring_size);
if (!sctx->gsvs_ring) {
pipe_resource_reference(&sctx->esgs_ring, NULL);
FREE(sctx->gs_rings);
return;
}
if (sctx->b.chip_class >= CIK) {
if (sctx->b.chip_class >= VI) {
@@ -1094,14 +1119,16 @@ static void si_update_gs_rings(struct si_context *sctx)
}
/**
* @returns 1 if \p sel has been updated to use a new scratch buffer and 0
* otherwise.
* @returns 1 if \p sel has been updated to use a new scratch buffer
* 0 if not
* < 0 if there was a failure
*/
static unsigned si_update_scratch_buffer(struct si_context *sctx,
static int si_update_scratch_buffer(struct si_context *sctx,
struct si_shader_selector *sel)
{
struct si_shader *shader;
uint64_t scratch_va = sctx->scratch_buffer->gpu_address;
int r;
if (!sel)
return 0;
@@ -1122,7 +1149,9 @@ static unsigned si_update_scratch_buffer(struct si_context *sctx,
si_shader_apply_scratch_relocs(sctx, shader, scratch_va);
/* Replace the shader bo with a new bo that has the relocs applied. */
si_shader_binary_upload(sctx->screen, shader);
r = si_shader_binary_upload(sctx->screen, shader);
if (r)
return r;
/* Update the shader state to use the new shader bo. */
si_shader_init_pm4_state(shader);
@@ -1161,7 +1190,7 @@ static unsigned si_get_max_scratch_bytes_per_wave(struct si_context *sctx)
return bytes;
}
static void si_update_spi_tmpring_size(struct si_context *sctx)
static bool si_update_spi_tmpring_size(struct si_context *sctx)
{
unsigned current_scratch_buffer_size =
si_get_current_scratch_buffer_size(sctx);
@@ -1169,6 +1198,7 @@ static void si_update_spi_tmpring_size(struct si_context *sctx)
si_get_max_scratch_bytes_per_wave(sctx);
unsigned scratch_needed_size = scratch_bytes_per_wave *
sctx->scratch_waves;
int r;
if (scratch_needed_size > 0) {
@@ -1181,6 +1211,9 @@ static void si_update_spi_tmpring_size(struct si_context *sctx)
sctx->scratch_buffer =
si_resource_create_custom(&sctx->screen->b.b,
PIPE_USAGE_DEFAULT, scratch_needed_size);
if (!sctx->scratch_buffer)
return false;
sctx->emit_scratch_reloc = true;
}
/* Update the shaders, so they are using the latest scratch. The
@@ -1188,31 +1221,57 @@ static void si_update_spi_tmpring_size(struct si_context *sctx)
* last used, so we still need to try to update them, even if
* they require scratch buffers smaller than the current size.
*/
if (si_update_scratch_buffer(sctx, sctx->ps_shader))
r = si_update_scratch_buffer(sctx, sctx->ps_shader);
if (r < 0)
return false;
if (r == 1)
si_pm4_bind_state(sctx, ps, sctx->ps_shader->current->pm4);
if (si_update_scratch_buffer(sctx, sctx->gs_shader))
r = si_update_scratch_buffer(sctx, sctx->gs_shader);
if (r < 0)
return false;
if (r == 1)
si_pm4_bind_state(sctx, gs, sctx->gs_shader->current->pm4);
if (si_update_scratch_buffer(sctx, sctx->tcs_shader))
r = si_update_scratch_buffer(sctx, sctx->tcs_shader);
if (r < 0)
return false;
if (r == 1)
si_pm4_bind_state(sctx, hs, sctx->tcs_shader->current->pm4);
/* VS can be bound as LS, ES, or VS. */
if (sctx->tes_shader) {
if (si_update_scratch_buffer(sctx, sctx->vs_shader))
r = si_update_scratch_buffer(sctx, sctx->vs_shader);
if (r < 0)
return false;
if (r == 1)
si_pm4_bind_state(sctx, ls, sctx->vs_shader->current->pm4);
} else if (sctx->gs_shader) {
if (si_update_scratch_buffer(sctx, sctx->vs_shader))
r = si_update_scratch_buffer(sctx, sctx->vs_shader);
if (r < 0)
return false;
if (r == 1)
si_pm4_bind_state(sctx, es, sctx->vs_shader->current->pm4);
} else {
if (si_update_scratch_buffer(sctx, sctx->vs_shader))
r = si_update_scratch_buffer(sctx, sctx->vs_shader);
if (r < 0)
return false;
if (r == 1)
si_pm4_bind_state(sctx, vs, sctx->vs_shader->current->pm4);
}
/* TES can be bound as ES or VS. */
if (sctx->gs_shader) {
if (si_update_scratch_buffer(sctx, sctx->tes_shader))
r = si_update_scratch_buffer(sctx, sctx->tes_shader);
if (r < 0)
return false;
if (r == 1)
si_pm4_bind_state(sctx, es, sctx->tes_shader->current->pm4);
} else {
if (si_update_scratch_buffer(sctx, sctx->tes_shader))
r = si_update_scratch_buffer(sctx, sctx->tes_shader);
if (r < 0)
return false;
if (r == 1)
si_pm4_bind_state(sctx, vs, sctx->tes_shader->current->pm4);
}
}
@@ -1223,6 +1282,7 @@ static void si_update_spi_tmpring_size(struct si_context *sctx)
sctx->spi_tmpring_size = S_0286E8_WAVES(sctx->scratch_waves) |
S_0286E8_WAVESIZE(scratch_bytes_per_wave >> 10);
return true;
}
static void si_init_tess_factor_ring(struct si_context *sctx)
@@ -1230,11 +1290,20 @@ static void si_init_tess_factor_ring(struct si_context *sctx)
assert(!sctx->tf_state);
sctx->tf_state = CALLOC_STRUCT(si_pm4_state);
if (!sctx->tf_state)
return;
sctx->tf_ring = pipe_buffer_create(sctx->b.b.screen, PIPE_BIND_CUSTOM,
PIPE_USAGE_DEFAULT,
32768 * sctx->screen->b.info.max_se);
if (!sctx->tf_ring) {
FREE(sctx->tf_state);
return;
}
sctx->b.clear_buffer(&sctx->b.b, sctx->tf_ring, 0,
sctx->tf_ring->width0, fui(0), false);
assert(((sctx->tf_ring->width0 / 4) & C_030938_SIZE) == 0);
if (sctx->b.chip_class >= CIK) {
@@ -1290,7 +1359,6 @@ static void si_generate_fixed_func_tcs(struct si_context *sctx)
sctx->fixed_func_tcs_shader =
ureg_create_shader_and_destroy(ureg, &sctx->b.b);
assert(sctx->fixed_func_tcs_shader);
}
static void si_update_vgt_shader_config(struct si_context *sctx)
@@ -1338,32 +1406,49 @@ static void si_update_so(struct si_context *sctx, struct si_shader_selector *sha
sctx->b.streamout.stride_in_dw = shader->so.stride;
}
void si_update_shaders(struct si_context *sctx)
bool si_update_shaders(struct si_context *sctx)
{
struct pipe_context *ctx = (struct pipe_context*)sctx;
struct si_state_rasterizer *rs = sctx->queued.named.rasterizer;
int r;
/* Update stages before GS. */
if (sctx->tes_shader) {
if (!sctx->tf_state)
if (!sctx->tf_state) {
si_init_tess_factor_ring(sctx);
if (!sctx->tf_state)
return false;
}
/* VS as LS */
si_shader_select(ctx, sctx->vs_shader);
r = si_shader_select(ctx, sctx->vs_shader);
if (r)
return false;
si_pm4_bind_state(sctx, ls, sctx->vs_shader->current->pm4);
if (sctx->tcs_shader) {
si_shader_select(ctx, sctx->tcs_shader);
r = si_shader_select(ctx, sctx->tcs_shader);
if (r)
return false;
si_pm4_bind_state(sctx, hs, sctx->tcs_shader->current->pm4);
} else {
if (!sctx->fixed_func_tcs_shader)
if (!sctx->fixed_func_tcs_shader) {
si_generate_fixed_func_tcs(sctx);
si_shader_select(ctx, sctx->fixed_func_tcs_shader);
if (!sctx->fixed_func_tcs_shader)
return false;
}
r = si_shader_select(ctx, sctx->fixed_func_tcs_shader);
if (r)
return false;
si_pm4_bind_state(sctx, hs,
sctx->fixed_func_tcs_shader->current->pm4);
}
si_shader_select(ctx, sctx->tes_shader);
r = si_shader_select(ctx, sctx->tes_shader);
if (r)
return false;
if (sctx->gs_shader) {
/* TES as ES */
si_pm4_bind_state(sctx, es, sctx->tes_shader->current->pm4);
@@ -1374,24 +1459,33 @@ void si_update_shaders(struct si_context *sctx)
}
} else if (sctx->gs_shader) {
/* VS as ES */
si_shader_select(ctx, sctx->vs_shader);
r = si_shader_select(ctx, sctx->vs_shader);
if (r)
return false;
si_pm4_bind_state(sctx, es, sctx->vs_shader->current->pm4);
} else {
/* VS as VS */
si_shader_select(ctx, sctx->vs_shader);
r = si_shader_select(ctx, sctx->vs_shader);
if (r)
return false;
si_pm4_bind_state(sctx, vs, sctx->vs_shader->current->pm4);
si_update_so(sctx, sctx->vs_shader);
}
/* Update GS. */
if (sctx->gs_shader) {
si_shader_select(ctx, sctx->gs_shader);
r = si_shader_select(ctx, sctx->gs_shader);
if (r)
return false;
si_pm4_bind_state(sctx, gs, sctx->gs_shader->current->pm4);
si_pm4_bind_state(sctx, vs, sctx->gs_shader->current->gs_copy_shader->pm4);
si_update_so(sctx, sctx->gs_shader);
if (!sctx->gs_rings)
if (!sctx->gs_rings) {
si_init_gs_rings(sctx);
if (!sctx->gs_rings)
return false;
}
if (sctx->emitted.named.gs_rings != sctx->gs_rings)
sctx->b.flags |= SI_CONTEXT_VGT_FLUSH;
@@ -1406,18 +1500,9 @@ void si_update_shaders(struct si_context *sctx)
si_update_vgt_shader_config(sctx);
si_shader_select(ctx, sctx->ps_shader);
if (!sctx->ps_shader->current) {
struct si_shader_selector *sel;
/* use a dummy shader if compiling the shader (variant) failed */
si_make_dummy_ps(sctx);
sel = sctx->dummy_pixel_shader;
si_shader_select(ctx, sel);
sctx->ps_shader->current = sel->current;
}
r = si_shader_select(ctx, sctx->ps_shader);
if (r)
return false;
si_pm4_bind_state(sctx, ps, sctx->ps_shader->current->pm4);
if (si_pm4_state_changed(sctx, ps) || si_pm4_state_changed(sctx, vs) ||
@@ -1428,9 +1513,14 @@ void si_update_shaders(struct si_context *sctx)
si_update_spi_map(sctx);
}
if (si_pm4_state_changed(sctx, ps) || si_pm4_state_changed(sctx, vs) ||
si_pm4_state_changed(sctx, gs)) {
si_update_spi_tmpring_size(sctx);
if (si_pm4_state_changed(sctx, ls) ||
si_pm4_state_changed(sctx, hs) ||
si_pm4_state_changed(sctx, es) ||
si_pm4_state_changed(sctx, gs) ||
si_pm4_state_changed(sctx, vs) ||
si_pm4_state_changed(sctx, ps)) {
if (!si_update_spi_tmpring_size(sctx))
return false;
}
if (sctx->ps_db_shader_control != sctx->ps_shader->current->db_shader_control) {
@@ -1445,6 +1535,7 @@ void si_update_shaders(struct si_context *sctx)
if (sctx->b.chip_class == SI)
si_mark_atom_dirty(sctx, &sctx->db_render_state);
}
return true;
}
void si_init_shader_functions(struct si_context *sctx)

View File

@@ -6732,6 +6732,9 @@
#define S_00B854_WAVES_PER_SH(x) (((x) & 0x3F) << 0) /* mask is 0x3FF on CIK */
#define G_00B854_WAVES_PER_SH(x) (((x) >> 0) & 0x3F) /* mask is 0x3FF on CIK */
#define C_00B854_WAVES_PER_SH 0xFFFFFFC0 /* mask is 0x3FF on CIK */
#define S_00B854_WAVES_PER_SH_CIK(x) (((x) & 0x3FF) << 0)
#define G_00B854_WAVES_PER_SH_CIK(x) (((x) >> 0) & 0x3FF)
#define C_00B854_WAVES_PER_SH_CIK 0xFFFFFC00
#define S_00B854_TG_PER_CU(x) (((x) & 0x0F) << 12)
#define G_00B854_TG_PER_CU(x) (((x) >> 12) & 0x0F)
#define C_00B854_TG_PER_CU 0xFFFF0FFF
@@ -8335,6 +8338,296 @@
#define V_028714_SPI_SHADER_UINT16_ABGR 0x07
#define V_028714_SPI_SHADER_SINT16_ABGR 0x08
#define V_028714_SPI_SHADER_32_ABGR 0x09
/* Stoney */
#define R_028754_SX_PS_DOWNCONVERT 0x028754
#define S_028754_MRT0(x) (((x) & 0x0F) << 0)
#define G_028754_MRT0(x) (((x) >> 0) & 0x0F)
#define C_028754_MRT0 0xFFFFFFF0
#define V_028754_SX_RT_EXPORT_NO_CONVERSION 0
#define V_028754_SX_RT_EXPORT_32_R 1
#define V_028754_SX_RT_EXPORT_32_A 2
#define V_028754_SX_RT_EXPORT_10_11_11 3
#define V_028754_SX_RT_EXPORT_2_10_10_10 4
#define V_028754_SX_RT_EXPORT_8_8_8_8 5
#define V_028754_SX_RT_EXPORT_5_6_5 6
#define V_028754_SX_RT_EXPORT_1_5_5_5 7
#define V_028754_SX_RT_EXPORT_4_4_4_4 8
#define V_028754_SX_RT_EXPORT_16_16_GR 9
#define V_028754_SX_RT_EXPORT_16_16_AR 10
#define S_028754_MRT1(x) (((x) & 0x0F) << 4)
#define G_028754_MRT1(x) (((x) >> 4) & 0x0F)
#define C_028754_MRT1 0xFFFFFF0F
#define S_028754_MRT2(x) (((x) & 0x0F) << 8)
#define G_028754_MRT2(x) (((x) >> 8) & 0x0F)
#define C_028754_MRT2 0xFFFFF0FF
#define S_028754_MRT3(x) (((x) & 0x0F) << 12)
#define G_028754_MRT3(x) (((x) >> 12) & 0x0F)
#define C_028754_MRT3 0xFFFF0FFF
#define S_028754_MRT4(x) (((x) & 0x0F) << 16)
#define G_028754_MRT4(x) (((x) >> 16) & 0x0F)
#define C_028754_MRT4 0xFFF0FFFF
#define S_028754_MRT5(x) (((x) & 0x0F) << 20)
#define G_028754_MRT5(x) (((x) >> 20) & 0x0F)
#define C_028754_MRT5 0xFF0FFFFF
#define S_028754_MRT6(x) (((x) & 0x0F) << 24)
#define G_028754_MRT6(x) (((x) >> 24) & 0x0F)
#define C_028754_MRT6 0xF0FFFFFF
#define S_028754_MRT7(x) (((x) & 0x0F) << 28)
#define G_028754_MRT7(x) (((x) >> 28) & 0x0F)
#define C_028754_MRT7 0x0FFFFFFF
#define R_028758_SX_BLEND_OPT_EPSILON 0x028758
#define S_028758_MRT0_EPSILON(x) (((x) & 0x0F) << 0)
#define G_028758_MRT0_EPSILON(x) (((x) >> 0) & 0x0F)
#define C_028758_MRT0_EPSILON 0xFFFFFFF0
#define V_028758_EXACT 0
#define V_028758_11BIT_FORMAT 1
#define V_028758_10BIT_FORMAT 3
#define V_028758_8BIT_FORMAT 7
#define V_028758_6BIT_FORMAT 11
#define V_028758_5BIT_FORMAT 13
#define V_028758_4BIT_FORMAT 15
#define S_028758_MRT1_EPSILON(x) (((x) & 0x0F) << 4)
#define G_028758_MRT1_EPSILON(x) (((x) >> 4) & 0x0F)
#define C_028758_MRT1_EPSILON 0xFFFFFF0F
#define S_028758_MRT2_EPSILON(x) (((x) & 0x0F) << 8)
#define G_028758_MRT2_EPSILON(x) (((x) >> 8) & 0x0F)
#define C_028758_MRT2_EPSILON 0xFFFFF0FF
#define S_028758_MRT3_EPSILON(x) (((x) & 0x0F) << 12)
#define G_028758_MRT3_EPSILON(x) (((x) >> 12) & 0x0F)
#define C_028758_MRT3_EPSILON 0xFFFF0FFF
#define S_028758_MRT4_EPSILON(x) (((x) & 0x0F) << 16)
#define G_028758_MRT4_EPSILON(x) (((x) >> 16) & 0x0F)
#define C_028758_MRT4_EPSILON 0xFFF0FFFF
#define S_028758_MRT5_EPSILON(x) (((x) & 0x0F) << 20)
#define G_028758_MRT5_EPSILON(x) (((x) >> 20) & 0x0F)
#define C_028758_MRT5_EPSILON 0xFF0FFFFF
#define S_028758_MRT6_EPSILON(x) (((x) & 0x0F) << 24)
#define G_028758_MRT6_EPSILON(x) (((x) >> 24) & 0x0F)
#define C_028758_MRT6_EPSILON 0xF0FFFFFF
#define S_028758_MRT7_EPSILON(x) (((x) & 0x0F) << 28)
#define G_028758_MRT7_EPSILON(x) (((x) >> 28) & 0x0F)
#define C_028758_MRT7_EPSILON 0x0FFFFFFF
#define R_02875C_SX_BLEND_OPT_CONTROL 0x02875C
#define S_02875C_MRT0_COLOR_OPT_DISABLE(x) (((x) & 0x1) << 0)
#define G_02875C_MRT0_COLOR_OPT_DISABLE(x) (((x) >> 0) & 0x1)
#define C_02875C_MRT0_COLOR_OPT_DISABLE 0xFFFFFFFE
#define S_02875C_MRT0_ALPHA_OPT_DISABLE(x) (((x) & 0x1) << 1)
#define G_02875C_MRT0_ALPHA_OPT_DISABLE(x) (((x) >> 1) & 0x1)
#define C_02875C_MRT0_ALPHA_OPT_DISABLE 0xFFFFFFFD
#define S_02875C_MRT1_COLOR_OPT_DISABLE(x) (((x) & 0x1) << 4)
#define G_02875C_MRT1_COLOR_OPT_DISABLE(x) (((x) >> 4) & 0x1)
#define C_02875C_MRT1_COLOR_OPT_DISABLE 0xFFFFFFEF
#define S_02875C_MRT1_ALPHA_OPT_DISABLE(x) (((x) & 0x1) << 5)
#define G_02875C_MRT1_ALPHA_OPT_DISABLE(x) (((x) >> 5) & 0x1)
#define C_02875C_MRT1_ALPHA_OPT_DISABLE 0xFFFFFFDF
#define S_02875C_MRT2_COLOR_OPT_DISABLE(x) (((x) & 0x1) << 8)
#define G_02875C_MRT2_COLOR_OPT_DISABLE(x) (((x) >> 8) & 0x1)
#define C_02875C_MRT2_COLOR_OPT_DISABLE 0xFFFFFEFF
#define S_02875C_MRT2_ALPHA_OPT_DISABLE(x) (((x) & 0x1) << 9)
#define G_02875C_MRT2_ALPHA_OPT_DISABLE(x) (((x) >> 9) & 0x1)
#define C_02875C_MRT2_ALPHA_OPT_DISABLE 0xFFFFFDFF
#define S_02875C_MRT3_COLOR_OPT_DISABLE(x) (((x) & 0x1) << 12)
#define G_02875C_MRT3_COLOR_OPT_DISABLE(x) (((x) >> 12) & 0x1)
#define C_02875C_MRT3_COLOR_OPT_DISABLE 0xFFFFEFFF
#define S_02875C_MRT3_ALPHA_OPT_DISABLE(x) (((x) & 0x1) << 13)
#define G_02875C_MRT3_ALPHA_OPT_DISABLE(x) (((x) >> 13) & 0x1)
#define C_02875C_MRT3_ALPHA_OPT_DISABLE 0xFFFFDFFF
#define S_02875C_MRT4_COLOR_OPT_DISABLE(x) (((x) & 0x1) << 16)
#define G_02875C_MRT4_COLOR_OPT_DISABLE(x) (((x) >> 16) & 0x1)
#define C_02875C_MRT4_COLOR_OPT_DISABLE 0xFFFEFFFF
#define S_02875C_MRT4_ALPHA_OPT_DISABLE(x) (((x) & 0x1) << 17)
#define G_02875C_MRT4_ALPHA_OPT_DISABLE(x) (((x) >> 17) & 0x1)
#define C_02875C_MRT4_ALPHA_OPT_DISABLE 0xFFFDFFFF
#define S_02875C_MRT5_COLOR_OPT_DISABLE(x) (((x) & 0x1) << 20)
#define G_02875C_MRT5_COLOR_OPT_DISABLE(x) (((x) >> 20) & 0x1)
#define C_02875C_MRT5_COLOR_OPT_DISABLE 0xFFEFFFFF
#define S_02875C_MRT5_ALPHA_OPT_DISABLE(x) (((x) & 0x1) << 21)
#define G_02875C_MRT5_ALPHA_OPT_DISABLE(x) (((x) >> 21) & 0x1)
#define C_02875C_MRT5_ALPHA_OPT_DISABLE 0xFFDFFFFF
#define S_02875C_MRT6_COLOR_OPT_DISABLE(x) (((x) & 0x1) << 24)
#define G_02875C_MRT6_COLOR_OPT_DISABLE(x) (((x) >> 24) & 0x1)
#define C_02875C_MRT6_COLOR_OPT_DISABLE 0xFEFFFFFF
#define S_02875C_MRT6_ALPHA_OPT_DISABLE(x) (((x) & 0x1) << 25)
#define G_02875C_MRT6_ALPHA_OPT_DISABLE(x) (((x) >> 25) & 0x1)
#define C_02875C_MRT6_ALPHA_OPT_DISABLE 0xFDFFFFFF
#define S_02875C_MRT7_COLOR_OPT_DISABLE(x) (((x) & 0x1) << 28)
#define G_02875C_MRT7_COLOR_OPT_DISABLE(x) (((x) >> 28) & 0x1)
#define C_02875C_MRT7_COLOR_OPT_DISABLE 0xEFFFFFFF
#define S_02875C_MRT7_ALPHA_OPT_DISABLE(x) (((x) & 0x1) << 29)
#define G_02875C_MRT7_ALPHA_OPT_DISABLE(x) (((x) >> 29) & 0x1)
#define C_02875C_MRT7_ALPHA_OPT_DISABLE 0xDFFFFFFF
#define S_02875C_PIXEN_ZERO_OPT_DISABLE(x) (((x) & 0x1) << 31)
#define G_02875C_PIXEN_ZERO_OPT_DISABLE(x) (((x) >> 31) & 0x1)
#define C_02875C_PIXEN_ZERO_OPT_DISABLE 0x7FFFFFFF
#define R_028760_SX_MRT0_BLEND_OPT 0x028760
#define S_028760_COLOR_SRC_OPT(x) (((x) & 0x07) << 0)
#define G_028760_COLOR_SRC_OPT(x) (((x) >> 0) & 0x07)
#define C_028760_COLOR_SRC_OPT 0xFFFFFFF8
#define V_028760_BLEND_OPT_PRESERVE_NONE_IGNORE_ALL 0
#define V_028760_BLEND_OPT_PRESERVE_ALL_IGNORE_NONE 1
#define V_028760_BLEND_OPT_PRESERVE_C1_IGNORE_C0 2
#define V_028760_BLEND_OPT_PRESERVE_C0_IGNORE_C1 3
#define V_028760_BLEND_OPT_PRESERVE_A1_IGNORE_A0 4
#define V_028760_BLEND_OPT_PRESERVE_A0_IGNORE_A1 5
#define V_028760_BLEND_OPT_PRESERVE_NONE_IGNORE_A0 6
#define V_028760_BLEND_OPT_PRESERVE_NONE_IGNORE_NONE 7
#define S_028760_COLOR_DST_OPT(x) (((x) & 0x07) << 4)
#define G_028760_COLOR_DST_OPT(x) (((x) >> 4) & 0x07)
#define C_028760_COLOR_DST_OPT 0xFFFFFF8F
#define S_028760_COLOR_COMB_FCN(x) (((x) & 0x07) << 8)
#define G_028760_COLOR_COMB_FCN(x) (((x) >> 8) & 0x07)
#define C_028760_COLOR_COMB_FCN 0xFFFFF8FF
#define V_028760_OPT_COMB_NONE 0
#define V_028760_OPT_COMB_ADD 1
#define V_028760_OPT_COMB_SUBTRACT 2
#define V_028760_OPT_COMB_MIN 3
#define V_028760_OPT_COMB_MAX 4
#define V_028760_OPT_COMB_REVSUBTRACT 5
#define V_028760_OPT_COMB_BLEND_DISABLED 6
#define V_028760_OPT_COMB_SAFE_ADD 7
#define S_028760_ALPHA_SRC_OPT(x) (((x) & 0x07) << 16)
#define G_028760_ALPHA_SRC_OPT(x) (((x) >> 16) & 0x07)
#define C_028760_ALPHA_SRC_OPT 0xFFF8FFFF
#define S_028760_ALPHA_DST_OPT(x) (((x) & 0x07) << 20)
#define G_028760_ALPHA_DST_OPT(x) (((x) >> 20) & 0x07)
#define C_028760_ALPHA_DST_OPT 0xFF8FFFFF
#define S_028760_ALPHA_COMB_FCN(x) (((x) & 0x07) << 24)
#define G_028760_ALPHA_COMB_FCN(x) (((x) >> 24) & 0x07)
#define C_028760_ALPHA_COMB_FCN 0xF8FFFFFF
#define R_028764_SX_MRT1_BLEND_OPT 0x028764
#define S_028764_COLOR_SRC_OPT(x) (((x) & 0x07) << 0)
#define G_028764_COLOR_SRC_OPT(x) (((x) >> 0) & 0x07)
#define C_028764_COLOR_SRC_OPT 0xFFFFFFF8
#define S_028764_COLOR_DST_OPT(x) (((x) & 0x07) << 4)
#define G_028764_COLOR_DST_OPT(x) (((x) >> 4) & 0x07)
#define C_028764_COLOR_DST_OPT 0xFFFFFF8F
#define S_028764_COLOR_COMB_FCN(x) (((x) & 0x07) << 8)
#define G_028764_COLOR_COMB_FCN(x) (((x) >> 8) & 0x07)
#define C_028764_COLOR_COMB_FCN 0xFFFFF8FF
#define S_028764_ALPHA_SRC_OPT(x) (((x) & 0x07) << 16)
#define G_028764_ALPHA_SRC_OPT(x) (((x) >> 16) & 0x07)
#define C_028764_ALPHA_SRC_OPT 0xFFF8FFFF
#define S_028764_ALPHA_DST_OPT(x) (((x) & 0x07) << 20)
#define G_028764_ALPHA_DST_OPT(x) (((x) >> 20) & 0x07)
#define C_028764_ALPHA_DST_OPT 0xFF8FFFFF
#define S_028764_ALPHA_COMB_FCN(x) (((x) & 0x07) << 24)
#define G_028764_ALPHA_COMB_FCN(x) (((x) >> 24) & 0x07)
#define C_028764_ALPHA_COMB_FCN 0xF8FFFFFF
#define R_028768_SX_MRT2_BLEND_OPT 0x028768
#define S_028768_COLOR_SRC_OPT(x) (((x) & 0x07) << 0)
#define G_028768_COLOR_SRC_OPT(x) (((x) >> 0) & 0x07)
#define C_028768_COLOR_SRC_OPT 0xFFFFFFF8
#define S_028768_COLOR_DST_OPT(x) (((x) & 0x07) << 4)
#define G_028768_COLOR_DST_OPT(x) (((x) >> 4) & 0x07)
#define C_028768_COLOR_DST_OPT 0xFFFFFF8F
#define S_028768_COLOR_COMB_FCN(x) (((x) & 0x07) << 8)
#define G_028768_COLOR_COMB_FCN(x) (((x) >> 8) & 0x07)
#define C_028768_COLOR_COMB_FCN 0xFFFFF8FF
#define S_028768_ALPHA_SRC_OPT(x) (((x) & 0x07) << 16)
#define G_028768_ALPHA_SRC_OPT(x) (((x) >> 16) & 0x07)
#define C_028768_ALPHA_SRC_OPT 0xFFF8FFFF
#define S_028768_ALPHA_DST_OPT(x) (((x) & 0x07) << 20)
#define G_028768_ALPHA_DST_OPT(x) (((x) >> 20) & 0x07)
#define C_028768_ALPHA_DST_OPT 0xFF8FFFFF
#define S_028768_ALPHA_COMB_FCN(x) (((x) & 0x07) << 24)
#define G_028768_ALPHA_COMB_FCN(x) (((x) >> 24) & 0x07)
#define C_028768_ALPHA_COMB_FCN 0xF8FFFFFF
#define R_02876C_SX_MRT3_BLEND_OPT 0x02876C
#define S_02876C_COLOR_SRC_OPT(x) (((x) & 0x07) << 0)
#define G_02876C_COLOR_SRC_OPT(x) (((x) >> 0) & 0x07)
#define C_02876C_COLOR_SRC_OPT 0xFFFFFFF8
#define S_02876C_COLOR_DST_OPT(x) (((x) & 0x07) << 4)
#define G_02876C_COLOR_DST_OPT(x) (((x) >> 4) & 0x07)
#define C_02876C_COLOR_DST_OPT 0xFFFFFF8F
#define S_02876C_COLOR_COMB_FCN(x) (((x) & 0x07) << 8)
#define G_02876C_COLOR_COMB_FCN(x) (((x) >> 8) & 0x07)
#define C_02876C_COLOR_COMB_FCN 0xFFFFF8FF
#define S_02876C_ALPHA_SRC_OPT(x) (((x) & 0x07) << 16)
#define G_02876C_ALPHA_SRC_OPT(x) (((x) >> 16) & 0x07)
#define C_02876C_ALPHA_SRC_OPT 0xFFF8FFFF
#define S_02876C_ALPHA_DST_OPT(x) (((x) & 0x07) << 20)
#define G_02876C_ALPHA_DST_OPT(x) (((x) >> 20) & 0x07)
#define C_02876C_ALPHA_DST_OPT 0xFF8FFFFF
#define S_02876C_ALPHA_COMB_FCN(x) (((x) & 0x07) << 24)
#define G_02876C_ALPHA_COMB_FCN(x) (((x) >> 24) & 0x07)
#define C_02876C_ALPHA_COMB_FCN 0xF8FFFFFF
#define R_028770_SX_MRT4_BLEND_OPT 0x028770
#define S_028770_COLOR_SRC_OPT(x) (((x) & 0x07) << 0)
#define G_028770_COLOR_SRC_OPT(x) (((x) >> 0) & 0x07)
#define C_028770_COLOR_SRC_OPT 0xFFFFFFF8
#define S_028770_COLOR_DST_OPT(x) (((x) & 0x07) << 4)
#define G_028770_COLOR_DST_OPT(x) (((x) >> 4) & 0x07)
#define C_028770_COLOR_DST_OPT 0xFFFFFF8F
#define S_028770_COLOR_COMB_FCN(x) (((x) & 0x07) << 8)
#define G_028770_COLOR_COMB_FCN(x) (((x) >> 8) & 0x07)
#define C_028770_COLOR_COMB_FCN 0xFFFFF8FF
#define S_028770_ALPHA_SRC_OPT(x) (((x) & 0x07) << 16)
#define G_028770_ALPHA_SRC_OPT(x) (((x) >> 16) & 0x07)
#define C_028770_ALPHA_SRC_OPT 0xFFF8FFFF
#define S_028770_ALPHA_DST_OPT(x) (((x) & 0x07) << 20)
#define G_028770_ALPHA_DST_OPT(x) (((x) >> 20) & 0x07)
#define C_028770_ALPHA_DST_OPT 0xFF8FFFFF
#define S_028770_ALPHA_COMB_FCN(x) (((x) & 0x07) << 24)
#define G_028770_ALPHA_COMB_FCN(x) (((x) >> 24) & 0x07)
#define C_028770_ALPHA_COMB_FCN 0xF8FFFFFF
#define R_028774_SX_MRT5_BLEND_OPT 0x028774
#define S_028774_COLOR_SRC_OPT(x) (((x) & 0x07) << 0)
#define G_028774_COLOR_SRC_OPT(x) (((x) >> 0) & 0x07)
#define C_028774_COLOR_SRC_OPT 0xFFFFFFF8
#define S_028774_COLOR_DST_OPT(x) (((x) & 0x07) << 4)
#define G_028774_COLOR_DST_OPT(x) (((x) >> 4) & 0x07)
#define C_028774_COLOR_DST_OPT 0xFFFFFF8F
#define S_028774_COLOR_COMB_FCN(x) (((x) & 0x07) << 8)
#define G_028774_COLOR_COMB_FCN(x) (((x) >> 8) & 0x07)
#define C_028774_COLOR_COMB_FCN 0xFFFFF8FF
#define S_028774_ALPHA_SRC_OPT(x) (((x) & 0x07) << 16)
#define G_028774_ALPHA_SRC_OPT(x) (((x) >> 16) & 0x07)
#define C_028774_ALPHA_SRC_OPT 0xFFF8FFFF
#define S_028774_ALPHA_DST_OPT(x) (((x) & 0x07) << 20)
#define G_028774_ALPHA_DST_OPT(x) (((x) >> 20) & 0x07)
#define C_028774_ALPHA_DST_OPT 0xFF8FFFFF
#define S_028774_ALPHA_COMB_FCN(x) (((x) & 0x07) << 24)
#define G_028774_ALPHA_COMB_FCN(x) (((x) >> 24) & 0x07)
#define C_028774_ALPHA_COMB_FCN 0xF8FFFFFF
#define R_028778_SX_MRT6_BLEND_OPT 0x028778
#define S_028778_COLOR_SRC_OPT(x) (((x) & 0x07) << 0)
#define G_028778_COLOR_SRC_OPT(x) (((x) >> 0) & 0x07)
#define C_028778_COLOR_SRC_OPT 0xFFFFFFF8
#define S_028778_COLOR_DST_OPT(x) (((x) & 0x07) << 4)
#define G_028778_COLOR_DST_OPT(x) (((x) >> 4) & 0x07)
#define C_028778_COLOR_DST_OPT 0xFFFFFF8F
#define S_028778_COLOR_COMB_FCN(x) (((x) & 0x07) << 8)
#define G_028778_COLOR_COMB_FCN(x) (((x) >> 8) & 0x07)
#define C_028778_COLOR_COMB_FCN 0xFFFFF8FF
#define S_028778_ALPHA_SRC_OPT(x) (((x) & 0x07) << 16)
#define G_028778_ALPHA_SRC_OPT(x) (((x) >> 16) & 0x07)
#define C_028778_ALPHA_SRC_OPT 0xFFF8FFFF
#define S_028778_ALPHA_DST_OPT(x) (((x) & 0x07) << 20)
#define G_028778_ALPHA_DST_OPT(x) (((x) >> 20) & 0x07)
#define C_028778_ALPHA_DST_OPT 0xFF8FFFFF
#define S_028778_ALPHA_COMB_FCN(x) (((x) & 0x07) << 24)
#define G_028778_ALPHA_COMB_FCN(x) (((x) >> 24) & 0x07)
#define C_028778_ALPHA_COMB_FCN 0xF8FFFFFF
#define R_02877C_SX_MRT7_BLEND_OPT 0x02877C
#define S_02877C_COLOR_SRC_OPT(x) (((x) & 0x07) << 0)
#define G_02877C_COLOR_SRC_OPT(x) (((x) >> 0) & 0x07)
#define C_02877C_COLOR_SRC_OPT 0xFFFFFFF8
#define S_02877C_COLOR_DST_OPT(x) (((x) & 0x07) << 4)
#define G_02877C_COLOR_DST_OPT(x) (((x) >> 4) & 0x07)
#define C_02877C_COLOR_DST_OPT 0xFFFFFF8F
#define S_02877C_COLOR_COMB_FCN(x) (((x) & 0x07) << 8)
#define G_02877C_COLOR_COMB_FCN(x) (((x) >> 8) & 0x07)
#define C_02877C_COLOR_COMB_FCN 0xFFFFF8FF
#define S_02877C_ALPHA_SRC_OPT(x) (((x) & 0x07) << 16)
#define G_02877C_ALPHA_SRC_OPT(x) (((x) >> 16) & 0x07)
#define C_02877C_ALPHA_SRC_OPT 0xFFF8FFFF
#define S_02877C_ALPHA_DST_OPT(x) (((x) & 0x07) << 20)
#define G_02877C_ALPHA_DST_OPT(x) (((x) >> 20) & 0x07)
#define C_02877C_ALPHA_DST_OPT 0xFF8FFFFF
#define S_02877C_ALPHA_COMB_FCN(x) (((x) & 0x07) << 24)
#define G_02877C_ALPHA_COMB_FCN(x) (((x) >> 24) & 0x07)
#define C_02877C_ALPHA_COMB_FCN 0xF8FFFFFF
/* */
#define R_028780_CB_BLEND0_CONTROL 0x028780
#define S_028780_COLOR_SRCBLEND(x) (((x) & 0x1F) << 0)
#define G_028780_COLOR_SRCBLEND(x) (((x) >> 0) & 0x1F)
@@ -8597,6 +8890,7 @@
#define V_028808_CB_ELIMINATE_FAST_CLEAR 0x02
#define V_028808_CB_RESOLVE 0x03
#define V_028808_CB_FMASK_DECOMPRESS 0x05
#define V_028808_CB_DCC_DECOMPRESS 0x06
#define S_028808_ROP3(x) (((x) & 0xFF) << 16)
#define G_028808_ROP3(x) (((x) >> 16) & 0xFF)
#define C_028808_ROP3 0xFF00FFFF
@@ -8675,6 +8969,11 @@
#define V_02880C_EXPORT_GREATER_THAN_Z 2
#define V_02880C_EXPORT_RESERVED 3
/* */
/* Stoney */
#define S_02880C_DUAL_QUAD_DISABLE(x) (((x) & 0x1) << 15)
#define G_02880C_DUAL_QUAD_DISABLE(x) (((x) >> 15) & 0x1)
#define C_02880C_DUAL_QUAD_DISABLE 0xFFFF7FFF
/* */
#define R_028810_PA_CL_CLIP_CNTL 0x028810
#define S_028810_UCP_ENA_0(x) (((x) & 0x1) << 0)
#define G_028810_UCP_ENA_0(x) (((x) >> 0) & 0x1)
@@ -9256,6 +9555,9 @@
#define V_028A40_GS_SCENARIO_G 0x03
#define V_028A40_GS_SCENARIO_C 0x04
#define V_028A40_SPRITE_EN 0x05
#define S_028A40_RESERVED_0(x) (((x) & 0x1) << 3)
#define G_028A40_RESERVED_0(x) (((x) >> 3) & 0x1)
#define C_028A40_RESERVED_0 0xFFFFFFF7
#define S_028A40_CUT_MODE(x) (((x) & 0x03) << 4)
#define G_028A40_CUT_MODE(x) (((x) >> 4) & 0x03)
#define C_028A40_CUT_MODE 0xFFFFFFCF
@@ -9263,12 +9565,19 @@
#define V_028A40_GS_CUT_512 0x01
#define V_028A40_GS_CUT_256 0x02
#define V_028A40_GS_CUT_128 0x03
#define S_028A40_RESERVED_1(x) (((x) & 0x1F) << 6)
#define G_028A40_RESERVED_1(x) (((x) >> 6) & 0x1F)
#define C_028A40_RESERVED_1 0xFFFFF83F
#define S_028A40_GS_C_PACK_EN(x) (((x) & 0x1) << 11)
#define G_028A40_GS_C_PACK_EN(x) (((x) >> 11) & 0x1)
#define C_028A40_GS_C_PACK_EN 0xFFFFF7FF
#define S_028A40_RESERVED_2(x) (((x) & 0x1) << 12)
#define G_028A40_RESERVED_2(x) (((x) >> 12) & 0x1)
#define C_028A40_RESERVED_2 0xFFFFEFFF
#define S_028A40_ES_PASSTHRU(x) (((x) & 0x1) << 13)
#define G_028A40_ES_PASSTHRU(x) (((x) >> 13) & 0x1)
#define C_028A40_ES_PASSTHRU 0xFFFFDFFF
/* SI-CIK */
#define S_028A40_COMPUTE_MODE(x) (((x) & 0x1) << 14)
#define G_028A40_COMPUTE_MODE(x) (((x) >> 14) & 0x1)
#define C_028A40_COMPUTE_MODE 0xFFFFBFFF
@@ -9278,6 +9587,7 @@
#define S_028A40_ELEMENT_INFO_EN(x) (((x) & 0x1) << 16)
#define G_028A40_ELEMENT_INFO_EN(x) (((x) >> 16) & 0x1)
#define C_028A40_ELEMENT_INFO_EN 0xFFFEFFFF
/* */
#define S_028A40_PARTIAL_THD_AT_EOI(x) (((x) & 0x1) << 17)
#define G_028A40_PARTIAL_THD_AT_EOI(x) (((x) >> 17) & 0x1)
#define C_028A40_PARTIAL_THD_AT_EOI 0xFFFDFFFF
@@ -9463,6 +9773,9 @@
#define C_028A7C_RDREQ_POLICY 0xFFFFFF3F
#define V_028A7C_VGT_POLICY_LRU 0x00
#define V_028A7C_VGT_POLICY_STREAM 0x01
#define S_028A7C_RDREQ_POLICY_VI(x) (((x) & 0x1) << 6)
#define G_028A7C_RDREQ_POLICY_VI(x) (((x) >> 6) & 0x1)
#define C_028A7C_RDREQ_POLICY_VI 0xFFFFFFBF
#define S_028A7C_ATC(x) (((x) & 0x1) << 8)
#define G_028A7C_ATC(x) (((x) >> 8) & 0x1)
#define C_028A7C_ATC 0xFFFFFEFF
@@ -9839,6 +10152,9 @@
#define V_028B6C_VGT_POLICY_BYPASS 0x02
/* */
/* VI */
#define S_028B6C_RDREQ_POLICY_VI(x) (((x) & 0x1) << 15)
#define G_028B6C_RDREQ_POLICY_VI(x) (((x) >> 15) & 0x1)
#define C_028B6C_RDREQ_POLICY_VI 0xFFFF7FFF
#define S_028B6C_DISTRIBUTION_MODE(x) (((x) & 0x03) << 17)
#define G_028B6C_DISTRIBUTION_MODE(x) (((x) >> 17) & 0x03)
#define C_028B6C_DISTRIBUTION_MODE 0xFFF9FFFF
@@ -10441,6 +10757,12 @@
#define S_028C3C_AA_MASK_X1Y1(x) (((x) & 0xFFFF) << 16)
#define G_028C3C_AA_MASK_X1Y1(x) (((x) >> 16) & 0xFFFF)
#define C_028C3C_AA_MASK_X1Y1 0x0000FFFF
/* Stoney */
#define R_028C40_PA_SC_SHADER_CONTROL 0x028C40
#define S_028C40_REALIGN_DQUADS_AFTER_N_WAVES(x) (((x) & 0x03) << 0)
#define G_028C40_REALIGN_DQUADS_AFTER_N_WAVES(x) (((x) >> 0) & 0x03)
#define C_028C40_REALIGN_DQUADS_AFTER_N_WAVES 0xFFFFFFFC
/* */
#define R_028C58_VGT_VERTEX_REUSE_BLOCK_CNTL 0x028C58
#define S_028C58_VTX_REUSE_DEPTH(x) (((x) & 0xFF) << 0)
#define G_028C58_VTX_REUSE_DEPTH(x) (((x) >> 0) & 0xFF)

View File

@@ -383,6 +383,8 @@ static int svga_get_shader_param(struct pipe_screen *screen, unsigned shader, en
case PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_ANY_INOUT_DECL_RANGE:
return 0;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
}
/* If we get here, we failed to handle a cap above */
debug_printf("Unexpected fragment shader query %u\n", param);
@@ -441,6 +443,8 @@ static int svga_get_shader_param(struct pipe_screen *screen, unsigned shader, en
case PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_ANY_INOUT_DECL_RANGE:
return 0;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
}
/* If we get here, we failed to handle a cap above */
debug_printf("Unexpected vertex shader query %u\n", param);

View File

@@ -36,6 +36,9 @@
static bool dump_stats = false;
static void
vc4_bo_cache_free_all(struct vc4_bo_cache *cache);
static void
vc4_bo_dump_stats(struct vc4_screen *screen)
{
@@ -136,6 +139,8 @@ vc4_bo_alloc(struct vc4_screen *screen, uint32_t size, const char *name)
bo->name = name;
bo->private = true;
bool cleared_and_retried = false;
retry:
if (!using_vc4_simulator) {
struct drm_vc4_create_bo create;
memset(&create, 0, sizeof(create));
@@ -157,8 +162,15 @@ vc4_bo_alloc(struct vc4_screen *screen, uint32_t size, const char *name)
assert(create.size >= size);
}
if (ret != 0) {
fprintf(stderr, "create ioctl failure\n");
abort();
if (!list_empty(&screen->bo_cache.time_list) &&
!cleared_and_retried) {
cleared_and_retried = true;
vc4_bo_cache_free_all(&screen->bo_cache);
goto retry;
}
free(bo);
return NULL;
}
screen->bo_count++;
@@ -248,6 +260,18 @@ free_stale_bos(struct vc4_screen *screen, time_t time)
}
}
static void
vc4_bo_cache_free_all(struct vc4_bo_cache *cache)
{
pipe_mutex_lock(cache->lock);
list_for_each_entry_safe(struct vc4_bo, bo, &cache->time_list,
time_list) {
vc4_bo_remove_from_cache(cache, bo);
vc4_bo_free(bo);
}
pipe_mutex_unlock(cache->lock);
}
void
vc4_bo_last_unreference_locked_timed(struct vc4_bo *bo, time_t time)
{
@@ -600,11 +624,7 @@ vc4_bufmgr_destroy(struct pipe_screen *pscreen)
struct vc4_screen *screen = vc4_screen(pscreen);
struct vc4_bo_cache *cache = &screen->bo_cache;
list_for_each_entry_safe(struct vc4_bo, bo, &cache->time_list,
time_list) {
vc4_bo_remove_from_cache(cache, bo);
vc4_bo_free(bo);
}
vc4_bo_cache_free_all(cache);
if (dump_stats) {
fprintf(stderr, "BO stats after screen destroy:\n");

View File

@@ -143,6 +143,8 @@ qir_opt_algebraic(struct vc4_compile *c)
case QOP_SEL_X_Y_ZC:
case QOP_SEL_X_Y_NS:
case QOP_SEL_X_Y_NC:
case QOP_SEL_X_Y_CS:
case QOP_SEL_X_Y_CC:
if (is_zero(c, inst->src[1])) {
/* Replace references to a 0 uniform value
* with the SEL_X_0 equivalent.

View File

@@ -1055,6 +1055,10 @@ ntq_emit_alu(struct vc4_compile *c, nir_alu_instr *instr)
qir_SF(c, qir_SUB(c, src[0], src[1]));
*dest = qir_SEL_X_0_NC(c, qir_uniform_ui(c, ~0));
break;
case nir_op_uge:
qir_SF(c, qir_SUB(c, src[0], src[1]));
*dest = qir_SEL_X_0_CC(c, qir_uniform_ui(c, ~0));
break;
case nir_op_ilt:
qir_SF(c, qir_SUB(c, src[0], src[1]));
*dest = qir_SEL_X_0_NS(c, qir_uniform_ui(c, ~0));

View File

@@ -62,10 +62,14 @@ static const struct qir_op_info qir_op_info[] = {
[QOP_SEL_X_0_NC] = { "fsel_x_0_nc", 1, 1, false, true },
[QOP_SEL_X_0_ZS] = { "fsel_x_0_zs", 1, 1, false, true },
[QOP_SEL_X_0_ZC] = { "fsel_x_0_zc", 1, 1, false, true },
[QOP_SEL_X_0_CS] = { "fsel_x_0_cs", 1, 1, false, true },
[QOP_SEL_X_0_CC] = { "fsel_x_0_cc", 1, 1, false, true },
[QOP_SEL_X_Y_NS] = { "fsel_x_y_ns", 1, 2, false, true },
[QOP_SEL_X_Y_NC] = { "fsel_x_y_nc", 1, 2, false, true },
[QOP_SEL_X_Y_ZS] = { "fsel_x_y_zs", 1, 2, false, true },
[QOP_SEL_X_Y_ZC] = { "fsel_x_y_zc", 1, 2, false, true },
[QOP_SEL_X_Y_CS] = { "fsel_x_y_cs", 1, 2, false, true },
[QOP_SEL_X_Y_CC] = { "fsel_x_y_cc", 1, 2, false, true },
[QOP_RCP] = { "rcp", 1, 1, false, true },
[QOP_RSQ] = { "rsq", 1, 1, false, true },
@@ -193,10 +197,14 @@ qir_depends_on_flags(struct qinst *inst)
case QOP_SEL_X_0_NC:
case QOP_SEL_X_0_ZS:
case QOP_SEL_X_0_ZC:
case QOP_SEL_X_0_CS:
case QOP_SEL_X_0_CC:
case QOP_SEL_X_Y_NS:
case QOP_SEL_X_Y_NC:
case QOP_SEL_X_Y_ZS:
case QOP_SEL_X_Y_ZC:
case QOP_SEL_X_Y_CS:
case QOP_SEL_X_Y_CC:
return true;
default:
return false;

View File

@@ -91,11 +91,15 @@ enum qop {
QOP_SEL_X_0_ZC,
QOP_SEL_X_0_NS,
QOP_SEL_X_0_NC,
QOP_SEL_X_0_CS,
QOP_SEL_X_0_CC,
/* Selects the src[0] if the ns flag bit is set, otherwise src[1]. */
QOP_SEL_X_Y_ZS,
QOP_SEL_X_Y_ZC,
QOP_SEL_X_Y_NS,
QOP_SEL_X_Y_NC,
QOP_SEL_X_Y_CS,
QOP_SEL_X_Y_CC,
QOP_FTOI,
QOP_ITOF,
@@ -570,10 +574,14 @@ QIR_ALU1(SEL_X_0_ZS)
QIR_ALU1(SEL_X_0_ZC)
QIR_ALU1(SEL_X_0_NS)
QIR_ALU1(SEL_X_0_NC)
QIR_ALU1(SEL_X_0_CS)
QIR_ALU1(SEL_X_0_CC)
QIR_ALU2(SEL_X_Y_ZS)
QIR_ALU2(SEL_X_Y_ZC)
QIR_ALU2(SEL_X_Y_NS)
QIR_ALU2(SEL_X_Y_NC)
QIR_ALU2(SEL_X_Y_CS)
QIR_ALU2(SEL_X_Y_CC)
QIR_ALU2(FMIN)
QIR_ALU2(FMAX)
QIR_ALU2(FMINABS)

View File

@@ -271,6 +271,8 @@ vc4_generate_code(struct vc4_context *vc4, struct vc4_compile *c)
case QOP_SEL_X_0_ZC:
case QOP_SEL_X_0_NS:
case QOP_SEL_X_0_NC:
case QOP_SEL_X_0_CS:
case QOP_SEL_X_0_CC:
queue(c, qpu_a_MOV(dst, src[0]));
set_last_cond_add(c, qinst->op - QOP_SEL_X_0_ZS +
QPU_COND_ZS);
@@ -284,6 +286,8 @@ vc4_generate_code(struct vc4_context *vc4, struct vc4_compile *c)
case QOP_SEL_X_Y_ZC:
case QOP_SEL_X_Y_NS:
case QOP_SEL_X_Y_NC:
case QOP_SEL_X_Y_CS:
case QOP_SEL_X_Y_CC:
queue(c, qpu_a_MOV(dst, src[0]));
set_last_cond_add(c, qinst->op - QOP_SEL_X_Y_ZS +
QPU_COND_ZS);

View File

@@ -35,11 +35,12 @@
static bool miptree_debug = false;
static void
static bool
vc4_resource_bo_alloc(struct vc4_resource *rsc)
{
struct pipe_resource *prsc = &rsc->base.b;
struct pipe_screen *pscreen = prsc->screen;
struct vc4_bo *bo;
if (miptree_debug) {
fprintf(stderr, "alloc %p: size %d + offset %d -> %d\n",
@@ -51,12 +52,18 @@ vc4_resource_bo_alloc(struct vc4_resource *rsc)
rsc->cube_map_stride * (prsc->array_size - 1));
}
vc4_bo_unreference(&rsc->bo);
rsc->bo = vc4_bo_alloc(vc4_screen(pscreen),
rsc->slices[0].offset +
rsc->slices[0].size +
rsc->cube_map_stride * (prsc->array_size - 1),
"resource");
bo = vc4_bo_alloc(vc4_screen(pscreen),
rsc->slices[0].offset +
rsc->slices[0].size +
rsc->cube_map_stride * (prsc->array_size - 1),
"resource");
if (bo) {
vc4_bo_unreference(&rsc->bo);
rsc->bo = bo;
return true;
} else {
return false;
}
}
static void
@@ -101,21 +108,27 @@ vc4_resource_transfer_map(struct pipe_context *pctx,
char *buf;
if (usage & PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE) {
vc4_resource_bo_alloc(rsc);
if (vc4_resource_bo_alloc(rsc)) {
/* If it might be bound as one of our vertex buffers, make
* sure we re-emit vertex buffer state.
*/
if (prsc->bind & PIPE_BIND_VERTEX_BUFFER)
vc4->dirty |= VC4_DIRTY_VTXBUF;
/* If it might be bound as one of our vertex buffers,
* make sure we re-emit vertex buffer state.
*/
if (prsc->bind & PIPE_BIND_VERTEX_BUFFER)
vc4->dirty |= VC4_DIRTY_VTXBUF;
} else {
/* If we failed to reallocate, flush everything so
* that we don't violate any syncing requirements.
*/
vc4_flush(pctx);
}
} else if (!(usage & PIPE_TRANSFER_UNSYNCHRONIZED)) {
if (vc4_cl_references_bo(pctx, rsc->bo)) {
if ((usage & PIPE_TRANSFER_DISCARD_RANGE) &&
prsc->last_level == 0 &&
prsc->width0 == box->width &&
prsc->height0 == box->height &&
prsc->depth0 == box->depth) {
vc4_resource_bo_alloc(rsc);
prsc->depth0 == box->depth &&
vc4_resource_bo_alloc(rsc)) {
if (prsc->bind & PIPE_BIND_VERTEX_BUFFER)
vc4->dirty |= VC4_DIRTY_VTXBUF;
} else {
@@ -389,8 +402,7 @@ vc4_resource_create(struct pipe_screen *pscreen,
rsc->vc4_format = get_resource_texture_format(prsc);
vc4_setup_slices(rsc);
vc4_resource_bo_alloc(rsc);
if (!rsc->bo)
if (!vc4_resource_bo_alloc(rsc))
goto fail;
return prsc;

View File

@@ -334,6 +334,8 @@ vc4_screen_get_shader_param(struct pipe_screen *pscreen, unsigned shader,
return VC4_MAX_TEXTURE_SAMPLERS;
case PIPE_SHADER_CAP_PREFERRED_IR:
return PIPE_SHADER_IR_TGSI;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
default:
fprintf(stderr, "unknown shader param %d\n", param);
return 0;

View File

@@ -581,6 +581,10 @@ vc4_create_sampler_view(struct pipe_context *pctx, struct pipe_resource *prsc,
tmpl.last_level = cso->u.tex.last_level - cso->u.tex.first_level;
prsc = vc4_resource_create(pctx->screen, &tmpl);
if (!prsc) {
free(so);
return NULL;
}
rsc = vc4_resource(prsc);
clone = vc4_resource(prsc);
clone->shadow_parent = &shadow_parent->base.b;

View File

@@ -674,7 +674,8 @@ enum pipe_shader_cap
PIPE_SHADER_CAP_TGSI_DROUND_SUPPORTED, /* all rounding modes */
PIPE_SHADER_CAP_TGSI_DFRACEXP_DLDEXP_SUPPORTED,
PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED,
PIPE_SHADER_CAP_TGSI_ANY_INOUT_DECL_RANGE
PIPE_SHADER_CAP_TGSI_ANY_INOUT_DECL_RANGE,
PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT,
};
/**

View File

@@ -188,10 +188,10 @@ dri2_drawable_get_buffers(struct dri_drawable *drawable,
* may occur as the stvis->color_format.
*/
switch(format) {
case PIPE_FORMAT_B8G8R8A8_UNORM:
case PIPE_FORMAT_BGRA8888_UNORM:
depth = 32;
break;
case PIPE_FORMAT_B8G8R8X8_UNORM:
case PIPE_FORMAT_BGRX8888_UNORM:
depth = 24;
break;
case PIPE_FORMAT_B5G6R5_UNORM:
@@ -261,13 +261,13 @@ dri_image_drawable_get_buffers(struct dri_drawable *drawable,
case PIPE_FORMAT_B5G6R5_UNORM:
image_format = __DRI_IMAGE_FORMAT_RGB565;
break;
case PIPE_FORMAT_B8G8R8X8_UNORM:
case PIPE_FORMAT_BGRX8888_UNORM:
image_format = __DRI_IMAGE_FORMAT_XRGB8888;
break;
case PIPE_FORMAT_B8G8R8A8_UNORM:
case PIPE_FORMAT_BGRA8888_UNORM:
image_format = __DRI_IMAGE_FORMAT_ARGB8888;
break;
case PIPE_FORMAT_R8G8B8A8_UNORM:
case PIPE_FORMAT_RGBA8888_UNORM:
image_format = __DRI_IMAGE_FORMAT_ABGR8888;
break;
default:
@@ -314,10 +314,10 @@ dri2_allocate_buffer(__DRIscreen *sPriv,
switch (format) {
case 32:
pf = PIPE_FORMAT_B8G8R8A8_UNORM;
pf = PIPE_FORMAT_BGRA8888_UNORM;
break;
case 24:
pf = PIPE_FORMAT_B8G8R8X8_UNORM;
pf = PIPE_FORMAT_BGRX8888_UNORM;
break;
case 16:
pf = PIPE_FORMAT_Z16_UNORM;
@@ -724,13 +724,13 @@ dri2_create_image_from_winsys(__DRIscreen *_screen,
pf = PIPE_FORMAT_B5G6R5_UNORM;
break;
case __DRI_IMAGE_FORMAT_XRGB8888:
pf = PIPE_FORMAT_B8G8R8X8_UNORM;
pf = PIPE_FORMAT_BGRX8888_UNORM;
break;
case __DRI_IMAGE_FORMAT_ARGB8888:
pf = PIPE_FORMAT_B8G8R8A8_UNORM;
pf = PIPE_FORMAT_BGRA8888_UNORM;
break;
case __DRI_IMAGE_FORMAT_ABGR8888:
pf = PIPE_FORMAT_R8G8B8A8_UNORM;
pf = PIPE_FORMAT_RGBA8888_UNORM;
break;
default:
pf = PIPE_FORMAT_NONE;
@@ -845,13 +845,13 @@ dri2_create_image(__DRIscreen *_screen,
pf = PIPE_FORMAT_B5G6R5_UNORM;
break;
case __DRI_IMAGE_FORMAT_XRGB8888:
pf = PIPE_FORMAT_B8G8R8X8_UNORM;
pf = PIPE_FORMAT_BGRX8888_UNORM;
break;
case __DRI_IMAGE_FORMAT_ARGB8888:
pf = PIPE_FORMAT_B8G8R8A8_UNORM;
pf = PIPE_FORMAT_BGRA8888_UNORM;
break;
case __DRI_IMAGE_FORMAT_ABGR8888:
pf = PIPE_FORMAT_R8G8B8A8_UNORM;
pf = PIPE_FORMAT_RGBA8888_UNORM;
break;
default:
pf = PIPE_FORMAT_NONE;
@@ -1293,6 +1293,7 @@ dri2_load_opencl_interop(struct dri_screen *screen)
}
struct dri2_fence {
struct dri_screen *driscreen;
struct pipe_fence_handle *pipe_fence;
void *cl_event;
};
@@ -1313,6 +1314,7 @@ dri2_create_fence(__DRIcontext *_ctx)
return NULL;
}
fence->driscreen = dri_screen(_ctx->driScreenPriv);
return fence;
}
@@ -1336,6 +1338,7 @@ dri2_get_fence_from_cl_event(__DRIscreen *_screen, intptr_t cl_event)
return NULL;
}
fence->driscreen = driscreen;
return fence;
}
@@ -1360,9 +1363,9 @@ static GLboolean
dri2_client_wait_sync(__DRIcontext *_ctx, void *_fence, unsigned flags,
uint64_t timeout)
{
struct dri_screen *driscreen = dri_screen(_ctx->driScreenPriv);
struct pipe_screen *screen = driscreen->base.screen;
struct dri2_fence *fence = (struct dri2_fence*)_fence;
struct dri_screen *driscreen = fence->driscreen;
struct pipe_screen *screen = driscreen->base.screen;
/* No need to flush. The context was flushed when the fence was created. */

View File

@@ -231,11 +231,11 @@ dri_set_tex_buffer2(__DRIcontext *pDRICtx, GLint target,
if (format == __DRI_TEXTURE_FORMAT_RGB) {
/* only need to cover the formats recognized by dri_fill_st_visual */
switch (internal_format) {
case PIPE_FORMAT_B8G8R8A8_UNORM:
internal_format = PIPE_FORMAT_B8G8R8X8_UNORM;
case PIPE_FORMAT_BGRA8888_UNORM:
internal_format = PIPE_FORMAT_BGRX8888_UNORM;
break;
case PIPE_FORMAT_A8R8G8B8_UNORM:
internal_format = PIPE_FORMAT_X8R8G8B8_UNORM;
case PIPE_FORMAT_ARGB8888_UNORM:
internal_format = PIPE_FORMAT_XRGB8888_UNORM;
break;
default:
break;

View File

@@ -753,10 +753,14 @@ static void slice_header(vid_dec_PrivateType *priv, struct vl_rbsp *rbsp,
priv->codec_data.h264.delta_pic_order_cnt_bottom = delta_pic_order_cnt_bottom;
}
priv->picture.h264.field_order_cnt[0] = pic_order_cnt_msb + pic_order_cnt_lsb;
priv->picture.h264.field_order_cnt[1] = pic_order_cnt_msb + pic_order_cnt_lsb;
if (!priv->picture.h264.field_pic_flag)
priv->picture.h264.field_order_cnt[1] += priv->codec_data.h264.delta_pic_order_cnt_bottom;
if (!priv->picture.h264.field_pic_flag) {
priv->picture.h264.field_order_cnt[0] = pic_order_cnt_msb + pic_order_cnt_lsb;
priv->picture.h264.field_order_cnt[1] = priv->picture.h264.field_order_cnt [0] +
priv->codec_data.h264.delta_pic_order_cnt_bottom;
} else if (!priv->picture.h264.bottom_field_flag)
priv->picture.h264.field_order_cnt[0] = pic_order_cnt_msb + pic_order_cnt_lsb;
else
priv->picture.h264.field_order_cnt[1] = pic_order_cnt_msb + pic_order_cnt_lsb;
} else if (sps->pic_order_cnt_type == 1) {
unsigned MaxFrameNum = 1 << (sps->log2_max_frame_num_minus4 + 4);

View File

@@ -73,6 +73,9 @@ vlVaBufferSetNumElements(VADriverContextP ctx, VABufferID buf_id,
return VA_STATUS_ERROR_INVALID_CONTEXT;
buf = handle_table_get(VL_VA_DRIVER(ctx)->htab, buf_id);
if (!buf)
return VA_STATUS_ERROR_INVALID_BUFFER;
buf->data = REALLOC(buf->data, buf->size * buf->num_elements,
buf->size * num_elements);
buf->num_elements = num_elements;
@@ -91,6 +94,9 @@ vlVaMapBuffer(VADriverContextP ctx, VABufferID buf_id, void **pbuff)
if (!ctx)
return VA_STATUS_ERROR_INVALID_CONTEXT;
if (!pbuff)
return VA_STATUS_ERROR_INVALID_PARAMETER;
buf = handle_table_get(VL_VA_DRIVER(ctx)->htab, buf_id);
if (!buf)
return VA_STATUS_ERROR_INVALID_BUFFER;

View File

@@ -116,7 +116,7 @@ vlVaCreateImage(VADriverContextP ctx, VAImageFormat *format, int width, int heig
img->width = width;
img->height = height;
w = align(width, 2);
h = align(width, 2);
h = align(height, 2);
switch (format->fourcc) {
case VA_FOURCC('N','V','1','2'):
@@ -332,13 +332,20 @@ vlVaPutImage(VADriverContextP ctx, VASurfaceID surface, VAImageID image,
if (format == PIPE_FORMAT_NONE)
return VA_STATUS_ERROR_OPERATION_FAILED;
if (surf->buffer == NULL || format != surf->buffer->buffer_format) {
if (surf->buffer)
surf->buffer->destroy(surf->buffer);
if (format != surf->buffer->buffer_format) {
struct pipe_video_buffer *tmp_buf;
enum pipe_format old_surf_format = surf->templat.buffer_format;
surf->templat.buffer_format = format;
surf->buffer = drv->pipe->create_video_buffer(drv->pipe, &surf->templat);
if (!surf->buffer)
return VA_STATUS_ERROR_ALLOCATION_FAILED;
tmp_buf = drv->pipe->create_video_buffer(drv->pipe, &surf->templat);
if (!tmp_buf) {
surf->templat.buffer_format = old_surf_format;
return VA_STATUS_ERROR_ALLOCATION_FAILED;
}
surf->buffer->destroy(surf->buffer);
surf->buffer = tmp_buf;
}
views = surf->buffer->get_sampler_view_planes(surf->buffer);

View File

@@ -58,7 +58,7 @@ vlVaBeginPicture(VADriverContextP ctx, VAContextID context_id, VASurfaceID rende
return VA_STATUS_ERROR_INVALID_SURFACE;
context->target = surf->buffer;
context->decoder->begin_frame(context->decoder, context->target, NULL);
context->decoder->begin_frame(context->decoder, context->target, &context->desc.base);
return VA_STATUS_SUCCESS;
}
@@ -500,8 +500,10 @@ handleVASliceDataBufferType(vlVaContext *context, vlVaBuffer *buf)
bufHasStartcode(buf, 0x0000010b, 32))
break;
if (context->decoder->profile == PIPE_VIDEO_PROFILE_VC1_ADVANCED) {
buffers[num_buffers] = (void *const)&start_code_vc1;
sizes[num_buffers++] = sizeof(start_code_vc1);
}
break;
case PIPE_VIDEO_FORMAT_MPEG4:
if (bufHasStartcode(buf, 0x000001, 24))
@@ -517,7 +519,7 @@ handleVASliceDataBufferType(vlVaContext *context, vlVaBuffer *buf)
buffers[num_buffers] = buf->data;
sizes[num_buffers] = buf->size;
++num_buffers;
context->decoder->decode_bitstream(context->decoder, context->target, NULL,
context->decoder->decode_bitstream(context->decoder, context->target, &context->desc.base,
num_buffers, (const void * const*)buffers, sizes);
}

View File

@@ -35,7 +35,8 @@ lib@OPENCL_LIBNAME@_la_LIBADD = \
-lclangEdit \
-lclangLex \
-lclangBasic \
$(LLVM_LIBS)
$(LLVM_LIBS) \
$(PTHREAD_LIBS)
nodist_EXTRA_lib@OPENCL_LIBNAME@_la_SOURCES = dummy.cpp
lib@OPENCL_LIBNAME@_la_SOURCES =

View File

@@ -14,3 +14,340 @@ EXPORTS
OSMesaGetProcAddress
OSMesaColorClamp
OSMesaPostprocess
glAccum
glAlphaFunc
glAreTexturesResident
glArrayElement
glBegin
glBindTexture
glBitmap
glBlendFunc
glCallList
glCallLists
glClear
glClearAccum
glClearColor
glClearDepth
glClearIndex
glClearStencil
glClipPlane
glColor3b
glColor3bv
glColor3d
glColor3dv
glColor3f
glColor3fv
glColor3i
glColor3iv
glColor3s
glColor3sv
glColor3ub
glColor3ubv
glColor3ui
glColor3uiv
glColor3us
glColor3usv
glColor4b
glColor4bv
glColor4d
glColor4dv
glColor4f
glColor4fv
glColor4i
glColor4iv
glColor4s
glColor4sv
glColor4ub
glColor4ubv
glColor4ui
glColor4uiv
glColor4us
glColor4usv
glColorMask
glColorMaterial
glColorPointer
glCopyPixels
glCopyTexImage1D
glCopyTexImage2D
glCopyTexSubImage1D
glCopyTexSubImage2D
glCullFace
; glDebugEntry
glDeleteLists
glDeleteTextures
glDepthFunc
glDepthMask
glDepthRange
glDisable
glDisableClientState
glDrawArrays
glDrawBuffer
glDrawElements
glDrawPixels
glEdgeFlag
glEdgeFlagPointer
glEdgeFlagv
glEnable
glEnableClientState
glEnd
glEndList
glEvalCoord1d
glEvalCoord1dv
glEvalCoord1f
glEvalCoord1fv
glEvalCoord2d
glEvalCoord2dv
glEvalCoord2f
glEvalCoord2fv
glEvalMesh1
glEvalMesh2
glEvalPoint1
glEvalPoint2
glFeedbackBuffer
glFinish
glFlush
glFogf
glFogfv
glFogi
glFogiv
glFrontFace
glFrustum
glGenLists
glGenTextures
glGetBooleanv
glGetClipPlane
glGetDoublev
glGetError
glGetFloatv
glGetIntegerv
glGetLightfv
glGetLightiv
glGetMapdv
glGetMapfv
glGetMapiv
glGetMaterialfv
glGetMaterialiv
glGetPixelMapfv
glGetPixelMapuiv
glGetPixelMapusv
glGetPointerv
glGetPolygonStipple
glGetString
glGetTexEnvfv
glGetTexEnviv
glGetTexGendv
glGetTexGenfv
glGetTexGeniv
glGetTexImage
glGetTexLevelParameterfv
glGetTexLevelParameteriv
glGetTexParameterfv
glGetTexParameteriv
glHint
glIndexMask
glIndexPointer
glIndexd
glIndexdv
glIndexf
glIndexfv
glIndexi
glIndexiv
glIndexs
glIndexsv
glIndexub
glIndexubv
glInitNames
glInterleavedArrays
glIsEnabled
glIsList
glIsTexture
glLightModelf
glLightModelfv
glLightModeli
glLightModeliv
glLightf
glLightfv
glLighti
glLightiv
glLineStipple
glLineWidth
glListBase
glLoadIdentity
glLoadMatrixd
glLoadMatrixf
glLoadName
glLogicOp
glMap1d
glMap1f
glMap2d
glMap2f
glMapGrid1d
glMapGrid1f
glMapGrid2d
glMapGrid2f
glMaterialf
glMaterialfv
glMateriali
glMaterialiv
glMatrixMode
glMultMatrixd
glMultMatrixf
glNewList
glNormal3b
glNormal3bv
glNormal3d
glNormal3dv
glNormal3f
glNormal3fv
glNormal3i
glNormal3iv
glNormal3s
glNormal3sv
glNormalPointer
glOrtho
glPassThrough
glPixelMapfv
glPixelMapuiv
glPixelMapusv
glPixelStoref
glPixelStorei
glPixelTransferf
glPixelTransferi
glPixelZoom
glPointSize
glPolygonMode
glPolygonOffset
glPolygonStipple
glPopAttrib
glPopClientAttrib
glPopMatrix
glPopName
glPrioritizeTextures
glPushAttrib
glPushClientAttrib
glPushMatrix
glPushName
glRasterPos2d
glRasterPos2dv
glRasterPos2f
glRasterPos2fv
glRasterPos2i
glRasterPos2iv
glRasterPos2s
glRasterPos2sv
glRasterPos3d
glRasterPos3dv
glRasterPos3f
glRasterPos3fv
glRasterPos3i
glRasterPos3iv
glRasterPos3s
glRasterPos3sv
glRasterPos4d
glRasterPos4dv
glRasterPos4f
glRasterPos4fv
glRasterPos4i
glRasterPos4iv
glRasterPos4s
glRasterPos4sv
glReadBuffer
glReadPixels
glRectd
glRectdv
glRectf
glRectfv
glRecti
glRectiv
glRects
glRectsv
glRenderMode
glRotated
glRotatef
glScaled
glScalef
glScissor
glSelectBuffer
glShadeModel
glStencilFunc
glStencilMask
glStencilOp
glTexCoord1d
glTexCoord1dv
glTexCoord1f
glTexCoord1fv
glTexCoord1i
glTexCoord1iv
glTexCoord1s
glTexCoord1sv
glTexCoord2d
glTexCoord2dv
glTexCoord2f
glTexCoord2fv
glTexCoord2i
glTexCoord2iv
glTexCoord2s
glTexCoord2sv
glTexCoord3d
glTexCoord3dv
glTexCoord3f
glTexCoord3fv
glTexCoord3i
glTexCoord3iv
glTexCoord3s
glTexCoord3sv
glTexCoord4d
glTexCoord4dv
glTexCoord4f
glTexCoord4fv
glTexCoord4i
glTexCoord4iv
glTexCoord4s
glTexCoord4sv
glTexCoordPointer
glTexEnvf
glTexEnvfv
glTexEnvi
glTexEnviv
glTexGend
glTexGendv
glTexGenf
glTexGenfv
glTexGeni
glTexGeniv
glTexImage1D
glTexImage2D
glTexParameterf
glTexParameterfv
glTexParameteri
glTexParameteriv
glTexSubImage1D
glTexSubImage2D
glTranslated
glTranslatef
glVertex2d
glVertex2dv
glVertex2f
glVertex2fv
glVertex2i
glVertex2iv
glVertex2s
glVertex2sv
glVertex3d
glVertex3dv
glVertex3f
glVertex3fv
glVertex3i
glVertex3iv
glVertex3s
glVertex3sv
glVertex4d
glVertex4dv
glVertex4f
glVertex4fv
glVertex4i
glVertex4iv
glVertex4s
glVertex4sv
glVertexPointer
glViewport

View File

@@ -11,3 +11,340 @@ EXPORTS
OSMesaGetProcAddress = OSMesaGetProcAddress@4
OSMesaColorClamp = OSMesaColorClamp@4
OSMesaPostprocess = OSMesaPostprocess@12
glAccum = glAccum@8
glAlphaFunc = glAlphaFunc@8
glAreTexturesResident = glAreTexturesResident@12
glArrayElement = glArrayElement@4
glBegin = glBegin@4
glBindTexture = glBindTexture@8
glBitmap = glBitmap@28
glBlendFunc = glBlendFunc@8
glCallList = glCallList@4
glCallLists = glCallLists@12
glClear = glClear@4
glClearAccum = glClearAccum@16
glClearColor = glClearColor@16
glClearDepth = glClearDepth@8
glClearIndex = glClearIndex@4
glClearStencil = glClearStencil@4
glClipPlane = glClipPlane@8
glColor3b = glColor3b@12
glColor3bv = glColor3bv@4
glColor3d = glColor3d@24
glColor3dv = glColor3dv@4
glColor3f = glColor3f@12
glColor3fv = glColor3fv@4
glColor3i = glColor3i@12
glColor3iv = glColor3iv@4
glColor3s = glColor3s@12
glColor3sv = glColor3sv@4
glColor3ub = glColor3ub@12
glColor3ubv = glColor3ubv@4
glColor3ui = glColor3ui@12
glColor3uiv = glColor3uiv@4
glColor3us = glColor3us@12
glColor3usv = glColor3usv@4
glColor4b = glColor4b@16
glColor4bv = glColor4bv@4
glColor4d = glColor4d@32
glColor4dv = glColor4dv@4
glColor4f = glColor4f@16
glColor4fv = glColor4fv@4
glColor4i = glColor4i@16
glColor4iv = glColor4iv@4
glColor4s = glColor4s@16
glColor4sv = glColor4sv@4
glColor4ub = glColor4ub@16
glColor4ubv = glColor4ubv@4
glColor4ui = glColor4ui@16
glColor4uiv = glColor4uiv@4
glColor4us = glColor4us@16
glColor4usv = glColor4usv@4
glColorMask = glColorMask@16
glColorMaterial = glColorMaterial@8
glColorPointer = glColorPointer@16
glCopyPixels = glCopyPixels@20
glCopyTexImage1D = glCopyTexImage1D@28
glCopyTexImage2D = glCopyTexImage2D@32
glCopyTexSubImage1D = glCopyTexSubImage1D@24
glCopyTexSubImage2D = glCopyTexSubImage2D@32
glCullFace = glCullFace@4
; glDebugEntry = glDebugEntry@8
glDeleteLists = glDeleteLists@8
glDeleteTextures = glDeleteTextures@8
glDepthFunc = glDepthFunc@4
glDepthMask = glDepthMask@4
glDepthRange = glDepthRange@16
glDisable = glDisable@4
glDisableClientState = glDisableClientState@4
glDrawArrays = glDrawArrays@12
glDrawBuffer = glDrawBuffer@4
glDrawElements = glDrawElements@16
glDrawPixels = glDrawPixels@20
glEdgeFlag = glEdgeFlag@4
glEdgeFlagPointer = glEdgeFlagPointer@8
glEdgeFlagv = glEdgeFlagv@4
glEnable = glEnable@4
glEnableClientState = glEnableClientState@4
glEnd = glEnd@0
glEndList = glEndList@0
glEvalCoord1d = glEvalCoord1d@8
glEvalCoord1dv = glEvalCoord1dv@4
glEvalCoord1f = glEvalCoord1f@4
glEvalCoord1fv = glEvalCoord1fv@4
glEvalCoord2d = glEvalCoord2d@16
glEvalCoord2dv = glEvalCoord2dv@4
glEvalCoord2f = glEvalCoord2f@8
glEvalCoord2fv = glEvalCoord2fv@4
glEvalMesh1 = glEvalMesh1@12
glEvalMesh2 = glEvalMesh2@20
glEvalPoint1 = glEvalPoint1@4
glEvalPoint2 = glEvalPoint2@8
glFeedbackBuffer = glFeedbackBuffer@12
glFinish = glFinish@0
glFlush = glFlush@0
glFogf = glFogf@8
glFogfv = glFogfv@8
glFogi = glFogi@8
glFogiv = glFogiv@8
glFrontFace = glFrontFace@4
glFrustum = glFrustum@48
glGenLists = glGenLists@4
glGenTextures = glGenTextures@8
glGetBooleanv = glGetBooleanv@8
glGetClipPlane = glGetClipPlane@8
glGetDoublev = glGetDoublev@8
glGetError = glGetError@0
glGetFloatv = glGetFloatv@8
glGetIntegerv = glGetIntegerv@8
glGetLightfv = glGetLightfv@12
glGetLightiv = glGetLightiv@12
glGetMapdv = glGetMapdv@12
glGetMapfv = glGetMapfv@12
glGetMapiv = glGetMapiv@12
glGetMaterialfv = glGetMaterialfv@12
glGetMaterialiv = glGetMaterialiv@12
glGetPixelMapfv = glGetPixelMapfv@8
glGetPixelMapuiv = glGetPixelMapuiv@8
glGetPixelMapusv = glGetPixelMapusv@8
glGetPointerv = glGetPointerv@8
glGetPolygonStipple = glGetPolygonStipple@4
glGetString = glGetString@4
glGetTexEnvfv = glGetTexEnvfv@12
glGetTexEnviv = glGetTexEnviv@12
glGetTexGendv = glGetTexGendv@12
glGetTexGenfv = glGetTexGenfv@12
glGetTexGeniv = glGetTexGeniv@12
glGetTexImage = glGetTexImage@20
glGetTexLevelParameterfv = glGetTexLevelParameterfv@16
glGetTexLevelParameteriv = glGetTexLevelParameteriv@16
glGetTexParameterfv = glGetTexParameterfv@12
glGetTexParameteriv = glGetTexParameteriv@12
glHint = glHint@8
glIndexMask = glIndexMask@4
glIndexPointer = glIndexPointer@12
glIndexd = glIndexd@8
glIndexdv = glIndexdv@4
glIndexf = glIndexf@4
glIndexfv = glIndexfv@4
glIndexi = glIndexi@4
glIndexiv = glIndexiv@4
glIndexs = glIndexs@4
glIndexsv = glIndexsv@4
glIndexub = glIndexub@4
glIndexubv = glIndexubv@4
glInitNames = glInitNames@0
glInterleavedArrays = glInterleavedArrays@12
glIsEnabled = glIsEnabled@4
glIsList = glIsList@4
glIsTexture = glIsTexture@4
glLightModelf = glLightModelf@8
glLightModelfv = glLightModelfv@8
glLightModeli = glLightModeli@8
glLightModeliv = glLightModeliv@8
glLightf = glLightf@12
glLightfv = glLightfv@12
glLighti = glLighti@12
glLightiv = glLightiv@12
glLineStipple = glLineStipple@8
glLineWidth = glLineWidth@4
glListBase = glListBase@4
glLoadIdentity = glLoadIdentity@0
glLoadMatrixd = glLoadMatrixd@4
glLoadMatrixf = glLoadMatrixf@4
glLoadName = glLoadName@4
glLogicOp = glLogicOp@4
glMap1d = glMap1d@32
glMap1f = glMap1f@24
glMap2d = glMap2d@56
glMap2f = glMap2f@40
glMapGrid1d = glMapGrid1d@20
glMapGrid1f = glMapGrid1f@12
glMapGrid2d = glMapGrid2d@40
glMapGrid2f = glMapGrid2f@24
glMaterialf = glMaterialf@12
glMaterialfv = glMaterialfv@12
glMateriali = glMateriali@12
glMaterialiv = glMaterialiv@12
glMatrixMode = glMatrixMode@4
glMultMatrixd = glMultMatrixd@4
glMultMatrixf = glMultMatrixf@4
glNewList = glNewList@8
glNormal3b = glNormal3b@12
glNormal3bv = glNormal3bv@4
glNormal3d = glNormal3d@24
glNormal3dv = glNormal3dv@4
glNormal3f = glNormal3f@12
glNormal3fv = glNormal3fv@4
glNormal3i = glNormal3i@12
glNormal3iv = glNormal3iv@4
glNormal3s = glNormal3s@12
glNormal3sv = glNormal3sv@4
glNormalPointer = glNormalPointer@12
glOrtho = glOrtho@48
glPassThrough = glPassThrough@4
glPixelMapfv = glPixelMapfv@12
glPixelMapuiv = glPixelMapuiv@12
glPixelMapusv = glPixelMapusv@12
glPixelStoref = glPixelStoref@8
glPixelStorei = glPixelStorei@8
glPixelTransferf = glPixelTransferf@8
glPixelTransferi = glPixelTransferi@8
glPixelZoom = glPixelZoom@8
glPointSize = glPointSize@4
glPolygonMode = glPolygonMode@8
glPolygonOffset = glPolygonOffset@8
glPolygonStipple = glPolygonStipple@4
glPopAttrib = glPopAttrib@0
glPopClientAttrib = glPopClientAttrib@0
glPopMatrix = glPopMatrix@0
glPopName = glPopName@0
glPrioritizeTextures = glPrioritizeTextures@12
glPushAttrib = glPushAttrib@4
glPushClientAttrib = glPushClientAttrib@4
glPushMatrix = glPushMatrix@0
glPushName = glPushName@4
glRasterPos2d = glRasterPos2d@16
glRasterPos2dv = glRasterPos2dv@4
glRasterPos2f = glRasterPos2f@8
glRasterPos2fv = glRasterPos2fv@4
glRasterPos2i = glRasterPos2i@8
glRasterPos2iv = glRasterPos2iv@4
glRasterPos2s = glRasterPos2s@8
glRasterPos2sv = glRasterPos2sv@4
glRasterPos3d = glRasterPos3d@24
glRasterPos3dv = glRasterPos3dv@4
glRasterPos3f = glRasterPos3f@12
glRasterPos3fv = glRasterPos3fv@4
glRasterPos3i = glRasterPos3i@12
glRasterPos3iv = glRasterPos3iv@4
glRasterPos3s = glRasterPos3s@12
glRasterPos3sv = glRasterPos3sv@4
glRasterPos4d = glRasterPos4d@32
glRasterPos4dv = glRasterPos4dv@4
glRasterPos4f = glRasterPos4f@16
glRasterPos4fv = glRasterPos4fv@4
glRasterPos4i = glRasterPos4i@16
glRasterPos4iv = glRasterPos4iv@4
glRasterPos4s = glRasterPos4s@16
glRasterPos4sv = glRasterPos4sv@4
glReadBuffer = glReadBuffer@4
glReadPixels = glReadPixels@28
glRectd = glRectd@32
glRectdv = glRectdv@8
glRectf = glRectf@16
glRectfv = glRectfv@8
glRecti = glRecti@16
glRectiv = glRectiv@8
glRects = glRects@16
glRectsv = glRectsv@8
glRenderMode = glRenderMode@4
glRotated = glRotated@32
glRotatef = glRotatef@16
glScaled = glScaled@24
glScalef = glScalef@12
glScissor = glScissor@16
glSelectBuffer = glSelectBuffer@8
glShadeModel = glShadeModel@4
glStencilFunc = glStencilFunc@12
glStencilMask = glStencilMask@4
glStencilOp = glStencilOp@12
glTexCoord1d = glTexCoord1d@8
glTexCoord1dv = glTexCoord1dv@4
glTexCoord1f = glTexCoord1f@4
glTexCoord1fv = glTexCoord1fv@4
glTexCoord1i = glTexCoord1i@4
glTexCoord1iv = glTexCoord1iv@4
glTexCoord1s = glTexCoord1s@4
glTexCoord1sv = glTexCoord1sv@4
glTexCoord2d = glTexCoord2d@16
glTexCoord2dv = glTexCoord2dv@4
glTexCoord2f = glTexCoord2f@8
glTexCoord2fv = glTexCoord2fv@4
glTexCoord2i = glTexCoord2i@8
glTexCoord2iv = glTexCoord2iv@4
glTexCoord2s = glTexCoord2s@8
glTexCoord2sv = glTexCoord2sv@4
glTexCoord3d = glTexCoord3d@24
glTexCoord3dv = glTexCoord3dv@4
glTexCoord3f = glTexCoord3f@12
glTexCoord3fv = glTexCoord3fv@4
glTexCoord3i = glTexCoord3i@12
glTexCoord3iv = glTexCoord3iv@4
glTexCoord3s = glTexCoord3s@12
glTexCoord3sv = glTexCoord3sv@4
glTexCoord4d = glTexCoord4d@32
glTexCoord4dv = glTexCoord4dv@4
glTexCoord4f = glTexCoord4f@16
glTexCoord4fv = glTexCoord4fv@4
glTexCoord4i = glTexCoord4i@16
glTexCoord4iv = glTexCoord4iv@4
glTexCoord4s = glTexCoord4s@16
glTexCoord4sv = glTexCoord4sv@4
glTexCoordPointer = glTexCoordPointer@16
glTexEnvf = glTexEnvf@12
glTexEnvfv = glTexEnvfv@12
glTexEnvi = glTexEnvi@12
glTexEnviv = glTexEnviv@12
glTexGend = glTexGend@16
glTexGendv = glTexGendv@12
glTexGenf = glTexGenf@12
glTexGenfv = glTexGenfv@12
glTexGeni = glTexGeni@12
glTexGeniv = glTexGeniv@12
glTexImage1D = glTexImage1D@32
glTexImage2D = glTexImage2D@36
glTexParameterf = glTexParameterf@12
glTexParameterfv = glTexParameterfv@12
glTexParameteri = glTexParameteri@12
glTexParameteriv = glTexParameteriv@12
glTexSubImage1D = glTexSubImage1D@28
glTexSubImage2D = glTexSubImage2D@36
glTranslated = glTranslated@24
glTranslatef = glTranslatef@12
glVertex2d = glVertex2d@16
glVertex2dv = glVertex2dv@4
glVertex2f = glVertex2f@8
glVertex2fv = glVertex2fv@4
glVertex2i = glVertex2i@8
glVertex2iv = glVertex2iv@4
glVertex2s = glVertex2s@8
glVertex2sv = glVertex2sv@4
glVertex3d = glVertex3d@24
glVertex3dv = glVertex3dv@4
glVertex3f = glVertex3f@12
glVertex3fv = glVertex3fv@4
glVertex3i = glVertex3i@12
glVertex3iv = glVertex3iv@4
glVertex3s = glVertex3s@12
glVertex3sv = glVertex3sv@4
glVertex4d = glVertex4d@32
glVertex4dv = glVertex4dv@4
glVertex4f = glVertex4f@16
glVertex4fv = glVertex4fv@4
glVertex4i = glVertex4i@16
glVertex4iv = glVertex4iv@4
glVertex4s = glVertex4s@16
glVertex4sv = glVertex4sv@4
glVertexPointer = glVertexPointer@16
glViewport = glViewport@16

View File

@@ -151,11 +151,15 @@ enum {
/* CZ specific rev IDs */
enum {
CZ_CARRIZO_A0 = 0x01,
CARRIZO_A0 = 0x01,
STONEY_A0 = 0x61,
CZ_UNKNOWN = 0xFF
};
#define ASICREV_IS_CARRIZO(eChipRev) \
(eChipRev >= CARRIZO_A0)
((eChipRev >= CARRIZO_A0) && (eChipRev < STONEY_A0))
#define ASICREV_IS_STONEY(eChipRev) \
((eChipRev >= STONEY_A0) && (eChipRev < CZ_UNKNOWN))
#endif /* AMDGPU_ID_H */

View File

@@ -226,7 +226,11 @@ static boolean do_winsys_init(struct amdgpu_winsys *ws)
break;
case CHIP_CARRIZO:
ws->family = FAMILY_CZ;
ws->rev_id = CZ_CARRIZO_A0;
ws->rev_id = CARRIZO_A0;
break;
case CHIP_STONEY:
ws->family = FAMILY_CZ;
ws->rev_id = STONEY_A0;
break;
case CHIP_FIJI:
ws->family = FAMILY_VI;

View File

@@ -76,6 +76,9 @@ struct radeon_bomgr {
bool va;
uint64_t va_offset;
struct list_head va_holes;
/* BO size alignment */
unsigned size_align;
};
static inline struct radeon_bomgr *radeon_bomgr(struct pb_manager *mgr)
@@ -164,8 +167,10 @@ static uint64_t radeon_bomgr_find_va(struct radeon_bomgr *mgr, uint64_t size, ui
struct radeon_bo_va_hole *hole, *n;
uint64_t offset = 0, waste = 0;
alignment = MAX2(alignment, 4096);
size = align(size, 4096);
/* All VM address space holes will implicitly start aligned to the
* size alignment, so we don't need to sanitize the alignment here
*/
size = align(size, mgr->size_align);
pipe_mutex_lock(mgr->bo_va_mutex);
/* first look for a hole */
@@ -222,7 +227,7 @@ static void radeon_bomgr_free_va(struct radeon_bomgr *mgr, uint64_t va, uint64_t
{
struct radeon_bo_va_hole *hole;
size = align(size, 4096);
size = align(size, mgr->size_align);
pipe_mutex_lock(mgr->bo_va_mutex);
if ((va + size) == mgr->va_offset) {
@@ -333,9 +338,9 @@ static void radeon_bo_destroy(struct pb_buffer *_buf)
pipe_mutex_destroy(bo->map_mutex);
if (bo->initial_domain & RADEON_DOMAIN_VRAM)
bo->rws->allocated_vram -= align(bo->base.size, 4096);
bo->rws->allocated_vram -= align(bo->base.size, mgr->size_align);
else if (bo->initial_domain & RADEON_DOMAIN_GTT)
bo->rws->allocated_gtt -= align(bo->base.size, 4096);
bo->rws->allocated_gtt -= align(bo->base.size, mgr->size_align);
FREE(bo);
}
@@ -620,9 +625,9 @@ static struct pb_buffer *radeon_bomgr_create_bo(struct pb_manager *_mgr,
}
if (rdesc->initial_domains & RADEON_DOMAIN_VRAM)
rws->allocated_vram += align(size, 4096);
rws->allocated_vram += align(size, mgr->size_align);
else if (rdesc->initial_domains & RADEON_DOMAIN_GTT)
rws->allocated_gtt += align(size, 4096);
rws->allocated_gtt += align(size, mgr->size_align);
return &bo->base;
}
@@ -696,6 +701,9 @@ struct pb_manager *radeon_bomgr_create(struct radeon_drm_winsys *rws)
mgr->va_offset = rws->va_start;
list_inithead(&mgr->va_holes);
/* TTM aligns the BO size to the CPU page size */
mgr->size_align = sysconf(_SC_PAGESIZE);
return &mgr->base;
}
@@ -858,7 +866,7 @@ radeon_winsys_bo_create(struct radeon_winsys *rws,
* BOs. Aligning this here helps the cached bufmgr. Especially small BOs,
* like constant/uniform buffers, can benefit from better and more reuse.
*/
size = align(size, 4096);
size = align(size, mgr->size_align);
/* Only set one usage bit each for domains and flags, or the cache manager
* might consider different sets of domains / flags compatible
@@ -969,7 +977,7 @@ static struct pb_buffer *radeon_winsys_bo_from_ptr(struct radeon_winsys *rws,
pipe_mutex_unlock(mgr->bo_handles_mutex);
}
ws->allocated_gtt += align(bo->base.size, 4096);
ws->allocated_gtt += align(bo->base.size, mgr->size_align);
return (struct pb_buffer*)bo;
}
@@ -1106,9 +1114,9 @@ done:
bo->initial_domain = radeon_bo_get_initial_domain((void*)bo);
if (bo->initial_domain & RADEON_DOMAIN_VRAM)
ws->allocated_vram += align(bo->base.size, 4096);
ws->allocated_vram += align(bo->base.size, mgr->size_align);
else if (bo->initial_domain & RADEON_DOMAIN_GTT)
ws->allocated_gtt += align(bo->base.size, 4096);
ws->allocated_gtt += align(bo->base.size, mgr->size_align);
return (struct pb_buffer*)bo;

View File

@@ -35,6 +35,7 @@ extern "C" {
#define __GBM__ 1
#include <stddef.h>
#include <stdint.h>
/**

View File

@@ -62,6 +62,8 @@ public:
virtual ir_rvalue *hir(exec_list *instructions,
struct _mesa_glsl_parse_state *state);
virtual bool has_sequence_subexpression() const;
/**
* Retrieve the source location of an AST node
*
@@ -221,6 +223,8 @@ public:
virtual void hir_no_rvalue(exec_list *instructions,
struct _mesa_glsl_parse_state *state);
virtual bool has_sequence_subexpression() const;
ir_rvalue *do_hir(exec_list *instructions,
struct _mesa_glsl_parse_state *state,
bool needs_rvalue);
@@ -299,6 +303,8 @@ public:
virtual void hir_no_rvalue(exec_list *instructions,
struct _mesa_glsl_parse_state *state);
virtual bool has_sequence_subexpression() const;
private:
/**
* Is this function call actually a constructor?

View File

@@ -395,13 +395,54 @@ generate_call(exec_list *instructions, ir_function_signature *sig,
}
}
/* If the function call is a constant expression, don't generate any
* instructions; just generate an ir_constant.
/* Section 4.3.2 (Const) of the GLSL 1.10.59 spec says:
*
* Function calls were first allowed to be constant expressions in GLSL
* 1.20 and GLSL ES 3.00.
* "Initializers for const declarations must be formed from literal
* values, other const variables (not including function call
* paramaters), or expressions of these.
*
* Constructors may be used in such expressions, but function calls may
* not."
*
* Section 4.3.3 (Constant Expressions) of the GLSL 1.20.8 spec says:
*
* "A constant expression is one of
*
* ...
*
* - a built-in function call whose arguments are all constant
* expressions, with the exception of the texture lookup
* functions, the noise functions, and ftransform. The built-in
* functions dFdx, dFdy, and fwidth must return 0 when evaluated
* inside an initializer with an argument that is a constant
* expression."
*
* Section 5.10 (Constant Expressions) of the GLSL ES 1.00.17 spec says:
*
* "A constant expression is one of
*
* ...
*
* - a built-in function call whose arguments are all constant
* expressions, with the exception of the texture lookup
* functions."
*
* Section 4.3.3 (Constant Expressions) of the GLSL ES 3.00.4 spec says:
*
* "A constant expression is one of
*
* ...
*
* - a built-in function call whose arguments are all constant
* expressions, with the exception of the texture lookup
* functions. The built-in functions dFdx, dFdy, and fwidth must
* return 0 when evaluated inside an initializer with an argument
* that is a constant expression."
*
* If the function call is a constant expression, don't generate any
* instructions; just generate an ir_constant.
*/
if (state->is_version(120, 300)) {
if (state->is_version(120, 100)) {
ir_constant *value = sig->constant_expression_value(actual_parameters, NULL);
if (value != NULL) {
return value;
@@ -1911,6 +1952,17 @@ ast_function_expression::hir(exec_list *instructions,
unreachable("not reached");
}
bool
ast_function_expression::has_sequence_subexpression() const
{
foreach_list_typed(const ast_node, ast, link, &this->expressions) {
if (ast->has_sequence_subexpression())
return true;
}
return false;
}
ir_rvalue *
ast_aggregate_initializer::hir(exec_list *instructions,
struct _mesa_glsl_parse_state *state)

View File

@@ -482,18 +482,20 @@ bit_logic_result_type(const struct glsl_type *type_a,
}
static const struct glsl_type *
modulus_result_type(const struct glsl_type *type_a,
const struct glsl_type *type_b,
modulus_result_type(ir_rvalue * &value_a, ir_rvalue * &value_b,
struct _mesa_glsl_parse_state *state, YYLTYPE *loc)
{
const glsl_type *type_a = value_a->type;
const glsl_type *type_b = value_b->type;
if (!state->check_version(130, 300, loc, "operator '%%' is reserved")) {
return glsl_type::error_type;
}
/* From GLSL 1.50 spec, page 56:
/* Section 5.9 (Expressions) of the GLSL 4.00 specification says:
*
* "The operator modulus (%) operates on signed or unsigned integers or
* integer vectors. The operand types must both be signed or both be
* unsigned."
* integer vectors."
*/
if (!type_a->is_integer()) {
_mesa_glsl_error(loc, state, "LHS of operator %% must be an integer");
@@ -503,11 +505,28 @@ modulus_result_type(const struct glsl_type *type_a,
_mesa_glsl_error(loc, state, "RHS of operator %% must be an integer");
return glsl_type::error_type;
}
if (type_a->base_type != type_b->base_type) {
/* "If the fundamental types in the operands do not match, then the
* conversions from section 4.1.10 "Implicit Conversions" are applied
* to create matching types."
*
* Note that GLSL 4.00 (and GL_ARB_gpu_shader5) introduced implicit
* int -> uint conversion rules. Prior to that, there were no implicit
* conversions. So it's harmless to apply them universally - no implicit
* conversions will exist. If the types don't match, we'll receive false,
* and raise an error, satisfying the GLSL 1.50 spec, page 56:
*
* "The operand types must both be signed or unsigned."
*/
if (!apply_implicit_conversion(type_a, value_b, state) &&
!apply_implicit_conversion(type_b, value_a, state)) {
_mesa_glsl_error(loc, state,
"operands of %% must have the same base type");
"could not implicitly convert operands to "
"modulus (%%) operator");
return glsl_type::error_type;
}
type_a = value_a->type;
type_b = value_b->type;
/* "The operands cannot be vectors of differing size. If one operand is
* a scalar and the other vector, then the scalar is applied component-
@@ -939,6 +958,12 @@ ast_node::hir(exec_list *instructions, struct _mesa_glsl_parse_state *state)
return NULL;
}
bool
ast_node::has_sequence_subexpression() const
{
return false;
}
void
ast_function_expression::hir_no_rvalue(exec_list *instructions,
struct _mesa_glsl_parse_state *state)
@@ -1261,7 +1286,7 @@ ast_expression::do_hir(exec_list *instructions,
op[0] = this->subexpressions[0]->hir(instructions, state);
op[1] = this->subexpressions[1]->hir(instructions, state);
type = modulus_result_type(op[0]->type, op[1]->type, state, & loc);
type = modulus_result_type(op[0], op[1], state, &loc);
assert(operations[this->oper] == ir_binop_mod);
@@ -1508,7 +1533,7 @@ ast_expression::do_hir(exec_list *instructions,
op[0] = this->subexpressions[0]->hir(instructions, state);
op[1] = this->subexpressions[1]->hir(instructions, state);
type = modulus_result_type(op[0]->type, op[1]->type, state, & loc);
type = modulus_result_type(op[0], op[1], state, &loc);
assert(operations[this->oper] == ir_binop_mod);
@@ -1850,6 +1875,80 @@ ast_expression::do_hir(exec_list *instructions,
return result;
}
bool
ast_expression::has_sequence_subexpression() const
{
switch (this->oper) {
case ast_plus:
case ast_neg:
case ast_bit_not:
case ast_logic_not:
case ast_pre_inc:
case ast_pre_dec:
case ast_post_inc:
case ast_post_dec:
return this->subexpressions[0]->has_sequence_subexpression();
case ast_assign:
case ast_add:
case ast_sub:
case ast_mul:
case ast_div:
case ast_mod:
case ast_lshift:
case ast_rshift:
case ast_less:
case ast_greater:
case ast_lequal:
case ast_gequal:
case ast_nequal:
case ast_equal:
case ast_bit_and:
case ast_bit_xor:
case ast_bit_or:
case ast_logic_and:
case ast_logic_or:
case ast_logic_xor:
case ast_array_index:
case ast_mul_assign:
case ast_div_assign:
case ast_add_assign:
case ast_sub_assign:
case ast_mod_assign:
case ast_ls_assign:
case ast_rs_assign:
case ast_and_assign:
case ast_xor_assign:
case ast_or_assign:
return this->subexpressions[0]->has_sequence_subexpression() ||
this->subexpressions[1]->has_sequence_subexpression();
case ast_conditional:
return this->subexpressions[0]->has_sequence_subexpression() ||
this->subexpressions[1]->has_sequence_subexpression() ||
this->subexpressions[2]->has_sequence_subexpression();
case ast_sequence:
return true;
case ast_field_selection:
case ast_identifier:
case ast_int_constant:
case ast_uint_constant:
case ast_float_constant:
case ast_bool_constant:
case ast_double_constant:
return false;
case ast_aggregate:
unreachable("ast_aggregate: Should never get here.");
case ast_function_call:
unreachable("should be handled by ast_function_expression::hir");
}
return false;
}
ir_rvalue *
ast_expression_statement::hir(exec_list *instructions,
@@ -3146,16 +3245,72 @@ process_initializer(ir_variable *var, ast_declaration *decl,
/* Calculate the constant value if this is a const or uniform
* declaration.
*
* Section 4.3 (Storage Qualifiers) of the GLSL ES 1.00.17 spec says:
*
* "Declarations of globals without a storage qualifier, or with
* just the const qualifier, may include initializers, in which case
* they will be initialized before the first line of main() is
* executed. Such initializers must be a constant expression."
*
* The same section of the GLSL ES 3.00.4 spec has similar language.
*/
if (type->qualifier.flags.q.constant
|| type->qualifier.flags.q.uniform) {
|| type->qualifier.flags.q.uniform
|| (state->es_shader && state->current_function == NULL)) {
ir_rvalue *new_rhs = validate_assignment(state, initializer_loc,
lhs, rhs, true);
if (new_rhs != NULL) {
rhs = new_rhs;
/* Section 4.3.3 (Constant Expressions) of the GLSL ES 3.00.4 spec
* says:
*
* "A constant expression is one of
*
* ...
*
* - an expression formed by an operator on operands that are
* all constant expressions, including getting an element of
* a constant array, or a field of a constant structure, or
* components of a constant vector. However, the sequence
* operator ( , ) and the assignment operators ( =, +=, ...)
* are not included in the operators that can create a
* constant expression."
*
* Section 12.43 (Sequence operator and constant expressions) says:
*
* "Should the following construct be allowed?
*
* float a[2,3];
*
* The expression within the brackets uses the sequence operator
* (',') and returns the integer 3 so the construct is declaring
* a single-dimensional array of size 3. In some languages, the
* construct declares a two-dimensional array. It would be
* preferable to make this construct illegal to avoid confusion.
*
* One possibility is to change the definition of the sequence
* operator so that it does not return a constant-expression and
* hence cannot be used to declare an array size.
*
* RESOLUTION: The result of a sequence operator is not a
* constant-expression."
*
* Section 4.3.3 (Constant Expressions) of the GLSL 4.30.9 spec
* contains language almost identical to the section 4.3.3 in the
* GLSL ES 3.00.4 spec. This is a new limitation for these GLSL
* versions.
*/
ir_constant *constant_value = rhs->constant_expression_value();
if (!constant_value) {
if (!constant_value ||
(state->is_version(430, 300) &&
decl->initializer->has_sequence_subexpression())) {
const char *const variable_mode =
(type->qualifier.flags.q.constant)
? "const"
: ((type->qualifier.flags.q.uniform) ? "uniform" : "global");
/* If ARB_shading_language_420pack is enabled, initializers of
* const-qualified local variables do not have to be constant
* expressions. Const-qualified global variables must still be
@@ -3166,22 +3321,24 @@ process_initializer(ir_variable *var, ast_declaration *decl,
_mesa_glsl_error(& initializer_loc, state,
"initializer of %s variable `%s' must be a "
"constant expression",
(type->qualifier.flags.q.constant)
? "const" : "uniform",
variable_mode,
decl->identifier);
if (var->type->is_numeric()) {
/* Reduce cascading errors. */
var->constant_value = ir_constant::zero(state, var->type);
var->constant_value = type->qualifier.flags.q.constant
? ir_constant::zero(state, var->type) : NULL;
}
}
} else {
rhs = constant_value;
var->constant_value = constant_value;
var->constant_value = type->qualifier.flags.q.constant
? constant_value : NULL;
}
} else {
if (var->type->is_numeric()) {
/* Reduce cascading errors. */
var->constant_value = ir_constant::zero(state, var->type);
var->constant_value = type->qualifier.flags.q.constant
? ir_constant::zero(state, var->type) : NULL;
}
}
}

View File

@@ -326,9 +326,9 @@ link_set_uniform_initializers(struct gl_shader_program *prog,
} else {
assert(!"Explicit binding not on a sampler, UBO or atomic.");
}
} else if (var->constant_value) {
} else if (var->constant_initializer) {
linker::set_uniform_initializer(mem_ctx, prog, var->name,
var->type, var->constant_value,
var->type, var->constant_initializer,
boolean_true);
}
}

View File

@@ -1629,7 +1629,7 @@ void nir_dump_dom_frontier(nir_shader *shader, FILE *fp);
void nir_dump_cfg_impl(nir_function_impl *impl, FILE *fp);
void nir_dump_cfg(nir_shader *shader, FILE *fp);
void nir_split_var_copies(nir_shader *shader);
bool nir_split_var_copies(nir_shader *shader);
void nir_lower_var_copy_instr(nir_intrinsic_instr *copy, void *mem_ctx);
void nir_lower_var_copies(nir_shader *shader);
@@ -1653,7 +1653,7 @@ void nir_lower_vars_to_ssa(nir_shader *shader);
void nir_remove_dead_variables(nir_shader *shader);
void nir_lower_vec_to_movs(nir_shader *shader);
bool nir_lower_vec_to_movs(nir_shader *shader);
void nir_lower_alu_to_scalar(nir_shader *shader);
void nir_lower_load_const_to_scalar(nir_shader *shader);

View File

@@ -455,7 +455,8 @@ lower_copies_to_load_store(struct deref_node *node,
struct deref_node *arg_node =
get_deref_node(copy->variables[i], state);
if (arg_node == NULL)
/* Only bother removing copy entries for other nodes */
if (arg_node == NULL || arg_node == node)
continue;
struct set_entry *arg_entry = _mesa_set_search(arg_node->copies, copy);
@@ -466,6 +467,8 @@ lower_copies_to_load_store(struct deref_node *node,
nir_instr_remove(&copy->instr);
}
node->copies = NULL;
return true;
}

Some files were not shown because too many files have changed in this diff Show More