Compare commits

..

96 Commits

Author SHA1 Message Date
Emil Velikov
31bf247031 docs: add release notes for 11.0.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2015-10-24 19:34:01 +01:00
Emil Velikov
b530dccbff Update version to 11.0.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2015-10-24 19:29:27 +01:00
Jonathan Gray
6d6a4d7c76 configure.ac: ensure RM is set
GNU make predefines RM to rm -f but this is not required by POSIX
so ensure that RM is set.  This fixes "make clean" on OpenBSD.

v2: use AC_CHECK_PROG

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 99c4079c37)
2015-10-21 14:23:22 +01:00
Tapani Pälli
13276962c7 mesa: fix ARRAY_SIZE query for GetProgramResourceiv
Patch also refactors name length queries which were using array size
in computation, this has to be done in same time to avoid regression in
arb_program_interface_query-resource-query Piglit test.

Fixes rest of the failures with
   ES31-CTS.program_interface_query.no-locations

v2: make additional check only for GS inputs
v3: create helper function for resource name length
    so that it gets calculated only in one place

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
(cherry picked from commit c0722be9f5)
2015-10-21 14:23:22 +01:00
Chris Wilson
03ab39fa70 i965: Remove early release of DRI2 miptree
intel_update_winsys_renderbuffer_miptree() will release the existing
miptree when wrapping a new DRI2 buffer, so we can remove the early
release and so prevent a NULL mt dereference should importing the new
DRI2 name fail for any reason. (Reusing the old DRI2 name will result
in the rendering going astray, to a stale buffer, and not shown on the
screen, but it allows us to issue a warning and not crash much later in
innocent code.)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86281
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
(cherry picked from commit 70e91d61fd)
2015-10-21 14:23:22 +01:00
Alejandro Piñeiro
4d215a25d5 i965/vec4: fill src_reg type using the constructor type parameter
The src_reg constructor that received the glsl_type was using it
only to build the swizzle, but not to fill this->type as dst_reg
is doing.

This caused some type mismatch between movs and alu operations
on the NIR path, so copy propagation optimization was not applied
to remove unneeded movs if negate modifier was involved. This was
first detected on minus (negate+add) operations.

Shader DB results (taking into account only vec4):

total instructions in shared programs: 20019 -> 19934 (-0.42%)
instructions in affected programs:     2918 -> 2833 (-2.91%)
helped:                                79
HURT:                                  0
GAINED:                                0
LOST:                                  0

Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 4de86e1371)
Nominated-by: Christoph Brill <egore911@egore911.de>
2015-10-21 14:23:22 +01:00
Alejandro Piñeiro
6766a36e19 i965/vec4: check writemask when bailing out at register coalesce
opt_register_coalesce stopped to check previous instructions to
coalesce with if somebody else was writing on the same
destination. This can be optimized to check if somebody else was
writing to the same channels of the same destination using the
writemask.

Shader DB results (taking into account only vec4):

total instructions in shared programs: 1781593 -> 1734957 (-2.62%)
instructions in affected programs:     1238390 -> 1191754 (-3.77%)
helped:                                12782
HURT:                                  0
GAINED:                                0
LOST:                                  0

v2: removed some parenthesis, fixed indentation, as suggested by
    Matt Turner
v3: added brackets, for consistency, as suggested by Eduardo Lima

Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit d4e29af234)
Nominated-by: Jason Ekstrand <jason@jlekstrand.net>
2015-10-21 14:23:22 +01:00
Brian Paul
42364b33d1 mesa: fix incorrect opcode in save_BlendFunci()
Fixes assertion failure with new piglit
arb_draw_buffers_blend-state_set_get test.

Cc: mesa-stable@lists.freedesktop.org

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit e24d04e436)
2015-10-21 14:23:22 +01:00
Marek Olšák
54a30ed94f gallium: add PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT
This avoids a serious r600g bug leading to a GPU hang.
The chances this bug will get fixed are pretty low now.

I deeply regret listening to others and not pushing this patch, leaving
other users with a GPU-crashing driver. Yes, it should be fixed
in the compiler and it's ugly, but users couldn't care less about that.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86720

Cc: 11.0 10.6 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 814f31457e)
2015-10-21 14:23:22 +01:00
Leo Liu
6f48b8957e st/omx/dec/h264: fix field picture type 0 poc disorder
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 867284a8f0)
2015-10-21 14:23:22 +01:00
Indrajit Das
b91ed628c1 st/va: Used correct parameter to derive the value of the "h" variable in vlVaCreateImage
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 381c17d695)
2015-10-21 14:23:22 +01:00
Marek Olšák
141109cc52 radeonsi: fix a GS copy shader leak
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit aa060e276c)
[Emil Velikov: si_shader_destroy() wants the ctx as first argument]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/gallium/drivers/radeonsi/si_shader.c
2015-10-21 14:23:21 +01:00
Marek Olšák
5d41a78769 st/mesa: fix clip state dependencies
This allows removing FLUSH_VERTICES in MatrixMode.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 3c6156a4a7)
2015-10-21 14:23:21 +01:00
Tapani Pälli
da1d57faf3 mesa: Set api prefix to version string when overriding version
Otherwise there are problems when user overrides version and application
such as Piglit wants to detect used api with glGetString(GL_VERSION).

This makes it currently impossible to run glslparsertest tests for
OpenGL ES when using version override.

Below is example when using MESA_GLES_VERSION_OVERRIDE=3.1.

Before:
	"3.1 Mesa 11.1.0-devel (git-24a1a15)"

After:
	"OpenGL ES 3.1 Mesa 11.1.0-devel (git-78042ff)"

v2: only include api prefix for OpenGL ES (Boyan Ding)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit dc8c221e28)
2015-10-21 14:23:21 +01:00
Rob Clark
0d87f75763 freedreno/a3xx: cache-flush is needed after MEM_WRITE
Otherwise the mem2gmem blit would see potentially bogus texture
coordinates.  Fixes an issue that shows up with glamor.

CC: "11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 6206da736c)
2015-10-21 14:23:21 +01:00
Chih-Wei Huang
009890a0de nv30: include the header of ffs prototype
It fixes a building error of the android 6.0 64-bit target.

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 7599f8b167)
2015-10-21 14:23:21 +01:00
Chih-Wei Huang
938df905ea nv50/ir: use C++11 standard std::unordered_map if possible
Note Android version before Lollipop is not supported.

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit d31005e3e5)
2015-10-21 14:23:21 +01:00
Chih-Wei Huang
9b561ed2d1 mesa: android: Fix the incorrect path of sse_minmax.c
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Fixes: 669cfc267a (android: mesa: fix the path of the SSE4_1
optimisations)
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>

(cherry picked from commit 67d8518a0e)
2015-10-21 14:23:21 +01:00
Krzysztof Sobiecki
b0b31397e2 st/fbo: use pipe_surface_release instead of pipe_surface_reference
pipe_surface_reference have problems with deleted contexts,
so use of pipe_surface_release might be more appropriate.

Fixes Wasteland 2 Director's Cut crash on start.

Cc: mesa-stable@lists.freedesktop.org

Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 14f7ce4248)
2015-10-21 14:23:21 +01:00
Brian Paul
c0b85c5a4c vbo: fix incorrect switch statement in init_mat_currval()
The variable 'i' is a value in [0, MAT_ATTRIB_MAX-1] so subtracting
VERT_ATTRIB_GENERIC0 gave a bogus value and we executed the default
switch clause for all loop iterations.

This doesn't fix any known issues but was clearly incorrect.

Cc: mesa-stable@lists.freedesktop.org

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit dd293d8aae)
2015-10-21 14:23:21 +01:00
Ian Romanick
a9da1ead7b glsl: In later GLSL versions, sequence operator is cannot be a constant expression
Fixes:
    ES3-CTS.shaders.negative.constant_sequence

    spec/glsl-es-3.00/compiler/global-initializer/from-sequence.vert
    spec/glsl-es-3.00/compiler/global-initializer/from-sequence.frag

v2: Fix a couple copy-and-paste mistake in the spec quotations.
Suggested by Matt.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 92635a84a7)
2015-10-21 14:23:21 +01:00
Ian Romanick
dab0c565d3 glsl: Add method to determine whether an expression contains the sequence operator
This will be used in the next patch to enforce some language sematics.

v2: Fix inverted logic in
ast_function_expression::has_sequence_subexpression.  The method
originally had a different name and a different meaning.  I fixed the
logic in ast_to_hir.cpp, but I only changed the names in
ast_function.cpp.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com> [v1]
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 05e4601c6b)
2015-10-21 14:23:21 +01:00
Ian Romanick
96931dbf14 glsl: Restrict initializers for global variables to constant expression in ES
v2: Combine this check with the existing const and uniform checks.  This
change depends on the previous patch (glsl: Only set
ir_variable::constant_value for const-decorated variables).

Fixes:

    ES2-CTS.shaders.negative.initialize
    ES3-CTS.shaders.negative.initialize

    spec/glsl-es-1.00/compiler/global-initializer/from-attribute.vert
    spec/glsl-es-1.00/compiler/global-initializer/from-uniform.vert
    spec/glsl-es-1.00/compiler/global-initializer/from-uniform.frag
    spec/glsl-es-1.00/compiler/global-initializer/from-global.vert
    spec/glsl-es-1.00/compiler/global-initializer/from-global.frag
    spec/glsl-es-1.00/compiler/global-initializer/from-varying.frag
    spec/glsl-es-3.00/compiler/global-initializer/from-uniform.vert
    spec/glsl-es-3.00/compiler/global-initializer/from-uniform.frag
    spec/glsl-es-3.00/compiler/global-initializer/from-in.vert
    spec/glsl-es-3.00/compiler/global-initializer/from-in.frag
    spec/glsl-es-3.00/compiler/global-initializer/from-global.vert
    spec/glsl-es-3.00/compiler/global-initializer/from-global.frag

Note: spec/glsl-es-3.00/compiler/global-initializer/from-sequence.*
still fail because the result of a sequence operator is still considered
to be a constant expression.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92304
Reviewed-by: Tapani Pälli <tapani.palli@intel.com> [v1]
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> [v1]
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bb329f2ff6)
2015-10-21 14:23:21 +01:00
Ian Romanick
2d5b8efd7d glsl: Only set ir_variable::constant_value for const-decorated variables
Right now we're also setting for uniforms, and that doesn't seem to hurt
things.  The next patch will make general global variables in GLSL ES,
and those definitely should not have constant_value set!

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3524d6df33)
2015-10-21 14:23:21 +01:00
Ian Romanick
0c6b210749 glsl: Use constant_initializer instead of constant_value to determine whether to keep an unused uniform
This even matches the comment "uniform initializers are precious, and
could get used by another stage."

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5bc68f0f2b)
2015-10-21 14:23:20 +01:00
Ian Romanick
8e9b698c24 glsl/linker: Use constant_initializer instead of constant_value to initialize uniforms
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 313372cae8)
2015-10-21 14:23:20 +01:00
Ian Romanick
3b238aa08f ff_fragment_shader: Use binding to set the sampler unit
This is the way layout(binding=xxx) works from GLSL.  The old method
just happened to work (and significantly predated support for
layout(binding=xxx)), but future changes will break this.

v2: Remove some stale comments.  Suggested by Matt and Chris Forbes.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8acce5d53a)
2015-10-21 14:23:20 +01:00
Ian Romanick
cd6ff70856 glsl: Allow built-in functions as constant expressions in OpenGL ES 1.00
In d4a24745 (August 2012), Paul made functions calls not be constant
expressions in GLSL ES 1.00.  Since this feature was added in desktop
GLSL 1.20, we believed that it was added in GLSL ES 3.00.  That turns
out to be completely wrong.  Built-in functions have always been allowed
as constant expressions in GLSL ES, and the patch adds the (many) spec
quotations to prove it.

While we never previously encountered this, a later patch enforces a GLSL
ES 1.00 rule that global variable initializers must be constant
expressions.  Without this fix, several dEQP tests fail.

Fixes:

    tests/spec/glsl-es-1.00/compiler/const-initializer/from-function.frag
    tests/spec/glsl-es-1.00/compiler/const-initializer/from-function.vert
    tests/spec/glsl-es-1.00/compiler/const-initializer/from-sequence-in-function.frag
    tests/spec/glsl-es-1.00/compiler/const-initializer/from-sequence-in-function.vert

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.0 10.1 10.2 10.3 10.4 10.5 10.6 11.0" <mesa-stable@lists.freedesktop.org>

Yes, I know we don't maintain stable branches that far back, but that
*is* how far back this bug goes!

(cherry picked from commit 43b07eb60f)
2015-10-21 14:23:20 +01:00
Nicolai Hähnle
2ee32ffe7c u_vbuf: fix vb slot assignment for translated buffers
Vertex attributes of different categories (constant/per-instance/
per-vertex) go into different buffers for translation, and this is now
properly reflected in the vertex buffers passed to the driver.

Fixes e.g. piglit's point-vertex-id divisor test.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 45ed627d89)
2015-10-21 14:23:20 +01:00
Dave Airlie
25e1e90937 mesa/uniforms: fix get_uniform for doubles (v2)
The initial glGetUniformdv support didn't cover all the
casting cases that are apparantly legal, and cts seems to
test for them.

I've updated the piglit test to cover these cases now.

v2: fix indentation - it's all broken in this file (Ilia)
fix src/dst index tracking in light of fp64 support (Ilia)

cc: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit bcfaab3885)
2015-10-21 14:23:20 +01:00
Francisco Jerez
22aae69aa5 mesa: Get rid of texture-dependent image unit derived state.
The point is to avoid having to re-validate all image units when
_NEW_TEXTURE is flagged, which can be expensive if the driver exposes
a large number of image units.  This has been reported to fix a 36%
performance regression in the Synmark2 Multithread benchmark on the
i965 driver which exposes 192 image units.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91788
Reported-by: Wendy Wang <wendy.wang@intel.com>
Tested-by: Ye Tian <yex.tian@intel.com>
CC: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 7e441bf025)
2015-10-21 14:23:20 +01:00
Francisco Jerez
4779eb04a4 i965: Use _mesa_is_image_unit_valid() instead of gl_image_unit::_Valid.
gl_image_unit::_Valid will be removed in a future commit.

Tested-by: Ye Tian <yex.tian@intel.com>
CC: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 2d97a78b37)
2015-10-21 14:23:20 +01:00
Francisco Jerez
df361e2311 mesa: Skip redundant texture completeness checking during image validation.
The call to _mesa_test_texobj_completeness() is unnecessary if the
texture is already known to be complete.  If the texture object is
dirtied in the meantime _BaseComplete and _MipmapComplete will be
reset to false.  _mesa_is_image_unit_valid() will start to be called
more frequently in a future commit, so it seems desirable to avoid the
unnecessary work.

Tested-by: Ye Tian <yex.tian@intel.com>
CC: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 25d3338be3)
2015-10-21 14:23:20 +01:00
Francisco Jerez
7259f17eca mesa: Expose function to calculate whether a shader image unit is valid.
A future commit will remove all texture object-dependent derived state
from the image unit struct to make validation unnecessary on texture
state changes.  Instead of checking gl_image_unit::_Valid drivers will
be required to call this function when needed to find out whether an
image unit is in a valid state and whether access from the shader is
allowed.

Tested-by: Ye Tian <yex.tian@intel.com>
CC: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 5152db415f)
2015-10-21 14:23:20 +01:00
Francisco Jerez
37b647b979 i965: Don't tell the hardware about our UAV access.
The hardware documentation relating to the UAV HW-assisted coherency
mechanism and UAV access enable bits is scarce and sometimes
contradictory, and there's quite some guesswork behind this commit, so
let me summarize the background first: HSW and later hardware have
infrastructure to support a stricter form of data coherency between
shader invocations from separate primitives.  The mechanism is
controlled by the "Accesses UAV" bits on 3DSTATE_VS, _HS, _DS, _GS and
_PS (or _PS_EXTRA on BDW+), and the "UAV Coherency Required" bit on
the 3DPRIMITIVE command.

Regardless of whether "UAV Coherency Required" is set, the hardware
fixed-function units will increment a per-stage semaphore for each
request received if "Accesses UAV" is set for the same or any lower
stage.  An implicit DC flush is emitted by the lowermost stage with
"Accesses UAV" set once it's done processing the request, this also
happens regardless of the value of "UAV Coherency Required".  The
completion of the DC flush will cause the same stage and all previous
ones to decrement the semaphore, marking the UAV accesses for the
primitive as coherent with L3.

The "UAV Coherency Required" 3DPRIMITIVE bit will cause a pipeline
stall before any threads are dispatched for the first FF stage with
"Accesses UAV" set until the semaphore is cleared for the same stage.
Effectively this guarantees that UAV memory accesses performed by
previous primitives from any stage will be strictly ordered (and
thanks to the implicit DC flush visible in memory) with UAV accesses
from the following primitives.

None of this is required by the usual image, atomic counter and SSBO
GL APIs which have very relaxed cross-primitive coherency and ordering
requirements, so we don't actually ever set the "UAV Coherency
Required" bit -- Ordering with respect to shader invocations from
previous stages on the same primitive where there is a data dependency
is of course already guaranteed as the spec requires, regardless of
this mechanism being enabled.  We do set the "Accesses UAV" bits
though since my commit ac7664e493 (which
this patch partially reverts), mainly because of comments like the
following from the BDW PRM:

> 3DSTATE_GS
>[...]
> 12 Accesses UAV
>    Format: Enable
>    This field must be set when GS has a UAV access.

There are similar comments in the documentation for the other
3DSTATE_*S commands.  The "must" part is misleading and unjustified
AFAIK.  Most of the "Accesses UAV" bits don't seem to have any side
effects other than the implicit DC flushes and the related
book-keeping in anticipation for a subsequent primitive with "UAV
Coherency Required" set, so in most cases they are unnecessary and may
incur a performance penalty.  There is an exception though.  On Gen8+
the PS_EXTRA UAV access bit influences the calculation of the PS
UAV-only and ThreadDispatchEnable signals which on previous
generations were set explicitly by the driver, so we cannot always
avoid enabling it on the PS stage.

The primary motivation for this change is that in fact the hardware
coherency mechanism is buggy and will cause a rather non-deterministic
hang on Gen8 when VS is the only stage with "Accesses UAV" set and the
processing of a request terminates immediately after the implicit DC
flush is sent for a previous primitive with no additional vertices
being emitted for the second primitive, what will cause the hardware
to skip sending a second DC flush and cause the VS to stall
indefinitely waiting for a response from the DC (BDWGFX HSD 1912017).
This hardware bug can be reproduced on current master with the
spec@arb_shader_image_load_store@host-mem-barrier@Indirect/RaW piglit
subtest (if you have the patience to run it a few dozen times).

The proposed workaround is to insert CS STALLs speculatively between
3DPRIMITIVE commands when "Accesses UAV" is enabled for the VS stage
only.  Because this would affect one of the hottest paths in the
driver and likely decrease performance even further due to the
unnecessary serialization, and because we don't actually need the
implicit DC flushes, it seems better to just disable them.

Cc: 11.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5346c11670)
2015-10-21 14:23:20 +01:00
Tapani Pälli
41cc0965bb mesa: add GL_UNSIGNED_INT_24_8 to _mesa_pack_depth_span
Patch adds missing type (used with NV_read_depth) so that it gets
handled correctly. This fixes errors seen with following CTS test:

   ES3-CTS.gtf.GL3Tests.packed_pixels.packed_pixels

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d8d0e4a81e)
2015-10-21 14:23:20 +01:00
Ilia Mirkin
3f802ebaf8 nouveau: make sure there's always room to emit a fence
I started seeing a lot of situations on nv30 where fence emission
wouldn't fit into the previous buffer (causing assertions). This ensures
that whenever checking for space, we always leave a bit of extra room
for the fence emission commands. Adjusts the nv30 and nvc0 fence
emission logic to bypass the space checking as well.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 47d11990b2)

Squashed with commit

nouveau: avoid emitting new fences unnecessarily

Right now we emit on every kick, but this is only necessary if something
will ever be able to observe that the fence completed. If there are no
refs, leave the fence alone and emit it another day.

This also happens to work around an issue for the kick handler -- a kick
can be a result of e.g. nouveau_bo_wait or explicit kick, or it can be
due to lack of space in the pushbuf. We want the emit to happen in the
current batch, so we want there to always be enough space. However an
explicit kick could take the reserved space for the implicitly-triggered
kick's fence emission if it happened right after. With the new mechanism,
hopefully there's no way to cause two fences to be emitted into the same
reserved space.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: 47d11990b (nouveau: make sure there's always room to emit a fence)
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 8053c9208f)

Squashed with commit

nv50,nvc0: don't base decisions on available pushbuf space

We still have to push everything out, might as well kick earlier and
flip pushbufs when we know we'll need it. This resolves some issues with
the new policy of making sure that we always leave a bit of room at the
end for fences.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: 47d11990b (nouveau: make sure there's always room to emit a fence)
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 9fe458335f)

Squashed with commit

nouveau: avoid double-emitting fence

The act of ensuring that there is space can cause a flush to happen,
which will emit the current screen fence. If that is the fence we're
trying to wait on, then it will have been emitted as a result of doing
the PUSH_SPACE. Don't attempt to emit it a second time.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Fixes: 8053c9208f (nouveau: avoid emitting new fences unnecessarily)
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit bf97f8d467)
2015-10-21 14:21:07 +01:00
Emil Velikov
b4bfea0094 docs: add sha256 checksums for 11.0.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2015-10-10 17:02:46 +01:00
Emil Velikov
914966befc docs: add release notes for 11.0.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2015-10-10 16:21:59 +01:00
Emil Velikov
3c86315ca3 Update version to 11.0.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2015-10-10 16:17:51 +01:00
Emil Velikov
d0c22560a1 Revert "nouveau: make sure there's always room to emit a fence"
This reverts commit 30570b2629.

As mentioned by Ilia Mirkin:

 Please remove this one from your list of cherry-picked patches. While
  it fixes real issues on nv30 (and probably the other generations too),
  it appears to introduce some new ones on nvc0. I've figured out what's
  causing it, but haven't figured out a proper fix. Not sure I'll be
  able to before you do a release.
2015-10-10 16:15:08 +01:00
Jason Ekstrand
1a866b3e49 mesa: Correctly handle GL_BGRA_EXT in ES3 format_and_type checks
The EXT_texture_format_BGRA8888 extension (which mesa supports
unconditionally) adds a new format and internal format called GL_BGRA_EXT.
Previously, this was not really handled at all in
_mesa_ex3_error_check_format_and_type.  When the checks were tightened in
commit f15a7f3c, we accidentally tightened things too far and GL_BGRA_EXT
would always cause an error to be thrown.

There were two primary issues here.  First, is that
_mesa_es3_effective_internal_format_for_format_and_type didn't handle the
GL_BGRA_EXT format.  Second is that it blindly uses _mesa_base_tex_format
which returns GL_RGBA for GL_BGRA_EXT.  This commit fixes both of these
issues as well as adds explicit checks that GL_BGRA_EXT is only ever used
with GL_BGRA_EXT and GL_UNSIGNED_BYTE.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92265
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6ad9ebb073)
2015-10-10 16:14:12 +01:00
Michel Dänzer
b1230e3e01 st/dri: Use packed RGB formats
Fixes Gallium based DRI drivers failing to load on big endian hosts
because they can't find any matching fbconfigs.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71789
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 87c3c9acd2)
Nominated-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-10-07 16:42:01 +01:00
Varad Gautam
d09b37e7d5 egl: restore surface type before linking config to its display
commit c2c2e9a (egl: implement EGL_KHR_gl_colorspace (v2)) leaves
_EGLConfig->SurfaceType set incorrectly before calling _eglLinkConfig(),
and the bad value is passed around to platform_android. set it to zero
as earlier.

v2: Set SurfaceType to 0, rather than surface_type (Suggested by Emil)

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91596
Signed-off-by: Varad Gautam <varadgautam@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit f988eff379)
2015-10-07 15:21:10 +01:00
Ilia Mirkin
30570b2629 nouveau: make sure there's always room to emit a fence
I started seeing a lot of situations on nv30 where fence emission
wouldn't fit into the previous buffer (causing assertions). This ensures
that whenever checking for space, we always leave a bit of extra room
for the fence emission commands. Adjusts the nv30 and nvc0 fence
emission logic to bypass the space checking as well.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 47d11990b2)
2015-10-07 14:52:55 +01:00
Ilia Mirkin
f114967ca9 nv30: always go through translate module on big-endian
It seems like things are either coming in slighly wrong, or perhaps
uploaded incorrectly, but either way passing them through the translate
module seems to fix everything. Eventually we should figure out what's
going wrong and fix it "for real", but this should do for now.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 78ec9e28ec)
2015-10-07 14:52:29 +01:00
Ilia Mirkin
39a3871b1e nv30: pretend to have packed texture/surface formats
This puts us in line with what the DDX/DRI2 st are expecting. It also
happens to work... no idea why, but seems better to have it work than to
ask lots of questions.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 1fec05d114)
2015-10-07 14:52:04 +01:00
Marek Olšák
28373c75ba egl/dri2: don't require a context for ClientWaitSync (v2)
The spec doesn't require it. This fixes a crash on Android.

v2: don't set any flags if ctx == NULL
v3: add the spec note

Cc: 10.6 11.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Albert Freeman <albertwdfreeman@gmail.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 18123a732b)
2015-10-07 14:51:39 +01:00
Marek Olšák
eabc656324 st/dri: don't use _ctx in client_wait_sync
Not needed and it can be NULL.

v2: fix dri2_get_fence_from_cl_event - thanks Albert

Cc: 10.6 11.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Albert Freeman <albertwdfreeman@gmail.com>
(cherry picked from commit b78336085b)
2015-10-07 14:51:14 +01:00
Matthew Waters
1f2d007e49 egl: rework handling EGL_CONTEXT_FLAGS
As of version 15 of the EGL_KHR_create_context spec, debug contexts
are allowed for ES contexts.  We should allow creation instead of
erroring.

While we're here provide a more comprehensive checking for the other two
flags - ROBUST_ACCESS_BIT_KHR and FORWARD_COMPATIBLE_BIT_KHR

v2 [Emil Velikov] Rebase. Minor tweak in commit message.

Cc: Boyan Ding <boyan.j.ding@gmail.com>
Cc: Chad Versace <chad.versace@intel.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91044
Signed-off-by: Matthew Waters <ystreet00@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 11cabc45b7)
2015-10-07 14:50:48 +01:00
Tom Stellard
00425de657 radeon/llvm: Initialize gallivm targets when initializing the AMDGPU target v2
This fixes a race condition in the glx-multithreaded-shader-compile
test.

v2:
  - Replace gallivm_init_llvm_{begin,end}() with gallivm_init_llvm_targets().

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>

CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a2e1e3d325)
2015-10-07 14:50:23 +01:00
Tom Stellard
16d9e62107 gallivm: Allow drivers and state trackers to initialize gallivm LLVM targets v2
Drivers and state trackers that use LLVM for generating code, must
register the targets they use with LLVM's global TargetRegistry.
The TargetRegistry is not thread-safe, so all targets must be added
to the registry before it can be queried for target information.

When drivers and state trackers initialize their own targets, they need
a way to force gallivm to initialize its targets at the same time.
Otherwise, there can be a race condition in some multi-threaded
applications (e.g. glx-multihreaded-shader-compile in piglit),
when one thread creates a context for a driver that uses LLVM (e.g.
radeonsi) and another thread creates a gallivm context (glxContextCreate
does this).

The race happens when the driver thread initializes its LLVM targets and
then starts using the registry before the gallivm thread has a chance to
register its targets.

This patch allows users to force gallivm to register its targets by
calling the gallivm_init_llvm_targets() function.

v2:
  - Use call_once and remove mutexes and static initializations.
  - Replace gallivm_init_llvm_{begin,end}() with
    gallivm_init_llvm_targets().

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>

CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 76cfd6f1da)
2015-10-07 14:49:58 +01:00
Tom Stellard
776bcb2042 gallium/radeon: Use call_once() when initailizing LLVM targets
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>

CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3219b48ae5)
2015-10-07 14:49:32 +01:00
Kyle Brenneman
ac75afff88 glx: Don't hard-code the name "libGL.so.1" in driOpenDriver (v3)
Add a macro GL_LIB_NAME to hold the filename that configure comes up with
based on the --with-gl-lib-name and --enable-mangling options.

In driOpenDriver, use the GL_LIB_NAME macro instead of hard-coding
"libGL.so.1".

v2: Add an #ifndef/#define for GL_LIB_NAME so that non-autoconf builds will
    work.
v3: Fix the library filename in the Makefile.

Signed-off-by: Kyle Brenneman <kbrenneman@nvidia.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d35391cfda)
2015-10-07 14:49:07 +01:00
Kyle Brenneman
de936892db mapi: Make _glapi_get_stub work with "gl" or "mgl" prefix.
When USE_MGL_NAMESPACE is defined, _glapi_get_stub will check for the "m"
prefix before trying to skip it, so that "glFoo" and "mglFoo" are
equivalent.

This should let it work with all the places where something calls
_glapi_get_proc_offset with a hard-coded name that starts with the normal
"gl" prefix.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55552
Signed-off-by: Kyle Brenneman <kbrenneman@nvidia.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 798f260a2f)
2015-10-07 14:48:41 +01:00
Kyle Brenneman
b2a04cfcc2 glx: Fix build errors with --enable-mangling (v2)
Rearranged the GLX_ALIAS macro in glextensions.h so that it will pick up
the renames from glx_mangle.h.

Fixed the alias attribute for glXGetProcAddress when USE_MGL_NAMESPACE is
defined.

v2: Add a comment clarifying why GLX_ALIAS needs two macros.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55552
Signed-off-by: Kyle Brenneman <kbrenneman@nvidia.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit a27f2d991b)
2015-10-07 14:48:15 +01:00
Daniel Scharrer
dca86265a2 mesa: Add abs input modifier to base for POW in ffvertex_prog
The result of POW for a negative base is undefined. Even when the result
is multiplied by zero (which is the case here whenever the base is
negative), the Inf and NaNs can propagate past that.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91342
Signed-off-by: Daniel Scharrer <daniel@constexpr.org>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit b3f9c5cc0f)
2015-10-07 14:47:50 +01:00
Ian Romanick
d0684f3d58 meta: Handle array textures in scaled MSAA blits
The old code had some significant problems with respect to
sampler2DArray textures.  The biggest problem was that some of the code
would use vec3 for the texture coordinate type, and other parts of the
code would use vec2.  The resulting shader would not even compile.
Since there were not tests for this path, nobody noticed.

The input to the fragment shader is always treated as a vec3.  If the
source data is only vec2, the vertex puller will supply 0 for the .z
component.  The texture coordinate passed to the fragment shader is
always a vec2 that comes from the .xy part of the vertex shader input.
The layer, taken from the .z of the vertex shader input is passed
separately as a flat integer.  If the generated fragment shader does not
use the layer integer, the GLSL linker will eliminate all the dead code
in the vertex shader.

Fixes the new piglit tests "blit-scaled samples=2 with
gl_texture_2d_multisample_array", etc. on i965.

Note for stable maintainer: This patch may depend on 46037237, and that
patch should be safe for stable.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9bd9cf1fa4)
2015-10-07 14:47:23 +01:00
Ville Syrjälä
7d78578b06 i915: Remember to call intel_prepare_render() before blitting
Bring over the following fix from i965:
 commit fb3d62fe3d
 Author: Kenneth Graunke <kenneth@whitecape.org>
 Date:   Tue Aug 6 14:36:09 2013 -0700

    i965: Remember to call intel_prepare_render() before blitting.

Fixes a crash in the following piglit tests:
 bin/fbo-sys-blit -auto
 bin/fbo-sys-sub-blit -auto

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a1a3f0961b)
2015-10-07 14:46:56 +01:00
Ville Syrjälä
88ed45b033 i915: Fix texcoord vs. varying collision in fragment programs
i915 fragment programs utilize the texture coordinate registers
for both texture coordinates and varyings. Unfortunately the
code doesn't check if the same index might be in use for both.
It just naively uses the index to pick a texture unit, which
could lead to collisions.

Add an extra mapping step to allocate non conflicting texture
units for both uses.

The issue can be reproduced with a pair of simple shaders like
these:
 attribute vec4 in_mod;
 varying vec4 mod;
 void main() {
   mod = in_mod;
   gl_TexCoord[0] = gl_MultiTexCoord0;
   gl_Position = gl_ModelViewProjectionMatrix * gl_Vertex;
 }

 varying vec4 mod;
 uniform sampler2D tex;
 void main() {
   gl_FragColor = texture2D(tex, vec2(gl_TexCoord[0])) * mod;
 }

Fixes many piglit tests on i915:

    glsl-link-varyings-2
    glsl-orangebook-ch06-bump
    interpolation-none-gl_frontcolor-smooth-fixed
    interpolation-none-gl_frontcolor-smooth-none
    interpolation-none-gl_frontcolor-smooth-vertex
    interpolation-none-gl_frontsecondarycolor-smooth-fixed
    interpolation-none-gl_frontsecondarycolor-smooth-vertex
    interpolation-none-gl_frontsecondarycolor-smooth-none
    interpolation-none-other-flat-fixed
    interpolation-none-other-flat-none
    interpolation-none-other-flat-vertex
    interpolation-none-other-smooth-fixed
    interpolation-none-other-smooth-none
    interpolation-none-other-smooth-vertex

v2 [idr]: Minor formatting tweaks.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c349031c27)
2015-10-07 14:46:29 +01:00
Ville Syrjälä
fbcd36ddb6 i830: Fix collision between I830_UPLOAD_RASTER_RULES and I830_UPLOAD_TEX(0)
I830_UPLOAD_RASTER_RULES and I830_UPLOAD_TEX(0) are trying to occupy
the same bit. Move the texture bits upwards a bit to make room for
I830_UPLOAD_RASTER_RULES.

Now the driver will actually upload the raster rules which is rather
important to get the provoking vertex right. Fixes the appearance
of glxgears teeth on gen2.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9504740f3e)
2015-10-07 14:46:00 +01:00
Brian Paul
531309a5f0 st/mesa: try PIPE_BIND_RENDER_TARGET when choosing float texture formats
For 8-bit RGB(A) texture formats we set the PIPE_BIND_RENDER_TARGET flag
to try to get a hardware format which also supports rendering (for FBO
textures).  Do the same thing for floating point formats.

This allows the Redway3D Flat demo to run.

Cc: 10.6 11.0 <mesa-stable@lists.freedesktop.org>

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit cb758b892a)
2015-10-07 14:45:34 +01:00
Ilia Mirkin
0ae914f65d nouveau: wait to unref the transfer's bo until it's no longer used
The bo will often come from a slab in which case it doesn't matter. But
for larger allocations this will be in its own bo, and we have to make
sure to wait until it's no longer used in order for it to be freed.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Tested-by: Marcin Ślusarz <marcin.slusarz@gmail.com>
(cherry picked from commit 1d8cba9b51)
2015-10-07 14:45:08 +01:00
Ilia Mirkin
b2c8b0e546 nouveau: delay deleting buffer with unflushed fence
If there is an unflushed fence on the bo, then the resource may still be
used in commands built up in the local pushbuf. Flushing can cause all
sorts of unwanted effects, so just free the bo when the relevant fence
is hit.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Tested-by: Marcin Ślusarz <marcin.slusarz@gmail.com>
(cherry picked from commit 3a6b9a7830)
2015-10-07 14:44:40 +01:00
Ilia Mirkin
d6ee06e9fe nouveau: be more careful about freeing temporary transfer buffers
Deleting a buffer does not flush the command stream. Make sure that we
wait for the copies to finish before deleting the temporary bo.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Tested-by: Marcin Ślusarz <marcin.slusarz@gmail.com>
(cherry picked from commit d4e650b07b)
2015-10-07 14:44:15 +01:00
Francisco Jerez
7b8b044ee4 i965/fs: Fix hang on IVB and VLV with image format mismatch.
IVB and VLV hang sporadically when an untyped surface read or write
message is used to access a surface of format other than RAW, as may
happen when there is a mismatch between the format qualifier of the
image uniform and the format of the actual image bound to the
pipeline.  According to the spec this condition gives undefined
results but may not lead to program termination (which is one of the
possible outcomes of the hang).  Fix it by checking at runtime whether
the surface is of the right type.

Fixes the "arb_shader_image_load_store.invalid/format mismatch" piglit
subtest.

Reported-by: Mark Janes <mark.a.janes@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91718
CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit b61292296b)
2015-10-07 14:43:49 +01:00
Marek Olšák
ec7cda29b6 radeonsi: add scratch buffer to the buffer list when it's re-allocated
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 9932142192)
2015-10-07 14:43:19 +01:00
Leo Liu
ab68081ffb radeon/vce: fix vui time_scale zero error
if app pass 0 as frame_rate_num, it should not be encoded to the VUI.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 1e97b41893)
2015-10-07 14:42:54 +01:00
Roland Scheidegger
46dc4946a2 mesa: fix mipmap generation for immutable, compressed textures
If the immutable compressed texture didn't have the full mip pyramid,
this didn't work, because it tried to generate mip levels for non-existing
levels. _mesa_prepare_mipmap_level() would correctly handle this by returning
FALSE if the mip level didn't exist, however we actually created the
non-existing mip level right before that because we used _mesa_get_tex_image()
before calling _mesa_prepare_mipmap_level(). It would then proceed to crash
(we allocated the mip level, which is a bad idea on an immutable texture,
but didn't initialize the values, leading to assertion failures or segfaults).
Fix this by using _mesa_select_tex_image() instead and call it after
_mesa_prepare_mipmap_level(), as that function will allocate missing mip levels
for non-immutable textures already.
This fixes a (2 year old) crash with astromenace which was hack-fixed in ubuntu
packages instead: http://bugs.debian.org/718680 (I guess most apps do full mip
chains - I believe this app not doing it is actually unintentional, always one
level less than full mip chain...).

Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 19604d30e1)
2015-10-07 14:42:27 +01:00
Marek Olšák
01e197c21a gallium/u_blitter: handle allocation failures
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 7bbce21e45)
2015-10-07 14:41:58 +01:00
Marek Olšák
0c5aacf446 radeonsi: handle dummy constant buffer allocation failure
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit ae418a7b56)
2015-10-07 14:41:30 +01:00
Marek Olšák
fb5dd33166 radeonsi: don't forget to update scratch relocations for LS, HS, ES shaders
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit b737d9c1dc)
2015-10-07 14:41:03 +01:00
Marek Olšák
b2d3012e35 radeonsi: skip drawing if updating the scratch buffer fails
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit d556346b35)
2015-10-07 14:40:36 +01:00
Marek Olšák
154573e427 radeonsi: skip drawing if PS fails to compile or upload
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 1f99b0be7e)
2015-10-07 14:40:08 +01:00
Marek Olšák
7e64e887f0 radeonsi: skip drawing if VS, TCS, TES, GS fail to compile or upload
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 237d7cccce)
2015-10-07 14:39:41 +01:00
Marek Olšák
10382380f0 radeonsi: handle fixed-func TCS shader create failure
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 9b6d9dd7d8)
2015-10-07 14:39:14 +01:00
Marek Olšák
815b595b5f radeonsi: handle shader precompile failures
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 5dbadb0257)
2015-10-07 14:38:46 +01:00
Marek Olšák
4e0ae01588 radeonsi: skip drawing if GS ring allocations fail
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 263f5a2cf9)
[Emil Velikov: Track gs_rings over gsvs_ring. NULL check/FREE gs_rings.]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/gallium/drivers/radeonsi/si_state_shaders.c
2015-10-07 14:36:53 +01:00
Marek Olšák
33ed153214 radeonsi: skip drawing if the tess factor ring allocation fails
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 22d3ccf5a8)
[Emil Velikov: Track tf_state over tf_ring. NULL check/FREE tf_state.]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/gallium/drivers/radeonsi/si_state_shaders.c
2015-10-07 14:36:30 +01:00
Marek Olšák
3cd7493f11 radeonsi: add malloc fail paths to si_create_shader_state
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 5c219ab552)
2015-10-07 14:10:31 +01:00
Marek Olšák
dacccf8e22 radeonsi: report alloc failure from si_shader_binary_read
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 394d67a58f)
2015-10-07 14:10:03 +01:00
Marek Olšák
288d9a06cc gallium/radeon: add a fail path for depth MSAA texture readback
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit dea834e639)
2015-10-07 14:09:37 +01:00
Marek Olšák
62ac723a34 gallium/radeon: handle buffer alloc failures in r600_draw_rectangle
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit f95e695059)
2015-10-07 14:09:11 +01:00
Marek Olšák
766a0b4661 gallium/radeon: handle buffer_map staging buffer failures better
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 282b378012)
2015-10-07 14:08:44 +01:00
Marek Olšák
f2e8b94f84 radeonsi: handle constant buffer alloc failures
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit cd27ff6a0f)
2015-10-07 14:08:18 +01:00
Marek Olšák
02a631bfbc radeonsi: handle index buffer alloc failures
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 29dff6f676)
2015-10-07 14:07:51 +01:00
Marek Olšák
94e9c52b62 st/mesa: fix front buffer regression after dropping st_validate_state in Blit
Broken by: d082c53249
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92072

Cc: 10.6 11.0 <mesa-stable@lists.freedesktop.org>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit f3a0819533)
2015-10-07 14:07:14 +01:00
Emil Velikov
4c0b484612 docs: add sha256 checksums for 11.0.2
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-09-29 00:19:36 +01:00
Emil Velikov
51e0b06d99 docs: add release notes for 11.0.2
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-09-28 20:45:37 +01:00
Emil Velikov
f2bfaa8271 Update version to 11.0.2
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-09-28 20:41:32 +01:00
Eduardo Lima Mitev
f15a7f3c6e mesa: Use the effective internal format instead for validation
When validating format+type+internalFormat for texture pixel operations
on GLES3, the effective internal format should be used if the one
specified is an unsized internal format. Page 127, section "3.8 Texturing"
of the GLES 3.0.4 spec says:

    "if internalformat is a base internal format, the effective internal
     format is a sized internal format that is derived from the format and
     type for internal use by the GL. Table 3.12 specifies the mapping of
     format and type to effective internal formats. The effective internal
     format is used by the GL for purposes such as texture completeness or
     type checks for CopyTex* commands. In these cases, the GL is required
     to operate as if the effective internal format was used as the
     internalformat when specifying the texture data."

v2: Per the spec, Luminance8Alpha8, Luminance8 and Alpha8 should not be
considered sized internal formats. Return the corresponding unsize format
instead.

v4: * Improved comments in
      _mesa_es3_effective_internal_format_for_format_and_type().
    * Splitted patch to separate chunk about reordering of
      error_check_subtexture_dimensions() error check, which is not directly
      related with this patch.
v5: Dropped the splitted patch because it was actually a work around 3
    dEQP tests that are buggy:

    dEQP-GLES2.functional.negative_api.texture.texsubimage2d_neg_offset
    dEQP-GLES2.functional.negative_api.texture.texsubimage2d_offset_allowed
    dEQP-GLES2.functional.negative_api.texture.texsubimage2d_neg_wdt_hgt

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
(cherry picked from commit 5edd9961c1)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91582
2015-09-28 20:38:41 +01:00
Eduardo Lima Mitev
cfddc456ae mesa: Move _mesa_base_tex_format() from teximage to glformats files
This function will be needed as part of validating the combination of format,
type and internal format of texture pixel operations, which happens in
glformats files. Specifically, we want to be able to obtain the base format
of a resolved effective internal format, to compare it with the original
internal format passed.

Also, since this function deals solely with GL formats, it fits better in
glformats where the rest of similar format functionality rests.

The function is moved as-is, without any modification.

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
(cherry picked from commit c6bf1cd146)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/mesa/main/teximage.c
	src/mesa/main/teximage.h
2015-09-28 20:35:26 +01:00
Eduardo Lima Mitev
25e2a4136b mesa: Fix order of format+type and internal format checks for glTexImageXD ops
The more specific GLES constrains should be checked after the general
validation performed by _mesa_error_check_format_and_type(). This is also
for consistency with the error checks order of glTexSubImage ops.

v3: The change of order uncovered a bug that regresses a couple of piglit
tests written against OpenGL-ES 1.1 spec, which expects an INVALID_VALUE
instead of the INVALID_ENUM returned by _mesa_error_check_format_and_type()
when an invalid format is passed to glTexImage2D. This version of the patch
accounts for those cases.

Fixes 1 dEQP test:
* dEQP-GLES3.functional.negative_api.texture.teximage2d

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
(cherry picked from commit 15ab968f62)
2015-09-28 20:29:41 +01:00
Matt Turner
ead4ce53f7 glsl: Expose gl_MaxTess{Control,Evaluation}AtomicCounters.
... with only ARB_shader_atomic_counters.

I expected to see interactions with ARB_tessellation_shader in the
ARB_shader_atomic_counters spec, but they do not exist. It seems that we
should unconditionally expose these variables in the presence of
ARB_shader_atomic_counters:

   gl_MaxTessControlAtomicCounters
   gl_MaxTessEvaluationAtomicCounters

This partially reverts commit da7adb99e8. The commit also affected
gl_MaxTessControlImageUniforms and gl_MaxTessEvaluationImageUniforms
similarly but the ARB_shader_image_load_store spec does list an
interaction with ARB_tessellation_shader.

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92095
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit d6bb46bbe8)
2015-09-28 20:29:13 +01:00
Kristian Høgsberg Kristensen
dace17bfd4 i965: Respect stride and subreg_offset for ATTR registers
When we assign hw regs to attributes, we don't incorporate the stride
and subreg_offset from the fs_reg. It's rarely used, but the integer
multiplication lowering uses unusual stride and subreg_offset
combination breaks when one source is an attribute.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91970
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 2ea16966ae)
2015-09-28 20:24:36 +01:00
Emil Velikov
7f1a77ae66 docs: add sha256 checksums for 11.0.1
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-09-26 14:08:52 +01:00
112 changed files with 2246 additions and 937 deletions

View File

@@ -1 +1 @@
11.0.1
11.0.4

View File

@@ -106,6 +106,8 @@ AC_SYS_LARGEFILE
LT_PREREQ([2.2])
LT_INIT([disable-static])
AC_CHECK_PROG(RM, rm, [rm -f])
AX_PROG_BISON([],
AS_IF([test ! -f "$srcdir/src/glsl/glcpp/glcpp-parse.c"],
[AC_MSG_ERROR([bison not found - unable to compile glcpp-parse.y])]))

View File

@@ -31,7 +31,8 @@ because compatibility contexts are not supported.
<h2>SHA256 checksums</h2>
<pre>
TBD
6dab262877e12c0546a0e2970c6835a0f217e6d4026ccecb3cd5dd733d1ce867 mesa-11.0.1.tar.gz
43d0dfcd1f1e36f07f8228cd76d90175d3fc74c1ed25d7071794a100a98ef2a6 mesa-11.0.1.tar.xz
</pre>

85
docs/relnotes/11.0.2.html Normal file
View File

@@ -0,0 +1,85 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 11.0.2 Release Notes / September 28, 2015</h1>
<p>
Mesa 11.0.2 is a bug fix release which fixes bugs found since the 11.0.1 release.
</p>
<p>
Mesa 11.0.2 implements the OpenGL 4.1 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.1. OpenGL
4.1 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
45170773500d6ae2f9eb93fc85efee69f7c97084411ada4eddf92f78bca56d20 mesa-11.0.2.tar.gz
fce11fb27eb87adf1e620a76455d635c6136dfa49ae58c53b34ef8d0c7b7eae4 mesa-11.0.2.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91582">Bug 91582</a> - [bisected] Regression in DEQP gles2.functional.negative_api.texture.texsubimage2d_neg_offset</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91970">Bug 91970</a> - [BSW regression] dEQP-GLES3.functional.shaders.precision.int.highp_mul_vertex</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92095">Bug 92095</a> - [Regression, bisected] arb_shader_atomic_counters.compiler.builtins.frag</li>
</ul>
<h2>Changes</h2>
<p>Eduardo Lima Mitev (3):</p>
<ul>
<li>mesa: Fix order of format+type and internal format checks for glTexImageXD ops</li>
<li>mesa: Move _mesa_base_tex_format() from teximage to glformats files</li>
<li>mesa: Use the effective internal format instead for validation</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: add sha256 checksums for 11.0.1</li>
<li>Update version to 11.0.2</li>
</ul>
<p>Kristian Høgsberg Kristensen (1):</p>
<ul>
<li>i965: Respect stride and subreg_offset for ATTR registers</li>
</ul>
<p>Matt Turner (1):</p>
<ul>
<li>glsl: Expose gl_MaxTess{Control,Evaluation}AtomicCounters.</li>
</ul>
</div>
</body>
</html>

185
docs/relnotes/11.0.3.html Normal file
View File

@@ -0,0 +1,185 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 11.0.3 Release Notes / October 10, 2015</h1>
<p>
Mesa 11.0.3 is a bug fix release which fixes bugs found since the 11.0.2 release.
</p>
<p>
Mesa 11.0.3 implements the OpenGL 4.1 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.1. OpenGL
4.1 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
c2210e3daecc10ed9fdcea500327652ed6effc2f47c4b9cee63fb08f560d7117 mesa-11.0.3.tar.gz
ab2992eece21adc23c398720ef8c6933cb69ea42e1b2611dc09d031e17e033d6 mesa-11.0.3.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=55552">Bug 55552</a> - Compile errors with --enable-mangling</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71789">Bug 71789</a> - [r300g] Visuals not found in (default) depth = 24</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91044">Bug 91044</a> - piglit spec/egl_khr_create_context/valid debug flag gles* fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91342">Bug 91342</a> - Very dark textures on some objects in indoors environments in Postal 2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91596">Bug 91596</a> - EGL_KHR_gl_colorspace (v2) causes problem with Android-x86 GUI</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91718">Bug 91718</a> - piglit.spec.arb_shader_image_load_store.invalid causes intermittent GPU HANG</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92072">Bug 92072</a> - Wine breakage since d082c5324 (st/mesa: don't call st_validate_state in BlitFramebuffer)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92265">Bug 92265</a> - Black windows in weston after update mesa to 11.0.2-1</li>
</ul>
<h2>Changes</h2>
<p>Brian Paul (1):</p>
<ul>
<li>st/mesa: try PIPE_BIND_RENDER_TARGET when choosing float texture formats</li>
</ul>
<p>Daniel Scharrer (1):</p>
<ul>
<li>mesa: Add abs input modifier to base for POW in ffvertex_prog</li>
</ul>
<p>Emil Velikov (3):</p>
<ul>
<li>docs: add sha256 checksums for 11.0.2</li>
<li>Revert "nouveau: make sure there's always room to emit a fence"</li>
<li>Update version to 11.0.3</li>
</ul>
<p>Francisco Jerez (1):</p>
<ul>
<li>i965/fs: Fix hang on IVB and VLV with image format mismatch.</li>
</ul>
<p>Ian Romanick (1):</p>
<ul>
<li>meta: Handle array textures in scaled MSAA blits</li>
</ul>
<p>Ilia Mirkin (6):</p>
<ul>
<li>nouveau: be more careful about freeing temporary transfer buffers</li>
<li>nouveau: delay deleting buffer with unflushed fence</li>
<li>nouveau: wait to unref the transfer's bo until it's no longer used</li>
<li>nv30: pretend to have packed texture/surface formats</li>
<li>nv30: always go through translate module on big-endian</li>
<li>nouveau: make sure there's always room to emit a fence</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>mesa: Correctly handle GL_BGRA_EXT in ES3 format_and_type checks</li>
</ul>
<p>Kyle Brenneman (3):</p>
<ul>
<li>glx: Fix build errors with --enable-mangling (v2)</li>
<li>mapi: Make _glapi_get_stub work with "gl" or "mgl" prefix.</li>
<li>glx: Don't hard-code the name "libGL.so.1" in driOpenDriver (v3)</li>
</ul>
<p>Leo Liu (1):</p>
<ul>
<li>radeon/vce: fix vui time_scale zero error</li>
</ul>
<p>Marek Olšák (21):</p>
<ul>
<li>st/mesa: fix front buffer regression after dropping st_validate_state in Blit</li>
<li>radeonsi: handle index buffer alloc failures</li>
<li>radeonsi: handle constant buffer alloc failures</li>
<li>gallium/radeon: handle buffer_map staging buffer failures better</li>
<li>gallium/radeon: handle buffer alloc failures in r600_draw_rectangle</li>
<li>gallium/radeon: add a fail path for depth MSAA texture readback</li>
<li>radeonsi: report alloc failure from si_shader_binary_read</li>
<li>radeonsi: add malloc fail paths to si_create_shader_state</li>
<li>radeonsi: skip drawing if the tess factor ring allocation fails</li>
<li>radeonsi: skip drawing if GS ring allocations fail</li>
<li>radeonsi: handle shader precompile failures</li>
<li>radeonsi: handle fixed-func TCS shader create failure</li>
<li>radeonsi: skip drawing if VS, TCS, TES, GS fail to compile or upload</li>
<li>radeonsi: skip drawing if PS fails to compile or upload</li>
<li>radeonsi: skip drawing if updating the scratch buffer fails</li>
<li>radeonsi: don't forget to update scratch relocations for LS, HS, ES shaders</li>
<li>radeonsi: handle dummy constant buffer allocation failure</li>
<li>gallium/u_blitter: handle allocation failures</li>
<li>radeonsi: add scratch buffer to the buffer list when it's re-allocated</li>
<li>st/dri: don't use _ctx in client_wait_sync</li>
<li>egl/dri2: don't require a context for ClientWaitSync (v2)</li>
</ul>
<p>Matthew Waters (1):</p>
<ul>
<li>egl: rework handling EGL_CONTEXT_FLAGS</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>st/dri: Use packed RGB formats</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>mesa: fix mipmap generation for immutable, compressed textures</li>
</ul>
<p>Tom Stellard (3):</p>
<ul>
<li>gallium/radeon: Use call_once() when initailizing LLVM targets</li>
<li>gallivm: Allow drivers and state trackers to initialize gallivm LLVM targets v2</li>
<li>radeon/llvm: Initialize gallivm targets when initializing the AMDGPU target v2</li>
</ul>
<p>Varad Gautam (1):</p>
<ul>
<li>egl: restore surface type before linking config to its display</li>
</ul>
<p>Ville Syrjälä (3):</p>
<ul>
<li>i830: Fix collision between I830_UPLOAD_RASTER_RULES and I830_UPLOAD_TEX(0)</li>
<li>i915: Fix texcoord vs. varying collision in fragment programs</li>
<li>i915: Remember to call intel_prepare_render() before blitting</li>
</ul>
</div>
</body>
</html>

167
docs/relnotes/11.0.4.html Normal file
View File

@@ -0,0 +1,167 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 11.0.4 Release Notes / October 24, 2015</h1>
<p>
Mesa 11.0.4 is a bug fix release which fixes bugs found since the 11.0.3 release.
</p>
<p>
Mesa 11.0.4 implements the OpenGL 4.1 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.1. OpenGL
4.1 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
TBD
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86281">Bug 86281</a> - brw_meta_fast_clear (brw=brw&#64;entry=0x7fffd4097a08, fb=fb&#64;entry=0x7fffd40fa900, buffers=buffers&#64;entry=2, partial_clear=partial_clear&#64;entry=false)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86720">Bug 86720</a> - [radeon] Europa Universalis 4 freezing during game start (10.3.3+, still broken on 11.0.2)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91788">Bug 91788</a> - [HSW Regression] Synmark2_v6 Multithread performance case FPS reduced by 36%</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92304">Bug 92304</a> - [cts] cts.shaders.negative conformance tests fail</li>
</ul>
<h2>Changes</h2>
<p>Alejandro Piñeiro (2):</p>
<ul>
<li>i965/vec4: check writemask when bailing out at register coalesce</li>
<li>i965/vec4: fill src_reg type using the constructor type parameter</li>
</ul>
<p>Brian Paul (2):</p>
<ul>
<li>vbo: fix incorrect switch statement in init_mat_currval()</li>
<li>mesa: fix incorrect opcode in save_BlendFunci()</li>
</ul>
<p>Chih-Wei Huang (3):</p>
<ul>
<li>mesa: android: Fix the incorrect path of sse_minmax.c</li>
<li>nv50/ir: use C++11 standard std::unordered_map if possible</li>
<li>nv30: include the header of ffs prototype</li>
</ul>
<p>Chris Wilson (1):</p>
<ul>
<li>i965: Remove early release of DRI2 miptree</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>mesa/uniforms: fix get_uniform for doubles (v2)</li>
</ul>
<p>Emil Velikov (1):</p>
<ul>
<li>docs: add sha256 checksums for 11.0.3</li>
</ul>
<p>Francisco Jerez (5):</p>
<ul>
<li>i965: Don't tell the hardware about our UAV access.</li>
<li>mesa: Expose function to calculate whether a shader image unit is valid.</li>
<li>mesa: Skip redundant texture completeness checking during image validation.</li>
<li>i965: Use _mesa_is_image_unit_valid() instead of gl_image_unit::_Valid.</li>
<li>mesa: Get rid of texture-dependent image unit derived state.</li>
</ul>
<p>Ian Romanick (8):</p>
<ul>
<li>glsl: Allow built-in functions as constant expressions in OpenGL ES 1.00</li>
<li>ff_fragment_shader: Use binding to set the sampler unit</li>
<li>glsl/linker: Use constant_initializer instead of constant_value to initialize uniforms</li>
<li>glsl: Use constant_initializer instead of constant_value to determine whether to keep an unused uniform</li>
<li>glsl: Only set ir_variable::constant_value for const-decorated variables</li>
<li>glsl: Restrict initializers for global variables to constant expression in ES</li>
<li>glsl: Add method to determine whether an expression contains the sequence operator</li>
<li>glsl: In later GLSL versions, sequence operator is cannot be a constant expression</li>
</ul>
<p>Ilia Mirkin (1):</p>
<ul>
<li>nouveau: make sure there's always room to emit a fence</li>
</ul>
<p>Indrajit Das (1):</p>
<ul>
<li>st/va: Used correct parameter to derive the value of the "h" variable in vlVaCreateImage</li>
</ul>
<p>Jonathan Gray (1):</p>
<ul>
<li>configure.ac: ensure RM is set</li>
</ul>
<p>Krzysztof Sobiecki (1):</p>
<ul>
<li>st/fbo: use pipe_surface_release instead of pipe_surface_reference</li>
</ul>
<p>Leo Liu (1):</p>
<ul>
<li>st/omx/dec/h264: fix field picture type 0 poc disorder</li>
</ul>
<p>Marek Olšák (3):</p>
<ul>
<li>st/mesa: fix clip state dependencies</li>
<li>radeonsi: fix a GS copy shader leak</li>
<li>gallium: add PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT</li>
</ul>
<p>Nicolai Hähnle (1):</p>
<ul>
<li>u_vbuf: fix vb slot assignment for translated buffers</li>
</ul>
<p>Rob Clark (1):</p>
<ul>
<li>freedreno/a3xx: cache-flush is needed after MEM_WRITE</li>
</ul>
<p>Tapani Pälli (3):</p>
<ul>
<li>mesa: add GL_UNSIGNED_INT_24_8 to _mesa_pack_depth_span</li>
<li>mesa: Set api prefix to version string when overriding version</li>
<li>mesa: fix ARRAY_SIZE query for GetProgramResourceiv</li>
</ul>
</div>
</body>
</html>

View File

@@ -312,6 +312,8 @@ dri2_add_config(_EGLDisplay *disp, const __DRIconfig *dri_config, int id,
else
conf->dri_single_config = dri_config;
}
conf->base.SurfaceType = 0;
conf->base.ConfigID = config_id;
_eglLinkConfig(&conf->base);
@@ -2384,13 +2386,18 @@ dri2_client_wait_sync(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSync *sync,
unsigned wait_flags = 0;
EGLint ret = EGL_CONDITION_SATISFIED_KHR;
if (flags & EGL_SYNC_FLUSH_COMMANDS_BIT_KHR)
/* The EGL_KHR_fence_sync spec states:
*
* "If no context is current for the bound API,
* the EGL_SYNC_FLUSH_COMMANDS_BIT_KHR bit is ignored.
*/
if (dri2_ctx && flags & EGL_SYNC_FLUSH_COMMANDS_BIT_KHR)
wait_flags |= __DRI2_FENCE_FLAG_FLUSH_COMMANDS;
/* the sync object should take a reference while waiting */
dri2_egl_ref_sync(dri2_sync);
if (dri2_dpy->fence->client_wait_sync(dri2_ctx->dri_context,
if (dri2_dpy->fence->client_wait_sync(dri2_ctx ? dri2_ctx->dri_context : NULL,
dri2_sync->fence, wait_flags,
timeout))
dri2_sync->base.SyncStatus = EGL_SIGNALED_KHR;

View File

@@ -152,12 +152,51 @@ _eglParseContextAttribList(_EGLContext *ctx, _EGLDisplay *dpy,
/* The EGL_KHR_create_context spec says:
*
* "Flags are only defined for OpenGL context creation, and
* specifying a flags value other than zero for other types of
* contexts, including OpenGL ES contexts, will generate an
* error."
* "If the EGL_CONTEXT_OPENGL_DEBUG_BIT_KHR flag bit is set in
* EGL_CONTEXT_FLAGS_KHR, then a <debug context> will be created.
* [...]
* In some cases a debug context may be identical to a non-debug
* context. This bit is supported for OpenGL and OpenGL ES
* contexts."
*/
if (api != EGL_OPENGL_API && val != 0) {
if ((val & EGL_CONTEXT_OPENGL_DEBUG_BIT_KHR) &&
(api != EGL_OPENGL_API && api != EGL_OPENGL_ES_API)) {
err = EGL_BAD_ATTRIBUTE;
break;
}
/* The EGL_KHR_create_context spec says:
*
* "If the EGL_CONTEXT_OPENGL_FORWARD_COMPATIBLE_BIT_KHR flag bit
* is set in EGL_CONTEXT_FLAGS_KHR, then a <forward-compatible>
* context will be created. Forward-compatible contexts are
* defined only for OpenGL versions 3.0 and later. They must not
* support functionality marked as <deprecated> by that version of
* the API, while a non-forward-compatible context must support
* all functionality in that version, deprecated or not. This bit
* is supported for OpenGL contexts, and requesting a
* forward-compatible context for OpenGL versions less than 3.0
* will generate an error."
*/
if ((val & EGL_CONTEXT_OPENGL_FORWARD_COMPATIBLE_BIT_KHR) &&
(api != EGL_OPENGL_API || ctx->ClientMajorVersion < 3)) {
err = EGL_BAD_ATTRIBUTE;
break;
}
/* The EGL_KHR_create_context_spec says:
*
* "If the EGL_CONTEXT_OPENGL_ROBUST_ACCESS_BIT_KHR bit is set in
* EGL_CONTEXT_FLAGS_KHR, then a context supporting <robust buffer
* access> will be created. Robust buffer access is defined in the
* GL_ARB_robustness extension specification, and the resulting
* context must also support either the GL_ARB_robustness
* extension, or a version of OpenGL incorporating equivalent
* functionality. This bit is supported for OpenGL contexts.
*/
if ((val & EGL_CONTEXT_OPENGL_ROBUST_ACCESS_BIT_KHR) &&
(api != EGL_OPENGL_API ||
!dpy->Extensions.EXT_create_context_robustness)) {
err = EGL_BAD_ATTRIBUTE;
break;
}

View File

@@ -137,6 +137,8 @@ gallivm_get_shader_param(enum pipe_shader_cap param)
case PIPE_SHADER_CAP_TGSI_DFRACEXP_DLDEXP_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED:
return 0;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
}
/* if we get here, we missed a shader cap above (and should have seen
* a compiler warning.)

View File

@@ -81,6 +81,8 @@
# pragma pop_macro("DEBUG")
#endif
#include "c11/threads.h"
#include "os/os_thread.h"
#include "pipe/p_config.h"
#include "util/u_debug.h"
#include "util/u_cpu_detect.h"
@@ -103,6 +105,33 @@ static LLVMEnsureMultithreaded lLVMEnsureMultithreaded;
}
static once_flag init_native_targets_once_flag;
static void init_native_targets()
{
// If we have a native target, initialize it to ensure it is linked in and
// usable by the JIT.
llvm::InitializeNativeTarget();
llvm::InitializeNativeTargetAsmPrinter();
llvm::InitializeNativeTargetDisassembler();
}
/**
* The llvm target registry is not thread-safe, so drivers and state-trackers
* that want to initialize targets should use the gallivm_init_llvm_targets()
* function to safely initialize targets.
*
* LLVM targets should be initialized before the driver or state-tracker tries
* to access the registry.
*/
extern "C" void
gallivm_init_llvm_targets(void)
{
call_once(&init_native_targets_once_flag, init_native_targets);
}
extern "C" void
lp_set_target_options(void)
{
@@ -115,13 +144,7 @@ lp_set_target_options(void)
llvm::DisablePrettyStackTrace = true;
#endif
// If we have a native target, initialize it to ensure it is linked in and
// usable by the JIT.
llvm::InitializeNativeTarget();
llvm::InitializeNativeTargetAsmPrinter();
llvm::InitializeNativeTargetDisassembler();
gallivm_init_llvm_targets();
}

View File

@@ -41,6 +41,8 @@ extern "C" {
struct lp_generated_code;
extern void
gallivm_init_llvm_targets(void);
extern void
lp_set_target_options(void);

View File

@@ -463,6 +463,8 @@ tgsi_exec_get_shader_param(enum pipe_shader_cap param)
case PIPE_SHADER_CAP_TGSI_DROUND_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED:
return 0;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
}
/* if we get here, we missed a shader cap above (and should have seen
* a compiler warning.)

View File

@@ -1190,6 +1190,8 @@ static void blitter_draw(struct blitter_context_priv *ctx,
u_upload_data(ctx->upload, 0, sizeof(ctx->vertices), ctx->vertices,
&vb.buffer_offset, &vb.buffer);
if (!vb.buffer)
return;
u_upload_unmap(ctx->upload);
pipe->set_vertex_buffers(pipe, ctx->base.vb_slot, 1, &vb);
@@ -2089,6 +2091,9 @@ void util_blitter_clear_buffer(struct blitter_context *blitter,
u_upload_data(ctx->upload, 0, num_channels*4, clear_value,
&vb.buffer_offset, &vb.buffer);
if (!vb.buffer)
goto out;
vb.stride = 0;
blitter_set_running_flag(ctx);
@@ -2112,6 +2117,7 @@ void util_blitter_clear_buffer(struct blitter_context *blitter,
util_draw_arrays(pipe, PIPE_PRIM_POINTS, 0, size / 4);
out:
blitter_restore_vertex_states(ctx);
blitter_restore_render_cond(ctx);
blitter_unset_running_flag(ctx);

View File

@@ -545,6 +545,7 @@ u_vbuf_translate_find_free_vb_slots(struct u_vbuf *mgr,
index = ffs(unused_vb_mask) - 1;
fallback_vbs[type] = index;
unused_vb_mask &= ~(1 << index);
/*printf("found slot=%i for type=%i\n", index, type);*/
}
}

View File

@@ -355,6 +355,10 @@ to be 0.
are supported.
* ``PIPE_SHADER_CAP_TGSI_ANY_INOUT_DECL_RANGE``: Whether the driver doesn't
ignore tgsi_declaration_range::Last for shader inputs and outputs.
* ``PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT``: This is the maximum number
of iterations that loops are allowed to have to be unrolled. It is only
a hint to state trackers. Whether any loops will be unrolled is not
guaranteed.
.. _pipe_compute_cap:

View File

@@ -828,11 +828,7 @@ fd3_emit_restore(struct fd_context *ctx)
OUT_RING(ring, A3XX_HLSQ_CONST_FSPRESV_RANGE_REG_STARTENTRY(0) |
A3XX_HLSQ_CONST_FSPRESV_RANGE_REG_ENDENTRY(0));
OUT_PKT0(ring, REG_A3XX_UCHE_CACHE_INVALIDATE0_REG, 2);
OUT_RING(ring, A3XX_UCHE_CACHE_INVALIDATE0_REG_ADDR(0));
OUT_RING(ring, A3XX_UCHE_CACHE_INVALIDATE1_REG_ADDR(0) |
A3XX_UCHE_CACHE_INVALIDATE1_REG_OPCODE(INVALIDATE) |
A3XX_UCHE_CACHE_INVALIDATE1_REG_ENTIRE_CACHE);
fd3_emit_cache_flush(ctx, ring);
OUT_PKT0(ring, REG_A3XX_GRAS_CL_CLIP_CNTL, 1);
OUT_RING(ring, 0x00000000); /* GRAS_CL_CLIP_CNTL */

View File

@@ -90,4 +90,15 @@ void fd3_emit_restore(struct fd_context *ctx);
void fd3_emit_init(struct pipe_context *pctx);
static inline void
fd3_emit_cache_flush(struct fd_context *ctx, struct fd_ringbuffer *ring)
{
fd_wfi(ctx, ring);
OUT_PKT0(ring, REG_A3XX_UCHE_CACHE_INVALIDATE0_REG, 2);
OUT_RING(ring, A3XX_UCHE_CACHE_INVALIDATE0_REG_ADDR(0));
OUT_RING(ring, A3XX_UCHE_CACHE_INVALIDATE1_REG_ADDR(0) |
A3XX_UCHE_CACHE_INVALIDATE1_REG_OPCODE(INVALIDATE) |
A3XX_UCHE_CACHE_INVALIDATE1_REG_ENTIRE_CACHE);
}
#endif /* FD3_EMIT_H */

View File

@@ -558,6 +558,8 @@ fd3_emit_tile_mem2gmem(struct fd_context *ctx, struct fd_tile *tile)
OUT_RING(ring, fui(x1));
OUT_RING(ring, fui(y1));
fd3_emit_cache_flush(ctx, ring);
for (i = 0; i < 4; i++) {
OUT_PKT0(ring, REG_A3XX_RB_MRT_CONTROL(i), 1);
OUT_RING(ring, A3XX_RB_MRT_CONTROL_ROP_CODE(ROP_COPY) |

View File

@@ -407,6 +407,8 @@ fd_screen_get_shader_param(struct pipe_screen *pscreen, unsigned shader,
return 16;
case PIPE_SHADER_CAP_PREFERRED_IR:
return PIPE_SHADER_IR_TGSI;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
}
debug_printf("unknown shader param %d\n", param);
return 0;

View File

@@ -167,6 +167,8 @@ i915_get_shader_param(struct pipe_screen *screen, unsigned shader, enum pipe_sha
case PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_ANY_INOUT_DECL_RANGE:
return 0;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
default:
debug_printf("%s: Unknown cap %u.\n", __FUNCTION__, cap);
return 0;

View File

@@ -138,6 +138,8 @@ ilo_get_shader_param(struct pipe_screen *screen, unsigned shader,
return PIPE_SHADER_IR_TGSI;
case PIPE_SHADER_CAP_TGSI_SQRT_SUPPORTED:
return 1;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
default:
return 0;

View File

@@ -25,10 +25,24 @@
#include <stack>
#include <limits>
#if __cplusplus >= 201103L
#include <unordered_map>
#else
#include <tr1/unordered_map>
#endif
namespace nv50_ir {
#if __cplusplus >= 201103L
using std::hash;
using std::unordered_map;
#elif !defined(ANDROID)
using std::tr1::hash;
using std::tr1::unordered_map;
#else
#error Android release before Lollipop is not supported!
#endif
#define MAX_REGISTER_FILE_SIZE 256
class RegisterSet
@@ -349,12 +363,12 @@ RegAlloc::PhiMovesPass::needNewElseBlock(BasicBlock *b, BasicBlock *p)
struct PhiMapHash {
size_t operator()(const std::pair<Instruction *, BasicBlock *>& val) const {
return std::tr1::hash<Instruction*>()(val.first) * 31 +
std::tr1::hash<BasicBlock*>()(val.second);
return hash<Instruction*>()(val.first) * 31 +
hash<BasicBlock*>()(val.second);
}
};
typedef std::tr1::unordered_map<
typedef unordered_map<
std::pair<Instruction *, BasicBlock *>, Value *, PhiMapHash> PhiMap;
// Critical edges need to be split up so that work can be inserted along

View File

@@ -80,7 +80,12 @@ release_allocation(struct nouveau_mm_allocation **mm,
inline void
nouveau_buffer_release_gpu_storage(struct nv04_resource *buf)
{
nouveau_bo_ref(NULL, &buf->bo);
if (buf->fence && buf->fence->state < NOUVEAU_FENCE_STATE_FLUSHED) {
nouveau_fence_work(buf->fence, nouveau_fence_unref_bo, buf->bo);
buf->bo = NULL;
} else {
nouveau_bo_ref(NULL, &buf->bo);
}
if (buf->mm)
release_allocation(&buf->mm, buf->fence);
@@ -281,7 +286,8 @@ nouveau_buffer_transfer_del(struct nouveau_context *nv,
{
if (tx->map) {
if (likely(tx->bo)) {
nouveau_bo_ref(NULL, &tx->bo);
nouveau_fence_work(nv->screen->fence.current,
nouveau_fence_unref_bo, tx->bo);
if (tx->mm)
release_allocation(&tx->mm, nv->screen->fence.current);
} else {
@@ -782,7 +788,7 @@ nouveau_buffer_migrate(struct nouveau_context *nv,
nv->copy_data(nv, buf->bo, buf->offset, new_domain,
bo, offset, old_domain, buf->base.width0);
nouveau_bo_ref(NULL, &bo);
nouveau_fence_work(screen->fence.current, nouveau_fence_unref_bo, bo);
if (mm)
release_allocation(&mm, screen->fence.current);
} else

View File

@@ -190,8 +190,14 @@ nouveau_fence_wait(struct nouveau_fence *fence)
/* wtf, someone is waiting on a fence in flush_notify handler? */
assert(fence->state != NOUVEAU_FENCE_STATE_EMITTING);
if (fence->state < NOUVEAU_FENCE_STATE_EMITTED)
nouveau_fence_emit(fence);
if (fence->state < NOUVEAU_FENCE_STATE_EMITTED) {
PUSH_SPACE(screen->pushbuf, 8);
/* The space allocation might trigger a flush, which could emit the
* current fence. So check again.
*/
if (fence->state < NOUVEAU_FENCE_STATE_EMITTED)
nouveau_fence_emit(fence);
}
if (fence->state < NOUVEAU_FENCE_STATE_FLUSHED)
if (nouveau_pushbuf_kick(screen->pushbuf, screen->pushbuf->channel))
@@ -224,10 +230,22 @@ nouveau_fence_wait(struct nouveau_fence *fence)
void
nouveau_fence_next(struct nouveau_screen *screen)
{
if (screen->fence.current->state < NOUVEAU_FENCE_STATE_EMITTING)
nouveau_fence_emit(screen->fence.current);
if (screen->fence.current->state < NOUVEAU_FENCE_STATE_EMITTING) {
if (screen->fence.current->ref > 1)
nouveau_fence_emit(screen->fence.current);
else
return;
}
nouveau_fence_ref(NULL, &screen->fence.current);
nouveau_fence_new(screen, &screen->fence.current, false);
}
void
nouveau_fence_unref_bo(void *data)
{
struct nouveau_bo *bo = data;
nouveau_bo_ref(NULL, &bo);
}

View File

@@ -37,6 +37,9 @@ void nouveau_fence_next(struct nouveau_screen *);
bool nouveau_fence_wait(struct nouveau_fence *);
bool nouveau_fence_signalled(struct nouveau_fence *);
void nouveau_fence_unref_bo(void *data); /* generic unref bo callback */
static inline void
nouveau_fence_ref(struct nouveau_fence *fence, struct nouveau_fence **ref)
{

View File

@@ -24,6 +24,8 @@ PUSH_AVAIL(struct nouveau_pushbuf *push)
static inline bool
PUSH_SPACE(struct nouveau_pushbuf *push, uint32_t size)
{
/* Provide a buffer so that fences always have room to be emitted */
size += 8;
if (PUSH_AVAIL(push) < size)
return nouveau_pushbuf_space(push, size, 0, 0) == 0;
return true;

View File

@@ -78,12 +78,12 @@ nv30_format_info_table[PIPE_FORMAT_COUNT] = {
_(B4G4R4X4_UNORM , S___),
_(B4G4R4A4_UNORM , S___),
_(B5G6R5_UNORM , SB__),
_(B8G8R8X8_UNORM , SB__),
_(B8G8R8X8_SRGB , S___),
_(B8G8R8A8_UNORM , SB__),
_(B8G8R8A8_SRGB , S___),
_(BGRX8888_UNORM , SB__),
_(BGRX8888_SRGB , S___),
_(BGRA8888_UNORM , SB__),
_(BGRA8888_SRGB , S___),
_(R8G8B8A8_UNORM , __V_),
_(R8G8B8A8_SNORM , S___),
_(RGBA8888_SNORM , S___),
_(DXT1_RGB , S___),
_(DXT1_SRGB , S___),
_(DXT1_RGBA , S___),
@@ -138,8 +138,8 @@ const struct nv30_format
nv30_format_table[PIPE_FORMAT_COUNT] = {
R_(B5G5R5X1_UNORM , X1R5G5B5 ),
R_(B5G6R5_UNORM , R5G6B5 ),
R_(B8G8R8X8_UNORM , X8R8G8B8 ),
R_(B8G8R8A8_UNORM , A8R8G8B8 ),
R_(BGRX8888_UNORM , X8R8G8B8 ),
R_(BGRA8888_UNORM , A8R8G8B8 ),
Z_(Z16_UNORM , Z16 ),
Z_(X8Z24_UNORM , Z24S8 ),
Z_(S8_UINT_Z24_UNORM , Z24S8 ),
@@ -223,11 +223,11 @@ nv30_texfmt_table[PIPE_FORMAT_COUNT] = {
_(B4G4R4X4_UNORM , A4R4G4B4, 0, C, C, C, 1, 2, 1, 0, x, NONE, ____),
_(B4G4R4A4_UNORM , A4R4G4B4, 0, C, C, C, C, 2, 1, 0, 3, NONE, ____),
_(B5G6R5_UNORM , R5G6B5 , 0, C, C, C, 1, 2, 1, 0, x, NONE, ____),
_(B8G8R8X8_UNORM , A8R8G8B8, 0, C, C, C, 1, 2, 1, 0, x, NONE, ____),
_(B8G8R8X8_SRGB , A8R8G8B8, 0, C, C, C, 1, 2, 1, 0, x, SRGB, ____),
_(B8G8R8A8_UNORM , A8R8G8B8, 0, C, C, C, C, 2, 1, 0, 3, NONE, ____),
_(B8G8R8A8_SRGB , A8R8G8B8, 0, C, C, C, C, 2, 1, 0, 3, SRGB, ____),
_(R8G8B8A8_SNORM , A8R8G8B8, 0, C, C, C, C, 0, 1, 2, 3, NONE, SSSS),
_(BGRX8888_UNORM , A8R8G8B8, 0, C, C, C, 1, 2, 1, 0, x, NONE, ____),
_(BGRX8888_SRGB , A8R8G8B8, 0, C, C, C, 1, 2, 1, 0, x, SRGB, ____),
_(BGRA8888_UNORM , A8R8G8B8, 0, C, C, C, C, 2, 1, 0, 3, NONE, ____),
_(BGRA8888_SRGB , A8R8G8B8, 0, C, C, C, C, 2, 1, 0, 3, SRGB, ____),
_(RGBA8888_SNORM , A8R8G8B8, 0, C, C, C, C, 0, 1, 2, 3, NONE, SSSS),
_(DXT1_RGB , DXT1 , 0, C, C, C, 1, 2, 1, 0, x, NONE, ____),
_(DXT1_SRGB , DXT1 , 0, C, C, C, 1, 2, 1, 0, x, SRGB, ____),
_(DXT1_RGBA , DXT1 , 0, C, C, C, C, 2, 1, 0, 3, NONE, ____),

View File

@@ -339,10 +339,15 @@ nv30_miptree_transfer_unmap(struct pipe_context *pipe,
struct nv30_context *nv30 = nv30_context(pipe);
struct nv30_transfer *tx = nv30_transfer(ptx);
if (ptx->usage & PIPE_TRANSFER_WRITE)
if (ptx->usage & PIPE_TRANSFER_WRITE) {
nv30_transfer_rect(nv30, NEAREST, &tx->tmp, &tx->img);
nouveau_bo_ref(NULL, &tx->tmp.bo);
/* Allow the copies above to finish executing before freeing the source */
nouveau_fence_work(nv30->screen->base.fence.current,
nouveau_fence_unref_bo, tx->tmp.bo);
} else {
nouveau_bo_ref(NULL, &tx->tmp.bo);
}
pipe_resource_reference(&ptx->resource, NULL);
FREE(tx);
}

View File

@@ -261,6 +261,8 @@ nv30_screen_get_shader_param(struct pipe_screen *pscreen, unsigned shader,
case PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_ANY_INOUT_DECL_RANGE:
return 0;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
default:
debug_printf("unknown vertex shader param %d\n", param);
return 0;
@@ -302,6 +304,8 @@ nv30_screen_get_shader_param(struct pipe_screen *pscreen, unsigned shader,
case PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_ANY_INOUT_DECL_RANGE:
return 0;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
default:
debug_printf("unknown fragment shader param %d\n", param);
return 0;
@@ -345,7 +349,9 @@ nv30_screen_fence_emit(struct pipe_screen *pscreen, uint32_t *sequence)
*sequence = ++screen->base.fence.sequence;
BEGIN_NV04(push, NV30_3D(FENCE_OFFSET), 2);
assert(PUSH_AVAIL(push) >= 3);
PUSH_DATA (push, NV30_3D_FENCE_OFFSET |
(2 /* size */ << 18) | (7 /* subchan */ << 13));
PUSH_DATA (push, 0);
PUSH_DATA (push, *sequence);
}

View File

@@ -191,7 +191,11 @@ nv30_vbo_validate(struct nv30_context *nv30)
if (!nv30->vertex || nv30->draw_flags)
return;
#ifdef PIPE_ARCH_BIG_ENDIAN
if (1) { /* Figure out where the buffers are getting messed up */
#else
if (unlikely(vertex->need_conversion)) {
#endif
nv30->vbo_fifo = ~0;
nv30->vbo_user = 0;
} else {

View File

@@ -1,3 +1,4 @@
#include <strings.h>
#include "pipe/p_context.h"
#include "pipe/p_defines.h"
#include "pipe/p_state.h"

View File

@@ -163,7 +163,10 @@ nv50_miptree_destroy(struct pipe_screen *pscreen, struct pipe_resource *pt)
{
struct nv50_miptree *mt = nv50_miptree(pt);
nouveau_bo_ref(NULL, &mt->base.bo);
if (mt->base.fence && mt->base.fence->state < NOUVEAU_FENCE_STATE_FLUSHED)
nouveau_fence_work(mt->base.fence, nouveau_fence_unref_bo, mt->base.bo);
else
nouveau_bo_ref(NULL, &mt->base.bo);
nouveau_fence_ref(NULL, &mt->base.fence);
nouveau_fence_ref(NULL, &mt->base.fence_wr);

View File

@@ -297,6 +297,8 @@ nv50_screen_get_shader_param(struct pipe_screen *pscreen, unsigned shader,
case PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_ANY_INOUT_DECL_RANGE:
return 0;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
default:
NOUVEAU_ERR("unknown PIPE_SHADER_CAP %d\n", param);
return 0;
@@ -386,6 +388,7 @@ nv50_screen_fence_emit(struct pipe_screen *pscreen, u32 *sequence)
/* we need to do it after possible flush in MARK_RING */
*sequence = ++screen->base.fence.sequence;
assert(PUSH_AVAIL(push) >= 5);
PUSH_DATA (push, NV50_FIFO_PKHDR(NV50_3D(QUERY_ADDRESS_HIGH), 4));
PUSH_DATAh(push, screen->fence.bo->offset);
PUSH_DATA (push, screen->fence.bo->offset);

View File

@@ -65,14 +65,9 @@ nv50_constbufs_validate(struct nv50_context *nv50)
PUSH_DATA (push, (b << 12) | (i << 8) | p | 1);
}
while (words) {
unsigned nr;
if (!PUSH_SPACE(push, 16))
break;
nr = PUSH_AVAIL(push);
assert(nr >= 16);
nr = MIN2(MIN2(nr - 3, words), NV04_PFIFO_MAX_PACKET_LEN);
unsigned nr = MIN2(words, NV04_PFIFO_MAX_PACKET_LEN);
PUSH_SPACE(push, nr + 3);
BEGIN_NV04(push, NV50_3D(CB_ADDR), 1);
PUSH_DATA (push, (start << 8) | b);
BEGIN_NI04(push, NV50_3D(CB_DATA(0)), nr);

View File

@@ -187,14 +187,7 @@ nv50_sifc_linear_u8(struct nouveau_context *nv,
PUSH_DATA (push, 0);
while (count) {
unsigned nr;
if (!PUSH_SPACE(push, 16))
break;
nr = PUSH_AVAIL(push);
assert(nr >= 16);
nr = MIN2(count, nr - 1);
nr = MIN2(nr, NV04_PFIFO_MAX_PACKET_LEN);
unsigned nr = MIN2(count, NV04_PFIFO_MAX_PACKET_LEN);
BEGIN_NI04(push, NV50_2D(SIFC_DATA), nr);
PUSH_DATAp(push, src, nr);
@@ -365,9 +358,14 @@ nv50_miptree_transfer_unmap(struct pipe_context *pctx,
tx->rect[0].base += mt->layer_stride;
tx->rect[1].base += tx->nblocksy * tx->base.stride;
}
/* Allow the copies above to finish executing before freeing the source */
nouveau_fence_work(nv50->screen->base.fence.current,
nouveau_fence_unref_bo, tx->rect[1].bo);
} else {
nouveau_bo_ref(NULL, &tx->rect[1].bo);
}
nouveau_bo_ref(NULL, &tx->rect[1].bo);
pipe_resource_reference(&transfer->resource, NULL);
FREE(tx);
@@ -390,12 +388,9 @@ nv50_cb_push(struct nouveau_context *nv,
nouveau_pushbuf_validate(push);
while (words) {
unsigned nr;
nr = PUSH_AVAIL(push);
nr = MIN2(nr - 7, words);
nr = MIN2(nr, NV04_PFIFO_MAX_PACKET_LEN - 1);
unsigned nr = MIN2(words, NV04_PFIFO_MAX_PACKET_LEN);
PUSH_SPACE(push, nr + 7);
BEGIN_NV04(push, NV50_3D(CB_DEF_ADDRESS_HIGH), 3);
PUSH_DATAh(push, bo->offset + base);
PUSH_DATA (push, bo->offset + base);

View File

@@ -310,6 +310,8 @@ nvc0_screen_get_shader_param(struct pipe_screen *pscreen, unsigned shader,
return 16; /* would be 32 in linked (OpenGL-style) mode */
case PIPE_SHADER_CAP_MAX_SAMPLER_VIEWS:
return 16; /* XXX not sure if more are really safe */
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
default:
NOUVEAU_ERR("unknown PIPE_SHADER_CAP %d\n", param);
return 0;
@@ -535,7 +537,8 @@ nvc0_screen_fence_emit(struct pipe_screen *pscreen, u32 *sequence)
/* we need to do it after possible flush in MARK_RING */
*sequence = ++screen->base.fence.sequence;
BEGIN_NVC0(push, NVC0_3D(QUERY_ADDRESS_HIGH), 4);
assert(PUSH_AVAIL(push) >= 5);
PUSH_DATA (push, NVC0_FIFO_PKHDR_SQ(NVC0_3D(QUERY_ADDRESS_HIGH), 4));
PUSH_DATAh(push, screen->fence.bo->offset);
PUSH_DATA (push, screen->fence.bo->offset);
PUSH_DATA (push, *sequence);

View File

@@ -188,14 +188,10 @@ nvc0_m2mf_push_linear(struct nouveau_context *nv,
nouveau_pushbuf_validate(push);
while (count) {
unsigned nr;
unsigned nr = MIN2(count, NV04_PFIFO_MAX_PACKET_LEN);
if (!PUSH_SPACE(push, 16))
if (!PUSH_SPACE(push, nr + 9))
break;
nr = PUSH_AVAIL(push);
assert(nr >= 16);
nr = MIN2(count, nr - 9);
nr = MIN2(nr, NV04_PFIFO_MAX_PACKET_LEN);
BEGIN_NVC0(push, NVC0_M2MF(OFFSET_OUT_HIGH), 2);
PUSH_DATAh(push, dst->offset + offset);
@@ -234,14 +230,10 @@ nve4_p2mf_push_linear(struct nouveau_context *nv,
nouveau_pushbuf_validate(push);
while (count) {
unsigned nr;
unsigned nr = MIN2(count, (NV04_PFIFO_MAX_PACKET_LEN - 1));
if (!PUSH_SPACE(push, 16))
if (!PUSH_SPACE(push, nr + 10))
break;
nr = PUSH_AVAIL(push);
assert(nr >= 16);
nr = MIN2(count, nr - 8);
nr = MIN2(nr, (NV04_PFIFO_MAX_PACKET_LEN - 1));
BEGIN_NVC0(push, NVE4_P2MF(UPLOAD_DST_ADDRESS_HIGH), 2);
PUSH_DATAh(push, dst->offset + offset);
@@ -495,11 +487,16 @@ nvc0_miptree_transfer_unmap(struct pipe_context *pctx,
tx->rect[1].base += tx->nblocksy * tx->base.stride;
}
NOUVEAU_DRV_STAT(&nvc0->screen->base, tex_transfers_wr, 1);
/* Allow the copies above to finish executing before freeing the source */
nouveau_fence_work(nvc0->screen->base.fence.current,
nouveau_fence_unref_bo, tx->rect[1].bo);
} else {
nouveau_bo_ref(NULL, &tx->rect[1].bo);
}
if (tx->base.usage & PIPE_TRANSFER_READ)
NOUVEAU_DRV_STAT(&nvc0->screen->base, tex_transfers_rd, 1);
nouveau_bo_ref(NULL, &tx->rect[1].bo);
pipe_resource_reference(&transfer->resource, NULL);
FREE(tx);
@@ -566,9 +563,7 @@ nvc0_cb_bo_push(struct nouveau_context *nv,
PUSH_DATA (push, bo->offset + base);
while (words) {
unsigned nr = PUSH_AVAIL(push);
nr = MIN2(nr, words);
nr = MIN2(nr, NV04_PFIFO_MAX_PACKET_LEN - 1);
unsigned nr = MIN2(words, NV04_PFIFO_MAX_PACKET_LEN - 1);
PUSH_SPACE(push, nr + 2);
PUSH_REFN (push, bo, NOUVEAU_BO_WR | domain);

View File

@@ -300,6 +300,8 @@ static int r300_get_shader_param(struct pipe_screen *pscreen, unsigned shader, e
case PIPE_SHADER_CAP_TGSI_DFRACEXP_DLDEXP_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED:
return 0;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
case PIPE_SHADER_CAP_PREFERRED_IR:
return PIPE_SHADER_IR_TGSI;
}
@@ -356,6 +358,8 @@ static int r300_get_shader_param(struct pipe_screen *pscreen, unsigned shader, e
case PIPE_SHADER_CAP_TGSI_DFRACEXP_DLDEXP_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED:
return 0;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
case PIPE_SHADER_CAP_PREFERRED_IR:
return PIPE_SHADER_IR_TGSI;
}

View File

@@ -504,6 +504,12 @@ static int r600_get_shader_param(struct pipe_screen* pscreen, unsigned shader, e
case PIPE_SHADER_CAP_TGSI_DFRACEXP_DLDEXP_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED:
return 0;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
/* due to a bug in the shader compiler, some loops hang
* if they are not unrolled, see:
* https://bugs.freedesktop.org/show_bug.cgi?id=86720
*/
return 255;
}
return 0;
}

View File

@@ -305,12 +305,11 @@ static void *r600_buffer_transfer_map(struct pipe_context *ctx,
data += box->x % R600_MAP_BUFFER_ALIGNMENT;
return r600_buffer_get_transfer(ctx, resource, level, usage, box,
ptransfer, data, staging, offset);
} else {
return NULL; /* error, shouldn't occur though */
}
} else {
/* At this point, the buffer is always idle (we checked it above). */
usage |= PIPE_TRANSFER_UNSYNCHRONIZED;
}
/* At this point, the buffer is always idle (we checked it above). */
usage |= PIPE_TRANSFER_UNSYNCHRONIZED;
}
/* Using a staging buffer in GTT for larger reads is much faster. */
else if ((usage & PIPE_TRANSFER_READ) &&

View File

@@ -78,6 +78,9 @@ void r600_draw_rectangle(struct blitter_context *blitter,
* I guess the 4th one is derived from the first 3.
* The vertex specification should match u_blitter's vertex element state. */
u_upload_alloc(rctx->uploader, 0, sizeof(float) * 24, &offset, &buf, (void**)&vb);
if (!buf)
return;
vb[0] = x1;
vb[1] = y1;
vb[2] = depth;

View File

@@ -989,6 +989,11 @@ static void *r600_texture_transfer_map(struct pipe_context *ctx,
if (usage & PIPE_TRANSFER_READ) {
struct pipe_resource *temp = ctx->screen->resource_create(ctx->screen, &resource);
if (!temp) {
R600_ERR("failed to create a temporary depth texture\n");
FREE(trans);
return NULL;
}
r600_copy_region_with_blit(ctx, temp, 0, 0, 0, 0, texture, level, box);
rctx->blit_decompress_depth(ctx, (struct r600_texture*)temp, staging_depth,

View File

@@ -25,6 +25,8 @@
*/
#include "radeon_llvm_emit.h"
#include "radeon_elf_util.h"
#include "c11/threads.h"
#include "gallivm/lp_bld_misc.h"
#include "util/u_memory.h"
#include "pipe/p_shader_tokens.h"
@@ -86,30 +88,29 @@ void radeon_llvm_shader_type(LLVMValueRef F, unsigned type)
static void init_r600_target()
{
static unsigned initialized = 0;
if (!initialized) {
gallivm_init_llvm_targets();
#if HAVE_LLVM < 0x0307
LLVMInitializeR600TargetInfo();
LLVMInitializeR600Target();
LLVMInitializeR600TargetMC();
LLVMInitializeR600AsmPrinter();
LLVMInitializeR600TargetInfo();
LLVMInitializeR600Target();
LLVMInitializeR600TargetMC();
LLVMInitializeR600AsmPrinter();
#else
LLVMInitializeAMDGPUTargetInfo();
LLVMInitializeAMDGPUTarget();
LLVMInitializeAMDGPUTargetMC();
LLVMInitializeAMDGPUAsmPrinter();
LLVMInitializeAMDGPUTargetInfo();
LLVMInitializeAMDGPUTarget();
LLVMInitializeAMDGPUTargetMC();
LLVMInitializeAMDGPUAsmPrinter();
#endif
initialized = 1;
}
}
static once_flag init_r600_target_once_flag = ONCE_FLAG_INIT;
LLVMTargetRef radeon_llvm_get_r600_target(const char *triple)
{
LLVMTargetRef target = NULL;
char *err_message = NULL;
init_r600_target();
call_once(&init_r600_target_once_flag, init_r600_target);
if (LLVMGetTargetFromTriple(triple, &target, &err_message)) {
fprintf(stderr, "Cannot find target for triple %s ", triple);

View File

@@ -233,6 +233,9 @@ static void vui(struct rvce_encoder *enc)
{
int i;
if (!enc->pic.rate_ctrl.frame_rate_num)
return;
RVCE_BEGIN(0x04000009); // vui
RVCE_CS(0x00000000); //aspectRatioInfoPresentFlag
RVCE_CS(0x00000000); //aspectRatioInfo.aspectRatioIdc

View File

@@ -468,7 +468,8 @@ void si_upload_const_buffer(struct si_context *sctx, struct r600_resource **rbuf
u_upload_alloc(sctx->b.uploader, 0, size, const_offset,
(struct pipe_resource**)rbuffer, &tmp);
util_memcpy_cpu_to_le32(tmp, ptr, size);
if (rbuffer)
util_memcpy_cpu_to_le32(tmp, ptr, size);
}
static void si_set_constant_buffer(struct pipe_context *ctx, uint shader, uint slot,
@@ -500,6 +501,11 @@ static void si_set_constant_buffer(struct pipe_context *ctx, uint shader, uint s
si_upload_const_buffer(sctx,
(struct r600_resource**)&buffer, input->user_buffer,
input->buffer_size, &buffer_offset);
if (!buffer) {
/* Just unbind on failure. */
si_set_constant_buffer(ctx, shader, slot, NULL);
return;
}
va = r600_resource(buffer)->gpu_address + buffer_offset;
} else {
pipe_resource_reference(&buffer, input->buffer);

View File

@@ -170,6 +170,8 @@ static struct pipe_context *si_create_context(struct pipe_screen *screen, void *
if (sctx->b.chip_class == CIK) {
sctx->null_const_buf.buffer = pipe_buffer_create(screen, PIPE_BIND_CONSTANT_BUFFER,
PIPE_USAGE_DEFAULT, 16);
if (!sctx->null_const_buf.buffer)
goto fail;
sctx->null_const_buf.buffer_size = sctx->null_const_buf.buffer->width0;
for (shader = 0; shader < SI_NUM_SHADERS; shader++) {
@@ -487,6 +489,8 @@ static int si_get_shader_param(struct pipe_screen* pscreen, unsigned shader, enu
case PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_ANY_INOUT_DECL_RANGE:
return 1;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
}
return 0;
}

View File

@@ -3829,11 +3829,14 @@ int si_shader_binary_read(struct si_screen *sscreen, struct si_shader *shader)
{
const struct radeon_shader_binary *binary = &shader->binary;
unsigned i;
int r;
bool dump = r600_can_dump_shader(&sscreen->b,
shader->selector ? shader->selector->tokens : NULL);
si_shader_binary_read_config(sscreen, shader, 0);
si_shader_binary_upload(sscreen, shader);
r = si_shader_binary_upload(sscreen, shader);
if (r)
return r;
if (dump) {
if (!(sscreen->b.debug_flags & DBG_NO_ASM)) {
@@ -4198,8 +4201,10 @@ out:
void si_shader_destroy(struct pipe_context *ctx, struct si_shader *shader)
{
if (shader->gs_copy_shader)
if (shader->gs_copy_shader) {
si_shader_destroy(ctx, shader->gs_copy_shader);
FREE(shader->gs_copy_shader);
}
if (shader->scratch_bo)
r600_resource_reference(&shader->scratch_bo, NULL);

View File

@@ -274,7 +274,7 @@ si_create_sampler_view_custom(struct pipe_context *ctx,
unsigned force_level);
/* si_state_shader.c */
void si_update_shaders(struct si_context *sctx);
bool si_update_shaders(struct si_context *sctx);
void si_init_shader_functions(struct si_context *sctx);
/* si_state_draw.c */

View File

@@ -760,8 +760,8 @@ void si_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info *info)
else
sctx->current_rast_prim = info->mode;
si_update_shaders(sctx);
if (!si_upload_shader_descriptors(sctx))
if (!si_update_shaders(sctx) ||
!si_upload_shader_descriptors(sctx))
return;
if (info->indexed) {
@@ -783,6 +783,10 @@ void si_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info *info)
u_upload_alloc(sctx->b.uploader, start_offset, count * 2,
&out_offset, &out_buffer, &ptr);
if (!out_buffer) {
pipe_resource_reference(&ib.buffer, NULL);
return;
}
util_shorten_ubyte_elts_to_userptr(&sctx->b.b, &ib, 0,
ib.offset + start_offset,
@@ -803,6 +807,8 @@ void si_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info *info)
u_upload_data(sctx->b.uploader, start_offset, count * ib.index_size,
(char*)ib.user_buffer + start_offset,
&ib.offset, &ib.buffer);
if (!ib.buffer)
return;
/* info->start will be added by the drawing code */
ib.offset -= start_offset;
}

View File

@@ -665,8 +665,16 @@ static void *si_create_shader_state(struct pipe_context *ctx,
struct si_shader_selector *sel = CALLOC_STRUCT(si_shader_selector);
int i;
if (!sel)
return NULL;
sel->type = pipe_shader_type;
sel->tokens = tgsi_dup_tokens(state->tokens);
if (!sel->tokens) {
FREE(sel);
return NULL;
}
sel->so = state->stream_output;
tgsi_scan_shader(state->tokens, &sel->info);
p_atomic_inc(&sscreen->b.num_shaders_created);
@@ -725,7 +733,12 @@ static void *si_create_shader_state(struct pipe_context *ctx,
}
if (sscreen->b.debug_flags & DBG_PRECOMPILE)
si_shader_select(ctx, sel);
if (si_shader_select(ctx, sel)) {
fprintf(stderr, "radeonsi: can't create a shader\n");
tgsi_free_tokens(sel->tokens);
FREE(sel);
return NULL;
}
return sel;
}
@@ -1031,11 +1044,23 @@ static void si_init_gs_rings(struct si_context *sctx)
assert(!sctx->gs_rings);
sctx->gs_rings = CALLOC_STRUCT(si_pm4_state);
if (!sctx->gs_rings)
return;
sctx->esgs_ring = pipe_buffer_create(sctx->b.b.screen, PIPE_BIND_CUSTOM,
PIPE_USAGE_DEFAULT, esgs_ring_size);
if (!sctx->esgs_ring) {
FREE(sctx->gs_rings);
return;
}
sctx->gsvs_ring = pipe_buffer_create(sctx->b.b.screen, PIPE_BIND_CUSTOM,
PIPE_USAGE_DEFAULT, gsvs_ring_size);
if (!sctx->gsvs_ring) {
pipe_resource_reference(&sctx->esgs_ring, NULL);
FREE(sctx->gs_rings);
return;
}
if (sctx->b.chip_class >= CIK) {
if (sctx->b.chip_class >= VI) {
@@ -1094,14 +1119,16 @@ static void si_update_gs_rings(struct si_context *sctx)
}
/**
* @returns 1 if \p sel has been updated to use a new scratch buffer and 0
* otherwise.
* @returns 1 if \p sel has been updated to use a new scratch buffer
* 0 if not
* < 0 if there was a failure
*/
static unsigned si_update_scratch_buffer(struct si_context *sctx,
static int si_update_scratch_buffer(struct si_context *sctx,
struct si_shader_selector *sel)
{
struct si_shader *shader;
uint64_t scratch_va = sctx->scratch_buffer->gpu_address;
int r;
if (!sel)
return 0;
@@ -1122,7 +1149,9 @@ static unsigned si_update_scratch_buffer(struct si_context *sctx,
si_shader_apply_scratch_relocs(sctx, shader, scratch_va);
/* Replace the shader bo with a new bo that has the relocs applied. */
si_shader_binary_upload(sctx->screen, shader);
r = si_shader_binary_upload(sctx->screen, shader);
if (r)
return r;
/* Update the shader state to use the new shader bo. */
si_shader_init_pm4_state(shader);
@@ -1161,7 +1190,7 @@ static unsigned si_get_max_scratch_bytes_per_wave(struct si_context *sctx)
return bytes;
}
static void si_update_spi_tmpring_size(struct si_context *sctx)
static bool si_update_spi_tmpring_size(struct si_context *sctx)
{
unsigned current_scratch_buffer_size =
si_get_current_scratch_buffer_size(sctx);
@@ -1169,6 +1198,7 @@ static void si_update_spi_tmpring_size(struct si_context *sctx)
si_get_max_scratch_bytes_per_wave(sctx);
unsigned scratch_needed_size = scratch_bytes_per_wave *
sctx->scratch_waves;
int r;
if (scratch_needed_size > 0) {
@@ -1181,6 +1211,9 @@ static void si_update_spi_tmpring_size(struct si_context *sctx)
sctx->scratch_buffer =
si_resource_create_custom(&sctx->screen->b.b,
PIPE_USAGE_DEFAULT, scratch_needed_size);
if (!sctx->scratch_buffer)
return false;
sctx->emit_scratch_reloc = true;
}
/* Update the shaders, so they are using the latest scratch. The
@@ -1188,31 +1221,57 @@ static void si_update_spi_tmpring_size(struct si_context *sctx)
* last used, so we still need to try to update them, even if
* they require scratch buffers smaller than the current size.
*/
if (si_update_scratch_buffer(sctx, sctx->ps_shader))
r = si_update_scratch_buffer(sctx, sctx->ps_shader);
if (r < 0)
return false;
if (r == 1)
si_pm4_bind_state(sctx, ps, sctx->ps_shader->current->pm4);
if (si_update_scratch_buffer(sctx, sctx->gs_shader))
r = si_update_scratch_buffer(sctx, sctx->gs_shader);
if (r < 0)
return false;
if (r == 1)
si_pm4_bind_state(sctx, gs, sctx->gs_shader->current->pm4);
if (si_update_scratch_buffer(sctx, sctx->tcs_shader))
r = si_update_scratch_buffer(sctx, sctx->tcs_shader);
if (r < 0)
return false;
if (r == 1)
si_pm4_bind_state(sctx, hs, sctx->tcs_shader->current->pm4);
/* VS can be bound as LS, ES, or VS. */
if (sctx->tes_shader) {
if (si_update_scratch_buffer(sctx, sctx->vs_shader))
r = si_update_scratch_buffer(sctx, sctx->vs_shader);
if (r < 0)
return false;
if (r == 1)
si_pm4_bind_state(sctx, ls, sctx->vs_shader->current->pm4);
} else if (sctx->gs_shader) {
if (si_update_scratch_buffer(sctx, sctx->vs_shader))
r = si_update_scratch_buffer(sctx, sctx->vs_shader);
if (r < 0)
return false;
if (r == 1)
si_pm4_bind_state(sctx, es, sctx->vs_shader->current->pm4);
} else {
if (si_update_scratch_buffer(sctx, sctx->vs_shader))
r = si_update_scratch_buffer(sctx, sctx->vs_shader);
if (r < 0)
return false;
if (r == 1)
si_pm4_bind_state(sctx, vs, sctx->vs_shader->current->pm4);
}
/* TES can be bound as ES or VS. */
if (sctx->gs_shader) {
if (si_update_scratch_buffer(sctx, sctx->tes_shader))
r = si_update_scratch_buffer(sctx, sctx->tes_shader);
if (r < 0)
return false;
if (r == 1)
si_pm4_bind_state(sctx, es, sctx->tes_shader->current->pm4);
} else {
if (si_update_scratch_buffer(sctx, sctx->tes_shader))
r = si_update_scratch_buffer(sctx, sctx->tes_shader);
if (r < 0)
return false;
if (r == 1)
si_pm4_bind_state(sctx, vs, sctx->tes_shader->current->pm4);
}
}
@@ -1223,6 +1282,7 @@ static void si_update_spi_tmpring_size(struct si_context *sctx)
sctx->spi_tmpring_size = S_0286E8_WAVES(sctx->scratch_waves) |
S_0286E8_WAVESIZE(scratch_bytes_per_wave >> 10);
return true;
}
static void si_init_tess_factor_ring(struct si_context *sctx)
@@ -1230,11 +1290,20 @@ static void si_init_tess_factor_ring(struct si_context *sctx)
assert(!sctx->tf_state);
sctx->tf_state = CALLOC_STRUCT(si_pm4_state);
if (!sctx->tf_state)
return;
sctx->tf_ring = pipe_buffer_create(sctx->b.b.screen, PIPE_BIND_CUSTOM,
PIPE_USAGE_DEFAULT,
32768 * sctx->screen->b.info.max_se);
if (!sctx->tf_ring) {
FREE(sctx->tf_state);
return;
}
sctx->b.clear_buffer(&sctx->b.b, sctx->tf_ring, 0,
sctx->tf_ring->width0, fui(0), false);
assert(((sctx->tf_ring->width0 / 4) & C_030938_SIZE) == 0);
if (sctx->b.chip_class >= CIK) {
@@ -1290,7 +1359,6 @@ static void si_generate_fixed_func_tcs(struct si_context *sctx)
sctx->fixed_func_tcs_shader =
ureg_create_shader_and_destroy(ureg, &sctx->b.b);
assert(sctx->fixed_func_tcs_shader);
}
static void si_update_vgt_shader_config(struct si_context *sctx)
@@ -1338,32 +1406,49 @@ static void si_update_so(struct si_context *sctx, struct si_shader_selector *sha
sctx->b.streamout.stride_in_dw = shader->so.stride;
}
void si_update_shaders(struct si_context *sctx)
bool si_update_shaders(struct si_context *sctx)
{
struct pipe_context *ctx = (struct pipe_context*)sctx;
struct si_state_rasterizer *rs = sctx->queued.named.rasterizer;
int r;
/* Update stages before GS. */
if (sctx->tes_shader) {
if (!sctx->tf_state)
if (!sctx->tf_state) {
si_init_tess_factor_ring(sctx);
if (!sctx->tf_state)
return false;
}
/* VS as LS */
si_shader_select(ctx, sctx->vs_shader);
r = si_shader_select(ctx, sctx->vs_shader);
if (r)
return false;
si_pm4_bind_state(sctx, ls, sctx->vs_shader->current->pm4);
if (sctx->tcs_shader) {
si_shader_select(ctx, sctx->tcs_shader);
r = si_shader_select(ctx, sctx->tcs_shader);
if (r)
return false;
si_pm4_bind_state(sctx, hs, sctx->tcs_shader->current->pm4);
} else {
if (!sctx->fixed_func_tcs_shader)
if (!sctx->fixed_func_tcs_shader) {
si_generate_fixed_func_tcs(sctx);
si_shader_select(ctx, sctx->fixed_func_tcs_shader);
if (!sctx->fixed_func_tcs_shader)
return false;
}
r = si_shader_select(ctx, sctx->fixed_func_tcs_shader);
if (r)
return false;
si_pm4_bind_state(sctx, hs,
sctx->fixed_func_tcs_shader->current->pm4);
}
si_shader_select(ctx, sctx->tes_shader);
r = si_shader_select(ctx, sctx->tes_shader);
if (r)
return false;
if (sctx->gs_shader) {
/* TES as ES */
si_pm4_bind_state(sctx, es, sctx->tes_shader->current->pm4);
@@ -1374,24 +1459,33 @@ void si_update_shaders(struct si_context *sctx)
}
} else if (sctx->gs_shader) {
/* VS as ES */
si_shader_select(ctx, sctx->vs_shader);
r = si_shader_select(ctx, sctx->vs_shader);
if (r)
return false;
si_pm4_bind_state(sctx, es, sctx->vs_shader->current->pm4);
} else {
/* VS as VS */
si_shader_select(ctx, sctx->vs_shader);
r = si_shader_select(ctx, sctx->vs_shader);
if (r)
return false;
si_pm4_bind_state(sctx, vs, sctx->vs_shader->current->pm4);
si_update_so(sctx, sctx->vs_shader);
}
/* Update GS. */
if (sctx->gs_shader) {
si_shader_select(ctx, sctx->gs_shader);
r = si_shader_select(ctx, sctx->gs_shader);
if (r)
return false;
si_pm4_bind_state(sctx, gs, sctx->gs_shader->current->pm4);
si_pm4_bind_state(sctx, vs, sctx->gs_shader->current->gs_copy_shader->pm4);
si_update_so(sctx, sctx->gs_shader);
if (!sctx->gs_rings)
if (!sctx->gs_rings) {
si_init_gs_rings(sctx);
if (!sctx->gs_rings)
return false;
}
if (sctx->emitted.named.gs_rings != sctx->gs_rings)
sctx->b.flags |= SI_CONTEXT_VGT_FLUSH;
@@ -1406,18 +1500,9 @@ void si_update_shaders(struct si_context *sctx)
si_update_vgt_shader_config(sctx);
si_shader_select(ctx, sctx->ps_shader);
if (!sctx->ps_shader->current) {
struct si_shader_selector *sel;
/* use a dummy shader if compiling the shader (variant) failed */
si_make_dummy_ps(sctx);
sel = sctx->dummy_pixel_shader;
si_shader_select(ctx, sel);
sctx->ps_shader->current = sel->current;
}
r = si_shader_select(ctx, sctx->ps_shader);
if (r)
return false;
si_pm4_bind_state(sctx, ps, sctx->ps_shader->current->pm4);
if (si_pm4_state_changed(sctx, ps) || si_pm4_state_changed(sctx, vs) ||
@@ -1428,9 +1513,14 @@ void si_update_shaders(struct si_context *sctx)
si_update_spi_map(sctx);
}
if (si_pm4_state_changed(sctx, ps) || si_pm4_state_changed(sctx, vs) ||
si_pm4_state_changed(sctx, gs)) {
si_update_spi_tmpring_size(sctx);
if (si_pm4_state_changed(sctx, ls) ||
si_pm4_state_changed(sctx, hs) ||
si_pm4_state_changed(sctx, es) ||
si_pm4_state_changed(sctx, gs) ||
si_pm4_state_changed(sctx, vs) ||
si_pm4_state_changed(sctx, ps)) {
if (!si_update_spi_tmpring_size(sctx))
return false;
}
if (sctx->ps_db_shader_control != sctx->ps_shader->current->db_shader_control) {
@@ -1445,6 +1535,7 @@ void si_update_shaders(struct si_context *sctx)
if (sctx->b.chip_class == SI)
si_mark_atom_dirty(sctx, &sctx->db_render_state);
}
return true;
}
void si_init_shader_functions(struct si_context *sctx)

View File

@@ -383,6 +383,8 @@ static int svga_get_shader_param(struct pipe_screen *screen, unsigned shader, en
case PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_ANY_INOUT_DECL_RANGE:
return 0;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
}
/* If we get here, we failed to handle a cap above */
debug_printf("Unexpected fragment shader query %u\n", param);
@@ -441,6 +443,8 @@ static int svga_get_shader_param(struct pipe_screen *screen, unsigned shader, en
case PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_ANY_INOUT_DECL_RANGE:
return 0;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
}
/* If we get here, we failed to handle a cap above */
debug_printf("Unexpected vertex shader query %u\n", param);

View File

@@ -334,6 +334,8 @@ vc4_screen_get_shader_param(struct pipe_screen *pscreen, unsigned shader,
return VC4_MAX_TEXTURE_SAMPLERS;
case PIPE_SHADER_CAP_PREFERRED_IR:
return PIPE_SHADER_IR_TGSI;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
default:
fprintf(stderr, "unknown shader param %d\n", param);
return 0;

View File

@@ -674,7 +674,8 @@ enum pipe_shader_cap
PIPE_SHADER_CAP_TGSI_DROUND_SUPPORTED, /* all rounding modes */
PIPE_SHADER_CAP_TGSI_DFRACEXP_DLDEXP_SUPPORTED,
PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED,
PIPE_SHADER_CAP_TGSI_ANY_INOUT_DECL_RANGE
PIPE_SHADER_CAP_TGSI_ANY_INOUT_DECL_RANGE,
PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT,
};
/**

View File

@@ -188,10 +188,10 @@ dri2_drawable_get_buffers(struct dri_drawable *drawable,
* may occur as the stvis->color_format.
*/
switch(format) {
case PIPE_FORMAT_B8G8R8A8_UNORM:
case PIPE_FORMAT_BGRA8888_UNORM:
depth = 32;
break;
case PIPE_FORMAT_B8G8R8X8_UNORM:
case PIPE_FORMAT_BGRX8888_UNORM:
depth = 24;
break;
case PIPE_FORMAT_B5G6R5_UNORM:
@@ -261,13 +261,13 @@ dri_image_drawable_get_buffers(struct dri_drawable *drawable,
case PIPE_FORMAT_B5G6R5_UNORM:
image_format = __DRI_IMAGE_FORMAT_RGB565;
break;
case PIPE_FORMAT_B8G8R8X8_UNORM:
case PIPE_FORMAT_BGRX8888_UNORM:
image_format = __DRI_IMAGE_FORMAT_XRGB8888;
break;
case PIPE_FORMAT_B8G8R8A8_UNORM:
case PIPE_FORMAT_BGRA8888_UNORM:
image_format = __DRI_IMAGE_FORMAT_ARGB8888;
break;
case PIPE_FORMAT_R8G8B8A8_UNORM:
case PIPE_FORMAT_RGBA8888_UNORM:
image_format = __DRI_IMAGE_FORMAT_ABGR8888;
break;
default:
@@ -314,10 +314,10 @@ dri2_allocate_buffer(__DRIscreen *sPriv,
switch (format) {
case 32:
pf = PIPE_FORMAT_B8G8R8A8_UNORM;
pf = PIPE_FORMAT_BGRA8888_UNORM;
break;
case 24:
pf = PIPE_FORMAT_B8G8R8X8_UNORM;
pf = PIPE_FORMAT_BGRX8888_UNORM;
break;
case 16:
pf = PIPE_FORMAT_Z16_UNORM;
@@ -724,13 +724,13 @@ dri2_create_image_from_winsys(__DRIscreen *_screen,
pf = PIPE_FORMAT_B5G6R5_UNORM;
break;
case __DRI_IMAGE_FORMAT_XRGB8888:
pf = PIPE_FORMAT_B8G8R8X8_UNORM;
pf = PIPE_FORMAT_BGRX8888_UNORM;
break;
case __DRI_IMAGE_FORMAT_ARGB8888:
pf = PIPE_FORMAT_B8G8R8A8_UNORM;
pf = PIPE_FORMAT_BGRA8888_UNORM;
break;
case __DRI_IMAGE_FORMAT_ABGR8888:
pf = PIPE_FORMAT_R8G8B8A8_UNORM;
pf = PIPE_FORMAT_RGBA8888_UNORM;
break;
default:
pf = PIPE_FORMAT_NONE;
@@ -845,13 +845,13 @@ dri2_create_image(__DRIscreen *_screen,
pf = PIPE_FORMAT_B5G6R5_UNORM;
break;
case __DRI_IMAGE_FORMAT_XRGB8888:
pf = PIPE_FORMAT_B8G8R8X8_UNORM;
pf = PIPE_FORMAT_BGRX8888_UNORM;
break;
case __DRI_IMAGE_FORMAT_ARGB8888:
pf = PIPE_FORMAT_B8G8R8A8_UNORM;
pf = PIPE_FORMAT_BGRA8888_UNORM;
break;
case __DRI_IMAGE_FORMAT_ABGR8888:
pf = PIPE_FORMAT_R8G8B8A8_UNORM;
pf = PIPE_FORMAT_RGBA8888_UNORM;
break;
default:
pf = PIPE_FORMAT_NONE;
@@ -1293,6 +1293,7 @@ dri2_load_opencl_interop(struct dri_screen *screen)
}
struct dri2_fence {
struct dri_screen *driscreen;
struct pipe_fence_handle *pipe_fence;
void *cl_event;
};
@@ -1313,6 +1314,7 @@ dri2_create_fence(__DRIcontext *_ctx)
return NULL;
}
fence->driscreen = dri_screen(_ctx->driScreenPriv);
return fence;
}
@@ -1336,6 +1338,7 @@ dri2_get_fence_from_cl_event(__DRIscreen *_screen, intptr_t cl_event)
return NULL;
}
fence->driscreen = driscreen;
return fence;
}
@@ -1360,9 +1363,9 @@ static GLboolean
dri2_client_wait_sync(__DRIcontext *_ctx, void *_fence, unsigned flags,
uint64_t timeout)
{
struct dri_screen *driscreen = dri_screen(_ctx->driScreenPriv);
struct pipe_screen *screen = driscreen->base.screen;
struct dri2_fence *fence = (struct dri2_fence*)_fence;
struct dri_screen *driscreen = fence->driscreen;
struct pipe_screen *screen = driscreen->base.screen;
/* No need to flush. The context was flushed when the fence was created. */

View File

@@ -231,11 +231,11 @@ dri_set_tex_buffer2(__DRIcontext *pDRICtx, GLint target,
if (format == __DRI_TEXTURE_FORMAT_RGB) {
/* only need to cover the formats recognized by dri_fill_st_visual */
switch (internal_format) {
case PIPE_FORMAT_B8G8R8A8_UNORM:
internal_format = PIPE_FORMAT_B8G8R8X8_UNORM;
case PIPE_FORMAT_BGRA8888_UNORM:
internal_format = PIPE_FORMAT_BGRX8888_UNORM;
break;
case PIPE_FORMAT_A8R8G8B8_UNORM:
internal_format = PIPE_FORMAT_X8R8G8B8_UNORM;
case PIPE_FORMAT_ARGB8888_UNORM:
internal_format = PIPE_FORMAT_XRGB8888_UNORM;
break;
default:
break;

View File

@@ -753,10 +753,14 @@ static void slice_header(vid_dec_PrivateType *priv, struct vl_rbsp *rbsp,
priv->codec_data.h264.delta_pic_order_cnt_bottom = delta_pic_order_cnt_bottom;
}
priv->picture.h264.field_order_cnt[0] = pic_order_cnt_msb + pic_order_cnt_lsb;
priv->picture.h264.field_order_cnt[1] = pic_order_cnt_msb + pic_order_cnt_lsb;
if (!priv->picture.h264.field_pic_flag)
priv->picture.h264.field_order_cnt[1] += priv->codec_data.h264.delta_pic_order_cnt_bottom;
if (!priv->picture.h264.field_pic_flag) {
priv->picture.h264.field_order_cnt[0] = pic_order_cnt_msb + pic_order_cnt_lsb;
priv->picture.h264.field_order_cnt[1] = priv->picture.h264.field_order_cnt [0] +
priv->codec_data.h264.delta_pic_order_cnt_bottom;
} else if (!priv->picture.h264.bottom_field_flag)
priv->picture.h264.field_order_cnt[0] = pic_order_cnt_msb + pic_order_cnt_lsb;
else
priv->picture.h264.field_order_cnt[1] = pic_order_cnt_msb + pic_order_cnt_lsb;
} else if (sps->pic_order_cnt_type == 1) {
unsigned MaxFrameNum = 1 << (sps->log2_max_frame_num_minus4 + 4);

View File

@@ -116,7 +116,7 @@ vlVaCreateImage(VADriverContextP ctx, VAImageFormat *format, int width, int heig
img->width = width;
img->height = height;
w = align(width, 2);
h = align(width, 2);
h = align(height, 2);
switch (format->fourcc) {
case VA_FOURCC('N','V','1','2'):

View File

@@ -35,7 +35,8 @@ lib@OPENCL_LIBNAME@_la_LIBADD = \
-lclangEdit \
-lclangLex \
-lclangBasic \
$(LLVM_LIBS)
$(LLVM_LIBS) \
$(PTHREAD_LIBS)
nodist_EXTRA_lib@OPENCL_LIBNAME@_la_SOURCES = dummy.cpp
lib@OPENCL_LIBNAME@_la_SOURCES =

View File

@@ -62,6 +62,8 @@ public:
virtual ir_rvalue *hir(exec_list *instructions,
struct _mesa_glsl_parse_state *state);
virtual bool has_sequence_subexpression() const;
/**
* Retrieve the source location of an AST node
*
@@ -221,6 +223,8 @@ public:
virtual void hir_no_rvalue(exec_list *instructions,
struct _mesa_glsl_parse_state *state);
virtual bool has_sequence_subexpression() const;
ir_rvalue *do_hir(exec_list *instructions,
struct _mesa_glsl_parse_state *state,
bool needs_rvalue);
@@ -299,6 +303,8 @@ public:
virtual void hir_no_rvalue(exec_list *instructions,
struct _mesa_glsl_parse_state *state);
virtual bool has_sequence_subexpression() const;
private:
/**
* Is this function call actually a constructor?

View File

@@ -395,13 +395,54 @@ generate_call(exec_list *instructions, ir_function_signature *sig,
}
}
/* If the function call is a constant expression, don't generate any
* instructions; just generate an ir_constant.
/* Section 4.3.2 (Const) of the GLSL 1.10.59 spec says:
*
* Function calls were first allowed to be constant expressions in GLSL
* 1.20 and GLSL ES 3.00.
* "Initializers for const declarations must be formed from literal
* values, other const variables (not including function call
* paramaters), or expressions of these.
*
* Constructors may be used in such expressions, but function calls may
* not."
*
* Section 4.3.3 (Constant Expressions) of the GLSL 1.20.8 spec says:
*
* "A constant expression is one of
*
* ...
*
* - a built-in function call whose arguments are all constant
* expressions, with the exception of the texture lookup
* functions, the noise functions, and ftransform. The built-in
* functions dFdx, dFdy, and fwidth must return 0 when evaluated
* inside an initializer with an argument that is a constant
* expression."
*
* Section 5.10 (Constant Expressions) of the GLSL ES 1.00.17 spec says:
*
* "A constant expression is one of
*
* ...
*
* - a built-in function call whose arguments are all constant
* expressions, with the exception of the texture lookup
* functions."
*
* Section 4.3.3 (Constant Expressions) of the GLSL ES 3.00.4 spec says:
*
* "A constant expression is one of
*
* ...
*
* - a built-in function call whose arguments are all constant
* expressions, with the exception of the texture lookup
* functions. The built-in functions dFdx, dFdy, and fwidth must
* return 0 when evaluated inside an initializer with an argument
* that is a constant expression."
*
* If the function call is a constant expression, don't generate any
* instructions; just generate an ir_constant.
*/
if (state->is_version(120, 300)) {
if (state->is_version(120, 100)) {
ir_constant *value = sig->constant_expression_value(actual_parameters, NULL);
if (value != NULL) {
return value;
@@ -1911,6 +1952,17 @@ ast_function_expression::hir(exec_list *instructions,
unreachable("not reached");
}
bool
ast_function_expression::has_sequence_subexpression() const
{
foreach_list_typed(const ast_node, ast, link, &this->expressions) {
if (ast->has_sequence_subexpression())
return true;
}
return false;
}
ir_rvalue *
ast_aggregate_initializer::hir(exec_list *instructions,
struct _mesa_glsl_parse_state *state)

View File

@@ -939,6 +939,12 @@ ast_node::hir(exec_list *instructions, struct _mesa_glsl_parse_state *state)
return NULL;
}
bool
ast_node::has_sequence_subexpression() const
{
return false;
}
void
ast_function_expression::hir_no_rvalue(exec_list *instructions,
struct _mesa_glsl_parse_state *state)
@@ -1850,6 +1856,80 @@ ast_expression::do_hir(exec_list *instructions,
return result;
}
bool
ast_expression::has_sequence_subexpression() const
{
switch (this->oper) {
case ast_plus:
case ast_neg:
case ast_bit_not:
case ast_logic_not:
case ast_pre_inc:
case ast_pre_dec:
case ast_post_inc:
case ast_post_dec:
return this->subexpressions[0]->has_sequence_subexpression();
case ast_assign:
case ast_add:
case ast_sub:
case ast_mul:
case ast_div:
case ast_mod:
case ast_lshift:
case ast_rshift:
case ast_less:
case ast_greater:
case ast_lequal:
case ast_gequal:
case ast_nequal:
case ast_equal:
case ast_bit_and:
case ast_bit_xor:
case ast_bit_or:
case ast_logic_and:
case ast_logic_or:
case ast_logic_xor:
case ast_array_index:
case ast_mul_assign:
case ast_div_assign:
case ast_add_assign:
case ast_sub_assign:
case ast_mod_assign:
case ast_ls_assign:
case ast_rs_assign:
case ast_and_assign:
case ast_xor_assign:
case ast_or_assign:
return this->subexpressions[0]->has_sequence_subexpression() ||
this->subexpressions[1]->has_sequence_subexpression();
case ast_conditional:
return this->subexpressions[0]->has_sequence_subexpression() ||
this->subexpressions[1]->has_sequence_subexpression() ||
this->subexpressions[2]->has_sequence_subexpression();
case ast_sequence:
return true;
case ast_field_selection:
case ast_identifier:
case ast_int_constant:
case ast_uint_constant:
case ast_float_constant:
case ast_bool_constant:
case ast_double_constant:
return false;
case ast_aggregate:
unreachable("ast_aggregate: Should never get here.");
case ast_function_call:
unreachable("should be handled by ast_function_expression::hir");
}
return false;
}
ir_rvalue *
ast_expression_statement::hir(exec_list *instructions,
@@ -3146,16 +3226,72 @@ process_initializer(ir_variable *var, ast_declaration *decl,
/* Calculate the constant value if this is a const or uniform
* declaration.
*
* Section 4.3 (Storage Qualifiers) of the GLSL ES 1.00.17 spec says:
*
* "Declarations of globals without a storage qualifier, or with
* just the const qualifier, may include initializers, in which case
* they will be initialized before the first line of main() is
* executed. Such initializers must be a constant expression."
*
* The same section of the GLSL ES 3.00.4 spec has similar language.
*/
if (type->qualifier.flags.q.constant
|| type->qualifier.flags.q.uniform) {
|| type->qualifier.flags.q.uniform
|| (state->es_shader && state->current_function == NULL)) {
ir_rvalue *new_rhs = validate_assignment(state, initializer_loc,
lhs, rhs, true);
if (new_rhs != NULL) {
rhs = new_rhs;
/* Section 4.3.3 (Constant Expressions) of the GLSL ES 3.00.4 spec
* says:
*
* "A constant expression is one of
*
* ...
*
* - an expression formed by an operator on operands that are
* all constant expressions, including getting an element of
* a constant array, or a field of a constant structure, or
* components of a constant vector. However, the sequence
* operator ( , ) and the assignment operators ( =, +=, ...)
* are not included in the operators that can create a
* constant expression."
*
* Section 12.43 (Sequence operator and constant expressions) says:
*
* "Should the following construct be allowed?
*
* float a[2,3];
*
* The expression within the brackets uses the sequence operator
* (',') and returns the integer 3 so the construct is declaring
* a single-dimensional array of size 3. In some languages, the
* construct declares a two-dimensional array. It would be
* preferable to make this construct illegal to avoid confusion.
*
* One possibility is to change the definition of the sequence
* operator so that it does not return a constant-expression and
* hence cannot be used to declare an array size.
*
* RESOLUTION: The result of a sequence operator is not a
* constant-expression."
*
* Section 4.3.3 (Constant Expressions) of the GLSL 4.30.9 spec
* contains language almost identical to the section 4.3.3 in the
* GLSL ES 3.00.4 spec. This is a new limitation for these GLSL
* versions.
*/
ir_constant *constant_value = rhs->constant_expression_value();
if (!constant_value) {
if (!constant_value ||
(state->is_version(430, 300) &&
decl->initializer->has_sequence_subexpression())) {
const char *const variable_mode =
(type->qualifier.flags.q.constant)
? "const"
: ((type->qualifier.flags.q.uniform) ? "uniform" : "global");
/* If ARB_shading_language_420pack is enabled, initializers of
* const-qualified local variables do not have to be constant
* expressions. Const-qualified global variables must still be
@@ -3166,22 +3302,24 @@ process_initializer(ir_variable *var, ast_declaration *decl,
_mesa_glsl_error(& initializer_loc, state,
"initializer of %s variable `%s' must be a "
"constant expression",
(type->qualifier.flags.q.constant)
? "const" : "uniform",
variable_mode,
decl->identifier);
if (var->type->is_numeric()) {
/* Reduce cascading errors. */
var->constant_value = ir_constant::zero(state, var->type);
var->constant_value = type->qualifier.flags.q.constant
? ir_constant::zero(state, var->type) : NULL;
}
}
} else {
rhs = constant_value;
var->constant_value = constant_value;
var->constant_value = type->qualifier.flags.q.constant
? constant_value : NULL;
}
} else {
if (var->type->is_numeric()) {
/* Reduce cascading errors. */
var->constant_value = ir_constant::zero(state, var->type);
var->constant_value = type->qualifier.flags.q.constant
? ir_constant::zero(state, var->type) : NULL;
}
}
}

View File

@@ -673,14 +673,10 @@ builtin_variable_generator::generate_constants()
if (!state->es_shader) {
add_const("gl_MaxGeometryAtomicCounters",
state->Const.MaxGeometryAtomicCounters);
if (state->is_version(400, 0) ||
state->ARB_tessellation_shader_enable) {
add_const("gl_MaxTessControlAtomicCounters",
state->Const.MaxTessControlAtomicCounters);
add_const("gl_MaxTessEvaluationAtomicCounters",
state->Const.MaxTessEvaluationAtomicCounters);
}
add_const("gl_MaxTessControlAtomicCounters",
state->Const.MaxTessControlAtomicCounters);
add_const("gl_MaxTessEvaluationAtomicCounters",
state->Const.MaxTessEvaluationAtomicCounters);
}
}

View File

@@ -326,9 +326,9 @@ link_set_uniform_initializers(struct gl_shader_program *prog,
} else {
assert(!"Explicit binding not on a sampler, UBO or atomic.");
}
} else if (var->constant_value) {
} else if (var->constant_initializer) {
linker::set_uniform_initializer(mem_ctx, prog, var->name,
var->type, var->constant_value,
var->type, var->constant_initializer,
boolean_true);
}
}

View File

@@ -103,7 +103,7 @@ do_dead_code(exec_list *instructions, bool uniform_locations_assigned)
*/
if (entry->var->data.mode == ir_var_uniform ||
entry->var->data.mode == ir_var_shader_storage) {
if (uniform_locations_assigned || entry->var->constant_value)
if (uniform_locations_assigned || entry->var->constant_initializer)
continue;
/* Section 2.11.6 (Uniform Variables) of the OpenGL ES 3.0.3 spec

View File

@@ -46,6 +46,7 @@ AM_CFLAGS = \
$(EXTRA_DEFINES_XF86VIDMODE) \
-D_REENTRANT \
-DDEFAULT_DRIVER_DIR=\"$(DRI_DRIVER_SEARCH_DIR)\" \
-DGL_LIB_NAME=\"lib@GL_LIB@.so.1\" \
$(DEFINES) \
$(LIBDRM_CFLAGS) \
$(DRI2PROTO_CFLAGS) \

View File

@@ -73,6 +73,10 @@ dri_message(int level, const char *f, ...)
}
}
#ifndef GL_LIB_NAME
#define GL_LIB_NAME "libGL.so.1"
#endif
#ifndef DEFAULT_DRIVER_DIR
/* this is normally defined in Mesa/configs/default with DRI_DRIVER_SEARCH_PATH */
#define DEFAULT_DRIVER_DIR "/usr/local/lib/dri"
@@ -99,7 +103,7 @@ driOpenDriver(const char *driverName)
int len;
/* Attempt to make sure libGL symbols will be visible to the driver */
glhandle = dlopen("libGL.so.1", RTLD_NOW | RTLD_GLOBAL);
glhandle = dlopen(GL_LIB_NAME, RTLD_NOW | RTLD_GLOBAL);
libPaths = NULL;
if (geteuid() == getuid()) {

View File

@@ -2646,7 +2646,11 @@ _X_EXPORT void (*glXGetProcAddressARB(const GLubyte * procName)) (void)
*/
_X_EXPORT void (*glXGetProcAddress(const GLubyte * procName)) (void)
#if defined(__GNUC__) && !defined(GLX_ALIAS_UNSUPPORTED)
# if defined(USE_MGL_NAMESPACE)
__attribute__ ((alias("mglXGetProcAddressARB")));
# else
__attribute__ ((alias("glXGetProcAddressARB")));
# endif
#else
{
return glXGetProcAddressARB(procName);

View File

@@ -281,11 +281,17 @@ typedef void (*PFNGLXDISABLEEXTENSIONPROC) (const char *name);
# define GLX_ALIAS_VOID(real_func, proto_args, args, aliased_func)
#else
# if defined(__GNUC__) && !defined(GLX_ALIAS_UNSUPPORTED)
# define GLX_ALIAS(return_type, real_func, proto_args, args, aliased_func) \
/* GLX_ALIAS and GLX_ALIAS_VOID both expand to the macro GLX_ALIAS2. Using the
* extra expansion means that the name mangling macros in glx_mangle.h will
* apply before stringification, so the alias attribute will have a string like
* "mglXFoo" instead of "glXFoo". */
# define GLX_ALIAS2(return_type, real_func, proto_args, args, aliased_func) \
return_type real_func proto_args \
__attribute__ ((alias( # aliased_func ) ));
# define GLX_ALIAS(return_type, real_func, proto_args, args, aliased_func) \
GLX_ALIAS2(return_type, real_func, proto_args, args, aliased_func)
# define GLX_ALIAS_VOID(real_func, proto_args, args, aliased_func) \
GLX_ALIAS(void, real_func, proto_args, args, aliased_func)
GLX_ALIAS2(void, real_func, proto_args, args, aliased_func)
# else
# define GLX_ALIAS(return_type, real_func, proto_args, args, aliased_func) \
return_type real_func proto_args \

View File

@@ -175,7 +175,7 @@ _glapi_get_stub(const char *name, int generate)
const struct mapi_stub *stub;
#ifdef USE_MGL_NAMESPACE
if (name)
if (name && name[0] == 'm')
name++;
#endif

View File

@@ -50,7 +50,7 @@ endif # MESA_ENABLE_ASM
ifeq ($(ARCH_X86_HAVE_SSE4_1),true)
LOCAL_SRC_FILES += \
main/streaming-load-memcpy.c \
mesa/main/sse_minmax.c
main/sse_minmax.c
LOCAL_CFLAGS := \
-msse4.1 \
-DUSE_SSE41

View File

@@ -71,9 +71,7 @@ setup_glsl_msaa_blit_scaled_shader(struct gl_context *ctx,
char *sample_map_str = rzalloc_size(mem_ctx, 1);
char *sample_map_expr = rzalloc_size(mem_ctx, 1);
char *texel_fetch_macro = rzalloc_size(mem_ctx, 1);
const char *vs_source;
const char *sampler_array_suffix = "";
const char *texcoord_type = "vec2";
float y_scale;
enum blit_msaa_shader shader_index;
@@ -99,7 +97,6 @@ setup_glsl_msaa_blit_scaled_shader(struct gl_context *ctx,
shader_index += BLIT_2X_MSAA_SHADER_2D_MULTISAMPLE_ARRAY_SCALED_RESOLVE -
BLIT_2X_MSAA_SHADER_2D_MULTISAMPLE_SCALED_RESOLVE;
sampler_array_suffix = "Array";
texcoord_type = "vec3";
}
if (blit->msaa_shaders[shader_index]) {
@@ -150,28 +147,37 @@ setup_glsl_msaa_blit_scaled_shader(struct gl_context *ctx,
" const int sample_map[%d] = int[%d](%s);\n",
samples, samples, sample_map_str);
ralloc_asprintf_append(&texel_fetch_macro,
"#define TEXEL_FETCH(coord) texelFetch(texSampler, i%s(coord), %s);\n",
texcoord_type, sample_number);
if (target == GL_TEXTURE_2D_MULTISAMPLE) {
ralloc_asprintf_append(&texel_fetch_macro,
"#define TEXEL_FETCH(coord) texelFetch(texSampler, ivec2(coord), %s);\n",
sample_number);
} else {
ralloc_asprintf_append(&texel_fetch_macro,
"#define TEXEL_FETCH(coord) texelFetch(texSampler, ivec3(coord, layer), %s);\n",
sample_number);
}
vs_source = ralloc_asprintf(mem_ctx,
static const char vs_source[] =
"#version 130\n"
"in vec2 position;\n"
"in %s textureCoords;\n"
"out %s texCoords;\n"
"in vec3 textureCoords;\n"
"out vec2 texCoords;\n"
"flat out int layer;\n"
"void main()\n"
"{\n"
" texCoords = textureCoords;\n"
" texCoords = textureCoords.xy;\n"
" layer = int(textureCoords.z);\n"
" gl_Position = vec4(position, 0.0, 1.0);\n"
"}\n",
texcoord_type,
texcoord_type);
"}\n"
;
fs_source = ralloc_asprintf(mem_ctx,
"#version 130\n"
"#extension GL_ARB_texture_multisample : enable\n"
"uniform sampler2DMS%s texSampler;\n"
"uniform float src_width, src_height;\n"
"in %s texCoords;\n"
"in vec2 texCoords;\n"
"flat in int layer;\n"
"out vec4 out_color;\n"
"\n"
"void main()\n"
@@ -212,7 +218,6 @@ setup_glsl_msaa_blit_scaled_shader(struct gl_context *ctx,
" out_color = mix(x_0_color, x_1_color, interp.y);\n"
"}\n",
sampler_array_suffix,
texcoord_type,
sample_map_expr,
y_scale,
1.0f / y_scale,

View File

@@ -42,10 +42,10 @@
#define I830_UPLOAD_STIPPLE 0x4
#define I830_UPLOAD_INVARIENT 0x8
#define I830_UPLOAD_RASTER_RULES 0x10
#define I830_UPLOAD_TEX(i) (0x10<<(i))
#define I830_UPLOAD_TEXBLEND(i) (0x100<<(i))
#define I830_UPLOAD_TEX_ALL (0x0f0)
#define I830_UPLOAD_TEXBLEND_ALL (0xf00)
#define I830_UPLOAD_TEX(i) (0x0100<<(i))
#define I830_UPLOAD_TEXBLEND(i) (0x1000<<(i))
#define I830_UPLOAD_TEX_ALL (0x0f00)
#define I830_UPLOAD_TEXBLEND_ALL (0xf000)
/* State structure offsets - these will probably disappear.
*/

View File

@@ -115,6 +115,8 @@ enum {
I915_RASTER_RULES_SETUP_SIZE,
};
#define I915_TEX_UNITS 8
#define I915_MAX_CONSTANT 32
#define I915_CONSTANT_SIZE (2+(4*I915_MAX_CONSTANT))
@@ -194,7 +196,8 @@ struct i915_fragment_program
/* Helpers for i915_fragprog.c:
*/
GLuint wpos_tex;
uint8_t texcoord_mapping[I915_TEX_UNITS];
uint8_t wpos_tex;
bool depth_written;
struct
@@ -205,15 +208,6 @@ struct i915_fragment_program
GLuint nr_params;
};
#define I915_TEX_UNITS 8
struct i915_hw_state
{
GLuint Ctx[I915_CTX_SETUP_SIZE];

View File

@@ -72,6 +72,22 @@ static const GLfloat cos_constants[4] = { 1.0,
-1.0 / (6 * 5 * 4 * 3 * 2 * 1)
};
/* texcoord_mapping[unit] = index | TEXCOORD_{TEX,VAR} */
#define TEXCOORD_TEX (0<<7)
#define TEXCOORD_VAR (1<<7)
static unsigned
get_texcoord_mapping(struct i915_fragment_program *p, uint8_t texcoord)
{
for (unsigned i = 0; i < p->ctx->Const.MaxTextureCoordUnits; i++) {
if (p->texcoord_mapping[i] == texcoord)
return i;
}
/* blah */
return p->ctx->Const.MaxTextureCoordUnits - 1;
}
/**
* Retrieve a ureg for the given source register. Will emit
* constants, apply swizzling and negation as needed.
@@ -82,6 +98,7 @@ src_vector(struct i915_fragment_program *p,
const struct gl_fragment_program *program)
{
GLuint src;
unsigned unit;
switch (source->File) {
@@ -119,8 +136,10 @@ src_vector(struct i915_fragment_program *p,
case VARYING_SLOT_TEX5:
case VARYING_SLOT_TEX6:
case VARYING_SLOT_TEX7:
unit = get_texcoord_mapping(p, (source->Index -
VARYING_SLOT_TEX0) | TEXCOORD_TEX);
src = i915_emit_decl(p, REG_TYPE_T,
T_TEX0 + (source->Index - VARYING_SLOT_TEX0),
T_TEX0 + unit,
D0_CHANNEL_ALL);
break;
@@ -132,8 +151,10 @@ src_vector(struct i915_fragment_program *p,
case VARYING_SLOT_VAR0 + 5:
case VARYING_SLOT_VAR0 + 6:
case VARYING_SLOT_VAR0 + 7:
unit = get_texcoord_mapping(p, (source->Index -
VARYING_SLOT_VAR0) | TEXCOORD_VAR);
src = i915_emit_decl(p, REG_TYPE_T,
T_TEX0 + (source->Index - VARYING_SLOT_VAR0),
T_TEX0 + unit,
D0_CHANNEL_ALL);
break;
@@ -1176,27 +1197,54 @@ fixup_depth_write(struct i915_fragment_program *p)
}
}
static void
check_texcoord_mapping(struct i915_fragment_program *p)
{
GLbitfield64 inputs = p->FragProg.Base.InputsRead;
unsigned unit = 0;
for (unsigned i = 0; i < p->ctx->Const.MaxTextureCoordUnits; i++) {
if (inputs & VARYING_BIT_TEX(i)) {
if (unit >= p->ctx->Const.MaxTextureCoordUnits) {
unit++;
break;
}
p->texcoord_mapping[unit++] = i | TEXCOORD_TEX;
}
if (inputs & VARYING_BIT_VAR(i)) {
if (unit >= p->ctx->Const.MaxTextureCoordUnits) {
unit++;
break;
}
p->texcoord_mapping[unit++] = i | TEXCOORD_VAR;
}
}
if (unit > p->ctx->Const.MaxTextureCoordUnits)
i915_program_error(p, "Too many texcoord units");
}
static void
check_wpos(struct i915_fragment_program *p)
{
GLbitfield64 inputs = p->FragProg.Base.InputsRead;
GLint i;
unsigned unit = 0;
p->wpos_tex = -1;
if ((inputs & VARYING_BIT_POS) == 0)
return;
for (i = 0; i < p->ctx->Const.MaxTextureCoordUnits; i++) {
if (inputs & (VARYING_BIT_TEX(i) | VARYING_BIT_VAR(i)))
continue;
else if (inputs & VARYING_BIT_POS) {
p->wpos_tex = i;
inputs &= ~VARYING_BIT_POS;
}
unit += !!(inputs & VARYING_BIT_TEX(i));
unit += !!(inputs & VARYING_BIT_VAR(i));
}
if (inputs & VARYING_BIT_POS) {
if (unit < p->ctx->Const.MaxTextureCoordUnits)
p->wpos_tex = unit;
else
i915_program_error(p, "No free texcoord for wpos value");
}
}
@@ -1212,6 +1260,7 @@ translate_program(struct i915_fragment_program *p)
}
i915_init_program(i915, p);
check_texcoord_mapping(p);
check_wpos(p);
upload_program(p);
fixup_depth_write(p);
@@ -1420,22 +1469,24 @@ i915ValidateFragmentProgram(struct i915_context *i915)
for (i = 0; i < p->ctx->Const.MaxTextureCoordUnits; i++) {
if (inputsRead & VARYING_BIT_TEX(i)) {
int unit = get_texcoord_mapping(p, i | TEXCOORD_TEX);
int sz = VB->AttribPtr[_TNL_ATTRIB_TEX0 + i]->size;
s2 &= ~S2_TEXCOORD_FMT(i, S2_TEXCOORD_FMT0_MASK);
s2 |= S2_TEXCOORD_FMT(i, SZ_TO_HW(sz));
s2 &= ~S2_TEXCOORD_FMT(unit, S2_TEXCOORD_FMT0_MASK);
s2 |= S2_TEXCOORD_FMT(unit, SZ_TO_HW(sz));
EMIT_ATTR(_TNL_ATTRIB_TEX0 + i, EMIT_SZ(sz), 0, sz * 4);
}
else if (inputsRead & VARYING_BIT_VAR(i)) {
if (inputsRead & VARYING_BIT_VAR(i)) {
int unit = get_texcoord_mapping(p, i | TEXCOORD_VAR);
int sz = VB->AttribPtr[_TNL_ATTRIB_GENERIC0 + i]->size;
s2 &= ~S2_TEXCOORD_FMT(i, S2_TEXCOORD_FMT0_MASK);
s2 |= S2_TEXCOORD_FMT(i, SZ_TO_HW(sz));
s2 &= ~S2_TEXCOORD_FMT(unit, S2_TEXCOORD_FMT0_MASK);
s2 |= S2_TEXCOORD_FMT(unit, SZ_TO_HW(sz));
EMIT_ATTR(_TNL_ATTRIB_GENERIC0 + i, EMIT_SZ(sz), 0, sz * 4);
}
else if (i == p->wpos_tex) {
if (i == p->wpos_tex) {
int wpos_size = 4 * sizeof(float);
/* If WPOS is required, duplicate the XYZ position data in an
* unused texture coordinate:

View File

@@ -658,6 +658,11 @@ intel_blit_framebuffer_with_blitter(struct gl_context *ctx,
{
struct intel_context *intel = intel_context(ctx);
/* Sync up the state of window system buffers. We need to do this before
* we go looking for the buffers.
*/
intel_prepare_render(intel);
if (mask & GL_COLOR_BUFFER_BIT) {
GLint i;
struct gl_renderbuffer *src_rb = readFb->_ColorReadBuffer;

View File

@@ -1412,7 +1412,6 @@ intel_process_dri2_buffer(struct brw_context *brw,
buffer->cpp, buffer->pitch);
}
intel_miptree_release(&rb->mt);
bo = drm_intel_bo_gem_create_from_name(brw->bufmgr, buffer_name,
buffer->name);
if (!bo) {

View File

@@ -1556,7 +1556,10 @@ fs_visitor::assign_vs_urb_setup()
inst->src[i].file = HW_REG;
inst->src[i].fixed_hw_reg =
retype(brw_vec8_grf(grf, 0), inst->src[i].type);
stride(byte_offset(retype(brw_vec8_grf(grf, 0), inst->src[i].type),
inst->src[i].subreg_offset),
inst->exec_size * inst->src[i].stride,
inst->exec_size, inst->src[i].stride);
}
}
}

View File

@@ -312,13 +312,43 @@ namespace {
}
namespace image_validity {
/**
* Check whether the bound image is suitable for untyped access.
*/
brw_predicate
emit_untyped_image_check(const fs_builder &bld, const fs_reg &image,
brw_predicate pred)
{
const brw_device_info *devinfo = bld.shader->devinfo;
const fs_reg stride = offset(image, bld, BRW_IMAGE_PARAM_STRIDE_OFFSET);
if (devinfo->gen == 7 && !devinfo->is_haswell) {
/* Check whether the first stride component (i.e. the Bpp value)
* is greater than four, what on Gen7 indicates that a surface of
* type RAW has been bound for untyped access. Reading or writing
* to a surface of type other than RAW using untyped surface
* messages causes a hang on IVB and VLV.
*/
set_predicate(pred,
bld.CMP(bld.null_reg_ud(), stride, fs_reg(4),
BRW_CONDITIONAL_G));
return BRW_PREDICATE_NORMAL;
} else {
/* More recent generations handle the format mismatch
* gracefully.
*/
return pred;
}
}
/**
* Check whether there is an image bound at the given index and write
* the comparison result to f0.0. Returns an appropriate predication
* mode to use on subsequent image operations.
*/
brw_predicate
emit_surface_check(const fs_builder &bld, const fs_reg &image)
emit_typed_atomic_check(const fs_builder &bld, const fs_reg &image)
{
const brw_device_info *devinfo = bld.shader->devinfo;
const fs_reg size = offset(image, bld, BRW_IMAGE_PARAM_SIZE_OFFSET);
@@ -895,7 +925,9 @@ namespace brw {
* surface read on the result,
*/
const brw_predicate pred =
emit_bounds_check(bld, image, saddr, dims);
emit_untyped_image_check(bld, image,
emit_bounds_check(bld, image,
saddr, dims));
/* and they don't know about surface coordinates, we need to
* convert them to a raw memory offset.
@@ -1041,7 +1073,9 @@ namespace brw {
* the surface write on the result,
*/
const brw_predicate pred =
emit_bounds_check(bld, image, saddr, dims);
emit_untyped_image_check(bld, image,
emit_bounds_check(bld, image,
saddr, dims));
/* and, phew, they don't know about surface coordinates, we
* need to convert them to a raw memory offset.
@@ -1072,7 +1106,7 @@ namespace brw {
using namespace image_coordinates;
using namespace surface_access;
/* Avoid performing an atomic operation on an unbound surface. */
const brw_predicate pred = emit_surface_check(bld, image);
const brw_predicate pred = emit_typed_atomic_check(bld, image);
/* Transform the image coordinates into actual surface coordinates. */
const fs_reg saddr =

View File

@@ -129,7 +129,7 @@ brw_upload_gs_image_surfaces(struct brw_context *brw)
ctx->_Shader->CurrentProgram[MESA_SHADER_GEOMETRY];
if (prog) {
/* BRW_NEW_GS_PROG_DATA, BRW_NEW_IMAGE_UNITS */
/* BRW_NEW_GS_PROG_DATA, BRW_NEW_IMAGE_UNITS, _NEW_TEXTURE */
brw_upload_image_surfaces(brw, prog->_LinkedShaders[MESA_SHADER_GEOMETRY],
&brw->gs.base, &brw->gs.prog_data->base.base);
}
@@ -137,6 +137,7 @@ brw_upload_gs_image_surfaces(struct brw_context *brw)
const struct brw_tracked_state brw_gs_image_surfaces = {
.dirty = {
.mesa = _NEW_TEXTURE,
.brw = BRW_NEW_BATCH |
BRW_NEW_GEOMETRY_PROGRAM |
BRW_NEW_GS_PROG_DATA |

View File

@@ -61,6 +61,8 @@ src_reg::src_reg(register_file file, int reg, const glsl_type *type)
this->swizzle = brw_swizzle_for_size(type->vector_elements);
else
this->swizzle = BRW_SWIZZLE_XYZW;
if (type)
this->type = brw_type_for_base_type(type);
}
/** Generic unset register constructor. */
@@ -1105,11 +1107,13 @@ vec4_visitor::opt_register_coalesce()
if (interfered)
break;
/* If somebody else writes our destination here, we can't coalesce
* before that.
/* If somebody else writes the same channels of our destination here,
* we can't coalesce before that.
*/
if (inst->dst.in_range(scan_inst->dst, scan_inst->regs_written))
break;
if (inst->dst.in_range(scan_inst->dst, scan_inst->regs_written) &&
(inst->dst.writemask & scan_inst->dst.writemask) != 0) {
break;
}
/* Check for reads of the register we're trying to coalesce into. We
* can't go rewriting instructions above that to put some other value

View File

@@ -201,7 +201,7 @@ brw_upload_vs_image_surfaces(struct brw_context *brw)
ctx->_Shader->CurrentProgram[MESA_SHADER_VERTEX];
if (prog) {
/* BRW_NEW_VS_PROG_DATA, BRW_NEW_IMAGE_UNITS */
/* BRW_NEW_VS_PROG_DATA, BRW_NEW_IMAGE_UNITS, _NEW_TEXTURE */
brw_upload_image_surfaces(brw, prog->_LinkedShaders[MESA_SHADER_VERTEX],
&brw->vs.base, &brw->vs.prog_data->base.base);
}
@@ -209,6 +209,7 @@ brw_upload_vs_image_surfaces(struct brw_context *brw)
const struct brw_tracked_state brw_vs_image_surfaces = {
.dirty = {
.mesa = _NEW_TEXTURE,
.brw = BRW_NEW_BATCH |
BRW_NEW_IMAGE_UNITS |
BRW_NEW_VERTEX_PROGRAM |

View File

@@ -34,6 +34,7 @@
#include "main/blend.h"
#include "main/mtypes.h"
#include "main/samplerobj.h"
#include "main/shaderimage.h"
#include "program/prog_parameter.h"
#include "main/framebuffer.h"
@@ -1033,7 +1034,7 @@ brw_upload_cs_image_surfaces(struct brw_context *brw)
ctx->_Shader->CurrentProgram[MESA_SHADER_COMPUTE];
if (prog) {
/* BRW_NEW_CS_PROG_DATA, BRW_NEW_IMAGE_UNITS */
/* BRW_NEW_CS_PROG_DATA, BRW_NEW_IMAGE_UNITS, _NEW_TEXTURE */
brw_upload_image_surfaces(brw, prog->_LinkedShaders[MESA_SHADER_COMPUTE],
&brw->cs.base, &brw->cs.prog_data->base);
}
@@ -1041,7 +1042,7 @@ brw_upload_cs_image_surfaces(struct brw_context *brw)
const struct brw_tracked_state brw_cs_image_surfaces = {
.dirty = {
.mesa = _NEW_PROGRAM,
.mesa = _NEW_TEXTURE | _NEW_PROGRAM,
.brw = BRW_NEW_BATCH |
BRW_NEW_CS_PROG_DATA |
BRW_NEW_IMAGE_UNITS
@@ -1174,7 +1175,7 @@ update_image_surface(struct brw_context *brw,
uint32_t *surf_offset,
struct brw_image_param *param)
{
if (u->_Valid) {
if (_mesa_is_image_unit_valid(&brw->ctx, u)) {
struct gl_texture_object *obj = u->TexObj;
const unsigned format = get_image_format(brw, u->_ActualFormat, access);
@@ -1259,7 +1260,7 @@ brw_upload_wm_image_surfaces(struct brw_context *brw)
struct gl_shader_program *prog = ctx->Shader._CurrentFragmentProgram;
if (prog) {
/* BRW_NEW_FS_PROG_DATA, BRW_NEW_IMAGE_UNITS */
/* BRW_NEW_FS_PROG_DATA, BRW_NEW_IMAGE_UNITS, _NEW_TEXTURE */
brw_upload_image_surfaces(brw, prog->_LinkedShaders[MESA_SHADER_FRAGMENT],
&brw->wm.base, &brw->wm.prog_data->base);
}
@@ -1267,6 +1268,7 @@ brw_upload_wm_image_surfaces(struct brw_context *brw)
const struct brw_tracked_state brw_wm_image_surfaces = {
.dirty = {
.mesa = _NEW_TEXTURE,
.brw = BRW_NEW_BATCH |
BRW_NEW_FRAGMENT_PROGRAM |
BRW_NEW_FS_PROG_DATA |

View File

@@ -59,9 +59,7 @@ upload_gs_state(struct brw_context *brw)
OUT_BATCH(((ALIGN(stage_state->sampler_count, 4)/4) <<
GEN6_GS_SAMPLER_COUNT_SHIFT) |
((brw->gs.prog_data->base.base.binding_table.size_bytes / 4) <<
GEN6_GS_BINDING_TABLE_ENTRY_COUNT_SHIFT) |
(brw->is_haswell && prog_data->base.nr_image_params ?
HSW_GS_UAV_ACCESS_ENABLE : 0));
GEN6_GS_BINDING_TABLE_ENTRY_COUNT_SHIFT));
if (brw->gs.prog_data->base.base.total_scratch) {
OUT_RELOC(stage_state->scratch_bo,

View File

@@ -126,9 +126,7 @@ upload_vs_state(struct brw_context *brw)
((ALIGN(stage_state->sampler_count, 4)/4) <<
GEN6_VS_SAMPLER_COUNT_SHIFT) |
((brw->vs.prog_data->base.base.binding_table.size_bytes / 4) <<
GEN6_VS_BINDING_TABLE_ENTRY_COUNT_SHIFT) |
(brw->is_haswell && prog_data->base.nr_image_params ?
HSW_VS_UAV_ACCESS_ENABLE : 0));
GEN6_VS_BINDING_TABLE_ENTRY_COUNT_SHIFT));
if (prog_data->base.total_scratch) {
OUT_RELOC(stage_state->scratch_bo,

View File

@@ -113,7 +113,14 @@ upload_wm_state(struct brw_context *brw)
else if (prog_data->base.nr_image_params)
dw1 |= GEN7_WM_EARLY_DS_CONTROL_PSEXEC;
/* _NEW_BUFFERS | _NEW_COLOR */
/* The "UAV access enable" bits are unnecessary on HSW because they only
* seem to have an effect on the HW-assisted coherency mechanism which we
* don't need, and the rasterization-related UAV_ONLY flag and the
* DISPATCH_ENABLE bit can be set independently from it.
* C.f. gen8_upload_ps_extra().
*
* BRW_NEW_FRAGMENT_PROGRAM | BRW_NEW_FS_PROG_DATA | _NEW_BUFFERS | _NEW_COLOR
*/
if (brw->is_haswell &&
!(brw_color_buffer_write_enabled(brw) || writes_depth) &&
prog_data->base.nr_image_params)
@@ -221,9 +228,6 @@ gen7_upload_ps_state(struct brw_context *brw,
_mesa_get_min_invocations_per_fragment(ctx, fp, false);
assert(min_inv_per_frag >= 1);
if (brw->is_haswell && prog_data->base.nr_image_params)
dw4 |= HSW_PS_UAV_ACCESS_ENABLE;
if (prog_data->prog_offset_16 || prog_data->no_8) {
dw4 |= GEN7_PS_16_DISPATCH_ENABLE;
if (!prog_data->no_8 && min_inv_per_frag == 1) {

View File

@@ -52,9 +52,7 @@ gen8_upload_gs_state(struct brw_context *brw)
((ALIGN(stage_state->sampler_count, 4)/4) <<
GEN6_GS_SAMPLER_COUNT_SHIFT) |
((prog_data->base.binding_table.size_bytes / 4) <<
GEN6_GS_BINDING_TABLE_ENTRY_COUNT_SHIFT) |
(prog_data->base.nr_image_params ?
HSW_GS_UAV_ACCESS_ENABLE : 0));
GEN6_GS_BINDING_TABLE_ENTRY_COUNT_SHIFT));
if (brw->gs.prog_data->base.base.total_scratch) {
OUT_RELOC64(stage_state->scratch_bo,

View File

@@ -25,6 +25,7 @@
#include "program/program.h"
#include "brw_state.h"
#include "brw_defines.h"
#include "brw_wm.h"
#include "intel_batchbuffer.h"
void
@@ -61,8 +62,33 @@ gen8_upload_ps_extra(struct brw_context *brw,
if (brw->gen >= 9 && prog_data->pulls_bary)
dw1 |= GEN9_PSX_SHADER_PULLS_BARY;
if (_mesa_active_fragment_shader_has_atomic_ops(&brw->ctx) ||
prog_data->base.nr_image_params)
/* The stricter cross-primitive coherency guarantees that the hardware
* gives us with the "Accesses UAV" bit set for at least one shader stage
* and the "UAV coherency required" bit set on the 3DPRIMITIVE command are
* redundant within the current image, atomic counter and SSBO GL APIs,
* which all have very loose ordering and coherency requirements and
* generally rely on the application to insert explicit barriers when a
* shader invocation is expected to see the memory writes performed by the
* invocations of some previous primitive. Regardless of the value of "UAV
* coherency required", the "Accesses UAV" bits will implicitly cause an in
* most cases useless DC flush when the lowermost stage with the bit set
* finishes execution.
*
* It would be nice to disable it, but in some cases we can't because on
* Gen8+ it also has an influence on rasterization via the PS UAV-only
* signal (which could be set independently from the coherency mechanism in
* the 3DSTATE_WM command on Gen7), and because in some cases it will
* determine whether the hardware skips execution of the fragment shader or
* not via the ThreadDispatchEnable signal. However if we know that
* GEN8_PS_BLEND_HAS_WRITEABLE_RT is going to be set and
* GEN8_PSX_PIXEL_SHADER_NO_RT_WRITE is not set it shouldn't make any
* difference so we may just disable it here.
*
* BRW_NEW_FS_PROG_DATA | BRW_NEW_FRAGMENT_PROGRAM | _NEW_BUFFERS | _NEW_COLOR
*/
if ((_mesa_active_fragment_shader_has_atomic_ops(&brw->ctx) ||
prog_data->base.nr_image_params) &&
!brw_color_buffer_write_enabled(brw))
dw1 |= GEN8_PSX_SHADER_HAS_UAV;
BEGIN_BATCH(2);
@@ -87,7 +113,7 @@ upload_ps_extra(struct brw_context *brw)
const struct brw_tracked_state gen8_ps_extra = {
.dirty = {
.mesa = 0,
.mesa = _NEW_BUFFERS | _NEW_COLOR,
.brw = BRW_NEW_CONTEXT |
BRW_NEW_FRAGMENT_PROGRAM |
BRW_NEW_FS_PROG_DATA |

View File

@@ -53,9 +53,7 @@ upload_vs_state(struct brw_context *brw)
((ALIGN(stage_state->sampler_count, 4) / 4) <<
GEN6_VS_SAMPLER_COUNT_SHIFT) |
((prog_data->base.binding_table.size_bytes / 4) <<
GEN6_VS_BINDING_TABLE_ENTRY_COUNT_SHIFT) |
(prog_data->base.nr_image_params ?
HSW_VS_UAV_ACCESS_ENABLE : 0));
GEN6_VS_BINDING_TABLE_ENTRY_COUNT_SHIFT));
if (prog_data->base.total_scratch) {
OUT_RELOC64(stage_state->scratch_bo,

View File

@@ -1401,7 +1401,7 @@ save_BlendFunci(GLuint buf, GLenum sfactor, GLenum dfactor)
GET_CURRENT_CONTEXT(ctx);
Node *n;
ASSERT_OUTSIDE_SAVE_BEGIN_END_AND_FLUSH(ctx);
n = alloc_instruction(ctx, OPCODE_BLEND_FUNC_SEPARATE_I, 3);
n = alloc_instruction(ctx, OPCODE_BLEND_FUNC_I, 3);
if (n) {
n[1].ui = buf;
n[2].e = sfactor;

View File

@@ -976,13 +976,11 @@ static void load_texture( texenv_fragment_program *p, GLuint unit )
ir_var_uniform);
p->top_instructions->push_head(sampler);
/* Set the texture unit for this sampler. The linker will pick this value
* up and do-the-right-thing.
*
* NOTE: The cast to int is important. Without it, the constant will have
* type uint, and things later on may get confused.
/* Set the texture unit for this sampler in the same way that
* layout(binding=X) would.
*/
sampler->constant_value = new(p->mem_ctx) ir_constant(int(unit));
sampler->data.explicit_binding = true;
sampler->data.binding = unit;
deref = new(p->mem_ctx) ir_dereference_variable(sampler);
tex->set_sampler(deref, glsl_type::vec4_type);

View File

@@ -293,9 +293,10 @@ struct ureg {
GLuint file:4;
GLint idx:9; /* relative addressing may be negative */
/* sizeof(idx) should == sizeof(prog_src_reg::Index) */
GLuint abs:1;
GLuint negate:1;
GLuint swz:12;
GLuint pad:6;
GLuint pad:5;
};
@@ -324,6 +325,7 @@ static const struct ureg undef = {
0,
0,
0,
0,
0
};
@@ -342,6 +344,7 @@ static struct ureg make_ureg(GLuint file, GLint idx)
struct ureg reg;
reg.file = file;
reg.idx = idx;
reg.abs = 0;
reg.negate = 0;
reg.swz = SWIZZLE_NOOP;
reg.pad = 0;
@@ -350,6 +353,14 @@ static struct ureg make_ureg(GLuint file, GLint idx)
static struct ureg absolute( struct ureg reg )
{
reg.abs = 1;
reg.negate = 0;
return reg;
}
static struct ureg negate( struct ureg reg )
{
reg.negate ^= 1;
@@ -526,8 +537,8 @@ static void emit_arg( struct prog_src_register *src,
src->File = reg.file;
src->Index = reg.idx;
src->Swizzle = reg.swz;
src->Abs = reg.abs;
src->Negate = reg.negate ? NEGATE_XYZW : NEGATE_NONE;
src->Abs = 0;
src->RelAddr = 0;
/* Check that bitfield sizes aren't exceeded */
assert(src->Index == reg.idx);
@@ -953,7 +964,7 @@ static struct ureg calculate_light_attenuation( struct tnl_program *p,
emit_op2(p, OPCODE_DP3, spot, 0, negate(VPpli), spot_dir_norm);
emit_op2(p, OPCODE_SLT, slt, 0, swizzle1(spot_dir_norm,W), spot);
emit_op2(p, OPCODE_POW, spot, 0, spot, swizzle1(attenuation, W));
emit_op2(p, OPCODE_POW, spot, 0, absolute(spot), swizzle1(attenuation, W));
emit_op2(p, OPCODE_MUL, att, 0, slt, spot);
release_temp(p, spot);

View File

@@ -2080,6 +2080,633 @@ _mesa_es_error_check_format_and_type(GLenum format, GLenum type,
return type_valid ? GL_NO_ERROR : GL_INVALID_OPERATION;
}
/**
* Return the simple base format for a given internal texture format.
* For example, given GL_LUMINANCE12_ALPHA4, return GL_LUMINANCE_ALPHA.
*
* \param ctx GL context.
* \param internalFormat the internal texture format token or 1, 2, 3, or 4.
*
* \return the corresponding \u base internal format (GL_ALPHA, GL_LUMINANCE,
* GL_LUMANCE_ALPHA, GL_INTENSITY, GL_RGB, or GL_RGBA), or -1 if invalid enum.
*
* This is the format which is used during texture application (i.e. the
* texture format and env mode determine the arithmetic used.
*/
GLint
_mesa_base_tex_format(const struct gl_context *ctx, GLint internalFormat)
{
switch (internalFormat) {
case GL_ALPHA:
case GL_ALPHA4:
case GL_ALPHA8:
case GL_ALPHA12:
case GL_ALPHA16:
return (ctx->API != API_OPENGL_CORE) ? GL_ALPHA : -1;
case 1:
case GL_LUMINANCE:
case GL_LUMINANCE4:
case GL_LUMINANCE8:
case GL_LUMINANCE12:
case GL_LUMINANCE16:
return (ctx->API != API_OPENGL_CORE) ? GL_LUMINANCE : -1;
case 2:
case GL_LUMINANCE_ALPHA:
case GL_LUMINANCE4_ALPHA4:
case GL_LUMINANCE6_ALPHA2:
case GL_LUMINANCE8_ALPHA8:
case GL_LUMINANCE12_ALPHA4:
case GL_LUMINANCE12_ALPHA12:
case GL_LUMINANCE16_ALPHA16:
return (ctx->API != API_OPENGL_CORE) ? GL_LUMINANCE_ALPHA : -1;
case GL_INTENSITY:
case GL_INTENSITY4:
case GL_INTENSITY8:
case GL_INTENSITY12:
case GL_INTENSITY16:
return (ctx->API != API_OPENGL_CORE) ? GL_INTENSITY : -1;
case 3:
return (ctx->API != API_OPENGL_CORE) ? GL_RGB : -1;
case GL_RGB:
case GL_R3_G3_B2:
case GL_RGB4:
case GL_RGB5:
case GL_RGB8:
case GL_RGB10:
case GL_RGB12:
case GL_RGB16:
return GL_RGB;
case 4:
return (ctx->API != API_OPENGL_CORE) ? GL_RGBA : -1;
case GL_RGBA:
case GL_RGBA2:
case GL_RGBA4:
case GL_RGB5_A1:
case GL_RGBA8:
case GL_RGB10_A2:
case GL_RGBA12:
case GL_RGBA16:
return GL_RGBA;
default:
; /* fallthrough */
}
/* GL_BGRA can be an internal format *only* in OpenGL ES (1.x or 2.0).
*/
if (_mesa_is_gles(ctx)) {
switch (internalFormat) {
case GL_BGRA:
return GL_RGBA;
default:
; /* fallthrough */
}
}
if (ctx->Extensions.ARB_ES2_compatibility) {
switch (internalFormat) {
case GL_RGB565:
return GL_RGB;
default:
; /* fallthrough */
}
}
if (ctx->Extensions.ARB_depth_texture) {
switch (internalFormat) {
case GL_DEPTH_COMPONENT:
case GL_DEPTH_COMPONENT16:
case GL_DEPTH_COMPONENT24:
case GL_DEPTH_COMPONENT32:
return GL_DEPTH_COMPONENT;
case GL_DEPTH_STENCIL:
case GL_DEPTH24_STENCIL8:
return GL_DEPTH_STENCIL;
default:
; /* fallthrough */
}
}
if (ctx->Extensions.ARB_texture_stencil8) {
switch (internalFormat) {
case GL_STENCIL_INDEX:
case GL_STENCIL_INDEX1:
case GL_STENCIL_INDEX4:
case GL_STENCIL_INDEX8:
case GL_STENCIL_INDEX16:
return GL_STENCIL_INDEX;
default:
; /* fallthrough */
}
}
switch (internalFormat) {
case GL_COMPRESSED_ALPHA:
return GL_ALPHA;
case GL_COMPRESSED_LUMINANCE:
return GL_LUMINANCE;
case GL_COMPRESSED_LUMINANCE_ALPHA:
return GL_LUMINANCE_ALPHA;
case GL_COMPRESSED_INTENSITY:
return GL_INTENSITY;
case GL_COMPRESSED_RGB:
return GL_RGB;
case GL_COMPRESSED_RGBA:
return GL_RGBA;
default:
; /* fallthrough */
}
if (ctx->Extensions.TDFX_texture_compression_FXT1) {
switch (internalFormat) {
case GL_COMPRESSED_RGB_FXT1_3DFX:
return GL_RGB;
case GL_COMPRESSED_RGBA_FXT1_3DFX:
return GL_RGBA;
default:
; /* fallthrough */
}
}
/* Assume that the ANGLE flag will always be set if the EXT flag is set.
*/
if (ctx->Extensions.ANGLE_texture_compression_dxt) {
switch (internalFormat) {
case GL_COMPRESSED_RGB_S3TC_DXT1_EXT:
return GL_RGB;
case GL_COMPRESSED_RGBA_S3TC_DXT1_EXT:
case GL_COMPRESSED_RGBA_S3TC_DXT3_EXT:
case GL_COMPRESSED_RGBA_S3TC_DXT5_EXT:
return GL_RGBA;
default:
; /* fallthrough */
}
}
if (_mesa_is_desktop_gl(ctx)
&& ctx->Extensions.ANGLE_texture_compression_dxt) {
switch (internalFormat) {
case GL_RGB_S3TC:
case GL_RGB4_S3TC:
return GL_RGB;
case GL_RGBA_S3TC:
case GL_RGBA4_S3TC:
return GL_RGBA;
default:
; /* fallthrough */
}
}
if (ctx->Extensions.MESA_ycbcr_texture) {
if (internalFormat == GL_YCBCR_MESA)
return GL_YCBCR_MESA;
}
if (ctx->Extensions.ARB_texture_float) {
switch (internalFormat) {
case GL_ALPHA16F_ARB:
case GL_ALPHA32F_ARB:
return GL_ALPHA;
case GL_RGBA16F_ARB:
case GL_RGBA32F_ARB:
return GL_RGBA;
case GL_RGB16F_ARB:
case GL_RGB32F_ARB:
return GL_RGB;
case GL_INTENSITY16F_ARB:
case GL_INTENSITY32F_ARB:
return GL_INTENSITY;
case GL_LUMINANCE16F_ARB:
case GL_LUMINANCE32F_ARB:
return GL_LUMINANCE;
case GL_LUMINANCE_ALPHA16F_ARB:
case GL_LUMINANCE_ALPHA32F_ARB:
return GL_LUMINANCE_ALPHA;
default:
; /* fallthrough */
}
}
if (ctx->Extensions.EXT_texture_snorm) {
switch (internalFormat) {
case GL_RED_SNORM:
case GL_R8_SNORM:
case GL_R16_SNORM:
return GL_RED;
case GL_RG_SNORM:
case GL_RG8_SNORM:
case GL_RG16_SNORM:
return GL_RG;
case GL_RGB_SNORM:
case GL_RGB8_SNORM:
case GL_RGB16_SNORM:
return GL_RGB;
case GL_RGBA_SNORM:
case GL_RGBA8_SNORM:
case GL_RGBA16_SNORM:
return GL_RGBA;
case GL_ALPHA_SNORM:
case GL_ALPHA8_SNORM:
case GL_ALPHA16_SNORM:
return GL_ALPHA;
case GL_LUMINANCE_SNORM:
case GL_LUMINANCE8_SNORM:
case GL_LUMINANCE16_SNORM:
return GL_LUMINANCE;
case GL_LUMINANCE_ALPHA_SNORM:
case GL_LUMINANCE8_ALPHA8_SNORM:
case GL_LUMINANCE16_ALPHA16_SNORM:
return GL_LUMINANCE_ALPHA;
case GL_INTENSITY_SNORM:
case GL_INTENSITY8_SNORM:
case GL_INTENSITY16_SNORM:
return GL_INTENSITY;
default:
; /* fallthrough */
}
}
if (ctx->Extensions.EXT_texture_sRGB) {
switch (internalFormat) {
case GL_SRGB_EXT:
case GL_SRGB8_EXT:
case GL_COMPRESSED_SRGB_EXT:
return GL_RGB;
case GL_COMPRESSED_SRGB_S3TC_DXT1_EXT:
return ctx->Extensions.EXT_texture_compression_s3tc ? GL_RGB : -1;
case GL_SRGB_ALPHA_EXT:
case GL_SRGB8_ALPHA8_EXT:
case GL_COMPRESSED_SRGB_ALPHA_EXT:
return GL_RGBA;
case GL_COMPRESSED_SRGB_ALPHA_S3TC_DXT1_EXT:
case GL_COMPRESSED_SRGB_ALPHA_S3TC_DXT3_EXT:
case GL_COMPRESSED_SRGB_ALPHA_S3TC_DXT5_EXT:
return ctx->Extensions.EXT_texture_compression_s3tc ? GL_RGBA : -1;
case GL_SLUMINANCE_ALPHA_EXT:
case GL_SLUMINANCE8_ALPHA8_EXT:
case GL_COMPRESSED_SLUMINANCE_ALPHA_EXT:
return GL_LUMINANCE_ALPHA;
case GL_SLUMINANCE_EXT:
case GL_SLUMINANCE8_EXT:
case GL_COMPRESSED_SLUMINANCE_EXT:
return GL_LUMINANCE;
default:
; /* fallthrough */
}
}
if (ctx->Version >= 30 ||
ctx->Extensions.EXT_texture_integer) {
switch (internalFormat) {
case GL_RGBA8UI_EXT:
case GL_RGBA16UI_EXT:
case GL_RGBA32UI_EXT:
case GL_RGBA8I_EXT:
case GL_RGBA16I_EXT:
case GL_RGBA32I_EXT:
case GL_RGB10_A2UI:
return GL_RGBA;
case GL_RGB8UI_EXT:
case GL_RGB16UI_EXT:
case GL_RGB32UI_EXT:
case GL_RGB8I_EXT:
case GL_RGB16I_EXT:
case GL_RGB32I_EXT:
return GL_RGB;
}
}
if (ctx->Extensions.EXT_texture_integer) {
switch (internalFormat) {
case GL_ALPHA8UI_EXT:
case GL_ALPHA16UI_EXT:
case GL_ALPHA32UI_EXT:
case GL_ALPHA8I_EXT:
case GL_ALPHA16I_EXT:
case GL_ALPHA32I_EXT:
return GL_ALPHA;
case GL_INTENSITY8UI_EXT:
case GL_INTENSITY16UI_EXT:
case GL_INTENSITY32UI_EXT:
case GL_INTENSITY8I_EXT:
case GL_INTENSITY16I_EXT:
case GL_INTENSITY32I_EXT:
return GL_INTENSITY;
case GL_LUMINANCE8UI_EXT:
case GL_LUMINANCE16UI_EXT:
case GL_LUMINANCE32UI_EXT:
case GL_LUMINANCE8I_EXT:
case GL_LUMINANCE16I_EXT:
case GL_LUMINANCE32I_EXT:
return GL_LUMINANCE;
case GL_LUMINANCE_ALPHA8UI_EXT:
case GL_LUMINANCE_ALPHA16UI_EXT:
case GL_LUMINANCE_ALPHA32UI_EXT:
case GL_LUMINANCE_ALPHA8I_EXT:
case GL_LUMINANCE_ALPHA16I_EXT:
case GL_LUMINANCE_ALPHA32I_EXT:
return GL_LUMINANCE_ALPHA;
default:
; /* fallthrough */
}
}
if (ctx->Extensions.ARB_texture_rg) {
switch (internalFormat) {
case GL_R16F:
case GL_R32F:
if (!ctx->Extensions.ARB_texture_float)
break;
return GL_RED;
case GL_R8I:
case GL_R8UI:
case GL_R16I:
case GL_R16UI:
case GL_R32I:
case GL_R32UI:
if (ctx->Version < 30 && !ctx->Extensions.EXT_texture_integer)
break;
/* FALLTHROUGH */
case GL_R8:
case GL_R16:
case GL_RED:
case GL_COMPRESSED_RED:
return GL_RED;
case GL_RG16F:
case GL_RG32F:
if (!ctx->Extensions.ARB_texture_float)
break;
return GL_RG;
case GL_RG8I:
case GL_RG8UI:
case GL_RG16I:
case GL_RG16UI:
case GL_RG32I:
case GL_RG32UI:
if (ctx->Version < 30 && !ctx->Extensions.EXT_texture_integer)
break;
/* FALLTHROUGH */
case GL_RG:
case GL_RG8:
case GL_RG16:
case GL_COMPRESSED_RG:
return GL_RG;
default:
; /* fallthrough */
}
}
if (ctx->Extensions.EXT_texture_shared_exponent) {
switch (internalFormat) {
case GL_RGB9_E5_EXT:
return GL_RGB;
default:
; /* fallthrough */
}
}
if (ctx->Extensions.EXT_packed_float) {
switch (internalFormat) {
case GL_R11F_G11F_B10F_EXT:
return GL_RGB;
default:
; /* fallthrough */
}
}
if (ctx->Extensions.ARB_depth_buffer_float) {
switch (internalFormat) {
case GL_DEPTH_COMPONENT32F:
return GL_DEPTH_COMPONENT;
case GL_DEPTH32F_STENCIL8:
return GL_DEPTH_STENCIL;
default:
; /* fallthrough */
}
}
if (ctx->Extensions.ARB_texture_compression_rgtc) {
switch (internalFormat) {
case GL_COMPRESSED_RED_RGTC1:
case GL_COMPRESSED_SIGNED_RED_RGTC1:
return GL_RED;
case GL_COMPRESSED_RG_RGTC2:
case GL_COMPRESSED_SIGNED_RG_RGTC2:
return GL_RG;
default:
; /* fallthrough */
}
}
if (ctx->Extensions.EXT_texture_compression_latc) {
switch (internalFormat) {
case GL_COMPRESSED_LUMINANCE_LATC1_EXT:
case GL_COMPRESSED_SIGNED_LUMINANCE_LATC1_EXT:
return GL_LUMINANCE;
case GL_COMPRESSED_LUMINANCE_ALPHA_LATC2_EXT:
case GL_COMPRESSED_SIGNED_LUMINANCE_ALPHA_LATC2_EXT:
return GL_LUMINANCE_ALPHA;
default:
; /* fallthrough */
}
}
if (ctx->Extensions.ATI_texture_compression_3dc) {
switch (internalFormat) {
case GL_COMPRESSED_LUMINANCE_ALPHA_3DC_ATI:
return GL_LUMINANCE_ALPHA;
default:
; /* fallthrough */
}
}
if (ctx->Extensions.OES_compressed_ETC1_RGB8_texture) {
switch (internalFormat) {
case GL_ETC1_RGB8_OES:
return GL_RGB;
default:
; /* fallthrough */
}
}
if (_mesa_is_gles3(ctx) || ctx->Extensions.ARB_ES3_compatibility) {
switch (internalFormat) {
case GL_COMPRESSED_RGB8_ETC2:
case GL_COMPRESSED_SRGB8_ETC2:
return GL_RGB;
case GL_COMPRESSED_RGBA8_ETC2_EAC:
case GL_COMPRESSED_SRGB8_ALPHA8_ETC2_EAC:
case GL_COMPRESSED_RGB8_PUNCHTHROUGH_ALPHA1_ETC2:
case GL_COMPRESSED_SRGB8_PUNCHTHROUGH_ALPHA1_ETC2:
return GL_RGBA;
case GL_COMPRESSED_R11_EAC:
case GL_COMPRESSED_SIGNED_R11_EAC:
return GL_RED;
case GL_COMPRESSED_RG11_EAC:
case GL_COMPRESSED_SIGNED_RG11_EAC:
return GL_RG;
default:
; /* fallthrough */
}
}
if (_mesa_is_desktop_gl(ctx) &&
ctx->Extensions.ARB_texture_compression_bptc) {
switch (internalFormat) {
case GL_COMPRESSED_RGBA_BPTC_UNORM:
case GL_COMPRESSED_SRGB_ALPHA_BPTC_UNORM:
return GL_RGBA;
case GL_COMPRESSED_RGB_BPTC_SIGNED_FLOAT:
case GL_COMPRESSED_RGB_BPTC_UNSIGNED_FLOAT:
return GL_RGB;
default:
; /* fallthrough */
}
}
if (ctx->API == API_OPENGLES) {
switch (internalFormat) {
case GL_PALETTE4_RGB8_OES:
case GL_PALETTE4_R5_G6_B5_OES:
case GL_PALETTE8_RGB8_OES:
case GL_PALETTE8_R5_G6_B5_OES:
return GL_RGB;
case GL_PALETTE4_RGBA8_OES:
case GL_PALETTE8_RGB5_A1_OES:
case GL_PALETTE4_RGBA4_OES:
case GL_PALETTE4_RGB5_A1_OES:
case GL_PALETTE8_RGBA8_OES:
case GL_PALETTE8_RGBA4_OES:
return GL_RGBA;
default:
; /* fallthrough */
}
}
return -1; /* error */
}
/**
* Returns the effective internal format from a texture format and type.
* This is used by texture image operations internally for validation, when
* the specified internal format is a base (unsized) format.
*
* This method will only return a valid effective internal format if the
* combination of format, type and internal format in base form, is acceptable.
*
* If a single sized internal format is defined in the spec (OpenGL-ES 3.0.4) or
* in extensions, to unambiguously correspond to the given base format, then
* that internal format is returned as the effective. Otherwise, if the
* combination is accepted but a single effective format is not defined, the
* passed base format will be returned instead.
*
* \param format the texture format
* \param type the texture type
*/
static GLenum
_mesa_es3_effective_internal_format_for_format_and_type(GLenum format,
GLenum type)
{
switch (type) {
case GL_UNSIGNED_BYTE:
switch (format) {
case GL_RGBA:
return GL_RGBA8;
case GL_RGB:
return GL_RGB8;
/* Although LUMINANCE_ALPHA, LUMINANCE and ALPHA appear in table 3.12,
* (section 3.8 Texturing, page 128 of the OpenGL-ES 3.0.4) as effective
* internal formats, they do not correspond to GL constants, so the base
* format is returned instead.
*/
case GL_BGRA_EXT:
case GL_LUMINANCE_ALPHA:
case GL_LUMINANCE:
case GL_ALPHA:
return format;
}
break;
case GL_UNSIGNED_SHORT_4_4_4_4:
if (format == GL_RGBA)
return GL_RGBA4;
break;
case GL_UNSIGNED_SHORT_5_5_5_1:
if (format == GL_RGBA)
return GL_RGB5_A1;
break;
case GL_UNSIGNED_SHORT_5_6_5:
if (format == GL_RGB)
return GL_RGB565;
break;
/* OES_packed_depth_stencil */
case GL_UNSIGNED_INT_24_8:
if (format == GL_DEPTH_STENCIL)
return GL_DEPTH24_STENCIL8;
break;
case GL_FLOAT_32_UNSIGNED_INT_24_8_REV:
if (format == GL_DEPTH_STENCIL)
return GL_DEPTH32F_STENCIL8;
break;
case GL_UNSIGNED_SHORT:
if (format == GL_DEPTH_COMPONENT)
return GL_DEPTH_COMPONENT16;
break;
case GL_UNSIGNED_INT:
/* It can be DEPTH_COMPONENT16 or DEPTH_COMPONENT24, so just return
* the format.
*/
if (format == GL_DEPTH_COMPONENT)
return format;
break;
/* OES_texture_float and OES_texture_half_float */
case GL_FLOAT:
if (format == GL_DEPTH_COMPONENT)
return GL_DEPTH_COMPONENT32F;
/* fall through */
case GL_HALF_FLOAT_OES:
switch (format) {
case GL_RGBA:
case GL_RGB:
case GL_LUMINANCE_ALPHA:
case GL_LUMINANCE:
case GL_ALPHA:
case GL_RED:
case GL_RG:
return format;
}
break;
case GL_HALF_FLOAT:
switch (format) {
case GL_RG:
case GL_RED:
return format;
}
break;
/* GL_EXT_texture_type_2_10_10_10_REV */
case GL_UNSIGNED_INT_2_10_10_10_REV:
switch (format) {
case GL_RGBA:
case GL_RGB:
return format;
}
break;
default:
/* fall through and return NONE */
break;
}
return GL_NONE;
}
/**
* Do error checking of format/type combinations for OpenGL ES 3
@@ -2091,7 +2718,53 @@ _mesa_es3_error_check_format_and_type(const struct gl_context *ctx,
GLenum format, GLenum type,
GLenum internalFormat)
{
/* If internalFormat is an unsized format, then the effective internal
* format derived from format and type should be used instead. Page 127,
* section "3.8 Texturing" of the GLES 3.0.4 spec states:
*
* "if internalformat is a base internal format, the effective
* internal format is a sized internal format that is derived
* from the format and type for internal use by the GL.
* Table 3.12 specifies the mapping of format and type to effective
* internal formats. The effective internal format is used by the GL
* for purposes such as texture completeness or type checks for
* CopyTex* commands. In these cases, the GL is required to operate
* as if the effective internal format was used as the internalformat
* when specifying the texture data."
*/
if (_mesa_is_enum_format_unsized(internalFormat)) {
GLenum effectiveInternalFormat =
_mesa_es3_effective_internal_format_for_format_and_type(format, type);
if (effectiveInternalFormat == GL_NONE)
return GL_INVALID_OPERATION;
GLenum baseInternalFormat;
if (internalFormat == GL_BGRA_EXT) {
/* Unfortunately, _mesa_base_tex_format returns a base format of
* GL_RGBA for GL_BGRA_EXT. This makes perfect sense if you're
* asking the question, "what channels does this format have?"
* However, if we're trying to determine if two internal formats
* match in the ES3 sense, we actually want GL_BGRA.
*/
baseInternalFormat = GL_BGRA_EXT;
} else {
baseInternalFormat =
_mesa_base_tex_format(ctx, effectiveInternalFormat);
}
if (internalFormat != baseInternalFormat)
return GL_INVALID_OPERATION;
internalFormat = effectiveInternalFormat;
}
switch (format) {
case GL_BGRA_EXT:
if (type != GL_UNSIGNED_BYTE || internalFormat != GL_BGRA)
return GL_INVALID_OPERATION;
break;
case GL_RGBA:
switch (type) {
case GL_UNSIGNED_BYTE:

View File

@@ -131,6 +131,8 @@ extern GLenum
_mesa_es3_error_check_format_and_type(const struct gl_context *ctx,
GLenum format, GLenum type,
GLenum internalFormat);
extern GLint
_mesa_base_tex_format(const struct gl_context *ctx, GLint internalFormat );
extern uint32_t
_mesa_format_from_format_and_type(GLenum format, GLenum type);

View File

@@ -1922,11 +1922,8 @@ generate_mipmap_uncompressed(struct gl_context *ctx, GLenum target,
}
/* get dest gl_texture_image */
dstImage = _mesa_get_tex_image(ctx, texObj, target, level + 1);
if (!dstImage) {
_mesa_error(ctx, GL_OUT_OF_MEMORY, "generating mipmaps");
return;
}
dstImage = _mesa_select_tex_image(texObj, target, level + 1);
assert(dstImage);
if (target == GL_TEXTURE_1D_ARRAY) {
srcDepth = srcHeight;
@@ -2110,7 +2107,19 @@ generate_mipmap_compressed(struct gl_context *ctx, GLenum target,
srcWidth, srcHeight, srcDepth,
&dstWidth, &dstHeight, &dstDepth);
if (!nextLevel)
break;
goto end;
if (!_mesa_prepare_mipmap_level(ctx, texObj, level + 1,
dstWidth, dstHeight, dstDepth,
border, srcImage->InternalFormat,
srcImage->TexFormat)) {
/* all done */
goto end;
}
/* get dest gl_texture_image */
dstImage = _mesa_select_tex_image(texObj, target, level + 1);
assert(dstImage);
/* Compute dst image strides and alloc memory on first iteration */
temp_dst_row_stride = _mesa_format_row_stride(temp_format, dstWidth);
@@ -2124,13 +2133,6 @@ generate_mipmap_compressed(struct gl_context *ctx, GLenum target,
}
}
/* get dest gl_texture_image */
dstImage = _mesa_get_tex_image(ctx, texObj, target, level + 1);
if (!dstImage) {
_mesa_error(ctx, GL_OUT_OF_MEMORY, "generating mipmaps");
goto end;
}
/* for 2D arrays, setup array[depth] of slice pointers */
for (i = 0; i < srcDepth; i++) {
temp_src_slices[i] = temp_src + temp_src_img_stride * i;
@@ -2149,14 +2151,6 @@ generate_mipmap_compressed(struct gl_context *ctx, GLenum target,
dstWidth, dstHeight, dstDepth,
temp_dst_slices, temp_dst_row_stride);
if (!_mesa_prepare_mipmap_level(ctx, texObj, level + 1,
dstWidth, dstHeight, dstDepth,
border, srcImage->InternalFormat,
srcImage->TexFormat)) {
/* all done */
goto end;
}
/* The image space was allocated above so use glTexSubImage now */
ctx->Driver.TexSubImage(ctx, 2, dstImage,
0, 0, 0, dstWidth, dstHeight, dstDepth,

View File

@@ -4170,13 +4170,6 @@ struct gl_image_unit
*/
GLboolean Layered;
/**
* GL_TRUE if the state of this image unit is valid and access from
* the shader is allowed. Otherwise loads from this unit should
* return zero and stores should have no effect.
*/
GLboolean _Valid;
/**
* Layer of the texture object bound to this unit as specified by the
* application.

View File

@@ -1074,6 +1074,21 @@ _mesa_pack_depth_span( struct gl_context *ctx, GLuint n, GLvoid *dest,
}
}
break;
case GL_UNSIGNED_INT_24_8:
{
const GLdouble scale = (GLdouble) 0xffffff;
GLuint *dst = (GLuint *) dest;
GLuint i;
for (i = 0; i < n; i++) {
GLuint z = (GLuint) (depthSpan[i] * scale);
assert(z <= 0xffffff);
dst[i] = (z << 8);
}
if (dstPacking->SwapBytes) {
_mesa_swap4( (GLuint *) dst, n );
}
break;
}
case GL_UNSIGNED_INT:
{
GLuint *dst = (GLuint *) dest;

View File

@@ -111,11 +111,9 @@ _mesa_GetProgramInterfaceiv(GLuint program, GLenum programInterface,
for (i = 0, *params = 0; i < shProg->NumProgramResourceList; i++) {
if (shProg->ProgramResourceList[i].Type != programInterface)
continue;
const char *name =
_mesa_program_resource_name(&shProg->ProgramResourceList[i]);
unsigned array_size =
_mesa_program_resource_array_size(&shProg->ProgramResourceList[i]);
*params = MAX2(*params, strlen(name) + (array_size ? 3 : 0) + 1);
unsigned len =
_mesa_program_resource_name_len(&shProg->ProgramResourceList[i]);
*params = MAX2(*params, len + 1);
}
break;
case GL_MAX_NUM_ACTIVE_VARIABLES:

View File

@@ -476,7 +476,7 @@ _mesa_program_resource_array_size(struct gl_program_resource *res)
RESOURCE_XFB(res)->Size : 0;
case GL_PROGRAM_INPUT:
case GL_PROGRAM_OUTPUT:
return RESOURCE_VAR(res)->data.max_array_access;
return RESOURCE_VAR(res)->type->length;
case GL_UNIFORM:
case GL_VERTEX_SUBROUTINE_UNIFORM:
case GL_GEOMETRY_SUBROUTINE_UNIFORM:
@@ -661,6 +661,57 @@ _mesa_program_resource_find_index(struct gl_shader_program *shProg,
return NULL;
}
/* Function returns if resource name is expected to have index
* appended into it.
*
*
* Page 61 (page 73 of the PDF) in section 2.11 of the OpenGL ES 3.0
* spec says:
*
* "If the active uniform is an array, the uniform name returned in
* name will always be the name of the uniform array appended with
* "[0]"."
*
* The same text also appears in the OpenGL 4.2 spec. It does not,
* however, appear in any previous spec. Previous specifications are
* ambiguous in this regard. However, either name can later be passed
* to glGetUniformLocation (and related APIs), so there shouldn't be any
* harm in always appending "[0]" to uniform array names.
*
* Geometry shader stage has different naming convention where the 'normal'
* condition is an array, therefore for variables referenced in geometry
* stage we do not add '[0]'.
*
* Note, that TCS outputs and TES inputs should not have index appended
* either.
*/
static bool
add_index_to_name(struct gl_program_resource *res)
{
bool add_index = !(((res->Type == GL_PROGRAM_INPUT) &&
res->StageReferences & (1 << MESA_SHADER_GEOMETRY)));
/* Transform feedback varyings have array index already appended
* in their names.
*/
if (res->Type == GL_TRANSFORM_FEEDBACK_VARYING)
add_index = false;
return add_index;
}
/* Get name length of a program resource. This consists of
* base name + 3 for '[0]' if resource is an array.
*/
extern unsigned
_mesa_program_resource_name_len(struct gl_program_resource *res)
{
unsigned length = strlen(_mesa_program_resource_name(res));
if (_mesa_program_resource_array_size(res) && add_index_to_name(res))
length += 3;
return length;
}
/* Get full name of a program resource.
*/
bool
@@ -696,36 +747,7 @@ _mesa_get_program_resource_name(struct gl_shader_program *shProg,
_mesa_copy_string(name, bufSize, length, _mesa_program_resource_name(res));
/* Page 61 (page 73 of the PDF) in section 2.11 of the OpenGL ES 3.0
* spec says:
*
* "If the active uniform is an array, the uniform name returned in
* name will always be the name of the uniform array appended with
* "[0]"."
*
* The same text also appears in the OpenGL 4.2 spec. It does not,
* however, appear in any previous spec. Previous specifications are
* ambiguous in this regard. However, either name can later be passed
* to glGetUniformLocation (and related APIs), so there shouldn't be any
* harm in always appending "[0]" to uniform array names.
*
* Geometry shader stage has different naming convention where the 'normal'
* condition is an array, therefore for variables referenced in geometry
* stage we do not add '[0]'.
*
* Note, that TCS outputs and TES inputs should not have index appended
* either.
*/
bool add_index = !(((programInterface == GL_PROGRAM_INPUT) &&
res->StageReferences & (1 << MESA_SHADER_GEOMETRY)));
/* Transform feedback varyings have array index already appended
* in their names.
*/
if (programInterface == GL_TRANSFORM_FEEDBACK_VARYING)
add_index = false;
if (add_index && _mesa_program_resource_array_size(res)) {
if (_mesa_program_resource_array_size(res) && add_index_to_name(res)) {
int i;
/* The comparison is strange because *length does *NOT* include the
@@ -972,13 +994,9 @@ _mesa_program_resource_prop(struct gl_shader_program *shProg,
switch (res->Type) {
case GL_ATOMIC_COUNTER_BUFFER:
goto invalid_operation;
case GL_TRANSFORM_FEEDBACK_VARYING:
*val = strlen(_mesa_program_resource_name(res)) + 1;
break;
default:
/* Base name +3 if array '[0]' + terminator. */
*val = strlen(_mesa_program_resource_name(res)) +
(_mesa_program_resource_array_size(res) > 0 ? 3 : 0) + 1;
/* Resource name length + terminator. */
*val = _mesa_program_resource_name_len(res) + 1;
}
return 1;
case GL_TYPE:
@@ -1003,7 +1021,7 @@ _mesa_program_resource_prop(struct gl_shader_program *shProg,
return 1;
case GL_PROGRAM_INPUT:
case GL_PROGRAM_OUTPUT:
*val = MAX2(RESOURCE_VAR(res)->type->length, 1);
*val = MAX2(_mesa_program_resource_array_size(res), 1);
return 1;
case GL_TRANSFORM_FEEDBACK_VARYING:
*val = MAX2(RESOURCE_XFB(res)->Size, 1);

View File

@@ -245,6 +245,9 @@ _mesa_get_program_resource_name(struct gl_shader_program *shProg,
GLsizei bufSize, GLsizei *length,
GLchar *name, const char *caller);
extern unsigned
_mesa_program_resource_name_len(struct gl_program_resource *res);
extern GLint
_mesa_program_resource_location(struct gl_shader_program *shProg,
GLenum programInterface, const char *name);

View File

@@ -415,8 +415,8 @@ _mesa_init_image_units(struct gl_context *ctx)
ctx->ImageUnits[i] = _mesa_default_image_unit(ctx);
}
static GLboolean
validate_image_unit(struct gl_context *ctx, struct gl_image_unit *u)
GLboolean
_mesa_is_image_unit_valid(struct gl_context *ctx, struct gl_image_unit *u)
{
struct gl_texture_object *t = u->TexObj;
mesa_format tex_format;
@@ -424,7 +424,8 @@ validate_image_unit(struct gl_context *ctx, struct gl_image_unit *u)
if (!t)
return GL_FALSE;
_mesa_test_texobj_completeness(ctx, t);
if (!t->_BaseComplete && !t->_MipmapComplete)
_mesa_test_texobj_completeness(ctx, t);
if (u->Level < t->BaseLevel ||
u->Level > t->_MaxLevel ||
@@ -473,17 +474,6 @@ validate_image_unit(struct gl_context *ctx, struct gl_image_unit *u)
return GL_TRUE;
}
void
_mesa_validate_image_units(struct gl_context *ctx)
{
unsigned i;
for (i = 0; i < ctx->Const.MaxImageUnits; ++i) {
struct gl_image_unit *u = &ctx->ImageUnits[i];
u->_Valid = validate_image_unit(ctx, u);
}
}
static GLboolean
validate_bind_image_texture(struct gl_context *ctx, GLuint unit,
GLuint texture, GLint level, GLboolean layered,
@@ -567,7 +557,6 @@ _mesa_BindImageTexture(GLuint unit, GLuint texture, GLint level,
u->Access = access;
u->Format = format;
u->_ActualFormat = _mesa_get_shader_image_format(format);
u->_Valid = validate_image_unit(ctx, u);
if (u->TexObj && _mesa_tex_target_is_layered(u->TexObj->Target)) {
u->Layered = layered;
@@ -707,7 +696,6 @@ _mesa_BindImageTextures(GLuint first, GLsizei count, const GLuint *textures)
u->Access = GL_READ_WRITE;
u->Format = tex_format;
u->_ActualFormat = _mesa_get_shader_image_format(tex_format);
u->_Valid = validate_image_unit(ctx, u);
} else {
/* Unbind the texture from the unit */
_mesa_reference_texobj(&u->TexObj, NULL);
@@ -717,7 +705,6 @@ _mesa_BindImageTextures(GLuint first, GLsizei count, const GLuint *textures)
u->Access = GL_READ_ONLY;
u->Format = GL_R8;
u->_ActualFormat = MESA_FORMAT_R_UNORM8;
u->_Valid = GL_FALSE;
}
/* Pass the BindImageTexture call down to the device driver */

Some files were not shown because too many files have changed in this diff Show More