Compare commits

...

88 Commits

Author SHA1 Message Date
Ian Romanick
456cdb6d01 mesa: Bump version to 9.1-rc2
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-02-17 14:49:02 -08:00
Eric Anholt
aaee862305 i965/fs: Use a helper function for checking for flow control instructions.
In 2 of our checks, we were missing BREAK and CONTINUE.

NOTE: Candidate for the stable branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit bf91f0b039)
2013-02-17 14:20:39 -08:00
bma
b84d9aa0c6 shaderapi: Fix AttachShader error
Detect a duplicate Shader type as and error instead of silently allowing
it, restrict to ES2 API.

v2: Tapani Pälli <tapani.palli@intel.com>
    - make the check run time instead of compile time

v3: chadv
    - Quote spec on which error to generate.

Signed-off-by: bma <Bo.Ma@windriver.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-and-tested-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit ce3dfa19ab)
2013-02-17 14:20:34 -08:00
Eric Anholt
bb4b1494e3 i965: Re-enable the -RHW workaround for original gen4 chips.
Fixes broken clipping in supertuxkart and presumably many other applications.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51471
NOTE: Candidate for the stable branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit cb4616d32d)
2013-02-17 14:20:27 -08:00
Eric Anholt
321abaaa8d i965/gen4: Work around missing sRGB RGB DXT1 support.
The hardware just doesn't support it.  I suspect this was a regression from
the move to fixed MESA_FORMATs for compressed textures and that previously we
were storing uncompressed for this or something.

Fixes GPU hangs in piglit "texwrap GL_EXT_texture_sRGB-s3tc bordercolor
swizzled" on my GM965.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit ddc2b453d0)
2013-02-17 14:20:22 -08:00
Ian Romanick
95f1203a7c mesa: Add .cherry-ignore for 9.1
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-02-17 14:16:22 -08:00
Christopher James Halse Rogers
96fb4d61fb i965: Fix leak in blorp CopyTexSubImage2D
_mesa_delete_renderbuffer does not call the driver-specific
renderbuffer delete function, so the blorp code was leaking the
Intel-specific bits, including some GEM objects.

Call the renderbuffer's ->Delete() method instead, which does the
right thing.

Fixes Unity rapidly sending the machine into the arms of the OOM-killer

Note: This is a candidate for the 9.1 branch.

Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit dd599188d2)
2013-02-17 14:13:27 -08:00
Brian Paul
d8a0439c65 st/mesa: fix format query for GL_ARB_texture_rg
The GL_ARB_texture_rg spec says that we need to support both texturing
and rendering for the GL_RED and GL_RG formats.  So move the format
check up into the rendertarget_mapping[] list.  Also, add
PIPE_FORMAT_R8_UNORM to the list of formats required.

Note: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
(cherry picked from commit 4be5a06752)
2013-02-17 14:13:13 -08:00
Eric Anholt
c785315f3d i965/gen7: Set up all samplers even if samplers are sparsely used.
In GLSL, sampler indices are allocated contiguously from 0.  But in the
case of ARB_fragment_program (and possibly fixed function), an app that
uses texture 0 and 2 will use sampler indices 0 and 2, so we were only
allocating space for samplers 0 and 1 and setting up sampler 0.  We
would read garbage for sampler 2, resulting in flickering textures and
an angry simulator.

Fixes bad rendering in 0 A.D. and ETQW.  This was fixed for pre-gen7 by
28f4be9eb9

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=25201
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58680
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: This is a candidate for stable branches.
(cherry picked from commit 5bb05c6e6d)
2013-02-17 14:12:47 -08:00
Kenneth Graunke
0e3c755ca3 i965: Use derived state for Haswell's 3DSTATE_VF packet.
Otherwise, we fail to correctly handle GL_PRIMITIVE_RESTART_FIXED_INDEX.

Fixes gles3conform's primitive_restart_mode test.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 8cabe26f5d)
2013-02-17 14:12:10 -08:00
Brian Paul
4d11454e90 util: fix incorrect Z bit masking in util_clear_depth_stencil()
For PIPE_FORMAT_Z24_UNORM_S8_UINT, the Z bits are in the 24
least significant bits.

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=60527
and http://bugs.freedesktop.org/show_bug.cgi?id=60524
and http://bugs.freedesktop.org/show_bug.cgi?id=60047

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 4bfdef87e6)
2013-02-17 14:10:55 -08:00
Marek Olšák
9838215f3c mesa: fix GetTexImage if mesa format and internal format don't match
Tested with softpipe only exposing RGBA formats.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit cb6470775c)
2013-02-17 14:10:40 -08:00
Marek Olšák
2e4473d9e3 mesa: don't use memcpy fast path for GetTexImage if base format is different
The Mesa format can be RGBA8888_REV, the format/type can be
GL_RGBA/GL_UNSIGNED_BYTE, but the actual texture internal format can be
LUMINANCE_ALPHA, INTENSITY, etc. Therefore we should look at the base
internal format as well.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit c8379204ab)
2013-02-17 14:10:28 -08:00
Marek Olšák
11eb644cc9 mesa: don't use _mesa_base_tex_format for format parameter of GetTexImage
_mesa_base_tex_format doesn't accept GL_BGR and GL_ABGR_EXT, etc.

v2: add a (now hopefully complete) helper function to deal with this

NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 09a99867ab)
2013-02-17 14:09:55 -08:00
Ian Romanick
60bad0ddc3 intel: Do not expose OES_compressed_ETC1_RGB8_texture or ARB_texture_rgb10_a2ui pre-GEN4
Older hardware cannot do ARB_texture_rgb10_a2ui, and the translation
code for OES_compressed_ETC1_RGB8_texture was never implemented in the
i915 driver.

NOTE: This is a candidate for all stable branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 0e2f26d5ea)
2013-02-17 14:09:09 -08:00
Roland Scheidegger
d41e9b4d14 softpipe: fix using optimized filter function
This optimized filter (when using repeat wrap modes,
linear min/mag/mip filters, pot textures) only applies to 2d textures,
but nothing prevented it from being used for other textures (likely
leading to very bogus sample results).

Note: This is a candidate for the 9.0 branch.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 66b6d51214)
2013-02-17 14:09:01 -08:00
Kristian Høgsberg
60f05f0eef egl-wayland: Make sure we allocate a back buffer even if nothing was rendered
At eglSwapBuffer time, we blindly assume we have a back buffer, but the
back buffer only gets allocated when somebody tries to render something.

NOTE: This is a candidate for the 9.0 and 9.1 branches.

https://bugs.freedesktop.org/show_bug.cgi?id=60086
(cherry picked from commit 1fe007399c)
2013-02-17 14:08:41 -08:00
Brian Paul
714d8b3f8c svga: fix sRGB rendering
We weren't emitting the SVGA_RS_OUTPUTGAMMA state so sRGB rendering
didn't work properly.

Fixes piglit's framebuffer-srgb test.

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit ff60509157)
2013-02-17 14:08:22 -08:00
Brian Paul
ac22dffaf6 st/mesa: don't choose DXT formats if we can't do DXT compression
If we call gl[Copy]TexImage2D() with a generic compression format
(e.g. intFormat=GL_COMPRESSED_RGBA) we can't choose a DXT format if
we don't have the external DXT compression library.

We weren't actually enforcing this before since the
pipe_screen::is_format_supported(DXT) query has no dependency on
the DXT compression library.

Now if we're given a generic compressed format and we can't do DXT
compression we'll fall back to a non-compressed format.

v2: use util_format_is_s3tc() function and add more comments about
the allow_dxt parameter.

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 4df42890c5)
2013-02-17 14:08:16 -08:00
Brian Paul
4ec3843a54 mesa: don't use format chooser code for glCompressedTexImage
When glCompressedTexImage is called the internalFormat is a specific
format for the incoming image and the the hardware format should be
the same (since we never do format transcoding).  So use the simpler
_mesa_glenum_to_compressed_format() function.  This change is also
needed for the next patch.

Note: This is a candidate for the stable branches.
(cherry picked from commit 478056b81a)
2013-02-17 14:07:37 -08:00
Michel Dänzer
30ae2f97c5 configure.ac: GLX cannot work without OpenGL
GLX uses mapi/glapi/libglapi.la, which is only built for OpenGL.

If the user specified --enable-xlib-glx --disable-opengl, error out, as these
cannot be both observed at the same time. If the user just specified
--disable-opengl but not --disable-glx, print a warning and disable GLX as
well.

NOTE: This is a candidate for the stable branches.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59364

Tested-by: Tom Stellard <thomas.stellard@amd.com>
(cherry picked from commit 3b888f534c)
2013-02-17 14:07:22 -08:00
Stéphane Marchesin
b289e639e4 glx: Check that swap_buffers_reply is non-NULL before using it
Check that the return value from xcb_dri2_swap_buffers_reply is
non-NULL before accessing the struct members.

Note: This is a candidate for the 9.0 branch.

Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 67e7263e45)
2013-02-17 14:02:29 -08:00
Brian Paul
c7e5e9ddce st/mesa: only enable GL_EXT_framebuffer_multisample if GL_MAX_SAMPLES >= 2
We never really have multisampling with one sample per pixel.
See also http://bugs.freedesktop.org/show_bug.cgi?id=59873

Note: This is a candidate for the 9.0 branch.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

(cherry picked from commit c80bacba2e)
2013-02-17 14:01:37 -08:00
Brian Paul
7b9e99f45b mesa: don't enable GL_EXT_framebuffer_multisample for software drivers
Note: This is a candidate for the 9.0 branch.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 8f3c81d018)
2013-02-17 14:00:11 -08:00
Brian Paul
9cea40321c osmesa: use _mesa_generate_mipmap() for mipmap generation, not meta
See previous commit for more info.

Note: This is a candidate for the 9.0 branch.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit 2180f32972)
2013-02-17 13:59:52 -08:00
Brian Paul
29e63455aa xlib: use _mesa_generate_mipmap() for mipmap generation, not meta
The swrast fragment program interpreter has trouble computing the
right texture LOD because it doesn't have easy access to input
derivatives.  This causes the GLSL-based meta generate mipmap code
to fetch texels from the wrong mipmap level.

One possible fix would be to set the GL_TEXTURE_MIN/MAX_LOD parameters
to limit sampling from the right level.  But let's just use the
_mesa_generate_mipmap() fallback since it's a lot faster than using
the fragment shader interpreter.

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=54240

Note: This is a candidate for the 9.0 branch.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit 89551ae04f)
2013-02-17 13:59:43 -08:00
Paul Berry
632a5a3a5b glsl: don't allow non-flat integral types in varying structs/arrays.
In the GLSL 1.30 spec, section 4.3.6 ("Outputs") says:

    "If a vertex output is a signed or unsigned integer or integer
    vector, then it must be qualified with the interpolation qualifier
    flat."

The GLSL ES 3.00 spec further clarifies, in section 4.3.6 ("Output
Variables"):

    "Vertex shader outputs that are, *or contain*, signed or unsigned
    integers or integer vectors must be qualified with the
    interpolation qualifier flat."

(Emphasis mine.)

The language in the GLSL ES 3.00 spec is clearly correct and should be
applied to all shading language versions, since varyings that contain
ints can't be interpolated, regardless of which shading language
version is in use.

(Note that in GLSL 1.50 the restriction is changed to apply to
fragment shader inputs rather than vertex shader outputs, to
accommodate the fact that in the presence of geometry shaders, vertex
shader outputs are not necessarily interpolated.  That will be
addressed by a future patch).

NOTE: This is a candidate for stable branches.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit 93c913485e)
2013-02-15 13:28:01 -08:00
Paul Berry
2cd4824fbc glsl: Allow default precision qualifiers to be set for sampler types.
From GLSL ES 3.00 section 4.5.4 ("Default Precision Qualifiers"):

    "The precision statement

        precision precision-qualifier type;

    can be used to establish a default precision qualifier. The type
    field can be either int or float or any of the sampler types, and
    the precision-qualifier can be lowp, mediump, or highp."

GLSL ES 1.00 has similar language.  GLSL 1.30 doesn't allow precision
qualifiers on sampler types, but this seems like an oversight (since
the intention of including these in GLSL 1.30 is to allow
compatibility with ES shaders).

Previously, Mesa followed GLSL 1.30 and only allowed default precision
qualifiers to be set for float and int.  This patch makes it follow
GLSL ES rules in all cases.

Fixes Piglit tests default-precision-sampler.{vert,frag}.

Partially addresses https://bugs.freedesktop.org/show_bug.cgi?id=60737.

NOTE: This is a candidate for stable branches.

Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit d5948f2f5e)
2013-02-15 13:27:48 -08:00
Michel Dänzer
05de84442b radeonsi: Handle TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS
8 more little piglits.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit c840270ebe)
2013-02-15 18:47:28 +01:00
Michel Dänzer
14372a70ec radeonsi: Fix array indices for detecting integer vertex formats
(cherry picked from commit f34ad85765)
2013-02-15 18:47:21 +01:00
Christian König
baa9070346 radeonsi: remove constant index limitation v3
With the llvm patches, fixing 14 piglit tests in total.

v2: increase the const limit
v3: document the const limit

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 8c80894fb3)
2013-02-15 18:46:32 +01:00
Christian König
f50e4e21f4 radeonsi: support constants as TEX coordinates
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 8514f5ac01)
2013-02-15 18:45:29 +01:00
Tom Stellard
38e728498b configure.ac: Add components to LLVM_COMPONENTS when using llvm shared libs
This is required when LLVM is built with CMake, which creates one
shared library for each component.
(cherry picked from commit 0898047e7b)
2013-02-13 17:02:12 -05:00
Matt Turner
fb2eb65126 Revert "mesa: Return INVALID_OPERATION when type is known but not allowed"
This reverts commit 2906e2034c.

Fixes a regression in the glean depthStencil test.

Reverting this does not affect any tests in es3conform, so a more recent
patch must have also fixed the failure this one was intended to fix.

Reported-by: lu hua <huax.lu@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59494
(cherry picked from commit a527b2192e)
2013-02-13 15:31:50 -05:00
Tom Stellard
741a249cbf r600g: Handle SET*_DX10 instructions in r600_bytecode_get_num_operands() 2013-02-13 15:31:34 -05:00
Jerome Glisse
3ae8678f81 r600g: fix lockup when hyperz & alpha test are enabled together. v3
Seems that alpha test being enabled confuse the GPU on the order in
which it should perform the Z testing. So force the order programmed
throught db shader control.

v2: Only force z order when alpha test is enabled
v3: Update db shader when binding new dsa + spelling fix

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
(cherry picked from commit 974b482aca)
2013-02-12 17:06:36 -05:00
Jordan Justen
85604f3d48 CopyTexImage: Don't check sRGB vs LINEAR for desktop GL
In OpenGL 4.3, new language was added that would require
this check. But, if this check results in broken applications
then perhaps it will be reversed.

For now, remove this check and re-evaluate when
desktop GL 4.3 is closer.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-02-12 11:25:15 -08:00
Quentin Glidic
9119b4e8ee configure.ac: Fix --with-llvm-shared-libs
The third argument of AC_ARG_WITH is evaluated for any provided value,
not only on --with-, so it must not force-enable the feature
Also, setting $with_llvm_shared_libs in the opencl check was overriding
the user switch

https://bugs.freedesktop.org/show_bug.cgi?id=59851

Signed-off-by: Quentin Glidic <sardemff7+git@sardemff7.net>
(cherry picked from commit 1e857130f0)
2013-02-12 15:58:22 +00:00
Tom Stellard
f4f306b8ba r600g/llvm: Select the correct GPU type for RV670
RV670 belongs in the R600 chip class

https://bugs.freedesktop.org/show_bug.cgi?id=58666

NOTE: This is a candidate for the 9.1 branch
(cherry picked from commit 257006e2a4)
2013-02-12 15:58:04 +00:00
Jerome Glisse
99adec8a88 r600g: make sure async blit is done 8 * pitch at a time v2
The blit must be aligned on 8 horizontal block.

v2: no need to align the reminder

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
(cherry picked from commit 323a448825)
2013-02-11 18:44:55 -05:00
Martin Andersson
3b609f12f6 winsys/radeon: fix bo with virtual address referencing mismatch
If the same context try to flink and open the object, use the
same bo struct instead of opening a new gem handle for the object.
This way we avoid avoid having 2 different handle pointing to the
same kernel object which can latter lead to trouble with virtual
address.

Fix:
https://bugs.freedesktop.org/show_bug.cgi?id=60200

Signed-off-by: Martin Andersson <g02maran@gmail.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
(cherry picked from commit a37835c8ed)
2013-02-11 18:41:28 -05:00
Andreas Boll
ecd310bd67 docs: document removal of makedepend build dependency
Build dependency removed with
424f200881

Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 44a5d7371c)
2013-02-11 18:12:30 +01:00
Matt Turner
2a7affc1d5 builtin_compiler/build: Don't use *_FOR_BUILD when not cross compiling
Previously we were relying on CFLAGS_FOR_BUILD to be the same as CFLAGS
when not cross compiling, but this assumption didn't take into
consideration 32-bit builds on 64-bit systems. More generally, not
honoring CFLAGS is bad.

Automake is evidently too stupid to accept

if CROSS_COMPILING
CC = @CC_FOR_BUILD@
...
else
CC = @CC@
endif

without warning that CC has been already defined. The warnings are
harmless, but I'd prefer to avoid future reports about them, so define
proxy variables, which are assigned inside the conditional and then
unconditionally assigned to CC et al.

NOTE: This is a candidate for the 9.1 branch.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59737
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60038
(cherry picked from commit 2db1f73849)
2013-02-11 12:28:06 +01:00
Quentin Glidic
c684e3b53e gallium/egl: Fix include dirs for VPATH build
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Quentin Glidic <sardemff7+git@sardemff7.net>
(cherry picked from commit 11bd1b0f58)
2013-02-11 11:48:54 +01:00
Andreas Boll
b94eeffe60 mesa: Bump version to 9.1-rc1 2013-02-11 09:21:54 +01:00
Jerome Glisse
a0528269a3 winsys/radeon: improve debuging printing
Make sure one can identify virtual address failure from allocation
failure.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
(cherry picked from commit 9a47684564)
2013-02-08 20:33:22 -05:00
Jerome Glisse
18ef6b1265 xorg: fix exa finish access
The exa core will already set the pointer to NULL prior calling
the callback function. So don't bail out in the callback if it's
already NULL.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
(cherry picked from commit 3310acdf47)
2013-02-08 19:01:51 -05:00
Paul Berry
0419b7a3a1 glsl: Support transform feedback of varying structs.
Since transform feedback needs to be able to access individual fields
of varying structs, we can no longer match up the arguments to
glTransformFeedbackVaryings() with variables in the vertex shader.

Instead, we build up a hashtable which records information about each
possible name that is a candidate for transform feedback, and then
match up the arguments to glTransformFeedbackVaryings() with the
contents of that hashtable.

Populating the hashtable uses the program_resource_visitor
infrastructure, so the logic is shared with how we handle uniforms.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 99b78337e3)
2013-02-08 11:17:33 -08:00
Paul Berry
5be2e14393 glsl: Use parse_program_resource_name to parse transform feedback varyings.
Previously, transform feedback varyings were parsed in an ad-hoc
fashion that wasn't compatible with structs (or array of structs).
This patch makes it use parse_program_resource_name(), which correctly
handles both.

Note that parse_program_resource_name()'s technique for handling
mal-formed input strings is to simply let them through and rely on the
fact that a future name lookup will fail.  Because of this,
tfeedback_decl::init() no longer needs to return a boolean error
code--it always succeeds, and if the input was mal-formed the error
will be detected later.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 53febac02c)
2013-02-08 11:17:28 -08:00
Paul Berry
11e4347bff glsl: Rename uniform_field_visitor to program_resource_visitor.
There's actually nothing uniform-specific in uniform_field_visitor.
It is potentially useful for all kinds of program resources (in
particular, future patches will use it for transform feedback
varyings).

This patch renames it to program_resource_visitor, and clarifies
several comments, to reflect the fact that it is useful for more than
just uniforms.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit b4db34cc4c)
2013-02-08 11:17:23 -08:00
Paul Berry
49a5f829f7 mesa/glsl: Separate parsing logic from _mesa_get_uniform_location.
The parsing logic is moved to a new function in the GLSL module,
parse_program_resource_name().  This name was chosen because it should
eventually be useful for handling everything that OpenGL 4.3 calls
"program resources" (e.g. uniforms, vertex inputs, fragment outputs,
and transform feedback varyings).

Future patches will make use of this function for linking transform
feedback varyings.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit b92900d26a)
2013-02-08 11:17:17 -08:00
Kenneth Graunke
5265c42e52 i965/blorp: Support blits between ARGB and XRGB formats.
Now that we have support for overriding alpha to 1.0, we can handle
blitting between these formats in either direction.

For now, we only support two XRGB formats: MESA_FORMAT_XRGB8888 and
MESA_FORMAT_RGBX8888_REV.  Most places only appear to worry about the
former, so ignore the latter for now.  We can always add it later.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Martin Steigerwald <martin@lichtvoll.de>
(cherry picked from commit 7d467f3c15)
2013-02-07 22:31:29 -08:00
Kenneth Graunke
3114f5acd3 i965/blorp: Support overriding destination alpha to 1.0.
Currently, Blorp requires the source and destination formats to be
equal.  However, we'd really like to be able to blit between XRGB and
ARGB formats; our BLT engine paths have supported this for a long time.

For ARGB -> XRGB, nothing needs to occur: the missing alpha is already
interpreted as 1.0.  For XRGB -> ARGB, we need to smash the alpha
channel to 1.0 when writing the destination colors.  This is fairly
straightforward with blending.

For now, this code is never used, as the source and destination formats
still must be equal.  The next patch will relax that restriction.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Martin Steigerwald <martin@lichtvoll.de>
(cherry picked from commit c0554141a9)
2013-02-07 22:31:29 -08:00
Kenneth Graunke
332c50b666 i965: Implement CopyTexSubImage2D via BLORP (and use it by default).
The BLT engine has many limitations.  Currently, it can only blit
X-tiled buffers (since we don't have a kernel API to whack the BLT
tiling mode register), which means all depth/stencil operations get
punted to meta code, which can be very CPU-intensive.

Even if we used the BLT engine, it can't blit between buffers with
different tiling modes, such as an X-tiled non-MSAA ARGB8888 texture
and a Y-tiled CMS ARGB8888 renderbuffer.  This is a fundamental
limitation, and the only way around that is to use BLORP.

Previously, BLORP only handled BlitFramebuffer.  This patch adds an
additional frontend for doing CopyTexSubImage.  It also makes it the
default.  This is partly to increase testing and avoid hiding bugs,
and partly because the BLORP path can already handle more cases.  With
trivial extensions, it should be able to handle everything the BLT can.

This helps PlaneShift massively, which tries to CopyTexSubImage2D
between depth buffers whenever a player casts a spell.  Since these
are Y-tiled, we hit meta and software ReadPixels paths, eating 99% CPU
while delivering ~1 FPS.  This is particularly bad in an MMO setting
because people cast spells all the time.

It also helps Xonotic in 4X MSAA mode.  At default power management
settings, I measured a 6.35138% +/- 0.672548% performance boost (n=5).
(This data is from v1 of the patch.)

No Piglit regressions on Ivybridge (v3) or Sandybridge (v2).

v2: Create a fake intel_renderbuffer to wrap the destination texture
    image and then reuse do_blorp_blit rather than reimplementing most
    of it.  Remove unnecessary clipping code and conditional rendering
    check.

v3: Reuse formats_match() to centralize checks; delete temporary
    renderbuffers.  Reorganize the code.

v4: Actually copy stencil when dealing with separate stencil buffers but
    packed depth/stencil formats.  Tested by a new Piglit test.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com> [v4]
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v3]
Reviewed-and-tested-by: Carl Worth <cworth@cworth.org> [v2]
Tested-by: Martin Steigerwald <martin@lichtvoll.de> [v3]
(cherry picked from commit 0b3bebbaac)
2013-02-07 22:31:29 -08:00
Kenneth Graunke
55e3f79d55 mesa: Put extern "C" guards in renderbuffer.h.
I need to use this from C++ code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 29aef6cce8)
2013-02-07 22:31:29 -08:00
Kenneth Graunke
1d2ef43032 i965: Fix the SF Vertex URB Read Length calculation for Gen7 platforms.
Ivybridge doesn't appear to have the same errata as Sandybridge; no
corruption was observed by setting it to more than the minimal correct
value.  It's possible that we were simply lucky, since the URB entries
are 1024-bit on Ivybridge vs. 512-bit Sandybridge.  Or perhaps the
underlying hardware issue is fixed.

Either way, we may as well program the minimum value since it's now
readily available, likely to be more efficient, and possibly more
correct.

v2: Use GEN7_SBE_* defines rather than GEN6_SF_*.  (A copy and paste
    mistake.)  They're the same, but using the right names is better.

NOTE: This is a candidate for all stable branches.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 44aa2e15f6)
2013-02-07 22:31:28 -08:00
Kenneth Graunke
3acd5ed75b i965: Fix the SF Vertex URB Read Length calculation for Sandybridge.
(This commit message was primarily written by Paul Berry, who explained
 what's going on far better than I would have.)

Previous to this patch, we thought that the only restrictions on
3DSTATE_SF's URB read length were (a) it needs to be large enough to
read all the VUE data that the SF needs, and (b) it can't be so large
that it tries to read VUE data that doesn't exist.  Since the VUE map
already tells us how much VUE data exists, we didn't bother worrying
about restriction (a); we just did the easy thing and programmed the
read length to satisfy restriction (b).

However, we didn't notice this erratum in the hardware docs: "[errata]
Corruption/Hang possible if length programmed larger than recommended".
Judging by the context surrounding this erratum, it's pretty clear that
it means "URB read length must be exactly the size necessary to read all
the VUE data that the SF needs, and no larger".  Which means that we
can't program the read length based on restriction (b)--we have to
program it based on restriction (a).

The URB read size needs to precisely match the amount of data that the
SF consumes; it doesn't work to simply base it on the size of the VUE.

Thankfully, the PRM contains the precise formula the hardware expects.

Fixes random UI corruption in Steam's "Big Picture Mode", random terrain
corruption in PlaneShift, and Piglit's fbo-5-varyings test.

NOTE: This is a candidate for all stable branches.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56920
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60172
Tested-by: Jordan Justen <jordan.l.justen@intel.com> (v1/Piglit)
Tested-by: Martin Steigerwald <martin@lichtvoll.de> (PlaneShift)
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 09fbc29828)
2013-02-07 22:31:28 -08:00
Kenneth Graunke
697f8e56dc i965: Compute the maximum SF source attribute.
The maximum SF source attribute is necessary to compute the Vertex URB
read length properly, which will be done in the next commit.

NOTE: This is a candidate for all stable branches.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Tested-by: Martin Steigerwald <martin@lichtvoll.de>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 5e9bc7bd12)
2013-02-07 22:31:28 -08:00
Kenneth Graunke
45ae093e5c i965: Refactor Gen6+ SF attribute override code.
The next patch will benefit from easy access to the source attribute
number and whether or not we're swizzling.  It doesn't want the final
attr_override DWord form, however.

NOTE: This is a candidate for all stable branches.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Tested-by: Martin Steigerwald <martin@lichtvoll.de>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit b3efc5bea8)
2013-02-07 22:31:28 -08:00
Kenneth Graunke
535e95299a i965: Add chipset limits for Haswell GT1/GT2.
The maximum number of URB entries come from the 3DSTATE_URB_VS and
3DSTATE_URB_GS state packet documentation; the thread count information
comes from the 3DSTATE_VS and 3DSTATE_PS state packet documentation.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
(cherry picked from commit 9add4e8038)
2013-02-07 22:31:28 -08:00
Vinson Lee
a7e2c615f1 i965: Fix assignment instead of comparison in asserts.
Fixes side effect in assertion defects reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit 1559994cba)
2013-02-07 22:31:28 -08:00
Paul Berry
5611a5a387 mesa: Don't check (offset + size <= bufObj->Size) in BindBufferRange.
In the documentation for BindBufferRange, OpenGL specs from 3.0
through 4.1 contain this language:

    "The error INVALID_VALUE is generated if size is less than or
    equal to zero or if offset + size is greater than the value of
    BUFFER_SIZE."

This text was dropped from OpenGL 4.2, and it does not appear in the
GLES 3.0 spec.

Presumably the reason for the change is because come clients change
the size of the buffer after calling BindBufferRange.  We don't want
to generate an error at the time of the BindBufferRange call just
because the old size of the buffer was too small, when the buffer is
about to be resized.

Since this is a deliberate relaxation of error conditions in order to
allow clients to work, it seems sensible to apply it to all versions
of GL, not just GL 4.2 and above.

(Note that there is no danger of this change allowing a client to
access data beyond the end of a buffer.  We already have code to
ensure that that doesn't happen in the case where the client shrinks
the buffer after calling BindBufferRange).

Eliminates a spurious error message in the gles3 conformance test
"transform_feedback_offset_size".

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 04f0d6cc22)
2013-02-07 21:20:32 -08:00
Ian Romanick
a48e5526c2 i965: Set UniformBufferOffsetAlignment to sizeof(vec4)
This matches the behavior of the Windows driver, but a bspec reference
should would be nice.

NOTE: This is a candidate for the 9.0 and 9.1 branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit f29ab4ece5)
2013-02-07 21:20:16 -08:00
Matt Turner
c59808c700 mesa: Allow glGet* queries of MAX_VARYING_COMPONENTS in ES 3
Should have been done in d9948e49 but I missed it because
MAX_VARYING_FLOATS doesn't appear in the ES 3 spec, but is the same
value as MAX_VARYING_COMPONENTS.

NOTE: Candidate for the 9.1 branch
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-02-07 17:54:16 -08:00
Michel Dänzer
ad62f424b3 radeonsi: Handle scaled and integer formats for samplers and vertex elements.
Also, add assertions to stress that render targets don't support scaled
formats.

20 more little piglits.
(cherry picked from commit 46dd16bca8b4526e46badc9cb1d7c058a8e6173e)
2013-02-07 19:11:30 +01:00
Michel Dänzer
fc04455533 radeonsi: Don't advertise PIPE_FORMAT_L8A8_SRGB support.
The hardware can't do it.
(cherry picked from commit f6e9430da2d3510f84baefa0fdf26ec5c457f146)
2013-02-07 19:11:19 +01:00
Michel Dänzer
6799bddf6b radeonsi: Remove incorrect (and dead) assignment in tex_fetch_args().
The proper return type is assigned at the end of the function.
(cherry picked from commit 180db2bcb28e94bb1ce18d76b2b3a5818d76262c)
2013-02-07 19:11:09 +01:00
Michel Dänzer
93f61addb5 radeonsi: Use unique names for referring to texture sampling intrinsics.
Append the overloaded vector type used for passing in the addressing
parameters.

Without this, LLVM uses the same function signature for all those types,
which cannot work.

Fixes problems e.g. with FlightGear and Red Eclipse.
(cherry picked from commit 1b3afea30de757815555d9eb1d6e72e2586d6a0c)
2013-02-07 19:10:17 +01:00
Jerome Glisse
d04b50b4de r600g: fix slice tile max for compressed texture and async dma
Was using the pixel size instead of the number of block for the slice
tile max computation which resulted in dma writing at wrong address.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-02-07 10:43:37 -05:00
Marek Olšák
f1c46c8418 r300g: fix blending with blend color and RGBA formats
NOTE: This is a candidate for the stable branches.
(cherry picked from commit f40a7fc34a)
2013-02-06 22:24:04 +01:00
Michel Dänzer
4bc85f9aac Require libdrm_radeon 2.4.42 for radeonsi.
It has new PCI IDs and an important tiled surface layout fix.
(cherry picked from commit 02a423b239)
2013-02-05 15:15:49 +01:00
Alex Deucher
e1d798a901 radeonsi: add Oland pci ids
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Note: this is a candidate for the 9.1 branch.
(cherry picked from commit 4161d70bba)
2013-02-04 17:20:22 -05:00
Alex Deucher
6b0fa537a9 radeonsi: default PA_SC_RASTER_CONFIG to 0
That should work in all cases.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Note: this is a candidate for the 9.1 branch.
(cherry picked from commit af0af75881)
2013-02-04 17:20:03 -05:00
Alex Deucher
0cc0097bb0 radeonsi: add support for Oland chips
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Note: this is a candidate for the 9.1 branch
(cherry picked from commit 83e4407f44)
2013-02-04 17:19:43 -05:00
Michel Dänzer
7f90de5414 radeonsi: Fix draws using user index buffer.
Was broken since commit bf469f4edc
('gallium: add void *user_buffer in pipe_index_buffer').

Fixes 11 piglit tests and lots of missing geometry e.g. in TORCS.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit a8a5055f2d)
2013-02-04 17:54:03 +01:00
Michel Dänzer
8cd237bcbe radeonsi: Remove spurious traces of R16G16B16 support.
The hardware can't do it, and these were causing warnings in some piglit tests.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit 6455d40b7e)
2013-02-04 17:28:18 +01:00
Michel Dänzer
5ca77c27a6 radeonsi: Enable texture arrays.
28/30 piglit tests pass.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit 6bcb823844)
2013-02-04 17:28:14 +01:00
Michel Dänzer
b104d151f1 radeonsi: Improve packing of texture address parameters.
In particular, the LOD bias and depth comparison values are packed before the
'normal' texture coordinates, and the array slice and LOD values are appended.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit 120efeef8b)
2013-02-04 17:27:43 +01:00
Michel Dänzer
5f9f3f381f radeonsi: Adapt to sample intrinsics changes.
Fix up intrinsic names, and bitcast texture address parameters to integers.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit e5fb7347a7)
2013-02-04 17:27:34 +01:00
Marek Olšák
b127ad3489 mesa: don't expose IBM_rasterpos_clip in a core context
glRasterPos doesn't exist in the core profile.

NOTE: This is a candidate for the stable branches (9.0 and 9.1).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit cc5fdaf2dc)
2013-02-01 16:35:24 +01:00
Marek Olšák
1003652a7f r300g: always put MSAA resources in VRAM
This along with the latest drm-fixes branch should help with bad performance
of MSAA. Remember: Nx MSAA can't be more than N times slower (where N=2,4,6).

Anyway, I recommend at least 512 MB of VRAM for Full HD 6x MSAA.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit a06f03d795)
2013-02-01 16:35:18 +01:00
Jerome Glisse
9d8a866db3 r600g: add cs memory usage accounting and limit it v3
We are now seing cs that can go over the vram+gtt size to avoid
failing flush early cs that goes over 70% (gtt+vram) usage. 70%
is use to allow some fragmentation.

The idea is to compute a gross estimate of memory requirement of
each draw call. After each draw call, memory will be precisely
accounted. So the uncertainty is only on the current draw call.
In practice this gave very good estimate (+/- 10% of the target
memory limit).

v2: Remove left over from testing version, remove useless NULL
    checking. Improve commit message.
v3: Add comment to code on memory accounting precision

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-01-31 14:25:30 -05:00
Marek Olšák
3b8d4f941f r600g: fix htile buffer leak
NOTE: This is a candidate for the 9.1 branch.
2013-01-31 14:25:10 -05:00
Matt Turner
ff515c4e7c build: Add missing comma in AS_IF
Reported-by: Lauri Kasanen<curaga@operamail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47248#c15
2013-01-29 15:06:47 -08:00
Marek Olšák
d7ca04a7c3 docs/relnotes-9.1: document new features in radeon drivers
(cherry picked from commit 845130951f)
2013-01-29 17:38:14 +01:00
Matt Turner
48af880f81 docs: List new extensions added in Mesa 9.1
I did not list the *_get_program_binary extensions since they're not
useful to anyone with their current implementation (that supports 0
binary formats).
2013-01-28 16:49:24 -08:00
Jerome Glisse
af2d8f8072 r600g: use uint64_t instead of unsigned long for proper 32bits cpu support
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-01-28 19:10:29 -05:00
Jerome Glisse
d8d17441e2 r600g: real fix for non 3.8 kernel
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-01-28 17:44:49 -05:00
87 changed files with 1490 additions and 549 deletions

View File

@@ -36,7 +36,7 @@ check-local:
# Rules for making release tarballs
PACKAGE_VERSION=9.1-devel
PACKAGE_VERSION=9.1-rc2
PACKAGE_DIR = Mesa-$(PACKAGE_VERSION)
PACKAGE_NAME = MesaLib-$(PACKAGE_VERSION)

3
bin/.cherry-ignore Normal file
View File

@@ -0,0 +1,3 @@
d60da27273d2cdb68bc32cae2ca66718dab15f27 st/mesa: set ctx->Const.MaxSamples = 0, not 1
5c86a728d4f688c0fe7fbf9f4b8f88060b65c4ee r600g: fix htile buffer leak
496928a442cec980b534bc5da2523b3632b21b61 CopyTexImage: Don't check sRGB vs LINEAR for desktop GL

View File

@@ -30,7 +30,7 @@ AC_SUBST([OSMESA_VERSION])
dnl Versions for external dependencies
LIBDRM_REQUIRED=2.4.24
LIBDRM_RADEON_REQUIRED=2.4.40
LIBDRM_RADEON_REQUIRED=2.4.42
LIBDRM_INTEL_REQUIRED=2.4.38
LIBDRM_NVVIEUX_REQUIRED=2.4.33
LIBDRM_NOUVEAU_REQUIRED="2.4.33 libdrm >= 2.4.41"
@@ -57,10 +57,10 @@ LT_PREREQ([2.2])
LT_INIT([disable-static])
AX_PROG_BISON([],
AS_IF([test ! -f "$srcdir/src/glsl/glcpp/glcpp-parse.c"]
AS_IF([test ! -f "$srcdir/src/glsl/glcpp/glcpp-parse.c"],
[AC_MSG_ERROR([bison not found - unable to compile glcpp-parse.y])]))
AX_PROG_FLEX([],
AS_IF([test ! -f "$srcdir/src/glsl/glcpp/glcpp-lex.c"]
AS_IF([test ! -f "$srcdir/src/glsl/glcpp/glcpp-lex.c"],
[AC_MSG_ERROR([flex not found - unable to compile glcpp-lex.l])]))
AC_PATH_PROG([PERL], [perl])
@@ -611,7 +611,7 @@ AC_ARG_ENABLE([opencl],
[enable OpenCL library NOTE: Enabling this option will also enable
--with-llvm-shared-libs
@<:@default=no@:>@])],
[enable_opencl="$enableval" with_llvm_shared_libs="$enableval"],
[],
[enable_opencl=no])
AC_ARG_ENABLE([xlib_glx],
[AS_HELP_STRING([--enable-xlib-glx],
@@ -701,6 +701,16 @@ if test "x$enable_dri$enable_xlib_glx" = xyesyes; then
AC_MSG_ERROR([DRI and Xlib-GLX cannot be built together])
fi
if test "x$enable_opengl$enable_xlib_glx" = xnoyes; then
AC_MSG_ERROR([Xlib-GLX cannot be built without OpenGL])
fi
# Disable GLX if OpenGL is not enabled
if test "x$enable_glx$enable_opengl" = xyesno; then
AC_MSG_WARN([OpenGL not enabled, disabling GLX])
enable_glx=no
fi
# Disable GLX if DRI and Xlib-GLX are not enabled
if test "x$enable_glx" = xyes -a \
"x$enable_dri" = xno -a \
@@ -1619,8 +1629,13 @@ AC_ARG_ENABLE([gallium-llvm],
AC_ARG_WITH([llvm-shared-libs],
[AS_HELP_STRING([--with-llvm-shared-libs],
[link with LLVM shared libraries @<:@default=disabled@:>@])],
[with_llvm_shared_libs=yes],
[],
[with_llvm_shared_libs=no])
AS_IF([test x$enable_opencl = xyes],
[
AC_MSG_WARN([OpenCL required, forcing LLVM shared libraries])
with_llvm_shared_libs=yes
])
AC_ARG_WITH([llvm-prefix],
[AS_HELP_STRING([--with-llvm-prefix],
@@ -1662,16 +1677,14 @@ if test "x$enable_gallium_llvm" = xyes; then
if test "x$LLVM_CONFIG" != xno; then
LLVM_VERSION=`$LLVM_CONFIG --version | sed 's/svn.*//g'`
LLVM_VERSION_INT=`echo $LLVM_VERSION | sed -e 's/\([[0-9]]\)\.\([[0-9]]\)/\10\2/g'`
if test "x$with_llvm_shared_libs" != xyes; then
LLVM_COMPONENTS="engine bitwriter"
if $LLVM_CONFIG --components | grep -q '\<mcjit\>'; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} mcjit"
fi
LLVM_COMPONENTS="engine bitwriter"
if $LLVM_CONFIG --components | grep -q '\<mcjit\>'; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} mcjit"
fi
if test "x$enable_opencl" = xyes; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} ipo linker instrumentation"
fi
fi
if test "x$enable_opencl" = xyes; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} ipo linker instrumentation"
fi
LLVM_LDFLAGS=`$LLVM_CONFIG --ldflags`
LLVM_BINDIR=`$LLVM_CONFIG --bindir`
LLVM_CPPFLAGS=`strip_unwanted_llvm_flags "$LLVM_CONFIG --cppflags"`
@@ -1840,6 +1853,9 @@ if test "x$with_gallium_drivers" != x; then
if test "x$enable_r600_llvm" = xyes; then
USE_R600_LLVM_COMPILER=yes;
fi
if test "x$enable_opencl" = xyes; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} bitreader asmparser"
fi
gallium_check_st "radeon/drm" "dri-r600" "xorg-r600" "" "xvmc-r600" "vdpau-r600"
;;
xradeonsi)

View File

@@ -44,9 +44,18 @@ Note: some of the new features are only available with certain drivers.
</p>
<ul>
<li>GL_ANGLE_texture_compression_dxt3</li>
<li>GL_ANGLE_texture_compression_dxt5</li>
<li>GL_ARB_ES3_compatibility</li>
<li>GL_ARB_internalformat_query</li>
<li>GL_ARB_map_buffer_alignment</li>
<li>GL_ARB_texture_cube_map_array</li>
<li>GL_ARB_shading_language_packing</li>
<li>GL_ARB_texture_buffer_object_rgb32</li>
<li>GL_ARB_texture_cube_map_array</li>
<li>GL_EXT_color_buffer_float</li>
<li>GL_OES_depth_texture_cube_map</li>
<li>OpenGL 3.1 core profile support on Radeon HD2000 up to HD6000 series </li>
<li>Multisample anti-aliasing support on Radeon X1000 series</li>
</ul>
@@ -63,6 +72,7 @@ Note: some of the new features are only available with certain drivers.
<li>Removed swrast support for GL_NV_vertex_program</li>
<li>Removed swrast support for GL_NV_fragment_program</li>
<li>Removed OpenVMS support (unmaintained and broken)</li>
<li>Removed makedepend build dependency</li>
</ul>
</div>

View File

@@ -46,3 +46,17 @@ CHIPSET(0x6839, VERDE_6839, VERDE)
CHIPSET(0x683B, VERDE_683B, VERDE)
CHIPSET(0x683D, VERDE_683D, VERDE)
CHIPSET(0x683F, VERDE_683F, VERDE)
CHIPSET(0x6600, OLAND_6600, OLAND)
CHIPSET(0x6601, OLAND_6601, OLAND)
CHIPSET(0x6602, OLAND_6602, OLAND)
CHIPSET(0x6603, OLAND_6603, OLAND)
CHIPSET(0x6606, OLAND_6606, OLAND)
CHIPSET(0x6607, OLAND_6607, OLAND)
CHIPSET(0x6610, OLAND_6610, OLAND)
CHIPSET(0x6611, OLAND_6611, OLAND)
CHIPSET(0x6613, OLAND_6613, OLAND)
CHIPSET(0x6620, OLAND_6620, OLAND)
CHIPSET(0x6621, OLAND_6621, OLAND)
CHIPSET(0x6623, OLAND_6623, OLAND)
CHIPSET(0x6631, OLAND_6631, OLAND)

View File

@@ -530,7 +530,7 @@ def generate(env):
env.PkgCheckModules('XF86VIDMODE', ['xxf86vm'])
env.PkgCheckModules('DRM', ['libdrm >= 2.4.24'])
env.PkgCheckModules('DRM_INTEL', ['libdrm_intel >= 2.4.30'])
env.PkgCheckModules('DRM_RADEON', ['libdrm_radeon >= 2.4.40'])
env.PkgCheckModules('DRM_RADEON', ['libdrm_radeon >= 2.4.42'])
env.PkgCheckModules('XORG', ['xorg-server >= 1.6.0'])
env.PkgCheckModules('KMS', ['libkms >= 2.4.24'])
env.PkgCheckModules('UDEV', ['libudev > 150'])

View File

@@ -446,6 +446,7 @@ dri2_swap_buffers(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *draw)
{
struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp);
struct dri2_egl_surface *dri2_surf = dri2_egl_surface(draw);
__DRIbuffer buffer;
int i, ret = 0;
while (dri2_surf->frame_callback && ret != -1)
@@ -463,6 +464,13 @@ dri2_swap_buffers(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *draw)
if (dri2_surf->color_buffers[i].age > 0)
dri2_surf->color_buffers[i].age++;
/* Make sure we have a back buffer in case we're swapping without ever
* rendering. */
if (get_back_bo(dri2_surf, &buffer) < 0) {
_eglError(EGL_BAD_ALLOC, "dri2_swap_buffers");
return EGL_FALSE;
}
dri2_surf->back->age = 1;
dri2_surf->current = dri2_surf->back;
dri2_surf->back = NULL;

View File

@@ -421,10 +421,10 @@ util_clear_depth_stencil(struct pipe_context *pipe,
else {
uint32_t dst_mask;
if (format == PIPE_FORMAT_Z24_UNORM_S8_UINT)
dst_mask = 0xffffff00;
dst_mask = 0x00ffffff;
else {
assert(format == PIPE_FORMAT_S8_UINT_Z24_UNORM);
dst_mask = 0xffffff;
dst_mask = 0xffffff00;
}
if (clear_flags & PIPE_CLEAR_DEPTH)
dst_mask = ~dst_mask;

View File

@@ -487,6 +487,7 @@ static void r300_set_blend_color(struct pipe_context* pipe,
(struct r300_blend_color_state*)r300->blend_color_state.state;
struct pipe_blend_color c;
enum pipe_format format = fb->nr_cbufs ? fb->cbufs[0]->format : 0;
float tmp;
CB_LOCALS;
state->state = *color; /* Save it, so that we can reuse it in set_fb_state */
@@ -513,6 +514,13 @@ static void r300_set_blend_color(struct pipe_context* pipe,
c.color[2] = c.color[3];
break;
case PIPE_FORMAT_R8G8B8A8_UNORM:
case PIPE_FORMAT_R8G8B8X8_UNORM:
tmp = c.color[0];
c.color[0] = c.color[2];
c.color[2] = tmp;
break;
default:;
}
}
@@ -919,6 +927,9 @@ r300_set_framebuffer_state(struct pipe_context* pipe,
/* Need to reset clamping or colormask. */
r300_mark_atom_dirty(r300, &r300->blend_state);
/* Re-swizzle the blend color. */
r300_set_blend_color(pipe, &((struct r300_blend_color_state*)r300->blend_color_state.state)->state);
/* If zsbuf is set from NULL to non-NULL or vice versa.. */
if (!!old_state->zsbuf != !!state->zsbuf) {
r300_mark_atom_dirty(r300, &r300->dsa_state);

View File

@@ -978,9 +978,9 @@ r300_texture_create_object(struct r300_screen *rscreen,
tex->tex.microtile = microtile;
tex->tex.macrotile[0] = macrotile;
tex->tex.stride_in_bytes_override = stride_in_bytes_override;
tex->domain = base->flags & R300_RESOURCE_FLAG_TRANSFER ?
RADEON_DOMAIN_GTT :
RADEON_DOMAIN_VRAM | RADEON_DOMAIN_GTT;
tex->domain = base->flags & R300_RESOURCE_FLAG_TRANSFER ? RADEON_DOMAIN_GTT :
base->nr_samples > 1 ? RADEON_DOMAIN_VRAM :
RADEON_DOMAIN_VRAM | RADEON_DOMAIN_GTT;
tex->buf = buffer;
r300_texture_desc_init(rscreen, tex, base);

View File

@@ -243,9 +243,9 @@ void evergreen_set_streamout_enable(struct r600_context *ctx, unsigned buffer_en
void evergreen_dma_copy(struct r600_context *rctx,
struct pipe_resource *dst,
struct pipe_resource *src,
unsigned long dst_offset,
unsigned long src_offset,
unsigned long size)
uint64_t dst_offset,
uint64_t src_offset,
uint64_t size)
{
struct radeon_winsys_cs *cs = rctx->rings.dma.cs;
unsigned i, ncopy, csize, sub_cmd, shift;

View File

@@ -1668,6 +1668,8 @@ static void evergreen_set_framebuffer_state(struct pipe_context *ctx,
surf = (struct r600_surface*)state->cbufs[i];
rtex = (struct r600_texture*)surf->base.texture;
r600_context_add_resource_size(ctx, state->cbufs[i]->texture);
if (!surf->color_initialized) {
evergreen_init_color_surface(rctx, surf);
}
@@ -1699,6 +1701,8 @@ static void evergreen_set_framebuffer_state(struct pipe_context *ctx,
if (state->zsbuf) {
surf = (struct r600_surface*)state->zsbuf;
r600_context_add_resource_size(ctx, state->zsbuf->texture);
if (!surf->depth_initialized) {
evergreen_init_depth_surface(rctx, surf);
}
@@ -2221,6 +2225,13 @@ static void evergreen_emit_db_misc_state(struct r600_context *rctx, struct r600_
if (rctx->db_state.rsurf && rctx->db_state.rsurf->htile_enabled) {
/* FORCE_OFF means HiZ/HiS are determined by DB_SHADER_CONTROL */
db_render_override |= S_02800C_FORCE_HIZ_ENABLE(V_02800C_FORCE_OFF);
/* This is to fix a lockup when hyperz and alpha test are enabled at
* the same time somehow GPU get confuse on which order to pick for
* z test
*/
if (rctx->alphatest_state.sx_alpha_test_control) {
db_render_override |= S_02800C_FORCE_SHADER_Z_ORDER(1);
}
} else {
db_render_override |= S_02800C_FORCE_HIZ_ENABLE(V_02800C_FORCE_DISABLE);
}
@@ -3210,7 +3221,7 @@ void evergreen_pipe_shader_ps(struct pipe_context *ctx, struct r600_pipe_shader
struct r600_context *rctx = (struct r600_context *)ctx;
struct r600_pipe_state *rstate = &shader->rstate;
struct r600_shader *rshader = &shader->shader;
unsigned i, exports_ps, num_cout, spi_ps_in_control_0, spi_input_z, spi_ps_in_control_1, db_shader_control;
unsigned i, exports_ps, num_cout, spi_ps_in_control_0, spi_input_z, spi_ps_in_control_1, db_shader_control = 0;
int pos_index = -1, face_index = -1;
int ninterp = 0;
boolean have_linear = FALSE, have_centroid = FALSE, have_perspective = FALSE;
@@ -3220,7 +3231,6 @@ void evergreen_pipe_shader_ps(struct pipe_context *ctx, struct r600_pipe_shader
rstate->nregs = 0;
db_shader_control = S_02880C_Z_ORDER(V_02880C_EARLY_Z_THEN_LATE_Z);
for (i = 0; i < rshader->ninput; i++) {
/* evergreen NUM_INTERP only contains values interpolated into the LDS,
POSITION goes via GPRs from the SC so isn't counted */
@@ -3454,6 +3464,21 @@ void evergreen_update_db_shader_control(struct r600_context * rctx)
V_02880C_EXPORT_DB_FULL) |
S_02880C_ALPHA_TO_MASK_DISABLE(rctx->framebuffer.cb0_is_integer);
/* When alpha test is enabled we can't trust the hw to make the proper
* decision on the order in which ztest should be run related to fragment
* shader execution.
*
* If alpha test is enabled perform early z rejection (RE_Z) but don't early
* write to the zbuffer. Write to zbuffer is delayed after fragment shader
* execution and thus after alpha test so if discarded by the alpha test
* the z value is not written.
*/
if (rctx->alphatest_state.sx_alpha_test_control) {
db_shader_control |= S_02880C_Z_ORDER(V_02880C_RE_Z);
} else {
db_shader_control |= S_02880C_Z_ORDER(V_02880C_EARLY_Z_THEN_LATE_Z);
}
if (db_shader_control != rctx->db_misc_state.db_shader_control) {
rctx->db_misc_state.db_shader_control = db_shader_control;
rctx->db_misc_state.atom.dirty = true;
@@ -3481,7 +3506,7 @@ static void evergreen_dma_copy_tile(struct r600_context *rctx,
unsigned array_mode, lbpp, pitch_tile_max, slice_tile_max, size;
unsigned ncopy, height, cheight, detile, i, x, y, z, src_mode, dst_mode;
unsigned sub_cmd, bank_h, bank_w, mt_aspect, nbanks, tile_split;
unsigned long base, addr;
uint64_t base, addr;
/* make sure that the dma ring is only one active */
rctx->rings.gfx.flush(rctx, RADEON_FLUSH_ASYNC);
@@ -3502,7 +3527,8 @@ static void evergreen_dma_copy_tile(struct r600_context *rctx,
if (dst_mode == RADEON_SURF_MODE_LINEAR) {
/* T2L */
array_mode = evergreen_array_mode(src_mode);
slice_tile_max = (((pitch * rsrc->surface.level[src_level].npix_y) >> 6) / bpp) - 1;
slice_tile_max = (rsrc->surface.level[src_level].nblk_x * rsrc->surface.level[src_level].nblk_y) >> 6;
slice_tile_max = slice_tile_max ? slice_tile_max - 1 : 0;
/* linear height must be the same as the slice tile max height, it's ok even
* if the linear destination/source have smaller heigh as the size of the
* dma packet will be using the copy_height which is always smaller or equal
@@ -3526,7 +3552,8 @@ static void evergreen_dma_copy_tile(struct r600_context *rctx,
} else {
/* L2T */
array_mode = evergreen_array_mode(dst_mode);
slice_tile_max = (((pitch * rdst->surface.level[dst_level].npix_y) >> 6) / bpp) - 1;
slice_tile_max = (rdst->surface.level[dst_level].nblk_x * rdst->surface.level[dst_level].nblk_y) >> 6;
slice_tile_max = slice_tile_max ? slice_tile_max - 1 : 0;
/* linear height must be the same as the slice tile max height, it's ok even
* if the linear destination/source have smaller heigh as the size of the
* dma packet will be using the copy_height which is always smaller or equal
@@ -3625,7 +3652,7 @@ boolean evergreen_dma_blit(struct pipe_context *ctx,
}
if (src_mode == dst_mode) {
unsigned long dst_offset, src_offset;
uint64_t dst_offset, src_offset;
/* simple dma blit would do NOTE code here assume :
* src_box.x/y == 0
* dst_x/y == 0

View File

@@ -174,9 +174,9 @@ void r600_need_dma_space(struct r600_context *ctx, unsigned num_dw);
void r600_dma_copy(struct r600_context *rctx,
struct pipe_resource *dst,
struct pipe_resource *src,
unsigned long dst_offset,
unsigned long src_offset,
unsigned long size);
uint64_t dst_offset,
uint64_t src_offset,
uint64_t size);
boolean r600_dma_blit(struct pipe_context *ctx,
struct pipe_resource *dst,
unsigned dst_level,
@@ -187,9 +187,9 @@ boolean r600_dma_blit(struct pipe_context *ctx,
void evergreen_dma_copy(struct r600_context *rctx,
struct pipe_resource *dst,
struct pipe_resource *src,
unsigned long dst_offset,
unsigned long src_offset,
unsigned long size);
uint64_t dst_offset,
uint64_t src_offset,
uint64_t size);
boolean evergreen_dma_blit(struct pipe_context *ctx,
struct pipe_resource *dst,
unsigned dst_level,

View File

@@ -68,13 +68,17 @@ static inline unsigned int r600_bytecode_get_num_operands(struct r600_bytecode *
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_MAX_INT:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_MIN_INT:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETE:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETE_DX10:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETE_INT:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETNE:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETNE_DX10:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETNE_INT:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGT:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGT_DX10:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGT_INT:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGT_UINT:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGE:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGE_DX10:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGE_INT:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGE_UINT:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_PRED_SETE:
@@ -150,13 +154,17 @@ static inline unsigned int r600_bytecode_get_num_operands(struct r600_bytecode *
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_MAX_INT:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_MIN_INT:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETE:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETE_DX10:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETE_INT:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETNE:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETNE_DX10:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETNE_INT:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGT:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGT_DX10:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGT_INT:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGT_UINT:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGE:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGE_DX10:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGE_INT:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGE_UINT:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_PRED_SETE:

View File

@@ -359,6 +359,16 @@ out_err:
void r600_need_cs_space(struct r600_context *ctx, unsigned num_dw,
boolean count_draw_in)
{
if (!ctx->ws->cs_memory_below_limit(ctx->rings.gfx.cs, ctx->vram, ctx->gtt)) {
ctx->gtt = 0;
ctx->vram = 0;
ctx->rings.gfx.flush(ctx, RADEON_FLUSH_ASYNC);
return;
}
/* all will be accounted once relocation are emited */
ctx->gtt = 0;
ctx->vram = 0;
/* The number of dwords we already used in the CS so far. */
num_dw += ctx->rings.gfx.cs->cdw;
@@ -784,6 +794,8 @@ void r600_begin_new_cs(struct r600_context *ctx)
ctx->pm4_dirty_cdwords = 0;
ctx->flags = 0;
ctx->gtt = 0;
ctx->vram = 0;
/* Begin a new CS. */
r600_emit_command_buffer(ctx->rings.gfx.cs, &ctx->start_cs_cmd);
@@ -1160,9 +1172,9 @@ void r600_need_dma_space(struct r600_context *ctx, unsigned num_dw)
void r600_dma_copy(struct r600_context *rctx,
struct pipe_resource *dst,
struct pipe_resource *src,
unsigned long dst_offset,
unsigned long src_offset,
unsigned long size)
uint64_t dst_offset,
uint64_t src_offset,
uint64_t size)
{
struct radeon_winsys_cs *cs = rctx->rings.dma.cs;
unsigned i, ncopy, csize, shift;

View File

@@ -537,6 +537,7 @@ const char * r600_llvm_gpu_string(enum radeon_family family)
case CHIP_RV630:
case CHIP_RV620:
case CHIP_RV635:
case CHIP_RV670:
case CHIP_RS780:
case CHIP_RS880:
gpu_family = "r600";
@@ -547,7 +548,6 @@ const char * r600_llvm_gpu_string(enum radeon_family family)
case CHIP_RV730:
gpu_family = "rv730";
break;
case CHIP_RV670:
case CHIP_RV740:
case CHIP_RV770:
gpu_family = "rv770";

View File

@@ -447,6 +447,10 @@ struct r600_context {
unsigned backend_mask;
unsigned max_db; /* for OQ */
/* current unaccounted memory usage */
uint64_t vram;
uint64_t gtt;
/* Miscellaneous state objects. */
void *custom_dsa_flush;
void *custom_blend_resolve;
@@ -869,9 +873,11 @@ static INLINE unsigned r600_context_bo_reloc(struct r600_context *ctx,
* look serialized from driver pov
*/
if (!ring->flushing) {
if (ring == &ctx->rings.gfx && ctx->rings.dma.cs) {
/* flush dma ring */
ctx->rings.dma.flush(ctx, RADEON_FLUSH_ASYNC);
if (ring == &ctx->rings.gfx) {
if (ctx->rings.dma.cs) {
/* flush dma ring */
ctx->rings.dma.flush(ctx, RADEON_FLUSH_ASYNC);
}
} else {
/* flush gfx ring */
ctx->rings.gfx.flush(ctx, RADEON_FLUSH_ASYNC);
@@ -996,4 +1002,28 @@ static INLINE unsigned u_max_layer(struct pipe_resource *r, unsigned level)
}
}
static INLINE void r600_context_add_resource_size(struct pipe_context *ctx, struct pipe_resource *r)
{
struct r600_context *rctx = (struct r600_context *)ctx;
struct r600_resource *rr = (struct r600_resource *)r;
if (r == NULL) {
return;
}
/*
* The idea is to compute a gross estimate of memory requirement of
* each draw call. After each draw call, memory will be precisely
* accounted. So the uncertainty is only on the current draw call.
* In practice this gave very good estimate (+/- 10% of the target
* memory limit).
*/
if (rr->domains & RADEON_DOMAIN_GTT) {
rctx->gtt += rr->buf->size;
}
if (rr->domains & RADEON_DOMAIN_VRAM) {
rctx->vram += rr->buf->size;
}
}
#endif

View File

@@ -1544,6 +1544,7 @@ static void r600_set_framebuffer_state(struct pipe_context *ctx,
surf = (struct r600_surface*)state->cbufs[i];
rtex = (struct r600_texture*)surf->base.texture;
r600_context_add_resource_size(ctx, state->cbufs[i]->texture);
if (!surf->color_initialized || force_cmask_fmask) {
r600_init_color_surface(rctx, surf, force_cmask_fmask);
@@ -1576,6 +1577,8 @@ static void r600_set_framebuffer_state(struct pipe_context *ctx,
if (state->zsbuf) {
surf = (struct r600_surface*)state->zsbuf;
r600_context_add_resource_size(ctx, state->zsbuf->texture);
if (!surf->depth_initialized) {
r600_init_depth_surface(rctx, surf);
}
@@ -1937,6 +1940,13 @@ static void r600_emit_db_misc_state(struct r600_context *rctx, struct r600_atom
if (rctx->db_state.rsurf && rctx->db_state.rsurf->htile_enabled) {
/* FORCE_OFF means HiZ/HiS are determined by DB_SHADER_CONTROL */
db_render_override |= S_028D10_FORCE_HIZ_ENABLE(V_028D10_FORCE_OFF);
/* This is to fix a lockup when hyperz and alpha test are enabled at
* the same time somehow GPU get confuse on which order to pick for
* z test
*/
if (rctx->alphatest_state.sx_alpha_test_control) {
db_render_override |= S_028D10_FORCE_SHADER_Z_ORDER(1);
}
} else {
db_render_override |= S_028D10_FORCE_HIZ_ENABLE(V_028D10_FORCE_DISABLE);
}
@@ -2745,7 +2755,7 @@ void r600_pipe_shader_ps(struct pipe_context *ctx, struct r600_pipe_shader *shad
tmp);
}
db_shader_control = S_02880C_Z_ORDER(V_02880C_EARLY_Z_THEN_LATE_Z);
db_shader_control = 0;
for (i = 0; i < rshader->noutput; i++) {
if (rshader->output[i].name == TGSI_SEMANTIC_POSITION)
z_export = 1;
@@ -2940,6 +2950,19 @@ void r600_update_db_shader_control(struct r600_context * rctx)
unsigned db_shader_control = rctx->ps_shader->current->db_shader_control |
S_02880C_DUAL_EXPORT_ENABLE(dual_export);
/* When alpha test is enabled we can't trust the hw to make the proper
* decision on the order in which ztest should be run related to fragment
* shader execution.
*
* If alpha test is enabled perform z test after fragment. RE_Z (early
* z test but no write to the zbuffer) seems to cause lockup on r6xx/r7xx
*/
if (rctx->alphatest_state.sx_alpha_test_control) {
db_shader_control |= S_02880C_Z_ORDER(V_02880C_LATE_Z);
} else {
db_shader_control |= S_02880C_Z_ORDER(V_02880C_EARLY_Z_THEN_LATE_Z);
}
if (db_shader_control != rctx->db_misc_state.db_shader_control) {
rctx->db_misc_state.db_shader_control = db_shader_control;
rctx->db_misc_state.atom.dirty = true;
@@ -2979,7 +3002,7 @@ static boolean r600_dma_copy_tile(struct r600_context *rctx,
struct r600_texture *rdst = (struct r600_texture*)dst;
unsigned array_mode, lbpp, pitch_tile_max, slice_tile_max, size;
unsigned ncopy, height, cheight, detile, i, x, y, z, src_mode, dst_mode;
unsigned long base, addr;
uint64_t base, addr;
/* make sure that the dma ring is only one active */
rctx->rings.gfx.flush(rctx, RADEON_FLUSH_ASYNC);
@@ -2998,7 +3021,8 @@ static boolean r600_dma_copy_tile(struct r600_context *rctx,
if (dst_mode == RADEON_SURF_MODE_LINEAR) {
/* T2L */
array_mode = r600_array_mode(src_mode);
slice_tile_max = (((pitch * rsrc->surface.level[src_level].npix_y) >> 6) / bpp) - 1;
slice_tile_max = (rsrc->surface.level[src_level].nblk_x * rsrc->surface.level[src_level].nblk_y) >> 6;
slice_tile_max = slice_tile_max ? slice_tile_max - 1 : 0;
/* linear height must be the same as the slice tile max height, it's ok even
* if the linear destination/source have smaller heigh as the size of the
* dma packet will be using the copy_height which is always smaller or equal
@@ -3016,7 +3040,8 @@ static boolean r600_dma_copy_tile(struct r600_context *rctx,
} else {
/* L2T */
array_mode = r600_array_mode(dst_mode);
slice_tile_max = (((pitch * rdst->surface.level[dst_level].npix_y) >> 6) / bpp) - 1;
slice_tile_max = (rdst->surface.level[dst_level].nblk_x * rdst->surface.level[dst_level].nblk_y) >> 6;
slice_tile_max = slice_tile_max ? slice_tile_max - 1 : 0;
/* linear height must be the same as the slice tile max height, it's ok even
* if the linear destination/source have smaller heigh as the size of the
* dma packet will be using the copy_height which is always smaller or equal
@@ -3037,14 +3062,15 @@ static boolean r600_dma_copy_tile(struct r600_context *rctx,
return FALSE;
}
size = (copy_height * pitch) >> 2;
ncopy = (size / 0x0000ffff) + !!(size % 0x0000ffff);
/* It's a r6xx/r7xx limitation, the blit must be on 8 boundary for number
* line in the blit. Compute max 8 line we can copy in the size limit
*/
cheight = ((0x0000ffff << 2) / pitch) & 0xfffffff8;
ncopy = (copy_height / cheight) + !!(copy_height % cheight);
r600_need_dma_space(rctx, ncopy * 7);
for (i = 0; i < ncopy; i++) {
cheight = copy_height;
if (((cheight * pitch) >> 2) > 0x0000ffff) {
cheight = (0x0000ffff << 2) / pitch;
}
cheight = cheight > copy_height ? copy_height : cheight;
size = (cheight * pitch) >> 2;
/* emit reloc before writting cs so that cs is always in consistent state */
r600_context_bo_reloc(rctx, &rctx->rings.dma, &rsrc->resource, RADEON_USAGE_READ);
@@ -3109,7 +3135,7 @@ boolean r600_dma_blit(struct pipe_context *ctx,
}
if (src_mode == dst_mode) {
unsigned long dst_offset, src_offset, size;
uint64_t dst_offset, src_offset, size;
/* simple dma blit would do NOTE code here assume :
* src_box.x/y == 0

View File

@@ -293,6 +293,11 @@ static void r600_bind_dsa_state(struct pipe_context *ctx, void *state)
rctx->alphatest_state.sx_alpha_test_control = dsa->sx_alpha_test_control;
rctx->alphatest_state.sx_alpha_ref = dsa->alpha_ref;
rctx->alphatest_state.atom.dirty = true;
if (rctx->chip_class >= EVERGREEN) {
evergreen_update_db_shader_control(rctx);
} else {
r600_update_db_shader_control(rctx);
}
}
}
@@ -479,7 +484,8 @@ static void r600_set_index_buffer(struct pipe_context *ctx,
if (ib) {
pipe_resource_reference(&rctx->index_buffer.buffer, ib->buffer);
memcpy(&rctx->index_buffer, ib, sizeof(*ib));
memcpy(&rctx->index_buffer, ib, sizeof(*ib));
r600_context_add_resource_size(ctx, ib->buffer);
} else {
pipe_resource_reference(&rctx->index_buffer.buffer, NULL);
}
@@ -516,6 +522,7 @@ static void r600_set_vertex_buffers(struct pipe_context *ctx,
vb[i].buffer_offset = input[i].buffer_offset;
pipe_resource_reference(&vb[i].buffer, input[i].buffer);
new_buffer_mask |= 1 << i;
r600_context_add_resource_size(ctx, input[i].buffer);
} else {
pipe_resource_reference(&vb[i].buffer, NULL);
disable_mask |= 1 << i;
@@ -613,6 +620,7 @@ static void r600_set_sampler_views(struct pipe_context *pipe, unsigned shader,
pipe_sampler_view_reference((struct pipe_sampler_view **)&dst->views.views[i], views[i]);
new_mask |= 1 << i;
r600_context_add_resource_size(pipe, views[i]->texture);
} else {
pipe_sampler_view_reference((struct pipe_sampler_view **)&dst->views.views[i], NULL);
disable_mask |= 1 << i;
@@ -806,6 +814,8 @@ static void r600_bind_ps_state(struct pipe_context *ctx, void *state)
rctx->ps_shader = (struct r600_pipe_shader_selector *)state;
r600_context_pipe_state_set(rctx, &rctx->ps_shader->current->rstate);
r600_context_add_resource_size(ctx, (struct pipe_resource *)rctx->ps_shader->current->bo);
if (rctx->chip_class <= R700) {
bool multiwrite = rctx->ps_shader->current->shader.fs_write_all;
@@ -835,6 +845,8 @@ static void r600_bind_vs_state(struct pipe_context *ctx, void *state)
if (state) {
r600_context_pipe_state_set(rctx, &rctx->vs_shader->current->rstate);
r600_context_add_resource_size(ctx, (struct pipe_resource *)rctx->vs_shader->current->bo);
/* Update clip misc state. */
if (rctx->vs_shader->current->pa_cl_vs_out_cntl != rctx->clip_misc_state.pa_cl_vs_out_cntl ||
rctx->vs_shader->current->shader.clip_dist_write != rctx->clip_misc_state.clip_dist_write) {
@@ -938,10 +950,13 @@ static void r600_set_constant_buffer(struct pipe_context *ctx, uint shader, uint
} else {
u_upload_data(rctx->uploader, 0, input->buffer_size, ptr, &cb->buffer_offset, &cb->buffer);
}
/* account it in gtt */
rctx->gtt += input->buffer_size;
} else {
/* Setup the hw buffer. */
cb->buffer_offset = input->buffer_offset;
pipe_resource_reference(&cb->buffer, input->buffer);
r600_context_add_resource_size(ctx, input->buffer);
}
state->enabled_mask |= 1 << index;
@@ -1004,6 +1019,7 @@ static void r600_set_so_targets(struct pipe_context *ctx,
/* Set the new targets. */
for (i = 0; i < num_targets; i++) {
pipe_so_target_reference((struct pipe_stream_output_target**)&rctx->so_targets[i], targets[i]);
r600_context_add_resource_size(ctx, targets[i]->buffer);
}
for (; i < rctx->num_so_targets; i++) {
pipe_so_target_reference((struct pipe_stream_output_target**)&rctx->so_targets[i], NULL);

View File

@@ -270,6 +270,7 @@ static void r600_texture_destroy(struct pipe_screen *screen,
if (rtex->flushed_depth_texture)
pipe_resource_reference((struct pipe_resource **)&rtex->flushed_depth_texture, NULL);
pipe_resource_reference((struct pipe_resource**)&rtex->htile, NULL);
pb_reference(&resource->buf, NULL);
FREE(rtex);
}

View File

@@ -155,7 +155,7 @@ static inline LLVMValueRef bitcast(
void radeon_llvm_emit_prepare_cube_coords(struct lp_build_tgsi_context * bld_base,
struct lp_build_emit_data * emit_data,
unsigned coord_arg);
LLVMValueRef *coords_arg);
void radeon_llvm_context_init(struct radeon_llvm_context * ctx);

View File

@@ -531,7 +531,7 @@ static void kil_emit(
void radeon_llvm_emit_prepare_cube_coords(
struct lp_build_tgsi_context * bld_base,
struct lp_build_emit_data * emit_data,
unsigned coord_arg)
LLVMValueRef *coords_arg)
{
unsigned target = emit_data->inst->Texture.Texture;
@@ -542,11 +542,13 @@ void radeon_llvm_emit_prepare_cube_coords(
LLVMValueRef coords[4];
LLVMValueRef mad_args[3];
LLVMValueRef idx;
struct LLVMOpaqueValue *cube_vec;
LLVMValueRef v;
unsigned i;
LLVMValueRef v = build_intrinsic(builder, "llvm.AMDGPU.cube",
LLVMVectorType(type, 4),
&emit_data->args[coord_arg], 1, LLVMReadNoneAttribute);
cube_vec = lp_build_gather_values(bld_base->base.gallivm, coords_arg, 4);
v = build_intrinsic(builder, "llvm.AMDGPU.cube", LLVMVectorType(type, 4),
&cube_vec, 1, LLVMReadNoneAttribute);
for (i = 0; i < 4; ++i) {
idx = lp_build_const_int32(gallivm, i);
@@ -579,18 +581,14 @@ void radeon_llvm_emit_prepare_cube_coords(
if (target != TGSI_TEXTURE_CUBE ||
opcode != TGSI_OPCODE_TEX) {
/* load source coord.w component - array_index for cube arrays or
* compare value for SHADOWCUBE */
idx = lp_build_const_int32(gallivm, 3);
coords[3] = LLVMBuildExtractElement(builder,
emit_data->args[coord_arg], idx, "");
/* for cube arrays coord.z = coord.w(array_index) * 8 + face */
if (target == TGSI_TEXTURE_CUBE_ARRAY ||
target == TGSI_TEXTURE_SHADOWCUBE_ARRAY) {
/* coords_arg.w component - array_index for cube arrays or
* compare value for SHADOWCUBE */
coords[2] = lp_build_emit_llvm_ternary(bld_base, TGSI_OPCODE_MAD,
coords[3], lp_build_const_float(gallivm, 8.0), coords[2]);
coords_arg[3], lp_build_const_float(gallivm, 8.0), coords[2]);
}
/* for instructions that need additional src (compare/lod/bias),
@@ -598,12 +596,11 @@ void radeon_llvm_emit_prepare_cube_coords(
if (opcode == TGSI_OPCODE_TEX2 ||
opcode == TGSI_OPCODE_TXB2 ||
opcode == TGSI_OPCODE_TXL2) {
coords[3] = emit_data->args[coord_arg + 1];
coords[3] = coords_arg[4];
}
}
emit_data->args[coord_arg] =
lp_build_gather_values(bld_base->base.gallivm, coords, 4);
memcpy(coords_arg, coords, sizeof(coords));
}
static void txd_fetch_args(
@@ -645,9 +642,6 @@ static void txp_fetch_args(
TGSI_OPCODE_DIV, arg, src_w);
}
coords[3] = bld_base->base.one;
emit_data->args[0] = lp_build_gather_values(bld_base->base.gallivm,
coords, 4);
emit_data->arg_count = 1;
if ((inst->Texture.Texture == TGSI_TEXTURE_CUBE ||
inst->Texture.Texture == TGSI_TEXTURE_CUBE_ARRAY ||
@@ -655,8 +649,12 @@ static void txp_fetch_args(
inst->Texture.Texture == TGSI_TEXTURE_SHADOWCUBE_ARRAY) &&
inst->Instruction.Opcode != TGSI_OPCODE_TXQ &&
inst->Instruction.Opcode != TGSI_OPCODE_TXQ_LZ) {
radeon_llvm_emit_prepare_cube_coords(bld_base, emit_data, 0);
radeon_llvm_emit_prepare_cube_coords(bld_base, emit_data, coords);
}
emit_data->args[0] = lp_build_gather_values(bld_base->base.gallivm,
coords, 4);
emit_data->arg_count = 1;
}
static void tex_fetch_args(
@@ -673,17 +671,12 @@ static void tex_fetch_args(
const struct tgsi_full_instruction * inst = emit_data->inst;
LLVMValueRef coords[4];
LLVMValueRef coords[5];
unsigned chan;
for (chan = 0; chan < 4; chan++) {
coords[chan] = lp_build_emit_fetch(bld_base, inst, 0, chan);
}
emit_data->arg_count = 1;
emit_data->args[0] = lp_build_gather_values(bld_base->base.gallivm,
coords, 4);
emit_data->dst_type = LLVMVectorType(bld_base->base.elem_type, 4);
if (inst->Instruction.Opcode == TGSI_OPCODE_TEX2 ||
inst->Instruction.Opcode == TGSI_OPCODE_TXB2 ||
inst->Instruction.Opcode == TGSI_OPCODE_TXL2) {
@@ -692,7 +685,7 @@ static void tex_fetch_args(
* That operand should be passed as a float value in the args array
* right after the coord vector. After packing it's not used anymore,
* that's why arg_count is not increased */
emit_data->args[1] = lp_build_emit_fetch(bld_base, inst, 1, 0);
coords[4] = lp_build_emit_fetch(bld_base, inst, 1, 0);
}
if ((inst->Texture.Texture == TGSI_TEXTURE_CUBE ||
@@ -701,8 +694,13 @@ static void tex_fetch_args(
inst->Texture.Texture == TGSI_TEXTURE_SHADOWCUBE_ARRAY) &&
inst->Instruction.Opcode != TGSI_OPCODE_TXQ &&
inst->Instruction.Opcode != TGSI_OPCODE_TXQ_LZ) {
radeon_llvm_emit_prepare_cube_coords(bld_base, emit_data, 0);
radeon_llvm_emit_prepare_cube_coords(bld_base, emit_data, coords);
}
emit_data->arg_count = 1;
emit_data->args[0] = lp_build_gather_values(bld_base->base.gallivm,
coords, 4);
emit_data->dst_type = LLVMVectorType(bld_base->base.elem_type, 4);
}
static void txf_fetch_args(

View File

@@ -280,6 +280,7 @@ static const char *r600_get_family_name(enum radeon_family family)
case CHIP_TAHITI: return "AMD TAHITI";
case CHIP_PITCAIRN: return "AMD PITCAIRN";
case CHIP_VERDE: return "AMD CAPE VERDE";
case CHIP_OLAND: return "AMD OLAND";
default: return "AMD unknown";
}
}
@@ -379,7 +380,7 @@ static int r600_get_param(struct pipe_screen* pscreen, enum pipe_cap param)
case PIPE_CAP_MAX_TEXTURE_CUBE_LEVELS:
return 15;
case PIPE_CAP_MAX_TEXTURE_ARRAY_LAYERS:
return /*rscreen->info.drm_minor >= 9 ? 16384 :*/ 0;
return 16384;
case PIPE_CAP_MAX_COMBINED_SAMPLERS:
return 32;
@@ -458,7 +459,7 @@ static int r600_get_shader_param(struct pipe_screen* pscreen, unsigned shader, e
/* FIXME Isn't this equal to TEMPS? */
return 1; /* Max native address registers */
case PIPE_SHADER_CAP_MAX_CONSTS:
return 64;
return 4096; /* actually only memory limits this */
case PIPE_SHADER_CAP_MAX_CONST_BUFFERS:
return 1;
case PIPE_SHADER_CAP_MAX_PREDS:

View File

@@ -433,6 +433,15 @@ static LLVMValueRef fetch_constant(
LLVMValueRef offset;
LLVMValueRef load;
if (swizzle == LP_CHAN_ALL) {
unsigned chan;
LLVMValueRef values[4];
for (chan = 0; chan < TGSI_NUM_CHANNELS; ++chan)
values[chan] = fetch_constant(bld_base, reg, type, chan);
return lp_build_gather_values(bld_base->base.gallivm, values, 4);
}
/* currently not supported */
if (reg->Register.Indirect) {
assert(0);
@@ -446,12 +455,6 @@ static LLVMValueRef fetch_constant(
* CONST[0].x will have an offset of 0 and CONST[1].x will have an
* offset of 4. */
idx = (reg->Register.Index * 4) + swizzle;
/* index loads above 255 are currently not supported */
if (idx > 255) {
assert(0);
idx = 0;
}
offset = lp_build_const_int32(base->gallivm, idx);
load = build_indexed_load(base->gallivm, const_ptr, offset);
@@ -612,6 +615,12 @@ static void si_llvm_emit_epilogue(struct lp_build_tgsi_context * bld_base)
int i;
tgsi_parse_token(parse);
if (parse->FullToken.Token.Type == TGSI_TOKEN_TYPE_PROPERTY &&
parse->FullToken.FullProperty.Property.PropertyName ==
TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS)
shader->fs_write_all = TRUE;
if (parse->FullToken.Token.Type != TGSI_TOKEN_TYPE_DECLARATION)
continue;
@@ -775,6 +784,29 @@ static void si_llvm_emit_epilogue(struct lp_build_tgsi_context * bld_base)
last_args[1] = lp_build_const_int32(base->gallivm,
si_shader_ctx->type == TGSI_PROCESSOR_FRAGMENT);
if (shader->fs_write_all && shader->nr_cbufs > 1) {
int i;
/* Specify that this is not yet the last export */
last_args[2] = lp_build_const_int32(base->gallivm, 0);
for (i = 1; i < shader->nr_cbufs; i++) {
/* Specify the target we are exporting */
last_args[3] = lp_build_const_int32(base->gallivm,
V_008DFC_SQ_EXP_MRT + i);
lp_build_intrinsic(base->gallivm->builder,
"llvm.SI.export",
LLVMVoidTypeInContext(base->gallivm->context),
last_args, 9);
si_shader_ctx->shader->spi_shader_col_format |=
si_shader_ctx->shader->spi_shader_col_format << 4;
}
last_args[3] = lp_build_const_int32(base->gallivm, V_008DFC_SQ_EXP_MRT);
}
/* Specify that this is the last export */
last_args[2] = lp_build_const_int32(base->gallivm, 1);
@@ -791,54 +823,127 @@ static void tex_fetch_args(
struct lp_build_tgsi_context * bld_base,
struct lp_build_emit_data * emit_data)
{
struct gallivm_state *gallivm = bld_base->base.gallivm;
const struct tgsi_full_instruction * inst = emit_data->inst;
unsigned opcode = inst->Instruction.Opcode;
unsigned target = inst->Texture.Texture;
LLVMValueRef ptr;
LLVMValueRef offset;
LLVMValueRef coords[4];
LLVMValueRef address[16];
unsigned count = 0;
unsigned chan;
/* WriteMask */
/* XXX: should be optimized using emit_data->inst->Dst[0].Register.WriteMask*/
emit_data->args[0] = lp_build_const_int32(bld_base->base.gallivm, 0xf);
/* Coordinates */
/* XXX: Not all sample instructions need 4 address arguments. */
if (inst->Instruction.Opcode == TGSI_OPCODE_TXP) {
LLVMValueRef src_w;
unsigned chan;
LLVMValueRef coords[4];
emit_data->dst_type = LLVMVectorType(bld_base->base.elem_type, 4);
src_w = lp_build_emit_fetch(bld_base, emit_data->inst, 0, TGSI_CHAN_W);
for (chan = 0; chan < 3; chan++ ) {
LLVMValueRef arg = lp_build_emit_fetch(bld_base,
emit_data->inst, 0, chan);
/* Fetch and project texture coordinates */
coords[3] = lp_build_emit_fetch(bld_base, emit_data->inst, 0, TGSI_CHAN_W);
for (chan = 0; chan < 3; chan++ ) {
coords[chan] = lp_build_emit_fetch(bld_base,
emit_data->inst, 0,
chan);
if (opcode == TGSI_OPCODE_TXP)
coords[chan] = lp_build_emit_llvm_binary(bld_base,
TGSI_OPCODE_DIV,
arg, src_w);
}
coords[chan],
coords[3]);
}
if (opcode == TGSI_OPCODE_TXP)
coords[3] = bld_base->base.one;
emit_data->args[1] = lp_build_gather_values(bld_base->base.gallivm,
coords, 4);
} else
emit_data->args[1] = lp_build_emit_fetch(bld_base, emit_data->inst,
0, LP_CHAN_ALL);
if (inst->Instruction.Opcode == TGSI_OPCODE_TEX2 ||
inst->Instruction.Opcode == TGSI_OPCODE_TXB2 ||
inst->Instruction.Opcode == TGSI_OPCODE_TXL2) {
/* These instructions have additional operand that should be packed
* into the cube coord vector by radeon_llvm_emit_prepare_cube_coords.
* That operand should be passed as a float value in the args array
* right after the coord vector. After packing it's not used anymore,
* that's why arg_count is not increased */
emit_data->args[2] = lp_build_emit_fetch(bld_base, inst, 1, 0);
/* Pack LOD bias value */
if (opcode == TGSI_OPCODE_TXB)
address[count++] = coords[3];
if ((target == TGSI_TEXTURE_CUBE || target == TGSI_TEXTURE_SHADOWCUBE) &&
opcode != TGSI_OPCODE_TXQ)
radeon_llvm_emit_prepare_cube_coords(bld_base, emit_data, coords);
/* Pack depth comparison value */
switch (target) {
case TGSI_TEXTURE_SHADOW1D:
case TGSI_TEXTURE_SHADOW1D_ARRAY:
case TGSI_TEXTURE_SHADOW2D:
case TGSI_TEXTURE_SHADOWRECT:
address[count++] = coords[2];
break;
case TGSI_TEXTURE_SHADOWCUBE:
case TGSI_TEXTURE_SHADOW2D_ARRAY:
address[count++] = coords[3];
break;
case TGSI_TEXTURE_SHADOWCUBE_ARRAY:
address[count++] = lp_build_emit_fetch(bld_base, inst, 1, 0);
}
if ((inst->Texture.Texture == TGSI_TEXTURE_CUBE ||
inst->Texture.Texture == TGSI_TEXTURE_SHADOWCUBE) &&
inst->Instruction.Opcode != TGSI_OPCODE_TXQ) {
radeon_llvm_emit_prepare_cube_coords(bld_base, emit_data, 1);
/* Pack texture coordinates */
address[count++] = coords[0];
switch (target) {
case TGSI_TEXTURE_2D:
case TGSI_TEXTURE_2D_ARRAY:
case TGSI_TEXTURE_3D:
case TGSI_TEXTURE_CUBE:
case TGSI_TEXTURE_RECT:
case TGSI_TEXTURE_SHADOW2D:
case TGSI_TEXTURE_SHADOWRECT:
case TGSI_TEXTURE_SHADOW2D_ARRAY:
case TGSI_TEXTURE_SHADOWCUBE:
case TGSI_TEXTURE_2D_MSAA:
case TGSI_TEXTURE_2D_ARRAY_MSAA:
case TGSI_TEXTURE_CUBE_ARRAY:
case TGSI_TEXTURE_SHADOWCUBE_ARRAY:
address[count++] = coords[1];
}
switch (target) {
case TGSI_TEXTURE_3D:
case TGSI_TEXTURE_CUBE:
case TGSI_TEXTURE_SHADOWCUBE:
case TGSI_TEXTURE_CUBE_ARRAY:
case TGSI_TEXTURE_SHADOWCUBE_ARRAY:
address[count++] = coords[2];
}
/* Pack array slice */
switch (target) {
case TGSI_TEXTURE_1D_ARRAY:
address[count++] = coords[1];
}
switch (target) {
case TGSI_TEXTURE_2D_ARRAY:
case TGSI_TEXTURE_2D_ARRAY_MSAA:
case TGSI_TEXTURE_SHADOW2D_ARRAY:
address[count++] = coords[2];
}
switch (target) {
case TGSI_TEXTURE_CUBE_ARRAY:
case TGSI_TEXTURE_SHADOW1D_ARRAY:
case TGSI_TEXTURE_SHADOWCUBE_ARRAY:
address[count++] = coords[3];
}
/* Pack LOD */
if (opcode == TGSI_OPCODE_TXL)
address[count++] = coords[3];
if (count > 16) {
assert(!"Cannot handle more than 16 texture address parameters");
count = 16;
}
for (chan = 0; chan < count; chan++ ) {
address[chan] = LLVMBuildBitCast(gallivm->builder,
address[chan],
LLVMInt32TypeInContext(gallivm->context),
"");
}
/* Pad to power of two vector */
while (count < util_next_power_of_two(count))
address[count++] = LLVMGetUndef(LLVMInt32TypeInContext(gallivm->context));
emit_data->args[1] = lp_build_gather_values(gallivm, address, count);
/* Resource */
ptr = use_sgpr(bld_base->base.gallivm, SGPR_CONST_PTR_V8I32, SI_SGPR_RESOURCE);
@@ -855,8 +960,7 @@ static void tex_fetch_args(
ptr, offset);
/* Dimensions */
emit_data->args[4] = lp_build_const_int32(bld_base->base.gallivm,
emit_data->inst->Texture.Texture);
emit_data->args[4] = lp_build_const_int32(bld_base->base.gallivm, target);
emit_data->arg_count = 5;
/* XXX: To optimize, we could use a float or v2f32, if the last bits of
@@ -866,22 +970,37 @@ static void tex_fetch_args(
4);
}
static void build_tex_intrinsic(const struct lp_build_tgsi_action * action,
struct lp_build_tgsi_context * bld_base,
struct lp_build_emit_data * emit_data)
{
struct lp_build_context * base = &bld_base->base;
char intr_name[23];
sprintf(intr_name, "%sv%ui32", action->intr_name,
LLVMGetVectorSize(LLVMTypeOf(emit_data->args[1])));
emit_data->output[emit_data->chan] = lp_build_intrinsic(
base->gallivm->builder, intr_name, emit_data->dst_type,
emit_data->args, emit_data->arg_count);
}
static const struct lp_build_tgsi_action tex_action = {
.fetch_args = tex_fetch_args,
.emit = lp_build_tgsi_intrinsic,
.intr_name = "llvm.SI.sample"
.emit = build_tex_intrinsic,
.intr_name = "llvm.SI.sample."
};
static const struct lp_build_tgsi_action txb_action = {
.fetch_args = tex_fetch_args,
.emit = lp_build_tgsi_intrinsic,
.intr_name = "llvm.SI.sample.bias"
.emit = build_tex_intrinsic,
.intr_name = "llvm.SI.sampleb."
};
static const struct lp_build_tgsi_action txl_action = {
.fetch_args = tex_fetch_args,
.emit = lp_build_tgsi_intrinsic,
.intr_name = "llvm.SI.sample.lod"
.emit = build_tex_intrinsic,
.intr_name = "llvm.SI.samplel."
};

View File

@@ -720,7 +720,6 @@ static uint32_t si_translate_colorformat(enum pipe_format format)
case PIPE_FORMAT_L8A8_SNORM:
case PIPE_FORMAT_L8A8_UINT:
case PIPE_FORMAT_L8A8_SINT:
case PIPE_FORMAT_L8A8_SRGB:
case PIPE_FORMAT_R8G8_SNORM:
case PIPE_FORMAT_R8G8_UNORM:
case PIPE_FORMAT_R8G8_UINT:
@@ -804,15 +803,12 @@ static uint32_t si_translate_colorformat(enum pipe_format format)
return V_028C70_COLOR_10_11_11;
/* 64-bit buffers. */
case PIPE_FORMAT_R16G16B16_USCALED:
case PIPE_FORMAT_R16G16B16_SSCALED:
case PIPE_FORMAT_R16G16B16A16_UINT:
case PIPE_FORMAT_R16G16B16A16_SINT:
case PIPE_FORMAT_R16G16B16A16_USCALED:
case PIPE_FORMAT_R16G16B16A16_SSCALED:
case PIPE_FORMAT_R16G16B16A16_UNORM:
case PIPE_FORMAT_R16G16B16A16_SNORM:
case PIPE_FORMAT_R16G16B16_FLOAT:
case PIPE_FORMAT_R16G16B16A16_FLOAT:
return V_028C70_COLOR_16_16_16_16;
@@ -898,7 +894,6 @@ static uint32_t si_translate_colorswap(enum pipe_format format)
case PIPE_FORMAT_L8A8_SNORM:
case PIPE_FORMAT_L8A8_UINT:
case PIPE_FORMAT_L8A8_SINT:
case PIPE_FORMAT_L8A8_SRGB:
return V_028C70_SWAP_ALT;
case PIPE_FORMAT_R8G8_SNORM:
case PIPE_FORMAT_R8G8_UNORM:
@@ -1172,6 +1167,8 @@ static uint32_t si_translate_texformat(struct pipe_screen *screen,
goto out_unknown; /* TODO */
case UTIL_FORMAT_COLORSPACE_SRGB:
if (desc->nr_channels != 4 && desc->nr_channels != 1)
goto out_unknown;
break;
default:
@@ -1624,15 +1621,19 @@ static void si_cb(struct r600_context *rctx, struct si_pm4_state *pm4,
if (desc->colorspace == UTIL_FORMAT_COLORSPACE_SRGB)
ntype = V_028C70_NUMBER_SRGB;
else if (desc->channel[i].type == UTIL_FORMAT_TYPE_SIGNED) {
if (desc->channel[i].normalized)
ntype = V_028C70_NUMBER_SNORM;
else if (desc->channel[i].pure_integer)
if (desc->channel[i].pure_integer) {
ntype = V_028C70_NUMBER_SINT;
} else {
assert(desc->channel[i].normalized);
ntype = V_028C70_NUMBER_SNORM;
}
} else if (desc->channel[i].type == UTIL_FORMAT_TYPE_UNSIGNED) {
if (desc->channel[i].normalized)
ntype = V_028C70_NUMBER_UNORM;
else if (desc->channel[i].pure_integer)
if (desc->channel[i].pure_integer) {
ntype = V_028C70_NUMBER_UINT;
} else {
assert(desc->channel[i].normalized);
ntype = V_028C70_NUMBER_UNORM;
}
}
}
@@ -2093,16 +2094,31 @@ static struct pipe_sampler_view *si_create_sampler_view(struct pipe_context *ctx
first_non_void = util_format_get_first_non_void_channel(pipe_format);
if (first_non_void < 0) {
num_format = V_008F14_IMG_NUM_FORMAT_FLOAT;
} else switch (desc->channel[first_non_void].type) {
case UTIL_FORMAT_TYPE_FLOAT:
num_format = V_008F14_IMG_NUM_FORMAT_FLOAT;
break;
case UTIL_FORMAT_TYPE_SIGNED:
num_format = V_008F14_IMG_NUM_FORMAT_SNORM;
break;
case UTIL_FORMAT_TYPE_UNSIGNED:
default:
} else if (desc->colorspace == UTIL_FORMAT_COLORSPACE_SRGB) {
num_format = V_008F14_IMG_NUM_FORMAT_SRGB;
} else {
num_format = V_008F14_IMG_NUM_FORMAT_UNORM;
switch (desc->channel[first_non_void].type) {
case UTIL_FORMAT_TYPE_FLOAT:
num_format = V_008F14_IMG_NUM_FORMAT_FLOAT;
break;
case UTIL_FORMAT_TYPE_SIGNED:
if (desc->channel[first_non_void].normalized)
num_format = V_008F14_IMG_NUM_FORMAT_SNORM;
else if (desc->channel[first_non_void].pure_integer)
num_format = V_008F14_IMG_NUM_FORMAT_SINT;
else
num_format = V_008F14_IMG_NUM_FORMAT_SSCALED;
break;
case UTIL_FORMAT_TYPE_UNSIGNED:
if (desc->channel[first_non_void].normalized)
num_format = V_008F14_IMG_NUM_FORMAT_UNORM;
else if (desc->channel[first_non_void].pure_integer)
num_format = V_008F14_IMG_NUM_FORMAT_UINT;
else
num_format = V_008F14_IMG_NUM_FORMAT_USCALED;
}
}
format = si_translate_texformat(ctx->screen, pipe_format, desc, first_non_void);
@@ -2476,10 +2492,20 @@ static void *si_create_vertex_elements(struct pipe_context *ctx,
num_format = V_008F0C_BUF_NUM_FORMAT_USCALED; /* XXX */
break;
case UTIL_FORMAT_TYPE_SIGNED:
num_format = V_008F0C_BUF_NUM_FORMAT_SNORM;
if (desc->channel[first_non_void].normalized)
num_format = V_008F0C_BUF_NUM_FORMAT_SNORM;
else if (desc->channel[first_non_void].pure_integer)
num_format = V_008F0C_BUF_NUM_FORMAT_SINT;
else
num_format = V_008F0C_BUF_NUM_FORMAT_SSCALED;
break;
case UTIL_FORMAT_TYPE_UNSIGNED:
num_format = V_008F0C_BUF_NUM_FORMAT_UNORM;
if (desc->channel[first_non_void].normalized)
num_format = V_008F0C_BUF_NUM_FORMAT_UNORM;
else if (desc->channel[first_non_void].pure_integer)
num_format = V_008F0C_BUF_NUM_FORMAT_UINT;
else
num_format = V_008F0C_BUF_NUM_FORMAT_USCALED;
break;
case UTIL_FORMAT_TYPE_FLOAT:
default:
@@ -2665,9 +2691,14 @@ void si_init_config(struct r600_context *rctx)
si_pm4_set_reg(pm4, R_028350_PA_SC_RASTER_CONFIG, 0x2a00126a);
break;
case CHIP_VERDE:
default:
si_pm4_set_reg(pm4, R_028350_PA_SC_RASTER_CONFIG, 0x0000124a);
break;
case CHIP_OLAND:
si_pm4_set_reg(pm4, R_028350_PA_SC_RASTER_CONFIG, 0x00000082);
break;
default:
si_pm4_set_reg(pm4, R_028350_PA_SC_RASTER_CONFIG, 0x00000000);
break;
}
si_pm4_set_state(rctx, init, pm4);

View File

@@ -524,10 +524,8 @@ void si_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info *info)
struct pipe_index_buffer ib = {};
uint32_t cp_coher_cntl;
if ((!info->count && (info->indexed || !info->count_from_stream_output)) ||
(info->indexed && !rctx->index_buffer.buffer)) {
if (!info->count && (info->indexed || !info->count_from_stream_output))
return;
}
if (!rctx->ps_shader || !rctx->vs_shader)
return;
@@ -538,13 +536,14 @@ void si_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info *info)
if (info->indexed) {
/* Initialize the index buffer struct. */
pipe_resource_reference(&ib.buffer, rctx->index_buffer.buffer);
ib.user_buffer = rctx->index_buffer.user_buffer;
ib.index_size = rctx->index_buffer.index_size;
ib.offset = rctx->index_buffer.offset + info->start * ib.index_size;
/* Translate or upload, if needed. */
r600_translate_index_buffer(rctx, &ib, info->count);
if (ib.user_buffer) {
if (ib.user_buffer && !ib.buffer) {
r600_upload_index_buffer(rctx, &ib, info->count);
}

View File

@@ -2963,6 +2963,7 @@ sp_create_sampler_variant( const struct pipe_sampler_state *sampler,
case PIPE_TEX_MIPFILTER_LINEAR:
if (key.bits.is_pot &&
key.bits.target == PIPE_TEXTURE_2D &&
sampler->min_img_filter == sampler->mag_img_filter &&
sampler->normalized_coords &&
sampler->wrap_s == PIPE_TEX_WRAP_REPEAT &&

View File

@@ -23,6 +23,7 @@
*
**********************************************************/
#include "util/u_format.h"
#include "util/u_inlines.h"
#include "util/u_memory.h"
#include "pipe/p_defines.h"
@@ -248,6 +249,16 @@ emit_rss(struct svga_context *svga, unsigned dirty)
EMIT_RS_FLOAT( svga, bias, DEPTHBIAS, fail );
}
if (dirty & SVGA_NEW_FRAME_BUFFER) {
/* XXX: we only look at the first color buffer's sRGB state */
float gamma = 1.0f;
if (svga->curr.framebuffer.cbufs[0] &&
util_format_is_srgb(svga->curr.framebuffer.cbufs[0]->format)) {
gamma = 2.2f;
}
EMIT_RS_FLOAT(svga, gamma, OUTPUTGAMMA, fail);
}
if (dirty & SVGA_NEW_RAST) {
/* bitmask of the enabled clip planes */
unsigned enabled = svga->curr.rast->templ.clip_plane_enable;

View File

@@ -72,6 +72,7 @@ AM_CPPFLAGS += \
-I$(top_srcdir)/src/gallium/winsys \
-I$(top_srcdir)/src/egl/wayland/wayland-egl \
-I$(top_srcdir)/src/egl/wayland/wayland-drm \
-I$(top_builddir)/src/egl/wayland/wayland-drm \
-DHAVE_WAYLAND_BACKEND
endif

View File

@@ -318,7 +318,7 @@ ExaFinishAccess(PixmapPtr pPix, int index)
if (!priv)
return;
if (!priv->map_transfer || pPix->devPrivate.ptr == NULL)
if (!priv->map_transfer)
return;
exa_debug_printf("ExaFinishAccess %d\n", index);

View File

@@ -593,10 +593,11 @@ static struct pb_buffer *radeon_bomgr_create_bo(struct pb_manager *_mgr,
va.offset = bo->va;
r = drmCommandWriteRead(rws->fd, DRM_RADEON_GEM_VA, &va, sizeof(va));
if (r && va.operation == RADEON_VA_RESULT_ERROR) {
fprintf(stderr, "radeon: Failed to allocate a buffer:\n");
fprintf(stderr, "radeon: Failed to allocate virtual address for buffer:\n");
fprintf(stderr, "radeon: size : %d bytes\n", size);
fprintf(stderr, "radeon: alignment : %d bytes\n", desc->alignment);
fprintf(stderr, "radeon: domains : %d\n", args.initial_domain);
fprintf(stderr, "radeon: va : 0x%016llx\n", (unsigned long long)bo->va);
radeon_bo_destroy(&bo->base);
return NULL;
}
@@ -962,6 +963,10 @@ static boolean radeon_winsys_bo_get_handle(struct pb_buffer *buffer,
whandle->handle = bo->handle;
}
pipe_mutex_lock(bo->mgr->bo_handles_mutex);
util_hash_table_set(bo->mgr->bo_handles, (void*)(uintptr_t)whandle->handle, bo);
pipe_mutex_unlock(bo->mgr->bo_handles_mutex);
whandle->stride = stride;
return TRUE;
}

View File

@@ -383,6 +383,16 @@ static boolean radeon_drm_cs_validate(struct radeon_winsys_cs *rcs)
return status;
}
static boolean radeon_drm_cs_memory_below_limit(struct radeon_winsys_cs *rcs, uint64_t vram, uint64_t gtt)
{
struct radeon_drm_cs *cs = radeon_drm_cs(rcs);
boolean status =
(cs->csc->used_gart + gtt) < cs->ws->info.gart_size * 0.7 &&
(cs->csc->used_vram + vram) < cs->ws->info.vram_size * 0.7;
return status;
}
static void radeon_drm_cs_write_reloc(struct radeon_winsys_cs *rcs,
struct radeon_winsys_cs_handle *buf)
{
@@ -575,6 +585,7 @@ void radeon_drm_cs_init_functions(struct radeon_drm_winsys *ws)
ws->base.cs_destroy = radeon_drm_cs_destroy;
ws->base.cs_add_reloc = radeon_drm_cs_add_reloc;
ws->base.cs_validate = radeon_drm_cs_validate;
ws->base.cs_memory_below_limit = radeon_drm_cs_memory_below_limit;
ws->base.cs_write_reloc = radeon_drm_cs_write_reloc;
ws->base.cs_flush = radeon_drm_cs_flush;
ws->base.cs_set_flush_callback = radeon_drm_cs_set_flush;

View File

@@ -312,6 +312,7 @@ static boolean do_winsys_init(struct radeon_drm_winsys *ws)
case CHIP_TAHITI:
case CHIP_PITCAIRN:
case CHIP_VERDE:
case CHIP_OLAND:
ws->info.chip_class = TAHITI;
break;
}

View File

@@ -123,6 +123,7 @@ enum radeon_family {
CHIP_TAHITI,
CHIP_PITCAIRN,
CHIP_VERDE,
CHIP_OLAND,
CHIP_LAST,
};
@@ -392,6 +393,16 @@ struct radeon_winsys {
*/
boolean (*cs_validate)(struct radeon_winsys_cs *cs);
/**
* Return TRUE if there is enough memory in VRAM and GTT for the relocs
* added so far.
*
* \param cs A command stream to validate.
* \param vram VRAM memory size pending to be use
* \param gtt GTT memory size pending to be use
*/
boolean (*cs_memory_below_limit)(struct radeon_winsys_cs *cs, uint64_t vram, uint64_t gtt);
/**
* Write a relocated dword to a command buffer.
*

View File

@@ -2829,9 +2829,9 @@ ast_declarator_list::hir(exec_list *instructions,
* flat."
*
* From section 4.3.4 of the GLSL 3.00 ES spec:
* "Fragment shader inputs that are signed or unsigned integers or
* integer vectors must be qualified with the interpolation qualifier
* flat."
* "Fragment shader inputs that are, or contain, signed or unsigned
* integers or integer vectors must be qualified with the
* interpolation qualifier flat."
*
* Since vertex outputs and fragment inputs must have matching
* qualifiers, these two requirements are equivalent.
@@ -2839,12 +2839,12 @@ ast_declarator_list::hir(exec_list *instructions,
if (state->is_version(130, 300)
&& state->target == vertex_shader
&& state->current_function == NULL
&& var->type->is_integer()
&& var->type->contains_integer()
&& var->mode == ir_var_shader_out
&& var->interpolation != INTERP_QUALIFIER_FLAT) {
_mesa_glsl_error(&loc, state, "If a vertex output is an integer, "
"then it must be qualified with 'flat'");
_mesa_glsl_error(&loc, state, "If a vertex output is (or contains) "
"an integer, then it must be qualified with 'flat'");
}
@@ -3967,6 +3967,47 @@ ast_iteration_statement::hir(exec_list *instructions,
}
/**
* Determine if the given type is valid for establishing a default precision
* qualifier.
*
* From GLSL ES 3.00 section 4.5.4 ("Default Precision Qualifiers"):
*
* "The precision statement
*
* precision precision-qualifier type;
*
* can be used to establish a default precision qualifier. The type field
* can be either int or float or any of the sampler types, and the
* precision-qualifier can be lowp, mediump, or highp."
*
* GLSL ES 1.00 has similar language. GLSL 1.30 doesn't allow precision
* qualifiers on sampler types, but this seems like an oversight (since the
* intention of including these in GLSL 1.30 is to allow compatibility with ES
* shaders). So we allow int, float, and all sampler types regardless of GLSL
* version.
*/
static bool
is_valid_default_precision_type(const struct _mesa_glsl_parse_state *state,
const char *type_name)
{
const struct glsl_type *type = state->symbols->get_type(type_name);
if (type == NULL)
return false;
switch (type->base_type) {
case GLSL_TYPE_INT:
case GLSL_TYPE_FLOAT:
/* "int" and "float" are valid, but vectors and matrices are not. */
return type->vector_elements == 1 && type->matrix_columns == 1;
case GLSL_TYPE_SAMPLER:
return true;
default:
return false;
}
}
ir_rvalue *
ast_type_specifier::hir(exec_list *instructions,
struct _mesa_glsl_parse_state *state)
@@ -4007,11 +4048,10 @@ ast_type_specifier::hir(exec_list *instructions,
"arrays");
return NULL;
}
if (strcmp(this->type_name, "float") != 0 &&
strcmp(this->type_name, "int") != 0) {
if (!is_valid_default_precision_type(state, this->type_name)) {
_mesa_glsl_error(&loc, state,
"default precision statements apply only to types "
"float and int");
"float, int, and sampler types");
return NULL;
}

View File

@@ -20,23 +20,44 @@
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
# IN THE SOFTWARE.
CC = @CC_FOR_BUILD@
CFLAGS = @CFLAGS_FOR_BUILD@
CPP = @CPP_FOR_BUILD@
CPPFLAGS = @CPPFLAGS_FOR_BUILD@
CXX = @CXX_FOR_BUILD@
CXXFLAGS = @CXXFLAGS_FOR_BUILD@
LD = @LD_FOR_BUILD@
LDFLAGS = @LDFLAGS_FOR_BUILD@
AM_CFLAGS = \
-I $(top_srcdir)/include \
-I $(top_srcdir)/src/mapi \
-I $(top_srcdir)/src/mesa \
-I $(GLSL_SRCDIR) \
-I $(GLSL_SRCDIR)/glcpp \
-I $(GLSL_BUILDDIR) \
$(DEFINES_FOR_BUILD)
-I $(GLSL_BUILDDIR)
if CROSS_COMPILING
proxyCC = @CC_FOR_BUILD@
proxyCFLAGS = @CFLAGS_FOR_BUILD@
proxyCPP = @CPP_FOR_BUILD@
proxyCPPFLAGS = @CPPFLAGS_FOR_BUILD@
proxyCXX = @CXX_FOR_BUILD@
proxyCXXFLAGS = @CXXFLAGS_FOR_BUILD@
proxyLD = @LD_FOR_BUILD@
proxyLDFLAGS = @LDFLAGS_FOR_BUILD@
AM_CFLAGS += $(DEFINES_FOR_BUILD)
else
proxyCC = @CC@
proxyCFLAGS = @CFLAGS@
proxyCPP = @CPP@
proxyCPPFLAGS = @CPPFLAGS@
proxyCXX = @CXX@
proxyCXXFLAGS = @CXXFLAGS@
proxyLD = @LD@
proxyLDFLAGS = @LDFLAGS@
AM_CFLAGS += $(DEFINES)
endif
CC = $(proxyCC)
CFLAGS = $(proxyCFLAGS)
CPP = $(proxyCPP)
CPPFLAGS = $(proxyCPPFLAGS)
CXX = $(proxyCXX)
CXXFLAGS = $(proxyCXXFLAGS)
LD = $(proxyLD)
LDFLAGS = $(proxyLDFLAGS)
AM_CXXFLAGS = $(AM_CFLAGS)

View File

@@ -156,6 +156,24 @@ glsl_type::contains_sampler() const
}
}
bool
glsl_type::contains_integer() const
{
if (this->is_array()) {
return this->fields.array->contains_integer();
} else if (this->is_record()) {
for (unsigned int i = 0; i < this->length; i++) {
if (this->fields.structure[i].type->contains_integer())
return true;
}
return false;
} else {
return this->is_integer();
}
}
gl_texture_index
glsl_type::sampler_index() const
{

View File

@@ -359,6 +359,12 @@ struct glsl_type {
return (base_type == GLSL_TYPE_UINT) || (base_type == GLSL_TYPE_INT);
}
/**
* Query whether or not type is an integral type, or for struct and array
* types, contains an integral type.
*/
bool contains_integer() const;
/**
* Query whether or not a type is a float type
*/

View File

@@ -29,7 +29,7 @@
#include "main/hash_table.h"
#include "program.h"
class ubo_visitor : public uniform_field_visitor {
class ubo_visitor : public program_resource_visitor {
public:
ubo_visitor(void *mem_ctx, gl_uniform_buffer_variable *variables,
unsigned num_variables)
@@ -44,7 +44,7 @@ public:
this->offset = 0;
this->buffer_size = 0;
this->is_array_instance = strchr(name, ']') != NULL;
this->uniform_field_visitor::process(type, name);
this->program_resource_visitor::process(type, name);
}
unsigned index;
@@ -112,7 +112,7 @@ private:
}
};
class count_block_size : public uniform_field_visitor {
class count_block_size : public program_resource_visitor {
public:
count_block_size() : num_active_uniforms(0)
{

View File

@@ -52,7 +52,7 @@ values_for_type(const glsl_type *type)
}
void
uniform_field_visitor::process(const glsl_type *type, const char *name)
program_resource_visitor::process(const glsl_type *type, const char *name)
{
assert(type->is_record()
|| (type->is_array() && type->fields.array->is_record())
@@ -65,7 +65,7 @@ uniform_field_visitor::process(const glsl_type *type, const char *name)
}
void
uniform_field_visitor::process(ir_variable *var)
program_resource_visitor::process(ir_variable *var)
{
const glsl_type *t = var->type;
@@ -93,8 +93,8 @@ uniform_field_visitor::process(ir_variable *var)
}
void
uniform_field_visitor::recursion(const glsl_type *t, char **name,
size_t name_length, bool row_major)
program_resource_visitor::recursion(const glsl_type *t, char **name,
size_t name_length, bool row_major)
{
/* Records need to have each field processed individually.
*
@@ -110,7 +110,7 @@ uniform_field_visitor::recursion(const glsl_type *t, char **name,
if (t->fields.structure[i].type->is_record())
this->visit_field(&t->fields.structure[i]);
/* Append '.field' to the current uniform name. */
/* Append '.field' to the current variable name. */
if (name_length == 0) {
ralloc_asprintf_rewrite_tail(name, &new_length, "%s", field);
} else {
@@ -125,7 +125,7 @@ uniform_field_visitor::recursion(const glsl_type *t, char **name,
for (unsigned i = 0; i < t->length; i++) {
size_t new_length = name_length;
/* Append the subscript to the current uniform name */
/* Append the subscript to the current variable name */
ralloc_asprintf_rewrite_tail(name, &new_length, "[%u]", i);
recursion(t->fields.array, name, new_length,
@@ -137,7 +137,7 @@ uniform_field_visitor::recursion(const glsl_type *t, char **name,
}
void
uniform_field_visitor::visit_field(const glsl_struct_field *field)
program_resource_visitor::visit_field(const glsl_struct_field *field)
{
(void) field;
/* empty */
@@ -153,7 +153,7 @@ uniform_field_visitor::visit_field(const glsl_struct_field *field)
* If the same uniform is added multiple times (i.e., once for each shader
* target), it will only be accounted once.
*/
class count_uniform_size : public uniform_field_visitor {
class count_uniform_size : public program_resource_visitor {
public:
count_uniform_size(struct string_to_uint_map *map)
: num_active_uniforms(0), num_values(0), num_shader_samplers(0),
@@ -171,10 +171,10 @@ public:
void process(ir_variable *var)
{
if (var->is_interface_instance())
uniform_field_visitor::process(var->interface_type,
var->interface_type->name);
program_resource_visitor::process(var->interface_type,
var->interface_type->name);
else
uniform_field_visitor::process(var);
program_resource_visitor::process(var);
}
/**
@@ -258,7 +258,7 @@ private:
* the \c gl_uniform_storage and \c gl_constant_value arrays are "big
* enough."
*/
class parcel_out_uniform_storage : public uniform_field_visitor {
class parcel_out_uniform_storage : public program_resource_visitor {
public:
parcel_out_uniform_storage(struct string_to_uint_map *map,
struct gl_uniform_storage *uniforms,

View File

@@ -35,6 +35,8 @@
#include "linker.h"
#include "link_varyings.h"
#include "main/macros.h"
#include "program/hash_table.h"
#include "program.h"
/**
@@ -154,10 +156,13 @@ cross_validate_outputs_to_inputs(struct gl_shader_program *prog,
/**
* Initialize this object based on a string that was passed to
* glTransformFeedbackVaryings. If there is a parse error, the error is
* reported using linker_error(), and false is returned.
* glTransformFeedbackVaryings.
*
* If the input is mal-formed, this call still succeeds, but it sets
* this->var_name to a mal-formed input, so tfeedback_decl::find_output_var()
* will fail to find any matching variable.
*/
bool
void
tfeedback_decl::init(struct gl_context *ctx, struct gl_shader_program *prog,
const void *mem_ctx, const char *input)
{
@@ -170,12 +175,13 @@ tfeedback_decl::init(struct gl_context *ctx, struct gl_shader_program *prog,
this->is_clip_distance_mesa = false;
this->skip_components = 0;
this->next_buffer_separator = false;
this->matched_candidate = NULL;
if (ctx->Extensions.ARB_transform_feedback3) {
/* Parse gl_NextBuffer. */
if (strcmp(input, "gl_NextBuffer") == 0) {
this->next_buffer_separator = true;
return true;
return;
}
/* Parse gl_SkipComponents. */
@@ -189,21 +195,17 @@ tfeedback_decl::init(struct gl_context *ctx, struct gl_shader_program *prog,
this->skip_components = 4;
if (this->skip_components)
return true;
return;
}
/* Parse a declaration. */
const char *bracket = strrchr(input, '[');
if (bracket) {
this->var_name = ralloc_strndup(mem_ctx, input, bracket - input);
if (sscanf(bracket, "[%u]", &this->array_subscript) != 1) {
linker_error(prog, "Cannot parse transform feedback varying %s", input);
return false;
}
const char *base_name_end;
long subscript = parse_program_resource_name(input, &base_name_end);
this->var_name = ralloc_strndup(mem_ctx, input, base_name_end - input);
if (subscript >= 0) {
this->array_subscript = subscript;
this->is_subscripted = true;
} else {
this->var_name = ralloc_strdup(mem_ctx, input);
this->is_subscripted = false;
}
@@ -215,8 +217,6 @@ tfeedback_decl::init(struct gl_context *ctx, struct gl_shader_program *prog,
strcmp(this->var_name, "gl_ClipDistance") == 0) {
this->is_clip_distance_mesa = true;
}
return true;
}
@@ -240,27 +240,32 @@ tfeedback_decl::is_same(const tfeedback_decl &x, const tfeedback_decl &y)
/**
* Assign a location for this tfeedback_decl object based on the location
* assignment in output_var.
* Assign a location for this tfeedback_decl object based on the transform
* feedback candidate found by find_candidate.
*
* If an error occurs, the error is reported through linker_error() and false
* is returned.
*/
bool
tfeedback_decl::assign_location(struct gl_context *ctx,
struct gl_shader_program *prog,
ir_variable *output_var)
struct gl_shader_program *prog)
{
assert(this->is_varying());
if (output_var->type->is_array()) {
unsigned fine_location
= this->matched_candidate->toplevel_var->location * 4
+ this->matched_candidate->toplevel_var->location_frac
+ this->matched_candidate->offset;
if (this->matched_candidate->type->is_array()) {
/* Array variable */
const unsigned matrix_cols =
output_var->type->fields.array->matrix_columns;
this->matched_candidate->type->fields.array->matrix_columns;
const unsigned vector_elements =
output_var->type->fields.array->vector_elements;
this->matched_candidate->type->fields.array->vector_elements;
unsigned actual_array_size = this->is_clip_distance_mesa ?
prog->Vert.ClipDistanceArraySize : output_var->type->array_size();
prog->Vert.ClipDistanceArraySize :
this->matched_candidate->type->array_size();
if (this->is_subscripted) {
/* Check array bounds. */
@@ -271,22 +276,11 @@ tfeedback_decl::assign_location(struct gl_context *ctx,
actual_array_size);
return false;
}
if (this->is_clip_distance_mesa) {
this->location =
output_var->location + this->array_subscript / 4;
this->location_frac = this->array_subscript % 4;
} else {
unsigned fine_location
= output_var->location * 4 + output_var->location_frac;
unsigned array_elem_size = vector_elements * matrix_cols;
fine_location += array_elem_size * this->array_subscript;
this->location = fine_location / 4;
this->location_frac = fine_location % 4;
}
unsigned array_elem_size = this->is_clip_distance_mesa ?
1 : vector_elements * matrix_cols;
fine_location += array_elem_size * this->array_subscript;
this->size = 1;
} else {
this->location = output_var->location;
this->location_frac = output_var->location_frac;
this->size = actual_array_size;
}
this->vector_elements = vector_elements;
@@ -294,7 +288,7 @@ tfeedback_decl::assign_location(struct gl_context *ctx,
if (this->is_clip_distance_mesa)
this->type = GL_FLOAT;
else
this->type = output_var->type->fields.array->gl_type;
this->type = this->matched_candidate->type->fields.array->gl_type;
} else {
/* Regular variable (scalar, vector, or matrix) */
if (this->is_subscripted) {
@@ -303,13 +297,13 @@ tfeedback_decl::assign_location(struct gl_context *ctx,
this->orig_name, this->var_name);
return false;
}
this->location = output_var->location;
this->location_frac = output_var->location_frac;
this->size = 1;
this->vector_elements = output_var->type->vector_elements;
this->matrix_columns = output_var->type->matrix_columns;
this->type = output_var->type->gl_type;
this->vector_elements = this->matched_candidate->type->vector_elements;
this->matrix_columns = this->matched_candidate->type->matrix_columns;
this->type = this->matched_candidate->type->gl_type;
}
this->location = fine_location / 4;
this->location_frac = fine_location % 4;
/* From GL_EXT_transform_feedback:
* A program will fail to link if:
@@ -404,35 +398,26 @@ tfeedback_decl::store(struct gl_context *ctx, struct gl_shader_program *prog,
}
ir_variable *
tfeedback_decl::find_output_var(gl_shader_program *prog,
gl_shader *producer) const
const tfeedback_candidate *
tfeedback_decl::find_candidate(gl_shader_program *prog,
hash_table *tfeedback_candidates)
{
const char *name = this->is_clip_distance_mesa
? "gl_ClipDistanceMESA" : this->var_name;
ir_variable *var = producer->symbols->get_variable(name);
if (var && var->mode == ir_var_shader_out) {
const glsl_type *type = var->type;
while (type->base_type == GLSL_TYPE_ARRAY)
type = type->fields.array;
if (type->base_type == GLSL_TYPE_STRUCT) {
linker_error(prog, "Transform feedback of varying structs not "
"implemented yet.");
return NULL;
}
return var;
this->matched_candidate = (const tfeedback_candidate *)
hash_table_find(tfeedback_candidates, name);
if (!this->matched_candidate) {
/* From GL_EXT_transform_feedback:
* A program will fail to link if:
*
* * any variable name specified in the <varyings> array is not
* declared as an output in the geometry shader (if present) or
* the vertex shader (if no geometry shader is present);
*/
linker_error(prog, "Transform feedback varying %s undeclared.",
this->orig_name);
}
/* From GL_EXT_transform_feedback:
* A program will fail to link if:
*
* * any variable name specified in the <varyings> array is not
* declared as an output in the geometry shader (if present) or
* the vertex shader (if no geometry shader is present);
*/
linker_error(prog, "Transform feedback varying %s undeclared.",
this->orig_name);
return NULL;
return this->matched_candidate;
}
@@ -449,8 +434,7 @@ parse_tfeedback_decls(struct gl_context *ctx, struct gl_shader_program *prog,
char **varying_names, tfeedback_decl *decls)
{
for (unsigned i = 0; i < num_names; ++i) {
if (!decls[i].init(ctx, prog, mem_ctx, varying_names[i]))
return false;
decls[i].init(ctx, prog, mem_ctx, varying_names[i]);
if (!decls[i].is_varying())
continue;
@@ -870,6 +854,80 @@ is_varying_var(GLenum shaderType, const ir_variable *var)
}
/**
* Visitor class that generates tfeedback_candidate structs describing all
* possible targets of transform feedback.
*
* tfeedback_candidate structs are stored in the hash table
* tfeedback_candidates, which is passed to the constructor. This hash table
* maps varying names to instances of the tfeedback_candidate struct.
*/
class tfeedback_candidate_generator : public program_resource_visitor
{
public:
tfeedback_candidate_generator(void *mem_ctx,
hash_table *tfeedback_candidates)
: mem_ctx(mem_ctx),
tfeedback_candidates(tfeedback_candidates)
{
}
void process(ir_variable *var)
{
this->toplevel_var = var;
this->varying_floats = 0;
if (var->is_interface_instance())
program_resource_visitor::process(var->interface_type,
var->interface_type->name);
else
program_resource_visitor::process(var);
}
private:
virtual void visit_field(const glsl_type *type, const char *name,
bool row_major)
{
assert(!type->is_record());
assert(!(type->is_array() && type->fields.array->is_record()));
assert(!type->is_interface());
assert(!(type->is_array() && type->fields.array->is_interface()));
(void) row_major;
tfeedback_candidate *candidate
= rzalloc(this->mem_ctx, tfeedback_candidate);
candidate->toplevel_var = this->toplevel_var;
candidate->type = type;
candidate->offset = this->varying_floats;
hash_table_insert(this->tfeedback_candidates, candidate,
ralloc_strdup(this->mem_ctx, name));
this->varying_floats += type->component_slots();
}
/**
* Memory context used to allocate hash table keys and values.
*/
void * const mem_ctx;
/**
* Hash table in which tfeedback_candidate objects should be stored.
*/
hash_table * const tfeedback_candidates;
/**
* Pointer to the toplevel variable that is being traversed.
*/
ir_variable *toplevel_var;
/**
* Total number of varying floats that have been visited so far. This is
* used to determine the offset to each varying within the toplevel
* variable.
*/
unsigned varying_floats;
};
/**
* Assign locations for all variables that are produced in one pipeline stage
* (the "producer") and consumed in the next stage (the "consumer").
@@ -902,6 +960,8 @@ assign_varying_locations(struct gl_context *ctx,
const unsigned producer_base = VERT_RESULT_VAR0;
const unsigned consumer_base = FRAG_ATTRIB_VAR0;
varying_matches matches(ctx->Const.DisableVaryingPacking);
hash_table *tfeedback_candidates
= hash_table_ctor(0, hash_table_string_hash, hash_table_string_compare);
/* Operate in a total of three passes.
*
@@ -920,6 +980,9 @@ assign_varying_locations(struct gl_context *ctx,
if ((output_var == NULL) || (output_var->mode != ir_var_shader_out))
continue;
tfeedback_candidate_generator g(mem_ctx, tfeedback_candidates);
g.process(output_var);
ir_variable *input_var =
consumer ? consumer->symbols->get_variable(output_var->name) : NULL;
@@ -935,15 +998,16 @@ assign_varying_locations(struct gl_context *ctx,
if (!tfeedback_decls[i].is_varying())
continue;
ir_variable *output_var
= tfeedback_decls[i].find_output_var(prog, producer);
const tfeedback_candidate *matched_candidate
= tfeedback_decls[i].find_candidate(prog, tfeedback_candidates);
if (output_var == NULL)
if (matched_candidate == NULL) {
hash_table_dtor(tfeedback_candidates);
return false;
if (output_var->is_unmatched_generic_inout) {
matches.record(output_var, NULL);
}
if (matched_candidate->toplevel_var->is_unmatched_generic_inout)
matches.record(matched_candidate->toplevel_var, NULL);
}
const unsigned slots_used = matches.assign_locations();
@@ -953,13 +1017,14 @@ assign_varying_locations(struct gl_context *ctx,
if (!tfeedback_decls[i].is_varying())
continue;
ir_variable *output_var
= tfeedback_decls[i].find_output_var(prog, producer);
if (!tfeedback_decls[i].assign_location(ctx, prog, output_var))
if (!tfeedback_decls[i].assign_location(ctx, prog)) {
hash_table_dtor(tfeedback_candidates);
return false;
}
}
hash_table_dtor(tfeedback_candidates);
if (ctx->Const.DisableVaryingPacking) {
/* Transform feedback code assumes varyings are packed, so if the driver
* has disabled varying packing, make sure it does not support transform

View File

@@ -41,6 +41,49 @@ struct gl_shader;
class ir_variable;
/**
* Data structure describing a varying which is available for use in transform
* feedback.
*
* For example, if the vertex shader contains:
*
* struct S {
* vec4 foo;
* float[3] bar;
* };
*
* varying S[2] v;
*
* Then there would be tfeedback_candidate objects corresponding to the
* following varyings:
*
* v[0].foo
* v[0].bar
* v[1].foo
* v[1].bar
*/
struct tfeedback_candidate
{
/**
* Toplevel variable containing this varying. In the above example, this
* would point to the declaration of the varying v.
*/
ir_variable *toplevel_var;
/**
* Type of this varying. In the above example, this would point to the
* glsl_type for "vec4" or "float[3]".
*/
const glsl_type *type;
/**
* Offset within the toplevel variable where this varying occurs (counted
* in multiples of the size of a float).
*/
unsigned offset;
};
/**
* Data structure tracking information about a transform feedback declaration
* during linking.
@@ -48,17 +91,17 @@ class ir_variable;
class tfeedback_decl
{
public:
bool init(struct gl_context *ctx, struct gl_shader_program *prog,
void init(struct gl_context *ctx, struct gl_shader_program *prog,
const void *mem_ctx, const char *input);
static bool is_same(const tfeedback_decl &x, const tfeedback_decl &y);
bool assign_location(struct gl_context *ctx, struct gl_shader_program *prog,
ir_variable *output_var);
bool assign_location(struct gl_context *ctx,
struct gl_shader_program *prog);
unsigned get_num_outputs() const;
bool store(struct gl_context *ctx, struct gl_shader_program *prog,
struct gl_transform_feedback_info *info, unsigned buffer,
const unsigned max_outputs) const;
ir_variable *find_output_var(gl_shader_program *prog,
gl_shader *producer) const;
const tfeedback_candidate *find_candidate(gl_shader_program *prog,
hash_table *tfeedback_candidates);
bool is_next_buffer_separator() const
{
@@ -158,6 +201,12 @@ private:
* Whether this is gl_NextBuffer from ARB_transform_feedback3.
*/
bool next_buffer_separator;
/**
* If find_candidate() has been called, pointer to the tfeedback_candidate
* data structure that was found. Otherwise NULL.
*/
const tfeedback_candidate *matched_candidate;
};

View File

@@ -200,6 +200,65 @@ linker_warning(gl_shader_program *prog, const char *fmt, ...)
}
/**
* Given a string identifying a program resource, break it into a base name
* and an optional array index in square brackets.
*
* If an array index is present, \c out_base_name_end is set to point to the
* "[" that precedes the array index, and the array index itself is returned
* as a long.
*
* If no array index is present (or if the array index is negative or
* mal-formed), \c out_base_name_end, is set to point to the null terminator
* at the end of the input string, and -1 is returned.
*
* Only the final array index is parsed; if the string contains other array
* indices (or structure field accesses), they are left in the base name.
*
* No attempt is made to check that the base name is properly formed;
* typically the caller will look up the base name in a hash table, so
* ill-formed base names simply turn into hash table lookup failures.
*/
long
parse_program_resource_name(const GLchar *name,
const GLchar **out_base_name_end)
{
/* Section 7.3.1 ("Program Interfaces") of the OpenGL 4.3 spec says:
*
* "When an integer array element or block instance number is part of
* the name string, it will be specified in decimal form without a "+"
* or "-" sign or any extra leading zeroes. Additionally, the name
* string will not include white space anywhere in the string."
*/
const size_t len = strlen(name);
*out_base_name_end = name + len;
if (len == 0 || name[len-1] != ']')
return -1;
/* Walk backwards over the string looking for a non-digit character. This
* had better be the opening bracket for an array index.
*
* Initially, i specifies the location of the ']'. Since the string may
* contain only the ']' charcater, walk backwards very carefully.
*/
unsigned i;
for (i = len - 1; (i > 0) && isdigit(name[i-1]); --i)
/* empty */ ;
if ((i == 0) || name[i-1] != '[')
return -1;
long array_index = strtol(&name[i], NULL, 10);
if (array_index < 0)
return -1;
*out_base_name_end = name + (i - 1);
return array_index;
}
void
link_invalidate_variable_locations(gl_shader *sh, int input_base,
int output_base)

View File

@@ -61,38 +61,39 @@ link_uniform_blocks(void *mem_ctx,
struct gl_uniform_block **blocks_ret);
/**
* Class for processing all of the leaf fields of an uniform
* Class for processing all of the leaf fields of a variable that corresponds
* to a program resource.
*
* Leaves are, roughly speaking, the parts of the uniform that the application
* could query with \c glGetUniformLocation (or that could be returned by
* \c glGetActiveUniforms).
* The leaf fields are all the parts of the variable that the application
* could query using \c glGetProgramResourceIndex (or that could be returned
* by \c glGetProgramResourceName).
*
* Classes my derive from this class to implement specific functionality.
* This class only provides the mechanism to iterate over the leaves. Derived
* classes must implement \c ::visit_field and may override \c ::process.
*/
class uniform_field_visitor {
class program_resource_visitor {
public:
/**
* Begin processing a uniform
* Begin processing a variable
*
* Classes that overload this function should call \c ::process from the
* base class to start the recursive processing of the uniform.
* base class to start the recursive processing of the variable.
*
* \param var The uniform variable that is to be processed
* \param var The variable that is to be processed
*
* Calls \c ::visit_field for each leaf of the uniform.
* Calls \c ::visit_field for each leaf of the variable.
*
* \warning
* This entry should only be used with uniform blocks in cases where the
* row / column ordering of matrices in the block does not matter. For
* example, enumerating the names of members of the block, but not for
* determining the offsets of members.
* When processing a uniform block, this entry should only be used in cases
* where the row / column ordering of matrices in the block does not
* matter. For example, enumerating the names of members of the block, but
* not for determining the offsets of members.
*/
void process(ir_variable *var);
/**
* Begin processing a uniform of a structured type.
* Begin processing a variable of a structured type.
*
* This flavor of \c process should be used to handle structured types
* (i.e., structures, interfaces, or arrays there of) that need special
@@ -100,7 +101,7 @@ public:
* (instead of the instance name) is used for an interface block.
*
* \param type Type that is to be processed, associated with \c name
* \param name Base name of the structured uniform being processed
* \param name Base name of the structured variable being processed
*
* \note
* \c type must be \c GLSL_TYPE_RECORD, \c GLSL_TYPE_INTERFACE, or an array
@@ -110,7 +111,7 @@ public:
protected:
/**
* Method invoked for each leaf of the uniform
* Method invoked for each leaf of the variable
*
* \param type Type of the field.
* \param name Fully qualified name of the field.

View File

@@ -33,3 +33,7 @@ linker_error(gl_shader_program *prog, const char *fmt, ...)
extern void
linker_warning(gl_shader_program *prog, const char *fmt, ...)
PRINTFLIKE(2, 3);
extern long
parse_program_resource_name(const GLchar *name,
const GLchar **out_base_name_end);

View File

@@ -789,9 +789,11 @@ dri2XcbSwapBuffers(Display *dpy,
swap_buffers_reply =
xcb_dri2_swap_buffers_reply(c, swap_buffers_cookie, NULL);
ret = merge_counter(swap_buffers_reply->swap_hi,
swap_buffers_reply->swap_lo);
free(swap_buffers_reply);
if (swap_buffers_reply) {
ret = merge_counter(swap_buffers_reply->swap_hi,
swap_buffers_reply->swap_lo);
free(swap_buffers_reply);
}
return ret;
}

View File

@@ -23,6 +23,7 @@
#include "main/teximage.h"
#include "main/fbobject.h"
#include "main/renderbuffer.h"
#include "glsl/ralloc.h"
@@ -183,10 +184,19 @@ formats_match(GLbitfield buffer_bit, struct intel_renderbuffer *src_irb,
gl_format src_format = find_miptree(buffer_bit, src_irb)->format;
gl_format dst_format = find_miptree(buffer_bit, dst_irb)->format;
return _mesa_get_srgb_format_linear(src_format) ==
_mesa_get_srgb_format_linear(dst_format);
}
gl_format linear_src_format = _mesa_get_srgb_format_linear(src_format);
gl_format linear_dst_format = _mesa_get_srgb_format_linear(dst_format);
/* Normally, we require the formats to be equal. However, we also support
* blitting from ARGB to XRGB (discarding alpha), and from XRGB to ARGB
* (overriding alpha to 1.0 via blending).
*/
return linear_src_format == linear_dst_format ||
(linear_src_format == MESA_FORMAT_XRGB8888 &&
linear_dst_format == MESA_FORMAT_ARGB8888) ||
(linear_src_format == MESA_FORMAT_ARGB8888 &&
linear_dst_format == MESA_FORMAT_XRGB8888);
}
static bool
try_blorp_blit(struct intel_context *intel,
@@ -295,6 +305,93 @@ try_blorp_blit(struct intel_context *intel,
return true;
}
bool
brw_blorp_copytexsubimage(struct intel_context *intel,
struct gl_renderbuffer *src_rb,
struct gl_texture_image *dst_image,
int srcX0, int srcY0,
int dstX0, int dstY0,
int width, int height)
{
struct gl_context *ctx = &intel->ctx;
struct intel_renderbuffer *src_irb = intel_renderbuffer(src_rb);
struct intel_renderbuffer *dst_irb;
/* BLORP is not supported before Gen6. */
if (intel->gen < 6)
return false;
/* Create a fake/wrapper renderbuffer to allow us to use do_blorp_blit(). */
dst_irb = intel_create_fake_renderbuffer_wrapper(intel, dst_image);
if (!dst_irb)
return false;
struct gl_renderbuffer *dst_rb = &dst_irb->Base.Base;
/* Unlike BlitFramebuffer, CopyTexSubImage doesn't have a buffer bit.
* It's only used by find_miptee() to decide whether to dereference the
* separate stencil miptree. In the case of packed depth/stencil, core
* Mesa hands us the depth attachment as src_rb (not stencil), so assume
* non-stencil for now. A buffer bit of 0 works for both color and depth.
*/
GLbitfield buffer_bit = 0;
if (!formats_match(buffer_bit, src_irb, dst_irb)) {
dst_rb->Delete(ctx, dst_rb);
return false;
}
/* Source clipping shouldn't be necessary, since copytexsubimage (in
* src/mesa/main/teximage.c) calls _mesa_clip_copytexsubimage() which
* takes care of it.
*
* Destination clipping shouldn't be necessary since the restrictions on
* glCopyTexSubImage prevent the user from specifying a destination rectangle
* that falls outside the bounds of the destination texture.
* See error_check_subtexture_dimensions().
*/
int srcY1 = srcY0 + height;
int dstX1 = dstX0 + width;
int dstY1 = dstY0 + height;
/* Sync up the state of window system buffers. We need to do this before
* we go looking for the buffers.
*/
intel_prepare_render(intel);
/* Account for the fact that in the system framebuffer, the origin is at
* the lower left.
*/
bool mirror_y = false;
if (_mesa_is_winsys_fbo(ctx->ReadBuffer)) {
GLint tmp = src_rb->Height - srcY0;
srcY0 = src_rb->Height - srcY1;
srcY1 = tmp;
mirror_y = true;
}
do_blorp_blit(intel, buffer_bit, src_irb, dst_irb,
srcX0, srcY0, dstX0, dstY0, dstX1, dstY1, false, mirror_y);
/* If we're copying a packed depth stencil texture, the above do_blorp_blit
* copied depth (since buffer_bit != GL_STENCIL_BIT). Now copy stencil as
* well. There's no need to do a formats_match() check because the separate
* stencil buffer is always S8.
*/
src_rb = ctx->ReadBuffer->Attachment[BUFFER_STENCIL].Renderbuffer;
if (_mesa_get_format_bits(dst_image->TexFormat, GL_STENCIL_BITS) > 0 &&
src_rb != NULL) {
src_irb = intel_renderbuffer(src_rb);
do_blorp_blit(intel, GL_STENCIL_BUFFER_BIT, src_irb, dst_irb,
srcX0, srcY0, dstX0, dstY0, dstX1, dstY1, false, mirror_y);
}
dst_rb->Delete(ctx, dst_rb);
return true;
}
GLbitfield
brw_blorp_framebuffer(struct intel_context *intel,
GLint srcX0, GLint srcY0, GLint srcX1, GLint srcY1,
@@ -1642,17 +1739,6 @@ brw_blorp_blit_params::brw_blorp_blit_params(struct brw_context *brw,
src.set(brw, src_mt, src_level, src_layer);
dst.set(brw, dst_mt, dst_level, dst_layer);
/* If we are blitting from sRGB to linear or vice versa, we still want the
* blit to be a direct copy, so we need source and destination to use the
* same format. However, we want the destination sRGB/linear state to be
* correct (so that sRGB blending is used when doing an MSAA resolve to an
* sRGB surface, and linear blending is used when doing an MSAA resolve to
* a linear surface). Since blorp blits don't support any format
* conversion (except between sRGB and linear), we can accomplish this by
* simply setting up the source to use the same format as the destination.
*/
assert(_mesa_get_srgb_format_linear(src_mt->format) ==
_mesa_get_srgb_format_linear(dst_mt->format));
src.brw_surfaceformat = dst.brw_surfaceformat;
use_wm_prog = true;

View File

@@ -278,7 +278,23 @@ brwCreateContext(int api,
}
/* WM maximum threads is number of EUs times number of threads per EU. */
if (intel->gen >= 7) {
assert(intel->gen <= 7);
if (intel->is_haswell) {
if (intel->gt == 1) {
brw->max_wm_threads = 102;
brw->max_vs_threads = 70;
brw->urb.size = 128;
brw->urb.max_vs_entries = 640;
brw->urb.max_gs_entries = 256;
} else if (intel->gt == 2) {
brw->max_wm_threads = 204;
brw->max_vs_threads = 280;
brw->urb.size = 256;
brw->urb.max_vs_entries = 1664;
brw->urb.max_gs_entries = 640;
}
} else if (intel->gen == 7) {
if (intel->gt == 1) {
brw->max_wm_threads = 48;
brw->max_vs_threads = 36;
@@ -360,6 +376,7 @@ brwCreateContext(int api,
ctx->Const.NativeIntegers = true;
ctx->Const.UniformBooleanTrue = 1;
ctx->Const.UniformBufferOffsetAlignment = 16;
ctx->Const.ForceGLSLExtensionsWarn = driQueryOptionb(&intel->optionCache, "force_glsl_extensions_warn");

View File

@@ -1217,6 +1217,14 @@ brw_blorp_framebuffer(struct intel_context *intel,
GLint dstX0, GLint dstY0, GLint dstX1, GLint dstY1,
GLbitfield mask, GLenum filter);
bool
brw_blorp_copytexsubimage(struct intel_context *intel,
struct gl_renderbuffer *src_rb,
struct gl_texture_image *dst_image,
int srcX0, int srcY0,
int dstX0, int dstY0,
int width, int height);
/* gen6_multisample_state.c */
void
gen6_emit_3dstate_multisample(struct brw_context *brw,

View File

@@ -327,6 +327,23 @@ fs_inst::is_math()
opcode == SHADER_OPCODE_POW);
}
bool
fs_inst::is_control_flow()
{
switch (opcode) {
case BRW_OPCODE_DO:
case BRW_OPCODE_WHILE:
case BRW_OPCODE_IF:
case BRW_OPCODE_ELSE:
case BRW_OPCODE_ENDIF:
case BRW_OPCODE_BREAK:
case BRW_OPCODE_CONTINUE:
return true;
default:
return false;
}
}
bool
fs_inst::is_send_from_grf()
{
@@ -2070,16 +2087,12 @@ fs_visitor::compute_to_mrf()
break;
}
/* We don't handle flow control here. Most computation of
/* We don't handle control flow here. Most computation of
* values that end up in MRFs are shortly before the MRF
* write anyway.
*/
if (scan_inst->opcode == BRW_OPCODE_DO ||
scan_inst->opcode == BRW_OPCODE_WHILE ||
scan_inst->opcode == BRW_OPCODE_ELSE ||
scan_inst->opcode == BRW_OPCODE_ENDIF) {
if (scan_inst->is_control_flow() && scan_inst->opcode != BRW_OPCODE_IF)
break;
}
/* You can't read from an MRF, so if someone else reads our
* MRF's source GRF that we wanted to rewrite, that stops us.
@@ -2163,16 +2176,8 @@ fs_visitor::remove_duplicate_mrf_writes()
foreach_list_safe(node, &this->instructions) {
fs_inst *inst = (fs_inst *)node;
switch (inst->opcode) {
case BRW_OPCODE_DO:
case BRW_OPCODE_WHILE:
case BRW_OPCODE_IF:
case BRW_OPCODE_ELSE:
case BRW_OPCODE_ENDIF:
if (inst->is_control_flow()) {
memset(last_mrf_move, 0, sizeof(last_mrf_move));
continue;
default:
break;
}
if (inst->opcode == BRW_OPCODE_MOV &&

View File

@@ -178,6 +178,7 @@ public:
bool overwrites_reg(const fs_reg &reg);
bool is_tex();
bool is_math();
bool is_control_flow();
bool is_send_from_grf();
fs_reg dst;

View File

@@ -951,8 +951,8 @@ fs_generator::generate_pack_half_2x16_split(fs_inst *inst,
{
assert(intel->gen >= 7);
assert(dst.type == BRW_REGISTER_TYPE_UD);
assert(x.type = BRW_REGISTER_TYPE_F);
assert(y.type = BRW_REGISTER_TYPE_F);
assert(x.type == BRW_REGISTER_TYPE_F);
assert(y.type == BRW_REGISTER_TYPE_F);
/* From the Ivybridge PRM, Vol4, Part3, Section 6.27 f32to16:
*

View File

@@ -816,15 +816,8 @@ fs_visitor::schedule_instructions(bool post_reg_alloc)
next_block_header = (fs_inst *)next_block_header->next;
sched.add_inst(inst);
if (inst->opcode == BRW_OPCODE_IF ||
inst->opcode == BRW_OPCODE_ELSE ||
inst->opcode == BRW_OPCODE_ENDIF ||
inst->opcode == BRW_OPCODE_DO ||
inst->opcode == BRW_OPCODE_WHILE ||
inst->opcode == BRW_OPCODE_BREAK ||
inst->opcode == BRW_OPCODE_CONTINUE) {
if (inst->is_control_flow())
break;
}
}
sched.calculate_deps();
sched.schedule_instructions(next_block_header);

View File

@@ -196,11 +196,11 @@ haswell_upload_cut_index(struct brw_context *brw)
return;
const unsigned cut_index_setting =
ctx->Array.PrimitiveRestart ? HSW_CUT_INDEX_ENABLE : 0;
ctx->Array._PrimitiveRestart ? HSW_CUT_INDEX_ENABLE : 0;
BEGIN_BATCH(2);
OUT_BATCH(_3DSTATE_VF << 16 | cut_index_setting | (2 - 2));
OUT_BATCH(ctx->Array.RestartIndex);
OUT_BATCH(ctx->Array._RestartIndex);
ADVANCE_BATCH();
}

View File

@@ -197,7 +197,8 @@ uint32_t brw_format_for_mesa_format(gl_format mesa_format);
GLuint translate_tex_target(GLenum target);
GLuint translate_tex_format(gl_format mesa_format,
GLuint translate_tex_format(struct intel_context *intel,
gl_format mesa_format,
GLenum internal_format,
GLenum depth_mode,
GLenum srgb_decode);
@@ -225,7 +226,7 @@ void upload_default_color(struct brw_context *brw,
/* gen6_sf_state.c */
uint32_t
get_attr_override(struct brw_vue_map *vue_map, int urb_entry_read_offset,
int fs_attr, bool two_side_color);
int fs_attr, bool two_side_color, uint32_t *max_source_attr);
#ifdef __cplusplus
}

View File

@@ -2420,18 +2420,14 @@ vec4_visitor::emit_psiz_and_flags(struct brw_reg reg)
* clipped against all fixed planes.
*/
if (brw->has_negative_rhw_bug) {
#if 0
/* FINISHME */
brw_CMP(p,
vec8(brw_null_reg()),
BRW_CONDITIONAL_L,
brw_swizzle1(output_reg[BRW_VERT_RESULT_NDC], 3),
brw_imm_f(0));
brw_OR(p, brw_writemask(header1, WRITEMASK_W), header1, brw_imm_ud(1<<6));
brw_MOV(p, output_reg[BRW_VERT_RESULT_NDC], brw_imm_f(0));
brw_set_predicate_control(p, BRW_PREDICATE_NONE);
#endif
src_reg ndc_w = src_reg(output_reg[BRW_VERT_RESULT_NDC]);
ndc_w.swizzle = BRW_SWIZZLE_WWWW;
emit(CMP(dst_null_f(), ndc_w, src_reg(0.0f), BRW_CONDITIONAL_L));
vec4_instruction *inst;
inst = emit(OR(header1_w, src_reg(header1_w), src_reg(1u << 6)));
inst->predicate = BRW_PREDICATE_NORMAL;
inst = emit(MOV(output_reg[BRW_VERT_RESULT_NDC], src_reg(0.0f)));
inst->predicate = BRW_PREDICATE_NORMAL;
}
emit(MOV(retype(reg, BRW_REGISTER_TYPE_UD), src_reg(header1)));

View File

@@ -622,7 +622,8 @@ brw_render_target_supported(struct intel_context *intel,
}
GLuint
translate_tex_format(gl_format mesa_format,
translate_tex_format(struct intel_context *intel,
gl_format mesa_format,
GLenum internal_format,
GLenum depth_mode,
GLenum srgb_decode)
@@ -651,6 +652,17 @@ translate_tex_format(gl_format mesa_format,
*/
return BRW_SURFACEFORMAT_R32G32B32A32_FLOAT;
case MESA_FORMAT_SRGB_DXT1:
if (intel->gen == 4 && !intel->is_g4x) {
/* Work around missing SRGB DXT1 support on original gen4 by just
* skipping SRGB decode. It's not worth not supporting sRGB in
* general to prevent this.
*/
WARN_ONCE(true, "Demoting sRGB DXT1 texture to non-sRGB\n");
mesa_format = MESA_FORMAT_RGB_DXT1;
}
return brw_format_for_mesa_format(mesa_format);
default:
assert(brw_format_for_mesa_format(mesa_format) != 0);
return brw_format_for_mesa_format(mesa_format);
@@ -829,6 +841,7 @@ brw_update_texture_surface(struct gl_context *ctx,
uint32_t *binding_table,
unsigned surf_index)
{
struct intel_context *intel = intel_context(ctx);
struct brw_context *brw = brw_context(ctx);
struct gl_texture_object *tObj = ctx->Texture.Unit[unit]._Current;
struct intel_texture_object *intelObj = intel_texture_object(tObj);
@@ -851,7 +864,8 @@ brw_update_texture_surface(struct gl_context *ctx,
surf[0] = (translate_tex_target(tObj->Target) << BRW_SURFACE_TYPE_SHIFT |
BRW_SURFACE_MIPMAPLAYOUT_BELOW << BRW_SURFACE_MIPLAYOUT_SHIFT |
BRW_SURFACE_CUBEFACE_ENABLES |
(translate_tex_format(mt->format,
(translate_tex_format(intel,
mt->format,
firstImage->InternalFormat,
tObj->DepthMode,
sampler->sRGBDecode) <<

View File

@@ -283,6 +283,25 @@ gen6_blorp_emit_blend_state(struct brw_context *brw,
blend->blend1.write_disable_b = false;
blend->blend1.write_disable_a = false;
/* When blitting from an XRGB source to a ARGB destination, we need to
* interpret the missing channel as 1.0. Blending can do that for us:
* we simply use the RGB values from the fragment shader ("source RGB"),
* but smash the alpha channel to 1.
*/
if (_mesa_get_format_bits(params->dst.mt->format, GL_ALPHA_BITS) > 0 &&
_mesa_get_format_bits(params->src.mt->format, GL_ALPHA_BITS) == 0) {
blend->blend0.blend_enable = 1;
blend->blend0.ia_blend_enable = 1;
blend->blend0.blend_func = BRW_BLENDFUNCTION_ADD;
blend->blend0.ia_blend_func = BRW_BLENDFUNCTION_ADD;
blend->blend0.source_blend_factor = BRW_BLENDFACTOR_SRC_COLOR;
blend->blend0.dest_blend_factor = BRW_BLENDFACTOR_ZERO;
blend->blend0.ia_source_blend_factor = BRW_BLENDFACTOR_ONE;
blend->blend0.ia_dest_blend_factor = BRW_BLENDFACTOR_ZERO;
}
return cc_blend_state_offset;
}

View File

@@ -54,9 +54,8 @@
*/
uint32_t
get_attr_override(struct brw_vue_map *vue_map, int urb_entry_read_offset,
int fs_attr, bool two_side_color)
int fs_attr, bool two_side_color, uint32_t *max_source_attr)
{
int attr_override, slot;
int vs_attr = _mesa_frag_attrib_to_vert_result(fs_attr);
if (vs_attr < 0 || vs_attr == VERT_RESULT_HPOS) {
/* These attributes will be overwritten by the fragment shader's
@@ -67,7 +66,7 @@ get_attr_override(struct brw_vue_map *vue_map, int urb_entry_read_offset,
}
/* Find the VUE slot for this attribute. */
slot = vue_map->vert_result_to_slot[vs_attr];
int slot = vue_map->vert_result_to_slot[vs_attr];
/* If there was only a back color written but not front, use back
* as the color instead of undefined
@@ -89,23 +88,29 @@ get_attr_override(struct brw_vue_map *vue_map, int urb_entry_read_offset,
* Each increment of urb_entry_read_offset represents a 256-bit value, so
* it counts for two 128-bit VUE slots.
*/
attr_override = slot - 2 * urb_entry_read_offset;
assert (attr_override >= 0 && attr_override < 32);
int source_attr = slot - 2 * urb_entry_read_offset;
assert(source_attr >= 0 && source_attr < 32);
/* If we are doing two-sided color, and the VUE slot following this one
* represents a back-facing color, then we need to instruct the SF unit to
* do back-facing swizzling.
*/
if (two_side_color) {
if (vue_map->slot_to_vert_result[slot] == VERT_RESULT_COL0 &&
vue_map->slot_to_vert_result[slot+1] == VERT_RESULT_BFC0)
attr_override |= (ATTRIBUTE_SWIZZLE_INPUTATTR_FACING << ATTRIBUTE_SWIZZLE_SHIFT);
else if (vue_map->slot_to_vert_result[slot] == VERT_RESULT_COL1 &&
vue_map->slot_to_vert_result[slot+1] == VERT_RESULT_BFC1)
attr_override |= (ATTRIBUTE_SWIZZLE_INPUTATTR_FACING << ATTRIBUTE_SWIZZLE_SHIFT);
bool swizzling = two_side_color &&
((vue_map->slot_to_vert_result[slot] == VERT_RESULT_COL0 &&
vue_map->slot_to_vert_result[slot+1] == VERT_RESULT_BFC0) ||
(vue_map->slot_to_vert_result[slot] == VERT_RESULT_COL1 &&
vue_map->slot_to_vert_result[slot+1] == VERT_RESULT_BFC1));
/* Update max_source_attr. If swizzling, the SF will read this slot + 1. */
if (*max_source_attr < source_attr + swizzling)
*max_source_attr = source_attr + swizzling;
if (swizzling) {
return source_attr |
(ATTRIBUTE_SWIZZLE_INPUTATTR_FACING << ATTRIBUTE_SWIZZLE_SHIFT);
}
return attr_override;
return source_attr;
}
static void
@@ -113,7 +118,6 @@ upload_sf_state(struct brw_context *brw)
{
struct intel_context *intel = &brw->intel;
struct gl_context *ctx = &intel->ctx;
uint32_t urb_entry_read_length;
/* BRW_NEW_FRAGMENT_PROGRAM */
uint32_t num_outputs = _mesa_bitcount_64(brw->fragment_program->Base.InputsRead);
/* _NEW_LIGHT */
@@ -130,21 +134,7 @@ upload_sf_state(struct brw_context *brw)
uint16_t attr_overrides[FRAG_ATTRIB_MAX];
uint32_t point_sprite_origin;
/* CACHE_NEW_VS_PROG */
urb_entry_read_length = ((brw->vs.prog_data->vue_map.num_slots + 1) / 2 -
urb_entry_read_offset);
if (urb_entry_read_length == 0) {
/* Setting the URB entry read length to 0 causes undefined behavior, so
* if we have no URB data to read, set it to 1.
*/
urb_entry_read_length = 1;
}
dw1 =
GEN6_SF_SWIZZLE_ENABLE |
num_outputs << GEN6_SF_NUM_OUTPUTS_SHIFT |
urb_entry_read_length << GEN6_SF_URB_ENTRY_READ_LENGTH_SHIFT |
urb_entry_read_offset << GEN6_SF_URB_ENTRY_READ_OFFSET_SHIFT;
dw1 = GEN6_SF_SWIZZLE_ENABLE | num_outputs << GEN6_SF_NUM_OUTPUTS_SHIFT;
dw2 = GEN6_SF_STATISTICS_ENABLE |
GEN6_SF_VIEWPORT_TRANSFORM_ENABLE;
@@ -280,6 +270,7 @@ upload_sf_state(struct brw_context *brw)
/* Create the mapping from the FS inputs we produce to the VS outputs
* they source from.
*/
uint32_t max_source_attr = 0;
for (; attr < FRAG_ATTRIB_MAX; attr++) {
enum glsl_interp_qualifier interp_qualifier =
brw->fragment_program->InterpQualifier[attr];
@@ -315,12 +306,30 @@ upload_sf_state(struct brw_context *brw)
attr_overrides[input_index++] =
get_attr_override(&brw->vs.prog_data->vue_map,
urb_entry_read_offset, attr,
ctx->VertexProgram._TwoSideEnabled);
ctx->VertexProgram._TwoSideEnabled,
&max_source_attr);
}
for (; input_index < FRAG_ATTRIB_MAX; input_index++)
attr_overrides[input_index] = 0;
/* From the Sandy Bridge PRM, Volume 2, Part 1, documentation for
* 3DSTATE_SF DWord 1 bits 15:11, "Vertex URB Entry Read Length":
*
* "This field should be set to the minimum length required to read the
* maximum source attribute. The maximum source attribute is indicated
* by the maximum value of the enabled Attribute # Source Attribute if
* Attribute Swizzle Enable is set, Number of Output Attributes-1 if
* enable is not set.
* read_length = ceiling((max_source_attr + 1) / 2)
*
* [errata] Corruption/Hang possible if length programmed larger than
* recommended"
*/
uint32_t urb_entry_read_length = ALIGN(max_source_attr + 1, 2) / 2;
dw1 |= urb_entry_read_length << GEN6_SF_URB_ENTRY_READ_LENGTH_SHIFT |
urb_entry_read_offset << GEN6_SF_URB_ENTRY_READ_OFFSET_SHIFT;
BEGIN_BATCH(20);
OUT_BATCH(_3DSTATE_SF << 16 | (20 - 2));
OUT_BATCH(dw1);

View File

@@ -196,7 +196,7 @@ gen7_upload_samplers(struct brw_context *brw)
GLbitfield SamplersUsed = vs->SamplersUsed | fs->SamplersUsed;
brw->sampler.count = _mesa_bitcount(SamplersUsed);
brw->sampler.count = _mesa_fls(SamplersUsed);
if (brw->sampler.count == 0)
return;

View File

@@ -34,7 +34,6 @@ upload_sbe_state(struct brw_context *brw)
{
struct intel_context *intel = &brw->intel;
struct gl_context *ctx = &intel->ctx;
uint32_t urb_entry_read_length;
/* BRW_NEW_FRAGMENT_PROGRAM */
uint32_t num_outputs = _mesa_bitcount_64(brw->fragment_program->Base.InputsRead);
/* _NEW_LIGHT */
@@ -48,22 +47,8 @@ upload_sbe_state(struct brw_context *brw)
bool render_to_fbo = _mesa_is_user_fbo(ctx->DrawBuffer);
uint32_t point_sprite_origin;
/* CACHE_NEW_VS_PROG */
urb_entry_read_length = ((brw->vs.prog_data->vue_map.num_slots + 1) / 2 -
urb_entry_read_offset);
if (urb_entry_read_length == 0) {
/* Setting the URB entry read length to 0 causes undefined behavior, so
* if we have no URB data to read, set it to 1.
*/
urb_entry_read_length = 1;
}
/* FINISHME: Attribute Swizzle Control Mode? */
dw1 =
GEN7_SBE_SWIZZLE_ENABLE |
num_outputs << GEN7_SBE_NUM_OUTPUTS_SHIFT |
urb_entry_read_length << GEN7_SBE_URB_ENTRY_READ_LENGTH_SHIFT |
urb_entry_read_offset << GEN7_SBE_URB_ENTRY_READ_OFFSET_SHIFT;
dw1 = GEN7_SBE_SWIZZLE_ENABLE | num_outputs << GEN7_SBE_NUM_OUTPUTS_SHIFT;
/* _NEW_POINT
*
@@ -84,6 +69,7 @@ upload_sbe_state(struct brw_context *brw)
/* Create the mapping from the FS inputs we produce to the VS outputs
* they source from.
*/
uint32_t max_source_attr = 0;
for (; attr < FRAG_ATTRIB_MAX; attr++) {
enum glsl_interp_qualifier interp_qualifier =
brw->fragment_program->InterpQualifier[attr];
@@ -118,9 +104,25 @@ upload_sbe_state(struct brw_context *brw)
attr_overrides[input_index++] =
get_attr_override(&brw->vs.prog_data->vue_map,
urb_entry_read_offset, attr,
ctx->VertexProgram._TwoSideEnabled);
ctx->VertexProgram._TwoSideEnabled,
&max_source_attr);
}
/* From the Ivy Bridge PRM, Volume 2, Part 1, documentation for
* 3DSTATE_SBE DWord 1 bits 15:11, "Vertex URB Entry Read Length":
*
* "This field should be set to the minimum length required to read the
* maximum source attribute. The maximum source attribute is indicated
* by the maximum value of the enabled Attribute # Source Attribute if
* Attribute Swizzle Enable is set, Number of Output Attributes-1 if
* enable is not set.
*
* read_length = ceiling((max_source_attr + 1) / 2)"
*/
uint32_t urb_entry_read_length = ALIGN(max_source_attr + 1, 2) / 2;
dw1 |= urb_entry_read_length << GEN7_SBE_URB_ENTRY_READ_LENGTH_SHIFT |
urb_entry_read_offset << GEN7_SBE_URB_ENTRY_READ_OFFSET_SHIFT;
for (; input_index < FRAG_ATTRIB_MAX; input_index++)
attr_overrides[input_index] = 0;

View File

@@ -308,7 +308,8 @@ gen7_update_texture_surface(struct gl_context *ctx,
8 * 4, 32, &binding_table[surf_index]);
memset(surf, 0, 8 * 4);
uint32_t tex_format = translate_tex_format(mt->format,
uint32_t tex_format = translate_tex_format(intel,
mt->format,
firstImage->InternalFormat,
tObj->DepthMode,
sampler->sRGBDecode);

View File

@@ -86,8 +86,6 @@ intelInitExtensions(struct gl_context *ctx)
ctx->Extensions.TDFX_texture_compression_FXT1 = true;
ctx->Extensions.OES_EGL_image = true;
ctx->Extensions.OES_draw_texture = true;
ctx->Extensions.OES_compressed_ETC1_RGB8_texture = true;
ctx->Extensions.ARB_texture_rgb10_a2ui = true;
if (intel->gen >= 6)
ctx->Const.GLSLVersion = 140;
@@ -144,6 +142,7 @@ intelInitExtensions(struct gl_context *ctx)
ctx->Extensions.EXT_packed_float = true;
ctx->Extensions.ARB_texture_compression_rgtc = true;
ctx->Extensions.ARB_texture_rg = true;
ctx->Extensions.ARB_texture_rgb10_a2ui = true;
ctx->Extensions.ARB_vertex_type_2_10_10_10_rev = true;
ctx->Extensions.EXT_draw_buffers2 = true;
ctx->Extensions.EXT_framebuffer_sRGB = true;
@@ -157,6 +156,7 @@ intelInitExtensions(struct gl_context *ctx)
ctx->Extensions.ATI_envmap_bumpmap = true;
ctx->Extensions.MESA_texture_array = true;
ctx->Extensions.NV_conditional_render = true;
ctx->Extensions.OES_compressed_ETC1_RGB8_texture = true;
ctx->Extensions.OES_standard_derivatives = true;
}

View File

@@ -531,6 +531,36 @@ intel_renderbuffer_update_wrapper(struct intel_context *intel,
return true;
}
/**
* Create a fake intel_renderbuffer that wraps a gl_texture_image.
*/
struct intel_renderbuffer *
intel_create_fake_renderbuffer_wrapper(struct intel_context *intel,
struct gl_texture_image *image)
{
struct gl_context *ctx = &intel->ctx;
struct intel_renderbuffer *irb;
struct gl_renderbuffer *rb;
irb = CALLOC_STRUCT(intel_renderbuffer);
if (!irb) {
_mesa_error(ctx, GL_OUT_OF_MEMORY, "creating renderbuffer");
return NULL;
}
rb = &irb->Base.Base;
_mesa_init_renderbuffer(rb, 0);
rb->ClassID = INTEL_RB_CLASS;
if (!intel_renderbuffer_update_wrapper(intel, irb, image, image->Face)) {
intel_delete_renderbuffer(ctx, rb);
return NULL;
}
return irb;
}
void
intel_renderbuffer_set_draw_offset(struct intel_renderbuffer *irb)
{

View File

@@ -140,6 +140,10 @@ intel_create_wrapped_renderbuffer(struct gl_context * ctx,
int width, int height,
gl_format format);
struct intel_renderbuffer *
intel_create_fake_renderbuffer_wrapper(struct intel_context *intel,
struct gl_texture_image *image);
extern void
intel_fbo_init(struct intel_context *intel);

View File

@@ -41,6 +41,9 @@
#include "intel_fbo.h"
#include "intel_tex.h"
#include "intel_blit.h"
#ifndef I915
#include "brw_context.h"
#endif
#define FILE_DEBUG_FLAG DEBUG_TEXTURE
@@ -177,15 +180,28 @@ intelCopyTexSubImage(struct gl_context *ctx, GLuint dims,
GLint x, GLint y,
GLsizei width, GLsizei height)
{
if (dims == 3 || !intel_copy_texsubimage(intel_context(ctx),
intel_texture_image(texImage),
xoffset, yoffset,
intel_renderbuffer(rb), x, y, width, height)) {
fallback_debug("%s - fallback to swrast\n", __FUNCTION__);
_mesa_meta_CopyTexSubImage(ctx, dims, texImage,
xoffset, yoffset, zoffset,
rb, x, y, width, height);
struct intel_context *intel = intel_context(ctx);
if (dims != 3) {
#ifndef I915
/* Try BLORP first. It can handle almost everything. */
if (brw_blorp_copytexsubimage(intel, rb, texImage, x, y,
xoffset, yoffset, width, height))
return;
#endif
/* Next, try the BLT engine. */
if (intel_copy_texsubimage(intel_context(ctx),
intel_texture_image(texImage),
xoffset, yoffset,
intel_renderbuffer(rb), x, y, width, height))
return;
}
/* Finally, fall back to meta. This will likely be slow. */
fallback_debug("%s - fallback to swrast\n", __FUNCTION__);
_mesa_meta_CopyTexSubImage(ctx, dims, texImage,
xoffset, yoffset, zoffset,
rb, x, y, width, height);
}

View File

@@ -42,6 +42,7 @@
#include "main/framebuffer.h"
#include "main/imports.h"
#include "main/macros.h"
#include "main/mipmap.h"
#include "main/mtypes.h"
#include "main/renderbuffer.h"
#include "main/version.h"
@@ -783,6 +784,8 @@ OSMesaCreateContextExt( GLenum format, GLint depthBits, GLint stencilBits,
ctx->Driver.MapRenderbuffer = osmesa_MapRenderbuffer;
ctx->Driver.UnmapRenderbuffer = osmesa_UnmapRenderbuffer;
ctx->Driver.GenerateMipmap = _mesa_generate_mipmap;
/* Extend the software rasterizer with our optimized line and triangle
* drawing functions.
*/

View File

@@ -34,6 +34,7 @@
#include "main/colormac.h"
#include "main/fbobject.h"
#include "main/macros.h"
#include "main/mipmap.h"
#include "main/image.h"
#include "main/imports.h"
#include "main/mtypes.h"
@@ -869,6 +870,8 @@ xmesa_init_driver_functions( XMesaVisual xmvisual,
driver->MapRenderbuffer = xmesa_MapRenderbuffer;
driver->UnmapRenderbuffer = xmesa_UnmapRenderbuffer;
driver->GenerateMipmap = _mesa_generate_mipmap;
#if ENABLE_EXT_timer_query
driver->NewQueryObject = xmesa_new_query_object;
driver->BeginQuery = xmesa_begin_query;

View File

@@ -2152,13 +2152,6 @@ _mesa_BindBufferRange(GLenum target, GLuint index,
(int) size);
return;
}
if (offset + size > bufObj->Size) {
_mesa_error(ctx, GL_INVALID_VALUE,
"glBindBufferRange(offset + size %d > buffer size %d)",
(int) (offset + size), (int) (bufObj->Size));
return;
}
}
switch (target) {

View File

@@ -297,7 +297,7 @@ static const struct extension extension_table[] = {
{ "GL_ATI_texture_float", o(ARB_texture_float), GL, 2002 },
{ "GL_ATI_texture_mirror_once", o(ATI_texture_mirror_once), GL, 2006 },
{ "GL_IBM_multimode_draw_arrays", o(dummy_true), GL, 1998 },
{ "GL_IBM_rasterpos_clip", o(dummy_true), GL, 1996 },
{ "GL_IBM_rasterpos_clip", o(dummy_true), GLL, 1996 },
{ "GL_IBM_texture_mirrored_repeat", o(dummy_true), GLL, 1998 },
{ "GL_INGR_blend_func_separate", o(EXT_blend_func_separate), GLL, 1999 },
{ "GL_MESA_pack_invert", o(MESA_pack_invert), GL, 2002 },
@@ -418,7 +418,6 @@ _mesa_enable_sw_extensions(struct gl_context *ctx)
ctx->Extensions.EXT_fog_coord = GL_TRUE;
ctx->Extensions.EXT_framebuffer_object = GL_TRUE;
ctx->Extensions.EXT_framebuffer_blit = GL_TRUE;
ctx->Extensions.EXT_framebuffer_multisample = GL_TRUE;
ctx->Extensions.EXT_packed_depth_stencil = GL_TRUE;
ctx->Extensions.EXT_pixel_buffer_object = GL_TRUE;
ctx->Extensions.EXT_point_parameters = GL_TRUE;

View File

@@ -225,9 +225,6 @@ descriptor=[
# GL_OES_point_sprite
[ "POINT_SPRITE_NV", "CONTEXT_BOOL(Point.PointSprite), extra_NV_point_sprite_ARB_point_sprite" ],
# GL_ARB_vertex_shader
[ "MAX_VARYING_FLOATS_ARB", "LOC_CUSTOM, TYPE_INT, 0, extra_ARB_vertex_shader" ],
]},
@@ -362,6 +359,7 @@ descriptor=[
# GL_ARB_vertex_shader
[ "MAX_VERTEX_UNIFORM_COMPONENTS_ARB", "CONTEXT_INT(Const.VertexProgram.MaxUniformComponents), extra_ARB_vertex_shader" ],
[ "MAX_VARYING_FLOATS_ARB", "LOC_CUSTOM, TYPE_INT, 0, extra_ARB_vertex_shader" ],
# GL_EXT_framebuffer_blit
# NOTE: GL_DRAW_FRAMEBUFFER_BINDING_EXT == GL_FRAMEBUFFER_BINDING_EXT

View File

@@ -1485,18 +1485,8 @@ _mesa_error_check_format_and_type(const struct gl_context *ctx,
else if (ctx->Extensions.ARB_depth_buffer_float &&
type == GL_FLOAT_32_UNSIGNED_INT_24_8_REV)
return GL_NO_ERROR;
switch (type) {
case GL_BYTE:
case GL_UNSIGNED_BYTE:
case GL_SHORT:
case GL_UNSIGNED_SHORT:
case GL_INT:
case GL_UNSIGNED_INT:
case GL_FLOAT:
return GL_INVALID_OPERATION;
default:
else
return GL_INVALID_ENUM;
}
case GL_DUDV_ATI:
case GL_DU8DV8_ATI:

View File

@@ -6027,6 +6027,20 @@ _mesa_rebase_rgba_float(GLuint n, GLfloat rgba[][4], GLenum baseFormat)
rgba[i][ACOMP] = 1.0F;
}
break;
case GL_RG:
for (i = 0; i < n; i++) {
rgba[i][BCOMP] = 0.0F;
rgba[i][ACOMP] = 1.0F;
}
break;
case GL_RED:
for (i = 0; i < n; i++) {
rgba[i][GCOMP] = 0.0F;
rgba[i][BCOMP] = 0.0F;
rgba[i][ACOMP] = 1.0F;
}
break;
default:
/* no-op */
;
@@ -6070,6 +6084,18 @@ _mesa_rebase_rgba_uint(GLuint n, GLuint rgba[][4], GLenum baseFormat)
rgba[i][ACOMP] = 1;
}
break;
case GL_RG:
for (i = 0; i < n; i++) {
rgba[i][BCOMP] = 0;
rgba[i][ACOMP] = 1;
}
break;
case GL_RED:
for (i = 0; i < n; i++) {
rgba[i][GCOMP] = 0;
rgba[i][BCOMP] = 0;
rgba[i][ACOMP] = 1;
}
default:
/* no-op */
;

View File

@@ -29,6 +29,10 @@
#include "glheader.h"
#include "mtypes.h"
#ifdef __cplusplus
extern "C" {
#endif
struct gl_context;
struct gl_framebuffer;
struct gl_renderbuffer;
@@ -62,6 +66,8 @@ _mesa_reference_renderbuffer(struct gl_renderbuffer **ptr,
_mesa_reference_renderbuffer_(ptr, rb);
}
#ifdef __cplusplus
}
#endif
#endif /* RENDERBUFFER_H */

View File

@@ -207,6 +207,8 @@ attach_shader(struct gl_context *ctx, GLuint program, GLuint shader)
struct gl_shader *sh;
GLuint i, n;
const bool same_type_disallowed = _mesa_is_gles(ctx);
shProg = _mesa_lookup_shader_program_err(ctx, program, "glAttachShader");
if (!shProg)
return;
@@ -227,6 +229,18 @@ attach_shader(struct gl_context *ctx, GLuint program, GLuint shader)
*/
_mesa_error(ctx, GL_INVALID_OPERATION, "glAttachShader");
return;
} else if (same_type_disallowed &&
shProg->Shaders[i]->Type == sh->Type) {
/* Shader with the same type is already attached to this program,
* OpenGL ES 2.0 and 3.0 specs say:
*
* "Multiple shader objects of the same type may not be attached
* to a single program object. [...] The error INVALID_OPERATION
* is generated if [...] another shader object of the same type
* as shader is already attached to program."
*/
_mesa_error(ctx, GL_INVALID_OPERATION, "glAttachShader");
return;
}
}

View File

@@ -310,6 +310,41 @@ get_tex_rgba_compressed(struct gl_context *ctx, GLuint dimensions,
}
/**
* Return a base GL format given the user-requested format
* for glGetTexImage().
*/
static GLenum
_mesa_base_pack_format(GLenum format)
{
switch (format) {
case GL_ABGR_EXT:
case GL_BGRA:
case GL_BGRA_INTEGER:
case GL_RGBA_INTEGER:
return GL_RGBA;
case GL_BGR:
case GL_BGR_INTEGER:
case GL_RGB_INTEGER:
return GL_RGB;
case GL_RED_INTEGER:
return GL_RED;
case GL_GREEN_INTEGER:
return GL_GREEN;
case GL_BLUE_INTEGER:
return GL_BLUE;
case GL_ALPHA_INTEGER:
return GL_ALPHA;
case GL_LUMINANCE_INTEGER_EXT:
return GL_LUMINANCE;
case GL_LUMINANCE_ALPHA_INTEGER_EXT:
return GL_LUMINANCE_ALPHA;
default:
return format;
}
}
/**
* Get an uncompressed color texture image.
*/
@@ -323,7 +358,7 @@ get_tex_rgba_uncompressed(struct gl_context *ctx, GLuint dimensions,
const gl_format texFormat =
_mesa_get_srgb_format_linear(texImage->TexFormat);
const GLuint width = texImage->Width;
const GLenum destBaseFormat = _mesa_base_tex_format(ctx, format);
GLenum destBaseFormat = _mesa_base_pack_format(format);
GLenum rebaseFormat = GL_NONE;
GLuint height = texImage->Height;
GLuint depth = texImage->Depth;
@@ -332,6 +367,7 @@ get_tex_rgba_uncompressed(struct gl_context *ctx, GLuint dimensions,
GLuint (*rgba_uint)[4];
GLboolean tex_is_integer = _mesa_is_format_integer_color(texImage->TexFormat);
GLboolean tex_is_uint = _mesa_is_format_unsigned(texImage->TexFormat);
GLenum texBaseFormat = _mesa_get_format_base_format(texImage->TexFormat);
/* Allocate buffer for one row of texels */
rgba = malloc(4 * width * sizeof(GLfloat));
@@ -368,6 +404,50 @@ get_tex_rgba_uncompressed(struct gl_context *ctx, GLuint dimensions,
*/
rebaseFormat = GL_LUMINANCE_ALPHA; /* this covers GL_LUMINANCE too */
}
else if (texImage->_BaseFormat != texBaseFormat) {
/* The internal format and the real format differ, so we can't rely
* on the unpack functions setting the correct constant values.
* (e.g. reading back GL_RGB8 which is actually RGBA won't set alpha=1)
*/
switch (texImage->_BaseFormat) {
case GL_RED:
if ((texBaseFormat == GL_RGBA ||
texBaseFormat == GL_RGB ||
texBaseFormat == GL_RG) &&
(destBaseFormat == GL_RGBA ||
destBaseFormat == GL_RGB ||
destBaseFormat == GL_RG ||
destBaseFormat == GL_GREEN)) {
rebaseFormat = texImage->_BaseFormat;
break;
}
/* fall through */
case GL_RG:
if ((texBaseFormat == GL_RGBA ||
texBaseFormat == GL_RGB) &&
(destBaseFormat == GL_RGBA ||
destBaseFormat == GL_RGB ||
destBaseFormat == GL_BLUE)) {
rebaseFormat = texImage->_BaseFormat;
break;
}
/* fall through */
case GL_RGB:
if (texBaseFormat == GL_RGBA &&
(destBaseFormat == GL_RGBA ||
destBaseFormat == GL_ALPHA ||
destBaseFormat == GL_LUMINANCE_ALPHA)) {
rebaseFormat = texImage->_BaseFormat;
}
break;
case GL_ALPHA:
if (destBaseFormat != GL_ALPHA) {
rebaseFormat = texImage->_BaseFormat;
}
break;
}
}
for (img = 0; img < depth; img++) {
GLubyte *srcMap;
@@ -467,16 +547,18 @@ get_tex_memcpy(struct gl_context *ctx, GLenum format, GLenum type,
{
const GLenum target = texImage->TexObject->Target;
GLboolean memCopy = GL_FALSE;
GLenum texBaseFormat = _mesa_get_format_base_format(texImage->TexFormat);
/*
* Check if we can use memcpy to copy from the hardware texture
* format to the user's format/type.
* Note that GL's pixel transfer ops don't apply to glGetTexImage()
*/
if (target == GL_TEXTURE_1D ||
target == GL_TEXTURE_2D ||
target == GL_TEXTURE_RECTANGLE ||
_mesa_is_cube_face(target)) {
if ((target == GL_TEXTURE_1D ||
target == GL_TEXTURE_2D ||
target == GL_TEXTURE_RECTANGLE ||
_mesa_is_cube_face(target)) &&
texBaseFormat == texImage->_BaseFormat) {
memCopy = _mesa_format_matches_format_and_type(texImage->TexFormat,
format, type,
ctx->Pack.SwapBytes);

View File

@@ -2454,9 +2454,7 @@ copytexture_error_check( struct gl_context *ctx, GLuint dimensions,
}
}
if ((_mesa_is_desktop_gl(ctx) &&
ctx->Extensions.ARB_framebuffer_object) ||
_mesa_is_gles3(ctx)) {
if (_mesa_is_gles3(ctx)) {
bool rb_is_srgb = false;
bool dst_is_srgb = false;
@@ -2470,22 +2468,16 @@ copytexture_error_check( struct gl_context *ctx, GLuint dimensions,
}
if (rb_is_srgb != dst_is_srgb) {
/* Page 190 (page 211 of the PDF) in section 8.6 of the OpenGL 4.3
* Core Profile spec says:
/* Page 137 (page 149 of the PDF) in section 3.8.5 of the
* OpenGLES 3.0.0 spec says:
*
* "An INVALID_OPERATION error is generated under any of the
* following conditions:
*
* ...
*
* - if the value of FRAMEBUFFER_ATTACHMENT_COLOR_ENCODING
* for the framebuffer attachment corresponding to the read
* buffer is LINEAR (see section 9.2.3) and internalformat
* is one of the sRGB formats in table 8.23
* - if the value of FRAMEBUFFER_ATTACHMENT_COLOR_ENCODING
* for the framebuffer attachment corresponding to the read
* buffer is SRGB and internalformat is not one of the sRGB
* formats. in table 8.23."
* "The error INVALID_OPERATION is also generated if the
* value of FRAMEBUFFER_ATTACHMENT_COLOR_ENCODING for the
* framebuffer attachment corresponding to the read buffer
* is LINEAR (see section 6.1.13) and internalformat is
* one of the sRGB formats described in section 3.8.16, or
* if the value of FRAMEBUFFER_ATTACHMENT_COLOR_ENCODING is
* SRGB and internalformat is not one of the sRGB formats."
*/
_mesa_error(ctx, GL_INVALID_OPERATION,
"glCopyTexImage%dD(srgb usage mismatch)", dimensions);
@@ -3004,8 +2996,18 @@ teximage(struct gl_context *ctx, GLboolean compressed, GLuint dims,
texObj = _mesa_get_current_tex_object(ctx, target);
assert(texObj);
texFormat = _mesa_choose_texture_format(ctx, texObj, target, level,
internalFormat, format, type);
if (compressed) {
/* For glCompressedTexImage() the driver has no choice about the
* texture format since we'll never transcode the user's compressed
* image data. The internalFormat was error checked earlier.
*/
texFormat = _mesa_glenum_to_compressed_format(internalFormat);
}
else {
texFormat = _mesa_choose_texture_format(ctx, texObj, target, level,
internalFormat, format, type);
}
assert(texFormat != MESA_FORMAT_NONE);
/* check that width, height, depth are legal for the mipmap level */

View File

@@ -929,6 +929,7 @@ _mesa_uniform_matrix(struct gl_context *ctx, struct gl_shader_program *shProg,
_mesa_propagate_uniforms_to_driver_storage(uni, offset, count);
}
/**
* Called via glGetUniformLocation().
*
@@ -944,73 +945,35 @@ _mesa_get_uniform_location(struct gl_context *ctx,
const GLchar *name,
unsigned *out_offset)
{
const size_t len = strlen(name);
long offset;
bool array_lookup;
/* Page 80 (page 94 of the PDF) of the OpenGL 2.1 spec says:
*
* "The first element of a uniform array is identified using the
* name of the uniform array appended with "[0]". Except if the last
* part of the string name indicates a uniform array, then the
* location of the first element of that array can be retrieved by
* either using the name of the uniform array, or the name of the
* uniform array appended with "[0]"."
*
* Note: since uniform names are not allowed to use whitespace, and array
* indices within uniform names are not allowed to use "+", "-", or leading
* zeros, it follows that each uniform has a unique name up to the possible
* ambiguity with "[0]" noted above. Therefore we don't need to worry
* about mal-formed inputs--they will properly fail when we try to look up
* the uniform name in shProg->UniformHash.
*/
const GLchar *base_name_end;
long offset = parse_program_resource_name(name, &base_name_end);
bool array_lookup = offset >= 0;
char *name_copy;
/* If the name ends with a ']', assume that it refers to some element of an
* array. Malformed array references will fail the hash table look up
* below, so it doesn't matter that they are not caught here. This code
* only wants to catch the "leaf" array references so that arrays of
* structures containing arrays will be handled correctly.
*/
if (name[len-1] == ']') {
unsigned i;
/* Walk backwards over the string looking for a non-digit character.
* This had better be the opening bracket for an array index.
*
* Initially, i specifies the location of the ']'. Since the string may
* contain only the ']' charcater, walk backwards very carefully.
*/
for (i = len - 1; (i > 0) && isdigit(name[i-1]); --i)
/* empty */ ;
/* Page 80 (page 94 of the PDF) of the OpenGL 2.1 spec says:
*
* "The first element of a uniform array is identified using the
* name of the uniform array appended with "[0]". Except if the last
* part of the string name indicates a uniform array, then the
* location of the first element of that array can be retrieved by
* either using the name of the uniform array, or the name of the
* uniform array appended with "[0]"."
*
* Page 79 (page 93 of the PDF) of the OpenGL 2.1 spec says:
*
* "name must be a null terminated string, without white space."
*
* Return an error if there is no opening '[' to match the closing ']'.
* An error will also be returned if there is intervening white space
* (or other non-digit characters) before the opening '['.
*/
if ((i == 0) || name[i-1] != '[')
return GL_INVALID_INDEX;
/* Return an error if there are no digits between the opening '[' to
* match the closing ']'.
*/
if (i == (len - 1))
return GL_INVALID_INDEX;
/* Make a new string that is a copy of the old string up to (but not
* including) the '[' character.
*/
name_copy = (char *) malloc(i);
memcpy(name_copy, name, i - 1);
name_copy[i-1] = '\0';
offset = strtol(&name[i], NULL, 10);
if (offset < 0) {
free(name_copy);
return GL_INVALID_INDEX;
}
array_lookup = true;
if (array_lookup) {
name_copy = (char *) malloc(base_name_end - name + 1);
memcpy(name_copy, name, base_name_end - name);
name_copy[base_name_end - name] = '\0';
} else {
name_copy = (char *) name;
offset = 0;
array_lookup = false;
}
unsigned location = 0;

View File

@@ -167,6 +167,10 @@ _mesa_GetActiveUniformsiv(GLuint program,
void GLAPIENTRY
_mesa_GetUniformiv(GLhandleARB, GLint, GLint *);
long
_mesa_parse_program_resource_name(const GLchar *name,
const GLchar **out_base_name_end);
unsigned
_mesa_get_uniform_location(struct gl_context *ctx, struct gl_shader_program *shProg,
const GLchar *name, unsigned *offset);

View File

@@ -35,7 +35,7 @@ struct gl_context;
#define MESA_MAJOR 9
#define MESA_MINOR 1
#define MESA_PATCH 0
#define MESA_VERSION_STRING "9.1-devel"
#define MESA_VERSION_STRING "9.1-rc2"
/* To make version comparison easy */
#define MESA_VERSION(a,b,c) (((a) << 16) + ((b) << 8) + (c))

View File

@@ -2375,7 +2375,7 @@ print_program(struct prog_instruction *mesa_instructions,
}
}
class add_uniform_to_shader : public uniform_field_visitor {
class add_uniform_to_shader : public program_resource_visitor {
public:
add_uniform_to_shader(struct gl_shader_program *shader_program,
struct gl_program_parameter_list *params)
@@ -2387,7 +2387,7 @@ public:
void process(ir_variable *var)
{
this->idx = -1;
this->uniform_field_visitor::process(var);
this->program_resource_visitor::process(var);
var->location = this->idx;
}

View File

@@ -1499,14 +1499,16 @@ st_CopyPixels(struct gl_context *ctx, GLint srcx, GLint srcy,
if (type == GL_DEPTH) {
texFormat = st_choose_format(screen, GL_DEPTH_COMPONENT,
GL_NONE, GL_NONE, st->internal_target,
sample_count, PIPE_BIND_DEPTH_STENCIL);
sample_count, PIPE_BIND_DEPTH_STENCIL,
FALSE);
assert(texFormat != PIPE_FORMAT_NONE);
}
else {
/* default color format */
texFormat = st_choose_format(screen, GL_RGBA,
GL_NONE, GL_NONE, st->internal_target,
sample_count, PIPE_BIND_SAMPLER_VIEW);
sample_count, PIPE_BIND_SAMPLER_VIEW,
FALSE);
assert(texFormat != PIPE_FORMAT_NONE);
}
}

View File

@@ -597,7 +597,7 @@ decompress_with_blit(struct gl_context * ctx,
/* Find the best match for the format+type combo. */
pipe_format = st_choose_format(pipe->screen, GL_RGBA8, format, type,
pipe_target, 0, bind);
pipe_target, 0, bind, FALSE);
if (pipe_format == PIPE_FORMAT_NONE) {
/* unable to get an rgba format!?! */
_mesa_problem(ctx, "%s: cannot find a supported format", __func__);

View File

@@ -421,6 +421,10 @@ void st_init_extensions(struct st_context *st)
{ { o(EXT_texture_integer) },
{ PIPE_FORMAT_R32G32B32A32_UINT,
PIPE_FORMAT_R32G32B32A32_SINT } },
{ { o(ARB_texture_rg) },
{ PIPE_FORMAT_R8_UNORM,
PIPE_FORMAT_R8G8_UNORM } },
};
/* Required: depth stencil and sampler support */
@@ -444,9 +448,6 @@ void st_init_extensions(struct st_context *st)
PIPE_FORMAT_RGTC2_UNORM,
PIPE_FORMAT_RGTC2_SNORM } },
{ { o(ARB_texture_rg) },
{ PIPE_FORMAT_R8G8_UNORM } },
{ { o(EXT_texture_compression_latc) },
{ PIPE_FORMAT_LATC1_UNORM,
PIPE_FORMAT_LATC1_SNORM,
@@ -534,7 +535,6 @@ void st_init_extensions(struct st_context *st)
ctx->Extensions.EXT_blend_minmax = GL_TRUE;
ctx->Extensions.EXT_framebuffer_blit = GL_TRUE;
ctx->Extensions.EXT_framebuffer_object = GL_TRUE;
ctx->Extensions.EXT_framebuffer_multisample = GL_TRUE;
ctx->Extensions.EXT_fog_coord = GL_TRUE;
ctx->Extensions.EXT_gpu_program_parameters = GL_TRUE;
ctx->Extensions.EXT_pixel_buffer_object = GL_TRUE;
@@ -654,6 +654,10 @@ void st_init_extensions(struct st_context *st)
}
}
if (ctx->Const.MaxSamples >= 2) {
ctx->Extensions.EXT_framebuffer_multisample = GL_TRUE;
}
if (ctx->Const.MaxDualSourceDrawBuffers > 0)
ctx->Extensions.ARB_blend_func_extended = GL_TRUE;

View File

@@ -1398,18 +1398,25 @@ static const struct format_mapping format_map[] = {
/**
* Return first supported format from the given list.
* \param allow_dxt indicates whether it's OK to return a DXT format.
*/
static enum pipe_format
find_supported_format(struct pipe_screen *screen,
const enum pipe_format formats[],
enum pipe_texture_target target,
unsigned sample_count,
unsigned tex_usage)
unsigned tex_usage,
boolean allow_dxt)
{
uint i;
for (i = 0; formats[i]; i++) {
if (screen->is_format_supported(screen, formats[i], target,
sample_count, tex_usage)) {
if (!allow_dxt && util_format_is_s3tc(formats[i])) {
/* we can't return a dxt format, continue searching */
continue;
}
return formats[i];
}
}
@@ -1514,12 +1521,16 @@ find_exact_format(GLint internalFormat, GLenum format, GLenum type)
* \param internalFormat the user value passed to glTexImage2D
* \param target one of PIPE_TEXTURE_x
* \param bindings bitmask of PIPE_BIND_x flags.
* \param allow_dxt indicates whether it's OK to return a DXT format. This
* only matters when internalFormat names a generic or
* specific compressed format. And that should only happen
* when we're getting called from gl[Copy]TexImage().
*/
enum pipe_format
st_choose_format(struct pipe_screen *screen, GLenum internalFormat,
GLenum format, GLenum type,
enum pipe_texture_target target, unsigned sample_count,
unsigned bindings)
unsigned bindings, boolean allow_dxt)
{
GET_CURRENT_CONTEXT(ctx); /* XXX this should be a function parameter */
int i, j;
@@ -1547,7 +1558,8 @@ st_choose_format(struct pipe_screen *screen, GLenum internalFormat,
* which is supported by the driver.
*/
return find_supported_format(screen, mapping->pipeFormats,
target, sample_count, bindings);
target, sample_count, bindings,
allow_dxt);
}
}
}
@@ -1569,8 +1581,8 @@ st_choose_renderbuffer_format(struct pipe_screen *screen,
usage = PIPE_BIND_DEPTH_STENCIL;
else
usage = PIPE_BIND_RENDER_TARGET;
return st_choose_format(screen, internalFormat, GL_NONE, GL_NONE, PIPE_TEXTURE_2D,
sample_count, usage);
return st_choose_format(screen, internalFormat, GL_NONE, GL_NONE,
PIPE_TEXTURE_2D, sample_count, usage, FALSE);
}
@@ -1597,12 +1609,13 @@ st_ChooseTextureFormat_renderable(struct gl_context *ctx, GLint internalFormat,
}
pFormat = st_choose_format(screen, internalFormat, format, type,
PIPE_TEXTURE_2D, 0, bindings);
PIPE_TEXTURE_2D, 0, bindings, ctx->Mesa_DXTn);
if (pFormat == PIPE_FORMAT_NONE) {
/* try choosing format again, this time without render target bindings */
pFormat = st_choose_format(screen, internalFormat, format, type,
PIPE_TEXTURE_2D, 0, PIPE_BIND_SAMPLER_VIEW);
PIPE_TEXTURE_2D, 0, PIPE_BIND_SAMPLER_VIEW,
ctx->Mesa_DXTn);
}
if (pFormat == PIPE_FORMAT_NONE) {
@@ -1661,7 +1674,7 @@ st_QuerySamplesForFormat(struct gl_context *ctx, GLenum internalFormat,
/* Set sample counts in descending order. */
for (i = 16; i > 1; i--) {
format = st_choose_format(screen, internalFormat, GL_NONE, GL_NONE,
PIPE_TEXTURE_2D, i, bind);
PIPE_TEXTURE_2D, i, bind, FALSE);
if (format != PIPE_FORMAT_NONE) {
samples[num_sample_counts++] = i;

View File

@@ -51,7 +51,7 @@ extern enum pipe_format
st_choose_format(struct pipe_screen *screen, GLenum internalFormat,
GLenum format, GLenum type,
enum pipe_texture_target target, unsigned sample_count,
unsigned bindings);
unsigned bindings, boolean allow_dxt);
extern enum pipe_format
st_choose_renderbuffer_format(struct pipe_screen *screen,

View File

@@ -398,7 +398,8 @@ st_create_color_map_texture(struct gl_context *ctx)
/* find an RGBA texture format */
format = st_choose_format(pipe->screen, GL_RGBA, GL_NONE, GL_NONE,
PIPE_TEXTURE_2D, 0, PIPE_BIND_SAMPLER_VIEW);
PIPE_TEXTURE_2D, 0, PIPE_BIND_SAMPLER_VIEW,
FALSE);
/* create texture for color map/table */
pt = st_texture_create(st, PIPE_TEXTURE_2D, format, 0,