Compare commits

...

116 Commits

Author SHA1 Message Date
Ian Romanick
17493b8848 docs: Update relelase notes 2013-02-22 17:46:24 -08:00
Ian Romanick
3ea699ff3c mesa: Bump version to 9.1 (final)
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-02-22 17:46:23 -08:00
Ian Romanick
ea38e7e2e3 i965: Enable OpenGL ES 3.0 on Sandy Bridge
Regardless of what we put in the screen structure, all of the extensions
that compute_version_es2 checks are present and 3.0 will be exposed
anyway.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 7ae6864f0d)
2013-02-22 17:46:23 -08:00
Alex Deucher
4212dbae1c r600g: fixup PS_PARTIAL_FLUSH flag handling for cayman
So we don't emit it twice if we ever use the flag on
cayman.

Note: this is a candidate for the 9.1 branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 8b5acad0e9)
2013-02-22 18:45:40 -05:00
Alex Deucher
a650092fd6 r600g: r6xx deadlock workaround (v6)
Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=50655
https://bugs.freedesktop.org/show_bug.cgi?id=47116

v2: flush along with workaround.
v3: just need a flush
v4: try WAIT_UNTIL
v5: switch to PS partial flush
v6: rework patch

Note: this is a candidate for the 9.1 branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 8442b67f5f)
2013-02-22 18:27:37 -05:00
Alex Deucher
11d9f75f01 r600g: add PS_PARTIAL_FLUSH flag
PS_PARTIAL flushes seems to be required in certain
cases to prevent hangs, especially on r6xx.

Note: this is a candidate for the 9.1 branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 7ebf83f109)
2013-02-22 18:27:24 -05:00
Lauri Kasanen
6427e1609e configure: Fix build with automake < 1.11
Commit 86d30dea3c broke building with older
automake versions with this error:

Makefile:769: *** Recursive variable am__v_YACC_ references itself (eventually).  Stop.

This patch fixes it. Fix stolen from xorg-macros.

Signed-off-by: Lauri Kasanen <cand@gmx.com>
(cherry picked from commit 0a82828ad5)
2013-02-22 13:15:40 -08:00
Michel Dänzer
47f7f803ae radeonsi: Fix PIPE_FORMAT_X32_S8X24_UINT sampler hardware format
4 more little piglits.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit 9c1107b3e1)
2013-02-22 20:24:33 +01:00
Michel Dänzer
cb8bacd87a radeonsi: Use stencil surface level information for stencil texturing
7 more little dwarves^W piglits.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit 8356962853)
2013-02-22 20:24:24 +01:00
Michel Dänzer
0d08abd461 radeonsi: properly implement S8Z24 depth-stencil format
Based on r600g commit 2b9659c9e6 .

Fixes crashes with 4 piglit tests which are now hitting these formats.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit f9adf79876)
2013-02-22 20:24:17 +01:00
Michel Dänzer
5baa8ec737 radeonsi: Fix w component of TGSI_SEMANTIC_POSITION fragment shader inputs.
It's the reciprocal of the register value.

Fixes piglit fragcoord_w and glsl-fs-fragcoord-zw-perspective.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit 954bc4ac34)
2013-02-22 20:15:17 +01:00
Michel Dänzer
0c3b96a6c6 radeonsi: Fix blending using destination alpha factor but non-alpha destination
11 more little piglits.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
(cherry picked from commit 95bced5929)
2013-02-22 20:13:45 +01:00
Marek Olšák
24e8ad6204 radeonsi: implement 3D transfers
That means we can map and read multiple slices with one transfer_map call.

[ Cherry-picked from r600g commit 1aebb6911e ]

11 more little piglits on master, 1 more on the 9.1 branch (Marek's
glTex(Sub)Image improvements on master broke the other 10).

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 72f4490b55)
2013-02-22 20:13:27 +01:00
Marek Olšák
8013101c2d radeonsi: add assertions to prevent creation of invalid surfaces
[ Cherry-picked from r600g commit ef11ed61a0 ]

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit a84c4edeed)
2013-02-22 20:13:18 +01:00
Marek Olšák
6894c127d9 radeonsi: use u_box_origin_2d helper function
[ Cherry-picked from r600g commit b278aba423 ]

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit c4faab63c4)
2013-02-22 20:13:04 +01:00
Marek Olšák
f0f3ebb777 st/mesa: don't do sRGB conversion in CopyTexSubImage
Assuming I understand EXT_texture_sRGB correctly.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 6520a86c67)
2013-02-22 13:20:10 +01:00
Marek Olšák
deeb4b056f r600g: fix random corruption with CP DMA in TF2
NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit aac8138744)
2013-02-22 12:50:27 +01:00
Andreas Boll
8818d01d33 llvmpipe/build: add DLOPEN_LIBS and PTHREAD_LIBS to the lp_test_* targets
Fixes undefined symbols.

NOTE: This is a candidate for the 9.1 branch.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61052
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit c1f2c3a80f)
2013-02-22 10:24:53 +01:00
Andreas Boll
ea51f870f6 targets/xa-vmwgfx: Force c++ linker to fix undefined symbols
NOTE: This is a candidate for the 9.1 branch.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61200
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit c1eb585f3d)
2013-02-22 10:24:47 +01:00
Tom Stellard
dfc35a4fc5 r300g/compiler: Fix bug in OMOD folding
The OMOD value was only being folded to one instruction in cases where
the MUL instruction was reading a value written by more than one
instruction.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
(cherry picked from commit 10bcc843f8)
2013-02-21 22:09:51 -05:00
Tom Stellard
c7fdb6a861 r300g/tests: Add helper functions for creating a full program
Now you can convert assembly strings into a full struct radeon_compiler
object and use it to test individual compiler pases.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
(cherry picked from commit 5e1321ddf4)
2013-02-21 22:09:42 -05:00
Tom Stellard
74790900ca r300g/tests: Exit test runner with a valid status code
This way make check can report whether or not the tests pass.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
(cherry picked from commit bcf2e157ca)
2013-02-21 22:09:33 -05:00
Tom Stellard
42cfb1ef47 r300g/complier: Make r300_vertprog_swizzle_caps visible in other files
This will be used by the test suite in later commits.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
(cherry picked from commit 5355fc1e87)
2013-02-21 22:09:22 -05:00
Tom Stellard
754447d81f r300g/compiler: Add missing license headers
These are all files that I authored, but forgot to add the license
headers.

NOTE: This is a candidate for the stable branches.

Signed-off-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
(cherry picked from commit 27d140b960)
2013-02-21 22:09:11 -05:00
Alex Deucher
e41bdc223e r600g: don't enable ReZ mode on evergreen
Can cause lockups in certain cases when
zfunc/zenable/zwrite change without a flush
in between.

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=60969
and lockups on Civ4 with wine.

This is a candidate for the 9.1 branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
(cherry picked from commit 2e4ef989a2)
2013-02-21 12:02:15 -05:00
Ian Romanick
6ff7080a4c mesa: Don't install glEvalMesh in the beginend dispatch table
NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59740
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 8b586322e7)
2013-02-20 15:25:55 -08:00
Zack Rusin
bdffccf91e DRI2: Don't disable GLX_INTEL_swap_event unconditionally
GLX_INTEL_swap_event is broken on the server side, where it's
currently unconditionally enabled. This completely breaks
systems running on drivers which don't support that extension.
There's no way to test for its presence on this side, so instead
of disabling it uncondtionally, just disable it for drivers
which are known to not support it. It makes sense because
most drivers do support it right now.
We'll be able to remove this once Xserver properly advertises
GLX_INTEL_swap_event.

Note: This is a candidate for stable branch branches.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60052
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 076403c30d)
2013-02-19 12:50:55 -08:00
Stefan Brüns
c41ba5bda9 glx: fix glGetTexLevelParameteriv for indirect rendering
A single element in a GLX reply is contained in the header itself.
The number of elements is denoted in the "n" field of the reply.
If "n" is 1, the length of additional data is 0.
The XXX_data_length() function of xcb does not return the length of
the (optional, n>1) data but the number of elements.

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=59876

Note: This is a candidate for the stable branches.

Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
Signed-off-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 5876a5dbc0)
2013-02-19 12:17:26 -08:00
Ian Romanick
456cdb6d01 mesa: Bump version to 9.1-rc2
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-02-17 14:49:02 -08:00
Eric Anholt
aaee862305 i965/fs: Use a helper function for checking for flow control instructions.
In 2 of our checks, we were missing BREAK and CONTINUE.

NOTE: Candidate for the stable branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit bf91f0b039)
2013-02-17 14:20:39 -08:00
bma
b84d9aa0c6 shaderapi: Fix AttachShader error
Detect a duplicate Shader type as and error instead of silently allowing
it, restrict to ES2 API.

v2: Tapani Pälli <tapani.palli@intel.com>
    - make the check run time instead of compile time

v3: chadv
    - Quote spec on which error to generate.

Signed-off-by: bma <Bo.Ma@windriver.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-and-tested-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit ce3dfa19ab)
2013-02-17 14:20:34 -08:00
Eric Anholt
bb4b1494e3 i965: Re-enable the -RHW workaround for original gen4 chips.
Fixes broken clipping in supertuxkart and presumably many other applications.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51471
NOTE: Candidate for the stable branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit cb4616d32d)
2013-02-17 14:20:27 -08:00
Eric Anholt
321abaaa8d i965/gen4: Work around missing sRGB RGB DXT1 support.
The hardware just doesn't support it.  I suspect this was a regression from
the move to fixed MESA_FORMATs for compressed textures and that previously we
were storing uncompressed for this or something.

Fixes GPU hangs in piglit "texwrap GL_EXT_texture_sRGB-s3tc bordercolor
swizzled" on my GM965.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit ddc2b453d0)
2013-02-17 14:20:22 -08:00
Ian Romanick
95f1203a7c mesa: Add .cherry-ignore for 9.1
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-02-17 14:16:22 -08:00
Christopher James Halse Rogers
96fb4d61fb i965: Fix leak in blorp CopyTexSubImage2D
_mesa_delete_renderbuffer does not call the driver-specific
renderbuffer delete function, so the blorp code was leaking the
Intel-specific bits, including some GEM objects.

Call the renderbuffer's ->Delete() method instead, which does the
right thing.

Fixes Unity rapidly sending the machine into the arms of the OOM-killer

Note: This is a candidate for the 9.1 branch.

Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit dd599188d2)
2013-02-17 14:13:27 -08:00
Brian Paul
d8a0439c65 st/mesa: fix format query for GL_ARB_texture_rg
The GL_ARB_texture_rg spec says that we need to support both texturing
and rendering for the GL_RED and GL_RG formats.  So move the format
check up into the rendertarget_mapping[] list.  Also, add
PIPE_FORMAT_R8_UNORM to the list of formats required.

Note: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
(cherry picked from commit 4be5a06752)
2013-02-17 14:13:13 -08:00
Eric Anholt
c785315f3d i965/gen7: Set up all samplers even if samplers are sparsely used.
In GLSL, sampler indices are allocated contiguously from 0.  But in the
case of ARB_fragment_program (and possibly fixed function), an app that
uses texture 0 and 2 will use sampler indices 0 and 2, so we were only
allocating space for samplers 0 and 1 and setting up sampler 0.  We
would read garbage for sampler 2, resulting in flickering textures and
an angry simulator.

Fixes bad rendering in 0 A.D. and ETQW.  This was fixed for pre-gen7 by
28f4be9eb9

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=25201
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58680
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: This is a candidate for stable branches.
(cherry picked from commit 5bb05c6e6d)
2013-02-17 14:12:47 -08:00
Kenneth Graunke
0e3c755ca3 i965: Use derived state for Haswell's 3DSTATE_VF packet.
Otherwise, we fail to correctly handle GL_PRIMITIVE_RESTART_FIXED_INDEX.

Fixes gles3conform's primitive_restart_mode test.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 8cabe26f5d)
2013-02-17 14:12:10 -08:00
Brian Paul
4d11454e90 util: fix incorrect Z bit masking in util_clear_depth_stencil()
For PIPE_FORMAT_Z24_UNORM_S8_UINT, the Z bits are in the 24
least significant bits.

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=60527
and http://bugs.freedesktop.org/show_bug.cgi?id=60524
and http://bugs.freedesktop.org/show_bug.cgi?id=60047

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 4bfdef87e6)
2013-02-17 14:10:55 -08:00
Marek Olšák
9838215f3c mesa: fix GetTexImage if mesa format and internal format don't match
Tested with softpipe only exposing RGBA formats.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit cb6470775c)
2013-02-17 14:10:40 -08:00
Marek Olšák
2e4473d9e3 mesa: don't use memcpy fast path for GetTexImage if base format is different
The Mesa format can be RGBA8888_REV, the format/type can be
GL_RGBA/GL_UNSIGNED_BYTE, but the actual texture internal format can be
LUMINANCE_ALPHA, INTENSITY, etc. Therefore we should look at the base
internal format as well.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit c8379204ab)
2013-02-17 14:10:28 -08:00
Marek Olšák
11eb644cc9 mesa: don't use _mesa_base_tex_format for format parameter of GetTexImage
_mesa_base_tex_format doesn't accept GL_BGR and GL_ABGR_EXT, etc.

v2: add a (now hopefully complete) helper function to deal with this

NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 09a99867ab)
2013-02-17 14:09:55 -08:00
Ian Romanick
60bad0ddc3 intel: Do not expose OES_compressed_ETC1_RGB8_texture or ARB_texture_rgb10_a2ui pre-GEN4
Older hardware cannot do ARB_texture_rgb10_a2ui, and the translation
code for OES_compressed_ETC1_RGB8_texture was never implemented in the
i915 driver.

NOTE: This is a candidate for all stable branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 0e2f26d5ea)
2013-02-17 14:09:09 -08:00
Roland Scheidegger
d41e9b4d14 softpipe: fix using optimized filter function
This optimized filter (when using repeat wrap modes,
linear min/mag/mip filters, pot textures) only applies to 2d textures,
but nothing prevented it from being used for other textures (likely
leading to very bogus sample results).

Note: This is a candidate for the 9.0 branch.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 66b6d51214)
2013-02-17 14:09:01 -08:00
Kristian Høgsberg
60f05f0eef egl-wayland: Make sure we allocate a back buffer even if nothing was rendered
At eglSwapBuffer time, we blindly assume we have a back buffer, but the
back buffer only gets allocated when somebody tries to render something.

NOTE: This is a candidate for the 9.0 and 9.1 branches.

https://bugs.freedesktop.org/show_bug.cgi?id=60086
(cherry picked from commit 1fe007399c)
2013-02-17 14:08:41 -08:00
Brian Paul
714d8b3f8c svga: fix sRGB rendering
We weren't emitting the SVGA_RS_OUTPUTGAMMA state so sRGB rendering
didn't work properly.

Fixes piglit's framebuffer-srgb test.

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit ff60509157)
2013-02-17 14:08:22 -08:00
Brian Paul
ac22dffaf6 st/mesa: don't choose DXT formats if we can't do DXT compression
If we call gl[Copy]TexImage2D() with a generic compression format
(e.g. intFormat=GL_COMPRESSED_RGBA) we can't choose a DXT format if
we don't have the external DXT compression library.

We weren't actually enforcing this before since the
pipe_screen::is_format_supported(DXT) query has no dependency on
the DXT compression library.

Now if we're given a generic compressed format and we can't do DXT
compression we'll fall back to a non-compressed format.

v2: use util_format_is_s3tc() function and add more comments about
the allow_dxt parameter.

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 4df42890c5)
2013-02-17 14:08:16 -08:00
Brian Paul
4ec3843a54 mesa: don't use format chooser code for glCompressedTexImage
When glCompressedTexImage is called the internalFormat is a specific
format for the incoming image and the the hardware format should be
the same (since we never do format transcoding).  So use the simpler
_mesa_glenum_to_compressed_format() function.  This change is also
needed for the next patch.

Note: This is a candidate for the stable branches.
(cherry picked from commit 478056b81a)
2013-02-17 14:07:37 -08:00
Michel Dänzer
30ae2f97c5 configure.ac: GLX cannot work without OpenGL
GLX uses mapi/glapi/libglapi.la, which is only built for OpenGL.

If the user specified --enable-xlib-glx --disable-opengl, error out, as these
cannot be both observed at the same time. If the user just specified
--disable-opengl but not --disable-glx, print a warning and disable GLX as
well.

NOTE: This is a candidate for the stable branches.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59364

Tested-by: Tom Stellard <thomas.stellard@amd.com>
(cherry picked from commit 3b888f534c)
2013-02-17 14:07:22 -08:00
Stéphane Marchesin
b289e639e4 glx: Check that swap_buffers_reply is non-NULL before using it
Check that the return value from xcb_dri2_swap_buffers_reply is
non-NULL before accessing the struct members.

Note: This is a candidate for the 9.0 branch.

Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 67e7263e45)
2013-02-17 14:02:29 -08:00
Brian Paul
c7e5e9ddce st/mesa: only enable GL_EXT_framebuffer_multisample if GL_MAX_SAMPLES >= 2
We never really have multisampling with one sample per pixel.
See also http://bugs.freedesktop.org/show_bug.cgi?id=59873

Note: This is a candidate for the 9.0 branch.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

(cherry picked from commit c80bacba2e)
2013-02-17 14:01:37 -08:00
Brian Paul
7b9e99f45b mesa: don't enable GL_EXT_framebuffer_multisample for software drivers
Note: This is a candidate for the 9.0 branch.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 8f3c81d018)
2013-02-17 14:00:11 -08:00
Brian Paul
9cea40321c osmesa: use _mesa_generate_mipmap() for mipmap generation, not meta
See previous commit for more info.

Note: This is a candidate for the 9.0 branch.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit 2180f32972)
2013-02-17 13:59:52 -08:00
Brian Paul
29e63455aa xlib: use _mesa_generate_mipmap() for mipmap generation, not meta
The swrast fragment program interpreter has trouble computing the
right texture LOD because it doesn't have easy access to input
derivatives.  This causes the GLSL-based meta generate mipmap code
to fetch texels from the wrong mipmap level.

One possible fix would be to set the GL_TEXTURE_MIN/MAX_LOD parameters
to limit sampling from the right level.  But let's just use the
_mesa_generate_mipmap() fallback since it's a lot faster than using
the fragment shader interpreter.

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=54240

Note: This is a candidate for the 9.0 branch.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit 89551ae04f)
2013-02-17 13:59:43 -08:00
Paul Berry
632a5a3a5b glsl: don't allow non-flat integral types in varying structs/arrays.
In the GLSL 1.30 spec, section 4.3.6 ("Outputs") says:

    "If a vertex output is a signed or unsigned integer or integer
    vector, then it must be qualified with the interpolation qualifier
    flat."

The GLSL ES 3.00 spec further clarifies, in section 4.3.6 ("Output
Variables"):

    "Vertex shader outputs that are, *or contain*, signed or unsigned
    integers or integer vectors must be qualified with the
    interpolation qualifier flat."

(Emphasis mine.)

The language in the GLSL ES 3.00 spec is clearly correct and should be
applied to all shading language versions, since varyings that contain
ints can't be interpolated, regardless of which shading language
version is in use.

(Note that in GLSL 1.50 the restriction is changed to apply to
fragment shader inputs rather than vertex shader outputs, to
accommodate the fact that in the presence of geometry shaders, vertex
shader outputs are not necessarily interpolated.  That will be
addressed by a future patch).

NOTE: This is a candidate for stable branches.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit 93c913485e)
2013-02-15 13:28:01 -08:00
Paul Berry
2cd4824fbc glsl: Allow default precision qualifiers to be set for sampler types.
From GLSL ES 3.00 section 4.5.4 ("Default Precision Qualifiers"):

    "The precision statement

        precision precision-qualifier type;

    can be used to establish a default precision qualifier. The type
    field can be either int or float or any of the sampler types, and
    the precision-qualifier can be lowp, mediump, or highp."

GLSL ES 1.00 has similar language.  GLSL 1.30 doesn't allow precision
qualifiers on sampler types, but this seems like an oversight (since
the intention of including these in GLSL 1.30 is to allow
compatibility with ES shaders).

Previously, Mesa followed GLSL 1.30 and only allowed default precision
qualifiers to be set for float and int.  This patch makes it follow
GLSL ES rules in all cases.

Fixes Piglit tests default-precision-sampler.{vert,frag}.

Partially addresses https://bugs.freedesktop.org/show_bug.cgi?id=60737.

NOTE: This is a candidate for stable branches.

Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit d5948f2f5e)
2013-02-15 13:27:48 -08:00
Michel Dänzer
05de84442b radeonsi: Handle TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS
8 more little piglits.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit c840270ebe)
2013-02-15 18:47:28 +01:00
Michel Dänzer
14372a70ec radeonsi: Fix array indices for detecting integer vertex formats
(cherry picked from commit f34ad85765)
2013-02-15 18:47:21 +01:00
Christian König
baa9070346 radeonsi: remove constant index limitation v3
With the llvm patches, fixing 14 piglit tests in total.

v2: increase the const limit
v3: document the const limit

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 8c80894fb3)
2013-02-15 18:46:32 +01:00
Christian König
f50e4e21f4 radeonsi: support constants as TEX coordinates
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 8514f5ac01)
2013-02-15 18:45:29 +01:00
Tom Stellard
38e728498b configure.ac: Add components to LLVM_COMPONENTS when using llvm shared libs
This is required when LLVM is built with CMake, which creates one
shared library for each component.
(cherry picked from commit 0898047e7b)
2013-02-13 17:02:12 -05:00
Matt Turner
fb2eb65126 Revert "mesa: Return INVALID_OPERATION when type is known but not allowed"
This reverts commit 2906e2034c.

Fixes a regression in the glean depthStencil test.

Reverting this does not affect any tests in es3conform, so a more recent
patch must have also fixed the failure this one was intended to fix.

Reported-by: lu hua <huax.lu@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59494
(cherry picked from commit a527b2192e)
2013-02-13 15:31:50 -05:00
Tom Stellard
741a249cbf r600g: Handle SET*_DX10 instructions in r600_bytecode_get_num_operands() 2013-02-13 15:31:34 -05:00
Jerome Glisse
3ae8678f81 r600g: fix lockup when hyperz & alpha test are enabled together. v3
Seems that alpha test being enabled confuse the GPU on the order in
which it should perform the Z testing. So force the order programmed
throught db shader control.

v2: Only force z order when alpha test is enabled
v3: Update db shader when binding new dsa + spelling fix

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
(cherry picked from commit 974b482aca)
2013-02-12 17:06:36 -05:00
Jordan Justen
85604f3d48 CopyTexImage: Don't check sRGB vs LINEAR for desktop GL
In OpenGL 4.3, new language was added that would require
this check. But, if this check results in broken applications
then perhaps it will be reversed.

For now, remove this check and re-evaluate when
desktop GL 4.3 is closer.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-02-12 11:25:15 -08:00
Quentin Glidic
9119b4e8ee configure.ac: Fix --with-llvm-shared-libs
The third argument of AC_ARG_WITH is evaluated for any provided value,
not only on --with-, so it must not force-enable the feature
Also, setting $with_llvm_shared_libs in the opencl check was overriding
the user switch

https://bugs.freedesktop.org/show_bug.cgi?id=59851

Signed-off-by: Quentin Glidic <sardemff7+git@sardemff7.net>
(cherry picked from commit 1e857130f0)
2013-02-12 15:58:22 +00:00
Tom Stellard
f4f306b8ba r600g/llvm: Select the correct GPU type for RV670
RV670 belongs in the R600 chip class

https://bugs.freedesktop.org/show_bug.cgi?id=58666

NOTE: This is a candidate for the 9.1 branch
(cherry picked from commit 257006e2a4)
2013-02-12 15:58:04 +00:00
Jerome Glisse
99adec8a88 r600g: make sure async blit is done 8 * pitch at a time v2
The blit must be aligned on 8 horizontal block.

v2: no need to align the reminder

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
(cherry picked from commit 323a448825)
2013-02-11 18:44:55 -05:00
Martin Andersson
3b609f12f6 winsys/radeon: fix bo with virtual address referencing mismatch
If the same context try to flink and open the object, use the
same bo struct instead of opening a new gem handle for the object.
This way we avoid avoid having 2 different handle pointing to the
same kernel object which can latter lead to trouble with virtual
address.

Fix:
https://bugs.freedesktop.org/show_bug.cgi?id=60200

Signed-off-by: Martin Andersson <g02maran@gmail.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
(cherry picked from commit a37835c8ed)
2013-02-11 18:41:28 -05:00
Andreas Boll
ecd310bd67 docs: document removal of makedepend build dependency
Build dependency removed with
424f200881

Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 44a5d7371c)
2013-02-11 18:12:30 +01:00
Matt Turner
2a7affc1d5 builtin_compiler/build: Don't use *_FOR_BUILD when not cross compiling
Previously we were relying on CFLAGS_FOR_BUILD to be the same as CFLAGS
when not cross compiling, but this assumption didn't take into
consideration 32-bit builds on 64-bit systems. More generally, not
honoring CFLAGS is bad.

Automake is evidently too stupid to accept

if CROSS_COMPILING
CC = @CC_FOR_BUILD@
...
else
CC = @CC@
endif

without warning that CC has been already defined. The warnings are
harmless, but I'd prefer to avoid future reports about them, so define
proxy variables, which are assigned inside the conditional and then
unconditionally assigned to CC et al.

NOTE: This is a candidate for the 9.1 branch.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59737
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60038
(cherry picked from commit 2db1f73849)
2013-02-11 12:28:06 +01:00
Quentin Glidic
c684e3b53e gallium/egl: Fix include dirs for VPATH build
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Quentin Glidic <sardemff7+git@sardemff7.net>
(cherry picked from commit 11bd1b0f58)
2013-02-11 11:48:54 +01:00
Andreas Boll
b94eeffe60 mesa: Bump version to 9.1-rc1 2013-02-11 09:21:54 +01:00
Jerome Glisse
a0528269a3 winsys/radeon: improve debuging printing
Make sure one can identify virtual address failure from allocation
failure.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
(cherry picked from commit 9a47684564)
2013-02-08 20:33:22 -05:00
Jerome Glisse
18ef6b1265 xorg: fix exa finish access
The exa core will already set the pointer to NULL prior calling
the callback function. So don't bail out in the callback if it's
already NULL.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
(cherry picked from commit 3310acdf47)
2013-02-08 19:01:51 -05:00
Paul Berry
0419b7a3a1 glsl: Support transform feedback of varying structs.
Since transform feedback needs to be able to access individual fields
of varying structs, we can no longer match up the arguments to
glTransformFeedbackVaryings() with variables in the vertex shader.

Instead, we build up a hashtable which records information about each
possible name that is a candidate for transform feedback, and then
match up the arguments to glTransformFeedbackVaryings() with the
contents of that hashtable.

Populating the hashtable uses the program_resource_visitor
infrastructure, so the logic is shared with how we handle uniforms.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 99b78337e3)
2013-02-08 11:17:33 -08:00
Paul Berry
5be2e14393 glsl: Use parse_program_resource_name to parse transform feedback varyings.
Previously, transform feedback varyings were parsed in an ad-hoc
fashion that wasn't compatible with structs (or array of structs).
This patch makes it use parse_program_resource_name(), which correctly
handles both.

Note that parse_program_resource_name()'s technique for handling
mal-formed input strings is to simply let them through and rely on the
fact that a future name lookup will fail.  Because of this,
tfeedback_decl::init() no longer needs to return a boolean error
code--it always succeeds, and if the input was mal-formed the error
will be detected later.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 53febac02c)
2013-02-08 11:17:28 -08:00
Paul Berry
11e4347bff glsl: Rename uniform_field_visitor to program_resource_visitor.
There's actually nothing uniform-specific in uniform_field_visitor.
It is potentially useful for all kinds of program resources (in
particular, future patches will use it for transform feedback
varyings).

This patch renames it to program_resource_visitor, and clarifies
several comments, to reflect the fact that it is useful for more than
just uniforms.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit b4db34cc4c)
2013-02-08 11:17:23 -08:00
Paul Berry
49a5f829f7 mesa/glsl: Separate parsing logic from _mesa_get_uniform_location.
The parsing logic is moved to a new function in the GLSL module,
parse_program_resource_name().  This name was chosen because it should
eventually be useful for handling everything that OpenGL 4.3 calls
"program resources" (e.g. uniforms, vertex inputs, fragment outputs,
and transform feedback varyings).

Future patches will make use of this function for linking transform
feedback varyings.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit b92900d26a)
2013-02-08 11:17:17 -08:00
Kenneth Graunke
5265c42e52 i965/blorp: Support blits between ARGB and XRGB formats.
Now that we have support for overriding alpha to 1.0, we can handle
blitting between these formats in either direction.

For now, we only support two XRGB formats: MESA_FORMAT_XRGB8888 and
MESA_FORMAT_RGBX8888_REV.  Most places only appear to worry about the
former, so ignore the latter for now.  We can always add it later.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Martin Steigerwald <martin@lichtvoll.de>
(cherry picked from commit 7d467f3c15)
2013-02-07 22:31:29 -08:00
Kenneth Graunke
3114f5acd3 i965/blorp: Support overriding destination alpha to 1.0.
Currently, Blorp requires the source and destination formats to be
equal.  However, we'd really like to be able to blit between XRGB and
ARGB formats; our BLT engine paths have supported this for a long time.

For ARGB -> XRGB, nothing needs to occur: the missing alpha is already
interpreted as 1.0.  For XRGB -> ARGB, we need to smash the alpha
channel to 1.0 when writing the destination colors.  This is fairly
straightforward with blending.

For now, this code is never used, as the source and destination formats
still must be equal.  The next patch will relax that restriction.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Martin Steigerwald <martin@lichtvoll.de>
(cherry picked from commit c0554141a9)
2013-02-07 22:31:29 -08:00
Kenneth Graunke
332c50b666 i965: Implement CopyTexSubImage2D via BLORP (and use it by default).
The BLT engine has many limitations.  Currently, it can only blit
X-tiled buffers (since we don't have a kernel API to whack the BLT
tiling mode register), which means all depth/stencil operations get
punted to meta code, which can be very CPU-intensive.

Even if we used the BLT engine, it can't blit between buffers with
different tiling modes, such as an X-tiled non-MSAA ARGB8888 texture
and a Y-tiled CMS ARGB8888 renderbuffer.  This is a fundamental
limitation, and the only way around that is to use BLORP.

Previously, BLORP only handled BlitFramebuffer.  This patch adds an
additional frontend for doing CopyTexSubImage.  It also makes it the
default.  This is partly to increase testing and avoid hiding bugs,
and partly because the BLORP path can already handle more cases.  With
trivial extensions, it should be able to handle everything the BLT can.

This helps PlaneShift massively, which tries to CopyTexSubImage2D
between depth buffers whenever a player casts a spell.  Since these
are Y-tiled, we hit meta and software ReadPixels paths, eating 99% CPU
while delivering ~1 FPS.  This is particularly bad in an MMO setting
because people cast spells all the time.

It also helps Xonotic in 4X MSAA mode.  At default power management
settings, I measured a 6.35138% +/- 0.672548% performance boost (n=5).
(This data is from v1 of the patch.)

No Piglit regressions on Ivybridge (v3) or Sandybridge (v2).

v2: Create a fake intel_renderbuffer to wrap the destination texture
    image and then reuse do_blorp_blit rather than reimplementing most
    of it.  Remove unnecessary clipping code and conditional rendering
    check.

v3: Reuse formats_match() to centralize checks; delete temporary
    renderbuffers.  Reorganize the code.

v4: Actually copy stencil when dealing with separate stencil buffers but
    packed depth/stencil formats.  Tested by a new Piglit test.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com> [v4]
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v3]
Reviewed-and-tested-by: Carl Worth <cworth@cworth.org> [v2]
Tested-by: Martin Steigerwald <martin@lichtvoll.de> [v3]
(cherry picked from commit 0b3bebbaac)
2013-02-07 22:31:29 -08:00
Kenneth Graunke
55e3f79d55 mesa: Put extern "C" guards in renderbuffer.h.
I need to use this from C++ code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 29aef6cce8)
2013-02-07 22:31:29 -08:00
Kenneth Graunke
1d2ef43032 i965: Fix the SF Vertex URB Read Length calculation for Gen7 platforms.
Ivybridge doesn't appear to have the same errata as Sandybridge; no
corruption was observed by setting it to more than the minimal correct
value.  It's possible that we were simply lucky, since the URB entries
are 1024-bit on Ivybridge vs. 512-bit Sandybridge.  Or perhaps the
underlying hardware issue is fixed.

Either way, we may as well program the minimum value since it's now
readily available, likely to be more efficient, and possibly more
correct.

v2: Use GEN7_SBE_* defines rather than GEN6_SF_*.  (A copy and paste
    mistake.)  They're the same, but using the right names is better.

NOTE: This is a candidate for all stable branches.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 44aa2e15f6)
2013-02-07 22:31:28 -08:00
Kenneth Graunke
3acd5ed75b i965: Fix the SF Vertex URB Read Length calculation for Sandybridge.
(This commit message was primarily written by Paul Berry, who explained
 what's going on far better than I would have.)

Previous to this patch, we thought that the only restrictions on
3DSTATE_SF's URB read length were (a) it needs to be large enough to
read all the VUE data that the SF needs, and (b) it can't be so large
that it tries to read VUE data that doesn't exist.  Since the VUE map
already tells us how much VUE data exists, we didn't bother worrying
about restriction (a); we just did the easy thing and programmed the
read length to satisfy restriction (b).

However, we didn't notice this erratum in the hardware docs: "[errata]
Corruption/Hang possible if length programmed larger than recommended".
Judging by the context surrounding this erratum, it's pretty clear that
it means "URB read length must be exactly the size necessary to read all
the VUE data that the SF needs, and no larger".  Which means that we
can't program the read length based on restriction (b)--we have to
program it based on restriction (a).

The URB read size needs to precisely match the amount of data that the
SF consumes; it doesn't work to simply base it on the size of the VUE.

Thankfully, the PRM contains the precise formula the hardware expects.

Fixes random UI corruption in Steam's "Big Picture Mode", random terrain
corruption in PlaneShift, and Piglit's fbo-5-varyings test.

NOTE: This is a candidate for all stable branches.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56920
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60172
Tested-by: Jordan Justen <jordan.l.justen@intel.com> (v1/Piglit)
Tested-by: Martin Steigerwald <martin@lichtvoll.de> (PlaneShift)
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 09fbc29828)
2013-02-07 22:31:28 -08:00
Kenneth Graunke
697f8e56dc i965: Compute the maximum SF source attribute.
The maximum SF source attribute is necessary to compute the Vertex URB
read length properly, which will be done in the next commit.

NOTE: This is a candidate for all stable branches.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Tested-by: Martin Steigerwald <martin@lichtvoll.de>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 5e9bc7bd12)
2013-02-07 22:31:28 -08:00
Kenneth Graunke
45ae093e5c i965: Refactor Gen6+ SF attribute override code.
The next patch will benefit from easy access to the source attribute
number and whether or not we're swizzling.  It doesn't want the final
attr_override DWord form, however.

NOTE: This is a candidate for all stable branches.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Tested-by: Martin Steigerwald <martin@lichtvoll.de>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit b3efc5bea8)
2013-02-07 22:31:28 -08:00
Kenneth Graunke
535e95299a i965: Add chipset limits for Haswell GT1/GT2.
The maximum number of URB entries come from the 3DSTATE_URB_VS and
3DSTATE_URB_GS state packet documentation; the thread count information
comes from the 3DSTATE_VS and 3DSTATE_PS state packet documentation.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
(cherry picked from commit 9add4e8038)
2013-02-07 22:31:28 -08:00
Vinson Lee
a7e2c615f1 i965: Fix assignment instead of comparison in asserts.
Fixes side effect in assertion defects reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit 1559994cba)
2013-02-07 22:31:28 -08:00
Paul Berry
5611a5a387 mesa: Don't check (offset + size <= bufObj->Size) in BindBufferRange.
In the documentation for BindBufferRange, OpenGL specs from 3.0
through 4.1 contain this language:

    "The error INVALID_VALUE is generated if size is less than or
    equal to zero or if offset + size is greater than the value of
    BUFFER_SIZE."

This text was dropped from OpenGL 4.2, and it does not appear in the
GLES 3.0 spec.

Presumably the reason for the change is because come clients change
the size of the buffer after calling BindBufferRange.  We don't want
to generate an error at the time of the BindBufferRange call just
because the old size of the buffer was too small, when the buffer is
about to be resized.

Since this is a deliberate relaxation of error conditions in order to
allow clients to work, it seems sensible to apply it to all versions
of GL, not just GL 4.2 and above.

(Note that there is no danger of this change allowing a client to
access data beyond the end of a buffer.  We already have code to
ensure that that doesn't happen in the case where the client shrinks
the buffer after calling BindBufferRange).

Eliminates a spurious error message in the gles3 conformance test
"transform_feedback_offset_size".

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 04f0d6cc22)
2013-02-07 21:20:32 -08:00
Ian Romanick
a48e5526c2 i965: Set UniformBufferOffsetAlignment to sizeof(vec4)
This matches the behavior of the Windows driver, but a bspec reference
should would be nice.

NOTE: This is a candidate for the 9.0 and 9.1 branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit f29ab4ece5)
2013-02-07 21:20:16 -08:00
Matt Turner
c59808c700 mesa: Allow glGet* queries of MAX_VARYING_COMPONENTS in ES 3
Should have been done in d9948e49 but I missed it because
MAX_VARYING_FLOATS doesn't appear in the ES 3 spec, but is the same
value as MAX_VARYING_COMPONENTS.

NOTE: Candidate for the 9.1 branch
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-02-07 17:54:16 -08:00
Michel Dänzer
ad62f424b3 radeonsi: Handle scaled and integer formats for samplers and vertex elements.
Also, add assertions to stress that render targets don't support scaled
formats.

20 more little piglits.
(cherry picked from commit 46dd16bca8b4526e46badc9cb1d7c058a8e6173e)
2013-02-07 19:11:30 +01:00
Michel Dänzer
fc04455533 radeonsi: Don't advertise PIPE_FORMAT_L8A8_SRGB support.
The hardware can't do it.
(cherry picked from commit f6e9430da2d3510f84baefa0fdf26ec5c457f146)
2013-02-07 19:11:19 +01:00
Michel Dänzer
6799bddf6b radeonsi: Remove incorrect (and dead) assignment in tex_fetch_args().
The proper return type is assigned at the end of the function.
(cherry picked from commit 180db2bcb28e94bb1ce18d76b2b3a5818d76262c)
2013-02-07 19:11:09 +01:00
Michel Dänzer
93f61addb5 radeonsi: Use unique names for referring to texture sampling intrinsics.
Append the overloaded vector type used for passing in the addressing
parameters.

Without this, LLVM uses the same function signature for all those types,
which cannot work.

Fixes problems e.g. with FlightGear and Red Eclipse.
(cherry picked from commit 1b3afea30de757815555d9eb1d6e72e2586d6a0c)
2013-02-07 19:10:17 +01:00
Jerome Glisse
d04b50b4de r600g: fix slice tile max for compressed texture and async dma
Was using the pixel size instead of the number of block for the slice
tile max computation which resulted in dma writing at wrong address.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-02-07 10:43:37 -05:00
Marek Olšák
f1c46c8418 r300g: fix blending with blend color and RGBA formats
NOTE: This is a candidate for the stable branches.
(cherry picked from commit f40a7fc34a)
2013-02-06 22:24:04 +01:00
Michel Dänzer
4bc85f9aac Require libdrm_radeon 2.4.42 for radeonsi.
It has new PCI IDs and an important tiled surface layout fix.
(cherry picked from commit 02a423b239)
2013-02-05 15:15:49 +01:00
Alex Deucher
e1d798a901 radeonsi: add Oland pci ids
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Note: this is a candidate for the 9.1 branch.
(cherry picked from commit 4161d70bba)
2013-02-04 17:20:22 -05:00
Alex Deucher
6b0fa537a9 radeonsi: default PA_SC_RASTER_CONFIG to 0
That should work in all cases.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Note: this is a candidate for the 9.1 branch.
(cherry picked from commit af0af75881)
2013-02-04 17:20:03 -05:00
Alex Deucher
0cc0097bb0 radeonsi: add support for Oland chips
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Note: this is a candidate for the 9.1 branch
(cherry picked from commit 83e4407f44)
2013-02-04 17:19:43 -05:00
Michel Dänzer
7f90de5414 radeonsi: Fix draws using user index buffer.
Was broken since commit bf469f4edc
('gallium: add void *user_buffer in pipe_index_buffer').

Fixes 11 piglit tests and lots of missing geometry e.g. in TORCS.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit a8a5055f2d)
2013-02-04 17:54:03 +01:00
Michel Dänzer
8cd237bcbe radeonsi: Remove spurious traces of R16G16B16 support.
The hardware can't do it, and these were causing warnings in some piglit tests.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit 6455d40b7e)
2013-02-04 17:28:18 +01:00
Michel Dänzer
5ca77c27a6 radeonsi: Enable texture arrays.
28/30 piglit tests pass.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit 6bcb823844)
2013-02-04 17:28:14 +01:00
Michel Dänzer
b104d151f1 radeonsi: Improve packing of texture address parameters.
In particular, the LOD bias and depth comparison values are packed before the
'normal' texture coordinates, and the array slice and LOD values are appended.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit 120efeef8b)
2013-02-04 17:27:43 +01:00
Michel Dänzer
5f9f3f381f radeonsi: Adapt to sample intrinsics changes.
Fix up intrinsic names, and bitcast texture address parameters to integers.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit e5fb7347a7)
2013-02-04 17:27:34 +01:00
Marek Olšák
b127ad3489 mesa: don't expose IBM_rasterpos_clip in a core context
glRasterPos doesn't exist in the core profile.

NOTE: This is a candidate for the stable branches (9.0 and 9.1).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit cc5fdaf2dc)
2013-02-01 16:35:24 +01:00
Marek Olšák
1003652a7f r300g: always put MSAA resources in VRAM
This along with the latest drm-fixes branch should help with bad performance
of MSAA. Remember: Nx MSAA can't be more than N times slower (where N=2,4,6).

Anyway, I recommend at least 512 MB of VRAM for Full HD 6x MSAA.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit a06f03d795)
2013-02-01 16:35:18 +01:00
Jerome Glisse
9d8a866db3 r600g: add cs memory usage accounting and limit it v3
We are now seing cs that can go over the vram+gtt size to avoid
failing flush early cs that goes over 70% (gtt+vram) usage. 70%
is use to allow some fragmentation.

The idea is to compute a gross estimate of memory requirement of
each draw call. After each draw call, memory will be precisely
accounted. So the uncertainty is only on the current draw call.
In practice this gave very good estimate (+/- 10% of the target
memory limit).

v2: Remove left over from testing version, remove useless NULL
    checking. Improve commit message.
v3: Add comment to code on memory accounting precision

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-01-31 14:25:30 -05:00
Marek Olšák
3b8d4f941f r600g: fix htile buffer leak
NOTE: This is a candidate for the 9.1 branch.
2013-01-31 14:25:10 -05:00
Matt Turner
ff515c4e7c build: Add missing comma in AS_IF
Reported-by: Lauri Kasanen<curaga@operamail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47248#c15
2013-01-29 15:06:47 -08:00
Marek Olšák
d7ca04a7c3 docs/relnotes-9.1: document new features in radeon drivers
(cherry picked from commit 845130951f)
2013-01-29 17:38:14 +01:00
Matt Turner
48af880f81 docs: List new extensions added in Mesa 9.1
I did not list the *_get_program_binary extensions since they're not
useful to anyone with their current implementation (that supports 0
binary formats).
2013-01-28 16:49:24 -08:00
Jerome Glisse
af2d8f8072 r600g: use uint64_t instead of unsigned long for proper 32bits cpu support
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-01-28 19:10:29 -05:00
Jerome Glisse
d8d17441e2 r600g: real fix for non 3.8 kernel
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-01-28 17:44:49 -05:00
115 changed files with 2192 additions and 647 deletions

View File

@@ -36,7 +36,7 @@ check-local:
# Rules for making release tarballs
PACKAGE_VERSION=9.1-devel
PACKAGE_VERSION=9.1
PACKAGE_DIR = Mesa-$(PACKAGE_VERSION)
PACKAGE_NAME = MesaLib-$(PACKAGE_VERSION)

3
bin/.cherry-ignore Normal file
View File

@@ -0,0 +1,3 @@
d60da27273d2cdb68bc32cae2ca66718dab15f27 st/mesa: set ctx->Const.MaxSamples = 0, not 1
5c86a728d4f688c0fe7fbf9f4b8f88060b65c4ee r600g: fix htile buffer leak
496928a442cec980b534bc5da2523b3632b21b61 CopyTexImage: Don't check sRGB vs LINEAR for desktop GL

View File

@@ -20,7 +20,8 @@ echo \#buildapi-variable-no-builddir >/dev/null
# Support silent build rules, requires at least automake-1.11. Disable
# by either passing --disable-silent-rules to configure or passing V=1
# to make
m4_ifdef([AM_SILENT_RULES], [AM_SILENT_RULES([yes])])
m4_ifdef([AM_SILENT_RULES], [AM_SILENT_RULES([yes])],
[AC_SUBST([AM_DEFAULT_VERBOSITY], [1])])
m4_ifdef([AM_PROG_AR], [AM_PROG_AR])
@@ -30,7 +31,7 @@ AC_SUBST([OSMESA_VERSION])
dnl Versions for external dependencies
LIBDRM_REQUIRED=2.4.24
LIBDRM_RADEON_REQUIRED=2.4.40
LIBDRM_RADEON_REQUIRED=2.4.42
LIBDRM_INTEL_REQUIRED=2.4.38
LIBDRM_NVVIEUX_REQUIRED=2.4.33
LIBDRM_NOUVEAU_REQUIRED="2.4.33 libdrm >= 2.4.41"
@@ -57,10 +58,10 @@ LT_PREREQ([2.2])
LT_INIT([disable-static])
AX_PROG_BISON([],
AS_IF([test ! -f "$srcdir/src/glsl/glcpp/glcpp-parse.c"]
AS_IF([test ! -f "$srcdir/src/glsl/glcpp/glcpp-parse.c"],
[AC_MSG_ERROR([bison not found - unable to compile glcpp-parse.y])]))
AX_PROG_FLEX([],
AS_IF([test ! -f "$srcdir/src/glsl/glcpp/glcpp-lex.c"]
AS_IF([test ! -f "$srcdir/src/glsl/glcpp/glcpp-lex.c"],
[AC_MSG_ERROR([flex not found - unable to compile glcpp-lex.l])]))
AC_PATH_PROG([PERL], [perl])
@@ -611,7 +612,7 @@ AC_ARG_ENABLE([opencl],
[enable OpenCL library NOTE: Enabling this option will also enable
--with-llvm-shared-libs
@<:@default=no@:>@])],
[enable_opencl="$enableval" with_llvm_shared_libs="$enableval"],
[],
[enable_opencl=no])
AC_ARG_ENABLE([xlib_glx],
[AS_HELP_STRING([--enable-xlib-glx],
@@ -701,6 +702,16 @@ if test "x$enable_dri$enable_xlib_glx" = xyesyes; then
AC_MSG_ERROR([DRI and Xlib-GLX cannot be built together])
fi
if test "x$enable_opengl$enable_xlib_glx" = xnoyes; then
AC_MSG_ERROR([Xlib-GLX cannot be built without OpenGL])
fi
# Disable GLX if OpenGL is not enabled
if test "x$enable_glx$enable_opengl" = xyesno; then
AC_MSG_WARN([OpenGL not enabled, disabling GLX])
enable_glx=no
fi
# Disable GLX if DRI and Xlib-GLX are not enabled
if test "x$enable_glx" = xyes -a \
"x$enable_dri" = xno -a \
@@ -1619,8 +1630,13 @@ AC_ARG_ENABLE([gallium-llvm],
AC_ARG_WITH([llvm-shared-libs],
[AS_HELP_STRING([--with-llvm-shared-libs],
[link with LLVM shared libraries @<:@default=disabled@:>@])],
[with_llvm_shared_libs=yes],
[],
[with_llvm_shared_libs=no])
AS_IF([test x$enable_opencl = xyes],
[
AC_MSG_WARN([OpenCL required, forcing LLVM shared libraries])
with_llvm_shared_libs=yes
])
AC_ARG_WITH([llvm-prefix],
[AS_HELP_STRING([--with-llvm-prefix],
@@ -1662,16 +1678,14 @@ if test "x$enable_gallium_llvm" = xyes; then
if test "x$LLVM_CONFIG" != xno; then
LLVM_VERSION=`$LLVM_CONFIG --version | sed 's/svn.*//g'`
LLVM_VERSION_INT=`echo $LLVM_VERSION | sed -e 's/\([[0-9]]\)\.\([[0-9]]\)/\10\2/g'`
if test "x$with_llvm_shared_libs" != xyes; then
LLVM_COMPONENTS="engine bitwriter"
if $LLVM_CONFIG --components | grep -q '\<mcjit\>'; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} mcjit"
fi
LLVM_COMPONENTS="engine bitwriter"
if $LLVM_CONFIG --components | grep -q '\<mcjit\>'; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} mcjit"
fi
if test "x$enable_opencl" = xyes; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} ipo linker instrumentation"
fi
fi
if test "x$enable_opencl" = xyes; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} ipo linker instrumentation"
fi
LLVM_LDFLAGS=`$LLVM_CONFIG --ldflags`
LLVM_BINDIR=`$LLVM_CONFIG --bindir`
LLVM_CPPFLAGS=`strip_unwanted_llvm_flags "$LLVM_CONFIG --cppflags"`
@@ -1840,6 +1854,9 @@ if test "x$with_gallium_drivers" != x; then
if test "x$enable_r600_llvm" = xyes; then
USE_R600_LLVM_COMPILER=yes;
fi
if test "x$enable_opencl" = xyes; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} bitreader asmparser"
fi
gallium_check_st "radeon/drm" "dri-r600" "xorg-r600" "" "xvmc-r600" "vdpau-r600"
;;
xradeonsi)

View File

@@ -14,7 +14,7 @@
<iframe src="contents.html"></iframe>
<div class="content">
<h1>Mesa 9.1 Release Notes / date TBD</h1>
<h1>Mesa 9.1 Release Notes / date February 22, 2013</h1>
<p>
Mesa 9.1 is a new development release.
@@ -44,9 +44,19 @@ Note: some of the new features are only available with certain drivers.
</p>
<ul>
<li>GL_ANGLE_texture_compression_dxt3</li>
<li>GL_ANGLE_texture_compression_dxt5</li>
<li>GL_ARB_ES3_compatibility</li>
<li>GL_ARB_internalformat_query</li>
<li>GL_ARB_map_buffer_alignment</li>
<li>GL_ARB_texture_cube_map_array</li>
<li>GL_ARB_shading_language_packing</li>
<li>GL_ARB_texture_buffer_object_rgb32</li>
<li>GL_ARB_texture_cube_map_array</li>
<li>GL_EXT_color_buffer_float</li>
<li>GL_OES_depth_texture_cube_map</li>
<li>OpenGL 3.1 core profile support on Radeon HD2000 up to HD6000 series </li>
<li>Multisample anti-aliasing support on Radeon X1000 series</li>
<li>OpenGL ES 3.0 support on Intel HD Graphics 2000, 2500, 3000, and 4000</li>
</ul>
@@ -63,6 +73,7 @@ Note: some of the new features are only available with certain drivers.
<li>Removed swrast support for GL_NV_vertex_program</li>
<li>Removed swrast support for GL_NV_fragment_program</li>
<li>Removed OpenVMS support (unmaintained and broken)</li>
<li>Removed makedepend build dependency</li>
</ul>
</div>

View File

@@ -46,3 +46,17 @@ CHIPSET(0x6839, VERDE_6839, VERDE)
CHIPSET(0x683B, VERDE_683B, VERDE)
CHIPSET(0x683D, VERDE_683D, VERDE)
CHIPSET(0x683F, VERDE_683F, VERDE)
CHIPSET(0x6600, OLAND_6600, OLAND)
CHIPSET(0x6601, OLAND_6601, OLAND)
CHIPSET(0x6602, OLAND_6602, OLAND)
CHIPSET(0x6603, OLAND_6603, OLAND)
CHIPSET(0x6606, OLAND_6606, OLAND)
CHIPSET(0x6607, OLAND_6607, OLAND)
CHIPSET(0x6610, OLAND_6610, OLAND)
CHIPSET(0x6611, OLAND_6611, OLAND)
CHIPSET(0x6613, OLAND_6613, OLAND)
CHIPSET(0x6620, OLAND_6620, OLAND)
CHIPSET(0x6621, OLAND_6621, OLAND)
CHIPSET(0x6623, OLAND_6623, OLAND)
CHIPSET(0x6631, OLAND_6631, OLAND)

View File

@@ -530,7 +530,7 @@ def generate(env):
env.PkgCheckModules('XF86VIDMODE', ['xxf86vm'])
env.PkgCheckModules('DRM', ['libdrm >= 2.4.24'])
env.PkgCheckModules('DRM_INTEL', ['libdrm_intel >= 2.4.30'])
env.PkgCheckModules('DRM_RADEON', ['libdrm_radeon >= 2.4.40'])
env.PkgCheckModules('DRM_RADEON', ['libdrm_radeon >= 2.4.42'])
env.PkgCheckModules('XORG', ['xorg-server >= 1.6.0'])
env.PkgCheckModules('KMS', ['libkms >= 2.4.24'])
env.PkgCheckModules('UDEV', ['libudev > 150'])

View File

@@ -446,6 +446,7 @@ dri2_swap_buffers(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *draw)
{
struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp);
struct dri2_egl_surface *dri2_surf = dri2_egl_surface(draw);
__DRIbuffer buffer;
int i, ret = 0;
while (dri2_surf->frame_callback && ret != -1)
@@ -463,6 +464,13 @@ dri2_swap_buffers(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *draw)
if (dri2_surf->color_buffers[i].age > 0)
dri2_surf->color_buffers[i].age++;
/* Make sure we have a back buffer in case we're swapping without ever
* rendering. */
if (get_back_bo(dri2_surf, &buffer) < 0) {
_eglError(EGL_BAD_ALLOC, "dri2_swap_buffers");
return EGL_FALSE;
}
dri2_surf->back->age = 1;
dri2_surf->current = dri2_surf->back;
dri2_surf->back = NULL;

View File

@@ -421,10 +421,10 @@ util_clear_depth_stencil(struct pipe_context *pipe,
else {
uint32_t dst_mask;
if (format == PIPE_FORMAT_Z24_UNORM_S8_UINT)
dst_mask = 0xffffff00;
dst_mask = 0x00ffffff;
else {
assert(format == PIPE_FORMAT_S8_UINT_Z24_UNORM);
dst_mask = 0xffffff;
dst_mask = 0xffffff00;
}
if (clear_flags & PIPE_CLEAR_DEPTH)
dst_mask = ~dst_mask;

View File

@@ -85,23 +85,30 @@ check_PROGRAMS = \
lp_test_printf
TESTS = $(check_PROGRAMS)
TEST_LIBS = \
libllvmpipe.la \
$(top_builddir)/src/gallium/auxiliary/libgallium.la \
$(LLVM_LIBS) \
$(DLOPEN_LIBS) \
$(PTHREAD_LIBS)
lp_test_format_SOURCES = lp_test_format.c lp_test_main.c
lp_test_format_LDADD = libllvmpipe.la ../../auxiliary/libgallium.la $(LLVM_LIBS)
lp_test_format_LDADD = $(TEST_LIBS)
nodist_EXTRA_lp_test_format_SOURCES = dummy.cpp
lp_test_arit_SOURCES = lp_test_arit.c lp_test_main.c
lp_test_arit_LDADD = libllvmpipe.la ../../auxiliary/libgallium.la $(LLVM_LIBS)
lp_test_arit_LDADD = $(TEST_LIBS)
nodist_EXTRA_lp_test_arit_SOURCES = dummy.cpp
lp_test_blend_SOURCES = lp_test_blend.c lp_test_main.c
lp_test_blend_LDADD = libllvmpipe.la ../../auxiliary/libgallium.la $(LLVM_LIBS)
lp_test_blend_LDADD = $(TEST_LIBS)
nodist_EXTRA_lp_test_blend_SOURCES = dummy.cpp
lp_test_conv_SOURCES = lp_test_conv.c lp_test_main.c
lp_test_conv_LDADD = libllvmpipe.la ../../auxiliary/libgallium.la $(LLVM_LIBS)
lp_test_conv_LDADD = $(TEST_LIBS)
nodist_EXTRA_lp_test_conv_SOURCES = dummy.cpp
lp_test_printf_SOURCES = lp_test_printf.c lp_test_main.c
lp_test_printf_LDADD = libllvmpipe.la ../../auxiliary/libgallium.la $(LLVM_LIBS)
lp_test_printf_LDADD = $(TEST_LIBS)
nodist_EXTRA_lp_test_printf_SOURCES = dummy.cpp

View File

@@ -22,6 +22,7 @@ r300_compiler_tests_CPPFLAGS = \
-I$(top_srcdir)/src/gallium/drivers/r300/compiler
r300_compiler_tests_SOURCES = \
$(testdir)/r300_compiler_tests.c \
$(testdir)/radeon_compiler_optimize_tests.c \
$(testdir)/radeon_compiler_util_tests.c \
$(testdir)/rc_test_helpers.c \
$(testdir)/unit_test.c

View File

@@ -854,7 +854,7 @@ static void rc_emulate_negative_addressing(struct radeon_compiler *compiler, voi
transform_negative_addressing(c, lastARL, inst, min_offset);
}
static struct rc_swizzle_caps r300_vertprog_swizzle_caps = {
struct rc_swizzle_caps r300_vertprog_swizzle_caps = {
.IsNative = &swizzle_is_native,
.Split = 0 /* should never be called */
};

View File

@@ -1,3 +1,30 @@
/*
* Copyright 2010 Tom Stellard <tstellar@gmail.com>
*
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining
* a copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sublicense, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial
* portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
* IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
* LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
* OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
* WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*
*/
#include "radeon_program_constants.h"
#ifndef RADEON_PROGRAM_UTIL_H

View File

@@ -1,4 +1,29 @@
/*
* Copyright 2010 Tom Stellard <tstellar@gmail.com>
*
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining
* a copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sublicense, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial
* portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
* IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
* LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
* OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
* WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*
*/
#ifndef RADEON_EMULATE_LOOPS_H
#define RADEON_EMULATE_LOOPS_H

View File

@@ -1,3 +1,27 @@
/*
* Copyright 2012 Advanced Micro Devices, Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* on the rights to use, copy, modify, merge, publish, distribute, sub
* license, and/or sell copies of the Software, and to permit persons to whom
* the Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
* THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
* USE OR OTHER DEALINGS IN THE SOFTWARE.
*
* Author: Tom Stellard <thomas.stellard@amd.com>
*/
#include "radeon_compiler.h"
#include "radeon_compiler_util.h"

View File

@@ -816,7 +816,7 @@ static int peephole_mul_omod(
/* Rewrite the instructions */
for (var = writer_list->Item; var; var = var->Friend) {
struct rc_variable * writer = writer_list->Item;
struct rc_variable * writer = var;
unsigned conversion_swizzle = rc_make_conversion_swizzle(
writer->Inst->U.I.DstReg.WriteMask,
inst_mul->U.I.DstReg.WriteMask);

View File

@@ -1,3 +1,29 @@
/*
* Copyright 2011 Tom Stellard <tstellar@gmail.com>
*
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining
* a copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sublicense, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial
* portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
* IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
* LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
* OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
* WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*
*/
#include "radeon_compiler.h"
#include "radeon_compiler_util.h"

View File

@@ -1,3 +1,29 @@
/*
* Copyright 2010 Tom Stellard <tstellar@gmail.com>
*
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining
* a copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sublicense, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial
* portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
* IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
* LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
* OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
* WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*
*/
#ifndef RADEON_RENAME_REGS_H
#define RADEON_RENAME_REGS_H

View File

@@ -54,4 +54,6 @@ struct rc_swizzle_caps {
void (*Split)(struct rc_src_register reg, unsigned int mask, struct rc_swizzle_split * split);
};
extern struct rc_swizzle_caps r300_vertprog_swizzle_caps;
#endif /* RADEON_SWIZZLE_H */

View File

@@ -1,3 +1,27 @@
/*
* Copyright 2012 Advanced Micro Devices, Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* on the rights to use, copy, modify, merge, publish, distribute, sub
* license, and/or sell copies of the Software, and to permit persons to whom
* the Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
* THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
* USE OR OTHER DEALINGS IN THE SOFTWARE.
*
* Author: Tom Stellard <thomas.stellard@amd.com>
*/
#include "radeon_compiler.h"
#include "radeon_compiler_util.h"

View File

@@ -1,6 +1,43 @@
/*
* Copyright 2011 Tom Stellard <tstellar@gmail.com>
*
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining
* a copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sublicense, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial
* portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
* IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
* LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
* OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
* WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*
*/
#include "r300_compiler_tests.h"
#include <stdlib.h>
int main(int argc, char ** argv)
{
radeon_compiler_util_run_tests();
unsigned pass = 1;
pass &= radeon_compiler_optimize_run_tests();
pass &= radeon_compiler_util_run_tests();
if (pass) {
return EXIT_SUCCESS;
} else {
return EXIT_FAILURE;
}
}

View File

@@ -1,2 +1,29 @@
/*
* Copyright 2011 Tom Stellard <tstellar@gmail.com>
*
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining
* a copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sublicense, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial
* portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
* IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
* LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
* OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
* WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*
*/
void radeon_compiler_util_run_tests(void);
unsigned radeon_compiler_optimize_run_tests(void);
unsigned radeon_compiler_util_run_tests(void);

View File

@@ -0,0 +1,75 @@
/*
* Copyright 2013 Advanced Micro Devices, Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* on the rights to use, copy, modify, merge, publish, distribute, sub
* license, and/or sell copies of the Software, and to permit persons to whom
* the Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
* THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
* USE OR OTHER DEALINGS IN THE SOFTWARE.
*
* Author: Tom Stellard <thomas.stellard@amd.com>
*/
#include "radeon_compiler.h"
#include "radeon_dataflow.h"
#include "r300_compiler_tests.h"
#include "rc_test_helpers.h"
#include "unit_test.h"
static void test_runner_rc_optimize(struct test_result * result)
{
struct radeon_compiler c;
struct rc_instruction *inst;
struct rc_instruction *inst_list[3];
unsigned inst_count = 0;
float const0[4] = {2.0f, 0.0f, 0.0f, 0.0f};
unsigned pass = 1;
test_begin(result);
init_compiler(&c, RC_FRAGMENT_PROGRAM, 1, 0);
rc_constants_add_immediate_vec4(&c.Program.Constants, const0);
add_instruction(&c, "RCP temp[0].x, const[1].x___;");
add_instruction(&c, "RCP temp[0].y, const[1]._y__;");
add_instruction(&c, "MUL temp[1].xy, const[0].xx__, temp[0].xy__;");
add_instruction(&c, "MOV output[0].xy, temp[1].xy;" );
rc_optimize(&c, NULL);
for(inst = c.Program.Instructions.Next;
inst != &c.Program.Instructions;
inst = inst->Next, inst_count++) {
inst_list[inst_count] = inst;
}
if (inst_list[0]->U.I.Omod != RC_OMOD_MUL_2 ||
inst_list[1]->U.I.Omod != RC_OMOD_MUL_2 ||
inst_list[2]->U.I.Opcode != RC_OPCODE_MOV) {
pass = 0;
}
test_check(result, pass);
}
unsigned radeon_compiler_optimize_run_tests()
{
struct test tests[] = {
{"rc_optimize() => peephole_mul_omod()", test_runner_rc_optimize},
{NULL, NULL}
};
return run_tests(tests);
}

View File

@@ -1,3 +1,30 @@
/*
* Copyright 2011 Tom Stellard <tstellar@gmail.com>
*
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining
* a copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sublicense, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial
* portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
* IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
* LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
* OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
* WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*
*/
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
@@ -67,11 +94,11 @@ static void test_runner_rc_inst_can_use_presub(struct test_result * result)
"MAD temp[0].xyz, temp[2].xyz_, -temp[3].xxx_, input[5].xyz_;");
}
void radeon_compiler_util_run_tests()
unsigned radeon_compiler_util_run_tests()
{
struct test tests[] = {
{"rc_inst_can_use_presub()", test_runner_rc_inst_can_use_presub},
{NULL, NULL}
};
run_tests(tests);
return run_tests(tests);
}

View File

@@ -1,3 +1,32 @@
/*
* Copyright 2011 Tom Stellard <tstellar@gmail.com>
* Copyright 2013 Advanced Micro Devices, Inc.
*
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining
* a copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sublicense, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial
* portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
* IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
* LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
* OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
* WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*
* Author: Tom Stellard <thomas.stellard@amd.com>
*/
#include <errno.h>
#include <regex.h>
#include <stdlib.h>
@@ -5,9 +34,14 @@
#include <string.h>
#include <sys/types.h>
#include "../radeon_compiler_util.h"
#include "../radeon_opcodes.h"
#include "../radeon_program.h"
#include "r500_fragprog.h"
#include "r300_fragprog_swizzle.h"
#include "radeon_compiler.h"
#include "radeon_compiler_util.h"
#include "radeon_opcodes.h"
#include "radeon_program.h"
#include "radeon_regalloc.h"
#include "radeon_swizzle.h"
#include "rc_test_helpers.h"
@@ -259,6 +293,7 @@ int init_rc_normal_dst(
if (tokens.WriteMask.Length == 0) {
inst->U.I.DstReg.WriteMask = RC_MASK_XYZW;
} else {
inst->U.I.DstReg.WriteMask = 0;
/* The first character should be '.' */
if (tokens.WriteMask.String[0] != '.') {
fprintf(stderr, "1st char of writemask is not valid.\n");
@@ -311,7 +346,8 @@ struct inst_tokens {
* this string is the same that is output by rc_program_print.
* @return 1 On success, 0 on failure
*/
int init_rc_normal_instruction(
int parse_rc_normal_instruction(
struct rc_instruction * inst,
const char * inst_str)
{
@@ -320,10 +356,6 @@ int init_rc_normal_instruction(
regmatch_t matches[REGEX_INST_MATCHES];
struct inst_tokens tokens;
/* Initialize inst */
memset(inst, 0, sizeof(struct rc_instruction));
inst->Type = RC_INSTRUCTION_NORMAL;
/* Execute the regex */
if (!regex_helper(regex_str, inst_str, matches, REGEX_INST_MATCHES)) {
return 0;
@@ -340,6 +372,8 @@ int init_rc_normal_instruction(
/* Fill out the rest of the instruction. */
inst->Type = RC_INSTRUCTION_NORMAL;
for (i = 0; i < MAX_RC_OPCODE; i++) {
const struct rc_opcode_info * info = rc_get_opcode_info(i);
unsigned int first_src = 3;
@@ -378,3 +412,47 @@ int init_rc_normal_instruction(
}
return 1;
}
int init_rc_normal_instruction(
struct rc_instruction * inst,
const char * inst_str)
{
/* Initialize inst */
memset(inst, 0, sizeof(struct rc_instruction));
return parse_rc_normal_instruction(inst, inst_str);
}
void add_instruction(struct radeon_compiler *c, const char * inst_string)
{
struct rc_instruction * new_inst =
rc_insert_new_instruction(c, c->Program.Instructions.Prev);
parse_rc_normal_instruction(new_inst, inst_string);
}
void init_compiler(
struct radeon_compiler *c,
enum rc_program_type program_type,
unsigned is_r500,
unsigned is_r400)
{
struct rc_regalloc_state *rs = malloc(sizeof(struct rc_regalloc_state));
rc_init(c, rs);
c->is_r500 = is_r500;
c->max_temp_regs = is_r500 ? 128 : (is_r400 ? 64 : 32);
c->max_constants = is_r500 ? 256 : 32;
c->max_alu_insts = (is_r500 || is_r400) ? 512 : 64;
c->max_tex_insts = (is_r500 || is_r400) ? 512 : 32;
if (program_type == RC_FRAGMENT_PROGRAM) {
c->has_half_swizzles = 1;
c->has_presub = 1;
c->has_omod = 1;
c->SwizzleCaps =
is_r500 ? &r500_swizzle_caps : &r300_swizzle_caps;
} else {
c->SwizzleCaps = &r300_vertprog_swizzle_caps;
}
}

View File

@@ -1,3 +1,33 @@
/*
* Copyright 2011 Tom Stellard <tstellar@gmail.com>
* Copyright 2013 Advanced Micro Devices, Inc.
*
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining
* a copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sublicense, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial
* portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
* IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
* LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
* OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
* WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*
* Author: Tom Stellard <thomas.stellard@amd.com>
*/
#include "radeon_compiler.h"
int init_rc_normal_src(
struct rc_instruction * inst,
@@ -8,6 +38,18 @@ int init_rc_normal_dst(
struct rc_instruction * inst,
const char * dst_str);
int parse_rc_normal_instruction(
struct rc_instruction * inst,
const char * inst_str);
int init_rc_normal_instruction(
struct rc_instruction * inst,
const char * inst_str);
void add_instruction(struct radeon_compiler *c, const char * inst_string);
void init_compiler(
struct radeon_compiler *c,
enum rc_program_type program_type,
unsigned is_r500,
unsigned is_r400);

View File

@@ -1,19 +1,51 @@
/*
* Copyright 2011 Tom Stellard <tstellar@gmail.com>
*
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining
* a copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sublicense, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial
* portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
* IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
* LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
* OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
* WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*
*/
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include "unit_test.h"
void run_tests(struct test tests[])
unsigned run_tests(struct test tests[])
{
int i;
unsigned pass = 1;
for (i = 0; tests[i].name; i++) {
printf("Test %s\n", tests[i].name);
memset(&tests[i].result, 0, sizeof(tests[i].result));
tests[i].test_func(&tests[i].result);
printf("Test %s (%d/%d) pass\n", tests[i].name,
tests[i].result.pass, tests[i].result.test_count);
if (tests[i].result.pass != tests[i].result.test_count) {
pass = 0;
}
}
return pass;
}
void test_begin(struct test_result * result)

View File

@@ -1,3 +1,29 @@
/*
* Copyright 2011 Tom Stellard <tstellar@gmail.com>
*
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining
* a copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sublicense, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial
* portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
* IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
* LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
* OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
* WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*
*/
struct test_result {
unsigned int test_count;
@@ -11,7 +37,7 @@ struct test {
struct test_result result;
};
void run_tests(struct test tests[]);
unsigned run_tests(struct test tests[]);
void test_begin(struct test_result * result);
void test_check(struct test_result * result, int cond);

View File

@@ -487,6 +487,7 @@ static void r300_set_blend_color(struct pipe_context* pipe,
(struct r300_blend_color_state*)r300->blend_color_state.state;
struct pipe_blend_color c;
enum pipe_format format = fb->nr_cbufs ? fb->cbufs[0]->format : 0;
float tmp;
CB_LOCALS;
state->state = *color; /* Save it, so that we can reuse it in set_fb_state */
@@ -513,6 +514,13 @@ static void r300_set_blend_color(struct pipe_context* pipe,
c.color[2] = c.color[3];
break;
case PIPE_FORMAT_R8G8B8A8_UNORM:
case PIPE_FORMAT_R8G8B8X8_UNORM:
tmp = c.color[0];
c.color[0] = c.color[2];
c.color[2] = tmp;
break;
default:;
}
}
@@ -919,6 +927,9 @@ r300_set_framebuffer_state(struct pipe_context* pipe,
/* Need to reset clamping or colormask. */
r300_mark_atom_dirty(r300, &r300->blend_state);
/* Re-swizzle the blend color. */
r300_set_blend_color(pipe, &((struct r300_blend_color_state*)r300->blend_color_state.state)->state);
/* If zsbuf is set from NULL to non-NULL or vice versa.. */
if (!!old_state->zsbuf != !!state->zsbuf) {
r300_mark_atom_dirty(r300, &r300->dsa_state);

View File

@@ -978,9 +978,9 @@ r300_texture_create_object(struct r300_screen *rscreen,
tex->tex.microtile = microtile;
tex->tex.macrotile[0] = macrotile;
tex->tex.stride_in_bytes_override = stride_in_bytes_override;
tex->domain = base->flags & R300_RESOURCE_FLAG_TRANSFER ?
RADEON_DOMAIN_GTT :
RADEON_DOMAIN_VRAM | RADEON_DOMAIN_GTT;
tex->domain = base->flags & R300_RESOURCE_FLAG_TRANSFER ? RADEON_DOMAIN_GTT :
base->nr_samples > 1 ? RADEON_DOMAIN_VRAM :
RADEON_DOMAIN_VRAM | RADEON_DOMAIN_GTT;
tex->buf = buffer;
r300_texture_desc_init(rscreen, tex, base);

View File

@@ -243,9 +243,9 @@ void evergreen_set_streamout_enable(struct r600_context *ctx, unsigned buffer_en
void evergreen_dma_copy(struct r600_context *rctx,
struct pipe_resource *dst,
struct pipe_resource *src,
unsigned long dst_offset,
unsigned long src_offset,
unsigned long size)
uint64_t dst_offset,
uint64_t src_offset,
uint64_t size)
{
struct radeon_winsys_cs *cs = rctx->rings.dma.cs;
unsigned i, ncopy, csize, sub_cmd, shift;

View File

@@ -1668,6 +1668,8 @@ static void evergreen_set_framebuffer_state(struct pipe_context *ctx,
surf = (struct r600_surface*)state->cbufs[i];
rtex = (struct r600_texture*)surf->base.texture;
r600_context_add_resource_size(ctx, state->cbufs[i]->texture);
if (!surf->color_initialized) {
evergreen_init_color_surface(rctx, surf);
}
@@ -1699,6 +1701,8 @@ static void evergreen_set_framebuffer_state(struct pipe_context *ctx,
if (state->zsbuf) {
surf = (struct r600_surface*)state->zsbuf;
r600_context_add_resource_size(ctx, state->zsbuf->texture);
if (!surf->depth_initialized) {
evergreen_init_depth_surface(rctx, surf);
}
@@ -2221,6 +2225,13 @@ static void evergreen_emit_db_misc_state(struct r600_context *rctx, struct r600_
if (rctx->db_state.rsurf && rctx->db_state.rsurf->htile_enabled) {
/* FORCE_OFF means HiZ/HiS are determined by DB_SHADER_CONTROL */
db_render_override |= S_02800C_FORCE_HIZ_ENABLE(V_02800C_FORCE_OFF);
/* This is to fix a lockup when hyperz and alpha test are enabled at
* the same time somehow GPU get confuse on which order to pick for
* z test
*/
if (rctx->alphatest_state.sx_alpha_test_control) {
db_render_override |= S_02800C_FORCE_SHADER_Z_ORDER(1);
}
} else {
db_render_override |= S_02800C_FORCE_HIZ_ENABLE(V_02800C_FORCE_DISABLE);
}
@@ -3210,7 +3221,7 @@ void evergreen_pipe_shader_ps(struct pipe_context *ctx, struct r600_pipe_shader
struct r600_context *rctx = (struct r600_context *)ctx;
struct r600_pipe_state *rstate = &shader->rstate;
struct r600_shader *rshader = &shader->shader;
unsigned i, exports_ps, num_cout, spi_ps_in_control_0, spi_input_z, spi_ps_in_control_1, db_shader_control;
unsigned i, exports_ps, num_cout, spi_ps_in_control_0, spi_input_z, spi_ps_in_control_1, db_shader_control = 0;
int pos_index = -1, face_index = -1;
int ninterp = 0;
boolean have_linear = FALSE, have_centroid = FALSE, have_perspective = FALSE;
@@ -3220,7 +3231,6 @@ void evergreen_pipe_shader_ps(struct pipe_context *ctx, struct r600_pipe_shader
rstate->nregs = 0;
db_shader_control = S_02880C_Z_ORDER(V_02880C_EARLY_Z_THEN_LATE_Z);
for (i = 0; i < rshader->ninput; i++) {
/* evergreen NUM_INTERP only contains values interpolated into the LDS,
POSITION goes via GPRs from the SC so isn't counted */
@@ -3454,6 +3464,24 @@ void evergreen_update_db_shader_control(struct r600_context * rctx)
V_02880C_EXPORT_DB_FULL) |
S_02880C_ALPHA_TO_MASK_DISABLE(rctx->framebuffer.cb0_is_integer);
/* When alpha test is enabled we can't trust the hw to make the proper
* decision on the order in which ztest should be run related to fragment
* shader execution.
*
* If alpha test is enabled perform early z rejection (RE_Z) but don't early
* write to the zbuffer. Write to zbuffer is delayed after fragment shader
* execution and thus after alpha test so if discarded by the alpha test
* the z value is not written.
* If ReZ is enabled, and the zfunc/zenable/zwrite values change you can
* get a hang unless you flush the DB in between. For now just use
* LATE_Z.
*/
if (rctx->alphatest_state.sx_alpha_test_control) {
db_shader_control |= S_02880C_Z_ORDER(V_02880C_LATE_Z);
} else {
db_shader_control |= S_02880C_Z_ORDER(V_02880C_EARLY_Z_THEN_LATE_Z);
}
if (db_shader_control != rctx->db_misc_state.db_shader_control) {
rctx->db_misc_state.db_shader_control = db_shader_control;
rctx->db_misc_state.atom.dirty = true;
@@ -3481,7 +3509,7 @@ static void evergreen_dma_copy_tile(struct r600_context *rctx,
unsigned array_mode, lbpp, pitch_tile_max, slice_tile_max, size;
unsigned ncopy, height, cheight, detile, i, x, y, z, src_mode, dst_mode;
unsigned sub_cmd, bank_h, bank_w, mt_aspect, nbanks, tile_split;
unsigned long base, addr;
uint64_t base, addr;
/* make sure that the dma ring is only one active */
rctx->rings.gfx.flush(rctx, RADEON_FLUSH_ASYNC);
@@ -3502,7 +3530,8 @@ static void evergreen_dma_copy_tile(struct r600_context *rctx,
if (dst_mode == RADEON_SURF_MODE_LINEAR) {
/* T2L */
array_mode = evergreen_array_mode(src_mode);
slice_tile_max = (((pitch * rsrc->surface.level[src_level].npix_y) >> 6) / bpp) - 1;
slice_tile_max = (rsrc->surface.level[src_level].nblk_x * rsrc->surface.level[src_level].nblk_y) >> 6;
slice_tile_max = slice_tile_max ? slice_tile_max - 1 : 0;
/* linear height must be the same as the slice tile max height, it's ok even
* if the linear destination/source have smaller heigh as the size of the
* dma packet will be using the copy_height which is always smaller or equal
@@ -3526,7 +3555,8 @@ static void evergreen_dma_copy_tile(struct r600_context *rctx,
} else {
/* L2T */
array_mode = evergreen_array_mode(dst_mode);
slice_tile_max = (((pitch * rdst->surface.level[dst_level].npix_y) >> 6) / bpp) - 1;
slice_tile_max = (rdst->surface.level[dst_level].nblk_x * rdst->surface.level[dst_level].nblk_y) >> 6;
slice_tile_max = slice_tile_max ? slice_tile_max - 1 : 0;
/* linear height must be the same as the slice tile max height, it's ok even
* if the linear destination/source have smaller heigh as the size of the
* dma packet will be using the copy_height which is always smaller or equal
@@ -3625,7 +3655,7 @@ boolean evergreen_dma_blit(struct pipe_context *ctx,
}
if (src_mode == dst_mode) {
unsigned long dst_offset, src_offset;
uint64_t dst_offset, src_offset;
/* simple dma blit would do NOTE code here assume :
* src_box.x/y == 0
* dst_x/y == 0

View File

@@ -151,6 +151,7 @@ struct r600_so_target {
#define R600_CONTEXT_WAIT_CP_DMA_IDLE (1 << 3)
#define R600_CONTEXT_FLUSH_AND_INV (1 << 4)
#define R600_CONTEXT_FLUSH_AND_INV_CB_META (1 << 5)
#define R600_CONTEXT_PS_PARTIAL_FLUSH (1 << 6)
struct r600_context;
struct r600_screen;
@@ -174,9 +175,9 @@ void r600_need_dma_space(struct r600_context *ctx, unsigned num_dw);
void r600_dma_copy(struct r600_context *rctx,
struct pipe_resource *dst,
struct pipe_resource *src,
unsigned long dst_offset,
unsigned long src_offset,
unsigned long size);
uint64_t dst_offset,
uint64_t src_offset,
uint64_t size);
boolean r600_dma_blit(struct pipe_context *ctx,
struct pipe_resource *dst,
unsigned dst_level,
@@ -187,9 +188,9 @@ boolean r600_dma_blit(struct pipe_context *ctx,
void evergreen_dma_copy(struct r600_context *rctx,
struct pipe_resource *dst,
struct pipe_resource *src,
unsigned long dst_offset,
unsigned long src_offset,
unsigned long size);
uint64_t dst_offset,
uint64_t src_offset,
uint64_t size);
boolean evergreen_dma_blit(struct pipe_context *ctx,
struct pipe_resource *dst,
unsigned dst_level,

View File

@@ -68,13 +68,17 @@ static inline unsigned int r600_bytecode_get_num_operands(struct r600_bytecode *
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_MAX_INT:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_MIN_INT:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETE:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETE_DX10:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETE_INT:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETNE:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETNE_DX10:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETNE_INT:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGT:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGT_DX10:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGT_INT:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGT_UINT:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGE:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGE_DX10:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGE_INT:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGE_UINT:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_PRED_SETE:
@@ -150,13 +154,17 @@ static inline unsigned int r600_bytecode_get_num_operands(struct r600_bytecode *
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_MAX_INT:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_MIN_INT:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETE:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETE_DX10:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETE_INT:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETNE:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETNE_DX10:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETNE_INT:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGT:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGT_DX10:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGT_INT:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGT_UINT:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGE:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGE_DX10:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGE_INT:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGE_UINT:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_PRED_SETE:

View File

@@ -359,6 +359,16 @@ out_err:
void r600_need_cs_space(struct r600_context *ctx, unsigned num_dw,
boolean count_draw_in)
{
if (!ctx->ws->cs_memory_below_limit(ctx->rings.gfx.cs, ctx->vram, ctx->gtt)) {
ctx->gtt = 0;
ctx->vram = 0;
ctx->rings.gfx.flush(ctx, RADEON_FLUSH_ASYNC);
return;
}
/* all will be accounted once relocation are emited */
ctx->gtt = 0;
ctx->vram = 0;
/* The number of dwords we already used in the CS so far. */
num_dw += ctx->rings.gfx.cs->cdw;
@@ -623,11 +633,15 @@ void r600_flush_emit(struct r600_context *rctx)
/* Use of WAIT_UNTIL is deprecated on Cayman+ */
if (rctx->family >= CHIP_CAYMAN) {
/* emit a PS partial flush on Cayman/TN */
cs->buf[cs->cdw++] = PKT3(PKT3_EVENT_WRITE, 0, 0);
cs->buf[cs->cdw++] = EVENT_TYPE(EVENT_TYPE_PS_PARTIAL_FLUSH) | EVENT_INDEX(4);
rctx->flags |= R600_CONTEXT_PS_PARTIAL_FLUSH;
}
}
if (rctx->flags & R600_CONTEXT_PS_PARTIAL_FLUSH) {
cs->buf[cs->cdw++] = PKT3(PKT3_EVENT_WRITE, 0, 0);
cs->buf[cs->cdw++] = EVENT_TYPE(EVENT_TYPE_PS_PARTIAL_FLUSH) | EVENT_INDEX(4);
}
if (rctx->chip_class >= R700 &&
(rctx->flags & R600_CONTEXT_FLUSH_AND_INV_CB_META)) {
cs->buf[cs->cdw++] = PKT3(PKT3_EVENT_WRITE, 0, 0);
@@ -784,6 +798,8 @@ void r600_begin_new_cs(struct r600_context *ctx)
ctx->pm4_dirty_cdwords = 0;
ctx->flags = 0;
ctx->gtt = 0;
ctx->vram = 0;
/* Begin a new CS. */
r600_emit_command_buffer(ctx->rings.gfx.cs, &ctx->start_cs_cmd);
@@ -1145,6 +1161,9 @@ void r600_cp_dma_copy_buffer(struct r600_context *rctx,
src_offset += byte_count;
dst_offset += byte_count;
}
/* Invalidate the read caches. */
rctx->flags |= R600_CONTEXT_INVAL_READ_CACHES;
}
void r600_need_dma_space(struct r600_context *ctx, unsigned num_dw)
@@ -1160,9 +1179,9 @@ void r600_need_dma_space(struct r600_context *ctx, unsigned num_dw)
void r600_dma_copy(struct r600_context *rctx,
struct pipe_resource *dst,
struct pipe_resource *src,
unsigned long dst_offset,
unsigned long src_offset,
unsigned long size)
uint64_t dst_offset,
uint64_t src_offset,
uint64_t size)
{
struct radeon_winsys_cs *cs = rctx->rings.dma.cs;
unsigned i, ncopy, csize, shift;

View File

@@ -537,6 +537,7 @@ const char * r600_llvm_gpu_string(enum radeon_family family)
case CHIP_RV630:
case CHIP_RV620:
case CHIP_RV635:
case CHIP_RV670:
case CHIP_RS780:
case CHIP_RS880:
gpu_family = "r600";
@@ -547,7 +548,6 @@ const char * r600_llvm_gpu_string(enum radeon_family family)
case CHIP_RV730:
gpu_family = "rv730";
break;
case CHIP_RV670:
case CHIP_RV740:
case CHIP_RV770:
gpu_family = "rv770";

View File

@@ -447,6 +447,10 @@ struct r600_context {
unsigned backend_mask;
unsigned max_db; /* for OQ */
/* current unaccounted memory usage */
uint64_t vram;
uint64_t gtt;
/* Miscellaneous state objects. */
void *custom_dsa_flush;
void *custom_blend_resolve;
@@ -869,9 +873,11 @@ static INLINE unsigned r600_context_bo_reloc(struct r600_context *ctx,
* look serialized from driver pov
*/
if (!ring->flushing) {
if (ring == &ctx->rings.gfx && ctx->rings.dma.cs) {
/* flush dma ring */
ctx->rings.dma.flush(ctx, RADEON_FLUSH_ASYNC);
if (ring == &ctx->rings.gfx) {
if (ctx->rings.dma.cs) {
/* flush dma ring */
ctx->rings.dma.flush(ctx, RADEON_FLUSH_ASYNC);
}
} else {
/* flush gfx ring */
ctx->rings.gfx.flush(ctx, RADEON_FLUSH_ASYNC);
@@ -996,4 +1002,28 @@ static INLINE unsigned u_max_layer(struct pipe_resource *r, unsigned level)
}
}
static INLINE void r600_context_add_resource_size(struct pipe_context *ctx, struct pipe_resource *r)
{
struct r600_context *rctx = (struct r600_context *)ctx;
struct r600_resource *rr = (struct r600_resource *)r;
if (r == NULL) {
return;
}
/*
* The idea is to compute a gross estimate of memory requirement of
* each draw call. After each draw call, memory will be precisely
* accounted. So the uncertainty is only on the current draw call.
* In practice this gave very good estimate (+/- 10% of the target
* memory limit).
*/
if (rr->domains & RADEON_DOMAIN_GTT) {
rctx->gtt += rr->buf->size;
}
if (rr->domains & RADEON_DOMAIN_VRAM) {
rctx->vram += rr->buf->size;
}
}
#endif

View File

@@ -1544,6 +1544,7 @@ static void r600_set_framebuffer_state(struct pipe_context *ctx,
surf = (struct r600_surface*)state->cbufs[i];
rtex = (struct r600_texture*)surf->base.texture;
r600_context_add_resource_size(ctx, state->cbufs[i]->texture);
if (!surf->color_initialized || force_cmask_fmask) {
r600_init_color_surface(rctx, surf, force_cmask_fmask);
@@ -1576,6 +1577,8 @@ static void r600_set_framebuffer_state(struct pipe_context *ctx,
if (state->zsbuf) {
surf = (struct r600_surface*)state->zsbuf;
r600_context_add_resource_size(ctx, state->zsbuf->texture);
if (!surf->depth_initialized) {
r600_init_depth_surface(rctx, surf);
}
@@ -1937,6 +1940,13 @@ static void r600_emit_db_misc_state(struct r600_context *rctx, struct r600_atom
if (rctx->db_state.rsurf && rctx->db_state.rsurf->htile_enabled) {
/* FORCE_OFF means HiZ/HiS are determined by DB_SHADER_CONTROL */
db_render_override |= S_028D10_FORCE_HIZ_ENABLE(V_028D10_FORCE_OFF);
/* This is to fix a lockup when hyperz and alpha test are enabled at
* the same time somehow GPU get confuse on which order to pick for
* z test
*/
if (rctx->alphatest_state.sx_alpha_test_control) {
db_render_override |= S_028D10_FORCE_SHADER_Z_ORDER(1);
}
} else {
db_render_override |= S_028D10_FORCE_HIZ_ENABLE(V_028D10_FORCE_DISABLE);
}
@@ -2745,7 +2755,7 @@ void r600_pipe_shader_ps(struct pipe_context *ctx, struct r600_pipe_shader *shad
tmp);
}
db_shader_control = S_02880C_Z_ORDER(V_02880C_EARLY_Z_THEN_LATE_Z);
db_shader_control = 0;
for (i = 0; i < rshader->noutput; i++) {
if (rshader->output[i].name == TGSI_SEMANTIC_POSITION)
z_export = 1;
@@ -2940,6 +2950,19 @@ void r600_update_db_shader_control(struct r600_context * rctx)
unsigned db_shader_control = rctx->ps_shader->current->db_shader_control |
S_02880C_DUAL_EXPORT_ENABLE(dual_export);
/* When alpha test is enabled we can't trust the hw to make the proper
* decision on the order in which ztest should be run related to fragment
* shader execution.
*
* If alpha test is enabled perform z test after fragment. RE_Z (early
* z test but no write to the zbuffer) seems to cause lockup on r6xx/r7xx
*/
if (rctx->alphatest_state.sx_alpha_test_control) {
db_shader_control |= S_02880C_Z_ORDER(V_02880C_LATE_Z);
} else {
db_shader_control |= S_02880C_Z_ORDER(V_02880C_EARLY_Z_THEN_LATE_Z);
}
if (db_shader_control != rctx->db_misc_state.db_shader_control) {
rctx->db_misc_state.db_shader_control = db_shader_control;
rctx->db_misc_state.atom.dirty = true;
@@ -2979,7 +3002,7 @@ static boolean r600_dma_copy_tile(struct r600_context *rctx,
struct r600_texture *rdst = (struct r600_texture*)dst;
unsigned array_mode, lbpp, pitch_tile_max, slice_tile_max, size;
unsigned ncopy, height, cheight, detile, i, x, y, z, src_mode, dst_mode;
unsigned long base, addr;
uint64_t base, addr;
/* make sure that the dma ring is only one active */
rctx->rings.gfx.flush(rctx, RADEON_FLUSH_ASYNC);
@@ -2998,7 +3021,8 @@ static boolean r600_dma_copy_tile(struct r600_context *rctx,
if (dst_mode == RADEON_SURF_MODE_LINEAR) {
/* T2L */
array_mode = r600_array_mode(src_mode);
slice_tile_max = (((pitch * rsrc->surface.level[src_level].npix_y) >> 6) / bpp) - 1;
slice_tile_max = (rsrc->surface.level[src_level].nblk_x * rsrc->surface.level[src_level].nblk_y) >> 6;
slice_tile_max = slice_tile_max ? slice_tile_max - 1 : 0;
/* linear height must be the same as the slice tile max height, it's ok even
* if the linear destination/source have smaller heigh as the size of the
* dma packet will be using the copy_height which is always smaller or equal
@@ -3016,7 +3040,8 @@ static boolean r600_dma_copy_tile(struct r600_context *rctx,
} else {
/* L2T */
array_mode = r600_array_mode(dst_mode);
slice_tile_max = (((pitch * rdst->surface.level[dst_level].npix_y) >> 6) / bpp) - 1;
slice_tile_max = (rdst->surface.level[dst_level].nblk_x * rdst->surface.level[dst_level].nblk_y) >> 6;
slice_tile_max = slice_tile_max ? slice_tile_max - 1 : 0;
/* linear height must be the same as the slice tile max height, it's ok even
* if the linear destination/source have smaller heigh as the size of the
* dma packet will be using the copy_height which is always smaller or equal
@@ -3037,14 +3062,15 @@ static boolean r600_dma_copy_tile(struct r600_context *rctx,
return FALSE;
}
size = (copy_height * pitch) >> 2;
ncopy = (size / 0x0000ffff) + !!(size % 0x0000ffff);
/* It's a r6xx/r7xx limitation, the blit must be on 8 boundary for number
* line in the blit. Compute max 8 line we can copy in the size limit
*/
cheight = ((0x0000ffff << 2) / pitch) & 0xfffffff8;
ncopy = (copy_height / cheight) + !!(copy_height % cheight);
r600_need_dma_space(rctx, ncopy * 7);
for (i = 0; i < ncopy; i++) {
cheight = copy_height;
if (((cheight * pitch) >> 2) > 0x0000ffff) {
cheight = (0x0000ffff << 2) / pitch;
}
cheight = cheight > copy_height ? copy_height : cheight;
size = (cheight * pitch) >> 2;
/* emit reloc before writting cs so that cs is always in consistent state */
r600_context_bo_reloc(rctx, &rctx->rings.dma, &rsrc->resource, RADEON_USAGE_READ);
@@ -3109,7 +3135,7 @@ boolean r600_dma_blit(struct pipe_context *ctx,
}
if (src_mode == dst_mode) {
unsigned long dst_offset, src_offset, size;
uint64_t dst_offset, src_offset, size;
/* simple dma blit would do NOTE code here assume :
* src_box.x/y == 0

View File

@@ -293,6 +293,11 @@ static void r600_bind_dsa_state(struct pipe_context *ctx, void *state)
rctx->alphatest_state.sx_alpha_test_control = dsa->sx_alpha_test_control;
rctx->alphatest_state.sx_alpha_ref = dsa->alpha_ref;
rctx->alphatest_state.atom.dirty = true;
if (rctx->chip_class >= EVERGREEN) {
evergreen_update_db_shader_control(rctx);
} else {
r600_update_db_shader_control(rctx);
}
}
}
@@ -479,7 +484,8 @@ static void r600_set_index_buffer(struct pipe_context *ctx,
if (ib) {
pipe_resource_reference(&rctx->index_buffer.buffer, ib->buffer);
memcpy(&rctx->index_buffer, ib, sizeof(*ib));
memcpy(&rctx->index_buffer, ib, sizeof(*ib));
r600_context_add_resource_size(ctx, ib->buffer);
} else {
pipe_resource_reference(&rctx->index_buffer.buffer, NULL);
}
@@ -516,6 +522,7 @@ static void r600_set_vertex_buffers(struct pipe_context *ctx,
vb[i].buffer_offset = input[i].buffer_offset;
pipe_resource_reference(&vb[i].buffer, input[i].buffer);
new_buffer_mask |= 1 << i;
r600_context_add_resource_size(ctx, input[i].buffer);
} else {
pipe_resource_reference(&vb[i].buffer, NULL);
disable_mask |= 1 << i;
@@ -613,6 +620,7 @@ static void r600_set_sampler_views(struct pipe_context *pipe, unsigned shader,
pipe_sampler_view_reference((struct pipe_sampler_view **)&dst->views.views[i], views[i]);
new_mask |= 1 << i;
r600_context_add_resource_size(pipe, views[i]->texture);
} else {
pipe_sampler_view_reference((struct pipe_sampler_view **)&dst->views.views[i], NULL);
disable_mask |= 1 << i;
@@ -806,6 +814,8 @@ static void r600_bind_ps_state(struct pipe_context *ctx, void *state)
rctx->ps_shader = (struct r600_pipe_shader_selector *)state;
r600_context_pipe_state_set(rctx, &rctx->ps_shader->current->rstate);
r600_context_add_resource_size(ctx, (struct pipe_resource *)rctx->ps_shader->current->bo);
if (rctx->chip_class <= R700) {
bool multiwrite = rctx->ps_shader->current->shader.fs_write_all;
@@ -835,6 +845,8 @@ static void r600_bind_vs_state(struct pipe_context *ctx, void *state)
if (state) {
r600_context_pipe_state_set(rctx, &rctx->vs_shader->current->rstate);
r600_context_add_resource_size(ctx, (struct pipe_resource *)rctx->vs_shader->current->bo);
/* Update clip misc state. */
if (rctx->vs_shader->current->pa_cl_vs_out_cntl != rctx->clip_misc_state.pa_cl_vs_out_cntl ||
rctx->vs_shader->current->shader.clip_dist_write != rctx->clip_misc_state.clip_dist_write) {
@@ -938,10 +950,13 @@ static void r600_set_constant_buffer(struct pipe_context *ctx, uint shader, uint
} else {
u_upload_data(rctx->uploader, 0, input->buffer_size, ptr, &cb->buffer_offset, &cb->buffer);
}
/* account it in gtt */
rctx->gtt += input->buffer_size;
} else {
/* Setup the hw buffer. */
cb->buffer_offset = input->buffer_offset;
pipe_resource_reference(&cb->buffer, input->buffer);
r600_context_add_resource_size(ctx, input->buffer);
}
state->enabled_mask |= 1 << index;
@@ -1004,6 +1019,7 @@ static void r600_set_so_targets(struct pipe_context *ctx,
/* Set the new targets. */
for (i = 0; i < num_targets; i++) {
pipe_so_target_reference((struct pipe_stream_output_target**)&rctx->so_targets[i], targets[i]);
r600_context_add_resource_size(ctx, targets[i]->buffer);
}
for (; i < rctx->num_so_targets; i++) {
pipe_so_target_reference((struct pipe_stream_output_target**)&rctx->so_targets[i], NULL);
@@ -1343,6 +1359,12 @@ static void r600_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info
rctx->vgt_state.atom.dirty = true;
}
/* Workaround for hardware deadlock on certain R600 ASICs: write into a CB register. */
if (rctx->chip_class == R600) {
rctx->flags |= R600_CONTEXT_PS_PARTIAL_FLUSH;
rctx->cb_misc_state.atom.dirty = true;
}
/* Emit states. */
r600_need_cs_space(rctx, ib.user_buffer ? 5 : 0, TRUE);
r600_flush_emit(rctx);

View File

@@ -270,6 +270,7 @@ static void r600_texture_destroy(struct pipe_screen *screen,
if (rtex->flushed_depth_texture)
pipe_resource_reference((struct pipe_resource **)&rtex->flushed_depth_texture, NULL);
pipe_resource_reference((struct pipe_resource**)&rtex->htile, NULL);
pb_reference(&resource->buf, NULL);
FREE(rtex);
}

View File

@@ -155,7 +155,7 @@ static inline LLVMValueRef bitcast(
void radeon_llvm_emit_prepare_cube_coords(struct lp_build_tgsi_context * bld_base,
struct lp_build_emit_data * emit_data,
unsigned coord_arg);
LLVMValueRef *coords_arg);
void radeon_llvm_context_init(struct radeon_llvm_context * ctx);

View File

@@ -531,7 +531,7 @@ static void kil_emit(
void radeon_llvm_emit_prepare_cube_coords(
struct lp_build_tgsi_context * bld_base,
struct lp_build_emit_data * emit_data,
unsigned coord_arg)
LLVMValueRef *coords_arg)
{
unsigned target = emit_data->inst->Texture.Texture;
@@ -542,11 +542,13 @@ void radeon_llvm_emit_prepare_cube_coords(
LLVMValueRef coords[4];
LLVMValueRef mad_args[3];
LLVMValueRef idx;
struct LLVMOpaqueValue *cube_vec;
LLVMValueRef v;
unsigned i;
LLVMValueRef v = build_intrinsic(builder, "llvm.AMDGPU.cube",
LLVMVectorType(type, 4),
&emit_data->args[coord_arg], 1, LLVMReadNoneAttribute);
cube_vec = lp_build_gather_values(bld_base->base.gallivm, coords_arg, 4);
v = build_intrinsic(builder, "llvm.AMDGPU.cube", LLVMVectorType(type, 4),
&cube_vec, 1, LLVMReadNoneAttribute);
for (i = 0; i < 4; ++i) {
idx = lp_build_const_int32(gallivm, i);
@@ -579,18 +581,14 @@ void radeon_llvm_emit_prepare_cube_coords(
if (target != TGSI_TEXTURE_CUBE ||
opcode != TGSI_OPCODE_TEX) {
/* load source coord.w component - array_index for cube arrays or
* compare value for SHADOWCUBE */
idx = lp_build_const_int32(gallivm, 3);
coords[3] = LLVMBuildExtractElement(builder,
emit_data->args[coord_arg], idx, "");
/* for cube arrays coord.z = coord.w(array_index) * 8 + face */
if (target == TGSI_TEXTURE_CUBE_ARRAY ||
target == TGSI_TEXTURE_SHADOWCUBE_ARRAY) {
/* coords_arg.w component - array_index for cube arrays or
* compare value for SHADOWCUBE */
coords[2] = lp_build_emit_llvm_ternary(bld_base, TGSI_OPCODE_MAD,
coords[3], lp_build_const_float(gallivm, 8.0), coords[2]);
coords_arg[3], lp_build_const_float(gallivm, 8.0), coords[2]);
}
/* for instructions that need additional src (compare/lod/bias),
@@ -598,12 +596,11 @@ void radeon_llvm_emit_prepare_cube_coords(
if (opcode == TGSI_OPCODE_TEX2 ||
opcode == TGSI_OPCODE_TXB2 ||
opcode == TGSI_OPCODE_TXL2) {
coords[3] = emit_data->args[coord_arg + 1];
coords[3] = coords_arg[4];
}
}
emit_data->args[coord_arg] =
lp_build_gather_values(bld_base->base.gallivm, coords, 4);
memcpy(coords_arg, coords, sizeof(coords));
}
static void txd_fetch_args(
@@ -645,9 +642,6 @@ static void txp_fetch_args(
TGSI_OPCODE_DIV, arg, src_w);
}
coords[3] = bld_base->base.one;
emit_data->args[0] = lp_build_gather_values(bld_base->base.gallivm,
coords, 4);
emit_data->arg_count = 1;
if ((inst->Texture.Texture == TGSI_TEXTURE_CUBE ||
inst->Texture.Texture == TGSI_TEXTURE_CUBE_ARRAY ||
@@ -655,8 +649,12 @@ static void txp_fetch_args(
inst->Texture.Texture == TGSI_TEXTURE_SHADOWCUBE_ARRAY) &&
inst->Instruction.Opcode != TGSI_OPCODE_TXQ &&
inst->Instruction.Opcode != TGSI_OPCODE_TXQ_LZ) {
radeon_llvm_emit_prepare_cube_coords(bld_base, emit_data, 0);
radeon_llvm_emit_prepare_cube_coords(bld_base, emit_data, coords);
}
emit_data->args[0] = lp_build_gather_values(bld_base->base.gallivm,
coords, 4);
emit_data->arg_count = 1;
}
static void tex_fetch_args(
@@ -673,17 +671,12 @@ static void tex_fetch_args(
const struct tgsi_full_instruction * inst = emit_data->inst;
LLVMValueRef coords[4];
LLVMValueRef coords[5];
unsigned chan;
for (chan = 0; chan < 4; chan++) {
coords[chan] = lp_build_emit_fetch(bld_base, inst, 0, chan);
}
emit_data->arg_count = 1;
emit_data->args[0] = lp_build_gather_values(bld_base->base.gallivm,
coords, 4);
emit_data->dst_type = LLVMVectorType(bld_base->base.elem_type, 4);
if (inst->Instruction.Opcode == TGSI_OPCODE_TEX2 ||
inst->Instruction.Opcode == TGSI_OPCODE_TXB2 ||
inst->Instruction.Opcode == TGSI_OPCODE_TXL2) {
@@ -692,7 +685,7 @@ static void tex_fetch_args(
* That operand should be passed as a float value in the args array
* right after the coord vector. After packing it's not used anymore,
* that's why arg_count is not increased */
emit_data->args[1] = lp_build_emit_fetch(bld_base, inst, 1, 0);
coords[4] = lp_build_emit_fetch(bld_base, inst, 1, 0);
}
if ((inst->Texture.Texture == TGSI_TEXTURE_CUBE ||
@@ -701,8 +694,13 @@ static void tex_fetch_args(
inst->Texture.Texture == TGSI_TEXTURE_SHADOWCUBE_ARRAY) &&
inst->Instruction.Opcode != TGSI_OPCODE_TXQ &&
inst->Instruction.Opcode != TGSI_OPCODE_TXQ_LZ) {
radeon_llvm_emit_prepare_cube_coords(bld_base, emit_data, 0);
radeon_llvm_emit_prepare_cube_coords(bld_base, emit_data, coords);
}
emit_data->arg_count = 1;
emit_data->args[0] = lp_build_gather_values(bld_base->base.gallivm,
coords, 4);
emit_data->dst_type = LLVMVectorType(bld_base->base.elem_type, 4);
}
static void txf_fetch_args(

View File

@@ -98,21 +98,6 @@ static void r600_blitter_end(struct pipe_context *ctx)
r600_context_queries_resume(rctx);
}
static unsigned u_max_layer(struct pipe_resource *r, unsigned level)
{
switch (r->target) {
case PIPE_TEXTURE_CUBE:
return 6 - 1;
case PIPE_TEXTURE_3D:
return u_minify(r->depth0, level) - 1;
case PIPE_TEXTURE_1D_ARRAY:
case PIPE_TEXTURE_2D_ARRAY:
return r->array_size - 1;
default:
return 0;
}
}
void si_blit_uncompress_depth(struct pipe_context *ctx,
struct r600_resource_texture *texture,
struct r600_resource_texture *staging,

View File

@@ -55,11 +55,8 @@ static void r600_copy_from_staging_texture(struct pipe_context *ctx, struct r600
struct pipe_resource *texture = transfer->resource;
struct pipe_box sbox;
sbox.x = sbox.y = sbox.z = 0;
sbox.width = transfer->box.width;
sbox.height = transfer->box.height;
/* XXX that might be wrong */
sbox.depth = 1;
u_box_3d(0, 0, 0, transfer->box.width, transfer->box.height, transfer->box.depth, &sbox);
ctx->resource_copy_region(ctx, texture, transfer->level,
transfer->box.x, transfer->box.y, transfer->box.z,
rtransfer->staging,
@@ -153,8 +150,7 @@ static int r600_init_surface(struct r600_screen *rscreen,
surface->flags |= RADEON_SURF_SCANOUT;
}
if ((ptex->bind & PIPE_BIND_DEPTH_STENCIL) &&
!is_flushed_depth && is_depth) {
if (!is_flushed_depth && is_depth) {
surface->flags |= RADEON_SURF_ZBUFFER;
if (is_stencil) {
@@ -239,7 +235,6 @@ static void *si_texture_transfer_map(struct pipe_context *ctx,
{
struct r600_context *rctx = (struct r600_context *)ctx;
struct r600_resource_texture *rtex = (struct r600_resource_texture*)texture;
struct pipe_resource resource;
struct r600_transfer *trans;
boolean use_staging_texture = FALSE;
struct radeon_winsys_cs_handle *buf;
@@ -299,42 +294,52 @@ static void *si_texture_transfer_map(struct pipe_context *ctx,
level, level,
box->z, box->z + box->depth - 1);
trans->transfer.stride = staging_depth->surface.level[level].pitch_bytes;
trans->transfer.layer_stride = staging_depth->surface.level[level].slice_size;
trans->offset = r600_texture_get_offset(staging_depth, level, box->z);
trans->staging = &staging_depth->resource.b.b;
} else if (use_staging_texture) {
resource.target = PIPE_TEXTURE_2D;
struct pipe_resource resource;
struct r600_resource_texture *staging;
memset(&resource, 0, sizeof(resource));
resource.format = texture->format;
resource.width0 = box->width;
resource.height0 = box->height;
resource.depth0 = 1;
resource.array_size = 1;
resource.last_level = 0;
resource.nr_samples = 0;
resource.usage = PIPE_USAGE_STAGING;
resource.bind = 0;
resource.flags = R600_RESOURCE_FLAG_TRANSFER;
/* For texture reading, the temporary (detiled) texture is used as
* a render target when blitting from a tiled texture. */
if (usage & PIPE_TRANSFER_READ) {
resource.bind |= PIPE_BIND_RENDER_TARGET;
}
/* For texture writing, the temporary texture is used as a sampler
* when blitting into a tiled texture. */
if (usage & PIPE_TRANSFER_WRITE) {
resource.bind |= PIPE_BIND_SAMPLER_VIEW;
/* We must set the correct texture target and dimensions if needed for a 3D transfer. */
if (box->depth > 1 && u_max_layer(texture, level) > 0)
resource.target = texture->target;
else
resource.target = PIPE_TEXTURE_2D;
switch (resource.target) {
case PIPE_TEXTURE_1D_ARRAY:
case PIPE_TEXTURE_2D_ARRAY:
case PIPE_TEXTURE_CUBE_ARRAY:
resource.array_size = box->depth;
break;
case PIPE_TEXTURE_3D:
resource.depth0 = box->depth;
break;
default:;
}
/* Create the temporary texture. */
trans->staging = ctx->screen->resource_create(ctx->screen, &resource);
if (trans->staging == NULL) {
staging = (struct r600_resource_texture*)ctx->screen->resource_create(ctx->screen, &resource);
if (staging == NULL) {
R600_ERR("failed to create temporary texture to hold untiled copy\n");
pipe_resource_reference(&trans->transfer.resource, NULL);
FREE(trans);
return NULL;
}
trans->transfer.stride = ((struct r600_resource_texture *)trans->staging)
->surface.level[0].pitch_bytes;
trans->staging = &staging->resource.b.b;
trans->transfer.stride = staging->surface.level[0].pitch_bytes;
trans->transfer.layer_stride = staging->surface.level[0].slice_size;
if (usage & PIPE_TRANSFER_READ) {
r600_copy_to_staging_texture(ctx, trans);
/* Always referenced in the blit. */
@@ -349,7 +354,7 @@ static void *si_texture_transfer_map(struct pipe_context *ctx,
if (trans->staging) {
buf = si_resource(trans->staging)->cs_buf;
} else {
buf = si_resource(trans->transfer.resource)->cs_buf;
buf = rtex->resource.cs_buf;
}
if (rtex->is_depth || !trans->staging)
@@ -549,6 +554,8 @@ static struct pipe_surface *r600_create_surface(struct pipe_context *pipe,
struct r600_surface *surface = CALLOC_STRUCT(r600_surface);
unsigned level = surf_tmpl->u.tex.level;
assert(surf_tmpl->u.tex.first_layer <= u_max_layer(texture, surf_tmpl->u.tex.level));
assert(surf_tmpl->u.tex.last_layer <= u_max_layer(texture, surf_tmpl->u.tex.level));
assert(surf_tmpl->u.tex.first_layer == surf_tmpl->u.tex.last_layer);
if (surface == NULL)
return NULL;

View File

@@ -280,6 +280,7 @@ static const char *r600_get_family_name(enum radeon_family family)
case CHIP_TAHITI: return "AMD TAHITI";
case CHIP_PITCAIRN: return "AMD PITCAIRN";
case CHIP_VERDE: return "AMD CAPE VERDE";
case CHIP_OLAND: return "AMD OLAND";
default: return "AMD unknown";
}
}
@@ -379,7 +380,7 @@ static int r600_get_param(struct pipe_screen* pscreen, enum pipe_cap param)
case PIPE_CAP_MAX_TEXTURE_CUBE_LEVELS:
return 15;
case PIPE_CAP_MAX_TEXTURE_ARRAY_LAYERS:
return /*rscreen->info.drm_minor >= 9 ? 16384 :*/ 0;
return 16384;
case PIPE_CAP_MAX_COMBINED_SAMPLERS:
return 32;
@@ -458,7 +459,7 @@ static int r600_get_shader_param(struct pipe_screen* pscreen, unsigned shader, e
/* FIXME Isn't this equal to TEMPS? */
return 1; /* Max native address registers */
case PIPE_SHADER_CAP_MAX_CONSTS:
return 64;
return 4096; /* actually only memory limits this */
case PIPE_SHADER_CAP_MAX_CONST_BUFFERS:
return 1;
case PIPE_SHADER_CAP_MAX_PREDS:

View File

@@ -277,4 +277,20 @@ static INLINE uint64_t r600_resource_va(struct pipe_screen *screen, struct pipe_
return rscreen->ws->buffer_get_virtual_address(rresource->cs_buf);
}
static INLINE unsigned u_max_layer(struct pipe_resource *r, unsigned level)
{
switch (r->target) {
case PIPE_TEXTURE_CUBE:
return 6 - 1;
case PIPE_TEXTURE_3D:
return u_minify(r->depth0, level) - 1;
case PIPE_TEXTURE_1D_ARRAY:
case PIPE_TEXTURE_2D_ARRAY:
case PIPE_TEXTURE_CUBE_ARRAY:
return r->array_size - 1;
default:
return 0;
}
}
#endif

View File

@@ -263,6 +263,14 @@ static void declare_input_fs(
build_intrinsic(base->gallivm->builder,
"llvm.SI.fs.read.pos", input_type,
args, 1, LLVMReadNoneAttribute);
if (chan == 3)
/* RCP for fragcoord.w */
si_shader_ctx->radeon_bld.inputs[soa_index] =
LLVMBuildFDiv(gallivm->builder,
lp_build_const_float(gallivm, 1.0f),
si_shader_ctx->radeon_bld.inputs[soa_index],
"");
}
return;
}
@@ -433,6 +441,15 @@ static LLVMValueRef fetch_constant(
LLVMValueRef offset;
LLVMValueRef load;
if (swizzle == LP_CHAN_ALL) {
unsigned chan;
LLVMValueRef values[4];
for (chan = 0; chan < TGSI_NUM_CHANNELS; ++chan)
values[chan] = fetch_constant(bld_base, reg, type, chan);
return lp_build_gather_values(bld_base->base.gallivm, values, 4);
}
/* currently not supported */
if (reg->Register.Indirect) {
assert(0);
@@ -446,12 +463,6 @@ static LLVMValueRef fetch_constant(
* CONST[0].x will have an offset of 0 and CONST[1].x will have an
* offset of 4. */
idx = (reg->Register.Index * 4) + swizzle;
/* index loads above 255 are currently not supported */
if (idx > 255) {
assert(0);
idx = 0;
}
offset = lp_build_const_int32(base->gallivm, idx);
load = build_indexed_load(base->gallivm, const_ptr, offset);
@@ -612,6 +623,12 @@ static void si_llvm_emit_epilogue(struct lp_build_tgsi_context * bld_base)
int i;
tgsi_parse_token(parse);
if (parse->FullToken.Token.Type == TGSI_TOKEN_TYPE_PROPERTY &&
parse->FullToken.FullProperty.Property.PropertyName ==
TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS)
shader->fs_write_all = TRUE;
if (parse->FullToken.Token.Type != TGSI_TOKEN_TYPE_DECLARATION)
continue;
@@ -775,6 +792,29 @@ static void si_llvm_emit_epilogue(struct lp_build_tgsi_context * bld_base)
last_args[1] = lp_build_const_int32(base->gallivm,
si_shader_ctx->type == TGSI_PROCESSOR_FRAGMENT);
if (shader->fs_write_all && shader->nr_cbufs > 1) {
int i;
/* Specify that this is not yet the last export */
last_args[2] = lp_build_const_int32(base->gallivm, 0);
for (i = 1; i < shader->nr_cbufs; i++) {
/* Specify the target we are exporting */
last_args[3] = lp_build_const_int32(base->gallivm,
V_008DFC_SQ_EXP_MRT + i);
lp_build_intrinsic(base->gallivm->builder,
"llvm.SI.export",
LLVMVoidTypeInContext(base->gallivm->context),
last_args, 9);
si_shader_ctx->shader->spi_shader_col_format |=
si_shader_ctx->shader->spi_shader_col_format << 4;
}
last_args[3] = lp_build_const_int32(base->gallivm, V_008DFC_SQ_EXP_MRT);
}
/* Specify that this is the last export */
last_args[2] = lp_build_const_int32(base->gallivm, 1);
@@ -791,54 +831,127 @@ static void tex_fetch_args(
struct lp_build_tgsi_context * bld_base,
struct lp_build_emit_data * emit_data)
{
struct gallivm_state *gallivm = bld_base->base.gallivm;
const struct tgsi_full_instruction * inst = emit_data->inst;
unsigned opcode = inst->Instruction.Opcode;
unsigned target = inst->Texture.Texture;
LLVMValueRef ptr;
LLVMValueRef offset;
LLVMValueRef coords[4];
LLVMValueRef address[16];
unsigned count = 0;
unsigned chan;
/* WriteMask */
/* XXX: should be optimized using emit_data->inst->Dst[0].Register.WriteMask*/
emit_data->args[0] = lp_build_const_int32(bld_base->base.gallivm, 0xf);
/* Coordinates */
/* XXX: Not all sample instructions need 4 address arguments. */
if (inst->Instruction.Opcode == TGSI_OPCODE_TXP) {
LLVMValueRef src_w;
unsigned chan;
LLVMValueRef coords[4];
emit_data->dst_type = LLVMVectorType(bld_base->base.elem_type, 4);
src_w = lp_build_emit_fetch(bld_base, emit_data->inst, 0, TGSI_CHAN_W);
for (chan = 0; chan < 3; chan++ ) {
LLVMValueRef arg = lp_build_emit_fetch(bld_base,
emit_data->inst, 0, chan);
/* Fetch and project texture coordinates */
coords[3] = lp_build_emit_fetch(bld_base, emit_data->inst, 0, TGSI_CHAN_W);
for (chan = 0; chan < 3; chan++ ) {
coords[chan] = lp_build_emit_fetch(bld_base,
emit_data->inst, 0,
chan);
if (opcode == TGSI_OPCODE_TXP)
coords[chan] = lp_build_emit_llvm_binary(bld_base,
TGSI_OPCODE_DIV,
arg, src_w);
}
coords[chan],
coords[3]);
}
if (opcode == TGSI_OPCODE_TXP)
coords[3] = bld_base->base.one;
emit_data->args[1] = lp_build_gather_values(bld_base->base.gallivm,
coords, 4);
} else
emit_data->args[1] = lp_build_emit_fetch(bld_base, emit_data->inst,
0, LP_CHAN_ALL);
if (inst->Instruction.Opcode == TGSI_OPCODE_TEX2 ||
inst->Instruction.Opcode == TGSI_OPCODE_TXB2 ||
inst->Instruction.Opcode == TGSI_OPCODE_TXL2) {
/* These instructions have additional operand that should be packed
* into the cube coord vector by radeon_llvm_emit_prepare_cube_coords.
* That operand should be passed as a float value in the args array
* right after the coord vector. After packing it's not used anymore,
* that's why arg_count is not increased */
emit_data->args[2] = lp_build_emit_fetch(bld_base, inst, 1, 0);
/* Pack LOD bias value */
if (opcode == TGSI_OPCODE_TXB)
address[count++] = coords[3];
if ((target == TGSI_TEXTURE_CUBE || target == TGSI_TEXTURE_SHADOWCUBE) &&
opcode != TGSI_OPCODE_TXQ)
radeon_llvm_emit_prepare_cube_coords(bld_base, emit_data, coords);
/* Pack depth comparison value */
switch (target) {
case TGSI_TEXTURE_SHADOW1D:
case TGSI_TEXTURE_SHADOW1D_ARRAY:
case TGSI_TEXTURE_SHADOW2D:
case TGSI_TEXTURE_SHADOWRECT:
address[count++] = coords[2];
break;
case TGSI_TEXTURE_SHADOWCUBE:
case TGSI_TEXTURE_SHADOW2D_ARRAY:
address[count++] = coords[3];
break;
case TGSI_TEXTURE_SHADOWCUBE_ARRAY:
address[count++] = lp_build_emit_fetch(bld_base, inst, 1, 0);
}
if ((inst->Texture.Texture == TGSI_TEXTURE_CUBE ||
inst->Texture.Texture == TGSI_TEXTURE_SHADOWCUBE) &&
inst->Instruction.Opcode != TGSI_OPCODE_TXQ) {
radeon_llvm_emit_prepare_cube_coords(bld_base, emit_data, 1);
/* Pack texture coordinates */
address[count++] = coords[0];
switch (target) {
case TGSI_TEXTURE_2D:
case TGSI_TEXTURE_2D_ARRAY:
case TGSI_TEXTURE_3D:
case TGSI_TEXTURE_CUBE:
case TGSI_TEXTURE_RECT:
case TGSI_TEXTURE_SHADOW2D:
case TGSI_TEXTURE_SHADOWRECT:
case TGSI_TEXTURE_SHADOW2D_ARRAY:
case TGSI_TEXTURE_SHADOWCUBE:
case TGSI_TEXTURE_2D_MSAA:
case TGSI_TEXTURE_2D_ARRAY_MSAA:
case TGSI_TEXTURE_CUBE_ARRAY:
case TGSI_TEXTURE_SHADOWCUBE_ARRAY:
address[count++] = coords[1];
}
switch (target) {
case TGSI_TEXTURE_3D:
case TGSI_TEXTURE_CUBE:
case TGSI_TEXTURE_SHADOWCUBE:
case TGSI_TEXTURE_CUBE_ARRAY:
case TGSI_TEXTURE_SHADOWCUBE_ARRAY:
address[count++] = coords[2];
}
/* Pack array slice */
switch (target) {
case TGSI_TEXTURE_1D_ARRAY:
address[count++] = coords[1];
}
switch (target) {
case TGSI_TEXTURE_2D_ARRAY:
case TGSI_TEXTURE_2D_ARRAY_MSAA:
case TGSI_TEXTURE_SHADOW2D_ARRAY:
address[count++] = coords[2];
}
switch (target) {
case TGSI_TEXTURE_CUBE_ARRAY:
case TGSI_TEXTURE_SHADOW1D_ARRAY:
case TGSI_TEXTURE_SHADOWCUBE_ARRAY:
address[count++] = coords[3];
}
/* Pack LOD */
if (opcode == TGSI_OPCODE_TXL)
address[count++] = coords[3];
if (count > 16) {
assert(!"Cannot handle more than 16 texture address parameters");
count = 16;
}
for (chan = 0; chan < count; chan++ ) {
address[chan] = LLVMBuildBitCast(gallivm->builder,
address[chan],
LLVMInt32TypeInContext(gallivm->context),
"");
}
/* Pad to power of two vector */
while (count < util_next_power_of_two(count))
address[count++] = LLVMGetUndef(LLVMInt32TypeInContext(gallivm->context));
emit_data->args[1] = lp_build_gather_values(gallivm, address, count);
/* Resource */
ptr = use_sgpr(bld_base->base.gallivm, SGPR_CONST_PTR_V8I32, SI_SGPR_RESOURCE);
@@ -855,8 +968,7 @@ static void tex_fetch_args(
ptr, offset);
/* Dimensions */
emit_data->args[4] = lp_build_const_int32(bld_base->base.gallivm,
emit_data->inst->Texture.Texture);
emit_data->args[4] = lp_build_const_int32(bld_base->base.gallivm, target);
emit_data->arg_count = 5;
/* XXX: To optimize, we could use a float or v2f32, if the last bits of
@@ -866,22 +978,37 @@ static void tex_fetch_args(
4);
}
static void build_tex_intrinsic(const struct lp_build_tgsi_action * action,
struct lp_build_tgsi_context * bld_base,
struct lp_build_emit_data * emit_data)
{
struct lp_build_context * base = &bld_base->base;
char intr_name[23];
sprintf(intr_name, "%sv%ui32", action->intr_name,
LLVMGetVectorSize(LLVMTypeOf(emit_data->args[1])));
emit_data->output[emit_data->chan] = lp_build_intrinsic(
base->gallivm->builder, intr_name, emit_data->dst_type,
emit_data->args, emit_data->arg_count);
}
static const struct lp_build_tgsi_action tex_action = {
.fetch_args = tex_fetch_args,
.emit = lp_build_tgsi_intrinsic,
.intr_name = "llvm.SI.sample"
.emit = build_tex_intrinsic,
.intr_name = "llvm.SI.sample."
};
static const struct lp_build_tgsi_action txb_action = {
.fetch_args = tex_fetch_args,
.emit = lp_build_tgsi_intrinsic,
.intr_name = "llvm.SI.sample.bias"
.emit = build_tex_intrinsic,
.intr_name = "llvm.SI.sampleb."
};
static const struct lp_build_tgsi_action txl_action = {
.fetch_args = tex_fetch_args,
.emit = lp_build_tgsi_intrinsic,
.intr_name = "llvm.SI.sample.lod"
.emit = build_tex_intrinsic,
.intr_name = "llvm.SI.samplel."
};

View File

@@ -314,6 +314,8 @@ static void si_update_fb_rs_state(struct r600_context *rctx)
offset_units = rctx->queued.named.rasterizer->offset_units;
switch (rctx->framebuffer.zsbuf->texture->format) {
case PIPE_FORMAT_S8_UINT_Z24_UNORM:
case PIPE_FORMAT_X8Z24_UNORM:
case PIPE_FORMAT_Z24X8_UNORM:
case PIPE_FORMAT_Z24_UNORM_S8_UINT:
depth = -24;
@@ -720,7 +722,6 @@ static uint32_t si_translate_colorformat(enum pipe_format format)
case PIPE_FORMAT_L8A8_SNORM:
case PIPE_FORMAT_L8A8_UINT:
case PIPE_FORMAT_L8A8_SINT:
case PIPE_FORMAT_L8A8_SRGB:
case PIPE_FORMAT_R8G8_SNORM:
case PIPE_FORMAT_R8G8_UNORM:
case PIPE_FORMAT_R8G8_UINT:
@@ -775,6 +776,7 @@ static uint32_t si_translate_colorformat(enum pipe_format format)
case PIPE_FORMAT_Z24_UNORM_S8_UINT:
return V_028C70_COLOR_8_24;
case PIPE_FORMAT_S8X24_UINT:
case PIPE_FORMAT_X8Z24_UNORM:
case PIPE_FORMAT_S8_UINT_Z24_UNORM:
return V_028C70_COLOR_24_8;
@@ -804,15 +806,12 @@ static uint32_t si_translate_colorformat(enum pipe_format format)
return V_028C70_COLOR_10_11_11;
/* 64-bit buffers. */
case PIPE_FORMAT_R16G16B16_USCALED:
case PIPE_FORMAT_R16G16B16_SSCALED:
case PIPE_FORMAT_R16G16B16A16_UINT:
case PIPE_FORMAT_R16G16B16A16_SINT:
case PIPE_FORMAT_R16G16B16A16_USCALED:
case PIPE_FORMAT_R16G16B16A16_SSCALED:
case PIPE_FORMAT_R16G16B16A16_UNORM:
case PIPE_FORMAT_R16G16B16A16_SNORM:
case PIPE_FORMAT_R16G16B16_FLOAT:
case PIPE_FORMAT_R16G16B16A16_FLOAT:
return V_028C70_COLOR_16_16_16_16;
@@ -898,7 +897,6 @@ static uint32_t si_translate_colorswap(enum pipe_format format)
case PIPE_FORMAT_L8A8_SNORM:
case PIPE_FORMAT_L8A8_UINT:
case PIPE_FORMAT_L8A8_SINT:
case PIPE_FORMAT_L8A8_SRGB:
return V_028C70_SWAP_ALT;
case PIPE_FORMAT_R8G8_SNORM:
case PIPE_FORMAT_R8G8_UNORM:
@@ -955,9 +953,10 @@ static uint32_t si_translate_colorswap(enum pipe_format format)
case PIPE_FORMAT_Z24_UNORM_S8_UINT:
return V_028C70_SWAP_STD;
case PIPE_FORMAT_S8X24_UINT:
case PIPE_FORMAT_X8Z24_UNORM:
case PIPE_FORMAT_S8_UINT_Z24_UNORM:
return V_028C70_SWAP_STD;
return V_028C70_SWAP_STD_REV;
case PIPE_FORMAT_R10G10B10A2_UNORM:
case PIPE_FORMAT_R10G10B10X2_SNORM:
@@ -1119,9 +1118,11 @@ static uint32_t si_translate_dbformat(enum pipe_format format)
switch (format) {
case PIPE_FORMAT_Z16_UNORM:
return V_028040_Z_16;
case PIPE_FORMAT_S8_UINT_Z24_UNORM:
case PIPE_FORMAT_X8Z24_UNORM:
case PIPE_FORMAT_Z24X8_UNORM:
case PIPE_FORMAT_Z24_UNORM_S8_UINT:
return V_028040_Z_24; /* XXX no longer supported on SI */
return V_028040_Z_24; /* deprecated on SI */
case PIPE_FORMAT_Z32_FLOAT:
case PIPE_FORMAT_Z32_FLOAT_S8X24_UINT:
return V_028040_Z_32_FLOAT;
@@ -1154,14 +1155,14 @@ static uint32_t si_translate_texformat(struct pipe_screen *screen,
case PIPE_FORMAT_Z24_UNORM_S8_UINT:
return V_008F14_IMG_DATA_FORMAT_8_24;
case PIPE_FORMAT_X8Z24_UNORM:
case PIPE_FORMAT_S8X24_UINT:
case PIPE_FORMAT_S8_UINT_Z24_UNORM:
return V_008F14_IMG_DATA_FORMAT_24_8;
case PIPE_FORMAT_X32_S8X24_UINT:
case PIPE_FORMAT_S8X24_UINT:
case PIPE_FORMAT_S8_UINT:
return V_008F14_IMG_DATA_FORMAT_8;
case PIPE_FORMAT_Z32_FLOAT:
return V_008F14_IMG_DATA_FORMAT_32;
case PIPE_FORMAT_X32_S8X24_UINT:
case PIPE_FORMAT_Z32_FLOAT_S8X24_UINT:
return V_008F14_IMG_DATA_FORMAT_X24_8_32;
default:
@@ -1172,6 +1173,8 @@ static uint32_t si_translate_texformat(struct pipe_screen *screen,
goto out_unknown; /* TODO */
case UTIL_FORMAT_COLORSPACE_SRGB:
if (desc->nr_channels != 4 && desc->nr_channels != 1)
goto out_unknown;
break;
default:
@@ -1523,6 +1526,8 @@ static unsigned si_tile_mode_index(struct r600_resource_texture *rtex, unsigned
switch (rtex->real_format) {
case PIPE_FORMAT_Z16_UNORM:
return 5;
case PIPE_FORMAT_S8_UINT_Z24_UNORM:
case PIPE_FORMAT_X8Z24_UNORM:
case PIPE_FORMAT_Z24X8_UNORM:
case PIPE_FORMAT_Z24_UNORM_S8_UINT:
case PIPE_FORMAT_Z32_FLOAT:
@@ -1586,7 +1591,7 @@ static void si_cb(struct r600_context *rctx, struct si_pm4_state *pm4,
struct r600_surface *surf;
unsigned level = state->cbufs[cb]->u.tex.level;
unsigned pitch, slice;
unsigned color_info;
unsigned color_info, color_attrib;
unsigned tile_mode_index;
unsigned format, swap, ntype, endian;
uint64_t offset;
@@ -1624,15 +1629,19 @@ static void si_cb(struct r600_context *rctx, struct si_pm4_state *pm4,
if (desc->colorspace == UTIL_FORMAT_COLORSPACE_SRGB)
ntype = V_028C70_NUMBER_SRGB;
else if (desc->channel[i].type == UTIL_FORMAT_TYPE_SIGNED) {
if (desc->channel[i].normalized)
ntype = V_028C70_NUMBER_SNORM;
else if (desc->channel[i].pure_integer)
if (desc->channel[i].pure_integer) {
ntype = V_028C70_NUMBER_SINT;
} else {
assert(desc->channel[i].normalized);
ntype = V_028C70_NUMBER_SNORM;
}
} else if (desc->channel[i].type == UTIL_FORMAT_TYPE_UNSIGNED) {
if (desc->channel[i].normalized)
ntype = V_028C70_NUMBER_UNORM;
else if (desc->channel[i].pure_integer)
if (desc->channel[i].pure_integer) {
ntype = V_028C70_NUMBER_UINT;
} else {
assert(desc->channel[i].normalized);
ntype = V_028C70_NUMBER_UNORM;
}
}
}
@@ -1670,6 +1679,9 @@ static void si_cb(struct r600_context *rctx, struct si_pm4_state *pm4,
S_028C70_NUMBER_TYPE(ntype) |
S_028C70_ENDIAN(endian);
color_attrib = S_028C74_TILE_MODE_INDEX(tile_mode_index) |
S_028C74_FORCE_DST_ALPHA_1(desc->swizzle[3] == UTIL_FORMAT_SWIZZLE_1);
offset += r600_resource_va(rctx->context.screen, state->cbufs[cb]->texture);
offset >>= 8;
@@ -1687,8 +1699,7 @@ static void si_cb(struct r600_context *rctx, struct si_pm4_state *pm4,
S_028C6C_SLICE_MAX(state->cbufs[cb]->u.tex.last_layer));
}
si_pm4_set_reg(pm4, R_028C70_CB_COLOR0_INFO + cb * 0x3C, color_info);
si_pm4_set_reg(pm4, R_028C74_CB_COLOR0_ATTRIB + cb * 0x3C,
S_028C74_TILE_MODE_INDEX(tile_mode_index));
si_pm4_set_reg(pm4, R_028C74_CB_COLOR0_ATTRIB + cb * 0x3C, color_attrib);
/* Determine pixel shader export format */
max_comp_size = si_colorformat_max_comp_size(format);
@@ -2053,6 +2064,7 @@ static struct pipe_sampler_view *si_create_sampler_view(struct pipe_context *ctx
unsigned char state_swizzle[4], swizzle[4];
unsigned height, depth, width;
enum pipe_format pipe_format = state->format;
struct radeon_surface_level *surflevel;
int first_non_void;
uint64_t va;
@@ -2072,37 +2084,84 @@ static struct pipe_sampler_view *si_create_sampler_view(struct pipe_context *ctx
state_swizzle[2] = state->swizzle_b;
state_swizzle[3] = state->swizzle_a;
surflevel = tmp->surface.level;
/* Texturing with separate depth and stencil. */
if (tmp->is_depth && !tmp->is_flushing_texture) {
switch (pipe_format) {
case PIPE_FORMAT_Z32_FLOAT_S8X24_UINT:
pipe_format = PIPE_FORMAT_Z32_FLOAT;
break;
case PIPE_FORMAT_X8Z24_UNORM:
case PIPE_FORMAT_S8_UINT_Z24_UNORM:
/* Z24 is always stored like this. */
pipe_format = PIPE_FORMAT_Z24X8_UNORM;
break;
case PIPE_FORMAT_X24S8_UINT:
case PIPE_FORMAT_S8X24_UINT:
case PIPE_FORMAT_X32_S8X24_UINT:
pipe_format = PIPE_FORMAT_S8_UINT;
surflevel = tmp->surface.stencil_level;
break;
default:;
}
}
desc = util_format_description(pipe_format);
util_format_compose_swizzles(desc->swizzle, state_swizzle, swizzle);
if (desc->colorspace == UTIL_FORMAT_COLORSPACE_ZS) {
const unsigned char swizzle_xxxx[4] = {0, 0, 0, 0};
const unsigned char swizzle_yyyy[4] = {1, 1, 1, 1};
switch (pipe_format) {
case PIPE_FORMAT_S8_UINT_Z24_UNORM:
case PIPE_FORMAT_X24S8_UINT:
case PIPE_FORMAT_X32_S8X24_UINT:
case PIPE_FORMAT_X8Z24_UNORM:
util_format_compose_swizzles(swizzle_yyyy, state_swizzle, swizzle);
break;
default:
util_format_compose_swizzles(swizzle_xxxx, state_swizzle, swizzle);
}
} else {
util_format_compose_swizzles(desc->swizzle, state_swizzle, swizzle);
}
first_non_void = util_format_get_first_non_void_channel(pipe_format);
if (first_non_void < 0) {
num_format = V_008F14_IMG_NUM_FORMAT_FLOAT;
} else switch (desc->channel[first_non_void].type) {
case UTIL_FORMAT_TYPE_FLOAT:
num_format = V_008F14_IMG_NUM_FORMAT_FLOAT;
break;
case UTIL_FORMAT_TYPE_SIGNED:
num_format = V_008F14_IMG_NUM_FORMAT_SNORM;
break;
case UTIL_FORMAT_TYPE_UNSIGNED:
default:
switch (pipe_format) {
case PIPE_FORMAT_S8_UINT_Z24_UNORM:
num_format = V_008F14_IMG_NUM_FORMAT_UNORM;
break;
default:
if (first_non_void < 0) {
num_format = V_008F14_IMG_NUM_FORMAT_FLOAT;
} else if (desc->colorspace == UTIL_FORMAT_COLORSPACE_SRGB) {
num_format = V_008F14_IMG_NUM_FORMAT_SRGB;
} else {
num_format = V_008F14_IMG_NUM_FORMAT_UNORM;
switch (desc->channel[first_non_void].type) {
case UTIL_FORMAT_TYPE_FLOAT:
num_format = V_008F14_IMG_NUM_FORMAT_FLOAT;
break;
case UTIL_FORMAT_TYPE_SIGNED:
if (desc->channel[first_non_void].normalized)
num_format = V_008F14_IMG_NUM_FORMAT_SNORM;
else if (desc->channel[first_non_void].pure_integer)
num_format = V_008F14_IMG_NUM_FORMAT_SINT;
else
num_format = V_008F14_IMG_NUM_FORMAT_SSCALED;
break;
case UTIL_FORMAT_TYPE_UNSIGNED:
if (desc->channel[first_non_void].normalized)
num_format = V_008F14_IMG_NUM_FORMAT_UNORM;
else if (desc->channel[first_non_void].pure_integer)
num_format = V_008F14_IMG_NUM_FORMAT_UINT;
else
num_format = V_008F14_IMG_NUM_FORMAT_USCALED;
}
}
}
format = si_translate_texformat(ctx->screen, pipe_format, desc, first_non_void);
@@ -2115,10 +2174,10 @@ static struct pipe_sampler_view *si_create_sampler_view(struct pipe_context *ctx
/* not supported any more */
//endian = si_colorformat_endian_swap(format);
width = tmp->surface.level[0].npix_x;
height = tmp->surface.level[0].npix_y;
depth = tmp->surface.level[0].npix_z;
pitch = tmp->surface.level[0].nblk_x * util_format_get_blockwidth(pipe_format);
width = surflevel[0].npix_x;
height = surflevel[0].npix_y;
depth = surflevel[0].npix_z;
pitch = surflevel[0].nblk_x * util_format_get_blockwidth(pipe_format);
if (texture->target == PIPE_TEXTURE_1D_ARRAY) {
height = 1;
@@ -2128,7 +2187,7 @@ static struct pipe_sampler_view *si_create_sampler_view(struct pipe_context *ctx
}
va = r600_resource_va(ctx->screen, texture);
va += tmp->surface.level[0].offset;
va += surflevel[0].offset;
view->state[0] = va >> 8;
view->state[1] = (S_008F14_BASE_ADDRESS_HI(va >> 40) |
S_008F14_DATA_FORMAT(format) |
@@ -2476,10 +2535,20 @@ static void *si_create_vertex_elements(struct pipe_context *ctx,
num_format = V_008F0C_BUF_NUM_FORMAT_USCALED; /* XXX */
break;
case UTIL_FORMAT_TYPE_SIGNED:
num_format = V_008F0C_BUF_NUM_FORMAT_SNORM;
if (desc->channel[first_non_void].normalized)
num_format = V_008F0C_BUF_NUM_FORMAT_SNORM;
else if (desc->channel[first_non_void].pure_integer)
num_format = V_008F0C_BUF_NUM_FORMAT_SINT;
else
num_format = V_008F0C_BUF_NUM_FORMAT_SSCALED;
break;
case UTIL_FORMAT_TYPE_UNSIGNED:
num_format = V_008F0C_BUF_NUM_FORMAT_UNORM;
if (desc->channel[first_non_void].normalized)
num_format = V_008F0C_BUF_NUM_FORMAT_UNORM;
else if (desc->channel[first_non_void].pure_integer)
num_format = V_008F0C_BUF_NUM_FORMAT_UINT;
else
num_format = V_008F0C_BUF_NUM_FORMAT_USCALED;
break;
case UTIL_FORMAT_TYPE_FLOAT:
default:
@@ -2665,9 +2734,14 @@ void si_init_config(struct r600_context *rctx)
si_pm4_set_reg(pm4, R_028350_PA_SC_RASTER_CONFIG, 0x2a00126a);
break;
case CHIP_VERDE:
default:
si_pm4_set_reg(pm4, R_028350_PA_SC_RASTER_CONFIG, 0x0000124a);
break;
case CHIP_OLAND:
si_pm4_set_reg(pm4, R_028350_PA_SC_RASTER_CONFIG, 0x00000082);
break;
default:
si_pm4_set_reg(pm4, R_028350_PA_SC_RASTER_CONFIG, 0x00000000);
break;
}
si_pm4_set_state(rctx, init, pm4);

View File

@@ -524,10 +524,8 @@ void si_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info *info)
struct pipe_index_buffer ib = {};
uint32_t cp_coher_cntl;
if ((!info->count && (info->indexed || !info->count_from_stream_output)) ||
(info->indexed && !rctx->index_buffer.buffer)) {
if (!info->count && (info->indexed || !info->count_from_stream_output))
return;
}
if (!rctx->ps_shader || !rctx->vs_shader)
return;
@@ -538,13 +536,14 @@ void si_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info *info)
if (info->indexed) {
/* Initialize the index buffer struct. */
pipe_resource_reference(&ib.buffer, rctx->index_buffer.buffer);
ib.user_buffer = rctx->index_buffer.user_buffer;
ib.index_size = rctx->index_buffer.index_size;
ib.offset = rctx->index_buffer.offset + info->start * ib.index_size;
/* Translate or upload, if needed. */
r600_translate_index_buffer(rctx, &ib, info->count);
if (ib.user_buffer) {
if (ib.user_buffer && !ib.buffer) {
r600_upload_index_buffer(rctx, &ib, info->count);
}

View File

@@ -2963,6 +2963,7 @@ sp_create_sampler_variant( const struct pipe_sampler_state *sampler,
case PIPE_TEX_MIPFILTER_LINEAR:
if (key.bits.is_pot &&
key.bits.target == PIPE_TEXTURE_2D &&
sampler->min_img_filter == sampler->mag_img_filter &&
sampler->normalized_coords &&
sampler->wrap_s == PIPE_TEX_WRAP_REPEAT &&

View File

@@ -23,6 +23,7 @@
*
**********************************************************/
#include "util/u_format.h"
#include "util/u_inlines.h"
#include "util/u_memory.h"
#include "pipe/p_defines.h"
@@ -248,6 +249,16 @@ emit_rss(struct svga_context *svga, unsigned dirty)
EMIT_RS_FLOAT( svga, bias, DEPTHBIAS, fail );
}
if (dirty & SVGA_NEW_FRAME_BUFFER) {
/* XXX: we only look at the first color buffer's sRGB state */
float gamma = 1.0f;
if (svga->curr.framebuffer.cbufs[0] &&
util_format_is_srgb(svga->curr.framebuffer.cbufs[0]->format)) {
gamma = 2.2f;
}
EMIT_RS_FLOAT(svga, gamma, OUTPUTGAMMA, fail);
}
if (dirty & SVGA_NEW_RAST) {
/* bitmask of the enabled clip planes */
unsigned enabled = svga->curr.rast->templ.clip_plane_enable;

View File

@@ -72,6 +72,7 @@ AM_CPPFLAGS += \
-I$(top_srcdir)/src/gallium/winsys \
-I$(top_srcdir)/src/egl/wayland/wayland-egl \
-I$(top_srcdir)/src/egl/wayland/wayland-drm \
-I$(top_builddir)/src/egl/wayland/wayland-drm \
-DHAVE_WAYLAND_BACKEND
endif

View File

@@ -318,7 +318,7 @@ ExaFinishAccess(PixmapPtr pPix, int index)
if (!priv)
return;
if (!priv->map_transfer || pPix->devPrivate.ptr == NULL)
if (!priv->map_transfer)
return;
exa_debug_printf("ExaFinishAccess %d\n", index);

View File

@@ -47,6 +47,8 @@ libxatracker_la_LIBADD = \
$(top_builddir)/src/gallium/drivers/trace/libtrace.la \
$(top_builddir)/src/gallium/drivers/rbug/librbug.la
nodist_EXTRA_libxatracker_la_SOURCES = dummy.cpp
if HAVE_MESA_LLVM
libxatracker_la_LDFLAGS += $(LLVM_LDFLAGS)
libxatracker_la_LIBADD += $(LLVM_LIBS)

View File

@@ -593,10 +593,11 @@ static struct pb_buffer *radeon_bomgr_create_bo(struct pb_manager *_mgr,
va.offset = bo->va;
r = drmCommandWriteRead(rws->fd, DRM_RADEON_GEM_VA, &va, sizeof(va));
if (r && va.operation == RADEON_VA_RESULT_ERROR) {
fprintf(stderr, "radeon: Failed to allocate a buffer:\n");
fprintf(stderr, "radeon: Failed to allocate virtual address for buffer:\n");
fprintf(stderr, "radeon: size : %d bytes\n", size);
fprintf(stderr, "radeon: alignment : %d bytes\n", desc->alignment);
fprintf(stderr, "radeon: domains : %d\n", args.initial_domain);
fprintf(stderr, "radeon: va : 0x%016llx\n", (unsigned long long)bo->va);
radeon_bo_destroy(&bo->base);
return NULL;
}
@@ -962,6 +963,10 @@ static boolean radeon_winsys_bo_get_handle(struct pb_buffer *buffer,
whandle->handle = bo->handle;
}
pipe_mutex_lock(bo->mgr->bo_handles_mutex);
util_hash_table_set(bo->mgr->bo_handles, (void*)(uintptr_t)whandle->handle, bo);
pipe_mutex_unlock(bo->mgr->bo_handles_mutex);
whandle->stride = stride;
return TRUE;
}

View File

@@ -383,6 +383,16 @@ static boolean radeon_drm_cs_validate(struct radeon_winsys_cs *rcs)
return status;
}
static boolean radeon_drm_cs_memory_below_limit(struct radeon_winsys_cs *rcs, uint64_t vram, uint64_t gtt)
{
struct radeon_drm_cs *cs = radeon_drm_cs(rcs);
boolean status =
(cs->csc->used_gart + gtt) < cs->ws->info.gart_size * 0.7 &&
(cs->csc->used_vram + vram) < cs->ws->info.vram_size * 0.7;
return status;
}
static void radeon_drm_cs_write_reloc(struct radeon_winsys_cs *rcs,
struct radeon_winsys_cs_handle *buf)
{
@@ -575,6 +585,7 @@ void radeon_drm_cs_init_functions(struct radeon_drm_winsys *ws)
ws->base.cs_destroy = radeon_drm_cs_destroy;
ws->base.cs_add_reloc = radeon_drm_cs_add_reloc;
ws->base.cs_validate = radeon_drm_cs_validate;
ws->base.cs_memory_below_limit = radeon_drm_cs_memory_below_limit;
ws->base.cs_write_reloc = radeon_drm_cs_write_reloc;
ws->base.cs_flush = radeon_drm_cs_flush;
ws->base.cs_set_flush_callback = radeon_drm_cs_set_flush;

View File

@@ -312,6 +312,7 @@ static boolean do_winsys_init(struct radeon_drm_winsys *ws)
case CHIP_TAHITI:
case CHIP_PITCAIRN:
case CHIP_VERDE:
case CHIP_OLAND:
ws->info.chip_class = TAHITI;
break;
}

View File

@@ -123,6 +123,7 @@ enum radeon_family {
CHIP_TAHITI,
CHIP_PITCAIRN,
CHIP_VERDE,
CHIP_OLAND,
CHIP_LAST,
};
@@ -392,6 +393,16 @@ struct radeon_winsys {
*/
boolean (*cs_validate)(struct radeon_winsys_cs *cs);
/**
* Return TRUE if there is enough memory in VRAM and GTT for the relocs
* added so far.
*
* \param cs A command stream to validate.
* \param vram VRAM memory size pending to be use
* \param gtt GTT memory size pending to be use
*/
boolean (*cs_memory_below_limit)(struct radeon_winsys_cs *cs, uint64_t vram, uint64_t gtt);
/**
* Write a relocated dword to a command buffer.
*

View File

@@ -2829,9 +2829,9 @@ ast_declarator_list::hir(exec_list *instructions,
* flat."
*
* From section 4.3.4 of the GLSL 3.00 ES spec:
* "Fragment shader inputs that are signed or unsigned integers or
* integer vectors must be qualified with the interpolation qualifier
* flat."
* "Fragment shader inputs that are, or contain, signed or unsigned
* integers or integer vectors must be qualified with the
* interpolation qualifier flat."
*
* Since vertex outputs and fragment inputs must have matching
* qualifiers, these two requirements are equivalent.
@@ -2839,12 +2839,12 @@ ast_declarator_list::hir(exec_list *instructions,
if (state->is_version(130, 300)
&& state->target == vertex_shader
&& state->current_function == NULL
&& var->type->is_integer()
&& var->type->contains_integer()
&& var->mode == ir_var_shader_out
&& var->interpolation != INTERP_QUALIFIER_FLAT) {
_mesa_glsl_error(&loc, state, "If a vertex output is an integer, "
"then it must be qualified with 'flat'");
_mesa_glsl_error(&loc, state, "If a vertex output is (or contains) "
"an integer, then it must be qualified with 'flat'");
}
@@ -3967,6 +3967,47 @@ ast_iteration_statement::hir(exec_list *instructions,
}
/**
* Determine if the given type is valid for establishing a default precision
* qualifier.
*
* From GLSL ES 3.00 section 4.5.4 ("Default Precision Qualifiers"):
*
* "The precision statement
*
* precision precision-qualifier type;
*
* can be used to establish a default precision qualifier. The type field
* can be either int or float or any of the sampler types, and the
* precision-qualifier can be lowp, mediump, or highp."
*
* GLSL ES 1.00 has similar language. GLSL 1.30 doesn't allow precision
* qualifiers on sampler types, but this seems like an oversight (since the
* intention of including these in GLSL 1.30 is to allow compatibility with ES
* shaders). So we allow int, float, and all sampler types regardless of GLSL
* version.
*/
static bool
is_valid_default_precision_type(const struct _mesa_glsl_parse_state *state,
const char *type_name)
{
const struct glsl_type *type = state->symbols->get_type(type_name);
if (type == NULL)
return false;
switch (type->base_type) {
case GLSL_TYPE_INT:
case GLSL_TYPE_FLOAT:
/* "int" and "float" are valid, but vectors and matrices are not. */
return type->vector_elements == 1 && type->matrix_columns == 1;
case GLSL_TYPE_SAMPLER:
return true;
default:
return false;
}
}
ir_rvalue *
ast_type_specifier::hir(exec_list *instructions,
struct _mesa_glsl_parse_state *state)
@@ -4007,11 +4048,10 @@ ast_type_specifier::hir(exec_list *instructions,
"arrays");
return NULL;
}
if (strcmp(this->type_name, "float") != 0 &&
strcmp(this->type_name, "int") != 0) {
if (!is_valid_default_precision_type(state, this->type_name)) {
_mesa_glsl_error(&loc, state,
"default precision statements apply only to types "
"float and int");
"float, int, and sampler types");
return NULL;
}

View File

@@ -20,23 +20,44 @@
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
# IN THE SOFTWARE.
CC = @CC_FOR_BUILD@
CFLAGS = @CFLAGS_FOR_BUILD@
CPP = @CPP_FOR_BUILD@
CPPFLAGS = @CPPFLAGS_FOR_BUILD@
CXX = @CXX_FOR_BUILD@
CXXFLAGS = @CXXFLAGS_FOR_BUILD@
LD = @LD_FOR_BUILD@
LDFLAGS = @LDFLAGS_FOR_BUILD@
AM_CFLAGS = \
-I $(top_srcdir)/include \
-I $(top_srcdir)/src/mapi \
-I $(top_srcdir)/src/mesa \
-I $(GLSL_SRCDIR) \
-I $(GLSL_SRCDIR)/glcpp \
-I $(GLSL_BUILDDIR) \
$(DEFINES_FOR_BUILD)
-I $(GLSL_BUILDDIR)
if CROSS_COMPILING
proxyCC = @CC_FOR_BUILD@
proxyCFLAGS = @CFLAGS_FOR_BUILD@
proxyCPP = @CPP_FOR_BUILD@
proxyCPPFLAGS = @CPPFLAGS_FOR_BUILD@
proxyCXX = @CXX_FOR_BUILD@
proxyCXXFLAGS = @CXXFLAGS_FOR_BUILD@
proxyLD = @LD_FOR_BUILD@
proxyLDFLAGS = @LDFLAGS_FOR_BUILD@
AM_CFLAGS += $(DEFINES_FOR_BUILD)
else
proxyCC = @CC@
proxyCFLAGS = @CFLAGS@
proxyCPP = @CPP@
proxyCPPFLAGS = @CPPFLAGS@
proxyCXX = @CXX@
proxyCXXFLAGS = @CXXFLAGS@
proxyLD = @LD@
proxyLDFLAGS = @LDFLAGS@
AM_CFLAGS += $(DEFINES)
endif
CC = $(proxyCC)
CFLAGS = $(proxyCFLAGS)
CPP = $(proxyCPP)
CPPFLAGS = $(proxyCPPFLAGS)
CXX = $(proxyCXX)
CXXFLAGS = $(proxyCXXFLAGS)
LD = $(proxyLD)
LDFLAGS = $(proxyLDFLAGS)
AM_CXXFLAGS = $(AM_CFLAGS)

View File

@@ -156,6 +156,24 @@ glsl_type::contains_sampler() const
}
}
bool
glsl_type::contains_integer() const
{
if (this->is_array()) {
return this->fields.array->contains_integer();
} else if (this->is_record()) {
for (unsigned int i = 0; i < this->length; i++) {
if (this->fields.structure[i].type->contains_integer())
return true;
}
return false;
} else {
return this->is_integer();
}
}
gl_texture_index
glsl_type::sampler_index() const
{

View File

@@ -359,6 +359,12 @@ struct glsl_type {
return (base_type == GLSL_TYPE_UINT) || (base_type == GLSL_TYPE_INT);
}
/**
* Query whether or not type is an integral type, or for struct and array
* types, contains an integral type.
*/
bool contains_integer() const;
/**
* Query whether or not a type is a float type
*/

View File

@@ -29,7 +29,7 @@
#include "main/hash_table.h"
#include "program.h"
class ubo_visitor : public uniform_field_visitor {
class ubo_visitor : public program_resource_visitor {
public:
ubo_visitor(void *mem_ctx, gl_uniform_buffer_variable *variables,
unsigned num_variables)
@@ -44,7 +44,7 @@ public:
this->offset = 0;
this->buffer_size = 0;
this->is_array_instance = strchr(name, ']') != NULL;
this->uniform_field_visitor::process(type, name);
this->program_resource_visitor::process(type, name);
}
unsigned index;
@@ -112,7 +112,7 @@ private:
}
};
class count_block_size : public uniform_field_visitor {
class count_block_size : public program_resource_visitor {
public:
count_block_size() : num_active_uniforms(0)
{

View File

@@ -52,7 +52,7 @@ values_for_type(const glsl_type *type)
}
void
uniform_field_visitor::process(const glsl_type *type, const char *name)
program_resource_visitor::process(const glsl_type *type, const char *name)
{
assert(type->is_record()
|| (type->is_array() && type->fields.array->is_record())
@@ -65,7 +65,7 @@ uniform_field_visitor::process(const glsl_type *type, const char *name)
}
void
uniform_field_visitor::process(ir_variable *var)
program_resource_visitor::process(ir_variable *var)
{
const glsl_type *t = var->type;
@@ -93,8 +93,8 @@ uniform_field_visitor::process(ir_variable *var)
}
void
uniform_field_visitor::recursion(const glsl_type *t, char **name,
size_t name_length, bool row_major)
program_resource_visitor::recursion(const glsl_type *t, char **name,
size_t name_length, bool row_major)
{
/* Records need to have each field processed individually.
*
@@ -110,7 +110,7 @@ uniform_field_visitor::recursion(const glsl_type *t, char **name,
if (t->fields.structure[i].type->is_record())
this->visit_field(&t->fields.structure[i]);
/* Append '.field' to the current uniform name. */
/* Append '.field' to the current variable name. */
if (name_length == 0) {
ralloc_asprintf_rewrite_tail(name, &new_length, "%s", field);
} else {
@@ -125,7 +125,7 @@ uniform_field_visitor::recursion(const glsl_type *t, char **name,
for (unsigned i = 0; i < t->length; i++) {
size_t new_length = name_length;
/* Append the subscript to the current uniform name */
/* Append the subscript to the current variable name */
ralloc_asprintf_rewrite_tail(name, &new_length, "[%u]", i);
recursion(t->fields.array, name, new_length,
@@ -137,7 +137,7 @@ uniform_field_visitor::recursion(const glsl_type *t, char **name,
}
void
uniform_field_visitor::visit_field(const glsl_struct_field *field)
program_resource_visitor::visit_field(const glsl_struct_field *field)
{
(void) field;
/* empty */
@@ -153,7 +153,7 @@ uniform_field_visitor::visit_field(const glsl_struct_field *field)
* If the same uniform is added multiple times (i.e., once for each shader
* target), it will only be accounted once.
*/
class count_uniform_size : public uniform_field_visitor {
class count_uniform_size : public program_resource_visitor {
public:
count_uniform_size(struct string_to_uint_map *map)
: num_active_uniforms(0), num_values(0), num_shader_samplers(0),
@@ -171,10 +171,10 @@ public:
void process(ir_variable *var)
{
if (var->is_interface_instance())
uniform_field_visitor::process(var->interface_type,
var->interface_type->name);
program_resource_visitor::process(var->interface_type,
var->interface_type->name);
else
uniform_field_visitor::process(var);
program_resource_visitor::process(var);
}
/**
@@ -258,7 +258,7 @@ private:
* the \c gl_uniform_storage and \c gl_constant_value arrays are "big
* enough."
*/
class parcel_out_uniform_storage : public uniform_field_visitor {
class parcel_out_uniform_storage : public program_resource_visitor {
public:
parcel_out_uniform_storage(struct string_to_uint_map *map,
struct gl_uniform_storage *uniforms,

View File

@@ -35,6 +35,8 @@
#include "linker.h"
#include "link_varyings.h"
#include "main/macros.h"
#include "program/hash_table.h"
#include "program.h"
/**
@@ -154,10 +156,13 @@ cross_validate_outputs_to_inputs(struct gl_shader_program *prog,
/**
* Initialize this object based on a string that was passed to
* glTransformFeedbackVaryings. If there is a parse error, the error is
* reported using linker_error(), and false is returned.
* glTransformFeedbackVaryings.
*
* If the input is mal-formed, this call still succeeds, but it sets
* this->var_name to a mal-formed input, so tfeedback_decl::find_output_var()
* will fail to find any matching variable.
*/
bool
void
tfeedback_decl::init(struct gl_context *ctx, struct gl_shader_program *prog,
const void *mem_ctx, const char *input)
{
@@ -170,12 +175,13 @@ tfeedback_decl::init(struct gl_context *ctx, struct gl_shader_program *prog,
this->is_clip_distance_mesa = false;
this->skip_components = 0;
this->next_buffer_separator = false;
this->matched_candidate = NULL;
if (ctx->Extensions.ARB_transform_feedback3) {
/* Parse gl_NextBuffer. */
if (strcmp(input, "gl_NextBuffer") == 0) {
this->next_buffer_separator = true;
return true;
return;
}
/* Parse gl_SkipComponents. */
@@ -189,21 +195,17 @@ tfeedback_decl::init(struct gl_context *ctx, struct gl_shader_program *prog,
this->skip_components = 4;
if (this->skip_components)
return true;
return;
}
/* Parse a declaration. */
const char *bracket = strrchr(input, '[');
if (bracket) {
this->var_name = ralloc_strndup(mem_ctx, input, bracket - input);
if (sscanf(bracket, "[%u]", &this->array_subscript) != 1) {
linker_error(prog, "Cannot parse transform feedback varying %s", input);
return false;
}
const char *base_name_end;
long subscript = parse_program_resource_name(input, &base_name_end);
this->var_name = ralloc_strndup(mem_ctx, input, base_name_end - input);
if (subscript >= 0) {
this->array_subscript = subscript;
this->is_subscripted = true;
} else {
this->var_name = ralloc_strdup(mem_ctx, input);
this->is_subscripted = false;
}
@@ -215,8 +217,6 @@ tfeedback_decl::init(struct gl_context *ctx, struct gl_shader_program *prog,
strcmp(this->var_name, "gl_ClipDistance") == 0) {
this->is_clip_distance_mesa = true;
}
return true;
}
@@ -240,27 +240,32 @@ tfeedback_decl::is_same(const tfeedback_decl &x, const tfeedback_decl &y)
/**
* Assign a location for this tfeedback_decl object based on the location
* assignment in output_var.
* Assign a location for this tfeedback_decl object based on the transform
* feedback candidate found by find_candidate.
*
* If an error occurs, the error is reported through linker_error() and false
* is returned.
*/
bool
tfeedback_decl::assign_location(struct gl_context *ctx,
struct gl_shader_program *prog,
ir_variable *output_var)
struct gl_shader_program *prog)
{
assert(this->is_varying());
if (output_var->type->is_array()) {
unsigned fine_location
= this->matched_candidate->toplevel_var->location * 4
+ this->matched_candidate->toplevel_var->location_frac
+ this->matched_candidate->offset;
if (this->matched_candidate->type->is_array()) {
/* Array variable */
const unsigned matrix_cols =
output_var->type->fields.array->matrix_columns;
this->matched_candidate->type->fields.array->matrix_columns;
const unsigned vector_elements =
output_var->type->fields.array->vector_elements;
this->matched_candidate->type->fields.array->vector_elements;
unsigned actual_array_size = this->is_clip_distance_mesa ?
prog->Vert.ClipDistanceArraySize : output_var->type->array_size();
prog->Vert.ClipDistanceArraySize :
this->matched_candidate->type->array_size();
if (this->is_subscripted) {
/* Check array bounds. */
@@ -271,22 +276,11 @@ tfeedback_decl::assign_location(struct gl_context *ctx,
actual_array_size);
return false;
}
if (this->is_clip_distance_mesa) {
this->location =
output_var->location + this->array_subscript / 4;
this->location_frac = this->array_subscript % 4;
} else {
unsigned fine_location
= output_var->location * 4 + output_var->location_frac;
unsigned array_elem_size = vector_elements * matrix_cols;
fine_location += array_elem_size * this->array_subscript;
this->location = fine_location / 4;
this->location_frac = fine_location % 4;
}
unsigned array_elem_size = this->is_clip_distance_mesa ?
1 : vector_elements * matrix_cols;
fine_location += array_elem_size * this->array_subscript;
this->size = 1;
} else {
this->location = output_var->location;
this->location_frac = output_var->location_frac;
this->size = actual_array_size;
}
this->vector_elements = vector_elements;
@@ -294,7 +288,7 @@ tfeedback_decl::assign_location(struct gl_context *ctx,
if (this->is_clip_distance_mesa)
this->type = GL_FLOAT;
else
this->type = output_var->type->fields.array->gl_type;
this->type = this->matched_candidate->type->fields.array->gl_type;
} else {
/* Regular variable (scalar, vector, or matrix) */
if (this->is_subscripted) {
@@ -303,13 +297,13 @@ tfeedback_decl::assign_location(struct gl_context *ctx,
this->orig_name, this->var_name);
return false;
}
this->location = output_var->location;
this->location_frac = output_var->location_frac;
this->size = 1;
this->vector_elements = output_var->type->vector_elements;
this->matrix_columns = output_var->type->matrix_columns;
this->type = output_var->type->gl_type;
this->vector_elements = this->matched_candidate->type->vector_elements;
this->matrix_columns = this->matched_candidate->type->matrix_columns;
this->type = this->matched_candidate->type->gl_type;
}
this->location = fine_location / 4;
this->location_frac = fine_location % 4;
/* From GL_EXT_transform_feedback:
* A program will fail to link if:
@@ -404,35 +398,26 @@ tfeedback_decl::store(struct gl_context *ctx, struct gl_shader_program *prog,
}
ir_variable *
tfeedback_decl::find_output_var(gl_shader_program *prog,
gl_shader *producer) const
const tfeedback_candidate *
tfeedback_decl::find_candidate(gl_shader_program *prog,
hash_table *tfeedback_candidates)
{
const char *name = this->is_clip_distance_mesa
? "gl_ClipDistanceMESA" : this->var_name;
ir_variable *var = producer->symbols->get_variable(name);
if (var && var->mode == ir_var_shader_out) {
const glsl_type *type = var->type;
while (type->base_type == GLSL_TYPE_ARRAY)
type = type->fields.array;
if (type->base_type == GLSL_TYPE_STRUCT) {
linker_error(prog, "Transform feedback of varying structs not "
"implemented yet.");
return NULL;
}
return var;
this->matched_candidate = (const tfeedback_candidate *)
hash_table_find(tfeedback_candidates, name);
if (!this->matched_candidate) {
/* From GL_EXT_transform_feedback:
* A program will fail to link if:
*
* * any variable name specified in the <varyings> array is not
* declared as an output in the geometry shader (if present) or
* the vertex shader (if no geometry shader is present);
*/
linker_error(prog, "Transform feedback varying %s undeclared.",
this->orig_name);
}
/* From GL_EXT_transform_feedback:
* A program will fail to link if:
*
* * any variable name specified in the <varyings> array is not
* declared as an output in the geometry shader (if present) or
* the vertex shader (if no geometry shader is present);
*/
linker_error(prog, "Transform feedback varying %s undeclared.",
this->orig_name);
return NULL;
return this->matched_candidate;
}
@@ -449,8 +434,7 @@ parse_tfeedback_decls(struct gl_context *ctx, struct gl_shader_program *prog,
char **varying_names, tfeedback_decl *decls)
{
for (unsigned i = 0; i < num_names; ++i) {
if (!decls[i].init(ctx, prog, mem_ctx, varying_names[i]))
return false;
decls[i].init(ctx, prog, mem_ctx, varying_names[i]);
if (!decls[i].is_varying())
continue;
@@ -870,6 +854,80 @@ is_varying_var(GLenum shaderType, const ir_variable *var)
}
/**
* Visitor class that generates tfeedback_candidate structs describing all
* possible targets of transform feedback.
*
* tfeedback_candidate structs are stored in the hash table
* tfeedback_candidates, which is passed to the constructor. This hash table
* maps varying names to instances of the tfeedback_candidate struct.
*/
class tfeedback_candidate_generator : public program_resource_visitor
{
public:
tfeedback_candidate_generator(void *mem_ctx,
hash_table *tfeedback_candidates)
: mem_ctx(mem_ctx),
tfeedback_candidates(tfeedback_candidates)
{
}
void process(ir_variable *var)
{
this->toplevel_var = var;
this->varying_floats = 0;
if (var->is_interface_instance())
program_resource_visitor::process(var->interface_type,
var->interface_type->name);
else
program_resource_visitor::process(var);
}
private:
virtual void visit_field(const glsl_type *type, const char *name,
bool row_major)
{
assert(!type->is_record());
assert(!(type->is_array() && type->fields.array->is_record()));
assert(!type->is_interface());
assert(!(type->is_array() && type->fields.array->is_interface()));
(void) row_major;
tfeedback_candidate *candidate
= rzalloc(this->mem_ctx, tfeedback_candidate);
candidate->toplevel_var = this->toplevel_var;
candidate->type = type;
candidate->offset = this->varying_floats;
hash_table_insert(this->tfeedback_candidates, candidate,
ralloc_strdup(this->mem_ctx, name));
this->varying_floats += type->component_slots();
}
/**
* Memory context used to allocate hash table keys and values.
*/
void * const mem_ctx;
/**
* Hash table in which tfeedback_candidate objects should be stored.
*/
hash_table * const tfeedback_candidates;
/**
* Pointer to the toplevel variable that is being traversed.
*/
ir_variable *toplevel_var;
/**
* Total number of varying floats that have been visited so far. This is
* used to determine the offset to each varying within the toplevel
* variable.
*/
unsigned varying_floats;
};
/**
* Assign locations for all variables that are produced in one pipeline stage
* (the "producer") and consumed in the next stage (the "consumer").
@@ -902,6 +960,8 @@ assign_varying_locations(struct gl_context *ctx,
const unsigned producer_base = VERT_RESULT_VAR0;
const unsigned consumer_base = FRAG_ATTRIB_VAR0;
varying_matches matches(ctx->Const.DisableVaryingPacking);
hash_table *tfeedback_candidates
= hash_table_ctor(0, hash_table_string_hash, hash_table_string_compare);
/* Operate in a total of three passes.
*
@@ -920,6 +980,9 @@ assign_varying_locations(struct gl_context *ctx,
if ((output_var == NULL) || (output_var->mode != ir_var_shader_out))
continue;
tfeedback_candidate_generator g(mem_ctx, tfeedback_candidates);
g.process(output_var);
ir_variable *input_var =
consumer ? consumer->symbols->get_variable(output_var->name) : NULL;
@@ -935,15 +998,16 @@ assign_varying_locations(struct gl_context *ctx,
if (!tfeedback_decls[i].is_varying())
continue;
ir_variable *output_var
= tfeedback_decls[i].find_output_var(prog, producer);
const tfeedback_candidate *matched_candidate
= tfeedback_decls[i].find_candidate(prog, tfeedback_candidates);
if (output_var == NULL)
if (matched_candidate == NULL) {
hash_table_dtor(tfeedback_candidates);
return false;
if (output_var->is_unmatched_generic_inout) {
matches.record(output_var, NULL);
}
if (matched_candidate->toplevel_var->is_unmatched_generic_inout)
matches.record(matched_candidate->toplevel_var, NULL);
}
const unsigned slots_used = matches.assign_locations();
@@ -953,13 +1017,14 @@ assign_varying_locations(struct gl_context *ctx,
if (!tfeedback_decls[i].is_varying())
continue;
ir_variable *output_var
= tfeedback_decls[i].find_output_var(prog, producer);
if (!tfeedback_decls[i].assign_location(ctx, prog, output_var))
if (!tfeedback_decls[i].assign_location(ctx, prog)) {
hash_table_dtor(tfeedback_candidates);
return false;
}
}
hash_table_dtor(tfeedback_candidates);
if (ctx->Const.DisableVaryingPacking) {
/* Transform feedback code assumes varyings are packed, so if the driver
* has disabled varying packing, make sure it does not support transform

View File

@@ -41,6 +41,49 @@ struct gl_shader;
class ir_variable;
/**
* Data structure describing a varying which is available for use in transform
* feedback.
*
* For example, if the vertex shader contains:
*
* struct S {
* vec4 foo;
* float[3] bar;
* };
*
* varying S[2] v;
*
* Then there would be tfeedback_candidate objects corresponding to the
* following varyings:
*
* v[0].foo
* v[0].bar
* v[1].foo
* v[1].bar
*/
struct tfeedback_candidate
{
/**
* Toplevel variable containing this varying. In the above example, this
* would point to the declaration of the varying v.
*/
ir_variable *toplevel_var;
/**
* Type of this varying. In the above example, this would point to the
* glsl_type for "vec4" or "float[3]".
*/
const glsl_type *type;
/**
* Offset within the toplevel variable where this varying occurs (counted
* in multiples of the size of a float).
*/
unsigned offset;
};
/**
* Data structure tracking information about a transform feedback declaration
* during linking.
@@ -48,17 +91,17 @@ class ir_variable;
class tfeedback_decl
{
public:
bool init(struct gl_context *ctx, struct gl_shader_program *prog,
void init(struct gl_context *ctx, struct gl_shader_program *prog,
const void *mem_ctx, const char *input);
static bool is_same(const tfeedback_decl &x, const tfeedback_decl &y);
bool assign_location(struct gl_context *ctx, struct gl_shader_program *prog,
ir_variable *output_var);
bool assign_location(struct gl_context *ctx,
struct gl_shader_program *prog);
unsigned get_num_outputs() const;
bool store(struct gl_context *ctx, struct gl_shader_program *prog,
struct gl_transform_feedback_info *info, unsigned buffer,
const unsigned max_outputs) const;
ir_variable *find_output_var(gl_shader_program *prog,
gl_shader *producer) const;
const tfeedback_candidate *find_candidate(gl_shader_program *prog,
hash_table *tfeedback_candidates);
bool is_next_buffer_separator() const
{
@@ -158,6 +201,12 @@ private:
* Whether this is gl_NextBuffer from ARB_transform_feedback3.
*/
bool next_buffer_separator;
/**
* If find_candidate() has been called, pointer to the tfeedback_candidate
* data structure that was found. Otherwise NULL.
*/
const tfeedback_candidate *matched_candidate;
};

View File

@@ -200,6 +200,65 @@ linker_warning(gl_shader_program *prog, const char *fmt, ...)
}
/**
* Given a string identifying a program resource, break it into a base name
* and an optional array index in square brackets.
*
* If an array index is present, \c out_base_name_end is set to point to the
* "[" that precedes the array index, and the array index itself is returned
* as a long.
*
* If no array index is present (or if the array index is negative or
* mal-formed), \c out_base_name_end, is set to point to the null terminator
* at the end of the input string, and -1 is returned.
*
* Only the final array index is parsed; if the string contains other array
* indices (or structure field accesses), they are left in the base name.
*
* No attempt is made to check that the base name is properly formed;
* typically the caller will look up the base name in a hash table, so
* ill-formed base names simply turn into hash table lookup failures.
*/
long
parse_program_resource_name(const GLchar *name,
const GLchar **out_base_name_end)
{
/* Section 7.3.1 ("Program Interfaces") of the OpenGL 4.3 spec says:
*
* "When an integer array element or block instance number is part of
* the name string, it will be specified in decimal form without a "+"
* or "-" sign or any extra leading zeroes. Additionally, the name
* string will not include white space anywhere in the string."
*/
const size_t len = strlen(name);
*out_base_name_end = name + len;
if (len == 0 || name[len-1] != ']')
return -1;
/* Walk backwards over the string looking for a non-digit character. This
* had better be the opening bracket for an array index.
*
* Initially, i specifies the location of the ']'. Since the string may
* contain only the ']' charcater, walk backwards very carefully.
*/
unsigned i;
for (i = len - 1; (i > 0) && isdigit(name[i-1]); --i)
/* empty */ ;
if ((i == 0) || name[i-1] != '[')
return -1;
long array_index = strtol(&name[i], NULL, 10);
if (array_index < 0)
return -1;
*out_base_name_end = name + (i - 1);
return array_index;
}
void
link_invalidate_variable_locations(gl_shader *sh, int input_base,
int output_base)

View File

@@ -61,38 +61,39 @@ link_uniform_blocks(void *mem_ctx,
struct gl_uniform_block **blocks_ret);
/**
* Class for processing all of the leaf fields of an uniform
* Class for processing all of the leaf fields of a variable that corresponds
* to a program resource.
*
* Leaves are, roughly speaking, the parts of the uniform that the application
* could query with \c glGetUniformLocation (or that could be returned by
* \c glGetActiveUniforms).
* The leaf fields are all the parts of the variable that the application
* could query using \c glGetProgramResourceIndex (or that could be returned
* by \c glGetProgramResourceName).
*
* Classes my derive from this class to implement specific functionality.
* This class only provides the mechanism to iterate over the leaves. Derived
* classes must implement \c ::visit_field and may override \c ::process.
*/
class uniform_field_visitor {
class program_resource_visitor {
public:
/**
* Begin processing a uniform
* Begin processing a variable
*
* Classes that overload this function should call \c ::process from the
* base class to start the recursive processing of the uniform.
* base class to start the recursive processing of the variable.
*
* \param var The uniform variable that is to be processed
* \param var The variable that is to be processed
*
* Calls \c ::visit_field for each leaf of the uniform.
* Calls \c ::visit_field for each leaf of the variable.
*
* \warning
* This entry should only be used with uniform blocks in cases where the
* row / column ordering of matrices in the block does not matter. For
* example, enumerating the names of members of the block, but not for
* determining the offsets of members.
* When processing a uniform block, this entry should only be used in cases
* where the row / column ordering of matrices in the block does not
* matter. For example, enumerating the names of members of the block, but
* not for determining the offsets of members.
*/
void process(ir_variable *var);
/**
* Begin processing a uniform of a structured type.
* Begin processing a variable of a structured type.
*
* This flavor of \c process should be used to handle structured types
* (i.e., structures, interfaces, or arrays there of) that need special
@@ -100,7 +101,7 @@ public:
* (instead of the instance name) is used for an interface block.
*
* \param type Type that is to be processed, associated with \c name
* \param name Base name of the structured uniform being processed
* \param name Base name of the structured variable being processed
*
* \note
* \c type must be \c GLSL_TYPE_RECORD, \c GLSL_TYPE_INTERFACE, or an array
@@ -110,7 +111,7 @@ public:
protected:
/**
* Method invoked for each leaf of the uniform
* Method invoked for each leaf of the variable
*
* \param type Type of the field.
* \param name Fully qualified name of the field.

View File

@@ -33,3 +33,7 @@ linker_error(gl_shader_program *prog, const char *fmt, ...)
extern void
linker_warning(gl_shader_program *prog, const char *fmt, ...)
PRINTFLIKE(2, 3);
extern long
parse_program_resource_name(const GLchar *name,
const GLchar **out_base_name_end);

View File

@@ -789,9 +789,11 @@ dri2XcbSwapBuffers(Display *dpy,
swap_buffers_reply =
xcb_dri2_swap_buffers_reply(c, swap_buffers_cookie, NULL);
ret = merge_counter(swap_buffers_reply->swap_hi,
swap_buffers_reply->swap_lo);
free(swap_buffers_reply);
if (swap_buffers_reply) {
ret = merge_counter(swap_buffers_reply->swap_hi,
swap_buffers_reply->swap_lo);
free(swap_buffers_reply);
}
return ret;
}
@@ -1053,7 +1055,8 @@ static const struct glx_context_vtable dri2_context_vtable = {
};
static void
dri2BindExtensions(struct dri2_screen *psc, const __DRIextension **extensions)
dri2BindExtensions(struct dri2_screen *psc, const __DRIextension **extensions,
const char *driverName)
{
int i;
@@ -1062,7 +1065,15 @@ dri2BindExtensions(struct dri2_screen *psc, const __DRIextension **extensions)
__glXEnableDirectExtension(&psc->base, "GLX_MESA_swap_control");
__glXEnableDirectExtension(&psc->base, "GLX_SGI_make_current_read");
if (psc->dri2->base.version >= 4) {
/*
* GLX_INTEL_swap_event is broken on the server side, where it's
* currently unconditionally enabled. This completely breaks
* systems running on drivers which don't support that extension.
* There's no way to test for its presence on this side, so instead
* of disabling it uncondtionally, just disable it for drivers
* which are known to not support it.
*/
if (strcmp(driverName, "vmwgfx") != 0) {
__glXEnableDirectExtension(&psc->base, "GLX_INTEL_swap_event");
}
@@ -1206,7 +1217,7 @@ dri2CreateScreen(int screen, struct glx_display * priv)
}
extensions = psc->core->getExtensions(psc->driScreen);
dri2BindExtensions(psc, extensions);
dri2BindExtensions(psc, extensions, driverName);
configs = driConvertConfigs(psc->core, psc->base.configs, driver_configs);
visuals = driConvertConfigs(psc->core, psc->base.visuals, driver_configs);

View File

@@ -700,7 +700,9 @@ generic_%u_byte( GLint rop, const void * ptr )
if f.reply_always_array:
print ' (void)memcpy(%s, %s_data(reply), %s_data_length(reply) * sizeof(%s));' % (output.name, xcb_name, xcb_name, output.get_base_type_string())
else:
print ' if (%s_data_length(reply) == 0)' % (xcb_name)
print ' /* the XXX_data_length() xcb function name is misleading, it returns the number */'
print ' /* of elements, not the length of the data part. A single element is embedded. */'
print ' if (%s_data_length(reply) == 1)' % (xcb_name)
print ' (void)memcpy(%s, &reply->datum, sizeof(reply->datum));' % (output.name)
print ' else'
print ' (void)memcpy(%s, %s_data(reply), %s_data_length(reply) * sizeof(%s));' % (output.name, xcb_name, xcb_name, output.get_base_type_string())

View File

@@ -23,6 +23,7 @@
#include "main/teximage.h"
#include "main/fbobject.h"
#include "main/renderbuffer.h"
#include "glsl/ralloc.h"
@@ -183,10 +184,19 @@ formats_match(GLbitfield buffer_bit, struct intel_renderbuffer *src_irb,
gl_format src_format = find_miptree(buffer_bit, src_irb)->format;
gl_format dst_format = find_miptree(buffer_bit, dst_irb)->format;
return _mesa_get_srgb_format_linear(src_format) ==
_mesa_get_srgb_format_linear(dst_format);
}
gl_format linear_src_format = _mesa_get_srgb_format_linear(src_format);
gl_format linear_dst_format = _mesa_get_srgb_format_linear(dst_format);
/* Normally, we require the formats to be equal. However, we also support
* blitting from ARGB to XRGB (discarding alpha), and from XRGB to ARGB
* (overriding alpha to 1.0 via blending).
*/
return linear_src_format == linear_dst_format ||
(linear_src_format == MESA_FORMAT_XRGB8888 &&
linear_dst_format == MESA_FORMAT_ARGB8888) ||
(linear_src_format == MESA_FORMAT_ARGB8888 &&
linear_dst_format == MESA_FORMAT_XRGB8888);
}
static bool
try_blorp_blit(struct intel_context *intel,
@@ -295,6 +305,93 @@ try_blorp_blit(struct intel_context *intel,
return true;
}
bool
brw_blorp_copytexsubimage(struct intel_context *intel,
struct gl_renderbuffer *src_rb,
struct gl_texture_image *dst_image,
int srcX0, int srcY0,
int dstX0, int dstY0,
int width, int height)
{
struct gl_context *ctx = &intel->ctx;
struct intel_renderbuffer *src_irb = intel_renderbuffer(src_rb);
struct intel_renderbuffer *dst_irb;
/* BLORP is not supported before Gen6. */
if (intel->gen < 6)
return false;
/* Create a fake/wrapper renderbuffer to allow us to use do_blorp_blit(). */
dst_irb = intel_create_fake_renderbuffer_wrapper(intel, dst_image);
if (!dst_irb)
return false;
struct gl_renderbuffer *dst_rb = &dst_irb->Base.Base;
/* Unlike BlitFramebuffer, CopyTexSubImage doesn't have a buffer bit.
* It's only used by find_miptee() to decide whether to dereference the
* separate stencil miptree. In the case of packed depth/stencil, core
* Mesa hands us the depth attachment as src_rb (not stencil), so assume
* non-stencil for now. A buffer bit of 0 works for both color and depth.
*/
GLbitfield buffer_bit = 0;
if (!formats_match(buffer_bit, src_irb, dst_irb)) {
dst_rb->Delete(ctx, dst_rb);
return false;
}
/* Source clipping shouldn't be necessary, since copytexsubimage (in
* src/mesa/main/teximage.c) calls _mesa_clip_copytexsubimage() which
* takes care of it.
*
* Destination clipping shouldn't be necessary since the restrictions on
* glCopyTexSubImage prevent the user from specifying a destination rectangle
* that falls outside the bounds of the destination texture.
* See error_check_subtexture_dimensions().
*/
int srcY1 = srcY0 + height;
int dstX1 = dstX0 + width;
int dstY1 = dstY0 + height;
/* Sync up the state of window system buffers. We need to do this before
* we go looking for the buffers.
*/
intel_prepare_render(intel);
/* Account for the fact that in the system framebuffer, the origin is at
* the lower left.
*/
bool mirror_y = false;
if (_mesa_is_winsys_fbo(ctx->ReadBuffer)) {
GLint tmp = src_rb->Height - srcY0;
srcY0 = src_rb->Height - srcY1;
srcY1 = tmp;
mirror_y = true;
}
do_blorp_blit(intel, buffer_bit, src_irb, dst_irb,
srcX0, srcY0, dstX0, dstY0, dstX1, dstY1, false, mirror_y);
/* If we're copying a packed depth stencil texture, the above do_blorp_blit
* copied depth (since buffer_bit != GL_STENCIL_BIT). Now copy stencil as
* well. There's no need to do a formats_match() check because the separate
* stencil buffer is always S8.
*/
src_rb = ctx->ReadBuffer->Attachment[BUFFER_STENCIL].Renderbuffer;
if (_mesa_get_format_bits(dst_image->TexFormat, GL_STENCIL_BITS) > 0 &&
src_rb != NULL) {
src_irb = intel_renderbuffer(src_rb);
do_blorp_blit(intel, GL_STENCIL_BUFFER_BIT, src_irb, dst_irb,
srcX0, srcY0, dstX0, dstY0, dstX1, dstY1, false, mirror_y);
}
dst_rb->Delete(ctx, dst_rb);
return true;
}
GLbitfield
brw_blorp_framebuffer(struct intel_context *intel,
GLint srcX0, GLint srcY0, GLint srcX1, GLint srcY1,
@@ -1642,17 +1739,6 @@ brw_blorp_blit_params::brw_blorp_blit_params(struct brw_context *brw,
src.set(brw, src_mt, src_level, src_layer);
dst.set(brw, dst_mt, dst_level, dst_layer);
/* If we are blitting from sRGB to linear or vice versa, we still want the
* blit to be a direct copy, so we need source and destination to use the
* same format. However, we want the destination sRGB/linear state to be
* correct (so that sRGB blending is used when doing an MSAA resolve to an
* sRGB surface, and linear blending is used when doing an MSAA resolve to
* a linear surface). Since blorp blits don't support any format
* conversion (except between sRGB and linear), we can accomplish this by
* simply setting up the source to use the same format as the destination.
*/
assert(_mesa_get_srgb_format_linear(src_mt->format) ==
_mesa_get_srgb_format_linear(dst_mt->format));
src.brw_surfaceformat = dst.brw_surfaceformat;
use_wm_prog = true;

View File

@@ -278,7 +278,23 @@ brwCreateContext(int api,
}
/* WM maximum threads is number of EUs times number of threads per EU. */
if (intel->gen >= 7) {
assert(intel->gen <= 7);
if (intel->is_haswell) {
if (intel->gt == 1) {
brw->max_wm_threads = 102;
brw->max_vs_threads = 70;
brw->urb.size = 128;
brw->urb.max_vs_entries = 640;
brw->urb.max_gs_entries = 256;
} else if (intel->gt == 2) {
brw->max_wm_threads = 204;
brw->max_vs_threads = 280;
brw->urb.size = 256;
brw->urb.max_vs_entries = 1664;
brw->urb.max_gs_entries = 640;
}
} else if (intel->gen == 7) {
if (intel->gt == 1) {
brw->max_wm_threads = 48;
brw->max_vs_threads = 36;
@@ -360,6 +376,7 @@ brwCreateContext(int api,
ctx->Const.NativeIntegers = true;
ctx->Const.UniformBooleanTrue = 1;
ctx->Const.UniformBufferOffsetAlignment = 16;
ctx->Const.ForceGLSLExtensionsWarn = driQueryOptionb(&intel->optionCache, "force_glsl_extensions_warn");

View File

@@ -1217,6 +1217,14 @@ brw_blorp_framebuffer(struct intel_context *intel,
GLint dstX0, GLint dstY0, GLint dstX1, GLint dstY1,
GLbitfield mask, GLenum filter);
bool
brw_blorp_copytexsubimage(struct intel_context *intel,
struct gl_renderbuffer *src_rb,
struct gl_texture_image *dst_image,
int srcX0, int srcY0,
int dstX0, int dstY0,
int width, int height);
/* gen6_multisample_state.c */
void
gen6_emit_3dstate_multisample(struct brw_context *brw,

View File

@@ -327,6 +327,23 @@ fs_inst::is_math()
opcode == SHADER_OPCODE_POW);
}
bool
fs_inst::is_control_flow()
{
switch (opcode) {
case BRW_OPCODE_DO:
case BRW_OPCODE_WHILE:
case BRW_OPCODE_IF:
case BRW_OPCODE_ELSE:
case BRW_OPCODE_ENDIF:
case BRW_OPCODE_BREAK:
case BRW_OPCODE_CONTINUE:
return true;
default:
return false;
}
}
bool
fs_inst::is_send_from_grf()
{
@@ -2070,16 +2087,12 @@ fs_visitor::compute_to_mrf()
break;
}
/* We don't handle flow control here. Most computation of
/* We don't handle control flow here. Most computation of
* values that end up in MRFs are shortly before the MRF
* write anyway.
*/
if (scan_inst->opcode == BRW_OPCODE_DO ||
scan_inst->opcode == BRW_OPCODE_WHILE ||
scan_inst->opcode == BRW_OPCODE_ELSE ||
scan_inst->opcode == BRW_OPCODE_ENDIF) {
if (scan_inst->is_control_flow() && scan_inst->opcode != BRW_OPCODE_IF)
break;
}
/* You can't read from an MRF, so if someone else reads our
* MRF's source GRF that we wanted to rewrite, that stops us.
@@ -2163,16 +2176,8 @@ fs_visitor::remove_duplicate_mrf_writes()
foreach_list_safe(node, &this->instructions) {
fs_inst *inst = (fs_inst *)node;
switch (inst->opcode) {
case BRW_OPCODE_DO:
case BRW_OPCODE_WHILE:
case BRW_OPCODE_IF:
case BRW_OPCODE_ELSE:
case BRW_OPCODE_ENDIF:
if (inst->is_control_flow()) {
memset(last_mrf_move, 0, sizeof(last_mrf_move));
continue;
default:
break;
}
if (inst->opcode == BRW_OPCODE_MOV &&

View File

@@ -178,6 +178,7 @@ public:
bool overwrites_reg(const fs_reg &reg);
bool is_tex();
bool is_math();
bool is_control_flow();
bool is_send_from_grf();
fs_reg dst;

View File

@@ -951,8 +951,8 @@ fs_generator::generate_pack_half_2x16_split(fs_inst *inst,
{
assert(intel->gen >= 7);
assert(dst.type == BRW_REGISTER_TYPE_UD);
assert(x.type = BRW_REGISTER_TYPE_F);
assert(y.type = BRW_REGISTER_TYPE_F);
assert(x.type == BRW_REGISTER_TYPE_F);
assert(y.type == BRW_REGISTER_TYPE_F);
/* From the Ivybridge PRM, Vol4, Part3, Section 6.27 f32to16:
*

View File

@@ -816,15 +816,8 @@ fs_visitor::schedule_instructions(bool post_reg_alloc)
next_block_header = (fs_inst *)next_block_header->next;
sched.add_inst(inst);
if (inst->opcode == BRW_OPCODE_IF ||
inst->opcode == BRW_OPCODE_ELSE ||
inst->opcode == BRW_OPCODE_ENDIF ||
inst->opcode == BRW_OPCODE_DO ||
inst->opcode == BRW_OPCODE_WHILE ||
inst->opcode == BRW_OPCODE_BREAK ||
inst->opcode == BRW_OPCODE_CONTINUE) {
if (inst->is_control_flow())
break;
}
}
sched.calculate_deps();
sched.schedule_instructions(next_block_header);

View File

@@ -196,11 +196,11 @@ haswell_upload_cut_index(struct brw_context *brw)
return;
const unsigned cut_index_setting =
ctx->Array.PrimitiveRestart ? HSW_CUT_INDEX_ENABLE : 0;
ctx->Array._PrimitiveRestart ? HSW_CUT_INDEX_ENABLE : 0;
BEGIN_BATCH(2);
OUT_BATCH(_3DSTATE_VF << 16 | cut_index_setting | (2 - 2));
OUT_BATCH(ctx->Array.RestartIndex);
OUT_BATCH(ctx->Array._RestartIndex);
ADVANCE_BATCH();
}

View File

@@ -197,7 +197,8 @@ uint32_t brw_format_for_mesa_format(gl_format mesa_format);
GLuint translate_tex_target(GLenum target);
GLuint translate_tex_format(gl_format mesa_format,
GLuint translate_tex_format(struct intel_context *intel,
gl_format mesa_format,
GLenum internal_format,
GLenum depth_mode,
GLenum srgb_decode);
@@ -225,7 +226,7 @@ void upload_default_color(struct brw_context *brw,
/* gen6_sf_state.c */
uint32_t
get_attr_override(struct brw_vue_map *vue_map, int urb_entry_read_offset,
int fs_attr, bool two_side_color);
int fs_attr, bool two_side_color, uint32_t *max_source_attr);
#ifdef __cplusplus
}

View File

@@ -2420,18 +2420,14 @@ vec4_visitor::emit_psiz_and_flags(struct brw_reg reg)
* clipped against all fixed planes.
*/
if (brw->has_negative_rhw_bug) {
#if 0
/* FINISHME */
brw_CMP(p,
vec8(brw_null_reg()),
BRW_CONDITIONAL_L,
brw_swizzle1(output_reg[BRW_VERT_RESULT_NDC], 3),
brw_imm_f(0));
brw_OR(p, brw_writemask(header1, WRITEMASK_W), header1, brw_imm_ud(1<<6));
brw_MOV(p, output_reg[BRW_VERT_RESULT_NDC], brw_imm_f(0));
brw_set_predicate_control(p, BRW_PREDICATE_NONE);
#endif
src_reg ndc_w = src_reg(output_reg[BRW_VERT_RESULT_NDC]);
ndc_w.swizzle = BRW_SWIZZLE_WWWW;
emit(CMP(dst_null_f(), ndc_w, src_reg(0.0f), BRW_CONDITIONAL_L));
vec4_instruction *inst;
inst = emit(OR(header1_w, src_reg(header1_w), src_reg(1u << 6)));
inst->predicate = BRW_PREDICATE_NORMAL;
inst = emit(MOV(output_reg[BRW_VERT_RESULT_NDC], src_reg(0.0f)));
inst->predicate = BRW_PREDICATE_NORMAL;
}
emit(MOV(retype(reg, BRW_REGISTER_TYPE_UD), src_reg(header1)));

View File

@@ -622,7 +622,8 @@ brw_render_target_supported(struct intel_context *intel,
}
GLuint
translate_tex_format(gl_format mesa_format,
translate_tex_format(struct intel_context *intel,
gl_format mesa_format,
GLenum internal_format,
GLenum depth_mode,
GLenum srgb_decode)
@@ -651,6 +652,17 @@ translate_tex_format(gl_format mesa_format,
*/
return BRW_SURFACEFORMAT_R32G32B32A32_FLOAT;
case MESA_FORMAT_SRGB_DXT1:
if (intel->gen == 4 && !intel->is_g4x) {
/* Work around missing SRGB DXT1 support on original gen4 by just
* skipping SRGB decode. It's not worth not supporting sRGB in
* general to prevent this.
*/
WARN_ONCE(true, "Demoting sRGB DXT1 texture to non-sRGB\n");
mesa_format = MESA_FORMAT_RGB_DXT1;
}
return brw_format_for_mesa_format(mesa_format);
default:
assert(brw_format_for_mesa_format(mesa_format) != 0);
return brw_format_for_mesa_format(mesa_format);
@@ -829,6 +841,7 @@ brw_update_texture_surface(struct gl_context *ctx,
uint32_t *binding_table,
unsigned surf_index)
{
struct intel_context *intel = intel_context(ctx);
struct brw_context *brw = brw_context(ctx);
struct gl_texture_object *tObj = ctx->Texture.Unit[unit]._Current;
struct intel_texture_object *intelObj = intel_texture_object(tObj);
@@ -851,7 +864,8 @@ brw_update_texture_surface(struct gl_context *ctx,
surf[0] = (translate_tex_target(tObj->Target) << BRW_SURFACE_TYPE_SHIFT |
BRW_SURFACE_MIPMAPLAYOUT_BELOW << BRW_SURFACE_MIPLAYOUT_SHIFT |
BRW_SURFACE_CUBEFACE_ENABLES |
(translate_tex_format(mt->format,
(translate_tex_format(intel,
mt->format,
firstImage->InternalFormat,
tObj->DepthMode,
sampler->sRGBDecode) <<

View File

@@ -283,6 +283,25 @@ gen6_blorp_emit_blend_state(struct brw_context *brw,
blend->blend1.write_disable_b = false;
blend->blend1.write_disable_a = false;
/* When blitting from an XRGB source to a ARGB destination, we need to
* interpret the missing channel as 1.0. Blending can do that for us:
* we simply use the RGB values from the fragment shader ("source RGB"),
* but smash the alpha channel to 1.
*/
if (_mesa_get_format_bits(params->dst.mt->format, GL_ALPHA_BITS) > 0 &&
_mesa_get_format_bits(params->src.mt->format, GL_ALPHA_BITS) == 0) {
blend->blend0.blend_enable = 1;
blend->blend0.ia_blend_enable = 1;
blend->blend0.blend_func = BRW_BLENDFUNCTION_ADD;
blend->blend0.ia_blend_func = BRW_BLENDFUNCTION_ADD;
blend->blend0.source_blend_factor = BRW_BLENDFACTOR_SRC_COLOR;
blend->blend0.dest_blend_factor = BRW_BLENDFACTOR_ZERO;
blend->blend0.ia_source_blend_factor = BRW_BLENDFACTOR_ONE;
blend->blend0.ia_dest_blend_factor = BRW_BLENDFACTOR_ZERO;
}
return cc_blend_state_offset;
}

View File

@@ -54,9 +54,8 @@
*/
uint32_t
get_attr_override(struct brw_vue_map *vue_map, int urb_entry_read_offset,
int fs_attr, bool two_side_color)
int fs_attr, bool two_side_color, uint32_t *max_source_attr)
{
int attr_override, slot;
int vs_attr = _mesa_frag_attrib_to_vert_result(fs_attr);
if (vs_attr < 0 || vs_attr == VERT_RESULT_HPOS) {
/* These attributes will be overwritten by the fragment shader's
@@ -67,7 +66,7 @@ get_attr_override(struct brw_vue_map *vue_map, int urb_entry_read_offset,
}
/* Find the VUE slot for this attribute. */
slot = vue_map->vert_result_to_slot[vs_attr];
int slot = vue_map->vert_result_to_slot[vs_attr];
/* If there was only a back color written but not front, use back
* as the color instead of undefined
@@ -89,23 +88,29 @@ get_attr_override(struct brw_vue_map *vue_map, int urb_entry_read_offset,
* Each increment of urb_entry_read_offset represents a 256-bit value, so
* it counts for two 128-bit VUE slots.
*/
attr_override = slot - 2 * urb_entry_read_offset;
assert (attr_override >= 0 && attr_override < 32);
int source_attr = slot - 2 * urb_entry_read_offset;
assert(source_attr >= 0 && source_attr < 32);
/* If we are doing two-sided color, and the VUE slot following this one
* represents a back-facing color, then we need to instruct the SF unit to
* do back-facing swizzling.
*/
if (two_side_color) {
if (vue_map->slot_to_vert_result[slot] == VERT_RESULT_COL0 &&
vue_map->slot_to_vert_result[slot+1] == VERT_RESULT_BFC0)
attr_override |= (ATTRIBUTE_SWIZZLE_INPUTATTR_FACING << ATTRIBUTE_SWIZZLE_SHIFT);
else if (vue_map->slot_to_vert_result[slot] == VERT_RESULT_COL1 &&
vue_map->slot_to_vert_result[slot+1] == VERT_RESULT_BFC1)
attr_override |= (ATTRIBUTE_SWIZZLE_INPUTATTR_FACING << ATTRIBUTE_SWIZZLE_SHIFT);
bool swizzling = two_side_color &&
((vue_map->slot_to_vert_result[slot] == VERT_RESULT_COL0 &&
vue_map->slot_to_vert_result[slot+1] == VERT_RESULT_BFC0) ||
(vue_map->slot_to_vert_result[slot] == VERT_RESULT_COL1 &&
vue_map->slot_to_vert_result[slot+1] == VERT_RESULT_BFC1));
/* Update max_source_attr. If swizzling, the SF will read this slot + 1. */
if (*max_source_attr < source_attr + swizzling)
*max_source_attr = source_attr + swizzling;
if (swizzling) {
return source_attr |
(ATTRIBUTE_SWIZZLE_INPUTATTR_FACING << ATTRIBUTE_SWIZZLE_SHIFT);
}
return attr_override;
return source_attr;
}
static void
@@ -113,7 +118,6 @@ upload_sf_state(struct brw_context *brw)
{
struct intel_context *intel = &brw->intel;
struct gl_context *ctx = &intel->ctx;
uint32_t urb_entry_read_length;
/* BRW_NEW_FRAGMENT_PROGRAM */
uint32_t num_outputs = _mesa_bitcount_64(brw->fragment_program->Base.InputsRead);
/* _NEW_LIGHT */
@@ -130,21 +134,7 @@ upload_sf_state(struct brw_context *brw)
uint16_t attr_overrides[FRAG_ATTRIB_MAX];
uint32_t point_sprite_origin;
/* CACHE_NEW_VS_PROG */
urb_entry_read_length = ((brw->vs.prog_data->vue_map.num_slots + 1) / 2 -
urb_entry_read_offset);
if (urb_entry_read_length == 0) {
/* Setting the URB entry read length to 0 causes undefined behavior, so
* if we have no URB data to read, set it to 1.
*/
urb_entry_read_length = 1;
}
dw1 =
GEN6_SF_SWIZZLE_ENABLE |
num_outputs << GEN6_SF_NUM_OUTPUTS_SHIFT |
urb_entry_read_length << GEN6_SF_URB_ENTRY_READ_LENGTH_SHIFT |
urb_entry_read_offset << GEN6_SF_URB_ENTRY_READ_OFFSET_SHIFT;
dw1 = GEN6_SF_SWIZZLE_ENABLE | num_outputs << GEN6_SF_NUM_OUTPUTS_SHIFT;
dw2 = GEN6_SF_STATISTICS_ENABLE |
GEN6_SF_VIEWPORT_TRANSFORM_ENABLE;
@@ -280,6 +270,7 @@ upload_sf_state(struct brw_context *brw)
/* Create the mapping from the FS inputs we produce to the VS outputs
* they source from.
*/
uint32_t max_source_attr = 0;
for (; attr < FRAG_ATTRIB_MAX; attr++) {
enum glsl_interp_qualifier interp_qualifier =
brw->fragment_program->InterpQualifier[attr];
@@ -315,12 +306,30 @@ upload_sf_state(struct brw_context *brw)
attr_overrides[input_index++] =
get_attr_override(&brw->vs.prog_data->vue_map,
urb_entry_read_offset, attr,
ctx->VertexProgram._TwoSideEnabled);
ctx->VertexProgram._TwoSideEnabled,
&max_source_attr);
}
for (; input_index < FRAG_ATTRIB_MAX; input_index++)
attr_overrides[input_index] = 0;
/* From the Sandy Bridge PRM, Volume 2, Part 1, documentation for
* 3DSTATE_SF DWord 1 bits 15:11, "Vertex URB Entry Read Length":
*
* "This field should be set to the minimum length required to read the
* maximum source attribute. The maximum source attribute is indicated
* by the maximum value of the enabled Attribute # Source Attribute if
* Attribute Swizzle Enable is set, Number of Output Attributes-1 if
* enable is not set.
* read_length = ceiling((max_source_attr + 1) / 2)
*
* [errata] Corruption/Hang possible if length programmed larger than
* recommended"
*/
uint32_t urb_entry_read_length = ALIGN(max_source_attr + 1, 2) / 2;
dw1 |= urb_entry_read_length << GEN6_SF_URB_ENTRY_READ_LENGTH_SHIFT |
urb_entry_read_offset << GEN6_SF_URB_ENTRY_READ_OFFSET_SHIFT;
BEGIN_BATCH(20);
OUT_BATCH(_3DSTATE_SF << 16 | (20 - 2));
OUT_BATCH(dw1);

View File

@@ -196,7 +196,7 @@ gen7_upload_samplers(struct brw_context *brw)
GLbitfield SamplersUsed = vs->SamplersUsed | fs->SamplersUsed;
brw->sampler.count = _mesa_bitcount(SamplersUsed);
brw->sampler.count = _mesa_fls(SamplersUsed);
if (brw->sampler.count == 0)
return;

View File

@@ -34,7 +34,6 @@ upload_sbe_state(struct brw_context *brw)
{
struct intel_context *intel = &brw->intel;
struct gl_context *ctx = &intel->ctx;
uint32_t urb_entry_read_length;
/* BRW_NEW_FRAGMENT_PROGRAM */
uint32_t num_outputs = _mesa_bitcount_64(brw->fragment_program->Base.InputsRead);
/* _NEW_LIGHT */
@@ -48,22 +47,8 @@ upload_sbe_state(struct brw_context *brw)
bool render_to_fbo = _mesa_is_user_fbo(ctx->DrawBuffer);
uint32_t point_sprite_origin;
/* CACHE_NEW_VS_PROG */
urb_entry_read_length = ((brw->vs.prog_data->vue_map.num_slots + 1) / 2 -
urb_entry_read_offset);
if (urb_entry_read_length == 0) {
/* Setting the URB entry read length to 0 causes undefined behavior, so
* if we have no URB data to read, set it to 1.
*/
urb_entry_read_length = 1;
}
/* FINISHME: Attribute Swizzle Control Mode? */
dw1 =
GEN7_SBE_SWIZZLE_ENABLE |
num_outputs << GEN7_SBE_NUM_OUTPUTS_SHIFT |
urb_entry_read_length << GEN7_SBE_URB_ENTRY_READ_LENGTH_SHIFT |
urb_entry_read_offset << GEN7_SBE_URB_ENTRY_READ_OFFSET_SHIFT;
dw1 = GEN7_SBE_SWIZZLE_ENABLE | num_outputs << GEN7_SBE_NUM_OUTPUTS_SHIFT;
/* _NEW_POINT
*
@@ -84,6 +69,7 @@ upload_sbe_state(struct brw_context *brw)
/* Create the mapping from the FS inputs we produce to the VS outputs
* they source from.
*/
uint32_t max_source_attr = 0;
for (; attr < FRAG_ATTRIB_MAX; attr++) {
enum glsl_interp_qualifier interp_qualifier =
brw->fragment_program->InterpQualifier[attr];
@@ -118,9 +104,25 @@ upload_sbe_state(struct brw_context *brw)
attr_overrides[input_index++] =
get_attr_override(&brw->vs.prog_data->vue_map,
urb_entry_read_offset, attr,
ctx->VertexProgram._TwoSideEnabled);
ctx->VertexProgram._TwoSideEnabled,
&max_source_attr);
}
/* From the Ivy Bridge PRM, Volume 2, Part 1, documentation for
* 3DSTATE_SBE DWord 1 bits 15:11, "Vertex URB Entry Read Length":
*
* "This field should be set to the minimum length required to read the
* maximum source attribute. The maximum source attribute is indicated
* by the maximum value of the enabled Attribute # Source Attribute if
* Attribute Swizzle Enable is set, Number of Output Attributes-1 if
* enable is not set.
*
* read_length = ceiling((max_source_attr + 1) / 2)"
*/
uint32_t urb_entry_read_length = ALIGN(max_source_attr + 1, 2) / 2;
dw1 |= urb_entry_read_length << GEN7_SBE_URB_ENTRY_READ_LENGTH_SHIFT |
urb_entry_read_offset << GEN7_SBE_URB_ENTRY_READ_OFFSET_SHIFT;
for (; input_index < FRAG_ATTRIB_MAX; input_index++)
attr_overrides[input_index] = 0;

View File

@@ -308,7 +308,8 @@ gen7_update_texture_surface(struct gl_context *ctx,
8 * 4, 32, &binding_table[surf_index]);
memset(surf, 0, 8 * 4);
uint32_t tex_format = translate_tex_format(mt->format,
uint32_t tex_format = translate_tex_format(intel,
mt->format,
firstImage->InternalFormat,
tObj->DepthMode,
sampler->sRGBDecode);

View File

@@ -86,8 +86,6 @@ intelInitExtensions(struct gl_context *ctx)
ctx->Extensions.TDFX_texture_compression_FXT1 = true;
ctx->Extensions.OES_EGL_image = true;
ctx->Extensions.OES_draw_texture = true;
ctx->Extensions.OES_compressed_ETC1_RGB8_texture = true;
ctx->Extensions.ARB_texture_rgb10_a2ui = true;
if (intel->gen >= 6)
ctx->Const.GLSLVersion = 140;
@@ -144,6 +142,7 @@ intelInitExtensions(struct gl_context *ctx)
ctx->Extensions.EXT_packed_float = true;
ctx->Extensions.ARB_texture_compression_rgtc = true;
ctx->Extensions.ARB_texture_rg = true;
ctx->Extensions.ARB_texture_rgb10_a2ui = true;
ctx->Extensions.ARB_vertex_type_2_10_10_10_rev = true;
ctx->Extensions.EXT_draw_buffers2 = true;
ctx->Extensions.EXT_framebuffer_sRGB = true;
@@ -157,6 +156,7 @@ intelInitExtensions(struct gl_context *ctx)
ctx->Extensions.ATI_envmap_bumpmap = true;
ctx->Extensions.MESA_texture_array = true;
ctx->Extensions.NV_conditional_render = true;
ctx->Extensions.OES_compressed_ETC1_RGB8_texture = true;
ctx->Extensions.OES_standard_derivatives = true;
}

View File

@@ -531,6 +531,36 @@ intel_renderbuffer_update_wrapper(struct intel_context *intel,
return true;
}
/**
* Create a fake intel_renderbuffer that wraps a gl_texture_image.
*/
struct intel_renderbuffer *
intel_create_fake_renderbuffer_wrapper(struct intel_context *intel,
struct gl_texture_image *image)
{
struct gl_context *ctx = &intel->ctx;
struct intel_renderbuffer *irb;
struct gl_renderbuffer *rb;
irb = CALLOC_STRUCT(intel_renderbuffer);
if (!irb) {
_mesa_error(ctx, GL_OUT_OF_MEMORY, "creating renderbuffer");
return NULL;
}
rb = &irb->Base.Base;
_mesa_init_renderbuffer(rb, 0);
rb->ClassID = INTEL_RB_CLASS;
if (!intel_renderbuffer_update_wrapper(intel, irb, image, image->Face)) {
intel_delete_renderbuffer(ctx, rb);
return NULL;
}
return irb;
}
void
intel_renderbuffer_set_draw_offset(struct intel_renderbuffer *irb)
{

View File

@@ -140,6 +140,10 @@ intel_create_wrapped_renderbuffer(struct gl_context * ctx,
int width, int height,
gl_format format);
struct intel_renderbuffer *
intel_create_fake_renderbuffer_wrapper(struct intel_context *intel,
struct gl_texture_image *image);
extern void
intel_fbo_init(struct intel_context *intel);

View File

@@ -1056,7 +1056,7 @@ set_max_gl_versions(struct intel_screen *screen)
screen->max_gl_core_version = 31;
screen->max_gl_compat_version = 30;
screen->max_gl_es1_version = 11;
screen->max_gl_es2_version = 20;
screen->max_gl_es2_version = 30;
break;
case 5:
case 4:

View File

@@ -41,6 +41,9 @@
#include "intel_fbo.h"
#include "intel_tex.h"
#include "intel_blit.h"
#ifndef I915
#include "brw_context.h"
#endif
#define FILE_DEBUG_FLAG DEBUG_TEXTURE
@@ -177,15 +180,28 @@ intelCopyTexSubImage(struct gl_context *ctx, GLuint dims,
GLint x, GLint y,
GLsizei width, GLsizei height)
{
if (dims == 3 || !intel_copy_texsubimage(intel_context(ctx),
intel_texture_image(texImage),
xoffset, yoffset,
intel_renderbuffer(rb), x, y, width, height)) {
fallback_debug("%s - fallback to swrast\n", __FUNCTION__);
_mesa_meta_CopyTexSubImage(ctx, dims, texImage,
xoffset, yoffset, zoffset,
rb, x, y, width, height);
struct intel_context *intel = intel_context(ctx);
if (dims != 3) {
#ifndef I915
/* Try BLORP first. It can handle almost everything. */
if (brw_blorp_copytexsubimage(intel, rb, texImage, x, y,
xoffset, yoffset, width, height))
return;
#endif
/* Next, try the BLT engine. */
if (intel_copy_texsubimage(intel_context(ctx),
intel_texture_image(texImage),
xoffset, yoffset,
intel_renderbuffer(rb), x, y, width, height))
return;
}
/* Finally, fall back to meta. This will likely be slow. */
fallback_debug("%s - fallback to swrast\n", __FUNCTION__);
_mesa_meta_CopyTexSubImage(ctx, dims, texImage,
xoffset, yoffset, zoffset,
rb, x, y, width, height);
}

View File

@@ -42,6 +42,7 @@
#include "main/framebuffer.h"
#include "main/imports.h"
#include "main/macros.h"
#include "main/mipmap.h"
#include "main/mtypes.h"
#include "main/renderbuffer.h"
#include "main/version.h"
@@ -783,6 +784,8 @@ OSMesaCreateContextExt( GLenum format, GLint depthBits, GLint stencilBits,
ctx->Driver.MapRenderbuffer = osmesa_MapRenderbuffer;
ctx->Driver.UnmapRenderbuffer = osmesa_UnmapRenderbuffer;
ctx->Driver.GenerateMipmap = _mesa_generate_mipmap;
/* Extend the software rasterizer with our optimized line and triangle
* drawing functions.
*/

View File

@@ -34,6 +34,7 @@
#include "main/colormac.h"
#include "main/fbobject.h"
#include "main/macros.h"
#include "main/mipmap.h"
#include "main/image.h"
#include "main/imports.h"
#include "main/mtypes.h"
@@ -869,6 +870,8 @@ xmesa_init_driver_functions( XMesaVisual xmvisual,
driver->MapRenderbuffer = xmesa_MapRenderbuffer;
driver->UnmapRenderbuffer = xmesa_UnmapRenderbuffer;
driver->GenerateMipmap = _mesa_generate_mipmap;
#if ENABLE_EXT_timer_query
driver->NewQueryObject = xmesa_new_query_object;
driver->BeginQuery = xmesa_begin_query;

View File

@@ -2152,13 +2152,6 @@ _mesa_BindBufferRange(GLenum target, GLuint index,
(int) size);
return;
}
if (offset + size > bufObj->Size) {
_mesa_error(ctx, GL_INVALID_VALUE,
"glBindBufferRange(offset + size %d > buffer size %d)",
(int) (offset + size), (int) (bufObj->Size));
return;
}
}
switch (target) {

View File

@@ -824,7 +824,8 @@ _mesa_MapGrid2d( GLint un, GLdouble u1, GLdouble u2,
void
_mesa_install_eval_vtxfmt(struct _glapi_table *disp,
const GLvertexformat *vfmt)
const GLvertexformat *vfmt,
bool beginend)
{
SET_EvalCoord1f(disp, vfmt->EvalCoord1f);
SET_EvalCoord1fv(disp, vfmt->EvalCoord1fv);
@@ -833,8 +834,12 @@ _mesa_install_eval_vtxfmt(struct _glapi_table *disp,
SET_EvalPoint1(disp, vfmt->EvalPoint1);
SET_EvalPoint2(disp, vfmt->EvalPoint2);
SET_EvalMesh1(disp, vfmt->EvalMesh1);
SET_EvalMesh2(disp, vfmt->EvalMesh2);
/* glEvalMesh1 and glEvalMesh2 are not allowed between glBegin and glEnd.
*/
if (!beginend) {
SET_EvalMesh1(disp, vfmt->EvalMesh1);
SET_EvalMesh2(disp, vfmt->EvalMesh2);
}
}

View File

@@ -39,6 +39,7 @@
#include "main/mfeatures.h"
#include "main/mtypes.h"
#include <stdbool.h>
#define _MESA_INIT_EVAL_VTXFMT(vfmt, impl) \
@@ -76,7 +77,8 @@ extern GLfloat *_mesa_copy_map_points2d(GLenum target,
extern void
_mesa_install_eval_vtxfmt(struct _glapi_table *disp,
const GLvertexformat *vfmt);
const GLvertexformat *vfmt,
bool beginend);
extern void _mesa_init_eval( struct gl_context *ctx );
extern void _mesa_free_eval_data( struct gl_context *ctx );

View File

@@ -297,7 +297,7 @@ static const struct extension extension_table[] = {
{ "GL_ATI_texture_float", o(ARB_texture_float), GL, 2002 },
{ "GL_ATI_texture_mirror_once", o(ATI_texture_mirror_once), GL, 2006 },
{ "GL_IBM_multimode_draw_arrays", o(dummy_true), GL, 1998 },
{ "GL_IBM_rasterpos_clip", o(dummy_true), GL, 1996 },
{ "GL_IBM_rasterpos_clip", o(dummy_true), GLL, 1996 },
{ "GL_IBM_texture_mirrored_repeat", o(dummy_true), GLL, 1998 },
{ "GL_INGR_blend_func_separate", o(EXT_blend_func_separate), GLL, 1999 },
{ "GL_MESA_pack_invert", o(MESA_pack_invert), GL, 2002 },
@@ -418,7 +418,6 @@ _mesa_enable_sw_extensions(struct gl_context *ctx)
ctx->Extensions.EXT_fog_coord = GL_TRUE;
ctx->Extensions.EXT_framebuffer_object = GL_TRUE;
ctx->Extensions.EXT_framebuffer_blit = GL_TRUE;
ctx->Extensions.EXT_framebuffer_multisample = GL_TRUE;
ctx->Extensions.EXT_packed_depth_stencil = GL_TRUE;
ctx->Extensions.EXT_pixel_buffer_object = GL_TRUE;
ctx->Extensions.EXT_point_parameters = GL_TRUE;

View File

@@ -225,9 +225,6 @@ descriptor=[
# GL_OES_point_sprite
[ "POINT_SPRITE_NV", "CONTEXT_BOOL(Point.PointSprite), extra_NV_point_sprite_ARB_point_sprite" ],
# GL_ARB_vertex_shader
[ "MAX_VARYING_FLOATS_ARB", "LOC_CUSTOM, TYPE_INT, 0, extra_ARB_vertex_shader" ],
]},
@@ -362,6 +359,7 @@ descriptor=[
# GL_ARB_vertex_shader
[ "MAX_VERTEX_UNIFORM_COMPONENTS_ARB", "CONTEXT_INT(Const.VertexProgram.MaxUniformComponents), extra_ARB_vertex_shader" ],
[ "MAX_VARYING_FLOATS_ARB", "LOC_CUSTOM, TYPE_INT, 0, extra_ARB_vertex_shader" ],
# GL_EXT_framebuffer_blit
# NOTE: GL_DRAW_FRAMEBUFFER_BINDING_EXT == GL_FRAMEBUFFER_BINDING_EXT

View File

@@ -1485,18 +1485,8 @@ _mesa_error_check_format_and_type(const struct gl_context *ctx,
else if (ctx->Extensions.ARB_depth_buffer_float &&
type == GL_FLOAT_32_UNSIGNED_INT_24_8_REV)
return GL_NO_ERROR;
switch (type) {
case GL_BYTE:
case GL_UNSIGNED_BYTE:
case GL_SHORT:
case GL_UNSIGNED_SHORT:
case GL_INT:
case GL_UNSIGNED_INT:
case GL_FLOAT:
return GL_INVALID_OPERATION;
default:
else
return GL_INVALID_ENUM;
}
case GL_DUDV_ATI:
case GL_DU8DV8_ATI:

View File

@@ -6027,6 +6027,20 @@ _mesa_rebase_rgba_float(GLuint n, GLfloat rgba[][4], GLenum baseFormat)
rgba[i][ACOMP] = 1.0F;
}
break;
case GL_RG:
for (i = 0; i < n; i++) {
rgba[i][BCOMP] = 0.0F;
rgba[i][ACOMP] = 1.0F;
}
break;
case GL_RED:
for (i = 0; i < n; i++) {
rgba[i][GCOMP] = 0.0F;
rgba[i][BCOMP] = 0.0F;
rgba[i][ACOMP] = 1.0F;
}
break;
default:
/* no-op */
;
@@ -6070,6 +6084,18 @@ _mesa_rebase_rgba_uint(GLuint n, GLuint rgba[][4], GLenum baseFormat)
rgba[i][ACOMP] = 1;
}
break;
case GL_RG:
for (i = 0; i < n; i++) {
rgba[i][BCOMP] = 0;
rgba[i][ACOMP] = 1;
}
break;
case GL_RED:
for (i = 0; i < n; i++) {
rgba[i][GCOMP] = 0;
rgba[i][BCOMP] = 0;
rgba[i][ACOMP] = 1;
}
default:
/* no-op */
;

Some files were not shown because too many files have changed in this diff Show More