Compare commits

...

294 Commits

Author SHA1 Message Date
Ian Romanick
f32ec82a8c docs: 9.1.3 release notes
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-21 12:59:17 -07:00
Ian Romanick
e9be1f7ce5 mesa: Bump version to 9.1.3
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-21 12:49:28 -07:00
Ian Romanick
caeab4d170 mesa: Note that a824692 is already back ported
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-21 12:47:32 -07:00
Paul Berry
cbe0e50247 intel: Do a depth resolve before copying images between miptrees.
When intel_finalize_mipmap_tree() calls intel_miptree_copy_teximage()
to reassemble a depth miptree that has been broken apart into pieces
(to deal with misalignment of levels/layers within the miptree), it
just copies the depth data, not the HiZ data.  This is reasonable,
since the alignment restrictions of HiZ are a large part of the reason
why the miptree had to be broken apart in the first place.  However,
in order for the depth copy to be sufficient, we need to do a depth
resolve first, to make sure any deferred depth writes that are in the
HiZ buffer get performed.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=64662 and
https://bugs.freedesktop.org/show_bug.cgi?id=64659.

NOTE: This is a candidate for stable release branches.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit 46ea804107)
2013-05-16 17:16:36 -07:00
Martin Andersson
c3eb301a3a r600g: Fix UMAD on Cayman
The multiplication part of tgsi_umad did not work on Cayman, because it did
not populate the correct vector slots.

This fixed hardlocks in the EXT_transform_feedback/order tests.

NOTE: This is a candidate for the stable branches.
(might not be easy to cherry-pick though)

Signed-off-by: Marek Olšák <maraeo@gmail.com>
Stable backport:
Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-05-15 02:14:07 +01:00
Chad Versace
4969960105 intel: Allocate hiz in intel_renderbuffer_move_to_temp()
When moving the renderbuffer to a new miptree, we neglected to allocate
the hiz buffer for the new miptree. Oops.

Fixes all Piglit depthstencil-render-miplevels tests from crash to pass on
Sandybridge.

Note: This is a candidate for the 9.1 branch.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit aa391976df)
2013-05-14 11:24:49 -07:00
Eric Anholt
22f7bcd44f i965: Disable write masking when setting up texturing m0.
v2/Kayden: Also disable write masking in the vec4 backend.

Fixes 78 oglconform glsl-bif-tex-* subcases.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com> [v1]
Reviewed-by: Eric Anholt <eric@anholt.net> [v2]
(cherry picked from commit 86536a321d)
2013-05-13 09:17:06 -07:00
Chad Versace
3933e65328 egl/dri2: Fix min/max swap interval of configs
The commit below exposed a bug in dri2_add_config.

    commit 3998f8c6b5
    Author: Ralf Jung <post@ralfj.de>
    Date:   Tue Apr 9 14:09:50 2013 +0200

	egl/x11: Fix initialisation of swap_interval

This little code snippet near the bottom of dri2_add_config,

    if (double_buffer) {
       ...
       conf->base.MinSwapInterval = dri2_dpy->min_swap_interval;
       conf->base.MaxSwapInterval = dri2_dpy->max_swap_interval;
    }

it never did what it claimed to do. The assignment never changed the value
of conf->base.MaxSwapInterval, because dri2_dpy->max_swap_interval was,
until the above exposing commit, unitialized here. That is,
conf->base.MaxSwapInterval was 0 before and after assignment. Ditto for
the min swap interval.

Above the troublesome code snippet, the call to _eglFilterArray rejects
the config as unmatching if its swap interval bounds differ from the base
config's.  Before the exposing commit, at the call to _eglFilterArray, the
swap interval bounds were always [0,0], and hence no config was rejected
due to swap interval.

After the exposing commit, _eglFilterArray incorrectly rejected some
configs, which prevented dri2_egl_config::dri_double_config from getting
set for the rejected config, which resulted in a NULL pointer getting
passed into dri2CreateNewDrawable, and then segfault.

The solution: set the swap interval bounds before _eglFilterArray.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63447
Tested-by: Lu Hua <huax.lu@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit d3dfce3276)
2013-05-13 09:14:24 -07:00
Andreas Boll
1e043ebe03 mesa: add usage examples to get-pick-list and shortlog scripts
NOTE: This is a candidate for the stable branches.
(cherry picked from commit b8e41db053)
2013-05-10 16:41:33 -07:00
Andreas Boll
8487315e6e mesa: Add a script to generate the list of fixed bugs
This list appears in the fixed bugs section of the release notes.

v2: Add usage examples

NOTE: This is a candidate for the stable branches.
(cherry picked from commit ca79b72c00)
2013-05-10 16:41:33 -07:00
Eric Anholt
7881aae604 i965: Fix hangs on HSW since the gen6 blorp fix.
The constant packets for gen6 are too small for gen7, and while IVB seems
happy with them HSW blows up.  Fix it by emitting the correct packets on
gen7, for all stages.

v2: Include the packets instead of just skipping them.
NOTE: This is a candidate for the stable branches.
Reviewed-and-tested-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 5d06c9ea0f)
2013-05-10 16:41:33 -07:00
Eric Anholt
23fb93a918 i965: Fix SNB GPU hangs when a blorp batch is the first thing to execute.
The GPU apparently goes looking for constants even though there are no
shader stages enabled, and gets stuck because we haven't told it there are
no constants to collect.  If any other user of the 3D pipeline had run
(even the Render accel of the X server!) since power on, then the in-GPU
constant buffers would have been set up with some contents we didn't use,
and we would succeed.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56416
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Dave Airlie <airlied@redhat.com>
NOTE: This is a candidate for the stable branches.
(cherry picked from commit 1dfea559c3)
2013-05-10 16:41:33 -07:00
Kenneth Graunke
49fa2f1135 i965/vs: Fix textureGrad() with shadow samplers on Haswell.
The shadow comparitor needs to be loaded into the Z component of the
last DWord.

Fixes es3conform's shadow_execution_vert and oglconform's
shadow-grad advanced.textureGrad.1D tests on Haswell.

NOTE: This is a candidate for stable branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit b5b6460c40)
2013-05-10 16:41:32 -07:00
Kenneth Graunke
6907c00059 i965: Lower textureGrad() for samplerCubeShadow.
According to the Ivybridge PRM, Volume 4 Part 1, page 130, in the
section for the sample_d message: "The r coordinate contains the faceid,
and the r gradients are ignored by hardware."

This doesn't match GLSL, which provides gradients for all of the
coordinates.  So we would need to do some math to compute the face ID
before using sample_d.  We currently don't have any code to do that.

However, we do have a lowering pass that converts textureGrad to
textureLod, which solves this problem.  Since textureGrad on three
components is sufficiently obscure, it's not a performance path.

For now, only handle samplerCubeShadow; we need tests for samplerCube
and samplerCubeArray.

Fixes es3conform's shadow_comparison_frag test on Haswell.

NOTE: This is a candidate for stable branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit e2f887b243)
2013-05-10 16:41:32 -07:00
Kenneth Graunke
79f89b46e1 glsl: Ignore redundant prototypes after a function's been defined.
Consider the following shader:

    vec4 f(vec4 v) { return v; }
    vec4 f(vec4 v);

The prototype exactly matches the signature of the earlier definition,
so there's absolutely no point in it.  However, it doesn't appear to
be illegal.  The GLSL 4.30 specification offers two relevant quotes:

"If a function name is declared twice with the same parameter types,
 then the return types and all qualifiers must also match, and it is the
 same function being declared."

"User-defined functions can have multiple declarations, but only one
 definition."

In this case the same function was declared twice, and there's only one
definition, which fits both pieces of text.  There doesn't appear to be
any text saying late prototypes are illegal, so presumably it's valid.

Unfortunately, it currently triggers an assertion failure:
ir_dereference_variable @ <p1> specifies undeclared variable `v' @ <p2>

When we process the second line, we look for an existing exact match so
we can enforce the one-definition rule.  We then leave sig set to that
existing function, and hit sig->replace_parameters(&hir_parameters),
unfortunately nuking our existing definition's parameters (which have
actual dereferences) with the prototype's bogus unused parameters.

Simply bailing out and ignoring such late prototypes is the safest
thing to do.

Fixes Piglit's late-proto.vert as well as 3DMark/Ice Storm for Android.

NOTE: This is a candidate for stable branches.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
(cherry picked from commit 6c5cf8baa1)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=39251
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61773
2013-05-10 16:41:32 -07:00
Alex Deucher
3f0ed60f93 radeonsi: add new SI pci ids
Note: this is a candidate for the 9.1 branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit b5145ca2a8)
2013-05-10 16:41:32 -07:00
Alex Deucher
b517272a5d r600g: add new richland pci ids
Note: this is a candidate for the stable branches.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit b3a856dfa9)
2013-05-10 16:41:32 -07:00
José Fonseca
3fb9f18be9 winsys/sw/xlib: Prevent shared memory segment leakage.
Running piglit with this was causing all sort of weird stuff happening
to my desktop (Chromium webpages become blank, Qt Creator flickered,
etc).  I tracked this down to shared memory segment leakage when GL is
not shutdown properly. The segments can be seen running `ipcs` and
looking for nattch==0.

This changes fixes this by calling shmctl(IPC_RMID) soon after creation
(which does not remove the segment immediately, but simply marks it for
removal when no more processes are attached).

This matches src/mesa/drivers/x11/xm_buffer.c behaviour.

v2:
- move shmctl(IPC_RMID) after XShmAttach() for *BSD, per Chris Wilson
- remove stray debug printfs, spotted by Ian Romanick

NOTE: This is a candidate for stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit e29525f79f)
2013-05-10 16:41:32 -07:00
Kenneth Graunke
e0e885ab86 mesa: Add unpack functions for A/I/L/LA [U]INT8/16/32 formats.
NOTE: This is a candidate for stable branches.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63569
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit cef31bb290)
2013-05-10 16:41:32 -07:00
Kenneth Graunke
07671a5627 mesa: Add unpack functions for R/RG/RGB [U]INT8/16/32 formats.
NOTE: This is a candidate for stable branches.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63569
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 995051ee34)
2013-05-10 16:41:32 -07:00
Kenneth Graunke
9f66038b5b mesa: Add an unpack function for ARGB2101010_UINT.
v2: Remove extra parenthesis (suggested by Brian).

NOTE: This is a candidate for stable branches.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63569
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 531be501de)
2013-05-10 16:41:32 -07:00
Kenneth Graunke
d28e375a3e mesa: Fix unpack function for ETC2_SRGB8_PUNCHTHROUGH_ALPHA1.
We accidentally set MESA_FORMAT_ETC2_RGB8_PUNCHTHROUGH_ALPHA1 twice,
rather than setting the RGB8 and SRGB8 formats.

NOTE: This is a candidate for stable branches.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63569
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit b1fded54c9)
2013-05-10 16:41:31 -07:00
Marek Olšák
9548c93768 st/mesa: depth-stencil-alpha state also depends on _NEW_BUFFERS
because the code looks at the visual if there is a depth or stencil buffer
before enabling depth or stencil, respectively.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit d23c7455ae)
2013-05-10 16:41:31 -07:00
Marek Olšák
54cfc3fd3a r600g: initialize CMASK and HTILE with the GPU using streamout
This fixes a crash when a resource cannot be mapped to the CPU's address space
because it's too big.

This puts a global pipe_context in r600_screen, which is guarded by a mutex,
so that we can use pipe_context when there isn't one around.
Hopefully our multi-context support is solid.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit b692076420)
2013-05-10 16:41:31 -07:00
Marek Olšák
a040faa894 gallium/u_blitter: implement buffer clearing
Although this might be useful for ARB_clear_buffer_object,
I need it for initializating resources in r600g.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>

v2: comment cleanups

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit 1ba46bbb4c)
2013-05-10 16:41:31 -07:00
Vadim Girlin
2f4134f7b5 gallium: handle drirc disable_glsl_line_continuations option
NOTE: This is a candidate for the 9.1 branch

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit f732036f12)
2013-05-10 16:41:31 -07:00
Brian Paul
7088218b9c mesa: enable GL_ARB_texture_float if TEXTURE_FLOAT_ENABLED is defined
Per message on mesa-users list, this wasn't working before.

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 877e3c1d42)
2013-05-10 16:41:31 -07:00
Dave Airlie
ed00ea3444 ralloc: don't write to memory in case of alloc fail.
For some reason I made this happen under indirect rendering,
I think we might have a leak, valgrind gave out, so I said I'd
fix the basic problem.

NOTE: This is a candidate for stable branches.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 47bd6e46fe)
2013-05-10 16:41:31 -07:00
Ian Romanick
1665f29c29 intel: Don't dereference a NULL pointer of calloc fails
The caller of NewTextureObject does the right thing if NULL is returned,
so this function should do the right thing too.

NOTE: This is a candidate for stable branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 505ac6ddc6)
2013-05-10 16:41:31 -07:00
Ian Romanick
9ef8b94e22 mesa/swrast: Move free calls outside the attachment loop
This was originally discovered by Klocwork analysis:

    Possible memory leak. Dynamic memory stored in 'srcBuffer0'
    allocated through function 'malloc' at line 566 can be lost at line
    746

However, I think the problem is actually much worse.  Since the memory
is freed after the first pass through the loop, the released buffer may
be used on the next iteration!

NOTE: This is a candidate for stable release branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit a27c6e1aea)
2013-05-10 16:41:31 -07:00
Ian Romanick
4a6a33d398 mesa/swrast: Refactor no-memory error checking in blit_linear
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 6758498eb7)
2013-05-10 16:41:30 -07:00
Roland Scheidegger
713e321fcf gallivm: fix small but severe bug in handling multiple lod level strides
Inserting the value for the second quad in the wrong place for the
following shuffle. This meant the row or image stride was undefined which is
quite catastrophic, can lead to bogus texels fetched or just segfault.
This code is only hit for SoA path currently, still surprising it
didn't crash more or caused more visible issues (I think llvm used a
broadcast shuffle for the undefined parts of the vector, hence the undefined
value for the second quad was just the same as that from the first quad,
so as long as both quads hit the same mip level everything was fine, and since
lower mips always have the same large stride it made it less likely to
hit out-of-bound memory in case of differing lods).

Note: this is a candidate for stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 1d6eb23f2d)
2013-05-10 16:41:30 -07:00
Ian Romanick
a61650ab83 mesa: Don't leak gl_context::BeginEnd at context destruction
The other dispatch tables (Exec and Save) are freed, but BeginEnd is
never freed.  This was found by inspection why investigating the leak of
shared state in _mesa_initialize_context.

NOTE: This is a candidate for stable branches

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 1faaa411c7)
2013-05-10 16:41:30 -07:00
Ian Romanick
13f7cd25f3 mesa: Don't leak shared state when context initialization fails
Back up at line 1017 (not shown in patch), we add a reference to the
shared state.  Several places after that may divert to the error
handler, but, as far as I can tell, nothing ever unreferences the shared
state.

Fixes issue identified by Klocwork analysis:

    Resource acquired to 'shared->TexMutex' at line 1012 may be lost
    here. Also there is one similar error on line 1087.

NOTE: This is a candidate for the stable branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 6e06550e4e)
2013-05-10 16:41:30 -07:00
Ian Romanick
e928a3059c egl/dri2: NULL check value returned by dri2_create_surface
dri2_create_surface can fail for a variety of reasons, including bad
input data.  Dereferencing the NULL pointer and crashing is not okay.

Fixes issue identified by Klocwork analysis:

    Pointer 'surf' returned from call to function 'dri2_create_surface'
    at line 285 may be NULL and will be dereferenced at line 291.

NOTE: This is a candidate for the stable branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit f730c210b8)
2013-05-10 16:41:30 -07:00
Ian Romanick
93613693ad mesa: NULL check the pointer before trying to dereference it
Duh.

Fixes issues identified by Klocwork analysis:

    Pointer 'table' returned from call to function 'calloc' at line 115
    may be NULL and will be dereferenced at line 117.

and

    Suspicious dereference of pointer 'table' before NULL check at line
    119.

NOTE: This is a candidate for the stable branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 2cc0b3294a)
2013-05-10 16:41:30 -07:00
Dave Airlie
377213b3ee st/mesa: fix UBO offsets.
Reported and tested by degasus on #radeon.

Note: This is a candidate for the 9.1 branch

Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit cb12bf7606)
2013-05-10 16:41:30 -07:00
Ralf Jung
d1b4165fcf egl/x11: Fix initialisation of swap_interval
The EGLConfig attributes EGL_MIN/MAX_SWAP_INTERVAL were incorrectly set to
0 and 0. This prevented clients from setting the swap interval to a
reasonable value, like 1 or 2.

Swap interval worked correctly in Mesa 9.0. The commit below introduced
the bug.

    commit 7e9bd2b2ed
    Author: Eric Anholt <eric@anholt.net>
    Date:   Tue Sep 25 14:05:30 2012 -0700
	egl: Add support for driconf control of swapinterval.

Note: This is a candidate for the 9.1 branch.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63078
[chadv: Wrote commit message]
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 3998f8c6b5)
2013-05-10 16:41:30 -07:00
Eric Anholt
04d7b718c6 i965/gen6: Reduce updates of transform feedback offsets with HW contexts.
The software-tracked transform feedback offsets (svbi_0_starting_index)
are incorrect in the presence of primitive restart, so we were actually
updating it with a bogus value if the batch wrapped and we emitted the
packet again during a single transform feedback.  By reducing state
emission, we avoid the bug.

Fixes piglit OpenGL 3.1/primitive-restart-xfb flush
Reviewed-by: Paul Berry <stereotype441@gmail.com>
NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit 007a88ed24)
2013-05-10 16:41:30 -07:00
Eric Anholt
d47e7a76a3 i965/gen7: Skip resetting SOL offsets at batch start with HW contexts.
The software-tracked transform feedback offsets (svbi_0_starting_index)
are incorrect in the presence of primitive restart, so we can't reliably
compute offsets for our buffer pointers after a batch flush.  Thanks to HW
contexts, our transform feedback offsets are now saved, so we can just
keep using the ones from before the batch wrap.

Fixes piglit OpenGL 3.1/primitive-restart-xfb flush
Reviewed-by: Paul Berry <stereotype441@gmail.com>
NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit 62a18da341)
2013-05-10 16:41:29 -07:00
Marek Olšák
4ae5638864 mesa: fix glGet queries depending on derived framebuffer state (v2)
"ctx->DrawBuffer->Visual" might be invalid if (NewState &_NEW_BUFFERS) != 0.

v2: also fix:
    - RGBA_INTEGER_MODE_EXT
    - RGBA_FLOAT_MODE_ARB (also check API support)
    - FRAMEBUFFER_SRGB_CAPABLE_EXT

NOTE: This is a candidate for stable branches.

Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit b6475f9437)
2013-05-10 16:41:29 -07:00
Paul Berry
7c6d62a670 glsl/linker: Reduce scope of non-flat integer varying fix.
In the mailing list discussion of "glsl/linker: fix varying packing
for non-flat integer varyings." (commit 7862bde), we concluded that
since the bug only applies to integral variables, it is safer to just
apply the bug fix to integer varyings.  I forgot to make the change
before pushing the patch upstream.  (Note: we aren't aware of any bugs
in commit 7862bde; it just seems wise to be on the safe side).

This patch makes the change.  Assuming commit 7862bde gets
cherry-picked back to 9.1, this commit should be cherry-picked too.

NOTE: This is a candidate for the 9.1 release branch.
(cherry picked from commit 5306af2113)
2013-05-10 16:41:29 -07:00
Paul Berry
6e0960e726 glsl/linker: Adapt flat varying handling in preparation for geometry shaders.
When a varying is consumed by transform feedback, but is not used by
the fragment shader, assign_varying_locations() sets its interpolation
type to "flat" in order to ensure that lower_packed_varyings never has
to deal with non-flat integral varyings (the GLSL spec doesn't require
integral vertex outputs to be flat if they aren't consumed by the
fragment shader).

A similar situation will arise when geometry shader support is added,
since the GLSL spec only requires integral vertex shader outputs to be
flat when they are consumed by the fragment shader.  This patch
modifies the linker to handle this situation too.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 32d2b2aa2c)
2013-05-10 16:41:29 -07:00
Paul Berry
0b2fa0eec6 glsl: Document lower_packed_varyings' "flat" requirement with an assert.
To minimize the variety of type conversions that lower_packed_varyings
needs to perform, it assumes that integral varyings are always
qualified as "flat".  link_varyings.cpp takes care of ensuring that
this is the case (even in the circumstances where GLSL doesn't require
it).

This patch documents the assumption with an assertion, for ease in
future debugging.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 8687c40c2d)
2013-05-10 16:41:29 -07:00
Paul Berry
bd8d257ef3 glsl/linker: fix varying packing for non-flat integer varyings.
Commit dfb57e7 (glsl: Fix error checking on "flat" keyword to match
GLSL ES 3.00, GLSL 1.50) relaxed the rules for integral varyings: they
only need to be declared as "flat" if they are a fragment shader
inputs.  This allowed for the possibility of a vertex shader output
being a non-flat integer, provided that it was not matched to a
fragment shader input.  A non-contrived situation where this might
arise is if a vertex shader generates some integral outputs which are
consumed by tranform feedback, but not by the fragment shader.

Unfortunately, lower_packed_varyings assumes that *all* integral
varyings are flat, regardless of whether they are consumed by the
fragment shader.  As a result, attempting to create a non-flat
integral vertex output of a size that required packing (i.e. a size
other than ivec4 or uvec4) would cause an assertion failure in
lower_packed_varyings.

This patch prevents the assertion failure by forcing vertex shader
outputs to be "flat" whenever they are not consumed by the fragment
shader.  This should have no effect on rendering since the "flat"
keyword only affects the behaviour of fragment shader inputs.

Fixes piglit test "spec/EXT_transform_feedback/nonflat-integral".

NOTE: This is a candidate for the 9.1 release branch.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 7862bde8af)
2013-05-10 16:41:29 -07:00
Chris Forbes
509054eb25 mesa: don't memcmp() off the end of a cache key.
Reported-by: `per` in #intel-gfx

The size of the cache key varies, so store the actual size as well as
the key blob itself, rather than just assuming it's the same as the size
passed in.

NOTE: This is a candidate for stable branches.

V2: Don't leave silly holes in structure; use unsigned instead of GLuint.
V3: Fix missing case for `last` match.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
(cherry picked from commit c4629ad3f9)
2013-05-10 16:41:29 -07:00
Brian Paul
df4e6650e3 gallium/u_blitter: fix is_blit_generic_supported() stencil checking
Don't check if there's sampler support for stencil if we're not
going to actually blit/copy stencil values.  Fixes the case where
we mistakenly said we can't support a blit of depth values from
S8Z24 to X8Z24.

Also, rename the is_stencil variable to dst_has_stencil to improve
readability.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit de99b6d117)
2013-05-10 16:41:29 -07:00
Alexander Monakov
cc53944c26 Honor GLX_DONT_CARE in MATCH_MASK
NOTE: This is a candidate for stable branches.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47478
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62999
Bugzilla: http://bugs.winehq.org/show_bug.cgi?id=26763
(cherry picked from commit 9cda356004)
2013-05-10 16:41:29 -07:00
Kenneth Graunke
acc3561cca i965: Fix stencil write enable flag in 3DSTATE_DEPTH_BUFFER on Gen7+.
ctx->Stencil.WriteMask is a statically sized array of 3 elements.
Checking it against 0 actually is a NULL check, and can never fail,
which meant that we always said stencil writes were enabled.

Use the new core Mesa derived state flag to fix this.

NOTE: This is a candidate for stable branches.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
(cherry picked from commit 01bd29d681)
2013-05-10 16:41:28 -07:00
Paul Berry
671e4e6b9e i965: Reduce code duplication in handling of depth, stencil, and HiZ.
This patch consolidates duplicate code in the brw_depthbuffer and
gen7_depthbuffer state atoms.  Previously, these state atoms contained
5 chunks of code for emitting the _3DSTATE_DEPTH_BUFFER packet (3 for
Gen4-6 and 2 for Gen7).  Also a lot of logic for determining the
appropriate buffer setup was duplicated between the Gen4-6 and Gen7
functions.

This refactor splits the code into three separate functions:
brw_emit_depthbuffer(), which determines the appropriate buffer setup
in a mostly generation-independent way, brw_emit_depth_stencil_hiz(),
which emits the appropriate state packets for Gen4-6, and
gen7_emit_depth_stencil_hiz(), which emits the appropriate state
packets for Gen7.

Tested using Piglit on Gen5-7 (no regressions).

v2: Re-word some comments.  Fix an assertion that incorrectly
prohibited packed depth/stencil formats on Gen6 (these are allowed
provided that HiZ is disabled).

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 41e4bccc75)
2013-05-10 16:41:28 -07:00
Kenneth Graunke
ae79402dba mesa: Add new ctx->Stencil._WriteEnabled derived state flag.
i965 needs to know whether stencil writes are enabled in several places,
and gets the test wrong sometimes.  While we could create a function to
compute this, it seems generally useful enough to warrant a new piece of
derived state.  Also, all the plumbing is already in place.

NOTE: This is a candidate for stable branches.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
(cherry picked from commit 1e3235d36e)
2013-05-10 16:41:28 -07:00
Marek Olšák
2708dc5e88 radeonsi: add more cases for copying unsupported formats to resource_copy_region
Ported from r600g commit:

8891b2f9c9

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit ff01e0db0e)
2013-05-10 16:41:28 -07:00
Paul Berry
c5a1eabaf2 glsl: Fix array indexing when constant folding built-in functions.
Mesa constant-folds built-in functions by using a miniature GLSL
interpreter (see
ir_function_signature::constant_expression_evaluate_expression_list()).
This interpreter had a bug in its handling of array indexing, which
caused expressions like "m[i][j]" (where m is a matrix) to be handled
incorrectly.  Specifically, it incorrectly treated j as indexing into
the whole matrix (rather than indexing just into the vector m[i]); as
a result the offset computed for m[i] was lost and m[i][j] was treated
as m[j][0].

Fixes piglit tests inverse-mat[234].{vert,frag}.

NOTE: This is a candidate for the 9.1 and 9.0 branches.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57436
(cherry picked from commit 7d4f1e6467)
2013-05-10 16:41:28 -07:00
Michel Dänzer
7c6472410a radeonsi: Handle arbitrary 2-byte formats in resource_copy_region
Fixes mplayer -vo vdpau OSD.

NOTE: This is a candidate for the 9.1 branch.

Reported-by: Igor Vagulin <igor.vagulin@gmail.com>

Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit c6efb4870b)
2013-05-10 16:41:28 -07:00
Maarten Lankhorst
09f5ee9918 nvc0: Fix fd leak in nvc0_create_decoder
NOTE: This is a candidate for the 9.0 and 9.1 branches.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
(cherry picked from commit 6d20c646d6)
2013-05-10 16:41:28 -07:00
Aras Pranckevicius
46ac963a23 GLSL: fix lower_jumps to report progress properly
A fix for lower_jumps progress reporting, very much like similar in
c1e591eed.

NOTE: This is a candidate for stable branches.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit b2eee0869f)
2013-05-10 16:41:28 -07:00
Eric Anholt
ee561e0927 i965/fs: Clean up the setup of gen4 simd16 message destinations.
I think this makes it much more obvious what's going on here.

NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 8edc7cbe64)
2013-05-10 16:41:28 -07:00
Eric Anholt
724269bb32 i965/fs: Do CSE on gen7's varying-index pull constant loads.
This is our first CSE on a regs_written() > 1 instruction, so it takes a
bit of extra fixup.  Reduces the number of loads on kwin's Lanczos shader
from 12 to 2.

v2: Fix compiler warning (false positive on possibly-uninitialized variable)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61554
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit 9f43b84928)
2013-05-10 16:41:27 -07:00
Eric Anholt
f523c0fb21 i965/fs: Avoid inappropriate optimization with regs_written > 1.
Right now we don't have anything with regs_written() > 1 and !inst->mlen,
but that's about to change.

NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit bc0e1591f6)
2013-05-10 15:40:28 -07:00
Eric Anholt
52bf09d52c i965: Make the constant surface interface take a normal byte size.
This puts the rounding-up logic into the function itself instead of all
the callers having to manage it.  Also drop an "unused" comment in gen4,
as the stride *is* used for texbos (and will be for uniforms soon).

NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 2f41a60145)
2013-05-10 13:43:11 -07:00
Eric Anholt
7f2a65d896 i965/fs: Move varying uniform offset compuation into the helper func.
I'm going to want to change the math for gen7 using sampler LD
instructions in a way that gets CSE to occur like we'd hope.

NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 8c694dfe64)
2013-05-10 13:43:11 -07:00
Eric Anholt
d61b1fdad6 i965/fs: Remove creation of a MOV instruction that's never used.
We weren't inserting it into the list, so it did nothing.  This line was
replaced by the MOV/MUL block above.

NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 59e858861c)
2013-05-10 13:43:11 -07:00
Haixia Shi
627e2669ab ACTIVE_UNIFORM_MAX_LENGTH should include 3 extra characters for arrays.
If the active uniform is an array, then the length of the uniform name should
include the three extra characters for the "[0]" suffix, which is required by
the GL 4.2 spec to be appended to the uniform name in glGetActiveUniform().

This avoids the situation where the output buffer does not have enough space
to hold the "[0]" suffix, resulting in an incomplete array specification like
"foobar[0".

NOTE: This is a candidate for the 9.1 branch.

Change-Id: I41e87ba347a7169eec8c575596cc3416adbe0728
Signed-off-by: Haixia Shi <hshi@chromium.org>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit bc0cc2944f)
2013-05-10 13:43:11 -07:00
Brian Paul
44d35d70e3 mesa: remove platform checks around __builtin_ffs, __builtin_ffsll
Use the __builtin_ffs, __builtin_ffsll functions whenever we have GCC,
not just for specific platforms.  Fixes Solaris build.

Note: This is a candidate for the stable branches.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62868
Signed-off-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 95df2b2883)
2013-05-10 13:43:11 -07:00
Ian Romanick
f5887e4d3f mesa: Note that patch 0967c36 shouldn't actually get picked to the 9.1 branch
The code didn't apply cleanly due to a number of refactors, so a
different solution was needed.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-10 13:43:11 -07:00
Chris Forbes
34a4fc5989 i965/fs: Don't try to use bogus interpolation modes pre-Gen6.
Interpolation modes other than perspective-barycentric-pixel-center (and
their associated coefficients in the WM payload) only exist in Gen6 and
later.

Unfortunately, if a varying was declared as `centroid`, we would blindly
read the nonexistant values, and so produce all manner of bad behavior
-- texture swimming, snow, etc.

Fixes rendering in Counter-Strike Source and Team Fortress 2 on
Ironlake.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 79f786f936)
2013-05-08 15:39:25 -07:00
Ian Romanick
f81eea3f1f docs: Add 9.1.2 release md5sums
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-04-30 15:25:57 -07:00
Ian Romanick
8c2981b8e0 docs: 9.1.2 release notes
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-04-30 15:18:53 -07:00
Ian Romanick
f9abbcacaa mesa: Bump version to 9.1.2
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-04-30 15:17:47 -07:00
Chris Forbes
251c87d884 i965/vs: Fix Gen4/5 VUE map inconsistency with gl_ClipVertex
This is roughly a backport of Eric's commit 0967c362.

We avoided assigning a slot in the VUE map for gl_ClipVertex, but left
the bit set in outputs_written, producing horrible confusion further
down the pipe.

Mostly fixes rendering in source games, and probably in Freespace 2 SCP.

No Piglit regressions on Ironlake.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>

V2: Mask out the bit, not its index. Strangely, the game still worked
with that wrong, but rendering of pretty much anything else was
completely trashed.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-30 07:16:02 +12:00
Adam Jackson
3cff41c7e4 linux: Don't emit a .note.ABI-tag section anymore (#26663)
We don't support pre-2.6 kernels anyway - the install docs say 2.6.28
for DRI - and apparently this confuses ld.so's sorting when multiple
libGLs are installed.  Just remove it.

Note: this is a candidate for the stable branches.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Adam Jackson <ajax@redhat.com>
(cherry picked from commit 904b03824b)
2013-04-27 17:22:51 +10:00
Alex Deucher
e78b553195 r600g: disable hyperz by default on 9.1
There are too many cases were we end up with lockups.
Once we sort out the remaining issues on master, they
can be backported and hyperz can be re-enabled on 9.1

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-04-22 12:16:51 -04:00
Tom Stellard
f0440493c2 r300g: Fix bug in OMOD optimization
https://bugs.freedesktop.org/show_bug.cgi?id=60503

NOTE: This is a candidate for the stable branches.
(cherry picked from commit c6a86fb563)
2013-04-12 09:35:00 -07:00
Carl Worth
4f44146226 i965: Avoid segfault in gen6_upload_state
This fixes a bug introduced in commit 258453716f and
triggered whenever "rb" is NULL.

Fixes at least one cause bug #59445:

	[SNB/IVB/HSW Bisected]Oglc draw-buffers2(advanced.blending.none) segfault
	https://bugs.freedesktop.org/show_bug.cgi?id=59445

(Though segfaults are still possible in that test case, but they have been
present since before commit 258453716f which is what's being fixed here.)

Reviewed-by: Eric Anholt <eric@anholt.net>
[jordan.l.justen@intel.com: fixes Anomaly Warzone Earth crash at title screen]
Tested-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-10 19:48:56 -07:00
Ian Romanick
39bb794aba mesa: Note that patch dbf94d1 should't actually get picked to the 9.1 branch
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-04-08 15:01:07 -07:00
Ian Romanick
c18d48da41 glsl: Add missing bool case in glsl_type::get_scalar_type
Since the case was missing bec4->get_scalar_type() would return bvec4,
but vec4->get_scalar_type() would return float.

NOTE: This is a candidate for stable branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit c770faea0a)
2013-04-08 14:49:58 -07:00
Martin Andersson
830bc1cbe6 r600g: Use virtual address for PIPE_QUERY_SO* in r600_emit_query_end
Virtual address is used for PIPE_QUERY_SO* queries in
r600_emit_query_begin, but not in r600_emit_query_end.

This will trigger a GPU fault when one of those queries is
made and virtual address is enabled.

Note: this is a candidate for the 9.1 branch

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 92855bcc95)
2013-04-08 14:49:53 -07:00
Eric Anholt
c589071fb2 mesa: Disable validate_ir_tree() on release builds.
Since half of ir_validate uses asserts() (the other using printf() then
abort()), there's not much use to calling it in a release build.  Cuts
6.3% of the startup time of TF2.

NOTE: This is a candidate for the stable branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 712bac1f41)
2013-04-08 14:49:48 -07:00
Marek Olšák
80092d8869 mesa: handle HALF_FLOAT like FLOAT in get_tex_rgba
NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit b2a4573c14)
2013-04-08 14:49:44 -07:00
Matt Turner
c7720a24be mesa: Implement TEXTURE_IMMUTABLE_LEVELS for ES 3.0.
NOTE: This is a candidate for the 9.1 branch.
Fixes piglit's texture-immutable-levels test.
Reported-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 12dc4be8a6)
2013-04-05 19:01:10 -07:00
Adam Jackson
82ac970d37 glx: Build with VISIBILITY_CFLAGS in automake
Note: This is a candidate for the stable branches.

Signed-off-by: Adam Jackson <ajax@redhat.com>
(cherry picked from commit 38aa8ec937)
2013-04-05 19:01:09 -07:00
Michel Dänzer
e0af764882 radeonsi: Emit pixel shader state even when only the vertex shader changed
Fixes random failures with piglit glsl-max-varyings.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 032e5548b3)
2013-04-05 19:01:09 -07:00
Kenneth Graunke
0c5fa7ae0e i965: Don't use texture swizzling to force alpha to 1.0 if unnecessary.
Commit 33599433c7 began setting the texture swizzle mode to XYZ1 for
RED, RG, and RGB textures in order to force alpha to 1.0 in case we
actually stored the texture as RGBA.

This had a unforseen performance implication: the shader precompile
assumes that the texture swizzle mode will be XYZW for non-shadow
sampler types.  By setting it to XYZ1, this means every shader used with
a RED, RG, or RGB texture has to be recompiled.  This is a very common
case.

Unfortunately, there's no way to improve the precompile, since RGBA
textures still need XYZW, and there's no way to know by looking at
the shader source what texture formats might be used.

However, we only need to smash alpha to 1.0 if the texture's memory
format actually has alpha bits.  If not, the sampler already returns 1.0
for us without any special swizzling.  XRGB8888, for example, is a very
common case where this occurs.

This partially fixes a performance regression since commit 33599433c7.
More work is required to fully fix it in all cases.  This at least helps
Warsow.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Carl Worth <cworth@cworth.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit d86efc075e)
2013-04-05 19:01:09 -07:00
Maarten Lankhorst
725c671d61 radeon/llvm: Do not link against libgallium when building statically.
NOTE: This is a candidate for the 9.1 branch.

Tested-by: Vincent Lejeune <vljn@ovi.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
(cherry picked from commit 7c3d8301af)
2013-04-05 19:01:09 -07:00
Andreas Boll
4205bd4b9b gallium/egl: fix out-of-tree build
Taken from downstream:
http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/15-fix-oot-build.diff;h=7040999a22d3937d0578cfd85ee2c71d7dc614bb;hb=refs/heads/ubuntu%2B1

NOTE: This is a candidate for the 9.1 branch.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 182895c4e6)
2013-04-05 19:01:09 -07:00
Andreas Boll
6e8f8a959b osmesa: fix out-of-tree build
Taken from downstream:
http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/14-fix-osmesa-build.diff;h=00581d0e1833c5492d9050e1bf3d5e658cad782e;hb=refs/heads/ubuntu%2B1

v2: Move the added line immediately after -I$(top_srcdir)/src/mapi

NOTE: This is a candidate for the 9.1 and 9.0 branches.

Acked-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 92e6260c19)
2013-04-05 19:01:09 -07:00
Andreas Boll
0c0e72f756 build: Enable x86 assembler on Hurd.
Taken from downstream:
http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/10-hurd-configure-tweaks.diff;h=984e17df1b8afdf8e4b36bee96aa5ab6a5691021;hb=refs/heads/ubuntu%2B1

Thanks to Pino Toscano.

v2: Don't bother with x86_64. AFAICT GNU/Hurd doesn't support it so far.

NOTE: This is a candidate for stable branches.

Acked-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Acked-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 06fff296e9)
2013-04-05 19:01:09 -07:00
Andreas Boll
60e5696de3 mesa: use ieee fp on s390 and m68k
Taken from downstream:
http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/02_use-ieee-fp-on-s390-and-m68k.patch;h=d3d6c1d7fec3c72ecf320706167deb61c52636c3;hb=refs/heads/ubuntu%2B1

Fixes Debian bug #349437.

Patch written by David Nusinow.

NOTE: This is a candidate for stable branches.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 7962f28c43)
2013-04-05 19:01:09 -07:00
Roland Scheidegger
7067d65e56 gallivm: fix return opcode handling in main function of a shader
If we're in some conditional or loop we must not return, or the code
after the condition is never executed.
(v2): And, we also can't just continue as nothing happened, since the
mask update code would later check if we actually have a mask, so we
need to remember that there was a return in main where we didn't exit
(to illustrate this, a ret in a if clause would cause a mask update
which is still ok as we're in a conditional, but after the endif the
mask update code would drop the mask hence bringing execution back to
pixels which should have their execution mask set to zero by the ret).
Thanks to Christoph Bumiller for figuring this out.

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=62357.

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 5af7b45986)
2013-04-05 19:01:09 -07:00
Andreas Boll
4999f0a84e radeon/llvm: Link against libgallium.la to fix an undefined symbol
Ported from downstream:
http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/119-libllvmradeon-link.patch;h=ee47f8a07dbf33c32f8b57faed923680ed6648fb;hb=refs/heads/ubuntu%2B1

Fixes a regression introduced with
f70c385351

NOTE: This is a candidate for the 9.1 branch.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62434
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
(cherry picked from commit 36320bfa54)
2013-04-05 19:01:08 -07:00
Maarten Lankhorst
70f7138754 gallium/build: Fix visibility CFLAGS in automake
v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - Fix formatting - use one CFLAG per line

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Maarten Lankhorst <m.b.lankhorst@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59238
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
(cherry picked from commit f70c385351)
2013-04-05 19:01:08 -07:00
Paul Berry
0756ab9c85 i965: Apply depthstencil alignment workaround when doing fast clears.
Fast depth clears have the same depth/stencil alignment requirements
as other drawing operations.  Therefore, we need to call
brw_workaround_depthstencil_alignment() from both the clear and
drawing paths.

Without this fix, we get image corruption if the following conditions
hold: (a) the first ever drawing operation to a depth miplevel (or the
first drawing operation after having used the texture for sampling) is
a clear, (b) the depth miplevel has a size that is eligible for fast
depth clears, and (c) the depth miplevel has an offset within the
miptree that isn't 8x8 aligned.

Fixes piglit "depthstencil-render-miplevels" tests with size 273.

NOTE: This is a candidate for stable branches

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit c5d5827951)
2013-04-05 19:01:08 -07:00
Kenneth Graunke
6e6dcd451e i965: Make INTEL_DEBUG=shader_time use the RAW surface format.
Untyped Atomic Operation messages are illegal for non-RAW formats.  The
IVB hardware proceeds happily (after all, who cares what the format of the
surface is if you're doing untyped ops on it?), but later hardware
apparently doesn't.  The simulator for gen7 does complain, though.

v2: Rebase against updates to previous patches. (by anholt)

NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 91df4d746b)
2013-04-05 19:01:08 -07:00
Kenneth Graunke
0d9f849ddf i965: Specialize SURFACE_STATE creation for shader time.
This is basically a copy and paste of gen7_create_constant_surface, but
with the parameters filled in to offer a simpler interface.

It will diverge shortly.

I didn't bother adding it to the vtable for now since shader time is only
exposed on Gen7+.

v2: Replace tabs in the new code (by anholt)
    Add back dropped memset() and add a comment about HSW channel selects.

NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 125b34cffb)
2013-04-05 19:01:08 -07:00
Kenneth Graunke
f32e776efb i965: Fix INTEL_DEBUG=shader_time for Haswell.
Haswell's "Data Cache" data port is a single unit, but split into two
SFIDs to allow for more message types without adding more bits in the
message descriptor.

Untyped Atomic Operations are now message 0010 in the second data cache
data port, rather than 6 in the first.

v2: Use the #defines from the previous commit. (by anholt)

NOTE: This is a candidate for the 9.1 branch.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net> (v1)
(cherry picked from commit f27a220cad)
2013-04-05 19:01:08 -07:00
Eric Anholt
74e8838179 i965: Add definitions for gen7+ data cache messages.
We were sparsely using some of these message types, but I'll just fill
them all in now.  It will be used for fixing shader_time on HSW.

v2: Add missing MEDIA_BLOCK_READ.

NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit a2d08f170a)
2013-04-05 19:01:08 -07:00
Anuj Phogat
02a8f04de1 mesa: Fix FB blitting in case of zero size src or dst rect
Framebuffer blitting operation should be skipped if any of the
dimensions (width/height) of src/dst rect is zero.

V2: Move the dimension check after error checking in _mesa_BlitFramebuffer.

Fixes: fbblit(negative.nullblit.zeroSize) in Intel oglconform
https://bugs.freedesktop.org/show_bug.cgi?id=59495

Note: Candidate for all the stable branches.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
(cherry picked from commit d78dcdf103)
2013-04-05 19:01:08 -07:00
José Fonseca
038a29b5c5 include: Fix build with VS 11 (i.e, 2012).
NOTE: Candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 57cd1d1454)
2013-04-05 19:01:08 -07:00
Ian Romanick
4626e6e270 mesa: Add previously picked commit to .cherry-ignore
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-04-05 19:01:07 -07:00
José Fonseca
fe72a61382 mesa,gallium,egl,mapi: One definition of C99 inline/__func__ to rule them all.
This is a squash-commit of the two commits listed below.  The first
introduced a 'make check' failure, and the second fixed it.

    mesa,gallium,egl,mapi: One definition of C99 inline/__func__ to rule them all.

    We were in four already...

    NOTE: Candidate for the stable branches.

    Reviewed-by: Brian Paul <brianp@vmware.com>
    (cherry picked from commit 70fe7c6d3e)

And:

    tests: Add $(top_srcdir)/include to AM_CPPFLAGS.

    Fixes this build error with make check.

      CC     collision.o
    In file included from ../../../../../src/mesa/main/hash_table.h:34:0,
                     from collision.c:31:
    ../../../../../src/mesa/main/compiler.h:51:53: fatal error: c99_compat.h: No such file or directory

    Signed-off-by: Vinson Lee <vlee@freedesktop.org>
    (cherry picked from commit a6bb7a9495)
2013-04-05 19:01:07 -07:00
José Fonseca
e2695d53c7 autotools: Add missing top-level include dir.
Fixes autotools build failure.  Not sure if there are more, as I have
difficulties in building the full tree.
(cherry picked from commit 7bff1cc3f6)
2013-04-05 19:01:07 -07:00
Christoph Bumiller
923bb2d8ed nvc0: fix for 2d engine R source formats writing RRR1 and not R001 2013-04-04 13:03:52 +02:00
Christoph Bumiller
ac4be46279 nv50,nvc0: fix 3d blits, restore viewport after blit
Conflicts:
	src/gallium/drivers/nvc0/nvc0_surface.c
2013-04-04 13:03:42 +02:00
Christoph Bumiller
5ba62ee201 nv50,nvc0: disable DEPTH_RANGE_NEAR/FAR clipping during blit
We send position.z == 0, DEPTH_RANGE may be some arbitrary range
not including 0 (for exmaple in piglit's hiz tests).
2013-04-04 12:54:41 +02:00
Christoph Bumiller
7410ba1265 nv50: fix 3D render target setup 2013-04-04 12:54:31 +02:00
Marek Olšák
980f84c392 gallium/tgsi: fix valgrind warning
"Conditional jump or move depends on uninitialised value(s)"

Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 9cec5edea7)
2013-03-26 03:23:59 +01:00
Eric Anholt
ff27e18834 i965/fs: Also do the gen4 SEND dependency workaround against other SENDs.
We were handling the the dependency workaround for the first written reg
of a send preceding the one we're fixing up, but didn't consider the other
regs.  Thus if you had two sampler calls that got allocated to the same
set of regs, one might, rarely, ovewrite the other.  This was occurring in
XBMC's GLSL shaders.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44567
NOTE: This is a candidate for the stable branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 4dc7e6dcbf)
2013-03-25 14:19:23 -07:00
Eric Anholt
c5bc65bd62 i965/fs: Fix broken rendering in large shaders with UBO loads.
The lowering process creates a new vgrf on gen7 that should be represented
in live interval analysis.  As-is, it was getting a conflicting allocation
with gl_FragDepth in the dolphin emulator, producing broken rendering.

NOTE: This is a candidate for the 9.1 branch.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61317
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 1323772543)
2013-03-25 14:19:19 -07:00
Eric Anholt
6dcccf7131 i965/fs: Fix register allocation for uniform pull constants in 16-wide.
We were allowing a compressed instruction to write a register that
contained the last use of a uniform pull constant (either UBO load or push
constant spillover), so it would get half its values smashed.

Since we need to see the actual instruction to decide this, move the
pre-gen6 pixel_x/y logic here, which should improve the performance of
register allocation since virtual_grf_interferes() is called more than
once per instruction.

NOTE: This is a candidate for the stable branches.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61317
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit f10f5e4980)
2013-03-25 14:18:29 -07:00
Marek Olšák
b3c8a250e6 mesa: don't allocate a texture if width or height is 0 in CopyTexImage
NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 4b69c1a92d)
2013-03-25 14:17:28 -07:00
Jan de Groot
5aacecb08c dri/nouveau: fix crash in nouveau_flush
https://bugs.freedesktop.org/show_bug.cgi?id=61947

Note: this is a candidate for the stable branches
(cherry picked from commit 17f1cb1d99)
2013-03-25 14:17:23 -07:00
Alan Hourihane
d095747749 mesa: fix glGetInteger*(GL_SAMPLER_BINDING).
If the sampler object has been deleted on another context, an
alternative context may reference the old sampler. So ensure the sampler
object still exists.

Note: this is a candidate for the stable branch.

Signed-off-by: Alan Hourihane <alanh@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 5984a911f9)
2013-03-25 14:15:59 -07:00
Alan Hourihane
cbbea7cc58 Unreference sampler object when it's currently bound to texture unit.
This change specifically unbinds a sampler object from the texture unit
if it's bound to a unit. The spec calls for default object when deleting
sampler objects which are currently bound.

Note: this is a candidate for the stable branches

Signed-off-by: Alan Hourihane <alanh@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit cf0b4a30fc)
2013-03-25 14:15:54 -07:00
Brian Paul
68e25de7d4 llvmpipe: add some scene limit sanity check assertions
Note: This is a candidate for the stable branches.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit a51b81558f)
2013-03-25 14:15:48 -07:00
Brian Paul
7ddda67835 llvmpipe: tweak CMD_BLOCK_MAX and LP_SCENE_MAX_SIZE
We advertise a max texture/surfaces size of 8K x 8K but the old values
for these limits didn't actually allow us to handle that surface size.

For 8K x 8K we'll have 16384 bins.  Each bin needs at least one cmd_block
object which was 2192 bytes in size.  Since 16384 * 2192 exceeded
LP_SCENE_MAX_SIZE we'd silently fail in lp_scene_new_data_block() and not
draw the complete scene.

By reducing CMD_BLOCK_MAX to 29 we get nice 512-byte cmd_blocks.  And
by increasing LP_SCENE_MAX_SIZE to 9 MB we can allocate enough command
blocks for 8K x 8K, plus a few regular data blocks.

Fixes the (improved) piglit fbo-maxsize test.

Note: This is a candidate for the stable branches.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit a31ebdffa0)
2013-03-25 14:15:45 -07:00
Marcin Slusarz
efd094d052 dri/nouveau: NV17_3D class is not available for NV1a chipset
Should fix https://bugs.freedesktop.org/show_bug.cgi?id=60510

Note: this is a candidate for the stable branches

Acked-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit f4ebcd133b)
2013-03-25 14:15:38 -07:00
Matt Turner
7f0014f64a configure.ac: Remove stale comment about --x-* arguments.
Should have been removed with e273ed37.

Note: This is a candidate for the 9.1 branch.
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 523b07e320)
2013-03-25 14:15:33 -07:00
Matt Turner
8d43e33ab2 configure.ac: Don't check for X11 unconditionally.
X11 is already checked conditionally below.

Fixes OSMesa-only configurations to not require X11.

Note: This is a candidate for the 9.1 branch.
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 35189d768b)
2013-03-25 14:15:29 -07:00
Alan Hourihane
7f44f9ddc3 Add missing GL_TEXTURE_CUBE_MAP entry in _mesa_legal_texture_dimensions
This was hit on the glTexStorage2D() path.

Note: this is a candidate for the stable branches

Signed-off-by: Alan Hourihane <alanh@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 196443f3f5)
2013-03-25 14:15:21 -07:00
Tapani Pälli
a31c9c3fa9 intel: Fix regression in intel_create_image_from_name stride handling
Strangely, the DRIimage interface we have passes the pitch in pixels
instead of bytes, which anholt missed in the change to using bytes for
region pitch.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61197
(cherry picked from commit e062a4187d)
2013-03-25 14:07:41 -07:00
Brian Paul
e6616948b7 vbo: fix crash found with shared display lists
This fixes a crash when a display list is created in one context
but executed from a second one.  The vbo_save_context::vertex_store
memeber will be NULL if we never created a display list with the
context.  Just check for that before dereferencing the pointer.

Fixes http://bugzilla.redhat.com/show_bug.cgi?id=918661

Note: This is a candidate for the stable branches.
(cherry picked from commit c2665aacdd)
2013-03-20 08:24:55 -06:00
Brian Paul
55cb78f082 mesa: flush current state when querying GL_EDGE_FLAG
Fixes http://bugs.freedesktop.org/show_bug.cgi?id=61395

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit b1390c7992)
2013-03-20 08:24:55 -06:00
Ian Romanick
3fe900840e docs: Add 9.1.1 release md5sums
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-19 17:26:24 -07:00
Ian Romanick
1e5e805fd0 mesa: Bump version to 9.1.1
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-19 17:14:38 -07:00
Ian Romanick
9e36e41034 docs: 9.1.1 release notes
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-19 17:11:41 -07:00
Alex Deucher
d0ccb5b911 r600g: Use blitter rather than DMA for 128bpp on cayman (v3)
On cayman, 128bpp surfaces require non_disp ordering for hw
access to both linear and tiled surfaces.  When we use the 3D
engine we can set the non_disp ordering on both the tiled and
linear sides (via CB or texture), but when we use the DMA
engine, we can only set the non_disp ordering on the tiled
side, so after a L2T operation with the DMA engine, the data
ends up in the wrong order on the tiled side.

v2: cayman/TN only

v3: fix comments

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=60802

Note: this is a candidate for the 9.1 branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 4409758a04)
2013-03-18 09:39:28 -04:00
Alex Deucher
61e7c043ea r600g: add Richland APU pci ids
Note: this is a candidate for the stable branches.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 03eef7f8ef)
2013-03-18 09:38:59 -04:00
José Fonseca
231247df02 scons: Warn when using MSVS versions prior to 2012.
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-15 19:55:26 +00:00
José Fonseca
5d66947d66 scons: Define _ALLOW_KEYWORD_MACROS on MSVC builds.
scons/llvm.py defines inline globally to workaround issues with LLVM C
binding headers, so the only way to is to avoid
aggravating xkeycheck.h errors is to set _ALLOW_KEYWORD_MACROS.

This fixes MSVC 2012 build with LLVM.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-15 19:55:26 +00:00
José Fonseca
0acc79322b scons: Allows choosing VS 10 or 11.
NOTE: Candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-15 19:55:26 +00:00
Michel Dänzer
5ccaa67204 radeonsi: Fix off-by-one for maximum vertex element index in some cases
In cases where the vertex element size is smaller than the vertex buffer
stride, the previous calculation could end up 1 too low. This would result
in the GPU using index 0 instead of the maximum index for those elements,
which would be visible as intermittent distorted triangles.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 4dca602521)
2013-03-15 19:00:41 +01:00
Frank Henigman
ed29a987fd i965: Link i965_dri.so with C++ linker.
Force C++ linking of i965_dri.so by adding a dummy C++ source file.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-03-08 21:25:25 -08:00
Matt Turner
e3f1b34fbe mesa: Allow ETC2/EAC formats with ARB_ES3_compatibility.
Fixes piglit's oes_compressed_etc2_texture-miptree tests on Desktop GL.
Reported-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-03-08 21:25:14 -08:00
Marek Olšák
09199c6862 r600g: pad the DMA CS to a multiple of 8 dwords
Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit c77917d35f)
2013-03-05 18:43:11 -05:00
Vincent Lejeune
1dc162d52f r600g: Check comp_mask before merging export instructions
Fixes a llvm uncovered (rare) bug where consecutive exports were
merged even if they have incompatible mask.
(cherry picked from commit 83e7d111af)
2013-03-05 18:32:53 -05:00
Vadim Girlin
9a5f513773 r600g: fix check_and_set_bank_swizzle for cayman
Tested-by: Vincent Lejeune <vljn at ovi.com>
Reviewed-by: Vincent Lejeune <vljn at ovi.com>
(cherry picked from commit 138b5b9a12)
2013-03-05 18:32:28 -05:00
Kenneth Graunke
26e827b309 i965: Fix Crystal Well PCI IDs.
The second digit was off by one, which meant we accidentally treated
GTn as GT(n-1).  This also meant no support for GT1 at all.

NOTE: This is a candidate for stable branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit b88f74d63d)
2013-03-05 14:58:04 -08:00
Brian Paul
44a5b5d161 svga: always link with C++
Even when we don't have LLVM since there's other C++ code
in the resulting DRI driver object.

Note: This is a candidate for the stable branches.

Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit a99eb5c83f)
2013-03-05 14:58:04 -08:00
Marek Olšák
f6765c6d20 r600g: always map uninitialized buffer range as unsynchronized
Any driver can implement this simple and efficient optimization.
Team Fortress 2 hits it always. The DISCARD_RANGE codepath is not even used
with TF2 anymore, so we avoid a ton of useless buffer copies.

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit 89e2898e9e)
2013-03-05 14:58:04 -08:00
Marek Olšák
2c3efd2ee4 gallium/util: add helper code for 1D integer range
Reviewed-by: Brian Paul <brianp@vmware.com>

v2: cosmetic changes based on Brian's review

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>

NOTE: This is a candidate for the 9.1 branch. (the next patch depends on it)
(cherry picked from commit 44f37261fc)
2013-03-05 14:58:04 -08:00
Marek Olšák
f5b22eb09f r600g: flush and invalidate htile cache when appropriate
Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit e5a250fdf9)
2013-03-05 14:58:04 -08:00
Marek Olšák
391e7ed51e r600g: use async DMA with a non-zero src offset
probably a typo

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit 9dd18f43a4)
2013-03-05 14:58:04 -08:00
Jordan Justen
db5492cae3 attrib: push/pop FRAGMENT_PROGRAM_ARB state
This requirement was added by ARB_fragment_program

When the Steam overlay is enabled, this fixes:
* Menu corruption with the Puddle game
* The screen going black on Rochard when
  the Steam overlay is accessed

NOTE: This is a candidate for the 9.0 and 9.1 branches.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 6f1538f8b4)
2013-03-05 14:57:10 -08:00
Keith Kriewall
65baaa070c scons: Fix Windows build with LLVM 3.2
Fixes fdo bug 61299

NOTE: This is a candidate for the stable branches.

Signed-off-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit efd8311a54)
2013-03-05 14:57:10 -08:00
Adam Sampson
9d0df82076 autotools: oprofilejit should be included in the list of LLVM components required
NOTE: This is a candidate for the stable branch.

Signed-off-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit 2506b03503)
2013-03-05 14:57:09 -08:00
Ian Romanick
049f343e8a egl: Allow 24-bit visuals for 32-bit RGBA8888 configs
Previously only the 32-bit X visual would match the 32-bit RGBA8888
configs.  This resulted in every config with alpha getting the "magic"
visual whose alpha is used by the compositor.  This also resulted in no
multisample visuals being advertised.  How many ways could we lose?

This patch inverts the problem... now you can't get the visual with
alpha used by the compositor even if you want it.  I think we need to
invent a new value for EGL_TRANSPARENT_TYPE that apps can use to get
this.  I'm surprised that there isn't already a choice for
EGL_TRANSPARENT_ALPHA.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Tian Ye <yex.tian@intel.com>
Acked-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59783
(cherry picked from commit 68a147e9a9)
2013-03-05 14:57:09 -08:00
Eric Anholt
0b494917b1 i965: Fix the W value of deprecated pointcoords on pre-gen6.
When you didn't have a texcoord array bound (or a non-1 current w
attrib), we were telling the fragment shader that it could just use "1"
instead of doing expensive pre-gen6 math to invert it.  If you drew the
point with a non-1 W value, then you'd get the right size (since all the
vertex computations worked), but we'd mis-interpolate the coordinate
across the face.

Fixes the mesa pointsprite demo on GM45.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=30232
Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Note: This is a candidate for the stable branches.
(cherry picked from commit 50a5d5dea0)
2013-03-05 14:57:09 -08:00
Tapani Pälli
fa23151f43 mesa/es: NULL check in EGLImageTargetTexture2DOES
check that pointer passed is valid and return error if not.

Note: This is a candidate for the stable branches.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit 3cdb548bfb)
2013-03-05 14:57:09 -08:00
Tapani Pälli
b1b2443ade mesa: add missing case in _mesa_GetTexParameterfv()
missing case GL_REQUIRED_TEXTURE_IMAGE_UNITS_OES is required
by OES_EGL_image_external extension.

Note: This is a candidate for the stable branches.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit 331967c773)
2013-03-05 14:57:09 -08:00
John Kåre Alsaker
2b8a431d39 llvmpipe: Fix creation of shared and scanout textures.
NOTE: This is a candidate for the stable branches.

Signed-off-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit 65aa1a194d)
2013-03-05 14:57:09 -08:00
Brian Paul
ab883bb8a4 llvmpipe: add missing checks for polygon offset point/line modes
The llvm pipeline handles regular filled triangle offsets, but it
doesn't handle offsets for triangles drawn in point or line mode.

Fixes failures found with new piglit polygon-mode-offset test.

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit f93c580063)
2013-03-05 14:57:09 -08:00
Brian Paul
2ac679bc87 draw: fix broken polygon offset stage
There were several issues.  We weren't handling different front/back
polygon fill modes.  We weren't checking whether the offset applied to
fill mode vs. line mode vs. point mode.

Fixes problems found with the Visualization Toolkit (VTK) test suite.

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit d6b8b116ee)
2013-03-05 14:57:09 -08:00
Brian Paul
f64664de7f st/mesa: fix polygon offset state translation logic
The old logic was kind of twisted, but seemed to work in practice.

Note: This is a candidate for the stable branches.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit a2c105e31e)
2013-03-05 14:57:09 -08:00
Brian Paul
1505ce833a st/mesa: check for dummy programs in destroy_program_variants()
When we destroy an ARB vp/fp whose ID was gen'd but not otherwise used we
get a pointer to the dummy/placeholder program.  We can't destroy that one
so just skip it.  This only failed during context tear-down because
glDeleteProgramsARB() was already aware of dummy programs.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=38086

Note: This is a candidate for the stable branches.

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
(cherry picked from commit 8bb291b0f5)
2013-03-05 14:57:09 -08:00
Brian Paul
6bc298a5b0 st/mesa: fix trimming of GL_QUAD_STRIP
We sometimes convert GL_QUAD_STRIP prims into GL_TRIANGLE_STRIP, but
that changes the results of the u_trim_pipe_prim() call.  We need to
pass the original primitive type to the trim function.

Note that OpenGL's GL_x prim type values match Gallium's PIPE_PRIM_x values.

Fixes a failure in the new piglit degenerate-prims test.

Note: This is a candidate for the stable branches.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit 8589cc41b3)
2013-03-05 14:57:09 -08:00
Anuj Phogat
b382f1dbeb meta: Allocate texture before initializing texture coordinates
tex->Sright and tex->Ttop are initialized during texture allocation.
This fixes depth buffer blitting failures in khronos conformance tests
when run on desktop GL 3.0.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=59495

Note: This is a candidate for stable branches.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit cff862f90d)
2013-03-05 14:57:09 -08:00
Eric Anholt
2893cd2843 mesa: Fix setup of ctx->Point.PointSprite for GLES2.
The recent change for GL core broke the older setup, which broke
gl_PointCoord on pre-gen6 (where gl_PointCoord is undefined if point
sprites are disabled).  Fixes the new piglit GLES-2.0/glsl-fs-pointcoord
test.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32429
Note: This is a candidate for the stable branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 92a204b493)
2013-03-05 14:57:09 -08:00
Eric Anholt
a11201fd2f i965/fs: Fix broken math on values loaded from uniform buffers on gen6.
In a debug build this led to assertion failures, but on a non-debug
build the hardware would just reference the whole vec8 instead of the
same channel 8 times.

Fixes the new piglit glsl-1.40/uniform-buffer/fs-exp2.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57121
Note: This is a candidate for the stable branches
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 7b0731d940)
2013-03-05 14:57:09 -08:00
Michel Dänzer
ba4f4cead4 r600g/Cayman: Fix blending using destination alpha factor but non-alpha dest
Only compile tested, but should fix at least some piglit fbo-blending tests.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
(cherry picked from commit 73bf626713)
2013-03-05 14:57:09 -08:00
Eric Anholt
6d60f8cfc4 i965/fs: Only do CSE when the dst types match.
We could potentially do some CSE even when the dst types aren't the same
on gen6 where there is no implicit dst type conversion iirc, or in the
case of uniform pull constant loads where the dst type doesn't impact
what's stored.  But it's not worth worrying about.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit c2a6e529c3)
2013-03-05 14:57:09 -08:00
Eric Anholt
808e01236b i965/fs: Delay setup of uniform loads until after pre-regalloc scheduling.
This should fix the register allocation explosion on the GLES 3.0 test
on gen6.  It also gives us an instruction that will fit our CSE handling.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit aebd3f46e3)
2013-03-05 14:57:09 -08:00
Eric Anholt
c5f63bedac i965/fs: Fix copy propagation with smearing.
We were correctly relaying the smear from MOV's src, but if the MOV
didn't do a smear, we don't want to smash the smear value from the
instruction being propagated into.  Prevents a regression in the
upcoming UBO change.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit 49bdebad38)
2013-03-05 14:57:08 -08:00
Brian Paul
952ca6a795 st/xlib: initialize the drawable size in create_xmesa_buffer()
Otherwise, the PBuffer's size was never set.  This also initializes
the buffer size for windows, pixmaps, etc.

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=61012

Note: This is a candidate for the stable branches.
(cherry picked from commit e2091f64cb)
2013-03-05 14:57:08 -08:00
Brian Paul
5c76f85a92 st/mesa: implement glBitmap unpacking from a PBO, for the cache path
We weren't mapping the PBO when using the bitmap cache (but we had
the PBO code for the non-cache path.)

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=61026

Note: This is a candidate for the stable branches.
(cherry picked from commit 63c30d7e4f)
2013-03-05 14:57:08 -08:00
Brian Paul
475548f6c9 draw: fix non-perspective interpolation in interp()
This fixes a regression from ab74fee5e1.
When we use the clip coordinate to compute the screen-space interpolation
factor, we need to first apply the divide-by-W step to the clip
coordinate.

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=60938

Note: This is a candidate for the 9.1 branch.
(cherry picked from commit 5da967aff5)
2013-03-05 14:57:08 -08:00
Vincent Lejeune
9071c094e8 r600g/llvm: Add support for UBO
NOTE: This is a candidate for the Mesa stable branch.

Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
(cherry picked from commit ef8fde6acb)
2013-03-05 14:57:08 -08:00
Eric Anholt
597d98bb2c i965/fs: Do a general SEND dependency workaround for the original 965.
We'd been ad-hoc inserting instructions in some SEND messages with no
knowledge of when it was required (so extra instructions), but not all SENDs
(so not often enough).  This should do much better than that, though it's
still flow-control-ignorant.

v2: Use BRW_MAX_MRF instead of magic numbers.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58960
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: Candidate for the stable branches.
(cherry picked from commit c37992c54d)
2013-03-05 14:57:08 -08:00
Ian Romanick
783f76e3d5 mesa: Modify candidate search string
Several commits on master for the 9.1 branch had "NOTE" messages in a
slightly different format.

NOTE: This is a candidate for stable branches

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 674f9239b9)
2013-03-05 14:55:02 -08:00
Ian Romanick
465ce417cf mesa: Add previously picked commit to .cherry-ignore
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-05 14:54:47 -08:00
Michel Dänzer
a5c79b7426 radeonsi: Fix up and enable flat shading.
Requires corresponding LLVM R600 backend fix to work correctly, but even
without that it doesn't hang anymore.

13 more little piglits.

Depends on LLVM: r175193, r175733

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit 18272c9b1b)
2013-03-04 18:50:31 +01:00
Jakub Bogusz
4aa2b5120a vdpau-softpipe: Build correct source file - vl_winsys_xsp.c
Copy-and-paste problem introduced by commit 7f24483e.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-03-03 22:57:04 -08:00
Martin Andersson
cd77c77bb9 winsys/radeon: Only add bo to hash table when creating flink
The problem is that we mix bo handles and flinked names in the hash
table. Because kms type handles are not flinked they should not be
added to the hash table. If we do that we will sooner or later
get a situation where we will overwrite a correct entry because
the bo handle was the same as a flinked name.

Note: this is a candidate for the stable branches.

Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit d96d8ed910)
2013-03-01 17:53:59 -05:00
Jerome Glisse
b199a6414d r600g: workaround hyperz lockup on evergreen
This work around disable hyperz if write to zbuffer is disabled. Somehow
using hyperz when not writting to the zbuffer trigger GPU lockup. See :

https://bugs.freedesktop.org/show_bug.cgi?id=60848

Candidate for 9.1

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
(cherry picked from commit 6bc7605745)
2013-02-28 09:49:12 -05:00
Daniel van Vugt
012329e83b gbm: Remember to init format on gbm_dri_bo_create.
https://bugs.freedesktop.org/show_bug.cgi?id=60143
(cherry picked from commit 6e226ab5ac)
2013-02-27 10:35:21 +01:00
Brian Paul
7a1612a54b docs: insert links to the 9.0.3 release 2013-02-25 08:10:35 -07:00
Andreas Boll
7024ee6b20 docs: add news item for 9.1 release 2013-02-25 10:42:21 +01:00
Andreas Boll
e0e59ceb3c docs: Add 9.1 release md5sums 2013-02-25 10:39:11 +01:00
Brian Paul
5b19631f7c docs: remove stray 'date' text 2013-02-23 06:33:16 -07:00
Ian Romanick
17493b8848 docs: Update relelase notes 2013-02-22 17:46:24 -08:00
Ian Romanick
3ea699ff3c mesa: Bump version to 9.1 (final)
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-02-22 17:46:23 -08:00
Ian Romanick
ea38e7e2e3 i965: Enable OpenGL ES 3.0 on Sandy Bridge
Regardless of what we put in the screen structure, all of the extensions
that compute_version_es2 checks are present and 3.0 will be exposed
anyway.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 7ae6864f0d)
2013-02-22 17:46:23 -08:00
Alex Deucher
4212dbae1c r600g: fixup PS_PARTIAL_FLUSH flag handling for cayman
So we don't emit it twice if we ever use the flag on
cayman.

Note: this is a candidate for the 9.1 branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 8b5acad0e9)
2013-02-22 18:45:40 -05:00
Alex Deucher
a650092fd6 r600g: r6xx deadlock workaround (v6)
Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=50655
https://bugs.freedesktop.org/show_bug.cgi?id=47116

v2: flush along with workaround.
v3: just need a flush
v4: try WAIT_UNTIL
v5: switch to PS partial flush
v6: rework patch

Note: this is a candidate for the 9.1 branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 8442b67f5f)
2013-02-22 18:27:37 -05:00
Alex Deucher
11d9f75f01 r600g: add PS_PARTIAL_FLUSH flag
PS_PARTIAL flushes seems to be required in certain
cases to prevent hangs, especially on r6xx.

Note: this is a candidate for the 9.1 branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 7ebf83f109)
2013-02-22 18:27:24 -05:00
Lauri Kasanen
6427e1609e configure: Fix build with automake < 1.11
Commit 86d30dea3c broke building with older
automake versions with this error:

Makefile:769: *** Recursive variable am__v_YACC_ references itself (eventually).  Stop.

This patch fixes it. Fix stolen from xorg-macros.

Signed-off-by: Lauri Kasanen <cand@gmx.com>
(cherry picked from commit 0a82828ad5)
2013-02-22 13:15:40 -08:00
Michel Dänzer
47f7f803ae radeonsi: Fix PIPE_FORMAT_X32_S8X24_UINT sampler hardware format
4 more little piglits.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit 9c1107b3e1)
2013-02-22 20:24:33 +01:00
Michel Dänzer
cb8bacd87a radeonsi: Use stencil surface level information for stencil texturing
7 more little dwarves^W piglits.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit 8356962853)
2013-02-22 20:24:24 +01:00
Michel Dänzer
0d08abd461 radeonsi: properly implement S8Z24 depth-stencil format
Based on r600g commit 2b9659c9e6 .

Fixes crashes with 4 piglit tests which are now hitting these formats.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit f9adf79876)
2013-02-22 20:24:17 +01:00
Michel Dänzer
5baa8ec737 radeonsi: Fix w component of TGSI_SEMANTIC_POSITION fragment shader inputs.
It's the reciprocal of the register value.

Fixes piglit fragcoord_w and glsl-fs-fragcoord-zw-perspective.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit 954bc4ac34)
2013-02-22 20:15:17 +01:00
Michel Dänzer
0c3b96a6c6 radeonsi: Fix blending using destination alpha factor but non-alpha destination
11 more little piglits.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
(cherry picked from commit 95bced5929)
2013-02-22 20:13:45 +01:00
Marek Olšák
24e8ad6204 radeonsi: implement 3D transfers
That means we can map and read multiple slices with one transfer_map call.

[ Cherry-picked from r600g commit 1aebb6911e ]

11 more little piglits on master, 1 more on the 9.1 branch (Marek's
glTex(Sub)Image improvements on master broke the other 10).

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 72f4490b55)
2013-02-22 20:13:27 +01:00
Marek Olšák
8013101c2d radeonsi: add assertions to prevent creation of invalid surfaces
[ Cherry-picked from r600g commit ef11ed61a0 ]

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit a84c4edeed)
2013-02-22 20:13:18 +01:00
Marek Olšák
6894c127d9 radeonsi: use u_box_origin_2d helper function
[ Cherry-picked from r600g commit b278aba423 ]

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit c4faab63c4)
2013-02-22 20:13:04 +01:00
Marek Olšák
f0f3ebb777 st/mesa: don't do sRGB conversion in CopyTexSubImage
Assuming I understand EXT_texture_sRGB correctly.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 6520a86c67)
2013-02-22 13:20:10 +01:00
Marek Olšák
deeb4b056f r600g: fix random corruption with CP DMA in TF2
NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit aac8138744)
2013-02-22 12:50:27 +01:00
Andreas Boll
8818d01d33 llvmpipe/build: add DLOPEN_LIBS and PTHREAD_LIBS to the lp_test_* targets
Fixes undefined symbols.

NOTE: This is a candidate for the 9.1 branch.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61052
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit c1f2c3a80f)
2013-02-22 10:24:53 +01:00
Andreas Boll
ea51f870f6 targets/xa-vmwgfx: Force c++ linker to fix undefined symbols
NOTE: This is a candidate for the 9.1 branch.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61200
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit c1eb585f3d)
2013-02-22 10:24:47 +01:00
Tom Stellard
dfc35a4fc5 r300g/compiler: Fix bug in OMOD folding
The OMOD value was only being folded to one instruction in cases where
the MUL instruction was reading a value written by more than one
instruction.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
(cherry picked from commit 10bcc843f8)
2013-02-21 22:09:51 -05:00
Tom Stellard
c7fdb6a861 r300g/tests: Add helper functions for creating a full program
Now you can convert assembly strings into a full struct radeon_compiler
object and use it to test individual compiler pases.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
(cherry picked from commit 5e1321ddf4)
2013-02-21 22:09:42 -05:00
Tom Stellard
74790900ca r300g/tests: Exit test runner with a valid status code
This way make check can report whether or not the tests pass.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
(cherry picked from commit bcf2e157ca)
2013-02-21 22:09:33 -05:00
Tom Stellard
42cfb1ef47 r300g/complier: Make r300_vertprog_swizzle_caps visible in other files
This will be used by the test suite in later commits.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
(cherry picked from commit 5355fc1e87)
2013-02-21 22:09:22 -05:00
Tom Stellard
754447d81f r300g/compiler: Add missing license headers
These are all files that I authored, but forgot to add the license
headers.

NOTE: This is a candidate for the stable branches.

Signed-off-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
(cherry picked from commit 27d140b960)
2013-02-21 22:09:11 -05:00
Alex Deucher
e41bdc223e r600g: don't enable ReZ mode on evergreen
Can cause lockups in certain cases when
zfunc/zenable/zwrite change without a flush
in between.

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=60969
and lockups on Civ4 with wine.

This is a candidate for the 9.1 branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
(cherry picked from commit 2e4ef989a2)
2013-02-21 12:02:15 -05:00
Ian Romanick
6ff7080a4c mesa: Don't install glEvalMesh in the beginend dispatch table
NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59740
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 8b586322e7)
2013-02-20 15:25:55 -08:00
Zack Rusin
bdffccf91e DRI2: Don't disable GLX_INTEL_swap_event unconditionally
GLX_INTEL_swap_event is broken on the server side, where it's
currently unconditionally enabled. This completely breaks
systems running on drivers which don't support that extension.
There's no way to test for its presence on this side, so instead
of disabling it uncondtionally, just disable it for drivers
which are known to not support it. It makes sense because
most drivers do support it right now.
We'll be able to remove this once Xserver properly advertises
GLX_INTEL_swap_event.

Note: This is a candidate for stable branch branches.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60052
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 076403c30d)
2013-02-19 12:50:55 -08:00
Stefan Brüns
c41ba5bda9 glx: fix glGetTexLevelParameteriv for indirect rendering
A single element in a GLX reply is contained in the header itself.
The number of elements is denoted in the "n" field of the reply.
If "n" is 1, the length of additional data is 0.
The XXX_data_length() function of xcb does not return the length of
the (optional, n>1) data but the number of elements.

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=59876

Note: This is a candidate for the stable branches.

Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
Signed-off-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 5876a5dbc0)
2013-02-19 12:17:26 -08:00
Ian Romanick
456cdb6d01 mesa: Bump version to 9.1-rc2
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-02-17 14:49:02 -08:00
Eric Anholt
aaee862305 i965/fs: Use a helper function for checking for flow control instructions.
In 2 of our checks, we were missing BREAK and CONTINUE.

NOTE: Candidate for the stable branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit bf91f0b039)
2013-02-17 14:20:39 -08:00
bma
b84d9aa0c6 shaderapi: Fix AttachShader error
Detect a duplicate Shader type as and error instead of silently allowing
it, restrict to ES2 API.

v2: Tapani Pälli <tapani.palli@intel.com>
    - make the check run time instead of compile time

v3: chadv
    - Quote spec on which error to generate.

Signed-off-by: bma <Bo.Ma@windriver.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-and-tested-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit ce3dfa19ab)
2013-02-17 14:20:34 -08:00
Eric Anholt
bb4b1494e3 i965: Re-enable the -RHW workaround for original gen4 chips.
Fixes broken clipping in supertuxkart and presumably many other applications.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51471
NOTE: Candidate for the stable branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit cb4616d32d)
2013-02-17 14:20:27 -08:00
Eric Anholt
321abaaa8d i965/gen4: Work around missing sRGB RGB DXT1 support.
The hardware just doesn't support it.  I suspect this was a regression from
the move to fixed MESA_FORMATs for compressed textures and that previously we
were storing uncompressed for this or something.

Fixes GPU hangs in piglit "texwrap GL_EXT_texture_sRGB-s3tc bordercolor
swizzled" on my GM965.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit ddc2b453d0)
2013-02-17 14:20:22 -08:00
Ian Romanick
95f1203a7c mesa: Add .cherry-ignore for 9.1
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-02-17 14:16:22 -08:00
Christopher James Halse Rogers
96fb4d61fb i965: Fix leak in blorp CopyTexSubImage2D
_mesa_delete_renderbuffer does not call the driver-specific
renderbuffer delete function, so the blorp code was leaking the
Intel-specific bits, including some GEM objects.

Call the renderbuffer's ->Delete() method instead, which does the
right thing.

Fixes Unity rapidly sending the machine into the arms of the OOM-killer

Note: This is a candidate for the 9.1 branch.

Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit dd599188d2)
2013-02-17 14:13:27 -08:00
Brian Paul
d8a0439c65 st/mesa: fix format query for GL_ARB_texture_rg
The GL_ARB_texture_rg spec says that we need to support both texturing
and rendering for the GL_RED and GL_RG formats.  So move the format
check up into the rendertarget_mapping[] list.  Also, add
PIPE_FORMAT_R8_UNORM to the list of formats required.

Note: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
(cherry picked from commit 4be5a06752)
2013-02-17 14:13:13 -08:00
Eric Anholt
c785315f3d i965/gen7: Set up all samplers even if samplers are sparsely used.
In GLSL, sampler indices are allocated contiguously from 0.  But in the
case of ARB_fragment_program (and possibly fixed function), an app that
uses texture 0 and 2 will use sampler indices 0 and 2, so we were only
allocating space for samplers 0 and 1 and setting up sampler 0.  We
would read garbage for sampler 2, resulting in flickering textures and
an angry simulator.

Fixes bad rendering in 0 A.D. and ETQW.  This was fixed for pre-gen7 by
28f4be9eb9

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=25201
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58680
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: This is a candidate for stable branches.
(cherry picked from commit 5bb05c6e6d)
2013-02-17 14:12:47 -08:00
Kenneth Graunke
0e3c755ca3 i965: Use derived state for Haswell's 3DSTATE_VF packet.
Otherwise, we fail to correctly handle GL_PRIMITIVE_RESTART_FIXED_INDEX.

Fixes gles3conform's primitive_restart_mode test.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 8cabe26f5d)
2013-02-17 14:12:10 -08:00
Brian Paul
4d11454e90 util: fix incorrect Z bit masking in util_clear_depth_stencil()
For PIPE_FORMAT_Z24_UNORM_S8_UINT, the Z bits are in the 24
least significant bits.

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=60527
and http://bugs.freedesktop.org/show_bug.cgi?id=60524
and http://bugs.freedesktop.org/show_bug.cgi?id=60047

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 4bfdef87e6)
2013-02-17 14:10:55 -08:00
Marek Olšák
9838215f3c mesa: fix GetTexImage if mesa format and internal format don't match
Tested with softpipe only exposing RGBA formats.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit cb6470775c)
2013-02-17 14:10:40 -08:00
Marek Olšák
2e4473d9e3 mesa: don't use memcpy fast path for GetTexImage if base format is different
The Mesa format can be RGBA8888_REV, the format/type can be
GL_RGBA/GL_UNSIGNED_BYTE, but the actual texture internal format can be
LUMINANCE_ALPHA, INTENSITY, etc. Therefore we should look at the base
internal format as well.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit c8379204ab)
2013-02-17 14:10:28 -08:00
Marek Olšák
11eb644cc9 mesa: don't use _mesa_base_tex_format for format parameter of GetTexImage
_mesa_base_tex_format doesn't accept GL_BGR and GL_ABGR_EXT, etc.

v2: add a (now hopefully complete) helper function to deal with this

NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 09a99867ab)
2013-02-17 14:09:55 -08:00
Ian Romanick
60bad0ddc3 intel: Do not expose OES_compressed_ETC1_RGB8_texture or ARB_texture_rgb10_a2ui pre-GEN4
Older hardware cannot do ARB_texture_rgb10_a2ui, and the translation
code for OES_compressed_ETC1_RGB8_texture was never implemented in the
i915 driver.

NOTE: This is a candidate for all stable branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 0e2f26d5ea)
2013-02-17 14:09:09 -08:00
Roland Scheidegger
d41e9b4d14 softpipe: fix using optimized filter function
This optimized filter (when using repeat wrap modes,
linear min/mag/mip filters, pot textures) only applies to 2d textures,
but nothing prevented it from being used for other textures (likely
leading to very bogus sample results).

Note: This is a candidate for the 9.0 branch.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 66b6d51214)
2013-02-17 14:09:01 -08:00
Kristian Høgsberg
60f05f0eef egl-wayland: Make sure we allocate a back buffer even if nothing was rendered
At eglSwapBuffer time, we blindly assume we have a back buffer, but the
back buffer only gets allocated when somebody tries to render something.

NOTE: This is a candidate for the 9.0 and 9.1 branches.

https://bugs.freedesktop.org/show_bug.cgi?id=60086
(cherry picked from commit 1fe007399c)
2013-02-17 14:08:41 -08:00
Brian Paul
714d8b3f8c svga: fix sRGB rendering
We weren't emitting the SVGA_RS_OUTPUTGAMMA state so sRGB rendering
didn't work properly.

Fixes piglit's framebuffer-srgb test.

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit ff60509157)
2013-02-17 14:08:22 -08:00
Brian Paul
ac22dffaf6 st/mesa: don't choose DXT formats if we can't do DXT compression
If we call gl[Copy]TexImage2D() with a generic compression format
(e.g. intFormat=GL_COMPRESSED_RGBA) we can't choose a DXT format if
we don't have the external DXT compression library.

We weren't actually enforcing this before since the
pipe_screen::is_format_supported(DXT) query has no dependency on
the DXT compression library.

Now if we're given a generic compressed format and we can't do DXT
compression we'll fall back to a non-compressed format.

v2: use util_format_is_s3tc() function and add more comments about
the allow_dxt parameter.

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 4df42890c5)
2013-02-17 14:08:16 -08:00
Brian Paul
4ec3843a54 mesa: don't use format chooser code for glCompressedTexImage
When glCompressedTexImage is called the internalFormat is a specific
format for the incoming image and the the hardware format should be
the same (since we never do format transcoding).  So use the simpler
_mesa_glenum_to_compressed_format() function.  This change is also
needed for the next patch.

Note: This is a candidate for the stable branches.
(cherry picked from commit 478056b81a)
2013-02-17 14:07:37 -08:00
Michel Dänzer
30ae2f97c5 configure.ac: GLX cannot work without OpenGL
GLX uses mapi/glapi/libglapi.la, which is only built for OpenGL.

If the user specified --enable-xlib-glx --disable-opengl, error out, as these
cannot be both observed at the same time. If the user just specified
--disable-opengl but not --disable-glx, print a warning and disable GLX as
well.

NOTE: This is a candidate for the stable branches.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59364

Tested-by: Tom Stellard <thomas.stellard@amd.com>
(cherry picked from commit 3b888f534c)
2013-02-17 14:07:22 -08:00
Stéphane Marchesin
b289e639e4 glx: Check that swap_buffers_reply is non-NULL before using it
Check that the return value from xcb_dri2_swap_buffers_reply is
non-NULL before accessing the struct members.

Note: This is a candidate for the 9.0 branch.

Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 67e7263e45)
2013-02-17 14:02:29 -08:00
Brian Paul
c7e5e9ddce st/mesa: only enable GL_EXT_framebuffer_multisample if GL_MAX_SAMPLES >= 2
We never really have multisampling with one sample per pixel.
See also http://bugs.freedesktop.org/show_bug.cgi?id=59873

Note: This is a candidate for the 9.0 branch.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

(cherry picked from commit c80bacba2e)
2013-02-17 14:01:37 -08:00
Brian Paul
7b9e99f45b mesa: don't enable GL_EXT_framebuffer_multisample for software drivers
Note: This is a candidate for the 9.0 branch.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 8f3c81d018)
2013-02-17 14:00:11 -08:00
Brian Paul
9cea40321c osmesa: use _mesa_generate_mipmap() for mipmap generation, not meta
See previous commit for more info.

Note: This is a candidate for the 9.0 branch.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit 2180f32972)
2013-02-17 13:59:52 -08:00
Brian Paul
29e63455aa xlib: use _mesa_generate_mipmap() for mipmap generation, not meta
The swrast fragment program interpreter has trouble computing the
right texture LOD because it doesn't have easy access to input
derivatives.  This causes the GLSL-based meta generate mipmap code
to fetch texels from the wrong mipmap level.

One possible fix would be to set the GL_TEXTURE_MIN/MAX_LOD parameters
to limit sampling from the right level.  But let's just use the
_mesa_generate_mipmap() fallback since it's a lot faster than using
the fragment shader interpreter.

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=54240

Note: This is a candidate for the 9.0 branch.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit 89551ae04f)
2013-02-17 13:59:43 -08:00
Paul Berry
632a5a3a5b glsl: don't allow non-flat integral types in varying structs/arrays.
In the GLSL 1.30 spec, section 4.3.6 ("Outputs") says:

    "If a vertex output is a signed or unsigned integer or integer
    vector, then it must be qualified with the interpolation qualifier
    flat."

The GLSL ES 3.00 spec further clarifies, in section 4.3.6 ("Output
Variables"):

    "Vertex shader outputs that are, *or contain*, signed or unsigned
    integers or integer vectors must be qualified with the
    interpolation qualifier flat."

(Emphasis mine.)

The language in the GLSL ES 3.00 spec is clearly correct and should be
applied to all shading language versions, since varyings that contain
ints can't be interpolated, regardless of which shading language
version is in use.

(Note that in GLSL 1.50 the restriction is changed to apply to
fragment shader inputs rather than vertex shader outputs, to
accommodate the fact that in the presence of geometry shaders, vertex
shader outputs are not necessarily interpolated.  That will be
addressed by a future patch).

NOTE: This is a candidate for stable branches.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit 93c913485e)
2013-02-15 13:28:01 -08:00
Paul Berry
2cd4824fbc glsl: Allow default precision qualifiers to be set for sampler types.
From GLSL ES 3.00 section 4.5.4 ("Default Precision Qualifiers"):

    "The precision statement

        precision precision-qualifier type;

    can be used to establish a default precision qualifier. The type
    field can be either int or float or any of the sampler types, and
    the precision-qualifier can be lowp, mediump, or highp."

GLSL ES 1.00 has similar language.  GLSL 1.30 doesn't allow precision
qualifiers on sampler types, but this seems like an oversight (since
the intention of including these in GLSL 1.30 is to allow
compatibility with ES shaders).

Previously, Mesa followed GLSL 1.30 and only allowed default precision
qualifiers to be set for float and int.  This patch makes it follow
GLSL ES rules in all cases.

Fixes Piglit tests default-precision-sampler.{vert,frag}.

Partially addresses https://bugs.freedesktop.org/show_bug.cgi?id=60737.

NOTE: This is a candidate for stable branches.

Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit d5948f2f5e)
2013-02-15 13:27:48 -08:00
Michel Dänzer
05de84442b radeonsi: Handle TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS
8 more little piglits.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit c840270ebe)
2013-02-15 18:47:28 +01:00
Michel Dänzer
14372a70ec radeonsi: Fix array indices for detecting integer vertex formats
(cherry picked from commit f34ad85765)
2013-02-15 18:47:21 +01:00
Christian König
baa9070346 radeonsi: remove constant index limitation v3
With the llvm patches, fixing 14 piglit tests in total.

v2: increase the const limit
v3: document the const limit

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 8c80894fb3)
2013-02-15 18:46:32 +01:00
Christian König
f50e4e21f4 radeonsi: support constants as TEX coordinates
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 8514f5ac01)
2013-02-15 18:45:29 +01:00
Tom Stellard
38e728498b configure.ac: Add components to LLVM_COMPONENTS when using llvm shared libs
This is required when LLVM is built with CMake, which creates one
shared library for each component.
(cherry picked from commit 0898047e7b)
2013-02-13 17:02:12 -05:00
Matt Turner
fb2eb65126 Revert "mesa: Return INVALID_OPERATION when type is known but not allowed"
This reverts commit 2906e2034c.

Fixes a regression in the glean depthStencil test.

Reverting this does not affect any tests in es3conform, so a more recent
patch must have also fixed the failure this one was intended to fix.

Reported-by: lu hua <huax.lu@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59494
(cherry picked from commit a527b2192e)
2013-02-13 15:31:50 -05:00
Tom Stellard
741a249cbf r600g: Handle SET*_DX10 instructions in r600_bytecode_get_num_operands() 2013-02-13 15:31:34 -05:00
Jerome Glisse
3ae8678f81 r600g: fix lockup when hyperz & alpha test are enabled together. v3
Seems that alpha test being enabled confuse the GPU on the order in
which it should perform the Z testing. So force the order programmed
throught db shader control.

v2: Only force z order when alpha test is enabled
v3: Update db shader when binding new dsa + spelling fix

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
(cherry picked from commit 974b482aca)
2013-02-12 17:06:36 -05:00
Jordan Justen
85604f3d48 CopyTexImage: Don't check sRGB vs LINEAR for desktop GL
In OpenGL 4.3, new language was added that would require
this check. But, if this check results in broken applications
then perhaps it will be reversed.

For now, remove this check and re-evaluate when
desktop GL 4.3 is closer.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-02-12 11:25:15 -08:00
Quentin Glidic
9119b4e8ee configure.ac: Fix --with-llvm-shared-libs
The third argument of AC_ARG_WITH is evaluated for any provided value,
not only on --with-, so it must not force-enable the feature
Also, setting $with_llvm_shared_libs in the opencl check was overriding
the user switch

https://bugs.freedesktop.org/show_bug.cgi?id=59851

Signed-off-by: Quentin Glidic <sardemff7+git@sardemff7.net>
(cherry picked from commit 1e857130f0)
2013-02-12 15:58:22 +00:00
Tom Stellard
f4f306b8ba r600g/llvm: Select the correct GPU type for RV670
RV670 belongs in the R600 chip class

https://bugs.freedesktop.org/show_bug.cgi?id=58666

NOTE: This is a candidate for the 9.1 branch
(cherry picked from commit 257006e2a4)
2013-02-12 15:58:04 +00:00
Jerome Glisse
99adec8a88 r600g: make sure async blit is done 8 * pitch at a time v2
The blit must be aligned on 8 horizontal block.

v2: no need to align the reminder

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
(cherry picked from commit 323a448825)
2013-02-11 18:44:55 -05:00
Martin Andersson
3b609f12f6 winsys/radeon: fix bo with virtual address referencing mismatch
If the same context try to flink and open the object, use the
same bo struct instead of opening a new gem handle for the object.
This way we avoid avoid having 2 different handle pointing to the
same kernel object which can latter lead to trouble with virtual
address.

Fix:
https://bugs.freedesktop.org/show_bug.cgi?id=60200

Signed-off-by: Martin Andersson <g02maran@gmail.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
(cherry picked from commit a37835c8ed)
2013-02-11 18:41:28 -05:00
Andreas Boll
ecd310bd67 docs: document removal of makedepend build dependency
Build dependency removed with
424f200881

Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 44a5d7371c)
2013-02-11 18:12:30 +01:00
Matt Turner
2a7affc1d5 builtin_compiler/build: Don't use *_FOR_BUILD when not cross compiling
Previously we were relying on CFLAGS_FOR_BUILD to be the same as CFLAGS
when not cross compiling, but this assumption didn't take into
consideration 32-bit builds on 64-bit systems. More generally, not
honoring CFLAGS is bad.

Automake is evidently too stupid to accept

if CROSS_COMPILING
CC = @CC_FOR_BUILD@
...
else
CC = @CC@
endif

without warning that CC has been already defined. The warnings are
harmless, but I'd prefer to avoid future reports about them, so define
proxy variables, which are assigned inside the conditional and then
unconditionally assigned to CC et al.

NOTE: This is a candidate for the 9.1 branch.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59737
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60038
(cherry picked from commit 2db1f73849)
2013-02-11 12:28:06 +01:00
Quentin Glidic
c684e3b53e gallium/egl: Fix include dirs for VPATH build
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Quentin Glidic <sardemff7+git@sardemff7.net>
(cherry picked from commit 11bd1b0f58)
2013-02-11 11:48:54 +01:00
Andreas Boll
b94eeffe60 mesa: Bump version to 9.1-rc1 2013-02-11 09:21:54 +01:00
Jerome Glisse
a0528269a3 winsys/radeon: improve debuging printing
Make sure one can identify virtual address failure from allocation
failure.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
(cherry picked from commit 9a47684564)
2013-02-08 20:33:22 -05:00
Jerome Glisse
18ef6b1265 xorg: fix exa finish access
The exa core will already set the pointer to NULL prior calling
the callback function. So don't bail out in the callback if it's
already NULL.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
(cherry picked from commit 3310acdf47)
2013-02-08 19:01:51 -05:00
Paul Berry
0419b7a3a1 glsl: Support transform feedback of varying structs.
Since transform feedback needs to be able to access individual fields
of varying structs, we can no longer match up the arguments to
glTransformFeedbackVaryings() with variables in the vertex shader.

Instead, we build up a hashtable which records information about each
possible name that is a candidate for transform feedback, and then
match up the arguments to glTransformFeedbackVaryings() with the
contents of that hashtable.

Populating the hashtable uses the program_resource_visitor
infrastructure, so the logic is shared with how we handle uniforms.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 99b78337e3)
2013-02-08 11:17:33 -08:00
Paul Berry
5be2e14393 glsl: Use parse_program_resource_name to parse transform feedback varyings.
Previously, transform feedback varyings were parsed in an ad-hoc
fashion that wasn't compatible with structs (or array of structs).
This patch makes it use parse_program_resource_name(), which correctly
handles both.

Note that parse_program_resource_name()'s technique for handling
mal-formed input strings is to simply let them through and rely on the
fact that a future name lookup will fail.  Because of this,
tfeedback_decl::init() no longer needs to return a boolean error
code--it always succeeds, and if the input was mal-formed the error
will be detected later.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 53febac02c)
2013-02-08 11:17:28 -08:00
Paul Berry
11e4347bff glsl: Rename uniform_field_visitor to program_resource_visitor.
There's actually nothing uniform-specific in uniform_field_visitor.
It is potentially useful for all kinds of program resources (in
particular, future patches will use it for transform feedback
varyings).

This patch renames it to program_resource_visitor, and clarifies
several comments, to reflect the fact that it is useful for more than
just uniforms.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit b4db34cc4c)
2013-02-08 11:17:23 -08:00
Paul Berry
49a5f829f7 mesa/glsl: Separate parsing logic from _mesa_get_uniform_location.
The parsing logic is moved to a new function in the GLSL module,
parse_program_resource_name().  This name was chosen because it should
eventually be useful for handling everything that OpenGL 4.3 calls
"program resources" (e.g. uniforms, vertex inputs, fragment outputs,
and transform feedback varyings).

Future patches will make use of this function for linking transform
feedback varyings.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit b92900d26a)
2013-02-08 11:17:17 -08:00
Kenneth Graunke
5265c42e52 i965/blorp: Support blits between ARGB and XRGB formats.
Now that we have support for overriding alpha to 1.0, we can handle
blitting between these formats in either direction.

For now, we only support two XRGB formats: MESA_FORMAT_XRGB8888 and
MESA_FORMAT_RGBX8888_REV.  Most places only appear to worry about the
former, so ignore the latter for now.  We can always add it later.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Martin Steigerwald <martin@lichtvoll.de>
(cherry picked from commit 7d467f3c15)
2013-02-07 22:31:29 -08:00
Kenneth Graunke
3114f5acd3 i965/blorp: Support overriding destination alpha to 1.0.
Currently, Blorp requires the source and destination formats to be
equal.  However, we'd really like to be able to blit between XRGB and
ARGB formats; our BLT engine paths have supported this for a long time.

For ARGB -> XRGB, nothing needs to occur: the missing alpha is already
interpreted as 1.0.  For XRGB -> ARGB, we need to smash the alpha
channel to 1.0 when writing the destination colors.  This is fairly
straightforward with blending.

For now, this code is never used, as the source and destination formats
still must be equal.  The next patch will relax that restriction.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Martin Steigerwald <martin@lichtvoll.de>
(cherry picked from commit c0554141a9)
2013-02-07 22:31:29 -08:00
Kenneth Graunke
332c50b666 i965: Implement CopyTexSubImage2D via BLORP (and use it by default).
The BLT engine has many limitations.  Currently, it can only blit
X-tiled buffers (since we don't have a kernel API to whack the BLT
tiling mode register), which means all depth/stencil operations get
punted to meta code, which can be very CPU-intensive.

Even if we used the BLT engine, it can't blit between buffers with
different tiling modes, such as an X-tiled non-MSAA ARGB8888 texture
and a Y-tiled CMS ARGB8888 renderbuffer.  This is a fundamental
limitation, and the only way around that is to use BLORP.

Previously, BLORP only handled BlitFramebuffer.  This patch adds an
additional frontend for doing CopyTexSubImage.  It also makes it the
default.  This is partly to increase testing and avoid hiding bugs,
and partly because the BLORP path can already handle more cases.  With
trivial extensions, it should be able to handle everything the BLT can.

This helps PlaneShift massively, which tries to CopyTexSubImage2D
between depth buffers whenever a player casts a spell.  Since these
are Y-tiled, we hit meta and software ReadPixels paths, eating 99% CPU
while delivering ~1 FPS.  This is particularly bad in an MMO setting
because people cast spells all the time.

It also helps Xonotic in 4X MSAA mode.  At default power management
settings, I measured a 6.35138% +/- 0.672548% performance boost (n=5).
(This data is from v1 of the patch.)

No Piglit regressions on Ivybridge (v3) or Sandybridge (v2).

v2: Create a fake intel_renderbuffer to wrap the destination texture
    image and then reuse do_blorp_blit rather than reimplementing most
    of it.  Remove unnecessary clipping code and conditional rendering
    check.

v3: Reuse formats_match() to centralize checks; delete temporary
    renderbuffers.  Reorganize the code.

v4: Actually copy stencil when dealing with separate stencil buffers but
    packed depth/stencil formats.  Tested by a new Piglit test.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com> [v4]
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v3]
Reviewed-and-tested-by: Carl Worth <cworth@cworth.org> [v2]
Tested-by: Martin Steigerwald <martin@lichtvoll.de> [v3]
(cherry picked from commit 0b3bebbaac)
2013-02-07 22:31:29 -08:00
Kenneth Graunke
55e3f79d55 mesa: Put extern "C" guards in renderbuffer.h.
I need to use this from C++ code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 29aef6cce8)
2013-02-07 22:31:29 -08:00
Kenneth Graunke
1d2ef43032 i965: Fix the SF Vertex URB Read Length calculation for Gen7 platforms.
Ivybridge doesn't appear to have the same errata as Sandybridge; no
corruption was observed by setting it to more than the minimal correct
value.  It's possible that we were simply lucky, since the URB entries
are 1024-bit on Ivybridge vs. 512-bit Sandybridge.  Or perhaps the
underlying hardware issue is fixed.

Either way, we may as well program the minimum value since it's now
readily available, likely to be more efficient, and possibly more
correct.

v2: Use GEN7_SBE_* defines rather than GEN6_SF_*.  (A copy and paste
    mistake.)  They're the same, but using the right names is better.

NOTE: This is a candidate for all stable branches.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 44aa2e15f6)
2013-02-07 22:31:28 -08:00
Kenneth Graunke
3acd5ed75b i965: Fix the SF Vertex URB Read Length calculation for Sandybridge.
(This commit message was primarily written by Paul Berry, who explained
 what's going on far better than I would have.)

Previous to this patch, we thought that the only restrictions on
3DSTATE_SF's URB read length were (a) it needs to be large enough to
read all the VUE data that the SF needs, and (b) it can't be so large
that it tries to read VUE data that doesn't exist.  Since the VUE map
already tells us how much VUE data exists, we didn't bother worrying
about restriction (a); we just did the easy thing and programmed the
read length to satisfy restriction (b).

However, we didn't notice this erratum in the hardware docs: "[errata]
Corruption/Hang possible if length programmed larger than recommended".
Judging by the context surrounding this erratum, it's pretty clear that
it means "URB read length must be exactly the size necessary to read all
the VUE data that the SF needs, and no larger".  Which means that we
can't program the read length based on restriction (b)--we have to
program it based on restriction (a).

The URB read size needs to precisely match the amount of data that the
SF consumes; it doesn't work to simply base it on the size of the VUE.

Thankfully, the PRM contains the precise formula the hardware expects.

Fixes random UI corruption in Steam's "Big Picture Mode", random terrain
corruption in PlaneShift, and Piglit's fbo-5-varyings test.

NOTE: This is a candidate for all stable branches.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56920
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60172
Tested-by: Jordan Justen <jordan.l.justen@intel.com> (v1/Piglit)
Tested-by: Martin Steigerwald <martin@lichtvoll.de> (PlaneShift)
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 09fbc29828)
2013-02-07 22:31:28 -08:00
Kenneth Graunke
697f8e56dc i965: Compute the maximum SF source attribute.
The maximum SF source attribute is necessary to compute the Vertex URB
read length properly, which will be done in the next commit.

NOTE: This is a candidate for all stable branches.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Tested-by: Martin Steigerwald <martin@lichtvoll.de>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 5e9bc7bd12)
2013-02-07 22:31:28 -08:00
Kenneth Graunke
45ae093e5c i965: Refactor Gen6+ SF attribute override code.
The next patch will benefit from easy access to the source attribute
number and whether or not we're swizzling.  It doesn't want the final
attr_override DWord form, however.

NOTE: This is a candidate for all stable branches.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Tested-by: Martin Steigerwald <martin@lichtvoll.de>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit b3efc5bea8)
2013-02-07 22:31:28 -08:00
Kenneth Graunke
535e95299a i965: Add chipset limits for Haswell GT1/GT2.
The maximum number of URB entries come from the 3DSTATE_URB_VS and
3DSTATE_URB_GS state packet documentation; the thread count information
comes from the 3DSTATE_VS and 3DSTATE_PS state packet documentation.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
(cherry picked from commit 9add4e8038)
2013-02-07 22:31:28 -08:00
Vinson Lee
a7e2c615f1 i965: Fix assignment instead of comparison in asserts.
Fixes side effect in assertion defects reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit 1559994cba)
2013-02-07 22:31:28 -08:00
Paul Berry
5611a5a387 mesa: Don't check (offset + size <= bufObj->Size) in BindBufferRange.
In the documentation for BindBufferRange, OpenGL specs from 3.0
through 4.1 contain this language:

    "The error INVALID_VALUE is generated if size is less than or
    equal to zero or if offset + size is greater than the value of
    BUFFER_SIZE."

This text was dropped from OpenGL 4.2, and it does not appear in the
GLES 3.0 spec.

Presumably the reason for the change is because come clients change
the size of the buffer after calling BindBufferRange.  We don't want
to generate an error at the time of the BindBufferRange call just
because the old size of the buffer was too small, when the buffer is
about to be resized.

Since this is a deliberate relaxation of error conditions in order to
allow clients to work, it seems sensible to apply it to all versions
of GL, not just GL 4.2 and above.

(Note that there is no danger of this change allowing a client to
access data beyond the end of a buffer.  We already have code to
ensure that that doesn't happen in the case where the client shrinks
the buffer after calling BindBufferRange).

Eliminates a spurious error message in the gles3 conformance test
"transform_feedback_offset_size".

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 04f0d6cc22)
2013-02-07 21:20:32 -08:00
Ian Romanick
a48e5526c2 i965: Set UniformBufferOffsetAlignment to sizeof(vec4)
This matches the behavior of the Windows driver, but a bspec reference
should would be nice.

NOTE: This is a candidate for the 9.0 and 9.1 branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit f29ab4ece5)
2013-02-07 21:20:16 -08:00
Matt Turner
c59808c700 mesa: Allow glGet* queries of MAX_VARYING_COMPONENTS in ES 3
Should have been done in d9948e49 but I missed it because
MAX_VARYING_FLOATS doesn't appear in the ES 3 spec, but is the same
value as MAX_VARYING_COMPONENTS.

NOTE: Candidate for the 9.1 branch
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-02-07 17:54:16 -08:00
Michel Dänzer
ad62f424b3 radeonsi: Handle scaled and integer formats for samplers and vertex elements.
Also, add assertions to stress that render targets don't support scaled
formats.

20 more little piglits.
(cherry picked from commit 46dd16bca8b4526e46badc9cb1d7c058a8e6173e)
2013-02-07 19:11:30 +01:00
Michel Dänzer
fc04455533 radeonsi: Don't advertise PIPE_FORMAT_L8A8_SRGB support.
The hardware can't do it.
(cherry picked from commit f6e9430da2d3510f84baefa0fdf26ec5c457f146)
2013-02-07 19:11:19 +01:00
Michel Dänzer
6799bddf6b radeonsi: Remove incorrect (and dead) assignment in tex_fetch_args().
The proper return type is assigned at the end of the function.
(cherry picked from commit 180db2bcb28e94bb1ce18d76b2b3a5818d76262c)
2013-02-07 19:11:09 +01:00
Michel Dänzer
93f61addb5 radeonsi: Use unique names for referring to texture sampling intrinsics.
Append the overloaded vector type used for passing in the addressing
parameters.

Without this, LLVM uses the same function signature for all those types,
which cannot work.

Fixes problems e.g. with FlightGear and Red Eclipse.
(cherry picked from commit 1b3afea30de757815555d9eb1d6e72e2586d6a0c)
2013-02-07 19:10:17 +01:00
Jerome Glisse
d04b50b4de r600g: fix slice tile max for compressed texture and async dma
Was using the pixel size instead of the number of block for the slice
tile max computation which resulted in dma writing at wrong address.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-02-07 10:43:37 -05:00
Marek Olšák
f1c46c8418 r300g: fix blending with blend color and RGBA formats
NOTE: This is a candidate for the stable branches.
(cherry picked from commit f40a7fc34a)
2013-02-06 22:24:04 +01:00
Michel Dänzer
4bc85f9aac Require libdrm_radeon 2.4.42 for radeonsi.
It has new PCI IDs and an important tiled surface layout fix.
(cherry picked from commit 02a423b239)
2013-02-05 15:15:49 +01:00
Alex Deucher
e1d798a901 radeonsi: add Oland pci ids
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Note: this is a candidate for the 9.1 branch.
(cherry picked from commit 4161d70bba)
2013-02-04 17:20:22 -05:00
Alex Deucher
6b0fa537a9 radeonsi: default PA_SC_RASTER_CONFIG to 0
That should work in all cases.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Note: this is a candidate for the 9.1 branch.
(cherry picked from commit af0af75881)
2013-02-04 17:20:03 -05:00
Alex Deucher
0cc0097bb0 radeonsi: add support for Oland chips
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Note: this is a candidate for the 9.1 branch
(cherry picked from commit 83e4407f44)
2013-02-04 17:19:43 -05:00
Michel Dänzer
7f90de5414 radeonsi: Fix draws using user index buffer.
Was broken since commit bf469f4edc
('gallium: add void *user_buffer in pipe_index_buffer').

Fixes 11 piglit tests and lots of missing geometry e.g. in TORCS.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit a8a5055f2d)
2013-02-04 17:54:03 +01:00
Michel Dänzer
8cd237bcbe radeonsi: Remove spurious traces of R16G16B16 support.
The hardware can't do it, and these were causing warnings in some piglit tests.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit 6455d40b7e)
2013-02-04 17:28:18 +01:00
Michel Dänzer
5ca77c27a6 radeonsi: Enable texture arrays.
28/30 piglit tests pass.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit 6bcb823844)
2013-02-04 17:28:14 +01:00
Michel Dänzer
b104d151f1 radeonsi: Improve packing of texture address parameters.
In particular, the LOD bias and depth comparison values are packed before the
'normal' texture coordinates, and the array slice and LOD values are appended.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit 120efeef8b)
2013-02-04 17:27:43 +01:00
Michel Dänzer
5f9f3f381f radeonsi: Adapt to sample intrinsics changes.
Fix up intrinsic names, and bitcast texture address parameters to integers.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit e5fb7347a7)
2013-02-04 17:27:34 +01:00
Marek Olšák
b127ad3489 mesa: don't expose IBM_rasterpos_clip in a core context
glRasterPos doesn't exist in the core profile.

NOTE: This is a candidate for the stable branches (9.0 and 9.1).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit cc5fdaf2dc)
2013-02-01 16:35:24 +01:00
Marek Olšák
1003652a7f r300g: always put MSAA resources in VRAM
This along with the latest drm-fixes branch should help with bad performance
of MSAA. Remember: Nx MSAA can't be more than N times slower (where N=2,4,6).

Anyway, I recommend at least 512 MB of VRAM for Full HD 6x MSAA.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit a06f03d795)
2013-02-01 16:35:18 +01:00
Jerome Glisse
9d8a866db3 r600g: add cs memory usage accounting and limit it v3
We are now seing cs that can go over the vram+gtt size to avoid
failing flush early cs that goes over 70% (gtt+vram) usage. 70%
is use to allow some fragmentation.

The idea is to compute a gross estimate of memory requirement of
each draw call. After each draw call, memory will be precisely
accounted. So the uncertainty is only on the current draw call.
In practice this gave very good estimate (+/- 10% of the target
memory limit).

v2: Remove left over from testing version, remove useless NULL
    checking. Improve commit message.
v3: Add comment to code on memory accounting precision

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-01-31 14:25:30 -05:00
Marek Olšák
3b8d4f941f r600g: fix htile buffer leak
NOTE: This is a candidate for the 9.1 branch.
2013-01-31 14:25:10 -05:00
Matt Turner
ff515c4e7c build: Add missing comma in AS_IF
Reported-by: Lauri Kasanen<curaga@operamail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47248#c15
2013-01-29 15:06:47 -08:00
Marek Olšák
d7ca04a7c3 docs/relnotes-9.1: document new features in radeon drivers
(cherry picked from commit 845130951f)
2013-01-29 17:38:14 +01:00
Matt Turner
48af880f81 docs: List new extensions added in Mesa 9.1
I did not list the *_get_program_binary extensions since they're not
useful to anyone with their current implementation (that supports 0
binary formats).
2013-01-28 16:49:24 -08:00
Jerome Glisse
af2d8f8072 r600g: use uint64_t instead of unsigned long for proper 32bits cpu support
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-01-28 19:10:29 -05:00
Jerome Glisse
d8d17441e2 r600g: real fix for non 3.8 kernel
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-01-28 17:44:49 -05:00
241 changed files with 5884 additions and 1575 deletions

View File

@@ -36,7 +36,7 @@ check-local:
# Rules for making release tarballs
PACKAGE_VERSION=9.1-devel
PACKAGE_VERSION=9.1.3
PACKAGE_DIR = Mesa-$(PACKAGE_VERSION)
PACKAGE_NAME = MesaLib-$(PACKAGE_VERSION)

16
bin/.cherry-ignore Normal file
View File

@@ -0,0 +1,16 @@
d60da27273d2cdb68bc32cae2ca66718dab15f27 st/mesa: set ctx->Const.MaxSamples = 0, not 1
5c86a728d4f688c0fe7fbf9f4b8f88060b65c4ee r600g: fix htile buffer leak
496928a442cec980b534bc5da2523b3632b21b61 CopyTexImage: Don't check sRGB vs LINEAR for desktop GL
3ee602314fc22054f69ee476f2e1037653d269bc mesa: Allow glGet* queries of MAX_VARYING_COMPONENTS in ES 3
# Already cherry picked without -x
96b3ca89b153f358de74059151d2b0e8bd884dfa scons: Allows choosing VS 10 or 11.
# This patch is superceded by 7d4f1e6
dbf94d105a48b7aafb2c8cf64d8b4392d87efea1 glsl: Replace constant-index vector array accesses with swizzles
# This patch is superceded by 34a4fc5
0967c362bf378b7415c30ca6d9523d3b2a3a7f5d i965: Fix an inconsistency inb the VUE map with gl_ClipVertex on gen4/5.
# This patch was backported as c3eb301
a8246927e35a49097f70cffb7fa8dd05ec1365e1 r600g: Fix UMAD on Cayman

52
bin/bugzilla_mesa.sh Executable file
View File

@@ -0,0 +1,52 @@
#!/bin/bash
# This script is used to generate the list of fixed bugs that
# appears in the release notes files, with HTML formatting.
#
# Note: This script could take a while until all details have
# been fetched from bugzilla.
#
# Usage examples:
#
# $ bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3
# $ bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3 > bugfixes
# $ bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3 | tee bugfixes
# $ DRYRUN=yes bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3
# $ DRYRUN=yes bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3 | wc -l
# regex pattern: trim before url
trim_before='s/.*\(http\)/\1/'
# regex pattern: trim after url
trim_after='s/\(show_bug.cgi?id=[0-9]*\).*/\1/'
# regex pattern: always use https
use_https='s/http:/https:/'
# extract fdo urls from commit log
urls=$(git log $* | grep 'bugs.freedesktop.org/show_bug' | sed -e $trim_before -e $trim_after -e $use_https | sort | uniq)
# if DRYRUN is set to "yes", simply print the URLs and don't fetch the
# details from fdo bugzilla.
#DRYRUN=yes
if [ "x$DRYRUN" = xyes ]; then
for i in $urls
do
echo $i
done
else
echo "<ul>"
echo ""
for i in $urls
do
id=$(echo $i | cut -d'=' -f2)
summary=$(wget --quiet -O - $i | grep -e '<title>.*</title>' | sed -e 's/ *<title>Bug [0-9]\+ &ndash; \(.*\)<\/title>/\1/')
echo "<li><a href=\"$i\">Bug $id</a> - $summary</li>"
echo ""
done
echo "</ul>"
fi

View File

@@ -1,6 +1,12 @@
#!/bin/sh
# Script for generating a list of candidates for cherry-picking to a stable branch
#
# Usage examples:
#
# $ bin/get-pick-list.sh
# $ bin/get-pick-list.sh > picklist
# $ bin/get-pick-list.sh | tee picklist
# Grep for commits with "cherry picked from commit" in the commit message.
git log --reverse --grep="cherry picked from commit" origin/master..HEAD |\
@@ -8,7 +14,7 @@ git log --reverse --grep="cherry picked from commit" origin/master..HEAD |\
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked
# Grep for commits that were marked as a candidate for the stable tree.
git log --reverse --pretty=%H -i --grep='^[[:space:]]*NOTE: This is a candidate' HEAD..origin/master |\
git log --reverse --pretty=%H -i --grep='^[[:space:]]*NOTE: .*[Cc]andidate' HEAD..origin/master |\
while read sha
do
# Check to see whether the patch is on the ignore list.

View File

@@ -2,6 +2,12 @@
# This script is used to generate the list of changes that
# appears in the release notes files, with HTML formatting.
#
# Usage examples:
#
# $ bin/shortlog_mesa.sh mesa-9.0.2..mesa-9.0.3
# $ bin/shortlog_mesa.sh mesa-9.0.2..mesa-9.0.3 > changes
# $ bin/shortlog_mesa.sh mesa-9.0.2..mesa-9.0.3 | tee changes
typeset -i in_log=0

View File

@@ -100,4 +100,4 @@ def AddOptions(opts):
opts.Add(BoolOption('quiet', 'DEPRECATED: profile build', 'yes'))
opts.Add(BoolOption('texture_float', 'enable floating-point textures and renderbuffers', 'no'))
if host_platform == 'windows':
opts.Add(EnumOption('MSVS_VERSION', 'MS Visual C++ version', None, allowed_values=('7.1', '8.0', '9.0')))
opts.Add(EnumOption('MSVC_VERSION', 'MS Visual C++ version', None, allowed_values=('7.1', '8.0', '9.0', '10.0', '11.0')))

View File

@@ -6,7 +6,7 @@ dnl Tell the user about autoconf.html in the --help output
m4_divert_once([HELP_END], [
See docs/autoconf.html for more details on the options for Mesa.])
AC_INIT([Mesa], [9.1.0],
AC_INIT([Mesa], [9.1.3],
[https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa])
AC_CONFIG_AUX_DIR([bin])
AC_CONFIG_MACRO_DIR([m4])
@@ -20,7 +20,8 @@ echo \#buildapi-variable-no-builddir >/dev/null
# Support silent build rules, requires at least automake-1.11. Disable
# by either passing --disable-silent-rules to configure or passing V=1
# to make
m4_ifdef([AM_SILENT_RULES], [AM_SILENT_RULES([yes])])
m4_ifdef([AM_SILENT_RULES], [AM_SILENT_RULES([yes])],
[AC_SUBST([AM_DEFAULT_VERBOSITY], [1])])
m4_ifdef([AM_PROG_AR], [AM_PROG_AR])
@@ -30,7 +31,7 @@ AC_SUBST([OSMESA_VERSION])
dnl Versions for external dependencies
LIBDRM_REQUIRED=2.4.24
LIBDRM_RADEON_REQUIRED=2.4.40
LIBDRM_RADEON_REQUIRED=2.4.42
LIBDRM_INTEL_REQUIRED=2.4.38
LIBDRM_NVVIEUX_REQUIRED=2.4.33
LIBDRM_NOUVEAU_REQUIRED="2.4.33 libdrm >= 2.4.41"
@@ -57,10 +58,10 @@ LT_PREREQ([2.2])
LT_INIT([disable-static])
AX_PROG_BISON([],
AS_IF([test ! -f "$srcdir/src/glsl/glcpp/glcpp-parse.c"]
AS_IF([test ! -f "$srcdir/src/glsl/glcpp/glcpp-parse.c"],
[AC_MSG_ERROR([bison not found - unable to compile glcpp-parse.y])]))
AX_PROG_FLEX([],
AS_IF([test ! -f "$srcdir/src/glsl/glcpp/glcpp-lex.c"]
AS_IF([test ! -f "$srcdir/src/glsl/glcpp/glcpp-lex.c"],
[AC_MSG_ERROR([flex not found - unable to compile glcpp-lex.l])]))
AC_PATH_PROG([PERL], [perl])
@@ -451,6 +452,9 @@ if test "x$enable_asm" = xyes; then
linux* | *freebsd* | dragonfly* | *netbsd*)
test "x$enable_64bit" = xyes && asm_arch=x86_64 || asm_arch=x86
;;
gnu*)
asm_arch=x86
;;
esac
;;
x86_64)
@@ -611,7 +615,7 @@ AC_ARG_ENABLE([opencl],
[enable OpenCL library NOTE: Enabling this option will also enable
--with-llvm-shared-libs
@<:@default=no@:>@])],
[enable_opencl="$enableval" with_llvm_shared_libs="$enableval"],
[],
[enable_opencl=no])
AC_ARG_ENABLE([xlib_glx],
[AS_HELP_STRING([--enable-xlib-glx],
@@ -701,6 +705,16 @@ if test "x$enable_dri$enable_xlib_glx" = xyesyes; then
AC_MSG_ERROR([DRI and Xlib-GLX cannot be built together])
fi
if test "x$enable_opengl$enable_xlib_glx" = xnoyes; then
AC_MSG_ERROR([Xlib-GLX cannot be built without OpenGL])
fi
# Disable GLX if OpenGL is not enabled
if test "x$enable_glx$enable_opengl" = xyesno; then
AC_MSG_WARN([OpenGL not enabled, disabling GLX])
enable_glx=no
fi
# Disable GLX if DRI and Xlib-GLX are not enabled
if test "x$enable_glx" = xyes -a \
"x$enable_dri" = xno -a \
@@ -815,20 +829,6 @@ if test "x$enable_dri" = xyes; then
fi
fi
dnl Find out if X is available.
PKG_CHECK_MODULES([X11], [x11], [no_x=no], [no_x=yes])
dnl Try to tell the user that the --x-* options are only used when
dnl pkg-config is not available. This must be right after AC_PATH_XTRA.
m4_divert_once([HELP_BEGIN],
[These options are only used when the X libraries cannot be found by the
pkg-config utility.])
dnl We need X for xlib and dri, so bomb now if it's not found
if test "x$enable_glx" = xyes -a "x$no_x" = xyes; then
AC_MSG_ERROR([X11 development libraries needed for GLX])
fi
dnl Direct rendering or just indirect rendering
case "$host_os" in
gnu*)
@@ -1619,8 +1619,13 @@ AC_ARG_ENABLE([gallium-llvm],
AC_ARG_WITH([llvm-shared-libs],
[AS_HELP_STRING([--with-llvm-shared-libs],
[link with LLVM shared libraries @<:@default=disabled@:>@])],
[with_llvm_shared_libs=yes],
[],
[with_llvm_shared_libs=no])
AS_IF([test x$enable_opencl = xyes],
[
AC_MSG_WARN([OpenCL required, forcing LLVM shared libraries])
with_llvm_shared_libs=yes
])
AC_ARG_WITH([llvm-prefix],
[AS_HELP_STRING([--with-llvm-prefix],
@@ -1662,16 +1667,17 @@ if test "x$enable_gallium_llvm" = xyes; then
if test "x$LLVM_CONFIG" != xno; then
LLVM_VERSION=`$LLVM_CONFIG --version | sed 's/svn.*//g'`
LLVM_VERSION_INT=`echo $LLVM_VERSION | sed -e 's/\([[0-9]]\)\.\([[0-9]]\)/\10\2/g'`
if test "x$with_llvm_shared_libs" != xyes; then
LLVM_COMPONENTS="engine bitwriter"
if $LLVM_CONFIG --components | grep -q '\<mcjit\>'; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} mcjit"
fi
LLVM_COMPONENTS="engine bitwriter"
if $LLVM_CONFIG --components | grep -q '\<mcjit\>'; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} mcjit"
fi
if $LLVM_CONFIG --components | grep -q '\<oprofilejit\>'; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} oprofilejit"
fi
if test "x$enable_opencl" = xyes; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} ipo linker instrumentation"
fi
fi
if test "x$enable_opencl" = xyes; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} ipo linker instrumentation"
fi
LLVM_LDFLAGS=`$LLVM_CONFIG --ldflags`
LLVM_BINDIR=`$LLVM_CONFIG --bindir`
LLVM_CPPFLAGS=`strip_unwanted_llvm_flags "$LLVM_CONFIG --cppflags"`
@@ -1840,6 +1846,9 @@ if test "x$with_gallium_drivers" != x; then
if test "x$enable_r600_llvm" = xyes; then
USE_R600_LLVM_COMPILER=yes;
fi
if test "x$enable_opencl" = xyes; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} bitreader asmparser"
fi
gallium_check_st "radeon/drm" "dri-r600" "xorg-r600" "" "xvmc-r600" "vdpau-r600"
;;
xradeonsi)

View File

@@ -16,6 +16,23 @@
<h1>News</h1>
<h2>February 22, 2013</h2>
<p>
<a href="relnotes-9.1.html">Mesa 9.1</a> is released.
This is a new development release.
See the release notes for more information about the release.
</p>
<h2>February 21, 2013</h2>
<p>
<a href="relnotes-9.0.3.html">Mesa 9.0.3</a> is released.
This is a bug fix release.
</p>
<h2>January 22, 2013</h2>
<p>

235
docs/relnotes-9.1.1.html Normal file
View File

@@ -0,0 +1,235 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<h1>Mesa 9.1.1 Release Notes / March 19th, 2013</h1>
<p>
Mesa 9.1.1 is a bug fix release which fixes bugs found since the 9.1 release.
</p>
<p>
Mesa 9.1 implements the OpenGL 3.1 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.1. OpenGL
3.1 is <strong>only</strong> available if requested at context creation
because GL_ARB_compatibility is not supported.
</p>
<h2>MD5 checksums</h2>
<pre>
6508d9882d8dce7106717f365632700c MesaLib-9.1.1.tar.gz
6ea2bdc3b7ecfb4257b39814b4182580 MesaLib-9.1.1.tar.bz2
3434c0eb47849a08c53cd32833d10d13 MesaLib-9.1.1.zip
</pre>
<h2>New features</h2>
<p>None.</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=30232">Bug 30232</a> - [GM45] mesa demos spriteblast render incorrectly</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=32429">Bug 32429</a> - [gles2] Ironlake: gl_PointCoord takes no effect for point sprites</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=38086">Bug 38086</a> - Mesa 7.11-devel implementation error: Unexpected program target in destroy_program_variants_cb()</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=57121">Bug 57121</a> - [snb] corrupted GLSL built-in function results when using Uniform Buffer contents as arguments</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=58042">Bug 58042</a> - [bisected] Garbled UI in Team Fortress 2 and Counter-Strike: Source</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=58960">Bug 58960</a> - Texture flicker with fragment shader</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=59495">Bug 59495</a> - [i965 Bisected]Oglc fbblit(advanced.blitFb-3d-cube.mirror.both) fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=59783">Bug 59783</a> - [IVB bisected] 3DMMES2.0 Taiji performance reduced by ~13% with gnome-session enable compositing</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60121">Bug 60121</a> - build - libvdpau_softpipe fails at runtime.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60143">Bug 60143</a> - gbm_dri_bo_create fails to initialize bo-&gt;base.base.format</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60802">Bug 60802</a> - Corruption with DMA ring on cayman</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60848">Bug 60848</a> - [bisected] r600g: add htile support cause gpu lockup in Dishonored wine.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60938">Bug 60938</a> - [softpipe] piglit interpolation-noperspective-gl_BackColor-flat-fixed regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=61012">Bug 61012</a> - alloc_layout_array tx * ty assertion failure when making pbuffer current</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=61026">Bug 61026</a> - Segfault in glBitmap when called with PBO source</li>
<!-- <li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=">Bug </a> - </li> -->
</ul>
<h2>Changes</h2>
<p>The full set of changes can be viewed by using the following GIT command:</p>
<pre>
git log mesa-9.1..mesa-9.1.1
</pre>
<p>Adam Sampson (1):</p>
<ul>
<li>autotools: oprofilejit should be included in the list of LLVM components required</li>
</ul>
<p>Alex Deucher (2):</p>
<ul>
<li>r600g: add Richland APU pci ids</li>
<li>r600g: Use blitter rather than DMA for 128bpp on cayman (v3)</li>
</ul>
<p>Andreas Boll (2):</p>
<ul>
<li>docs: Add 9.1 release md5sums</li>
<li>docs: add news item for 9.1 release</li>
</ul>
<p>Anuj Phogat (1):</p>
<ul>
<li>meta: Allocate texture before initializing texture coordinates</li>
</ul>
<p>Brian Paul (11):</p>
<ul>
<li>docs: remove stray 'date' text</li>
<li>docs: insert links to the 9.0.3 release</li>
<li>draw: fix non-perspective interpolation in interp()</li>
<li>st/mesa: implement glBitmap unpacking from a PBO, for the cache path</li>
<li>st/xlib: initialize the drawable size in create_xmesa_buffer()</li>
<li>st/mesa: fix trimming of GL_QUAD_STRIP</li>
<li>st/mesa: check for dummy programs in destroy_program_variants()</li>
<li>st/mesa: fix polygon offset state translation logic</li>
<li>draw: fix broken polygon offset stage</li>
<li>llvmpipe: add missing checks for polygon offset point/line modes</li>
<li>svga: always link with C++</li>
</ul>
<p>Daniel van Vugt (1):</p>
<ul>
<li>gbm: Remember to init format on gbm_dri_bo_create.</li>
</ul>
<p>Eric Anholt (7):</p>
<ul>
<li>i965/fs: Do a general SEND dependency workaround for the original 965.</li>
<li>i965/fs: Fix copy propagation with smearing.</li>
<li>i965/fs: Delay setup of uniform loads until after pre-regalloc scheduling.</li>
<li>i965/fs: Only do CSE when the dst types match.</li>
<li>i965/fs: Fix broken math on values loaded from uniform buffers on gen6.</li>
<li>mesa: Fix setup of ctx-&gt;Point.PointSprite for GLES2.</li>
<li>i965: Fix the W value of deprecated pointcoords on pre-gen6.</li>
</ul>
<p>Frank Henigman (1):</p>
<ul>
<li>i965: Link i965_dri.so with C++ linker.</li>
</ul>
<p>Ian Romanick (3):</p>
<ul>
<li>mesa: Add previously picked commit to .cherry-ignore</li>
<li>mesa: Modify candidate search string</li>
<li>egl: Allow 24-bit visuals for 32-bit RGBA8888 configs</li>
</ul>
<p>Jakub Bogusz (1):</p>
<ul>
<li>vdpau-softpipe: Build correct source file - vl_winsys_xsp.c</li>
</ul>
<p>Jerome Glisse (1):</p>
<ul>
<li>r600g: workaround hyperz lockup on evergreen</li>
</ul>
<p>John Kåre Alsaker (1):</p>
<ul>
<li>llvmpipe: Fix creation of shared and scanout textures.</li>
</ul>
<p>Jordan Justen (1):</p>
<ul>
<li>attrib: push/pop FRAGMENT_PROGRAM_ARB state</li>
</ul>
<p>José Fonseca (3):</p>
<ul>
<li>scons: Allows choosing VS 10 or 11.</li>
<li>scons: Define _ALLOW_KEYWORD_MACROS on MSVC builds.</li>
<li>scons: Warn when using MSVS versions prior to 2012.</li>
</ul>
<p>Keith Kriewall (1):</p>
<ul>
<li>scons: Fix Windows build with LLVM 3.2</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>i965: Fix Crystal Well PCI IDs.</li>
</ul>
<p>Marek Olšák (5):</p>
<ul>
<li>r600g: use async DMA with a non-zero src offset</li>
<li>r600g: flush and invalidate htile cache when appropriate</li>
<li>gallium/util: add helper code for 1D integer range</li>
<li>r600g: always map uninitialized buffer range as unsynchronized</li>
<li>r600g: pad the DMA CS to a multiple of 8 dwords</li>
</ul>
<p>Martin Andersson (1):</p>
<ul>
<li>winsys/radeon: Only add bo to hash table when creating flink</li>
</ul>
<p>Matt Turner (1):</p>
<ul>
<li>mesa: Allow ETC2/EAC formats with ARB_ES3_compatibility.</li>
</ul>
<p>Michel Dänzer (3):</p>
<ul>
<li>radeonsi: Fix up and enable flat shading.</li>
<li>r600g/Cayman: Fix blending using destination alpha factor but non-alpha dest</li>
<li>radeonsi: Fix off-by-one for maximum vertex element index in some cases</li>
</ul>
<p>Tapani Pälli (2):</p>
<ul>
<li>mesa: add missing case in _mesa_GetTexParameterfv()</li>
<li>mesa/es: NULL check in EGLImageTargetTexture2DOES</li>
</ul>
<p>Vadim Girlin (1):</p>
<ul>
<li>r600g: fix check_and_set_bank_swizzle for cayman</li>
</ul>
<p>Vincent Lejeune (2):</p>
<ul>
<li>r600g/llvm: Add support for UBO</li>
<li>r600g: Check comp_mask before merging export instructions</li>
</ul>
</div>
</body>
</html>

237
docs/relnotes-9.1.2.html Normal file
View File

@@ -0,0 +1,237 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<h1>Mesa 9.1.2 Release Notes / April 30th, 2013</h1>
<p>
Mesa 9.1.2 is a bug fix release which fixes bugs found since the 9.1.1 release.
</p>
<p>
Mesa 9.1 implements the OpenGL 3.1 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.1. OpenGL
3.1 is <strong>only</strong> available if requested at context creation
because GL_ARB_compatibility is not supported.
</p>
<h2>MD5 checksums</h2>
<pre>
df2aab86ff4a510ce5b0d074caa0a59f MesaLib-9.1.2.tar.bz2
415c2bc3a9eb571aafbfa474ebf5a2e0 MesaLib-9.1.2.tar.gz
b1ae5a4d9255953980bc9254f5323420 MesaLib-9.1.2.zip
</pre>
<h2>New features</h2>
<p>None.</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=44567">Bug 44567</a> - [965gm] green artifacts when using GLSL in XBMC</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=59238">Bug 59238</a> - many new symbols in libxatracker after recent automake work</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=59445">Bug 59445</a> - [SNB/IVB/HSW Bisected]Oglc draw-buffers2(advanced.blending.none) segfault</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=59495">Bug 59495</a> - [i965 Bisected]Oglc fbblit(advanced.blitFb-3d-cube.mirror.both) fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60503">Bug 60503</a> - [r300g] Unigine Heaven 3.0: all objects are black</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60510">Bug 60510</a> - Firefox 18.0.2 Crash On Nvidia GeForce2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=61197">Bug 61197</a> - [SNB Bisected] kwin_gles screen corruption</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=61317">Bug 61317</a> - [IVB] corrupt rendering with UBOs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=61395">Bug 61395</a> - glEdgeFlag can't be set to false</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=61947">Bug 61947</a> - nullpointer dereference causes xorg-server segfault when nouveau DRI driver is loaded</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=62357">Bug 62357</a> - llvmpipe: Fragment Shader with &quot;return&quot; in main causes back output</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=62434">Bug 62434</a> - [bisected] 3284.073] (EE) AIGLX error: dlopen of /usr/lib/xorg/modules/dri/r600_dri.so failed (/usr/lib/libllvmradeon9.2.0.so: undefined symbol: lp_build_tgsi_intrinsic)</li>
<li><a href="http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=349437">Debian bug #349437</a> - mesa - FTBFS: error: 'IEEE_ONE' undeclared</li>
<li><a href="http://bugzilla.redhat.com/show_bug.cgi?id=918661">Redhat bug #918661</a> - crash in routine Avogadro UI manipulation</li>
</ul>
<h2>Changes</h2>
<p>The full set of changes can be viewed by using the following GIT command:</p>
<pre>
git log mesa-9.1.1..mesa-9.1.2
</pre>
<p>Adam Jackson (2):</p>
<ul>
<li>glx: Build with VISIBILITY_CFLAGS in automake</li>
<li>linux: Don't emit a .note.ABI-tag section anymore (#26663)</li>
</ul>
<p>Alan Hourihane (3):</p>
<ul>
<li>Add missing GL_TEXTURE_CUBE_MAP entry in _mesa_legal_texture_dimensions</li>
<li>Unreference sampler object when it's currently bound to texture unit.</li>
<li>mesa: fix glGetInteger*(GL_SAMPLER_BINDING).</li>
</ul>
<p>Alex Deucher (1):</p>
<ul>
<li>r600g: disable hyperz by default on 9.1</li>
</ul>
<p>Andreas Boll (5):</p>
<ul>
<li>radeon/llvm: Link against libgallium.la to fix an undefined symbol</li>
<li>mesa: use ieee fp on s390 and m68k</li>
<li>build: Enable x86 assembler on Hurd.</li>
<li>osmesa: fix out-of-tree build</li>
<li>gallium/egl: fix out-of-tree build</li>
</ul>
<p>Anuj Phogat (1):</p>
<ul>
<li>mesa: Fix FB blitting in case of zero size src or dst rect</li>
</ul>
<p>Brian Paul (4):</p>
<ul>
<li>mesa: flush current state when querying GL_EDGE_FLAG</li>
<li>vbo: fix crash found with shared display lists</li>
<li>llvmpipe: tweak CMD_BLOCK_MAX and LP_SCENE_MAX_SIZE</li>
<li>llvmpipe: add some scene limit sanity check assertions</li>
</ul>
<p>Carl Worth (1):</p>
<ul>
<li>i965: Avoid segfault in gen6_upload_state</li>
</ul>
<p>Chris Forbes (1):</p>
<ul>
<li>i965/vs: Fix Gen4/5 VUE map inconsistency with gl_ClipVertex</li>
</ul>
<p>Christoph Bumiller (4):</p>
<ul>
<li>nv50: fix 3D render target setup</li>
<li>nv50,nvc0: disable DEPTH_RANGE_NEAR/FAR clipping during blit</li>
<li>nv50,nvc0: fix 3d blits, restore viewport after blit</li>
<li>nvc0: fix for 2d engine R source formats writing RRR1 and not R001</li>
</ul>
<p>Eric Anholt (5):</p>
<ul>
<li>i965/fs: Fix register allocation for uniform pull constants in 16-wide.</li>
<li>i965/fs: Fix broken rendering in large shaders with UBO loads.</li>
<li>i965/fs: Also do the gen4 SEND dependency workaround against other SENDs.</li>
<li>i965: Add definitions for gen7+ data cache messages.</li>
<li>mesa: Disable validate_ir_tree() on release builds.</li>
</ul>
<p>Ian Romanick (5):</p>
<ul>
<li>docs: Add 9.1.1 release md5sums</li>
<li>mesa: Add previously picked commit to .cherry-ignore</li>
<li>glsl: Add missing bool case in glsl_type::get_scalar_type</li>
<li>mesa: Note that patch dbf94d1 should't actually get picked to the 9.1 branch</li>
<li>mesa: Bump version to 9.1.2</li>
</ul>
<p>Jan de Groot (1):</p>
<ul>
<li>dri/nouveau: fix crash in nouveau_flush</li>
</ul>
<p>José Fonseca (3):</p>
<ul>
<li>autotools: Add missing top-level include dir.</li>
<li>mesa,gallium,egl,mapi: One definition of C99 inline/__func__ to rule them all.</li>
<li>include: Fix build with VS 11 (i.e, 2012).</li>
</ul>
<p>Kenneth Graunke (4):</p>
<ul>
<li>i965: Fix INTEL_DEBUG=shader_time for Haswell.</li>
<li>i965: Specialize SURFACE_STATE creation for shader time.</li>
<li>i965: Make INTEL_DEBUG=shader_time use the RAW surface format.</li>
<li>i965: Don't use texture swizzling to force alpha to 1.0 if unnecessary.</li>
</ul>
<p>Maarten Lankhorst (2):</p>
<ul>
<li>gallium/build: Fix visibility CFLAGS in automake</li>
<li>radeon/llvm: Do not link against libgallium when building statically.</li>
</ul>
<p>Marcin Slusarz (1):</p>
<ul>
<li>dri/nouveau: NV17_3D class is not available for NV1a chipset</li>
</ul>
<p>Marek Olšák (3):</p>
<ul>
<li>mesa: don't allocate a texture if width or height is 0 in CopyTexImage</li>
<li>gallium/tgsi: fix valgrind warning</li>
<li>mesa: handle HALF_FLOAT like FLOAT in get_tex_rgba</li>
</ul>
<p>Martin Andersson (1):</p>
<ul>
<li>r600g: Use virtual address for PIPE_QUERY_SO* in r600_emit_query_end</li>
</ul>
<p>Matt Turner (3):</p>
<ul>
<li>configure.ac: Don't check for X11 unconditionally.</li>
<li>configure.ac: Remove stale comment about --x-* arguments.</li>
<li>mesa: Implement TEXTURE_IMMUTABLE_LEVELS for ES 3.0.</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>radeonsi: Emit pixel shader state even when only the vertex shader changed</li>
</ul>
<p>Paul Berry (1):</p>
<ul>
<li>i965: Apply depthstencil alignment workaround when doing fast clears.</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>gallivm: fix return opcode handling in main function of a shader</li>
</ul>
<p>Tapani Pälli (1):</p>
<ul>
<li>intel: Fix regression in intel_create_image_from_name stride handling</li>
</ul>
<p>Tom Stellard (1):</p>
<ul>
<li>r300g: Fix bug in OMOD optimization</li>
</ul>
</div>
</body>
</html>

228
docs/relnotes-9.1.3.html Normal file
View File

@@ -0,0 +1,228 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<h1>Mesa 9.1.3 Release Notes / May 21st, 2013</h1>
<p>
Mesa 9.1.3 is a bug fix release which fixes bugs found since the 9.1.1 release.
</p>
<p>
Mesa 9.1 implements the OpenGL 3.1 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.1. OpenGL
3.1 is <strong>only</strong> available if requested at context creation
because GL_ARB_compatibility is not supported.
</p>
<h2>MD5 checksums</h2>
<pre>
TBD
</pre>
<h2>New features</h2>
<p>None.</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=39251">Bug 39251</a> - Second Life viewers from release 2.7.4.235167 to the last 3.4.0.264911 crash on start.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=47478">Bug 47478</a> - [wine] GLX_DONT_CARE does not work for GLX_DRAWABLE_TYPE or GLX_RENDER_TYPE</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=56416">Bug 56416</a> - [SNB bisected] SNB hang with rc6 and hiz on glxgears (and other GL apps) immediately after xinit.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=57436">Bug 57436</a> - [GLSL1.40 IVB/HSW]Piglit spec/glsl-1.40/compiler_built-in-functions/inverse-mat2.frag fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=61554">Bug 61554</a> - [ivb] Mesa 9.1 performance regression on KWin's Lanczos shader</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=61773">Bug 61773</a> - abort is an incredibly not-smart way to handle IR validation</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=62868">Bug 62868</a> - solaris build broken with missing ffsll</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=62999">Bug 62999</a> - glXChooseFBConfig with GLX_DRAWABLE_TYPE, GLX_DONT_CARE fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=63078">Bug 63078</a> - EGL X11 Regression: Maximum swap interval is 0 (worked with 9.0)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=63447">Bug 63447</a> - [i965 Bisected]Ogles1conform/Ogles2conform/Ogles3conform cases segfault</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=64662">Bug 64662</a> - [SNB 9.1 Bisected]Ogles2conform GL2ExtensionTests/depth_texture_cube_map/depth_texture_cube_map.test fail</li>
</ul>
<h2>Changes</h2>
<p>The full set of changes can be viewed by using the following GIT command:</p>
<pre>
git log mesa-9.1.2..mesa-9.1.3
</pre>
<p>Alex Deucher (2):</p>
<ul>
<li>r600g: add new richland pci ids</li>
<li>radeonsi: add new SI pci ids</li>
</ul>
<p>Alexander Monakov (1):</p>
<ul>
<li>Honor GLX_DONT_CARE in MATCH_MASK</li>
</ul>
<p>Andreas Boll (2):</p>
<ul>
<li>mesa: Add a script to generate the list of fixed bugs</li>
<li>mesa: add usage examples to get-pick-list and shortlog scripts</li>
</ul>
<p>Aras Pranckevicius (1):</p>
<ul>
<li>GLSL: fix lower_jumps to report progress properly</li>
</ul>
<p>Brian Paul (3):</p>
<ul>
<li>mesa: remove platform checks around __builtin_ffs, __builtin_ffsll</li>
<li>gallium/u_blitter: fix is_blit_generic_supported() stencil checking</li>
<li>mesa: enable GL_ARB_texture_float if TEXTURE_FLOAT_ENABLED is defined</li>
</ul>
<p>Chad Versace (2):</p>
<ul>
<li>egl/dri2: Fix min/max swap interval of configs</li>
<li>intel: Allocate hiz in intel_renderbuffer_move_to_temp()</li>
</ul>
<p>Chris Forbes (2):</p>
<ul>
<li>i965/fs: Don't try to use bogus interpolation modes pre-Gen6.</li>
<li>mesa: don't memcmp() off the end of a cache key.</li>
</ul>
<p>Dave Airlie (2):</p>
<ul>
<li>st/mesa: fix UBO offsets.</li>
<li>ralloc: don't write to memory in case of alloc fail.</li>
</ul>
<p>Eric Anholt (11):</p>
<ul>
<li>i965/fs: Remove creation of a MOV instruction that's never used.</li>
<li>i965/fs: Move varying uniform offset compuation into the helper func.</li>
<li>i965: Make the constant surface interface take a normal byte size.</li>
<li>i965/fs: Avoid inappropriate optimization with regs_written &gt; 1.</li>
<li>i965/fs: Do CSE on gen7's varying-index pull constant loads.</li>
<li>i965/fs: Clean up the setup of gen4 simd16 message destinations.</li>
<li>i965/gen7: Skip resetting SOL offsets at batch start with HW contexts.</li>
<li>i965/gen6: Reduce updates of transform feedback offsets with HW contexts.</li>
<li>i965: Fix SNB GPU hangs when a blorp batch is the first thing to execute.</li>
<li>i965: Fix hangs on HSW since the gen6 blorp fix.</li>
<li>i965: Disable write masking when setting up texturing m0.</li>
</ul>
<p>Haixia Shi (1):</p>
<ul>
<li>ACTIVE_UNIFORM_MAX_LENGTH should include 3 extra characters for arrays.</li>
</ul>
<p>Ian Romanick (11):</p>
<ul>
<li>docs: Add 9.1.2 release md5sums</li>
<li>mesa: Note that patch 0967c36 shouldn't actually get picked to the 9.1 branch</li>
<li>mesa: NULL check the pointer before trying to dereference it</li>
<li>egl/dri2: NULL check value returned by dri2_create_surface</li>
<li>mesa: Don't leak shared state when context initialization fails</li>
<li>mesa: Don't leak gl_context::BeginEnd at context destruction</li>
<li>mesa/swrast: Refactor no-memory error checking in blit_linear</li>
<li>mesa/swrast: Move free calls outside the attachment loop</li>
<li>intel: Don't dereference a NULL pointer of calloc fails</li>
<li>mesa: Note that a824692 is already back ported</li>
<li>mesa: Bump version to 9.1.3</li>
</ul>
<p>José Fonseca (1):</p>
<ul>
<li>winsys/sw/xlib: Prevent shared memory segment leakage.</li>
</ul>
<p>Kenneth Graunke (9):</p>
<ul>
<li>mesa: Add new ctx-&gt;Stencil._WriteEnabled derived state flag.</li>
<li>i965: Fix stencil write enable flag in 3DSTATE_DEPTH_BUFFER on Gen7+.</li>
<li>mesa: Fix unpack function for ETC2_SRGB8_PUNCHTHROUGH_ALPHA1.</li>
<li>mesa: Add an unpack function for ARGB2101010_UINT.</li>
<li>mesa: Add unpack functions for R/RG/RGB [U]INT8/16/32 formats.</li>
<li>mesa: Add unpack functions for A/I/L/LA [U]INT8/16/32 formats.</li>
<li>glsl: Ignore redundant prototypes after a function's been defined.</li>
<li>i965: Lower textureGrad() for samplerCubeShadow.</li>
<li>i965/vs: Fix textureGrad() with shadow samplers on Haswell.</li>
</ul>
<p>Maarten Lankhorst (1):</p>
<ul>
<li>nvc0: Fix fd leak in nvc0_create_decoder</li>
</ul>
<p>Marek Olšák (5):</p>
<ul>
<li>radeonsi: add more cases for copying unsupported formats to resource_copy_region</li>
<li>mesa: fix glGet queries depending on derived framebuffer state (v2)</li>
<li>gallium/u_blitter: implement buffer clearing</li>
<li>r600g: initialize CMASK and HTILE with the GPU using streamout</li>
<li>st/mesa: depth-stencil-alpha state also depends on _NEW_BUFFERS</li>
</ul>
<p>Martin Andersson (1):</p>
<ul>
<li>r600g: Fix UMAD on Cayman</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>radeonsi: Handle arbitrary 2-byte formats in resource_copy_region</li>
</ul>
<p>Paul Berry (7):</p>
<ul>
<li>glsl: Fix array indexing when constant folding built-in functions.</li>
<li>i965: Reduce code duplication in handling of depth, stencil, and HiZ.</li>
<li>glsl/linker: fix varying packing for non-flat integer varyings.</li>
<li>glsl: Document lower_packed_varyings' "flat" requirement with an assert.</li>
<li>glsl/linker: Adapt flat varying handling in preparation for geometry shaders.</li>
<li>glsl/linker: Reduce scope of non-flat integer varying fix.</li>
<li>intel: Do a depth resolve before copying images between miptrees.</li>
</ul>
<p>Ralf Jung (1):</p>
<ul>
<li>egl/x11: Fix initialisation of swap_interval</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>gallivm: fix small but severe bug in handling multiple lod level strides</li>
</ul>
<p>Vadim Girlin (1):</p>
<ul>
<li>gallium: handle drirc disable_glsl_line_continuations option</li>
</ul>
</div>
</body>
</html>

View File

@@ -14,7 +14,7 @@
<iframe src="contents.html"></iframe>
<div class="content">
<h1>Mesa 9.1 Release Notes / date TBD</h1>
<h1>Mesa 9.1 Release Notes / February 22, 2013</h1>
<p>
Mesa 9.1 is a new development release.
@@ -33,7 +33,9 @@ because GL_ARB_compatibility is not supported.
<h2>MD5 checksums</h2>
<pre>
tbd
86d40f3056f89949368764bf84aff55e MesaLib-9.1.tar.gz
d3891e02215422e120271d976ff1947e MesaLib-9.1.tar.bz2
01645f28f53351c23b0beb6c688911d8 MesaLib-9.1.zip
</pre>
@@ -44,9 +46,19 @@ Note: some of the new features are only available with certain drivers.
</p>
<ul>
<li>GL_ANGLE_texture_compression_dxt3</li>
<li>GL_ANGLE_texture_compression_dxt5</li>
<li>GL_ARB_ES3_compatibility</li>
<li>GL_ARB_internalformat_query</li>
<li>GL_ARB_map_buffer_alignment</li>
<li>GL_ARB_texture_cube_map_array</li>
<li>GL_ARB_shading_language_packing</li>
<li>GL_ARB_texture_buffer_object_rgb32</li>
<li>GL_ARB_texture_cube_map_array</li>
<li>GL_EXT_color_buffer_float</li>
<li>GL_OES_depth_texture_cube_map</li>
<li>OpenGL 3.1 core profile support on Radeon HD2000 up to HD6000 series </li>
<li>Multisample anti-aliasing support on Radeon X1000 series</li>
<li>OpenGL ES 3.0 support on Intel HD Graphics 2000, 2500, 3000, and 4000</li>
</ul>
@@ -63,6 +75,7 @@ Note: some of the new features are only available with certain drivers.
<li>Removed swrast support for GL_NV_vertex_program</li>
<li>Removed swrast support for GL_NV_fragment_program</li>
<li>Removed OpenVMS support (unmaintained and broken)</li>
<li>Removed makedepend build dependency</li>
</ul>
</div>

View File

@@ -22,6 +22,7 @@ The release notes summarize what's new or changed in each Mesa release.
<ul>
<li><a href="relnotes-9.1.html">9.1 release notes</a>
<li><a href="relnotes-9.0.3.html">9.0.3 release notes</a>
<li><a href="relnotes-9.0.2.html">9.0.2 release notes</a>
<li><a href="relnotes-9.0.1.html">9.0.1 release notes</a>
<li><a href="relnotes-9.0.html">9.0 release notes</a>

147
include/c99_compat.h Normal file
View File

@@ -0,0 +1,147 @@
/**************************************************************************
*
* Copyright 2007-2013 VMware, Inc.
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sub license, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial portions
* of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
* OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
* IN NO EVENT SHALL VMWARE AND/OR ITS SUPPLIERS BE LIABLE FOR
* ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
* TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
* SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*
**************************************************************************/
#ifndef _C99_COMPAT_H_
#define _C99_COMPAT_H_
/*
* MSVC hacks.
*/
#if defined(_MSC_VER)
/*
* Visual Studio 2012 will complain if we define the `inline` keyword, but
* actually it only supports the keyword on C++.
*
* We could skip this check by defining _ALLOW_KEYWORD_MACROS, but there is
* probably value in checking this for other keywords. So simply include
* the checking before we define it below.
*/
# if _MSC_VER >= 1700
# include <xkeycheck.h>
# endif
/*
* XXX: MSVC has a `__restrict` keyword, but it also has a
* `__declspec(restrict)` modifier, so it is impossible to define a
* `restrict` macro without interfering with the latter. Furthermore the
* MSVC standard library uses __declspec(restrict) under the _CRTRESTRICT
* macro. For now resolve this issue by redefining _CRTRESTRICT, but going
* forward we should probably should stop using restrict, especially
* considering that our code does not obbey strict aliasing rules any way.
*/
# include <crtdefs.h>
# undef _CRTRESTRICT
# define _CRTRESTRICT
#endif
/*
* C99 inline keyword
*/
#ifndef inline
# ifdef __cplusplus
/* C++ supports inline keyword */
# elif defined(__GNUC__)
# define inline __inline__
# elif defined(_MSC_VER)
# define inline __inline
# elif defined(__ICL)
# define inline __inline
# elif defined(__INTEL_COMPILER)
/* Intel compiler supports inline keyword */
# elif defined(__WATCOMC__) && (__WATCOMC__ >= 1100)
# define inline __inline
# elif defined(__SUNPRO_C) && defined(__C99FEATURES__)
/* C99 supports inline keyword */
# elif (__STDC_VERSION__ >= 199901L)
/* C99 supports inline keyword */
# else
# define inline
# endif
#endif
/*
* C99 restrict keyword
*
* See also:
* - http://cellperformance.beyond3d.com/articles/2006/05/demystifying-the-restrict-keyword.html
*/
#ifndef restrict
# if (__STDC_VERSION__ >= 199901L)
/* C99 */
# elif defined(__SUNPRO_C) && defined(__C99FEATURES__)
/* C99 */
# elif defined(__GNUC__)
# define restrict __restrict__
# elif defined(_MSC_VER)
# define restrict __restrict
# else
# define restrict /* */
# endif
#endif
/*
* C99 __func__ macro
*/
#ifndef __func__
# if (__STDC_VERSION__ >= 199901L)
/* C99 */
# elif defined(__SUNPRO_C) && defined(__C99FEATURES__)
/* C99 */
# elif defined(__GNUC__)
# if __GNUC__ >= 2
# define __func__ __FUNCTION__
# else
# define __func__ "<unknown>"
# endif
# elif defined(_MSC_VER)
# if _MSC_VER >= 1300
# define __func__ __FUNCTION__
# else
# define __func__ "<unknown>"
# endif
# else
# define __func__ "<unknown>"
# endif
#endif
/* Simple test case for debugging */
#if 0
static inline const char *
test_c99_compat_h(const void * restrict a,
const void * restrict b)
{
return __func__;
}
#endif
#endif /* _C99_COMPAT_H_ */

View File

@@ -53,12 +53,12 @@ CHIPSET(0x0A26, HASWELL_ULT_M_GT2_PLUS, hsw_gt2)
CHIPSET(0x0A0A, HASWELL_ULT_S_GT1, hsw_gt1)
CHIPSET(0x0A1A, HASWELL_ULT_S_GT2, hsw_gt2)
CHIPSET(0x0A2A, HASWELL_ULT_S_GT2_PLUS, hsw_gt2)
CHIPSET(0x0D12, HASWELL_CRW_GT1, hsw_gt1)
CHIPSET(0x0D22, HASWELL_CRW_GT2, hsw_gt2)
CHIPSET(0x0D32, HASWELL_CRW_GT2_PLUS, hsw_gt2)
CHIPSET(0x0D16, HASWELL_CRW_M_GT1, hsw_gt1)
CHIPSET(0x0D26, HASWELL_CRW_M_GT2, hsw_gt2)
CHIPSET(0x0D36, HASWELL_CRW_M_GT2_PLUS, hsw_gt2)
CHIPSET(0x0D1A, HASWELL_CRW_S_GT1, hsw_gt1)
CHIPSET(0x0D2A, HASWELL_CRW_S_GT2, hsw_gt2)
CHIPSET(0x0D3A, HASWELL_CRW_S_GT2_PLUS, hsw_gt2)
CHIPSET(0x0D02, HASWELL_CRW_GT1, hsw_gt1)
CHIPSET(0x0D12, HASWELL_CRW_GT2, hsw_gt2)
CHIPSET(0x0D22, HASWELL_CRW_GT2_PLUS, hsw_gt2)
CHIPSET(0x0D06, HASWELL_CRW_M_GT1, hsw_gt1)
CHIPSET(0x0D16, HASWELL_CRW_M_GT2, hsw_gt2)
CHIPSET(0x0D26, HASWELL_CRW_M_GT2_PLUS, hsw_gt2)
CHIPSET(0x0D0A, HASWELL_CRW_S_GT1, hsw_gt1)
CHIPSET(0x0D1A, HASWELL_CRW_S_GT2, hsw_gt2)
CHIPSET(0x0D2A, HASWELL_CRW_S_GT2_PLUS, hsw_gt2)

View File

@@ -298,6 +298,10 @@ CHIPSET(0x9907, ARUBA_9907, ARUBA)
CHIPSET(0x9908, ARUBA_9908, ARUBA)
CHIPSET(0x9909, ARUBA_9909, ARUBA)
CHIPSET(0x990A, ARUBA_990A, ARUBA)
CHIPSET(0x990B, ARUBA_990B, ARUBA)
CHIPSET(0x990C, ARUBA_990C, ARUBA)
CHIPSET(0x990D, ARUBA_990D, ARUBA)
CHIPSET(0x990E, ARUBA_990E, ARUBA)
CHIPSET(0x990F, ARUBA_990F, ARUBA)
CHIPSET(0x9910, ARUBA_9910, ARUBA)
CHIPSET(0x9913, ARUBA_9913, ARUBA)
@@ -309,6 +313,15 @@ CHIPSET(0x9991, ARUBA_9991, ARUBA)
CHIPSET(0x9992, ARUBA_9992, ARUBA)
CHIPSET(0x9993, ARUBA_9993, ARUBA)
CHIPSET(0x9994, ARUBA_9994, ARUBA)
CHIPSET(0x9995, ARUBA_9995, ARUBA)
CHIPSET(0x9996, ARUBA_9996, ARUBA)
CHIPSET(0x9997, ARUBA_9997, ARUBA)
CHIPSET(0x9998, ARUBA_9998, ARUBA)
CHIPSET(0x9999, ARUBA_9999, ARUBA)
CHIPSET(0x999A, ARUBA_999A, ARUBA)
CHIPSET(0x999B, ARUBA_999B, ARUBA)
CHIPSET(0x999C, ARUBA_999C, ARUBA)
CHIPSET(0x999D, ARUBA_999D, ARUBA)
CHIPSET(0x99A0, ARUBA_99A0, ARUBA)
CHIPSET(0x99A2, ARUBA_99A2, ARUBA)
CHIPSET(0x99A4, ARUBA_99A4, ARUBA)

View File

@@ -28,6 +28,7 @@ CHIPSET(0x684C, PITCAIRN_684C, PITCAIRN)
CHIPSET(0x6820, VERDE_6820, VERDE)
CHIPSET(0x6821, VERDE_6821, VERDE)
CHIPSET(0x6822, VERDE_6822, VERDE)
CHIPSET(0x6823, VERDE_6823, VERDE)
CHIPSET(0x6824, VERDE_6824, VERDE)
CHIPSET(0x6825, VERDE_6825, VERDE)
@@ -35,14 +36,30 @@ CHIPSET(0x6826, VERDE_6826, VERDE)
CHIPSET(0x6827, VERDE_6827, VERDE)
CHIPSET(0x6828, VERDE_6828, VERDE)
CHIPSET(0x6829, VERDE_6829, VERDE)
CHIPSET(0x682A, VERDE_682A, VERDE)
CHIPSET(0x682B, VERDE_682B, VERDE)
CHIPSET(0x682D, VERDE_682D, VERDE)
CHIPSET(0x682F, VERDE_682F, VERDE)
CHIPSET(0x6830, VERDE_6830, VERDE)
CHIPSET(0x6831, VERDE_6831, VERDE)
CHIPSET(0x6835, VERDE_6835, VERDE)
CHIPSET(0x6837, VERDE_6837, VERDE)
CHIPSET(0x6838, VERDE_6838, VERDE)
CHIPSET(0x6839, VERDE_6839, VERDE)
CHIPSET(0x683B, VERDE_683B, VERDE)
CHIPSET(0x683D, VERDE_683D, VERDE)
CHIPSET(0x683F, VERDE_683F, VERDE)
CHIPSET(0x6600, OLAND_6600, OLAND)
CHIPSET(0x6601, OLAND_6601, OLAND)
CHIPSET(0x6602, OLAND_6602, OLAND)
CHIPSET(0x6603, OLAND_6603, OLAND)
CHIPSET(0x6606, OLAND_6606, OLAND)
CHIPSET(0x6607, OLAND_6607, OLAND)
CHIPSET(0x6610, OLAND_6610, OLAND)
CHIPSET(0x6611, OLAND_6611, OLAND)
CHIPSET(0x6613, OLAND_6613, OLAND)
CHIPSET(0x6620, OLAND_6620, OLAND)
CHIPSET(0x6621, OLAND_6621, OLAND)
CHIPSET(0x6623, OLAND_6623, OLAND)
CHIPSET(0x6631, OLAND_6631, OLAND)

View File

@@ -289,6 +289,7 @@ def generate(env):
'_CRT_SECURE_NO_DEPRECATE',
'_SCL_SECURE_NO_WARNINGS',
'_SCL_SECURE_NO_DEPRECATE',
'_ALLOW_KEYWORD_MACROS',
]
if env['build'] in ('debug', 'checked'):
cppdefines += ['_DEBUG']
@@ -401,6 +402,8 @@ def generate(env):
'/Oi', # enable intrinsic functions
]
else:
if distutils.version.LooseVersion(env['MSVC_VERSION']) < distutils.version.LooseVersion('11.0'):
print 'scons: warning: Visual Studio versions prior to 2012 are known to produce incorrect code when optimizations are enabled ( https://bugs.freedesktop.org/show_bug.cgi?id=58718 )'
ccflags += [
'/O2', # optimize for speed
]
@@ -530,7 +533,7 @@ def generate(env):
env.PkgCheckModules('XF86VIDMODE', ['xxf86vm'])
env.PkgCheckModules('DRM', ['libdrm >= 2.4.24'])
env.PkgCheckModules('DRM_INTEL', ['libdrm_intel >= 2.4.30'])
env.PkgCheckModules('DRM_RADEON', ['libdrm_radeon >= 2.4.40'])
env.PkgCheckModules('DRM_RADEON', ['libdrm_radeon >= 2.4.42'])
env.PkgCheckModules('XORG', ['xorg-server >= 1.6.0'])
env.PkgCheckModules('KMS', ['libkms >= 2.4.24'])
env.PkgCheckModules('UDEV', ['libudev > 150'])

View File

@@ -92,7 +92,19 @@ def generate(env):
'HAVE_STDINT_H',
])
env.Prepend(LIBPATH = [os.path.join(llvm_dir, 'lib')])
if llvm_version >= distutils.version.LooseVersion('3.0'):
if llvm_version >= distutils.version.LooseVersion('3.2'):
# 3.2
env.Prepend(LIBS = [
'LLVMBitWriter', 'LLVMX86Disassembler', 'LLVMX86AsmParser',
'LLVMX86CodeGen', 'LLVMX86Desc', 'LLVMSelectionDAG',
'LLVMAsmPrinter', 'LLVMMCParser', 'LLVMX86AsmPrinter',
'LLVMX86Utils', 'LLVMX86Info', 'LLVMJIT',
'LLVMExecutionEngine', 'LLVMCodeGen', 'LLVMScalarOpts',
'LLVMInstCombine', 'LLVMTransformUtils', 'LLVMipa',
'LLVMAnalysis', 'LLVMTarget', 'LLVMMC', 'LLVMCore',
'LLVMSupport', 'LLVMRuntimeDyld', 'LLVMObject'
])
elif llvm_version >= distutils.version.LooseVersion('3.0'):
# 3.0
env.Prepend(LIBS = [
'LLVMBitWriter', 'LLVMX86Disassembler', 'LLVMX86AsmParser',

View File

@@ -195,7 +195,14 @@ dri2_add_config(_EGLDisplay *disp, const __DRIconfig *dri_config, int id,
for (i = 0; attr_list[i] != EGL_NONE; i += 2)
_eglSetConfigKey(&base, attr_list[i], attr_list[i+1]);
if (depth > 0 && depth != base.BufferSize)
/* Allow a 24-bit RGB visual to match a 32-bit RGBA EGLConfig. Otherwise
* it will only match a 32-bit RGBA visual. On a composited window manager
* on X11, this will make all of the EGLConfigs with destination alpha get
* blended by the compositor. This is probably not what the application
* wants... especially on drivers that only have 32-bit RGBA EGLConfigs!
*/
if (depth > 0 && depth != base.BufferSize
&& !(depth == 24 && base.BufferSize == 32))
return NULL;
if (rgba_masks && memcmp(rgba_masks, dri_masks, sizeof(dri_masks)))
@@ -214,6 +221,9 @@ dri2_add_config(_EGLDisplay *disp, const __DRIconfig *dri_config, int id,
base.RenderableType = disp->ClientAPIs;
base.Conformant = disp->ClientAPIs;
base.MinSwapInterval = dri2_dpy->min_swap_interval;
base.MaxSwapInterval = dri2_dpy->max_swap_interval;
if (!_eglValidateConfig(&base, EGL_FALSE)) {
_eglLog(_EGL_DEBUG, "DRI2: failed to validate config %d", id);
return NULL;
@@ -261,9 +271,6 @@ dri2_add_config(_EGLDisplay *disp, const __DRIconfig *dri_config, int id,
if (double_buffer) {
surface_type &= ~EGL_PIXMAP_BIT;
conf->base.MinSwapInterval = dri2_dpy->min_swap_interval;
conf->base.MaxSwapInterval = dri2_dpy->max_swap_interval;
}
conf->base.SurfaceType |= surface_type;

View File

@@ -446,6 +446,7 @@ dri2_swap_buffers(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *draw)
{
struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp);
struct dri2_egl_surface *dri2_surf = dri2_egl_surface(draw);
__DRIbuffer buffer;
int i, ret = 0;
while (dri2_surf->frame_callback && ret != -1)
@@ -463,6 +464,13 @@ dri2_swap_buffers(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *draw)
if (dri2_surf->color_buffers[i].age > 0)
dri2_surf->color_buffers[i].age++;
/* Make sure we have a back buffer in case we're swapping without ever
* rendering. */
if (get_back_bo(dri2_surf, &buffer) < 0) {
_eglError(EGL_BAD_ALLOC, "dri2_swap_buffers");
return EGL_FALSE;
}
dri2_surf->back->age = 1;
dri2_surf->current = dri2_surf->back;
dri2_surf->back = NULL;

View File

@@ -284,14 +284,15 @@ dri2_create_window_surface(_EGLDriver *drv, _EGLDisplay *disp,
surf = dri2_create_surface(drv, disp, EGL_WINDOW_BIT, conf,
window, attrib_list);
if (surf != NULL) {
/* When we first create the DRI2 drawable, its swap interval on the
* server side is 1.
*/
surf->SwapInterval = 1;
/* When we first create the DRI2 drawable, its swap interval on the server
* side is 1.
*/
surf->SwapInterval = 1;
/* Override that with a driconf-set value. */
drv->API.SwapInterval(drv, disp, surf, dri2_dpy->default_swap_interval);
/* Override that with a driconf-set value. */
drv->API.SwapInterval(drv, disp, surf, dri2_dpy->default_swap_interval);
}
return surf;
}
@@ -1162,6 +1163,8 @@ dri2_initialize_x11_dri2(_EGLDriver *drv, _EGLDisplay *disp)
if (!dri2_create_screen(disp))
goto cleanup_fd;
dri2_setup_swap_interval(dri2_dpy);
if (dri2_dpy->conn) {
if (!dri2_add_configs_for_visuals(dri2_dpy, disp))
goto cleanup_configs;
@@ -1181,8 +1184,6 @@ dri2_initialize_x11_dri2(_EGLDriver *drv, _EGLDisplay *disp)
disp->VersionMajor = 1;
disp->VersionMinor = 4;
dri2_setup_swap_interval(dri2_dpy);
return EGL_TRUE;
cleanup_configs:

View File

@@ -31,6 +31,9 @@
#define EGLCOMPILER_INCLUDED
#include "c99_compat.h" /* inline, __func__, etc. */
/**
* Get standard integer types
*/
@@ -62,30 +65,7 @@
#endif
/**
* Function inlining
*/
#ifndef inline
# ifdef __cplusplus
/* C++ supports inline keyword */
# elif defined(__GNUC__)
# define inline __inline__
# elif defined(_MSC_VER)
# define inline __inline
# elif defined(__ICL)
# define inline __inline
# elif defined(__INTEL_COMPILER)
/* Intel compiler supports inline keyword */
# elif defined(__WATCOMC__) && (__WATCOMC__ >= 1100)
# define inline __inline
# elif defined(__SUNPRO_C) && defined(__C99FEATURES__)
/* C99 supports inline keyword */
# elif (__STDC_VERSION__ >= 199901L)
/* C99 supports inline keyword */
# else
# define inline
# endif
#endif
/* XXX: Use standard `inline` keyword instead */
#ifndef INLINE
# define INLINE inline
#endif
@@ -104,21 +84,9 @@
# endif
#endif
/**
* The __FUNCTION__ gcc variable is generally only used for debugging.
* If we're not using gcc, define __FUNCTION__ as a cpp symbol here.
* Don't define it if using a newer Windows compiler.
*/
/* XXX: Use standard `__func__` instead */
#ifndef __FUNCTION__
# if (!defined __GNUC__) && (!defined __xlC__) && \
(!defined(_MSC_VER) || _MSC_VER < 1300)
# if (__STDC_VERSION__ >= 199901L) /* C99 */ || \
(defined(__SUNPRO_C) && defined(__C99FEATURES__))
# define __FUNCTION__ __func__
# else
# define __FUNCTION__ "<unknown>"
# endif
# endif
# define __FUNCTION__ __func__
#endif
#endif /* EGLCOMPILER_INCLUDED */

View File

@@ -7,7 +7,10 @@ noinst_LTLIBRARIES = libgallium.la
AM_CFLAGS = \
-I$(top_srcdir)/src/gallium/auxiliary/util \
$(GALLIUM_CFLAGS)
$(GALLIUM_CFLAGS) \
$(VISIBILITY_CFLAGS)
AM_CXXFLAGS = $(VISIBILITY_CXXFLAGS)
libgallium_la_SOURCES = \
$(C_SOURCES) \
@@ -18,7 +21,7 @@ if HAVE_MESA_LLVM
AM_CFLAGS += \
$(LLVM_CFLAGS)
AM_CXXFLAGS = \
AM_CXXFLAGS += \
$(GALLIUM_CFLAGS) \
$(LLVM_CXXFLAGS)

View File

@@ -167,12 +167,17 @@ static void interp( const struct clip_stage *clip,
{
int k;
t_nopersp = t;
for (k = 0; k < 2; k++)
/* find either in.x != out.x or in.y != out.y */
for (k = 0; k < 2; k++) {
if (in->clip[k] != out->clip[k]) {
t_nopersp = (dst->clip[k] - out->clip[k]) /
(in->clip[k] - out->clip[k]);
/* do divide by W, then compute linear interpolation factor */
float in_coord = in->clip[k] / in->clip[3];
float out_coord = out->clip[k] / out->clip[3];
float dst_coord = dst->clip[k] / dst->clip[3];
t_nopersp = (dst_coord - out_coord) / (in_coord - out_coord);
break;
}
}
}
/* Other attributes

View File

@@ -127,10 +127,44 @@ static void offset_first_tri( struct draw_stage *stage,
struct prim_header *header )
{
struct offset_stage *offset = offset_stage(stage);
const struct pipe_rasterizer_state *rast = stage->draw->rasterizer;
unsigned fill_mode = rast->fill_front;
boolean do_offset;
if (rast->fill_back != rast->fill_front) {
/* Need to check for back-facing triangle */
boolean ccw = header->det < 0.0f;
if (ccw != rast->front_ccw)
fill_mode = rast->fill_back;
}
/* Now determine if we need to do offsetting for the point/line/fill mode */
switch (fill_mode) {
case PIPE_POLYGON_MODE_FILL:
do_offset = rast->offset_tri;
break;
case PIPE_POLYGON_MODE_LINE:
do_offset = rast->offset_line;
break;
case PIPE_POLYGON_MODE_POINT:
do_offset = rast->offset_point;
break;
default:
assert(!"invalid fill_mode in offset_first_tri()");
do_offset = rast->offset_tri;
}
if (do_offset) {
offset->scale = rast->offset_scale;
offset->clamp = rast->offset_clamp;
offset->units = (float) (rast->offset_units * stage->draw->mrd);
}
else {
offset->scale = 0.0f;
offset->clamp = 0.0f;
offset->units = 0.0f;
}
offset->units = (float) (stage->draw->rasterizer->offset_units * stage->draw->mrd);
offset->scale = stage->draw->rasterizer->offset_scale;
offset->clamp = stage->draw->rasterizer->offset_clamp;
stage->tri = offset_tri;
stage->tri( stage, header );

View File

@@ -867,7 +867,7 @@ lp_build_get_level_stride_vec(struct lp_build_sample_context *bld,
stride = bld->int_coord_bld.undef;
for (i = 0; i < bld->num_lods; i++) {
LLVMValueRef indexi = lp_build_const_int32(bld->gallivm, i);
LLVMValueRef indexo = lp_build_const_int32(bld->gallivm, i);
LLVMValueRef indexo = lp_build_const_int32(bld->gallivm, 4 * i);
indexes[1] = LLVMBuildExtractElement(builder, level, indexi, "");
stride1 = LLVMBuildGEP(builder, stride_array, indexes, 2, "");
stride1 = LLVMBuildLoad(builder, stride1, "");

View File

@@ -240,6 +240,7 @@ struct lp_exec_mask {
struct lp_build_context *bld;
boolean has_mask;
boolean ret_in_main;
LLVMTypeRef int_vec_type;

View File

@@ -73,6 +73,7 @@ static void lp_exec_mask_init(struct lp_exec_mask *mask, struct lp_build_context
mask->bld = bld;
mask->has_mask = FALSE;
mask->ret_in_main = FALSE;
mask->cond_stack_size = 0;
mask->loop_stack_size = 0;
mask->call_stack_size = 0;
@@ -108,7 +109,7 @@ static void lp_exec_mask_update(struct lp_exec_mask *mask)
} else
mask->exec_mask = mask->cond_mask;
if (mask->call_stack_size) {
if (mask->call_stack_size || mask->ret_in_main) {
mask->exec_mask = LLVMBuildAnd(builder,
mask->exec_mask,
mask->ret_mask,
@@ -117,7 +118,8 @@ static void lp_exec_mask_update(struct lp_exec_mask *mask)
mask->has_mask = (mask->cond_stack_size > 0 ||
mask->loop_stack_size > 0 ||
mask->call_stack_size > 0);
mask->call_stack_size > 0 ||
mask->ret_in_main);
}
static void lp_exec_mask_cond_push(struct lp_exec_mask *mask,
@@ -348,11 +350,23 @@ static void lp_exec_mask_ret(struct lp_exec_mask *mask, int *pc)
LLVMBuilderRef builder = mask->bld->gallivm->builder;
LLVMValueRef exec_mask;
if (mask->call_stack_size == 0) {
if (mask->cond_stack_size == 0 &&
mask->loop_stack_size == 0 &&
mask->call_stack_size == 0) {
/* returning from main() */
*pc = -1;
return;
}
if (mask->call_stack_size == 0) {
/*
* This requires special handling since we need to ensure
* we don't drop the mask even if we have no call stack
* (e.g. after a ret in a if clause after the endif)
*/
mask->ret_in_main = TRUE;
}
exec_mask = LLVMBuildNot(builder,
mask->exec_mask,
"ret");

View File

@@ -1569,7 +1569,7 @@ tgsi_text_translate(
struct tgsi_token *tokens,
uint num_tokens )
{
struct translate_ctx ctx;
struct translate_ctx ctx = {0};
ctx.text = text;
ctx.cur = text;

View File

@@ -100,7 +100,7 @@ struct blitter_context_priv
void *velem_state;
void *velem_uint_state;
void *velem_sint_state;
void *velem_state_readbuf;
void *velem_state_readbuf[4]; /**< X, XY, XYZ, XYZW */
/* Sampler state. */
void *sampler_state, *sampler_state_linear;
@@ -277,9 +277,19 @@ struct blitter_context *util_blitter_create(struct pipe_context *pipe)
}
if (ctx->has_stream_out) {
velem[0].src_format = PIPE_FORMAT_R32_UINT;
velem[0].vertex_buffer_index = ctx->base.vb_slot;
ctx->velem_state_readbuf = pipe->create_vertex_elements_state(pipe, 1, &velem[0]);
static enum pipe_format formats[4] = {
PIPE_FORMAT_R32_UINT,
PIPE_FORMAT_R32G32_UINT,
PIPE_FORMAT_R32G32B32_UINT,
PIPE_FORMAT_R32G32B32A32_UINT
};
for (i = 0; i < 4; i++) {
velem[0].src_format = formats[i];
velem[0].vertex_buffer_index = ctx->base.vb_slot;
ctx->velem_state_readbuf[i] =
pipe->create_vertex_elements_state(pipe, 1, &velem[0]);
}
}
/* fragment shaders are created on-demand */
@@ -344,8 +354,11 @@ void util_blitter_destroy(struct blitter_context *blitter)
pipe->delete_vertex_elements_state(pipe, ctx->velem_sint_state);
pipe->delete_vertex_elements_state(pipe, ctx->velem_uint_state);
}
if (ctx->velem_state_readbuf)
pipe->delete_vertex_elements_state(pipe, ctx->velem_state_readbuf);
for (i = 0; i < 4; i++) {
if (ctx->velem_state_readbuf[i]) {
pipe->delete_vertex_elements_state(pipe, ctx->velem_state_readbuf[i]);
}
}
for (i = 0; i < PIPE_MAX_TEXTURE_TYPES; i++) {
if (ctx->fs_texfetch_col[i])
@@ -1120,18 +1133,17 @@ static boolean is_blit_generic_supported(struct blitter_context *blitter,
if (dst) {
unsigned bind;
boolean is_stencil;
const struct util_format_description *desc =
util_format_description(dst_format);
is_stencil = util_format_has_stencil(desc);
boolean dst_has_stencil = util_format_has_stencil(desc);
/* Stencil export must be supported for stencil copy. */
if ((mask & PIPE_MASK_S) && is_stencil && !ctx->has_stencil_export) {
if ((mask & PIPE_MASK_S) && dst_has_stencil &&
!ctx->has_stencil_export) {
return FALSE;
}
if (is_stencil || util_format_has_depth(desc))
if (dst_has_stencil || util_format_has_depth(desc))
bind = PIPE_BIND_DEPTH_STENCIL;
else
bind = PIPE_BIND_RENDER_TARGET;
@@ -1153,15 +1165,18 @@ static boolean is_blit_generic_supported(struct blitter_context *blitter,
}
/* Check stencil sampler support for stencil copy. */
if (util_format_has_stencil(util_format_description(src_format))) {
enum pipe_format stencil_format =
if (mask & PIPE_MASK_S) {
if (util_format_has_stencil(util_format_description(src_format))) {
enum pipe_format stencil_format =
util_format_stencil_only(src_format);
assert(stencil_format != PIPE_FORMAT_NONE);
assert(stencil_format != PIPE_FORMAT_NONE);
if (stencil_format != src_format &&
!screen->is_format_supported(screen, stencil_format, src->target,
src->nr_samples, PIPE_BIND_SAMPLER_VIEW)) {
return FALSE;
if (stencil_format != src_format &&
!screen->is_format_supported(screen, stencil_format,
src->target, src->nr_samples,
PIPE_BIND_SAMPLER_VIEW)) {
return FALSE;
}
}
}
}
@@ -1714,7 +1729,7 @@ void util_blitter_copy_buffer(struct blitter_context *blitter,
vb.stride = 4;
pipe->set_vertex_buffers(pipe, ctx->base.vb_slot, 1, &vb);
pipe->bind_vertex_elements_state(pipe, ctx->velem_state_readbuf);
pipe->bind_vertex_elements_state(pipe, ctx->velem_state_readbuf[0]);
pipe->bind_vs_state(pipe, ctx->vs_pos_only);
if (ctx->has_geometry_shader)
pipe->bind_gs_state(pipe, NULL);
@@ -1731,6 +1746,66 @@ void util_blitter_copy_buffer(struct blitter_context *blitter,
pipe_so_target_reference(&so_target, NULL);
}
void util_blitter_clear_buffer(struct blitter_context *blitter,
struct pipe_resource *dst,
unsigned offset, unsigned size,
unsigned num_channels,
const union pipe_color_union *clear_value)
{
struct blitter_context_priv *ctx = (struct blitter_context_priv*)blitter;
struct pipe_context *pipe = ctx->base.pipe;
struct pipe_vertex_buffer vb = {0};
struct pipe_stream_output_target *so_target;
assert(num_channels >= 1);
assert(num_channels <= 4);
/* IMPORTANT: DON'T DO ANY BOUNDS CHECKING HERE!
*
* R600 uses this to initialize texture resources, so width0 might not be
* what you think it is.
*/
/* Streamout is required. */
if (!ctx->has_stream_out) {
assert(!"Streamout unsupported in util_blitter_clear_buffer()");
return;
}
/* Some alignment is required. */
if (offset % 4 != 0 || size % 4 != 0) {
assert(!"Bad alignment in util_blitter_clear_buffer()");
return;
}
u_upload_data(ctx->upload, 0, num_channels*4, clear_value,
&vb.buffer_offset, &vb.buffer);
vb.stride = 0;
blitter_set_running_flag(ctx);
blitter_check_saved_vertex_states(ctx);
blitter_disable_render_cond(ctx);
pipe->set_vertex_buffers(pipe, ctx->base.vb_slot, 1, &vb);
pipe->bind_vertex_elements_state(pipe,
ctx->velem_state_readbuf[num_channels-1]);
pipe->bind_vs_state(pipe, ctx->vs_pos_only);
if (ctx->has_geometry_shader)
pipe->bind_gs_state(pipe, NULL);
pipe->bind_rasterizer_state(pipe, ctx->rs_discard_state);
so_target = pipe->create_stream_output_target(pipe, dst, offset, size);
pipe->set_stream_output_targets(pipe, 1, &so_target, 0);
util_draw_arrays(pipe, PIPE_PRIM_POINTS, 0, size / 4);
blitter_restore_vertex_states(ctx);
blitter_restore_render_cond(ctx);
blitter_unset_running_flag(ctx);
pipe_so_target_reference(&so_target, NULL);
pipe_resource_reference(&vb.buffer, NULL);
}
/* probably radeon specific */
void util_blitter_custom_resolve_color(struct blitter_context *blitter,
struct pipe_resource *dst,

View File

@@ -276,7 +276,7 @@ void util_blitter_default_src_texture(struct pipe_sampler_view *src_templ,
/**
* Copy data from one buffer to another using the Stream Output functionality.
* Some alignment is required, otherwise software fallback is used.
* 4-byte alignment is required, otherwise software fallback is used.
*/
void util_blitter_copy_buffer(struct blitter_context *blitter,
struct pipe_resource *dst,
@@ -285,6 +285,22 @@ void util_blitter_copy_buffer(struct blitter_context *blitter,
unsigned srcx,
unsigned size);
/**
* Clear the contents of a buffer using the Stream Output functionality.
* 4-byte alignment is required.
*
* "num_channels" can be 1, 2, 3, or 4, and specifies if the clear value is
* R, RG, RGB, or RGBA.
*
* For each element, only "num_channels" components of "clear_value" are
* copied to the buffer, then the offset is incremented by num_channels*4.
*/
void util_blitter_clear_buffer(struct blitter_context *blitter,
struct pipe_resource *dst,
unsigned offset, unsigned size,
unsigned num_channels,
const union pipe_color_union *clear_value);
/**
* Clear a region of a (color) surface to a constant value.
*

View File

@@ -0,0 +1,89 @@
/*
* Copyright 2013 Marek Olšák <maraeo@gmail.com>
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* on the rights to use, copy, modify, merge, publish, distribute, sub
* license, and/or sell copies of the Software, and to permit persons to whom
* the Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
* THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
* USE OR OTHER DEALINGS IN THE SOFTWARE. */
/**
* @file
* 1D integer range, capable of the union and intersection operations.
*
* It only maintains a single interval which is extended when the union is
* done. This implementation is partially thread-safe (readers are not
* protected by a lock).
*
* @author Marek Olšák
*/
#ifndef U_RANGE_H
#define U_RANGE_H
#include "os/os_thread.h"
struct util_range {
unsigned start; /* inclusive */
unsigned end; /* exclusive */
/* for the range to be consistent with multiple contexts: */
pipe_mutex write_mutex;
};
static INLINE void
util_range_set_empty(struct util_range *range)
{
range->start = ~0;
range->end = 0;
}
/* This is like a union of two sets. */
static INLINE void
util_range_add(struct util_range *range, unsigned start, unsigned end)
{
if (start < range->start || end > range->end) {
pipe_mutex_lock(range->write_mutex);
range->start = MIN2(start, range->start);
range->end = MAX2(end, range->end);
pipe_mutex_unlock(range->write_mutex);
}
}
static INLINE boolean
util_ranges_intersect(struct util_range *range, unsigned start, unsigned end)
{
return MAX2(start, range->start) < MIN2(end, range->end);
}
/* Init/deinit */
static INLINE void
util_range_init(struct util_range *range)
{
pipe_mutex_init(range->write_mutex);
util_range_set_empty(range);
}
static INLINE void
util_range_destroy(struct util_range *range)
{
pipe_mutex_destroy(range->write_mutex);
}
#endif

View File

@@ -421,10 +421,10 @@ util_clear_depth_stencil(struct pipe_context *pipe,
else {
uint32_t dst_mask;
if (format == PIPE_FORMAT_Z24_UNORM_S8_UINT)
dst_mask = 0xffffff00;
dst_mask = 0x00ffffff;
else {
assert(format == PIPE_FORMAT_S8_UINT_Z24_UNORM);
dst_mask = 0xffffff;
dst_mask = 0xffffff00;
}
if (clear_flags & PIPE_CLEAR_DEPTH)
dst_mask = ~dst_mask;

View File

@@ -1,6 +1,7 @@
AUTOMAKE_OPTIONS = subdir-objects
AM_CPPFLAGS = \
-I$(top_srcdir)/include \
-I$(top_srcdir)/src/gallium/include \
-I$(top_srcdir)/src/gallium/auxiliary \
-I$(top_srcdir)/src/gallium/drivers \

View File

@@ -85,23 +85,30 @@ check_PROGRAMS = \
lp_test_printf
TESTS = $(check_PROGRAMS)
TEST_LIBS = \
libllvmpipe.la \
$(top_builddir)/src/gallium/auxiliary/libgallium.la \
$(LLVM_LIBS) \
$(DLOPEN_LIBS) \
$(PTHREAD_LIBS)
lp_test_format_SOURCES = lp_test_format.c lp_test_main.c
lp_test_format_LDADD = libllvmpipe.la ../../auxiliary/libgallium.la $(LLVM_LIBS)
lp_test_format_LDADD = $(TEST_LIBS)
nodist_EXTRA_lp_test_format_SOURCES = dummy.cpp
lp_test_arit_SOURCES = lp_test_arit.c lp_test_main.c
lp_test_arit_LDADD = libllvmpipe.la ../../auxiliary/libgallium.la $(LLVM_LIBS)
lp_test_arit_LDADD = $(TEST_LIBS)
nodist_EXTRA_lp_test_arit_SOURCES = dummy.cpp
lp_test_blend_SOURCES = lp_test_blend.c lp_test_main.c
lp_test_blend_LDADD = libllvmpipe.la ../../auxiliary/libgallium.la $(LLVM_LIBS)
lp_test_blend_LDADD = $(TEST_LIBS)
nodist_EXTRA_lp_test_blend_SOURCES = dummy.cpp
lp_test_conv_SOURCES = lp_test_conv.c lp_test_main.c
lp_test_conv_LDADD = libllvmpipe.la ../../auxiliary/libgallium.la $(LLVM_LIBS)
lp_test_conv_LDADD = $(TEST_LIBS)
nodist_EXTRA_lp_test_conv_SOURCES = dummy.cpp
lp_test_printf_SOURCES = lp_test_printf.c lp_test_main.c
lp_test_printf_LDADD = libllvmpipe.la ../../auxiliary/libgallium.la $(LLVM_LIBS)
lp_test_printf_LDADD = $(TEST_LIBS)
nodist_EXTRA_lp_test_printf_SOURCES = dummy.cpp

View File

@@ -64,6 +64,28 @@ lp_scene_create( struct pipe_context *pipe )
pipe_mutex_init(scene->mutex);
#ifdef DEBUG
/* Do some scene limit sanity checks here */
{
size_t maxBins = TILES_X * TILES_Y;
size_t maxCommandBytes = sizeof(struct cmd_block) * maxBins;
size_t maxCommandPlusData = maxCommandBytes + DATA_BLOCK_SIZE;
/* We'll need at least one command block per bin. Make sure that's
* less than the max allowed scene size.
*/
assert(maxCommandBytes < LP_SCENE_MAX_SIZE);
/* We'll also need space for at least one other data block */
assert(maxCommandPlusData <= LP_SCENE_MAX_SIZE);
/* Ideally, the size of a cmd_block object will be a power of two
* in order to avoid wasting space when we allocation them from
* data blocks (which are power of two also).
*/
assert(sizeof(struct cmd_block) ==
util_next_power_of_two(sizeof(struct cmd_block)));
}
#endif
return scene;
}

View File

@@ -49,12 +49,18 @@ struct lp_rast_state;
#define TILES_Y (LP_MAX_HEIGHT / TILE_SIZE)
#define CMD_BLOCK_MAX 128
/* Commands per command block (ideally so sizeof(cmd_block) is a power of
* two in size.)
*/
#define CMD_BLOCK_MAX 29
/* Bytes per data block.
*/
#define DATA_BLOCK_SIZE (64 * 1024)
/* Scene temporary storage is clamped to this size:
*/
#define LP_SCENE_MAX_SIZE (4*1024*1024)
#define LP_SCENE_MAX_SIZE (9*1024*1024)
/* The maximum amount of texture storage referenced by a scene is
* clamped ot this size:

View File

@@ -46,6 +46,10 @@ clear_flags(struct pipe_rasterizer_state *rast)
{
rast->light_twoside = 0;
rast->offset_tri = 0;
rast->offset_line = 0;
rast->offset_point = 0;
rast->offset_units = 0.0f;
rast->offset_scale = 0.0f;
}
@@ -74,6 +78,8 @@ llvmpipe_create_rasterizer_state(struct pipe_context *pipe,
*/
need_pipeline = (rast->fill_front != PIPE_POLYGON_MODE_FILL ||
rast->fill_back != PIPE_POLYGON_MODE_FILL ||
rast->offset_point ||
rast->offset_line ||
rast->point_smooth ||
rast->line_smooth ||
rast->line_stipple_enable ||

View File

@@ -295,7 +295,9 @@ llvmpipe_resource_create(struct pipe_screen *_screen,
/* assert(lpr->base.bind); */
if (resource_is_texture(&lpr->base)) {
if (lpr->base.bind & PIPE_BIND_DISPLAY_TARGET) {
if (lpr->base.bind & (PIPE_BIND_DISPLAY_TARGET |
PIPE_BIND_SCANOUT |
PIPE_BIND_SHARED)) {
/* displayable surface */
if (!llvmpipe_displaytarget_layout(screen, lpr))
goto fail;

View File

@@ -180,4 +180,44 @@ nv50_blit_eng2d_get_mask(const struct pipe_blit_info *info)
return mask;
}
#if NOUVEAU_DRIVER == 0xc0
# define nv50_format_table nvc0_format_table
#endif
/* return TRUE for formats that can be converted among each other by NVC0_2D */
static INLINE boolean
nv50_2d_dst_format_faithful(enum pipe_format format)
{
const uint64_t mask =
NV50_ENG2D_SUPPORTED_FORMATS &
~NV50_ENG2D_NOCONVERT_FORMATS;
uint8_t id = nv50_format_table[format].rt;
return (id >= 0xc0) && (mask & (1ULL << (id - 0xc0)));
}
static INLINE boolean
nv50_2d_src_format_faithful(enum pipe_format format)
{
const uint64_t mask =
NV50_ENG2D_SUPPORTED_FORMATS &
~(NV50_ENG2D_LUMINANCE_FORMATS | NV50_ENG2D_INTENSITY_FORMATS);
uint8_t id = nv50_format_table[format].rt;
return (id >= 0xc0) && (mask & (1ULL << (id - 0xc0)));
}
static INLINE boolean
nv50_2d_format_supported(enum pipe_format format)
{
uint8_t id = nv50_format_table[format].rt;
return (id >= 0xc0) &&
(NV50_ENG2D_SUPPORTED_FORMATS & (1ULL << (id - 0xc0)));
}
static INLINE boolean
nv50_2d_dst_format_ops_supported(enum pipe_format format)
{
uint8_t id = nv50_format_table[format].rt;
return (id >= 0xc0) &&
(NV50_ENG2D_OPERATION_FORMATS & (1ULL << (id - 0xc0)));
}
#endif /* __NV50_BLIT_H__ */

View File

@@ -9,6 +9,7 @@ nv50_validate_fb(struct nv50_context *nv50)
struct pipe_framebuffer_state *fb = &nv50->framebuffer;
unsigned i;
unsigned ms_mode = NV50_3D_MULTISAMPLE_MODE_MS1;
uint32_t array_size = 0xffff, array_mode = 0;
nouveau_bufctx_reset(nv50->bufctx_3d, NV50_BIND_FB);
@@ -23,6 +24,13 @@ nv50_validate_fb(struct nv50_context *nv50)
struct nv50_surface *sf = nv50_surface(fb->cbufs[i]);
struct nouveau_bo *bo = mt->base.bo;
array_size = MIN2(array_size, sf->depth);
if (mt->layout_3d)
array_mode = NV50_3D_RT_ARRAY_MODE_MODE_3D; /* 1 << 16 */
/* can't mix 3D with ARRAY or have RTs of different depth/array_size */
assert(mt->layout_3d || !array_mode || array_size == 1);
BEGIN_NV04(push, NV50_3D(RT_ADDRESS_HIGH(i)), 5);
PUSH_DATAh(push, bo->offset + sf->offset);
PUSH_DATA (push, bo->offset + sf->offset);
@@ -34,7 +42,7 @@ nv50_validate_fb(struct nv50_context *nv50)
PUSH_DATA (push, sf->width);
PUSH_DATA (push, sf->height);
BEGIN_NV04(push, NV50_3D(RT_ARRAY_MODE), 1);
PUSH_DATA (push, sf->depth);
PUSH_DATA (push, array_mode | array_size);
} else {
PUSH_DATA (push, 0);
PUSH_DATA (push, 0);
@@ -63,7 +71,7 @@ nv50_validate_fb(struct nv50_context *nv50)
struct nv50_miptree *mt = nv50_miptree(fb->zsbuf->texture);
struct nv50_surface *sf = nv50_surface(fb->zsbuf);
struct nouveau_bo *bo = mt->base.bo;
int unk = mt->base.base.target == PIPE_TEXTURE_2D;
int unk = mt->base.base.target == PIPE_TEXTURE_3D || sf->depth == 1;
BEGIN_NV04(push, NV50_3D(ZETA_ADDRESS_HIGH), 5);
PUSH_DATAh(push, bo->offset + sf->offset);

View File

@@ -35,25 +35,22 @@
#include "nv50_context.h"
#include "nv50_resource.h"
#include "nv50_blit.h"
#include "nv50_defs.xml.h"
#include "nv50_texture.xml.h"
/* these are used in nv50_blit.h */
#define NV50_ENG2D_SUPPORTED_FORMATS 0xff0843e080608409ULL
#define NV50_ENG2D_NOCONVERT_FORMATS 0x0008402000000000ULL
#define NV50_ENG2D_LUMINANCE_FORMATS 0x0008402000000000ULL
#define NV50_ENG2D_INTENSITY_FORMATS 0x0000000000000000ULL
#define NV50_ENG2D_OPERATION_FORMATS 0x060001c000608000ULL
/* return TRUE for formats that can be converted among each other by NV50_2D */
static INLINE boolean
nv50_2d_format_faithful(enum pipe_format format)
{
uint8_t id = nv50_format_table[format].rt;
return (id >= 0xc0) &&
(NV50_ENG2D_SUPPORTED_FORMATS & (1ULL << (id - 0xc0)));
}
#define NOUVEAU_DRIVER 0x50
#include "nv50_blit.h"
static INLINE uint8_t
nv50_2d_format(enum pipe_format format)
nv50_2d_format(enum pipe_format format, boolean dst, boolean dst_src_equal)
{
uint8_t id = nv50_format_table[format].rt;
@@ -62,6 +59,7 @@ nv50_2d_format(enum pipe_format format)
*/
if ((id >= 0xc0) && (NV50_ENG2D_SUPPORTED_FORMATS & (1ULL << (id - 0xc0))))
return id;
assert(dst_src_equal);
switch (util_format_get_blocksize(format)) {
case 1:
@@ -78,7 +76,7 @@ nv50_2d_format(enum pipe_format format)
static int
nv50_2d_texture_set(struct nouveau_pushbuf *push, int dst,
struct nv50_miptree *mt, unsigned level, unsigned layer,
enum pipe_format pformat)
enum pipe_format pformat, boolean dst_src_pformat_equal)
{
struct nouveau_bo *bo = mt->base.bo;
uint32_t width, height, depth;
@@ -86,7 +84,7 @@ nv50_2d_texture_set(struct nouveau_pushbuf *push, int dst,
uint32_t mthd = dst ? NV50_2D_DST_FORMAT : NV50_2D_SRC_FORMAT;
uint32_t offset = mt->level[level].offset;
format = nv50_2d_format(pformat);
format = nv50_2d_format(pformat, dst, dst_src_pformat_equal);
if (!format) {
NOUVEAU_ERR("invalid/unsupported surface format: %s\n",
util_format_name(pformat));
@@ -155,15 +153,16 @@ nv50_2d_texture_do_copy(struct nouveau_pushbuf *push,
const enum pipe_format dfmt = dst->base.base.format;
const enum pipe_format sfmt = src->base.base.format;
int ret;
boolean eqfmt = dfmt == sfmt;
if (!PUSH_SPACE(push, 2 * 16 + 32))
return PIPE_ERROR;
ret = nv50_2d_texture_set(push, 1, dst, dst_level, dz, dfmt);
ret = nv50_2d_texture_set(push, 1, dst, dst_level, dz, dfmt, eqfmt);
if (ret)
return ret;
ret = nv50_2d_texture_set(push, 0, src, src_level, sz, sfmt);
ret = nv50_2d_texture_set(push, 0, src, src_level, sz, sfmt, eqfmt);
if (ret)
return ret;
@@ -243,8 +242,8 @@ nv50_resource_copy_region(struct pipe_context *pipe,
}
assert((src->format == dst->format) ||
(nv50_2d_format_faithful(src->format) &&
nv50_2d_format_faithful(dst->format)));
(nv50_2d_src_format_faithful(src->format) &&
nv50_2d_dst_format_faithful(dst->format)));
BCTX_REFN(nv50->bufctx, 2D, nv04_resource(src), RD);
BCTX_REFN(nv50->bufctx, 2D, nv04_resource(dst), WR);
@@ -936,7 +935,7 @@ nv50_blit_3d(struct nv50_context *nv50, const struct pipe_blit_info *info)
nv50_blit_select_fp(blit, info);
nv50_blitctx_pre_blit(blit);
nv50_blit_set_dst(blit, dst, info->dst.level, 0, info->dst.format);
nv50_blit_set_dst(blit, dst, info->dst.level, -1, info->dst.format);
nv50_blit_set_src(blit, src, info->src.level, -1, info->src.format,
blit->filter);
@@ -977,6 +976,8 @@ nv50_blit_3d(struct nv50_context *nv50, const struct pipe_blit_info *info)
BEGIN_NV04(push, NV50_3D(VIEWPORT_TRANSFORM_EN), 1);
PUSH_DATA (push, 0);
BEGIN_NV04(push, NV50_3D(VIEW_VOLUME_CLIP_CTRL), 1);
PUSH_DATA (push, 0x1);
/* Draw a large triangle in screen coordinates covering the whole
* render target, with scissors defining the destination region.
@@ -1059,7 +1060,8 @@ nv50_blit_eng2d(struct nv50_context *nv50, const struct pipe_blit_info *info)
int64_t du_dx, dv_dy;
int i;
uint32_t mode;
const uint32_t mask = nv50_blit_eng2d_get_mask(info);
uint32_t mask = nv50_blit_eng2d_get_mask(info);
boolean b;
mode = nv50_blit_get_filter(info) ?
NV50_2D_BLIT_CONTROL_FILTER_BILINEAR :
@@ -1070,8 +1072,9 @@ nv50_blit_eng2d(struct nv50_context *nv50, const struct pipe_blit_info *info)
du_dx = ((int64_t)info->src.box.width << 32) / info->dst.box.width;
dv_dy = ((int64_t)info->src.box.height << 32) / info->dst.box.height;
nv50_2d_texture_set(push, 1, dst, info->dst.level, dz, info->dst.format);
nv50_2d_texture_set(push, 0, src, info->src.level, sz, info->src.format);
b = info->dst.format == info->src.format;
nv50_2d_texture_set(push, 1, dst, info->dst.level, dz, info->dst.format, b);
nv50_2d_texture_set(push, 0, src, info->src.level, sz, info->src.format, b);
if (info->scissor_enable) {
BEGIN_NV04(push, NV50_2D(CLIP_X), 5);
@@ -1094,6 +1097,17 @@ nv50_blit_eng2d(struct nv50_context *nv50, const struct pipe_blit_info *info)
PUSH_DATA (push, 0xffffffff);
BEGIN_NV04(push, NV50_2D(OPERATION), 1);
PUSH_DATA (push, NV50_2D_OPERATION_ROP);
} else
if (info->src.format != info->dst.format) {
if (info->src.format == PIPE_FORMAT_R8_UNORM ||
info->src.format == PIPE_FORMAT_R16_UNORM ||
info->src.format == PIPE_FORMAT_R16_FLOAT ||
info->src.format == PIPE_FORMAT_R32_FLOAT) {
mask = 0xffff0000; /* also makes condition for OPERATION reset true */
BEGIN_NV04(push, NV50_2D(BETA4), 2);
PUSH_DATA (push, mask);
PUSH_DATA (push, NV50_2D_OPERATION_SRCCOPY_PREMULT);
}
}
if (src->ms_x > dst->ms_x || src->ms_y > dst->ms_y) {
@@ -1224,10 +1238,25 @@ nv50_blit(struct pipe_context *pipe, const struct pipe_blit_info *info)
debug_printf("blit: cannot filter array or cube textures in z direction");
}
if (!eng3d && info->dst.format != info->src.format)
if (!nv50_2d_format_faithful(info->dst.format) ||
!nv50_2d_format_faithful(info->src.format))
if (!eng3d && info->dst.format != info->src.format) {
if (!nv50_2d_dst_format_faithful(info->dst.format) ||
!nv50_2d_src_format_faithful(info->src.format)) {
eng3d = TRUE;
} else
if (!nv50_2d_src_format_faithful(info->src.format)) {
if (!util_format_is_luminance(info->src.format)) {
if (util_format_is_intensity(info->src.format))
eng3d = TRUE;
else
if (!nv50_2d_dst_format_ops_supported(info->dst.format))
eng3d = TRUE;
else
eng3d = !nv50_2d_format_supported(info->src.format);
}
} else
if (util_format_is_luminance_alpha(info->src.format))
eng3d = TRUE;
}
if (info->src.resource->nr_samples == 8 &&
info->dst.resource->nr_samples <= 1)

View File

@@ -1041,7 +1041,7 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
#define NVC0_3D_VIEWPORT_TRANSFORM_EN 0x0000192c
#define NVC0_3D_VIEW_VOLUME_CLIP_CTRL 0x0000193c
#define NVC0_3D_VIEW_VOLUME_CLIP_CTRL_UNK0 0x00000001
#define NVC0_3D_VIEW_VOLUME_CLIP_CTRL_DEPTH_RANGE_0_1 0x00000001
#define NVC0_3D_VIEW_VOLUME_CLIP_CTRL_UNK1__MASK 0x00000006
#define NVC0_3D_VIEW_VOLUME_CLIP_CTRL_UNK1__SHIFT 1
#define NVC0_3D_VIEW_VOLUME_CLIP_CTRL_UNK1_UNK0 0x00000000

View File

@@ -36,29 +36,32 @@
#include "nv50/nv50_defs.xml.h"
#include "nv50/nv50_texture.xml.h"
/* these are used in nv50_blit.h */
#define NV50_ENG2D_SUPPORTED_FORMATS 0xff9ccfe1cce3ccc9ULL
#define NV50_ENG2D_NOCONVERT_FORMATS 0x009cc02000000000ULL
#define NV50_ENG2D_LUMINANCE_FORMATS 0x001cc02000000000ULL
#define NV50_ENG2D_INTENSITY_FORMATS 0x0080000000000000ULL
#define NV50_ENG2D_OPERATION_FORMATS 0x060001c000638000ULL
#define NOUVEAU_DRIVER 0xc0
#include "nv50/nv50_blit.h"
#define NVC0_ENG2D_SUPPORTED_FORMATS 0xff9ccfe1cce3ccc9ULL
/* return TRUE for formats that can be converted among each other by NVC0_2D */
static INLINE boolean
nvc0_2d_format_faithful(enum pipe_format format)
{
uint8_t id = nvc0_format_table[format].rt;
return (id >= 0xc0) && (NVC0_ENG2D_SUPPORTED_FORMATS & (1ULL << (id - 0xc0)));
}
static INLINE uint8_t
nvc0_2d_format(enum pipe_format format)
nvc0_2d_format(enum pipe_format format, boolean dst, boolean dst_src_equal)
{
uint8_t id = nvc0_format_table[format].rt;
/* A8_UNORM is treated as I8_UNORM as far as the 2D engine is concerned. */
if (!dst && unlikely(format == PIPE_FORMAT_I8_UNORM) && !dst_src_equal)
return NV50_SURFACE_FORMAT_A8_UNORM;
/* Hardware values for color formats range from 0xc0 to 0xff,
* but the 2D engine doesn't support all of them.
*/
if (nvc0_2d_format_faithful(format))
if (nv50_2d_format_supported(format))
return id;
assert(dst_src_equal);
switch (util_format_get_blocksize(format)) {
case 1:
@@ -72,6 +75,7 @@ nvc0_2d_format(enum pipe_format format)
case 16:
return NV50_SURFACE_FORMAT_RGBA32_FLOAT;
default:
assert(0);
return 0;
}
}
@@ -79,7 +83,7 @@ nvc0_2d_format(enum pipe_format format)
static int
nvc0_2d_texture_set(struct nouveau_pushbuf *push, boolean dst,
struct nv50_miptree *mt, unsigned level, unsigned layer,
enum pipe_format pformat)
enum pipe_format pformat, boolean dst_src_pformat_equal)
{
struct nouveau_bo *bo = mt->base.bo;
uint32_t width, height, depth;
@@ -87,7 +91,7 @@ nvc0_2d_texture_set(struct nouveau_pushbuf *push, boolean dst,
uint32_t mthd = dst ? NVC0_2D_DST_FORMAT : NVC0_2D_SRC_FORMAT;
uint32_t offset = mt->level[level].offset;
format = nvc0_2d_format(pformat);
format = nvc0_2d_format(pformat, dst, dst_src_pformat_equal);
if (!format) {
NOUVEAU_ERR("invalid/unsupported surface format: %s\n",
util_format_name(pformat));
@@ -157,15 +161,16 @@ nvc0_2d_texture_do_copy(struct nouveau_pushbuf *push,
const enum pipe_format dfmt = dst->base.base.format;
const enum pipe_format sfmt = src->base.base.format;
int ret;
boolean eqfmt = dfmt == sfmt;
if (!PUSH_SPACE(push, 2 * 16 + 32))
return PIPE_ERROR;
ret = nvc0_2d_texture_set(push, TRUE, dst, dst_level, dz, dfmt);
ret = nvc0_2d_texture_set(push, TRUE, dst, dst_level, dz, dfmt, eqfmt);
if (ret)
return ret;
ret = nvc0_2d_texture_set(push, FALSE, src, src_level, sz, sfmt);
ret = nvc0_2d_texture_set(push, FALSE, src, src_level, sz, sfmt, eqfmt);
if (ret)
return ret;
@@ -243,8 +248,8 @@ nvc0_resource_copy_region(struct pipe_context *pipe,
return;
}
assert(nvc0_2d_format_faithful(src->format));
assert(nvc0_2d_format_faithful(dst->format));
assert(nv50_2d_dst_format_faithful(dst->format));
assert(nv50_2d_src_format_faithful(src->format));
BCTX_REFN(nvc0->bufctx, 2D, nv04_resource(src), RD);
BCTX_REFN(nvc0->bufctx, 2D, nv04_resource(dst), WR);
@@ -490,19 +495,19 @@ nvc0_blitter_make_vp(struct nvc0_blitter *blit)
{
static const uint32_t code_nvc0[] =
{
0xfff01c66, 0x06000080, /* vfetch b128 { $r0 $r1 $r2 $r3 } a[0x80] */
0xfff11c26, 0x06000090, /* vfetch b96 { $r4 $r5 $r6 } a[0x90]*/
0x03f01c66, 0x0a7e0070, /* export b128 o[0x70] { $r0 $r1 $r2 $r3 } */
0x13f01c26, 0x0a7e0080, /* export b96 o[0x80] { $r4 $r5 $r6 } */
0xfff11c26, 0x06000080, /* vfetch b64 $r4:$r5 a[0x80] */
0xfff01c46, 0x06000090, /* vfetch b96 $r0:$r1:$r2 a[0x90] */
0x13f01c26, 0x0a7e0070, /* export b64 o[0x70] $r4:$r5 */
0x03f01c46, 0x0a7e0080, /* export b96 o[0x80] $r0:$r1:$r2 */
0x00001de7, 0x80000000, /* exit */
};
static const uint32_t code_nve4[] =
{
0x00000007, 0x20000000, /* sched */
0xfff01c66, 0x06000080, /* vfetch b128 { $r0 $r1 $r2 $r3 } a[0x80] */
0xfff11c46, 0x06000090, /* vfetch b96 { $r4 $r5 $r6 } a[0x90]*/
0x03f01c66, 0x0a7e0070, /* export b128 o[0x70] { $r0 $r1 $r2 $r3 } */
0x13f01c46, 0x0a7e0080, /* export b96 o[0x80] { $r4 $r5 $r6 } */
0xfff11c26, 0x06000080, /* vfetch b64 $r4:$r5 a[0x80] */
0xfff01c46, 0x06000090, /* vfetch b96 $r0:$r1:$r2 a[0x90] */
0x13f01c26, 0x0a7e0070, /* export b64 o[0x70] $r4:$r5 */
0x03f01c46, 0x0a7e0080, /* export b96 o[0x80] $r0:$r1:$r2 */
0x00001de7, 0x80000000, /* exit */
};
@@ -515,13 +520,13 @@ nvc0_blitter_make_vp(struct nvc0_blitter *blit)
blit->vp.code = (uint32_t *)code_nvc0; /* const_cast */
blit->vp.code_size = sizeof(code_nvc0);
}
blit->vp.max_gpr = 7;
blit->vp.max_gpr = 6;
blit->vp.vp.edgeflag = PIPE_MAX_ATTRIBS;
blit->vp.hdr[0] = 0x00020461; /* vertprog magic */
blit->vp.hdr[4] = 0x000ff000; /* no outputs read */
blit->vp.hdr[6] = 0x0000003f; /* a[0x80], a[0x90] */
blit->vp.hdr[13] = 0x0003f000; /* o[0x70], o[0x80] */
blit->vp.hdr[6] = 0x00000073; /* a[0x80].xy, a[0x90].xyz */
blit->vp.hdr[13] = 0x00073000; /* o[0x70].xy, o[0x80].xyz */
}
static void
@@ -820,7 +825,7 @@ nvc0_blit_3d(struct nvc0_context *nvc0, const struct pipe_blit_info *info)
nvc0_blit_select_fp(blit, info);
nvc0_blitctx_pre_blit(blit);
nvc0_blit_set_dst(blit, dst, info->dst.level, 0, info->dst.format);
nvc0_blit_set_dst(blit, dst, info->dst.level, -1, info->dst.format);
nvc0_blit_set_src(blit, src, info->src.level, -1, info->src.format,
blit->filter);
@@ -859,6 +864,8 @@ nvc0_blit_3d(struct nvc0_context *nvc0, const struct pipe_blit_info *info)
z += 0.5f * dz;
IMMED_NVC0(push, NVC0_3D(VIEWPORT_TRANSFORM_EN), 0);
IMMED_NVC0(push, NVC0_3D(VIEW_VOLUME_CLIP_CTRL), 0x2 |
NVC0_3D_VIEW_VOLUME_CLIP_CTRL_DEPTH_RANGE_0_1);
BEGIN_NVC0(push, NVC0_3D(VIEWPORT_HORIZ(0)), 2);
PUSH_DATA (push, nvc0->framebuffer.width << 16);
PUSH_DATA (push, nvc0->framebuffer.height << 16);
@@ -925,11 +932,14 @@ nvc0_blit_3d(struct nvc0_context *nvc0, const struct pipe_blit_info *info)
if (info->dst.box.z + info->dst.box.depth - 1)
IMMED_NVC0(push, NVC0_3D(LAYER), 0);
/* re-enable normally constant state */
IMMED_NVC0(push, NVC0_3D(VIEWPORT_TRANSFORM_EN), 1);
nvc0_blitctx_post_blit(blit);
/* restore viewport */
BEGIN_NVC0(push, NVC0_3D(VIEWPORT_HORIZ(0)), 2);
PUSH_DATA (push, nvc0->framebuffer.width << 16);
PUSH_DATA (push, nvc0->framebuffer.height << 16);
IMMED_NVC0(push, NVC0_3D(VIEWPORT_TRANSFORM_EN), 1);
}
static void
@@ -948,7 +958,8 @@ nvc0_blit_eng2d(struct nvc0_context *nvc0, const struct pipe_blit_info *info)
int64_t du_dx, dv_dy;
int i;
uint32_t mode;
const uint32_t mask = nv50_blit_eng2d_get_mask(info);
uint32_t mask = nv50_blit_eng2d_get_mask(info);
boolean b;
mode = nv50_blit_get_filter(info) ?
NVC0_2D_BLIT_CONTROL_FILTER_BILINEAR :
@@ -959,8 +970,9 @@ nvc0_blit_eng2d(struct nvc0_context *nvc0, const struct pipe_blit_info *info)
du_dx = ((int64_t)info->src.box.width << 32) / info->dst.box.width;
dv_dy = ((int64_t)info->src.box.height << 32) / info->dst.box.height;
nvc0_2d_texture_set(push, 1, dst, info->dst.level, dz, info->dst.format);
nvc0_2d_texture_set(push, 0, src, info->src.level, sz, info->src.format);
b = info->dst.format == info->src.format;
nvc0_2d_texture_set(push, 1, dst, info->dst.level, dz, info->dst.format, b);
nvc0_2d_texture_set(push, 0, src, info->src.level, sz, info->src.format, b);
if (info->scissor_enable) {
BEGIN_NVC0(push, NVC0_2D(CLIP_X), 5);
@@ -981,6 +993,25 @@ nvc0_blit_eng2d(struct nvc0_context *nvc0, const struct pipe_blit_info *info)
PUSH_DATA (push, 0xffffffff);
PUSH_DATA (push, 0xffffffff);
IMMED_NVC0(push, NVC0_2D(OPERATION), NVC0_2D_OPERATION_ROP);
} else
if (info->src.format != info->dst.format) {
if (info->src.format == PIPE_FORMAT_R8_UNORM ||
info->src.format == PIPE_FORMAT_R8_SNORM ||
info->src.format == PIPE_FORMAT_R16_UNORM ||
info->src.format == PIPE_FORMAT_R16_SNORM ||
info->src.format == PIPE_FORMAT_R16_FLOAT ||
info->src.format == PIPE_FORMAT_R32_FLOAT) {
mask = 0xffff0000; /* also makes condition for OPERATION reset true */
BEGIN_NVC0(push, NVC0_2D(BETA4), 2);
PUSH_DATA (push, mask);
PUSH_DATA (push, NVC0_2D_OPERATION_SRCCOPY_PREMULT);
} else
if (info->src.format == PIPE_FORMAT_A8_UNORM) {
mask = 0xff000000;
BEGIN_NVC0(push, NVC0_2D(BETA4), 2);
PUSH_DATA (push, mask);
PUSH_DATA (push, NVC0_2D_OPERATION_SRCCOPY_PREMULT);
}
}
if (src->ms_x > dst->ms_x || src->ms_y > dst->ms_y) {
@@ -1106,10 +1137,24 @@ nvc0_blit(struct pipe_context *pipe, const struct pipe_blit_info *info)
debug_printf("blit: cannot filter array or cube textures in z direction");
}
if (!eng3d && info->dst.format != info->src.format)
if (!nvc0_2d_format_faithful(info->dst.format) ||
!nvc0_2d_format_faithful(info->src.format))
if (!eng3d && info->dst.format != info->src.format) {
if (!nv50_2d_dst_format_faithful(info->dst.format)) {
eng3d = TRUE;
} else
if (!nv50_2d_src_format_faithful(info->src.format)) {
if (!util_format_is_luminance(info->src.format)) {
if (util_format_is_intensity(info->src.format))
eng3d = info->src.format != PIPE_FORMAT_I8_UNORM;
else
if (!nv50_2d_dst_format_ops_supported(info->dst.format))
eng3d = TRUE;
else
eng3d = !nv50_2d_format_supported(info->src.format);
}
} else
if (util_format_is_luminance_alpha(info->src.format))
eng3d = TRUE;
}
if (info->src.resource->nr_samples == 8 &&
info->dst.resource->nr_samples <= 1)

View File

@@ -356,19 +356,19 @@ nvc0_create_decoder(struct pipe_context *context,
goto fw_fail;
}
r = read(fd, dec->fw_bo->map, 0x4000);
close(fd);
if (r < 0) {
fprintf(stderr, "reading firmware file %s failed: %m\n", path);
goto fw_fail;
}
if (r == 0x4000) {
close(fd);
fprintf(stderr, "firmware file %s too large!\n", path);
goto fw_fail;
}
if (r & 0xff) {
close(fd);
fprintf(stderr, "firmware file %s wrong size!\n", path);
goto fw_fail;
}

View File

@@ -22,6 +22,7 @@ r300_compiler_tests_CPPFLAGS = \
-I$(top_srcdir)/src/gallium/drivers/r300/compiler
r300_compiler_tests_SOURCES = \
$(testdir)/r300_compiler_tests.c \
$(testdir)/radeon_compiler_optimize_tests.c \
$(testdir)/radeon_compiler_util_tests.c \
$(testdir)/rc_test_helpers.c \
$(testdir)/unit_test.c

View File

@@ -854,7 +854,7 @@ static void rc_emulate_negative_addressing(struct radeon_compiler *compiler, voi
transform_negative_addressing(c, lastARL, inst, min_offset);
}
static struct rc_swizzle_caps r300_vertprog_swizzle_caps = {
struct rc_swizzle_caps r300_vertprog_swizzle_caps = {
.IsNative = &swizzle_is_native,
.Split = 0 /* should never be called */
};

View File

@@ -1,3 +1,30 @@
/*
* Copyright 2010 Tom Stellard <tstellar@gmail.com>
*
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining
* a copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sublicense, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial
* portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
* IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
* LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
* OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
* WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*
*/
#include "radeon_program_constants.h"
#ifndef RADEON_PROGRAM_UTIL_H

View File

@@ -1,4 +1,29 @@
/*
* Copyright 2010 Tom Stellard <tstellar@gmail.com>
*
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining
* a copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sublicense, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial
* portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
* IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
* LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
* OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
* WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*
*/
#ifndef RADEON_EMULATE_LOOPS_H
#define RADEON_EMULATE_LOOPS_H

View File

@@ -1,3 +1,27 @@
/*
* Copyright 2012 Advanced Micro Devices, Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* on the rights to use, copy, modify, merge, publish, distribute, sub
* license, and/or sell copies of the Software, and to permit persons to whom
* the Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
* THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
* USE OR OTHER DEALINGS IN THE SOFTWARE.
*
* Author: Tom Stellard <thomas.stellard@amd.com>
*/
#include "radeon_compiler.h"
#include "radeon_compiler_util.h"

View File

@@ -708,6 +708,7 @@ static int peephole_mul_omod(
struct rc_list * writer_list;
struct rc_variable * var;
struct peephole_mul_cb_data cb_data;
unsigned writemask_sum;
for (i = 0; i < 2; i++) {
unsigned int j;
@@ -815,10 +816,11 @@ static int peephole_mul_omod(
}
/* Rewrite the instructions */
writemask_sum = rc_variable_writemask_sum(writer_list->Item);
for (var = writer_list->Item; var; var = var->Friend) {
struct rc_variable * writer = writer_list->Item;
struct rc_variable * writer = var;
unsigned conversion_swizzle = rc_make_conversion_swizzle(
writer->Inst->U.I.DstReg.WriteMask,
writemask_sum,
inst_mul->U.I.DstReg.WriteMask);
writer->Inst->U.I.Omod = omod_op;
writer->Inst->U.I.DstReg.File = inst_mul->U.I.DstReg.File;

View File

@@ -1,3 +1,29 @@
/*
* Copyright 2011 Tom Stellard <tstellar@gmail.com>
*
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining
* a copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sublicense, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial
* portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
* IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
* LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
* OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
* WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*
*/
#include "radeon_compiler.h"
#include "radeon_compiler_util.h"

View File

@@ -1,3 +1,29 @@
/*
* Copyright 2010 Tom Stellard <tstellar@gmail.com>
*
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining
* a copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sublicense, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial
* portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
* IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
* LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
* OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
* WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*
*/
#ifndef RADEON_RENAME_REGS_H
#define RADEON_RENAME_REGS_H

View File

@@ -54,4 +54,6 @@ struct rc_swizzle_caps {
void (*Split)(struct rc_src_register reg, unsigned int mask, struct rc_swizzle_split * split);
};
extern struct rc_swizzle_caps r300_vertprog_swizzle_caps;
#endif /* RADEON_SWIZZLE_H */

View File

@@ -1,3 +1,27 @@
/*
* Copyright 2012 Advanced Micro Devices, Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* on the rights to use, copy, modify, merge, publish, distribute, sub
* license, and/or sell copies of the Software, and to permit persons to whom
* the Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
* THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
* USE OR OTHER DEALINGS IN THE SOFTWARE.
*
* Author: Tom Stellard <thomas.stellard@amd.com>
*/
#include "radeon_compiler.h"
#include "radeon_compiler_util.h"

View File

@@ -1,6 +1,43 @@
/*
* Copyright 2011 Tom Stellard <tstellar@gmail.com>
*
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining
* a copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sublicense, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial
* portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
* IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
* LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
* OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
* WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*
*/
#include "r300_compiler_tests.h"
#include <stdlib.h>
int main(int argc, char ** argv)
{
radeon_compiler_util_run_tests();
unsigned pass = 1;
pass &= radeon_compiler_optimize_run_tests();
pass &= radeon_compiler_util_run_tests();
if (pass) {
return EXIT_SUCCESS;
} else {
return EXIT_FAILURE;
}
}

View File

@@ -1,2 +1,29 @@
/*
* Copyright 2011 Tom Stellard <tstellar@gmail.com>
*
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining
* a copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sublicense, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial
* portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
* IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
* LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
* OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
* WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*
*/
void radeon_compiler_util_run_tests(void);
unsigned radeon_compiler_optimize_run_tests(void);
unsigned radeon_compiler_util_run_tests(void);

View File

@@ -0,0 +1,75 @@
/*
* Copyright 2013 Advanced Micro Devices, Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* on the rights to use, copy, modify, merge, publish, distribute, sub
* license, and/or sell copies of the Software, and to permit persons to whom
* the Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
* THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
* USE OR OTHER DEALINGS IN THE SOFTWARE.
*
* Author: Tom Stellard <thomas.stellard@amd.com>
*/
#include "radeon_compiler.h"
#include "radeon_dataflow.h"
#include "r300_compiler_tests.h"
#include "rc_test_helpers.h"
#include "unit_test.h"
static void test_runner_rc_optimize(struct test_result * result)
{
struct radeon_compiler c;
struct rc_instruction *inst;
struct rc_instruction *inst_list[3];
unsigned inst_count = 0;
float const0[4] = {2.0f, 0.0f, 0.0f, 0.0f};
unsigned pass = 1;
test_begin(result);
init_compiler(&c, RC_FRAGMENT_PROGRAM, 1, 0);
rc_constants_add_immediate_vec4(&c.Program.Constants, const0);
add_instruction(&c, "RCP temp[0].x, const[1].x___;");
add_instruction(&c, "RCP temp[0].y, const[1]._y__;");
add_instruction(&c, "MUL temp[1].xy, const[0].xx__, temp[0].xy__;");
add_instruction(&c, "MOV output[0].xy, temp[1].xy;" );
rc_optimize(&c, NULL);
for(inst = c.Program.Instructions.Next;
inst != &c.Program.Instructions;
inst = inst->Next, inst_count++) {
inst_list[inst_count] = inst;
}
if (inst_list[0]->U.I.Omod != RC_OMOD_MUL_2 ||
inst_list[1]->U.I.Omod != RC_OMOD_MUL_2 ||
inst_list[2]->U.I.Opcode != RC_OPCODE_MOV) {
pass = 0;
}
test_check(result, pass);
}
unsigned radeon_compiler_optimize_run_tests()
{
struct test tests[] = {
{"rc_optimize() => peephole_mul_omod()", test_runner_rc_optimize},
{NULL, NULL}
};
return run_tests(tests);
}

View File

@@ -1,3 +1,30 @@
/*
* Copyright 2011 Tom Stellard <tstellar@gmail.com>
*
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining
* a copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sublicense, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial
* portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
* IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
* LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
* OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
* WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*
*/
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
@@ -67,11 +94,11 @@ static void test_runner_rc_inst_can_use_presub(struct test_result * result)
"MAD temp[0].xyz, temp[2].xyz_, -temp[3].xxx_, input[5].xyz_;");
}
void radeon_compiler_util_run_tests()
unsigned radeon_compiler_util_run_tests()
{
struct test tests[] = {
{"rc_inst_can_use_presub()", test_runner_rc_inst_can_use_presub},
{NULL, NULL}
};
run_tests(tests);
return run_tests(tests);
}

View File

@@ -1,3 +1,32 @@
/*
* Copyright 2011 Tom Stellard <tstellar@gmail.com>
* Copyright 2013 Advanced Micro Devices, Inc.
*
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining
* a copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sublicense, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial
* portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
* IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
* LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
* OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
* WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*
* Author: Tom Stellard <thomas.stellard@amd.com>
*/
#include <errno.h>
#include <regex.h>
#include <stdlib.h>
@@ -5,9 +34,14 @@
#include <string.h>
#include <sys/types.h>
#include "../radeon_compiler_util.h"
#include "../radeon_opcodes.h"
#include "../radeon_program.h"
#include "r500_fragprog.h"
#include "r300_fragprog_swizzle.h"
#include "radeon_compiler.h"
#include "radeon_compiler_util.h"
#include "radeon_opcodes.h"
#include "radeon_program.h"
#include "radeon_regalloc.h"
#include "radeon_swizzle.h"
#include "rc_test_helpers.h"
@@ -259,6 +293,7 @@ int init_rc_normal_dst(
if (tokens.WriteMask.Length == 0) {
inst->U.I.DstReg.WriteMask = RC_MASK_XYZW;
} else {
inst->U.I.DstReg.WriteMask = 0;
/* The first character should be '.' */
if (tokens.WriteMask.String[0] != '.') {
fprintf(stderr, "1st char of writemask is not valid.\n");
@@ -311,7 +346,8 @@ struct inst_tokens {
* this string is the same that is output by rc_program_print.
* @return 1 On success, 0 on failure
*/
int init_rc_normal_instruction(
int parse_rc_normal_instruction(
struct rc_instruction * inst,
const char * inst_str)
{
@@ -320,10 +356,6 @@ int init_rc_normal_instruction(
regmatch_t matches[REGEX_INST_MATCHES];
struct inst_tokens tokens;
/* Initialize inst */
memset(inst, 0, sizeof(struct rc_instruction));
inst->Type = RC_INSTRUCTION_NORMAL;
/* Execute the regex */
if (!regex_helper(regex_str, inst_str, matches, REGEX_INST_MATCHES)) {
return 0;
@@ -340,6 +372,8 @@ int init_rc_normal_instruction(
/* Fill out the rest of the instruction. */
inst->Type = RC_INSTRUCTION_NORMAL;
for (i = 0; i < MAX_RC_OPCODE; i++) {
const struct rc_opcode_info * info = rc_get_opcode_info(i);
unsigned int first_src = 3;
@@ -378,3 +412,47 @@ int init_rc_normal_instruction(
}
return 1;
}
int init_rc_normal_instruction(
struct rc_instruction * inst,
const char * inst_str)
{
/* Initialize inst */
memset(inst, 0, sizeof(struct rc_instruction));
return parse_rc_normal_instruction(inst, inst_str);
}
void add_instruction(struct radeon_compiler *c, const char * inst_string)
{
struct rc_instruction * new_inst =
rc_insert_new_instruction(c, c->Program.Instructions.Prev);
parse_rc_normal_instruction(new_inst, inst_string);
}
void init_compiler(
struct radeon_compiler *c,
enum rc_program_type program_type,
unsigned is_r500,
unsigned is_r400)
{
struct rc_regalloc_state *rs = malloc(sizeof(struct rc_regalloc_state));
rc_init(c, rs);
c->is_r500 = is_r500;
c->max_temp_regs = is_r500 ? 128 : (is_r400 ? 64 : 32);
c->max_constants = is_r500 ? 256 : 32;
c->max_alu_insts = (is_r500 || is_r400) ? 512 : 64;
c->max_tex_insts = (is_r500 || is_r400) ? 512 : 32;
if (program_type == RC_FRAGMENT_PROGRAM) {
c->has_half_swizzles = 1;
c->has_presub = 1;
c->has_omod = 1;
c->SwizzleCaps =
is_r500 ? &r500_swizzle_caps : &r300_swizzle_caps;
} else {
c->SwizzleCaps = &r300_vertprog_swizzle_caps;
}
}

View File

@@ -1,3 +1,33 @@
/*
* Copyright 2011 Tom Stellard <tstellar@gmail.com>
* Copyright 2013 Advanced Micro Devices, Inc.
*
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining
* a copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sublicense, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial
* portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
* IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
* LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
* OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
* WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*
* Author: Tom Stellard <thomas.stellard@amd.com>
*/
#include "radeon_compiler.h"
int init_rc_normal_src(
struct rc_instruction * inst,
@@ -8,6 +38,18 @@ int init_rc_normal_dst(
struct rc_instruction * inst,
const char * dst_str);
int parse_rc_normal_instruction(
struct rc_instruction * inst,
const char * inst_str);
int init_rc_normal_instruction(
struct rc_instruction * inst,
const char * inst_str);
void add_instruction(struct radeon_compiler *c, const char * inst_string);
void init_compiler(
struct radeon_compiler *c,
enum rc_program_type program_type,
unsigned is_r500,
unsigned is_r400);

View File

@@ -1,19 +1,51 @@
/*
* Copyright 2011 Tom Stellard <tstellar@gmail.com>
*
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining
* a copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sublicense, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial
* portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
* IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
* LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
* OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
* WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*
*/
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include "unit_test.h"
void run_tests(struct test tests[])
unsigned run_tests(struct test tests[])
{
int i;
unsigned pass = 1;
for (i = 0; tests[i].name; i++) {
printf("Test %s\n", tests[i].name);
memset(&tests[i].result, 0, sizeof(tests[i].result));
tests[i].test_func(&tests[i].result);
printf("Test %s (%d/%d) pass\n", tests[i].name,
tests[i].result.pass, tests[i].result.test_count);
if (tests[i].result.pass != tests[i].result.test_count) {
pass = 0;
}
}
return pass;
}
void test_begin(struct test_result * result)

View File

@@ -1,3 +1,29 @@
/*
* Copyright 2011 Tom Stellard <tstellar@gmail.com>
*
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining
* a copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sublicense, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial
* portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
* IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
* LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
* OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
* WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*
*/
struct test_result {
unsigned int test_count;
@@ -11,7 +37,7 @@ struct test {
struct test_result result;
};
void run_tests(struct test tests[]);
unsigned run_tests(struct test tests[]);
void test_begin(struct test_result * result);
void test_check(struct test_result * result, int cond);

View File

@@ -487,6 +487,7 @@ static void r300_set_blend_color(struct pipe_context* pipe,
(struct r300_blend_color_state*)r300->blend_color_state.state;
struct pipe_blend_color c;
enum pipe_format format = fb->nr_cbufs ? fb->cbufs[0]->format : 0;
float tmp;
CB_LOCALS;
state->state = *color; /* Save it, so that we can reuse it in set_fb_state */
@@ -513,6 +514,13 @@ static void r300_set_blend_color(struct pipe_context* pipe,
c.color[2] = c.color[3];
break;
case PIPE_FORMAT_R8G8B8A8_UNORM:
case PIPE_FORMAT_R8G8B8X8_UNORM:
tmp = c.color[0];
c.color[0] = c.color[2];
c.color[2] = tmp;
break;
default:;
}
}
@@ -919,6 +927,9 @@ r300_set_framebuffer_state(struct pipe_context* pipe,
/* Need to reset clamping or colormask. */
r300_mark_atom_dirty(r300, &r300->blend_state);
/* Re-swizzle the blend color. */
r300_set_blend_color(pipe, &((struct r300_blend_color_state*)r300->blend_color_state.state)->state);
/* If zsbuf is set from NULL to non-NULL or vice versa.. */
if (!!old_state->zsbuf != !!state->zsbuf) {
r300_mark_atom_dirty(r300, &r300->dsa_state);

View File

@@ -978,9 +978,9 @@ r300_texture_create_object(struct r300_screen *rscreen,
tex->tex.microtile = microtile;
tex->tex.macrotile[0] = macrotile;
tex->tex.stride_in_bytes_override = stride_in_bytes_override;
tex->domain = base->flags & R300_RESOURCE_FLAG_TRANSFER ?
RADEON_DOMAIN_GTT :
RADEON_DOMAIN_VRAM | RADEON_DOMAIN_GTT;
tex->domain = base->flags & R300_RESOURCE_FLAG_TRANSFER ? RADEON_DOMAIN_GTT :
base->nr_samples > 1 ? RADEON_DOMAIN_VRAM :
RADEON_DOMAIN_VRAM | RADEON_DOMAIN_GTT;
tex->buf = buffer;
r300_texture_desc_init(rscreen, tex, base);

View File

@@ -243,9 +243,9 @@ void evergreen_set_streamout_enable(struct r600_context *ctx, unsigned buffer_en
void evergreen_dma_copy(struct r600_context *rctx,
struct pipe_resource *dst,
struct pipe_resource *src,
unsigned long dst_offset,
unsigned long src_offset,
unsigned long size)
uint64_t dst_offset,
uint64_t src_offset,
uint64_t size)
{
struct radeon_winsys_cs *cs = rctx->rings.dma.cs;
unsigned i, ncopy, csize, sub_cmd, shift;
@@ -283,4 +283,7 @@ void evergreen_dma_copy(struct r600_context *rctx,
src_offset += csize << shift;
size -= csize;
}
util_range_add(&rdst->valid_buffer_range, dst_offset,
dst_offset + size);
}

View File

@@ -808,6 +808,7 @@ static void *evergreen_create_dsa_state(struct pipe_context *ctx,
dsa->valuemask[1] = state->stencil[1].valuemask;
dsa->writemask[0] = state->stencil[0].writemask;
dsa->writemask[1] = state->stencil[1].writemask;
dsa->zwritemask = state->depth.writemask;
db_depth_control = S_028800_Z_ENABLE(state->depth.enabled) |
S_028800_Z_WRITE_ENABLE(state->depth.writemask) |
@@ -1321,6 +1322,10 @@ void evergreen_init_color_surface_rat(struct r600_context *rctx,
* elements. */
surf->cb_color_dim = pipe_buffer->width0;
/* Set the buffer range the GPU will have access to: */
util_range_add(&r600_resource(pipe_buffer)->valid_buffer_range,
0, pipe_buffer->width0);
surf->cb_color_cmask = surf->cb_color_base;
surf->cb_color_cmask_slice = 0;
surf->cb_color_fmask = surf->cb_color_base;
@@ -1405,10 +1410,15 @@ void evergreen_init_color_surface(struct r600_context *rctx,
S_028C74_NON_DISP_TILING_ORDER(non_disp_tiling) |
S_028C74_FMASK_BANK_HEIGHT(fmask_bankh);
if (rctx->chip_class == CAYMAN && rtex->resource.b.b.nr_samples > 1) {
unsigned log_samples = util_logbase2(rtex->resource.b.b.nr_samples);
color_attrib |= S_028C74_NUM_SAMPLES(log_samples) |
S_028C74_NUM_FRAGMENTS(log_samples);
if (rctx->chip_class == CAYMAN) {
color_attrib |= S_028C74_FORCE_DST_ALPHA_1(desc->swizzle[3] ==
UTIL_FORMAT_SWIZZLE_1);
if (rtex->resource.b.b.nr_samples > 1) {
unsigned log_samples = util_logbase2(rtex->resource.b.b.nr_samples);
color_attrib |= S_028C74_NUM_SAMPLES(log_samples) |
S_028C74_NUM_FRAGMENTS(log_samples);
}
}
ntype = V_028C70_NUMBER_UNORM;
@@ -1647,6 +1657,11 @@ static void evergreen_set_framebuffer_state(struct pipe_context *ctx,
}
if (rctx->framebuffer.state.zsbuf) {
rctx->flags |= R600_CONTEXT_WAIT_3D_IDLE | R600_CONTEXT_FLUSH_AND_INV;
rtex = (struct r600_texture*)rctx->framebuffer.state.zsbuf->texture;
if (rtex->htile) {
rctx->flags |= R600_CONTEXT_FLUSH_AND_INV_DB_META;
}
}
util_copy_framebuffer_state(&rctx->framebuffer.state, state);
@@ -1668,6 +1683,8 @@ static void evergreen_set_framebuffer_state(struct pipe_context *ctx,
surf = (struct r600_surface*)state->cbufs[i];
rtex = (struct r600_texture*)surf->base.texture;
r600_context_add_resource_size(ctx, state->cbufs[i]->texture);
if (!surf->color_initialized) {
evergreen_init_color_surface(rctx, surf);
}
@@ -1699,6 +1716,8 @@ static void evergreen_set_framebuffer_state(struct pipe_context *ctx,
if (state->zsbuf) {
surf = (struct r600_surface*)state->zsbuf;
r600_context_add_resource_size(ctx, state->zsbuf->texture);
if (!surf->depth_initialized) {
evergreen_init_depth_surface(rctx, surf);
}
@@ -2218,9 +2237,23 @@ static void evergreen_emit_db_misc_state(struct r600_context *rctx, struct r600_
}
db_render_override |= S_02800C_NOOP_CULL_DISABLE(1);
}
if (rctx->db_state.rsurf && rctx->db_state.rsurf->htile_enabled) {
/* FIXME we should be able to use hyperz even if we are not writing to
* zbuffer but somehow this trigger GPU lockup. See :
*
* https://bugs.freedesktop.org/show_bug.cgi?id=60848
*
* Disable hyperz for now if not writing to zbuffer.
*/
if (rctx->db_state.rsurf && rctx->db_state.rsurf->htile_enabled && rctx->zwritemask) {
/* FORCE_OFF means HiZ/HiS are determined by DB_SHADER_CONTROL */
db_render_override |= S_02800C_FORCE_HIZ_ENABLE(V_02800C_FORCE_OFF);
/* This is to fix a lockup when hyperz and alpha test are enabled at
* the same time somehow GPU get confuse on which order to pick for
* z test
*/
if (rctx->alphatest_state.sx_alpha_test_control) {
db_render_override |= S_02800C_FORCE_SHADER_Z_ORDER(1);
}
} else {
db_render_override |= S_02800C_FORCE_HIZ_ENABLE(V_02800C_FORCE_DISABLE);
}
@@ -3210,7 +3243,7 @@ void evergreen_pipe_shader_ps(struct pipe_context *ctx, struct r600_pipe_shader
struct r600_context *rctx = (struct r600_context *)ctx;
struct r600_pipe_state *rstate = &shader->rstate;
struct r600_shader *rshader = &shader->shader;
unsigned i, exports_ps, num_cout, spi_ps_in_control_0, spi_input_z, spi_ps_in_control_1, db_shader_control;
unsigned i, exports_ps, num_cout, spi_ps_in_control_0, spi_input_z, spi_ps_in_control_1, db_shader_control = 0;
int pos_index = -1, face_index = -1;
int ninterp = 0;
boolean have_linear = FALSE, have_centroid = FALSE, have_perspective = FALSE;
@@ -3220,7 +3253,6 @@ void evergreen_pipe_shader_ps(struct pipe_context *ctx, struct r600_pipe_shader
rstate->nregs = 0;
db_shader_control = S_02880C_Z_ORDER(V_02880C_EARLY_Z_THEN_LATE_Z);
for (i = 0; i < rshader->ninput; i++) {
/* evergreen NUM_INTERP only contains values interpolated into the LDS,
POSITION goes via GPRs from the SC so isn't counted */
@@ -3454,6 +3486,24 @@ void evergreen_update_db_shader_control(struct r600_context * rctx)
V_02880C_EXPORT_DB_FULL) |
S_02880C_ALPHA_TO_MASK_DISABLE(rctx->framebuffer.cb0_is_integer);
/* When alpha test is enabled we can't trust the hw to make the proper
* decision on the order in which ztest should be run related to fragment
* shader execution.
*
* If alpha test is enabled perform early z rejection (RE_Z) but don't early
* write to the zbuffer. Write to zbuffer is delayed after fragment shader
* execution and thus after alpha test so if discarded by the alpha test
* the z value is not written.
* If ReZ is enabled, and the zfunc/zenable/zwrite values change you can
* get a hang unless you flush the DB in between. For now just use
* LATE_Z.
*/
if (rctx->alphatest_state.sx_alpha_test_control) {
db_shader_control |= S_02880C_Z_ORDER(V_02880C_LATE_Z);
} else {
db_shader_control |= S_02880C_Z_ORDER(V_02880C_EARLY_Z_THEN_LATE_Z);
}
if (db_shader_control != rctx->db_misc_state.db_shader_control) {
rctx->db_misc_state.db_shader_control = db_shader_control;
rctx->db_misc_state.atom.dirty = true;
@@ -3481,7 +3531,7 @@ static void evergreen_dma_copy_tile(struct r600_context *rctx,
unsigned array_mode, lbpp, pitch_tile_max, slice_tile_max, size;
unsigned ncopy, height, cheight, detile, i, x, y, z, src_mode, dst_mode;
unsigned sub_cmd, bank_h, bank_w, mt_aspect, nbanks, tile_split;
unsigned long base, addr;
uint64_t base, addr;
/* make sure that the dma ring is only one active */
rctx->rings.gfx.flush(rctx, RADEON_FLUSH_ASYNC);
@@ -3502,7 +3552,8 @@ static void evergreen_dma_copy_tile(struct r600_context *rctx,
if (dst_mode == RADEON_SURF_MODE_LINEAR) {
/* T2L */
array_mode = evergreen_array_mode(src_mode);
slice_tile_max = (((pitch * rsrc->surface.level[src_level].npix_y) >> 6) / bpp) - 1;
slice_tile_max = (rsrc->surface.level[src_level].nblk_x * rsrc->surface.level[src_level].nblk_y) >> 6;
slice_tile_max = slice_tile_max ? slice_tile_max - 1 : 0;
/* linear height must be the same as the slice tile max height, it's ok even
* if the linear destination/source have smaller heigh as the size of the
* dma packet will be using the copy_height which is always smaller or equal
@@ -3526,7 +3577,8 @@ static void evergreen_dma_copy_tile(struct r600_context *rctx,
} else {
/* L2T */
array_mode = evergreen_array_mode(dst_mode);
slice_tile_max = (((pitch * rdst->surface.level[dst_level].npix_y) >> 6) / bpp) - 1;
slice_tile_max = (rdst->surface.level[dst_level].nblk_x * rdst->surface.level[dst_level].nblk_y) >> 6;
slice_tile_max = slice_tile_max ? slice_tile_max - 1 : 0;
/* linear height must be the same as the slice tile max height, it's ok even
* if the linear destination/source have smaller heigh as the size of the
* dma packet will be using the copy_height which is always smaller or equal
@@ -3624,8 +3676,19 @@ boolean evergreen_dma_blit(struct pipe_context *ctx,
return FALSE;
}
/* 128 bpp surfaces require non_disp_tiling for both
* tiled and linear buffers on cayman. However, async
* DMA only supports it on the tiled side. As such
* the tile order is backwards after a L2T/T2L packet.
*/
if ((rctx->chip_class == CAYMAN) &&
(src_mode != dst_mode) &&
(util_format_get_blocksize(src->format) >= 16)) {
return FALSE;
}
if (src_mode == dst_mode) {
unsigned long dst_offset, src_offset;
uint64_t dst_offset, src_offset;
/* simple dma blit would do NOTE code here assume :
* src_box.x/y == 0
* dst_x/y == 0

View File

@@ -28,6 +28,7 @@
#include "../../winsys/radeon/drm/radeon_winsys.h"
#include "util/u_double_list.h"
#include "util/u_range.h"
#include "util/u_transfer.h"
#define R600_ERR(fmt, args...) \
@@ -50,6 +51,16 @@ struct r600_resource {
/* Resource state. */
unsigned domains;
/* The buffer range which is initialized (with a write transfer,
* streamout, DMA, or as a random access target). The rest of
* the buffer is considered invalid and can be mapped unsynchronized.
*
* This allows unsychronized mapping of a buffer range which hasn't
* been used yet. It's for applications which forget to use
* the unsynchronized map flag and expect the driver to figure it out.
*/
struct util_range valid_buffer_range;
};
#define R600_BLOCK_MAX_BO 32
@@ -151,6 +162,8 @@ struct r600_so_target {
#define R600_CONTEXT_WAIT_CP_DMA_IDLE (1 << 3)
#define R600_CONTEXT_FLUSH_AND_INV (1 << 4)
#define R600_CONTEXT_FLUSH_AND_INV_CB_META (1 << 5)
#define R600_CONTEXT_PS_PARTIAL_FLUSH (1 << 6)
#define R600_CONTEXT_FLUSH_AND_INV_DB_META (1 << 7)
struct r600_context;
struct r600_screen;
@@ -174,9 +187,9 @@ void r600_need_dma_space(struct r600_context *ctx, unsigned num_dw);
void r600_dma_copy(struct r600_context *rctx,
struct pipe_resource *dst,
struct pipe_resource *src,
unsigned long dst_offset,
unsigned long src_offset,
unsigned long size);
uint64_t dst_offset,
uint64_t src_offset,
uint64_t size);
boolean r600_dma_blit(struct pipe_context *ctx,
struct pipe_resource *dst,
unsigned dst_level,
@@ -187,9 +200,9 @@ boolean r600_dma_blit(struct pipe_context *ctx,
void evergreen_dma_copy(struct r600_context *rctx,
struct pipe_resource *dst,
struct pipe_resource *src,
unsigned long dst_offset,
unsigned long src_offset,
unsigned long size);
uint64_t dst_offset,
uint64_t src_offset,
uint64_t size);
boolean evergreen_dma_blit(struct pipe_context *ctx,
struct pipe_resource *dst,
unsigned dst_level,

View File

@@ -68,13 +68,17 @@ static inline unsigned int r600_bytecode_get_num_operands(struct r600_bytecode *
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_MAX_INT:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_MIN_INT:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETE:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETE_DX10:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETE_INT:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETNE:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETNE_DX10:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETNE_INT:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGT:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGT_DX10:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGT_INT:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGT_UINT:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGE:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGE_DX10:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGE_INT:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGE_UINT:
case V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_PRED_SETE:
@@ -150,13 +154,17 @@ static inline unsigned int r600_bytecode_get_num_operands(struct r600_bytecode *
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_MAX_INT:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_MIN_INT:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETE:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETE_DX10:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETE_INT:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETNE:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETNE_DX10:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETNE_INT:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGT:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGT_DX10:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGT_INT:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGT_UINT:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGE:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGE_DX10:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGE_INT:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_SETGE_UINT:
case EG_V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_PRED_SETE:
@@ -314,6 +322,7 @@ int r600_bytecode_add_output(struct r600_bytecode *bc, const struct r600_bytecod
output->swizzle_y == bc->cf_last->output.swizzle_y &&
output->swizzle_z == bc->cf_last->output.swizzle_z &&
output->swizzle_w == bc->cf_last->output.swizzle_w &&
output->comp_mask == bc->cf_last->output.comp_mask &&
(output->burst_count + bc->cf_last->output.burst_count) <= 16) {
if ((output->gpr + output->burst_count) == bc->cf_last->output.gpr &&
@@ -865,12 +874,6 @@ static int check_and_set_bank_swizzle(struct r600_bytecode *bc,
bank_swizzle[4] = SQ_ALU_SCL_210;
while(bank_swizzle[4] <= SQ_ALU_SCL_221) {
if (max_slots == 4) {
for (i = 0; i < max_slots; i++) {
if (bank_swizzle[i] == SQ_ALU_VEC_210)
return -1;
}
}
init_bank_swizzle(&bs);
if (scalar_only == false) {
for (i = 0; i < 4; i++) {
@@ -902,8 +905,10 @@ static int check_and_set_bank_swizzle(struct r600_bytecode *bc,
bank_swizzle[i]++;
if (bank_swizzle[i] <= SQ_ALU_VEC_210)
break;
else
else if (i < max_slots - 1)
bank_swizzle[i] = SQ_ALU_VEC_012;
else
return -1;
}
}
}

View File

@@ -523,6 +523,37 @@ void r600_copy_buffer(struct pipe_context *ctx, struct pipe_resource *dst, unsig
}
}
static void r600_clear_buffer(struct pipe_context *ctx, struct pipe_resource *dst,
unsigned offset, unsigned size, unsigned char value)
{
struct r600_context *rctx = (struct r600_context*)ctx;
if (rctx->screen->has_streamout && offset % 4 == 0 && size % 4 == 0) {
union pipe_color_union clear_value;
uint32_t v = value;
clear_value.ui[0] = v | (v << 8) | (v << 16) | (v << 24);
r600_blitter_begin(ctx, R600_DISABLE_RENDER_COND);
util_blitter_clear_buffer(rctx->blitter, dst, offset, size,
1, &clear_value);
r600_blitter_end(ctx);
} else {
char *map = r600_buffer_mmap_sync_with_rings(rctx, r600_resource(dst),
PIPE_TRANSFER_WRITE);
memset(map + offset, value, size);
}
}
void r600_screen_clear_buffer(struct r600_screen *rscreen, struct pipe_resource *dst,
unsigned offset, unsigned size, unsigned char value)
{
pipe_mutex_lock(rscreen->aux_context_lock);
r600_clear_buffer(rscreen->aux_context, dst, offset, size, value);
rscreen->aux_context->flush(rscreen->aux_context, NULL, 0);
pipe_mutex_unlock(rscreen->aux_context_lock);
}
static bool util_format_is_subsampled_2x1_32bpp(enum pipe_format format)
{
const struct util_format_description *desc = util_format_description(format);

View File

@@ -34,6 +34,7 @@ static void r600_buffer_destroy(struct pipe_screen *screen,
{
struct r600_resource *rbuffer = r600_resource(buf);
util_range_destroy(&rbuffer->valid_buffer_range);
pb_reference(&rbuffer->buf, NULL);
FREE(rbuffer);
}
@@ -98,6 +99,14 @@ static void *r600_buffer_transfer_map(struct pipe_context *ctx,
assert(box->x + box->width <= resource->width0);
/* See if the buffer range being mapped has never been initialized,
* in which case it can be mapped unsynchronized. */
if (!(usage & PIPE_TRANSFER_UNSYNCHRONIZED) &&
usage & PIPE_TRANSFER_WRITE &&
!util_ranges_intersect(&rbuffer->valid_buffer_range, box->x, box->x + box->width)) {
usage |= PIPE_TRANSFER_UNSYNCHRONIZED;
}
if (usage & PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE &&
!(usage & PIPE_TRANSFER_UNSYNCHRONIZED)) {
assert(usage & PIPE_TRANSFER_WRITE);
@@ -178,6 +187,7 @@ static void r600_buffer_transfer_unmap(struct pipe_context *pipe,
{
struct r600_context *rctx = (struct r600_context*)pipe;
struct r600_transfer *rtransfer = (struct r600_transfer*)transfer;
struct r600_resource *rbuffer = r600_resource(transfer->resource);
if (rtransfer->staging) {
struct pipe_resource *dst, *src;
@@ -189,7 +199,7 @@ static void r600_buffer_transfer_unmap(struct pipe_context *pipe,
doffset = transfer->box.x;
soffset = rtransfer->offset + transfer->box.x % R600_MAP_BUFFER_ALIGNMENT;
/* Copy the staging buffer into the original one. */
if (rctx->rings.dma.cs && !(size % 4) && !(doffset % 4) && !(soffset)) {
if (rctx->rings.dma.cs && !(size % 4) && !(doffset % 4) && !(soffset % 4)) {
if (rctx->screen->chip_class >= EVERGREEN) {
evergreen_dma_copy(rctx, dst, src, doffset, soffset, size);
} else {
@@ -203,6 +213,11 @@ static void r600_buffer_transfer_unmap(struct pipe_context *pipe,
}
pipe_resource_reference((struct pipe_resource**)&rtransfer->staging, NULL);
}
if (transfer->usage & PIPE_TRANSFER_WRITE) {
util_range_add(&rbuffer->valid_buffer_range, transfer->box.x,
transfer->box.x + transfer->box.width);
}
util_slab_free(&rctx->pool_transfers, transfer);
}
@@ -259,6 +274,7 @@ bool r600_init_resource(struct r600_screen *rscreen,
res->cs_buf = rscreen->ws->buffer_get_cs_handle(res->buf);
res->domains = domains;
util_range_set_empty(&res->valid_buffer_range);
return true;
}
@@ -275,6 +291,7 @@ struct pipe_resource *r600_buffer_create(struct pipe_screen *screen,
pipe_reference_init(&rbuffer->b.b.reference, 1);
rbuffer->b.b.screen = screen;
rbuffer->b.vtbl = &r600_buffer_vtbl;
util_range_init(&rbuffer->valid_buffer_range);
if (!r600_init_resource(rscreen, rbuffer, templ->width0, alignment, TRUE, templ->usage)) {
FREE(rbuffer);

View File

@@ -359,6 +359,16 @@ out_err:
void r600_need_cs_space(struct r600_context *ctx, unsigned num_dw,
boolean count_draw_in)
{
if (!ctx->ws->cs_memory_below_limit(ctx->rings.gfx.cs, ctx->vram, ctx->gtt)) {
ctx->gtt = 0;
ctx->vram = 0;
ctx->rings.gfx.flush(ctx, RADEON_FLUSH_ASYNC);
return;
}
/* all will be accounted once relocation are emited */
ctx->gtt = 0;
ctx->vram = 0;
/* The number of dwords we already used in the CS so far. */
num_dw += ctx->rings.gfx.cs->cdw;
@@ -623,17 +633,27 @@ void r600_flush_emit(struct r600_context *rctx)
/* Use of WAIT_UNTIL is deprecated on Cayman+ */
if (rctx->family >= CHIP_CAYMAN) {
/* emit a PS partial flush on Cayman/TN */
cs->buf[cs->cdw++] = PKT3(PKT3_EVENT_WRITE, 0, 0);
cs->buf[cs->cdw++] = EVENT_TYPE(EVENT_TYPE_PS_PARTIAL_FLUSH) | EVENT_INDEX(4);
rctx->flags |= R600_CONTEXT_PS_PARTIAL_FLUSH;
}
}
if (rctx->flags & R600_CONTEXT_PS_PARTIAL_FLUSH) {
cs->buf[cs->cdw++] = PKT3(PKT3_EVENT_WRITE, 0, 0);
cs->buf[cs->cdw++] = EVENT_TYPE(EVENT_TYPE_PS_PARTIAL_FLUSH) | EVENT_INDEX(4);
}
if (rctx->chip_class >= R700 &&
(rctx->flags & R600_CONTEXT_FLUSH_AND_INV_CB_META)) {
cs->buf[cs->cdw++] = PKT3(PKT3_EVENT_WRITE, 0, 0);
cs->buf[cs->cdw++] = EVENT_TYPE(EVENT_TYPE_FLUSH_AND_INV_CB_META) | EVENT_INDEX(0);
}
if (rctx->chip_class >= R700 &&
(rctx->flags & R600_CONTEXT_FLUSH_AND_INV_DB_META)) {
cs->buf[cs->cdw++] = PKT3(PKT3_EVENT_WRITE, 0, 0);
cs->buf[cs->cdw++] = EVENT_TYPE(EVENT_TYPE_FLUSH_AND_INV_DB_META) | EVENT_INDEX(0);
}
if (rctx->flags & R600_CONTEXT_FLUSH_AND_INV) {
cs->buf[cs->cdw++] = PKT3(PKT3_EVENT_WRITE, 0, 0);
cs->buf[cs->cdw++] = EVENT_TYPE(EVENT_TYPE_CACHE_FLUSH_AND_INV_EVENT) | EVENT_INDEX(0);
@@ -728,6 +748,7 @@ void r600_context_flush(struct r600_context *ctx, unsigned flags)
*/
ctx->flags |= R600_CONTEXT_FLUSH_AND_INV |
R600_CONTEXT_FLUSH_AND_INV_CB_META |
R600_CONTEXT_FLUSH_AND_INV_DB_META |
R600_CONTEXT_WAIT_3D_IDLE |
R600_CONTEXT_WAIT_CP_DMA_IDLE;
@@ -784,6 +805,8 @@ void r600_begin_new_cs(struct r600_context *ctx)
ctx->pm4_dirty_cdwords = 0;
ctx->flags = 0;
ctx->gtt = 0;
ctx->vram = 0;
/* Begin a new CS. */
r600_emit_command_buffer(ctx->rings.gfx.cs, &ctx->start_cs_cmd);
@@ -1103,6 +1126,7 @@ void r600_cp_dma_copy_buffer(struct r600_context *rctx,
rctx->flags |= R600_CONTEXT_INVAL_READ_CACHES |
R600_CONTEXT_FLUSH_AND_INV |
R600_CONTEXT_FLUSH_AND_INV_CB_META |
R600_CONTEXT_FLUSH_AND_INV_DB_META |
R600_CONTEXT_STREAMOUT_FLUSH |
R600_CONTEXT_WAIT_3D_IDLE;
@@ -1145,6 +1169,12 @@ void r600_cp_dma_copy_buffer(struct r600_context *rctx,
src_offset += byte_count;
dst_offset += byte_count;
}
/* Invalidate the read caches. */
rctx->flags |= R600_CONTEXT_INVAL_READ_CACHES;
util_range_add(&r600_resource(dst)->valid_buffer_range, dst_offset,
dst_offset + size);
}
void r600_need_dma_space(struct r600_context *ctx, unsigned num_dw)
@@ -1160,9 +1190,9 @@ void r600_need_dma_space(struct r600_context *ctx, unsigned num_dw)
void r600_dma_copy(struct r600_context *rctx,
struct pipe_resource *dst,
struct pipe_resource *src,
unsigned long dst_offset,
unsigned long src_offset,
unsigned long size)
uint64_t dst_offset,
uint64_t src_offset,
uint64_t size)
{
struct radeon_winsys_cs *cs = rctx->rings.dma.cs;
unsigned i, ncopy, csize, shift;
@@ -1191,4 +1221,7 @@ void r600_dma_copy(struct r600_context *rctx,
src_offset += csize << shift;
size -= csize;
}
util_range_add(&rdst->valid_buffer_range, dst_offset,
dst_offset + size);
}

View File

@@ -29,7 +29,7 @@
#include "r600_pipe.h"
/* the number of CS dwords for flushing and drawing */
#define R600_MAX_FLUSH_CS_DWORDS 12
#define R600_MAX_FLUSH_CS_DWORDS 16
#define R600_MAX_DRAW_CS_DWORDS 34
#define R600_TRACE_CS_DWORDS 7

View File

@@ -38,8 +38,12 @@ static LLVMValueRef llvm_fetch_const(
LLVMValueRef index = LLVMBuildLoad(bld_base->base.gallivm->builder, bld->addr[reg->Indirect.Index][reg->Indirect.SwizzleX], "");
offset[1] = LLVMBuildAdd(bld_base->base.gallivm->builder, offset[1], index, "");
}
unsigned ConstantAddressSpace = CONSTANT_BUFFER_0_ADDR_SPACE ;
if (reg->Register.Dimension) {
ConstantAddressSpace += reg->Dimension.Index;
}
LLVMTypeRef const_ptr_type = LLVMPointerType(LLVMArrayType(LLVMVectorType(bld_base->base.elem_type, 4), 1024),
CONSTANT_BUFFER_0_ADDR_SPACE);
ConstantAddressSpace);
LLVMValueRef const_ptr = LLVMBuildIntToPtr(bld_base->base.gallivm->builder, lp_build_const_int32(bld_base->base.gallivm, 0), const_ptr_type, "");
LLVMValueRef ptr = LLVMBuildGEP(bld_base->base.gallivm->builder, const_ptr, offset, 2, "");
LLVMValueRef cvecval = LLVMBuildLoad(bld_base->base.gallivm->builder, ptr, "");
@@ -537,6 +541,7 @@ const char * r600_llvm_gpu_string(enum radeon_family family)
case CHIP_RV630:
case CHIP_RV620:
case CHIP_RV635:
case CHIP_RV670:
case CHIP_RS780:
case CHIP_RS880:
gpu_family = "r600";
@@ -547,7 +552,6 @@ const char * r600_llvm_gpu_string(enum radeon_family family)
case CHIP_RV730:
gpu_family = "rv730";
break;
case CHIP_RV670:
case CHIP_RV740:
case CHIP_RV770:
gpu_family = "rv770";

View File

@@ -22,6 +22,7 @@
*/
#include "r600_pipe.h"
#include "r600_public.h"
#include "r600d.h"
#include <errno.h>
#include "pipe/p_shader_tokens.h"
@@ -165,12 +166,23 @@ static void r600_flush_gfx_ring(void *ctx, unsigned flags)
static void r600_flush_dma_ring(void *ctx, unsigned flags)
{
struct r600_context *rctx = (struct r600_context *)ctx;
struct radeon_winsys_cs *cs = rctx->rings.dma.cs;
unsigned padding_dw, i;
if (!rctx->rings.dma.cs->cdw) {
if (!cs->cdw) {
return;
}
/* Pad the DMA CS to a multiple of 8 dwords. */
padding_dw = 8 - cs->cdw % 8;
if (padding_dw < 8) {
for (i = 0; i < padding_dw; i++) {
cs->buf[cs->cdw++] = DMA_PACKET(DMA_PACKET_NOP, 0, 0, 0);
}
}
rctx->rings.dma.flushing = true;
rctx->ws->cs_flush(rctx->rings.dma.cs, flags);
rctx->ws->cs_flush(cs, flags);
rctx->rings.dma.flushing = false;
}
@@ -822,6 +834,9 @@ static void r600_destroy_screen(struct pipe_screen* pscreen)
if (rscreen == NULL)
return;
pipe_mutex_destroy(rscreen->aux_context_lock);
rscreen->aux_context->destroy(rscreen->aux_context);
if (rscreen->global_pool) {
compute_memory_pool_delete(rscreen->global_pool);
}
@@ -1145,7 +1160,7 @@ struct pipe_screen *r600_screen_create(struct radeon_winsys *ws)
* case were triggering lockup quickly such as :
* piglit/bin/depthstencil-render-miplevels 1024 d=s=z24_s8
*/
rscreen->use_hyperz = debug_get_bool_option("R600_HYPERZ", TRUE);
rscreen->use_hyperz = debug_get_bool_option("R600_HYPERZ", FALSE);
rscreen->use_hyperz = rscreen->info.drm_minor >= 26 ? rscreen->use_hyperz : FALSE;
rscreen->global_pool = compute_memory_pool_new(rscreen);
@@ -1164,5 +1179,41 @@ struct pipe_screen *r600_screen_create(struct radeon_winsys *ws)
}
#endif
/* Create the auxiliary context. */
pipe_mutex_init(rscreen->aux_context_lock);
rscreen->aux_context = rscreen->screen.context_create(&rscreen->screen, NULL);
#if 0 /* This is for testing whether aux_context and buffer clearing work correctly. */
struct pipe_resource templ = {};
templ.width0 = 4;
templ.height0 = 2048;
templ.depth0 = 1;
templ.array_size = 1;
templ.target = PIPE_TEXTURE_2D;
templ.format = PIPE_FORMAT_R8G8B8A8_UNORM;
templ.usage = PIPE_USAGE_STATIC;
struct r600_resource *res = r600_resource(rscreen->screen.resource_create(&rscreen->screen, &templ));
unsigned char *map = ws->buffer_map(res->cs_buf, NULL, PIPE_TRANSFER_WRITE);
memset(map, 0, 256);
r600_screen_clear_buffer(rscreen, &res->b.b, 4, 4, 0xCC);
r600_screen_clear_buffer(rscreen, &res->b.b, 8, 4, 0xDD);
r600_screen_clear_buffer(rscreen, &res->b.b, 12, 4, 0xEE);
r600_screen_clear_buffer(rscreen, &res->b.b, 20, 4, 0xFF);
r600_screen_clear_buffer(rscreen, &res->b.b, 32, 20, 0x87);
ws->buffer_wait(res->buf, RADEON_USAGE_WRITE);
int i;
for (i = 0; i < 256; i++) {
printf("%02X", map[i]);
if (i % 16 == 15)
printf("\n");
}
#endif
return &rscreen->screen;
}

View File

@@ -252,6 +252,11 @@ struct r600_screen {
unsigned cs_count;
#endif
r600g_dma_blit_t dma_blit;
/* Auxiliary context. Mainly used to initialize resources.
* It must be locked prior to using and flushed before unlocking. */
struct pipe_context *aux_context;
pipe_mutex aux_context_lock;
};
struct r600_pipe_sampler_view {
@@ -298,7 +303,8 @@ struct r600_dsa_state {
unsigned alpha_ref;
ubyte valuemask[2];
ubyte writemask[2];
unsigned sx_alpha_test_control;
unsigned zwritemask;
unsigned sx_alpha_test_control;
};
struct r600_pipe_shader;
@@ -447,6 +453,10 @@ struct r600_context {
unsigned backend_mask;
unsigned max_db; /* for OQ */
/* current unaccounted memory usage */
uint64_t vram;
uint64_t gtt;
/* Miscellaneous state objects. */
void *custom_dsa_flush;
void *custom_blend_resolve;
@@ -509,6 +519,7 @@ struct r600_context {
bool alpha_to_one;
bool force_blend_disable;
boolean dual_src_blend;
unsigned zwritemask;
/* Index buffer. */
struct pipe_index_buffer index_buffer;
@@ -624,6 +635,8 @@ void evergreen_update_db_shader_control(struct r600_context * rctx);
/* r600_blit.c */
void r600_copy_buffer(struct pipe_context *ctx, struct pipe_resource *dst, unsigned dstx,
struct pipe_resource *src, const struct pipe_box *src_box);
void r600_screen_clear_buffer(struct r600_screen *rscreen, struct pipe_resource *dst,
unsigned offset, unsigned size, unsigned char value);
void r600_init_blit_functions(struct r600_context *rctx);
void r600_blit_decompress_depth(struct pipe_context *ctx,
struct r600_texture *texture,
@@ -869,9 +882,11 @@ static INLINE unsigned r600_context_bo_reloc(struct r600_context *ctx,
* look serialized from driver pov
*/
if (!ring->flushing) {
if (ring == &ctx->rings.gfx && ctx->rings.dma.cs) {
/* flush dma ring */
ctx->rings.dma.flush(ctx, RADEON_FLUSH_ASYNC);
if (ring == &ctx->rings.gfx) {
if (ctx->rings.dma.cs) {
/* flush dma ring */
ctx->rings.dma.flush(ctx, RADEON_FLUSH_ASYNC);
}
} else {
/* flush gfx ring */
ctx->rings.gfx.flush(ctx, RADEON_FLUSH_ASYNC);
@@ -996,4 +1011,28 @@ static INLINE unsigned u_max_layer(struct pipe_resource *r, unsigned level)
}
}
static INLINE void r600_context_add_resource_size(struct pipe_context *ctx, struct pipe_resource *r)
{
struct r600_context *rctx = (struct r600_context *)ctx;
struct r600_resource *rr = (struct r600_resource *)r;
if (r == NULL) {
return;
}
/*
* The idea is to compute a gross estimate of memory requirement of
* each draw call. After each draw call, memory will be precisely
* accounted. So the uncertainty is only on the current draw call.
* In practice this gave very good estimate (+/- 10% of the target
* memory limit).
*/
if (rr->domains & RADEON_DOMAIN_GTT) {
rctx->gtt += rr->buf->size;
}
if (rr->domains & RADEON_DOMAIN_VRAM) {
rctx->vram += rr->buf->size;
}
}
#endif

View File

@@ -186,10 +186,11 @@ static void r600_emit_query_end(struct r600_context *ctx, struct r600_query *que
case PIPE_QUERY_PRIMITIVES_GENERATED:
case PIPE_QUERY_SO_STATISTICS:
case PIPE_QUERY_SO_OVERFLOW_PREDICATE:
va += query->buffer.results_end + query->result_size/2;
cs->buf[cs->cdw++] = PKT3(PKT3_EVENT_WRITE, 2, 0);
cs->buf[cs->cdw++] = EVENT_TYPE(EVENT_TYPE_SAMPLE_STREAMOUTSTATS) | EVENT_INDEX(3);
cs->buf[cs->cdw++] = query->buffer.results_end + query->result_size/2;
cs->buf[cs->cdw++] = 0;
cs->buf[cs->cdw++] = va;
cs->buf[cs->cdw++] = (va >> 32UL) & 0xFF;
break;
case PIPE_QUERY_TIME_ELAPSED:
va += query->buffer.results_end + query->result_size/2;

View File

@@ -5760,7 +5760,7 @@ static int tgsi_umad(struct r600_shader_ctx *ctx)
{
struct tgsi_full_instruction *inst = &ctx->parse.FullToken.FullInstruction;
struct r600_bytecode_alu alu;
int i, j, r;
int i, j, k, r;
int lasti = tgsi_last_instruction(inst->Dst[0].Register.WriteMask);
/* src0 * src1 */
@@ -5768,21 +5768,40 @@ static int tgsi_umad(struct r600_shader_ctx *ctx)
if (!(inst->Dst[0].Register.WriteMask & (1 << i)))
continue;
memset(&alu, 0, sizeof(struct r600_bytecode_alu));
if (ctx->bc->chip_class == CAYMAN) {
for (j = 0; j < 4; j++) {
memset(&alu, 0, sizeof(struct r600_bytecode_alu));
alu.dst.chan = j;
alu.dst.sel = ctx->temp_reg;
alu.dst.write = (j == i);
alu.dst.chan = i;
alu.dst.sel = ctx->temp_reg;
alu.dst.write = 1;
if (j == 3)
alu.last = 1;
alu.inst = CTX_INST(V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_MULLO_UINT);
for (k = 0; k < inst->Instruction.NumSrcRegs; k++) {
r600_bytecode_src(&alu.src[k], &ctx->src[k], i);
}
r = r600_bytecode_add_alu(ctx->bc, &alu);
if (r)
return r;
}
} else {
memset(&alu, 0, sizeof(struct r600_bytecode_alu));
alu.inst = CTX_INST(V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_MULLO_UINT);
for (j = 0; j < 2; j++) {
r600_bytecode_src(&alu.src[j], &ctx->src[j], i);
}
alu.dst.chan = i;
alu.dst.sel = ctx->temp_reg;
alu.dst.write = 1;
alu.last = 1;
r = r600_bytecode_add_alu(ctx->bc, &alu);
if (r)
return r;
alu.inst = CTX_INST(V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_MULLO_UINT);
for (j = 0; j < 2; j++) {
r600_bytecode_src(&alu.src[j], &ctx->src[j], i);
}
alu.last = 1;
r = r600_bytecode_add_alu(ctx->bc, &alu);
if (r)
return r;
}
}

View File

@@ -802,6 +802,7 @@ static void *r600_create_dsa_state(struct pipe_context *ctx,
dsa->valuemask[1] = state->stencil[1].valuemask;
dsa->writemask[0] = state->stencil[0].writemask;
dsa->writemask[1] = state->stencil[1].writemask;
dsa->zwritemask = state->depth.writemask;
db_depth_control = S_028800_Z_ENABLE(state->depth.enabled) |
S_028800_Z_WRITE_ENABLE(state->depth.writemask) |
@@ -1515,6 +1516,11 @@ static void r600_set_framebuffer_state(struct pipe_context *ctx,
}
if (rctx->framebuffer.state.zsbuf) {
rctx->flags |= R600_CONTEXT_WAIT_3D_IDLE | R600_CONTEXT_FLUSH_AND_INV;
rtex = (struct r600_texture*)rctx->framebuffer.state.zsbuf->texture;
if (rctx->chip_class >= R700 && rtex->htile) {
rctx->flags |= R600_CONTEXT_FLUSH_AND_INV_DB_META;
}
}
/* Set the new state. */
@@ -1544,6 +1550,7 @@ static void r600_set_framebuffer_state(struct pipe_context *ctx,
surf = (struct r600_surface*)state->cbufs[i];
rtex = (struct r600_texture*)surf->base.texture;
r600_context_add_resource_size(ctx, state->cbufs[i]->texture);
if (!surf->color_initialized || force_cmask_fmask) {
r600_init_color_surface(rctx, surf, force_cmask_fmask);
@@ -1576,6 +1583,8 @@ static void r600_set_framebuffer_state(struct pipe_context *ctx,
if (state->zsbuf) {
surf = (struct r600_surface*)state->zsbuf;
r600_context_add_resource_size(ctx, state->zsbuf->texture);
if (!surf->depth_initialized) {
r600_init_depth_surface(rctx, surf);
}
@@ -1937,6 +1946,13 @@ static void r600_emit_db_misc_state(struct r600_context *rctx, struct r600_atom
if (rctx->db_state.rsurf && rctx->db_state.rsurf->htile_enabled) {
/* FORCE_OFF means HiZ/HiS are determined by DB_SHADER_CONTROL */
db_render_override |= S_028D10_FORCE_HIZ_ENABLE(V_028D10_FORCE_OFF);
/* This is to fix a lockup when hyperz and alpha test are enabled at
* the same time somehow GPU get confuse on which order to pick for
* z test
*/
if (rctx->alphatest_state.sx_alpha_test_control) {
db_render_override |= S_028D10_FORCE_SHADER_Z_ORDER(1);
}
} else {
db_render_override |= S_028D10_FORCE_HIZ_ENABLE(V_028D10_FORCE_DISABLE);
}
@@ -2745,7 +2761,7 @@ void r600_pipe_shader_ps(struct pipe_context *ctx, struct r600_pipe_shader *shad
tmp);
}
db_shader_control = S_02880C_Z_ORDER(V_02880C_EARLY_Z_THEN_LATE_Z);
db_shader_control = 0;
for (i = 0; i < rshader->noutput; i++) {
if (rshader->output[i].name == TGSI_SEMANTIC_POSITION)
z_export = 1;
@@ -2940,6 +2956,19 @@ void r600_update_db_shader_control(struct r600_context * rctx)
unsigned db_shader_control = rctx->ps_shader->current->db_shader_control |
S_02880C_DUAL_EXPORT_ENABLE(dual_export);
/* When alpha test is enabled we can't trust the hw to make the proper
* decision on the order in which ztest should be run related to fragment
* shader execution.
*
* If alpha test is enabled perform z test after fragment. RE_Z (early
* z test but no write to the zbuffer) seems to cause lockup on r6xx/r7xx
*/
if (rctx->alphatest_state.sx_alpha_test_control) {
db_shader_control |= S_02880C_Z_ORDER(V_02880C_LATE_Z);
} else {
db_shader_control |= S_02880C_Z_ORDER(V_02880C_EARLY_Z_THEN_LATE_Z);
}
if (db_shader_control != rctx->db_misc_state.db_shader_control) {
rctx->db_misc_state.db_shader_control = db_shader_control;
rctx->db_misc_state.atom.dirty = true;
@@ -2979,7 +3008,7 @@ static boolean r600_dma_copy_tile(struct r600_context *rctx,
struct r600_texture *rdst = (struct r600_texture*)dst;
unsigned array_mode, lbpp, pitch_tile_max, slice_tile_max, size;
unsigned ncopy, height, cheight, detile, i, x, y, z, src_mode, dst_mode;
unsigned long base, addr;
uint64_t base, addr;
/* make sure that the dma ring is only one active */
rctx->rings.gfx.flush(rctx, RADEON_FLUSH_ASYNC);
@@ -2998,7 +3027,8 @@ static boolean r600_dma_copy_tile(struct r600_context *rctx,
if (dst_mode == RADEON_SURF_MODE_LINEAR) {
/* T2L */
array_mode = r600_array_mode(src_mode);
slice_tile_max = (((pitch * rsrc->surface.level[src_level].npix_y) >> 6) / bpp) - 1;
slice_tile_max = (rsrc->surface.level[src_level].nblk_x * rsrc->surface.level[src_level].nblk_y) >> 6;
slice_tile_max = slice_tile_max ? slice_tile_max - 1 : 0;
/* linear height must be the same as the slice tile max height, it's ok even
* if the linear destination/source have smaller heigh as the size of the
* dma packet will be using the copy_height which is always smaller or equal
@@ -3016,7 +3046,8 @@ static boolean r600_dma_copy_tile(struct r600_context *rctx,
} else {
/* L2T */
array_mode = r600_array_mode(dst_mode);
slice_tile_max = (((pitch * rdst->surface.level[dst_level].npix_y) >> 6) / bpp) - 1;
slice_tile_max = (rdst->surface.level[dst_level].nblk_x * rdst->surface.level[dst_level].nblk_y) >> 6;
slice_tile_max = slice_tile_max ? slice_tile_max - 1 : 0;
/* linear height must be the same as the slice tile max height, it's ok even
* if the linear destination/source have smaller heigh as the size of the
* dma packet will be using the copy_height which is always smaller or equal
@@ -3037,14 +3068,15 @@ static boolean r600_dma_copy_tile(struct r600_context *rctx,
return FALSE;
}
size = (copy_height * pitch) >> 2;
ncopy = (size / 0x0000ffff) + !!(size % 0x0000ffff);
/* It's a r6xx/r7xx limitation, the blit must be on 8 boundary for number
* line in the blit. Compute max 8 line we can copy in the size limit
*/
cheight = ((0x0000ffff << 2) / pitch) & 0xfffffff8;
ncopy = (copy_height / cheight) + !!(copy_height % cheight);
r600_need_dma_space(rctx, ncopy * 7);
for (i = 0; i < ncopy; i++) {
cheight = copy_height;
if (((cheight * pitch) >> 2) > 0x0000ffff) {
cheight = (0x0000ffff << 2) / pitch;
}
cheight = cheight > copy_height ? copy_height : cheight;
size = (cheight * pitch) >> 2;
/* emit reloc before writting cs so that cs is always in consistent state */
r600_context_bo_reloc(rctx, &rctx->rings.dma, &rsrc->resource, RADEON_USAGE_READ);
@@ -3109,7 +3141,7 @@ boolean r600_dma_blit(struct pipe_context *ctx,
}
if (src_mode == dst_mode) {
unsigned long dst_offset, src_offset, size;
uint64_t dst_offset, src_offset, size;
/* simple dma blit would do NOTE code here assume :
* src_box.x/y == 0

View File

@@ -284,6 +284,16 @@ static void r600_bind_dsa_state(struct pipe_context *ctx, void *state)
ref.valuemask[1] = dsa->valuemask[1];
ref.writemask[0] = dsa->writemask[0];
ref.writemask[1] = dsa->writemask[1];
if (rctx->zwritemask != dsa->zwritemask) {
rctx->zwritemask = dsa->zwritemask;
if (rctx->chip_class >= EVERGREEN) {
/* work around some issue when not writting to zbuffer
* we are having lockup on evergreen so do not enable
* hyperz when not writting zbuffer
*/
rctx->db_misc_state.atom.dirty = true;
}
}
r600_set_stencil_ref(ctx, &ref);
@@ -293,6 +303,11 @@ static void r600_bind_dsa_state(struct pipe_context *ctx, void *state)
rctx->alphatest_state.sx_alpha_test_control = dsa->sx_alpha_test_control;
rctx->alphatest_state.sx_alpha_ref = dsa->alpha_ref;
rctx->alphatest_state.atom.dirty = true;
if (rctx->chip_class >= EVERGREEN) {
evergreen_update_db_shader_control(rctx);
} else {
r600_update_db_shader_control(rctx);
}
}
}
@@ -479,7 +494,8 @@ static void r600_set_index_buffer(struct pipe_context *ctx,
if (ib) {
pipe_resource_reference(&rctx->index_buffer.buffer, ib->buffer);
memcpy(&rctx->index_buffer, ib, sizeof(*ib));
memcpy(&rctx->index_buffer, ib, sizeof(*ib));
r600_context_add_resource_size(ctx, ib->buffer);
} else {
pipe_resource_reference(&rctx->index_buffer.buffer, NULL);
}
@@ -516,6 +532,7 @@ static void r600_set_vertex_buffers(struct pipe_context *ctx,
vb[i].buffer_offset = input[i].buffer_offset;
pipe_resource_reference(&vb[i].buffer, input[i].buffer);
new_buffer_mask |= 1 << i;
r600_context_add_resource_size(ctx, input[i].buffer);
} else {
pipe_resource_reference(&vb[i].buffer, NULL);
disable_mask |= 1 << i;
@@ -613,6 +630,7 @@ static void r600_set_sampler_views(struct pipe_context *pipe, unsigned shader,
pipe_sampler_view_reference((struct pipe_sampler_view **)&dst->views.views[i], views[i]);
new_mask |= 1 << i;
r600_context_add_resource_size(pipe, views[i]->texture);
} else {
pipe_sampler_view_reference((struct pipe_sampler_view **)&dst->views.views[i], NULL);
disable_mask |= 1 << i;
@@ -806,6 +824,8 @@ static void r600_bind_ps_state(struct pipe_context *ctx, void *state)
rctx->ps_shader = (struct r600_pipe_shader_selector *)state;
r600_context_pipe_state_set(rctx, &rctx->ps_shader->current->rstate);
r600_context_add_resource_size(ctx, (struct pipe_resource *)rctx->ps_shader->current->bo);
if (rctx->chip_class <= R700) {
bool multiwrite = rctx->ps_shader->current->shader.fs_write_all;
@@ -835,6 +855,8 @@ static void r600_bind_vs_state(struct pipe_context *ctx, void *state)
if (state) {
r600_context_pipe_state_set(rctx, &rctx->vs_shader->current->rstate);
r600_context_add_resource_size(ctx, (struct pipe_resource *)rctx->vs_shader->current->bo);
/* Update clip misc state. */
if (rctx->vs_shader->current->pa_cl_vs_out_cntl != rctx->clip_misc_state.pa_cl_vs_out_cntl ||
rctx->vs_shader->current->shader.clip_dist_write != rctx->clip_misc_state.clip_dist_write) {
@@ -938,10 +960,13 @@ static void r600_set_constant_buffer(struct pipe_context *ctx, uint shader, uint
} else {
u_upload_data(rctx->uploader, 0, input->buffer_size, ptr, &cb->buffer_offset, &cb->buffer);
}
/* account it in gtt */
rctx->gtt += input->buffer_size;
} else {
/* Setup the hw buffer. */
cb->buffer_offset = input->buffer_offset;
pipe_resource_reference(&cb->buffer, input->buffer);
r600_context_add_resource_size(ctx, input->buffer);
}
state->enabled_mask |= 1 << index;
@@ -957,6 +982,7 @@ r600_create_so_target(struct pipe_context *ctx,
{
struct r600_context *rctx = (struct r600_context *)ctx;
struct r600_so_target *t;
struct r600_resource *rbuffer = (struct r600_resource*)buffer;
t = CALLOC_STRUCT(r600_so_target);
if (!t) {
@@ -976,6 +1002,9 @@ r600_create_so_target(struct pipe_context *ctx,
pipe_resource_reference(&t->b.buffer, buffer);
t->b.buffer_offset = buffer_offset;
t->b.buffer_size = buffer_size;
util_range_add(&rbuffer->valid_buffer_range, buffer_offset,
buffer_offset + buffer_size);
return &t->b;
}
@@ -1004,6 +1033,7 @@ static void r600_set_so_targets(struct pipe_context *ctx,
/* Set the new targets. */
for (i = 0; i < num_targets; i++) {
pipe_so_target_reference((struct pipe_stream_output_target**)&rctx->so_targets[i], targets[i]);
r600_context_add_resource_size(ctx, targets[i]->buffer);
}
for (; i < rctx->num_so_targets; i++) {
pipe_so_target_reference((struct pipe_stream_output_target**)&rctx->so_targets[i], NULL);
@@ -1343,6 +1373,12 @@ static void r600_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info
rctx->vgt_state.atom.dirty = true;
}
/* Workaround for hardware deadlock on certain R600 ASICs: write into a CB register. */
if (rctx->chip_class == R600) {
rctx->flags |= R600_CONTEXT_PS_PARTIAL_FLUSH;
rctx->cb_misc_state.atom.dirty = true;
}
/* Emit states. */
r600_need_cs_space(rctx, ib.user_buffer ? 5 : 0, TRUE);
r600_flush_emit(rctx);

View File

@@ -270,6 +270,7 @@ static void r600_texture_destroy(struct pipe_screen *screen,
if (rtex->flushed_depth_texture)
pipe_resource_reference((struct pipe_resource **)&rtex->flushed_depth_texture, NULL);
pipe_resource_reference((struct pipe_resource**)&rtex->htile, NULL);
pb_reference(&resource->buf, NULL);
FREE(rtex);
}
@@ -479,10 +480,7 @@ r600_texture_create_object(struct pipe_screen *screen,
*/
R600_ERR("r600: failed to create bo for htile buffers\n");
} else {
void *ptr;
ptr = rscreen->ws->buffer_map(rtex->htile->cs_buf, NULL, PIPE_TRANSFER_WRITE);
memset(ptr, 0x0, htile_size);
rscreen->ws->buffer_unmap(rtex->htile->cs_buf);
r600_screen_clear_buffer(rscreen, &rtex->htile->b.b, 0, htile_size, 0);
}
}
@@ -504,9 +502,8 @@ r600_texture_create_object(struct pipe_screen *screen,
if (rtex->cmask_size) {
/* Initialize the cmask to 0xCC (= compressed state). */
char *ptr = rscreen->ws->buffer_map(resource->cs_buf, NULL, PIPE_TRANSFER_WRITE);
memset(ptr + rtex->cmask_offset, 0xCC, rtex->cmask_size);
rscreen->ws->buffer_unmap(resource->cs_buf);
r600_screen_clear_buffer(rscreen, &rtex->resource.b.b,
rtex->cmask_offset, rtex->cmask_size, 0xCC);
}
if (debug_get_option_print_texdepth() && rtex->is_depth && rtex->non_disp_tiling) {

View File

@@ -119,6 +119,7 @@
#define EVENT_TYPE_CACHE_FLUSH_AND_INV_EVENT 0x16
#define EVENT_TYPE_SO_VGTSTREAMOUT_FLUSH 0x1f
#define EVENT_TYPE_SAMPLE_STREAMOUTSTATS 0x20
#define EVENT_TYPE_FLUSH_AND_INV_DB_META 0x2c /* supported on r700+ */
#define EVENT_TYPE_FLUSH_AND_INV_CB_META 46 /* supported on r700+ */
#define EVENT_TYPE(x) ((x) << 0)
#define EVENT_INDEX(x) ((x) << 8)

View File

@@ -1,11 +1,14 @@
include Makefile.sources
include $(top_srcdir)/src/gallium/Automake.inc
LIBGALLIUM_LIBS=
if HAVE_GALLIUM_R600
if HAVE_GALLIUM_RADEONSI
lib_LTLIBRARIES = libllvmradeon@VERSION@.la
libllvmradeon@VERSION@_la_LDFLAGS = -Wl, -shared -avoid-version \
$(LLVM_LDFLAGS)
LIBGALLIUM_LIBS += $(top_builddir)/src/gallium/auxiliary/libgallium.la
else
noinst_LTLIBRARIES = libllvmradeon@VERSION@.la
endif
@@ -26,5 +29,6 @@ libllvmradeon@VERSION@_la_SOURCES = \
$(C_FILES)
libllvmradeon@VERSION@_la_LIBADD = \
$(LIBGALLIUM_LIBS) \
$(CLOCK_LIB) \
$(LLVM_LIBS)

View File

@@ -155,7 +155,7 @@ static inline LLVMValueRef bitcast(
void radeon_llvm_emit_prepare_cube_coords(struct lp_build_tgsi_context * bld_base,
struct lp_build_emit_data * emit_data,
unsigned coord_arg);
LLVMValueRef *coords_arg);
void radeon_llvm_context_init(struct radeon_llvm_context * ctx);

View File

@@ -531,7 +531,7 @@ static void kil_emit(
void radeon_llvm_emit_prepare_cube_coords(
struct lp_build_tgsi_context * bld_base,
struct lp_build_emit_data * emit_data,
unsigned coord_arg)
LLVMValueRef *coords_arg)
{
unsigned target = emit_data->inst->Texture.Texture;
@@ -542,11 +542,13 @@ void radeon_llvm_emit_prepare_cube_coords(
LLVMValueRef coords[4];
LLVMValueRef mad_args[3];
LLVMValueRef idx;
struct LLVMOpaqueValue *cube_vec;
LLVMValueRef v;
unsigned i;
LLVMValueRef v = build_intrinsic(builder, "llvm.AMDGPU.cube",
LLVMVectorType(type, 4),
&emit_data->args[coord_arg], 1, LLVMReadNoneAttribute);
cube_vec = lp_build_gather_values(bld_base->base.gallivm, coords_arg, 4);
v = build_intrinsic(builder, "llvm.AMDGPU.cube", LLVMVectorType(type, 4),
&cube_vec, 1, LLVMReadNoneAttribute);
for (i = 0; i < 4; ++i) {
idx = lp_build_const_int32(gallivm, i);
@@ -579,18 +581,14 @@ void radeon_llvm_emit_prepare_cube_coords(
if (target != TGSI_TEXTURE_CUBE ||
opcode != TGSI_OPCODE_TEX) {
/* load source coord.w component - array_index for cube arrays or
* compare value for SHADOWCUBE */
idx = lp_build_const_int32(gallivm, 3);
coords[3] = LLVMBuildExtractElement(builder,
emit_data->args[coord_arg], idx, "");
/* for cube arrays coord.z = coord.w(array_index) * 8 + face */
if (target == TGSI_TEXTURE_CUBE_ARRAY ||
target == TGSI_TEXTURE_SHADOWCUBE_ARRAY) {
/* coords_arg.w component - array_index for cube arrays or
* compare value for SHADOWCUBE */
coords[2] = lp_build_emit_llvm_ternary(bld_base, TGSI_OPCODE_MAD,
coords[3], lp_build_const_float(gallivm, 8.0), coords[2]);
coords_arg[3], lp_build_const_float(gallivm, 8.0), coords[2]);
}
/* for instructions that need additional src (compare/lod/bias),
@@ -598,12 +596,11 @@ void radeon_llvm_emit_prepare_cube_coords(
if (opcode == TGSI_OPCODE_TEX2 ||
opcode == TGSI_OPCODE_TXB2 ||
opcode == TGSI_OPCODE_TXL2) {
coords[3] = emit_data->args[coord_arg + 1];
coords[3] = coords_arg[4];
}
}
emit_data->args[coord_arg] =
lp_build_gather_values(bld_base->base.gallivm, coords, 4);
memcpy(coords_arg, coords, sizeof(coords));
}
static void txd_fetch_args(
@@ -645,9 +642,6 @@ static void txp_fetch_args(
TGSI_OPCODE_DIV, arg, src_w);
}
coords[3] = bld_base->base.one;
emit_data->args[0] = lp_build_gather_values(bld_base->base.gallivm,
coords, 4);
emit_data->arg_count = 1;
if ((inst->Texture.Texture == TGSI_TEXTURE_CUBE ||
inst->Texture.Texture == TGSI_TEXTURE_CUBE_ARRAY ||
@@ -655,8 +649,12 @@ static void txp_fetch_args(
inst->Texture.Texture == TGSI_TEXTURE_SHADOWCUBE_ARRAY) &&
inst->Instruction.Opcode != TGSI_OPCODE_TXQ &&
inst->Instruction.Opcode != TGSI_OPCODE_TXQ_LZ) {
radeon_llvm_emit_prepare_cube_coords(bld_base, emit_data, 0);
radeon_llvm_emit_prepare_cube_coords(bld_base, emit_data, coords);
}
emit_data->args[0] = lp_build_gather_values(bld_base->base.gallivm,
coords, 4);
emit_data->arg_count = 1;
}
static void tex_fetch_args(
@@ -673,17 +671,12 @@ static void tex_fetch_args(
const struct tgsi_full_instruction * inst = emit_data->inst;
LLVMValueRef coords[4];
LLVMValueRef coords[5];
unsigned chan;
for (chan = 0; chan < 4; chan++) {
coords[chan] = lp_build_emit_fetch(bld_base, inst, 0, chan);
}
emit_data->arg_count = 1;
emit_data->args[0] = lp_build_gather_values(bld_base->base.gallivm,
coords, 4);
emit_data->dst_type = LLVMVectorType(bld_base->base.elem_type, 4);
if (inst->Instruction.Opcode == TGSI_OPCODE_TEX2 ||
inst->Instruction.Opcode == TGSI_OPCODE_TXB2 ||
inst->Instruction.Opcode == TGSI_OPCODE_TXL2) {
@@ -692,7 +685,7 @@ static void tex_fetch_args(
* That operand should be passed as a float value in the args array
* right after the coord vector. After packing it's not used anymore,
* that's why arg_count is not increased */
emit_data->args[1] = lp_build_emit_fetch(bld_base, inst, 1, 0);
coords[4] = lp_build_emit_fetch(bld_base, inst, 1, 0);
}
if ((inst->Texture.Texture == TGSI_TEXTURE_CUBE ||
@@ -701,8 +694,13 @@ static void tex_fetch_args(
inst->Texture.Texture == TGSI_TEXTURE_SHADOWCUBE_ARRAY) &&
inst->Instruction.Opcode != TGSI_OPCODE_TXQ &&
inst->Instruction.Opcode != TGSI_OPCODE_TXQ_LZ) {
radeon_llvm_emit_prepare_cube_coords(bld_base, emit_data, 0);
radeon_llvm_emit_prepare_cube_coords(bld_base, emit_data, coords);
}
emit_data->arg_count = 1;
emit_data->args[0] = lp_build_gather_values(bld_base->base.gallivm,
coords, 4);
emit_data->dst_type = LLVMVectorType(bld_base->base.elem_type, 4);
}
static void txf_fetch_args(
@@ -768,6 +766,22 @@ static void emit_icmp(
emit_data->output[emit_data->chan] = v;
}
static void emit_ucmp(
const struct lp_build_tgsi_action * action,
struct lp_build_tgsi_context * bld_base,
struct lp_build_emit_data * emit_data)
{
unsigned pred;
LLVMBuilderRef builder = bld_base->base.gallivm->builder;
LLVMContextRef context = bld_base->base.gallivm->context;
LLVMValueRef v = LLVMBuildFCmp(builder, LLVMRealUGE,
emit_data->args[0], lp_build_const_float(bld_base->base.gallivm, 0.), "");
emit_data->output[emit_data->chan] = LLVMBuildSelect(builder, v, emit_data->args[2], emit_data->args[1], "");
}
static void emit_cmp(
const struct lp_build_tgsi_action *action,
struct lp_build_tgsi_context * bld_base,
@@ -1243,6 +1257,7 @@ void radeon_llvm_context_init(struct radeon_llvm_context * ctx)
bld_base->op_actions[TGSI_OPCODE_USNE].emit = emit_icmp;
bld_base->op_actions[TGSI_OPCODE_U2F].emit = emit_u2f;
bld_base->op_actions[TGSI_OPCODE_XOR].emit = emit_xor;
bld_base->op_actions[TGSI_OPCODE_UCMP].emit = emit_ucmp;
bld_base->rsq_action.emit = build_tgsi_intrinsic_nomem;
bld_base->rsq_action.intr_name = "llvm.AMDGPU.rsq";

View File

@@ -98,21 +98,6 @@ static void r600_blitter_end(struct pipe_context *ctx)
r600_context_queries_resume(rctx);
}
static unsigned u_max_layer(struct pipe_resource *r, unsigned level)
{
switch (r->target) {
case PIPE_TEXTURE_CUBE:
return 6 - 1;
case PIPE_TEXTURE_3D:
return u_minify(r->depth0, level) - 1;
case PIPE_TEXTURE_1D_ARRAY:
case PIPE_TEXTURE_2D_ARRAY:
return r->array_size - 1;
default:
return 0;
}
}
void si_blit_uncompress_depth(struct pipe_context *ctx,
struct r600_resource_texture *texture,
struct r600_resource_texture *staging,
@@ -432,12 +417,30 @@ static void r600_resource_copy_region(struct pipe_context *ctx,
r600_change_format(dst, dst_level, &orig_info[1],
PIPE_FORMAT_R8_UNORM);
break;
case 2:
r600_change_format(src, src_level, &orig_info[0],
PIPE_FORMAT_R8G8_UNORM);
r600_change_format(dst, dst_level, &orig_info[1],
PIPE_FORMAT_R8G8_UNORM);
break;
case 4:
r600_change_format(src, src_level, &orig_info[0],
PIPE_FORMAT_R8G8B8A8_UNORM);
r600_change_format(dst, dst_level, &orig_info[1],
PIPE_FORMAT_R8G8B8A8_UNORM);
break;
case 8:
r600_change_format(src, src_level, &orig_info[0],
PIPE_FORMAT_R16G16B16A16_UINT);
r600_change_format(dst, dst_level, &orig_info[1],
PIPE_FORMAT_R16G16B16A16_UINT);
break;
case 16:
r600_change_format(src, src_level, &orig_info[0],
PIPE_FORMAT_R32G32B32A32_UINT);
r600_change_format(dst, dst_level, &orig_info[1],
PIPE_FORMAT_R32G32B32A32_UINT);
break;
default:
fprintf(stderr, "Unhandled format %s with blocksize %u\n",
util_format_short_name(src->format), blocksize);

View File

@@ -55,11 +55,8 @@ static void r600_copy_from_staging_texture(struct pipe_context *ctx, struct r600
struct pipe_resource *texture = transfer->resource;
struct pipe_box sbox;
sbox.x = sbox.y = sbox.z = 0;
sbox.width = transfer->box.width;
sbox.height = transfer->box.height;
/* XXX that might be wrong */
sbox.depth = 1;
u_box_3d(0, 0, 0, transfer->box.width, transfer->box.height, transfer->box.depth, &sbox);
ctx->resource_copy_region(ctx, texture, transfer->level,
transfer->box.x, transfer->box.y, transfer->box.z,
rtransfer->staging,
@@ -153,8 +150,7 @@ static int r600_init_surface(struct r600_screen *rscreen,
surface->flags |= RADEON_SURF_SCANOUT;
}
if ((ptex->bind & PIPE_BIND_DEPTH_STENCIL) &&
!is_flushed_depth && is_depth) {
if (!is_flushed_depth && is_depth) {
surface->flags |= RADEON_SURF_ZBUFFER;
if (is_stencil) {
@@ -239,7 +235,6 @@ static void *si_texture_transfer_map(struct pipe_context *ctx,
{
struct r600_context *rctx = (struct r600_context *)ctx;
struct r600_resource_texture *rtex = (struct r600_resource_texture*)texture;
struct pipe_resource resource;
struct r600_transfer *trans;
boolean use_staging_texture = FALSE;
struct radeon_winsys_cs_handle *buf;
@@ -299,42 +294,52 @@ static void *si_texture_transfer_map(struct pipe_context *ctx,
level, level,
box->z, box->z + box->depth - 1);
trans->transfer.stride = staging_depth->surface.level[level].pitch_bytes;
trans->transfer.layer_stride = staging_depth->surface.level[level].slice_size;
trans->offset = r600_texture_get_offset(staging_depth, level, box->z);
trans->staging = &staging_depth->resource.b.b;
} else if (use_staging_texture) {
resource.target = PIPE_TEXTURE_2D;
struct pipe_resource resource;
struct r600_resource_texture *staging;
memset(&resource, 0, sizeof(resource));
resource.format = texture->format;
resource.width0 = box->width;
resource.height0 = box->height;
resource.depth0 = 1;
resource.array_size = 1;
resource.last_level = 0;
resource.nr_samples = 0;
resource.usage = PIPE_USAGE_STAGING;
resource.bind = 0;
resource.flags = R600_RESOURCE_FLAG_TRANSFER;
/* For texture reading, the temporary (detiled) texture is used as
* a render target when blitting from a tiled texture. */
if (usage & PIPE_TRANSFER_READ) {
resource.bind |= PIPE_BIND_RENDER_TARGET;
}
/* For texture writing, the temporary texture is used as a sampler
* when blitting into a tiled texture. */
if (usage & PIPE_TRANSFER_WRITE) {
resource.bind |= PIPE_BIND_SAMPLER_VIEW;
/* We must set the correct texture target and dimensions if needed for a 3D transfer. */
if (box->depth > 1 && u_max_layer(texture, level) > 0)
resource.target = texture->target;
else
resource.target = PIPE_TEXTURE_2D;
switch (resource.target) {
case PIPE_TEXTURE_1D_ARRAY:
case PIPE_TEXTURE_2D_ARRAY:
case PIPE_TEXTURE_CUBE_ARRAY:
resource.array_size = box->depth;
break;
case PIPE_TEXTURE_3D:
resource.depth0 = box->depth;
break;
default:;
}
/* Create the temporary texture. */
trans->staging = ctx->screen->resource_create(ctx->screen, &resource);
if (trans->staging == NULL) {
staging = (struct r600_resource_texture*)ctx->screen->resource_create(ctx->screen, &resource);
if (staging == NULL) {
R600_ERR("failed to create temporary texture to hold untiled copy\n");
pipe_resource_reference(&trans->transfer.resource, NULL);
FREE(trans);
return NULL;
}
trans->transfer.stride = ((struct r600_resource_texture *)trans->staging)
->surface.level[0].pitch_bytes;
trans->staging = &staging->resource.b.b;
trans->transfer.stride = staging->surface.level[0].pitch_bytes;
trans->transfer.layer_stride = staging->surface.level[0].slice_size;
if (usage & PIPE_TRANSFER_READ) {
r600_copy_to_staging_texture(ctx, trans);
/* Always referenced in the blit. */
@@ -349,7 +354,7 @@ static void *si_texture_transfer_map(struct pipe_context *ctx,
if (trans->staging) {
buf = si_resource(trans->staging)->cs_buf;
} else {
buf = si_resource(trans->transfer.resource)->cs_buf;
buf = rtex->resource.cs_buf;
}
if (rtex->is_depth || !trans->staging)
@@ -549,6 +554,8 @@ static struct pipe_surface *r600_create_surface(struct pipe_context *pipe,
struct r600_surface *surface = CALLOC_STRUCT(r600_surface);
unsigned level = surf_tmpl->u.tex.level;
assert(surf_tmpl->u.tex.first_layer <= u_max_layer(texture, surf_tmpl->u.tex.level));
assert(surf_tmpl->u.tex.last_layer <= u_max_layer(texture, surf_tmpl->u.tex.level));
assert(surf_tmpl->u.tex.first_layer == surf_tmpl->u.tex.last_layer);
if (surface == NULL)
return NULL;

View File

@@ -280,6 +280,7 @@ static const char *r600_get_family_name(enum radeon_family family)
case CHIP_TAHITI: return "AMD TAHITI";
case CHIP_PITCAIRN: return "AMD PITCAIRN";
case CHIP_VERDE: return "AMD CAPE VERDE";
case CHIP_OLAND: return "AMD OLAND";
default: return "AMD unknown";
}
}
@@ -379,7 +380,7 @@ static int r600_get_param(struct pipe_screen* pscreen, enum pipe_cap param)
case PIPE_CAP_MAX_TEXTURE_CUBE_LEVELS:
return 15;
case PIPE_CAP_MAX_TEXTURE_ARRAY_LAYERS:
return /*rscreen->info.drm_minor >= 9 ? 16384 :*/ 0;
return 16384;
case PIPE_CAP_MAX_COMBINED_SAMPLERS:
return 32;
@@ -458,7 +459,7 @@ static int r600_get_shader_param(struct pipe_screen* pscreen, unsigned shader, e
/* FIXME Isn't this equal to TEMPS? */
return 1; /* Max native address registers */
case PIPE_SHADER_CAP_MAX_CONSTS:
return 64;
return 4096; /* actually only memory limits this */
case PIPE_SHADER_CAP_MAX_CONST_BUFFERS:
return 1;
case PIPE_SHADER_CAP_MAX_PREDS:

View File

@@ -277,4 +277,20 @@ static INLINE uint64_t r600_resource_va(struct pipe_screen *screen, struct pipe_
return rscreen->ws->buffer_get_virtual_address(rresource->cs_buf);
}
static INLINE unsigned u_max_layer(struct pipe_resource *r, unsigned level)
{
switch (r->target) {
case PIPE_TEXTURE_CUBE:
return 6 - 1;
case PIPE_TEXTURE_3D:
return u_minify(r->depth0, level) - 1;
case PIPE_TEXTURE_1D_ARRAY:
case PIPE_TEXTURE_2D_ARRAY:
case PIPE_TEXTURE_CUBE_ARRAY:
return r->array_size - 1;
default:
return 0;
}
}
#endif

View File

@@ -263,6 +263,14 @@ static void declare_input_fs(
build_intrinsic(base->gallivm->builder,
"llvm.SI.fs.read.pos", input_type,
args, 1, LLVMReadNoneAttribute);
if (chan == 3)
/* RCP for fragcoord.w */
si_shader_ctx->radeon_bld.inputs[soa_index] =
LLVMBuildFDiv(gallivm->builder,
lp_build_const_float(gallivm, 1.0f),
si_shader_ctx->radeon_bld.inputs[soa_index],
"");
}
return;
}
@@ -301,14 +309,8 @@ static void declare_input_fs(
/* XXX: Handle all possible interpolation modes */
switch (decl->Interp.Interpolate) {
case TGSI_INTERPOLATE_COLOR:
/* XXX: Flat shading hangs the GPU */
if (si_shader_ctx->rctx->queued.named.rasterizer &&
si_shader_ctx->rctx->queued.named.rasterizer->flatshade) {
#if 0
if (si_shader_ctx->key.flatshade) {
intr_name = "llvm.SI.fs.interp.constant";
#else
intr_name = "llvm.SI.fs.interp.linear.center";
#endif
} else {
if (decl->Interp.Centroid)
intr_name = "llvm.SI.fs.interp.persp.centroid";
@@ -317,11 +319,8 @@ static void declare_input_fs(
}
break;
case TGSI_INTERPOLATE_CONSTANT:
/* XXX: Flat shading hangs the GPU */
#if 0
intr_name = "llvm.SI.fs.interp.constant";
break;
#endif
case TGSI_INTERPOLATE_LINEAR:
if (decl->Interp.Centroid)
intr_name = "llvm.SI.fs.interp.linear.centroid";
@@ -433,6 +432,15 @@ static LLVMValueRef fetch_constant(
LLVMValueRef offset;
LLVMValueRef load;
if (swizzle == LP_CHAN_ALL) {
unsigned chan;
LLVMValueRef values[4];
for (chan = 0; chan < TGSI_NUM_CHANNELS; ++chan)
values[chan] = fetch_constant(bld_base, reg, type, chan);
return lp_build_gather_values(bld_base->base.gallivm, values, 4);
}
/* currently not supported */
if (reg->Register.Indirect) {
assert(0);
@@ -446,12 +454,6 @@ static LLVMValueRef fetch_constant(
* CONST[0].x will have an offset of 0 and CONST[1].x will have an
* offset of 4. */
idx = (reg->Register.Index * 4) + swizzle;
/* index loads above 255 are currently not supported */
if (idx > 255) {
assert(0);
idx = 0;
}
offset = lp_build_const_int32(base->gallivm, idx);
load = build_indexed_load(base->gallivm, const_ptr, offset);
@@ -612,6 +614,12 @@ static void si_llvm_emit_epilogue(struct lp_build_tgsi_context * bld_base)
int i;
tgsi_parse_token(parse);
if (parse->FullToken.Token.Type == TGSI_TOKEN_TYPE_PROPERTY &&
parse->FullToken.FullProperty.Property.PropertyName ==
TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS)
shader->fs_write_all = TRUE;
if (parse->FullToken.Token.Type != TGSI_TOKEN_TYPE_DECLARATION)
continue;
@@ -775,6 +783,29 @@ static void si_llvm_emit_epilogue(struct lp_build_tgsi_context * bld_base)
last_args[1] = lp_build_const_int32(base->gallivm,
si_shader_ctx->type == TGSI_PROCESSOR_FRAGMENT);
if (shader->fs_write_all && shader->nr_cbufs > 1) {
int i;
/* Specify that this is not yet the last export */
last_args[2] = lp_build_const_int32(base->gallivm, 0);
for (i = 1; i < shader->nr_cbufs; i++) {
/* Specify the target we are exporting */
last_args[3] = lp_build_const_int32(base->gallivm,
V_008DFC_SQ_EXP_MRT + i);
lp_build_intrinsic(base->gallivm->builder,
"llvm.SI.export",
LLVMVoidTypeInContext(base->gallivm->context),
last_args, 9);
si_shader_ctx->shader->spi_shader_col_format |=
si_shader_ctx->shader->spi_shader_col_format << 4;
}
last_args[3] = lp_build_const_int32(base->gallivm, V_008DFC_SQ_EXP_MRT);
}
/* Specify that this is the last export */
last_args[2] = lp_build_const_int32(base->gallivm, 1);
@@ -791,54 +822,127 @@ static void tex_fetch_args(
struct lp_build_tgsi_context * bld_base,
struct lp_build_emit_data * emit_data)
{
struct gallivm_state *gallivm = bld_base->base.gallivm;
const struct tgsi_full_instruction * inst = emit_data->inst;
unsigned opcode = inst->Instruction.Opcode;
unsigned target = inst->Texture.Texture;
LLVMValueRef ptr;
LLVMValueRef offset;
LLVMValueRef coords[4];
LLVMValueRef address[16];
unsigned count = 0;
unsigned chan;
/* WriteMask */
/* XXX: should be optimized using emit_data->inst->Dst[0].Register.WriteMask*/
emit_data->args[0] = lp_build_const_int32(bld_base->base.gallivm, 0xf);
/* Coordinates */
/* XXX: Not all sample instructions need 4 address arguments. */
if (inst->Instruction.Opcode == TGSI_OPCODE_TXP) {
LLVMValueRef src_w;
unsigned chan;
LLVMValueRef coords[4];
emit_data->dst_type = LLVMVectorType(bld_base->base.elem_type, 4);
src_w = lp_build_emit_fetch(bld_base, emit_data->inst, 0, TGSI_CHAN_W);
for (chan = 0; chan < 3; chan++ ) {
LLVMValueRef arg = lp_build_emit_fetch(bld_base,
emit_data->inst, 0, chan);
/* Fetch and project texture coordinates */
coords[3] = lp_build_emit_fetch(bld_base, emit_data->inst, 0, TGSI_CHAN_W);
for (chan = 0; chan < 3; chan++ ) {
coords[chan] = lp_build_emit_fetch(bld_base,
emit_data->inst, 0,
chan);
if (opcode == TGSI_OPCODE_TXP)
coords[chan] = lp_build_emit_llvm_binary(bld_base,
TGSI_OPCODE_DIV,
arg, src_w);
}
coords[chan],
coords[3]);
}
if (opcode == TGSI_OPCODE_TXP)
coords[3] = bld_base->base.one;
emit_data->args[1] = lp_build_gather_values(bld_base->base.gallivm,
coords, 4);
} else
emit_data->args[1] = lp_build_emit_fetch(bld_base, emit_data->inst,
0, LP_CHAN_ALL);
if (inst->Instruction.Opcode == TGSI_OPCODE_TEX2 ||
inst->Instruction.Opcode == TGSI_OPCODE_TXB2 ||
inst->Instruction.Opcode == TGSI_OPCODE_TXL2) {
/* These instructions have additional operand that should be packed
* into the cube coord vector by radeon_llvm_emit_prepare_cube_coords.
* That operand should be passed as a float value in the args array
* right after the coord vector. After packing it's not used anymore,
* that's why arg_count is not increased */
emit_data->args[2] = lp_build_emit_fetch(bld_base, inst, 1, 0);
/* Pack LOD bias value */
if (opcode == TGSI_OPCODE_TXB)
address[count++] = coords[3];
if ((target == TGSI_TEXTURE_CUBE || target == TGSI_TEXTURE_SHADOWCUBE) &&
opcode != TGSI_OPCODE_TXQ)
radeon_llvm_emit_prepare_cube_coords(bld_base, emit_data, coords);
/* Pack depth comparison value */
switch (target) {
case TGSI_TEXTURE_SHADOW1D:
case TGSI_TEXTURE_SHADOW1D_ARRAY:
case TGSI_TEXTURE_SHADOW2D:
case TGSI_TEXTURE_SHADOWRECT:
address[count++] = coords[2];
break;
case TGSI_TEXTURE_SHADOWCUBE:
case TGSI_TEXTURE_SHADOW2D_ARRAY:
address[count++] = coords[3];
break;
case TGSI_TEXTURE_SHADOWCUBE_ARRAY:
address[count++] = lp_build_emit_fetch(bld_base, inst, 1, 0);
}
if ((inst->Texture.Texture == TGSI_TEXTURE_CUBE ||
inst->Texture.Texture == TGSI_TEXTURE_SHADOWCUBE) &&
inst->Instruction.Opcode != TGSI_OPCODE_TXQ) {
radeon_llvm_emit_prepare_cube_coords(bld_base, emit_data, 1);
/* Pack texture coordinates */
address[count++] = coords[0];
switch (target) {
case TGSI_TEXTURE_2D:
case TGSI_TEXTURE_2D_ARRAY:
case TGSI_TEXTURE_3D:
case TGSI_TEXTURE_CUBE:
case TGSI_TEXTURE_RECT:
case TGSI_TEXTURE_SHADOW2D:
case TGSI_TEXTURE_SHADOWRECT:
case TGSI_TEXTURE_SHADOW2D_ARRAY:
case TGSI_TEXTURE_SHADOWCUBE:
case TGSI_TEXTURE_2D_MSAA:
case TGSI_TEXTURE_2D_ARRAY_MSAA:
case TGSI_TEXTURE_CUBE_ARRAY:
case TGSI_TEXTURE_SHADOWCUBE_ARRAY:
address[count++] = coords[1];
}
switch (target) {
case TGSI_TEXTURE_3D:
case TGSI_TEXTURE_CUBE:
case TGSI_TEXTURE_SHADOWCUBE:
case TGSI_TEXTURE_CUBE_ARRAY:
case TGSI_TEXTURE_SHADOWCUBE_ARRAY:
address[count++] = coords[2];
}
/* Pack array slice */
switch (target) {
case TGSI_TEXTURE_1D_ARRAY:
address[count++] = coords[1];
}
switch (target) {
case TGSI_TEXTURE_2D_ARRAY:
case TGSI_TEXTURE_2D_ARRAY_MSAA:
case TGSI_TEXTURE_SHADOW2D_ARRAY:
address[count++] = coords[2];
}
switch (target) {
case TGSI_TEXTURE_CUBE_ARRAY:
case TGSI_TEXTURE_SHADOW1D_ARRAY:
case TGSI_TEXTURE_SHADOWCUBE_ARRAY:
address[count++] = coords[3];
}
/* Pack LOD */
if (opcode == TGSI_OPCODE_TXL)
address[count++] = coords[3];
if (count > 16) {
assert(!"Cannot handle more than 16 texture address parameters");
count = 16;
}
for (chan = 0; chan < count; chan++ ) {
address[chan] = LLVMBuildBitCast(gallivm->builder,
address[chan],
LLVMInt32TypeInContext(gallivm->context),
"");
}
/* Pad to power of two vector */
while (count < util_next_power_of_two(count))
address[count++] = LLVMGetUndef(LLVMInt32TypeInContext(gallivm->context));
emit_data->args[1] = lp_build_gather_values(gallivm, address, count);
/* Resource */
ptr = use_sgpr(bld_base->base.gallivm, SGPR_CONST_PTR_V8I32, SI_SGPR_RESOURCE);
@@ -855,8 +959,7 @@ static void tex_fetch_args(
ptr, offset);
/* Dimensions */
emit_data->args[4] = lp_build_const_int32(bld_base->base.gallivm,
emit_data->inst->Texture.Texture);
emit_data->args[4] = lp_build_const_int32(bld_base->base.gallivm, target);
emit_data->arg_count = 5;
/* XXX: To optimize, we could use a float or v2f32, if the last bits of
@@ -866,22 +969,37 @@ static void tex_fetch_args(
4);
}
static void build_tex_intrinsic(const struct lp_build_tgsi_action * action,
struct lp_build_tgsi_context * bld_base,
struct lp_build_emit_data * emit_data)
{
struct lp_build_context * base = &bld_base->base;
char intr_name[23];
sprintf(intr_name, "%sv%ui32", action->intr_name,
LLVMGetVectorSize(LLVMTypeOf(emit_data->args[1])));
emit_data->output[emit_data->chan] = lp_build_intrinsic(
base->gallivm->builder, intr_name, emit_data->dst_type,
emit_data->args, emit_data->arg_count);
}
static const struct lp_build_tgsi_action tex_action = {
.fetch_args = tex_fetch_args,
.emit = lp_build_tgsi_intrinsic,
.intr_name = "llvm.SI.sample"
.emit = build_tex_intrinsic,
.intr_name = "llvm.SI.sample."
};
static const struct lp_build_tgsi_action txb_action = {
.fetch_args = tex_fetch_args,
.emit = lp_build_tgsi_intrinsic,
.intr_name = "llvm.SI.sample.bias"
.emit = build_tex_intrinsic,
.intr_name = "llvm.SI.sampleb."
};
static const struct lp_build_tgsi_action txl_action = {
.fetch_args = tex_fetch_args,
.emit = lp_build_tgsi_intrinsic,
.intr_name = "llvm.SI.sample.lod"
.emit = build_tex_intrinsic,
.intr_name = "llvm.SI.samplel."
};

View File

@@ -82,6 +82,7 @@ struct si_shader_key {
unsigned nr_cbufs:4;
unsigned color_two_side:1;
unsigned alpha_func:3;
unsigned flatshade:1;
float alpha_ref;
};

View File

@@ -314,6 +314,8 @@ static void si_update_fb_rs_state(struct r600_context *rctx)
offset_units = rctx->queued.named.rasterizer->offset_units;
switch (rctx->framebuffer.zsbuf->texture->format) {
case PIPE_FORMAT_S8_UINT_Z24_UNORM:
case PIPE_FORMAT_X8Z24_UNORM:
case PIPE_FORMAT_Z24X8_UNORM:
case PIPE_FORMAT_Z24_UNORM_S8_UINT:
depth = -24;
@@ -419,8 +421,7 @@ static void *si_create_rs_state(struct pipe_context *ctx,
rs->offset_units = state->offset_units;
rs->offset_scale = state->offset_scale * 12.0f;
/* XXX: Flat shading hangs the GPU */
tmp = S_0286D4_FLAT_SHADE_ENA(0);
tmp = S_0286D4_FLAT_SHADE_ENA(1);
if (state->sprite_coord_enable) {
tmp |= S_0286D4_PNT_SPRITE_ENA(1) |
S_0286D4_PNT_SPRITE_OVRD_X(V_0286D4_SPI_PNT_SPRITE_SEL_S) |
@@ -720,7 +721,6 @@ static uint32_t si_translate_colorformat(enum pipe_format format)
case PIPE_FORMAT_L8A8_SNORM:
case PIPE_FORMAT_L8A8_UINT:
case PIPE_FORMAT_L8A8_SINT:
case PIPE_FORMAT_L8A8_SRGB:
case PIPE_FORMAT_R8G8_SNORM:
case PIPE_FORMAT_R8G8_UNORM:
case PIPE_FORMAT_R8G8_UINT:
@@ -775,6 +775,7 @@ static uint32_t si_translate_colorformat(enum pipe_format format)
case PIPE_FORMAT_Z24_UNORM_S8_UINT:
return V_028C70_COLOR_8_24;
case PIPE_FORMAT_S8X24_UINT:
case PIPE_FORMAT_X8Z24_UNORM:
case PIPE_FORMAT_S8_UINT_Z24_UNORM:
return V_028C70_COLOR_24_8;
@@ -804,15 +805,12 @@ static uint32_t si_translate_colorformat(enum pipe_format format)
return V_028C70_COLOR_10_11_11;
/* 64-bit buffers. */
case PIPE_FORMAT_R16G16B16_USCALED:
case PIPE_FORMAT_R16G16B16_SSCALED:
case PIPE_FORMAT_R16G16B16A16_UINT:
case PIPE_FORMAT_R16G16B16A16_SINT:
case PIPE_FORMAT_R16G16B16A16_USCALED:
case PIPE_FORMAT_R16G16B16A16_SSCALED:
case PIPE_FORMAT_R16G16B16A16_UNORM:
case PIPE_FORMAT_R16G16B16A16_SNORM:
case PIPE_FORMAT_R16G16B16_FLOAT:
case PIPE_FORMAT_R16G16B16A16_FLOAT:
return V_028C70_COLOR_16_16_16_16;
@@ -898,7 +896,6 @@ static uint32_t si_translate_colorswap(enum pipe_format format)
case PIPE_FORMAT_L8A8_SNORM:
case PIPE_FORMAT_L8A8_UINT:
case PIPE_FORMAT_L8A8_SINT:
case PIPE_FORMAT_L8A8_SRGB:
return V_028C70_SWAP_ALT;
case PIPE_FORMAT_R8G8_SNORM:
case PIPE_FORMAT_R8G8_UNORM:
@@ -955,9 +952,10 @@ static uint32_t si_translate_colorswap(enum pipe_format format)
case PIPE_FORMAT_Z24_UNORM_S8_UINT:
return V_028C70_SWAP_STD;
case PIPE_FORMAT_S8X24_UINT:
case PIPE_FORMAT_X8Z24_UNORM:
case PIPE_FORMAT_S8_UINT_Z24_UNORM:
return V_028C70_SWAP_STD;
return V_028C70_SWAP_STD_REV;
case PIPE_FORMAT_R10G10B10A2_UNORM:
case PIPE_FORMAT_R10G10B10X2_SNORM:
@@ -1119,9 +1117,11 @@ static uint32_t si_translate_dbformat(enum pipe_format format)
switch (format) {
case PIPE_FORMAT_Z16_UNORM:
return V_028040_Z_16;
case PIPE_FORMAT_S8_UINT_Z24_UNORM:
case PIPE_FORMAT_X8Z24_UNORM:
case PIPE_FORMAT_Z24X8_UNORM:
case PIPE_FORMAT_Z24_UNORM_S8_UINT:
return V_028040_Z_24; /* XXX no longer supported on SI */
return V_028040_Z_24; /* deprecated on SI */
case PIPE_FORMAT_Z32_FLOAT:
case PIPE_FORMAT_Z32_FLOAT_S8X24_UINT:
return V_028040_Z_32_FLOAT;
@@ -1154,14 +1154,14 @@ static uint32_t si_translate_texformat(struct pipe_screen *screen,
case PIPE_FORMAT_Z24_UNORM_S8_UINT:
return V_008F14_IMG_DATA_FORMAT_8_24;
case PIPE_FORMAT_X8Z24_UNORM:
case PIPE_FORMAT_S8X24_UINT:
case PIPE_FORMAT_S8_UINT_Z24_UNORM:
return V_008F14_IMG_DATA_FORMAT_24_8;
case PIPE_FORMAT_X32_S8X24_UINT:
case PIPE_FORMAT_S8X24_UINT:
case PIPE_FORMAT_S8_UINT:
return V_008F14_IMG_DATA_FORMAT_8;
case PIPE_FORMAT_Z32_FLOAT:
return V_008F14_IMG_DATA_FORMAT_32;
case PIPE_FORMAT_X32_S8X24_UINT:
case PIPE_FORMAT_Z32_FLOAT_S8X24_UINT:
return V_008F14_IMG_DATA_FORMAT_X24_8_32;
default:
@@ -1172,6 +1172,8 @@ static uint32_t si_translate_texformat(struct pipe_screen *screen,
goto out_unknown; /* TODO */
case UTIL_FORMAT_COLORSPACE_SRGB:
if (desc->nr_channels != 4 && desc->nr_channels != 1)
goto out_unknown;
break;
default:
@@ -1523,6 +1525,8 @@ static unsigned si_tile_mode_index(struct r600_resource_texture *rtex, unsigned
switch (rtex->real_format) {
case PIPE_FORMAT_Z16_UNORM:
return 5;
case PIPE_FORMAT_S8_UINT_Z24_UNORM:
case PIPE_FORMAT_X8Z24_UNORM:
case PIPE_FORMAT_Z24X8_UNORM:
case PIPE_FORMAT_Z24_UNORM_S8_UINT:
case PIPE_FORMAT_Z32_FLOAT:
@@ -1586,7 +1590,7 @@ static void si_cb(struct r600_context *rctx, struct si_pm4_state *pm4,
struct r600_surface *surf;
unsigned level = state->cbufs[cb]->u.tex.level;
unsigned pitch, slice;
unsigned color_info;
unsigned color_info, color_attrib;
unsigned tile_mode_index;
unsigned format, swap, ntype, endian;
uint64_t offset;
@@ -1624,15 +1628,19 @@ static void si_cb(struct r600_context *rctx, struct si_pm4_state *pm4,
if (desc->colorspace == UTIL_FORMAT_COLORSPACE_SRGB)
ntype = V_028C70_NUMBER_SRGB;
else if (desc->channel[i].type == UTIL_FORMAT_TYPE_SIGNED) {
if (desc->channel[i].normalized)
ntype = V_028C70_NUMBER_SNORM;
else if (desc->channel[i].pure_integer)
if (desc->channel[i].pure_integer) {
ntype = V_028C70_NUMBER_SINT;
} else {
assert(desc->channel[i].normalized);
ntype = V_028C70_NUMBER_SNORM;
}
} else if (desc->channel[i].type == UTIL_FORMAT_TYPE_UNSIGNED) {
if (desc->channel[i].normalized)
ntype = V_028C70_NUMBER_UNORM;
else if (desc->channel[i].pure_integer)
if (desc->channel[i].pure_integer) {
ntype = V_028C70_NUMBER_UINT;
} else {
assert(desc->channel[i].normalized);
ntype = V_028C70_NUMBER_UNORM;
}
}
}
@@ -1670,6 +1678,9 @@ static void si_cb(struct r600_context *rctx, struct si_pm4_state *pm4,
S_028C70_NUMBER_TYPE(ntype) |
S_028C70_ENDIAN(endian);
color_attrib = S_028C74_TILE_MODE_INDEX(tile_mode_index) |
S_028C74_FORCE_DST_ALPHA_1(desc->swizzle[3] == UTIL_FORMAT_SWIZZLE_1);
offset += r600_resource_va(rctx->context.screen, state->cbufs[cb]->texture);
offset >>= 8;
@@ -1687,8 +1698,7 @@ static void si_cb(struct r600_context *rctx, struct si_pm4_state *pm4,
S_028C6C_SLICE_MAX(state->cbufs[cb]->u.tex.last_layer));
}
si_pm4_set_reg(pm4, R_028C70_CB_COLOR0_INFO + cb * 0x3C, color_info);
si_pm4_set_reg(pm4, R_028C74_CB_COLOR0_ATTRIB + cb * 0x3C,
S_028C74_TILE_MODE_INDEX(tile_mode_index));
si_pm4_set_reg(pm4, R_028C74_CB_COLOR0_ATTRIB + cb * 0x3C, color_attrib);
/* Determine pixel shader export format */
max_comp_size = si_colorformat_max_comp_size(format);
@@ -1848,7 +1858,7 @@ static INLINE struct si_shader_key si_shader_selector_key(struct pipe_context *c
key.export_16bpc = rctx->export_16bpc;
if (rctx->queued.named.rasterizer) {
key.color_two_side = rctx->queued.named.rasterizer->two_side;
/*key.flatshade = rctx->queued.named.rasterizer->flatshade;*/
key.flatshade = rctx->queued.named.rasterizer->flatshade;
}
if (rctx->queued.named.dsa) {
key.alpha_func = rctx->queued.named.dsa->alpha_func;
@@ -2053,6 +2063,7 @@ static struct pipe_sampler_view *si_create_sampler_view(struct pipe_context *ctx
unsigned char state_swizzle[4], swizzle[4];
unsigned height, depth, width;
enum pipe_format pipe_format = state->format;
struct radeon_surface_level *surflevel;
int first_non_void;
uint64_t va;
@@ -2072,37 +2083,84 @@ static struct pipe_sampler_view *si_create_sampler_view(struct pipe_context *ctx
state_swizzle[2] = state->swizzle_b;
state_swizzle[3] = state->swizzle_a;
surflevel = tmp->surface.level;
/* Texturing with separate depth and stencil. */
if (tmp->is_depth && !tmp->is_flushing_texture) {
switch (pipe_format) {
case PIPE_FORMAT_Z32_FLOAT_S8X24_UINT:
pipe_format = PIPE_FORMAT_Z32_FLOAT;
break;
case PIPE_FORMAT_X8Z24_UNORM:
case PIPE_FORMAT_S8_UINT_Z24_UNORM:
/* Z24 is always stored like this. */
pipe_format = PIPE_FORMAT_Z24X8_UNORM;
break;
case PIPE_FORMAT_X24S8_UINT:
case PIPE_FORMAT_S8X24_UINT:
case PIPE_FORMAT_X32_S8X24_UINT:
pipe_format = PIPE_FORMAT_S8_UINT;
surflevel = tmp->surface.stencil_level;
break;
default:;
}
}
desc = util_format_description(pipe_format);
util_format_compose_swizzles(desc->swizzle, state_swizzle, swizzle);
if (desc->colorspace == UTIL_FORMAT_COLORSPACE_ZS) {
const unsigned char swizzle_xxxx[4] = {0, 0, 0, 0};
const unsigned char swizzle_yyyy[4] = {1, 1, 1, 1};
switch (pipe_format) {
case PIPE_FORMAT_S8_UINT_Z24_UNORM:
case PIPE_FORMAT_X24S8_UINT:
case PIPE_FORMAT_X32_S8X24_UINT:
case PIPE_FORMAT_X8Z24_UNORM:
util_format_compose_swizzles(swizzle_yyyy, state_swizzle, swizzle);
break;
default:
util_format_compose_swizzles(swizzle_xxxx, state_swizzle, swizzle);
}
} else {
util_format_compose_swizzles(desc->swizzle, state_swizzle, swizzle);
}
first_non_void = util_format_get_first_non_void_channel(pipe_format);
if (first_non_void < 0) {
num_format = V_008F14_IMG_NUM_FORMAT_FLOAT;
} else switch (desc->channel[first_non_void].type) {
case UTIL_FORMAT_TYPE_FLOAT:
num_format = V_008F14_IMG_NUM_FORMAT_FLOAT;
break;
case UTIL_FORMAT_TYPE_SIGNED:
num_format = V_008F14_IMG_NUM_FORMAT_SNORM;
break;
case UTIL_FORMAT_TYPE_UNSIGNED:
default:
switch (pipe_format) {
case PIPE_FORMAT_S8_UINT_Z24_UNORM:
num_format = V_008F14_IMG_NUM_FORMAT_UNORM;
break;
default:
if (first_non_void < 0) {
num_format = V_008F14_IMG_NUM_FORMAT_FLOAT;
} else if (desc->colorspace == UTIL_FORMAT_COLORSPACE_SRGB) {
num_format = V_008F14_IMG_NUM_FORMAT_SRGB;
} else {
num_format = V_008F14_IMG_NUM_FORMAT_UNORM;
switch (desc->channel[first_non_void].type) {
case UTIL_FORMAT_TYPE_FLOAT:
num_format = V_008F14_IMG_NUM_FORMAT_FLOAT;
break;
case UTIL_FORMAT_TYPE_SIGNED:
if (desc->channel[first_non_void].normalized)
num_format = V_008F14_IMG_NUM_FORMAT_SNORM;
else if (desc->channel[first_non_void].pure_integer)
num_format = V_008F14_IMG_NUM_FORMAT_SINT;
else
num_format = V_008F14_IMG_NUM_FORMAT_SSCALED;
break;
case UTIL_FORMAT_TYPE_UNSIGNED:
if (desc->channel[first_non_void].normalized)
num_format = V_008F14_IMG_NUM_FORMAT_UNORM;
else if (desc->channel[first_non_void].pure_integer)
num_format = V_008F14_IMG_NUM_FORMAT_UINT;
else
num_format = V_008F14_IMG_NUM_FORMAT_USCALED;
}
}
}
format = si_translate_texformat(ctx->screen, pipe_format, desc, first_non_void);
@@ -2115,10 +2173,10 @@ static struct pipe_sampler_view *si_create_sampler_view(struct pipe_context *ctx
/* not supported any more */
//endian = si_colorformat_endian_swap(format);
width = tmp->surface.level[0].npix_x;
height = tmp->surface.level[0].npix_y;
depth = tmp->surface.level[0].npix_z;
pitch = tmp->surface.level[0].nblk_x * util_format_get_blockwidth(pipe_format);
width = surflevel[0].npix_x;
height = surflevel[0].npix_y;
depth = surflevel[0].npix_z;
pitch = surflevel[0].nblk_x * util_format_get_blockwidth(pipe_format);
if (texture->target == PIPE_TEXTURE_1D_ARRAY) {
height = 1;
@@ -2128,7 +2186,7 @@ static struct pipe_sampler_view *si_create_sampler_view(struct pipe_context *ctx
}
va = r600_resource_va(ctx->screen, texture);
va += tmp->surface.level[0].offset;
va += surflevel[0].offset;
view->state[0] = va >> 8;
view->state[1] = (S_008F14_BASE_ADDRESS_HI(va >> 40) |
S_008F14_DATA_FORMAT(format) |
@@ -2476,10 +2534,20 @@ static void *si_create_vertex_elements(struct pipe_context *ctx,
num_format = V_008F0C_BUF_NUM_FORMAT_USCALED; /* XXX */
break;
case UTIL_FORMAT_TYPE_SIGNED:
num_format = V_008F0C_BUF_NUM_FORMAT_SNORM;
if (desc->channel[first_non_void].normalized)
num_format = V_008F0C_BUF_NUM_FORMAT_SNORM;
else if (desc->channel[first_non_void].pure_integer)
num_format = V_008F0C_BUF_NUM_FORMAT_SINT;
else
num_format = V_008F0C_BUF_NUM_FORMAT_SSCALED;
break;
case UTIL_FORMAT_TYPE_UNSIGNED:
num_format = V_008F0C_BUF_NUM_FORMAT_UNORM;
if (desc->channel[first_non_void].normalized)
num_format = V_008F0C_BUF_NUM_FORMAT_UNORM;
else if (desc->channel[first_non_void].pure_integer)
num_format = V_008F0C_BUF_NUM_FORMAT_UINT;
else
num_format = V_008F0C_BUF_NUM_FORMAT_USCALED;
break;
case UTIL_FORMAT_TYPE_FLOAT:
default:
@@ -2665,9 +2733,14 @@ void si_init_config(struct r600_context *rctx)
si_pm4_set_reg(pm4, R_028350_PA_SC_RASTER_CONFIG, 0x2a00126a);
break;
case CHIP_VERDE:
default:
si_pm4_set_reg(pm4, R_028350_PA_SC_RASTER_CONFIG, 0x0000124a);
break;
case CHIP_OLAND:
si_pm4_set_reg(pm4, R_028350_PA_SC_RASTER_CONFIG, 0x00000082);
break;
default:
si_pm4_set_reg(pm4, R_028350_PA_SC_RASTER_CONFIG, 0x00000000);
break;
}
si_pm4_set_state(rctx, init, pm4);

View File

@@ -128,11 +128,6 @@ static void si_pipe_shader_ps(struct pipe_context *ctx, struct si_pipe_shader *s
continue;
}
/* XXX: Flat shading hangs the GPU */
if (shader->shader.input[i].interpolate == TGSI_INTERPOLATE_CONSTANT ||
(shader->shader.input[i].interpolate == TGSI_INTERPOLATE_COLOR &&
rctx->queued.named.rasterizer->flatshade))
have_linear = TRUE;
if (shader->shader.input[i].interpolate == TGSI_INTERPOLATE_LINEAR)
have_linear = TRUE;
if (shader->shader.input[i].interpolate == TGSI_INTERPOLATE_PERSPECTIVE)
@@ -327,15 +322,12 @@ static void si_update_spi_map(struct r600_context *rctx)
bcolor:
tmp = 0;
#if 0
/* XXX: Flat shading hangs the GPU */
if (name == TGSI_SEMANTIC_POSITION ||
ps->input[i].interpolate == TGSI_INTERPOLATE_CONSTANT ||
(ps->input[i].interpolate == TGSI_INTERPOLATE_COLOR &&
rctx->rasterizer && rctx->rasterizer->flatshade)) {
rctx->ps_shader->current->key.flatshade)) {
tmp |= S_028644_FLAT_SHADE(1);
}
#endif
if (name == TGSI_SEMANTIC_GENERIC &&
rctx->sprite_coord_enable & (1 << ps->input[i].sid)) {
@@ -409,6 +401,11 @@ static void si_update_derived_state(struct r600_context *rctx)
}
if (si_pm4_state_changed(rctx, ps) || si_pm4_state_changed(rctx, vs)) {
/* XXX: Emitting the PS state even when only the VS changed
* fixes random failures with piglit glsl-max-varyings.
* Not sure why...
*/
rctx->emitted.named.ps = NULL;
si_update_spi_map(rctx);
}
}
@@ -453,8 +450,14 @@ static void si_vertex_buffer_update(struct r600_context *rctx)
si_pm4_sh_data_add(pm4, va & 0xFFFFFFFF);
si_pm4_sh_data_add(pm4, (S_008F04_BASE_ADDRESS_HI(va >> 32) |
S_008F04_STRIDE(vb->stride)));
si_pm4_sh_data_add(pm4, (vb->buffer->width0 - vb->buffer_offset) /
MAX2(vb->stride, 1));
if (vb->stride)
/* Round up by rounding down and adding 1 */
si_pm4_sh_data_add(pm4,
(vb->buffer->width0 - offset -
util_format_get_blocksize(ve->src_format)) /
vb->stride + 1);
else
si_pm4_sh_data_add(pm4, vb->buffer->width0 - offset);
si_pm4_sh_data_add(pm4, rctx->vertex_elements->rsrc_word3[i]);
if (!bound[ve->vertex_buffer_index]) {
@@ -524,10 +527,8 @@ void si_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info *info)
struct pipe_index_buffer ib = {};
uint32_t cp_coher_cntl;
if ((!info->count && (info->indexed || !info->count_from_stream_output)) ||
(info->indexed && !rctx->index_buffer.buffer)) {
if (!info->count && (info->indexed || !info->count_from_stream_output))
return;
}
if (!rctx->ps_shader || !rctx->vs_shader)
return;
@@ -538,13 +539,14 @@ void si_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info *info)
if (info->indexed) {
/* Initialize the index buffer struct. */
pipe_resource_reference(&ib.buffer, rctx->index_buffer.buffer);
ib.user_buffer = rctx->index_buffer.user_buffer;
ib.index_size = rctx->index_buffer.index_size;
ib.offset = rctx->index_buffer.offset + info->start * ib.index_size;
/* Translate or upload, if needed. */
r600_translate_index_buffer(rctx, &ib, info->count);
if (ib.user_buffer) {
if (ib.user_buffer && !ib.buffer) {
r600_upload_index_buffer(rctx, &ib, info->count);
}

View File

@@ -30,6 +30,7 @@ noinst_LTLIBRARIES = librbug.la
# preprocessor is determined by the ordering of the -I flags.
AM_CFLAGS = \
$(GALLIUM_CFLAGS) \
$(VISIBILITY_CFLAGS) \
-I$(top_srcdir)/src/gallium/drivers \
-I$(top_srcdir)/include

View File

@@ -2963,6 +2963,7 @@ sp_create_sampler_variant( const struct pipe_sampler_state *sampler,
case PIPE_TEX_MIPFILTER_LINEAR:
if (key.bits.is_pot &&
key.bits.target == PIPE_TEXTURE_2D &&
sampler->min_img_filter == sampler->mag_img_filter &&
sampler->normalized_coords &&
sampler->wrap_s == PIPE_TEX_WRAP_REPEAT &&

View File

@@ -29,6 +29,8 @@ AM_CPPFLAGS = \
-I$(top_srcdir)/include \
$(GALLIUM_CFLAGS)
AM_CFLAGS = $(VISIBILITY_CFLAGS)
#On some systems -std= must be added to CFLAGS to be the last -std=
CFLAGS += -std=gnu99

View File

@@ -23,6 +23,7 @@
*
**********************************************************/
#include "util/u_format.h"
#include "util/u_inlines.h"
#include "util/u_memory.h"
#include "pipe/p_defines.h"
@@ -248,6 +249,16 @@ emit_rss(struct svga_context *svga, unsigned dirty)
EMIT_RS_FLOAT( svga, bias, DEPTHBIAS, fail );
}
if (dirty & SVGA_NEW_FRAME_BUFFER) {
/* XXX: we only look at the first color buffer's sRGB state */
float gamma = 1.0f;
if (svga->curr.framebuffer.cbufs[0] &&
util_format_is_srgb(svga->curr.framebuffer.cbufs[0]->format)) {
gamma = 2.2f;
}
EMIT_RS_FLOAT(svga, gamma, OUTPUTGAMMA, fail);
}
if (dirty & SVGA_NEW_RAST) {
/* bitmask of the enabled clip planes */
unsigned enabled = svga->curr.rast->templ.clip_plane_enable;

View File

@@ -1,7 +1,8 @@
include $(top_srcdir)/src/gallium/Automake.inc
AM_CFLAGS = \
$(GALLIUM_CFLAGS)
$(GALLIUM_CFLAGS) \
$(VISIBILITY_CFLAGS)
noinst_LTLIBRARIES = libtrace.la

View File

@@ -29,6 +29,8 @@
#define P_COMPILER_H
#include "c99_compat.h" /* inline, __func__, etc. */
#include "p_config.h"
#include <stdlib.h>
@@ -90,28 +92,7 @@ typedef unsigned char boolean;
#endif
#endif
/* Function inlining */
#ifndef inline
# ifdef __cplusplus
/* C++ supports inline keyword */
# elif defined(__GNUC__)
# define inline __inline__
# elif defined(_MSC_VER)
# define inline __inline
# elif defined(__ICL)
# define inline __inline
# elif defined(__INTEL_COMPILER)
/* Intel compiler supports inline keyword */
# elif defined(__WATCOMC__) && (__WATCOMC__ >= 1100)
# define inline __inline
# elif defined(__SUNPRO_C) && defined(__C99FEATURES__)
/* C99 supports inline keyword */
# elif (__STDC_VERSION__ >= 199901L)
/* C99 supports inline keyword */
# else
# define inline
# endif
#endif
/* XXX: Use standard `inline` keyword instead */
#ifndef INLINE
# define INLINE inline
#endif
@@ -127,26 +108,6 @@ typedef unsigned char boolean;
# endif
#endif
/*
* Define the C99 restrict keyword.
*
* See also:
* - http://cellperformance.beyond3d.com/articles/2006/05/demystifying-the-restrict-keyword.html
*/
#ifndef restrict
# if (__STDC_VERSION__ >= 199901L)
/* C99 */
# elif defined(__SUNPRO_C) && defined(__C99FEATURES__)
/* C99 */
# elif defined(__GNUC__)
# define restrict __restrict__
# elif defined(_MSC_VER)
# define restrict __restrict
# else
# define restrict /* */
# endif
#endif
/* Function visibility */
#ifndef PUBLIC
@@ -160,35 +121,10 @@ typedef unsigned char boolean;
#endif
/* The __FUNCTION__ gcc variable is generally only used for debugging.
* If we're not using gcc, define __FUNCTION__ as a cpp symbol here.
*/
/* XXX: Use standard `__func__` instead */
#ifndef __FUNCTION__
# if !defined(__GNUC__)
# if (__STDC_VERSION__ >= 199901L) /* C99 */ || \
(defined(__SUNPRO_C) && defined(__C99FEATURES__))
# define __FUNCTION__ __func__
# else
# define __FUNCTION__ "<unknown>"
# endif
# endif
# if defined(_MSC_VER) && _MSC_VER < 1300
# define __FUNCTION__ "<unknown>"
# endif
# define __FUNCTION__ __func__
#endif
#ifndef __func__
# if (__STDC_VERSION__ >= 199901L) || \
(defined(__SUNPRO_C) && defined(__C99FEATURES__))
/* __func__ is part of C99 */
# elif defined(_MSC_VER)
# if _MSC_VER >= 1300
# define __func__ __FUNCTION__
# else
# define __func__ "<unknown>"
# endif
# endif
#endif
/* This should match linux gcc cdecl semantics everywhere, so that we

Some files were not shown because too many files have changed in this diff Show More