Compare commits

...

47 Commits

Author SHA1 Message Date
José Fonseca
a2224cf594 mesa: Use IROUND instead of roundf.
roundf is not available on MSVC.

(cherry picked from commit bba8f10598)
2014-01-31 12:39:34 -08:00
Carl Worth
e714e26420 docs: Add md5sums for the 9.2.5 release. 2013-12-12 22:54:08 -08:00
Carl Worth
4636e87191 docs: Add release notes for 9.2.5 2013-12-12 22:50:24 -08:00
Carl Worth
a32330eb93 Bump version to 9.2.5
In preparation for the 9.2.5 release.
2013-12-12 22:27:06 -08:00
Chris Forbes
83f1f6d2ef i965/fs: Gen4-5: Implement alpha test in shader for MRT
V2: Add comment explaining what emit_alpha_test() is for;
    fix spurious temp and bogus whitespace.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit f7e15fcf56)
2013-12-12 16:00:58 -08:00
Chris Forbes
56b0e3271a i965/fs: Gen4-5: Setup discard masks for MRT alpha test
The same setup is required here as when the user-provided shader
explicitly uses KIL or discard.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit ca82ba90dd)
2013-12-12 16:00:25 -08:00
Chris Forbes
4f247c28e5 i965: Gen4-5: Include alpha func/ref in program key
V2: Better explanation of the rationale for doing this.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 1080fc610e)
2013-12-12 15:59:49 -08:00
Chris Forbes
711b15ddb9 i965: Gen4-5: Don't enable hardware alpha test with MRT
We have to do this in the shader instead, since these gens lack an
independent RT0 alpha value in their render target write messages.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit dbcd633040)
2013-12-12 15:59:26 -08:00
Chí-Thanh Christopher Nguyễn
b625016802 st/xorg: Handle new DamageUnregister API which has only one argument
This fixes building against the new API in X server 1.15
Taken from xf86-video-modesetting beca4dfb0e4d11d3729214967a1fe56ee5669831 from Keith Packard
2013-12-12 15:56:19 -08:00
Ilia Mirkin
3d8f3eea52 nv50: report 15 max inputs for fragment programs
First off, nv50_program only has 16 in/out varyings. However reporting
16 makes 'm' become 68 in nv50_fp_linkage_validate with the
varying-packing-simple piglit test. (Subverting the assert makes it
compile but fail.) With this patch, varying-packing-simple passes.

See: https://bugs.freedesktop.org/show_bug.cgi?id=69155

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "9.2 10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bad8871e52)
2013-12-12 15:45:47 -08:00
Dave Airlie
a6524c6d81 swrast: fix readback regression since inversion fix
This readback from the frontbuffer with swrast was broken, that bug
just made it more obviously broken, this fixes it by inverting the
sub image gets. Also fixes a few other piglits.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=72327
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=72325

(for 9.2 the patches this depends on were asked to be backported separately
 in an email).
Cc: "9.2" "10.0" mesa-stable@lists.fedoraproject.org
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>

(cherry picked from commit 0b16042377)
2013-12-12 15:45:32 -08:00
Dave Airlie
a3b657a607 glx: don't fail out when no configs if we have visuals
GLX 1.2 servers with no SGIX_fbconfigs exist (some citrix thing),
and we fail glxinfo completely in those cases.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit b01a3a9b72)
2013-12-12 15:45:20 -08:00
Dave Airlie
f5a8a37c73 mesa/swrast: fix inverted front buffer rendering with old-school swrast
I've no idea when this broke, but we have some people who wanted it fixed,
so here's my attempt.

reproducer, run readpix with swrast hit f, or run trival tri -sb things are
upside down, after this patch they aren't.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62142
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66213

Cc: <mesa-stable@lists.freedesktop.org>"
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit a43b49dfb1)
2013-12-12 15:44:55 -08:00
Tom Stellard
f0dfdbfc79 r300/compiler/tests: Fix line length check in test parser
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

CC: "9.2" "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9a5ce0c4c9)
2013-12-12 15:43:42 -08:00
Tom Stellard
961e2c8ada r300/compiler/tests: Fix segfault
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

CC: "9.2" "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 1896431f79)
2013-12-12 15:43:30 -08:00
Ian Romanick
a15aa108ea glsl: Don't emit empty declaration warning for a struct specifier
The intention is that things like

   int;

will generate a warning.  However, we were also accidentally emitting
the same warning for things like

  struct Foo { int x; };

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68838
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: Aras Pranckevicius <aras@unity3d.com>
Cc: "9.2 10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 758658850b)
2013-12-12 15:43:14 -08:00
Ilia Mirkin
767623bbf2 nv50: wait on the buf's fence before sticking it into pushbuf
This resolves some rendering issues in source games.
See https://bugs.freedesktop.org/show_bug.cgi?id=64323

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "9.2 10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0e5bf85651)
2013-12-12 15:42:45 -08:00
Ilia Mirkin
94e7e35d6e nouveau: avoid leaking fences while waiting
This fixes a memory leak in some situations. Also avoids emitting an
extra fence if the kick handler does the call to nouveau_fence_next
itself.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "9.2 10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ce6dd69697)
2013-12-12 15:42:23 -08:00
Ilia Mirkin
bd304ffc85 nv50: Fix GPU_READING/WRITING bit removal
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
CC: "9.1, 9.2, 10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c45cf6199f)
2013-12-12 15:40:48 -08:00
Chad Versace
16c5f24141 i965: Add extra-alignment for non-msrt fast color clear for all hw (v2)
The BSpec states that the aligment for the non-msrt clear rectangle must
be doubled; the BSpec does not restricit the workaround to specific
hardware.

Commit 9a1a67b applied the workaround to Haswell GT3.  Commit 8b659ce
expanded the workaround to all Haswell variants. This commit expands it
to all hardware.

No Piglit regressions on Ivybridge 0x0166. No fixes either.

I know no Ivybridge nor Baytrail bug related to this workaround.
However, the BSpec says the extra alignment is required, so let's do it.

v2: Apply to all hardware, not just gen7.

CC: "9.2, 10.0" <mesa-stable@lists.freedesktop.org>
CC: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit 998018d7be)
2013-12-12 15:40:24 -08:00
Chad Versace
b60eb2e174 i965/hsw: Apply non-msrt fast color clear w/a to all HSW GTs
Pre-patch, the workaround was applied to only HSW GT3. However, the
workaround also fixes render corruption on the HSW GT1 Chromebook,
codenamed Falco.

Also, update the BSpec quote that discusses the workaround to reflect
the latest BSpec.

The BSpec states that the workaround is required for Ivybridge and
Baytrail as well as Haswell. But, we apply the workaround to only
Haswell because (a) we suspect that is the only hardware where it is
actually required and (b) we haven't yet validated the workaround for
the other hardware.

CC: "9.2, 10.0" <mesa-stable@lists.freedesktop.org>
CC: Anuj Phogat <anuj.phogat@gmail.com>
OTC-Tracker: CHRMOS-812
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit 8b659cef3a)
2013-12-12 15:39:45 -08:00
Carl Worth
ebe8e9b01f docs: Add md5sums for the 9.2.4 release
Just after tagging the 9.2.4 release and making the tar files.
2013-11-27 23:52:54 -08:00
Carl Worth
3e385d1bc3 docs: Add release notes for the 9.2.4 release. 2013-11-27 23:41:14 -08:00
Carl Worth
bd62562282 Bump version to 9.2.4
In preparation for the 9.2.4 release.
2013-11-27 23:37:50 -08:00
Paul Berry
6a022c8b03 glsl: Fix lowering of direct assignment in lower_clip_distance.
In commit 065da16 (glsl: Convert lower_clip_distance_visitor to be an
ir_rvalue_visitor), we failed to notice that since
lower_clip_distance_visitor overrides visit_leave(ir_assignment *),
ir_rvalue_visitor::visit_leave(ir_assignment *) wasn't getting called.
As a result, clip distance dereferences appearing directly on the
right hand side of an assignment (not in a subexpression) weren't
getting properly lowered.  This caused an ir_dereference_variable node
to be left in the IR that referred to the old gl_ClipDistance
variable.  However, since the lowering pass replaces gl_ClipDistance
with gl_ClipDistanceMESA, this turned into a dangling pointer when the
IR got reparented.

Prior to the introduction of geometry shaders, this bug was unlikely
to arise, because (a) reading from gl_ClipDistance[i] in the fragment
shader was rare, and (b) when it happened, it was likely that it would
either appear in a subexpression, or be hoisted into a subexpression
by tree grafting.

However, in a geometry shader, we're likely to see a statement like
this, which would trigger the bug:

    gl_ClipDistance[i] = gl_in[j].gl_ClipDistance[i];

This patch causes
lower_clip_distance_visitor::visit_leave(ir_assignment *) to call the
base class visitor, so that the right hand side of the assignment is
properly lowered.

Fixes piglit test:
- spec/glsl-1.50/execution/geometry/clip-distance-itemized-copy

Cc: Ian Romanick <idr@freedesktop.org>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 9dfcb05fa6)
2013-11-26 16:39:13 -08:00
Tapani Pälli
dfaf5de460 mesa: enable GL_TEXTURE_LOD_BIAS set/get
Earlier comments suggest this was removed from GL core spec but it is
still there. Enabling makes 'texture_lod_bias_getter' Khronos
conformance tests pass, also removes some errors from Metro Last Light
game which is using this API.

v2: leave NOTE comment (Ian)

Cc: "9.0 9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 7e61b44dcd)
2013-11-26 16:30:45 -08:00
Brian Paul
b9990d758b st/mesa: fix GL_FEEDBACK mode inverted Y coordinate bug
We need to check the drawbuffer's orientation before inverting Y
coordinates.  Fixes piglit feedback tests when running with the
-fbo option.

Cc: "9.2" "10.0" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 15d8e05e1e)
2013-11-26 16:30:07 -08:00
Paul Berry
ee5e7148e5 i965: Fix vertical alignment for multisampled buffers.
From the Sandy Bridge PRM, Vol 1 Part 1 7.18.3.4 (Alignment Unit
Size):

    j [vertical alignment] = 4 for any render target surface is
    multisampled (4x)

From the Ivy Bridge PRM, Vol 4 Part 1 2.12.2.1 (SURFACE_STATE for most
messages), under the "Surface Vertical Alignment" heading:

    This field is intended to be set to VALIGN_4 if the surface was
    rendered as a depth buffer, for a multisampled (4x) render target,
    or for a multisampled (8x) render target, since these surfaces
    support only alignment of 4.

Back in 2012 when we added multisampling support to the i965 driver,
we forgot to update the logic for computing the vertical alignment, so
we were often using a vertical alignment of 2 for multisampled
buffers, leading to subtle rendering errors.

Note that the specs also require a vertical alignment of 4 for all
Y-tiled render target surfaces; I plan to address that in a separate
patch.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53077
Cc: mesa-stable@lists.freedesktop.org

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit b4c3b833ec)
2013-11-26 16:27:32 -08:00
Carl Worth
44831bff0f Merge branch '9.2-freedreno' of git://github.com/freedreno/mesa into 9.2
This allows the freedreno driver to build with current mesa.
2013-11-26 16:25:06 -08:00
Rob Clark
4fd03f26aa freedreno: updates for msm drm/kms driver
There where some small API tweaks in libdrm_freedreno to enable support
for msm drm/kms driver.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-11-13 10:16:02 -05:00
Rob Clark
4f0be333e7 freedreno/a3xx/compiler: handle sync flags better
We need to set the flag on all the .xyzw components that are written by
the instruction, not just on .x.  Otherwise a later use of rN.y (for
example) will not trigger the appropriate sync bit to be set.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-11-13 10:16:02 -05:00
Rob Clark
f1998c8aa7 freedreno/a3xx/compiler: better const handling
Seems like most/all instructions have some restrictions about const src
registers.  In seems like the 2 src (cat2) instructions can take at most
one const, and the 3 src (cat3) instructions can take at most one const
in the first 2 arguments.  And so on.  Handle this properly now.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-11-13 10:16:01 -05:00
Rob Clark
0b2c5119cb freedreno/a3xx: don't leak so much
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-11-13 10:16:01 -05:00
Rob Clark
c20aa295ec freedreno/a3xx/compiler: fix SGT/SLT/etc
The cmps.f.* instruction doesn't actually seem to give a float 1.0 or
0.0 output.  It either needs a cov.u16f16 or add.s + sel.f16.  This
makes SGT/SLT/etc more similar to CMP, so handle them in trans_cmp().

This fixes a bunch of piglit tests.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-11-13 10:16:01 -05:00
Rob Clark
ca5514b851 freedreno/a3xx/compiler: bit of re-arrange/cleanup
It seems there are a number of cases where instructions have limitations
about taking reading src's from const register file, so make
get_unconst() a bit easier to use.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-11-13 10:16:01 -05:00
Rob Clark
c726a6a907 freedreno/a3xx/compiler: make compiler errors more useful
We probably should get rid of assert() entirely, but at this stage it is
more useful for things to crash where we can catch it in a debugger.
With compile_error() we have a single place to set an error flag (to
bail out and return an error on the next instruction) so that will be a
small change later when enough of the compiler bugs are sorted.

But re-arrange/cleanup the error/assert stuff so we at least get a dump
of the TGSI that triggered it.  So we see some useful output in piglit
logs.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-11-13 10:16:00 -05:00
Rob Clark
12da4c1a6a freedreno: fix segfault when no color buffer bound
Don't crash when no color buffer bound.  Something caught when starting
to run piglit, fixes a hanful of piglit tests.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-11-13 10:16:00 -05:00
Rob Clark
f3a7e28fe4 freedreno/a3xx/compiler: cat4 cannot use const reg as src
Category 4 instructions (rsq, rcp, sqrt, etc) seem to be unable to take
a const register as src.  In these cases we need to move the src to a
temporary gpr first.

This is the second case of such a restriction, where the instruction
encoding appears to support a const src, but in fact the hw appears to
ignore that bit.  So split things out into a helper that can be re-used
for any instructions which have this limitation.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-11-13 10:16:00 -05:00
Rob Clark
5394a872f3 freedreno/a3xx/compiler: use max_reg rather than file_count
Our current (rather naive) register assignment is based on mapping
different register files (INPUT, OUTPUT, TEMP, CONST, etc) based on the
max register index of the preceding file.  But in some cases, the lowest
used register in a file might not be zero.  In which case
file_count[file] != file_max[file] + 1.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-11-13 10:16:00 -05:00
Rob Clark
c833874386 freedreno/a3xx/compiler: handle saturate on dst
Sometimes things other than color dst need saturating, like if there is
a 'clamp(foo, 0.0, 1.0)'.  So for saturated dst add the extra
instructions to fix up dst.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-11-13 10:15:59 -05:00
Rob Clark
83e6532001 freedreno/a3xx/compiler: fix CMP
The 1st src to add.s needs (r) flag (repeat), otherwise it will end up:

  add.s dst.xyzw, tmp.xxxx -1

instead of:

  add.s dst.xyzw, tmp.xyzw, -1

Also, if we are using a temporary dst to avoid clobbering one of the src
registers, we actually need to use that as the dst for the sel
instruction.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-11-13 10:15:59 -05:00
Rob Clark
3da8868b5d freedreno/a3xx: some texture fixes
Stop hard coding bits that indicate texture type (2d/3d/cube/etc).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-11-13 10:15:59 -05:00
Rob Clark
e1e9f69d3c freedreno: update register headers
resync w/ rnndb database

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-11-13 10:15:58 -05:00
Rob Clark
8b167d34be freedreno: add debug option to disable scissor optimization
Useful for testing and debugging.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-11-13 10:15:58 -05:00
Rob Clark
b2a32254d6 freedreno/a3xx: fix viewport on gmem->mem resolve
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-11-13 10:15:57 -05:00
Carl Worth
08d7c08512 docs: Add MD5 sums for the 9.2.3 release
We can do this now that the release tree has been tagged and tar files have
been generated.
2013-11-13 07:13:21 -08:00
Rob Clark
2d844be97f freedreno/a3xx: fix color inversion on mem->gmem restore
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-11-13 09:47:40 -05:00
51 changed files with 1727 additions and 426 deletions

View File

@@ -35,7 +35,7 @@ LOCAL_C_INCLUDES += \
# define ANDROID_VERSION (e.g., 4.0.x => 0x0400)
LOCAL_CFLAGS += \
-DPACKAGE_VERSION=\"9.2.3\" \
-DPACKAGE_VERSION=\"9.2.5\" \
-DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\" \
-DANDROID_VERSION=0x0$(MESA_ANDROID_MAJOR_VERSION)0$(MESA_ANDROID_MINOR_VERSION)

View File

@@ -70,7 +70,7 @@ if env['gles']:
# Environment setup
env.Append(CPPDEFINES = [
('PACKAGE_VERSION', '\\"9.2.3\\"'),
('PACKAGE_VERSION', '\\"9.2.5\\"'),
('PACKAGE_BUGREPORT', '\\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\\"'),
])

View File

@@ -6,7 +6,7 @@ dnl Tell the user about autoconf.html in the --help output
m4_divert_once([HELP_END], [
See docs/autoconf.html for more details on the options for Mesa.])
AC_INIT([Mesa], [9.2.3],
AC_INIT([Mesa], [9.2.5],
[https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa])
AC_CONFIG_AUX_DIR([bin])
AC_CONFIG_MACRO_DIR([m4])

View File

@@ -31,7 +31,9 @@ because GL_ARB_compatibility is not supported.
<h2>MD5 checksums</h2>
<pre>
TBA
66e9a33a414f801e1c33398bf627d56b MesaLib-9.2.3.tar.gz
f56b6beb556e4b9072814419f7c554e3 MesaLib-9.2.3.tar.bz2
ed852dab576faac237ac4298bf55d0a1 MesaLib-9.2.3.zip
</pre>

102
docs/relnotes/9.2.4.html Normal file
View File

@@ -0,0 +1,102 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 9.2.4 Release Notes / (November 27, 2013)</h1>
<p>
Mesa 9.2.4 is a bug fix release which fixes bugs found since the 9.2.3 release.
</p>
<p>
Mesa 9.2 implements the OpenGL 3.1 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.1. OpenGL
3.1 is <strong>only</strong> available if requested at context creation
because GL_ARB_compatibility is not supported.
</p>
<h2>MD5 checksums</h2>
<pre>
28190b831b0271d69dbc44b2686eab1c MesaLib-9.2.4.tar.gz
e630c0a307cec4f0f70ddd029d2fe084 MesaLib-9.2.4.tar.bz2
8ef5e1e92e1d30fbedec31f716a7619e MesaLib-9.2.4.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=53077">Bug 53077</a> - [IVB] Output error with msaa when both of framebuffer and source color's alpha are not 1</li>
<li>Fix freedreno to compile with recent libdrm.</li>
</ul>
<h2>Changes</h2>
<p>The full set of changes can be viewed by using the following GIT command:</p>
<pre>
git log mesa-9.2.3..mesa-9.2.4
</pre>
<p>Brian Paul (1):</p>
<ul>
<li>st/mesa: fix GL_FEEDBACK mode inverted Y coordinate bug</li>
</ul>
<p>Paul Berry (2):</p>
<ul>
<li>i965: Fix vertical alignment for multisampled buffers.</li>
<li>glsl: Fix lowering of direct assignment in lower_clip_distance.</li>
</ul>
<p>Rob Clark (17):</p>
<ul>
<li>freedreno/a3xx: fix color inversion on mem-&gt;gmem restore</li>
<li>freedreno/a3xx: fix viewport on gmem-&gt;mem resolve</li>
<li>freedreno: add debug option to disable scissor optimization</li>
<li>freedreno: update register headers</li>
<li>freedreno/a3xx: some texture fixes</li>
<li>freedreno/a3xx/compiler: fix CMP</li>
<li>freedreno/a3xx/compiler: handle saturate on dst</li>
<li>freedreno/a3xx/compiler: use max_reg rather than file_count</li>
<li>freedreno/a3xx/compiler: cat4 cannot use const reg as src</li>
<li>freedreno: fix segfault when no color buffer bound</li>
<li>freedreno/a3xx/compiler: make compiler errors more useful</li>
<li>freedreno/a3xx/compiler: bit of re-arrange/cleanup</li>
<li>freedreno/a3xx/compiler: fix SGT/SLT/etc</li>
<li>freedreno/a3xx: don't leak so much</li>
<li>freedreno/a3xx/compiler: better const handling</li>
<li>freedreno/a3xx/compiler: handle sync flags better</li>
<li>freedreno: updates for msm drm/kms driver</li>
</ul>
<p>Tapani Pälli (1):</p>
<ul>
<li>mesa: enable GL_TEXTURE_LOD_BIAS set/get</li>
</ul>
</div>
</body>
</html>

120
docs/relnotes/9.2.5.html Normal file
View File

@@ -0,0 +1,120 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 9.2.5 Release Notes / (December 12, 2013)</h1>
<p>
Mesa 9.2.5 is a bug fix release which fixes bugs found since the 9.2.4 release.
</p>
<p>
Mesa 9.2 implements the OpenGL 3.1 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.1. OpenGL
3.1 is <strong>only</strong> available if requested at context creation
because GL_ARB_compatibility is not supported.
</p>
<h2>MD5 checksums</h2>
<pre>
9fb4de29ca1d9cfd03cbdefa123ba336 MesaLib-9.2.5.tar.bz2
1146c7c332767174f3de782b88d8e8ca MesaLib-9.2.5.tar.gz
a9a6c46dac7ea26fd272bf14894d95f3 MesaLib-9.2.5.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=62142">Bug 62142</a> - Mesa/demo mipmap_limits upside down with running by SOFTWARE</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=64323">Bug 64323</a> - Severe misrendering in Left 4 Dead 2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66213">Bug 66213</a> - Certain Mesa Demos Rendering Inverted (vertically)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68838">Bug 68838</a> - GLSL: struct declarations produce a &quot;empty declaration warning&quot; in 9.2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=69155">Bug 69155</a> - [NV50 gallium] [piglit] bin/varying-packing-simple triggers memory corruption/failures</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72325">Bug 72325</a> - [swrast] piglit glean fbo regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72327">Bug 72327</a> - [swrast] piglit glean pointSprite regression</li>
</ul>
<h2>Changes</h2>
<p>The full set of changes can be viewed by using the following GIT command:</p>
<pre>
git log mesa-9.2.4..mesa-9.2.5
</pre>
<p>Chad Versace (2):</p>
<ul>
<li>i965/hsw: Apply non-msrt fast color clear w/a to all HSW GTs</li>
<li>i965: Add extra-alignment for non-msrt fast color clear for all hw (v2)</li>
</ul>
<p>Chris Forbes (4):</p>
<ul>
<li>i965: Gen4-5: Don't enable hardware alpha test with MRT</li>
<li>i965: Gen4-5: Include alpha func/ref in program key</li>
<li>i965/fs: Gen4-5: Setup discard masks for MRT alpha test</li>
<li>i965/fs: Gen4-5: Implement alpha test in shader for MRT</li>
</ul>
<p>Chí-Thanh Christopher Nguyễn (1):</p>
<ul>
<li>st/xorg: Handle new DamageUnregister API which has only one argument</li>
</ul>
<p>Dave Airlie (3):</p>
<ul>
<li>mesa/swrast: fix inverted front buffer rendering with old-school swrast</li>
<li>glx: don't fail out when no configs if we have visuals</li>
<li>swrast: fix readback regression since inversion fix</li>
</ul>
<p>Ian Romanick (1):</p>
<ul>
<li>glsl: Don't emit empty declaration warning for a struct specifier</li>
</ul>
<p>Ilia Mirkin (4):</p>
<ul>
<li>nv50: Fix GPU_READING/WRITING bit removal</li>
<li>nouveau: avoid leaking fences while waiting</li>
<li>nv50: wait on the buf's fence before sticking it into pushbuf</li>
<li>nv50: report 15 max inputs for fragment programs</li>
</ul>
<p>Tom Stellard (2):</p>
<ul>
<li>r300/compiler/tests: Fix segfault</li>
<li>r300/compiler/tests: Fix line length check in test parser</li>
</ul>
</div>
</body>
</html>

View File

@@ -8,10 +8,12 @@ http://0x04.net/cgit/index.cgi/rules-ng-ng
git clone git://0x04.net/rules-ng-ng
The rules-ng-ng source files this header was generated from are:
- /home/robclark/src/freedreno/envytools/rnndb/a2xx.xml ( 30127 bytes, from 2013-05-05 18:29:35)
- /home/robclark/src/freedreno/envytools/rnndb/adreno.xml ( 327 bytes, from 2013-07-05 19:21:12)
- /home/robclark/src/freedreno/envytools/rnndb/freedreno_copyright.xml ( 1453 bytes, from 2013-03-31 16:51:27)
- /home/robclark/src/freedreno/envytools/rnndb/adreno_common.xml ( 3094 bytes, from 2013-05-05 18:29:22)
- /home/robclark/src/freedreno/envytools/rnndb/a2xx/a2xx.xml ( 30005 bytes, from 2013-07-19 21:30:48)
- /home/robclark/src/freedreno/envytools/rnndb/adreno_common.xml ( 8983 bytes, from 2013-07-24 01:38:36)
- /home/robclark/src/freedreno/envytools/rnndb/adreno_pm4.xml ( 9712 bytes, from 2013-05-26 15:22:37)
- /home/robclark/src/freedreno/envytools/rnndb/a3xx/a3xx.xml ( 51415 bytes, from 2013-08-03 14:26:05)
Copyright (C) 2013 by the following authors:
- Rob Clark <robdclark@gmail.com> (robclark)
@@ -236,56 +238,6 @@ enum sq_tex_filter {
#define REG_A2XX_CP_PFP_UCODE_DATA 0x000000c1
#define REG_A2XX_CP_RB_BASE 0x000001c0
#define REG_A2XX_CP_RB_CNTL 0x000001c1
#define REG_A2XX_CP_RB_RPTR_ADDR 0x000001c3
#define REG_A2XX_CP_RB_RPTR 0x000001c4
#define REG_A2XX_CP_RB_WPTR 0x000001c5
#define REG_A2XX_CP_RB_WPTR_DELAY 0x000001c6
#define REG_A2XX_CP_RB_RPTR_WR 0x000001c7
#define REG_A2XX_CP_RB_WPTR_BASE 0x000001c8
#define REG_A2XX_CP_QUEUE_THRESHOLDS 0x000001d5
#define REG_A2XX_SCRATCH_UMSK 0x000001dc
#define REG_A2XX_SCRATCH_ADDR 0x000001dd
#define REG_A2XX_CP_STATE_DEBUG_INDEX 0x000001ec
#define REG_A2XX_CP_STATE_DEBUG_DATA 0x000001ed
#define REG_A2XX_CP_INT_CNTL 0x000001f2
#define REG_A2XX_CP_INT_STATUS 0x000001f3
#define REG_A2XX_CP_INT_ACK 0x000001f4
#define REG_A2XX_CP_ME_CNTL 0x000001f6
#define REG_A2XX_CP_ME_STATUS 0x000001f7
#define REG_A2XX_CP_ME_RAM_WADDR 0x000001f8
#define REG_A2XX_CP_ME_RAM_RADDR 0x000001f9
#define REG_A2XX_CP_ME_RAM_DATA 0x000001fa
#define REG_A2XX_CP_DEBUG 0x000001fc
#define REG_A2XX_CP_CSQ_RB_STAT 0x000001fd
#define REG_A2XX_CP_CSQ_IB1_STAT 0x000001fe
#define REG_A2XX_CP_CSQ_IB2_STAT 0x000001ff
#define REG_A2XX_RBBM_PERFCOUNTER1_SELECT 0x00000395
#define REG_A2XX_RBBM_PERFCOUNTER1_LO 0x00000397
@@ -338,11 +290,32 @@ enum sq_tex_filter {
#define REG_A2XX_CP_STAT 0x0000047f
#define REG_A2XX_SCRATCH_REG0 0x00000578
#define REG_A2XX_SCRATCH_REG2 0x0000057a
#define REG_A2XX_RBBM_STATUS 0x000005d0
#define A2XX_RBBM_STATUS_CMDFIFO_AVAIL__MASK 0x0000001f
#define A2XX_RBBM_STATUS_CMDFIFO_AVAIL__SHIFT 0
static inline uint32_t A2XX_RBBM_STATUS_CMDFIFO_AVAIL(uint32_t val)
{
return ((val) << A2XX_RBBM_STATUS_CMDFIFO_AVAIL__SHIFT) & A2XX_RBBM_STATUS_CMDFIFO_AVAIL__MASK;
}
#define A2XX_RBBM_STATUS_TC_BUSY 0x00000020
#define A2XX_RBBM_STATUS_HIRQ_PENDING 0x00000100
#define A2XX_RBBM_STATUS_CPRQ_PENDING 0x00000200
#define A2XX_RBBM_STATUS_CFRQ_PENDING 0x00000400
#define A2XX_RBBM_STATUS_PFRQ_PENDING 0x00000800
#define A2XX_RBBM_STATUS_VGT_BUSY_NO_DMA 0x00001000
#define A2XX_RBBM_STATUS_RBBM_WU_BUSY 0x00004000
#define A2XX_RBBM_STATUS_CP_NRT_BUSY 0x00010000
#define A2XX_RBBM_STATUS_MH_BUSY 0x00040000
#define A2XX_RBBM_STATUS_MH_COHERENCY_BUSY 0x00080000
#define A2XX_RBBM_STATUS_SX_BUSY 0x00200000
#define A2XX_RBBM_STATUS_TPC_BUSY 0x00400000
#define A2XX_RBBM_STATUS_SC_CNTX_BUSY 0x01000000
#define A2XX_RBBM_STATUS_PA_BUSY 0x02000000
#define A2XX_RBBM_STATUS_VGT_BUSY 0x04000000
#define A2XX_RBBM_STATUS_SQ_CNTX17_BUSY 0x08000000
#define A2XX_RBBM_STATUS_SQ_CNTX0_BUSY 0x10000000
#define A2XX_RBBM_STATUS_RB_CNTX_BUSY 0x40000000
#define A2XX_RBBM_STATUS_GUI_ACTIVE 0x80000000
#define REG_A2XX_A220_VSC_BIN_SIZE 0x00000c01
#define A2XX_A220_VSC_BIN_SIZE_WIDTH__MASK 0x0000001f
@@ -358,13 +331,13 @@ static inline uint32_t A2XX_A220_VSC_BIN_SIZE_HEIGHT(uint32_t val)
return ((val >> 5) << A2XX_A220_VSC_BIN_SIZE_HEIGHT__SHIFT) & A2XX_A220_VSC_BIN_SIZE_HEIGHT__MASK;
}
#define REG_A2XX_VSC_PIPE(i0) (0x00000c06 + 0x3*(i0))
static inline uint32_t REG_A2XX_VSC_PIPE(uint32_t i0) { return 0x00000c06 + 0x3*i0; }
#define REG_A2XX_VSC_PIPE_CONFIG(i0) (0x00000c06 + 0x3*(i0))
static inline uint32_t REG_A2XX_VSC_PIPE_CONFIG(uint32_t i0) { return 0x00000c06 + 0x3*i0; }
#define REG_A2XX_VSC_PIPE_DATA_ADDRESS(i0) (0x00000c07 + 0x3*(i0))
static inline uint32_t REG_A2XX_VSC_PIPE_DATA_ADDRESS(uint32_t i0) { return 0x00000c07 + 0x3*i0; }
#define REG_A2XX_VSC_PIPE_DATA_LENGTH(i0) (0x00000c08 + 0x3*(i0))
static inline uint32_t REG_A2XX_VSC_PIPE_DATA_LENGTH(uint32_t i0) { return 0x00000c08 + 0x3*i0; }
#define REG_A2XX_PC_DEBUG_CNTL 0x00000c38

View File

@@ -137,7 +137,7 @@ emit_texture(struct fd_ringbuffer *ring, struct fd_context *ctx,
OUT_RING(ring, 0x00010000 + (0x6 * const_idx));
OUT_RING(ring, sampler->tex0 | view->tex0);
OUT_RELOC(ring, view->tex_resource->bo, 0, view->fmt);
OUT_RELOC(ring, view->tex_resource->bo, 0, view->fmt, 0);
OUT_RING(ring, view->tex2);
OUT_RING(ring, sampler->tex3 | view->tex3);
OUT_RING(ring, sampler->tex4);
@@ -171,7 +171,7 @@ fd2_emit_vertex_bufs(struct fd_ringbuffer *ring, uint32_t val,
OUT_RING(ring, (0x1 << 16) | (val & 0xffff));
for (i = 0; i < n; i++) {
struct fd_resource *rsc = fd_resource(vbufs[i].prsc);
OUT_RELOC(ring, rsc->bo, vbufs[i].offset, 3);
OUT_RELOC(ring, rsc->bo, vbufs[i].offset, 3, 0);
OUT_RING (ring, vbufs[i].size);
}
}

View File

@@ -70,7 +70,7 @@ emit_gmem2mem_surf(struct fd_ringbuffer *ring, uint32_t base,
OUT_PKT3(ring, CP_SET_CONSTANT, 5);
OUT_RING(ring, CP_REG(REG_A2XX_RB_COPY_CONTROL));
OUT_RING(ring, 0x00000000); /* RB_COPY_CONTROL */
OUT_RELOC(ring, rsc->bo, 0, 0); /* RB_COPY_DEST_BASE */
OUT_RELOCW(ring, rsc->bo, 0, 0, 0); /* RB_COPY_DEST_BASE */
OUT_RING(ring, rsc->pitch >> 5); /* RB_COPY_DEST_PITCH */
OUT_RING(ring, /* RB_COPY_DEST_INFO */
A2XX_RB_COPY_DEST_INFO_FORMAT(fd2_pipe2color(psurf->format)) |
@@ -199,7 +199,7 @@ emit_mem2gmem_surf(struct fd_ringbuffer *ring, uint32_t base,
A2XX_SQ_TEX_0_CLAMP_Z(SQ_TEX_WRAP) |
A2XX_SQ_TEX_0_PITCH(rsc->pitch));
OUT_RELOC(ring, rsc->bo, 0,
fd2_pipe2surface(psurf->format) | 0x800);
fd2_pipe2surface(psurf->format) | 0x800, 0);
OUT_RING(ring, A2XX_SQ_TEX_2_WIDTH(psurf->width - 1) |
A2XX_SQ_TEX_2_HEIGHT(psurf->height - 1));
OUT_RING(ring, 0x01000000 | // XXX
@@ -241,7 +241,7 @@ fd2_emit_tile_mem2gmem(struct fd_context *ctx, uint32_t xoff, uint32_t yoff,
y0 = ((float)yoff) / ((float)pfb->height);
y1 = ((float)yoff + bin_h) / ((float)pfb->height);
OUT_PKT3(ring, CP_MEM_WRITE, 9);
OUT_RELOC(ring, fd_resource(fd2_ctx->solid_vertexbuf)->bo, 0x60, 0);
OUT_RELOC(ring, fd_resource(fd2_ctx->solid_vertexbuf)->bo, 0x60, 0, 0);
OUT_RING(ring, fui(x0));
OUT_RING(ring, fui(y0));
OUT_RING(ring, fui(x1));
@@ -337,7 +337,7 @@ fd2_emit_tile_init(struct fd_context *ctx)
struct fd_ringbuffer *ring = ctx->ring;
struct pipe_framebuffer_state *pfb = &ctx->framebuffer;
struct fd_gmem_stateobj *gmem = &ctx->gmem;
enum pipe_format format = pfb->cbufs[0]->format;
enum pipe_format format = pipe_surface_format(pfb->cbufs[0]);
uint32_t reg;
OUT_PKT3(ring, CP_SET_CONSTANT, 4);
@@ -358,7 +358,7 @@ fd2_emit_tile_prep(struct fd_context *ctx, uint32_t xoff, uint32_t yoff,
{
struct fd_ringbuffer *ring = ctx->ring;
struct pipe_framebuffer_state *pfb = &ctx->framebuffer;
enum pipe_format format = pfb->cbufs[0]->format;
enum pipe_format format = pipe_surface_format(pfb->cbufs[0]);
OUT_PKT3(ring, CP_SET_CONSTANT, 2);
OUT_RING(ring, CP_REG(REG_A2XX_RB_COLOR_INFO));
@@ -379,7 +379,7 @@ fd2_emit_tile_renderprep(struct fd_context *ctx, uint32_t xoff, uint32_t yoff,
{
struct fd_ringbuffer *ring = ctx->ring;
struct pipe_framebuffer_state *pfb = &ctx->framebuffer;
enum pipe_format format = pfb->cbufs[0]->format;
enum pipe_format format = pipe_surface_format(pfb->cbufs[0]);
OUT_PKT3(ring, CP_SET_CONSTANT, 2);
OUT_RING(ring, CP_REG(REG_A2XX_RB_COLOR_INFO));

View File

@@ -8,10 +8,12 @@ http://0x04.net/cgit/index.cgi/rules-ng-ng
git clone git://0x04.net/rules-ng-ng
The rules-ng-ng source files this header was generated from are:
- /home/robclark/src/freedreno/envytools/rnndb/a3xx.xml ( 42578 bytes, from 2013-06-02 13:10:46)
- /home/robclark/src/freedreno/envytools/rnndb/adreno.xml ( 327 bytes, from 2013-07-05 19:21:12)
- /home/robclark/src/freedreno/envytools/rnndb/freedreno_copyright.xml ( 1453 bytes, from 2013-03-31 16:51:27)
- /home/robclark/src/freedreno/envytools/rnndb/adreno_common.xml ( 3094 bytes, from 2013-05-05 18:29:22)
- /home/robclark/src/freedreno/envytools/rnndb/a2xx/a2xx.xml ( 30005 bytes, from 2013-07-19 21:30:48)
- /home/robclark/src/freedreno/envytools/rnndb/adreno_common.xml ( 8983 bytes, from 2013-07-24 01:38:36)
- /home/robclark/src/freedreno/envytools/rnndb/adreno_pm4.xml ( 9712 bytes, from 2013-05-26 15:22:37)
- /home/robclark/src/freedreno/envytools/rnndb/a3xx/a3xx.xml ( 51415 bytes, from 2013-08-03 14:26:05)
Copyright (C) 2013 by the following authors:
- Rob Clark <robdclark@gmail.com> (robclark)
@@ -130,6 +132,13 @@ enum a3xx_tex_fmt {
TFMT_NORM_USHORT_5551 = 6,
TFMT_NORM_USHORT_4444 = 7,
TFMT_NORM_UINT_X8Z24 = 10,
TFMT_NORM_UINT_NV12_UV_TILED = 17,
TFMT_NORM_UINT_NV12_Y_TILED = 19,
TFMT_NORM_UINT_NV12_UV = 21,
TFMT_NORM_UINT_NV12_Y = 23,
TFMT_NORM_UINT_I420_Y = 24,
TFMT_NORM_UINT_I420_U = 26,
TFMT_NORM_UINT_I420_V = 27,
TFMT_NORM_UINT_2_10_10_10 = 41,
TFMT_NORM_UINT_A8 = 44,
TFMT_NORM_UINT_L8_A8 = 47,
@@ -207,6 +216,37 @@ enum a3xx_tex_swiz {
A3XX_TEX_ONE = 5,
};
enum a3xx_tex_type {
A3XX_TEX_1D = 0,
A3XX_TEX_2D = 1,
A3XX_TEX_CUBE = 2,
A3XX_TEX_3D = 3,
};
#define A3XX_INT0_RBBM_GPU_IDLE 0x00000001
#define A3XX_INT0_RBBM_AHB_ERROR 0x00000002
#define A3XX_INT0_RBBM_REG_TIMEOUT 0x00000004
#define A3XX_INT0_RBBM_ME_MS_TIMEOUT 0x00000008
#define A3XX_INT0_RBBM_PFP_MS_TIMEOUT 0x00000010
#define A3XX_INT0_RBBM_ATB_BUS_OVERFLOW 0x00000020
#define A3XX_INT0_VFD_ERROR 0x00000040
#define A3XX_INT0_CP_SW_INT 0x00000080
#define A3XX_INT0_CP_T0_PACKET_IN_IB 0x00000100
#define A3XX_INT0_CP_OPCODE_ERROR 0x00000200
#define A3XX_INT0_CP_RESERVED_BIT_ERROR 0x00000400
#define A3XX_INT0_CP_HW_FAULT 0x00000800
#define A3XX_INT0_CP_DMA 0x00001000
#define A3XX_INT0_CP_IB2_INT 0x00002000
#define A3XX_INT0_CP_IB1_INT 0x00004000
#define A3XX_INT0_CP_RB_INT 0x00008000
#define A3XX_INT0_CP_REG_PROTECT_FAULT 0x00010000
#define A3XX_INT0_CP_RB_DONE_TS 0x00020000
#define A3XX_INT0_CP_VS_DONE_TS 0x00040000
#define A3XX_INT0_CP_PS_DONE_TS 0x00080000
#define A3XX_INT0_CACHE_FLUSH_TS 0x00100000
#define A3XX_INT0_CP_AHB_ERROR_HALT 0x00200000
#define A3XX_INT0_MISC_HANG_DETECT 0x01000000
#define A3XX_INT0_UCHE_OOB_ACCESS 0x02000000
#define REG_A3XX_RBBM_HW_VERSION 0x00000000
#define REG_A3XX_RBBM_HW_RELEASE 0x00000001
@@ -230,6 +270,27 @@ enum a3xx_tex_swiz {
#define REG_A3XX_RBBM_GPR0_CTL 0x0000002e
#define REG_A3XX_RBBM_STATUS 0x00000030
#define A3XX_RBBM_STATUS_HI_BUSY 0x00000001
#define A3XX_RBBM_STATUS_CP_ME_BUSY 0x00000002
#define A3XX_RBBM_STATUS_CP_PFP_BUSY 0x00000004
#define A3XX_RBBM_STATUS_CP_NRT_BUSY 0x00004000
#define A3XX_RBBM_STATUS_VBIF_BUSY 0x00008000
#define A3XX_RBBM_STATUS_TSE_BUSY 0x00010000
#define A3XX_RBBM_STATUS_RAS_BUSY 0x00020000
#define A3XX_RBBM_STATUS_RB_BUSY 0x00040000
#define A3XX_RBBM_STATUS_PC_DCALL_BUSY 0x00080000
#define A3XX_RBBM_STATUS_PC_VSD_BUSY 0x00100000
#define A3XX_RBBM_STATUS_VFD_BUSY 0x00200000
#define A3XX_RBBM_STATUS_VPC_BUSY 0x00400000
#define A3XX_RBBM_STATUS_UCHE_BUSY 0x00800000
#define A3XX_RBBM_STATUS_SP_BUSY 0x01000000
#define A3XX_RBBM_STATUS_TPL1_BUSY 0x02000000
#define A3XX_RBBM_STATUS_MARB_BUSY 0x04000000
#define A3XX_RBBM_STATUS_VSC_BUSY 0x08000000
#define A3XX_RBBM_STATUS_ARB_BUSY 0x10000000
#define A3XX_RBBM_STATUS_HLSQ_BUSY 0x20000000
#define A3XX_RBBM_STATUS_GPU_BUSY_NOHC 0x40000000
#define A3XX_RBBM_STATUS_GPU_BUSY 0x80000000
#define REG_A3XX_RBBM_WAIT_IDLE_CLOCKS_CTL 0x00000033
@@ -251,20 +312,202 @@ enum a3xx_tex_swiz {
#define REG_A3XX_RBBM_PERFCTR_CTL 0x00000080
#define REG_A3XX_RBBM_PERFCTR_LOAD_CMD0 0x00000081
#define REG_A3XX_RBBM_PERFCTR_LOAD_CMD1 0x00000082
#define REG_A3XX_RBBM_PERFCTR_LOAD_VALUE_LO 0x00000084
#define REG_A3XX_RBBM_PERFCTR_LOAD_VALUE_HI 0x00000085
#define REG_A3XX_RBBM_PERFCOUNTER0_SELECT 0x00000086
#define REG_A3XX_RBBM_PERFCOUNTER1_SELECT 0x00000087
#define REG_A3XX_RBBM_GPU_BUSY_MASKED 0x00000088
#define REG_A3XX_RBBM_PERFCTR_CP_0_LO 0x00000090
#define REG_A3XX_RBBM_PERFCTR_CP_0_HI 0x00000091
#define REG_A3XX_RBBM_PERFCTR_RBBM_0_LO 0x00000092
#define REG_A3XX_RBBM_PERFCTR_RBBM_0_HI 0x00000093
#define REG_A3XX_RBBM_PERFCTR_RBBM_1_LO 0x00000094
#define REG_A3XX_RBBM_PERFCTR_RBBM_1_HI 0x00000095
#define REG_A3XX_RBBM_PERFCTR_PC_0_LO 0x00000096
#define REG_A3XX_RBBM_PERFCTR_PC_0_HI 0x00000097
#define REG_A3XX_RBBM_PERFCTR_PC_1_LO 0x00000098
#define REG_A3XX_RBBM_PERFCTR_PC_1_HI 0x00000099
#define REG_A3XX_RBBM_PERFCTR_PC_2_LO 0x0000009a
#define REG_A3XX_RBBM_PERFCTR_PC_2_HI 0x0000009b
#define REG_A3XX_RBBM_PERFCTR_PC_3_LO 0x0000009c
#define REG_A3XX_RBBM_PERFCTR_PC_3_HI 0x0000009d
#define REG_A3XX_RBBM_PERFCTR_VFD_0_LO 0x0000009e
#define REG_A3XX_RBBM_PERFCTR_VFD_0_HI 0x0000009f
#define REG_A3XX_RBBM_PERFCTR_VFD_1_LO 0x000000a0
#define REG_A3XX_RBBM_PERFCTR_VFD_1_HI 0x000000a1
#define REG_A3XX_RBBM_PERFCTR_HLSQ_0_LO 0x000000a2
#define REG_A3XX_RBBM_PERFCTR_HLSQ_0_HI 0x000000a3
#define REG_A3XX_RBBM_PERFCTR_HLSQ_1_LO 0x000000a4
#define REG_A3XX_RBBM_PERFCTR_HLSQ_1_HI 0x000000a5
#define REG_A3XX_RBBM_PERFCTR_HLSQ_2_LO 0x000000a6
#define REG_A3XX_RBBM_PERFCTR_HLSQ_2_HI 0x000000a7
#define REG_A3XX_RBBM_PERFCTR_HLSQ_3_LO 0x000000a8
#define REG_A3XX_RBBM_PERFCTR_HLSQ_3_HI 0x000000a9
#define REG_A3XX_RBBM_PERFCTR_HLSQ_4_LO 0x000000aa
#define REG_A3XX_RBBM_PERFCTR_HLSQ_4_HI 0x000000ab
#define REG_A3XX_RBBM_PERFCTR_HLSQ_5_LO 0x000000ac
#define REG_A3XX_RBBM_PERFCTR_HLSQ_5_HI 0x000000ad
#define REG_A3XX_RBBM_PERFCTR_VPC_0_LO 0x000000ae
#define REG_A3XX_RBBM_PERFCTR_VPC_0_HI 0x000000af
#define REG_A3XX_RBBM_PERFCTR_VPC_1_LO 0x000000b0
#define REG_A3XX_RBBM_PERFCTR_VPC_1_HI 0x000000b1
#define REG_A3XX_RBBM_PERFCTR_TSE_0_LO 0x000000b2
#define REG_A3XX_RBBM_PERFCTR_TSE_0_HI 0x000000b3
#define REG_A3XX_RBBM_PERFCTR_TSE_1_LO 0x000000b4
#define REG_A3XX_RBBM_PERFCTR_TSE_1_HI 0x000000b5
#define REG_A3XX_RBBM_PERFCTR_RAS_0_LO 0x000000b6
#define REG_A3XX_RBBM_PERFCTR_RAS_0_HI 0x000000b7
#define REG_A3XX_RBBM_PERFCTR_RAS_1_LO 0x000000b8
#define REG_A3XX_RBBM_PERFCTR_RAS_1_HI 0x000000b9
#define REG_A3XX_RBBM_PERFCTR_UCHE_0_LO 0x000000ba
#define REG_A3XX_RBBM_PERFCTR_UCHE_0_HI 0x000000bb
#define REG_A3XX_RBBM_PERFCTR_UCHE_1_LO 0x000000bc
#define REG_A3XX_RBBM_PERFCTR_UCHE_1_HI 0x000000bd
#define REG_A3XX_RBBM_PERFCTR_UCHE_2_LO 0x000000be
#define REG_A3XX_RBBM_PERFCTR_UCHE_2_HI 0x000000bf
#define REG_A3XX_RBBM_PERFCTR_UCHE_3_LO 0x000000c0
#define REG_A3XX_RBBM_PERFCTR_UCHE_3_HI 0x000000c1
#define REG_A3XX_RBBM_PERFCTR_UCHE_4_LO 0x000000c2
#define REG_A3XX_RBBM_PERFCTR_UCHE_4_HI 0x000000c3
#define REG_A3XX_RBBM_PERFCTR_UCHE_5_LO 0x000000c4
#define REG_A3XX_RBBM_PERFCTR_UCHE_5_HI 0x000000c5
#define REG_A3XX_RBBM_PERFCTR_TP_0_LO 0x000000c6
#define REG_A3XX_RBBM_PERFCTR_TP_0_HI 0x000000c7
#define REG_A3XX_RBBM_PERFCTR_TP_1_LO 0x000000c8
#define REG_A3XX_RBBM_PERFCTR_TP_1_HI 0x000000c9
#define REG_A3XX_RBBM_PERFCTR_TP_2_LO 0x000000ca
#define REG_A3XX_RBBM_PERFCTR_TP_2_HI 0x000000cb
#define REG_A3XX_RBBM_PERFCTR_TP_3_LO 0x000000cc
#define REG_A3XX_RBBM_PERFCTR_TP_3_HI 0x000000cd
#define REG_A3XX_RBBM_PERFCTR_TP_4_LO 0x000000ce
#define REG_A3XX_RBBM_PERFCTR_TP_4_HI 0x000000cf
#define REG_A3XX_RBBM_PERFCTR_TP_5_LO 0x000000d0
#define REG_A3XX_RBBM_PERFCTR_TP_5_HI 0x000000d1
#define REG_A3XX_RBBM_PERFCTR_SP_0_LO 0x000000d2
#define REG_A3XX_RBBM_PERFCTR_SP_0_HI 0x000000d3
#define REG_A3XX_RBBM_PERFCTR_SP_1_LO 0x000000d4
#define REG_A3XX_RBBM_PERFCTR_SP_1_HI 0x000000d5
#define REG_A3XX_RBBM_PERFCTR_SP_2_LO 0x000000d6
#define REG_A3XX_RBBM_PERFCTR_SP_2_HI 0x000000d7
#define REG_A3XX_RBBM_PERFCTR_SP_3_LO 0x000000d8
#define REG_A3XX_RBBM_PERFCTR_SP_3_HI 0x000000d9
#define REG_A3XX_RBBM_PERFCTR_SP_4_LO 0x000000da
#define REG_A3XX_RBBM_PERFCTR_SP_4_HI 0x000000db
#define REG_A3XX_RBBM_PERFCTR_SP_5_LO 0x000000dc
#define REG_A3XX_RBBM_PERFCTR_SP_5_HI 0x000000dd
#define REG_A3XX_RBBM_PERFCTR_SP_6_LO 0x000000de
#define REG_A3XX_RBBM_PERFCTR_SP_6_HI 0x000000df
#define REG_A3XX_RBBM_PERFCTR_SP_7_LO 0x000000e0
#define REG_A3XX_RBBM_PERFCTR_SP_7_HI 0x000000e1
#define REG_A3XX_RBBM_PERFCTR_RB_0_LO 0x000000e2
#define REG_A3XX_RBBM_PERFCTR_RB_0_HI 0x000000e3
#define REG_A3XX_RBBM_PERFCTR_RB_1_LO 0x000000e4
#define REG_A3XX_RBBM_PERFCTR_RB_1_HI 0x000000e5
#define REG_A3XX_RBBM_PERFCTR_PWR_0_LO 0x000000ea
#define REG_A3XX_RBBM_PERFCTR_PWR_0_HI 0x000000eb
#define REG_A3XX_RBBM_PERFCTR_PWR_1_LO 0x000000ec
#define REG_A3XX_RBBM_PERFCTR_PWR_1_HI 0x000000ed
#define REG_A3XX_RBBM_RBBM_CTL 0x00000100
#define REG_A3XX_RBBM_RBBM_CTL 0x00000100
#define REG_A3XX_RBBM_DEBUG_BUS_CTL 0x00000111
#define REG_A3XX_RBBM_DEBUG_BUS_DATA_STATUS 0x00000112
@@ -287,22 +530,20 @@ enum a3xx_tex_swiz {
#define REG_A3XX_CP_MEQ_DATA 0x000001db
#define REG_A3XX_CP_PERFCOUNTER_SELECT 0x00000445
#define REG_A3XX_CP_HW_FAULT 0x0000045c
#define REG_A3XX_CP_PROTECT_CTRL 0x0000045e
#define REG_A3XX_CP_PROTECT_STATUS 0x0000045f
#define REG_A3XX_CP_PROTECT(i0) (0x00000460 + 0x1*(i0))
static inline uint32_t REG_A3XX_CP_PROTECT(uint32_t i0) { return 0x00000460 + 0x1*i0; }
#define REG_A3XX_CP_PROTECT_REG(i0) (0x00000460 + 0x1*(i0))
static inline uint32_t REG_A3XX_CP_PROTECT_REG(uint32_t i0) { return 0x00000460 + 0x1*i0; }
#define REG_A3XX_CP_AHB_FAULT 0x0000054d
#define REG_A3XX_CP_SCRATCH_REG2 0x0000057a
#define REG_A3XX_CP_SCRATCH_REG3 0x0000057b
#define REG_A3XX_GRAS_CL_CLIP_CNTL 0x00002040
#define A3XX_GRAS_CL_CLIP_CNTL_IJ_PERSP_CENTER 0x00001000
#define A3XX_GRAS_CL_CLIP_CNTL_CLIP_DISABLE 0x00010000
@@ -528,9 +769,9 @@ static inline uint32_t A3XX_RB_MSAA_CONTROL_SAMPLE_MASK(uint32_t val)
#define REG_A3XX_UNKNOWN_20C3 0x000020c3
#define REG_A3XX_RB_MRT(i0) (0x000020c4 + 0x4*(i0))
static inline uint32_t REG_A3XX_RB_MRT(uint32_t i0) { return 0x000020c4 + 0x4*i0; }
#define REG_A3XX_RB_MRT_CONTROL(i0) (0x000020c4 + 0x4*(i0))
static inline uint32_t REG_A3XX_RB_MRT_CONTROL(uint32_t i0) { return 0x000020c4 + 0x4*i0; }
#define A3XX_RB_MRT_CONTROL_READ_DEST_ENABLE 0x00000008
#define A3XX_RB_MRT_CONTROL_BLEND 0x00000010
#define A3XX_RB_MRT_CONTROL_BLEND2 0x00000020
@@ -553,7 +794,7 @@ static inline uint32_t A3XX_RB_MRT_CONTROL_COMPONENT_ENABLE(uint32_t val)
return ((val) << A3XX_RB_MRT_CONTROL_COMPONENT_ENABLE__SHIFT) & A3XX_RB_MRT_CONTROL_COMPONENT_ENABLE__MASK;
}
#define REG_A3XX_RB_MRT_BUF_INFO(i0) (0x000020c5 + 0x4*(i0))
static inline uint32_t REG_A3XX_RB_MRT_BUF_INFO(uint32_t i0) { return 0x000020c5 + 0x4*i0; }
#define A3XX_RB_MRT_BUF_INFO_COLOR_FORMAT__MASK 0x0000003f
#define A3XX_RB_MRT_BUF_INFO_COLOR_FORMAT__SHIFT 0
static inline uint32_t A3XX_RB_MRT_BUF_INFO_COLOR_FORMAT(enum a3xx_color_fmt val)
@@ -579,7 +820,7 @@ static inline uint32_t A3XX_RB_MRT_BUF_INFO_COLOR_BUF_PITCH(uint32_t val)
return ((val >> 5) << A3XX_RB_MRT_BUF_INFO_COLOR_BUF_PITCH__SHIFT) & A3XX_RB_MRT_BUF_INFO_COLOR_BUF_PITCH__MASK;
}
#define REG_A3XX_RB_MRT_BUF_BASE(i0) (0x000020c6 + 0x4*(i0))
static inline uint32_t REG_A3XX_RB_MRT_BUF_BASE(uint32_t i0) { return 0x000020c6 + 0x4*i0; }
#define A3XX_RB_MRT_BUF_BASE_COLOR_BUF_BASE__MASK 0xfffffff0
#define A3XX_RB_MRT_BUF_BASE_COLOR_BUF_BASE__SHIFT 4
static inline uint32_t A3XX_RB_MRT_BUF_BASE_COLOR_BUF_BASE(uint32_t val)
@@ -587,7 +828,7 @@ static inline uint32_t A3XX_RB_MRT_BUF_BASE_COLOR_BUF_BASE(uint32_t val)
return ((val >> 5) << A3XX_RB_MRT_BUF_BASE_COLOR_BUF_BASE__SHIFT) & A3XX_RB_MRT_BUF_BASE_COLOR_BUF_BASE__MASK;
}
#define REG_A3XX_RB_MRT_BLEND_CONTROL(i0) (0x000020c7 + 0x4*(i0))
static inline uint32_t REG_A3XX_RB_MRT_BLEND_CONTROL(uint32_t i0) { return 0x000020c7 + 0x4*i0; }
#define A3XX_RB_MRT_BLEND_CONTROL_RGB_SRC_FACTOR__MASK 0x0000001f
#define A3XX_RB_MRT_BLEND_CONTROL_RGB_SRC_FACTOR__SHIFT 0
static inline uint32_t A3XX_RB_MRT_BLEND_CONTROL_RGB_SRC_FACTOR(enum adreno_rb_blend_factor val)
@@ -627,12 +868,60 @@ static inline uint32_t A3XX_RB_MRT_BLEND_CONTROL_ALPHA_DEST_FACTOR(enum adreno_r
#define A3XX_RB_MRT_BLEND_CONTROL_CLAMP_ENABLE 0x20000000
#define REG_A3XX_RB_BLEND_RED 0x000020e4
#define A3XX_RB_BLEND_RED_UINT__MASK 0x000000ff
#define A3XX_RB_BLEND_RED_UINT__SHIFT 0
static inline uint32_t A3XX_RB_BLEND_RED_UINT(uint32_t val)
{
return ((val) << A3XX_RB_BLEND_RED_UINT__SHIFT) & A3XX_RB_BLEND_RED_UINT__MASK;
}
#define A3XX_RB_BLEND_RED_FLOAT__MASK 0xffff0000
#define A3XX_RB_BLEND_RED_FLOAT__SHIFT 16
static inline uint32_t A3XX_RB_BLEND_RED_FLOAT(float val)
{
return ((util_float_to_half(val)) << A3XX_RB_BLEND_RED_FLOAT__SHIFT) & A3XX_RB_BLEND_RED_FLOAT__MASK;
}
#define REG_A3XX_RB_BLEND_GREEN 0x000020e5
#define A3XX_RB_BLEND_GREEN_UINT__MASK 0x000000ff
#define A3XX_RB_BLEND_GREEN_UINT__SHIFT 0
static inline uint32_t A3XX_RB_BLEND_GREEN_UINT(uint32_t val)
{
return ((val) << A3XX_RB_BLEND_GREEN_UINT__SHIFT) & A3XX_RB_BLEND_GREEN_UINT__MASK;
}
#define A3XX_RB_BLEND_GREEN_FLOAT__MASK 0xffff0000
#define A3XX_RB_BLEND_GREEN_FLOAT__SHIFT 16
static inline uint32_t A3XX_RB_BLEND_GREEN_FLOAT(float val)
{
return ((util_float_to_half(val)) << A3XX_RB_BLEND_GREEN_FLOAT__SHIFT) & A3XX_RB_BLEND_GREEN_FLOAT__MASK;
}
#define REG_A3XX_RB_BLEND_BLUE 0x000020e6
#define A3XX_RB_BLEND_BLUE_UINT__MASK 0x000000ff
#define A3XX_RB_BLEND_BLUE_UINT__SHIFT 0
static inline uint32_t A3XX_RB_BLEND_BLUE_UINT(uint32_t val)
{
return ((val) << A3XX_RB_BLEND_BLUE_UINT__SHIFT) & A3XX_RB_BLEND_BLUE_UINT__MASK;
}
#define A3XX_RB_BLEND_BLUE_FLOAT__MASK 0xffff0000
#define A3XX_RB_BLEND_BLUE_FLOAT__SHIFT 16
static inline uint32_t A3XX_RB_BLEND_BLUE_FLOAT(float val)
{
return ((util_float_to_half(val)) << A3XX_RB_BLEND_BLUE_FLOAT__SHIFT) & A3XX_RB_BLEND_BLUE_FLOAT__MASK;
}
#define REG_A3XX_RB_BLEND_ALPHA 0x000020e7
#define A3XX_RB_BLEND_ALPHA_UINT__MASK 0x000000ff
#define A3XX_RB_BLEND_ALPHA_UINT__SHIFT 0
static inline uint32_t A3XX_RB_BLEND_ALPHA_UINT(uint32_t val)
{
return ((val) << A3XX_RB_BLEND_ALPHA_UINT__SHIFT) & A3XX_RB_BLEND_ALPHA_UINT__MASK;
}
#define A3XX_RB_BLEND_ALPHA_FLOAT__MASK 0xffff0000
#define A3XX_RB_BLEND_ALPHA_FLOAT__SHIFT 16
static inline uint32_t A3XX_RB_BLEND_ALPHA_FLOAT(float val)
{
return ((util_float_to_half(val)) << A3XX_RB_BLEND_ALPHA_FLOAT__SHIFT) & A3XX_RB_BLEND_ALPHA_FLOAT__MASK;
}
#define REG_A3XX_UNKNOWN_20E8 0x000020e8
@@ -1063,9 +1352,9 @@ static inline uint32_t A3XX_VFD_CONTROL_1_REGID4INST(uint32_t val)
#define REG_A3XX_VFD_INDEX_OFFSET 0x00002245
#define REG_A3XX_VFD_FETCH(i0) (0x00002246 + 0x2*(i0))
static inline uint32_t REG_A3XX_VFD_FETCH(uint32_t i0) { return 0x00002246 + 0x2*i0; }
#define REG_A3XX_VFD_FETCH_INSTR_0(i0) (0x00002246 + 0x2*(i0))
static inline uint32_t REG_A3XX_VFD_FETCH_INSTR_0(uint32_t i0) { return 0x00002246 + 0x2*i0; }
#define A3XX_VFD_FETCH_INSTR_0_FETCHSIZE__MASK 0x0000007f
#define A3XX_VFD_FETCH_INSTR_0_FETCHSIZE__SHIFT 0
static inline uint32_t A3XX_VFD_FETCH_INSTR_0_FETCHSIZE(uint32_t val)
@@ -1092,11 +1381,11 @@ static inline uint32_t A3XX_VFD_FETCH_INSTR_0_STEPRATE(uint32_t val)
return ((val) << A3XX_VFD_FETCH_INSTR_0_STEPRATE__SHIFT) & A3XX_VFD_FETCH_INSTR_0_STEPRATE__MASK;
}
#define REG_A3XX_VFD_FETCH_INSTR_1(i0) (0x00002247 + 0x2*(i0))
static inline uint32_t REG_A3XX_VFD_FETCH_INSTR_1(uint32_t i0) { return 0x00002247 + 0x2*i0; }
#define REG_A3XX_VFD_DECODE(i0) (0x00002266 + 0x1*(i0))
static inline uint32_t REG_A3XX_VFD_DECODE(uint32_t i0) { return 0x00002266 + 0x1*i0; }
#define REG_A3XX_VFD_DECODE_INSTR(i0) (0x00002266 + 0x1*(i0))
static inline uint32_t REG_A3XX_VFD_DECODE_INSTR(uint32_t i0) { return 0x00002266 + 0x1*i0; }
#define A3XX_VFD_DECODE_INSTR_WRITEMASK__MASK 0x0000000f
#define A3XX_VFD_DECODE_INSTR_WRITEMASK__SHIFT 0
static inline uint32_t A3XX_VFD_DECODE_INSTR_WRITEMASK(uint32_t val)
@@ -1173,13 +1462,13 @@ static inline uint32_t A3XX_VPC_PACK_NUMNONPOSVSVAR(uint32_t val)
return ((val) << A3XX_VPC_PACK_NUMNONPOSVSVAR__SHIFT) & A3XX_VPC_PACK_NUMNONPOSVSVAR__MASK;
}
#define REG_A3XX_VPC_VARYING_INTERP(i0) (0x00002282 + 0x1*(i0))
static inline uint32_t REG_A3XX_VPC_VARYING_INTERP(uint32_t i0) { return 0x00002282 + 0x1*i0; }
#define REG_A3XX_VPC_VARYING_INTERP_MODE(i0) (0x00002282 + 0x1*(i0))
static inline uint32_t REG_A3XX_VPC_VARYING_INTERP_MODE(uint32_t i0) { return 0x00002282 + 0x1*i0; }
#define REG_A3XX_VPC_VARYING_PS_REPL(i0) (0x00002286 + 0x1*(i0))
static inline uint32_t REG_A3XX_VPC_VARYING_PS_REPL(uint32_t i0) { return 0x00002286 + 0x1*i0; }
#define REG_A3XX_VPC_VARYING_PS_REPL_MODE(i0) (0x00002286 + 0x1*(i0))
static inline uint32_t REG_A3XX_VPC_VARYING_PS_REPL_MODE(uint32_t i0) { return 0x00002286 + 0x1*i0; }
#define REG_A3XX_VPC_VARY_CYLWRAP_ENABLE_0 0x0000228a
@@ -1293,9 +1582,9 @@ static inline uint32_t A3XX_SP_VS_PARAM_REG_TOTALVSOUTVAR(uint32_t val)
return ((val) << A3XX_SP_VS_PARAM_REG_TOTALVSOUTVAR__SHIFT) & A3XX_SP_VS_PARAM_REG_TOTALVSOUTVAR__MASK;
}
#define REG_A3XX_SP_VS_OUT(i0) (0x000022c7 + 0x1*(i0))
static inline uint32_t REG_A3XX_SP_VS_OUT(uint32_t i0) { return 0x000022c7 + 0x1*i0; }
#define REG_A3XX_SP_VS_OUT_REG(i0) (0x000022c7 + 0x1*(i0))
static inline uint32_t REG_A3XX_SP_VS_OUT_REG(uint32_t i0) { return 0x000022c7 + 0x1*i0; }
#define A3XX_SP_VS_OUT_REG_A_REGID__MASK 0x000001ff
#define A3XX_SP_VS_OUT_REG_A_REGID__SHIFT 0
static inline uint32_t A3XX_SP_VS_OUT_REG_A_REGID(uint32_t val)
@@ -1321,9 +1610,9 @@ static inline uint32_t A3XX_SP_VS_OUT_REG_B_COMPMASK(uint32_t val)
return ((val) << A3XX_SP_VS_OUT_REG_B_COMPMASK__SHIFT) & A3XX_SP_VS_OUT_REG_B_COMPMASK__MASK;
}
#define REG_A3XX_SP_VS_VPC_DST(i0) (0x000022d0 + 0x1*(i0))
static inline uint32_t REG_A3XX_SP_VS_VPC_DST(uint32_t i0) { return 0x000022d0 + 0x1*i0; }
#define REG_A3XX_SP_VS_VPC_DST_REG(i0) (0x000022d0 + 0x1*(i0))
static inline uint32_t REG_A3XX_SP_VS_VPC_DST_REG(uint32_t i0) { return 0x000022d0 + 0x1*i0; }
#define A3XX_SP_VS_VPC_DST_REG_OUTLOC0__MASK 0x000000ff
#define A3XX_SP_VS_VPC_DST_REG_OUTLOC0__SHIFT 0
static inline uint32_t A3XX_SP_VS_VPC_DST_REG_OUTLOC0(uint32_t val)
@@ -1480,9 +1769,9 @@ static inline uint32_t A3XX_SP_FS_OBJ_OFFSET_REG_SHADEROBJOFFSET(uint32_t val)
#define REG_A3XX_SP_FS_OUTPUT_REG 0x000022ec
#define REG_A3XX_SP_FS_MRT(i0) (0x000022f0 + 0x1*(i0))
static inline uint32_t REG_A3XX_SP_FS_MRT(uint32_t i0) { return 0x000022f0 + 0x1*i0; }
#define REG_A3XX_SP_FS_MRT_REG(i0) (0x000022f0 + 0x1*(i0))
static inline uint32_t REG_A3XX_SP_FS_MRT_REG(uint32_t i0) { return 0x000022f0 + 0x1*i0; }
#define A3XX_SP_FS_MRT_REG_REGID__MASK 0x000000ff
#define A3XX_SP_FS_MRT_REG_REGID__SHIFT 0
static inline uint32_t A3XX_SP_FS_MRT_REG_REGID(uint32_t val)
@@ -1491,9 +1780,9 @@ static inline uint32_t A3XX_SP_FS_MRT_REG_REGID(uint32_t val)
}
#define A3XX_SP_FS_MRT_REG_HALF_PRECISION 0x00000100
#define REG_A3XX_SP_FS_IMAGE_OUTPUT(i0) (0x000022f4 + 0x1*(i0))
static inline uint32_t REG_A3XX_SP_FS_IMAGE_OUTPUT(uint32_t i0) { return 0x000022f4 + 0x1*i0; }
#define REG_A3XX_SP_FS_IMAGE_OUTPUT_REG(i0) (0x000022f4 + 0x1*(i0))
static inline uint32_t REG_A3XX_SP_FS_IMAGE_OUTPUT_REG(uint32_t i0) { return 0x000022f4 + 0x1*i0; }
#define A3XX_SP_FS_IMAGE_OUTPUT_REG_MRTFORMAT__MASK 0x0000003f
#define A3XX_SP_FS_IMAGE_OUTPUT_REG_MRTFORMAT__SHIFT 0
static inline uint32_t A3XX_SP_FS_IMAGE_OUTPUT_REG_MRTFORMAT(enum a3xx_color_fmt val)
@@ -1607,9 +1896,9 @@ static inline uint32_t A3XX_VSC_BIN_SIZE_HEIGHT(uint32_t val)
#define REG_A3XX_VSC_SIZE_ADDRESS 0x00000c02
#define REG_A3XX_VSC_PIPE(i0) (0x00000c06 + 0x3*(i0))
static inline uint32_t REG_A3XX_VSC_PIPE(uint32_t i0) { return 0x00000c06 + 0x3*i0; }
#define REG_A3XX_VSC_PIPE_CONFIG(i0) (0x00000c06 + 0x3*(i0))
static inline uint32_t REG_A3XX_VSC_PIPE_CONFIG(uint32_t i0) { return 0x00000c06 + 0x3*i0; }
#define A3XX_VSC_PIPE_CONFIG_X__MASK 0x000003ff
#define A3XX_VSC_PIPE_CONFIG_X__SHIFT 0
static inline uint32_t A3XX_VSC_PIPE_CONFIG_X(uint32_t val)
@@ -1635,26 +1924,46 @@ static inline uint32_t A3XX_VSC_PIPE_CONFIG_H(uint32_t val)
return ((val) << A3XX_VSC_PIPE_CONFIG_H__SHIFT) & A3XX_VSC_PIPE_CONFIG_H__MASK;
}
#define REG_A3XX_VSC_PIPE_DATA_ADDRESS(i0) (0x00000c07 + 0x3*(i0))
static inline uint32_t REG_A3XX_VSC_PIPE_DATA_ADDRESS(uint32_t i0) { return 0x00000c07 + 0x3*i0; }
#define REG_A3XX_VSC_PIPE_DATA_LENGTH(i0) (0x00000c08 + 0x3*(i0))
static inline uint32_t REG_A3XX_VSC_PIPE_DATA_LENGTH(uint32_t i0) { return 0x00000c08 + 0x3*i0; }
#define REG_A3XX_UNKNOWN_0C3D 0x00000c3d
#define REG_A3XX_PC_PERFCOUNTER0_SELECT 0x00000c48
#define REG_A3XX_PC_PERFCOUNTER1_SELECT 0x00000c49
#define REG_A3XX_PC_PERFCOUNTER2_SELECT 0x00000c4a
#define REG_A3XX_PC_PERFCOUNTER3_SELECT 0x00000c4b
#define REG_A3XX_UNKNOWN_0C81 0x00000c81
#define REG_A3XX_GRAS_CL_USER_PLANE(i0) (0x00000ca0 + 0x4*(i0))
#define REG_A3XX_GRAS_PERFCOUNTER0_SELECT 0x00000c88
#define REG_A3XX_GRAS_CL_USER_PLANE_X(i0) (0x00000ca0 + 0x4*(i0))
#define REG_A3XX_GRAS_PERFCOUNTER1_SELECT 0x00000c89
#define REG_A3XX_GRAS_CL_USER_PLANE_Y(i0) (0x00000ca1 + 0x4*(i0))
#define REG_A3XX_GRAS_PERFCOUNTER2_SELECT 0x00000c8a
#define REG_A3XX_GRAS_CL_USER_PLANE_Z(i0) (0x00000ca2 + 0x4*(i0))
#define REG_A3XX_GRAS_PERFCOUNTER3_SELECT 0x00000c8b
#define REG_A3XX_GRAS_CL_USER_PLANE_W(i0) (0x00000ca3 + 0x4*(i0))
static inline uint32_t REG_A3XX_GRAS_CL_USER_PLANE(uint32_t i0) { return 0x00000ca0 + 0x4*i0; }
static inline uint32_t REG_A3XX_GRAS_CL_USER_PLANE_X(uint32_t i0) { return 0x00000ca0 + 0x4*i0; }
static inline uint32_t REG_A3XX_GRAS_CL_USER_PLANE_Y(uint32_t i0) { return 0x00000ca1 + 0x4*i0; }
static inline uint32_t REG_A3XX_GRAS_CL_USER_PLANE_Z(uint32_t i0) { return 0x00000ca2 + 0x4*i0; }
static inline uint32_t REG_A3XX_GRAS_CL_USER_PLANE_W(uint32_t i0) { return 0x00000ca3 + 0x4*i0; }
#define REG_A3XX_RB_GMEM_BASE_ADDR 0x00000cc0
#define REG_A3XX_RB_PERFCOUNTER0_SELECT 0x00000cc6
#define REG_A3XX_RB_PERFCOUNTER1_SELECT 0x00000cc7
#define REG_A3XX_RB_WINDOW_SIZE 0x00000ce0
#define A3XX_RB_WINDOW_SIZE_WIDTH__MASK 0x00003fff
#define A3XX_RB_WINDOW_SIZE_WIDTH__SHIFT 0
@@ -1669,18 +1978,46 @@ static inline uint32_t A3XX_RB_WINDOW_SIZE_HEIGHT(uint32_t val)
return ((val) << A3XX_RB_WINDOW_SIZE_HEIGHT__SHIFT) & A3XX_RB_WINDOW_SIZE_HEIGHT__MASK;
}
#define REG_A3XX_UNKNOWN_0E00 0x00000e00
#define REG_A3XX_HLSQ_PERFCOUNTER0_SELECT 0x00000e00
#define REG_A3XX_HLSQ_PERFCOUNTER1_SELECT 0x00000e01
#define REG_A3XX_HLSQ_PERFCOUNTER2_SELECT 0x00000e02
#define REG_A3XX_HLSQ_PERFCOUNTER3_SELECT 0x00000e03
#define REG_A3XX_HLSQ_PERFCOUNTER4_SELECT 0x00000e04
#define REG_A3XX_HLSQ_PERFCOUNTER5_SELECT 0x00000e05
#define REG_A3XX_UNKNOWN_0E43 0x00000e43
#define REG_A3XX_VFD_PERFCOUNTER0_SELECT 0x00000e44
#define REG_A3XX_VFD_PERFCOUNTER1_SELECT 0x00000e45
#define REG_A3XX_VPC_VPC_DEBUG_RAM_SEL 0x00000e61
#define REG_A3XX_VPC_VPC_DEBUG_RAM_READ 0x00000e62
#define REG_A3XX_VPC_PERFCOUNTER0_SELECT 0x00000e64
#define REG_A3XX_VPC_PERFCOUNTER1_SELECT 0x00000e65
#define REG_A3XX_UCHE_CACHE_MODE_CONTROL_REG 0x00000e82
#define REG_A3XX_UCHE_PERFCOUNTER0_SELECT 0x00000e84
#define REG_A3XX_UCHE_PERFCOUNTER1_SELECT 0x00000e85
#define REG_A3XX_UCHE_PERFCOUNTER2_SELECT 0x00000e86
#define REG_A3XX_UCHE_PERFCOUNTER3_SELECT 0x00000e87
#define REG_A3XX_UCHE_PERFCOUNTER4_SELECT 0x00000e88
#define REG_A3XX_UCHE_PERFCOUNTER5_SELECT 0x00000e89
#define REG_A3XX_UCHE_CACHE_INVALIDATE0_REG 0x00000ea0
#define A3XX_UCHE_CACHE_INVALIDATE0_REG_ADDR__MASK 0x0fffffff
#define A3XX_UCHE_CACHE_INVALIDATE0_REG_ADDR__SHIFT 0
@@ -1724,6 +2061,18 @@ static inline uint32_t A3XX_UCHE_CACHE_INVALIDATE1_REG_OPCODE(enum a3xx_cache_op
#define REG_A3XX_UNKNOWN_0F03 0x00000f03
#define REG_A3XX_TP_PERFCOUNTER0_SELECT 0x00000f04
#define REG_A3XX_TP_PERFCOUNTER1_SELECT 0x00000f05
#define REG_A3XX_TP_PERFCOUNTER2_SELECT 0x00000f06
#define REG_A3XX_TP_PERFCOUNTER3_SELECT 0x00000f07
#define REG_A3XX_TP_PERFCOUNTER4_SELECT 0x00000f08
#define REG_A3XX_TP_PERFCOUNTER5_SELECT 0x00000f09
#define REG_A3XX_TEX_SAMP_0 0x00000000
#define A3XX_TEX_SAMP_0_XY_MAG__MASK 0x0000000c
#define A3XX_TEX_SAMP_0_XY_MAG__SHIFT 2
@@ -1791,6 +2140,12 @@ static inline uint32_t A3XX_TEX_CONST_0_FMT(enum a3xx_tex_fmt val)
{
return ((val) << A3XX_TEX_CONST_0_FMT__SHIFT) & A3XX_TEX_CONST_0_FMT__MASK;
}
#define A3XX_TEX_CONST_0_TYPE__MASK 0xc0000000
#define A3XX_TEX_CONST_0_TYPE__SHIFT 30
static inline uint32_t A3XX_TEX_CONST_0_TYPE(enum a3xx_tex_type val)
{
return ((val) << A3XX_TEX_CONST_0_TYPE__SHIFT) & A3XX_TEX_CONST_0_TYPE__MASK;
}
#define REG_A3XX_TEX_CONST_1 0x00000001
#define A3XX_TEX_CONST_1_HEIGHT__MASK 0x00003fff

View File

@@ -62,10 +62,16 @@ static unsigned regmask_idx(struct ir3_register *reg)
return num;
}
static void regmask_set(regmask_t regmask, struct ir3_register *reg)
static void regmask_set(regmask_t regmask, struct ir3_register *reg,
unsigned wrmask)
{
unsigned idx = regmask_idx(reg);
regmask[idx / 8] |= 1 << (idx % 8);
unsigned i;
for (i = 0; i < 4; i++) {
if (wrmask & (1 << i)) {
unsigned idx = regmask_idx(reg) + i;
regmask[idx / 8] |= 1 << (idx % 8);
}
}
}
static unsigned regmask_get(regmask_t regmask, struct ir3_register *reg)
@@ -91,6 +97,7 @@ struct fd3_compile_context {
unsigned next_inloc;
unsigned num_internal_temps;
struct tgsi_src_register internal_temps[6];
/* track registers which need to synchronize w/ "complex alu" cat3
* instruction pipeline:
@@ -128,9 +135,16 @@ struct fd3_compile_context {
* up the vector operation
*/
struct tgsi_dst_register tmp_dst;
struct tgsi_src_register tmp_src;
struct tgsi_src_register *tmp_src;
};
static void vectorize(struct fd3_compile_context *ctx,
struct ir3_instruction *instr, struct tgsi_dst_register *dst,
int nsrcs, ...);
static void create_mov(struct fd3_compile_context *ctx,
struct tgsi_dst_register *dst, struct tgsi_src_register *src);
static unsigned
compile_init(struct fd3_compile_context *ctx, struct fd3_shader_stateobj *so,
const struct tgsi_token *tokens)
@@ -154,19 +168,19 @@ compile_init(struct fd3_compile_context *ctx, struct fd3_shader_stateobj *so,
/* Immediates go after constants: */
ctx->base_reg[TGSI_FILE_CONSTANT] = 0;
ctx->base_reg[TGSI_FILE_IMMEDIATE] =
ctx->info.file_count[TGSI_FILE_CONSTANT];
ctx->info.file_max[TGSI_FILE_CONSTANT] + 1;
/* Temporaries after outputs after inputs: */
ctx->base_reg[TGSI_FILE_INPUT] = 0;
ctx->base_reg[TGSI_FILE_OUTPUT] =
ctx->info.file_count[TGSI_FILE_INPUT];
ctx->info.file_max[TGSI_FILE_INPUT] + 1;
ctx->base_reg[TGSI_FILE_TEMPORARY] =
ctx->info.file_count[TGSI_FILE_INPUT] +
ctx->info.file_count[TGSI_FILE_OUTPUT];
ctx->info.file_max[TGSI_FILE_INPUT] + 1 +
ctx->info.file_max[TGSI_FILE_OUTPUT] + 1;
so->first_immediate = ctx->base_reg[TGSI_FILE_IMMEDIATE];
ctx->immediate_idx = 4 * (ctx->info.file_count[TGSI_FILE_CONSTANT] +
ctx->info.file_count[TGSI_FILE_IMMEDIATE]);
ctx->immediate_idx = 4 * (ctx->info.file_max[TGSI_FILE_CONSTANT] + 1 +
ctx->info.file_max[TGSI_FILE_IMMEDIATE] + 1);
ret = tgsi_parse_init(&ctx->parser, tokens);
if (ret != TGSI_PARSE_OK)
@@ -177,6 +191,21 @@ compile_init(struct fd3_compile_context *ctx, struct fd3_shader_stateobj *so,
return ret;
}
static void
compile_error(struct fd3_compile_context *ctx, const char *format, ...)
{
va_list ap;
va_start(ap, format);
_debug_vprintf(format, ap);
va_end(ap);
tgsi_dump(ctx->tokens, 0);
assert(0);
}
#define compile_assert(ctx, cond) do { \
if (!(cond)) compile_error((ctx), "failed assert: "#cond"\n"); \
} while (0)
static void
compile_free(struct fd3_compile_context *ctx)
{
@@ -193,6 +222,24 @@ struct instr_translater {
unsigned arg;
};
static unsigned
src_flags(struct fd3_compile_context *ctx, struct ir3_register *reg)
{
unsigned flags = 0;
if (regmask_get(ctx->needs_ss, reg)) {
flags |= IR3_INSTR_SS;
memset(ctx->needs_ss, 0, sizeof(ctx->needs_ss));
}
if (regmask_get(ctx->needs_sy, reg)) {
flags |= IR3_INSTR_SY;
memset(ctx->needs_sy, 0, sizeof(ctx->needs_sy));
}
return flags;
}
static struct ir3_register *
add_dst_reg(struct fd3_compile_context *ctx, struct ir3_instruction *instr,
const struct tgsi_dst_register *dst, unsigned chan)
@@ -205,9 +252,8 @@ add_dst_reg(struct fd3_compile_context *ctx, struct ir3_instruction *instr,
num = dst->Index + ctx->base_reg[dst->File];
break;
default:
DBG("unsupported dst register file: %s",
compile_error(ctx, "unsupported dst register file: %s\n",
tgsi_file_name(dst->File));
assert(0);
break;
}
@@ -234,14 +280,17 @@ add_src_reg(struct fd3_compile_context *ctx, struct ir3_instruction *instr,
flags |= IR3_REG_CONST;
num = src->Index + ctx->base_reg[src->File];
break;
case TGSI_FILE_OUTPUT:
/* NOTE: we should only end up w/ OUTPUT file for things like
* clamp()'ing saturated dst instructions
*/
case TGSI_FILE_INPUT:
case TGSI_FILE_TEMPORARY:
num = src->Index + ctx->base_reg[src->File];
break;
default:
DBG("unsupported src register file: %s",
compile_error(ctx, "unsupported src register file: %s\n",
tgsi_file_name(src->File));
assert(0);
break;
}
@@ -254,15 +303,7 @@ add_src_reg(struct fd3_compile_context *ctx, struct ir3_instruction *instr,
reg = ir3_reg_create(instr, regid(num, chan), flags);
if (regmask_get(ctx->needs_ss, reg)) {
instr->flags |= IR3_INSTR_SS;
memset(ctx->needs_ss, 0, sizeof(ctx->needs_ss));
}
if (regmask_get(ctx->needs_sy, reg)) {
instr->flags |= IR3_INSTR_SY;
memset(ctx->needs_sy, 0, sizeof(ctx->needs_sy));
}
instr->flags |= src_flags(ctx, reg);
return reg;
}
@@ -285,11 +326,11 @@ src_from_dst(struct tgsi_src_register *src, struct tgsi_dst_register *dst)
/* Get internal-temp src/dst to use for a sequence of instructions
* generated by a single TGSI op.
*/
static void
static struct tgsi_src_register *
get_internal_temp(struct fd3_compile_context *ctx,
struct tgsi_dst_register *tmp_dst,
struct tgsi_src_register *tmp_src)
struct tgsi_dst_register *tmp_dst)
{
struct tgsi_src_register *tmp_src;
int n;
tmp_dst->File = TGSI_FILE_TEMPORARY;
@@ -299,23 +340,78 @@ get_internal_temp(struct fd3_compile_context *ctx,
/* assign next temporary: */
n = ctx->num_internal_temps++;
compile_assert(ctx, n < ARRAY_SIZE(ctx->internal_temps));
tmp_src = &ctx->internal_temps[n];
tmp_dst->Index = ctx->info.file_count[TGSI_FILE_TEMPORARY] + n;
tmp_dst->Index = ctx->info.file_max[TGSI_FILE_TEMPORARY] + n + 1;
src_from_dst(tmp_src, tmp_dst);
return tmp_src;
}
/* same as get_internal_temp, but w/ src.xxxx (for instructions that
* replicate their results)
*/
static void
static struct tgsi_src_register *
get_internal_temp_repl(struct fd3_compile_context *ctx,
struct tgsi_dst_register *tmp_dst,
struct tgsi_src_register *tmp_src)
struct tgsi_dst_register *tmp_dst)
{
get_internal_temp(ctx, tmp_dst, tmp_src);
struct tgsi_src_register *tmp_src =
get_internal_temp(ctx, tmp_dst);
tmp_src->SwizzleX = tmp_src->SwizzleY =
tmp_src->SwizzleZ = tmp_src->SwizzleW = TGSI_SWIZZLE_X;
return tmp_src;
}
static inline bool
is_const(struct tgsi_src_register *src)
{
return (src->File == TGSI_FILE_CONSTANT) ||
(src->File == TGSI_FILE_IMMEDIATE);
}
static type_t
get_ftype(struct fd3_compile_context *ctx)
{
return ctx->so->half_precision ? TYPE_F16 : TYPE_F32;
}
static type_t
get_utype(struct fd3_compile_context *ctx)
{
return ctx->so->half_precision ? TYPE_U16 : TYPE_U32;
}
static unsigned
src_swiz(struct tgsi_src_register *src, int chan)
{
switch (chan) {
case 0: return src->SwizzleX;
case 1: return src->SwizzleY;
case 2: return src->SwizzleZ;
case 3: return src->SwizzleW;
}
assert(0);
return 0;
}
/* for instructions that cannot take a const register as src, if needed
* generate a move to temporary gpr:
*/
static struct tgsi_src_register *
get_unconst(struct fd3_compile_context *ctx, struct tgsi_src_register *src)
{
struct tgsi_dst_register tmp_dst;
struct tgsi_src_register *tmp_src;
compile_assert(ctx, is_const(src));
tmp_src = get_internal_temp(ctx, &tmp_dst);
create_mov(ctx, &tmp_dst, src);
return tmp_src;
}
static void
@@ -365,30 +461,11 @@ get_immediate(struct fd3_compile_context *ctx,
reg->SwizzleW = swiz2tgsi[swiz];
}
static type_t
get_type(struct fd3_compile_context *ctx)
{
return ctx->so->half_precision ? TYPE_F16 : TYPE_F32;
}
static unsigned
src_swiz(struct tgsi_src_register *src, int chan)
{
switch (chan) {
case 0: return src->SwizzleX;
case 1: return src->SwizzleY;
case 2: return src->SwizzleZ;
case 3: return src->SwizzleW;
}
assert(0);
return 0;
}
static void
create_mov(struct fd3_compile_context *ctx, struct tgsi_dst_register *dst,
struct tgsi_src_register *src)
{
type_t type_mov = get_type(ctx);
type_t type_mov = get_ftype(ctx);
unsigned i;
for (i = 0; i < 4; i++) {
@@ -404,7 +481,35 @@ create_mov(struct fd3_compile_context *ctx, struct tgsi_dst_register *dst,
ir3_instr_create(ctx->ir, 0, OPC_NOP);
}
}
}
static void
create_clamp(struct fd3_compile_context *ctx, struct tgsi_dst_register *dst,
struct tgsi_src_register *minval, struct tgsi_src_register *maxval)
{
struct ir3_instruction *instr;
struct tgsi_src_register src;
src_from_dst(&src, dst);
instr = ir3_instr_create(ctx->ir, 2, OPC_MAX_F);
vectorize(ctx, instr, dst, 2, &src, 0, minval, 0);
instr = ir3_instr_create(ctx->ir, 2, OPC_MIN_F);
vectorize(ctx, instr, dst, 2, &src, 0, maxval, 0);
}
static void
create_clamp_imm(struct fd3_compile_context *ctx,
struct tgsi_dst_register *dst,
uint32_t minval, uint32_t maxval)
{
struct tgsi_src_register minconst, maxconst;
get_immediate(ctx, &minconst, minval);
get_immediate(ctx, &maxconst, maxval);
create_clamp(ctx, dst, &minconst, &maxconst);
}
static struct tgsi_dst_register *
@@ -415,7 +520,7 @@ get_dst(struct fd3_compile_context *ctx, struct tgsi_full_instruction *inst)
for (i = 0; i < inst->Instruction.NumSrcRegs; i++) {
struct tgsi_src_register *src = &inst->Src[i].Register;
if ((src->File == dst->File) && (src->Index == dst->Index)) {
get_internal_temp(ctx, &ctx->tmp_dst, &ctx->tmp_src);
ctx->tmp_src = get_internal_temp(ctx, &ctx->tmp_dst);
ctx->tmp_dst.WriteMask = dst->WriteMask;
dst = &ctx->tmp_dst;
break;
@@ -430,7 +535,7 @@ put_dst(struct fd3_compile_context *ctx, struct tgsi_full_instruction *inst,
{
/* if necessary, add mov back into original dst: */
if (dst != &inst->Dst[0].Register) {
create_mov(ctx, &inst->Dst[0].Register, &ctx->tmp_src);
create_mov(ctx, &inst->Dst[0].Register, ctx->tmp_src);
}
}
@@ -478,6 +583,7 @@ vectorize(struct fd3_compile_context *ctx, struct ir3_instruction *instr,
cur->regs[j+1]->num =
regid(cur->regs[j+1]->num >> 2,
src_swiz(src, i));
cur->flags |= src_flags(ctx, cur->regs[j+1]);
}
va_end(ap);
}
@@ -496,6 +602,15 @@ vectorize(struct fd3_compile_context *ctx, struct ir3_instruction *instr,
* native instructions:
*/
static inline void
get_swiz(unsigned *swiz, struct tgsi_src_register *src)
{
swiz[0] = src->SwizzleX;
swiz[1] = src->SwizzleY;
swiz[2] = src->SwizzleZ;
swiz[3] = src->SwizzleW;
}
static void
trans_dotp(const struct instr_translater *t,
struct fd3_compile_context *ctx,
@@ -503,39 +618,35 @@ trans_dotp(const struct instr_translater *t,
{
struct ir3_instruction *instr;
struct tgsi_dst_register tmp_dst;
struct tgsi_src_register tmp_src;
struct tgsi_src_register *tmp_src;
struct tgsi_dst_register *dst = &inst->Dst[0].Register;
struct tgsi_src_register *src0 = &inst->Src[0].Register;
struct tgsi_src_register *src1 = &inst->Src[1].Register;
unsigned swiz0[] = { src0->SwizzleX, src0->SwizzleY, src0->SwizzleZ, src0->SwizzleW };
unsigned swiz1[] = { src1->SwizzleX, src1->SwizzleY, src1->SwizzleZ, src1->SwizzleW };
unsigned swiz0[4];
unsigned swiz1[4];
opc_t opc_mad = ctx->so->half_precision ? OPC_MAD_F16 : OPC_MAD_F32;
unsigned n = t->arg; /* number of components */
unsigned i;
unsigned i, swapped = 0;
get_internal_temp_repl(ctx, &tmp_dst, &tmp_src);
tmp_src = get_internal_temp_repl(ctx, &tmp_dst);
/* Blob compiler never seems to use a const in src1 position for
* mad.*, although there does seem (according to disassembler
* hidden in libllvm-a3xx.so) to be a bit to indicate that src1
* is a const. Not sure if this is a hw bug, or simply that the
* disassembler lies.
/* in particular, can't handle const for src1 for cat3/mad:
*/
if ((src1->File == TGSI_FILE_IMMEDIATE) ||
(src1->File == TGSI_FILE_CONSTANT)) {
/* the mov to tmp unswizzles src1, so now we have tmp.xyzw:
*/
for (i = 0; i < 4; i++)
swiz1[i] = i;
/* the first mul.f will clobber tmp.x, but that is ok
* because after that point we no longer need tmp.x:
*/
create_mov(ctx, &tmp_dst, src1);
src1 = &tmp_src;
if (is_const(src1)) {
if (!is_const(src0)) {
struct tgsi_src_register *tmp;
tmp = src0;
src0 = src1;
src1 = tmp;
swapped = 1;
} else {
src0 = get_unconst(ctx, src0);
}
}
get_swiz(swiz0, src0);
get_swiz(swiz1, src1);
instr = ir3_instr_create(ctx->ir, 2, OPC_MUL_F);
add_dst_reg(ctx, instr, &tmp_dst, 0);
add_src_reg(ctx, instr, src0, swiz0[0]);
@@ -548,29 +659,27 @@ trans_dotp(const struct instr_translater *t,
add_dst_reg(ctx, instr, &tmp_dst, 0);
add_src_reg(ctx, instr, src0, swiz0[i]);
add_src_reg(ctx, instr, src1, swiz1[i]);
add_src_reg(ctx, instr, &tmp_src, 0);
add_src_reg(ctx, instr, tmp_src, 0);
}
/* DPH(a,b) = (a.x * b.x) + (a.y * b.y) + (a.z * b.z) + b.w */
if (t->tgsi_opc == TGSI_OPCODE_DPH) {
ir3_instr_create(ctx->ir, 0, OPC_NOP);
ir3_instr_create(ctx->ir, 0, OPC_NOP)->repeat = 1;
instr = ir3_instr_create(ctx->ir, 2, OPC_ADD_F);
add_dst_reg(ctx, instr, &tmp_dst, 0);
add_src_reg(ctx, instr, src1, swiz1[i]);
add_src_reg(ctx, instr, &tmp_src, 0);
if (swapped)
add_src_reg(ctx, instr, src0, swiz0[i]);
else
add_src_reg(ctx, instr, src1, swiz1[i]);
add_src_reg(ctx, instr, tmp_src, 0);
n++;
}
ir3_instr_create(ctx->ir, 0, OPC_NOP);
ir3_instr_create(ctx->ir, 0, OPC_NOP)->repeat = 2;
/* pad out to multiple of 4 scalar instructions: */
for (i = 2 * n; i % 4; i++) {
ir3_instr_create(ctx->ir, 0, OPC_NOP);
}
create_mov(ctx, dst, &tmp_src);
create_mov(ctx, dst, tmp_src);
}
/* LRP(a,b,c) = (a * b) + ((1 - a) * c) */
@@ -581,37 +690,39 @@ trans_lrp(const struct instr_translater *t,
{
struct ir3_instruction *instr;
struct tgsi_dst_register tmp_dst1, tmp_dst2;
struct tgsi_src_register tmp_src1, tmp_src2;
struct tgsi_src_register *tmp_src1, *tmp_src2;
struct tgsi_src_register tmp_const;
struct tgsi_src_register *src0 = &inst->Src[0].Register;
struct tgsi_src_register *src1 = &inst->Src[1].Register;
get_internal_temp(ctx, &tmp_dst1, &tmp_src1);
get_internal_temp(ctx, &tmp_dst2, &tmp_src2);
if (is_const(src0) && is_const(src1))
src0 = get_unconst(ctx, src0);
tmp_src1 = get_internal_temp(ctx, &tmp_dst1);
tmp_src2 = get_internal_temp(ctx, &tmp_dst2);
get_immediate(ctx, &tmp_const, fui(1.0));
/* tmp1 = (a * b) */
instr = ir3_instr_create(ctx->ir, 2, OPC_MUL_F);
vectorize(ctx, instr, &tmp_dst1, 2,
&inst->Src[0].Register, 0,
&inst->Src[1].Register, 0);
vectorize(ctx, instr, &tmp_dst1, 2, src0, 0, src1, 0);
/* tmp2 = (1 - a) */
instr = ir3_instr_create(ctx->ir, 2, OPC_ADD_F);
vectorize(ctx, instr, &tmp_dst2, 2,
&tmp_const, 0,
&inst->Src[0].Register, IR3_REG_NEGATE);
vectorize(ctx, instr, &tmp_dst2, 2, &tmp_const, 0,
src0, IR3_REG_NEGATE);
/* tmp2 = tmp2 * c */
instr = ir3_instr_create(ctx->ir, 2, OPC_MUL_F);
vectorize(ctx, instr, &tmp_dst2, 2,
&tmp_src2, 0,
tmp_src2, 0,
&inst->Src[2].Register, 0);
/* dst = tmp1 + tmp2 */
instr = ir3_instr_create(ctx->ir, 2, OPC_ADD_F);
vectorize(ctx, instr, &inst->Dst[0].Register, 2,
&tmp_src1, 0,
&tmp_src2, 0);
tmp_src1, 0,
tmp_src2, 0);
}
/* FRC(x) = x - FLOOR(x) */
@@ -622,9 +733,9 @@ trans_frac(const struct instr_translater *t,
{
struct ir3_instruction *instr;
struct tgsi_dst_register tmp_dst;
struct tgsi_src_register tmp_src;
struct tgsi_src_register *tmp_src;
get_internal_temp(ctx, &tmp_dst, &tmp_src);
tmp_src = get_internal_temp(ctx, &tmp_dst);
/* tmp = FLOOR(x) */
instr = ir3_instr_create(ctx->ir, 2, OPC_FLOOR_F);
@@ -635,7 +746,7 @@ trans_frac(const struct instr_translater *t,
instr = ir3_instr_create(ctx->ir, 2, OPC_ADD_F);
vectorize(ctx, instr, &inst->Dst[0].Register, 2,
&inst->Src[0].Register, 0,
&tmp_src, IR3_REG_NEGATE);
tmp_src, IR3_REG_NEGATE);
}
/* POW(a,b) = EXP2(b * LOG2(a)) */
@@ -647,24 +758,24 @@ trans_pow(const struct instr_translater *t,
struct ir3_instruction *instr;
struct ir3_register *r;
struct tgsi_dst_register tmp_dst;
struct tgsi_src_register tmp_src;
struct tgsi_src_register *tmp_src;
struct tgsi_dst_register *dst = &inst->Dst[0].Register;
struct tgsi_src_register *src0 = &inst->Src[0].Register;
struct tgsi_src_register *src1 = &inst->Src[1].Register;
get_internal_temp_repl(ctx, &tmp_dst, &tmp_src);
tmp_src = get_internal_temp_repl(ctx, &tmp_dst);
/* log2 Rtmp, Rsrc0 */
ir3_instr_create(ctx->ir, 0, OPC_NOP)->repeat = 5;
instr = ir3_instr_create(ctx->ir, 4, OPC_LOG2);
r = add_dst_reg(ctx, instr, &tmp_dst, 0);
add_src_reg(ctx, instr, src0, src0->SwizzleX);
regmask_set(ctx->needs_ss, r);
regmask_set(ctx->needs_ss, r, TGSI_WRITEMASK_X);
/* mul.f Rtmp, Rtmp, Rsrc1 */
instr = ir3_instr_create(ctx->ir, 2, OPC_MUL_F);
add_dst_reg(ctx, instr, &tmp_dst, 0);
add_src_reg(ctx, instr, &tmp_src, 0);
add_src_reg(ctx, instr, tmp_src, 0);
add_src_reg(ctx, instr, src1, src1->SwizzleX);
/* blob compiler seems to ensure there are at least 6 instructions
@@ -676,10 +787,10 @@ trans_pow(const struct instr_translater *t,
/* exp2 Rdst, Rtmp */
instr = ir3_instr_create(ctx->ir, 4, OPC_EXP2);
r = add_dst_reg(ctx, instr, &tmp_dst, 0);
add_src_reg(ctx, instr, &tmp_src, 0);
regmask_set(ctx->needs_ss, r);
add_src_reg(ctx, instr, tmp_src, 0);
regmask_set(ctx->needs_ss, r, TGSI_WRITEMASK_X);
create_mov(ctx, dst, &tmp_src);
create_mov(ctx, dst, tmp_src);
}
/* texture fetch/sample instructions: */
@@ -690,8 +801,6 @@ trans_samp(const struct instr_translater *t,
{
struct ir3_register *r;
struct ir3_instruction *instr;
struct tgsi_dst_register tmp_dst;
struct tgsi_src_register tmp_src;
struct tgsi_src_register *coord = &inst->Src[0].Register;
struct tgsi_src_register *samp = &inst->Src[1].Register;
unsigned tex = inst->Texture.Texture;
@@ -711,7 +820,7 @@ trans_samp(const struct instr_translater *t,
flags |= IR3_INSTR_P;
break;
default:
assert(0);
compile_assert(ctx, 0);
break;
}
@@ -726,10 +835,13 @@ trans_samp(const struct instr_translater *t,
*/
for (i = 1; (i < 4) && (order[i] >= 0); i++) {
if (src_swiz(coord, i) != (src_swiz(coord, 0) + order[i])) {
type_t type_mov = get_type(ctx);
struct tgsi_dst_register tmp_dst;
struct tgsi_src_register *tmp_src;
type_t type_mov = get_ftype(ctx);
/* need to move things around: */
get_internal_temp(ctx, &tmp_dst, &tmp_src);
tmp_src = get_internal_temp(ctx, &tmp_dst);
for (j = 0; (j < 4) && (order[j] >= 0); j++) {
instr = ir3_instr_create(ctx->ir, 1, 0);
@@ -740,7 +852,7 @@ trans_samp(const struct instr_translater *t,
src_swiz(coord, order[j]));
}
coord = &tmp_src;
coord = tmp_src;
if (j < 4)
ir3_instr_create(ctx->ir, 0, OPC_NOP)->repeat = 4 - j - 1;
@@ -750,7 +862,7 @@ trans_samp(const struct instr_translater *t,
}
instr = ir3_instr_create(ctx->ir, 5, t->opc);
instr->cat5.type = get_type(ctx);
instr->cat5.type = get_ftype(ctx);
instr->cat5.samp = samp->Index;
instr->cat5.tex = samp->Index;
instr->flags |= flags;
@@ -760,10 +872,42 @@ trans_samp(const struct instr_translater *t,
add_src_reg(ctx, instr, coord, coord->SwizzleX);
regmask_set(ctx->needs_sy, r);
regmask_set(ctx->needs_sy, r, r->wrmask);
}
/* CMP(a,b,c) = (a < 0) ? b : c */
/*
* SEQ(a,b) = (a == b) ? 1.0 : 0.0
* cmps.f.eq tmp0, b, a
* cov.u16f16 dst, tmp0
*
* SNE(a,b) = (a != b) ? 1.0 : 0.0
* cmps.f.eq tmp0, b, a
* add.s tmp0, tmp0, -1
* sel.f16 dst, {0.0}, tmp0, {1.0}
*
* SGE(a,b) = (a >= b) ? 1.0 : 0.0
* cmps.f.ge tmp0, a, b
* cov.u16f16 dst, tmp0
*
* SLE(a,b) = (a <= b) ? 1.0 : 0.0
* cmps.f.ge tmp0, b, a
* cov.u16f16 dst, tmp0
*
* SGT(a,b) = (a > b) ? 1.0 : 0.0
* cmps.f.ge tmp0, b, a
* add.s tmp0, tmp0, -1
* sel.f16 dst, {0.0}, tmp0, {1.0}
*
* SLT(a,b) = (a < b) ? 1.0 : 0.0
* cmps.f.ge tmp0, a, b
* add.s tmp0, tmp0, -1
* sel.f16 dst, {0.0}, tmp0, {1.0}
*
* CMP(a,b,c) = (a < 0.0) ? b : c
* cmps.f.ge tmp0, a, {0.0}
* add.s tmp0, tmp0, -1
* sel.f16 dst, c, tmp0, b
*/
static void
trans_cmp(const struct instr_translater *t,
struct fd3_compile_context *ctx,
@@ -771,35 +915,94 @@ trans_cmp(const struct instr_translater *t,
{
struct ir3_instruction *instr;
struct tgsi_dst_register tmp_dst;
struct tgsi_src_register tmp_src;
struct tgsi_src_register constval;
/* final instruction uses original src1 and src2, so we need get_dst() */
struct tgsi_src_register *tmp_src;
struct tgsi_src_register constval0, constval1;
/* final instruction for CMP() uses orig src1 and src2: */
struct tgsi_dst_register *dst = get_dst(ctx, inst);
struct tgsi_src_register *a0, *a1;
unsigned condition;
get_internal_temp(ctx, &tmp_dst, &tmp_src);
tmp_src = get_internal_temp(ctx, &tmp_dst);
/* cmps.f.ge tmp, src0, 0.0 */
switch (t->tgsi_opc) {
case TGSI_OPCODE_SEQ:
case TGSI_OPCODE_SNE:
a0 = &inst->Src[1].Register; /* b */
a1 = &inst->Src[0].Register; /* a */
condition = IR3_COND_EQ;
break;
case TGSI_OPCODE_SGE:
case TGSI_OPCODE_SLT:
a0 = &inst->Src[0].Register; /* a */
a1 = &inst->Src[1].Register; /* b */
condition = IR3_COND_GE;
break;
case TGSI_OPCODE_SLE:
case TGSI_OPCODE_SGT:
a0 = &inst->Src[1].Register; /* b */
a1 = &inst->Src[0].Register; /* a */
condition = IR3_COND_GE;
break;
case TGSI_OPCODE_CMP:
get_immediate(ctx, &constval0, fui(0.0));
a0 = &inst->Src[0].Register; /* a */
a1 = &constval0; /* {0.0} */
condition = IR3_COND_GE;
break;
default:
compile_assert(ctx, 0);
return;
}
if (is_const(a0) && is_const(a1))
a0 = get_unconst(ctx, a0);
/* cmps.f.ge tmp, a0, a1 */
instr = ir3_instr_create(ctx->ir, 2, OPC_CMPS_F);
instr->cat2.condition = IR3_COND_GE;
get_immediate(ctx, &constval, fui(0.0));
vectorize(ctx, instr, &tmp_dst, 2,
&inst->Src[0].Register, 0,
&constval, 0);
instr->cat2.condition = condition;
vectorize(ctx, instr, &tmp_dst, 2, a0, 0, a1, 0);
/* add.s tmp, tmp, -1 */
instr = ir3_instr_create(ctx->ir, 2, OPC_ADD_S);
instr->repeat = 3;
add_dst_reg(ctx, instr, &tmp_dst, 0);
add_src_reg(ctx, instr, &tmp_src, 0);
ir3_reg_create(instr, 0, IR3_REG_IMMED)->iim_val = -1;
switch (t->tgsi_opc) {
case TGSI_OPCODE_SEQ:
case TGSI_OPCODE_SGE:
case TGSI_OPCODE_SLE:
/* cov.u16f16 dst, tmp0 */
instr = ir3_instr_create(ctx->ir, 1, 0);
instr->cat1.src_type = get_utype(ctx);
instr->cat1.dst_type = get_ftype(ctx);
vectorize(ctx, instr, dst, 1, tmp_src, 0);
break;
case TGSI_OPCODE_SNE:
case TGSI_OPCODE_SGT:
case TGSI_OPCODE_SLT:
case TGSI_OPCODE_CMP:
/* add.s tmp, tmp, -1 */
instr = ir3_instr_create(ctx->ir, 2, OPC_ADD_S);
instr->repeat = 3;
add_dst_reg(ctx, instr, &tmp_dst, 0);
add_src_reg(ctx, instr, tmp_src, 0)->flags |= IR3_REG_R;
ir3_reg_create(instr, 0, IR3_REG_IMMED)->iim_val = -1;
/* sel.{f32,f16} dst, src2, tmp, src1 */
instr = ir3_instr_create(ctx->ir, 3, ctx->so->half_precision ?
OPC_SEL_F16 : OPC_SEL_F32);
vectorize(ctx, instr, &inst->Dst[0].Register, 3,
&inst->Src[2].Register, 0,
&tmp_src, 0,
&inst->Src[1].Register, 0);
if (t->tgsi_opc == TGSI_OPCODE_CMP) {
/* sel.{f32,f16} dst, src2, tmp, src1 */
instr = ir3_instr_create(ctx->ir, 3,
ctx->so->half_precision ? OPC_SEL_F16 : OPC_SEL_F32);
vectorize(ctx, instr, dst, 3,
&inst->Src[2].Register, 0,
tmp_src, 0,
&inst->Src[1].Register, 0);
} else {
get_immediate(ctx, &constval0, fui(0.0));
get_immediate(ctx, &constval1, fui(1.0));
/* sel.{f32,f16} dst, {0.0}, tmp0, {1.0} */
instr = ir3_instr_create(ctx->ir, 3,
ctx->so->half_precision ? OPC_SEL_F16 : OPC_SEL_F32);
vectorize(ctx, instr, dst, 3,
&constval0, 0, tmp_src, 0, &constval1, 0);
}
break;
}
put_dst(ctx, inst, dst);
}
@@ -858,10 +1061,13 @@ trans_if(const struct instr_translater *t,
get_immediate(ctx, &constval, fui(0.0));
if (is_const(src))
src = get_unconst(ctx, src);
instr = ir3_instr_create(ctx->ir, 2, OPC_CMPS_F);
ir3_reg_create(instr, regid(REG_P0, 0), 0);
add_src_reg(ctx, instr, &constval, constval.SwizzleX);
add_src_reg(ctx, instr, src, src->SwizzleX);
add_src_reg(ctx, instr, &constval, constval.SwizzleX);
instr->cat2.condition = IR3_COND_EQ;
instr = ir3_instr_create(ctx->ir, 0, OPC_BR);
@@ -939,16 +1145,12 @@ instr_cat2(const struct instr_translater *t,
struct tgsi_full_instruction *inst)
{
struct tgsi_dst_register *dst = get_dst(ctx, inst);
struct tgsi_src_register *src0 = &inst->Src[0].Register;
struct tgsi_src_register *src1 = &inst->Src[1].Register;
struct ir3_instruction *instr;
unsigned src0_flags = 0;
instr = ir3_instr_create(ctx->ir, 2, t->opc);
switch (t->tgsi_opc) {
case TGSI_OPCODE_SLT:
case TGSI_OPCODE_SGE:
instr->cat2.condition = t->arg;
break;
case TGSI_OPCODE_ABS:
src0_flags = IR3_REG_ABS;
break;
@@ -970,48 +1172,65 @@ instr_cat2(const struct instr_translater *t,
case OPC_SETRM:
case OPC_CBITS_B:
/* these only have one src reg */
vectorize(ctx, instr, dst, 1,
&inst->Src[0].Register, src0_flags);
instr = ir3_instr_create(ctx->ir, 2, t->opc);
vectorize(ctx, instr, dst, 1, src0, src0_flags);
break;
default:
vectorize(ctx, instr, dst, 2,
&inst->Src[0].Register, src0_flags,
&inst->Src[1].Register, 0);
if (is_const(src0) && is_const(src1))
src0 = get_unconst(ctx, src0);
instr = ir3_instr_create(ctx->ir, 2, t->opc);
vectorize(ctx, instr, dst, 2, src0, src0_flags, src1, 0);
break;
}
put_dst(ctx, inst, dst);
}
static bool is_mad(opc_t opc)
{
switch (opc) {
case OPC_MAD_U16:
case OPC_MADSH_U16:
case OPC_MAD_S16:
case OPC_MADSH_M16:
case OPC_MAD_U24:
case OPC_MAD_S24:
case OPC_MAD_F16:
case OPC_MAD_F32:
return true;
default:
return false;
}
}
static void
instr_cat3(const struct instr_translater *t,
struct fd3_compile_context *ctx,
struct tgsi_full_instruction *inst)
{
struct tgsi_dst_register *dst = get_dst(ctx, inst);
struct tgsi_src_register *src0 = &inst->Src[0].Register;
struct tgsi_src_register *src1 = &inst->Src[1].Register;
struct tgsi_dst_register tmp_dst;
struct tgsi_src_register tmp_src;
struct ir3_instruction *instr;
/* Blob compiler never seems to use a const in src1 position..
* although there does seem (according to disassembler hidden
* in libllvm-a3xx.so) to be a bit to indicate that src1 is a
* const. Not sure if this is a hw bug, or simply that the
* disassembler lies.
/* in particular, can't handle const for src1 for cat3..
* for mad, we can swap first two src's if needed:
*/
if ((src1->File == TGSI_FILE_CONSTANT) ||
(src1->File == TGSI_FILE_IMMEDIATE)) {
get_internal_temp(ctx, &tmp_dst, &tmp_src);
create_mov(ctx, &tmp_dst, src1);
src1 = &tmp_src;
if (is_const(src1)) {
if (is_mad(t->opc) && !is_const(src0)) {
struct tgsi_src_register *tmp;
tmp = src0;
src0 = src1;
src1 = tmp;
} else {
src0 = get_unconst(ctx, src0);
}
}
instr = ir3_instr_create(ctx->ir, 3,
ctx->so->half_precision ? t->hopc : t->opc);
vectorize(ctx, instr, dst, 3,
&inst->Src[0].Register, 0,
src1, 0,
vectorize(ctx, instr, dst, 3, src0, 0, src1, 0,
&inst->Src[2].Register, 0);
put_dst(ctx, inst, dst);
}
@@ -1022,15 +1241,20 @@ instr_cat4(const struct instr_translater *t,
struct tgsi_full_instruction *inst)
{
struct tgsi_dst_register *dst = get_dst(ctx, inst);
struct tgsi_src_register *src = &inst->Src[0].Register;
struct ir3_instruction *instr;
/* seems like blob compiler avoids const as src.. */
if (is_const(src))
src = get_unconst(ctx, src);
ir3_instr_create(ctx->ir, 0, OPC_NOP)->repeat = 5;
instr = ir3_instr_create(ctx->ir, 4, t->opc);
vectorize(ctx, instr, dst, 1,
&inst->Src[0].Register, 0);
vectorize(ctx, instr, dst, 1, src, 0);
regmask_set(ctx->needs_ss, instr->regs[0]);
regmask_set(ctx->needs_ss, instr->regs[0],
inst->Dst[0].Register.WriteMask);
put_dst(ctx, inst, dst);
}
@@ -1051,12 +1275,11 @@ static const struct instr_translater translaters[TGSI_OPCODE_LAST] = {
INSTR(DPH, trans_dotp, .arg = 3), /* almost like DP3 */
INSTR(MIN, instr_cat2, .opc = OPC_MIN_F),
INSTR(MAX, instr_cat2, .opc = OPC_MAX_F),
INSTR(SLT, instr_cat2, .opc = OPC_CMPS_F, .arg = IR3_COND_LT),
INSTR(SGE, instr_cat2, .opc = OPC_CMPS_F, .arg = IR3_COND_GE),
INSTR(MAD, instr_cat3, .opc = OPC_MAD_F32, .hopc = OPC_MAD_F16),
INSTR(LRP, trans_lrp),
INSTR(FRC, trans_frac),
INSTR(FLR, instr_cat2, .opc = OPC_FLOOR_F),
INSTR(ARL, instr_cat2, .opc = OPC_FLOOR_F),
INSTR(EX2, instr_cat4, .opc = OPC_EXP2),
INSTR(LG2, instr_cat4, .opc = OPC_LOG2),
INSTR(POW, trans_pow),
@@ -1065,6 +1288,12 @@ static const struct instr_translater translaters[TGSI_OPCODE_LAST] = {
INSTR(SIN, instr_cat4, .opc = OPC_COS),
INSTR(TEX, trans_samp, .opc = OPC_SAM, .arg = TGSI_OPCODE_TEX),
INSTR(TXP, trans_samp, .opc = OPC_SAM, .arg = TGSI_OPCODE_TXP),
INSTR(SGT, trans_cmp),
INSTR(SLT, trans_cmp),
INSTR(SGE, trans_cmp),
INSTR(SLE, trans_cmp),
INSTR(SNE, trans_cmp),
INSTR(SEQ, trans_cmp),
INSTR(CMP, trans_cmp),
INSTR(IF, trans_if),
INSTR(ELSE, trans_else),
@@ -1132,7 +1361,7 @@ decl_out(struct fd3_compile_context *ctx, struct tgsi_full_declaration *decl)
unsigned name = decl->Semantic.Name;
unsigned i;
assert(decl->Declaration.Semantic); // TODO is this ever not true?
compile_assert(ctx, decl->Declaration.Semantic); // TODO is this ever not true?
DBG("decl out[%d] -> r%d", name, decl->Range.First + base); // XXX
@@ -1152,9 +1381,8 @@ decl_out(struct fd3_compile_context *ctx, struct tgsi_full_declaration *decl)
so->outputs[so->outputs_count++].regid = regid(i + base, 0);
break;
default:
DBG("unknown VS semantic name: %s",
compile_error(ctx, "unknown VS semantic name: %s\n",
tgsi_semantic_names[name]);
assert(0);
}
} else {
switch (name) {
@@ -1162,9 +1390,8 @@ decl_out(struct fd3_compile_context *ctx, struct tgsi_full_declaration *decl)
so->color_regid = regid(decl->Range.First + base, 0);
break;
default:
DBG("unknown VS semantic name: %s",
compile_error(ctx, "unknown VS semantic name: %s\n",
tgsi_semantic_names[name]);
assert(0);
}
}
}
@@ -1223,10 +1450,19 @@ compile_instructions(struct fd3_compile_context *ctx)
t->fxn(t, ctx, inst);
ctx->num_internal_temps = 0;
} else {
debug_printf("unknown TGSI opc: %s\n",
compile_error(ctx, "unknown TGSI opc: %s\n",
tgsi_get_opcode_name(opc));
tgsi_dump(ctx->tokens, 0);
assert(0);
}
switch (inst->Instruction.Saturate) {
case TGSI_SAT_ZERO_ONE:
create_clamp_imm(ctx, &inst->Dst[0].Register,
fui(0.0), fui(1.0));
break;
case TGSI_SAT_MINUS_PLUS_ONE:
create_clamp_imm(ctx, &inst->Dst[0].Register,
fui(-1.0), fui(1.0));
break;
}
break;
@@ -1253,6 +1489,8 @@ fd3_compile_shader(struct fd3_shader_stateobj *so,
so->ir = ir3_shader_create();
assert(so->ir);
so->color_regid = regid(63,0);
so->pos_regid = regid(63,0);
so->psize_regid = regid(63,0);

View File

@@ -40,7 +40,18 @@
static void
fd3_context_destroy(struct pipe_context *pctx)
{
struct fd3_context *fd3_ctx = fd3_context(fd_context(pctx));
fd3_prog_fini(pctx);
fd_bo_del(fd3_ctx->vs_pvt_mem);
fd_bo_del(fd3_ctx->fs_pvt_mem);
fd_bo_del(fd3_ctx->vsc_size_mem);
fd_bo_del(fd3_ctx->vsc_pipe_mem);
pipe_resource_reference(&fd3_ctx->solid_vbuf, NULL);
pipe_resource_reference(&fd3_ctx->blit_texcoord_vbuf, NULL);
fd_context_destroy(pctx);
}

View File

@@ -81,7 +81,7 @@ fd3_emit_constant(struct fd_ringbuffer *ring,
if (prsc) {
struct fd_bo *bo = fd_resource(prsc)->bo;
OUT_RELOC(ring, bo, offset,
CP_LOAD_STATE_1_STATE_TYPE(ST_CONSTANTS));
CP_LOAD_STATE_1_STATE_TYPE(ST_CONSTANTS), 0);
} else {
OUT_RING(ring, CP_LOAD_STATE_1_EXT_SRC_ADDR(0) |
CP_LOAD_STATE_1_STATE_TYPE(ST_CONSTANTS));
@@ -212,7 +212,7 @@ emit_textures(struct fd_ringbuffer *ring,
for (i = 0; i < tex->num_textures; i++) {
struct fd3_pipe_sampler_view *view =
fd3_pipe_sampler_view(tex->textures[i]);
OUT_RELOC(ring, view->tex_resource->bo, 0, 0);
OUT_RELOC(ring, view->tex_resource->bo, 0, 0, 0);
/* I think each entry is a ptr to mipmap level.. for now, just
* pad w/ null's until I get around to actually implementing
* mipmap support..
@@ -279,9 +279,9 @@ fd3_emit_gmem_restore_tex(struct fd_ringbuffer *ring, struct pipe_surface *psurf
CP_LOAD_STATE_1_EXT_SRC_ADDR(0));
OUT_RING(ring, A3XX_TEX_CONST_0_FMT(fd3_pipe2tex(psurf->format)) |
0x40000000 | // XXX
fd3_tex_swiz(psurf->format, PIPE_SWIZZLE_BLUE, PIPE_SWIZZLE_GREEN,
PIPE_SWIZZLE_RED, PIPE_SWIZZLE_ALPHA));
OUT_RING(ring, A3XX_TEX_CONST_1_FETCHSIZE(fd3_pipe2fetchsize(psurf->format)) |
fd3_tex_swiz(psurf->format, PIPE_SWIZZLE_RED, PIPE_SWIZZLE_GREEN,
PIPE_SWIZZLE_BLUE, PIPE_SWIZZLE_ALPHA));
OUT_RING(ring, A3XX_TEX_CONST_1_FETCHSIZE(TFETCH_DISABLE) |
A3XX_TEX_CONST_1_WIDTH(psurf->width) |
A3XX_TEX_CONST_1_HEIGHT(psurf->height));
OUT_RING(ring, A3XX_TEX_CONST_2_PITCH(rsc->pitch * rsc->cpp) |
@@ -296,7 +296,7 @@ fd3_emit_gmem_restore_tex(struct fd_ringbuffer *ring, struct pipe_surface *psurf
CP_LOAD_STATE_0_NUM_UNIT(1));
OUT_RING(ring, CP_LOAD_STATE_1_STATE_TYPE(ST_CONSTANTS) |
CP_LOAD_STATE_1_EXT_SRC_ADDR(0));
OUT_RELOC(ring, rsc->bo, 0, 0);
OUT_RELOC(ring, rsc->bo, 0, 0, 0);
}
void
@@ -322,7 +322,7 @@ fd3_emit_vertex_bufs(struct fd_ringbuffer *ring,
COND(switchnext, A3XX_VFD_FETCH_INSTR_0_SWITCHNEXT) |
A3XX_VFD_FETCH_INSTR_0_INDEXCODE(i) |
A3XX_VFD_FETCH_INSTR_0_STEPRATE(1));
OUT_RELOC(ring, rsc->bo, vbufs[i].offset, 0);
OUT_RELOC(ring, rsc->bo, vbufs[i].offset, 0, 0);
OUT_PKT0(ring, REG_A3XX_VFD_DECODE_INSTR(i), 1);
OUT_RING(ring, A3XX_VFD_DECODE_INSTR_CONSTFILL |
@@ -481,12 +481,12 @@ fd3_emit_restore(struct fd_context *ctx)
OUT_PKT0(ring, REG_A3XX_SP_VS_PVT_MEM_CTRL_REG, 3);
OUT_RING(ring, 0x08000001); /* SP_VS_PVT_MEM_CTRL_REG */
OUT_RELOC(ring, fd3_ctx->vs_pvt_mem, 0, 0); /* SP_VS_PVT_MEM_ADDR_REG */
OUT_RELOC(ring, fd3_ctx->vs_pvt_mem, 0,0,0); /* SP_VS_PVT_MEM_ADDR_REG */
OUT_RING(ring, 0x00000000); /* SP_VS_PVT_MEM_SIZE_REG */
OUT_PKT0(ring, REG_A3XX_SP_FS_PVT_MEM_CTRL_REG, 3);
OUT_RING(ring, 0x08000001); /* SP_FS_PVT_MEM_CTRL_REG */
OUT_RELOC(ring, fd3_ctx->fs_pvt_mem, 0, 0); /* SP_FS_PVT_MEM_ADDR_REG */
OUT_RELOC(ring, fd3_ctx->fs_pvt_mem, 0,0,0); /* SP_FS_PVT_MEM_ADDR_REG */
OUT_RING(ring, 0x00000000); /* SP_FS_PVT_MEM_SIZE_REG */
OUT_PKT0(ring, REG_A3XX_PC_VERTEX_REUSE_BLOCK_CNTL, 1);
@@ -536,8 +536,8 @@ fd3_emit_restore(struct fd_context *ctx)
OUT_PKT0(ring, REG_A3XX_UNKNOWN_0C3D, 1);
OUT_RING(ring, 0x00000001); /* UNKNOWN_0C3D */
OUT_PKT0(ring, REG_A3XX_UNKNOWN_0E00, 1);
OUT_RING(ring, 0x00000000); /* UNKNOWN_0E00 */
OUT_PKT0(ring, REG_A3XX_HLSQ_PERFCOUNTER0_SELECT, 1);
OUT_RING(ring, 0x00000000); /* HLSQ_PERFCOUNTER0_SELECT */
OUT_PKT0(ring, REG_A3XX_HLSQ_CONST_VSPRESV_RANGE_REG, 2);
OUT_RING(ring, A3XX_HLSQ_CONST_VSPRESV_RANGE_REG_STARTENTRY(0) |
@@ -549,7 +549,7 @@ fd3_emit_restore(struct fd_context *ctx)
OUT_RING(ring, 0x00000001); /* UCHE_CACHE_MODE_CONTROL_REG */
OUT_PKT0(ring, REG_A3XX_VSC_SIZE_ADDRESS, 1);
OUT_RELOC(ring, fd3_ctx->vsc_size_mem, 0, 0); /* VSC_SIZE_ADDRESS */
OUT_RELOC(ring, fd3_ctx->vsc_size_mem, 0, 0, 0); /* VSC_SIZE_ADDRESS */
OUT_PKT0(ring, REG_A3XX_GRAS_CL_CLIP_CNTL, 1);
OUT_RING(ring, 0x00000000); /* GRAS_CL_CLIP_CNTL */

View File

@@ -89,7 +89,7 @@ emit_mrt(struct fd_ringbuffer *ring, unsigned nr_bufs,
if (bin_w || (i >= nr_bufs)) {
OUT_RING(ring, A3XX_RB_MRT_BUF_BASE_COLOR_BUF_BASE(base));
} else {
OUT_RELOCS(ring, res->bo, 0, 0, -1);
OUT_RELOCW(ring, res->bo, 0, 0, -1);
}
OUT_PKT0(ring, REG_A3XX_SP_FS_IMAGE_OUTPUT_REG(i), 1);
@@ -116,7 +116,7 @@ emit_gmem2mem_surf(struct fd_ringbuffer *ring,
OUT_RING(ring, A3XX_RB_COPY_CONTROL_MSAA_RESOLVE(MSAA_ONE) |
A3XX_RB_COPY_CONTROL_MODE(mode) |
A3XX_RB_COPY_CONTROL_GMEM_BASE(base));
OUT_RELOCS(ring, rsc->bo, 0, 0, -1); /* RB_COPY_DEST_BASE */
OUT_RELOCW(ring, rsc->bo, 0, 0, -1); /* RB_COPY_DEST_BASE */
OUT_RING(ring, A3XX_RB_COPY_DEST_PITCH_PITCH(rsc->pitch * rsc->cpp));
OUT_RING(ring, A3XX_RB_COPY_DEST_INFO_TILE(LINEAR) |
A3XX_RB_COPY_DEST_INFO_FORMAT(fd3_pipe2color(psurf->format)) |
@@ -168,6 +168,14 @@ fd3_emit_tile_gmem2mem(struct fd_context *ctx, uint32_t xoff, uint32_t yoff,
OUT_PKT0(ring, REG_A3XX_GRAS_CL_CLIP_CNTL, 1);
OUT_RING(ring, 0x00000000); /* GRAS_CL_CLIP_CNTL */
OUT_PKT0(ring, REG_A3XX_GRAS_CL_VPORT_XOFFSET, 6);
OUT_RING(ring, A3XX_GRAS_CL_VPORT_XOFFSET((float)pfb->width/2.0 - 0.5));
OUT_RING(ring, A3XX_GRAS_CL_VPORT_XSCALE((float)pfb->width/2.0));
OUT_RING(ring, A3XX_GRAS_CL_VPORT_YOFFSET((float)pfb->height/2.0 - 0.5));
OUT_RING(ring, A3XX_GRAS_CL_VPORT_YSCALE(-(float)pfb->height/2.0));
OUT_RING(ring, A3XX_GRAS_CL_VPORT_ZOFFSET(0.0));
OUT_RING(ring, A3XX_GRAS_CL_VPORT_ZSCALE(1.0));
OUT_PKT0(ring, REG_A3XX_RB_MODE_CONTROL, 1);
OUT_RING(ring, A3XX_RB_MODE_CONTROL_RENDER_MODE(RB_RESOLVE_PASS) |
A3XX_RB_MODE_CONTROL_MARB_CACHE_SPLIT_MODE);
@@ -206,8 +214,12 @@ fd3_emit_tile_gmem2mem(struct fd_context *ctx, uint32_t xoff, uint32_t yoff,
}, 1);
if (ctx->resolve & (FD_BUFFER_DEPTH | FD_BUFFER_STENCIL)) {
uint32_t base = depth_base(&ctx->gmem) *
fd_resource(pfb->cbufs[0]->texture)->cpp;
uint32_t base = 0;
if (pfb->cbufs[0]) {
struct fd_resource *rsc =
fd_resource(pfb->cbufs[0]->texture);
base = depth_base(&ctx->gmem) * rsc->cpp;
}
emit_gmem2mem_surf(ring, RB_COPY_DEPTH_STENCIL, base, pfb->zsbuf);
}
@@ -260,7 +272,7 @@ fd3_emit_tile_mem2gmem(struct fd_context *ctx, uint32_t xoff, uint32_t yoff,
y1 = ((float)yoff + bin_h) / ((float)pfb->height);
OUT_PKT3(ring, CP_MEM_WRITE, 5);
OUT_RELOC(ring, fd_resource(fd3_ctx->blit_texcoord_vbuf)->bo, 0, 0);
OUT_RELOC(ring, fd_resource(fd3_ctx->blit_texcoord_vbuf)->bo, 0, 0, 0);
OUT_RING(ring, fui(x0));
OUT_RING(ring, fui(y0));
OUT_RING(ring, fui(x1));
@@ -383,7 +395,7 @@ update_vsc_pipe(struct fd_context *ctx)
A3XX_VSC_PIPE_CONFIG_Y(0) |
A3XX_VSC_PIPE_CONFIG_W(gmem->nbins_x) |
A3XX_VSC_PIPE_CONFIG_H(gmem->nbins_y));
OUT_RELOC(ring, bo, 0, 0); /* VSC_PIPE[0].DATA_ADDRESS */
OUT_RELOC(ring, bo, 0, 0, 0); /* VSC_PIPE[0].DATA_ADDRESS */
OUT_RING(ring, fd_bo_size(bo) - 32); /* VSC_PIPE[0].DATA_LENGTH */
for (i = 1; i < 8; i++) {
@@ -402,8 +414,11 @@ static void
fd3_emit_sysmem_prep(struct fd_context *ctx)
{
struct pipe_framebuffer_state *pfb = &ctx->framebuffer;
struct fd_resource *rsc = fd_resource(pfb->cbufs[0]->texture);
struct fd_ringbuffer *ring = ctx->ring;
uint32_t pitch = 0;
if (pfb->cbufs[0])
pitch = fd_resource(pfb->cbufs[0]->texture)->pitch;
fd3_emit_restore(ctx);
@@ -414,7 +429,7 @@ fd3_emit_sysmem_prep(struct fd_context *ctx)
emit_mrt(ring, pfb->nr_cbufs, pfb->cbufs, NULL, 0);
fd3_emit_rbrc_tile_state(ring,
A3XX_RB_RENDER_CONTROL_BIN_WIDTH(rsc->pitch));
A3XX_RB_RENDER_CONTROL_BIN_WIDTH(pitch));
/* setup scissor/offset for current tile: */
OUT_PKT0(ring, REG_A3XX_PA_SC_WINDOW_OFFSET, 1);

View File

@@ -249,7 +249,7 @@ fd3_program_emit(struct fd_ringbuffer *ring,
*/
for (i = 0; i < 6; i++) {
OUT_PKT0(ring, REG_A3XX_SP_PERFCOUNTER0_SELECT, 1);
OUT_RING(ring, 0x00000000); /* SP_PERFCOUNTER4_SELECT */
OUT_RING(ring, 0x00000000); /* SP_PERFCOUNTER0_SELECT */
OUT_PKT0(ring, REG_A3XX_SP_PERFCOUNTER4_SELECT, 1);
OUT_RING(ring, 0x00000000); /* SP_PERFCOUNTER4_SELECT */
@@ -320,7 +320,7 @@ fd3_program_emit(struct fd_ringbuffer *ring,
OUT_PKT0(ring, REG_A3XX_SP_VS_OBJ_OFFSET_REG, 2);
OUT_RING(ring, A3XX_SP_VS_OBJ_OFFSET_REG_CONSTOBJECTOFFSET(0) |
A3XX_SP_VS_OBJ_OFFSET_REG_SHADEROBJOFFSET(0));
OUT_RELOC(ring, vp->bo, 0, 0); /* SP_VS_OBJ_START_REG */
OUT_RELOC(ring, vp->bo, 0, 0, 0); /* SP_VS_OBJ_START_REG */
#endif
OUT_PKT0(ring, REG_A3XX_SP_FS_LENGTH_REG, 1);
@@ -345,7 +345,7 @@ fd3_program_emit(struct fd_ringbuffer *ring,
OUT_PKT0(ring, REG_A3XX_SP_FS_OBJ_OFFSET_REG, 2);
OUT_RING(ring, A3XX_SP_FS_OBJ_OFFSET_REG_CONSTOBJECTOFFSET(128) |
A3XX_SP_FS_OBJ_OFFSET_REG_SHADEROBJOFFSET(128 - fp->instrlen));
OUT_RELOC(ring, fp->bo, 0, 0); /* SP_FS_OBJ_START_REG */
OUT_RELOC(ring, fp->bo, 0, 0, 0); /* SP_FS_OBJ_START_REG */
#endif
OUT_PKT0(ring, REG_A3XX_SP_FS_FLAT_SHAD_MODE_REG_0, 2);

View File

@@ -87,6 +87,7 @@ fd3_sampler_state_create(struct pipe_context *pctx,
so->base = *cso;
so->texsamp0 =
COND(!cso->normalized_coords, A3XX_TEX_SAMP_0_UNNORM_COORDS) |
A3XX_TEX_SAMP_0_XY_MAG(tex_filter(cso->mag_img_filter)) |
A3XX_TEX_SAMP_0_XY_MIN(tex_filter(cso->min_img_filter)) |
A3XX_TEX_SAMP_0_WRAP_S(tex_clamp(cso->wrap_s)) |
@@ -97,6 +98,28 @@ fd3_sampler_state_create(struct pipe_context *pctx,
return so;
}
static enum a3xx_tex_type
tex_type(unsigned target)
{
switch (target) {
default:
assert(0);
case PIPE_BUFFER:
case PIPE_TEXTURE_1D:
case PIPE_TEXTURE_1D_ARRAY:
return A3XX_TEX_1D;
case PIPE_TEXTURE_RECT:
case PIPE_TEXTURE_2D:
case PIPE_TEXTURE_2D_ARRAY:
return A3XX_TEX_2D;
case PIPE_TEXTURE_3D:
return A3XX_TEX_3D;
case PIPE_TEXTURE_CUBE:
case PIPE_TEXTURE_CUBE_ARRAY:
return A3XX_TEX_CUBE;
}
}
static struct pipe_sampler_view *
fd3_sampler_view_create(struct pipe_context *pctx, struct pipe_resource *prsc,
const struct pipe_sampler_view *cso)
@@ -116,7 +139,7 @@ fd3_sampler_view_create(struct pipe_context *pctx, struct pipe_resource *prsc,
so->tex_resource = rsc;
so->texconst0 =
0x40000000 | /* ??? */
A3XX_TEX_CONST_0_TYPE(tex_type(prsc->target)) |
A3XX_TEX_CONST_0_FMT(fd3_pipe2tex(cso->format)) |
fd3_tex_swiz(cso->format, cso->swizzle_r, cso->swizzle_g,
cso->swizzle_b, cso->swizzle_a);

View File

@@ -306,10 +306,11 @@ fd3_pipe2swap(enum pipe_format format)
case PIPE_FORMAT_B8G8R8A8_UNORM:
case PIPE_FORMAT_B8G8R8X8_UNORM:
return WXYZ;
case PIPE_FORMAT_R8G8B8A8_UNORM:
case PIPE_FORMAT_R8G8B8X8_UNORM:
case PIPE_FORMAT_Z24X8_UNORM:
case PIPE_FORMAT_Z24_UNORM_S8_UINT:
return WZYX;
default:
return WZYX;
}

View File

@@ -166,8 +166,7 @@ struct ir3_instruction {
};
};
/* this is just large to cope w/ the large test *.asm: */
#define MAX_INSTRS 10240
#define MAX_INSTRS 1024
struct ir3_shader {
unsigned instrs_count;

View File

@@ -8,10 +8,12 @@ http://0x04.net/cgit/index.cgi/rules-ng-ng
git clone git://0x04.net/rules-ng-ng
The rules-ng-ng source files this header was generated from are:
- /home/robclark/src/freedreno/envytools/rnndb/a3xx.xml ( 42578 bytes, from 2013-06-02 13:10:46)
- /home/robclark/src/freedreno/envytools/rnndb/adreno.xml ( 327 bytes, from 2013-07-05 19:21:12)
- /home/robclark/src/freedreno/envytools/rnndb/freedreno_copyright.xml ( 1453 bytes, from 2013-03-31 16:51:27)
- /home/robclark/src/freedreno/envytools/rnndb/adreno_common.xml ( 3094 bytes, from 2013-05-05 18:29:22)
- /home/robclark/src/freedreno/envytools/rnndb/a2xx/a2xx.xml ( 30005 bytes, from 2013-07-19 21:30:48)
- /home/robclark/src/freedreno/envytools/rnndb/adreno_common.xml ( 8983 bytes, from 2013-07-24 01:38:36)
- /home/robclark/src/freedreno/envytools/rnndb/adreno_pm4.xml ( 9712 bytes, from 2013-05-26 15:22:37)
- /home/robclark/src/freedreno/envytools/rnndb/a3xx/a3xx.xml ( 51415 bytes, from 2013-08-03 14:26:05)
Copyright (C) 2013 by the following authors:
- Rob Clark <robdclark@gmail.com> (robclark)
@@ -113,5 +115,318 @@ enum adreno_rb_depth_format {
DEPTHX_24_8 = 1,
};
enum adreno_mmu_clnt_beh {
BEH_NEVR = 0,
BEH_TRAN_RNG = 1,
BEH_TRAN_FLT = 2,
};
#define REG_AXXX_MH_MMU_CONFIG 0x00000040
#define AXXX_MH_MMU_CONFIG_MMU_ENABLE 0x00000001
#define AXXX_MH_MMU_CONFIG_SPLIT_MODE_ENABLE 0x00000002
#define AXXX_MH_MMU_CONFIG_RB_W_CLNT_BEHAVIOR__MASK 0x00000030
#define AXXX_MH_MMU_CONFIG_RB_W_CLNT_BEHAVIOR__SHIFT 4
static inline uint32_t AXXX_MH_MMU_CONFIG_RB_W_CLNT_BEHAVIOR(enum adreno_mmu_clnt_beh val)
{
return ((val) << AXXX_MH_MMU_CONFIG_RB_W_CLNT_BEHAVIOR__SHIFT) & AXXX_MH_MMU_CONFIG_RB_W_CLNT_BEHAVIOR__MASK;
}
#define AXXX_MH_MMU_CONFIG_CP_W_CLNT_BEHAVIOR__MASK 0x000000c0
#define AXXX_MH_MMU_CONFIG_CP_W_CLNT_BEHAVIOR__SHIFT 6
static inline uint32_t AXXX_MH_MMU_CONFIG_CP_W_CLNT_BEHAVIOR(enum adreno_mmu_clnt_beh val)
{
return ((val) << AXXX_MH_MMU_CONFIG_CP_W_CLNT_BEHAVIOR__SHIFT) & AXXX_MH_MMU_CONFIG_CP_W_CLNT_BEHAVIOR__MASK;
}
#define AXXX_MH_MMU_CONFIG_CP_R0_CLNT_BEHAVIOR__MASK 0x00000300
#define AXXX_MH_MMU_CONFIG_CP_R0_CLNT_BEHAVIOR__SHIFT 8
static inline uint32_t AXXX_MH_MMU_CONFIG_CP_R0_CLNT_BEHAVIOR(enum adreno_mmu_clnt_beh val)
{
return ((val) << AXXX_MH_MMU_CONFIG_CP_R0_CLNT_BEHAVIOR__SHIFT) & AXXX_MH_MMU_CONFIG_CP_R0_CLNT_BEHAVIOR__MASK;
}
#define AXXX_MH_MMU_CONFIG_CP_R1_CLNT_BEHAVIOR__MASK 0x00000c00
#define AXXX_MH_MMU_CONFIG_CP_R1_CLNT_BEHAVIOR__SHIFT 10
static inline uint32_t AXXX_MH_MMU_CONFIG_CP_R1_CLNT_BEHAVIOR(enum adreno_mmu_clnt_beh val)
{
return ((val) << AXXX_MH_MMU_CONFIG_CP_R1_CLNT_BEHAVIOR__SHIFT) & AXXX_MH_MMU_CONFIG_CP_R1_CLNT_BEHAVIOR__MASK;
}
#define AXXX_MH_MMU_CONFIG_CP_R2_CLNT_BEHAVIOR__MASK 0x00003000
#define AXXX_MH_MMU_CONFIG_CP_R2_CLNT_BEHAVIOR__SHIFT 12
static inline uint32_t AXXX_MH_MMU_CONFIG_CP_R2_CLNT_BEHAVIOR(enum adreno_mmu_clnt_beh val)
{
return ((val) << AXXX_MH_MMU_CONFIG_CP_R2_CLNT_BEHAVIOR__SHIFT) & AXXX_MH_MMU_CONFIG_CP_R2_CLNT_BEHAVIOR__MASK;
}
#define AXXX_MH_MMU_CONFIG_CP_R3_CLNT_BEHAVIOR__MASK 0x0000c000
#define AXXX_MH_MMU_CONFIG_CP_R3_CLNT_BEHAVIOR__SHIFT 14
static inline uint32_t AXXX_MH_MMU_CONFIG_CP_R3_CLNT_BEHAVIOR(enum adreno_mmu_clnt_beh val)
{
return ((val) << AXXX_MH_MMU_CONFIG_CP_R3_CLNT_BEHAVIOR__SHIFT) & AXXX_MH_MMU_CONFIG_CP_R3_CLNT_BEHAVIOR__MASK;
}
#define AXXX_MH_MMU_CONFIG_CP_R4_CLNT_BEHAVIOR__MASK 0x00030000
#define AXXX_MH_MMU_CONFIG_CP_R4_CLNT_BEHAVIOR__SHIFT 16
static inline uint32_t AXXX_MH_MMU_CONFIG_CP_R4_CLNT_BEHAVIOR(enum adreno_mmu_clnt_beh val)
{
return ((val) << AXXX_MH_MMU_CONFIG_CP_R4_CLNT_BEHAVIOR__SHIFT) & AXXX_MH_MMU_CONFIG_CP_R4_CLNT_BEHAVIOR__MASK;
}
#define AXXX_MH_MMU_CONFIG_VGT_R0_CLNT_BEHAVIOR__MASK 0x000c0000
#define AXXX_MH_MMU_CONFIG_VGT_R0_CLNT_BEHAVIOR__SHIFT 18
static inline uint32_t AXXX_MH_MMU_CONFIG_VGT_R0_CLNT_BEHAVIOR(enum adreno_mmu_clnt_beh val)
{
return ((val) << AXXX_MH_MMU_CONFIG_VGT_R0_CLNT_BEHAVIOR__SHIFT) & AXXX_MH_MMU_CONFIG_VGT_R0_CLNT_BEHAVIOR__MASK;
}
#define AXXX_MH_MMU_CONFIG_VGT_R1_CLNT_BEHAVIOR__MASK 0x00300000
#define AXXX_MH_MMU_CONFIG_VGT_R1_CLNT_BEHAVIOR__SHIFT 20
static inline uint32_t AXXX_MH_MMU_CONFIG_VGT_R1_CLNT_BEHAVIOR(enum adreno_mmu_clnt_beh val)
{
return ((val) << AXXX_MH_MMU_CONFIG_VGT_R1_CLNT_BEHAVIOR__SHIFT) & AXXX_MH_MMU_CONFIG_VGT_R1_CLNT_BEHAVIOR__MASK;
}
#define AXXX_MH_MMU_CONFIG_TC_R_CLNT_BEHAVIOR__MASK 0x00c00000
#define AXXX_MH_MMU_CONFIG_TC_R_CLNT_BEHAVIOR__SHIFT 22
static inline uint32_t AXXX_MH_MMU_CONFIG_TC_R_CLNT_BEHAVIOR(enum adreno_mmu_clnt_beh val)
{
return ((val) << AXXX_MH_MMU_CONFIG_TC_R_CLNT_BEHAVIOR__SHIFT) & AXXX_MH_MMU_CONFIG_TC_R_CLNT_BEHAVIOR__MASK;
}
#define AXXX_MH_MMU_CONFIG_PA_W_CLNT_BEHAVIOR__MASK 0x03000000
#define AXXX_MH_MMU_CONFIG_PA_W_CLNT_BEHAVIOR__SHIFT 24
static inline uint32_t AXXX_MH_MMU_CONFIG_PA_W_CLNT_BEHAVIOR(enum adreno_mmu_clnt_beh val)
{
return ((val) << AXXX_MH_MMU_CONFIG_PA_W_CLNT_BEHAVIOR__SHIFT) & AXXX_MH_MMU_CONFIG_PA_W_CLNT_BEHAVIOR__MASK;
}
#define REG_AXXX_MH_MMU_VA_RANGE 0x00000041
#define REG_AXXX_MH_MMU_PT_BASE 0x00000042
#define REG_AXXX_MH_MMU_PAGE_FAULT 0x00000043
#define REG_AXXX_MH_MMU_TRAN_ERROR 0x00000044
#define REG_AXXX_MH_MMU_INVALIDATE 0x00000045
#define REG_AXXX_MH_MMU_MPU_BASE 0x00000046
#define REG_AXXX_MH_MMU_MPU_END 0x00000047
#define REG_AXXX_CP_RB_BASE 0x000001c0
#define REG_AXXX_CP_RB_CNTL 0x000001c1
#define AXXX_CP_RB_CNTL_BUFSZ__MASK 0x0000003f
#define AXXX_CP_RB_CNTL_BUFSZ__SHIFT 0
static inline uint32_t AXXX_CP_RB_CNTL_BUFSZ(uint32_t val)
{
return ((val) << AXXX_CP_RB_CNTL_BUFSZ__SHIFT) & AXXX_CP_RB_CNTL_BUFSZ__MASK;
}
#define AXXX_CP_RB_CNTL_BLKSZ__MASK 0x00003f00
#define AXXX_CP_RB_CNTL_BLKSZ__SHIFT 8
static inline uint32_t AXXX_CP_RB_CNTL_BLKSZ(uint32_t val)
{
return ((val) << AXXX_CP_RB_CNTL_BLKSZ__SHIFT) & AXXX_CP_RB_CNTL_BLKSZ__MASK;
}
#define AXXX_CP_RB_CNTL_BUF_SWAP__MASK 0x00030000
#define AXXX_CP_RB_CNTL_BUF_SWAP__SHIFT 16
static inline uint32_t AXXX_CP_RB_CNTL_BUF_SWAP(uint32_t val)
{
return ((val) << AXXX_CP_RB_CNTL_BUF_SWAP__SHIFT) & AXXX_CP_RB_CNTL_BUF_SWAP__MASK;
}
#define AXXX_CP_RB_CNTL_POLL_EN 0x00100000
#define AXXX_CP_RB_CNTL_NO_UPDATE 0x08000000
#define AXXX_CP_RB_CNTL_RPTR_WR_EN 0x80000000
#define REG_AXXX_CP_RB_RPTR_ADDR 0x000001c3
#define AXXX_CP_RB_RPTR_ADDR_SWAP__MASK 0x00000003
#define AXXX_CP_RB_RPTR_ADDR_SWAP__SHIFT 0
static inline uint32_t AXXX_CP_RB_RPTR_ADDR_SWAP(uint32_t val)
{
return ((val) << AXXX_CP_RB_RPTR_ADDR_SWAP__SHIFT) & AXXX_CP_RB_RPTR_ADDR_SWAP__MASK;
}
#define AXXX_CP_RB_RPTR_ADDR_ADDR__MASK 0xfffffffc
#define AXXX_CP_RB_RPTR_ADDR_ADDR__SHIFT 2
static inline uint32_t AXXX_CP_RB_RPTR_ADDR_ADDR(uint32_t val)
{
return ((val >> 2) << AXXX_CP_RB_RPTR_ADDR_ADDR__SHIFT) & AXXX_CP_RB_RPTR_ADDR_ADDR__MASK;
}
#define REG_AXXX_CP_RB_RPTR 0x000001c4
#define REG_AXXX_CP_RB_WPTR 0x000001c5
#define REG_AXXX_CP_RB_WPTR_DELAY 0x000001c6
#define REG_AXXX_CP_RB_RPTR_WR 0x000001c7
#define REG_AXXX_CP_RB_WPTR_BASE 0x000001c8
#define REG_AXXX_CP_QUEUE_THRESHOLDS 0x000001d5
#define AXXX_CP_QUEUE_THRESHOLDS_CSQ_IB1_START__MASK 0x0000000f
#define AXXX_CP_QUEUE_THRESHOLDS_CSQ_IB1_START__SHIFT 0
static inline uint32_t AXXX_CP_QUEUE_THRESHOLDS_CSQ_IB1_START(uint32_t val)
{
return ((val) << AXXX_CP_QUEUE_THRESHOLDS_CSQ_IB1_START__SHIFT) & AXXX_CP_QUEUE_THRESHOLDS_CSQ_IB1_START__MASK;
}
#define AXXX_CP_QUEUE_THRESHOLDS_CSQ_IB2_START__MASK 0x00000f00
#define AXXX_CP_QUEUE_THRESHOLDS_CSQ_IB2_START__SHIFT 8
static inline uint32_t AXXX_CP_QUEUE_THRESHOLDS_CSQ_IB2_START(uint32_t val)
{
return ((val) << AXXX_CP_QUEUE_THRESHOLDS_CSQ_IB2_START__SHIFT) & AXXX_CP_QUEUE_THRESHOLDS_CSQ_IB2_START__MASK;
}
#define AXXX_CP_QUEUE_THRESHOLDS_CSQ_ST_START__MASK 0x000f0000
#define AXXX_CP_QUEUE_THRESHOLDS_CSQ_ST_START__SHIFT 16
static inline uint32_t AXXX_CP_QUEUE_THRESHOLDS_CSQ_ST_START(uint32_t val)
{
return ((val) << AXXX_CP_QUEUE_THRESHOLDS_CSQ_ST_START__SHIFT) & AXXX_CP_QUEUE_THRESHOLDS_CSQ_ST_START__MASK;
}
#define REG_AXXX_CP_MEQ_THRESHOLDS 0x000001d6
#define REG_AXXX_CP_CSQ_AVAIL 0x000001d7
#define AXXX_CP_CSQ_AVAIL_RING__MASK 0x0000007f
#define AXXX_CP_CSQ_AVAIL_RING__SHIFT 0
static inline uint32_t AXXX_CP_CSQ_AVAIL_RING(uint32_t val)
{
return ((val) << AXXX_CP_CSQ_AVAIL_RING__SHIFT) & AXXX_CP_CSQ_AVAIL_RING__MASK;
}
#define AXXX_CP_CSQ_AVAIL_IB1__MASK 0x00007f00
#define AXXX_CP_CSQ_AVAIL_IB1__SHIFT 8
static inline uint32_t AXXX_CP_CSQ_AVAIL_IB1(uint32_t val)
{
return ((val) << AXXX_CP_CSQ_AVAIL_IB1__SHIFT) & AXXX_CP_CSQ_AVAIL_IB1__MASK;
}
#define AXXX_CP_CSQ_AVAIL_IB2__MASK 0x007f0000
#define AXXX_CP_CSQ_AVAIL_IB2__SHIFT 16
static inline uint32_t AXXX_CP_CSQ_AVAIL_IB2(uint32_t val)
{
return ((val) << AXXX_CP_CSQ_AVAIL_IB2__SHIFT) & AXXX_CP_CSQ_AVAIL_IB2__MASK;
}
#define REG_AXXX_CP_STQ_AVAIL 0x000001d8
#define AXXX_CP_STQ_AVAIL_ST__MASK 0x0000007f
#define AXXX_CP_STQ_AVAIL_ST__SHIFT 0
static inline uint32_t AXXX_CP_STQ_AVAIL_ST(uint32_t val)
{
return ((val) << AXXX_CP_STQ_AVAIL_ST__SHIFT) & AXXX_CP_STQ_AVAIL_ST__MASK;
}
#define REG_AXXX_CP_MEQ_AVAIL 0x000001d9
#define AXXX_CP_MEQ_AVAIL_MEQ__MASK 0x0000001f
#define AXXX_CP_MEQ_AVAIL_MEQ__SHIFT 0
static inline uint32_t AXXX_CP_MEQ_AVAIL_MEQ(uint32_t val)
{
return ((val) << AXXX_CP_MEQ_AVAIL_MEQ__SHIFT) & AXXX_CP_MEQ_AVAIL_MEQ__MASK;
}
#define REG_AXXX_SCRATCH_UMSK 0x000001dc
#define AXXX_SCRATCH_UMSK_UMSK__MASK 0x000000ff
#define AXXX_SCRATCH_UMSK_UMSK__SHIFT 0
static inline uint32_t AXXX_SCRATCH_UMSK_UMSK(uint32_t val)
{
return ((val) << AXXX_SCRATCH_UMSK_UMSK__SHIFT) & AXXX_SCRATCH_UMSK_UMSK__MASK;
}
#define AXXX_SCRATCH_UMSK_SWAP__MASK 0x00030000
#define AXXX_SCRATCH_UMSK_SWAP__SHIFT 16
static inline uint32_t AXXX_SCRATCH_UMSK_SWAP(uint32_t val)
{
return ((val) << AXXX_SCRATCH_UMSK_SWAP__SHIFT) & AXXX_SCRATCH_UMSK_SWAP__MASK;
}
#define REG_AXXX_SCRATCH_ADDR 0x000001dd
#define REG_AXXX_CP_ME_RDADDR 0x000001ea
#define REG_AXXX_CP_STATE_DEBUG_INDEX 0x000001ec
#define REG_AXXX_CP_STATE_DEBUG_DATA 0x000001ed
#define REG_AXXX_CP_INT_CNTL 0x000001f2
#define REG_AXXX_CP_INT_STATUS 0x000001f3
#define REG_AXXX_CP_INT_ACK 0x000001f4
#define REG_AXXX_CP_ME_CNTL 0x000001f6
#define REG_AXXX_CP_ME_STATUS 0x000001f7
#define REG_AXXX_CP_ME_RAM_WADDR 0x000001f8
#define REG_AXXX_CP_ME_RAM_RADDR 0x000001f9
#define REG_AXXX_CP_ME_RAM_DATA 0x000001fa
#define REG_AXXX_CP_DEBUG 0x000001fc
#define AXXX_CP_DEBUG_PREDICATE_DISABLE 0x00800000
#define AXXX_CP_DEBUG_PROG_END_PTR_ENABLE 0x01000000
#define AXXX_CP_DEBUG_MIU_128BIT_WRITE_ENABLE 0x02000000
#define AXXX_CP_DEBUG_PREFETCH_PASS_NOPS 0x04000000
#define AXXX_CP_DEBUG_DYNAMIC_CLK_DISABLE 0x08000000
#define AXXX_CP_DEBUG_PREFETCH_MATCH_DISABLE 0x10000000
#define AXXX_CP_DEBUG_SIMPLE_ME_FLOW_CONTROL 0x40000000
#define AXXX_CP_DEBUG_MIU_WRITE_PACK_DISABLE 0x80000000
#define REG_AXXX_CP_CSQ_RB_STAT 0x000001fd
#define AXXX_CP_CSQ_RB_STAT_RPTR__MASK 0x0000007f
#define AXXX_CP_CSQ_RB_STAT_RPTR__SHIFT 0
static inline uint32_t AXXX_CP_CSQ_RB_STAT_RPTR(uint32_t val)
{
return ((val) << AXXX_CP_CSQ_RB_STAT_RPTR__SHIFT) & AXXX_CP_CSQ_RB_STAT_RPTR__MASK;
}
#define AXXX_CP_CSQ_RB_STAT_WPTR__MASK 0x007f0000
#define AXXX_CP_CSQ_RB_STAT_WPTR__SHIFT 16
static inline uint32_t AXXX_CP_CSQ_RB_STAT_WPTR(uint32_t val)
{
return ((val) << AXXX_CP_CSQ_RB_STAT_WPTR__SHIFT) & AXXX_CP_CSQ_RB_STAT_WPTR__MASK;
}
#define REG_AXXX_CP_CSQ_IB1_STAT 0x000001fe
#define AXXX_CP_CSQ_IB1_STAT_RPTR__MASK 0x0000007f
#define AXXX_CP_CSQ_IB1_STAT_RPTR__SHIFT 0
static inline uint32_t AXXX_CP_CSQ_IB1_STAT_RPTR(uint32_t val)
{
return ((val) << AXXX_CP_CSQ_IB1_STAT_RPTR__SHIFT) & AXXX_CP_CSQ_IB1_STAT_RPTR__MASK;
}
#define AXXX_CP_CSQ_IB1_STAT_WPTR__MASK 0x007f0000
#define AXXX_CP_CSQ_IB1_STAT_WPTR__SHIFT 16
static inline uint32_t AXXX_CP_CSQ_IB1_STAT_WPTR(uint32_t val)
{
return ((val) << AXXX_CP_CSQ_IB1_STAT_WPTR__SHIFT) & AXXX_CP_CSQ_IB1_STAT_WPTR__MASK;
}
#define REG_AXXX_CP_CSQ_IB2_STAT 0x000001ff
#define AXXX_CP_CSQ_IB2_STAT_RPTR__MASK 0x0000007f
#define AXXX_CP_CSQ_IB2_STAT_RPTR__SHIFT 0
static inline uint32_t AXXX_CP_CSQ_IB2_STAT_RPTR(uint32_t val)
{
return ((val) << AXXX_CP_CSQ_IB2_STAT_RPTR__SHIFT) & AXXX_CP_CSQ_IB2_STAT_RPTR__MASK;
}
#define AXXX_CP_CSQ_IB2_STAT_WPTR__MASK 0x007f0000
#define AXXX_CP_CSQ_IB2_STAT_WPTR__SHIFT 16
static inline uint32_t AXXX_CP_CSQ_IB2_STAT_WPTR(uint32_t val)
{
return ((val) << AXXX_CP_CSQ_IB2_STAT_WPTR__SHIFT) & AXXX_CP_CSQ_IB2_STAT_WPTR__MASK;
}
#define REG_AXXX_CP_SCRATCH_REG0 0x00000578
#define REG_AXXX_CP_SCRATCH_REG1 0x00000579
#define REG_AXXX_CP_SCRATCH_REG2 0x0000057a
#define REG_AXXX_CP_SCRATCH_REG3 0x0000057b
#define REG_AXXX_CP_SCRATCH_REG4 0x0000057c
#define REG_AXXX_CP_SCRATCH_REG5 0x0000057d
#define REG_AXXX_CP_SCRATCH_REG6 0x0000057e
#define REG_AXXX_CP_SCRATCH_REG7 0x0000057f
#define REG_AXXX_CP_ME_CF_EVENT_SRC 0x0000060a
#define REG_AXXX_CP_ME_CF_EVENT_ADDR 0x0000060b
#define REG_AXXX_CP_ME_CF_EVENT_DATA 0x0000060c
#define REG_AXXX_CP_ME_NRT_ADDR 0x0000060d
#define REG_AXXX_CP_ME_NRT_DATA 0x0000060e
#endif /* ADRENO_COMMON_XML */

View File

@@ -8,10 +8,12 @@ http://0x04.net/cgit/index.cgi/rules-ng-ng
git clone git://0x04.net/rules-ng-ng
The rules-ng-ng source files this header was generated from are:
- /home/robclark/src/freedreno/envytools/rnndb/a3xx.xml ( 42578 bytes, from 2013-06-02 13:10:46)
- /home/robclark/src/freedreno/envytools/rnndb/adreno.xml ( 327 bytes, from 2013-07-05 19:21:12)
- /home/robclark/src/freedreno/envytools/rnndb/freedreno_copyright.xml ( 1453 bytes, from 2013-03-31 16:51:27)
- /home/robclark/src/freedreno/envytools/rnndb/adreno_common.xml ( 3094 bytes, from 2013-05-05 18:29:22)
- /home/robclark/src/freedreno/envytools/rnndb/a2xx/a2xx.xml ( 30005 bytes, from 2013-07-19 21:30:48)
- /home/robclark/src/freedreno/envytools/rnndb/adreno_common.xml ( 8983 bytes, from 2013-07-24 01:38:36)
- /home/robclark/src/freedreno/envytools/rnndb/adreno_pm4.xml ( 9712 bytes, from 2013-05-26 15:22:37)
- /home/robclark/src/freedreno/envytools/rnndb/a3xx/a3xx.xml ( 51415 bytes, from 2013-08-03 14:26:05)
Copyright (C) 2013 by the following authors:
- Rob Clark <robdclark@gmail.com> (robclark)

View File

@@ -86,7 +86,8 @@ fd_context_render(struct pipe_context *pctx)
ctx->gmem_reason = 0;
ctx->num_draws = 0;
fd_resource(pfb->cbufs[0]->texture)->dirty = false;
if (pfb->cbufs[0])
fd_resource(pfb->cbufs[0]->texture)->dirty = false;
if (pfb->zsbuf)
fd_resource(pfb->zsbuf->texture)->dirty = false;
}

View File

@@ -104,7 +104,7 @@ fd_draw_emit(struct fd_context *ctx, const struct pipe_draw_info *info)
src_sel, idx_type, IGNORE_VISIBILITY));
OUT_RING(ring, info->count); /* NumIndices */
if (info->indexed) {
OUT_RELOC(ring, idx_bo, idx_offset, 0);
OUT_RELOC(ring, idx_bo, idx_offset, 0, 0);
OUT_RING (ring, idx_size);
}
}
@@ -193,8 +193,8 @@ fd_clear(struct pipe_context *pctx, unsigned buffers,
}
DBG("%x depth=%f, stencil=%u (%s/%s)", buffers, depth, stencil,
util_format_name(pfb->cbufs[0]->format),
pfb->zsbuf ? util_format_name(pfb->zsbuf->format) : "none");
util_format_short_name(pipe_surface_format(pfb->cbufs[0])),
util_format_short_name(pipe_surface_format(pfb->zsbuf)));
ctx->clear(ctx, buffers, color, depth, stencil);

View File

@@ -71,12 +71,16 @@ calculate_tiles(struct fd_context *ctx)
{
struct fd_gmem_stateobj *gmem = &ctx->gmem;
struct pipe_scissor_state *scissor = &ctx->max_scissor;
uint32_t cpp = util_format_get_blocksize(ctx->framebuffer.cbufs[0]->format);
struct pipe_framebuffer_state *pfb = &ctx->framebuffer;
uint32_t gmem_size = ctx->screen->gmemsize_bytes;
uint32_t minx, miny, width, height;
uint32_t nbins_x = 1, nbins_y = 1;
uint32_t bin_w, bin_h;
uint32_t max_width = 992;
uint32_t cpp = 4;
if (pfb->cbufs[0])
cpp = util_format_get_blocksize(pfb->cbufs[0]->format);
if ((gmem->cpp == cpp) &&
!memcmp(&gmem->scissor, scissor, sizeof(gmem->scissor))) {
@@ -84,10 +88,17 @@ calculate_tiles(struct fd_context *ctx)
return;
}
minx = scissor->minx & ~31; /* round down to multiple of 32 */
miny = scissor->miny & ~31;
width = scissor->maxx - minx;
height = scissor->maxy - miny;
if (fd_mesa_debug & FD_DBG_DSCIS) {
minx = 0;
miny = 0;
width = pfb->width;
height = pfb->height;
} else {
minx = scissor->minx & ~31; /* round down to multiple of 32 */
miny = scissor->miny & ~31;
width = scissor->maxx - minx;
height = scissor->maxy - miny;
}
// TODO we probably could optimize this a bit if we know that
// Z or stencil is not enabled for any of the draw calls..
@@ -132,9 +143,7 @@ static void
render_tiles(struct fd_context *ctx)
{
struct fd_gmem_stateobj *gmem = &ctx->gmem;
uint32_t i, yoff = 0;
yoff= gmem->miny;
uint32_t i, yoff = gmem->miny;
ctx->emit_tile_init(ctx);
@@ -143,13 +152,13 @@ render_tiles(struct fd_context *ctx)
uint32_t bh = gmem->bin_h;
/* clip bin height: */
bh = MIN2(bh, gmem->height - yoff);
bh = MIN2(bh, gmem->miny + gmem->height - yoff);
for (j = 0; j < gmem->nbins_x; j++) {
uint32_t bw = gmem->bin_w;
/* clip bin width: */
bw = MIN2(bw, gmem->width - xoff);
bw = MIN2(bw, gmem->minx + gmem->width - xoff);
DBG("bin_h=%d, yoff=%d, bin_w=%d, xoff=%d",
bh, yoff, bw, xoff);
@@ -205,15 +214,15 @@ fd_gmem_render_tiles(struct pipe_context *pctx)
if (sysmem) {
DBG("rendering sysmem (%s/%s)",
util_format_name(pfb->cbufs[0]->format),
pfb->zsbuf ? util_format_name(pfb->zsbuf->format) : "none");
util_format_short_name(pipe_surface_format(pfb->cbufs[0])),
util_format_short_name(pipe_surface_format(pfb->zsbuf)));
render_sysmem(ctx);
} else {
struct fd_gmem_stateobj *gmem = &ctx->gmem;
DBG("rendering %dx%d tiles (%s/%s)", gmem->nbins_x, gmem->nbins_y,
util_format_name(pfb->cbufs[0]->format),
pfb->zsbuf ? util_format_name(pfb->zsbuf->format) : "none");
calculate_tiles(ctx);
DBG("rendering %dx%d tiles (%s/%s)", gmem->nbins_x, gmem->nbins_y,
util_format_short_name(pipe_surface_format(pfb->cbufs[0])),
util_format_short_name(pipe_surface_format(pfb->zsbuf)));
render_tiles(ctx);
}
@@ -225,7 +234,8 @@ fd_gmem_render_tiles(struct pipe_context *pctx)
/* update timestamps on render targets: */
timestamp = fd_ringbuffer_timestamp(ctx->ring);
fd_resource(pfb->cbufs[0]->texture)->timestamp = timestamp;
if (pfb->cbufs[0])
fd_resource(pfb->cbufs[0]->texture)->timestamp = timestamp;
if (pfb->zsbuf)
fd_resource(pfb->zsbuf->texture)->timestamp = timestamp;

View File

@@ -59,6 +59,9 @@ fd_resource_transfer_unmap(struct pipe_context *pctx,
struct pipe_transfer *ptrans)
{
struct fd_context *ctx = fd_context(pctx);
struct fd_resource *rsc = fd_resource(ptrans->resource);
if (!(ptrans->usage & PIPE_TRANSFER_UNSYNCHRONIZED))
fd_bo_cpu_fini(rsc->bo);
pipe_resource_reference(&ptrans->resource, NULL);
util_slab_free(&ctx->transfer_pool, ptrans);
}
@@ -74,12 +77,13 @@ fd_resource_transfer_map(struct pipe_context *pctx,
struct fd_resource *rsc = fd_resource(prsc);
struct pipe_transfer *ptrans = util_slab_alloc(&ctx->transfer_pool);
enum pipe_format format = prsc->format;
uint32_t op = 0;
char *buf;
if (!ptrans)
return NULL;
/* util_slap_alloc() doesn't zero: */
/* util_slab_alloc() doesn't zero: */
memset(ptrans, 0, sizeof(*ptrans));
pipe_resource_reference(&ptrans->resource, prsc);
@@ -90,7 +94,8 @@ fd_resource_transfer_map(struct pipe_context *pctx,
ptrans->layer_stride = ptrans->stride;
/* some state trackers (at least XA) don't do this.. */
fd_resource_transfer_flush_region(pctx, ptrans, box);
if (!(usage & PIPE_TRANSFER_FLUSH_EXPLICIT))
fd_resource_transfer_flush_region(pctx, ptrans, box);
buf = fd_bo_map(rsc->bo);
if (!buf) {
@@ -98,6 +103,15 @@ fd_resource_transfer_map(struct pipe_context *pctx,
return NULL;
}
if (usage & PIPE_TRANSFER_READ)
op |= DRM_FREEDRENO_PREP_READ;
if (usage & PIPE_TRANSFER_WRITE)
op |= DRM_FREEDRENO_PREP_WRITE;
if (!(usage & PIPE_TRANSFER_UNSYNCHRONIZED))
fd_bo_cpu_prep(rsc->bo, ctx->screen->pipe, op);
*pptrans = ptrans;
return buf +

View File

@@ -60,6 +60,7 @@ static const struct debug_named_value debug_options[] = {
{"disasm", FD_DBG_DISASM, "Dump TGSI and adreno shader disassembly"},
{"dclear", FD_DBG_DCLEAR, "Mark all state dirty after clear"},
{"dgmem", FD_DBG_DGMEM, "Mark all state dirty after GMEM tile pass"},
{"dscis", FD_DBG_DSCIS, "Disable scissor optimization"},
DEBUG_NAMED_VALUE_END
};

View File

@@ -120,7 +120,7 @@ fd_set_framebuffer_state(struct pipe_context *pctx,
unsigned i;
DBG("%d: cbufs[0]=%p, zsbuf=%p", ctx->needs_flush,
cso->cbufs[0], cso->zsbuf);
framebuffer->cbufs[0], framebuffer->zsbuf);
fd_context_render(pctx);

View File

@@ -33,8 +33,10 @@
#include <freedreno_ringbuffer.h>
#include "pipe/p_format.h"
#include "pipe/p_state.h"
#include "util/u_debug.h"
#include "util/u_math.h"
#include "util/u_half.h"
#include "adreno_common.xml.h"
#include "adreno_pm4.xml.h"
@@ -47,10 +49,11 @@ enum adreno_pa_su_sc_draw fd_polygon_mode(unsigned mode);
enum adreno_stencil_op fd_stencil_op(unsigned op);
#define FD_DBG_MSGS 0x1
#define FD_DBG_DISASM 0x2
#define FD_DBG_DCLEAR 0x4
#define FD_DBG_DGMEM 0x8
#define FD_DBG_MSGS 0x01
#define FD_DBG_DISASM 0x02
#define FD_DBG_DCLEAR 0x04
#define FD_DBG_DGMEM 0x08
#define FD_DBG_DSCIS 0x10
extern int fd_mesa_debug;
#define DBG(fmt, ...) \
@@ -77,6 +80,15 @@ static inline uint32_t DRAW(enum pc_di_primtype prim_type,
(1 << 14);
}
static inline enum pipe_format
pipe_surface_format(struct pipe_surface *psurf)
{
if (!psurf)
return PIPE_FORMAT_NONE;
return psurf->format;
}
#define LOG_DWORDS 0
@@ -92,25 +104,36 @@ OUT_RING(struct fd_ringbuffer *ring, uint32_t data)
static inline void
OUT_RELOC(struct fd_ringbuffer *ring, struct fd_bo *bo,
uint32_t offset, uint32_t or)
{
if (LOG_DWORDS) {
DBG("ring[%p]: OUT_RELOC %04x: %p+%u", ring,
(uint32_t)(ring->cur - ring->last_start), bo, offset);
}
fd_ringbuffer_emit_reloc(ring, bo, offset, or);
}
/* shifted reloc: */
static inline void
OUT_RELOCS(struct fd_ringbuffer *ring, struct fd_bo *bo,
uint32_t offset, uint32_t or, int32_t shift)
{
if (LOG_DWORDS) {
DBG("ring[%p]: OUT_RELOCS %04x: %p+%u << %d", ring,
DBG("ring[%p]: OUT_RELOC %04x: %p+%u << %d", ring,
(uint32_t)(ring->cur - ring->last_start), bo, offset, shift);
}
fd_ringbuffer_emit_reloc_shift(ring, bo, offset, or, shift);
fd_ringbuffer_reloc(ring, &(struct fd_reloc){
.bo = bo,
.flags = FD_RELOC_READ,
.offset = offset,
.or = or,
.shift = shift,
});
}
static inline void
OUT_RELOCW(struct fd_ringbuffer *ring, struct fd_bo *bo,
uint32_t offset, uint32_t or, int32_t shift)
{
if (LOG_DWORDS) {
DBG("ring[%p]: OUT_RELOC %04x: %p+%u << %d", ring,
(uint32_t)(ring->cur - ring->last_start), bo, offset, shift);
}
fd_ringbuffer_reloc(ring, &(struct fd_reloc){
.bo = bo,
.flags = FD_RELOC_READ | FD_RELOC_WRITE,
.offset = offset,
.or = or,
.shift = shift,
});
}
static inline void BEGIN_RING(struct fd_ringbuffer *ring, uint32_t ndwords)
@@ -143,7 +166,7 @@ OUT_IB(struct fd_ringbuffer *ring, struct fd_ringmarker *start,
struct fd_ringmarker *end)
{
OUT_PKT3(ring, CP_INDIRECT_BUFFER_PFD, 2);
fd_ringbuffer_emit_reloc_ring(ring, start);
fd_ringbuffer_emit_reloc_ring(ring, start, end);
OUT_RING(ring, fd_ringmarker_dwords(start, end));
}

View File

@@ -205,6 +205,9 @@ nouveau_transfer_write(struct nouveau_context *nv, struct nouveau_transfer *tx,
base, size / 4, (const uint32_t *)data);
else
nv->push_data(nv, buf->bo, buf->offset + base, buf->domain, size, data);
nouveau_fence_ref(nv->screen->fence.current, &buf->fence);
nouveau_fence_ref(nv->screen->fence.current, &buf->fence_wr);
}

View File

@@ -189,16 +189,15 @@ nouveau_fence_wait(struct nouveau_fence *fence)
/* wtf, someone is waiting on a fence in flush_notify handler? */
assert(fence->state != NOUVEAU_FENCE_STATE_EMITTING);
if (fence->state < NOUVEAU_FENCE_STATE_EMITTED) {
if (fence->state < NOUVEAU_FENCE_STATE_EMITTED)
nouveau_fence_emit(fence);
if (fence == screen->fence.current)
nouveau_fence_new(screen, &screen->fence.current, FALSE);
}
if (fence->state < NOUVEAU_FENCE_STATE_FLUSHED) {
if (fence->state < NOUVEAU_FENCE_STATE_FLUSHED)
if (nouveau_pushbuf_kick(screen->pushbuf, screen->pushbuf->channel))
return FALSE;
}
if (fence == screen->fence.current)
nouveau_fence_next(screen);
do {
nouveau_fence_update(screen, FALSE);

View File

@@ -221,7 +221,7 @@ nv50_screen_get_shader_param(struct pipe_screen *pscreen, unsigned shader,
case PIPE_SHADER_CAP_MAX_INPUTS:
if (shader == PIPE_SHADER_VERTEX)
return 32;
return 0x300 / 16;
return 15;
case PIPE_SHADER_CAP_MAX_CONSTS:
return 65536 / 16;
case PIPE_SHADER_CAP_MAX_CONST_BUFFERS:

View File

@@ -61,7 +61,7 @@ nv50_validate_fb(struct nv50_context *nv50)
if (mt->base.status & NOUVEAU_BUFFER_STATUS_GPU_READING)
nv50->state.rt_serialize = TRUE;
mt->base.status |= NOUVEAU_BUFFER_STATUS_GPU_WRITING;
mt->base.status &= NOUVEAU_BUFFER_STATUS_GPU_READING;
mt->base.status &= ~NOUVEAU_BUFFER_STATUS_GPU_READING;
/* only register for writing, otherwise we'd always serialize here */
BCTX_REFN(nv50->bufctx_3d, FB, &mt->base, WR);
@@ -91,7 +91,7 @@ nv50_validate_fb(struct nv50_context *nv50)
if (mt->base.status & NOUVEAU_BUFFER_STATUS_GPU_READING)
nv50->state.rt_serialize = TRUE;
mt->base.status |= NOUVEAU_BUFFER_STATUS_GPU_WRITING;
mt->base.status &= NOUVEAU_BUFFER_STATUS_GPU_READING;
mt->base.status &= ~NOUVEAU_BUFFER_STATUS_GPU_READING;
BCTX_REFN(nv50->bufctx_3d, FB, &mt->base, WR);
} else {

View File

@@ -271,7 +271,7 @@ nv50_validate_tic(struct nv50_context *nv50, int s)
nv50->screen->tic.lock[tic->id / 32] |= 1 << (tic->id % 32);
res->status &= NOUVEAU_BUFFER_STATUS_GPU_WRITING;
res->status &= ~NOUVEAU_BUFFER_STATUS_GPU_WRITING;
res->status |= NOUVEAU_BUFFER_STATUS_GPU_READING;
BCTX_REFN(nv50->bufctx_3d, TEXTURES, res, RD);

View File

@@ -597,6 +597,15 @@ nv50_draw_elements(struct nv50_context *nv50, boolean shorten,
assert(nouveau_resource_mapped_by_gpu(nv50->idxbuf.buffer));
/* This shouldn't have to be here. The going theory is that the buffer
* is being filled in by PGRAPH, and it's not done yet by the time it
* gets submitted to PFIFO, which in turn starts immediately prefetching
* the not-yet-written data. Ideally this wait would only happen on
* pushbuf submit, but it's probably not a big performance difference.
*/
if (buf->fence_wr && !nouveau_fence_signalled(buf->fence_wr))
nouveau_fence_wait(buf->fence_wr);
while (instance_count--) {
BEGIN_NV04(push, NV50_3D(VERTEX_BEGIN_GL), 1);
PUSH_DATA (push, prim);

View File

@@ -79,14 +79,13 @@ static void test_runner_rc_regalloc(
static void tex_1d_swizzle(struct test_result *result)
{
struct radeon_compiler c;
struct r300_fragment_program_compiler c;
init_compiler(&c, RC_FRAGMENT_PROGRAM, 0, 0);
struct r300_fragment_program_compiler *cc =
(struct r300_fragment_program_compiler*)&c;
cc->AllocateHwInputs = dummy_allocate_hw_inputs;
memset(&c, 0, sizeof(c));
init_compiler(&c.Base, RC_FRAGMENT_PROGRAM, 0, 0);
c.AllocateHwInputs = dummy_allocate_hw_inputs;
test_runner_rc_regalloc(result, &c, "regalloc_tex_1d_swizzle.test");
test_runner_rc_regalloc(result, &c.Base, "regalloc_tex_1d_swizzle.test");
}
unsigned radeon_compiler_regalloc_run_tests()

View File

@@ -542,6 +542,7 @@ unsigned load_program(
char **string_store;
unsigned i = 0;
memset(line, 0, sizeof(line));
snprintf(path, MAX_PATH_LENGTH, "compiler/tests/%s", filename);
file = fopen(path, "r");
if (!file) {
@@ -552,7 +553,8 @@ unsigned load_program(
count = &test->num_input_lines;
while (fgets(line, MAX_LINE_LENGTH, file)){
if (line[MAX_LINE_LENGTH - 2] == '\n') {
char last_char = line[MAX_LINE_LENGTH - 1];
if (last_char && last_char != '\n') {
fprintf(stderr, "Error line cannot be longer than 100 "
"characters:\n%s\n", line);
return 0;

View File

@@ -62,6 +62,10 @@
#include "libkms/libkms.h"
#endif
#if XORG_VERSION_CURRENT >= XORG_VERSION_NUMERIC(1,14,99,2,0)
#define DamageUnregister(d, dd) DamageUnregister(dd)
#endif
/*
* Functions and symbols exported to Xorg via pointers.
*/

View File

@@ -2682,7 +2682,7 @@ ast_declarator_list::hir(exec_list *instructions,
precision_names[this->type->qualifier.precision],
type_name);
}
} else {
} else if (this->type->specifier->structure == NULL) {
_mesa_glsl_warning(&loc, state, "empty declaration");
}
}

View File

@@ -251,6 +251,11 @@ lower_clip_distance_visitor::fix_lhs(ir_assignment *ir)
ir_visitor_status
lower_clip_distance_visitor::visit_leave(ir_assignment *ir)
{
/* First invoke the base class visitor. This causes handle_rvalue() to be
* called on ir->rhs and ir->condition.
*/
ir_rvalue_visitor::visit_leave(ir);
ir_dereference_variable *lhs_var = ir->lhs->as_dereference_variable();
ir_dereference_variable *rhs_var = ir->rhs->as_dereference_variable();
if ((lhs_var && lhs_var->var == this->old_clip_distance_var)

View File

@@ -183,7 +183,7 @@ GetGLXPrivScreenConfig(Display * dpy, int scrn, struct glx_display ** ppriv,
/* Check to see if the GL is supported on this screen */
*ppsc = (*ppriv)->screens[scrn];
if ((*ppsc)->configs == NULL) {
if ((*ppsc)->configs == NULL && (*ppsc)->visuals == NULL) {
/* No support for GL on this screen regardless of visual */
return GLX_BAD_VISUAL;
}

View File

@@ -261,24 +261,17 @@ brw_blorp_clear_params::brw_blorp_clear_params(struct brw_context *brw,
x_align *= 16;
y_align *= 32;
if (brw->is_haswell && brw->gt == 3) {
/* From BSpec: 3D-Media-GPGPU Engine > 3D Pipeline > Pixel > Pixel
* Backend > MCS Buffer for Render Target(s) [DevIVB+]:
* [DevHSW:GT3]: Clear rectangle must be aligned to two times the
* number of pixels in the table shown below...
* x_align, y_align values computed above are the relevant entries
* in the referred table.
*/
x0 = ROUND_DOWN_TO(x0, 2 * x_align);
y0 = ROUND_DOWN_TO(y0, 2 * y_align);
x1 = ALIGN(x1, 2 * x_align);
y1 = ALIGN(y1, 2 * y_align);
} else {
x0 = ROUND_DOWN_TO(x0, x_align);
y0 = ROUND_DOWN_TO(y0, y_align);
x1 = ALIGN(x1, x_align);
y1 = ALIGN(y1, y_align);
}
/* From BSpec: 3D-Media-GPGPU Engine > 3D Pipeline > Pixel > Pixel
* Backend > MCS Buffer for Render Target(s) [DevIVB+] > Table "Color
* Clear of Non-MultiSampled Render Target Restrictions":
*
* Clear rectangle must be aligned to two times the number of pixels in
* the table shown below due to 16x16 hashing across the slice.
*/
x0 = ROUND_DOWN_TO(x0, 2 * x_align);
y0 = ROUND_DOWN_TO(y0, 2 * y_align);
x1 = ALIGN(x1, 2 * x_align);
y1 = ALIGN(y1, 2 * y_align);
/* From the Ivy Bridge PRM, Vol2 Part1 11.7 "MCS Buffer for Render
* Target(s)", beneath the "Fast Color Clear" bullet (p327):

View File

@@ -187,7 +187,8 @@ static void upload_cc_unit(struct brw_context *brw)
eqA != eqRGB);
}
if (ctx->Color.AlphaEnabled) {
/* _NEW_BUFFERS */
if (ctx->Color.AlphaEnabled && ctx->DrawBuffer->_NumColorDrawBuffers <= 1) {
cc->cc3.alpha_test = 1;
cc->cc3.alpha_test_func =
intel_translate_compare_func(ctx->Color.AlphaFunc);

View File

@@ -2920,7 +2920,7 @@ fs_visitor::run()
/* We handle discards by keeping track of the still-live pixels in f0.1.
* Initialize it with the dispatched pixels.
*/
if (fp->UsesKill) {
if (fp->UsesKill || c->key.alpha_test_func) {
fs_inst *discard_init = emit(FS_OPCODE_MOV_DISPATCH_TO_FLAGS);
discard_init->flag_subreg = 1;
}
@@ -2944,6 +2944,9 @@ fs_visitor::run()
emit(FS_OPCODE_PLACEHOLDER_HALT);
if (c->key.alpha_test_func)
emit_alpha_test();
emit_fb_writes();
split_virtual_grfs();

View File

@@ -395,6 +395,7 @@ public:
fs_reg dst, fs_reg src0, fs_reg src1, fs_reg one);
void emit_color_write(int target, int index, int first_color_mrf);
void emit_alpha_test();
void emit_fb_writes();
void emit_shader_time_begin();

View File

@@ -107,7 +107,7 @@ fs_generator::generate_fb_write(fs_inst *inst)
brw_set_mask_control(p, BRW_MASK_DISABLE);
brw_set_compression_control(p, BRW_COMPRESSION_NONE);
if (fp->UsesKill) {
if (fp->UsesKill || c->key.alpha_test_func) {
struct brw_reg pixel_mask;
if (brw->gen >= 6)

View File

@@ -2228,6 +2228,60 @@ fs_visitor::emit_color_write(int target, int index, int first_color_mrf)
}
}
static int
cond_for_alpha_func(GLenum func)
{
switch(func) {
case GL_GREATER:
return BRW_CONDITIONAL_G;
case GL_GEQUAL:
return BRW_CONDITIONAL_GE;
case GL_LESS:
return BRW_CONDITIONAL_L;
case GL_LEQUAL:
return BRW_CONDITIONAL_LE;
case GL_EQUAL:
return BRW_CONDITIONAL_EQ;
case GL_NOTEQUAL:
return BRW_CONDITIONAL_NEQ;
default:
assert(!"Not reached");
return 0;
}
}
/**
* Alpha test support for when we compile it into the shader instead
* of using the normal fixed-function alpha test.
*/
void
fs_visitor::emit_alpha_test()
{
this->current_annotation = "Alpha test";
fs_inst *cmp;
if (c->key.alpha_test_func == GL_ALWAYS)
return;
if (c->key.alpha_test_func == GL_NEVER) {
/* f0.1 = 0 */
fs_reg some_reg = fs_reg(retype(brw_vec8_grf(0, 0),
BRW_REGISTER_TYPE_UW));
cmp = emit(CMP(reg_null_f, some_reg, some_reg,
BRW_CONDITIONAL_NEQ));
} else {
/* RT0 alpha */
fs_reg color = outputs[0];
color.reg_offset += 3;
/* f0.1 &= func(color, ref) */
cmp = emit(CMP(reg_null_f, color, fs_reg(c->key.alpha_test_ref),
cond_for_alpha_func(c->key.alpha_test_func)));
}
cmp->predicate = BRW_PREDICATE_NORMAL;
cmp->flag_subreg = 1;
}
void
fs_visitor::emit_fb_writes()
{

View File

@@ -94,7 +94,7 @@ intel_horizontal_texture_alignment_unit(struct brw_context *brw,
static unsigned int
intel_vertical_texture_alignment_unit(struct brw_context *brw,
gl_format format)
gl_format format, bool multisampled)
{
/**
* From the "Alignment Unit Size" section of various specs, namely:
@@ -118,8 +118,6 @@ intel_vertical_texture_alignment_unit(struct brw_context *brw,
*
* On SNB+, non-special cases can be overridden by setting the SURFACE_STATE
* "Surface Vertical Alignment" field to VALIGN_2 or VALIGN_4.
*
* We currently don't support multisampling.
*/
if (_mesa_is_format_compressed(format))
return 4;
@@ -127,6 +125,9 @@ intel_vertical_texture_alignment_unit(struct brw_context *brw,
if (format == MESA_FORMAT_S8)
return brw->gen >= 7 ? 8 : 4;
if (multisampled)
return 4;
GLenum base_format = _mesa_get_format_base_format(format);
if (brw->gen >= 6 &&
@@ -284,8 +285,10 @@ brw_miptree_layout_texture_3d(struct brw_context *brw,
void
brw_miptree_layout(struct brw_context *brw, struct intel_mipmap_tree *mt)
{
bool multisampled = mt->num_samples > 1;
mt->align_w = intel_horizontal_texture_alignment_unit(brw, mt->format);
mt->align_h = intel_vertical_texture_alignment_unit(brw, mt->format);
mt->align_h =
intel_vertical_texture_alignment_unit(brw, mt->format, multisampled);
switch (mt->target) {
case GL_TEXTURE_CUBE_MAP:

View File

@@ -287,6 +287,10 @@ brw_wm_debug_recompile(struct brw_context *brw,
old_key->drawable_height, key->drawable_height);
found |= key_debug(brw, "input slots valid",
old_key->input_slots_valid, key->input_slots_valid);
found |= key_debug(brw, "mrt alpha test function",
old_key->alpha_test_func, key->alpha_test_func);
found |= key_debug(brw, "mrt alpha test reference value",
old_key->alpha_test_ref, key->alpha_test_ref);
found |= brw_debug_recompile_sampler_key(brw, &old_key->tex, &key->tex);
@@ -467,6 +471,18 @@ static void brw_wm_populate_key( struct brw_context *brw,
if (brw->gen < 6)
key->input_slots_valid = brw->vue_map_geom_out.slots_valid;
/* _NEW_COLOR | _NEW_BUFFERS */
/* Pre-gen6, the hardware alpha test always used each render
* target's alpha to do alpha test, as opposed to render target 0's alpha
* like GL requires. Fix that by building the alpha test into the
* shader, and we'll skip enabling the fixed function alpha test.
*/
if (brw->gen < 6 && ctx->DrawBuffer->_NumColorDrawBuffers > 1 && ctx->Color.AlphaEnabled) {
key->alpha_test_func = ctx->Color.AlphaFunc;
key->alpha_test_ref = ctx->Color.AlphaRef;
}
/* The unique fragment program ID */
key->program_string_id = fp->id;
}

View File

@@ -70,6 +70,8 @@ struct brw_wm_prog_key {
GLushort drawable_height;
GLbitfield64 input_slots_valid;
GLuint program_string_id:32;
GLenum alpha_test_func; /* < For Gen4/5 MRT alpha test */
float alpha_test_ref;
struct brw_sampler_prog_key_data tex;
};

View File

@@ -397,12 +397,12 @@ swrast_map_renderbuffer(struct gl_context *ctx,
stride = w * cpp;
xrb->Base.Buffer = malloc(h * stride);
sPriv->swrast_loader->getImage(dPriv, x, y, w, h,
sPriv->swrast_loader->getImage(dPriv, x, rb->Height - y - h, w, h,
(char *) xrb->Base.Buffer,
dPriv->loaderPrivate);
*out_map = xrb->Base.Buffer;
*out_stride = stride;
*out_map = xrb->Base.Buffer + (h - 1) * stride;
*out_stride = -stride;
return;
}

View File

@@ -660,11 +660,8 @@ set_tex_parameterf(struct gl_context *ctx,
return GL_FALSE;
case GL_TEXTURE_LOD_BIAS:
/* NOTE: this is really part of OpenGL 1.4, not EXT_texture_lod_bias.
* It was removed in core-profile, and it has never existed in OpenGL
* ES.
*/
if (ctx->API != API_OPENGL_COMPAT)
/* NOTE: this is really part of OpenGL 1.4, not EXT_texture_lod_bias. */
if (_mesa_is_gles(ctx))
goto invalid_pname;
if (!target_allows_setting_sampler_parameters(texObj->Target))
@@ -1489,7 +1486,7 @@ _mesa_GetTexParameterfv( GLenum target, GLenum pname, GLfloat *params )
*params = (GLfloat) obj->DepthMode;
break;
case GL_TEXTURE_LOD_BIAS:
if (ctx->API != API_OPENGL_COMPAT)
if (_mesa_is_gles(ctx))
goto invalid_pname;
*params = obj->Sampler.LodBias;
@@ -1677,10 +1674,13 @@ _mesa_GetTexParameteriv( GLenum target, GLenum pname, GLint *params )
*params = (GLint) obj->DepthMode;
break;
case GL_TEXTURE_LOD_BIAS:
if (ctx->API != API_OPENGL_COMPAT)
if (_mesa_is_gles(ctx))
goto invalid_pname;
*params = (GLint) obj->Sampler.LodBias;
/* GL spec 'Data Conversions' section specifies that floating-point
* value in integer Get function is rounded to nearest integer
*/
*params = IROUND(obj->Sampler.LodBias);
break;
case GL_TEXTURE_CROP_RECT_OES:
if (ctx->API != API_OPENGLES || !ctx->Extensions.OES_draw_texture)

View File

@@ -85,9 +85,11 @@ feedback_vertex(struct gl_context *ctx, const struct draw_context *draw,
const GLfloat *color, *texcoord;
GLuint slot;
/* Recall that Y=0=Top of window for Gallium wincoords */
win[0] = v->data[0][0];
win[1] = ctx->DrawBuffer->Height - v->data[0][1];
if (st_fb_orientation(ctx->DrawBuffer) == Y_0_TOP)
win[1] = ctx->DrawBuffer->Height - v->data[0][1];
else
win[1] = v->data[0][1];
win[2] = v->data[0][2];
win[3] = 1.0F / v->data[0][3];