Compare commits

...

41 Commits

Author SHA1 Message Date
Ian Romanick
fcb4eabb5f mesa: Bump version to 10.1-rc2
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2014-02-21 14:29:44 -08:00
Kenneth Graunke
7cb4dea765 i965: Create a hardware context before initializing state module.
brw_init_state() calls brw_upload_initial_gpu_state().  If hardware
contexts are enabled (brw->hw_ctx != NULL), this will upload some
initial invariant state for the GPU.  Without hardware contexts, we
rely on this state being uploaded via atoms that subscribe to the
BRW_NEW_CONTEXT bit.

Commit 46d3c2bf4d accidentally moved
the call to brw_init_state() before creating a hardware context.
This meant brw_upload_initial_gpu_state would always early return.
Except on Gen6+, we stopped uploading the initial GPU state via
state atoms, so it never happened.

Fixes a regression since 46d3c2bf4d.

Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 3663bbe773)
2014-02-21 14:28:37 -08:00
Ian Romanick
b498fb9586 glsl: Only warn for macro names containing __
From page 14 (page 20 of the PDF) of the GLSL 1.10 spec:

    "In addition, all identifiers containing two consecutive underscores
     (__) are reserved as possible future keywords."

The intention is that names containing __ are reserved for internal use
by the implementation, and names prefixed with GL_ are reserved for use
by Khronos.  Names simply containing __ are dangerous to use, but should
be allowed.

Per the Khronos bug mentioned below, a future version of the GLSL
specification will clarify this.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Darius Spitznagel <d.spitznagel@goodbytez.de>
Cc: Tapani Pälli <lemody@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71870
Bugzilla: Khronos #11702
(cherry picked from commit 2c85fd5a96)
2014-02-20 13:41:14 -08:00
Ian Romanick
3731a4fae4 glcpp: Only warn for macro names containing __
Section 3.3 (Preprocessor) of the GLSL 1.30 spec (and later) and the
GLSL ES spec (all versions) say:

    "All macro names containing two consecutive underscores ( __ ) are
    reserved for future use as predefined macro names. All macro names
    prefixed with "GL_" ("GL" followed by a single underscore) are also
    reserved."

The intention is that names containing __ are reserved for internal use
by the implementation, and names prefixed with GL_ are reserved for use
by Khronos.  Since every extension adds a name prefixed with GL_ (i.e.,
the name of the extension), that should be an error.  Names simply
containing __ are dangerous to use, but should be allowed.  In similar
cases, the C++ preprocessor specification says, "no diagnostic is
required."

Per the Khronos bug mentioned below, a future version of the GLSL
specification will clarify this.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Darius Spitznagel <d.spitznagel@goodbytez.de>
Cc: Tapani Pälli <lemody@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71870
Bugzilla: Khronos #11702
(cherry picked from commit 0bd7892630)
2014-02-20 13:41:09 -08:00
Anuj Phogat
d623eeb37a glsl: Fix condition to generate shader link error
GL_ARB_ES2_compatibility doesn't say anything about shader linking
when one of the shaders (vertex or fragment shader) is absent. So,
the extension shouldn't change the behavior specified in GLSL
specification.

Tested the behavior on proprietary linux drivers of NVIDIA and AMD.
Both of them allow linking a version 100 shader program in OpenGL
context, when one of the shaders is absent.

Makes following Khronos CTS tests to pass:
successfulcompilevert_linkprogram.test
successfulcompilefrag_linkprogram.test

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 03597cf802)
2014-02-18 16:47:33 -08:00
Anuj Phogat
6534d80cca mesa: Add GL_TEXTURE_CUBE_MAP_ARRAY to legal_get_tex_level_parameter_target()
Fixes failing Khronos CTS test packed_depth_stencil_init.test

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 6bd2472a8b)
2014-02-18 16:47:33 -08:00
Michel Dänzer
20eb466999 r600g,radeonsi: Consolidate logic for short-circuiting flushes
Fixes radeonsi emitting command streams to the kernel even when there
have been no draw calls before a flush, potentially powering up the GPU
needlessly.

Incidentally, this also cuts the runtime of piglit gpu.py in about half
on my Kaveri system, probably because an X11 client going away no longer
always results in a command stream being submitted to the kernel via
glamor.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=65761
Cc: "10.1" mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit cf0172d46a)
2014-02-18 16:47:33 -08:00
Kusanagi Kouichi
706eef0cfe targets/vdpau: Always use c++ to link
If built without llvm, the following error occurs with mplayer:

Failed to open VDPAU backend .../libvdpau_r600.so: undefined symbol: _ZTVN10__cxxabiv117__class_type_infoE
[vo/vdpau] Error when calling vdp_device_create_x11: 1

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 61f6cddef7)
2014-02-18 16:47:33 -08:00
Carl Worth
a889b8e9aa main: Avoid double-free of shader Label
As documented, the _mesa_free_shader_program_data function:

	"Frees all the data that hangs off a shader program object, but not
	the object itself."

This means that this function may be called multiple times on the same object,
(and has been observed to). Meanwhile, the shProg->Label field was not being
set to NULL after its free(). This led to a second call to free() of the same
address on the second call to this function.

Fix this by setting this field to NULL after free(), (just as with all other
calls to free() in this function).

Reviewed-by: Brian Paul <brianp@vmware.com>

CC: mesa-stable@lists.freedesktop.org
(cherry picked from commit a92581acf2)
2014-02-18 16:47:33 -08:00
Alex Deucher
02d96b7e9f radeon: reverse DBG_NO_HYPERZ logic
Change the flag to DBG_HYPERZ and reverse the logic
so setting the flag enabled the feature.  This disables
hyperz on r600g and radeonsi by default.  It can be
enabled by setting the env var.  There are just too
many issues with certain apps so leave it disabled for
now until we sort out the issues with the problematic
apps.

Bugs:
https://bugs.freedesktop.org/show_bug.cgi?id=58660
https://bugs.freedesktop.org/show_bug.cgi?id=64471
https://bugs.freedesktop.org/show_bug.cgi?id=66352
https://bugs.freedesktop.org/show_bug.cgi?id=68799
https://bugs.freedesktop.org/show_bug.cgi?id=72685
https://bugs.freedesktop.org/show_bug.cgi?id=73088
https://bugs.freedesktop.org/show_bug.cgi?id=74428
https://bugs.freedesktop.org/show_bug.cgi?id=74803
https://bugs.freedesktop.org/show_bug.cgi?id=74863
https://bugs.freedesktop.org/show_bug.cgi?id=74892
https://bugzilla.kernel.org/show_bug.cgi?id=70411

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: "10.1" "10.0" <mesa-stable@lists.freedesktop.org>
Acked-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 01e6371149)
2014-02-18 16:45:05 -08:00
Ilia Mirkin
a1b6aa9fe2 nouveau: fix chipset checks for nv1a by using the oclass instead
Commit f4ebcd133b ("dri/nouveau: NV17_3D class is not available for
NV1a chipset") fixed this partially by using the correct 3d class.
However there were a lot of checks left over comparing against the
chipset.

Reported-and-tested-by: John F. Godfrey <jfgodfrey@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 9.2 10.0 10.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit 0c8b165366)
2014-02-18 16:45:01 -08:00
Fredrik Höglund
150b1f0aac mesa: Preserve the NewArrays state when copying a VAO
Cc: "10.1" "10.0" <mesa-stable@lists.freedesktop.org>

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72895
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 9afbd04d89)
2014-02-18 16:44:58 -08:00
Matt Turner
50066dc544 glsl: Do not vectorize vector array dereferences.
Array dereferences must have scalar indices, so we cannot vectorize
them.

Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Reported-by: Andrew Guertin <lists@dolphinling.net>
Tested-by: Andrew Guertin <lists@dolphinling.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 025d99ce3c)
2014-02-18 16:44:53 -08:00
Emil Velikov
088d642b8f dri/nouveau: Pass the API into _mesa_initialize_context
Currently we create a OPENGL_COMPAT context regardless of
what was requested by the program. Correct that by retaining
the program's request and passing it into _mesa_initialize_context.

Based on a similar commit for radeon/r200 by Ian Romanick.

Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 76d9f6d972)
2014-02-18 16:44:50 -08:00
Daniel Kurtz
69bd4ed017 glsl: Add locking to builtin_builder singleton
Consider a multithreaded program with two contexts A and B, and the
following scenario:

1. Context A calls initialize(), which allocates mem_ctx and starts
   building built-ins.
2. Context B calls initialize(), which sees mem_ctx != NULL and assumes
   everything is already set up.  It returns.
3. Context B calls find(), which fails to find the built-in since it
   hasn't been created yet.
4. Context A finally finishes initializing the built-ins.

This will break at step 3.  Adding a lock ensures that subsequent
callers of initialize() will wait until initialization is actually
complete.

Similarly, if any thread calls release while another thread is still
initializing, or calling find(), the mem_ctx/shader would get free'd while
from under it, leading to corruption or use-after-free crashes.

Fixes sporadic failures in Piglit's glx-multithread-shader-compile.

Bugzilla: https://bugs.freedesktop.org/69200
Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.1 10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b47d231526)
2014-02-18 16:44:46 -08:00
Kenneth Graunke
62a358892f mesa: Fix MESA_FORMAT_Z24_UNORM_S8_UINT vs. X8_UINT mix-up.
In commit eeed49f5f2, Mark accidentally
renamed MESA_FORMAT_S8_Z24 to MESA_FORMAT_Z24_UNORM_X8_UINT and
MESA_FORMAT_X8_Z24 to MESA_FORMAT_Z24_UNORM_S8_UINT, reversing their
sense.  The commit message was correct, but what sed commands actually
got run didn't match that.

This patch swaps the two enum names, reversing them.  This should undo
the damage, but might break things if people have manually fixed a few
instances in the meantime...

Mark's commit also failed to mention renames:
s/MESA_FORMAT_ARGB2101010_UINT\b/MESA_FORMAT_B10G10R10A2_UINT/g
s/MESA_FORMAT_ABGR2101010\b/MESA_FORMAT_R10G10B10A2_UNORM/g
but those seem okay.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit a487ef87fe)
2014-02-10 08:45:16 -08:00
Ilia Mirkin
290648b076 nouveau/video: make sure that firmware is present when checking caps
Apparently some players are ill-prepared for us claiming that a decoder
exists only to have creating it fail, and express this poor preparation
with crashes (e.g. flash). Check that firmware is there to increase the
chances of there being a high correlation between reported capabilities
and ability to create a decoder.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 10.0 10.1 <mesa-stable@lists.freedesktop.org>
Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 40dd777b33)
2014-02-10 08:44:56 -08:00
Grigori Goronzy
7f97f1fce4 gallium: add geometry shader output limits
v2: adjust limits for radeonsi and llvmpipe
v3: add documentation

Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit d34d5fddf8)
2014-02-10 08:44:52 -08:00
Ilia Mirkin
33169597f7 nv30: report 8 maximum inputs
nvfx_fragprog_assign_generic only allows for up to 10/8 texcoords for
nv40/nv30. This fixes compilation of the varying-packing tests.
Furthermore it appears that the last 2 inputs on nv4x don't seem to
work in those tests, so just report 8 everywhere for now.

Tested on NV42, NV44. NV4B appears to have additional problems.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 9.1 9.2 10.0 10.1 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 356aff3a5c)
2014-02-10 08:44:48 -08:00
Christoph Bumiller
966f2d3db8 nv50/ir/ra: some register spilling fixes
Cc: 10.1 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2e9ee44797)
2014-02-10 08:44:41 -08:00
Brian Paul
3cefbe5cf5 mesa: update assertion in detach_shader() for geom shaders
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74723
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
(cherry picked from commit c325ec8965)
2014-02-10 08:44:37 -08:00
Ian Romanick
1e6bba58d8 mesa: Bump version to 10.1-rc1
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2014-02-07 18:34:08 -08:00
Christoph Bumiller
137a0fe5c8 nvc0: handle TGSI_SEMANTIC_LAYER
Cc: 10.1 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 882e98e5e6)
2014-02-07 17:10:11 -08:00
Kristian Høgsberg
70e8ec38b5 glx: Pass NULL DRI drawables into the DRI driver for None GLX drawables
GLX_ARB_create_context allows making a GLX context current with None
drawable and readables, but this was never implemented correctly in GLX.
We would create a __DRIdrawable for the None GLX drawable and pass that
to the DRI driver and that would somehow work.  Now it's somehow broken.

The way this should have worked is that we pass a NULL DRI drawable
to the DRI driver when the GLX user calls glXMakeContextCurrent()
with None for drawable and readables.

https://bugs.freedesktop.org/show_bug.cgi?id=74143
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
(cherry picked from commit f658150639)
2014-02-07 17:10:11 -08:00
Kristian Høgsberg
c79a7ef9a3 i965: Move intel_prepare_render() above first buffer access
The driver is supposed to ensure buffers before any drawing operation, but in
do_blit_drawpixels() and do_blit_copypixels() we inspect the buffer format
before calling intel_prepare_render().  That was covered up by the
unconditional call to intel_prepare_render() in intelMakeCurrent(), but we
now only do this on the initial intelMakeCurrent call for a context
(to get the size for the initial viewport values).

https://bugs.freedesktop.org/show_bug.cgi?id=74083

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Tested-by: Alexander Monakov <amonakov@gmail.com>
(cherry picked from commit 44338cd826)
2014-02-07 17:10:11 -08:00
Christoph Bumiller
17aeb3fdc9 nvc0/ir/emit: hardcode vertex output stream to 0 for now
(cherry picked from commit b7233acf78)
2014-02-07 17:10:11 -08:00
Kenneth Graunke
ecaf9259e9 glsl: Don't lose precision qualifiers when encountering "centroid".
Mesa fails to retain the precision qualifier when parsing:

   #version 300 es
   centroid in mediump vec2 v;

Consider how the parser's type_qualifier production is applied.
First, the precision_qualifier rule creates a new ast_type_qualifier:

    <precision: mediump>

Then the storage_qualifier rule creates a second one:

    <flags: in>

and calls merge_qualifier() to fold in any previous qualifications,
returning:

    <flags: in, precision: mediump>

Finally, the auxiliary_storage_qualifier creates one for "centroid":

    <flags: centroid>

it then does $$ = $1 and $$.flags |= $2.flags, resulting in:

    <flags: centroid, in>

Since precision isn't stored in the flags bitfield, it is lost.  We need
to instead call merge_qualifier to combine all the fields.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reported-by: Kevin Rogovin <kevin.rogovin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 2062f40d81)
2014-02-07 17:10:10 -08:00
Brian Paul
0fb761b404 st/mesa: avoid sw fallback for getting/decompressing textures
If st_GetTexImage() is to decompress the texture, avoid the fallback
path even if prefer_blit_based_texture_transfer = false.  For drivers
that returned PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER = 0, we
were always taking the fallback path for texture decompression rather
than rendering a quad.  The later is a lot faster.

Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit f47e596288)
2014-02-07 17:10:10 -08:00
Ilia Mirkin
31911f8d37 nv50: only over-allocate by a page for code
The pre-fetching doesn't go too far. Tested with over-allocating by only
a page, and didn't see any errors in dmesg. Saves ~512KB of VRAM.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 10.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
(cherry picked from commit f76c7ad5b1)
2014-02-07 17:10:10 -08:00
Ilia Mirkin
142f6cc0b4 nv50: fix layerid to be the fp input number rather than vp output number
In the tests they were the same so it didn't matter, but indications are
that this is the correct behaviour. Also take this opportunity to
(trivially) support using gl_Layer in fp.

Cc: 10.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
(cherry picked from commit 364bdd2419)
2014-02-07 17:10:10 -08:00
Ilia Mirkin
156ac628a8 nv50: rework primid logic
Functionally identical but much simpler. Should also better integrate
with future layer/viewport changes/fixes.

Cc: 10.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
(cherry picked from commit c7373b7dc7)
2014-02-07 17:10:10 -08:00
Matt Turner
7aa84761b6 glsl: Initialize ubo_binding_mask flags to zero.
Missed in commit e63bb298. Caused sporadic test failures, like
incorrect-in-layout-qualifier-repeated-prim.geom.

Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit e2ef93cf94)
2014-02-07 17:10:10 -08:00
Marek Olšák
61219adb3d st/mesa: fix crash when a shader uses a TBO and it's not bound
This binds a NULL sampler view in that case.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74251

Cc: "10.1" "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit c6dbcf10df)
2014-02-07 17:10:10 -08:00
Paul Berry
ee632e68bd glsl: Fix continue statements in do-while loops.
From the GLSL 4.40 spec, section 6.4 (Jumps):

    The continue jump is used only in loops. It skips the remainder of
    the body of the inner most loop of which it is inside. For while
    and do-while loops, this jump is to the next evaluation of the
    loop condition-expression from which the loop continues as
    previously defined.

Previously, we incorrectly treated a "continue" statement as jumping
to the top of a do-while loop.

This patch fixes the problem by replicating the loop condition when
converting the "continue" statement to IR.  (We already do a similar
thing in "for" loops, to ensure that "continue" causes the loop
expression to be executed).

Fixes piglit tests:
- glsl-fs-continue-inside-do-while.shader_test
- glsl-vs-continue-inside-do-while.shader_test
- glsl-fs-continue-in-switch-in-do-while.shader_test
- glsl-vs-continue-in-switch-in-do-while.shader_test

Cc: mesa-stable@lists.freedesktop.org

Acked-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 7f5740899f)
2014-02-07 17:10:10 -08:00
Paul Berry
b5c99be4af glsl: Make condition_to_hir() callable from outside ast_iteration_statement.
In addition to making it public, we also need to change its first
argument from an ir_loop * to an exec_list *, so that it can be used
to insert the condition anywhere in the IR (rather than just in the
body of the loop).

This will be necessary in order to make continue statements work
properly in do-while loops.

Cc: mesa-stable@lists.freedesktop.org

Acked-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 56790856b3)
2014-02-07 17:10:10 -08:00
Topi Pohjolainen
165868d45e i965/blorp: do not use unnecessary hw-blending support
This is really not needed as blorp blit programs already sample
XRGB normally and get alpha channel set to 1.0 automatically by
the sampler engine. This is simply copied directly to the payload
of the render target write message and hence there is no need for
any additional blending support from the pixel processing pipeline.

The blending formula is anyway broken for color components, it
multiplies the color component with itself (blend factor is the
component itself).
Alpha blending in turn would not fix the alpha to one independent
of the source but simply used the source alpha as is instead
(1.0 * src_alpha + 0.0 * dst_alpha).

Quoting Eric:

 "If we want to actually make the no-alpha-bits-present thing work,
  we need to override the bits in the surface state or in the
  generated code.  In the normal draw path, it's done for sampling
  by the swizzling code in brw_wm_surface_state.c, and the blending
  overrides is just to fix up the alpha blending stage which
  doesn't pay attention to that for the destination surface."

If one modifies piglit test gl-3.2-layered-rendering-blit to use
color component values other than zero or one, this change will
kick in on IVB. No regressions on IVB.

This is effectively revert of c0554141a9:

    i965/blorp: Support overriding destination alpha to 1.0.

    Currently, Blorp requires the source and destination formats to be
    equal.  However, we'd really like to be able to blit between XRGB and
    ARGB formats; our BLT engine paths have supported this for a long time.

    For ARGB -> XRGB, nothing needs to occur: the missing alpha is already
    interpreted as 1.0.  For XRGB -> ARGB, we need to smash the alpha
    channel to 1.0 when writing the destination colors.  This is fairly
    straightforward with blending.

    For now, this code is never used, as the source and destination formats
    still must be equal.  The next patch will relax that restriction.

    NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
(cherry picked from commit 933be19cdf)
2014-02-07 17:10:10 -08:00
Christian König
bbcd975881 radeon/uvd: fix feedback buffer handling v2
Without the correct feedback buffer size UVD runs
into an error on each frame, reducing the maximum FPS.

v2: fixing Michels comments

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Cc: "10.1" "10.0" "9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c3c24c3acc)
2014-02-07 17:10:10 -08:00
Brian Paul
6cfcc4fccf draw: fix incorrect color of flat-shaded clipped lines
When we clipped a line weren't copying the provoking vertex
color to the second vertex.  We also weren't checking for
first vs. last provoking vertex.

Fixes failures found with the new piglit line-flat-clip-color test.

Cc: "10.0, 10.1" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit fc3fcd1e01)
2014-02-07 17:10:09 -08:00
Brian Paul
39a3b0313b gallium/auxiliary/indices: replace free() with FREE()
To match the CALLOC_STRUCT() call.

Cc: "10.0, 10.1" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 307fd76053)
2014-02-07 17:10:09 -08:00
Dave Airlie
9e59e41266 docs: update 10.1 relnotes to note GL 3.3 on r600 and radeonsi.
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-02-06 01:03:09 +00:00
Dave Airlie
1289080c4d r600g: Add GL 3.3 support for 10.1 release
All patches on master below, except max samplers
which was removed on master.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>

commit 57c6bb18822ebf88a98b98714c846608ff3ba42b
Author: Dave Airlie <airlied@redhat.com>
Date:   Thu Feb 6 00:48:57 2014 +0000

    bump max samplers

commit 2e4bd244493bebd41edf725a2c3c4e793282a5bb
Author: Dave Airlie <airlied@redhat.com>
Date:   Thu Jan 30 04:19:57 2014 +0000

    r600g: add support for geom shaders to r600/r700 chipsets (v2)

    This is my first attempt at enabling r600/r700 geometry shaders,
    the basic tests pass on both my rv770 and my rv635,

    It requires this kernel patch:
    http://www.spinics.net/lists/dri-devel/msg52745.html

    v2: address Alex comments.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit 0ed4f769d77c4db2259befba5fc1707f1cb5cb98
Author: Dave Airlie <airlied@redhat.com>
Date:   Wed Jan 29 21:48:09 2014 +0000

    r600g: enable GLSL 3.30 on evergreen GPUs

    This throws the switch to enable GL 3.3 and GLSL 330.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit aeca8f21dd42b9ecd3932ef028fa8846036c1307
Author: Dave Airlie <airlied@redhat.com>
Date:   Tue Feb 4 10:48:42 2014 +1000

    r600g: properly propogate clip dist write value

    This moves the value from the GS shader to the copy shader so the registers
    are setup correctly.

    fixes tests/spec/glsl-1.50/execution/geometry/clip-distance-out-values.shader_test

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit e1bc410fe670bb17078a55876f1700a504127fef
Author: Dave Airlie <airlied@redhat.com>
Date:   Mon Feb 3 15:31:26 2014 +1000

    r600g: calculate a better value for array_size (v2)

    attempt to calculate a better value for array size to avoid breaking apps.

    v2: use 0xfff like streamout, suggested by Grigori

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit 6f2f117dec51eb51c1b09e86e829e176a98e3bfc
Author: Dave Airlie <airlied@redhat.com>
Date:   Fri Jan 31 03:35:51 2014 +0000

    r600g: fix CAYMAN geometry shader support

    cayman has a different end of program bit, so do that properly.

    fixes hangs with geom shader tests on cayman.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit 305ea22fd517f83406aba3e3930d710fd42a3049
Author: Dave Airlie <airlied@redhat.com>
Date:   Wed Jan 29 00:17:15 2014 +0000

    r600g: fix up shader out misc stuff for copy shader

    set the correct values so the misc out register is setup correctly
    for the copy shader.

    This also updates the state for the gs copy shader so the hw
    gets programmed correctly.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit 53630e14c8791a84798a03d74653bf46bd013fc7
Author: Dave Airlie <airlied@redhat.com>
Date:   Tue Jan 28 23:15:29 2014 +0000

    r600g: port the layered surface rendering patch from radeonsi

    This just makes r600 and evergreen do what the radeonsi codepaths do
    for layered rendering. This makes the 2d amd_vertex_shader_layer test
    pass on evergreen.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit aa4cd3b9bed1ea23468fba4aa5c428153e8cddc1
Author: Dave Airlie <airlied@redhat.com>
Date:   Tue Jan 28 13:04:00 2014 +1000

    r600g: initial VS output layer support

    This just adds support for emitting the proper value in the VS out misc.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit 75a93f2e1e0f4d6015cdf63570ec4d3d12478b8d
Author: Dave Airlie <airlied@redhat.com>
Date:   Tue Jan 28 12:06:49 2014 +1000

    r600g: setup const texture buffers for geom shaders

    This just enables the workarounds we have for vertex/pixel shaders
    for geom shaders as well.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit 88697a860635aae54e56dce2d6a839a06dea0c5a
Author: Dave Airlie <airlied@redhat.com>
Date:   Fri Jan 24 17:14:26 2014 +1000

    r600g: calculate correct cut value

    This selects the cut value depending on the shader selected.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit dfb88bef3e13112a838773e700c35052774f8a63
Author: Dave Airlie <airlied@redhat.com>
Date:   Fri Jan 24 14:46:37 2014 +1000

    r600g: fix dynamic_input_array_index.shader_test

    This follows what fglrx does, it unpacks the input we are
    going to indirect into a bunch of registers and indirects
    inside them.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit a3c6373f8cf3aab750399654a4b77150ec30bce9
Author: Dave Airlie <airlied@redhat.com>
Date:   Fri Jan 24 13:39:36 2014 +1000

    r600g: add support for indirect geom ring writes

    We need to be able to write to the ring using a base register
    for when we emit vertices in a loop, in theory the SB compiler
    could collapse these indirect writes to direct writes if the
    register value is constant and known, but that is outside my
    pay grade.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit dbc6a13adf935b118eaa6b396593f50d7b7e16e6
Author: Dave Airlie <airlied@redhat.com>
Date:   Tue Dec 24 05:59:19 2013 +0000

    r600g: write proper output prim type

    Vadim's code derived it from the info.mode, but it needs
    to be takes from the geometry shader output primitive.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit f7f51b0b775f652967e2b972cf7c183482a771be
Author: Dave Airlie <airlied@redhat.com>
Date:   Tue Dec 24 05:30:37 2013 +0000

    r600g: enable instance cnt register with new enough kernel

    The instance cnt register was missing for a few kernels,
    with a new enough kernel we can output it.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit 9e6ce37f66372018ec5398f74c3b43ff5f5bf309
Author: Dave Airlie <airlied@redhat.com>
Date:   Mon Dec 23 01:30:03 2013 +0000

    r600g: add primitive input support for gs

    only enable prim id if gs uses it

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit fa932dfc7df3cf9ff63d08fb0e1db2119fc2ac93
Author: Dave Airlie <airlied@redhat.com>
Date:   Thu Dec 19 05:17:00 2013 +0000

    r600g: emit streamout from dma copy shader

    This enables streamout with GS in the mix, from the
    VS dma shader.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit 205defb542ac185b7f46508fd51a4077a4702107
Author: Dave Airlie <airlied@redhat.com>
Date:   Wed Dec 18 15:55:07 2013 +1000

    r600g/gs: fix cases where number of gs inputs != number of gs outputs

    this fixes a bunch of the geom shader built-in tests

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit d9e7ab40bc45644194c86f842599c76d0675243c
Author: Dave Airlie <airlied@redhat.com>
Date:   Tue Jan 28 10:21:03 2014 +1000

    r600g: increase array base for exported parameters

    Trivial fix to Vadim's code.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit 82d67fbd3b96b6b2cc0124a19b6f31b7912ec152
Author: Dave Airlie <airlied@redhat.com>
Date:   Fri Jan 24 16:41:32 2014 +1000

    r600g: initialise the geom shader loop registers.

    As we do for vertex and pixel shaders.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit 78be55d98d290d708bd1b3df3ef6cd5fa89865c7
Author: Dave Airlie <airlied@redhat.com>
Date:   Sat Nov 30 06:26:13 2013 +0000

    r600g: emit NOPs at end of shaders in more cases

    If the shader has no CF clauses at all emit an nop
    If the last instruction is an ENDLOOP add a NOP for the LOOP to go to
    if the last instruction is CALL_FS add a NOP

    These fix a bunch of hangs in the geometry shader tests.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit 634b2498dc73efa3cca5a6fc3ed35c5bea6bb2e9
Author: Dave Airlie <airlied@redhat.com>
Date:   Thu Nov 28 23:38:35 2013 +0000

    r600g: don't enable SB for geom shaders

    SB needs fixes for three GS instructions it seems to raise
    them outside loops etc despite my best efforts.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit 5b61dd0e917e54625ac227b8b1c2c82955f51ab1
Author: Dave Airlie <airlied@redhat.com>
Date:   Tue Dec 24 04:56:25 2013 +0000

    r600g/sb: add MEM_RING support

    Although we don't use SB on geom shaders, the VS copy shader will use it
    so we might as well implement MEM_RING support in sb.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit 0247375aec4681c154ae4d14b8cd637e7a9e0e3e
Author: Dave Airlie <airlied@redhat.com>
Date:   Wed Jan 29 04:08:43 2014 +0000

    r600g: don't fail if we can't map VS->GS ring entries

    This can happen in normal operation, so don't report an error on it,
    just continue.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit 2c986600fac6cb5692e9e377cb04f9f50389172c
Author: Vadim Girlin <vadimgirlin@gmail.com>
Date:   Fri Aug 2 06:38:23 2013 +0400

    r600g: initial support for geometry shaders on evergreen (v2)

    This is Vadim's initial work with a few regression fixes squashed in.

    v2: (airlied)
    fix regression in glsl-max-varyings - need to use vs and ps_dirty
    fix regression in shader exports from rebasing.
    whitespace fixing.
    v2.1: squash fix assert

    Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit ce23c43e2b611f30964afe4d1c02c4d0361ba430
Author: Vadim Girlin <vadimgirlin@gmail.com>
Date:   Fri Aug 2 06:32:32 2013 +0400

    r600g: add hw register definitions for GS block setup

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit b0ec79c28d6373930ca0dc19168dd504204456b5
Author: Vadim Girlin <vadimgirlin@gmail.com>
Date:   Wed Jul 31 23:09:39 2013 +0400

    r600g: defer shader variant selection and depending state updates

    [airlied: fix dropped streamout line - fix for master]

    Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit e41cbfb4d15d519f9301699f39d7dd0153f2edf4
Author: Dave Airlie <airlied@redhat.com>
Date:   Mon Jan 13 10:19:00 2014 +1000

    r600g/bc: add support for indexed memory writes.

    It looks like we need these for geom shaders in the future.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit 46efb1648e883b2cb231cca38c1540e7e9ec1ecc
Author: Vadim Girlin <vadimgirlin@gmail.com>
Date:   Wed Jul 31 20:02:22 2013 +0400

    r600g: move barrier and end_of_program bits from output to cf struct (v2)

    v2: fix regression on r600 NOP instructions.

    Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Dave Airlie <airlied@redhat.com>

commit 42802d5d8d145f07cf3fca1bb6e8ab0cd1fd5c85
Author: Dave Airlie <airlied@redhat.com>
Date:   Wed Jan 29 01:33:14 2014 +0000

    r600g: split streamout emit code into a separate function

    For geometry shaders we need to call this code from a second place.

    Just move it out for now to keep future patches cleaner.

    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-02-06 00:49:58 +00:00
119 changed files with 2305 additions and 742 deletions

View File

@@ -1 +1 @@
10.1.0-devel
10.1.0-rc2

View File

@@ -52,7 +52,7 @@ it.</li>
<li>GL_AMD_shader_trinary_minmax.</li>
<li>GL_EXT_framebuffer_blit on r200 and radeon.</li>
<li>Reduced memory usage for display lists.</li>
<li>OpenGL 3.3 support on nv50, nvc0</li>
<li>OpenGL 3.3 support on nv50, nvc0, r600 and radeonsi</li>
</ul>

View File

@@ -588,7 +588,12 @@ do_clip_line( struct draw_stage *stage,
if (v0->clipmask) {
interp( clipper, stage->tmp[0], t0, v0, v1, viewport_index );
copy_flat(stage, stage->tmp[0], v0);
if (stage->draw->rasterizer->flatshade_first) {
copy_flat(stage, stage->tmp[0], v0); /* copy v0 color to tmp[0] */
}
else {
copy_flat(stage, stage->tmp[0], v1); /* copy v1 color to tmp[0] */
}
newprim.v[0] = stage->tmp[0];
}
else {
@@ -597,6 +602,12 @@ do_clip_line( struct draw_stage *stage,
if (v1->clipmask) {
interp( clipper, stage->tmp[1], t1, v1, v0, viewport_index );
if (stage->draw->rasterizer->flatshade_first) {
copy_flat(stage, stage->tmp[1], v0); /* copy v0 color to tmp[1] */
}
else {
copy_flat(stage, stage->tmp[1], v1); /* copy v1 color to tmp[1] */
}
newprim.v[1] = stage->tmp[1];
}
else {

View File

@@ -74,7 +74,7 @@ void
util_primconvert_destroy(struct primconvert_context *pc)
{
util_primconvert_save_index_buffer(pc, NULL);
free(pc);
FREE(pc);
}
void

View File

@@ -178,6 +178,12 @@ The integer capabilities:
ARB_framebuffer_object is provided.
* ``PIPE_CAP_TGSI_VS_LAYER``: Whether TGSI_SEMANTIC_LAYER is supported
as a vertex shader output.
* ``PIPE_CAP_MAX_GEOMETRY_OUTPUT_VERTICES``: The maximum number of vertices
output by a single invocation of a geometry shader.
* ``PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS``: The maximum number of
vertex components output by a single invocation of a geometry shader.
This is the product of the number of attribute components per vertex and
the number of output vertices.
.. _pipe_capf:

View File

@@ -210,6 +210,11 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param)
case PIPE_CAP_MAX_STREAM_OUTPUT_INTERLEAVED_COMPONENTS:
return 0;
/* Geometry shader output, unsupported. */
case PIPE_CAP_MAX_GEOMETRY_OUTPUT_VERTICES:
case PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS:
return 0;
/* Texturing. */
case PIPE_CAP_MAX_TEXTURE_2D_LEVELS:
case PIPE_CAP_MAX_TEXTURE_3D_LEVELS:

View File

@@ -264,6 +264,11 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap cap)
case PIPE_CAP_MAX_RENDER_TARGETS:
return 1;
/* Geometry shader output, unsupported. */
case PIPE_CAP_MAX_GEOMETRY_OUTPUT_VERTICES:
case PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS:
return 0;
/* Fragment coordinate conventions. */
case PIPE_CAP_TGSI_FS_COORD_ORIGIN_UPPER_LEFT:
case PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_HALF_INTEGER:

View File

@@ -374,6 +374,9 @@ ilo_get_param(struct pipe_screen *screen, enum pipe_cap param)
return ILO_MAX_SO_BINDINGS / ILO_MAX_SO_BUFFERS;
case PIPE_CAP_MAX_STREAM_OUTPUT_INTERLEAVED_COMPONENTS:
return ILO_MAX_SO_BINDINGS;
case PIPE_CAP_MAX_GEOMETRY_OUTPUT_VERTICES:
case PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS:
return 0;
case PIPE_CAP_STREAM_OUTPUT_PAUSE_RESUME:
if (is->dev.gen >= ILO_GEN(7))
return is->dev.has_gen7_sol_reset;

View File

@@ -189,6 +189,9 @@ llvmpipe_get_param(struct pipe_screen *screen, enum pipe_cap param)
case PIPE_CAP_MAX_STREAM_OUTPUT_SEPARATE_COMPONENTS:
case PIPE_CAP_MAX_STREAM_OUTPUT_INTERLEAVED_COMPONENTS:
return 16*4;
case PIPE_CAP_MAX_GEOMETRY_OUTPUT_VERTICES:
case PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS:
return 1024;
case PIPE_CAP_STREAM_OUTPUT_PAUSE_RESUME:
return 1;
case PIPE_CAP_TGSI_CAN_COMPACT_CONSTANTS:

View File

@@ -71,7 +71,6 @@ struct nv50_ir_varying
#define NV50_SEMANTIC_CLIPDISTANCE (TGSI_SEMANTIC_COUNT + 0)
#define NV50_SEMANTIC_VIEWPORTINDEX (TGSI_SEMANTIC_COUNT + 4)
#define NV50_SEMANTIC_LAYER (TGSI_SEMANTIC_COUNT + 5)
#define NV50_SEMANTIC_INVOCATIONID (TGSI_SEMANTIC_COUNT + 6)
#define NV50_SEMANTIC_TESSFACTOR (TGSI_SEMANTIC_COUNT + 7)
#define NV50_SEMANTIC_TESSCOORD (TGSI_SEMANTIC_COUNT + 8)

View File

@@ -1488,8 +1488,13 @@ CodeEmitterNVC0::emitOUT(const Instruction *i)
// vertex stream
if (i->src(1).getFile() == FILE_IMMEDIATE) {
code[1] |= 0xc000;
code[0] |= SDATA(i->src(1)).u32 << 26;
// Using immediate encoding here triggers an invalid opcode error
// or random results when error reporting is disabled.
// TODO: figure this out when we get multiple vertex streams
assert(SDATA(i->src(1)).u32 == 0);
srcId(NULL, 26);
// code[1] |= 0xc000;
// code[0] |= SDATA(i->src(1)).u32 << 26;
} else {
srcId(i->src(1), 26);
}

View File

@@ -861,8 +861,8 @@ int Source::inferSysValDirection(unsigned sn) const
case TGSI_SEMANTIC_INSTANCEID:
case TGSI_SEMANTIC_VERTEXID:
return 1;
#if 0
case TGSI_SEMANTIC_LAYER:
#if 0
case TGSI_SEMANTIC_VIEWPORTINDEX:
return 0;
#endif

View File

@@ -284,6 +284,7 @@ public:
bool run(const std::list<ValuePair>&);
Symbol *assignSlot(const Interval&, const unsigned int size);
Symbol *offsetSlot(Symbol *, const LValue *);
inline int32_t getStackSize() const { return stackSize; }
private:
@@ -774,6 +775,7 @@ GCRA::RIG_Node::init(const RegisterSet& regs, LValue *lval)
weight = std::numeric_limits<float>::infinity();
degree = 0;
degreeLimit = regs.getFileSize(f, lval->reg.size);
degreeLimit -= relDegree[1][colors] - 1;
livei.insert(lval->livei);
}
@@ -1466,10 +1468,25 @@ SpillCodeInserter::assignSlot(const Interval &livei, const unsigned int size)
return slot.sym;
}
Symbol *
SpillCodeInserter::offsetSlot(Symbol *base, const LValue *lval)
{
if (!base || !lval->compound || (lval->compMask & 0x1))
return base;
Symbol *slot = cloneShallow(func, base);
slot->reg.data.offset += (ffs(lval->compMask) - 1) * lval->reg.size;
slot->reg.size = lval->reg.size;
return slot;
}
void
SpillCodeInserter::spill(Instruction *defi, Value *slot, LValue *lval)
{
const DataType ty = typeOfSize(slot->reg.size);
const DataType ty = typeOfSize(lval->reg.size);
slot = offsetSlot(slot->asSym(), lval);
Instruction *st;
if (slot->reg.file == FILE_MEMORY_LOCAL) {
@@ -1488,8 +1505,9 @@ SpillCodeInserter::spill(Instruction *defi, Value *slot, LValue *lval)
LValue *
SpillCodeInserter::unspill(Instruction *usei, LValue *lval, Value *slot)
{
const DataType ty = typeOfSize(slot->reg.size);
const DataType ty = typeOfSize(lval->reg.size);
slot = offsetSlot(slot->asSym(), lval);
lval = cloneShallow(func, lval);
Instruction *ld;
@@ -1506,6 +1524,16 @@ SpillCodeInserter::unspill(Instruction *usei, LValue *lval, Value *slot)
return lval;
}
// For each value that is to be spilled, go through all its definitions.
// A value can have multiple definitions if it has been coalesced before.
// For each definition, first go through all its uses and insert an unspill
// instruction before it, then replace the use with the temporary register.
// Unspill can be either a load from memory or simply a move to another
// register file.
// For "Pseudo" instructions (like PHI, SPLIT, MERGE) we can erase the use
// if we have spilled to a memory location, or simply with the new register.
// No load or conversion instruction should be needed.
bool
SpillCodeInserter::run(const std::list<ValuePair>& lst)
{
@@ -1524,12 +1552,13 @@ SpillCodeInserter::run(const std::list<ValuePair>& lst)
LValue *dval = (*d)->get()->asLValue();
Instruction *defi = (*d)->getInsn();
// handle uses first or they'll contain the spill stores
// Unspill at each use *before* inserting spill instructions,
// we don't want to have the spill instructions in the use list here.
while (!dval->uses.empty()) {
ValueRef *u = dval->uses.front();
Instruction *usei = u->getInsn();
assert(usei);
if (usei->op == OP_PHI) {
if (usei->isPseudo()) {
tmp = (slot->reg.file == FILE_MEMORY_LOCAL) ? NULL : slot;
last = NULL;
} else
@@ -1541,7 +1570,7 @@ SpillCodeInserter::run(const std::list<ValuePair>& lst)
}
assert(defi);
if (defi->op == OP_PHI) {
if (defi->isPseudo()) {
d = lval->defs.erase(d);
--d;
if (slot->reg.file == FILE_MEMORY_LOCAL)

View File

@@ -532,7 +532,7 @@ recordLocation(uint16_t *locs, uint8_t *masks,
case TGSI_SEMANTIC_INSTANCEID: locs[SV_INSTANCE_ID] = addr; break;
case TGSI_SEMANTIC_VERTEXID: locs[SV_VERTEX_ID] = addr; break;
case TGSI_SEMANTIC_PRIMID: locs[SV_PRIMITIVE_ID] = addr; break;
case NV50_SEMANTIC_LAYER: locs[SV_LAYER] = addr; break;
case TGSI_SEMANTIC_LAYER: locs[SV_LAYER] = addr; break;
case NV50_SEMANTIC_VIEWPORTINDEX: locs[SV_VIEWPORT_INDEX] = addr; break;
default:
break;

View File

@@ -49,6 +49,11 @@ struct nouveau_screen {
boolean hint_buf_keep_sysmem_copy;
struct {
unsigned profiles_checked;
unsigned profiles_present;
} firmware_info;
#ifdef NOUVEAU_ENABLE_DRIVER_STATISTICS
union {
uint64_t v[29];

View File

@@ -21,6 +21,7 @@
*/
#include <sys/mman.h>
#include <sys/stat.h>
#include <stdio.h>
#include <fcntl.h>
@@ -350,6 +351,77 @@ nouveau_vp3_load_firmware(struct nouveau_vp3_decoder *dec,
return 0;
}
static int
firmware_present(struct pipe_screen *pscreen, enum pipe_video_profile profile)
{
struct nouveau_screen *screen = nouveau_screen(pscreen);
int chipset = screen->device->chipset;
int vp3 = chipset < 0xa3 || chipset == 0xaa || chipset == 0xac;
int vp5 = chipset >= 0xd0;
int ret;
/* For all chipsets, try to create a BSP objects. Assume that if firmware
* is present for it, firmware is also present for VP/PPP */
if (!(screen->firmware_info.profiles_checked & 1)) {
struct nouveau_object *channel = NULL, *bsp = NULL;
struct nv04_fifo nv04_data = {.vram = 0xbeef0201, .gart = 0xbeef0202};
struct nvc0_fifo nvc0_args = {};
struct nve0_fifo nve0_args = {.engine = NVE0_FIFO_ENGINE_BSP};
void *data = NULL;
int size, oclass;
if (chipset < 0xc0)
oclass = 0x85b1;
else if (vp5)
oclass = 0x95b1;
else
oclass = 0x90b1;
if (chipset < 0xc0) {
data = &nv04_data;
size = sizeof(nv04_data);
} else if (chipset < 0xe0) {
data = &nvc0_args;
size = sizeof(nvc0_args);
} else {
data = &nve0_args;
size = sizeof(nve0_args);
}
/* kepler must have its own channel, so just do this for everyone */
nouveau_object_new(&screen->device->object, 0,
NOUVEAU_FIFO_CHANNEL_CLASS,
data, size, &channel);
if (channel) {
nouveau_object_new(channel, 0, oclass, NULL, 0, &bsp);
if (bsp)
screen->firmware_info.profiles_present |= 1;
nouveau_object_del(&bsp);
nouveau_object_del(&channel);
}
screen->firmware_info.profiles_checked |= 1;
}
if (!(screen->firmware_info.profiles_present & 1))
return 0;
/* For vp3/vp4 chipsets, make sure that the relevant firmware is present */
if (!vp5 && !(screen->firmware_info.profiles_checked & (1 << profile))) {
char path[PATH_MAX];
struct stat s;
if (vp3)
vp3_getpath(profile, path);
else
vp4_getpath(profile, path);
ret = stat(path, &s);
if (!ret && s.st_size > 1000)
screen->firmware_info.profiles_present |= (1 << profile);
screen->firmware_info.profiles_checked |= (1 << profile);
}
return vp5 || (screen->firmware_info.profiles_present & (1 << profile));
}
int
nouveau_vp3_screen_get_video_param(struct pipe_screen *pscreen,
enum pipe_video_profile profile,
@@ -363,8 +435,10 @@ nouveau_vp3_screen_get_video_param(struct pipe_screen *pscreen,
switch (param) {
case PIPE_VIDEO_CAP_SUPPORTED:
/* VP3 does not support MPEG4, VP4+ do. */
return profile >= PIPE_VIDEO_PROFILE_MPEG1 && (
!vp3 || codec != PIPE_VIDEO_FORMAT_MPEG4);
return entrypoint == PIPE_VIDEO_ENTRYPOINT_BITSTREAM &&
profile >= PIPE_VIDEO_PROFILE_MPEG1 &&
(!vp3 || codec != PIPE_VIDEO_FORMAT_MPEG4) &&
firmware_present(pscreen, profile);
case PIPE_VIDEO_CAP_NPOT_TEXTURES:
return 1;
case PIPE_VIDEO_CAP_MAX_WIDTH:

View File

@@ -108,6 +108,8 @@ nv30_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param)
case PIPE_CAP_MAX_TEXEL_OFFSET:
case PIPE_CAP_MAX_STREAM_OUTPUT_SEPARATE_COMPONENTS:
case PIPE_CAP_MAX_STREAM_OUTPUT_INTERLEAVED_COMPONENTS:
case PIPE_CAP_MAX_GEOMETRY_OUTPUT_VERTICES:
case PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS:
case PIPE_CAP_TGSI_CAN_COMPACT_CONSTANTS:
case PIPE_CAP_TEXTURE_BARRIER:
case PIPE_CAP_SEAMLESS_CUBE_MAP:
@@ -219,7 +221,7 @@ nv30_screen_get_shader_param(struct pipe_screen *pscreen, unsigned shader,
case PIPE_SHADER_CAP_MAX_CONTROL_FLOW_DEPTH:
return 0;
case PIPE_SHADER_CAP_MAX_INPUTS:
return (eng3d->oclass >= NV40_3D_CLASS) ? 12 : 10;
return 8; /* should be possible to do 10 with nv4x */
case PIPE_SHADER_CAP_MAX_CONSTS:
return (eng3d->oclass >= NV40_3D_CLASS) ? 224 : 32;
case PIPE_SHADER_CAP_MAX_CONST_BUFFERS:

View File

@@ -104,7 +104,7 @@ nv50_vertprog_assign_slots(struct nv50_ir_prog_info *info)
prog->vp.bfc[info->out[i].si] = i;
break;
case TGSI_SEMANTIC_LAYER:
prog->gp.has_layer = true;
prog->gp.has_layer = TRUE;
prog->gp.layerid = n;
break;
default:
@@ -170,10 +170,8 @@ nv50_fragprog_assign_slots(struct nv50_ir_prog_info *info)
if (info->in[i].sn == TGSI_SEMANTIC_COLOR)
prog->vp.bfc[info->in[i].si] = j;
else if (info->in[i].sn == TGSI_SEMANTIC_PRIMID) {
else if (info->in[i].sn == TGSI_SEMANTIC_PRIMID)
prog->vp.attrs[2] |= NV50_3D_VP_GP_BUILTIN_ATTR_EN_PRIMITIVE_ID;
prog->gp.primid = j;
}
prog->in[j].id = i;
prog->in[j].mask = info->in[i].mask;
@@ -345,7 +343,6 @@ nv50_program_translate(struct nv50_program *prog, uint16_t chipset)
prog->vp.clpd[0] = map_undef;
prog->vp.clpd[1] = map_undef;
prog->vp.psiz = map_undef;
prog->gp.primid = 0x80;
prog->gp.has_layer = 0;
info->driverPriv = prog;

View File

@@ -88,9 +88,8 @@ struct nv50_program {
struct {
uint32_t vert_count;
ubyte primid; /* primitive id output register */
uint8_t prim_type; /* point, line strip or tri strip */
bool has_layer;
uint8_t has_layer;
ubyte layerid; /* hw value of layer output */
} gp;

View File

@@ -143,6 +143,9 @@ nv50_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param)
case PIPE_CAP_MAX_STREAM_OUTPUT_INTERLEAVED_COMPONENTS:
case PIPE_CAP_MAX_STREAM_OUTPUT_SEPARATE_COMPONENTS:
return 64;
case PIPE_CAP_MAX_GEOMETRY_OUTPUT_VERTICES:
case PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS:
return 1024;
case PIPE_CAP_STREAM_OUTPUT_PAUSE_RESUME:
return (class_3d >= NVA0_3D_CLASS) ? 1 : 0;
case PIPE_CAP_BLEND_EQUATION_SEPARATE:
@@ -741,12 +744,13 @@ nv50_screen_create(struct nouveau_device *dev)
goto fail;
}
/* This over-allocates by a whole code BO. The GP, which would execute at
* the end of the last page, would trigger faults. The going theory is that
* it prefetches up to a certain amount. This avoids dmesg spam.
/* This over-allocates by a page. The GP, which would execute at the end of
* the last page, would trigger faults. The going theory is that it
* prefetches up to a certain amount.
*/
ret = nouveau_bo_new(dev, NOUVEAU_BO_VRAM, 1 << 16,
4 << NV50_CODE_BO_SIZE_LOG2, NULL, &screen->code);
(3 << NV50_CODE_BO_SIZE_LOG2) + 0x1000,
NULL, &screen->code);
if (ret) {
NOUVEAU_ERR("Failed to allocate code bo: %d\n", ret);
goto fail;

View File

@@ -346,7 +346,7 @@ nv50_fp_linkage_validate(struct nv50_context *nv50)
struct nv50_varying dummy;
int i, n, c, m;
uint32_t primid = 0;
uint32_t layerid = vp->gp.layerid;
uint32_t layerid = 0;
uint32_t psiz = 0x000;
uint32_t interp = fp->fp.interp;
uint32_t colors = fp->fp.colors;
@@ -401,17 +401,21 @@ nv50_fp_linkage_validate(struct nv50_context *nv50)
if (vp->out[n].sn == fp->in[i].sn &&
vp->out[n].si == fp->in[i].si)
break;
if (i == fp->gp.primid) {
switch (fp->in[i].sn) {
case TGSI_SEMANTIC_PRIMID:
primid = m;
break;
case TGSI_SEMANTIC_LAYER:
layerid = m;
break;
}
m = nv50_vec4_map(map, m, lin,
&fp->in[i], (n < vp->out_nr) ? &vp->out[n] : &dummy);
}
if (vp->gp.has_layer) {
// In GL4.x, layer can be an fp input, but not in 3.x. Make sure to add
// it to the output map.
map[m++] = layerid;
if (vp->gp.has_layer && !layerid) {
layerid = m;
map[m++] = vp->gp.layerid;
}
if (nv50->rast->pipe.point_size_per_vertex) {

View File

@@ -741,16 +741,80 @@ error:
return NULL;
}
#define FIRMWARE_BSP_KERN 0x01
#define FIRMWARE_VP_KERN 0x02
#define FIRMWARE_BSP_H264 0x04
#define FIRMWARE_VP_MPEG2 0x08
#define FIRMWARE_VP_H264_1 0x10
#define FIRMWARE_VP_H264_2 0x20
#define FIRMWARE_PRESENT(val, fw) (val & FIRMWARE_ ## fw)
static int
firmware_present(struct pipe_screen *pscreen, enum pipe_video_format codec)
{
struct nouveau_screen *screen = nouveau_screen(pscreen);
struct nouveau_object *obj = NULL;
struct stat s;
int checked = screen->firmware_info.profiles_checked;
int present, ret;
if (!FIRMWARE_PRESENT(checked, VP_KERN)) {
nouveau_object_new(screen->channel, 0, 0x7476, NULL, 0, &obj);
if (obj)
screen->firmware_info.profiles_present |= FIRMWARE_VP_KERN;
nouveau_object_del(&obj);
screen->firmware_info.profiles_checked |= FIRMWARE_VP_KERN;
}
if (codec == PIPE_VIDEO_FORMAT_MPEG4_AVC) {
if (!FIRMWARE_PRESENT(checked, BSP_KERN)) {
nouveau_object_new(screen->channel, 0, 0x74b0, NULL, 0, &obj);
if (obj)
screen->firmware_info.profiles_present |= FIRMWARE_BSP_KERN;
nouveau_object_del(&obj);
screen->firmware_info.profiles_checked |= FIRMWARE_BSP_KERN;
}
if (!FIRMWARE_PRESENT(checked, VP_H264_1)) {
ret = stat("/lib/firmware/nouveau/nv84_vp-h264-1", &s);
if (!ret && s.st_size > 1000)
screen->firmware_info.profiles_present |= FIRMWARE_VP_H264_1;
screen->firmware_info.profiles_checked |= FIRMWARE_VP_H264_1;
}
/* should probably check the others, but assume that 1 means all */
present = screen->firmware_info.profiles_present;
return FIRMWARE_PRESENT(present, VP_KERN) &&
FIRMWARE_PRESENT(present, BSP_KERN) &&
FIRMWARE_PRESENT(present, VP_H264_1);
} else {
if (!FIRMWARE_PRESENT(checked, VP_MPEG2)) {
ret = stat("/lib/firmware/nouveau/nv84_vp-mpeg12", &s);
if (!ret && s.st_size > 1000)
screen->firmware_info.profiles_present |= FIRMWARE_VP_MPEG2;
screen->firmware_info.profiles_checked |= FIRMWARE_VP_MPEG2;
}
present = screen->firmware_info.profiles_present;
return FIRMWARE_PRESENT(present, VP_KERN) &&
FIRMWARE_PRESENT(present, VP_MPEG2);
}
}
int
nv84_screen_get_video_param(struct pipe_screen *pscreen,
enum pipe_video_profile profile,
enum pipe_video_entrypoint entrypoint,
enum pipe_video_cap param)
{
enum pipe_video_format codec;
switch (param) {
case PIPE_VIDEO_CAP_SUPPORTED:
return u_reduce_video_profile(profile) == PIPE_VIDEO_FORMAT_MPEG4_AVC ||
u_reduce_video_profile(profile) == PIPE_VIDEO_FORMAT_MPEG12;
codec = u_reduce_video_profile(profile);
return (codec == PIPE_VIDEO_FORMAT_MPEG4_AVC ||
codec == PIPE_VIDEO_FORMAT_MPEG12) &&
firmware_present(pscreen, codec);
case PIPE_VIDEO_CAP_NPOT_TEXTURES:
return 1;
case PIPE_VIDEO_CAP_MAX_WIDTH:

View File

@@ -64,7 +64,7 @@ nvc0_shader_output_address(unsigned sn, unsigned si, unsigned ubase)
switch (sn) {
case NV50_SEMANTIC_TESSFACTOR: return 0x000 + si * 0x4;
case TGSI_SEMANTIC_PRIMID: return 0x060;
case NV50_SEMANTIC_LAYER: return 0x064;
case TGSI_SEMANTIC_LAYER: return 0x064;
case NV50_SEMANTIC_VIEWPORTINDEX: return 0x068;
case TGSI_SEMANTIC_PSIZE: return 0x06c;
case TGSI_SEMANTIC_POSITION: return 0x070;

View File

@@ -127,6 +127,9 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param)
case PIPE_CAP_MAX_STREAM_OUTPUT_SEPARATE_COMPONENTS:
case PIPE_CAP_MAX_STREAM_OUTPUT_INTERLEAVED_COMPONENTS:
return 128;
case PIPE_CAP_MAX_GEOMETRY_OUTPUT_VERTICES:
case PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS:
return 1024;
case PIPE_CAP_BLEND_EQUATION_SEPARATE:
case PIPE_CAP_INDEP_BLEND_ENABLE:
case PIPE_CAP_INDEP_BLEND_FUNC:

View File

@@ -190,7 +190,7 @@ nvc0_gmtyprog_validate(struct nvc0_context *nvc0)
/* we allow GPs with no code for specifying stream output state only */
if (gp && gp->code_size) {
const boolean gp_selects_layer = gp->hdr[13] & (1 << 9);
const boolean gp_selects_layer = !!(gp->hdr[13] & (1 << 9));
BEGIN_NVC0(push, NVC0_3D(MACRO_GP_SELECT), 1);
PUSH_DATA (push, 0x41);

View File

@@ -151,6 +151,8 @@ static int r300_get_param(struct pipe_screen* pscreen, enum pipe_cap param)
case PIPE_CAP_MAX_STREAM_OUTPUT_BUFFERS:
case PIPE_CAP_MAX_STREAM_OUTPUT_SEPARATE_COMPONENTS:
case PIPE_CAP_MAX_STREAM_OUTPUT_INTERLEAVED_COMPONENTS:
case PIPE_CAP_MAX_GEOMETRY_OUTPUT_VERTICES:
case PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS:
case PIPE_CAP_STREAM_OUTPUT_PAUSE_RESUME:
case PIPE_CAP_FRAGMENT_COLOR_CLAMPED:
case PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION:

View File

@@ -79,45 +79,49 @@ int eg_bytecode_cf_build(struct r600_bytecode *bc, struct r600_bytecode_cf *cf)
bc->bytecode[id++] = S_SQ_CF_ALLOC_EXPORT_WORD0_RW_GPR(cf->output.gpr) |
S_SQ_CF_ALLOC_EXPORT_WORD0_ELEM_SIZE(cf->output.elem_size) |
S_SQ_CF_ALLOC_EXPORT_WORD0_ARRAY_BASE(cf->output.array_base) |
S_SQ_CF_ALLOC_EXPORT_WORD0_TYPE(cf->output.type);
S_SQ_CF_ALLOC_EXPORT_WORD0_TYPE(cf->output.type) |
S_SQ_CF_ALLOC_EXPORT_WORD0_INDEX_GPR(cf->output.index_gpr);
bc->bytecode[id] =
S_SQ_CF_ALLOC_EXPORT_WORD1_BURST_COUNT(cf->output.burst_count - 1) |
S_SQ_CF_ALLOC_EXPORT_WORD1_SWIZ_SEL_X(cf->output.swizzle_x) |
S_SQ_CF_ALLOC_EXPORT_WORD1_SWIZ_SEL_Y(cf->output.swizzle_y) |
S_SQ_CF_ALLOC_EXPORT_WORD1_SWIZ_SEL_Z(cf->output.swizzle_z) |
S_SQ_CF_ALLOC_EXPORT_WORD1_SWIZ_SEL_W(cf->output.swizzle_w) |
S_SQ_CF_ALLOC_EXPORT_WORD1_BARRIER(cf->output.barrier) |
S_SQ_CF_ALLOC_EXPORT_WORD1_BARRIER(cf->barrier) |
S_SQ_CF_ALLOC_EXPORT_WORD1_CF_INST(opcode);
if (bc->chip_class == EVERGREEN) /* no EOP on cayman */
bc->bytecode[id] |= S_SQ_CF_ALLOC_EXPORT_WORD1_END_OF_PROGRAM(cf->output.end_of_program);
bc->bytecode[id] |= S_SQ_CF_ALLOC_EXPORT_WORD1_END_OF_PROGRAM(cf->end_of_program);
id++;
} else if (cfop->flags & CF_STRM) {
/* MEM_STREAM instructions */
} else if (cfop->flags & CF_MEM) {
/* MEM_STREAM, MEM_RING instructions */
bc->bytecode[id++] = S_SQ_CF_ALLOC_EXPORT_WORD0_RW_GPR(cf->output.gpr) |
S_SQ_CF_ALLOC_EXPORT_WORD0_ELEM_SIZE(cf->output.elem_size) |
S_SQ_CF_ALLOC_EXPORT_WORD0_ARRAY_BASE(cf->output.array_base) |
S_SQ_CF_ALLOC_EXPORT_WORD0_TYPE(cf->output.type);
S_SQ_CF_ALLOC_EXPORT_WORD0_TYPE(cf->output.type) |
S_SQ_CF_ALLOC_EXPORT_WORD0_INDEX_GPR(cf->output.index_gpr);
bc->bytecode[id] = S_SQ_CF_ALLOC_EXPORT_WORD1_BURST_COUNT(cf->output.burst_count - 1) |
S_SQ_CF_ALLOC_EXPORT_WORD1_BARRIER(cf->output.barrier) |
S_SQ_CF_ALLOC_EXPORT_WORD1_BARRIER(cf->barrier) |
S_SQ_CF_ALLOC_EXPORT_WORD1_CF_INST(opcode) |
S_SQ_CF_ALLOC_EXPORT_WORD1_BUF_COMP_MASK(cf->output.comp_mask) |
S_SQ_CF_ALLOC_EXPORT_WORD1_BUF_ARRAY_SIZE(cf->output.array_size);
if (bc->chip_class == EVERGREEN) /* no EOP on cayman */
bc->bytecode[id] |= S_SQ_CF_ALLOC_EXPORT_WORD1_END_OF_PROGRAM(cf->output.end_of_program);
bc->bytecode[id] |= S_SQ_CF_ALLOC_EXPORT_WORD1_END_OF_PROGRAM(cf->end_of_program);
id++;
} else {
/* branch, loop, call, return instructions */
/* other instructions */
bc->bytecode[id++] = S_SQ_CF_WORD0_ADDR(cf->cf_addr >> 1);
bc->bytecode[id++] = S_SQ_CF_WORD1_CF_INST(opcode)|
S_SQ_CF_WORD1_BARRIER(1) |
S_SQ_CF_WORD1_COND(cf->cond) |
S_SQ_CF_WORD1_POP_COUNT(cf->pop_count);
S_SQ_CF_WORD1_POP_COUNT(cf->pop_count) |
S_SQ_CF_WORD1_END_OF_PROGRAM(cf->end_of_program);
}
}
return 0;
}
#if 0
void eg_bytecode_export_read(struct r600_bytecode *bc,
struct r600_bytecode_output *output, uint32_t word0, uint32_t word1)
{
@@ -138,3 +142,4 @@ void eg_bytecode_export_read(struct r600_bytecode *bc,
output->array_size = G_SQ_CF_ALLOC_EXPORT_WORD1_BUF_ARRAY_SIZE(word1);
output->comp_mask = G_SQ_CF_ALLOC_EXPORT_WORD1_BUF_COMP_MASK(word1);
}
#endif

View File

@@ -1407,7 +1407,7 @@ void evergreen_init_color_surface(struct r600_context *rctx,
struct pipe_resource *pipe_tex = surf->base.texture;
unsigned level = surf->base.u.tex.level;
unsigned pitch, slice;
unsigned color_info, color_attrib, color_dim = 0;
unsigned color_info, color_attrib, color_dim = 0, color_view;
unsigned format, swap, ntype, endian;
uint64_t offset, base_offset;
unsigned non_disp_tiling, macro_aspect, tile_split, bankh, bankw, fmask_bankh, nbanks;
@@ -1416,10 +1416,15 @@ void evergreen_init_color_surface(struct r600_context *rctx,
bool blend_clamp = 0, blend_bypass = 0;
offset = rtex->surface.level[level].offset;
if (rtex->surface.level[level].mode < RADEON_SURF_MODE_1D) {
if (rtex->surface.level[level].mode == RADEON_SURF_MODE_LINEAR) {
assert(surf->base.u.tex.first_layer == surf->base.u.tex.last_layer);
offset += rtex->surface.level[level].slice_size *
surf->base.u.tex.first_layer;
}
color_view = 0;
} else
color_view = S_028C6C_SLICE_START(surf->base.u.tex.first_layer) |
S_028C6C_SLICE_MAX(surf->base.u.tex.last_layer);
pitch = (rtex->surface.level[level].nblk_x) / 8 - 1;
slice = (rtex->surface.level[level].nblk_x * rtex->surface.level[level].nblk_y) / 64;
if (slice) {
@@ -1569,12 +1574,7 @@ void evergreen_init_color_surface(struct r600_context *rctx,
surf->cb_color_info = color_info;
surf->cb_color_pitch = S_028C64_PITCH_TILE_MAX(pitch);
surf->cb_color_slice = S_028C68_SLICE_TILE_MAX(slice);
if (rtex->surface.level[level].mode < RADEON_SURF_MODE_1D) {
surf->cb_color_view = 0;
} else {
surf->cb_color_view = S_028C6C_SLICE_START(surf->base.u.tex.first_layer) |
S_028C6C_SLICE_MAX(surf->base.u.tex.last_layer);
}
surf->cb_color_view = color_view;
surf->cb_color_attrib = color_attrib;
if (rtex->fmask.size) {
surf->cb_color_fmask = (base_offset + rtex->fmask.offset) >> 8;
@@ -1829,7 +1829,6 @@ static void evergreen_set_framebuffer_state(struct pipe_context *ctx,
rctx->db_misc_state.atom.dirty = true;
}
evergreen_update_db_shader_control(rctx);
/* Calculate the CS size. */
rctx->framebuffer.atom.num_dw = 4; /* SCISSOR */
@@ -2519,6 +2518,7 @@ static void evergreen_emit_constant_buffers(struct r600_context *rctx,
struct r600_resource *rbuffer;
uint64_t va;
unsigned buffer_index = ffs(dirty_mask) - 1;
unsigned gs_ring_buffer = (buffer_index == R600_GS_RING_CONST_BUFFER);
cb = &state->cb[buffer_index];
rbuffer = (struct r600_resource*)cb->buffer;
@@ -2527,10 +2527,12 @@ static void evergreen_emit_constant_buffers(struct r600_context *rctx,
va = r600_resource_va(&rctx->screen->b.b, &rbuffer->b.b);
va += cb->buffer_offset;
r600_write_context_reg_flag(cs, reg_alu_constbuf_size + buffer_index * 4,
ALIGN_DIVUP(cb->buffer_size >> 4, 16), pkt_flags);
r600_write_context_reg_flag(cs, reg_alu_const_cache + buffer_index * 4, va >> 8,
pkt_flags);
if (!gs_ring_buffer) {
r600_write_context_reg_flag(cs, reg_alu_constbuf_size + buffer_index * 4,
ALIGN_DIVUP(cb->buffer_size >> 4, 16), pkt_flags);
r600_write_context_reg_flag(cs, reg_alu_const_cache + buffer_index * 4, va >> 8,
pkt_flags);
}
radeon_emit(cs, PKT3(PKT3_NOP, 0, 0) | pkt_flags);
radeon_emit(cs, r600_context_bo_reloc(&rctx->b, &rctx->b.rings.gfx, rbuffer, RADEON_USAGE_READ));
@@ -2540,10 +2542,12 @@ static void evergreen_emit_constant_buffers(struct r600_context *rctx,
radeon_emit(cs, va); /* RESOURCEi_WORD0 */
radeon_emit(cs, rbuffer->buf->size - cb->buffer_offset - 1); /* RESOURCEi_WORD1 */
radeon_emit(cs, /* RESOURCEi_WORD2 */
S_030008_ENDIAN_SWAP(r600_endian_swap(32)) |
S_030008_STRIDE(16) |
S_030008_BASE_ADDRESS_HI(va >> 32UL));
S_030008_ENDIAN_SWAP(gs_ring_buffer ? ENDIAN_NONE : r600_endian_swap(32)) |
S_030008_STRIDE(gs_ring_buffer ? 4 : 16) |
S_030008_BASE_ADDRESS_HI(va >> 32UL) |
S_030008_DATA_FORMAT(FMT_32_32_32_32_FLOAT));
radeon_emit(cs, /* RESOURCEi_WORD3 */
S_03000C_UNCACHED(gs_ring_buffer ? 1 : 0) |
S_03000C_DST_SEL_X(V_03000C_SQ_SEL_X) |
S_03000C_DST_SEL_Y(V_03000C_SQ_SEL_Y) |
S_03000C_DST_SEL_Z(V_03000C_SQ_SEL_Z) |
@@ -2551,7 +2555,8 @@ static void evergreen_emit_constant_buffers(struct r600_context *rctx,
radeon_emit(cs, 0); /* RESOURCEi_WORD4 */
radeon_emit(cs, 0); /* RESOURCEi_WORD5 */
radeon_emit(cs, 0); /* RESOURCEi_WORD6 */
radeon_emit(cs, 0xc0000000); /* RESOURCEi_WORD7 */
radeon_emit(cs, /* RESOURCEi_WORD7 */
S_03001C_TYPE(V_03001C_SQ_TEX_VTX_VALID_BUFFER));
radeon_emit(cs, PKT3(PKT3_NOP, 0, 0) | pkt_flags);
radeon_emit(cs, r600_context_bo_reloc(&rctx->b, &rctx->b.rings.gfx, rbuffer, RADEON_USAGE_READ));
@@ -2715,6 +2720,77 @@ static void evergreen_emit_vertex_fetch_shader(struct r600_context *rctx, struct
radeon_emit(cs, r600_context_bo_reloc(&rctx->b, &rctx->b.rings.gfx, shader->buffer, RADEON_USAGE_READ));
}
static void evergreen_emit_shader_stages(struct r600_context *rctx, struct r600_atom *a)
{
struct radeon_winsys_cs *cs = rctx->b.rings.gfx.cs;
struct r600_shader_stages_state *state = (struct r600_shader_stages_state*)a;
uint32_t v = 0, v2 = 0, primid = 0;
if (state->geom_enable) {
uint32_t cut_val;
if (rctx->gs_shader->current->shader.gs_max_out_vertices <= 128)
cut_val = V_028A40_GS_CUT_128;
else if (rctx->gs_shader->current->shader.gs_max_out_vertices <= 256)
cut_val = V_028A40_GS_CUT_256;
else if (rctx->gs_shader->current->shader.gs_max_out_vertices <= 512)
cut_val = V_028A40_GS_CUT_512;
else
cut_val = V_028A40_GS_CUT_1024;
v = S_028B54_ES_EN(V_028B54_ES_STAGE_REAL) |
S_028B54_GS_EN(1) |
S_028B54_VS_EN(V_028B54_VS_STAGE_COPY_SHADER);
v2 = S_028A40_MODE(V_028A40_GS_SCENARIO_G) |
S_028A40_CUT_MODE(cut_val);
if (rctx->gs_shader->current->shader.gs_prim_id_input)
primid = 1;
}
r600_write_context_reg(cs, R_028B54_VGT_SHADER_STAGES_EN, v);
r600_write_context_reg(cs, R_028A40_VGT_GS_MODE, v2);
r600_write_context_reg(cs, R_028A84_VGT_PRIMITIVEID_EN, primid);
}
static void evergreen_emit_gs_rings(struct r600_context *rctx, struct r600_atom *a)
{
struct pipe_screen *screen = rctx->b.b.screen;
struct radeon_winsys_cs *cs = rctx->b.rings.gfx.cs;
struct r600_gs_rings_state *state = (struct r600_gs_rings_state*)a;
struct r600_resource *rbuffer;
r600_write_config_reg(cs, R_008040_WAIT_UNTIL, S_008040_WAIT_3D_IDLE(1));
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
radeon_emit(cs, EVENT_TYPE(EVENT_TYPE_VGT_FLUSH));
if (state->enable) {
rbuffer =(struct r600_resource*)state->esgs_ring.buffer;
r600_write_config_reg(cs, R_008C40_SQ_ESGS_RING_BASE,
(r600_resource_va(screen, &rbuffer->b.b)) >> 8);
radeon_emit(cs, PKT3(PKT3_NOP, 0, 0));
radeon_emit(cs, r600_context_bo_reloc(&rctx->b, &rctx->b.rings.gfx, rbuffer, RADEON_USAGE_READWRITE));
r600_write_config_reg(cs, R_008C44_SQ_ESGS_RING_SIZE,
state->esgs_ring.buffer_size >> 8);
rbuffer =(struct r600_resource*)state->gsvs_ring.buffer;
r600_write_config_reg(cs, R_008C48_SQ_GSVS_RING_BASE,
(r600_resource_va(screen, &rbuffer->b.b)) >> 8);
radeon_emit(cs, PKT3(PKT3_NOP, 0, 0));
radeon_emit(cs, r600_context_bo_reloc(&rctx->b, &rctx->b.rings.gfx, rbuffer, RADEON_USAGE_READWRITE));
r600_write_config_reg(cs, R_008C4C_SQ_GSVS_RING_SIZE,
state->gsvs_ring.buffer_size >> 8);
} else {
r600_write_config_reg(cs, R_008C44_SQ_ESGS_RING_SIZE, 0);
r600_write_config_reg(cs, R_008C4C_SQ_GSVS_RING_SIZE, 0);
}
r600_write_config_reg(cs, R_008040_WAIT_UNTIL, S_008040_WAIT_3D_IDLE(1));
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
radeon_emit(cs, EVENT_TYPE(EVENT_TYPE_VGT_FLUSH));
}
void cayman_init_common_regs(struct r600_command_buffer *cb,
enum chip_class ctx_chip_class,
enum radeon_family ctx_family,
@@ -2905,6 +2981,7 @@ static void cayman_init_atom_start_cs(struct r600_context *rctx)
eg_store_loop_const(cb, R_03A200_SQ_LOOP_CONST_0, 0x01000FFF);
eg_store_loop_const(cb, R_03A200_SQ_LOOP_CONST_0 + (32 * 4), 0x01000FFF);
eg_store_loop_const(cb, R_03A200_SQ_LOOP_CONST_0 + (64 * 4), 0x01000FFF);
}
void evergreen_init_common_regs(struct r600_command_buffer *cb,
@@ -3363,6 +3440,7 @@ void evergreen_init_atom_start_cs(struct r600_context *rctx)
eg_store_loop_const(cb, R_03A200_SQ_LOOP_CONST_0, 0x01000FFF);
eg_store_loop_const(cb, R_03A200_SQ_LOOP_CONST_0 + (32 * 4), 0x01000FFF);
eg_store_loop_const(cb, R_03A200_SQ_LOOP_CONST_0 + (64 * 4), 0x01000FFF);
}
void evergreen_update_ps_state(struct pipe_context *ctx, struct r600_pipe_shader *shader)
@@ -3510,6 +3588,102 @@ void evergreen_update_ps_state(struct pipe_context *ctx, struct r600_pipe_shader
shader->flatshade = rctx->rasterizer->flatshade;
}
void evergreen_update_es_state(struct pipe_context *ctx, struct r600_pipe_shader *shader)
{
struct r600_command_buffer *cb = &shader->command_buffer;
struct r600_shader *rshader = &shader->shader;
r600_init_command_buffer(cb, 32);
r600_store_context_reg(cb, R_028890_SQ_PGM_RESOURCES_ES,
S_028890_NUM_GPRS(rshader->bc.ngpr) |
S_028890_STACK_SIZE(rshader->bc.nstack));
r600_store_context_reg(cb, R_02888C_SQ_PGM_START_ES,
r600_resource_va(ctx->screen, (void *)shader->bo) >> 8);
/* After that, the NOP relocation packet must be emitted (shader->bo, RADEON_USAGE_READ). */
}
static unsigned r600_conv_prim_to_gs_out(unsigned mode)
{
static const int prim_conv[] = {
V_028A6C_OUTPRIM_TYPE_POINTLIST,
V_028A6C_OUTPRIM_TYPE_LINESTRIP,
V_028A6C_OUTPRIM_TYPE_LINESTRIP,
V_028A6C_OUTPRIM_TYPE_LINESTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_LINESTRIP,
V_028A6C_OUTPRIM_TYPE_LINESTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP
};
assert(mode < Elements(prim_conv));
return prim_conv[mode];
}
void evergreen_update_gs_state(struct pipe_context *ctx, struct r600_pipe_shader *shader)
{
struct r600_context *rctx = (struct r600_context *)ctx;
struct r600_command_buffer *cb = &shader->command_buffer;
struct r600_shader *rshader = &shader->shader;
struct r600_shader *cp_shader = &shader->gs_copy_shader->shader;
unsigned gsvs_itemsize =
(cp_shader->ring_item_size * rshader->gs_max_out_vertices) >> 2;
r600_init_command_buffer(cb, 64);
/* VGT_GS_MODE is written by evergreen_emit_shader_stages */
r600_store_context_reg(cb, R_028AB8_VGT_VTX_CNT_EN, 1);
r600_store_context_reg(cb, R_028B38_VGT_GS_MAX_VERT_OUT,
S_028B38_MAX_VERT_OUT(rshader->gs_max_out_vertices));
r600_store_context_reg(cb, R_028A6C_VGT_GS_OUT_PRIM_TYPE,
r600_conv_prim_to_gs_out(rshader->gs_output_prim));
if (rctx->screen->b.info.drm_minor >= 35) {
r600_store_context_reg(cb, R_028B90_VGT_GS_INSTANCE_CNT,
S_028B90_CNT(0) |
S_028B90_ENABLE(0));
}
r600_store_context_reg_seq(cb, R_02891C_SQ_GS_VERT_ITEMSIZE, 4);
r600_store_value(cb, cp_shader->ring_item_size >> 2);
r600_store_value(cb, 0);
r600_store_value(cb, 0);
r600_store_value(cb, 0);
r600_store_context_reg(cb, R_028900_SQ_ESGS_RING_ITEMSIZE,
(rshader->ring_item_size) >> 2);
r600_store_context_reg(cb, R_028904_SQ_GSVS_RING_ITEMSIZE,
gsvs_itemsize);
r600_store_context_reg_seq(cb, R_02892C_SQ_GSVS_RING_OFFSET_1, 3);
r600_store_value(cb, gsvs_itemsize);
r600_store_value(cb, gsvs_itemsize);
r600_store_value(cb, gsvs_itemsize);
/* FIXME calculate these values somehow ??? */
r600_store_context_reg_seq(cb, R_028A54_GS_PER_ES, 3);
r600_store_value(cb, 0x80); /* GS_PER_ES */
r600_store_value(cb, 0x100); /* ES_PER_GS */
r600_store_value(cb, 0x2); /* GS_PER_VS */
r600_store_context_reg(cb, R_028878_SQ_PGM_RESOURCES_GS,
S_028878_NUM_GPRS(rshader->bc.ngpr) |
S_028878_STACK_SIZE(rshader->bc.nstack));
r600_store_context_reg(cb, R_028874_SQ_PGM_START_GS,
r600_resource_va(ctx->screen, (void *)shader->bo) >> 8);
/* After that, the NOP relocation packet must be emitted (shader->bo, RADEON_USAGE_READ). */
}
void evergreen_update_vs_state(struct pipe_context *ctx, struct r600_pipe_shader *shader)
{
struct r600_command_buffer *cb = &shader->command_buffer;
@@ -3552,7 +3726,8 @@ void evergreen_update_vs_state(struct pipe_context *ctx, struct r600_pipe_shader
S_02881C_VS_OUT_CCDIST0_VEC_ENA((rshader->clip_dist_write & 0x0F) != 0) |
S_02881C_VS_OUT_CCDIST1_VEC_ENA((rshader->clip_dist_write & 0xF0) != 0) |
S_02881C_VS_OUT_MISC_VEC_ENA(rshader->vs_out_misc_write) |
S_02881C_USE_VTX_POINT_SIZE(rshader->vs_out_point_size);
S_02881C_USE_VTX_POINT_SIZE(rshader->vs_out_point_size) |
S_02881C_USE_VTX_RENDER_TARGET_INDX(rshader->vs_out_layer);
}
void *evergreen_create_resolve_blend(struct r600_context *rctx)
@@ -3919,6 +4094,10 @@ void evergreen_init_state_functions(struct r600_context *rctx)
rctx->atoms[id++] = &rctx->b.streamout.begin_atom;
r600_init_atom(rctx, &rctx->vertex_shader.atom, id++, r600_emit_shader, 23);
r600_init_atom(rctx, &rctx->pixel_shader.atom, id++, r600_emit_shader, 0);
r600_init_atom(rctx, &rctx->geometry_shader.atom, id++, r600_emit_shader, 0);
r600_init_atom(rctx, &rctx->export_shader.atom, id++, r600_emit_shader, 0);
r600_init_atom(rctx, &rctx->shader_stages.atom, id++, evergreen_emit_shader_stages, 6);
r600_init_atom(rctx, &rctx->gs_rings.atom, id++, evergreen_emit_gs_rings, 26);
rctx->b.b.create_blend_state = evergreen_create_blend_state;
rctx->b.b.create_depth_stencil_alpha_state = evergreen_create_dsa_state;

View File

@@ -48,6 +48,7 @@
#define EVENT_TYPE_ZPASS_DONE 0x15
#define EVENT_TYPE_CACHE_FLUSH_AND_INV_EVENT 0x16
#define EVENT_TYPE_SO_VGTSTREAMOUT_FLUSH 0x1f
#define EVENT_TYPE_VGT_FLUSH 0x24
#define EVENT_TYPE_FLUSH_AND_INV_DB_META 0x2c
#define EVENT_TYPE(x) ((x) << 0)
@@ -274,6 +275,11 @@
#define G_008E2C_NUM_LS_LDS(x) (((x) >> 16) & 0xFFFF)
#define C_008E2C_NUM_LS_LDS(x) 0xFFFF0000
#define R_008C40_SQ_ESGS_RING_BASE 0x00008C40
#define R_008C44_SQ_ESGS_RING_SIZE 0x00008C44
#define R_008C48_SQ_GSVS_RING_BASE 0x00008C48
#define R_008C4C_SQ_GSVS_RING_SIZE 0x00008C4C
#define R_008CF0_SQ_MS_FIFO_SIZES 0x00008CF0
#define S_008CF0_CACHE_FIFO_SIZE(x) (((x) & 0xFF) << 0)
#define G_008CF0_CACHE_FIFO_SIZE(x) (((x) >> 0) & 0xFF)
@@ -821,12 +827,22 @@
#define S_028A40_MODE(x) (((x) & 0x3) << 0)
#define G_028A40_MODE(x) (((x) >> 0) & 0x3)
#define C_028A40_MODE 0xFFFFFFFC
#define V_028A40_GS_OFF 0
#define V_028A40_GS_SCENARIO_A 1
#define V_028A40_GS_SCENARIO_B 2
#define V_028A40_GS_SCENARIO_G 3
#define V_028A40_GS_SCENARIO_C 4
#define V_028A40_SPRITE_EN 5
#define S_028A40_ES_PASSTHRU(x) (((x) & 0x1) << 2)
#define G_028A40_ES_PASSTHRU(x) (((x) >> 2) & 0x1)
#define C_028A40_ES_PASSTHRU 0xFFFFFFFB
#define S_028A40_CUT_MODE(x) (((x) & 0x3) << 3)
#define G_028A40_CUT_MODE(x) (((x) >> 3) & 0x3)
#define C_028A40_CUT_MODE 0xFFFFFFE7
#define V_028A40_GS_CUT_1024 0
#define V_028A40_GS_CUT_512 1
#define V_028A40_GS_CUT_256 2
#define V_028A40_GS_CUT_128 3
#define S_028A40_COMPUTE_MODE(x) (x << 14)
#define S_028A40_PARTIAL_THD_AT_EOI(x) (x << 17)
#define R_028A6C_VGT_GS_OUT_PRIM_TYPE 0x028A6C
@@ -1201,6 +1217,7 @@
#define C_030008_ENDIAN_SWAP 0x3FFFFFFF
#define R_03000C_SQ_VTX_CONSTANT_WORD3_0 0x03000C
#define S_03000C_UNCACHED(x) (((x) & 0x1) << 2)
#define S_03000C_DST_SEL_X(x) (((x) & 0x7) << 3)
#define G_03000C_DST_SEL_X(x) (((x) >> 3) & 0x7)
#define V_03000C_SQ_SEL_X 0x00000000
@@ -1457,6 +1474,34 @@
#define G_028860_UNCACHED_FIRST_INST(x) (((x) >> 28) & 0x1)
#define C_028860_UNCACHED_FIRST_INST 0xEFFFFFFF
#define R_028878_SQ_PGM_RESOURCES_GS 0x028878
#define S_028878_NUM_GPRS(x) (((x) & 0xFF) << 0)
#define G_028878_NUM_GPRS(x) (((x) >> 0) & 0xFF)
#define C_028878_NUM_GPRS 0xFFFFFF00
#define S_028878_STACK_SIZE(x) (((x) & 0xFF) << 8)
#define G_028878_STACK_SIZE(x) (((x) >> 8) & 0xFF)
#define C_028878_STACK_SIZE 0xFFFF00FF
#define S_028878_DX10_CLAMP(x) (((x) & 0x1) << 21)
#define G_028878_DX10_CLAMP(x) (((x) >> 21) & 0x1)
#define C_028878_DX10_CLAMP 0xFFDFFFFF
#define S_028878_UNCACHED_FIRST_INST(x) (((x) & 0x1) << 28)
#define G_028878_UNCACHED_FIRST_INST(x) (((x) >> 28) & 0x1)
#define C_028878_UNCACHED_FIRST_INST 0xEFFFFFFF
#define R_028890_SQ_PGM_RESOURCES_ES 0x028890
#define S_028890_NUM_GPRS(x) (((x) & 0xFF) << 0)
#define G_028890_NUM_GPRS(x) (((x) >> 0) & 0xFF)
#define C_028890_NUM_GPRS 0xFFFFFF00
#define S_028890_STACK_SIZE(x) (((x) & 0xFF) << 8)
#define G_028890_STACK_SIZE(x) (((x) >> 8) & 0xFF)
#define C_028890_STACK_SIZE 0xFFFF00FF
#define S_028890_DX10_CLAMP(x) (((x) & 0x1) << 21)
#define G_028890_DX10_CLAMP(x) (((x) >> 21) & 0x1)
#define C_028890_DX10_CLAMP 0xFFDFFFFF
#define S_028890_UNCACHED_FIRST_INST(x) (((x) & 0x1) << 28)
#define G_028890_UNCACHED_FIRST_INST(x) (((x) >> 28) & 0x1)
#define C_028890_UNCACHED_FIRST_INST 0xEFFFFFFF
#define R_028864_SQ_PGM_RESOURCES_2_VS 0x028864
#define S_028864_SINGLE_ROUND(x) (((x) & 0x3) << 0)
#define G_028864_SINGLE_ROUND(x) (((x) >> 0) & 0x3)
@@ -1880,6 +1925,8 @@
#define G_02884C_EXPORT_Z(x) (((x) >> 0) & 0x1)
#define C_02884C_EXPORT_Z 0xFFFFFFFE
#define R_02885C_SQ_PGM_START_VS 0x0002885C
#define R_028874_SQ_PGM_START_GS 0x00028874
#define R_02888C_SQ_PGM_START_ES 0x0002888C
#define R_0288A4_SQ_PGM_START_FS 0x000288A4
#define R_0288D0_SQ_PGM_START_LS 0x000288d0
#define R_0288A8_SQ_PGM_RESOURCES_FS 0x000288A8
@@ -1894,6 +1941,9 @@
#define R_028920_SQ_GS_VERT_ITEMSIZE_1 0x00028920
#define R_028924_SQ_GS_VERT_ITEMSIZE_2 0x00028924
#define R_028928_SQ_GS_VERT_ITEMSIZE_3 0x00028928
#define R_02892C_SQ_GSVS_RING_OFFSET_1 0x0002892C
#define R_028930_SQ_GSVS_RING_OFFSET_2 0x00028930
#define R_028934_SQ_GSVS_RING_OFFSET_3 0x00028934
#define R_028940_ALU_CONST_CACHE_PS_0 0x00028940
#define R_028944_ALU_CONST_CACHE_PS_1 0x00028944
#define R_028980_ALU_CONST_CACHE_VS_0 0x00028980
@@ -1928,6 +1978,15 @@
#define S_028A48_VPORT_SCISSOR_ENABLE(x) (((x) & 0x1) << 1)
#define S_028A48_LINE_STIPPLE_ENABLE(x) (((x) & 0x1) << 2)
#define R_028A4C_PA_SC_MODE_CNTL_1 0x00028A4C
#define R_028A54_GS_PER_ES 0x00028A54
#define R_028A58_ES_PER_GS 0x00028A58
#define R_028A5C_GS_PER_VS 0x00028A5C
#define R_028A84_VGT_PRIMITIVEID_EN 0x028A84
#define S_028A84_PRIMITIVEID_EN(x) (((x) & 0x1) << 0)
#define G_028A84_PRIMITIVEID_EN(x) (((x) >> 0) & 0x1)
#define C_028A84_PRIMITIVEID_EN 0xFFFFFFFE
#define R_028A94_VGT_MULTI_PRIM_IB_RESET_EN 0x00028A94
#define S_028A94_RESET_EN(x) (((x) & 0x1) << 0)
#define G_028A94_RESET_EN(x) (((x) >> 0) & 0x1)
@@ -1962,11 +2021,27 @@
#define R_028B28_VGT_STRMOUT_DRAW_OPAQUE_OFFSET 0x028B28
#define R_028B2C_VGT_STRMOUT_DRAW_OPAQUE_BUFFER_FILLED_SIZE 0x028B2C
#define R_028B30_VGT_STRMOUT_DRAW_OPAQUE_VERTEX_STRIDE 0x028B30
#define R_028B38_VGT_GS_MAX_VERT_OUT 0x028B38
#define S_028B38_MAX_VERT_OUT(x) (((x) & 0x7FF) << 0)
#define R_028B44_VGT_STRMOUT_BASE_OFFSET_HI_0 0x028B44
#define R_028B48_VGT_STRMOUT_BASE_OFFSET_HI_1 0x028B48
#define R_028B4C_VGT_STRMOUT_BASE_OFFSET_HI_2 0x028B4C
#define R_028B50_VGT_STRMOUT_BASE_OFFSET_HI_3 0x028B50
#define R_028B54_VGT_SHADER_STAGES_EN 0x00028B54
#define S_028B54_LS_EN(x) (((x) & 0x3) << 0)
#define V_028B54_LS_STAGE_OFF 0x00
#define V_028B54_LS_STAGE_ON 0x01
#define V_028B54_CS_STAGE_ON 0x02
#define S_028B54_HS_EN(x) (((x) & 0x1) << 2)
#define S_028B54_ES_EN(x) (((x) & 0x3) << 3)
#define V_028B54_ES_STAGE_OFF 0x00
#define V_028B54_ES_STAGE_DS 0x01
#define V_028B54_ES_STAGE_REAL 0x02
#define S_028B54_GS_EN(x) (((x) & 0x1) << 5)
#define S_028B54_VS_EN(x) (((x) & 0x3) << 6)
#define V_028B54_VS_STAGE_REAL 0x00
#define V_028B54_VS_STAGE_DS 0x01
#define V_028B54_VS_STAGE_COPY_SHADER 0x02
#define R_028B70_DB_ALPHA_TO_MASK 0x00028B70
#define S_028B70_ALPHA_TO_MASK_ENABLE(x) (((x) & 0x1) << 0)
#define S_028B70_ALPHA_TO_MASK_OFFSET0(x) (((x) & 0x3) << 8)
@@ -1998,12 +2073,9 @@
#define S_028B8C_OFFSET(x) (((x) & 0xFFFFFFFF) << 0)
#define G_028B8C_OFFSET(x) (((x) >> 0) & 0xFFFFFFFF)
#define C_028B8C_OFFSET 0x00000000
#define R_028B94_VGT_STRMOUT_CONFIG 0x028B94
#define S_028B94_STREAMOUT_0_EN(x) (((x) & 0x1) << 0)
#define S_028B94_STREAMOUT_1_EN(x) (((x) & 0x1) << 1)
#define S_028B94_STREAMOUT_2_EN(x) (((x) & 0x1) << 2)
#define S_028B94_STREAMOUT_3_EN(x) (((x) & 0x1) << 3)
#define S_028B94_RAST_STREAM(x) (((x) & 0x07) << 4)
#define R_028B90_VGT_GS_INSTANCE_CNT 0x00028B90
#define S_028B90_ENABLE(x) (((x) & 0x1) << 0)
#define S_028B90_CNT(x) (((x) & 0x7F) << 2)
#define R_028B98_VGT_STRMOUT_BUFFER_CONFIG 0x028B98
#define S_028B98_STREAM_0_BUFFER_EN(x) (((x) & 0x0F) << 0)
#define S_028B98_STREAM_1_BUFFER_EN(x) (((x) & 0x0F) << 4)

View File

@@ -193,7 +193,6 @@ int r600_bytecode_add_output(struct r600_bytecode *bc,
if ((output->gpr + output->burst_count) == bc->cf_last->output.gpr &&
(output->array_base + output->burst_count) == bc->cf_last->output.array_base) {
bc->cf_last->output.end_of_program |= output->end_of_program;
bc->cf_last->op = bc->cf_last->output.op = output->op;
bc->cf_last->output.gpr = output->gpr;
bc->cf_last->output.array_base = output->array_base;
@@ -203,7 +202,6 @@ int r600_bytecode_add_output(struct r600_bytecode *bc,
} else if (output->gpr == (bc->cf_last->output.gpr + bc->cf_last->output.burst_count) &&
output->array_base == (bc->cf_last->output.array_base + bc->cf_last->output.burst_count)) {
bc->cf_last->output.end_of_program |= output->end_of_program;
bc->cf_last->op = bc->cf_last->output.op = output->op;
bc->cf_last->output.burst_count += output->burst_count;
return 0;
@@ -215,6 +213,7 @@ int r600_bytecode_add_output(struct r600_bytecode *bc,
return r;
bc->cf_last->op = output->op;
memcpy(&bc->cf_last->output, output, sizeof(struct r600_bytecode_output));
bc->cf_last->barrier = 1;
return 0;
}
@@ -1526,24 +1525,26 @@ static int r600_bytecode_cf_build(struct r600_bytecode *bc, struct r600_bytecode
bc->bytecode[id++] = S_SQ_CF_ALLOC_EXPORT_WORD0_RW_GPR(cf->output.gpr) |
S_SQ_CF_ALLOC_EXPORT_WORD0_ELEM_SIZE(cf->output.elem_size) |
S_SQ_CF_ALLOC_EXPORT_WORD0_ARRAY_BASE(cf->output.array_base) |
S_SQ_CF_ALLOC_EXPORT_WORD0_TYPE(cf->output.type);
S_SQ_CF_ALLOC_EXPORT_WORD0_TYPE(cf->output.type) |
S_SQ_CF_ALLOC_EXPORT_WORD0_INDEX_GPR(cf->output.index_gpr);
bc->bytecode[id++] = S_SQ_CF_ALLOC_EXPORT_WORD1_BURST_COUNT(cf->output.burst_count - 1) |
S_SQ_CF_ALLOC_EXPORT_WORD1_SWIZ_SEL_X(cf->output.swizzle_x) |
S_SQ_CF_ALLOC_EXPORT_WORD1_SWIZ_SEL_Y(cf->output.swizzle_y) |
S_SQ_CF_ALLOC_EXPORT_WORD1_SWIZ_SEL_Z(cf->output.swizzle_z) |
S_SQ_CF_ALLOC_EXPORT_WORD1_SWIZ_SEL_W(cf->output.swizzle_w) |
S_SQ_CF_ALLOC_EXPORT_WORD1_BARRIER(cf->output.barrier) |
S_SQ_CF_ALLOC_EXPORT_WORD1_BARRIER(cf->barrier) |
S_SQ_CF_ALLOC_EXPORT_WORD1_CF_INST(opcode) |
S_SQ_CF_ALLOC_EXPORT_WORD1_END_OF_PROGRAM(cf->output.end_of_program);
} else if (cfop->flags & CF_STRM) {
S_SQ_CF_ALLOC_EXPORT_WORD1_END_OF_PROGRAM(cf->end_of_program);
} else if (cfop->flags & CF_MEM) {
bc->bytecode[id++] = S_SQ_CF_ALLOC_EXPORT_WORD0_RW_GPR(cf->output.gpr) |
S_SQ_CF_ALLOC_EXPORT_WORD0_ELEM_SIZE(cf->output.elem_size) |
S_SQ_CF_ALLOC_EXPORT_WORD0_ARRAY_BASE(cf->output.array_base) |
S_SQ_CF_ALLOC_EXPORT_WORD0_TYPE(cf->output.type);
S_SQ_CF_ALLOC_EXPORT_WORD0_TYPE(cf->output.type) |
S_SQ_CF_ALLOC_EXPORT_WORD0_INDEX_GPR(cf->output.index_gpr);
bc->bytecode[id++] = S_SQ_CF_ALLOC_EXPORT_WORD1_BURST_COUNT(cf->output.burst_count - 1) |
S_SQ_CF_ALLOC_EXPORT_WORD1_BARRIER(cf->output.barrier) |
S_SQ_CF_ALLOC_EXPORT_WORD1_BARRIER(cf->barrier) |
S_SQ_CF_ALLOC_EXPORT_WORD1_CF_INST(opcode) |
S_SQ_CF_ALLOC_EXPORT_WORD1_END_OF_PROGRAM(cf->output.end_of_program) |
S_SQ_CF_ALLOC_EXPORT_WORD1_END_OF_PROGRAM(cf->end_of_program) |
S_SQ_CF_ALLOC_EXPORT_WORD1_BUF_ARRAY_SIZE(cf->output.array_size) |
S_SQ_CF_ALLOC_EXPORT_WORD1_BUF_COMP_MASK(cf->output.comp_mask);
} else {
@@ -1551,7 +1552,8 @@ static int r600_bytecode_cf_build(struct r600_bytecode *bc, struct r600_bytecode
bc->bytecode[id++] = S_SQ_CF_WORD1_CF_INST(opcode) |
S_SQ_CF_WORD1_BARRIER(1) |
S_SQ_CF_WORD1_COND(cf->cond) |
S_SQ_CF_WORD1_POP_COUNT(cf->pop_count);
S_SQ_CF_WORD1_POP_COUNT(cf->pop_count) |
S_SQ_CF_WORD1_END_OF_PROGRAM(cf->end_of_program);
}
return 0;
}
@@ -1932,12 +1934,12 @@ void r600_bytecode_disasm(struct r600_bytecode *bc)
print_indent(o, 67);
fprintf(stderr, " ES:%X ", cf->output.elem_size);
if (!cf->output.barrier)
if (!cf->barrier)
fprintf(stderr, "NO_BARRIER ");
if (cf->output.end_of_program)
if (cf->end_of_program)
fprintf(stderr, "EOP ");
fprintf(stderr, "\n");
} else if (r600_isa_cf(cf->op)->flags & CF_STRM) {
} else if (r600_isa_cf(cf->op)->flags & CF_MEM) {
int o = 0;
const char *exp_type[] = {"WRITE", "WRITE_IND", "WRITE_ACK",
"WRITE_IND_ACK"};
@@ -1963,14 +1965,17 @@ void r600_bytecode_disasm(struct r600_bytecode *bc)
o += print_swizzle(7);
}
if (cf->output.type == V_SQ_CF_ALLOC_EXPORT_WORD0_SQ_EXPORT_WRITE_IND)
o += fprintf(stderr, " R%d", cf->output.index_gpr);
o += print_indent(o, 67);
fprintf(stderr, " ES:%i ", cf->output.elem_size);
if (cf->output.array_size != 0xFFF)
fprintf(stderr, "AS:%i ", cf->output.array_size);
if (!cf->output.barrier)
if (!cf->barrier)
fprintf(stderr, "NO_BARRIER ");
if (cf->output.end_of_program)
if (cf->end_of_program)
fprintf(stderr, "EOP ");
fprintf(stderr, "\n");
} else {
@@ -2486,6 +2491,7 @@ void r600_bytecode_alu_read(struct r600_bytecode *bc,
}
}
#if 0
void r600_bytecode_export_read(struct r600_bytecode *bc,
struct r600_bytecode_output *output, uint32_t word0, uint32_t word1)
{
@@ -2506,3 +2512,4 @@ void r600_bytecode_export_read(struct r600_bytecode *bc,
output->array_size = G_SQ_CF_ALLOC_EXPORT_WORD1_BUF_ARRAY_SIZE(word1);
output->comp_mask = G_SQ_CF_ALLOC_EXPORT_WORD1_BUF_COMP_MASK(word1);
}
#endif

View File

@@ -115,7 +115,6 @@ struct r600_bytecode_output {
unsigned array_size;
unsigned comp_mask;
unsigned type;
unsigned end_of_program;
unsigned op;
@@ -126,7 +125,7 @@ struct r600_bytecode_output {
unsigned swizzle_z;
unsigned swizzle_w;
unsigned burst_count;
unsigned barrier;
unsigned index_gpr;
};
struct r600_bytecode_kcache {
@@ -148,6 +147,8 @@ struct r600_bytecode_cf {
struct r600_bytecode_kcache kcache[4];
unsigned r6xx_uses_waterfall;
unsigned eg_alu_extended;
unsigned barrier;
unsigned end_of_program;
struct list_head alu;
struct list_head tex;
struct list_head vtx;

View File

@@ -59,6 +59,7 @@ static void r600_blitter_begin(struct pipe_context *ctx, enum r600_blitter_op op
util_blitter_save_vertex_buffer_slot(rctx->blitter, rctx->vertex_buffer_state.vb);
util_blitter_save_vertex_elements(rctx->blitter, rctx->vertex_fetch_shader.cso);
util_blitter_save_vertex_shader(rctx->blitter, rctx->vs_shader);
util_blitter_save_geometry_shader(rctx->blitter, rctx->gs_shader);
util_blitter_save_so_targets(rctx->blitter, rctx->b.streamout.num_targets,
(struct pipe_stream_output_target**)rctx->b.streamout.targets);
util_blitter_save_rasterizer(rctx->blitter, rctx->rasterizer_state.cso);

View File

@@ -301,6 +301,12 @@ void r600_begin_new_cs(struct r600_context *ctx)
ctx->config_state.atom.dirty = true;
ctx->stencil_ref.atom.dirty = true;
ctx->vertex_fetch_shader.atom.dirty = true;
ctx->export_shader.atom.dirty = true;
if (ctx->gs_shader) {
ctx->geometry_shader.atom.dirty = true;
ctx->shader_stages.atom.dirty = true;
ctx->gs_rings.atom.dirty = true;
}
ctx->vertex_shader.atom.dirty = true;
ctx->viewport.atom.dirty = true;
@@ -346,7 +352,7 @@ void r600_begin_new_cs(struct r600_context *ctx)
ctx->last_primitive_type = -1;
ctx->last_start_instance = -1;
ctx->initial_gfx_cs_size = ctx->b.rings.gfx.cs->cdw;
ctx->b.initial_gfx_cs_size = ctx->b.rings.gfx.cs->cdw;
}
/* The max number of bytes to copy per packet. */

View File

@@ -73,7 +73,7 @@ static void r600_flush(struct pipe_context *ctx, unsigned flags)
unsigned render_cond_mode = 0;
boolean render_cond_cond = FALSE;
if (rctx->b.rings.gfx.cs->cdw == rctx->initial_gfx_cs_size)
if (rctx->b.rings.gfx.cs->cdw == rctx->b.initial_gfx_cs_size)
return;
rctx->b.rings.gfx.flushing = true;
@@ -94,7 +94,7 @@ static void r600_flush(struct pipe_context *ctx, unsigned flags)
ctx->render_condition(ctx, render_cond, render_cond_cond, render_cond_mode);
}
rctx->initial_gfx_cs_size = rctx->b.rings.gfx.cs->cdw;
rctx->b.initial_gfx_cs_size = rctx->b.rings.gfx.cs->cdw;
}
static void r600_flush_from_st(struct pipe_context *ctx,
@@ -372,6 +372,11 @@ static int r600_get_param(struct pipe_screen* pscreen, enum pipe_cap param)
return 1;
case PIPE_CAP_GLSL_FEATURE_LEVEL:
if (family >= CHIP_CEDAR)
return 330;
/* pre-evergreen geom shaders need newer kernel */
if (rscreen->b.info.drm_minor >= 37)
return 330;
return 140;
/* Supported except the original R600. */
@@ -383,6 +388,7 @@ static int r600_get_param(struct pipe_screen* pscreen, enum pipe_cap param)
/* Supported on Evergreen. */
case PIPE_CAP_SEAMLESS_CUBE_MAP_PER_TEXTURE:
case PIPE_CAP_CUBE_MAP_ARRAY:
case PIPE_CAP_TGSI_VS_LAYER:
return family >= CHIP_CEDAR ? 1 : 0;
/* Unsupported features. */
@@ -392,7 +398,6 @@ static int r600_get_param(struct pipe_screen* pscreen, enum pipe_cap param)
case PIPE_CAP_FRAGMENT_COLOR_CLAMPED:
case PIPE_CAP_VERTEX_COLOR_CLAMPED:
case PIPE_CAP_USER_VERTEX_BUFFERS:
case PIPE_CAP_TGSI_VS_LAYER:
return 0;
/* Stream output. */
@@ -404,6 +409,12 @@ static int r600_get_param(struct pipe_screen* pscreen, enum pipe_cap param)
case PIPE_CAP_MAX_STREAM_OUTPUT_INTERLEAVED_COMPONENTS:
return 32*4;
/* Geometry shader output. */
case PIPE_CAP_MAX_GEOMETRY_OUTPUT_VERTICES:
return 1024;
case PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS:
return 16384;
/* Texturing. */
case PIPE_CAP_MAX_TEXTURE_2D_LEVELS:
case PIPE_CAP_MAX_TEXTURE_3D_LEVELS:
@@ -416,7 +427,7 @@ static int r600_get_param(struct pipe_screen* pscreen, enum pipe_cap param)
return rscreen->b.info.drm_minor >= 9 ?
(family >= CHIP_CEDAR ? 16384 : 8192) : 0;
case PIPE_CAP_MAX_COMBINED_SAMPLERS:
return 32;
return 48;
/* Render targets. */
case PIPE_CAP_MAX_RENDER_TARGETS:
@@ -449,14 +460,20 @@ static int r600_get_param(struct pipe_screen* pscreen, enum pipe_cap param)
static int r600_get_shader_param(struct pipe_screen* pscreen, unsigned shader, enum pipe_shader_cap param)
{
struct r600_screen *rscreen = (struct r600_screen *)pscreen;
switch(shader)
{
case PIPE_SHADER_FRAGMENT:
case PIPE_SHADER_VERTEX:
case PIPE_SHADER_COMPUTE:
case PIPE_SHADER_COMPUTE:
break;
case PIPE_SHADER_GEOMETRY:
/* XXX: support and enable geometry programs */
if (rscreen->b.family >= CHIP_CEDAR)
break;
/* pre-evergreen geom shaders need newer kernel */
if (rscreen->b.info.drm_minor >= 37)
break;
return 0;
default:
/* XXX: support tessellation on Evergreen */
@@ -568,8 +585,8 @@ struct pipe_screen *r600_screen_create(struct radeon_winsys *ws)
rscreen->b.debug_flags |= DBG_COMPUTE;
if (debug_get_bool_option("R600_DUMP_SHADERS", FALSE))
rscreen->b.debug_flags |= DBG_FS | DBG_VS | DBG_GS | DBG_PS | DBG_CS;
if (!debug_get_bool_option("R600_HYPERZ", TRUE))
rscreen->b.debug_flags |= DBG_NO_HYPERZ;
if (debug_get_bool_option("R600_HYPERZ", FALSE))
rscreen->b.debug_flags |= DBG_HYPERZ;
if (!debug_get_bool_option("R600_LLVM", TRUE))
rscreen->b.debug_flags |= DBG_NO_LLVM;

View File

@@ -38,7 +38,7 @@
#include "util/u_double_list.h"
#include "util/u_transfer.h"
#define R600_NUM_ATOMS 41
#define R600_NUM_ATOMS 42
/* the number of CS dwords for flushing and drawing */
#define R600_MAX_FLUSH_CS_DWORDS 16
@@ -46,13 +46,14 @@
#define R600_TRACE_CS_DWORDS 7
#define R600_MAX_USER_CONST_BUFFERS 13
#define R600_MAX_DRIVER_CONST_BUFFERS 3
#define R600_MAX_DRIVER_CONST_BUFFERS 4
#define R600_MAX_CONST_BUFFERS (R600_MAX_USER_CONST_BUFFERS + R600_MAX_DRIVER_CONST_BUFFERS)
/* start driver buffers after user buffers */
#define R600_UCP_CONST_BUFFER (R600_MAX_USER_CONST_BUFFERS)
#define R600_TXQ_CONST_BUFFER (R600_MAX_USER_CONST_BUFFERS + 1)
#define R600_BUFFER_INFO_CONST_BUFFER (R600_MAX_USER_CONST_BUFFERS + 2)
#define R600_GS_RING_CONST_BUFFER (R600_MAX_USER_CONST_BUFFERS + 3)
#define R600_MAX_CONST_BUFFER_SIZE 4096
@@ -159,6 +160,7 @@ struct r600_sample_mask {
struct r600_config_state {
struct r600_atom atom;
unsigned sq_gpr_resource_mgmt_1;
unsigned sq_gpr_resource_mgmt_2;
};
struct r600_stencil_ref
@@ -179,6 +181,18 @@ struct r600_viewport_state {
struct pipe_viewport_state state;
};
struct r600_shader_stages_state {
struct r600_atom atom;
unsigned geom_enable;
};
struct r600_gs_rings_state {
struct r600_atom atom;
unsigned enable;
struct pipe_constant_buffer esgs_ring;
struct pipe_constant_buffer gsvs_ring;
};
/* This must start from 16. */
/* features */
#define DBG_NO_LLVM (1 << 17)
@@ -353,7 +367,7 @@ struct r600_fetch_shader {
struct r600_shader_state {
struct r600_atom atom;
struct r600_pipe_shader_selector *shader;
struct r600_pipe_shader *shader;
};
struct r600_context {
@@ -361,7 +375,6 @@ struct r600_context {
struct r600_screen *screen;
struct blitter_context *blitter;
struct u_suballocator *allocator_fetch_shader;
unsigned initial_gfx_cs_size;
/* Hardware info. */
boolean has_vertex_cache;
@@ -415,7 +428,11 @@ struct r600_context {
struct r600_cso_state vertex_fetch_shader;
struct r600_shader_state vertex_shader;
struct r600_shader_state pixel_shader;
struct r600_shader_state geometry_shader;
struct r600_shader_state export_shader;
struct r600_cs_shader_state cs_shader_state;
struct r600_shader_stages_state shader_stages;
struct r600_gs_rings_state gs_rings;
struct r600_constbuf_state constbuf_state[PIPE_SHADER_TYPES];
struct r600_textures_info samplers[PIPE_SHADER_TYPES];
/** Vertex buffers for fetch shaders */
@@ -427,6 +444,7 @@ struct r600_context {
unsigned compute_cb_target_mask;
struct r600_pipe_shader_selector *ps_shader;
struct r600_pipe_shader_selector *vs_shader;
struct r600_pipe_shader_selector *gs_shader;
struct r600_rasterizer_state *rasterizer;
bool alpha_to_one;
bool force_blend_disable;
@@ -506,6 +524,8 @@ void cayman_init_common_regs(struct r600_command_buffer *cb,
void evergreen_init_state_functions(struct r600_context *rctx);
void evergreen_init_atom_start_cs(struct r600_context *rctx);
void evergreen_update_ps_state(struct pipe_context *ctx, struct r600_pipe_shader *shader);
void evergreen_update_es_state(struct pipe_context *ctx, struct r600_pipe_shader *shader);
void evergreen_update_gs_state(struct pipe_context *ctx, struct r600_pipe_shader *shader);
void evergreen_update_vs_state(struct pipe_context *ctx, struct r600_pipe_shader *shader);
void *evergreen_create_db_flush_dsa(struct r600_context *rctx);
void *evergreen_create_resolve_blend(struct r600_context *rctx);
@@ -545,6 +565,8 @@ r600_create_sampler_view_custom(struct pipe_context *ctx,
void r600_init_state_functions(struct r600_context *rctx);
void r600_init_atom_start_cs(struct r600_context *rctx);
void r600_update_ps_state(struct pipe_context *ctx, struct r600_pipe_shader *shader);
void r600_update_es_state(struct pipe_context *ctx, struct r600_pipe_shader *shader);
void r600_update_gs_state(struct pipe_context *ctx, struct r600_pipe_shader *shader);
void r600_update_vs_state(struct pipe_context *ctx, struct r600_pipe_shader *shader);
void *r600_create_db_flush_dsa(struct r600_context *rctx);
void *r600_create_resolve_blend(struct r600_context *rctx);

File diff suppressed because it is too large Load Diff

View File

@@ -37,6 +37,7 @@ struct r600_shader_io {
unsigned lds_pos; /* for evergreen */
unsigned back_color_input;
unsigned write_mask;
int ring_offset;
};
struct r600_shader {
@@ -61,12 +62,22 @@ struct r600_shader {
/* flag is set if the shader writes VS_OUT_MISC_VEC (e.g. for PSIZE) */
boolean vs_out_misc_write;
boolean vs_out_point_size;
boolean vs_out_layer;
boolean has_txq_cube_array_z_comp;
boolean uses_tex_buffers;
boolean gs_prim_id_input;
/* geometry shader properties */
unsigned gs_input_prim;
unsigned gs_output_prim;
unsigned gs_max_out_vertices;
/* size in bytes of a data item in the ring (single vertex data) */
unsigned ring_item_size;
unsigned indirect_files;
unsigned max_arrays;
unsigned num_arrays;
unsigned vs_as_es;
struct r600_shader_array * arrays;
};
@@ -74,6 +85,7 @@ struct r600_shader_key {
unsigned color_two_side:1;
unsigned alpha_to_one:1;
unsigned nr_cbufs:4;
unsigned vs_as_es:1;
};
struct r600_shader_array {
@@ -85,6 +97,8 @@ struct r600_shader_array {
struct r600_pipe_shader {
struct r600_pipe_shader_selector *selector;
struct r600_pipe_shader *next_variant;
/* for GS - corresponding copy shader (installed as VS) */
struct r600_pipe_shader *gs_copy_shader;
struct r600_shader shader;
struct r600_command_buffer command_buffer; /* register writes */
struct r600_resource *bo;

View File

@@ -1264,6 +1264,7 @@ static void r600_init_color_surface(struct r600_context *rctx,
unsigned level = surf->base.u.tex.level;
unsigned pitch, slice;
unsigned color_info;
unsigned color_view;
unsigned format, swap, ntype, endian;
unsigned offset;
const struct util_format_description *desc;
@@ -1277,10 +1278,15 @@ static void r600_init_color_surface(struct r600_context *rctx,
}
offset = rtex->surface.level[level].offset;
if (rtex->surface.level[level].mode < RADEON_SURF_MODE_1D) {
if (rtex->surface.level[level].mode == RADEON_SURF_MODE_LINEAR) {
assert(surf->base.u.tex.first_layer == surf->base.u.tex.last_layer);
offset += rtex->surface.level[level].slice_size *
surf->base.u.tex.first_layer;
}
surf->base.u.tex.first_layer;
color_view = 0;
} else
color_view = S_028080_SLICE_START(surf->base.u.tex.first_layer) |
S_028080_SLICE_MAX(surf->base.u.tex.last_layer);
pitch = rtex->surface.level[level].nblk_x / 8 - 1;
slice = (rtex->surface.level[level].nblk_x * rtex->surface.level[level].nblk_y) / 64;
if (slice) {
@@ -1466,14 +1472,7 @@ static void r600_init_color_surface(struct r600_context *rctx,
}
surf->cb_color_info = color_info;
if (rtex->surface.level[level].mode < RADEON_SURF_MODE_1D) {
surf->cb_color_view = 0;
} else {
surf->cb_color_view = S_028080_SLICE_START(surf->base.u.tex.first_layer) |
S_028080_SLICE_MAX(surf->base.u.tex.last_layer);
}
surf->cb_color_view = color_view;
surf->color_initialized = true;
}
@@ -1667,8 +1666,6 @@ static void r600_set_framebuffer_state(struct pipe_context *ctx,
rctx->alphatest_state.atom.dirty = true;
}
r600_update_db_shader_control(rctx);
/* Calculate the CS size. */
rctx->framebuffer.atom.num_dw =
10 /*COLOR_INFO*/ + 4 /*SCISSOR*/ + 3 /*SHADER_CONTROL*/ + 8 /*MSAA*/;
@@ -2067,6 +2064,7 @@ static void r600_emit_config_state(struct r600_context *rctx, struct r600_atom *
struct r600_config_state *a = (struct r600_config_state*)atom;
r600_write_config_reg(cs, R_008C04_SQ_GPR_RESOURCE_MGMT_1, a->sq_gpr_resource_mgmt_1);
r600_write_config_reg(cs, R_008C08_SQ_GPR_RESOURCE_MGMT_2, a->sq_gpr_resource_mgmt_2);
}
static void r600_emit_vertex_buffers(struct r600_context *rctx, struct r600_atom *atom)
@@ -2118,16 +2116,18 @@ static void r600_emit_constant_buffers(struct r600_context *rctx,
struct r600_resource *rbuffer;
unsigned offset;
unsigned buffer_index = ffs(dirty_mask) - 1;
unsigned gs_ring_buffer = (buffer_index == R600_GS_RING_CONST_BUFFER);
cb = &state->cb[buffer_index];
rbuffer = (struct r600_resource*)cb->buffer;
assert(rbuffer);
offset = cb->buffer_offset;
r600_write_context_reg(cs, reg_alu_constbuf_size + buffer_index * 4,
ALIGN_DIVUP(cb->buffer_size >> 4, 16));
r600_write_context_reg(cs, reg_alu_const_cache + buffer_index * 4, offset >> 8);
if (!gs_ring_buffer) {
r600_write_context_reg(cs, reg_alu_constbuf_size + buffer_index * 4,
ALIGN_DIVUP(cb->buffer_size >> 4, 16));
r600_write_context_reg(cs, reg_alu_const_cache + buffer_index * 4, offset >> 8);
}
radeon_emit(cs, PKT3(PKT3_NOP, 0, 0));
radeon_emit(cs, r600_context_bo_reloc(&rctx->b, &rctx->b.rings.gfx, rbuffer, RADEON_USAGE_READ));
@@ -2137,8 +2137,8 @@ static void r600_emit_constant_buffers(struct r600_context *rctx,
radeon_emit(cs, offset); /* RESOURCEi_WORD0 */
radeon_emit(cs, rbuffer->buf->size - offset - 1); /* RESOURCEi_WORD1 */
radeon_emit(cs, /* RESOURCEi_WORD2 */
S_038008_ENDIAN_SWAP(r600_endian_swap(32)) |
S_038008_STRIDE(16));
S_038008_ENDIAN_SWAP(gs_ring_buffer ? ENDIAN_NONE : r600_endian_swap(32)) |
S_038008_STRIDE(gs_ring_buffer ? 4 : 16));
radeon_emit(cs, 0); /* RESOURCEi_WORD3 */
radeon_emit(cs, 0); /* RESOURCEi_WORD4 */
radeon_emit(cs, 0); /* RESOURCEi_WORD5 */
@@ -2323,34 +2323,124 @@ static void r600_emit_vertex_fetch_shader(struct r600_context *rctx, struct r600
radeon_emit(cs, r600_context_bo_reloc(&rctx->b, &rctx->b.rings.gfx, shader->buffer, RADEON_USAGE_READ));
}
static void r600_emit_shader_stages(struct r600_context *rctx, struct r600_atom *a)
{
struct radeon_winsys_cs *cs = rctx->b.rings.gfx.cs;
struct r600_shader_stages_state *state = (struct r600_shader_stages_state*)a;
uint32_t v2 = 0, primid = 0;
if (state->geom_enable) {
uint32_t cut_val;
if (rctx->gs_shader->current->shader.gs_max_out_vertices <= 128)
cut_val = V_028A40_GS_CUT_128;
else if (rctx->gs_shader->current->shader.gs_max_out_vertices <= 256)
cut_val = V_028A40_GS_CUT_256;
else if (rctx->gs_shader->current->shader.gs_max_out_vertices <= 512)
cut_val = V_028A40_GS_CUT_512;
else
cut_val = V_028A40_GS_CUT_1024;
v2 = S_028A40_MODE(V_028A40_GS_SCENARIO_G) |
S_028A40_CUT_MODE(cut_val);
if (rctx->gs_shader->current->shader.gs_prim_id_input)
primid = 1;
}
r600_write_context_reg(cs, R_028A40_VGT_GS_MODE, v2);
r600_write_context_reg(cs, R_028A84_VGT_PRIMITIVEID_EN, primid);
}
static void r600_emit_gs_rings(struct r600_context *rctx, struct r600_atom *a)
{
struct pipe_screen *screen = rctx->b.b.screen;
struct radeon_winsys_cs *cs = rctx->b.rings.gfx.cs;
struct r600_gs_rings_state *state = (struct r600_gs_rings_state*)a;
struct r600_resource *rbuffer;
r600_write_config_reg(cs, R_008040_WAIT_UNTIL, S_008040_WAIT_3D_IDLE(1));
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
radeon_emit(cs, EVENT_TYPE(EVENT_TYPE_VGT_FLUSH));
if (state->enable) {
rbuffer =(struct r600_resource*)state->esgs_ring.buffer;
r600_write_config_reg(cs, R_008C40_SQ_ESGS_RING_BASE,
(r600_resource_va(screen, &rbuffer->b.b)) >> 8);
radeon_emit(cs, PKT3(PKT3_NOP, 0, 0));
radeon_emit(cs, r600_context_bo_reloc(&rctx->b, &rctx->b.rings.gfx, rbuffer, RADEON_USAGE_READWRITE));
r600_write_config_reg(cs, R_008C44_SQ_ESGS_RING_SIZE,
state->esgs_ring.buffer_size >> 8);
rbuffer =(struct r600_resource*)state->gsvs_ring.buffer;
r600_write_config_reg(cs, R_008C48_SQ_GSVS_RING_BASE,
(r600_resource_va(screen, &rbuffer->b.b)) >> 8);
radeon_emit(cs, PKT3(PKT3_NOP, 0, 0));
radeon_emit(cs, r600_context_bo_reloc(&rctx->b, &rctx->b.rings.gfx, rbuffer, RADEON_USAGE_READWRITE));
r600_write_config_reg(cs, R_008C4C_SQ_GSVS_RING_SIZE,
state->gsvs_ring.buffer_size >> 8);
} else {
r600_write_config_reg(cs, R_008C44_SQ_ESGS_RING_SIZE, 0);
r600_write_config_reg(cs, R_008C4C_SQ_GSVS_RING_SIZE, 0);
}
r600_write_config_reg(cs, R_008040_WAIT_UNTIL, S_008040_WAIT_3D_IDLE(1));
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
radeon_emit(cs, EVENT_TYPE(EVENT_TYPE_VGT_FLUSH));
}
/* Adjust GPR allocation on R6xx/R7xx */
bool r600_adjust_gprs(struct r600_context *rctx)
{
unsigned num_ps_gprs = rctx->ps_shader->current->shader.bc.ngpr;
unsigned num_vs_gprs = rctx->vs_shader->current->shader.bc.ngpr;
unsigned num_vs_gprs, num_es_gprs, num_gs_gprs;
unsigned new_num_ps_gprs = num_ps_gprs;
unsigned new_num_vs_gprs = num_vs_gprs;
unsigned new_num_vs_gprs, new_num_es_gprs, new_num_gs_gprs;
unsigned cur_num_ps_gprs = G_008C04_NUM_PS_GPRS(rctx->config_state.sq_gpr_resource_mgmt_1);
unsigned cur_num_vs_gprs = G_008C04_NUM_VS_GPRS(rctx->config_state.sq_gpr_resource_mgmt_1);
unsigned cur_num_gs_gprs = G_008C08_NUM_GS_GPRS(rctx->config_state.sq_gpr_resource_mgmt_2);
unsigned cur_num_es_gprs = G_008C08_NUM_ES_GPRS(rctx->config_state.sq_gpr_resource_mgmt_2);
unsigned def_num_ps_gprs = rctx->default_ps_gprs;
unsigned def_num_vs_gprs = rctx->default_vs_gprs;
unsigned def_num_gs_gprs = 0;
unsigned def_num_es_gprs = 0;
unsigned def_num_clause_temp_gprs = rctx->r6xx_num_clause_temp_gprs;
/* hardware will reserve twice num_clause_temp_gprs */
unsigned max_gprs = def_num_ps_gprs + def_num_vs_gprs + def_num_clause_temp_gprs * 2;
unsigned tmp;
unsigned max_gprs = def_num_gs_gprs + def_num_es_gprs + def_num_ps_gprs + def_num_vs_gprs + def_num_clause_temp_gprs * 2;
unsigned tmp, tmp2;
if (rctx->gs_shader) {
num_es_gprs = rctx->vs_shader->current->shader.bc.ngpr;
num_gs_gprs = rctx->gs_shader->current->shader.bc.ngpr;
num_vs_gprs = rctx->gs_shader->current->gs_copy_shader->shader.bc.ngpr;
} else {
num_es_gprs = 0;
num_gs_gprs = 0;
num_vs_gprs = rctx->vs_shader->current->shader.bc.ngpr;
}
new_num_vs_gprs = num_vs_gprs;
new_num_es_gprs = num_es_gprs;
new_num_gs_gprs = num_gs_gprs;
/* the sum of all SQ_GPR_RESOURCE_MGMT*.NUM_*_GPRS must <= to max_gprs */
if (new_num_ps_gprs > cur_num_ps_gprs || new_num_vs_gprs > cur_num_vs_gprs) {
if (new_num_ps_gprs > cur_num_ps_gprs || new_num_vs_gprs > cur_num_vs_gprs ||
new_num_es_gprs > cur_num_es_gprs || new_num_gs_gprs > cur_num_gs_gprs) {
/* try to use switch back to default */
if (new_num_ps_gprs > def_num_ps_gprs || new_num_vs_gprs > def_num_vs_gprs) {
if (new_num_ps_gprs > def_num_ps_gprs || new_num_vs_gprs > def_num_vs_gprs ||
new_num_gs_gprs > def_num_gs_gprs || new_num_es_gprs > def_num_es_gprs) {
/* always privilege vs stage so that at worst we have the
* pixel stage producing wrong output (not the vertex
* stage) */
new_num_ps_gprs = max_gprs - (new_num_vs_gprs + def_num_clause_temp_gprs * 2);
new_num_ps_gprs = max_gprs - ((new_num_vs_gprs - new_num_es_gprs - new_num_gs_gprs) + def_num_clause_temp_gprs * 2);
new_num_vs_gprs = num_vs_gprs;
new_num_gs_gprs = num_gs_gprs;
new_num_es_gprs = num_es_gprs;
} else {
new_num_ps_gprs = def_num_ps_gprs;
new_num_vs_gprs = def_num_vs_gprs;
new_num_es_gprs = def_num_es_gprs;
new_num_gs_gprs = def_num_gs_gprs;
}
} else {
return true;
@@ -2362,10 +2452,11 @@ bool r600_adjust_gprs(struct r600_context *rctx)
* it will lockup. So in this case just discard the draw command
* and don't change the current gprs repartitions.
*/
if (num_ps_gprs > new_num_ps_gprs || num_vs_gprs > new_num_vs_gprs) {
R600_ERR("ps & vs shader require too many register (%d + %d) "
if (num_ps_gprs > new_num_ps_gprs || num_vs_gprs > new_num_vs_gprs ||
num_gs_gprs > new_num_gs_gprs || num_es_gprs > new_num_es_gprs) {
R600_ERR("shaders require too many register (%d + %d + %d + %d) "
"for a combined maximum of %d\n",
num_ps_gprs, num_vs_gprs, max_gprs);
num_ps_gprs, num_vs_gprs, num_es_gprs, num_gs_gprs, max_gprs);
return false;
}
@@ -2373,8 +2464,12 @@ bool r600_adjust_gprs(struct r600_context *rctx)
tmp = S_008C04_NUM_PS_GPRS(new_num_ps_gprs) |
S_008C04_NUM_VS_GPRS(new_num_vs_gprs) |
S_008C04_NUM_CLAUSE_TEMP_GPRS(def_num_clause_temp_gprs);
if (rctx->config_state.sq_gpr_resource_mgmt_1 != tmp) {
tmp2 = S_008C08_NUM_ES_GPRS(new_num_es_gprs) |
S_008C08_NUM_GS_GPRS(new_num_gs_gprs);
if (rctx->config_state.sq_gpr_resource_mgmt_1 != tmp || rctx->config_state.sq_gpr_resource_mgmt_2 != tmp2) {
rctx->config_state.sq_gpr_resource_mgmt_1 = tmp;
rctx->config_state.sq_gpr_resource_mgmt_2 = tmp2;
rctx->config_state.atom.dirty = true;
rctx->b.flags |= R600_CONTEXT_WAIT_3D_IDLE;
}
@@ -2492,19 +2587,19 @@ void r600_init_atom_start_cs(struct r600_context *rctx)
num_es_stack_entries = 16;
break;
case CHIP_RV770:
num_ps_gprs = 192;
num_ps_gprs = 130;
num_vs_gprs = 56;
num_temp_gprs = 4;
num_gs_gprs = 0;
num_es_gprs = 0;
num_ps_threads = 188;
num_gs_gprs = 31;
num_es_gprs = 31;
num_ps_threads = 180;
num_vs_threads = 60;
num_gs_threads = 0;
num_es_threads = 0;
num_ps_stack_entries = 256;
num_vs_stack_entries = 256;
num_gs_stack_entries = 0;
num_es_stack_entries = 0;
num_gs_threads = 4;
num_es_threads = 4;
num_ps_stack_entries = 128;
num_vs_stack_entries = 128;
num_gs_stack_entries = 128;
num_es_stack_entries = 128;
break;
case CHIP_RV730:
case CHIP_RV740:
@@ -2513,10 +2608,10 @@ void r600_init_atom_start_cs(struct r600_context *rctx)
num_temp_gprs = 4;
num_gs_gprs = 0;
num_es_gprs = 0;
num_ps_threads = 188;
num_ps_threads = 180;
num_vs_threads = 60;
num_gs_threads = 0;
num_es_threads = 0;
num_gs_threads = 4;
num_es_threads = 4;
num_ps_stack_entries = 128;
num_vs_stack_entries = 128;
num_gs_stack_entries = 0;
@@ -2528,10 +2623,10 @@ void r600_init_atom_start_cs(struct r600_context *rctx)
num_temp_gprs = 4;
num_gs_gprs = 0;
num_es_gprs = 0;
num_ps_threads = 144;
num_ps_threads = 136;
num_vs_threads = 48;
num_gs_threads = 0;
num_es_threads = 0;
num_gs_threads = 4;
num_es_threads = 4;
num_ps_stack_entries = 128;
num_vs_stack_entries = 128;
num_gs_stack_entries = 0;
@@ -2707,9 +2802,12 @@ void r600_init_atom_start_cs(struct r600_context *rctx)
r600_store_value(cb, 0); /* R_028240_PA_SC_GENERIC_SCISSOR_TL */
r600_store_value(cb, S_028244_BR_X(8192) | S_028244_BR_Y(8192)); /* R_028244_PA_SC_GENERIC_SCISSOR_BR */
r600_store_context_reg_seq(cb, R_0288CC_SQ_PGM_CF_OFFSET_PS, 2);
r600_store_context_reg_seq(cb, R_0288CC_SQ_PGM_CF_OFFSET_PS, 5);
r600_store_value(cb, 0); /* R_0288CC_SQ_PGM_CF_OFFSET_PS */
r600_store_value(cb, 0); /* R_0288D0_SQ_PGM_CF_OFFSET_VS */
r600_store_value(cb, 0); /* R_0288D4_SQ_PGM_CF_OFFSET_GS */
r600_store_value(cb, 0); /* R_0288D8_SQ_PGM_CF_OFFSET_ES */
r600_store_value(cb, 0); /* R_0288DC_SQ_PGM_CF_OFFSET_FS */
r600_store_context_reg(cb, R_0288E0_SQ_VTX_SEMANTIC_CLEAR, ~0);
@@ -2718,7 +2816,6 @@ void r600_init_atom_start_cs(struct r600_context *rctx)
r600_store_value(cb, 0); /* R_028404_VGT_MIN_VTX_INDX */
r600_store_context_reg(cb, R_0288A4_SQ_PGM_RESOURCES_FS, 0);
r600_store_context_reg(cb, R_0288DC_SQ_PGM_CF_OFFSET_FS, 0);
if (rctx->b.chip_class == R700 && rctx->screen->b.has_streamout)
r600_store_context_reg(cb, R_028354_SX_SURFACE_SYNC, S_028354_SURFACE_SYNC_MASK(0xf));
@@ -2729,6 +2826,7 @@ void r600_init_atom_start_cs(struct r600_context *rctx)
r600_store_loop_const(cb, R_03E200_SQ_LOOP_CONST_0, 0x1000FFF);
r600_store_loop_const(cb, R_03E200_SQ_LOOP_CONST_0 + (32 * 4), 0x1000FFF);
r600_store_loop_const(cb, R_03E200_SQ_LOOP_CONST_0 + (64 * 4), 0x1000FFF);
}
void r600_update_ps_state(struct pipe_context *ctx, struct r600_pipe_shader *shader)
@@ -2901,6 +2999,94 @@ void r600_update_vs_state(struct pipe_context *ctx, struct r600_pipe_shader *sha
S_02881C_USE_VTX_POINT_SIZE(rshader->vs_out_point_size);
}
static unsigned r600_conv_prim_to_gs_out(unsigned mode)
{
static const int prim_conv[] = {
V_028A6C_OUTPRIM_TYPE_POINTLIST,
V_028A6C_OUTPRIM_TYPE_LINESTRIP,
V_028A6C_OUTPRIM_TYPE_LINESTRIP,
V_028A6C_OUTPRIM_TYPE_LINESTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_LINESTRIP,
V_028A6C_OUTPRIM_TYPE_LINESTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP
};
assert(mode < Elements(prim_conv));
return prim_conv[mode];
}
void r600_update_gs_state(struct pipe_context *ctx, struct r600_pipe_shader *shader)
{
struct r600_context *rctx = (struct r600_context *)ctx;
struct r600_command_buffer *cb = &shader->command_buffer;
struct r600_shader *rshader = &shader->shader;
struct r600_shader *cp_shader = &shader->gs_copy_shader->shader;
unsigned gsvs_itemsize =
(cp_shader->ring_item_size * rshader->gs_max_out_vertices) >> 2;
r600_init_command_buffer(cb, 64);
/* VGT_GS_MODE is written by r600_emit_shader_stages */
r600_store_context_reg(cb, R_028AB8_VGT_VTX_CNT_EN, 1);
if (rctx->b.chip_class >= R700) {
r600_store_context_reg(cb, R_028B38_VGT_GS_MAX_VERT_OUT,
S_028B38_MAX_VERT_OUT(rshader->gs_max_out_vertices));
}
r600_store_context_reg(cb, R_028A6C_VGT_GS_OUT_PRIM_TYPE,
r600_conv_prim_to_gs_out(rshader->gs_output_prim));
r600_store_context_reg_seq(cb, R_0288C8_SQ_GS_VERT_ITEMSIZE, 4);
r600_store_value(cb, cp_shader->ring_item_size >> 2);
r600_store_value(cb, 0);
r600_store_value(cb, 0);
r600_store_value(cb, 0);
r600_store_context_reg(cb, R_0288A8_SQ_ESGS_RING_ITEMSIZE,
(rshader->ring_item_size) >> 2);
r600_store_context_reg(cb, R_0288AC_SQ_GSVS_RING_ITEMSIZE,
gsvs_itemsize);
/* FIXME calculate these values somehow ??? */
r600_store_config_reg_seq(cb, R_0088C8_VGT_GS_PER_ES, 2);
r600_store_value(cb, 0x80); /* GS_PER_ES */
r600_store_value(cb, 0x100); /* ES_PER_GS */
r600_store_config_reg_seq(cb, R_0088E8_VGT_GS_PER_VS, 1);
r600_store_value(cb, 0x2); /* GS_PER_VS */
r600_store_context_reg(cb, R_02887C_SQ_PGM_RESOURCES_GS,
S_02887C_NUM_GPRS(rshader->bc.ngpr) |
S_02887C_STACK_SIZE(rshader->bc.nstack));
r600_store_context_reg(cb, R_02886C_SQ_PGM_START_GS,
r600_resource_va(ctx->screen, (void *)shader->bo) >> 8);
/* After that, the NOP relocation packet must be emitted (shader->bo, RADEON_USAGE_READ). */
}
void r600_update_es_state(struct pipe_context *ctx, struct r600_pipe_shader *shader)
{
struct r600_command_buffer *cb = &shader->command_buffer;
struct r600_shader *rshader = &shader->shader;
r600_init_command_buffer(cb, 32);
r600_store_context_reg(cb, R_028890_SQ_PGM_RESOURCES_ES,
S_028890_NUM_GPRS(rshader->bc.ngpr) |
S_028890_STACK_SIZE(rshader->bc.nstack));
r600_store_context_reg(cb, R_028880_SQ_PGM_START_ES,
r600_resource_va(ctx->screen, (void *)shader->bo) >> 8);
/* After that, the NOP relocation packet must be emitted (shader->bo, RADEON_USAGE_READ). */
}
void *r600_create_resolve_blend(struct r600_context *rctx)
{
struct pipe_blend_state blend;
@@ -3262,6 +3448,10 @@ void r600_init_state_functions(struct r600_context *rctx)
rctx->atoms[id++] = &rctx->b.streamout.begin_atom;
r600_init_atom(rctx, &rctx->vertex_shader.atom, id++, r600_emit_shader, 23);
r600_init_atom(rctx, &rctx->pixel_shader.atom, id++, r600_emit_shader, 0);
r600_init_atom(rctx, &rctx->geometry_shader.atom, id++, r600_emit_shader, 0);
r600_init_atom(rctx, &rctx->export_shader.atom, id++, r600_emit_shader, 0);
r600_init_atom(rctx, &rctx->shader_stages.atom, id++, r600_emit_shader_stages, 0);
r600_init_atom(rctx, &rctx->gs_rings.atom, id++, r600_emit_gs_rings, 0);
rctx->b.b.create_blend_state = r600_create_blend_state;
rctx->b.b.create_depth_stencil_alpha_state = r600_create_dsa_state;

View File

@@ -301,11 +301,6 @@ static void r600_bind_dsa_state(struct pipe_context *ctx, void *state)
rctx->alphatest_state.sx_alpha_test_control = dsa->sx_alpha_test_control;
rctx->alphatest_state.sx_alpha_ref = dsa->alpha_ref;
rctx->alphatest_state.atom.dirty = true;
if (rctx->b.chip_class >= EVERGREEN) {
evergreen_update_db_shader_control(rctx);
} else {
r600_update_db_shader_control(rctx);
}
}
}
@@ -698,6 +693,8 @@ static INLINE struct r600_shader_key r600_shader_selector_key(struct pipe_contex
/* Dual-source blending only makes sense with nr_cbufs == 1. */
if (key.nr_cbufs == 1 && rctx->dual_src_blend)
key.nr_cbufs = 2;
} else if (sel->type == PIPE_SHADER_VERTEX) {
key.vs_as_es = (rctx->gs_shader != NULL);
}
return key;
}
@@ -709,7 +706,6 @@ static int r600_shader_select(struct pipe_context *ctx,
bool *dirty)
{
struct r600_shader_key key;
struct r600_context *rctx = (struct r600_context *)ctx;
struct r600_pipe_shader * shader = NULL;
int r;
@@ -771,11 +767,6 @@ static int r600_shader_select(struct pipe_context *ctx,
shader->next_variant = sel->current;
sel->current = shader;
if (rctx->ps_shader &&
rctx->cb_misc_state.nr_ps_color_outputs != rctx->ps_shader->current->nr_ps_color_outputs) {
rctx->cb_misc_state.nr_ps_color_outputs = rctx->ps_shader->current->nr_ps_color_outputs;
rctx->cb_misc_state.atom.dirty = true;
}
return 0;
}
@@ -784,16 +775,10 @@ static void *r600_create_shader_state(struct pipe_context *ctx,
unsigned pipe_shader_type)
{
struct r600_pipe_shader_selector *sel = CALLOC_STRUCT(r600_pipe_shader_selector);
int r;
sel->type = pipe_shader_type;
sel->tokens = tgsi_dup_tokens(state->tokens);
sel->so = state->stream_output;
r = r600_shader_select(ctx, sel, NULL);
if (r)
return NULL;
return sel;
}
@@ -809,6 +794,12 @@ static void *r600_create_vs_state(struct pipe_context *ctx,
return r600_create_shader_state(ctx, state, PIPE_SHADER_VERTEX);
}
static void *r600_create_gs_state(struct pipe_context *ctx,
const struct pipe_shader_state *state)
{
return r600_create_shader_state(ctx, state, PIPE_SHADER_GEOMETRY);
}
static void r600_bind_ps_state(struct pipe_context *ctx, void *state)
{
struct r600_context *rctx = (struct r600_context *)ctx;
@@ -816,31 +807,7 @@ static void r600_bind_ps_state(struct pipe_context *ctx, void *state)
if (!state)
state = rctx->dummy_pixel_shader;
rctx->pixel_shader.shader = rctx->ps_shader = (struct r600_pipe_shader_selector *)state;
rctx->pixel_shader.atom.num_dw = rctx->ps_shader->current->command_buffer.num_dw;
rctx->pixel_shader.atom.dirty = true;
r600_context_add_resource_size(ctx, (struct pipe_resource *)rctx->ps_shader->current->bo);
if (rctx->b.chip_class <= R700) {
bool multiwrite = rctx->ps_shader->current->shader.fs_write_all;
if (rctx->cb_misc_state.multiwrite != multiwrite) {
rctx->cb_misc_state.multiwrite = multiwrite;
rctx->cb_misc_state.atom.dirty = true;
}
}
if (rctx->cb_misc_state.nr_ps_color_outputs != rctx->ps_shader->current->nr_ps_color_outputs) {
rctx->cb_misc_state.nr_ps_color_outputs = rctx->ps_shader->current->nr_ps_color_outputs;
rctx->cb_misc_state.atom.dirty = true;
}
if (rctx->b.chip_class >= EVERGREEN) {
evergreen_update_db_shader_control(rctx);
} else {
r600_update_db_shader_control(rctx);
}
rctx->ps_shader = (struct r600_pipe_shader_selector *)state;
}
static void r600_bind_vs_state(struct pipe_context *ctx, void *state)
@@ -850,19 +817,19 @@ static void r600_bind_vs_state(struct pipe_context *ctx, void *state)
if (!state)
return;
rctx->vertex_shader.shader = rctx->vs_shader = (struct r600_pipe_shader_selector *)state;
rctx->vertex_shader.atom.dirty = true;
rctx->vs_shader = (struct r600_pipe_shader_selector *)state;
rctx->b.streamout.stride_in_dw = rctx->vs_shader->so.stride;
}
r600_context_add_resource_size(ctx, (struct pipe_resource *)rctx->vs_shader->current->bo);
static void r600_bind_gs_state(struct pipe_context *ctx, void *state)
{
struct r600_context *rctx = (struct r600_context *)ctx;
/* Update clip misc state. */
if (rctx->vs_shader->current->pa_cl_vs_out_cntl != rctx->clip_misc_state.pa_cl_vs_out_cntl ||
rctx->vs_shader->current->shader.clip_dist_write != rctx->clip_misc_state.clip_dist_write) {
rctx->clip_misc_state.pa_cl_vs_out_cntl = rctx->vs_shader->current->pa_cl_vs_out_cntl;
rctx->clip_misc_state.clip_dist_write = rctx->vs_shader->current->shader.clip_dist_write;
rctx->clip_misc_state.atom.dirty = true;
}
rctx->gs_shader = (struct r600_pipe_shader_selector *)state;
if (!state)
return;
rctx->b.streamout.stride_in_dw = rctx->gs_shader->so.stride;
}
static void r600_delete_shader_selector(struct pipe_context *ctx,
@@ -905,6 +872,20 @@ static void r600_delete_vs_state(struct pipe_context *ctx, void *state)
r600_delete_shader_selector(ctx, sel);
}
static void r600_delete_gs_state(struct pipe_context *ctx, void *state)
{
struct r600_context *rctx = (struct r600_context *)ctx;
struct r600_pipe_shader_selector *sel = (struct r600_pipe_shader_selector *)state;
if (rctx->gs_shader == sel) {
rctx->gs_shader = NULL;
}
r600_delete_shader_selector(ctx, sel);
}
void r600_constant_buffers_dirty(struct r600_context *rctx, struct r600_constbuf_state *state)
{
if (state->dirty_mask) {
@@ -1098,10 +1079,65 @@ static void r600_setup_txq_cube_array_constants(struct r600_context *rctx, int s
pipe_resource_reference(&cb.buffer, NULL);
}
static void update_shader_atom(struct pipe_context *ctx,
struct r600_shader_state *state,
struct r600_pipe_shader *shader)
{
state->shader = shader;
if (shader) {
state->atom.num_dw = shader->command_buffer.num_dw;
state->atom.dirty = true;
r600_context_add_resource_size(ctx, (struct pipe_resource *)shader->bo);
} else {
state->atom.num_dw = 0;
state->atom.dirty = false;
}
}
static void update_gs_block_state(struct r600_context *rctx, unsigned enable)
{
if (rctx->shader_stages.geom_enable != enable) {
rctx->shader_stages.geom_enable = enable;
rctx->shader_stages.atom.dirty = true;
}
if (rctx->gs_rings.enable != enable) {
rctx->gs_rings.enable = enable;
rctx->gs_rings.atom.dirty = true;
if (enable && !rctx->gs_rings.esgs_ring.buffer) {
unsigned size = 0x1C000;
rctx->gs_rings.esgs_ring.buffer =
pipe_buffer_create(rctx->b.b.screen, PIPE_BIND_CUSTOM,
PIPE_USAGE_STATIC, size);
rctx->gs_rings.esgs_ring.buffer_size = size;
size = 0x4000000;
rctx->gs_rings.gsvs_ring.buffer =
pipe_buffer_create(rctx->b.b.screen, PIPE_BIND_CUSTOM,
PIPE_USAGE_STATIC, size);
rctx->gs_rings.gsvs_ring.buffer_size = size;
}
if (enable) {
r600_set_constant_buffer(&rctx->b.b, PIPE_SHADER_GEOMETRY,
R600_GS_RING_CONST_BUFFER, &rctx->gs_rings.esgs_ring);
r600_set_constant_buffer(&rctx->b.b, PIPE_SHADER_VERTEX,
R600_GS_RING_CONST_BUFFER, &rctx->gs_rings.gsvs_ring);
} else {
r600_set_constant_buffer(&rctx->b.b, PIPE_SHADER_GEOMETRY,
R600_GS_RING_CONST_BUFFER, NULL);
r600_set_constant_buffer(&rctx->b.b, PIPE_SHADER_VERTEX,
R600_GS_RING_CONST_BUFFER, NULL);
}
}
}
static bool r600_update_derived_state(struct r600_context *rctx)
{
struct pipe_context * ctx = (struct pipe_context*)rctx;
bool ps_dirty = false;
bool ps_dirty = false, vs_dirty = false, gs_dirty = false;
bool blend_disable;
if (!rctx->blitter->running) {
@@ -1119,23 +1155,101 @@ static bool r600_update_derived_state(struct r600_context *rctx)
}
}
r600_shader_select(ctx, rctx->ps_shader, &ps_dirty);
update_gs_block_state(rctx, rctx->gs_shader != NULL);
if (rctx->ps_shader && rctx->rasterizer &&
((rctx->rasterizer->sprite_coord_enable != rctx->ps_shader->current->sprite_coord_enable) ||
(rctx->rasterizer->flatshade != rctx->ps_shader->current->flatshade))) {
if (rctx->gs_shader) {
r600_shader_select(ctx, rctx->gs_shader, &gs_dirty);
if (unlikely(!rctx->gs_shader->current))
return false;
if (rctx->b.chip_class >= EVERGREEN)
evergreen_update_ps_state(ctx, rctx->ps_shader->current);
else
r600_update_ps_state(ctx, rctx->ps_shader->current);
if (!rctx->shader_stages.geom_enable) {
rctx->shader_stages.geom_enable = true;
rctx->shader_stages.atom.dirty = true;
}
ps_dirty = true;
/* gs_shader provides GS and VS (copy shader) */
if (unlikely(rctx->geometry_shader.shader != rctx->gs_shader->current)) {
update_shader_atom(ctx, &rctx->geometry_shader, rctx->gs_shader->current);
update_shader_atom(ctx, &rctx->vertex_shader, rctx->gs_shader->current->gs_copy_shader);
/* Update clip misc state. */
if (rctx->gs_shader->current->gs_copy_shader->pa_cl_vs_out_cntl != rctx->clip_misc_state.pa_cl_vs_out_cntl ||
rctx->gs_shader->current->gs_copy_shader->shader.clip_dist_write != rctx->clip_misc_state.clip_dist_write) {
rctx->clip_misc_state.pa_cl_vs_out_cntl = rctx->gs_shader->current->gs_copy_shader->pa_cl_vs_out_cntl;
rctx->clip_misc_state.clip_dist_write = rctx->gs_shader->current->gs_copy_shader->shader.clip_dist_write;
rctx->clip_misc_state.atom.dirty = true;
}
}
r600_shader_select(ctx, rctx->vs_shader, &vs_dirty);
if (unlikely(!rctx->vs_shader->current))
return false;
/* vs_shader is used as ES */
if (unlikely(vs_dirty || rctx->export_shader.shader != rctx->vs_shader->current)) {
update_shader_atom(ctx, &rctx->export_shader, rctx->vs_shader->current);
}
} else {
if (unlikely(rctx->geometry_shader.shader)) {
update_shader_atom(ctx, &rctx->geometry_shader, NULL);
update_shader_atom(ctx, &rctx->export_shader, NULL);
rctx->shader_stages.geom_enable = false;
rctx->shader_stages.atom.dirty = true;
}
r600_shader_select(ctx, rctx->vs_shader, &vs_dirty);
if (unlikely(!rctx->vs_shader->current))
return false;
if (unlikely(vs_dirty || rctx->vertex_shader.shader != rctx->vs_shader->current)) {
update_shader_atom(ctx, &rctx->vertex_shader, rctx->vs_shader->current);
/* Update clip misc state. */
if (rctx->vs_shader->current->pa_cl_vs_out_cntl != rctx->clip_misc_state.pa_cl_vs_out_cntl ||
rctx->vs_shader->current->shader.clip_dist_write != rctx->clip_misc_state.clip_dist_write) {
rctx->clip_misc_state.pa_cl_vs_out_cntl = rctx->vs_shader->current->pa_cl_vs_out_cntl;
rctx->clip_misc_state.clip_dist_write = rctx->vs_shader->current->shader.clip_dist_write;
rctx->clip_misc_state.atom.dirty = true;
}
}
}
if (ps_dirty) {
rctx->pixel_shader.atom.num_dw = rctx->ps_shader->current->command_buffer.num_dw;
rctx->pixel_shader.atom.dirty = true;
r600_shader_select(ctx, rctx->ps_shader, &ps_dirty);
if (unlikely(!rctx->ps_shader->current))
return false;
if (unlikely(ps_dirty || rctx->pixel_shader.shader != rctx->ps_shader->current)) {
if (rctx->cb_misc_state.nr_ps_color_outputs != rctx->ps_shader->current->nr_ps_color_outputs) {
rctx->cb_misc_state.nr_ps_color_outputs = rctx->ps_shader->current->nr_ps_color_outputs;
rctx->cb_misc_state.atom.dirty = true;
}
if (rctx->b.chip_class <= R700) {
bool multiwrite = rctx->ps_shader->current->shader.fs_write_all;
if (rctx->cb_misc_state.multiwrite != multiwrite) {
rctx->cb_misc_state.multiwrite = multiwrite;
rctx->cb_misc_state.atom.dirty = true;
}
}
if (rctx->b.chip_class >= EVERGREEN) {
evergreen_update_db_shader_control(rctx);
} else {
r600_update_db_shader_control(rctx);
}
if (unlikely(!ps_dirty && rctx->ps_shader && rctx->rasterizer &&
((rctx->rasterizer->sprite_coord_enable != rctx->ps_shader->current->sprite_coord_enable) ||
(rctx->rasterizer->flatshade != rctx->ps_shader->current->flatshade)))) {
if (rctx->b.chip_class >= EVERGREEN)
evergreen_update_ps_state(ctx, rctx->ps_shader->current);
else
r600_update_ps_state(ctx, rctx->ps_shader->current);
}
update_shader_atom(ctx, &rctx->pixel_shader, rctx->ps_shader->current);
}
/* on R600 we stuff masks + txq info into one constant buffer */
@@ -1145,11 +1259,15 @@ static bool r600_update_derived_state(struct r600_context *rctx)
r600_setup_buffer_constants(rctx, PIPE_SHADER_FRAGMENT);
if (rctx->vs_shader && rctx->vs_shader->current->shader.uses_tex_buffers)
r600_setup_buffer_constants(rctx, PIPE_SHADER_VERTEX);
if (rctx->gs_shader && rctx->gs_shader->current->shader.uses_tex_buffers)
r600_setup_buffer_constants(rctx, PIPE_SHADER_GEOMETRY);
} else {
if (rctx->ps_shader && rctx->ps_shader->current->shader.uses_tex_buffers)
eg_setup_buffer_constants(rctx, PIPE_SHADER_FRAGMENT);
if (rctx->vs_shader && rctx->vs_shader->current->shader.uses_tex_buffers)
eg_setup_buffer_constants(rctx, PIPE_SHADER_VERTEX);
if (rctx->gs_shader && rctx->gs_shader->current->shader.uses_tex_buffers)
eg_setup_buffer_constants(rctx, PIPE_SHADER_GEOMETRY);
}
@@ -1157,6 +1275,8 @@ static bool r600_update_derived_state(struct r600_context *rctx)
r600_setup_txq_cube_array_constants(rctx, PIPE_SHADER_FRAGMENT);
if (rctx->vs_shader && rctx->vs_shader->current->shader.has_txq_cube_array_z_comp)
r600_setup_txq_cube_array_constants(rctx, PIPE_SHADER_VERTEX);
if (rctx->gs_shader && rctx->gs_shader->current->shader.has_txq_cube_array_z_comp)
r600_setup_txq_cube_array_constants(rctx, PIPE_SHADER_GEOMETRY);
if (rctx->b.chip_class < EVERGREEN && rctx->ps_shader && rctx->vs_shader) {
if (!r600_adjust_gprs(rctx)) {
@@ -1174,33 +1294,10 @@ static bool r600_update_derived_state(struct r600_context *rctx)
rctx->blend_state.cso,
blend_disable);
}
return true;
}
static unsigned r600_conv_prim_to_gs_out(unsigned mode)
{
static const int prim_conv[] = {
V_028A6C_OUTPRIM_TYPE_POINTLIST,
V_028A6C_OUTPRIM_TYPE_LINESTRIP,
V_028A6C_OUTPRIM_TYPE_LINESTRIP,
V_028A6C_OUTPRIM_TYPE_LINESTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_LINESTRIP,
V_028A6C_OUTPRIM_TYPE_LINESTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP
};
assert(mode < Elements(prim_conv));
return prim_conv[mode];
}
void r600_emit_clip_misc_state(struct r600_context *rctx, struct r600_atom *atom)
{
struct radeon_winsys_cs *cs = rctx->b.rings.gfx.cs;
@@ -1227,7 +1324,7 @@ static void r600_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info
return;
}
if (!rctx->vs_shader) {
if (!rctx->vs_shader || !rctx->ps_shader) {
assert(0);
return;
}
@@ -1330,8 +1427,6 @@ static void r600_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info
r600_write_context_reg(cs, R_028A0C_PA_SC_LINE_STIPPLE,
S_028A0C_AUTO_RESET_CNTL(ls_mask) |
(rctx->rasterizer ? rctx->rasterizer->pa_sc_line_stipple : 0));
r600_write_context_reg(cs, R_028A6C_VGT_GS_OUT_PRIM_TYPE,
r600_conv_prim_to_gs_out(info.mode));
r600_write_config_reg(cs, R_008958_VGT_PRIMITIVE_TYPE,
r600_conv_pipe_prim(info.mode));
@@ -1615,11 +1710,14 @@ bool sampler_state_needs_border_color(const struct pipe_sampler_state *state)
void r600_emit_shader(struct r600_context *rctx, struct r600_atom *a)
{
struct radeon_winsys_cs *cs = rctx->b.rings.gfx.cs;
struct r600_pipe_shader *shader = ((struct r600_shader_state*)a)->shader->current;
struct r600_pipe_shader *shader = ((struct r600_shader_state*)a)->shader;
if (!shader)
return;
r600_emit_command_buffer(cs, &shader->command_buffer);
radeon_emit(cs, PKT3(PKT3_NOP, 0, 0));
radeon_emit(cs, r600_context_bo_reloc(&rctx->b, &rctx->b.rings.gfx, shader->bo, RADEON_USAGE_READ));
}
@@ -1633,7 +1731,6 @@ struct pipe_surface *r600_create_surface_custom(struct pipe_context *pipe,
assert(templ->u.tex.first_layer <= util_max_layer(texture, templ->u.tex.level));
assert(templ->u.tex.last_layer <= util_max_layer(texture, templ->u.tex.level));
assert(templ->u.tex.first_layer == templ->u.tex.last_layer);
if (surface == NULL)
return NULL;
pipe_reference_init(&surface->base.reference, 1);
@@ -2148,6 +2245,7 @@ void r600_init_common_state_functions(struct r600_context *rctx)
{
rctx->b.b.create_fs_state = r600_create_ps_state;
rctx->b.b.create_vs_state = r600_create_vs_state;
rctx->b.b.create_gs_state = r600_create_gs_state;
rctx->b.b.create_vertex_elements_state = r600_create_vertex_fetch_shader;
rctx->b.b.bind_blend_state = r600_bind_blend_state;
rctx->b.b.bind_depth_stencil_alpha_state = r600_bind_dsa_state;
@@ -2156,6 +2254,7 @@ void r600_init_common_state_functions(struct r600_context *rctx)
rctx->b.b.bind_rasterizer_state = r600_bind_rs_state;
rctx->b.b.bind_vertex_elements_state = r600_bind_vertex_elements;
rctx->b.b.bind_vs_state = r600_bind_vs_state;
rctx->b.b.bind_gs_state = r600_bind_gs_state;
rctx->b.b.delete_blend_state = r600_delete_blend_state;
rctx->b.b.delete_depth_stencil_alpha_state = r600_delete_dsa_state;
rctx->b.b.delete_fs_state = r600_delete_ps_state;
@@ -2163,6 +2262,7 @@ void r600_init_common_state_functions(struct r600_context *rctx)
rctx->b.b.delete_sampler_state = r600_delete_sampler_state;
rctx->b.b.delete_vertex_elements_state = r600_delete_vertex_elements;
rctx->b.b.delete_vs_state = r600_delete_vs_state;
rctx->b.b.delete_gs_state = r600_delete_gs_state;
rctx->b.b.set_blend_color = r600_set_blend_color;
rctx->b.b.set_clip_state = r600_set_clip_state;
rctx->b.b.set_constant_buffer = r600_set_constant_buffer;

View File

@@ -123,6 +123,7 @@
#define EVENT_TYPE_SO_VGTSTREAMOUT_FLUSH 0x1f
#define EVENT_TYPE_SAMPLE_STREAMOUTSTATS 0x20
#define EVENT_TYPE_FLUSH_AND_INV_DB_META 0x2c /* supported on r700+ */
#define EVENT_TYPE_VGT_FLUSH 0x24
#define EVENT_TYPE_FLUSH_AND_INV_CB_META 46 /* supported on r700+ */
#define EVENT_TYPE(x) ((x) << 0)
#define EVENT_INDEX(x) ((x) << 8)
@@ -200,6 +201,19 @@
/* Registers */
#define R_008490_CP_STRMOUT_CNTL 0x008490
#define S_008490_OFFSET_UPDATE_DONE(x) (((x) & 0x1) << 0)
#define R_008C40_SQ_ESGS_RING_BASE 0x008C40
#define R_008C44_SQ_ESGS_RING_SIZE 0x008C44
#define R_008C48_SQ_GSVS_RING_BASE 0x008C48
#define R_008C4C_SQ_GSVS_RING_SIZE 0x008C4C
#define R_008C50_SQ_ESTMP_RING_BASE 0x008C50
#define R_008C54_SQ_ESTMP_RING_SIZE 0x008C54
#define R_008C50_SQ_GSTMP_RING_BASE 0x008C58
#define R_008C54_SQ_GSTMP_RING_SIZE 0x008C5C
#define R_0088C8_VGT_GS_PER_ES 0x0088C8
#define R_0088CC_VGT_ES_PER_GS 0x0088CC
#define R_0088E8_VGT_GS_PER_VS 0x0088E8
#define R_008960_VGT_STRMOUT_BUFFER_FILLED_SIZE_0 0x008960 /* read-only */
#define R_008964_VGT_STRMOUT_BUFFER_FILLED_SIZE_1 0x008964 /* read-only */
#define R_008968_VGT_STRMOUT_BUFFER_FILLED_SIZE_2 0x008968 /* read-only */
@@ -1824,12 +1838,20 @@
#define S_028A40_MODE(x) (((x) & 0x3) << 0)
#define G_028A40_MODE(x) (((x) >> 0) & 0x3)
#define C_028A40_MODE 0xFFFFFFFC
#define V_028A40_GS_OFF 0
#define V_028A40_GS_SCENARIO_A 1
#define V_028A40_GS_SCENARIO_B 2
#define V_028A40_GS_SCENARIO_G 3
#define S_028A40_ES_PASSTHRU(x) (((x) & 0x1) << 2)
#define G_028A40_ES_PASSTHRU(x) (((x) >> 2) & 0x1)
#define C_028A40_ES_PASSTHRU 0xFFFFFFFB
#define S_028A40_CUT_MODE(x) (((x) & 0x3) << 3)
#define G_028A40_CUT_MODE(x) (((x) >> 3) & 0x3)
#define C_028A40_CUT_MODE 0xFFFFFFE7
#define V_028A40_GS_CUT_1024 0
#define V_028A40_GS_CUT_512 1
#define V_028A40_GS_CUT_256 2
#define V_028A40_GS_CUT_128 3
#define R_008DFC_SQ_CF_WORD0 0x008DFC
#define S_008DFC_ADDR(x) (((x) & 0xFFFFFFFF) << 0)
#define G_008DFC_ADDR(x) (((x) >> 0) & 0xFFFFFFFF)
@@ -2332,6 +2354,26 @@
#define S_028D44_ALPHA_TO_MASK_OFFSET3(x) (((x) & 0x3) << 14)
#define S_028D44_OFFSET_ROUND(x) (((x) & 0x1) << 16)
#define R_028868_SQ_PGM_RESOURCES_VS 0x028868
#define R_028890_SQ_PGM_RESOURCES_ES 0x028890
#define S_028890_NUM_GPRS(x) (((x) & 0xFF) << 0)
#define G_028890_NUM_GPRS(x) (((x) >> 0) & 0xFF)
#define C_028890_NUM_GPRS 0xFFFFFF00
#define S_028890_STACK_SIZE(x) (((x) & 0xFF) << 8)
#define G_028890_STACK_SIZE(x) (((x) >> 8) & 0xFF)
#define C_028890_STACK_SIZE 0xFFFF00FF
#define S_028890_DX10_CLAMP(x) (((x) & 0x1) << 21)
#define G_028890_DX10_CLAMP(x) (((x) >> 21) & 0x1)
#define C_028890_DX10_CLAMP 0xFFDFFFFF
#define R_02887C_SQ_PGM_RESOURCES_GS 0x02887C
#define S_02887C_NUM_GPRS(x) (((x) & 0xFF) << 0)
#define G_02887C_NUM_GPRS(x) (((x) >> 0) & 0xFF)
#define C_02887C_NUM_GPRS 0xFFFFFF00
#define S_02887C_STACK_SIZE(x) (((x) & 0xFF) << 8)
#define G_02887C_STACK_SIZE(x) (((x) >> 8) & 0xFF)
#define C_02887C_STACK_SIZE 0xFFFF00FF
#define S_02887C_DX10_CLAMP(x) (((x) & 0x1) << 21)
#define G_02887C_DX10_CLAMP(x) (((x) >> 21) & 0x1)
#define C_02887C_DX10_CLAMP 0xFFDFFFFF
#define R_0286CC_SPI_PS_IN_CONTROL_0 0x0286CC
#define R_0286D0_SPI_PS_IN_CONTROL_1 0x0286D0
#define R_028644_SPI_PS_INPUT_CNTL_0 0x028644
@@ -2421,11 +2463,15 @@
#define G_028C04_MAX_SAMPLE_DIST(x) (((x) >> 13) & 0xF)
#define C_028C04_MAX_SAMPLE_DIST 0xFFFE1FFF
#define R_0288CC_SQ_PGM_CF_OFFSET_PS 0x0288CC
#define R_0288DC_SQ_PGM_CF_OFFSET_FS 0x0288DC
#define R_0288D0_SQ_PGM_CF_OFFSET_VS 0x0288D0
#define R_0288D4_SQ_PGM_CF_OFFSET_GS 0x0288D4
#define R_0288D8_SQ_PGM_CF_OFFSET_ES 0x0288D8
#define R_0288DC_SQ_PGM_CF_OFFSET_FS 0x0288DC
#define R_028840_SQ_PGM_START_PS 0x028840
#define R_028894_SQ_PGM_START_FS 0x028894
#define R_028858_SQ_PGM_START_VS 0x028858
#define R_02886C_SQ_PGM_START_GS 0x02886C
#define R_028880_SQ_PGM_START_ES 0x028880
#define R_028080_CB_COLOR0_VIEW 0x028080
#define S_028080_SLICE_START(x) (((x) & 0x7FF) << 0)
#define G_028080_SLICE_START(x) (((x) >> 0) & 0x7FF)
@@ -2863,6 +2909,7 @@
#define R_0283F4_SQ_VTX_SEMANTIC_29 0x0283F4
#define R_0283F8_SQ_VTX_SEMANTIC_30 0x0283F8
#define R_0283FC_SQ_VTX_SEMANTIC_31 0x0283FC
#define R_0288C8_SQ_GS_VERT_ITEMSIZE 0x0288C8
#define R_0288E0_SQ_VTX_SEMANTIC_CLEAR 0x0288E0
#define R_028400_VGT_MAX_VTX_INDX 0x028400
#define S_028400_MAX_INDX(x) (((x) & 0xFFFFFFFF) << 0)
@@ -3287,6 +3334,8 @@
#define R_028B28_VGT_STRMOUT_DRAW_OPAQUE_OFFSET 0x028B28
#define R_028B2C_VGT_STRMOUT_DRAW_OPAQUE_BUFFER_FILLED_SIZE 0x028B2C
#define R_028B30_VGT_STRMOUT_DRAW_OPAQUE_VERTEX_STRIDE 0x028B30
#define R_028B38_VGT_GS_MAX_VERT_OUT 0x028B38 /* r7xx */
#define S_028B38_MAX_VERT_OUT(x) (((x) & 0x7FF) << 0)
#define R_028B44_VGT_STRMOUT_BASE_OFFSET_HI_0 0x028B44
#define R_028B48_VGT_STRMOUT_BASE_OFFSET_HI_1 0x028B48
#define R_028B4C_VGT_STRMOUT_BASE_OFFSET_HI_2 0x028B4C

View File

@@ -169,8 +169,10 @@ enum shader_target
{
TARGET_UNKNOWN,
TARGET_VS,
TARGET_ES,
TARGET_PS,
TARGET_GS,
TARGET_GS_COPY,
TARGET_COMPUTE,
TARGET_FETCH,

View File

@@ -137,7 +137,7 @@ void bc_dump::dump(cf_node& n) {
for (int k = 0; k < 4; ++k)
s << chans[n.bc.sel[k]];
} else if (n.bc.op_ptr->flags & (CF_STRM | CF_RAT)) {
} else if (n.bc.op_ptr->flags & CF_MEM) {
static const char *exp_type[] = {"WRITE", "WRITE_IND", "WRITE_ACK",
"WRITE_IND_ACK"};
fill_to(s, 18);
@@ -150,6 +150,9 @@ void bc_dump::dump(cf_node& n) {
if ((n.bc.op_ptr->flags & CF_RAT) && (n.bc.type & 1)) {
s << ", @R" << n.bc.index_gpr << ".xyz";
}
if ((n.bc.op_ptr->flags & CF_MEM) && (n.bc.type & 1)) {
s << ", @R" << n.bc.index_gpr << ".x";
}
s << " ES:" << n.bc.elem_size;

View File

@@ -63,7 +63,7 @@ int bc_finalizer::run() {
// workaround for some problems on r6xx/7xx
// add ALU NOP to each vertex shader
if (!ctx.is_egcm() && sh.target == TARGET_VS) {
if (!ctx.is_egcm() && (sh.target == TARGET_VS || sh.target == TARGET_ES)) {
cf_node *c = sh.create_clause(NST_ALU_CLAUSE);
alu_group_node *g = sh.create_alu_group();
@@ -695,7 +695,7 @@ void bc_finalizer::finalize_cf(cf_node* c) {
c->bc.rw_gpr = reg >= 0 ? reg : 0;
c->bc.comp_mask = mask;
if ((flags & CF_RAT) && (c->bc.type & 1)) {
if (((flags & CF_RAT) || (!(flags & CF_STRM))) && (c->bc.type & 1)) {
reg = -1;

View File

@@ -58,7 +58,10 @@ int bc_parser::decode() {
if (pshader) {
switch (bc->type) {
case TGSI_PROCESSOR_FRAGMENT: t = TARGET_PS; break;
case TGSI_PROCESSOR_VERTEX: t = TARGET_VS; break;
case TGSI_PROCESSOR_VERTEX:
t = pshader->vs_as_es ? TARGET_ES : TARGET_VS;
break;
case TGSI_PROCESSOR_GEOMETRY: t = TARGET_GS; break;
case TGSI_PROCESSOR_COMPUTE: t = TARGET_COMPUTE; break;
default: assert(!"unknown shader target"); return -1; break;
}
@@ -134,8 +137,12 @@ int bc_parser::parse_decls() {
}
}
if (sh->target == TARGET_VS)
if (sh->target == TARGET_VS || sh->target == TARGET_ES)
sh->add_input(0, 1, 0x0F);
else if (sh->target == TARGET_GS) {
sh->add_input(0, 1, 0x0F);
sh->add_input(1, 1, 0x0F);
}
bool ps_interp = ctx.hw_class >= HW_CLASS_EVERGREEN
&& sh->target == TARGET_PS;
@@ -202,7 +209,7 @@ int bc_parser::decode_cf(unsigned &i, bool &eop) {
if (cf->bc.rw_rel)
gpr_reladdr = true;
assert(!cf->bc.rw_rel);
} else if (flags & (CF_STRM | CF_RAT)) {
} else if (flags & CF_MEM) {
if (cf->bc.rw_rel)
gpr_reladdr = true;
assert(!cf->bc.rw_rel);
@@ -676,7 +683,7 @@ int bc_parser::prepare_ir() {
} while (1);
c->bc.end_of_program = eop;
} else if (flags & (CF_STRM | CF_RAT)) {
} else if (flags & CF_MEM) {
unsigned burst_count = c->bc.burst_count;
unsigned eop = c->bc.end_of_program;
@@ -694,7 +701,7 @@ int bc_parser::prepare_ir() {
sh->get_gpr_value(true, c->bc.rw_gpr, s, false);
}
if ((flags & CF_RAT) && (c->bc.type & 1)) { // indexed write
if (((flags & CF_RAT) || (!(flags & CF_STRM))) && (c->bc.type & 1)) { // indexed write
c->src.resize(8);
for(int s = 0; s < 3; ++s) {
c->src[4 + s] =

View File

@@ -349,7 +349,7 @@ void dump::dump_op(node &n, const char *name) {
static const char *exp_type[] = {"PIXEL", "POS ", "PARAM"};
sblog << " " << exp_type[c->bc.type] << " " << c->bc.array_base;
has_dst = false;
} else if (c->bc.op_ptr->flags & CF_STRM) {
} else if (c->bc.op_ptr->flags & (CF_MEM)) {
static const char *exp_type[] = {"WRITE", "WRITE_IND", "WRITE_ACK",
"WRITE_IND_ACK"};
sblog << " " << exp_type[c->bc.type] << " " << c->bc.array_base

View File

@@ -215,7 +215,7 @@ void shader::init() {
void shader::init_call_fs(cf_node* cf) {
unsigned gpr = 0;
assert(target == TARGET_VS);
assert(target == TARGET_VS || target == TARGET_ES);
for(inputs_vec::const_iterator I = inputs.begin(),
E = inputs.end(); I != E; ++I, ++gpr) {
@@ -433,6 +433,7 @@ std::string shader::get_full_target_name() {
const char* shader::get_shader_target_name() {
switch (target) {
case TARGET_VS: return "VS";
case TARGET_ES: return "ES";
case TARGET_PS: return "PS";
case TARGET_GS: return "GS";
case TARGET_COMPUTE: return "COMPUTE";

View File

@@ -59,7 +59,7 @@ void *r600_buffer_map_sync_with_rings(struct r600_common_context *ctx,
rusage = RADEON_USAGE_WRITE;
}
if (ctx->rings.gfx.cs->cdw &&
if (ctx->rings.gfx.cs->cdw != ctx->initial_gfx_cs_size &&
ctx->ws->cs_is_buffer_referenced(ctx->rings.gfx.cs,
resource->cs_buf, rusage)) {
if (usage & PIPE_TRANSFER_DONTBLOCK) {

View File

@@ -137,7 +137,7 @@ static const struct debug_named_value common_debug_options[] = {
{ "ps", DBG_PS, "Print pixel shaders" },
{ "cs", DBG_CS, "Print compute shaders" },
{ "nohyperz", DBG_NO_HYPERZ, "Disable Hyper-Z" },
{ "hyperz", DBG_HYPERZ, "Enable Hyper-Z" },
/* GL uses the word INVALIDATE, gallium uses the word DISCARD */
{ "noinvalrange", DBG_NO_DISCARD_RANGE, "Disable handling of INVALIDATE_RANGE map flags" },

View File

@@ -83,7 +83,7 @@
#define DBG_PS (1 << 11)
#define DBG_CS (1 << 12)
/* features */
#define DBG_NO_HYPERZ (1 << 13)
#define DBG_HYPERZ (1 << 13)
#define DBG_NO_DISCARD_RANGE (1 << 14)
/* The maximum allowed bit is 15. */
@@ -241,6 +241,7 @@ struct r600_common_context {
enum radeon_family family;
enum chip_class chip_class;
struct r600_rings rings;
unsigned initial_gfx_cs_size;
struct u_upload_mgr *uploader;
struct u_suballocator *allocator_so_filled_size;

View File

@@ -596,7 +596,7 @@ r600_texture_create_object(struct pipe_screen *screen,
if (rtex->is_depth) {
if (!(base->flags & (R600_RESOURCE_FLAG_TRANSFER |
R600_RESOURCE_FLAG_FLUSHED_DEPTH)) &&
!(rscreen->debug_flags & DBG_NO_HYPERZ)) {
(rscreen->debug_flags & DBG_HYPERZ)) {
r600_texture_allocate_htile(rscreen, rtex);
}

View File

@@ -58,6 +58,9 @@
#define NUM_H264_REFS 17
#define NUM_VC1_REFS 5
#define FB_BUFFER_OFFSET 0x1000
#define FB_BUFFER_SIZE 2048
/* UVD buffer representation */
struct ruvd_buffer
{
@@ -81,6 +84,7 @@ struct ruvd_decoder {
struct ruvd_buffer msg_fb_buffers[NUM_BUFFERS];
struct ruvd_msg *msg;
uint32_t *fb;
struct ruvd_buffer bs_buffers[NUM_BUFFERS];
void* bs_ptr;
@@ -131,16 +135,21 @@ static void send_cmd(struct ruvd_decoder *dec, unsigned cmd,
set_reg(dec, RUVD_GPCOM_VCPU_CMD, cmd << 1);
}
/* map the next available message buffer */
static void map_msg_buf(struct ruvd_decoder *dec)
/* map the next available message/feedback buffer */
static void map_msg_fb_buf(struct ruvd_decoder *dec)
{
struct ruvd_buffer* buf;
uint8_t *ptr;
/* grap the current message buffer */
/* grab the current message/feedback buffer */
buf = &dec->msg_fb_buffers[dec->cur_buffer];
/* copy the message into it */
dec->msg = dec->ws->buffer_map(buf->cs_handle, dec->cs, PIPE_TRANSFER_WRITE);
/* and map it for CPU access */
ptr = dec->ws->buffer_map(buf->cs_handle, dec->cs, PIPE_TRANSFER_WRITE);
/* calc buffer offsets */
dec->msg = (struct ruvd_msg *)ptr;
dec->fb = (uint32_t *)(ptr + FB_BUFFER_OFFSET);
}
/* unmap and send a message command to the VCPU */
@@ -148,8 +157,8 @@ static void send_msg_buf(struct ruvd_decoder *dec)
{
struct ruvd_buffer* buf;
/* ignore the request if message buffer isn't mapped */
if (!dec->msg)
/* ignore the request if message/feedback buffer isn't mapped */
if (!dec->msg || !dec->fb)
return;
/* grap the current message buffer */
@@ -157,6 +166,8 @@ static void send_msg_buf(struct ruvd_decoder *dec)
/* unmap the buffer */
dec->ws->buffer_unmap(buf->cs_handle);
dec->msg = NULL;
dec->fb = NULL;
/* and send it to the hardware */
send_cmd(dec, RUVD_CMD_MSG_BUFFER, buf->cs_handle, 0,
@@ -644,7 +655,7 @@ static void ruvd_destroy(struct pipe_video_codec *decoder)
assert(decoder);
map_msg_buf(dec);
map_msg_fb_buf(dec);
memset(dec->msg, 0, sizeof(*dec->msg));
dec->msg->size = sizeof(*dec->msg);
dec->msg->msg_type = RUVD_MSG_DESTROY;
@@ -773,7 +784,7 @@ static void ruvd_end_frame(struct pipe_video_codec *decoder,
memset(dec->bs_ptr, 0, bs_size - dec->bs_size);
dec->ws->buffer_unmap(bs_buf->cs_handle);
map_msg_buf(dec);
map_msg_fb_buf(dec);
dec->msg->size = sizeof(*dec->msg);
dec->msg->msg_type = RUVD_MSG_DECODE;
dec->msg->stream_handle = dec->stream_handle;
@@ -813,6 +824,10 @@ static void ruvd_end_frame(struct pipe_video_codec *decoder,
dec->msg->body.decode.db_surf_tile_config = dec->msg->body.decode.dt_surf_tile_config;
dec->msg->body.decode.extension_support = 0x1;
/* set at least the feedback buffer size */
dec->fb[0] = FB_BUFFER_SIZE;
send_msg_buf(dec);
send_cmd(dec, RUVD_CMD_DPB_BUFFER, dec->dpb.cs_handle, 0,
@@ -822,7 +837,7 @@ static void ruvd_end_frame(struct pipe_video_codec *decoder,
send_cmd(dec, RUVD_CMD_DECODING_TARGET_BUFFER, dt, 0,
RADEON_USAGE_WRITE, RADEON_DOMAIN_VRAM);
send_cmd(dec, RUVD_CMD_FEEDBACK_BUFFER, msg_fb_buf->cs_handle,
0x1000, RADEON_USAGE_WRITE, RADEON_DOMAIN_GTT);
FB_BUFFER_OFFSET, RADEON_USAGE_WRITE, RADEON_DOMAIN_GTT);
set_reg(dec, RUVD_ENGINE_CNTL, 1);
flush(dec);
@@ -898,7 +913,8 @@ struct pipe_video_codec *ruvd_create_decoder(struct pipe_context *context,
bs_buf_size = width * height * 512 / (16 * 16);
for (i = 0; i < NUM_BUFFERS; ++i) {
unsigned msg_fb_size = align(sizeof(struct ruvd_msg), 0x1000) + 0x1000;
unsigned msg_fb_size = FB_BUFFER_OFFSET + FB_BUFFER_SIZE;
STATIC_ASSERT(sizeof(struct ruvd_msg) <= FB_BUFFER_OFFSET);
if (!create_buffer(dec, &dec->msg_fb_buffers[i], msg_fb_size)) {
RUVD_ERR("Can't allocated message buffers.\n");
goto error;
@@ -920,7 +936,7 @@ struct pipe_video_codec *ruvd_create_decoder(struct pipe_context *context,
clear_buffer(dec, &dec->dpb);
map_msg_buf(dec);
map_msg_fb_buf(dec);
dec->msg->size = sizeof(*dec->msg);
dec->msg->msg_type = RUVD_MSG_CREATE;
dec->msg->stream_handle = dec->stream_handle;

View File

@@ -81,7 +81,7 @@ void si_context_flush(struct si_context *ctx, unsigned flags)
{
struct radeon_winsys_cs *cs = ctx->b.rings.gfx.cs;
if (!cs->cdw)
if (cs->cdw == ctx->b.initial_gfx_cs_size)
return;
/* suspend queries */
@@ -177,6 +177,8 @@ void si_begin_new_cs(struct si_context *ctx)
}
si_all_descriptors_begin_new_cs(ctx);
ctx->b.initial_gfx_cs_size = ctx->b.rings.gfx.cs->cdw;
}
#if SI_TRACE_CS

View File

@@ -299,6 +299,12 @@ static int si_get_param(struct pipe_screen* pscreen, enum pipe_cap param)
case PIPE_CAP_MAX_STREAM_OUTPUT_INTERLEAVED_COMPONENTS:
return sscreen->b.has_streamout ? 32*4 : 0;
/* Geometry shader output. */
case PIPE_CAP_MAX_GEOMETRY_OUTPUT_VERTICES:
return 1024;
case PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS:
return 4095;
/* Texturing. */
case PIPE_CAP_MAX_TEXTURE_2D_LEVELS:
case PIPE_CAP_MAX_TEXTURE_3D_LEVELS:

View File

@@ -121,6 +121,9 @@ softpipe_get_param(struct pipe_screen *screen, enum pipe_cap param)
case PIPE_CAP_MAX_STREAM_OUTPUT_SEPARATE_COMPONENTS:
case PIPE_CAP_MAX_STREAM_OUTPUT_INTERLEAVED_COMPONENTS:
return 16*4;
case PIPE_CAP_MAX_GEOMETRY_OUTPUT_VERTICES:
case PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS:
return 0;
case PIPE_CAP_PRIMITIVE_RESTART:
return 1;
case PIPE_CAP_SHADER_STENCIL_EXPORT:

View File

@@ -253,6 +253,8 @@ svga_get_param(struct pipe_screen *screen, enum pipe_cap param)
case PIPE_CAP_MAX_STREAM_OUTPUT_SEPARATE_COMPONENTS:
case PIPE_CAP_MAX_STREAM_OUTPUT_INTERLEAVED_COMPONENTS:
case PIPE_CAP_STREAM_OUTPUT_PAUSE_RESUME:
case PIPE_CAP_MAX_GEOMETRY_OUTPUT_VERTICES:
case PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS:
case PIPE_CAP_TGSI_CAN_COMPACT_CONSTANTS:
case PIPE_CAP_VERTEX_BUFFER_OFFSET_4BYTE_ALIGNED_ONLY:
case PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY:

View File

@@ -524,7 +524,9 @@ enum pipe_cap {
PIPE_CAP_MAX_VIEWPORTS = 84,
PIPE_CAP_ENDIANNESS = 85,
PIPE_CAP_MIXED_FRAMEBUFFER_SIZES = 86,
PIPE_CAP_TGSI_VS_LAYER = 87
PIPE_CAP_TGSI_VS_LAYER = 87,
PIPE_CAP_MAX_GEOMETRY_OUTPUT_VERTICES = 88,
PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS = 89
};
#define PIPE_QUIRK_TEXTURE_BORDER_COLOR_SWIZZLE_NV50 (1 << 0)

View File

@@ -44,17 +44,13 @@ libvdpau_r600_la_LIBADD = \
$(GALLIUM_DRI_LIB_DEPS) \
$(RADEON_LIBS)
if HAVE_MESA_LLVM
libvdpau_r600_la_LINK = $(CXXLINK) $(libvdpau_r600_la_LDFLAGS)
# Mention a dummy pure C++ file to trigger generation of the $(LINK) variable
nodist_EXTRA_libvdpau_r600_la_SOURCES = dummy-cpp.cpp
if HAVE_MESA_LLVM
libvdpau_r600_la_LDFLAGS += $(LLVM_LDFLAGS)
libvdpau_r600_la_LIBADD += $(LLVM_LIBS)
else
libvdpau_r600_la_LINK = $(LINK) $(libvdpau_r600_la_LDFLAGS)
# Mention a dummy pure C file to trigger generation of the $(LINK) variable
nodist_EXTRA_libvdpau_r600_la_SOURCES = dummy-c.c
endif
# Provide compatibility with scripts for the old Mesa build system for

View File

@@ -888,14 +888,13 @@ public:
ast_node *body;
private:
/**
* Generate IR from the condition of a loop
*
* This is factored out of ::hir because some loops have the condition
* test at the top (for and while), and others have it at the end (do-while).
*/
void condition_to_hir(class ir_loop *, struct _mesa_glsl_parse_state *);
void condition_to_hir(exec_list *, struct _mesa_glsl_parse_state *);
};

View File

@@ -2853,10 +2853,17 @@ validate_identifier(const char *identifier, YYLTYPE loc,
* "In addition, all identifiers containing two
* consecutive underscores (__) are reserved as
* possible future keywords."
*
* The intention is that names containing __ are reserved for internal
* use by the implementation, and names prefixed with GL_ are reserved
* for use by Khronos. Names simply containing __ are dangerous to use,
* but should be allowed.
*
* A future version of the GLSL specification will clarify this.
*/
_mesa_glsl_error(&loc, state,
"identifier `%s' uses reserved `__' string",
identifier);
_mesa_glsl_warning(&loc, state,
"identifier `%s' uses reserved `__' string",
identifier);
}
}
@@ -4029,17 +4036,22 @@ ast_jump_statement::hir(exec_list *instructions,
_mesa_glsl_error(& loc, state,
"break may only appear in a loop or a switch");
} else {
/* For a loop, inline the for loop expression again,
* since we don't know where near the end of
* the loop body the normal copy of it
* is going to be placed.
/* For a loop, inline the for loop expression again, since we don't
* know where near the end of the loop body the normal copy of it is
* going to be placed. Same goes for the condition for a do-while
* loop.
*/
if (state->loop_nesting_ast != NULL &&
mode == ast_continue &&
state->loop_nesting_ast->rest_expression) {
state->loop_nesting_ast->rest_expression->hir(instructions,
state);
}
mode == ast_continue) {
if (state->loop_nesting_ast->rest_expression) {
state->loop_nesting_ast->rest_expression->hir(instructions,
state);
}
if (state->loop_nesting_ast->mode ==
ast_iteration_statement::ast_do_while) {
state->loop_nesting_ast->condition_to_hir(instructions, state);
}
}
if (state->switch_state.is_switch_innermost &&
mode == ast_break) {
@@ -4369,14 +4381,14 @@ ast_case_label::hir(exec_list *instructions,
}
void
ast_iteration_statement::condition_to_hir(ir_loop *stmt,
ast_iteration_statement::condition_to_hir(exec_list *instructions,
struct _mesa_glsl_parse_state *state)
{
void *ctx = state;
if (condition != NULL) {
ir_rvalue *const cond =
condition->hir(& stmt->body_instructions, state);
condition->hir(instructions, state);
if ((cond == NULL)
|| !cond->type->is_boolean() || !cond->type->is_scalar()) {
@@ -4397,7 +4409,7 @@ ast_iteration_statement::condition_to_hir(ir_loop *stmt,
new(ctx) ir_loop_jump(ir_loop_jump::jump_break);
if_stmt->then_instructions.push_tail(break_stmt);
stmt->body_instructions.push_tail(if_stmt);
instructions->push_tail(if_stmt);
}
}
}
@@ -4432,7 +4444,7 @@ ast_iteration_statement::hir(exec_list *instructions,
state->switch_state.is_switch_innermost = false;
if (mode != ast_do_while)
condition_to_hir(stmt, state);
condition_to_hir(&stmt->body_instructions, state);
if (body != NULL)
body->hir(& stmt->body_instructions, state);
@@ -4441,7 +4453,7 @@ ast_iteration_statement::hir(exec_list *instructions,
rest_expression->hir(& stmt->body_instructions, state);
if (mode == ast_do_while)
condition_to_hir(stmt, state);
condition_to_hir(&stmt->body_instructions, state);
if (mode != ast_do_while)
state->symbols->pop_scope();

View File

@@ -118,6 +118,7 @@ ast_type_qualifier::merge_qualifier(YYLTYPE *loc,
ubo_layout_mask.flags.q.shared = 1;
ast_type_qualifier ubo_binding_mask;
ubo_binding_mask.flags.i = 0;
ubo_binding_mask.flags.q.explicit_binding = 1;
ubo_binding_mask.flags.q.explicit_offset = 1;

View File

@@ -4100,6 +4100,7 @@ builtin_builder::_mid3(const glsl_type *type)
/* The singleton instance of builtin_builder. */
static builtin_builder builtins;
_glthread_DECLARE_STATIC_MUTEX(builtins_lock);
/**
* External API (exposing the built-in module to the rest of the compiler):
@@ -4108,20 +4109,28 @@ static builtin_builder builtins;
void
_mesa_glsl_initialize_builtin_functions()
{
_glthread_LOCK_MUTEX(builtins_lock);
builtins.initialize();
_glthread_UNLOCK_MUTEX(builtins_lock);
}
void
_mesa_glsl_release_builtin_functions()
{
_glthread_LOCK_MUTEX(builtins_lock);
builtins.release();
_glthread_UNLOCK_MUTEX(builtins_lock);
}
ir_function_signature *
_mesa_glsl_find_builtin_function(_mesa_glsl_parse_state *state,
const char *name, exec_list *actual_parameters)
{
return builtins.find(state, name, actual_parameters);
ir_function_signature * s;
_glthread_LOCK_MUTEX(builtins_lock);
s = builtins.find(state, name, actual_parameters);
_glthread_UNLOCK_MUTEX(builtins_lock);
return s;
}
gl_shader *

View File

@@ -1770,11 +1770,27 @@ static void
_check_for_reserved_macro_name (glcpp_parser_t *parser, YYLTYPE *loc,
const char *identifier)
{
/* According to the GLSL specification, macro names starting with "__"
* or "GL_" are reserved for future use. So, don't allow them.
/* Section 3.3 (Preprocessor) of the GLSL 1.30 spec (and later) and
* the GLSL ES spec (all versions) say:
*
* "All macro names containing two consecutive underscores ( __ )
* are reserved for future use as predefined macro names. All
* macro names prefixed with "GL_" ("GL" followed by a single
* underscore) are also reserved."
*
* The intention is that names containing __ are reserved for internal
* use by the implementation, and names prefixed with GL_ are reserved
* for use by Khronos. Since every extension adds a name prefixed
* with GL_ (i.e., the name of the extension), that should be an
* error. Names simply containing __ are dangerous to use, but should
* be allowed.
*
* A future version of the GLSL specification will clarify this.
*/
if (strstr(identifier, "__")) {
glcpp_error (loc, parser, "Macro names containing \"__\" are reserved.\n");
glcpp_warning(loc, parser,
"Macro names containing \"__\" are reserved "
"for use by the implementation.\n");
}
if (strncmp(identifier, "GL_", 3) == 0) {
glcpp_error (loc, parser, "Macro names starting with \"GL_\" are reserved.\n");

View File

@@ -1,8 +1,8 @@
0:1(10): preprocessor error: Macro names containing "__" are reserved.
0:1(10): preprocessor warning: Macro names containing "__" are reserved for use by the implementation.
0:2(9): preprocessor error: Macro names starting with "GL_" are reserved.
0:3(9): preprocessor error: Macro names containing "__" are reserved.
0:3(9): preprocessor warning: Macro names containing "__" are reserved for use by the implementation.

View File

@@ -1466,7 +1466,7 @@ type_qualifier:
"just before storage qualifiers");
}
$$ = $1;
$$.flags.i |= $2.flags.i;
$$.merge_qualifier(&@1, state, $2);
}
| storage_qualifier type_qualifier
{

View File

@@ -2329,11 +2329,12 @@ link_shaders(struct gl_context *ctx, struct gl_shader_program *prog)
goto done;
/* OpenGL ES requires that a vertex shader and a fragment shader both be
* present in a linked program. By checking prog->IsES, we also
* catch the GL_ARB_ES2_compatibility case.
* present in a linked program. GL_ARB_ES2_compatibility doesn't say
* anything about shader linking when one of the shaders (vertex or
* fragment shader) is absent. So, the extension shouldn't change the
* behavior specified in GLSL specification.
*/
if (!prog->InternalSeparateShader &&
(ctx->API == API_OPENGLES2 || prog->IsES)) {
if (!prog->InternalSeparateShader && ctx->API == API_OPENGLES2) {
if (prog->_LinkedShaders[MESA_SHADER_VERTEX] == NULL) {
linker_error(prog, "program lacks a vertex shader\n");
} else if (prog->_LinkedShaders[MESA_SHADER_FRAGMENT] == NULL) {

View File

@@ -82,6 +82,7 @@ public:
virtual ir_visitor_status visit_enter(ir_assignment *);
virtual ir_visitor_status visit_enter(ir_swizzle *);
virtual ir_visitor_status visit_enter(ir_dereference_array *);
virtual ir_visitor_status visit_enter(ir_if *);
virtual ir_visitor_status visit_enter(ir_loop *);
@@ -289,6 +290,19 @@ ir_vectorize_visitor::visit_enter(ir_swizzle *ir)
return visit_continue;
}
/* Upon entering an ir_array_dereference, remove the current assignment from
* further consideration. Since the index of an array dereference must scalar,
* we are not able to vectorize it.
*
* FINISHME: If all of scalar indices are identical we could vectorize.
*/
ir_visitor_status
ir_vectorize_visitor::visit_enter(ir_dereference_array *ir)
{
this->current_assignment = NULL;
return visit_continue_with_parent;
}
/* Since there is no statement to visit between the "then" and "else"
* instructions try to vectorize before, in between, and after them to avoid
* combining statements from different basic blocks.

View File

@@ -141,6 +141,7 @@ dri2_bind_context(struct glx_context *context, struct glx_context *old,
struct dri2_context *pcp = (struct dri2_context *) context;
struct dri2_screen *psc = (struct dri2_screen *) pcp->base.psc;
struct dri2_drawable *pdraw, *pread;
__DRIdrawable *dri_draw = NULL, *dri_read = NULL;
struct dri2_display *pdp;
pdraw = (struct dri2_drawable *) driFetchDrawable(context, draw);
@@ -148,20 +149,26 @@ dri2_bind_context(struct glx_context *context, struct glx_context *old,
driReleaseDrawables(&pcp->base);
if (pdraw == NULL || pread == NULL)
if (pdraw)
dri_draw = pdraw->driDrawable;
else if (draw != None)
return GLXBadDrawable;
if (!(*psc->core->bindContext) (pcp->driContext,
pdraw->driDrawable, pread->driDrawable))
if (pread)
dri_read = pread->driDrawable;
else if (read != None)
return GLXBadDrawable;
if (!(*psc->core->bindContext) (pcp->driContext, dri_draw, dri_read))
return GLXBadContext;
/* If the server doesn't send invalidate events, we may miss a
* resize before the rendering starts. Invalidate the buffers now
* so the driver will recheck before rendering starts. */
pdp = (struct dri2_display *) psc->base.display;
if (!pdp->invalidateAvailable) {
if (!pdp->invalidateAvailable && pdraw) {
dri2InvalidateBuffers(psc->base.dpy, pdraw->base.xDrawable);
if (pread != pdraw)
if (pread != pdraw && pread)
dri2InvalidateBuffers(psc->base.dpy, pread->base.xDrawable);
}

View File

@@ -392,6 +392,9 @@ driFetchDrawable(struct glx_context *gc, GLXDrawable glxDrawable)
if (priv == NULL)
return NULL;
if (glxDrawable == None)
return NULL;
psc = priv->screens[gc->screen];
if (priv->drawHash == NULL)
return NULL;

View File

@@ -595,8 +595,8 @@ i830_render_target_supported(struct intel_context *intel,
{
mesa_format format = rb->Format;
if (format == MESA_FORMAT_Z24_UNORM_X8_UINT ||
format == MESA_FORMAT_Z24_UNORM_S8_UINT ||
if (format == MESA_FORMAT_Z24_UNORM_S8_UINT ||
format == MESA_FORMAT_Z24_UNORM_X8_UINT ||
format == MESA_FORMAT_Z_UNORM16) {
return true;
}
@@ -804,7 +804,7 @@ i830_update_draw_buffer(struct intel_context *intel)
/* Check for stencil fallback. */
if (irbStencil && irbStencil->mt) {
assert(intel_rb_format(irbStencil) == MESA_FORMAT_Z24_UNORM_X8_UINT);
assert(intel_rb_format(irbStencil) == MESA_FORMAT_Z24_UNORM_S8_UINT);
FALLBACK(intel, INTEL_FALLBACK_STENCIL_BUFFER, false);
} else if (irbStencil && !irbStencil->mt) {
FALLBACK(intel, INTEL_FALLBACK_STENCIL_BUFFER, true);
@@ -817,7 +817,7 @@ i830_update_draw_buffer(struct intel_context *intel)
* we still need to set up the shared depth/stencil state so we can use it.
*/
if (depthRegion == NULL && irbStencil && irbStencil->mt
&& intel_rb_format(irbStencil) == MESA_FORMAT_Z24_UNORM_X8_UINT) {
&& intel_rb_format(irbStencil) == MESA_FORMAT_Z24_UNORM_S8_UINT) {
depthRegion = irbStencil->mt->region;
}

View File

@@ -114,8 +114,8 @@ intel_init_texture_formats(struct gl_context *ctx)
ctx->TextureFormatSupported[MESA_FORMAT_L8A8_UNORM] = true;
/* Depth and stencil */
ctx->TextureFormatSupported[MESA_FORMAT_Z24_UNORM_X8_UINT] = true;
ctx->TextureFormatSupported[MESA_FORMAT_Z24_UNORM_S8_UINT] = true;
ctx->TextureFormatSupported[MESA_FORMAT_Z24_UNORM_X8_UINT] = true;
/*
* This was disabled in initial FBO enabling to avoid combinations

View File

@@ -88,8 +88,8 @@ translate_texture_format(mesa_format mesa_format, GLenum DepthMode)
case MESA_FORMAT_RGBA_DXT5:
case MESA_FORMAT_SRGBA_DXT5:
return (MAPSURF_COMPRESSED | MT_COMPRESS_DXT4_5);
case MESA_FORMAT_Z24_UNORM_X8_UINT:
case MESA_FORMAT_Z24_UNORM_S8_UINT:
case MESA_FORMAT_Z24_UNORM_X8_UINT:
if (DepthMode == GL_ALPHA)
return (MAPSURF_32BIT | MT_32BIT_x8A24);
else if (DepthMode == GL_INTENSITY)

View File

@@ -562,8 +562,8 @@ i915_render_target_supported(struct intel_context *intel,
{
mesa_format format = rb->Format;
if (format == MESA_FORMAT_Z24_UNORM_X8_UINT ||
format == MESA_FORMAT_Z24_UNORM_S8_UINT ||
if (format == MESA_FORMAT_Z24_UNORM_S8_UINT ||
format == MESA_FORMAT_Z24_UNORM_X8_UINT ||
format == MESA_FORMAT_Z_UNORM16) {
return true;
}
@@ -777,7 +777,7 @@ i915_update_draw_buffer(struct intel_context *intel)
/* Check for stencil fallback. */
if (irbStencil && irbStencil->mt) {
assert(intel_rb_format(irbStencil) == MESA_FORMAT_Z24_UNORM_X8_UINT);
assert(intel_rb_format(irbStencil) == MESA_FORMAT_Z24_UNORM_S8_UINT);
FALLBACK(intel, INTEL_FALLBACK_STENCIL_BUFFER, false);
} else if (irbStencil && !irbStencil->mt) {
FALLBACK(intel, INTEL_FALLBACK_STENCIL_BUFFER, true);
@@ -790,7 +790,7 @@ i915_update_draw_buffer(struct intel_context *intel)
* we still need to set up the shared depth/stencil state so we can use it.
*/
if (depthRegion == NULL && irbStencil && irbStencil->mt
&& intel_rb_format(irbStencil) == MESA_FORMAT_Z24_UNORM_X8_UINT) {
&& intel_rb_format(irbStencil) == MESA_FORMAT_Z24_UNORM_S8_UINT) {
depthRegion = irbStencil->mt->region;
}

View File

@@ -194,7 +194,7 @@ intel_alloc_renderbuffer_storage(struct gl_context * ctx, struct gl_renderbuffer
case GL_STENCIL_INDEX8_EXT:
case GL_STENCIL_INDEX16_EXT:
/* These aren't actual texture formats, so force them here. */
rb->Format = MESA_FORMAT_Z24_UNORM_X8_UINT;
rb->Format = MESA_FORMAT_Z24_UNORM_S8_UINT;
break;
}

View File

@@ -889,7 +889,7 @@ intelCreateBuffer(__DRIscreen * driScrnPriv,
* Use combined depth/stencil. Note that the renderbuffer is
* attached to two attachment points.
*/
rb = intel_create_private_renderbuffer(MESA_FORMAT_Z24_UNORM_X8_UINT);
rb = intel_create_private_renderbuffer(MESA_FORMAT_Z24_UNORM_S8_UINT);
_mesa_add_renderbuffer(fb, BUFFER_DEPTH, &rb->Base.Base);
_mesa_add_renderbuffer(fb, BUFFER_STENCIL, &rb->Base.Base);
}

View File

@@ -95,7 +95,7 @@ brw_blorp_surface_info::set(struct brw_context *brw,
this->map_stencil_as_y_tiled = true;
this->brw_surfaceformat = BRW_SURFACEFORMAT_R8_UNORM;
break;
case MESA_FORMAT_Z24_UNORM_S8_UINT:
case MESA_FORMAT_Z24_UNORM_X8_UINT:
/* It would make sense to use BRW_SURFACEFORMAT_R24_UNORM_X8_TYPELESS
* here, but unfortunately it isn't supported as a render target, which
* would prevent us from blitting to 24-bit depth.
@@ -328,7 +328,7 @@ brw_hiz_op_params::brw_hiz_op_params(struct intel_mipmap_tree *mt,
switch (mt->format) {
case MESA_FORMAT_Z_UNORM16: depth_format = BRW_DEPTHFORMAT_D16_UNORM; break;
case MESA_FORMAT_Z_FLOAT32: depth_format = BRW_DEPTHFORMAT_D32_FLOAT; break;
case MESA_FORMAT_Z24_UNORM_S8_UINT: depth_format = BRW_DEPTHFORMAT_D24_UNORM_X8_UINT; break;
case MESA_FORMAT_Z24_UNORM_X8_UINT: depth_format = BRW_DEPTHFORMAT_D24_UNORM_X8_UINT; break;
default: assert(0); break;
}
}

View File

@@ -219,7 +219,7 @@ formats_match(GLbitfield buffer_bit, struct intel_renderbuffer *src_irb,
{
/* Note: don't just check gl_renderbuffer::Format, because in some cases
* multiple gl_formats resolve to the same native type in the miptree (for
* example MESA_FORMAT_Z24_UNORM_S8_UINT and MESA_FORMAT_Z24_UNORM_X8_UINT), and we can blit
* example MESA_FORMAT_Z24_UNORM_X8_UINT and MESA_FORMAT_Z24_UNORM_S8_UINT), and we can blit
* between those formats.
*/
mesa_format src_format = find_miptree(buffer_bit, src_irb)->format;
@@ -368,8 +368,8 @@ brw_blorp_copytexsubimage(struct brw_context *brw,
* we have to lie about the surface format. See the comments in
* brw_blorp_surface_info::set().
*/
if ((src_mt->format == MESA_FORMAT_Z24_UNORM_S8_UINT) !=
(dst_mt->format == MESA_FORMAT_Z24_UNORM_S8_UINT)) {
if ((src_mt->format == MESA_FORMAT_Z24_UNORM_X8_UINT) !=
(dst_mt->format == MESA_FORMAT_Z24_UNORM_X8_UINT)) {
return false;
}

View File

@@ -130,7 +130,7 @@ brw_fast_clear_depth(struct gl_context *ctx)
uint32_t depth_clear_value;
switch (mt->format) {
case MESA_FORMAT_Z32_FLOAT_S8X24_UINT:
case MESA_FORMAT_Z24_UNORM_X8_UINT:
case MESA_FORMAT_Z24_UNORM_S8_UINT:
/* From the Sandy Bridge PRM, volume 2 part 1, page 314:
*
* "[DevSNB+]: Several cases exist where Depth Buffer Clear cannot be

View File

@@ -682,12 +682,6 @@ brwCreateContext(gl_api api,
intel_batchbuffer_init(brw);
brw_init_state(brw);
intelInitExtensions(ctx);
intel_fbo_init(brw);
if (brw->gen >= 6) {
/* Create a new hardware context. Using a hardware context means that
* our GPU state will be saved/restored on context switch, allowing us
@@ -705,6 +699,12 @@ brwCreateContext(gl_api api,
}
}
brw_init_state(brw);
intelInitExtensions(ctx);
intel_fbo_init(brw);
brw_init_surface_formats(brw);
if (brw->is_g4x || brw->gen >= 5) {

View File

@@ -142,7 +142,7 @@ brw_depthbuffer_format(struct brw_context *brw)
if (!drb &&
(srb = intel_get_renderbuffer(fb, BUFFER_STENCIL)) &&
!srb->mt->stencil_mt &&
(intel_rb_format(srb) == MESA_FORMAT_Z24_UNORM_X8_UINT ||
(intel_rb_format(srb) == MESA_FORMAT_Z24_UNORM_S8_UINT ||
intel_rb_format(srb) == MESA_FORMAT_Z32_FLOAT_S8X24_UINT)) {
drb = srb;
}
@@ -155,7 +155,7 @@ brw_depthbuffer_format(struct brw_context *brw)
return BRW_DEPTHFORMAT_D16_UNORM;
case MESA_FORMAT_Z_FLOAT32:
return BRW_DEPTHFORMAT_D32_FLOAT;
case MESA_FORMAT_Z24_UNORM_S8_UINT:
case MESA_FORMAT_Z24_UNORM_X8_UINT:
if (brw->gen >= 6) {
return BRW_DEPTHFORMAT_D24_UNORM_X8_UINT;
} else {
@@ -173,7 +173,7 @@ brw_depthbuffer_format(struct brw_context *brw)
*/
return BRW_DEPTHFORMAT_D24_UNORM_S8_UINT;
}
case MESA_FORMAT_Z24_UNORM_X8_UINT:
case MESA_FORMAT_Z24_UNORM_S8_UINT:
return BRW_DEPTHFORMAT_D24_UNORM_S8_UINT;
case MESA_FORMAT_Z32_FLOAT_S8X24_UINT:
return BRW_DEPTHFORMAT_D32_FLOAT_S8X24_UINT;

View File

@@ -358,9 +358,9 @@ brw_format_for_mesa_format(mesa_format mesa_format)
[MESA_FORMAT_G16R16_UNORM] = 0,
[MESA_FORMAT_B10G10R10A2_UNORM] = BRW_SURFACEFORMAT_B10G10R10A2_UNORM,
[MESA_FORMAT_S8_UINT_Z24_UNORM] = 0,
[MESA_FORMAT_Z24_UNORM_X8_UINT] = 0,
[MESA_FORMAT_Z_UNORM16] = 0,
[MESA_FORMAT_Z24_UNORM_S8_UINT] = 0,
[MESA_FORMAT_Z_UNORM16] = 0,
[MESA_FORMAT_Z24_UNORM_X8_UINT] = 0,
[MESA_FORMAT_X8Z24_UNORM] = 0,
[MESA_FORMAT_Z_UNORM32] = 0,
[MESA_FORMAT_S_UINT8] = 0,
@@ -600,8 +600,8 @@ brw_init_surface_formats(struct brw_context *brw)
/* We will check this table for FBO completeness, but the surface format
* table above only covered color rendering.
*/
brw->format_supported_as_render_target[MESA_FORMAT_Z24_UNORM_X8_UINT] = true;
brw->format_supported_as_render_target[MESA_FORMAT_Z24_UNORM_S8_UINT] = true;
brw->format_supported_as_render_target[MESA_FORMAT_Z24_UNORM_X8_UINT] = true;
brw->format_supported_as_render_target[MESA_FORMAT_S_UINT8] = true;
brw->format_supported_as_render_target[MESA_FORMAT_Z_UNORM16] = true;
brw->format_supported_as_render_target[MESA_FORMAT_Z_FLOAT32] = true;
@@ -610,8 +610,8 @@ brw_init_surface_formats(struct brw_context *brw)
/* We remap depth formats to a supported texturing format in
* translate_tex_format().
*/
ctx->TextureFormatSupported[MESA_FORMAT_Z24_UNORM_X8_UINT] = true;
ctx->TextureFormatSupported[MESA_FORMAT_Z24_UNORM_S8_UINT] = true;
ctx->TextureFormatSupported[MESA_FORMAT_Z24_UNORM_X8_UINT] = true;
ctx->TextureFormatSupported[MESA_FORMAT_Z_FLOAT32] = true;
ctx->TextureFormatSupported[MESA_FORMAT_Z32_FLOAT_S8X24_UINT] = true;
@@ -697,8 +697,8 @@ translate_tex_format(struct brw_context *brw,
case MESA_FORMAT_Z_UNORM16:
return BRW_SURFACEFORMAT_R16_UNORM;
case MESA_FORMAT_Z24_UNORM_X8_UINT:
case MESA_FORMAT_Z24_UNORM_S8_UINT:
case MESA_FORMAT_Z24_UNORM_X8_UINT:
return BRW_SURFACEFORMAT_R24_UNORM_X8_TYPELESS;
case MESA_FORMAT_Z_FLOAT32:
@@ -740,8 +740,8 @@ brw_is_hiz_depth_format(struct brw_context *brw, mesa_format format)
switch (format) {
case MESA_FORMAT_Z_FLOAT32:
case MESA_FORMAT_Z32_FLOAT_S8X24_UINT:
case MESA_FORMAT_Z24_UNORM_S8_UINT:
case MESA_FORMAT_Z24_UNORM_X8_UINT:
case MESA_FORMAT_Z24_UNORM_S8_UINT:
case MESA_FORMAT_Z_UNORM16:
return true;
default:

View File

@@ -254,26 +254,6 @@ gen6_blorp_emit_blend_state(struct brw_context *brw,
blend->blend1.write_disable_b = params->color_write_disable[2];
blend->blend1.write_disable_a = params->color_write_disable[3];
/* When blitting from an XRGB source to a ARGB destination, we need to
* interpret the missing channel as 1.0. Blending can do that for us:
* we simply use the RGB values from the fragment shader ("source RGB"),
* but smash the alpha channel to 1.
*/
if (params->src.mt &&
_mesa_get_format_bits(params->dst.mt->format, GL_ALPHA_BITS) > 0 &&
_mesa_get_format_bits(params->src.mt->format, GL_ALPHA_BITS) == 0) {
blend->blend0.blend_enable = 1;
blend->blend0.ia_blend_enable = 1;
blend->blend0.blend_func = BRW_BLENDFUNCTION_ADD;
blend->blend0.ia_blend_func = BRW_BLENDFUNCTION_ADD;
blend->blend0.source_blend_factor = BRW_BLENDFACTOR_SRC_COLOR;
blend->blend0.dest_blend_factor = BRW_BLENDFACTOR_ZERO;
blend->blend0.ia_source_blend_factor = BRW_BLENDFACTOR_ONE;
blend->blend0.ia_dest_blend_factor = BRW_BLENDFACTOR_ZERO;
}
return cc_blend_state_offset;
}

View File

@@ -209,7 +209,7 @@ intel_alloc_renderbuffer_storage(struct gl_context * ctx, struct gl_renderbuffer
rb->Format = MESA_FORMAT_S_UINT8;
} else {
assert(!brw->must_use_separate_stencil);
rb->Format = MESA_FORMAT_Z24_UNORM_X8_UINT;
rb->Format = MESA_FORMAT_Z24_UNORM_S8_UINT;
}
break;
}

View File

@@ -370,8 +370,8 @@ intel_miptree_create_layout(struct brw_context *brw,
/* Fix up the Z miptree format for how we're splitting out separate
* stencil. Gen7 expects there to be no stencil bits in its depth buffer.
*/
if (mt->format == MESA_FORMAT_Z24_UNORM_X8_UINT) {
mt->format = MESA_FORMAT_Z24_UNORM_S8_UINT;
if (mt->format == MESA_FORMAT_Z24_UNORM_S8_UINT) {
mt->format = MESA_FORMAT_Z24_UNORM_X8_UINT;
} else if (mt->format == MESA_FORMAT_Z32_FLOAT_S8X24_UINT) {
mt->format = MESA_FORMAT_Z_FLOAT32;
mt->cpp = 4;
@@ -918,8 +918,8 @@ intel_miptree_match_image(struct intel_mipmap_tree *mt,
assert(target_to_target(image->TexObject->Target) == mt->target);
mesa_format mt_format = mt->format;
if (mt->format == MESA_FORMAT_Z24_UNORM_S8_UINT && mt->stencil_mt)
mt_format = MESA_FORMAT_Z24_UNORM_X8_UINT;
if (mt->format == MESA_FORMAT_Z24_UNORM_X8_UINT && mt->stencil_mt)
mt_format = MESA_FORMAT_Z24_UNORM_S8_UINT;
if (mt->format == MESA_FORMAT_Z_FLOAT32 && mt->stencil_mt)
mt_format = MESA_FORMAT_Z32_FLOAT_S8X24_UINT;
if (mt->etc_format != MESA_FORMAT_NONE)

View File

@@ -279,8 +279,8 @@ struct intel_mipmap_tree
* on hardware where we want or need to use separate stencil, there will be
* two miptrees for storing the data. If the depthstencil texture or rb is
* MESA_FORMAT_Z32_FLOAT_S8X24_UINT, then mt->format will be
* MESA_FORMAT_Z_FLOAT32, otherwise for MESA_FORMAT_Z24_UNORM_X8_UINT objects it will be
* MESA_FORMAT_Z24_UNORM_S8_UINT.
* MESA_FORMAT_Z_FLOAT32, otherwise for MESA_FORMAT_Z24_UNORM_S8_UINT objects it will be
* MESA_FORMAT_Z24_UNORM_X8_UINT.
*
* For ETC1/ETC2 textures, this is one of the uncompressed mesa texture
* formats if the hardware lacks support for ETC1/ETC2. See @ref wraps_etc.

View File

@@ -66,6 +66,8 @@ do_blit_copypixels(struct gl_context * ctx,
/* Update draw buffer bounds */
_mesa_update_state(ctx);
intel_prepare_render(brw);
switch (type) {
case GL_COLOR:
if (fb->_NumColorDrawBuffers != 1) {
@@ -148,8 +150,6 @@ do_blit_copypixels(struct gl_context * ctx,
return false;
}
intel_prepare_render(brw);
intel_batchbuffer_flush(brw);
/* Clip to destination buffer. */

View File

@@ -72,6 +72,8 @@ do_blit_drawpixels(struct gl_context * ctx,
return false;
}
intel_prepare_render(brw);
struct gl_renderbuffer *rb = ctx->DrawBuffer->_ColorDrawBuffers[0];
struct intel_renderbuffer *irb = intel_renderbuffer(rb);
@@ -101,8 +103,6 @@ do_blit_drawpixels(struct gl_context * ctx,
src_offset += _mesa_image_offset(2, unpack, width, height,
format, type, 0, 0, 0);
intel_prepare_render(brw);
src_buffer = intel_bufferobj_buffer(brw, src,
src_offset, width * height *
irb->mt->cpp);

View File

@@ -1007,7 +1007,7 @@ intelCreateBuffer(__DRIscreen * driScrnPriv,
assert(mesaVis->stencilBits == 8);
if (screen->devinfo->has_hiz_and_separate_stencil) {
rb = intel_create_private_renderbuffer(MESA_FORMAT_Z24_UNORM_S8_UINT,
rb = intel_create_private_renderbuffer(MESA_FORMAT_Z24_UNORM_X8_UINT,
num_samples);
_mesa_add_renderbuffer(fb, BUFFER_DEPTH, &rb->Base.Base);
rb = intel_create_private_renderbuffer(MESA_FORMAT_S_UINT8,
@@ -1018,7 +1018,7 @@ intelCreateBuffer(__DRIscreen * driScrnPriv,
* Use combined depth/stencil. Note that the renderbuffer is
* attached to two attachment points.
*/
rb = intel_create_private_renderbuffer(MESA_FORMAT_Z24_UNORM_X8_UINT,
rb = intel_create_private_renderbuffer(MESA_FORMAT_Z24_UNORM_S8_UINT,
num_samples);
_mesa_add_renderbuffer(fb, BUFFER_DEPTH, &rb->Base.Base);
_mesa_add_renderbuffer(fb, BUFFER_STENCIL, &rb->Base.Base);

View File

@@ -72,7 +72,7 @@ nouveau_context_create(gl_api api,
return false;
}
ctx = screen->driver->context_create(screen, visual, share_ctx);
ctx = screen->driver->context_create(screen, api, visual, share_ctx);
if (!ctx) {
*error = __DRI_CTX_ERROR_NO_MEMORY;
return GL_FALSE;
@@ -107,7 +107,8 @@ nouveau_context_create(gl_api api,
}
GLboolean
nouveau_context_init(struct gl_context *ctx, struct nouveau_screen *screen,
nouveau_context_init(struct gl_context *ctx, gl_api api,
struct nouveau_screen *screen,
const struct gl_config *visual, struct gl_context *share_ctx)
{
struct nouveau_context *nctx = to_nouveau_context(ctx);
@@ -125,7 +126,7 @@ nouveau_context_init(struct gl_context *ctx, struct nouveau_screen *screen,
nouveau_fbo_functions_init(&functions);
/* Initialize the mesa context. */
_mesa_initialize_context(ctx, API_OPENGL_COMPAT, visual,
_mesa_initialize_context(ctx, api, visual,
share_ctx, &functions);
nouveau_state_init(ctx);

View File

@@ -115,7 +115,8 @@ nouveau_context_create(gl_api api,
void *share_ctx);
GLboolean
nouveau_context_init(struct gl_context *ctx, struct nouveau_screen *screen,
nouveau_context_init(struct gl_context *ctx, gl_api api,
struct nouveau_screen *screen,
const struct gl_config *visual, struct gl_context *share_ctx);
void

View File

@@ -48,6 +48,7 @@
struct nouveau_driver {
struct gl_context *(*context_create)(struct nouveau_screen *screen,
gl_api api,
const struct gl_config *visual,
struct gl_context *share_ctx);
void (*context_destroy)(struct gl_context *ctx);

View File

@@ -138,7 +138,8 @@ nv04_context_destroy(struct gl_context *ctx)
}
static struct gl_context *
nv04_context_create(struct nouveau_screen *screen, const struct gl_config *visual,
nv04_context_create(struct nouveau_screen *screen, gl_api api,
const struct gl_config *visual,
struct gl_context *share_ctx)
{
struct nv04_context *nctx;
@@ -153,7 +154,7 @@ nv04_context_create(struct nouveau_screen *screen, const struct gl_config *visua
ctx = &nctx->base.base;
hw = &nctx->base.hw;
if (!nouveau_context_init(ctx, screen, visual, share_ctx))
if (!nouveau_context_init(ctx, api, screen, visual, share_ctx))
goto fail;
/* GL constants. */

View File

@@ -62,7 +62,7 @@ swzsurf_format(mesa_format format)
case MESA_FORMAT_B8G8R8X8_UNORM:
case MESA_FORMAT_B8G8R8A8_UNORM:
case MESA_FORMAT_A8R8G8B8_UNORM:
case MESA_FORMAT_Z24_UNORM_X8_UINT:
case MESA_FORMAT_Z24_UNORM_S8_UINT:
case MESA_FORMAT_S8_UINT_Z24_UNORM:
case MESA_FORMAT_Z_UNORM32:
return NV04_SWIZZLED_SURFACE_FORMAT_COLOR_A8R8G8B8;
@@ -101,7 +101,7 @@ surf2d_format(mesa_format format)
case MESA_FORMAT_B8G8R8X8_UNORM:
case MESA_FORMAT_B8G8R8A8_UNORM:
case MESA_FORMAT_A8R8G8B8_UNORM:
case MESA_FORMAT_Z24_UNORM_X8_UINT:
case MESA_FORMAT_Z24_UNORM_S8_UINT:
case MESA_FORMAT_S8_UINT_Z24_UNORM:
case MESA_FORMAT_Z_UNORM32:
return NV04_CONTEXT_SURFACES_2D_FORMAT_Y32;
@@ -140,7 +140,7 @@ rect_format(mesa_format format)
case MESA_FORMAT_B8G8R8X8_UNORM:
case MESA_FORMAT_B8G8R8A8_UNORM:
case MESA_FORMAT_A8R8G8B8_UNORM:
case MESA_FORMAT_Z24_UNORM_X8_UINT:
case MESA_FORMAT_Z24_UNORM_S8_UINT:
case MESA_FORMAT_S8_UINT_Z24_UNORM:
case MESA_FORMAT_Z_UNORM32:
return NV04_GDI_RECTANGLE_TEXT_COLOR_FORMAT_A8R8G8B8;
@@ -179,7 +179,7 @@ sifm_format(mesa_format format)
case MESA_FORMAT_B8G8R8X8_UNORM:
case MESA_FORMAT_B8G8R8A8_UNORM:
case MESA_FORMAT_A8R8G8B8_UNORM:
case MESA_FORMAT_Z24_UNORM_X8_UINT:
case MESA_FORMAT_Z24_UNORM_S8_UINT:
case MESA_FORMAT_S8_UINT_Z24_UNORM:
case MESA_FORMAT_Z_UNORM32:
return NV03_SCALED_IMAGE_FROM_MEMORY_COLOR_FORMAT_A8R8G8B8;

View File

@@ -63,7 +63,7 @@ nv10_use_viewport_zclear(struct gl_context *ctx)
struct gl_framebuffer *fb = ctx->DrawBuffer;
struct gl_renderbuffer *depthRb = fb->Attachment[BUFFER_DEPTH].Renderbuffer;
return context_chipset(ctx) < 0x17 &&
return context_eng3d(ctx)->oclass < NV17_3D_CLASS &&
!nctx->hierz.clear_blocked && depthRb &&
(_mesa_get_format_bits(depthRb->Format,
GL_DEPTH_BITS) >= 24);
@@ -184,7 +184,7 @@ nv10_clear(struct gl_context *ctx, GLbitfield buffers)
}
if ((buffers & BUFFER_BIT_DEPTH) && ctx->Depth.Mask) {
if (context_chipset(ctx) >= 0x17)
if (context_eng3d(ctx)->oclass >= NV17_3D_CLASS)
nv17_zclear(ctx, &buffers);
else
nv10_zclear(ctx, &buffers);
@@ -245,7 +245,7 @@ nv10_hwctx_init(struct gl_context *ctx)
BEGIN_NV04(push, NV04_GRAPH(3D, NOP), 1);
PUSH_DATA (push, 0);
if (context_chipset(ctx) >= 0x17) {
if (context_eng3d(ctx)->oclass >= NV17_3D_CLASS) {
BEGIN_NV04(push, NV17_3D(UNK01AC), 2);
PUSH_DATA (push, fifo->vram);
PUSH_DATA (push, fifo->vram);
@@ -257,7 +257,7 @@ nv10_hwctx_init(struct gl_context *ctx)
PUSH_DATA (push, 1);
}
if (context_chipset(ctx) >= 0x11) {
if (context_eng3d(ctx)->oclass >= NV15_3D_CLASS) {
BEGIN_NV04(push, SUBC_3D(0x120), 3);
PUSH_DATA (push, 0);
PUSH_DATA (push, 1);
@@ -427,7 +427,8 @@ nv10_context_destroy(struct gl_context *ctx)
}
static struct gl_context *
nv10_context_create(struct nouveau_screen *screen, const struct gl_config *visual,
nv10_context_create(struct nouveau_screen *screen, gl_api api,
const struct gl_config *visual,
struct gl_context *share_ctx)
{
struct nouveau_context *nctx;
@@ -441,7 +442,7 @@ nv10_context_create(struct nouveau_screen *screen, const struct gl_config *visua
ctx = &nctx->base;
if (!nouveau_context_init(ctx, screen, visual, share_ctx))
if (!nouveau_context_init(ctx, api, screen, visual, share_ctx))
goto fail;
ctx->Extensions.ARB_texture_env_crossbar = true;

View File

@@ -106,7 +106,7 @@ nv10_emit_framebuffer(struct gl_context *ctx, int emit)
/* At least nv11 seems to get sad if we don't do this before
* swapping RTs.*/
if (context_chipset(ctx) < 0x17) {
if (context_eng3d(ctx)->oclass < NV17_3D_CLASS) {
int i;
for (i = 0; i < 6; i++) {
@@ -140,7 +140,7 @@ nv10_emit_framebuffer(struct gl_context *ctx, int emit)
PUSH_MTHDl(push, NV10_3D(ZETA_OFFSET), BUFCTX_FB,
s->bo, 0, bo_flags);
if (context_chipset(ctx) >= 0x17) {
if (context_eng3d(ctx)->oclass >= NV17_3D_CLASS) {
setup_hierz_buffer(ctx);
context_dirty(ctx, ZCLEAR);
}

View File

@@ -28,6 +28,7 @@
#include "nouveau_context.h"
#include "nouveau_gldefs.h"
#include "nouveau_util.h"
#include "nv_object.xml.h"
#include "nv10_3d.xml.h"
#include "nv10_driver.h"
@@ -120,7 +121,7 @@ nv10_emit_logic_opcode(struct gl_context *ctx, int emit)
struct nouveau_pushbuf *push = context_push(ctx);
assert(!ctx->Color.ColorLogicOpEnabled
|| context_chipset(ctx) >= 0x11);
|| context_eng3d(ctx)->oclass >= NV15_3D_CLASS);
BEGIN_NV04(push, NV11_3D(COLOR_LOGIC_OP_ENABLE), 2);
PUSH_DATAb(push, ctx->Color.ColorLogicOpEnabled);

View File

@@ -438,7 +438,8 @@ nv20_context_destroy(struct gl_context *ctx)
}
static struct gl_context *
nv20_context_create(struct nouveau_screen *screen, const struct gl_config *visual,
nv20_context_create(struct nouveau_screen *screen, gl_api api,
const struct gl_config *visual,
struct gl_context *share_ctx)
{
struct nouveau_context *nctx;
@@ -452,7 +453,7 @@ nv20_context_create(struct nouveau_screen *screen, const struct gl_config *visua
ctx = &nctx->base;
if (!nouveau_context_init(ctx, screen, visual, share_ctx))
if (!nouveau_context_init(ctx, api, screen, visual, share_ctx))
goto fail;
ctx->Extensions.ARB_texture_env_crossbar = true;

View File

@@ -310,7 +310,7 @@ radeon_map_renderbuffer(struct gl_context *ctx,
}
if ((rmesa->radeonScreen->chip_flags & RADEON_CHIPSET_DEPTH_ALWAYS_TILED) && !rrb->has_surface) {
if (rb->Format == MESA_FORMAT_Z24_UNORM_X8_UINT || rb->Format == MESA_FORMAT_Z24_UNORM_S8_UINT) {
if (rb->Format == MESA_FORMAT_Z24_UNORM_S8_UINT || rb->Format == MESA_FORMAT_Z24_UNORM_X8_UINT) {
radeon_map_renderbuffer_s8z24(ctx, rb, x, y, w, h,
mode, out_map, out_stride);
return;
@@ -419,7 +419,7 @@ radeon_unmap_renderbuffer(struct gl_context *ctx,
GLboolean ok;
if ((rmesa->radeonScreen->chip_flags & RADEON_CHIPSET_DEPTH_ALWAYS_TILED) && !rrb->has_surface) {
if (rb->Format == MESA_FORMAT_Z24_UNORM_X8_UINT || rb->Format == MESA_FORMAT_Z24_UNORM_S8_UINT) {
if (rb->Format == MESA_FORMAT_Z24_UNORM_S8_UINT || rb->Format == MESA_FORMAT_Z24_UNORM_X8_UINT) {
radeon_unmap_renderbuffer_s8z24(ctx, rb);
return;
}
@@ -507,7 +507,7 @@ radeon_alloc_renderbuffer_storage(struct gl_context * ctx, struct gl_renderbuffe
case GL_STENCIL_INDEX8_EXT:
case GL_STENCIL_INDEX16_EXT:
/* alloc a depth+stencil buffer */
rb->Format = MESA_FORMAT_Z24_UNORM_X8_UINT;
rb->Format = MESA_FORMAT_Z24_UNORM_S8_UINT;
cpp = 4;
break;
case GL_DEPTH_COMPONENT16:
@@ -517,12 +517,12 @@ radeon_alloc_renderbuffer_storage(struct gl_context * ctx, struct gl_renderbuffe
case GL_DEPTH_COMPONENT:
case GL_DEPTH_COMPONENT24:
case GL_DEPTH_COMPONENT32:
rb->Format = MESA_FORMAT_Z24_UNORM_S8_UINT;
rb->Format = MESA_FORMAT_Z24_UNORM_X8_UINT;
cpp = 4;
break;
case GL_DEPTH_STENCIL_EXT:
case GL_DEPTH24_STENCIL8_EXT:
rb->Format = MESA_FORMAT_Z24_UNORM_X8_UINT;
rb->Format = MESA_FORMAT_Z24_UNORM_S8_UINT;
cpp = 4;
break;
default:

View File

@@ -632,14 +632,14 @@ radeonCreateBuffer( __DRIscreen *driScrnPriv,
if (mesaVis->depthBits == 24) {
if (mesaVis->stencilBits == 8) {
struct radeon_renderbuffer *depthStencilRb =
radeon_create_renderbuffer(MESA_FORMAT_Z24_UNORM_X8_UINT, driDrawPriv);
radeon_create_renderbuffer(MESA_FORMAT_Z24_UNORM_S8_UINT, driDrawPriv);
_mesa_add_renderbuffer(&rfb->base, BUFFER_DEPTH, &depthStencilRb->base.Base);
_mesa_add_renderbuffer(&rfb->base, BUFFER_STENCIL, &depthStencilRb->base.Base);
depthStencilRb->has_surface = screen->depthHasSurface;
} else {
/* depth renderbuffer */
struct radeon_renderbuffer *depth =
radeon_create_renderbuffer(MESA_FORMAT_Z24_UNORM_S8_UINT, driDrawPriv);
radeon_create_renderbuffer(MESA_FORMAT_Z24_UNORM_X8_UINT, driDrawPriv);
_mesa_add_renderbuffer(&rfb->base, BUFFER_DEPTH, &depth->base.Base);
depth->has_surface = screen->depthHasSurface;
}

View File

@@ -433,7 +433,7 @@ mesa_format radeonChooseTextureFormat(struct gl_context * ctx,
case GL_DEPTH_COMPONENT32:
case GL_DEPTH_STENCIL_EXT:
case GL_DEPTH24_STENCIL8_EXT:
return MESA_FORMAT_Z24_UNORM_X8_UINT;
return MESA_FORMAT_Z24_UNORM_S8_UINT;
/* EXT_texture_sRGB */
case GL_SRGB:
@@ -513,7 +513,7 @@ unsigned radeonIsFormatRenderable(mesa_format mesa_format)
switch (mesa_format)
{
case MESA_FORMAT_Z_UNORM16:
case MESA_FORMAT_Z24_UNORM_X8_UINT:
case MESA_FORMAT_Z24_UNORM_S8_UINT:
return 1;
default:
return 0;

View File

@@ -1457,6 +1457,7 @@ copy_array_object(struct gl_context *ctx,
/* _Enabled must be the same than on push */
dest->_Enabled = src->_Enabled;
dest->NewArrays = src->NewArrays;
dest->_MaxElement = src->_MaxElement;
}

Some files were not shown because too many files have changed in this diff Show More