Compare commits

...

301 Commits

Author SHA1 Message Date
Carl Worth
42f86ef025 docs: Add sha256 sums for the 10.1.6 release files
Just after generating these files and tagging the release.
2014-06-24 21:24:53 -07:00
Carl Worth
5f41cae633 docs: Add release notes for the 10.1.6 release. 2014-06-24 21:17:37 -07:00
Carl Worth
0e76bc55ed Update VERSION to 10.1.6
In preparation for the 10.1.6 release.
2014-06-24 21:14:06 -07:00
Carl Worth
ce6877491f cherry-ignore: Add a patch to ignore
This patch is not needed on the 10.1 branch, (just 10.2), as confirmed by
Emil.
2014-06-24 12:52:21 -07:00
Tobias Klausmann
f9b6457986 nv50/ir: clear subop when folding constant expressions
Some operations (e.g. OP_MUL/OP_MAD/OP_EXTBF) might have a subop set.
After folding, make sure that it is cleared

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>

(cherry picked from commit 3164bfc734)
2014-06-24 12:48:06 -07:00
Roland Scheidegger
04ca4cef97 draw: (trivial) fix clamping of viewport index
The old logic would let all negative values go through unclamped, with
potentially disastrous results (probably trying to fetch viewport values
from random memory locations). GL has undefined rendering for vp indices
outside valid range but that's a bit too undefined...
(The logic is now the same as in llvmpipe.)

CC: "10.1 10.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 604e54de78)
2014-06-23 16:50:38 -07:00
Beren Minor
b574944a05 egl/main: Fix eglMakeCurrent when releasing context from current thread.
EGL 1.4 Specification says that
eglMakeCurrent(display, EGL_NO_SURFACE, EGL_NO_SURFACE, EGL_NO_CONTEXT)
can be used to release the current thread's ownership on the surfaces
and context.

MESA's egl implementation was only accepting the parameters when the
KHR_surfaceless_context extension is supported.

[chadv] Add quote from the EGL 1.4 spec.
Cc: "10,1, 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>

(cherry picked from commit 0ca0d5743f)
2014-06-23 15:37:38 -07:00
Daniel Manjarres
838b0d9928 glx: Don't crash on swap event for a Window (non-GLXWindow)
Prior to GLX 1.3 there was the glxMakeCurrent() function that took a
single drawable handle. The Drawable could be either a bare XID for a
Window or an XID for a glxpixmap.

GLX 1.3 added glxMakeContextCurrent that takes 2 handles: one for
reading, one for writing. Nowadays the old glxMakeCurrent call is
implemented as a call to glxMakeContextCurrent with the single handle
duplicated.

Because of this it is allowed to use a plain-old Window ID as an
argument to glxMakeContextCurrent, although nobody really documents this
sort of thing. The manpage for the NEW call specifies the arguments as
GLXPixmaps, but the actual code accepts Window XIDs too, and handles
them correctly.

Similarly, the glxSelectEvents function can also take a bare Window XID.

The "piglit" tests all use GLXWindows and/or GLXPixmaps. You never
tested swap events with a bare Window XID. That is what my app was
doing.

The swap_events code worked with Window XIDs in mesa 7.x.y. The new code
added in versions 8, 9, and 10 assumes that all buffer swap events have
a GLXPixmap associated with them. Because of the historical quirks
above, this is not true. Swap events for bare Window XIDs do NOT have a
glxpixmap resulting in a segfault.

Any app that uses the old school glxMakeCurrent call with a Window XID
while trying to use swap_events will crash when the libs try to lookup
the nonexistent GLXPixmap associated with the incoming swap event.

I believe that the people who wrote the spec overlooked this, because
the "sbc" field comes from the OML_sync extension that is defined in
terms of glxpixmaps only.

v2 (idr): Formatting changes.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54372
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 86bd2196b4)
2014-06-23 11:21:09 -07:00
Iago Toral Quiroga
c2dc58fe96 mesa: Copy Geom.UsesEndPrimitive when cloning a geometry program.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 96a95f48ea)
2014-06-23 11:21:09 -07:00
Tom Stellard
d947156407 clover: Don't use llvm's global context
An LLVMContext should only be accessed by a single and using the global
context was causing crashes in multi-threaded environments.  Now we use
a separate context for each compile.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>

CC: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4aa128a123)
2014-06-23 11:21:09 -07:00
Tom Stellard
bf50129ba6 clover: Prevent Clang from printing number of errors and warnings to stderr.
https://bugs.freedesktop.org/show_bug.cgi?id=78581

CC: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0cc391f013)
2014-06-23 11:21:08 -07:00
Kristian Høgsberg
68af044a0c mesa: Remove glClear optimization based on drawable size
A drawable size of 0x0 means that we don't have buffers for a drawable yet,
not that we have a zero-sized buffer.  Core mesa shouldn't be optimizing out
drawing based on buffer size, since the draw call could be what triggers
the driver to go and get buffers.  As discussed in the referenced bug report,
the optimization was added as part of a scatter-shot attempt to fix a
different problem.  There's no other example in mesa core of using the
buffer size in this way.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74005
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 7928b946ad)
2014-06-23 11:21:08 -07:00
Neil Roberts
151e7ac3cf i965: Set the fast clear color value for texture surfaces
When a multisampled texture is used for sampling the fast clear color value
needs to be programmed into the surface state. This was being left as all
zeroes so if the surface was cleared to a value other than black then it
wouldn't work properly. This doesn't matter for single-sample textures because
in that case the MCS buffer is resolved before it is used as a texture source.

https://bugs.freedesktop.org/show_bug.cgi?id=79729

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 765efeef88)
2014-06-23 11:21:08 -07:00
Michel Dänzer
920428a30a configure: Only check for OpenCL without LLVM when the latter is certain
LLVM is enabled by default for some architectures, but the test was failing
before that.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2d399bb183)
2014-06-23 11:21:08 -07:00
Adrian Negreanu
77619d927b android, dricore: undefined reference to _mesa_streaming_load_memcpy
_mesa_streaming_load_memcpy is defined in main/streaming-load-memcpy.c
I'm adding it to the dricore lib

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
(cherry picked from commit 357a8b6f33)
2014-06-23 11:21:08 -07:00
Adrian Negreanu
e7537b3410 android, mesa_gen_matypes: pull in timespec POSIX definition
This fixes:
  include/c11/threads_posix.h: In function 'cnd_timedwait':
  include/c11/threads_posix.h:140:21: error: storage size of 'abs_time' isn't known

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
(cherry picked from commit 6eb3888c86)
2014-06-23 11:21:08 -07:00
Adrian Negreanu
39c599a666 android, egl: add correct drm include for libmesa_egl_dri2
Fixes:
  src/egl/drivers/dri2/platform_android.c:38:
  include/GL/internal/dri_interface.h:51:17:
    fatal error: drm.h: No such file or directory

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
(cherry picked from commit 4dc5545eff)
2014-06-23 11:21:08 -07:00
Adrian Negreanu
8e8fab2ef6 android: add src/gallium/auxiliary as include path for libmesa_dricore
This fixes:
In file included from
/home/adrian/workspace/mesa/mesa-master.git/src/mesa/vbo/vbo_exec_api.c:445:0:
/home/adrian/workspace/mesa/mesa-master.git/src/mesa/vbo/vbo_attrib_tmp.h:28:38:
fatal error: util/u_format_r11g11b10f.h: No such file or directory

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
(cherry picked from commit 0048483f73)
2014-06-23 11:21:08 -07:00
Adrian Negreanu
283acc26e4 android: add libloader to libGLES_mesa and libmesa_egl_dri2
This fixes
  src/egl/drivers/dri2/platform_android.c:664: error: undefined reference to 'loader_set_logger'
  src/egl/drivers/dri2/platform_android.c:678: error: undefined reference to 'loader_get_driver_for_fd'

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
(cherry picked from commit a49ebfab1d)
2014-06-23 11:21:08 -07:00
Adrian Negreanu
9f5eea0cc9 android: adapt to the megadriver mechanism
Fixes linker error:
  ld:
  .../libmesa_dri_common_intermediates/libmesa_dri_common.a(dri_util.o):
    in function globalDriverAPI:dri_util.c(.data.rel+0x0): error:
    undefined reference to 'driDriverAPI'

As an example, you can see that mesa_dri_drivers
also uses common/libmegadriver_stub (src/mesa/drivers/dri/Makefile.am)

The _stub part might be confusing, but
it actually provides the dri-driver shared lib constructor,
megadriver_stub_init, which will later on load the real
platform dependent part and call
l __driDriverGetExtensions_<platform>

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
(cherry picked from commit aba0f152be)
2014-06-23 11:21:08 -07:00
Adrian Negreanu
e03020abbc add megadriver_stub_FILES
So that android part can also use $(megadriver_stub_FILES)

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
(cherry picked from commit eb3f80dbba)
2014-06-23 11:21:08 -07:00
Emil Velikov
e4c65664ea configure: error out when building opencl without LLVM
Cc: Tom Stellard <thomas.stellard@amd.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
(cherry picked from commit 93257a56b5)
2014-06-23 11:21:08 -07:00
José Fonseca
f6bf295924 mesa/main: Prevent sefgault on glGetIntegerv(GL_ATOMIC_COUNTER_BUFFER_BINDING).
A recent ApiTrace change, that tries to dump more buffer state
causes Mesa from my distro (10.1.4) to segfaults here.

I haven't actually confirm this fixes it (I can't repro on master),
but it seems a good idea to be defensive here anyway.

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit eb58aa9cf0)
2014-06-23 11:21:08 -07:00
José Fonseca
1d7b8bc085 mesa: Make glGetIntegerv(GL_*_ARRAY_SIZE) return GL_BGRA.
Same as b026b6bbfe, but
COLOR_ARRAY_SIZE/SECONDARY_COLOR_ARRAY_SIZE.

Ideally we wouldn't munge the incoming state, so that we wouldn't need
to unmunge it back on glGet*.  But the array size state is copied and
referred in many places, many of which couldn't take an GLenum like
GL_BGRA instead of a plain integer.  So just hack around on glGet*,
to ensure there is no risk of introducing regressions elsewhere.

This bug causes problems to Apitrace, resulting in wrong traces.  See
https://github.com/apitrace/apitrace/issues/261 for details.

Tested with piglit arb_vertex_array_bgra-get, which was created for this
purpose.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e3e13d6b85)
2014-06-23 11:21:03 -07:00
José Fonseca
2889608534 mesa/main: Make get_hash.c values constant.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 53468dee03)
2014-06-23 10:58:10 -07:00
Carl Worth
45bf29b53d docs: Add SHA256 checksums for the 10.1.5 release
Immediately after tagging the commit used to create the tar files.
2014-06-06 19:20:55 -07:00
Carl Worth
feb4c7284c Add release notes for the 10.1.5 release. 2014-06-06 19:10:33 -07:00
Carl Worth
b614628a3c Update version to 10.1.5
In preparation for the 10.1.5 release, of course.
2014-06-06 17:21:44 -07:00
Carl Worth
1f08d1bf46 Ignore a patch that is not needed for the 10.1 branch.
The function being modified does not exist in 10.1.
2014-06-06 17:14:31 -07:00
Carl Worth
a73894a7ed cherry-ignore: Ignore two commits.
The second of these two is simply a "git revert" of the first. So skipping
both of them gives us the same final result in a simpler way.
2014-06-02 13:03:06 -07:00
Roland Scheidegger
cf08c24750 llvmpipe: fix crash when not all attachments are populated in a fb
Framebuffers can have NULL attachments since a while. llvmpipe handled
that properly for lp_rast_shade_quads_mask but it seems the change didn't
make it to lp_rast_shade_tile.
This fixes piglit fbo-drawbuffers-none test (though I need to increase
the FB_SIZE from 32 to 256 so the tris cover some tiles fully).
https://bugs.freedesktop.org/show_bug.cgi?id=79421

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 576868140b)
2014-06-02 11:37:13 -07:00
Pavel Popov
4d676c5ed2 i965: Fix Line Stipple enable bit in 3DSTATE_SF for Haswell.
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Pavel Popov <pavel.e.popov@intel.com>
(cherry picked from commit d292d40207)
2014-06-02 11:35:57 -07:00
Brian Paul
4942eae869 glsl: fix use-after free bug/crash in ast_declarator_list::hir()
The call to get_variable_being_redeclared() may delete 'var' so we
can't reference var->name afterward.  We fix that by examining the
var's name before making that call.

Fixes valgrind warnings and possible crash when running the piglit
tests/spec/glsl-1.30/execution/clipping/vs-clip-distance-in-param.shader_test
test (and probably others).

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit f9cecca7a6)
2014-06-02 11:31:55 -07:00
Emil Velikov
1776a562b4 glx: do not leak dri3Display
v2: Do not wrap the code in ifdef HAVE_DRI3 (suggested by Keith)

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Cc: Keith Packard <keithp@keithp.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit eb2241f8a9)
2014-06-02 11:30:15 -07:00
Pavel Popov
d2f5638ade i965: Properly return *RESET* status in glGetGraphicsResetStatusARB
The glGetGraphicsResetStatusARB from ARB_robustness extension always
returns GUILTY_CONTEXT_RESET_ARB and never returns NO_ERROR for guilty
context with LOSE_CONTEXT_ON_RESET_ARB strategy.  This is because Mesa
returns GUILTY_CONTEXT_RESET_ARB if batch_active !=0 whereas kernel
driver never reset batch_active and this variable always > 0 for guilty
context.  The same behaviour also can be observed for batch_pending and
INNOCENT_CONTEXT_RESET_ARB.

But ARB_robustness spec says:

  If a reset status other than NO_ERROR is returned and subsequent calls
  return NO_ERROR, the context reset was encountered and completed. If a
  reset status is repeatedly returned, the context may be in the process
  of resetting.

  8. How should the application react to a reset context event?
  RESOLVED: For this extension, the application is expected to query the
  reset status until NO_ERROR is returned. If a reset is encountered, at
  least one *RESET* status will be returned. Once NO_ERROR is
  encountered, the application can safely destroy the old context and
  create a new one.

The main problem is the context may be in the process of resetting and
in this case a reset status should be repeatedly returned.  But looks
like the kernel driver returns nonzero active/pending only if the
context reset has already been encountered and completed.  For this
reason the *RESET* status cannot be repeatedly returned and should be
returned only once.

The reset_count and brw->reset_count variables can be used to control
that glGetGraphicsResetStatusARB returns *RESET* status only once for
each context.  Note the i915 triggers reset_count twice which allows to
return correct reset count immediately after active/pending have been
incremented.

v2 (idr): Trivial reformatting of comments.

Signed-off-by: Pavel Popov <pavel.e.popov@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8dc4a98c44)
2014-06-02 11:28:46 -07:00
James Legg
ee0207a212 mesa: Fix unbinding GL_DEPTH_STENCIL_ATTACHMENT
glFramebufferRender(..., GL_DEPTH_STENCIL_ATTACHMENT, ..., 0) only
detached the depth buffer and not the stencil buffer.

Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=79115
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 846c715abb)
2014-06-02 11:25:24 -07:00
Ilia Mirkin
36e0e9c5e7 nv50/ir: fix constant folding for OP_MUL subop HIGH
These instructions can come in either through IMUL_HI/UMUL_HI TGSI
opcodes, or from OP_DIV constant folding.

Also make sure that the constant foldings which delete the original
instruction still get counted as having done something.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
(cherry picked from commit d2a3de19c6)
2014-05-22 11:33:40 -07:00
Ilia Mirkin
5d8e60dcc7 nv50/ir: fix s32 x s32 -> high s32 multiply logic
Retrieving the high 32 bits of a signed multiply is rather annoying. It
appears that the simplest way to do this is to compute the absolute
value of the arguments, and perform a u32 x u32 -> u64 operation. If the
arguments' signs differ, then negate the result. Since there is no u64
support in the cvt instruction, we have the perform the 2's complement
negation "by hand".

This logic can come into use by the IMUL_HI instruction (very unlikely
to be seen), as well as from constant folding of division by a constant.
Fixes dolphin's divisions by 255.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
(cherry picked from commit d3a5cf052c)
2014-05-22 11:32:51 -07:00
Carl Worth
a23e73e00d Merge remote-tracking branch 'origin/10.1' into 10.1 2014-05-20 15:28:57 -07:00
Carl Worth
a02f6639f7 docs: Add md5sums for 10.1.4 release
After making the tar files.
2014-05-20 15:26:00 -07:00
Carl Worth
cc9b282f8a docs: Add release notes for the 10.1.4 release. 2014-05-20 14:22:34 -07:00
Carl Worth
edab352b25 VERSION: Update to 10.1.4
In preparation for the 10.1.4 release.
2014-05-20 14:19:05 -07:00
Jeremy Huddleston Sequoia
ec83a39e2b darwin: Fix test for kCGLPFAOpenGLProfile support at runtime
Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
(cherry picked from commit 7a109268ab)
2014-05-20 10:55:27 -07:00
Jeremy Huddleston Sequoia
ea5839c8fe glapi: Avoid heap corruption in _glapi_table
Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Reviewed-by: Chia-I Wu <olv@lunarg.com>
(cherry picked from commit ff5456d1ac)
2014-05-20 01:39:32 -07:00
Ilia Mirkin
2d6f733979 nv50/ir: fix integer mul lowering for u32 x u32 -> high u32
UNION appears to expect that all of its sources are conditionally
defined. Otherwise it inserts an unpredicated mov instruction which
overwrites the desired result. This fixes tests that use UMUL_HI, and
much less directly, unsigned integer division by a constant, which uses
this functionality in a peephole pass.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
(cherry picked from commit 5b8f1a0f7c)
2014-05-19 14:50:03 -07:00
Carl Worth
507d2e523c cherry-ignore: Roland and Michel agreed to drop these patches.
The first was apparently not entirely suitable for stable, (and buggy). And
the second existed only to fix a bug in the first. So without the first, we
don't need either.
2014-05-16 17:22:12 -07:00
Brian Paul
07ada102cb mesa: fix double-freeing of dispatch tables inside glBegin/End.
We allocate dispatch tables for BeginEnd and OutsideBeginEnd.  But
when we destroy the context we were freeing the BeginEnd and Exec
tables.  If Exec==BeginEnd we did a double-free.  This would happen
if the context was destroyed while inside a glBegin/End pair.  Now
free the BeginEnd and OutsideBeginEnd pointers.

Cc: "10.1", "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit ef6b6658f9)
2014-05-16 16:07:30 -07:00
Eric Anholt
13b142a420 i965/vs: Use samplers for UBOs in the VS like we do for non-UBO pulls.
Improves performance of a dolphin emulator trace I had laying around by
3.60131% +/- 0.995887% (n=128).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 9245206cbf)
2014-05-14 12:53:08 -07:00
Michel Dänzer
1ba2298131 radeonsi: Fix anisotropic filtering state setup
Bring it back in line with r600g. I broke this in the original radeonsi
bringup. :(

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78537

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit c5828b0599)
2014-05-14 12:42:02 -07:00
Ilia Mirkin
736e16288b nv50: fix setting of texture ms info to be per-stage
Different textures may be bound to each slot for each stage. So we need
to be able to upload ms parameters for each one without stages
overwriting each other.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 863573b9cb)
2014-05-14 12:39:11 -07:00
Ilia Mirkin
7396efb19a nv50/ir: make sure to reverse cond codes on all the OP_SET variants
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Cc: "10.2 10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 68f47cad0d)
2014-05-13 17:35:21 -07:00
Emil Velikov
c8e24aa5a9 configure: error out if building GBM without dri
Both backends require --enable-dri, and building an empty libgbm
makes little to no sense. Error out at configure to prevent the
user from shooting themselves in the foot.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78225
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit e477d12c33)
2014-05-13 17:34:58 -07:00
Tom Stellard
16dfaf495a radeonsi: Enable geometry shaders with LLVM 3.4.1
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

CC: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 93c2ebbd83)
2014-05-13 17:34:31 -07:00
Tom Stellard
f3eb3455c8 configure.ac: Add LLVM_VERSION_PATCH to DEFINES
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

CC: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c5d0008325)

Conflicts:
	configure.ac
2014-05-13 17:33:52 -07:00
Carl Worth
9e1eb6fb93 docs: Add MD5 sums for 10.1.3
Just after making the release tar files.
2014-05-09 07:41:41 -07:00
Carl Worth
0028eb1083 docs: Add release notes for Mesa 10.1.3.
This is an emergencyt release to make a performance-regression fix available.
2014-05-09 07:38:16 -07:00
Carl Worth
d4c7ca04c1 VERSION: Update to 10.1.3
For the emergency 10.1.3 release.
2014-05-09 07:17:36 -07:00
Thomas Hellstrom
e16de70a90 st/xa: Fix performance regression introduced by commit "Cache render target surface"
The mentioned commit has the nasty side-effect of turning off accelerated
copies.

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
(cherry picked from commit 9306b7c171)
2014-05-09 07:16:13 -07:00
Kenneth Graunke
f7b949723a i965: Fix depth (array slices) computation for 1D_ARRAY render targets.
1D array targets store the number of slices in the Height field.

Fixes Piglit's spec/!OpenGL 3.2/layered-rendering/clear-color-all-types
1d_array single_level, at least when used with Meta clears.

Cc: "10.2 10.1 10.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit e6967270c7)
2014-05-08 08:24:31 -07:00
Kenneth Graunke
558c20fa95 mesa: Fix MaxNumLayers for 1D array textures.
1D array targets store the number of slices in the Height field.

Cc: "10.2 10.1 10.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 5c399ca8e4)
2014-05-08 08:24:03 -07:00
Tapani Pälli
31462dc748 glsl: fix bogus layout qualifier warnings
Print out GL_ARB_explicit_attrib_location warnings only
when parsing attribute that uses "location" qualifier.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77245
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e65917f94e)
2014-05-07 15:42:31 -07:00
Carl Worth
994203bf5e get-pick-list.sh: Require explicit "10.1" for nominating stable patches
A nomination unadorned with a specific version is now interpreted as being
aimed at the 10.2 branch, (which was recently opened).
2014-05-05 13:27:11 -07:00
Carl Worth
08da743a97 docs: Add MD5 sums for Mesa 10.1.2
Immediately after creating the 10.1.2 tar files.
2014-05-05 11:30:13 -07:00
Carl Worth
bde3135717 docs: Add notes for the 10.1.2 release. 2014-05-05 11:23:21 -07:00
Carl Worth
75049062d5 Update VERSION to 10.1.2
In preparation for the 10.1.2 release.
2014-05-05 11:23:20 -07:00
Ian Romanick
3d648f0f50 dri3: Enable GLX_MESA_query_renderer on DRI3 too
This should have happend around the time of commit 4680d23, but Keith's
DRI3 patches and my GLX_MESA_query_renderer patches crossed in the mail.

I don't have a working DRI3 setup, so I haven't been able to actually
verify this.  I'm hoping that someone can piglit this for me on DRI3...
It's also unfortunate the DRI2 and DRI3 can't share more code.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: Keith Packard <keithp@keithp.com>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 625bdd64e5)

Conflicts:
	src/glx/dri3_glx.c

During the cherry-pick, the following commit was squashed in as well:

glx: Conditionally compile GLX_MESA_query_renderer DRI3 support

Missed out with commit 625bdd64e5.

Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 0b307afd57)
2014-05-05 11:23:20 -07:00
Anuj Phogat
ff117336b7 glsl: Apply the link error conditions to GL_ARB_fragment_coord_conventions
Link error conditions added in previous patch are equally applicable
to GL_ARB_fragment_coord_conventions implementation. Extension's spec
says:
   "If gl_FragCoord is redeclared in any fragment shader in a program,
    it must be redeclared in all the fragment shaders in that program
    that have a static use of gl_FragCoord. All redeclarations of
    gl_FragCoord in all fragment shaders in a single program must have
    the same set of qualifiers."

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 9bcb0a8532 with some
manual backporting)
2014-05-05 11:23:20 -07:00
Anuj Phogat
2cd8ce4c67 glsl: Link error if fs defines conflicting qualifiers for gl_FragCoord
GLSL 1.50 spec says:
   "If gl_FragCoord is redeclared in any fragment shader in a program,
    it must be redeclared in all the fragment shaders in that
    program that have a static use gl_FragCoord. All redeclarations of
    gl_FragCoord in all fragment shaders in a single program must
    have the same set of qualifiers."

This patch causes the shader link to fail if we have multiple fragment
shaders with conflicting layout qualifiers for gl_FragCoord.

V2: Restructure the code and add conditions to correctly handle the
    following case:

fragment shader 1:
layout(origin_upper_left) in vec4 gl_FragCoord;
void main()
{
    foo();
    gl_FragColor = gl_FragData;
}

fragment shader 2:
layout(pixel_center_integer) in vec4 gl_FragCoord;
void foo()
{
}

V3:
Allow linking in the following case:
fragment shader 1:
void main()
{
    foo();
    gl_FragColor = gl_FragCoord;
}

fragment shader 2:
in vec4 gl_FragCoord;
void foo()
{
   ...
}

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

(cherry picked from commit 35f11e85cb with some
manual backporting)
2014-05-05 11:23:20 -07:00
Anuj Phogat
ec70be5628 glsl: Use switch to allow adding more shader types
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
2014-05-05 11:23:20 -07:00
Carl Worth
2d9bfe4bf4 cherry-ignore: Drop an ignored patch now that piglit has been updated.
This patch was ignored when we saw it causing a piglit test to regress. That
piglit test has been determined to have been incorrect. It has been fixed so
that this patch is now clearly a bug fix, not a regression.
2014-05-05 11:23:20 -07:00
Anuj Phogat
efba496d03 i965: Add glBlitFramebuffer to commands affected by conditional rendering
Fixes failures in Khronos OpenGL CTS test conditional_render_test9

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 1d350b9e22)
2014-05-05 11:23:20 -07:00
Anuj Phogat
853c313ce3 mesa: Allow FLOAT_32_UNSIGNED_INT_24_8_REV in get_tex_depth_stencil()
Fixes a crash in Khronos OpenGL CTS packed_pixels tests.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit c1743707a1)
2014-05-05 11:23:20 -07:00
Anuj Phogat
2981bc9ff8 mesa: Add support to unpack depth-stencil texture in to FLOAT_32_UNSIGNED_INT_24_8_REV
V2: Follow the new naming convention for unpack functions.
    Use double precision for converting Z24 to a float.
V3: Unpack stencil value to most significant byte.
    Use 'struct z32f_x24s8' type.
V4: Unpack stencil value to least significant byte.
    Add a comment to clarify stencil packing.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 29b8e894d1)
2014-05-05 11:23:20 -07:00
Anuj Phogat
18e6cd5e61 mesa: Add new helper function _mesa_unpack_depth_stencil_row()
This patch makes non-functional changes in the code. New helper
function added here will make it easier to support more data
types in the following patches.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 7a8045d2f7)
2014-05-05 11:23:20 -07:00
Anuj Phogat
4ee60a14df mesa: Allow srcFormat=GL_DEPTH_STENCIL in _mesa_texstore_xx_xx() functions
_mesa_texstore_z24_s8() and _mesa_texstore_z32f_x24s8() are capable of
handling GL_DEPTH_STENCIL format. So, allow it in both the functions.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 1a8f9ba9b3)
2014-05-05 11:23:20 -07:00
Anuj Phogat
2b5ad9baa1 mesa: Add missing types in _mesa_texstore_xx_xx() functions
Depth-stencil teture targets are allowed to use source data of type
GL_UNSIGNED_INT_24_8_EXT and GL_FLOAT_32_UNSIGNED_INT_24_8_REV.

Fixes few crashes in Khronos OpenGL CTS packed_pixels tests.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit aeb9d4495d)
2014-05-05 11:23:20 -07:00
Anuj Phogat
87173023b2 i965: Fix crash in do_blit_readpixels()
Fixes a crash in Khronos CTS packed_pixels tests.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit d714b20eb4)
2014-05-05 11:23:20 -07:00
Anuj Phogat
51e80d1a8b mesa: Add error condition for format=STENCIL_INDEX in glGetTexImage()
From OpenGL 4.0 spec, page 306:
   "Calling GetTexImage with a format of STENCIL_INDEX
    causes the error INVALID_ENUM."

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 5388fc157e)
2014-05-05 11:23:19 -07:00
Anuj Phogat
755bf62c2e mesa: Add entry for extension ARB_texture_stencil8
V2: Alphabetize the new entry

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 340658e44f)
2014-05-05 11:23:19 -07:00
Anuj Phogat
81f98ffb61 glsl: Compile error if fs uses gl_FragCoord before first redeclaration
Section 4.3.8.1, page 39 of GLSL 1.50 spec says:
  "Within any shader, the first redeclarations of gl_FragCoord
   must appear before any use of gl_FragCoord."

GLSL compiler should generate an error in following case:

vec4 p = gl_FragCoord;
layout(origin_upper_left) in vec4 gl_FragCoord;

void main()
{
}

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit a751adf071)
2014-05-05 11:23:19 -07:00
Anuj Phogat
072a79b188 glsl: Compile error if fs defines conflicting qualifiers for gl_FragCoord
GLSL 1.50 spec says:
   "If gl_FragCoord is redeclared in any fragment shader in a program,
    it must be redeclared in all the fragment shaders in that
    program that have a static use gl_FragCoord. All redeclarations of
    gl_FragCoord in all fragment shaders in a single program must
    have the same set of qualifiers."

This patch makes the glsl compiler to generate an error if we have a
fragment shader defined with conflicting layout qualifier declarations
for gl_FragCoord. For example:

layout(origin_upper_left, pixel_center_integer) in vec4 gl_FragCoord;
layout(pixel_center_integer) in vec4 gl_FragCoord;

void main()
{
}

V2: Some code refactoring for better readability.
    Add compiler error conditions for redeclarations like:

layout(origin_upper_left) in vec4 gl_FragCoord;
layout(origin_upper_left, pixel_center_integer) in vec4 gl_FragCoord;

and

in vec4 gl_FragCoord;
layout(origin_upper_left, pixel_center_integer) in vec4 gl_FragCoord;

V3: Simplify function is_conflicting_fragcoord_redeclaration()
V4: Check for null pointer before doing strcmp(var->name, "gl_FragCoord").

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 581e4acb0d)
2014-05-05 11:23:19 -07:00
Anuj Phogat
87655f1805 mesa: Use location VERT_ATTRIB_GENERIC0 for vertex attribute 0
In OpenGL 3.1 attribute 0 becomes non-magic, just like in
OpenGL ES 2.0. Earlier versions of OpenGL used attribute 0
exclusively for vertex position.

V2: Add a utility function _mesa_attr_zero_aliases_vertex() in
    varray.h

Fixes 4 Khronos OpenGL CTS failures:
glGetVertexAttrib
depth24_basic
depth24_precision
rgb8_rgba8_rgb

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 49c71050de)
2014-05-05 11:23:19 -07:00
Anuj Phogat
3c6c42b86d mesa: Fix querying location of nth element of an array variable
This patch makes changes to the behavior of glGetAttribLocation(),
glGetFragDataLocation() and glGetFragDataIndex() functions.

Code changes handle a case described in following example:

shader program:
layout(location = 1)in vec4[4] a;
void main()
{
}

Currently, glGetAttribLocation("a") returns 1.
glGetAttribLocation("a[i]"), where i = {0, 1, 2, 3}, returns -1.
But the expected locations for array elements are: 1, 2, 3 and 4
respectively.

This clarification came up with the addition of
ARB_program_interface_query to OpenGL 4.3.

From Page 326 (page 347 of the PDF) of OpenGL 4.3 spec:
   "Otherwise, the command is equivalent to

    GetProgramResourceLocation(program, PROGRAM_INPUT, name);"

And, From Page 101 (page 122 of the PDF) of OpenGL 4.3 spec:

   "A string provided to GetProgramResourceLocation or
    GetProgramResourceLocationIndex is considered to match an active
    variable if

    • the string exactly matches the name of the active variable;
    • if the string identifies the base name of an active array, where
      the string would exactly match the name of the variable if the
      suffix "[0]" were appended to the string; or
    • if the string identifies an active element of the array, where
      the string ends with the concatenation of the "[" character, an
      integer (with no "+" sign, extra leading zeroes, or whitespace)
      identifying an array element, and the "]" character, the integer
      is less than the number of active elements of the array variable,
      and where the string would exactly match the enumerated name of
      the array if the decimal integer were replaced with zero."

V2: Simplify get_matching_index() function.
    Add relevant text from OpenGL spec in commit message.

Fixes failures in Khronos OpenGL CTS tests:
explicit_attrib_location_room
draw_instanced_max_vertex_attribs

Proprietary linux drivers of NVIDIA (331.49) matches the behavior
expected by OpenGL 4.3 spec.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit dc75479b7a)
2014-05-05 11:23:19 -07:00
Anuj Phogat
8792eda0eb glsl: Allow overlapping locations for vertex input attributes
Currently overlapping locations of input variables are not allowed for all
the shader types in OpenGL and OpenGL ES.

From OpenGL ES 3.0 spec, page 56:
   "Binding more than one attribute name to the same location is referred
    to as aliasing, and is not permitted in OpenGL ES Shading Language
    3.00 vertex shaders. LinkProgram will fail when this condition exists.
    However, aliasing is possible in OpenGL ES Shading Language 1.00 vertex
    shaders."

Taking in to account what different versions of OpenGL and OpenGL ES specs
say about aliasing:
   - It is allowed only on vertex shader input attributes in OpenGL (2.0 and
     above) and OpenGL ES 2.0.
   - It is explictly disallowed in OpenGL ES 3.0.

Fixes Khronos CTS failing test:
explicit_attrib_location_vertex_input_aliased.test
See more details about this at below mentioned khronos bug.

V2: Fix the case where location exceeds the maximum allowed attribute
    location.
V3: Simplify the condition added in V2.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Bugzilla: Khronos #9609
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 8c61b6a99b)
2014-05-05 11:23:19 -07:00
Kenneth Graunke
0860c95d5c i965: Actually emit PIPELINE_SELECT and 3DSTATE_VF_STATISTICS.
For platforms using hardware contexts (currently Gen6+), we failed to
emit PIPELINE_SELECT and 3DSTATE_VF_STATISTICS, instead emitting MI_NOOP
for both.

During one of the context initialization reordering patches, we
accidentally moved brw_init_state before we set brw->CMD_PIPELINE_SELECT
and brw->CMD_VF_STATISTICS.  So, when brw_init_state uploaded initial
GPU state (brw_init_state -> brw_upload_initial_gpu_state ->
brw_upload_invariant_state), these would be 0 (MI_NOOP).

Storing the commands in the context is not worthwhile.  We have many
generation checks in our state upload code, and for platforms with
hardware contexts, this only gets called once per GL context anyway.
The cost is negligable, and it's easy to botch context creation
ordering.

This may fix hangs on Gen6+ when using the media pipeline.

Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
(cherry picked from commit ac30e1adb4)

Conflicts:
	src/mesa/drivers/dri/i965/brw_context.h
2014-05-05 11:23:19 -07:00
Kenneth Graunke
f4b0b3a402 i965: Don't enable reset notification support on Gen4-5.
arekm reported that using Chrome with GPU acceleration enabled on GM45
triggered the hw_ctx != NULL assertion in brw_get_graphics_reset_status.

We definitely do not want to advertise reset notification support on
Gen4-5 systems, since it needs hardware contexts, and we never even
request a hardware context on those systems.

Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75723
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 0380ec467d)
2014-05-05 11:23:19 -07:00
Eric Anholt
f9f10f681e i965: Fix render-to-texture in non-FinishRenderTexture cases.
We've had several problems now with FinishRenderTexture not getting called
enough, and we're ready to just give up on it ever doing what we need.  In
particular, an upcoming Steam title had rendering bugs that could be fixed
by always_flush_cache=true.

Instead of hoping Mesa core can figure out when we need to flush our
caches, just track what BOs we've rendered to in a set, and when we render
from a BO in that set, emit a flush and clear the set.

There's some overhead to keeping this set, but most of that is just
hashing the pointer -- it turns out our set never even gets very large,
because cache flushes are so common (even on cairo-gl).

No statistically significant performance difference in cairo-gl (n=100),
despite spending ~.5% CPU in these set operations.

v1: (Original patch by Eric Anholt.)
v2: (Changes by Ken Graunke.)
  - Rebase forward from May 7th 2013 -> March 4th 2014.
  - Drop the FinishRenderTexture hook entirely; after rebasing the
    patch, the hook was just an empty function.
  - Move the brw_render_cache_set_clear() call from
    intel_batchbuffer_emit_flush() to brw_emit_pipe_control_flush().
    In theory, this could catch more cases where we've flushed.
  - Consider stencil as a possible texturing source.
v3: (changes by anholt):
  - Move set_clear() back to emit_mi_flush() -- it means we can drop
    more forced flushes from the code.  In the previous location, it
    wouldn't have been called when we wanted pre-gen6.
  - Move the set clear from batch init to reset -- it should be empty at
    the start of every batch, since the kernel handled any inter-batch
    flush for us.
v4: Drop the debug code in set.c that I accidentally committed.
v5: Back port to 10.1 stable branch (remove reference to stencil texture.)

Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Dylan Baker <baker.dylan.c@gmail.com> [v2]

Conflicts:
	src/mesa/drivers/dri/i965/brw_draw.c
	src/mesa/drivers/dri/i965/intel_fbo.h
2014-05-05 11:23:19 -07:00
Carl Worth
996fbd4e2b cherry-ignore: Ignore a patch causing a regression
This may be just a bogus test. I'm waiting to see what the bugzilla decides:

https://bugs.freedesktop.org/show_bug.cgi?id=77702
2014-05-05 11:23:19 -07:00
Michel Dänzer
e19c702eac st/mesa: Fix NULL pointer dereference for incomplete framebuffers
This can happen with glamor, which uses EGL_KHR_surfaceless_context and
only explicitly binds GL_READ_FRAMEBUFFER for glReadPixels.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 136c437cea)
2014-05-05 11:23:19 -07:00
Ander Conselvan de Oliveira
5d680bc082 egl: Protect use of gbm_dri with ifdef HAVE_DRM_PLATFORM
Otherwise it fails to compile if the drm egl platform is disabled.

Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 17860309f1)
2014-05-05 11:23:19 -07:00
Neil Roberts
e43327bdd9 wayland: Fix the logic in disabling the prime capability
It looks like this bit of code is trying to disable the prime capability if
the driver doesn't support createImageFromFds. However the logic looks a bit
broken and what it would actually do is disable all other capabilities apart
from prime. This patch fixes it to actually disable prime.

Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 63d4661ab2)
2014-05-05 11:23:18 -07:00
Ander Conselvan de Oliveira
c8ac5294eb gbm/dri: Fix out-of-memory error path in dri_device_create()
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit aa91fe1c09)
2014-05-05 11:23:18 -07:00
Marek Olšák
d404180430 r600g: fix hang on RV740 by using DX_RASTERIZATION_KILL instead of SX_MISC
Changing SX_MISC hangs RV740. When we're at it, let's use DX_RASTERIZATION_KILL
on all R700 and later chipsets.

Cc: 10.0 10.1 mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 3a3b1bf60e)
2014-05-05 11:23:18 -07:00
Marek Olšák
c7adf5d1c7 r600g: fix for an MSAA hang on RV770
Cc: 10.0 10.1 mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 3d0c4f3b01)
2014-05-05 11:23:18 -07:00
Marek Olšák
62ba29b236 r600g: fix for broken CULL_FRONT behavior on R6xx
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ecc8a37ec5)

Conflicts:
	src/gallium/drivers/r600/r600_pipe.h
2014-05-05 11:23:18 -07:00
Marek Olšák
8ea3790d49 r600g: fix buffer copying on R600-R700
This fixes broken rendering in DOTA 2.

Cc: 10.0 10.1 mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 0967970768)
2014-05-05 11:23:11 -07:00
Marek Olšák
110b6af5f4 r600g: fix flushing on RV670, RS780, RS880 again
Cc: 10.0 10.1 mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 042e40f67b)
2014-04-30 15:39:28 -07:00
Marek Olšák
5e688c0601 r600g: fix MSAA resolve on R6xx when the destination is 1D-tiled
Cc: 10.0 10.1 mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 20a9b784da)
2014-04-30 15:38:46 -07:00
Marek Olšák
1602419b16 r600g: disable async DMA on R700
Cc: 10.0 10.1 mesa-stable@lists.freedesktop.org
(cherry picked from commit 6dd045ef40)
2014-04-30 15:38:04 -07:00
Marek Olšák
081e37b3b6 r600g: fix edge flags and layered rendering on R600-R700
We forgot to set these bits.

Cc: 10.1 mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit e5741f1e91)

Conflicts:
	src/gallium/drivers/r600/r600_state.c
2014-04-30 15:28:10 -07:00
Marek Olšák
4f3abcfee4 st/mesa: remove trailing NULL colorbuffers
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 8a1dfba73e)
2014-04-30 15:16:23 -07:00
Marek Olšák
d17b75f1e5 r300g: don't crash when getting NULL colorbuffers
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit e522c455e4)
2014-04-30 15:15:07 -07:00
Brian Paul
1d0e7fb691 swrast: allocate swrast_texture_image::ImageSlices array if needed
Fixes a segmentation fault in conform divzero.c test.
This happens when glTexImage(level, width=0, height=0) is called.  We
don't allocate texture memory in that case so the ImageSlices array
was never allocated.

Cc: "10.1" <mesa-stable@lists.freedesktop.org>

Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit 7cc2e2e99d)
2014-04-30 15:13:25 -07:00
nick
468c1a2d46 swrast: Fix vertex color in _swsetup_Translate()
Straightforward fix to properly load dest->color with color data, as
opposed to position data as previously implemented.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=27499
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 15c92464df)
2014-04-30 15:11:09 -07:00
Chris Forbes
74194a4bfc glsl: Only allow invariant on shader in/out between stages.
Previously this was special-cased for VS and FS; it never got updated
when geometry shaders came along. Generalize using is_varying_var() so
this won't be broken again with tessellation.

Note that there are two copies of the logic for `invariant`: It can be
present as part of a new declaration, and also as a redeclaration of an
existing variable or block member.

Fixes the four new piglits:
   spec/glsl-1.50/compiler/invariant-qualifier-*.geom

Note for stable: This won't quite pick cleanly due to whitespace and
state->target -> state->stage renames. Should be straightforward
adjustments though.

Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 0dfa6e7cf5)

Conflicts:
	src/glsl/ast_to_hir.cpp
2014-04-23 00:25:07 -07:00
Anuj Phogat
b3e3ba5c37 mesa: Fix error code generation in glReadPixels()
Section 4.3.1, page 220, of OpenGL 3.3 specification explains
the error conditions for glreadPixels():

   "If the format is DEPTH_STENCIL, then values are taken from
    both the depth buffer and the stencil buffer. If there is
    no depth buffer or if there is no stencil buffer, then the
    error INVALID_OPERATION occurs. If the type parameter is
    not UNSIGNED_INT_24_8 or FLOAT_32_UNSIGNED_INT_24_8_REV,
    then the error INVALID_ENUM occurs."

Fixes failing Khronos CTS test packed_depth_stencil_error.test

V2: Avoid code duplication

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit f8ae2a56c6)
2014-04-23 00:25:07 -07:00
Anuj Phogat
62b1970ccc mesa: Add an error condition in glGetFramebufferAttachmentParameteriv()
From the OpenGL 4.4 spec page 275:
  "If pname is FRAMEBUFFER_ATTACHMENT_COMPONENT_TYPE, param will
   contain the format of components of the specified attachment,
   one of FLOAT, INT, UNSIGNED_INT, SIGNED_NORMALIZED, or
   UNSIGNED_NORMALIZED for floating-point, signed integer,
   unsigned integer, signed normalized fixedpoint, or unsigned
   normalized fixed-point components respectively. If no data
   storage or texture image has been specified for the attachment,
   param will contain NONE. This query cannot be performed for a
   combined depth+stencil attachment, since it does not have a
   single format."

Fixes Khronos CTS test: packed_depth_stencil_parameters.test

Khronos Bug# 9170
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

(cherry picked from commit bd1880dfe8)
2014-04-23 00:25:07 -07:00
Anuj Phogat
6a154a4875 mesa: Add error condition for integer formats in glGetTexImage()
OpenGL 4.0 spec, page 306 suggests an INVALID_OPERATION in glGetTexImage
if :
   "format is one of the integer formats in table 3.3 and the internal
    format of the texture image is not integer, or format is not one of
    the integer formats in table 3.3 and the internal format is integer."

V2: Use helper function _mesa_is_format_integer()

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit cb6566f9df)
2014-04-23 00:25:07 -07:00
Anuj Phogat
66765bb6a6 mesa: Add helper function _mesa_is_format_integer()
This function will be used in the following patch.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 3135668254)
2014-04-23 00:25:06 -07:00
Anuj Phogat
657a185dc0 i965: Fix component mask and varying_to_slot mapping for gl_ViewportIndex
gl_ViewportIndex doesn't get its own varying slot. It is stored
in VARYING_SLOT_PSIZ.z. This patch fixes the issue for both gen7
and gen8 because gen7_upload_3dstate_so_decl_list() is shared
between them.

Fixes failures in OpenGL Khronos CTS test transform_feedback_builtins.
Makes new piglit test glsl-1.50-transform-feedback-builtins pass for
'gl_ViewportIndex'.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 48fc2703e5)
2014-04-23 00:25:06 -07:00
Anuj Phogat
b22bdb5cd2 i965: Fix component mask and varying_to_slot mapping for gl_Layer
gl_Layer doesn't get its own varying slot. It is stored in
VARYING_SLOT_PSIZ.y. This patch fixes the issue for both gen7
and gen8 because gen7_upload_3dstate_so_decl_list() is shared
between them.

Fixes failures in OpenGL Khronos CTS test transform_feedback_builtins.
Makes new piglit test glsl-1.50-transform-feedback-builtins pass for
'gl_Layer'.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 7928b9c249)
2014-04-23 00:25:06 -07:00
Anuj Phogat
90eae12ae0 i965: Put an assertion to check valid varying_to_slot[varying]
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 969b461c2b)
2014-04-23 00:25:06 -07:00
Benjamin Bellec
c9ceb03147 mesa: fix GetStringi error message with correct function name
Signed-off-by: Benjamin Bellec <b.bellec@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9b3b9c613f)
2014-04-23 00:25:06 -07:00
Anuj Phogat
488f5b4390 mesa: Fix error condition for multisample proxy texture targets
Fixes failures in Khronos OpenGL CTS test proxy_textures_invalid_samples

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit ee10e893cb)

Conflicts:
	src/mesa/main/teximage.c
2014-04-23 00:25:06 -07:00
Anuj Phogat
eed256688f swrast: Add glBlitFramebuffer to commands affected by conditional rendering
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 8ed42ddd7d)
2014-04-21 11:50:01 -07:00
Thomas Hellstrom
58ca56ddf5 st/xa: Cache render target surface
Otherwise it will trick the gallium driver into thinking that the render
target has actually changed (due to different pipe_surface pointing to
same underlying pipe_resource).  This is really badness for tiling GPUs
like adreno.

This also appears to fix a rendering error with Motif on vmwgfx.
Why that is is still under investigation.

Based on an idea by Rob Clark.

Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 09cd376353)
2014-04-21 11:43:31 -07:00
Samuel Iglesias Gonsalvez
e16df44a85 mesa: fix check for dummy renderbuffer in _mesa_FramebufferRenderbufferEXT()
According to the spec:
	<renderbuffertarget> must be RENDERBUFFER and <renderbuffer>
	should be set to the name of the renderbuffer object to be
	attached to the framebuffer.  <renderbuffer> must be either
	zero or the name of an existing renderbuffer object of type
	<renderbuffertarget>, otherwise an INVALID_OPERATION error is
	generated.

This patch changes the previous returned GL_INVALID_VALUE to
GL_INVALID_OPERATION.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76894

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
(cherry picked from commit 9927180714)
2014-04-21 11:43:06 -07:00
Anuj Phogat
b026b6bbfe mesa: Fix glGetVertexAttribi(GL_VERTEX_ATTRIB_ARRAY_SIZE)
mesa currently returns 4 when GL_VERTEX_ATTRIB_ARRAY_SIZE is queried
for a vertex array initially set up with size=GL_BGRA. This patch
makes changes to return size=GL_BGRA as required by the spec.

Fixes Khronos OpenGL CTS test: vertex_array_bgra_basic.test

V2: Use array->Format instead of adding a new variable

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit fdd8bebc22)
2014-04-21 11:18:54 -07:00
Michel Dänzer
cda6610d85 r600g: Disable LLVM by default at runtime for graphics
For graphics, the LLVM compiler backend currently has many shortcomings
compared to the non-LLVM one. E.g. it can't handle geometry shaders yet,
but that's just the tip of the iceberg.

So building Mesa with --enable-r600-llvm-compiler is currently not
recommended for anyone who doesn't want to work on fixing those issues.
However, for protection of users who end up enabling it anyway for some
reason, let's disable the LLVM backend at runtime by default. It can be
enabled with the environment variable R600_DEBUG=llvm.

Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
(cherry picked from commit 7286739b9b)
2014-04-21 11:18:34 -07:00
Carl Worth
4cd6530885 docs: Add the MD5 sums for the 10.1.1 release tar files.
Now that those files have been generated.
2014-04-18 17:46:20 -07:00
Carl Worth
780817af84 docs: Add release notes for 10.1.1 2014-04-18 17:15:08 -07:00
Carl Worth
e31f76bf66 Update VERSION to 10.1.1
In preparation for the 10.1.1 release.
2014-04-18 17:14:56 -07:00
Eric Anholt
527210f15d i965: Fix buffer overruns in MSAA MCS buffer clearing.
This manifested as rendering failures or sometimes GPU hangs in
compositors when they accidentally got MSAA visuals due to a bug in the X
Server.  Today we decided that the problem in compositors was equivalent
to a corruption bug we'd noticed recently in resizing MSAA-visual
glxgears, and debugging got a lot easier.

When we allocate our MCS MT, libdrm takes the size we request, aligns it
to Y tile size (blowing it up from 300x300=900000 bytes to 384*320=122880
bytes, 30 pages), then puts it into a power-of-two-sized BO (131072 bytes,
32 pages).  Because it's Y tiled, we attach a 384-byte-stride fence to it.
When we memset by the BO size in Mesa, between bytes 122880 and 131072 the
data gets stored to the first 20 or so scanlines of each of the 3 tiled
pages in that row, even though only 2 of those pages were allocated by
libdrm.  In the glxgears case, the missing 3rd page happened to
consistently be the static VBO that got mapped right after the first MCS
allocation, so corruption only appeared once window resize made us throw
out the old MCS and then allocate the same BO to back the new MCS.

Instead, just memset the amount of data we actually asked libdrm to
allocate for, which will be smaller (more efficient) and not overrun.
Thanks go to Kenneth for doing most of the hard debugging to eliminate a
lot of the search space for the bug.

Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77207
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 7ae870211d)
2014-04-18 15:19:00 -07:00
Mike Stroyan
a616fbd7aa i965: Avoid dependency hints on math opcodes
Putting NoDDClr and NoDDChk dependency control on instruction
sequences that include math opcodes can cause corruption of channels.
Treat math opcodes like send opcodes and suppress dependency hinting.

Signed-off-by: Mike Stroyan <mike@LunarG.com>
Tested-by: Tony Bertapelli <anthony.p.bertapelli@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 602510395a)
2014-04-18 15:19:00 -07:00
Kenneth Graunke
9abd686e3f glsl: Try vectorizing when seeing a repeated assignment to a channel.
When considering assignment expressions like:

    v.x += u.x;
    v.x += u.x;

the vectorizer would incorrectly keep going, attempting to find more
instructions to vectorize.  It would overwrite the saved assignment
to point at the second one, and increment channels a second time,
resulting in try_vectorize thinking the expression was a vec2 instead of
a float.

Instead, if we see a repeated assignment to a channel, just try to
vectorize everything we've found so far.  This clears the saved state
so it will start over.

Fixes Piglit's repeated-channel-assignments.vert.

Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit ae2a03b573)
2014-04-18 15:19:00 -07:00
Ian Romanick
3a194cd77e glsl: Propagate explicit binding information from the AST all the way to the linker
Information about the binding was not being properly communicated from
the front-end compiler to the linker.  As a result, the linker never
knew that any UBOs had explicit bindings!

Fixes the piglit test arb_shading_language_420pack-binding-layout.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76323
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: github@socker.lepus.uberspace.de [v0]
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Cc: github@socker.lepus.uberspace.de
(cherry picked from commit 625cf8c874)
2014-04-18 15:19:00 -07:00
Ian Romanick
a38db439df linker: Set binding for all elements of UBO array
Previously, a UBO like

    layout(binding=2) uniform U {
        ...
    } my_constants[4];

wouldn't get any bindings set.  The code would try to set the binding of
U, but that would fail.  It should instead set the bindings for U[0],
U[1], ...

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76323
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Cc: github@socker.lepus.uberspace.de
(cherry picked from commit 25a6656875)
2014-04-18 15:18:58 -07:00
Ian Romanick
5f685d9925 linker: Set block bindings based on UniformBlocks rather than UniformStorage
For blocks, gl_shader_program::UniformStorage isn't very useful.  The
names stored there are the names of the elements of the block, so
finding blocks with an instance name is hard.  There is also only one
entry in ::UniformStorage for each element of a block array, and that is
a deal breaker.

Using ::UniformBlocks is what _mesa_GetUniformBlockIndex does.  I
contemplated sharing code between set_block_binding and
_mesa_GetUniformBlockIndex, but building the stand-alone compiler and
the unit tests make this hard.  I plan to return to this effort shortly.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76323
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Cc: github@socker.lepus.uberspace.de
(cherry picked from commit cc42717b50)
2014-04-18 15:17:31 -07:00
Ian Romanick
b3f1ee8b18 linker: Clean up "unused parameter" warnings
../../src/glsl/link_uniform_initializers.cpp:87:1: warning: unused parameter 'mem_ctx' [-Wunused-parameter]
../../src/glsl/link_uniform_initializers.cpp:87:1: warning: unused parameter 'type' [-Wunused-parameter]
../../src/glsl/link_uniform_initializers.cpp:127:1: warning: unused parameter 'mem_ctx' [-Wunused-parameter]
../../src/glsl/link_uniform_initializers.cpp:127:1: warning: unused parameter 'type' [-Wunused-parameter]

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76323
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Cc: github@socker.lepus.uberspace.de
(cherry picked from commit 157391a41b)
2014-04-18 15:16:20 -07:00
Carl Worth
c862a14676 glsl: Allow explicit binding on atomics again
As of 943b2d52bf, layout(binding) on an atomic would fail the assertion
here.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 92840aabf7)

Conflicts:
	src/glsl/link_uniform_initializers.cpp
2014-04-18 15:14:37 -07:00
Ian Romanick
23e42eeab0 linker: Fold set_uniform_binding into call site
In the next patch, we'll see that using
gl_shader_program::UniformStorage is not correct for uniform blocks.
That means we can't use ::UniformStorage to select between the sampler
path and the block path.  Instead we want to just use the type of the
variable.  That's never passed to set_uniform_binding, and it's easier
to just remove the function (especially for later patches in the series)
than to add another parameter.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76323
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Cc: github@socker.lepus.uberspace.de
(cherry picked from commit 943b2d52bf)
2014-04-16 10:29:40 -07:00
Ian Romanick
cc0e6d87be linker: Various trivial clean-ups in set_sampler_binding
- Remove the spurious block left from the previous commit and re-indent.

- Constify elements.

- Make the spec reference in the code look like other spec references in
  the compiler.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76323
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Cc: github@socker.lepus.uberspace.de
(cherry picked from commit 881c52f13f)
2014-04-16 10:29:13 -07:00
Ian Romanick
dab5a7a9f9 linker: Split set_uniform_binding into separate functions for blocks and samplers
The two code paths are quite different, and there are some problems in
the handling of uniform blocks.  Future changes will cause these paths
to diverge further.  Ultimately, selecting between the two functions
will happen at the set_uniform_binding call site, and
set_uniform_binding will be deleted.

NOTE: This patch just moves code around.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76323
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Cc: github@socker.lepus.uberspace.de
(cherry picked from commit 6e2f63b69e)
2014-04-16 10:28:45 -07:00
Jonathan Gray
358d05617a configure: don't require libudev for gbm or egl drm/wayland
After the loader changes libudev is no longer required for
gbm or the egl drm/wayland platforms.  Lets these build/run
on OpenBSD.

v2: preserve the libudev requirement for Linux as suggested
by Emil Velikov.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 81799c82e4)
2014-04-16 10:28:06 -07:00
Emil Velikov
df9e7ee445 configure: cleanup libudev handling
Add the explicit note about the required version during configure.
Require the same version (151) of udev when building the pipe-loader.
Mention the udev version requirement in GBM Requires.private.

v2: Resolve a couple of silly typos. Spotted by Ilia
v3: Cleanup platfrom/platform typo. Spotten by Stefan

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 118c36adb4)
2014-04-16 10:27:53 -07:00
Tom Stellard
933215ac63 configure: Use LLVM shared libraries by default
Linking with LLVM static libraries is easily broken by changes to
the llvm-config program or when LLVM adds, removes, or changes library
components.  Keeping up with these changes requires a lot of maintanence
effort to keep the build working on the master and stable branches.

Also, because of issues in the past LLVM static libraries, the release
manager is currently configuring with --with-llvm-shared-libs when
checking the build before release.  Enabling shared libraries by
default would allow the release manager to run ./configure with
no arguments, and be reasonably confident that the build would succeed.

Acked-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit a4c734297f)
2014-04-16 10:07:08 -07:00
Matt Turner
c755ebfecf i965/fs: Don't propagate saturation modifiers if there are source modifiers.
Which would lead to translating

   mad     vgrf9:F,  vgrf3:F, u0:F, vgrf6:F
   mov.sat vgrf7:F, -vgrf9:F

into

   mad.sat vgrf9:F,  vgrf3:F, u0:F, vgrf6:F
   mov     vgrf7:F, -vgrf9:F

Fixes some lighting effects in Dota2.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76749
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 92d03f7f28)
2014-04-15 17:52:10 -07:00
Matt Turner
c60b97e9ba i965/fs: Don't propagate saturate modifiers into partial writes.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 7a7b8a02be)
2014-04-15 17:51:56 -07:00
Matt Turner
37c4ba3e69 i965/fs: Fix off-by-one in saturate propagation.
ip needs to be initialized to start_ip - 1, since the first thing in the
main loop is ip++. Otherwise we would incorrectly propagate the saturate
from the mov to the mad:

   mad     a, b, c, d
   mov.sat x, a
   add     y, z, a

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 86ae6f477d)
2014-04-15 17:51:45 -07:00
Alexander von Gluck IV
9f1fe12fd1 haiku: Fix build through scons corrections and viewport fixes
* Add HAVE_PTHREAD, we do have pthread support wrappers now for
  non-native Haiku threaded applications.
* Viewport changed behavior recently breaking the build.
  We fix this by looking at the gl_context ViewportArray
  (Thanks Brian for the idea)

Acked-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 7683fce878)
2014-04-15 17:41:20 -07:00
Jonathan Gray
92c43a3a88 egl/dri2: use drm macros to construct device name
Don't hardcode /dev/dri/card0 but instead use the drm
macros which allows the correct /dev/drm0 device to be
opened on OpenBSD.

v2: use snprintf and fallback to /dev/dri/card0
v3: check for snprintf truncation

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit c973e440d5)
2014-04-14 12:12:55 -07:00
Carl Worth
b8e0e34555 cherry-ignore: Ignore a few patches
These were recently discussed with the patch authors who agreed these can be
skipped for the 10.1.1 release.
2014-04-14 12:07:42 -07:00
Marek Olšák
0c6be6e146 r600g: implement edge flags
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 1337da5115)

Conflicts:
	src/gallium/drivers/r600/evergreen_state.c
	src/gallium/drivers/r600/r600_shader.c
	src/gallium/drivers/r600/r600_shader.h
2014-04-14 11:48:54 -07:00
Michel Dänzer
30be758fd2 r600g: Don't leak bytecode on shader compile failure
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74868

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit ee2bcf38a4)
2014-04-14 11:48:53 -07:00
Emil Velikov
aae5cf54a2 glx: drop obsolete _XUnlock_Mutex in __glXInitialize error path
With commit 1f1928db001(glx: Drop _Xglobal_lock while we create and
initialize glx display) we've split the big _Xglobal_lock handling in
a more fine grained manner.

Unfortunatelly we forgot to drop the unlock_mutex on the error paths,
leading to undefined behaviour as the mutex is already unlocked.

Cc: Kristian Høgsberg <krh@bitplanet.net>
Cc: "9.2 10.0 10.1"  <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit f9832f960f)
2014-04-14 11:48:53 -07:00
Brian Paul
c5612ba549 svga: move LIST_INITHEAD(dirty_buffers) earlier in svga_context_create()
Fixes a crash in svga_context_flush_buffers() if we use the 'draw' module
for AA lines (when the device doesn't support that feature).  We need to
initialize this list before we setup the swtnl pieces.

Found/fixed by Charmaine Lee.

Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
(cherry picked from commit e853ade544)

Conflicts:
	src/gallium/drivers/svga/svga_context.c
2014-04-14 11:48:53 -07:00
Kenneth Graunke
e52117cefb i965: Stop advertising GL_MESA_ycbcr_texture.
The "new" fragment shader backend has never supported the necessary
color conversion code for this to work.  We began using the new backend
in Mesa 7.10 for GLSL (commit a81d423d93, October 2010),
and for ARB_fragment_program in Mesa 9.1 (commit 97615b2d8c,
August 2012).

I haven't heard any complaints, so I don't think anyone will miss this
feature.  I believe mplayer used it at one point, but these days
defaults to other paths anyway.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 26ae030fcc)
2014-04-14 11:48:53 -07:00
Courtney Goeltzenleuchter
18055f9136 mesa: add bounds checking to eliminate buffer overrun
Decompressing ETC2 textures was causing intermitent segfault
by copying resulting 4x4 texel block to the destination texture
regardless of the size of the destination texture. Issue found
via application crash in GLBenchmark 3.0's Manhattan test.

v2: add more detail comment. Compute limit outside inner loops.
v3: add bugzilla reference
v4: Correct cc syntax in commit log
v5: really grab the right patch

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74988
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1, suggested v2-3]
(cherry picked from commit cb4ad13685)
2014-04-14 11:48:53 -07:00
Brian Paul
563fd9d736 svga: replace sampler assertion with conditional
For TEX instructions, the set of samplers and sampler views should
be consistent.  The XA state tracker sometimes passes an inconsistent
set of samplers and sampler views.  Rather than assert and die, issue
a warning.

v2: add debugging code to detect inconsistent state.
v3: also check for null sampler in svga_state_tss.c

Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
(cherry picked from commit 9bb2ec6fd1)
2014-04-14 11:48:53 -07:00
Chia-I Wu
9aa0b296f8 i965/vec4: fix record clearing in copy propagation
Given

  mov vgrf7, vgrf9.xyxz
  add vgrf9.xyz, vgrf4.xyzw, vgrf5.xyzw
  add vgrf10.x, vgrf6.xyzw, vgrf7.wwww

the last instruction would be wrongly changed to

  add vgrf10.x, vgrf6.xyzw, vgrf9.zzzz

during copy propagation.

The issue is that when deciding if a record should be cleared, the old code
checked for

  inst->dst.writemask & (1 << ch)

instead of

  inst->dst.writemask & (1 << BRW_GET_SWZ(src->swizzle, ch))

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76749
Signed-off-by: Chia-I Wu <olv@lunarg.com>
Cc: Jordan Justen <jljusten@gmail.com>
Cc: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romainck <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "10.1" <mesa-stable@freedesktop.org>
(cherry picked from commit 4ddf51db6a)
2014-04-14 11:48:53 -07:00
Kenneth Graunke
82db52a55e glsl: Fix lack of i2u in lower_ubo_reference.
ir_binop_ubo_load takes unsigned integer operands.  However, the array
index used to compute these offsets may be a signed integer.  (For
example, see Piglit's spec/glsl-1.40/uniform_buffer/fs-bvec-array).

For some reason, we were missing an ir_binop_i2u cast, and ir_validator
was failing to catch that.

Without this change, ir_builder's type inference code broke for me when
writing a new optimization pass.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit e14b93371c)
2014-04-14 11:48:53 -07:00
Thomas Hellstrom
8f4fb58dbf st/xa: Make sure unused samplers are set to NULL
renderer_copy_prepare was setting the first sampler but never telling
the cso code how many samplers were actually used. Fix this.

Cc: "10.1" <mesa-stable@freedesktop.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 47f60cbb71)
2014-04-14 11:48:53 -07:00
Thomas Hellstrom
8f85bf57fc st/xa: Bind destination before setting new state
Binding a new destination may cause the svga driver to emit draw calls
while propagating the surface. Make sure this doesn't happen in the middle
of sampler state setup where state may be incosistent.

In practice, surface propagation should never happen here and even if it did,
it wouldn't be a valid reason for the svga driver to emit partially set up
state, but to avoid future uncertainties, make sure this doesn't happen
anyway.

Found while auditing the state tracker for inconsistent sampler state /
sampler view setup.

Cc: "10.1" <mesa-stable@freedesktop.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
(cherry picked from commit e5d2c5b899)
2014-04-14 11:48:53 -07:00
Ilia Mirkin
76c84a0f75 nouveau: fix firmware check on nvd7/nvd9
The kernel driver expects the class to be based on chipset generation
rather than VP generation. Make sure to pass 90b1 for NVDX chipsets
instead of 95b1.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77102
Fixes: 40dd777b33
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.1 10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@ubunutu.com>
(cherry picked from commit 89c5b56be6)
2014-04-14 11:48:53 -07:00
Thomas Hellstrom
b38c141850 winsys/svga: Fix prime surface references also for guest-backed surfaces
Implement guest-backed surface sharing using prime fds. Previously only
legacy surfaces could use this functionality. Also use the vmwgfx 2.6
single-ioctl prime fd reference if available.

Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
(cherry picked from commit 2f6fcd65f2)
2014-04-14 11:48:53 -07:00
Thomas Hellstrom
b088d4649c winsys/svga: Update the vmwgfx_drm.h header to latest version from kernel
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
(cherry picked from commit 0887b499e9)
2014-04-14 11:48:52 -07:00
Jonathan Gray
8d6eea9824 egl/dri2: don't require libudev to build drm/wayland platforms
After the loader changes libudev is no longer required to
build gbm or the egl drm/wayland platforms.

Remove a libudev ifdef which allows the the drm egl driver
to be loaded on OpenBSD.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 0295953c5d)
2014-04-14 11:48:52 -07:00
Johannes Nixdorf
91543aef3b configure.ac: fix the detection of expat with pkg-config
The pkg-config module was called "EXPAT" instead of "expat" in
PKG_CHECK_EXISTS. This seems to have been wrong because the wrong
argument was copied from PKG_CHECK_MODULES.

Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 476db98e03)
2014-04-14 11:48:52 -07:00
Jonathan Gray
4d7504c0c4 megadriver_stub.c: don't use _GNU_SOURCE to gate the compat code
_GNU_SOURCE is only set/required for linux*|*-gnu*|gnu*) and as the
functionality is available on other systems check for RTLD_DEFAULT instead.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 1cc742d912)
2014-04-14 11:48:52 -07:00
Jonathan Gray
d7df21d08b loader: don't limit the non-udev path to only android
Platforms that lack libudev (OpenBSD and possibly others) need
this change in order to load the correct dri driver.
Under linux we unconditionally require libudev, thus this code
will never get build.

v2: Add commit message (Emil Velikov)

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 380f05ccc3)
2014-04-14 11:48:52 -07:00
Jonathan Gray
b853dceb4b loader: use 0 instead of FALSE which isn't defined
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 727f54a76e)
2014-04-14 11:48:52 -07:00
Brian Paul
71c4e4f420 cso: fix sampler view count in cso_set_sampler_views()
We want to call pipe->set_sampler_views() with count being the
maximum of the old number of sampler views and the new number.
This makes sure we null-out any old sampler views.

We already do the same thing for sampler states in single_sampler_done().
Fixes some assertions seen in the VMware driver with XA tracker.

Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Tested-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 2355a64414)
2014-04-14 11:48:52 -07:00
Thomas Hellstrom
95bba69398 winsys/svga: Replace the query mm buffer pool with a slab pool v3
This is to avoid running out of query buffer space due to winsys
limitations. Instead of a fixed size per screen pool of query buffers,
use a slab allocator that allocates a new slab if we run out of space
in the first one.

v2: Correct email addresses.
v3: s/8192/VMW_QUERY_POOL_SIZE/. Improve documentation and log message.

Reported-and-tested-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5dc206525b)
2014-04-14 11:48:52 -07:00
Emil Velikov
e144e14c0a configure: enable dri3 only for linux
Currently only linux can make use of dri3, so it would make sense to
enable it explicitly for the platform.
Drop a duplicated libudev check while we're at it.

v3: Properly handle dri3 and reword commit message.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76377
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 23740ed031)
2014-04-14 11:48:52 -07:00
Brian Paul
c1fbfa4859 mesa: fix glMultiDrawArrays inside a display list
The underlying glDrawArrays() calls weren't getting compiled into
the display list.  We simply need to use the current dispatch table
so the CALL_DrawArrays() is routed to the display list save function.

This patch also fixes glMultiModeDrawArraysIBM and
glMultiModeDrawElementsIBM.

Fixes the new piglit gl-1.4-dlist-multidrawarrays test.

Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit e341856294)
2014-04-14 11:48:52 -07:00
Brian Paul
2ea18474a6 st/mesa: add null pointer checking in query object functions
Don't pass null query object pointers into gallium functions.
This avoids segfaulting in the VMware driver (and others?) if the
pipe_context::create_query() call fails and returns NULL.

Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 488d4c4826)
2014-04-14 11:48:52 -07:00
Brian Paul
740bd738a9 mesa: fix unpack_Z32_FLOAT_X24S8() / unpack_Z32_FLOAT() mix-up
And use the z32f_x24s8 helper struct in unpack_Z32_FLOAT_X24S8().
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 1f4ebfaa88)
2014-04-14 11:48:52 -07:00
Christian König
e801b8d677 st/mesa: fix sampler view handling with shared textures v4
Release the references to the sampler views before
destroying the pipe context.

v2: remove TODO and unrelated change
v3: move to st_texture.[ch], rename callback, add comment
v4: fix rebase mess up and add further cleanups

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d117ddbe31)
2014-04-14 11:48:52 -07:00
José Fonseca
da726d3d93 draw: Duplicate TGSI tokens in draw_pipe_pstipple module.
As done in draw_pipe_aaline and draw_pipe_aapoint modules.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ee89432a47)
2014-04-14 11:48:51 -07:00
Christian König
1d32349f70 st/mesa: recreate sampler view on context change v3
With shared glx contexts it is possible that a texture is create and used
in one context and then used in another one resulting in incorrect
sampler view usage.

v2: avoid template copy
v3: add XXX comment

Signed-off-by: Christian König <christian.koenig@amd.com>
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 92e543c45d)
2014-04-14 11:48:51 -07:00
Ilia Mirkin
b3c9041cca nvc0/ir: move sample id to second source arg to fix sampler2DMS
The nvc0 texfetch instruction expects the sample id to be in the second
source (usually used for the offset) rather than as part of the texture
coordinate.

This fixes all the sampler2DMS/Array tests on nvc0.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 19ba573a57)
2014-04-14 11:48:51 -07:00
Marek Olšák
6619158873 st/mesa: drop the lowering of quad strips to triangle strips
This fallback to triangle strips is silly and should be done in drivers
if they need it.

This should fix the case when quad strips are used with flatshading that is
enabled by the "flat" GLSL varying modifier. It also fixes primitive restart
for quad strips.

This fixes piglit:
  NV_primitive_restart/primitive-restart-draw-mode-quad_strip

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit e5f6b6d0fe)
2014-04-14 11:48:51 -07:00
Marek Olšák
6466e99aa0 st/mesa: fix generating mipmaps for cube arrays
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit db722bdcab)
2014-04-14 11:48:51 -07:00
Marek Olšák
f5fe91c087 mesa: fix software fallback for generating mipmaps for 3D textures
It didn't use the driver-provided src/dstRowStride at all.
This was broken for the cases when stride != width*bpp.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 91df26842f)
2014-04-14 11:48:51 -07:00
Marek Olšák
cb89da2e61 mesa: fix software fallback for generating mipmaps for cube arrays
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 78c60d1b63)
2014-04-14 11:48:51 -07:00
Marek Olšák
1dba985788 mesa: allow generating mipmaps for cube arrays
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 185ad78ffd)
2014-04-14 11:48:51 -07:00
Marek Olšák
3a4a0882cb mesa: fix texture border handling for cube arrays
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 55cf320ed8)
2014-04-14 11:48:51 -07:00
Brian Paul
352b9e8faf c11/threads: don't include assert.h if the assert macro is already defined
In the gallium code, the assert() macro could come from either the
system's assert.h file (via c11/threads.h) or from gallium's u_debug.h.
It looks like all known assert.h files unconditionally #undef assert
before defining their own version.  So the assert you get depends on
whether threads.h or u_debug.h was included last.

In the gallium code we really want to use the assert() from u_debug.h
(it behaves better on Windows).  In gallium, c11/threads.h is only
included after u_debug.h in the os_thread.h wrapper.  So Adding
an #ifndef assert test in the threads*.h files avoids using the system's
assert().

Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit eaf9affa5e)
2014-04-14 11:48:51 -07:00
Ilia Mirkin
87587c6683 nouveau: there may not have been a texture if the fbo was incomplete
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e58071355e)
2014-04-14 11:48:51 -07:00
Ilia Mirkin
ee71a08f23 nouveau: add forgotten GL_COMPRESSED_INTENSITY to texture format list
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b676df9abf)
2014-04-14 11:48:51 -07:00
Ilia Mirkin
9cb82a0319 mesa/main: condition GL_DEPTH_STENCIL on ARB_depth_texture
EXT_packed_depth_stencil is supported by all drivers, but
ARB_depth_texture isn't (notably nouveau_vieux). This should avoid
passing unexpected values down to ChooseTextureFormat.

The EXT_packed_depth_stencil spec does not make any explicit references
to requiring ARB_depth_texture in order to allow textures with that
format, however if there is no dependency, ARB_depth_texture would be
practically implied by the extension.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>

Note for 10.0 backport: This will produce a conflict, the solution is to
move the surrounding if as well.

(cherry picked from commit 18690995a6)
2014-04-14 11:48:50 -07:00
Ilia Mirkin
ec6be857f3 loader: add special logic to distinguish nouveau from nouveau_vieux
There are a lot of different pci ids supported by nouveau, and more are
added all the time. The relevant distinguisher between drivers is the
chipset id.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 51989817e6)
2014-04-14 11:48:50 -07:00
Marek Olšák
365975e5eb mesa: mark GL_RGB9_E5 as not color-renderable
The GL 4.4 spec says it's not color-renderable and not accepted
by RenderBufferStorage. The EXT_texture_shared_exponent spec says
it's not color-renderable but it's accepted by RenderBufferStorageEXT.
This seems to be a bug in the extension spec.

Let's do what GL 4.4 says.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 2e361160ff)
2014-04-14 11:48:50 -07:00
Marek Olšák
3b7fd0351d st/mesa: fix per-vertex edge flags and GLSL support (v2)
This fixes piglit/gl-2.0-edgeflag.

v2: use StrideB to recognize per-vertex edge flags

Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 3d42696d10)
2014-04-14 11:48:50 -07:00
Kenneth Graunke
676cc8e39d i965/fs: Fix register comparisons in saturate propagation.
opt_saturate_propagation_local compares scan_inst->dst.reg/reg_offset
with inst->src[0].reg/reg_offset, and ensures that scan_inst->dst.file
is GRF.  But nothing ensured that inst->src[0].file was GRF.

In the following program, this resulted in u1:F matching vgrf1:UW,
and a saturate being incorrectly propagated from instruction 8 to
instruction 1.

{  1}    0: add vgrf0:UW, hw_reg1+8:UW, hw_reg0:V
{  1}    1: add vgrf1:UW, hw_reg1+10:UW, hw_reg0:V
{  1}    2: linterp vgrf6:F, hw_reg2:F, hw_reg3:F, hw_reg0:F
{  2}    3: linterp vgrf27:F, hw_reg2:F, hw_reg3:F, hw_reg0+16:F
{  4}    4: mov vgrf10+0.0:F, vgrf6:F
{  3}    5: mov vgrf10+1.0:F, vgrf27:F
{  6}    6: tex vgrf8+0.0:F, vgrf10+0.0:F
{  5}    7: mov vgrf32:F, u1:F
{  5}    8: mov.sat vgrf12:F, u1:F

From shader-db:
   total instructions in shared programs: 1841932 -> 1841957 (0.00%)
   instructions in affected programs:     5823 -> 5848 (0.43%)
I inspected two of the 25 hurt shaders, and concluded that they were
both hitting this bug, and not legitimately optimized.

This fixes bugs in Left 4 Dead 2 and Team Fortress 2, possibly among
others.  The optimization pass didn't exist in 10.0, so this is only
a candidate for 10.1.

Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 4d2e79269a)
2014-04-14 11:48:50 -07:00
Emil Velikov
c2c1c902f9 mesa: return v.value_int64 when the requested type is TYPE_INT64
Fixes "Operands don't affect result" defect reported by Coverity.

Cc: "9.2 10.0 10.1"  <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit a9cf3aa208)
2014-04-14 11:48:50 -07:00
Emil Velikov
d78131f695 nv50: add missing brackets when handling the samplers array
Commit 3805a864b1d(nv50: assert before trying to out-of-bounds access
samplers) introduced a series of asserts as a precausion of a previous
illegal memory access.

Although it failed to encapsulate loop within nv50_sampler_state_delete
effectively failing to clear the sampler state, apart from exadurating
the illegal memory access issue.

Fixes gcc warning "array subscript is above array bounds" and
"Nesting level does not match indentation" and "Out-of-bounds read"
defects reported by Coverity.

Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit c26b488088)
2014-04-14 11:48:50 -07:00
Marek Olšák
8648c2b2a0 r600g,radeonsi: fix MAX_TEXTURE_3D_LEVELS and MAX_TEXTURE_ARRAY_LAYERS limits
CB_COLORi_VIEW.SLICE_MAX can be at most 2047.

This fixes the maxlayers piglit test.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 4f1f32306a)

Conflicts:
	src/gallium/drivers/r600/r600_pipe.c
	src/gallium/drivers/radeonsi/si_pipe.c
2014-04-14 11:48:50 -07:00
Jonathan Gray
9bfe1b3773 gallium: add endian detection for OpenBSD
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 40214267ab)
2014-04-14 11:48:50 -07:00
Ilia Mirkin
a77c10a1a4 nv50: adjust blit_3d handling of ms output textures
This fixes some unwanted scaling when the output is multisampled.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 253314d487)

And squashed with:

Revert nvc0 part of "nv50: adjust blit_3d handling of ms output textures"

The nvc0 bits don't appear to work, and I thought I had removed them
from the commit. Oops.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 897f40f25d)
2014-04-14 11:48:50 -07:00
Ilia Mirkin
ac14d741ee nouveau: fix fence waiting logic in screen destroy
nouveau_fence_wait has the expectation that an external entity is
holding onto the fence being waited on, not that it is merely held onto
by the current pointer. Fixes a use-after-free in nouveau_fence_wait
when used on the screen's current fence.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75279
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 507f0230d4)

Conflicts:
	src/gallium/drivers/nouveau/nv30/nv30_screen.c
2014-04-14 11:48:50 -07:00
Marek Olšák
acfb3f7f02 mesa: fix the format of glEdgeFlagPointer
Softpipe expects a float in the vertex shader, which is what glEdgeFlag
generates.

This fixes piglit/gl-2.0-edgeflag.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 780ce576bb)
2014-04-14 11:48:50 -07:00
Marek Olšák
228ad18b84 r600g: fix blitting the last 2 mipmap levels for Evergreen
This fixes a lot of compressedteximage piglit tests.

R600-R700 don't have this issue.

Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit fcdf6fa86c)
2014-04-14 11:48:49 -07:00
Marek Olšák
7c09f4bb44 r600g: fix texelFetchOffset GLSL functions
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 8a08051e2a)
2014-04-14 11:48:49 -07:00
Matt Turner
3eb2103ce0 mesa: Wrap SSE4.1 code in #ifdef __SSE4_1__.
Because people insist on doing things like explicitly disabling SSE 4.1.

Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
Tested-by: David Heidelberger <david.heidelberger@ixit.cz>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71547
(cherry picked from commit 8d3f739383)
2014-04-14 11:48:49 -07:00
Brian Paul
58fe564607 mesa: fix copy & paste bugs in pack_ubyte_SRGB8()
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 1e25aa4cdb)
2014-04-14 11:48:49 -07:00
Brian Paul
16fc050e07 mesa: fix copy & paste bugs in pack_ubyte_SARGB8()
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 9493fc729e)
2014-04-14 11:48:49 -07:00
Aaron Watry
248d82515f gallium/util: Fix memory leak
Fix a leaked vertex shader in u_blitter.c

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

CC: "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit fb78152678)
2014-04-14 11:48:49 -07:00
Anuj Phogat
728f58c534 mesa: Allow GL_DEPTH_COMPONENT and GL_DEPTH_STENCIL combinations in glTexImage{123}D()
From OpenGL 3.3 spec, page 141:
   "Textures with a base internal format of DEPTH_COMPONENT or DEPTH_STENCIL
    require either depth component data or depth/stencil component data.
    Textures with other base internal formats require RGBA component data.
    The error INVALID_OPERATION is generated if one of the base internal
    format and format is DEPTH_COMPONENT or DEPTH_STENCIL, and the other
    is neither of these values."

Fixes Khronos OpenGL CTS test failure: proxy_textures_invalid_size

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 079bff5a99)
2014-04-14 11:48:49 -07:00
Anuj Phogat
df15372b65 mesa: Set initial internal format of a texture to GL_RGBA
From OpenGL 4.0 spec, page 398:
   "The initial internal format of a texel array is RGBA
    instead of 1. TEXTURE_COMPONENTS is deprecated; always
    use TEXTURE_INTERNAL_FORMAT."

Fixes Khronos OpenGL CTS test failure: proxy_textures_invalid_size

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 063980151e)
2014-04-14 11:48:49 -07:00
Brian Paul
50d65b4374 st/osmesa: check buffer size when searching for buffers
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75543
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cbacee207f)
2014-04-14 11:48:49 -07:00
José Fonseca
f4e348d008 c11/threads: Don't implement thrd_current on Windows.
GetCurrentThread() returns a pseudo-handle (a constant which only makes
sense when used within the calling thread) and not a real handle.

DuplicateHandle() will return a real handle, but it will create a new
handle every time we call.  Calling DuplicateHandle() here means we will
leak handles, which can cause serious problems.

In short, the Windows implementation of thrd_t needs a thorough make
over, and it won't be pretty.  It looks like C11 committee
over-simplified things: it would be much better to have seperate objects
for threads and thread IDs like C++11 does.

For now, just comment out the thrd_current() implementation, so we get
build errors if anybody tries to use it.

Thanks to Brian Paul for spotting and diagnosing this problem.

Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit a61d859519)
2014-04-14 11:48:49 -07:00
José Fonseca
34e8881ac7 mapi/u_thread: Use GetCurrentThreadId
u_thread_self() expects thrd_current() to return a unique numeric ID
for the current thread, but this is not feasible on Windows.

Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit e8d85034da)
2014-04-14 11:48:49 -07:00
José Fonseca
30274cf36c c11/threads: Fix nano to milisecond conversion.
Per https://gist.github.com/yohhoy/2223710/#comment-710118

Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Michel Dänzer <michel@daenzer.net>
(cherry picked from commit f34d75d6f6)
2014-04-14 11:48:49 -07:00
Hans
64f54cc8f4 mesa: don't define c99 math functions for MSVC >= 1800
Signed-off-by: Brian Paul <brianp@vmware.com>
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 837da9bdae)
2014-04-10 11:01:54 -07:00
Hans
4f668babc6 util: don't define isfinite(), isnan() for MSVC >= 1800
Signed-off-by: Brian Paul <brianp@vmware.com>
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bf25660325)
2014-04-10 11:01:23 -07:00
Brian Paul
ee51c6aae7 mesa: don't call ctx->Driver.ClearBufferSubData() if size==0
Fixes failed assertion when trying to map zero-length region.

https://bugs.freedesktop.org/show_bug.cgi?id=75660
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit aff7c5e78a)
2014-04-10 11:01:00 -07:00
Brian Paul
3491c57bd9 softpipe: use 64-bit arithmetic in softpipe_resource_layout()
To avoid 32-bit integer overflow for large textures.  Note: we're
already doing this in llvmpipe.

Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 465b2c42bc)
2014-04-10 11:00:16 -07:00
Ian Romanick
4a86465f47 mesa: Bump version to 10.1 (final)
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2014-03-05 08:59:46 +02:00
Julien Cristau
03d0c9fd30 glx/dri2: fix build failure on HURD
Patch from Debian package.

Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 6f0e2731e8)
2014-03-05 08:38:38 +02:00
Chris Forbes
4c0702b05c i965: Validate (and resolve) all the bound textures.
BRW_MAX_TEX_UNIT is the static limit on the number of textures we
support per-stage, not in total.

Core's `Unit` array is sized by MAX_COMBINED_TEXTURE_IMAGE_UNITS, which
is significantly larger, and across the various shader stages, up to
ctx->Const.MaxCombinedTextureImageUnits elements of it may be actually
used.

Fixes invisible bad behavior in piglit's max-samplers test (although
this escalated to an assertion failure on HSW with texture_view, since
non-immutable textures only have _Format set by validation.)

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit befbda56a2)
2014-03-03 09:36:41 +02:00
Chris Forbes
5fbd649451 i965: Widen sampler key bitfields for 32 samplers
Previously the `high` 16 samplers on Haswell+ would not get sampler
workarounds applied.

Don't bother widening YUV fields, since they're ignored and going away
soon anyway.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 590920f93e)
2014-03-03 09:36:01 +02:00
Ian Romanick
05b9e6a963 mesa: Bump version to 10.1-rc3
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2014-03-01 08:53:32 -08:00
Emil Velikov
92e8c52340 dri/i9*5: correctly calculate the amount of system memory
The variable name states megabytes, while we calculate the amount in
kilobytes. Correct this by dividing with the correct amount.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit fc25956bad)
2014-03-01 08:53:32 -08:00
Brian Paul
3f0011edfd mesa: add unpacking code for MESA_FORMAT_Z32_FLOAT_S8X24_UINT
Fixes glGetTexImage() when converting from MESA_FORMAT_Z32_FLOAT_S8X24_UINT
to GL_UNSIGNED_INT_24_8.  Hit by the piglit
ext_packed_depth_stencil-getteximage test.

Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit a12d4d0398)
2014-03-01 08:29:42 -08:00
Ian Romanick
6e3ce7997a i915: Allocate the sys_buffer using _mesa_align_malloc
Though it won't matter on Linux, use _mesa_align_free to release it.
Since i965 doesn't have sys_buffer, I overlooked this in the
GL_ARB_map_buffer_alignment work a few months ago.  Fixes i915 (and
presumably i830) regressions in ARB_map_buffer_range tests and the
failure in arb_map_buffer_alignment-sanity_test.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74960
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit ff2cbf9e0c)
2014-02-28 15:23:59 -08:00
Ian Romanick
1b6aad2234 i915: Only allow 8 vertex texture units
There's no reason to have more vertex texture units than fragment
texture units on this hardware.  Since increasing the default maximum
number of texture units from 16 to 32, this has triggered some segfault
in i915 driver.  There's probably some array or bitfield that isn't
properly sized now.  This really papers over the bug, but I don't think
I'll lose any sleep over that.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74071
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 8ba157006f)
2014-02-28 15:23:57 -08:00
Petri Latvala
b34f05f6a7 i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.
v2: Don't add function parameters, pass the required size in
prog_data->nr_params.

v3:
- Use the name uniform_array_size instead of uniform_param_count.
- Round up when dividing param_count by 4.
- Use MAX2() instead of taking the maximum by hand.
- Don't crash if prog_data passed to vec4_visitor constructor is NULL

v4: Rebase for current master

v5 (idr): Trivial whitespace change.

Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71254
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 7189fce237)
2014-02-28 15:22:50 -08:00
Tom Stellard
677fde5ca0 r600g/compute: PIPE_CAP_COMPUTE should be false for pre-evergreen GPUs
This prevents clover from using unsupported devices.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

CC: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f61e382f0a)
2014-02-28 14:32:42 -08:00
Matt Turner
3305b9c96b glsl: Don't vectorize horizontal expressions.
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75224
(cherry picked from commit 4bd7f1d044)
2014-02-28 14:32:39 -08:00
Matt Turner
a43b8bfa78 glsl: Add is_horizontal() method to ir_expression.
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5eff8576ba)
2014-02-28 14:32:37 -08:00
Brian Paul
862572b205 mesa: do depth/stencil format conversion in glGetTexImage
glGetTexImage(GL_DEPTH_STENCIL, GL_UNSIGNED_INT_24_8) was just
using memcpy() instead of _mesa_unpack_uint_24_8_depth_stencil_row()
to convert texels from the hardware format to the GL format.

Fixes issue reported by David Meng at Intel.  The new piglit
ext_packed_depth_stencil-getteximage test checks for this bug.

Also, add some format/type assertions.  We don't yet handle the
GL_FLOAT_32_UNSIGNED_INT_24_8_REV type.  That should be fixed in
a follow-on patch.

Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 43dee0295e)
2014-02-28 14:32:34 -08:00
Thomas Hellstrom
037f357564 winsys/svga: Avoid calling drm getparam for max surface size on older kernels
This avoids the kernel driver spewing out errors about the param not being
supported.

Also correct the max surface size used when the kernel does not support the
query.

Reported-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f5e681f3fa)
2014-02-28 14:32:31 -08:00
Anuj Phogat
bef5554092 i965: Fix the region's pitch condition to use blitter
intelEmitCopyBlit uses a signed 16-bit integer to represent
buffer pitch, so it can only handle buffer pitches < 32k.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit b3094d9927)
2014-02-26 18:07:14 -08:00
Kenneth Graunke
09b03dcee6 i965: Don't try to dump shader source for fixed-function FS programs.
sh->Source is NULL and this will segfault.

Fixes MESA_GLSL=dump with "The Swapper".

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit f896e82301)
2014-02-26 18:07:11 -08:00
Kenneth Graunke
45cb6063e7 glsl: Delete LRP_TO_ARITH lowering pass flag.
Tt's kind of a trap---calling do_common_optimization() after
lower_instructions() may cause opt_algebraic() to reintroduce
ir_triop_lrp expressions that were lowered, effectively defeating the
point.  Because of this, nobody uses it.

v2: Delete more code (caught by Ian Romanick).

Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit ac0a8b9540)
2014-02-26 18:07:08 -08:00
Kenneth Graunke
9cc1bbcaf4 i965: Stop lowering ir_triop_lrp.
Both the vector and scalar backends now support it natively, so there's
no point in lowering it.

Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 2fdea48e21)
2014-02-26 18:07:04 -08:00
Kenneth Graunke
24abd48ac0 i965/vec4: Handle ir_triop_lrp on Gen4-5 as well.
When the vec4 backend encountered an ir_triop_lrp, it always emitted an
actual LRP instruction, which only exists on Gen6+.  Gen4-5 used
lower_instructions() to decompose ir_triop_lrp at the IR level.

Since commit 8d37e9915a ("glsl: Optimize open-coded lrp into lrp."),
we've had an bug where lower_instructions translates ir_triop_lrp into
arithmetic, but opt_algebraic reassembles it back into a lrp.

To avoid this ordering concern, just handle ir_triop_lrp in the backend.
The FS backend already does this, so we may as well do likewise.

v2: Add a comment reminding us that we could emit better assembly if we
    implemented the infrastructure necessary to support using MAC.
    (Assembly code provided by Eric Anholt).

Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75253
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 56879a7ac4)
2014-02-26 18:07:00 -08:00
Kenneth Graunke
2475db34a0 i965/vec4: Add a brw->gen >= 6 assertion in three-source emitters.
Three source instructions didn't exist until Gen6.  vec4_generator has
assertions to catch this, but catching it in the visitor provides a
nicer backtrace.

Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit ffde483f3c)
2014-02-26 18:06:15 -08:00
Francisco Jerez
3efb934dee i965/vec4: Add non-mutating helper functions to modify src_reg::swizzle and ::negate.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
(cherry picked from commit 98306e727b)
2014-02-26 17:58:20 -08:00
Brian Paul
29876a4d28 gallium/pipebuffer: change pb_cache_manager_create() size_factor to float
Requested by Marek.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e4a5a9fd2f)
2014-02-26 13:35:59 -07:00
Thomas Hellstrom
00769d0322 svga/winsys: Propagate surface shared information to the winsys
The linux winsys needs to know whether a surface is shared.
For guest-backed surfaces we need this information to avoid allocating a
mob out of the mob cache for shared surfaces, but instead allocate a shared
mob, that is never put in the mob cache, from the kernel.

Also previously, all surfaces were given the "shareable" attribute when
allocated from the kernel. This is too permissive for client-local surfaces.
Now that we have the needed info, only set the "shareable" attribute if the
client indicates that it needs to share the surface.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 141e39a893)
2014-02-26 13:35:59 -07:00
Brian Paul
e9a3a8997d svga/winsys: implement GBS support
This is a squash commit of many commits by Thomas Hellstrom.

Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit fe6a854477)
2014-02-26 13:35:59 -07:00
Thomas Hellstrom
a809de8bd9 gallium/util: Add flush/map debug utility code
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 59e7c59621)
2014-02-26 13:35:59 -07:00
Thomas Hellstrom
19740e3085 gallium/pipebuffer: Add a cache buffer manager bypass mask
In some situations, it may be desirable to bypass the cache at buffer
creation but to insert the buffer in the cache at buffer destruction.
One such situation is where we already have a kernel representation of a
buffer that we want to use, but we also want to insert it in the cache when
it's freed up.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8af358d8bc)
2014-02-26 13:35:59 -07:00
Thomas Hellstrom
a44639d826 pipebuffer, winsys: Add a size match parameter to the cached buffer manager
In some situations it's important to restrict the sizes of buffers that the
cached buffer manager is allowed to return

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c9e9b1862b)
2014-02-26 13:35:59 -07:00
Brian Paul
03035d6074 svga: update texture code for GBS
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3d1fd6df53)
2014-02-26 13:35:59 -07:00
Brian Paul
ab7074e024 svga: update buffer code for GBS
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 72b0e959fc)
2014-02-26 13:35:59 -07:00
Brian Paul
a0423a5be2 svga: add new helper functions for GBS buffers
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e0a6fb09bd)
2014-02-26 13:35:59 -07:00
Brian Paul
5abf1526d7 svga: remove a couple unneeded assertions
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6476bcbc50)
2014-02-26 13:35:59 -07:00
Brian Paul
6fc2e0a942 svga: adjust adjustment for point coordinates
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f8bbd8261d)
2014-02-26 13:35:59 -07:00
Brian Paul
6d5e27d19e svga: track which textures are rendered to
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d0c22a6d53)
2014-02-26 13:35:58 -07:00
Brian Paul
3a32f9773a svga: add helpers for tracking rendering to textures
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c1e60a61e8)
2014-02-26 13:35:58 -07:00
Brian Paul
18a7c83765 svga: update shader code for GBS
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f84c830b14)
2014-02-26 13:35:58 -07:00
Brian Paul
98c4fe0f5a svga: update constant buffer code for GBS
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2f1fc8db10)
2014-02-26 13:35:58 -07:00
Brian Paul
2e2719246a svga: add svga_have_gb_objects/dma() functions
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 31dfefc47f)
2014-02-26 13:35:58 -07:00
Brian Paul
5f69eb6caa svga: add new GBS commands
And update some existing commands.

Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 823fbfdca7)
2014-02-26 13:35:58 -07:00
Brian Paul
0ced104930 svga: update svga_winsys interface for GBS
This adds new interface functions for guest-backed surfaces and
adds a mobid parameter to the surface_relocation() function.

Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d993ada50c)
2014-02-26 13:35:58 -07:00
Brian Paul
8ae30c1fc4 svga: update dumping code with new GBS commands, etc
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 024711385e)
2014-02-26 13:35:57 -07:00
Brian Paul
ec9aef9ac2 svga: split / update svga3d header files
The old svga3d_reg.h file is split into separate header files and we
add new items for guest-backed surfaces.

Plus some minor code fixes because of renamed symbols.

Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2e0c90847f)
2014-02-26 13:35:57 -07:00
Brian Paul
5b338d4b35 svga: replace out-of-temps assertion with debug warning
Signed-off-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 23d4ff53d4)
2014-02-26 13:35:57 -07:00
Brian Paul
07d1e7f12f svga: check shader size against max command buffer size
If the shader is too large, plug in a dummy shader.  This patch also
reworks the existing dummy shader code.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 97fdace6d7)
2014-02-26 13:35:57 -07:00
Brian Paul
16c03a004d svga: refactor some shader code
Put common code in new svga_shader.c file.  Considate separate vertex/
fragment shader ID generation.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 4686f610b1)
2014-02-26 13:35:57 -07:00
Fredrik Höglund
3d2979f83b glx: Fix the GLXFBConfig attrib sort priorities
The sort priorites for GLX_SAMPLES and GLX_SAMPLE_BUFFERS are
not defined in GL_ARB_multisample, but they are defined in
the GLX 1.4 specification.

Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 3616e862f2)
2014-02-26 09:15:58 -08:00
Fredrik Höglund
3224f0c978 glx: Fix the default values for GLXFBConfig attributes
The default values for GLX_DRAWABLE_TYPE and GLX_RENDER_TYPE are
GLX_WINDOW_BIT and GLX_RGBA_BIT respectively, as specified in
the GLX 1.4 specification.

This fixes the glx-choosefbconfig-defaults piglit test.

Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit f41c2f6c33)
2014-02-26 09:15:54 -08:00
Emil Velikov
fc2834f5ad nv50: correctly calculate the number of vertical blocks during transfer map
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 882070cc81)
2014-02-25 07:59:25 -08:00
Ilia Mirkin
e32e2836a3 nv50: make sure to clear _all_ layers of all attachments
Unfortunately there's only one RT_ARRAY_MODE setting for all
attachments, so clears were previously truncated to the minimum number
of layers any attachment had. Instead set the RT_ARRAY_MODE to 512 (the
max number of layers) before doing the clear. This fixes
gl-3.2-layered-rendering-clear-color-mismatched-layer-count.

Also fix clears of individual layered rt/zeta, in case it ever happens.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
Cc: 10.1 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6152ba0894)
2014-02-24 11:09:52 -08:00
Christoph Bumiller
d8012560d5 nv50/ir/ra: fix SpillCodeInserter::offsetSlot usage
We were turning non-memory spill slots into NULL.

Cc: 10.1 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 1f4bfb8797)
2014-02-24 11:09:47 -08:00
Christian König
8e4fec994c st/vdpau: add flush on unmap
Flush the context when we unmap a buffer, otherwise VDPAU might
start rendering the next frame while we still reference that buffer.

Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: StrangeNoises (rachel@strangenoises.org)
(cherry picked from commit db54fca9b8)
2014-02-24 10:37:01 -08:00
Marek Olšák
5437d38fac vdpau: flush the context before exporting the surface v2
Bugzilla (bug needs XBMC changes as well):
https://bugs.freedesktop.org/show_bug.cgi?id=73191

When VL uploads vertex buffers, it uses PIPE_TRANSFER_DONTBLOCK, which always
flushes the context in the winsys if the buffer being mapped is busy. Since
I added handling of DISCARD_RANGE, DONTBLOCK has had no effect when combined
with DISCARD_RANGE and I think the context isn't flushed anywhere else,
so no commands are submitted to the GPU until the IB is full, which takes
a lot of frames.

Using DISCARD_RANGE is not the only way to trigger this bug. The other way
is to reallocate the vertex buffer before every upload.

BTW, I'm not sure if this is the right place for flushing, but it does fix
the bug.

v2 (chk): move the flush to the right place.

Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: StrangeNoises (rachel@strangenoises.org)
(cherry picked from commit 3f98053fc9)
2014-02-24 10:36:42 -08:00
Ian Romanick
fcb4eabb5f mesa: Bump version to 10.1-rc2
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2014-02-21 14:29:44 -08:00
Kenneth Graunke
7cb4dea765 i965: Create a hardware context before initializing state module.
brw_init_state() calls brw_upload_initial_gpu_state().  If hardware
contexts are enabled (brw->hw_ctx != NULL), this will upload some
initial invariant state for the GPU.  Without hardware contexts, we
rely on this state being uploaded via atoms that subscribe to the
BRW_NEW_CONTEXT bit.

Commit 46d3c2bf4d accidentally moved
the call to brw_init_state() before creating a hardware context.
This meant brw_upload_initial_gpu_state would always early return.
Except on Gen6+, we stopped uploading the initial GPU state via
state atoms, so it never happened.

Fixes a regression since 46d3c2bf4d.

Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 3663bbe773)
2014-02-21 14:28:37 -08:00
Ian Romanick
b498fb9586 glsl: Only warn for macro names containing __
From page 14 (page 20 of the PDF) of the GLSL 1.10 spec:

    "In addition, all identifiers containing two consecutive underscores
     (__) are reserved as possible future keywords."

The intention is that names containing __ are reserved for internal use
by the implementation, and names prefixed with GL_ are reserved for use
by Khronos.  Names simply containing __ are dangerous to use, but should
be allowed.

Per the Khronos bug mentioned below, a future version of the GLSL
specification will clarify this.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Darius Spitznagel <d.spitznagel@goodbytez.de>
Cc: Tapani Pälli <lemody@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71870
Bugzilla: Khronos #11702
(cherry picked from commit 2c85fd5a96)
2014-02-20 13:41:14 -08:00
Ian Romanick
3731a4fae4 glcpp: Only warn for macro names containing __
Section 3.3 (Preprocessor) of the GLSL 1.30 spec (and later) and the
GLSL ES spec (all versions) say:

    "All macro names containing two consecutive underscores ( __ ) are
    reserved for future use as predefined macro names. All macro names
    prefixed with "GL_" ("GL" followed by a single underscore) are also
    reserved."

The intention is that names containing __ are reserved for internal use
by the implementation, and names prefixed with GL_ are reserved for use
by Khronos.  Since every extension adds a name prefixed with GL_ (i.e.,
the name of the extension), that should be an error.  Names simply
containing __ are dangerous to use, but should be allowed.  In similar
cases, the C++ preprocessor specification says, "no diagnostic is
required."

Per the Khronos bug mentioned below, a future version of the GLSL
specification will clarify this.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Darius Spitznagel <d.spitznagel@goodbytez.de>
Cc: Tapani Pälli <lemody@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71870
Bugzilla: Khronos #11702
(cherry picked from commit 0bd7892630)
2014-02-20 13:41:09 -08:00
Anuj Phogat
d623eeb37a glsl: Fix condition to generate shader link error
GL_ARB_ES2_compatibility doesn't say anything about shader linking
when one of the shaders (vertex or fragment shader) is absent. So,
the extension shouldn't change the behavior specified in GLSL
specification.

Tested the behavior on proprietary linux drivers of NVIDIA and AMD.
Both of them allow linking a version 100 shader program in OpenGL
context, when one of the shaders is absent.

Makes following Khronos CTS tests to pass:
successfulcompilevert_linkprogram.test
successfulcompilefrag_linkprogram.test

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 03597cf802)
2014-02-18 16:47:33 -08:00
Anuj Phogat
6534d80cca mesa: Add GL_TEXTURE_CUBE_MAP_ARRAY to legal_get_tex_level_parameter_target()
Fixes failing Khronos CTS test packed_depth_stencil_init.test

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 6bd2472a8b)
2014-02-18 16:47:33 -08:00
Michel Dänzer
20eb466999 r600g,radeonsi: Consolidate logic for short-circuiting flushes
Fixes radeonsi emitting command streams to the kernel even when there
have been no draw calls before a flush, potentially powering up the GPU
needlessly.

Incidentally, this also cuts the runtime of piglit gpu.py in about half
on my Kaveri system, probably because an X11 client going away no longer
always results in a command stream being submitted to the kernel via
glamor.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=65761
Cc: "10.1" mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit cf0172d46a)
2014-02-18 16:47:33 -08:00
Kusanagi Kouichi
706eef0cfe targets/vdpau: Always use c++ to link
If built without llvm, the following error occurs with mplayer:

Failed to open VDPAU backend .../libvdpau_r600.so: undefined symbol: _ZTVN10__cxxabiv117__class_type_infoE
[vo/vdpau] Error when calling vdp_device_create_x11: 1

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 61f6cddef7)
2014-02-18 16:47:33 -08:00
Carl Worth
a889b8e9aa main: Avoid double-free of shader Label
As documented, the _mesa_free_shader_program_data function:

	"Frees all the data that hangs off a shader program object, but not
	the object itself."

This means that this function may be called multiple times on the same object,
(and has been observed to). Meanwhile, the shProg->Label field was not being
set to NULL after its free(). This led to a second call to free() of the same
address on the second call to this function.

Fix this by setting this field to NULL after free(), (just as with all other
calls to free() in this function).

Reviewed-by: Brian Paul <brianp@vmware.com>

CC: mesa-stable@lists.freedesktop.org
(cherry picked from commit a92581acf2)
2014-02-18 16:47:33 -08:00
Alex Deucher
02d96b7e9f radeon: reverse DBG_NO_HYPERZ logic
Change the flag to DBG_HYPERZ and reverse the logic
so setting the flag enabled the feature.  This disables
hyperz on r600g and radeonsi by default.  It can be
enabled by setting the env var.  There are just too
many issues with certain apps so leave it disabled for
now until we sort out the issues with the problematic
apps.

Bugs:
https://bugs.freedesktop.org/show_bug.cgi?id=58660
https://bugs.freedesktop.org/show_bug.cgi?id=64471
https://bugs.freedesktop.org/show_bug.cgi?id=66352
https://bugs.freedesktop.org/show_bug.cgi?id=68799
https://bugs.freedesktop.org/show_bug.cgi?id=72685
https://bugs.freedesktop.org/show_bug.cgi?id=73088
https://bugs.freedesktop.org/show_bug.cgi?id=74428
https://bugs.freedesktop.org/show_bug.cgi?id=74803
https://bugs.freedesktop.org/show_bug.cgi?id=74863
https://bugs.freedesktop.org/show_bug.cgi?id=74892
https://bugzilla.kernel.org/show_bug.cgi?id=70411

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: "10.1" "10.0" <mesa-stable@lists.freedesktop.org>
Acked-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 01e6371149)
2014-02-18 16:45:05 -08:00
Ilia Mirkin
a1b6aa9fe2 nouveau: fix chipset checks for nv1a by using the oclass instead
Commit f4ebcd133b ("dri/nouveau: NV17_3D class is not available for
NV1a chipset") fixed this partially by using the correct 3d class.
However there were a lot of checks left over comparing against the
chipset.

Reported-and-tested-by: John F. Godfrey <jfgodfrey@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 9.2 10.0 10.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit 0c8b165366)
2014-02-18 16:45:01 -08:00
Fredrik Höglund
150b1f0aac mesa: Preserve the NewArrays state when copying a VAO
Cc: "10.1" "10.0" <mesa-stable@lists.freedesktop.org>

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72895
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 9afbd04d89)
2014-02-18 16:44:58 -08:00
Matt Turner
50066dc544 glsl: Do not vectorize vector array dereferences.
Array dereferences must have scalar indices, so we cannot vectorize
them.

Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Reported-by: Andrew Guertin <lists@dolphinling.net>
Tested-by: Andrew Guertin <lists@dolphinling.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 025d99ce3c)
2014-02-18 16:44:53 -08:00
Emil Velikov
088d642b8f dri/nouveau: Pass the API into _mesa_initialize_context
Currently we create a OPENGL_COMPAT context regardless of
what was requested by the program. Correct that by retaining
the program's request and passing it into _mesa_initialize_context.

Based on a similar commit for radeon/r200 by Ian Romanick.

Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 76d9f6d972)
2014-02-18 16:44:50 -08:00
Daniel Kurtz
69bd4ed017 glsl: Add locking to builtin_builder singleton
Consider a multithreaded program with two contexts A and B, and the
following scenario:

1. Context A calls initialize(), which allocates mem_ctx and starts
   building built-ins.
2. Context B calls initialize(), which sees mem_ctx != NULL and assumes
   everything is already set up.  It returns.
3. Context B calls find(), which fails to find the built-in since it
   hasn't been created yet.
4. Context A finally finishes initializing the built-ins.

This will break at step 3.  Adding a lock ensures that subsequent
callers of initialize() will wait until initialization is actually
complete.

Similarly, if any thread calls release while another thread is still
initializing, or calling find(), the mem_ctx/shader would get free'd while
from under it, leading to corruption or use-after-free crashes.

Fixes sporadic failures in Piglit's glx-multithread-shader-compile.

Bugzilla: https://bugs.freedesktop.org/69200
Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.1 10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b47d231526)
2014-02-18 16:44:46 -08:00
Kenneth Graunke
62a358892f mesa: Fix MESA_FORMAT_Z24_UNORM_S8_UINT vs. X8_UINT mix-up.
In commit eeed49f5f2, Mark accidentally
renamed MESA_FORMAT_S8_Z24 to MESA_FORMAT_Z24_UNORM_X8_UINT and
MESA_FORMAT_X8_Z24 to MESA_FORMAT_Z24_UNORM_S8_UINT, reversing their
sense.  The commit message was correct, but what sed commands actually
got run didn't match that.

This patch swaps the two enum names, reversing them.  This should undo
the damage, but might break things if people have manually fixed a few
instances in the meantime...

Mark's commit also failed to mention renames:
s/MESA_FORMAT_ARGB2101010_UINT\b/MESA_FORMAT_B10G10R10A2_UINT/g
s/MESA_FORMAT_ABGR2101010\b/MESA_FORMAT_R10G10B10A2_UNORM/g
but those seem okay.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit a487ef87fe)
2014-02-10 08:45:16 -08:00
Ilia Mirkin
290648b076 nouveau/video: make sure that firmware is present when checking caps
Apparently some players are ill-prepared for us claiming that a decoder
exists only to have creating it fail, and express this poor preparation
with crashes (e.g. flash). Check that firmware is there to increase the
chances of there being a high correlation between reported capabilities
and ability to create a decoder.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 10.0 10.1 <mesa-stable@lists.freedesktop.org>
Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 40dd777b33)
2014-02-10 08:44:56 -08:00
Grigori Goronzy
7f97f1fce4 gallium: add geometry shader output limits
v2: adjust limits for radeonsi and llvmpipe
v3: add documentation

Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit d34d5fddf8)
2014-02-10 08:44:52 -08:00
Ilia Mirkin
33169597f7 nv30: report 8 maximum inputs
nvfx_fragprog_assign_generic only allows for up to 10/8 texcoords for
nv40/nv30. This fixes compilation of the varying-packing tests.
Furthermore it appears that the last 2 inputs on nv4x don't seem to
work in those tests, so just report 8 everywhere for now.

Tested on NV42, NV44. NV4B appears to have additional problems.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 9.1 9.2 10.0 10.1 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 356aff3a5c)
2014-02-10 08:44:48 -08:00
Christoph Bumiller
966f2d3db8 nv50/ir/ra: some register spilling fixes
Cc: 10.1 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2e9ee44797)
2014-02-10 08:44:41 -08:00
Brian Paul
3cefbe5cf5 mesa: update assertion in detach_shader() for geom shaders
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74723
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
(cherry picked from commit c325ec8965)
2014-02-10 08:44:37 -08:00
Ian Romanick
1e6bba58d8 mesa: Bump version to 10.1-rc1
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2014-02-07 18:34:08 -08:00
Christoph Bumiller
137a0fe5c8 nvc0: handle TGSI_SEMANTIC_LAYER
Cc: 10.1 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 882e98e5e6)
2014-02-07 17:10:11 -08:00
Kristian Høgsberg
70e8ec38b5 glx: Pass NULL DRI drawables into the DRI driver for None GLX drawables
GLX_ARB_create_context allows making a GLX context current with None
drawable and readables, but this was never implemented correctly in GLX.
We would create a __DRIdrawable for the None GLX drawable and pass that
to the DRI driver and that would somehow work.  Now it's somehow broken.

The way this should have worked is that we pass a NULL DRI drawable
to the DRI driver when the GLX user calls glXMakeContextCurrent()
with None for drawable and readables.

https://bugs.freedesktop.org/show_bug.cgi?id=74143
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
(cherry picked from commit f658150639)
2014-02-07 17:10:11 -08:00
Kristian Høgsberg
c79a7ef9a3 i965: Move intel_prepare_render() above first buffer access
The driver is supposed to ensure buffers before any drawing operation, but in
do_blit_drawpixels() and do_blit_copypixels() we inspect the buffer format
before calling intel_prepare_render().  That was covered up by the
unconditional call to intel_prepare_render() in intelMakeCurrent(), but we
now only do this on the initial intelMakeCurrent call for a context
(to get the size for the initial viewport values).

https://bugs.freedesktop.org/show_bug.cgi?id=74083

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Tested-by: Alexander Monakov <amonakov@gmail.com>
(cherry picked from commit 44338cd826)
2014-02-07 17:10:11 -08:00
Christoph Bumiller
17aeb3fdc9 nvc0/ir/emit: hardcode vertex output stream to 0 for now
(cherry picked from commit b7233acf78)
2014-02-07 17:10:11 -08:00
Kenneth Graunke
ecaf9259e9 glsl: Don't lose precision qualifiers when encountering "centroid".
Mesa fails to retain the precision qualifier when parsing:

   #version 300 es
   centroid in mediump vec2 v;

Consider how the parser's type_qualifier production is applied.
First, the precision_qualifier rule creates a new ast_type_qualifier:

    <precision: mediump>

Then the storage_qualifier rule creates a second one:

    <flags: in>

and calls merge_qualifier() to fold in any previous qualifications,
returning:

    <flags: in, precision: mediump>

Finally, the auxiliary_storage_qualifier creates one for "centroid":

    <flags: centroid>

it then does $$ = $1 and $$.flags |= $2.flags, resulting in:

    <flags: centroid, in>

Since precision isn't stored in the flags bitfield, it is lost.  We need
to instead call merge_qualifier to combine all the fields.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reported-by: Kevin Rogovin <kevin.rogovin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 2062f40d81)
2014-02-07 17:10:10 -08:00
Brian Paul
0fb761b404 st/mesa: avoid sw fallback for getting/decompressing textures
If st_GetTexImage() is to decompress the texture, avoid the fallback
path even if prefer_blit_based_texture_transfer = false.  For drivers
that returned PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER = 0, we
were always taking the fallback path for texture decompression rather
than rendering a quad.  The later is a lot faster.

Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit f47e596288)
2014-02-07 17:10:10 -08:00
Ilia Mirkin
31911f8d37 nv50: only over-allocate by a page for code
The pre-fetching doesn't go too far. Tested with over-allocating by only
a page, and didn't see any errors in dmesg. Saves ~512KB of VRAM.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 10.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
(cherry picked from commit f76c7ad5b1)
2014-02-07 17:10:10 -08:00
Ilia Mirkin
142f6cc0b4 nv50: fix layerid to be the fp input number rather than vp output number
In the tests they were the same so it didn't matter, but indications are
that this is the correct behaviour. Also take this opportunity to
(trivially) support using gl_Layer in fp.

Cc: 10.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
(cherry picked from commit 364bdd2419)
2014-02-07 17:10:10 -08:00
Ilia Mirkin
156ac628a8 nv50: rework primid logic
Functionally identical but much simpler. Should also better integrate
with future layer/viewport changes/fixes.

Cc: 10.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
(cherry picked from commit c7373b7dc7)
2014-02-07 17:10:10 -08:00
Matt Turner
7aa84761b6 glsl: Initialize ubo_binding_mask flags to zero.
Missed in commit e63bb298. Caused sporadic test failures, like
incorrect-in-layout-qualifier-repeated-prim.geom.

Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit e2ef93cf94)
2014-02-07 17:10:10 -08:00
Marek Olšák
61219adb3d st/mesa: fix crash when a shader uses a TBO and it's not bound
This binds a NULL sampler view in that case.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74251

Cc: "10.1" "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit c6dbcf10df)
2014-02-07 17:10:10 -08:00
Paul Berry
ee632e68bd glsl: Fix continue statements in do-while loops.
From the GLSL 4.40 spec, section 6.4 (Jumps):

    The continue jump is used only in loops. It skips the remainder of
    the body of the inner most loop of which it is inside. For while
    and do-while loops, this jump is to the next evaluation of the
    loop condition-expression from which the loop continues as
    previously defined.

Previously, we incorrectly treated a "continue" statement as jumping
to the top of a do-while loop.

This patch fixes the problem by replicating the loop condition when
converting the "continue" statement to IR.  (We already do a similar
thing in "for" loops, to ensure that "continue" causes the loop
expression to be executed).

Fixes piglit tests:
- glsl-fs-continue-inside-do-while.shader_test
- glsl-vs-continue-inside-do-while.shader_test
- glsl-fs-continue-in-switch-in-do-while.shader_test
- glsl-vs-continue-in-switch-in-do-while.shader_test

Cc: mesa-stable@lists.freedesktop.org

Acked-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 7f5740899f)
2014-02-07 17:10:10 -08:00
Paul Berry
b5c99be4af glsl: Make condition_to_hir() callable from outside ast_iteration_statement.
In addition to making it public, we also need to change its first
argument from an ir_loop * to an exec_list *, so that it can be used
to insert the condition anywhere in the IR (rather than just in the
body of the loop).

This will be necessary in order to make continue statements work
properly in do-while loops.

Cc: mesa-stable@lists.freedesktop.org

Acked-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 56790856b3)
2014-02-07 17:10:10 -08:00
Topi Pohjolainen
165868d45e i965/blorp: do not use unnecessary hw-blending support
This is really not needed as blorp blit programs already sample
XRGB normally and get alpha channel set to 1.0 automatically by
the sampler engine. This is simply copied directly to the payload
of the render target write message and hence there is no need for
any additional blending support from the pixel processing pipeline.

The blending formula is anyway broken for color components, it
multiplies the color component with itself (blend factor is the
component itself).
Alpha blending in turn would not fix the alpha to one independent
of the source but simply used the source alpha as is instead
(1.0 * src_alpha + 0.0 * dst_alpha).

Quoting Eric:

 "If we want to actually make the no-alpha-bits-present thing work,
  we need to override the bits in the surface state or in the
  generated code.  In the normal draw path, it's done for sampling
  by the swizzling code in brw_wm_surface_state.c, and the blending
  overrides is just to fix up the alpha blending stage which
  doesn't pay attention to that for the destination surface."

If one modifies piglit test gl-3.2-layered-rendering-blit to use
color component values other than zero or one, this change will
kick in on IVB. No regressions on IVB.

This is effectively revert of c0554141a9:

    i965/blorp: Support overriding destination alpha to 1.0.

    Currently, Blorp requires the source and destination formats to be
    equal.  However, we'd really like to be able to blit between XRGB and
    ARGB formats; our BLT engine paths have supported this for a long time.

    For ARGB -> XRGB, nothing needs to occur: the missing alpha is already
    interpreted as 1.0.  For XRGB -> ARGB, we need to smash the alpha
    channel to 1.0 when writing the destination colors.  This is fairly
    straightforward with blending.

    For now, this code is never used, as the source and destination formats
    still must be equal.  The next patch will relax that restriction.

    NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
(cherry picked from commit 933be19cdf)
2014-02-07 17:10:10 -08:00
Christian König
bbcd975881 radeon/uvd: fix feedback buffer handling v2
Without the correct feedback buffer size UVD runs
into an error on each frame, reducing the maximum FPS.

v2: fixing Michels comments

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Cc: "10.1" "10.0" "9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c3c24c3acc)
2014-02-07 17:10:10 -08:00
Brian Paul
6cfcc4fccf draw: fix incorrect color of flat-shaded clipped lines
When we clipped a line weren't copying the provoking vertex
color to the second vertex.  We also weren't checking for
first vs. last provoking vertex.

Fixes failures found with the new piglit line-flat-clip-color test.

Cc: "10.0, 10.1" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit fc3fcd1e01)
2014-02-07 17:10:09 -08:00
Brian Paul
39a3b0313b gallium/auxiliary/indices: replace free() with FREE()
To match the CALLOC_STRUCT() call.

Cc: "10.0, 10.1" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 307fd76053)
2014-02-07 17:10:09 -08:00
Dave Airlie
9e59e41266 docs: update 10.1 relnotes to note GL 3.3 on r600 and radeonsi.
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-02-06 01:03:09 +00:00
Dave Airlie
1289080c4d r600g: Add GL 3.3 support for 10.1 release
All patches on master below, except max samplers
which was removed on master.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>

commit 57c6bb18822ebf88a98b98714c846608ff3ba42b
Author: Dave Airlie <airlied@redhat.com>
Date:   Thu Feb 6 00:48:57 2014 +0000

    bump max samplers

commit 2e4bd244493bebd41edf725a2c3c4e793282a5bb
Author: Dave Airlie <airlied@redhat.com>
Date:   Thu Jan 30 04:19:57 2014 +0000

    r600g: add support for geom shaders to r600/r700 chipsets (v2)

    This is my first attempt at enabling r600/r700 geometry shaders,
    the basic tests pass on both my rv770 and my rv635,

    It requires this kernel patch:
    http://www.spinics.net/lists/dri-devel/msg52745.html

    v2: address Alex comments.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit 0ed4f769d77c4db2259befba5fc1707f1cb5cb98
Author: Dave Airlie <airlied@redhat.com>
Date:   Wed Jan 29 21:48:09 2014 +0000

    r600g: enable GLSL 3.30 on evergreen GPUs

    This throws the switch to enable GL 3.3 and GLSL 330.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit aeca8f21dd42b9ecd3932ef028fa8846036c1307
Author: Dave Airlie <airlied@redhat.com>
Date:   Tue Feb 4 10:48:42 2014 +1000

    r600g: properly propogate clip dist write value

    This moves the value from the GS shader to the copy shader so the registers
    are setup correctly.

    fixes tests/spec/glsl-1.50/execution/geometry/clip-distance-out-values.shader_test

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit e1bc410fe670bb17078a55876f1700a504127fef
Author: Dave Airlie <airlied@redhat.com>
Date:   Mon Feb 3 15:31:26 2014 +1000

    r600g: calculate a better value for array_size (v2)

    attempt to calculate a better value for array size to avoid breaking apps.

    v2: use 0xfff like streamout, suggested by Grigori

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit 6f2f117dec51eb51c1b09e86e829e176a98e3bfc
Author: Dave Airlie <airlied@redhat.com>
Date:   Fri Jan 31 03:35:51 2014 +0000

    r600g: fix CAYMAN geometry shader support

    cayman has a different end of program bit, so do that properly.

    fixes hangs with geom shader tests on cayman.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit 305ea22fd517f83406aba3e3930d710fd42a3049
Author: Dave Airlie <airlied@redhat.com>
Date:   Wed Jan 29 00:17:15 2014 +0000

    r600g: fix up shader out misc stuff for copy shader

    set the correct values so the misc out register is setup correctly
    for the copy shader.

    This also updates the state for the gs copy shader so the hw
    gets programmed correctly.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit 53630e14c8791a84798a03d74653bf46bd013fc7
Author: Dave Airlie <airlied@redhat.com>
Date:   Tue Jan 28 23:15:29 2014 +0000

    r600g: port the layered surface rendering patch from radeonsi

    This just makes r600 and evergreen do what the radeonsi codepaths do
    for layered rendering. This makes the 2d amd_vertex_shader_layer test
    pass on evergreen.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit aa4cd3b9bed1ea23468fba4aa5c428153e8cddc1
Author: Dave Airlie <airlied@redhat.com>
Date:   Tue Jan 28 13:04:00 2014 +1000

    r600g: initial VS output layer support

    This just adds support for emitting the proper value in the VS out misc.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit 75a93f2e1e0f4d6015cdf63570ec4d3d12478b8d
Author: Dave Airlie <airlied@redhat.com>
Date:   Tue Jan 28 12:06:49 2014 +1000

    r600g: setup const texture buffers for geom shaders

    This just enables the workarounds we have for vertex/pixel shaders
    for geom shaders as well.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit 88697a860635aae54e56dce2d6a839a06dea0c5a
Author: Dave Airlie <airlied@redhat.com>
Date:   Fri Jan 24 17:14:26 2014 +1000

    r600g: calculate correct cut value

    This selects the cut value depending on the shader selected.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit dfb88bef3e13112a838773e700c35052774f8a63
Author: Dave Airlie <airlied@redhat.com>
Date:   Fri Jan 24 14:46:37 2014 +1000

    r600g: fix dynamic_input_array_index.shader_test

    This follows what fglrx does, it unpacks the input we are
    going to indirect into a bunch of registers and indirects
    inside them.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit a3c6373f8cf3aab750399654a4b77150ec30bce9
Author: Dave Airlie <airlied@redhat.com>
Date:   Fri Jan 24 13:39:36 2014 +1000

    r600g: add support for indirect geom ring writes

    We need to be able to write to the ring using a base register
    for when we emit vertices in a loop, in theory the SB compiler
    could collapse these indirect writes to direct writes if the
    register value is constant and known, but that is outside my
    pay grade.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit dbc6a13adf935b118eaa6b396593f50d7b7e16e6
Author: Dave Airlie <airlied@redhat.com>
Date:   Tue Dec 24 05:59:19 2013 +0000

    r600g: write proper output prim type

    Vadim's code derived it from the info.mode, but it needs
    to be takes from the geometry shader output primitive.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit f7f51b0b775f652967e2b972cf7c183482a771be
Author: Dave Airlie <airlied@redhat.com>
Date:   Tue Dec 24 05:30:37 2013 +0000

    r600g: enable instance cnt register with new enough kernel

    The instance cnt register was missing for a few kernels,
    with a new enough kernel we can output it.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit 9e6ce37f66372018ec5398f74c3b43ff5f5bf309
Author: Dave Airlie <airlied@redhat.com>
Date:   Mon Dec 23 01:30:03 2013 +0000

    r600g: add primitive input support for gs

    only enable prim id if gs uses it

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit fa932dfc7df3cf9ff63d08fb0e1db2119fc2ac93
Author: Dave Airlie <airlied@redhat.com>
Date:   Thu Dec 19 05:17:00 2013 +0000

    r600g: emit streamout from dma copy shader

    This enables streamout with GS in the mix, from the
    VS dma shader.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit 205defb542ac185b7f46508fd51a4077a4702107
Author: Dave Airlie <airlied@redhat.com>
Date:   Wed Dec 18 15:55:07 2013 +1000

    r600g/gs: fix cases where number of gs inputs != number of gs outputs

    this fixes a bunch of the geom shader built-in tests

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit d9e7ab40bc45644194c86f842599c76d0675243c
Author: Dave Airlie <airlied@redhat.com>
Date:   Tue Jan 28 10:21:03 2014 +1000

    r600g: increase array base for exported parameters

    Trivial fix to Vadim's code.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit 82d67fbd3b96b6b2cc0124a19b6f31b7912ec152
Author: Dave Airlie <airlied@redhat.com>
Date:   Fri Jan 24 16:41:32 2014 +1000

    r600g: initialise the geom shader loop registers.

    As we do for vertex and pixel shaders.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit 78be55d98d290d708bd1b3df3ef6cd5fa89865c7
Author: Dave Airlie <airlied@redhat.com>
Date:   Sat Nov 30 06:26:13 2013 +0000

    r600g: emit NOPs at end of shaders in more cases

    If the shader has no CF clauses at all emit an nop
    If the last instruction is an ENDLOOP add a NOP for the LOOP to go to
    if the last instruction is CALL_FS add a NOP

    These fix a bunch of hangs in the geometry shader tests.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit 634b2498dc73efa3cca5a6fc3ed35c5bea6bb2e9
Author: Dave Airlie <airlied@redhat.com>
Date:   Thu Nov 28 23:38:35 2013 +0000

    r600g: don't enable SB for geom shaders

    SB needs fixes for three GS instructions it seems to raise
    them outside loops etc despite my best efforts.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit 5b61dd0e917e54625ac227b8b1c2c82955f51ab1
Author: Dave Airlie <airlied@redhat.com>
Date:   Tue Dec 24 04:56:25 2013 +0000

    r600g/sb: add MEM_RING support

    Although we don't use SB on geom shaders, the VS copy shader will use it
    so we might as well implement MEM_RING support in sb.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit 0247375aec4681c154ae4d14b8cd637e7a9e0e3e
Author: Dave Airlie <airlied@redhat.com>
Date:   Wed Jan 29 04:08:43 2014 +0000

    r600g: don't fail if we can't map VS->GS ring entries

    This can happen in normal operation, so don't report an error on it,
    just continue.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit 2c986600fac6cb5692e9e377cb04f9f50389172c
Author: Vadim Girlin <vadimgirlin@gmail.com>
Date:   Fri Aug 2 06:38:23 2013 +0400

    r600g: initial support for geometry shaders on evergreen (v2)

    This is Vadim's initial work with a few regression fixes squashed in.

    v2: (airlied)
    fix regression in glsl-max-varyings - need to use vs and ps_dirty
    fix regression in shader exports from rebasing.
    whitespace fixing.
    v2.1: squash fix assert

    Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit ce23c43e2b611f30964afe4d1c02c4d0361ba430
Author: Vadim Girlin <vadimgirlin@gmail.com>
Date:   Fri Aug 2 06:32:32 2013 +0400

    r600g: add hw register definitions for GS block setup

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit b0ec79c28d6373930ca0dc19168dd504204456b5
Author: Vadim Girlin <vadimgirlin@gmail.com>
Date:   Wed Jul 31 23:09:39 2013 +0400

    r600g: defer shader variant selection and depending state updates

    [airlied: fix dropped streamout line - fix for master]

    Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit e41cbfb4d15d519f9301699f39d7dd0153f2edf4
Author: Dave Airlie <airlied@redhat.com>
Date:   Mon Jan 13 10:19:00 2014 +1000

    r600g/bc: add support for indexed memory writes.

    It looks like we need these for geom shaders in the future.

    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit 46efb1648e883b2cb231cca38c1540e7e9ec1ecc
Author: Vadim Girlin <vadimgirlin@gmail.com>
Date:   Wed Jul 31 20:02:22 2013 +0400

    r600g: move barrier and end_of_program bits from output to cf struct (v2)

    v2: fix regression on r600 NOP instructions.

    Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Dave Airlie <airlied@redhat.com>

commit 42802d5d8d145f07cf3fca1bb6e8ab0cd1fd5c85
Author: Dave Airlie <airlied@redhat.com>
Date:   Wed Jan 29 01:33:14 2014 +0000

    r600g: split streamout emit code into a separate function

    For geometry shaders we need to call this code from a second place.

    Just move it out for now to keep future patches cleaner.

    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-02-06 00:49:58 +00:00
337 changed files with 16599 additions and 4170 deletions

View File

@@ -1 +1 @@
10.1.0-devel
10.1.6

28
bin/.cherry-ignore Normal file
View File

@@ -0,0 +1,28 @@
# This patch does not apply cleanly, author says it can be skipped.
dff3eccd158d648482bb47118ef5d57a9186e5a4
# And this one depends on the above, author says it too can be skipped.
ac35ded4733883037316d556af596524e5e02535
# This patch introduces some regressions. See:
# https://bugs.freedesktop.org/show_bug.cgi?id=77443
1afe3359258a9e89b62c8638761f52d78f6d1cbc
# Author retracted this from consideration for stable branch
3e817e7e56806d8adb8f16c35136045c29908944
# And this one was simply a bug fix for the previously-retracted commit
2bab95973d8ad3a84f62670143d6f26c230d9582
# Here we have a commit, and its subsequent "revert" both proposed within a
# single window of the stable release. So we can achieve the same final effect
# by ignoring both of the commits.
e3cc0d90e14e62a0a787b6c07a6df0f5c84039be
0d5ec2c615784929be095951f9269773a790a2dd
# The function being modified here (_eglCreateWindowSurfaceCommon) does not
# exist in the 10.1 branch.
91ff0d4c6510dc38f279c586ced17fba917873e7
# This patch is not needed (modifies work only in 10.2)
6980cae6aeb6671b6b0245e20a2d34957c1fff0a

View File

@@ -14,7 +14,7 @@ git log --reverse --grep="cherry picked from commit" origin/master..HEAD |\
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked
# Grep for commits that were marked as a candidate for the stable tree.
git log --reverse --pretty=%H -i --grep='^\([[:space:]]*NOTE: .*[Cc]andidate\|CC:.*mesa-stable\)' HEAD..origin/master |\
git log --reverse --pretty=%H -i --grep='^\([[:space:]]*NOTE: .*[Cc]andidate\|CC:.*10\.1.*mesa-stable\)' HEAD..origin/master |\
while read sha
do
# Check to see whether the patch is on the ignore list.

View File

@@ -542,11 +542,20 @@ AC_ARG_ENABLE([dri],
[enable DRI modules @<:@default=enabled@:>@])],
[enable_dri="$enableval"],
[enable_dri=yes])
case "$host_os" in
linux*)
dri3_default=yes
;;
*)
dri3_default=no
;;
esac
AC_ARG_ENABLE([dri3],
[AS_HELP_STRING([--enable-dri3],
[enable DRI3 @<:@default=enabled@:>@])],
[enable DRI3 @<:@default=auto@:>@])],
[enable_dri3="$enableval"],
[enable_dri3=yes])
[enable_dri3="$dri3_default"])
AC_ARG_ENABLE([glx],
[AS_HELP_STRING([--enable-glx],
[enable GLX library @<:@default=enabled@:>@])],
@@ -771,6 +780,13 @@ if test "x$have_libdrm" = xyes; then
DEFINES="$DEFINES -DHAVE_LIBDRM"
fi
case "$host_os" in
linux*)
need_libudev=yes ;;
*)
need_libudev=no ;;
esac
PKG_CHECK_MODULES([LIBUDEV], [libudev >= $LIBUDEV_REQUIRED],
have_libudev=yes, have_libudev=no)
@@ -830,9 +846,6 @@ xyesno)
PKG_CHECK_MODULES([DRI2PROTO], [dri2proto >= $DRI2PROTO_REQUIRED])
GL_PC_REQ_PRIV="$GL_PC_REQ_PRIV libdrm >= $LIBDRM_REQUIRED"
if test x"$enable_dri3" = xyes; then
if test x"$have_libudev" != xyes; then
AC_MSG_ERROR([DRI3 requires libudev >= $LIBUDEV_REQUIRED])
fi
PKG_CHECK_MODULES([DRI3PROTO], [dri3proto >= $DRI3PROTO_REQUIRED])
PKG_CHECK_MODULES([PRESENTPROTO], [presentproto >= $PRESENTPROTO_REQUIRED])
fi
@@ -1017,7 +1030,7 @@ if test "x$enable_dri" = xyes; then
gnu*)
DEFINES="$DEFINES -DUSE_EXTERNAL_DXTN_LIB=1"
DEFINES="$DEFINES -DHAVE_ALIAS"
;;
;;
solaris*)
DEFINES="$DEFINES -DUSE_EXTERNAL_DXTN_LIB=1"
;;
@@ -1037,7 +1050,7 @@ if test "x$enable_dri" = xyes; then
DRI_DIRS=`echo "$DRI_DIRS" | $SED 's/ */ /g'`
# Check for expat
PKG_CHECK_EXISTS([EXPAT], [have_expat=yes], [have_expat=no])
PKG_CHECK_EXISTS([expat], [have_expat=yes], [have_expat=no])
if test "x$have_expat" = "xyes"; then
PKG_CHECK_MODULES([EXPAT], [expat], [],
AC_MSG_ERROR([Expat required for DRI.]))
@@ -1178,8 +1191,8 @@ if test "x$enable_gbm" = xauto; then
esac
fi
if test "x$enable_gbm" = xyes; then
if test x"$have_libudev" != xyes; then
AC_MSG_ERROR([gbm needs udev])
if test "x$need_libudev$have_libudev" = xyesno; then
AC_MSG_ERROR([gbm requires udev >= $LIBUDEV_REQUIRED])
fi
if test "x$enable_dri" = xyes; then
@@ -1187,10 +1200,21 @@ if test "x$enable_gbm" = xyes; then
if test "x$enable_shared_glapi" = xno; then
AC_MSG_ERROR([gbm_dri requires --enable-shared-glapi])
fi
else
# Strictly speaking libgbm does not require --enable-dri, although
# both of its backends do. Thus one can build libgbm without any
# backends if --disable-dri is set.
# To avoid unnecessary complexity of checking if at least one backend
# is available when building, just mandate --enable-dri.
AC_MSG_ERROR([gbm requires --enable-dri])
fi
fi
AM_CONDITIONAL(HAVE_GBM, test "x$enable_gbm" = xyes)
GBM_PC_REQ_PRIV="libudev"
if test "x$need_libudev" = xyes; then
GBM_PC_REQ_PRIV="libudev >= $LIBUDEV_REQUIRED"
else
GBM_PC_REQ_PRIV=""
fi
GBM_PC_LIB_PRIV="$DLOPEN_LIBS"
AC_SUBST([GBM_PC_REQ_PRIV])
AC_SUBST([GBM_PC_LIB_PRIV])
@@ -1461,9 +1485,9 @@ for plat in $egl_platforms; do
;;
esac
case "$plat$have_libudev" in
waylandno|drmno)
AC_MSG_ERROR([cannot build $plat platfrom without udev]) ;;
case "$plat$need_libudev$have_libudev" in
waylandyesno|drmyesno)
AC_MSG_ERROR([cannot build $plat platform without udev >= $LIBUDEV_REQUIRED]) ;;
esac
done
@@ -1529,11 +1553,11 @@ AC_ARG_ENABLE([gallium-llvm],
[enable_gallium_llvm="$enableval"],
[enable_gallium_llvm=auto])
AC_ARG_WITH([llvm-shared-libs],
[AS_HELP_STRING([--with-llvm-shared-libs],
[link with LLVM shared libraries @<:@default=disabled@:>@])],
AC_ARG_ENABLE([llvm-shared-libs],
[AS_HELP_STRING([--enable-llvm-shared-libs],
[link with LLVM shared libraries @<:@default=enabled@:>@])],
[],
[with_llvm_shared_libs=no])
[with_llvm_shared_libs=yes])
AC_ARG_WITH([llvm-prefix],
[AS_HELP_STRING([--with-llvm-prefix],
@@ -1588,6 +1612,12 @@ if test "x$enable_gallium_llvm" = xyes; then
AC_COMPUTE_INT([LLVM_VERSION_MINOR], [LLVM_VERSION_MINOR],
[#include "${LLVM_INCLUDEDIR}/llvm/Config/llvm-config.h"])
dnl In LLVM 3.4.1 patch level was defined in config.h and not
dnl llvm-config.h
AC_COMPUTE_INT([LLVM_VERSION_PATCH], [LLVM_VERSION_PATCH],
[#include "${LLVM_INCLUDEDIR}/llvm/Config/config.h"],
LLVM_VERSION_PATCH=0) dnl Default if LLVM_VERSION_PATCH not found
if test "x${LLVM_VERSION_MAJOR}" != x; then
LLVM_VERSION_INT="${LLVM_VERSION_MAJOR}0${LLVM_VERSION_MINOR}"
else
@@ -1610,7 +1640,7 @@ if test "x$enable_gallium_llvm" = xyes; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} option"
fi
fi
DEFINES="${DEFINES} -DHAVE_LLVM=0x0$LLVM_VERSION_INT"
DEFINES="${DEFINES} -DHAVE_LLVM=0x0$LLVM_VERSION_INT -DLLVM_VERSION_PATCH=$LLVM_VERSION_PATCH"
MESA_LLVM=1
dnl Check for Clang internal headers
@@ -1629,6 +1659,10 @@ if test "x$enable_gallium_llvm" = xyes; then
else
MESA_LLVM=0
LLVM_VERSION_INT=0
if test "x$enable_opencl" = xyes; then
AC_MSG_ERROR([cannot enable OpenCL without LLVM])
fi
fi
dnl Directory for XVMC libs
@@ -1702,8 +1736,9 @@ gallium_require_llvm() {
gallium_require_drm_loader() {
if test "x$enable_gallium_loader" = xyes; then
PKG_CHECK_MODULES([LIBUDEV], [libudev], [],
AC_MSG_ERROR([Gallium drm loader requires libudev]))
if test "x$need_libudev$have_libudev" = xyesno; then
AC_MSG_ERROR([Gallium drm loader requires libudev >= $LIBUDEV_REQUIRED])
fi
if test "x$have_libdrm" != xyes; then
AC_MSG_ERROR([Gallium drm loader requires libdrm >= $LIBDRM_REQUIRED])
fi

254
docs/relnotes/10.1.1.html Normal file
View File

@@ -0,0 +1,254 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.1.1 Release Notes / (April 18, 2014)</h1>
<p>
Mesa 10.1.1 is a bug fix release which fixes bugs found since the 10.1 release.
</p>
<p>
Mesa 10.1.1 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>MD5 checksums</h2>
<pre>
96e63674ccfa98e7ec6eb4fee3f770c3 MesaLib-10.1.1.tar.gz
1fde7ed079df7aeb9b6a744ca033de8d MesaLib-10.1.1.tar.bz2
e64d0a562638664b13d2edf22321df59 MesaLib-10.1.1.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71547">Bug 71547</a> - compilation failure :#error &quot;SSE4.1 instruction set not enabled&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74868">Bug 74868</a> - r600g: Diablo III Crashes After a few minutes</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74988">Bug 74988</a> - Buffer overrun (segfault) decompressing ETC2 texture in GLBenchmark 3.0 Manhattan</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75279">Bug 75279</a> - XCloseDisplay() takes one minute around nouveau_dri.so, freezing Firefox startup</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75543">Bug 75543</a> - OSMesa Gallium OSMesaMakeCurrent</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75660">Bug 75660</a> - u_inlines.h:277:pipe_buffer_map_range: Assertion `length' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76323">Bug 76323</a> - GLSL compiler ignores layout(binding=N) on uniform blocks</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76377">Bug 76377</a> - DRI3 should only be enabled on Linux due to a udev dependency</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76749">Bug 76749</a> - [HSW] DOTA world lighting has no effect</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77102">Bug 77102</a> - gallium nouveau has no profile in vdpau and libva</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77207">Bug 77207</a> - [ivb/hsw] batch overwritten with garbage</li>
</ul>
<h2>Changes</h2>
<p>Aaron Watry (1):</p>
<ul>
<li>gallium/util: Fix memory leak</li>
</ul>
<p>Alexander von Gluck IV (1):</p>
<ul>
<li>haiku: Fix build through scons corrections and viewport fixes</li>
</ul>
<p>Anuj Phogat (2):</p>
<ul>
<li>mesa: Set initial internal format of a texture to GL_RGBA</li>
<li>mesa: Allow GL_DEPTH_COMPONENT and GL_DEPTH_STENCIL combinations in glTexImage{123}D()</li>
</ul>
<p>Brian Paul (12):</p>
<ul>
<li>softpipe: use 64-bit arithmetic in softpipe_resource_layout()</li>
<li>mesa: don't call ctx-&gt;Driver.ClearBufferSubData() if size==0</li>
<li>st/osmesa: check buffer size when searching for buffers</li>
<li>mesa: fix copy &amp; paste bugs in pack_ubyte_SARGB8()</li>
<li>mesa: fix copy &amp; paste bugs in pack_ubyte_SRGB8()</li>
<li>c11/threads: don't include assert.h if the assert macro is already defined</li>
<li>mesa: fix unpack_Z32_FLOAT_X24S8() / unpack_Z32_FLOAT() mix-up</li>
<li>st/mesa: add null pointer checking in query object functions</li>
<li>mesa: fix glMultiDrawArrays inside a display list</li>
<li>cso: fix sampler view count in cso_set_sampler_views()</li>
<li>svga: replace sampler assertion with conditional</li>
<li>svga: move LIST_INITHEAD(dirty_buffers) earlier in svga_context_create()</li>
</ul>
<p>Carl Worth (3):</p>
<ul>
<li>cherry-ignore: Ignore a few patches</li>
<li>glsl: Allow explicit binding on atomics again</li>
<li>Update VERSION to 10.1.1</li>
</ul>
<p>Chia-I Wu (1):</p>
<ul>
<li>i965/vec4: fix record clearing in copy propagation</li>
</ul>
<p>Christian König (2):</p>
<ul>
<li>st/mesa: recreate sampler view on context change v3</li>
<li>st/mesa: fix sampler view handling with shared textures v4</li>
</ul>
<p>Courtney Goeltzenleuchter (1):</p>
<ul>
<li>mesa: add bounds checking to eliminate buffer overrun</li>
</ul>
<p>Emil Velikov (5):</p>
<ul>
<li>nv50: add missing brackets when handling the samplers array</li>
<li>mesa: return v.value_int64 when the requested type is TYPE_INT64</li>
<li>configure: enable dri3 only for linux</li>
<li>glx: drop obsolete _XUnlock_Mutex in __glXInitialize error path</li>
<li>configure: cleanup libudev handling</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>i965: Fix buffer overruns in MSAA MCS buffer clearing.</li>
</ul>
<p>Hans (2):</p>
<ul>
<li>util: don't define isfinite(), isnan() for MSVC &gt;= 1800</li>
<li>mesa: don't define c99 math functions for MSVC &gt;= 1800</li>
</ul>
<p>Ian Romanick (7):</p>
<ul>
<li>linker: Split set_uniform_binding into separate functions for blocks and samplers</li>
<li>linker: Various trivial clean-ups in set_sampler_binding</li>
<li>linker: Fold set_uniform_binding into call site</li>
<li>linker: Clean up "unused parameter" warnings</li>
<li>linker: Set block bindings based on UniformBlocks rather than UniformStorage</li>
<li>linker: Set binding for all elements of UBO array</li>
<li>glsl: Propagate explicit binding information from the AST all the way to the linker</li>
</ul>
<p>Ilia Mirkin (8):</p>
<ul>
<li>nouveau: fix fence waiting logic in screen destroy</li>
<li>nv50: adjust blit_3d handling of ms output textures</li>
<li>loader: add special logic to distinguish nouveau from nouveau_vieux</li>
<li>mesa/main: condition GL_DEPTH_STENCIL on ARB_depth_texture</li>
<li>nouveau: add forgotten GL_COMPRESSED_INTENSITY to texture format list</li>
<li>nouveau: there may not have been a texture if the fbo was incomplete</li>
<li>nvc0/ir: move sample id to second source arg to fix sampler2DMS</li>
<li>nouveau: fix firmware check on nvd7/nvd9</li>
</ul>
<p>Johannes Nixdorf (1):</p>
<ul>
<li>configure.ac: fix the detection of expat with pkg-config</li>
</ul>
<p>Jonathan Gray (7):</p>
<ul>
<li>gallium: add endian detection for OpenBSD</li>
<li>loader: use 0 instead of FALSE which isn't defined</li>
<li>loader: don't limit the non-udev path to only android</li>
<li>megadriver_stub.c: don't use _GNU_SOURCE to gate the compat code</li>
<li>egl/dri2: don't require libudev to build drm/wayland platforms</li>
<li>egl/dri2: use drm macros to construct device name</li>
<li>configure: don't require libudev for gbm or egl drm/wayland</li>
</ul>
<p>José Fonseca (4):</p>
<ul>
<li>c11/threads: Fix nano to milisecond conversion.</li>
<li>mapi/u_thread: Use GetCurrentThreadId</li>
<li>c11/threads: Don't implement thrd_current on Windows.</li>
<li>draw: Duplicate TGSI tokens in draw_pipe_pstipple module.</li>
</ul>
<p>Kenneth Graunke (4):</p>
<ul>
<li>i965/fs: Fix register comparisons in saturate propagation.</li>
<li>glsl: Fix lack of i2u in lower_ubo_reference.</li>
<li>i965: Stop advertising GL_MESA_ycbcr_texture.</li>
<li>glsl: Try vectorizing when seeing a repeated assignment to a channel.</li>
</ul>
<p>Marek Olšák (13):</p>
<ul>
<li>r600g: fix texelFetchOffset GLSL functions</li>
<li>r600g: fix blitting the last 2 mipmap levels for Evergreen</li>
<li>mesa: fix the format of glEdgeFlagPointer</li>
<li>r600g,radeonsi: fix MAX_TEXTURE_3D_LEVELS and MAX_TEXTURE_ARRAY_LAYERS limits</li>
<li>st/mesa: fix per-vertex edge flags and GLSL support (v2)</li>
<li>mesa: mark GL_RGB9_E5 as not color-renderable</li>
<li>mesa: fix texture border handling for cube arrays</li>
<li>mesa: allow generating mipmaps for cube arrays</li>
<li>mesa: fix software fallback for generating mipmaps for cube arrays</li>
<li>mesa: fix software fallback for generating mipmaps for 3D textures</li>
<li>st/mesa: fix generating mipmaps for cube arrays</li>
<li>st/mesa: drop the lowering of quad strips to triangle strips</li>
<li>r600g: implement edge flags</li>
</ul>
<p>Matt Turner (4):</p>
<ul>
<li>mesa: Wrap SSE4.1 code in #ifdef __SSE4_1__.</li>
<li>i965/fs: Fix off-by-one in saturate propagation.</li>
<li>i965/fs: Don't propagate saturate modifiers into partial writes.</li>
<li>i965/fs: Don't propagate saturation modifiers if there are source modifiers.</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>r600g: Don't leak bytecode on shader compile failure</li>
</ul>
<p>Mike Stroyan (1):</p>
<ul>
<li>i965: Avoid dependency hints on math opcodes</li>
</ul>
<p>Thomas Hellstrom (5):</p>
<ul>
<li>winsys/svga: Replace the query mm buffer pool with a slab pool v3</li>
<li>winsys/svga: Update the vmwgfx_drm.h header to latest version from kernel</li>
<li>winsys/svga: Fix prime surface references also for guest-backed surfaces</li>
<li>st/xa: Bind destination before setting new state</li>
<li>st/xa: Make sure unused samplers are set to NULL</li>
</ul>
<p>Tom Stellard (1):</p>
<ul>
<li>configure: Use LLVM shared libraries by default</li>
</ul>
</div>
</body>
</html>

179
docs/relnotes/10.1.2.html Normal file
View File

@@ -0,0 +1,179 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.1.2 Release Notes / (May 5, 2014)</h1>
<p>
Mesa 10.1.2 is a bug fix release which fixes bugs found since the 10.1.1 release.
</p>
<p>
Mesa 10.1.2 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>MD5 checksums</h2>
<pre>
37d79f94b1f41852a89d1fc3900bea76 MesaLib-10.1.2.tar.gz
28b60d15ac9f364da1e0155911eaf44e MesaLib-10.1.2.tar.bz2
05300039085a65fc53c5472c4bb5747a MesaLib-10.1.2.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=27499">Bug 27499</a> - [855GM i915] GL_LINE_STIPPLE displays incorrect colors</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75723">Bug 75723</a> - (regression since Linux 3.14?) brw_get_graphics_reset_status: Assertion `brw-&gt;hw_ctx != ((void *)0)' failed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76894">Bug 76894</a> - Piglit/spec/EXT_framebuffer_object/fbo-bind-renderbuffer failed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77702">Bug 77702</a> - [i965 Bisected]Piglit spec/NV_conditional_render_blitframebuffer fails</li>
</ul>
<h2>Changes</h2>
<p>Ander Conselvan de Oliveira (2):</p>
<ul>
<li>gbm/dri: Fix out-of-memory error path in dri_device_create()</li>
<li>egl: Protect use of gbm_dri with ifdef HAVE_DRM_PLATFORM</li>
</ul>
<p>Anuj Phogat (27):</p>
<ul>
<li>mesa: Fix glGetVertexAttribi(GL_VERTEX_ATTRIB_ARRAY_SIZE)</li>
<li>swrast: Add glBlitFramebuffer to commands affected by conditional rendering</li>
<li>mesa: Fix error condition for multisample proxy texture targets</li>
<li>i965: Put an assertion to check valid varying_to_slot[varying]</li>
<li>i965: Fix component mask and varying_to_slot mapping for gl_Layer</li>
<li>i965: Fix component mask and varying_to_slot mapping for gl_ViewportIndex</li>
<li>mesa: Add helper function _mesa_is_format_integer()</li>
<li>mesa: Add error condition for integer formats in glGetTexImage()</li>
<li>mesa: Add an error condition in glGetFramebufferAttachmentParameteriv()</li>
<li>mesa: Fix error code generation in glReadPixels()</li>
<li>glsl: Allow overlapping locations for vertex input attributes</li>
<li>mesa: Fix querying location of nth element of an array variable</li>
<li>mesa: Use location VERT_ATTRIB_GENERIC0 for vertex attribute 0</li>
<li>glsl: Compile error if fs defines conflicting qualifiers for gl_FragCoord</li>
<li>glsl: Compile error if fs uses gl_FragCoord before first redeclaration</li>
<li>mesa: Add entry for extension ARB_texture_stencil8</li>
<li>mesa: Add error condition for format=STENCIL_INDEX in glGetTexImage()</li>
<li>i965: Fix crash in do_blit_readpixels()</li>
<li>mesa: Add missing types in _mesa_texstore_xx_xx() functions</li>
<li>mesa: Allow srcFormat=GL_DEPTH_STENCIL in _mesa_texstore_xx_xx() functions</li>
<li>mesa: Add new helper function _mesa_unpack_depth_stencil_row()</li>
<li>mesa: Add support to unpack depth-stencil texture in to FLOAT_32_UNSIGNED_INT_24_8_REV</li>
<li>mesa: Allow FLOAT_32_UNSIGNED_INT_24_8_REV in get_tex_depth_stencil()</li>
<li>i965: Add glBlitFramebuffer to commands affected by conditional rendering</li>
<li>glsl: Use switch to allow adding more shader types</li>
<li>glsl: Link error if fs defines conflicting qualifiers for gl_FragCoord</li>
<li>glsl: Apply the link error conditions to GL_ARB_fragment_coord_conventions</li>
</ul>
<p>Benjamin Bellec (1):</p>
<ul>
<li>mesa: fix GetStringi error message with correct function name</li>
</ul>
<p>Brian Paul (1):</p>
<ul>
<li>swrast: allocate swrast_texture_image::ImageSlices array if needed</li>
</ul>
<p>Carl Worth (4):</p>
<ul>
<li>docs: Add the MD5 sums for the 10.1.1 release tar files.</li>
<li>cherry-ignore: Ignore a patch causing a regression</li>
<li>cherry-ignore: Drop an ignored patch now that piglit has been updated.</li>
<li>Update VERSION to 10.1.2</li>
</ul>
<p>Chris Forbes (1):</p>
<ul>
<li>glsl: Only allow `invariant` on shader in/out between stages.</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>i965: Fix render-to-texture in non-FinishRenderTexture cases.</li>
</ul>
<p>Ian Romanick (1):</p>
<ul>
<li>dri3: Enable GLX_MESA_query_renderer on DRI3 too</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>i965: Don't enable reset notification support on Gen4-5.</li>
<li>i965: Actually emit PIPELINE_SELECT and 3DSTATE_VF_STATISTICS.</li>
</ul>
<p>Marek Olšák (10):</p>
<ul>
<li>r300g: don't crash when getting NULL colorbuffers</li>
<li>st/mesa: remove trailing NULL colorbuffers</li>
<li>r600g: fix edge flags and layered rendering on R600-R700</li>
<li>r600g: disable async DMA on R700</li>
<li>r600g: fix MSAA resolve on R6xx when the destination is 1D-tiled</li>
<li>r600g: fix flushing on RV670, RS780, RS880 again</li>
<li>r600g: fix buffer copying on R600-R700</li>
<li>r600g: fix for broken CULL_FRONT behavior on R6xx</li>
<li>r600g: fix for an MSAA hang on RV770</li>
<li>r600g: fix hang on RV740 by using DX_RASTERIZATION_KILL instead of SX_MISC</li>
</ul>
<p>Michel Dänzer (2):</p>
<ul>
<li>r600g: Disable LLVM by default at runtime for graphics</li>
<li>st/mesa: Fix NULL pointer dereference for incomplete framebuffers</li>
</ul>
<p>Neil Roberts (1):</p>
<ul>
<li>wayland: Fix the logic in disabling the prime capability</li>
</ul>
<p>Samuel Iglesias Gonsalvez (1):</p>
<ul>
<li>mesa: fix check for dummy renderbuffer in _mesa_FramebufferRenderbufferEXT()</li>
</ul>
<p>Thomas Hellstrom (1):</p>
<ul>
<li>st/xa: Cache render target surface</li>
</ul>
<p>nick (1):</p>
<ul>
<li>swrast: Fix vertex color in _swsetup_Translate()</li>
</ul>
</div>
</body>
</html>

90
docs/relnotes/10.1.3.html Normal file
View File

@@ -0,0 +1,90 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.1.3 Release Notes / (May 9, 2014)</h1>
<p>
Mesa 10.1.3 is a bug fix release which fixes bugs found since the 10.1.2 release.
</p>
<p>
Note: Mesa 10.1.3 is being released sooner than originally scheduled to make
available a fix for a performance rgression that was inadvertently introduced
to Mesa 10.1.2. The performance regression is reported to make vmware
swapbuffers fall back to software.
</p>
<p>
Mesa 10.1.3 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>MD5 checksums</h2>
<pre>
665fe1656aaa2c37b32042068aff92cb MesaLib-10.1.3.tar.gz
ba6dbe2b9cab0b4de840c996b9b6a3ad MesaLib-10.1.3.tar.bz2
4e6f26330a63d3c47e62ac4bdead39e8 MesaLib-10.1.3.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77245">Bug 77245</a> - Bogus GL_ARB_explicit_attrib_location layout identifier warnings</li>
</ul>
<h2>Changes</h2>
<p>Carl Worth (3):</p>
<ul>
<li>docs: Add MD5 sums for Mesa 10.1.2</li>
<li>get-pick-list.sh: Require explicit "10.1" for nominating stable patches</li>
<li>VERSION: Update to 10.1.3</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>mesa: Fix MaxNumLayers for 1D array textures.</li>
<li>i965: Fix depth (array slices) computation for 1D_ARRAY render targets.</li>
</ul>
<p>Tapani Pälli (1):</p>
<ul>
<li>glsl: fix bogus layout qualifier warnings</li>
</ul>
<p>Thomas Hellstrom (1):</p>
<ul>
<li>st/xa: Fix performance regression introduced by commit "Cache render target surface"</li>
</ul>
</div>
</body>
</html>

100
docs/relnotes/10.1.4.html Normal file
View File

@@ -0,0 +1,100 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.1.4 Release Notes / (May 20, 2014)</h1>
<p>
Mesa 10.1.4 is a bug fix release which fixes bugs found since the 10.1.3 release.
</p>
<p>
Mesa 10.1.4 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>MD5 checksums</h2>
<pre>
e934365d77f384bfaec844999440bef8 MesaLib-10.1.4.tar.gz
6fddee101f49b7409cd29994c34ddee7 MesaLib-10.1.4.tar.bz2
ba5f48e7d5e373922c804c2651fec6c1 MesaLib-10.1.4.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78225">Bug 78225</a> - Compile error due to undefined reference to `gbm_dri_backend', fix attached</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78537">Bug 78537</a> - no anisotropic filtering in a native Half-Life 2</li>
</ul>
<h2>Changes</h2>
<p>Brian Paul (1):</p>
<ul>
<li>mesa: fix double-freeing of dispatch tables inside glBegin/End.</li>
</ul>
<p>Carl Worth (3):</p>
<ul>
<li>docs: Add MD5 sums for 10.1.3</li>
<li>cherry-ignore: Roland and Michel agreed to drop these patches.</li>
<li>VERSION: Update to 10.1.4</li>
</ul>
<p>Emil Velikov (1):</p>
<ul>
<li>configure: error out if building GBM without dri</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>i965/vs: Use samplers for UBOs in the VS like we do for non-UBO pulls.</li>
</ul>
<p>Ilia Mirkin (3):</p>
<ul>
<li>nv50/ir: make sure to reverse cond codes on all the OP_SET variants</li>
<li>nv50: fix setting of texture ms info to be per-stage</li>
<li>nv50/ir: fix integer mul lowering for u32 x u32 -&gt; high u32</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>radeonsi: Fix anisotropic filtering state setup</li>
</ul>
<p>Tom Stellard (2):</p>
<ul>
<li>configure.ac: Add LLVM_VERSION_PATCH to DEFINES</li>
<li>radeonsi: Enable geometry shaders with LLVM 3.4.1</li>
</ul>
</div>
</body>
</html>

105
docs/relnotes/10.1.5.html Normal file
View File

@@ -0,0 +1,105 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.1.5 Release Notes / (June 6, 2014)</h1>
<p>
Mesa 10.1.5 is a bug fix release which fixes bugs found since the 10.1.4 release.
</p>
<p>
Mesa 10.1.5 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
b0aceaa75bc9a9b2d9215a113e2ad488b5cf85c99005a7624f8cf7c37c5d0eaa MesaLib-10.1.5.tar.gz
bc6c5ec7836f254a49d055a29d9aa34c97c54c038f47ad3a00fa57a5fef15bbc MesaLib-10.1.5.tar.bz2
78b7255cab0af7918945452a84de7989096ebcdd27e99b31c56c0589274cbc77 MesaLib-10.1.5.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79115">Bug 79115</a> - </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79421">Bug 79421</a> - </li>
</ul>
<h2>Changes</h2>
<p>Brian Paul (1):</p>
<ul>
<li>glsl: fix use-after free bug/crash in ast_declarator_list::hir()</li>
</ul>
<p>Carl Worth (5):</p>
<ul>
<li>docs: Add md5sums for 10.1.4 release</li>
<li>Merge remote-tracking branch 'origin/10.1' into 10.1</li>
<li>cherry-ignore: Ignore two commits.</li>
<li>Ignore a patch that is not needed for the 10.1 branch.</li>
<li>Update version to 10.1.5</li>
</ul>
<p>Emil Velikov (1):</p>
<ul>
<li>glx: do not leak dri3Display</li>
</ul>
<p>Ilia Mirkin (2):</p>
<ul>
<li>nv50/ir: fix s32 x s32 -&gt; high s32 multiply logic</li>
<li>nv50/ir: fix constant folding for OP_MUL subop HIGH</li>
</ul>
<p>James Legg (1):</p>
<ul>
<li>mesa: Fix unbinding GL_DEPTH_STENCIL_ATTACHMENT</li>
</ul>
<p>Jeremy Huddleston Sequoia (2):</p>
<ul>
<li>glapi: Avoid heap corruption in _glapi_table</li>
<li>darwin: Fix test for kCGLPFAOpenGLProfile support at runtime</li>
</ul>
<p>Pavel Popov (2):</p>
<ul>
<li>i965: Properly return *RESET* status in glGetGraphicsResetStatusARB</li>
<li>i965: Fix Line Stipple enable bit in 3DSTATE_SF for Haswell.</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>llvmpipe: fix crash when not all attachments are populated in a fb</li>
</ul>
</div>
</body>
</html>

138
docs/relnotes/10.1.6.html Normal file
View File

@@ -0,0 +1,138 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.1.6 Release Notes / (June 24, 2014)</h1>
<p>
Mesa 10.1.6 is a bug fix release which fixes bugs found since the 10.1.5 release.
</p>
<p>
Mesa 10.1.6 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
cde60e06b340d7598802fe4a4484b3fb8befd714f9ab9caabe1f27d3149e8815 MesaLib-10.1.6.tar.bz2
e4e726d7805a442f7ed07d12f71335e6126796ec85328a5989eb5348a8042d00 MesaLib-10.1.6.tar.gz
bf7e3f721a7ad0c2057a034834b6fea688e64f26a66cf8d1caa2827e405e72dd MesaLib-10.1.6.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=54372">Bug 54372</a> - GLX_INTEL_swap_event crashes driver when swapping window buffers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74005">Bug 74005</a> - [i965 Bisected]Piglit/glx_glx-make-glxdrawable-current fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78581">Bug 78581</a> - </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79729">Bug 79729</a> - [i965] glClear on a multisample texture doesn't work</li>
</ul>
<h2>Changes</h2>
<p>Adrian Negreanu (7):</p>
<ul>
<li>add megadriver_stub_FILES</li>
<li>android: adapt to the megadriver mechanism</li>
<li>android: add libloader to libGLES_mesa and libmesa_egl_dri2</li>
<li>android: add src/gallium/auxiliary as include path for libmesa_dricore</li>
<li>android, egl: add correct drm include for libmesa_egl_dri2</li>
<li>android, mesa_gen_matypes: pull in timespec POSIX definition</li>
<li>android, dricore: undefined reference to _mesa_streaming_load_memcpy</li>
</ul>
<p>Beren Minor (1):</p>
<ul>
<li>egl/main: Fix eglMakeCurrent when releasing context from current thread.</li>
</ul>
<p>Carl Worth (3):</p>
<ul>
<li>docs: Add SHA256 checksums for the 10.1.5 release</li>
<li>cherry-ignore: Add a patch to ignore</li>
<li>Update VERSION to 10.1.6</li>
</ul>
<p>Daniel Manjarres (1):</p>
<ul>
<li>glx: Don't crash on swap event for a Window (non-GLXWindow)</li>
</ul>
<p>Emil Velikov (1):</p>
<ul>
<li>configure: error out when building opencl without LLVM</li>
</ul>
<p>Iago Toral Quiroga (1):</p>
<ul>
<li>mesa: Copy Geom.UsesEndPrimitive when cloning a geometry program.</li>
</ul>
<p>José Fonseca (3):</p>
<ul>
<li>mesa/main: Make get_hash.c values constant.</li>
<li>mesa: Make glGetIntegerv(GL_*_ARRAY_SIZE) return GL_BGRA.</li>
<li>mesa/main: Prevent sefgault on glGetIntegerv(GL_ATOMIC_COUNTER_BUFFER_BINDING).</li>
</ul>
<p>Kristian Høgsberg (1):</p>
<ul>
<li>mesa: Remove glClear optimization based on drawable size</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>configure: Only check for OpenCL without LLVM when the latter is certain</li>
</ul>
<p>Neil Roberts (1):</p>
<ul>
<li>i965: Set the fast clear color value for texture surfaces</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>draw: (trivial) fix clamping of viewport index</li>
</ul>
<p>Tobias Klausmann (1):</p>
<ul>
<li>nv50/ir: clear subop when folding constant expressions</li>
</ul>
<p>Tom Stellard (2):</p>
<ul>
<li>clover: Prevent Clang from printing number of errors and warnings to stderr.</li>
<li>clover: Don't use llvm's global context</li>
</ul>
</div>
</body>
</html>

View File

@@ -52,7 +52,7 @@ it.</li>
<li>GL_AMD_shader_trinary_minmax.</li>
<li>GL_EXT_framebuffer_blit on r200 and radeon.</li>
<li>Reduced memory usage for display lists.</li>
<li>OpenGL 3.3 support on nv50, nvc0</li>
<li>OpenGL 3.3 support on nv50, nvc0, r600 and radeonsi</li>
</ul>

View File

@@ -27,7 +27,9 @@
* DEALINGS IN THE SOFTWARE.
*/
#include <stdlib.h>
#ifndef assert
#include <assert.h>
#endif
#include <limits.h>
#include <errno.h>
#include <unistd.h>

View File

@@ -26,7 +26,9 @@
* ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
#ifndef assert
#include <assert.h>
#endif
#include <limits.h>
#include <errno.h>
#include <process.h> // MSVCRT
@@ -146,7 +148,7 @@ static unsigned __stdcall impl_thrd_routine(void *p)
static DWORD impl_xtime2msec(const xtime *xt)
{
return (DWORD)((xt->sec * 1000u) + (xt->nsec / 1000));
return (DWORD)((xt->sec * 1000U) + (xt->nsec / 1000000L));
}
#ifdef EMULATED_THREADS_USE_NATIVE_CALL_ONCE
@@ -492,12 +494,42 @@ thrd_create(thrd_t *thr, thrd_start_t func, void *arg)
return thrd_success;
}
#if 0
// 7.25.5.2
static inline thrd_t
thrd_current(void)
{
return GetCurrentThread();
HANDLE hCurrentThread;
BOOL bRet;
/* GetCurrentThread() returns a pseudo-handle, which is useless. We need
* to call DuplicateHandle to get a real handle. However the handle value
* will not match the one returned by thread_create.
*
* Other potential solutions would be:
* - define thrd_t as a thread Ids, but this would mean we'd need to OpenThread for many operations
* - use malloc'ed memory for thrd_t. This would imply using TLS for current thread.
*
* Neither is particularly nice.
*
* Life would be much easier if C11 threads had different abstractions for
* threads and thread IDs, just like C++11 threads does...
*/
bRet = DuplicateHandle(GetCurrentProcess(), // source process (pseudo) handle
GetCurrentThread(), // source (pseudo) handle
GetCurrentProcess(), // target process
&hCurrentThread, // target handle
0,
FALSE,
DUPLICATE_SAME_ACCESS);
assert(bRet);
if (!bRet) {
hCurrentThread = GetCurrentThread();
}
return hCurrentThread;
}
#endif
// 7.25.5.3
static inline int
@@ -511,7 +543,7 @@ thrd_detach(thrd_t thr)
static inline int
thrd_equal(thrd_t thr0, thrd_t thr1)
{
return (thr0 == thr1);
return GetThreadId(thr0) == GetThreadId(thr1);
}
// 7.25.5.5

View File

@@ -269,6 +269,11 @@ def generate(env):
cppdefines += ['HAVE_ALIAS']
else:
cppdefines += ['GLX_ALIAS_UNSUPPORTED']
if env['platform'] == 'haiku':
cppdefines += [
'HAVE_PTHREAD',
'HAVE_POSIX_MEMALIGN'
]
if platform == 'windows':
cppdefines += [
'WIN32',

View File

@@ -40,8 +40,12 @@ LOCAL_C_INCLUDES := \
$(MESA_TOP)/src/mapi \
$(MESA_TOP)/src/egl/main \
$(MESA_TOP)/src/loader \
$(DRM_TOP)/include/drm \
$(DRM_GRALLOC_TOP)
LOCAL_STATIC_LIBRARIES := \
libloader
LOCAL_MODULE := libmesa_egl_dri2
include $(MESA_COMMON_MK)

View File

@@ -626,7 +626,6 @@ dri2_initialize(_EGLDriver *drv, _EGLDisplay *disp)
return dri2_initialize_x11(drv, disp);
#endif
#ifdef HAVE_LIBUDEV
#ifdef HAVE_DRM_PLATFORM
case _EGL_PLATFORM_DRM:
if (disp->Options.TestOnly)
@@ -639,7 +638,6 @@ dri2_initialize(_EGLDriver *drv, _EGLDisplay *disp)
return EGL_TRUE;
return dri2_initialize_wayland(drv, disp);
#endif
#endif
#ifdef HAVE_ANDROID_PLATFORM
case _EGL_PLATFORM_ANDROID:
if (disp->Options.TestOnly)
@@ -1894,10 +1892,12 @@ dri2_bind_wayland_display_wl(_EGLDriver *drv, _EGLDisplay *disp,
if (!dri2_dpy->wl_server_drm)
return EGL_FALSE;
#ifdef HAVE_DRM_PLATFORM
/* We have to share the wl_drm instance with gbm, so gbm can convert
* wl_buffers to gbm bos. */
if (dri2_dpy->gbm_dri)
dri2_dpy->gbm_dri->wl_drm = dri2_dpy->wl_server_drm;
#endif
return EGL_TRUE;
}

View File

@@ -458,7 +458,12 @@ dri2_initialize_drm(_EGLDriver *drv, _EGLDisplay *disp)
gbm = disp->PlatformDisplay;
if (gbm == NULL) {
fd = open("/dev/dri/card0", O_RDWR);
char buf[64];
int n = snprintf(buf, sizeof(buf), DRM_DEV_NAME, DRM_DIR_NAME, 0);
if (n != -1 && n < sizeof(buf))
fd = open(buf, O_RDWR);
if (fd < 0)
fd = open("/dev/dri/card0", O_RDWR);
dri2_dpy->own_device = 1;
gbm = gbm_create_device(fd);
if (gbm == NULL)

View File

@@ -1045,7 +1045,7 @@ dri2_initialize_wayland(_EGLDriver *drv, _EGLDisplay *disp)
if (dri2_dpy->image->base.version < 7 ||
dri2_dpy->image->createImageFromFds == NULL)
dri2_dpy->capabilities &= WL_DRM_CAPABILITY_PRIME;
dri2_dpy->capabilities &= ~WL_DRM_CAPABILITY_PRIME;
types = EGL_WINDOW_BIT;
for (i = 0; dri2_dpy->driver_configs[i]; i++) {

View File

@@ -154,11 +154,14 @@ LOCAL_STATIC_LIBRARIES := \
libmesa_glsl \
libmesa_glsl_utils \
libmesa_gallium \
libloader \
$(LOCAL_STATIC_LIBRARIES)
endif # MESA_BUILD_GALLIUM
LOCAL_STATIC_LIBRARIES := \
$(LOCAL_STATIC_LIBRARIES) \
libloader
LOCAL_MODULE := libGLES_mesa
LOCAL_MODULE_PATH := $(TARGET_OUT_SHARED_LIBRARIES)/egl

View File

@@ -490,8 +490,12 @@ eglMakeCurrent(EGLDisplay dpy, EGLSurface draw, EGLSurface read,
if (!context && ctx != EGL_NO_CONTEXT)
RETURN_EGL_ERROR(disp, EGL_BAD_CONTEXT, EGL_FALSE);
if (!draw_surf || !read_surf) {
/* surfaces may be NULL if surfaceless */
if (!disp->Extensions.KHR_surfaceless_context)
/* From the EGL 1.4 (20130211) spec:
*
* To release the current context without assigning a new one, set ctx
* to EGL_NO_CONTEXT and set draw and read to EGL_NO_SURFACE.
*/
if (!disp->Extensions.KHR_surfaceless_context && ctx != EGL_NO_CONTEXT)
RETURN_EGL_ERROR(disp, EGL_BAD_SURFACE, EGL_FALSE);
if ((!draw_surf && draw != EGL_NO_SURFACE) ||

View File

@@ -91,6 +91,7 @@ C_SOURCES := \
translate/translate_sse.c \
util/u_debug.c \
util/u_debug_describe.c \
util/u_debug_flush.c \
util/u_debug_memory.c \
util/u_debug_refcnt.c \
util/u_debug_stack.c \

View File

@@ -1187,11 +1187,12 @@ cso_set_sampler_views(struct cso_context *ctx,
pipe_sampler_view_reference(&info->views[i], NULL);
}
info->nr_views = count;
/* bind the new sampler views */
ctx->pipe->set_sampler_views(ctx->pipe, shader_stage, 0, count,
ctx->pipe->set_sampler_views(ctx->pipe, shader_stage, 0,
MAX2(info->nr_views, count),
info->views);
info->nr_views = count;
}

View File

@@ -588,7 +588,12 @@ do_clip_line( struct draw_stage *stage,
if (v0->clipmask) {
interp( clipper, stage->tmp[0], t0, v0, v1, viewport_index );
copy_flat(stage, stage->tmp[0], v0);
if (stage->draw->rasterizer->flatshade_first) {
copy_flat(stage, stage->tmp[0], v0); /* copy v0 color to tmp[0] */
}
else {
copy_flat(stage, stage->tmp[0], v1); /* copy v1 color to tmp[0] */
}
newprim.v[0] = stage->tmp[0];
}
else {
@@ -597,6 +602,12 @@ do_clip_line( struct draw_stage *stage,
if (v1->clipmask) {
interp( clipper, stage->tmp[1], t1, v1, v0, viewport_index );
if (stage->draw->rasterizer->flatshade_first) {
copy_flat(stage, stage->tmp[1], v0); /* copy v0 color to tmp[1] */
}
else {
copy_flat(stage, stage->tmp[1], v1); /* copy v1 color to tmp[1] */
}
newprim.v[1] = stage->tmp[1];
}
else {

View File

@@ -673,7 +673,7 @@ pstip_create_fs_state(struct pipe_context *pipe,
struct pstip_fragment_shader *pstipfs = CALLOC_STRUCT(pstip_fragment_shader);
if (pstipfs) {
pstipfs->state = *fs;
pstipfs->state.tokens = tgsi_dup_tokens(fs->tokens);
/* pass-through */
pstipfs->driver_fs = pstip->driver_create_fs_state(pstip->pipe, fs);
@@ -707,6 +707,7 @@ pstip_delete_fs_state(struct pipe_context *pipe, void *fs)
if (pstipfs->pstip_fs)
pstip->driver_delete_fs_state(pstip->pipe, pstipfs->pstip_fs);
FREE((void*)pstipfs->state.tokens);
FREE(pstipfs);
}

View File

@@ -495,7 +495,7 @@ draw_stats_clipper_primitives(struct draw_context *draw,
static INLINE unsigned
draw_clamp_viewport_idx(int idx)
{
return ((PIPE_MAX_VIEWPORTS > idx || idx < 0) ? idx : 0);
return ((PIPE_MAX_VIEWPORTS > idx && idx >= 0) ? idx : 0);
}
/**

View File

@@ -74,7 +74,7 @@ void
util_primconvert_destroy(struct primconvert_context *pc)
{
util_primconvert_save_index_buffer(pc, NULL);
free(pc);
FREE(pc);
}
void

View File

@@ -161,7 +161,9 @@ pb_slab_range_manager_create(struct pb_manager *provider,
*/
struct pb_manager *
pb_cache_manager_create(struct pb_manager *provider,
unsigned usecs);
unsigned usecs,
float size_factor,
unsigned bypass_usage);
struct pb_fence_ops;

View File

@@ -82,6 +82,8 @@ struct pb_cache_manager
struct list_head delayed;
pb_size numDelayed;
float size_factor;
unsigned bypass_usage;
};
@@ -227,11 +229,14 @@ pb_cache_is_buffer_compat(struct pb_cache_buffer *buf,
pb_size size,
const struct pb_desc *desc)
{
if (desc->usage & buf->mgr->bypass_usage)
return 0;
if(buf->base.size < size)
return 0;
/* be lenient with size */
if(buf->base.size >= 2*size)
if(buf->base.size > (unsigned) (buf->mgr->size_factor * size))
return 0;
if(!pb_check_alignment(desc->alignment, buf->base.alignment))
@@ -338,7 +343,7 @@ pb_cache_manager_create_buffer(struct pb_manager *_mgr,
assert(pipe_is_referenced(&buf->buffer->reference));
assert(pb_check_alignment(desc->alignment, buf->buffer->alignment));
assert(pb_check_usage(desc->usage, buf->buffer->usage));
assert(pb_check_usage(desc->usage & ~mgr->bypass_usage, buf->buffer->usage));
assert(buf->buffer->size >= size);
pipe_reference_init(&buf->base.reference, 1);
@@ -384,10 +389,23 @@ pb_cache_manager_destroy(struct pb_manager *mgr)
FREE(mgr);
}
/**
* Create a caching buffer manager
*
* @param provider The buffer manager to which cache miss buffer requests
* should be redirected.
* @param usecs Unused buffers may be released from the cache after this
* time
* @param size_factor Declare buffers that are size_factor times bigger than
* the requested size as cache hits.
* @param bypass_usage Bitmask. If (requested usage & bypass_usage) != 0,
* buffer allocation requests are redirected to the provider.
*/
struct pb_manager *
pb_cache_manager_create(struct pb_manager *provider,
unsigned usecs)
unsigned usecs,
float size_factor,
unsigned bypass_usage)
{
struct pb_cache_manager *mgr;
@@ -403,6 +421,8 @@ pb_cache_manager_create(struct pb_manager *provider,
mgr->base.flush = pb_cache_manager_flush;
mgr->provider = provider;
mgr->usecs = usecs;
mgr->size_factor = size_factor;
mgr->bypass_usage = bypass_usage;
LIST_INITHEAD(&mgr->delayed);
mgr->numDelayed = 0;
pipe_mutex_init(mgr->mutex);

View File

@@ -364,6 +364,8 @@ void util_blitter_destroy(struct blitter_context *blitter)
pipe->delete_vs_state(pipe, ctx->vs);
if (ctx->vs_pos_only)
pipe->delete_vs_state(pipe, ctx->vs_pos_only);
if (ctx->vs_layered)
pipe->delete_vs_state(pipe, ctx->vs_layered);
pipe->delete_vertex_elements_state(pipe, ctx->velem_state);
for (i = 0; i < 4; i++) {
if (ctx->velem_state_readbuf[i]) {

View File

@@ -0,0 +1,391 @@
/**************************************************************************
*
* Copyright 2012 VMware, Inc.
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sub license, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial portions
* of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
* OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
* IN NO EVENT SHALL VMWARE AND/OR ITS SUPPLIERS BE LIABLE FOR
* ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
* TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
* SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*
**************************************************************************/
/**
* @file
* u_debug_flush.c Debug flush and map-related issues:
* - Flush while synchronously mapped.
* - Command stream reference while synchronously mapped.
* - Synchronous map while referenced on command stream.
* - Recursive maps.
* - Unmap while not mapped.
*
* @author Thomas Hellstrom <thellstrom@vmware.com>
*/
#ifdef DEBUG
#include "pipe/p_compiler.h"
#include "util/u_debug_stack.h"
#include "util/u_debug.h"
#include "util/u_memory.h"
#include "util/u_debug_flush.h"
#include "util/u_hash_table.h"
#include "util/u_double_list.h"
#include "util/u_inlines.h"
#include "os/os_thread.h"
#include <stdio.h>
struct debug_flush_buf {
/* Atomic */
struct pipe_reference reference; /* Must be the first member. */
pipe_mutex mutex;
/* Immutable */
boolean supports_unsync;
unsigned bt_depth;
/* Protected by mutex */
boolean mapped;
boolean mapped_sync;
struct debug_stack_frame *map_frame;
};
struct debug_flush_item {
struct debug_flush_buf *fbuf;
unsigned bt_depth;
struct debug_stack_frame *ref_frame;
};
struct debug_flush_ctx {
/* Contexts are used by a single thread at a time */
unsigned bt_depth;
boolean catch_map_of_referenced;
struct util_hash_table *ref_hash;
struct list_head head;
};
pipe_static_mutex(list_mutex);
static struct list_head ctx_list = {&ctx_list, &ctx_list};
static struct debug_stack_frame *
debug_flush_capture_frame(int start, int depth)
{
struct debug_stack_frame *frames;
frames = CALLOC(depth, sizeof(*frames));
if (!frames)
return NULL;
debug_backtrace_capture(frames, start, depth);
return frames;
}
static int
debug_flush_pointer_compare(void *key1, void *key2)
{
return (key1 == key2) ? 0 : 1;
}
static unsigned
debug_flush_pointer_hash(void *key)
{
return (unsigned) (unsigned long) key;
}
struct debug_flush_buf *
debug_flush_buf_create(boolean supports_unsync, unsigned bt_depth)
{
struct debug_flush_buf *fbuf = CALLOC_STRUCT(debug_flush_buf);
if (!fbuf)
goto out_no_buf;
fbuf->supports_unsync = supports_unsync;
fbuf->bt_depth = bt_depth;
pipe_reference_init(&fbuf->reference, 1);
pipe_mutex_init(fbuf->mutex);
return fbuf;
out_no_buf:
debug_printf("Debug flush buffer creation failed.\n");
debug_printf("Debug flush checking for this buffer will be incomplete.\n");
return NULL;
}
void
debug_flush_buf_reference(struct debug_flush_buf **dst,
struct debug_flush_buf *src)
{
struct debug_flush_buf *fbuf = *dst;
if (pipe_reference(&(*dst)->reference, &src->reference)) {
if (fbuf->map_frame)
FREE(fbuf->map_frame);
FREE(fbuf);
}
*dst = src;
}
static void
debug_flush_item_destroy(struct debug_flush_item *item)
{
debug_flush_buf_reference(&item->fbuf, NULL);
if (item->ref_frame)
FREE(item->ref_frame);
FREE(item);
}
struct debug_flush_ctx *
debug_flush_ctx_create(boolean catch_reference_of_mapped, unsigned bt_depth)
{
struct debug_flush_ctx *fctx = CALLOC_STRUCT(debug_flush_ctx);
if (!fctx)
goto out_no_ctx;
fctx->ref_hash = util_hash_table_create(debug_flush_pointer_hash,
debug_flush_pointer_compare);
if (!fctx->ref_hash)
goto out_no_ref_hash;
fctx->bt_depth = bt_depth;
pipe_mutex_lock(list_mutex);
list_addtail(&fctx->head, &ctx_list);
pipe_mutex_unlock(list_mutex);
return fctx;
out_no_ref_hash:
FREE(fctx);
out_no_ctx:
debug_printf("Debug flush context creation failed.\n");
debug_printf("Debug flush checking for this context will be incomplete.\n");
return NULL;
}
static void
debug_flush_alert(const char *s, const char *op,
unsigned start, unsigned depth,
boolean continued,
boolean capture,
const struct debug_stack_frame *frame)
{
if (capture)
frame = debug_flush_capture_frame(start, depth);
if (s)
debug_printf("%s ", s);
if (frame) {
debug_printf("%s backtrace follows:\n", op);
debug_backtrace_dump(frame, depth);
} else
debug_printf("No %s backtrace was captured.\n", op);
if (continued)
debug_printf("**********************************\n");
else
debug_printf("*********END OF MESSAGE***********\n\n\n");
if (capture)
FREE((void *)frame);
}
void
debug_flush_map(struct debug_flush_buf *fbuf, unsigned flags)
{
boolean mapped_sync = FALSE;
if (!fbuf)
return;
pipe_mutex_lock(fbuf->mutex);
if (fbuf->mapped) {
debug_flush_alert("Recursive map detected.", "Map",
2, fbuf->bt_depth, TRUE, TRUE, NULL);
debug_flush_alert(NULL, "Previous map", 0, fbuf->bt_depth, FALSE,
FALSE, fbuf->map_frame);
} else if (!(flags & PIPE_TRANSFER_UNSYNCHRONIZED) ||
!fbuf->supports_unsync) {
fbuf->mapped_sync = mapped_sync = TRUE;
}
fbuf->map_frame = debug_flush_capture_frame(1, fbuf->bt_depth);
fbuf->mapped = TRUE;
pipe_mutex_unlock(fbuf->mutex);
if (mapped_sync) {
struct debug_flush_ctx *fctx;
pipe_mutex_lock(list_mutex);
LIST_FOR_EACH_ENTRY(fctx, &ctx_list, head) {
struct debug_flush_item *item =
util_hash_table_get(fctx->ref_hash, fbuf);
if (item && fctx->catch_map_of_referenced) {
debug_flush_alert("Already referenced map detected.",
"Map", 2, fbuf->bt_depth, TRUE, TRUE, NULL);
debug_flush_alert(NULL, "Reference", 0, item->bt_depth,
FALSE, FALSE, item->ref_frame);
}
}
pipe_mutex_unlock(list_mutex);
}
}
void
debug_flush_unmap(struct debug_flush_buf *fbuf)
{
if (!fbuf)
return;
pipe_mutex_lock(fbuf->mutex);
if (!fbuf->mapped)
debug_flush_alert("Unmap not previously mapped detected.", "Map",
2, fbuf->bt_depth, FALSE, TRUE, NULL);
fbuf->mapped_sync = FALSE;
fbuf->mapped = FALSE;
if (fbuf->map_frame) {
FREE(fbuf->map_frame);
fbuf->map_frame = NULL;
}
pipe_mutex_unlock(fbuf->mutex);
}
void
debug_flush_cb_reference(struct debug_flush_ctx *fctx,
struct debug_flush_buf *fbuf)
{
struct debug_flush_item *item;
if (!fctx || !fbuf)
return;
item = util_hash_table_get(fctx->ref_hash, fbuf);
pipe_mutex_lock(fbuf->mutex);
if (fbuf->mapped_sync) {
debug_flush_alert("Reference of mapped buffer detected.", "Reference",
2, fctx->bt_depth, TRUE, TRUE, NULL);
debug_flush_alert(NULL, "Map", 0, fbuf->bt_depth, FALSE,
FALSE, fbuf->map_frame);
}
pipe_mutex_unlock(fbuf->mutex);
if (!item) {
item = CALLOC_STRUCT(debug_flush_item);
if (item) {
debug_flush_buf_reference(&item->fbuf, fbuf);
item->bt_depth = fctx->bt_depth;
item->ref_frame = debug_flush_capture_frame(2, item->bt_depth);
if (util_hash_table_set(fctx->ref_hash, fbuf, item) != PIPE_OK) {
debug_flush_item_destroy(item);
goto out_no_item;
}
return;
}
goto out_no_item;
}
return;
out_no_item:
debug_printf("Debug flush command buffer reference creation failed.\n");
debug_printf("Debug flush checking will be incomplete "
"for this command batch.\n");
}
static enum pipe_error
debug_flush_might_flush_cb(void *key, void *value, void *data)
{
struct debug_flush_item *item =
(struct debug_flush_item *) value;
struct debug_flush_buf *fbuf = item->fbuf;
const char *reason = (const char *) data;
char message[80];
snprintf(message, sizeof(message),
"%s referenced mapped buffer detected.", reason);
pipe_mutex_lock(fbuf->mutex);
if (fbuf->mapped_sync) {
debug_flush_alert(message, reason, 3, item->bt_depth, TRUE, TRUE, NULL);
debug_flush_alert(NULL, "Map", 0, fbuf->bt_depth, TRUE, FALSE,
fbuf->map_frame);
debug_flush_alert(NULL, "First reference", 0, item->bt_depth, FALSE,
FALSE, item->ref_frame);
}
pipe_mutex_unlock(fbuf->mutex);
return PIPE_OK;
}
void
debug_flush_might_flush(struct debug_flush_ctx *fctx)
{
if (!fctx)
return;
util_hash_table_foreach(fctx->ref_hash,
debug_flush_might_flush_cb,
"Might flush");
}
static enum pipe_error
debug_flush_flush_cb(void *key, void *value, void *data)
{
struct debug_flush_item *item =
(struct debug_flush_item *) value;
debug_flush_item_destroy(item);
return PIPE_OK;
}
void
debug_flush_flush(struct debug_flush_ctx *fctx)
{
if (!fctx)
return;
util_hash_table_foreach(fctx->ref_hash,
debug_flush_might_flush_cb,
"Flush");
util_hash_table_foreach(fctx->ref_hash,
debug_flush_flush_cb,
NULL);
util_hash_table_clear(fctx->ref_hash);
}
void
debug_flush_ctx_destroy(struct debug_flush_ctx *fctx)
{
if (!fctx)
return;
list_del(&fctx->head);
util_hash_table_foreach(fctx->ref_hash,
debug_flush_flush_cb,
NULL);
util_hash_table_clear(fctx->ref_hash);
util_hash_table_destroy(fctx->ref_hash);
FREE(fctx);
}
#endif

View File

@@ -0,0 +1,138 @@
/**************************************************************************
*
* Copyright 2012 VMware, Inc.
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sub license, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial portions
* of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
* OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
* IN NO EVENT SHALL VMWARE AND/OR ITS SUPPLIERS BE LIABLE FOR
* ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
* TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
* SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*
**************************************************************************/
/**
* @file
* u_debug_flush.h - Header for debugging flush- and map- related issues.
* - Flush while synchronously mapped.
* - Command stream reference while synchronously mapped.
* - Synchronous map while referenced on command stream.
* - Recursive maps.
* - Unmap while not mapped.
*
* @author Thomas Hellstrom <thellstrom@vmware.com>
*/
#ifdef DEBUG
#ifndef U_DEBUG_FLUSH_H_
#define U_DEBUG_FLUSH_H_
struct debug_flush_buf;
struct debug_flush_ctx;
/**
* Create a buffer (AKA allocation) representation.
*
* @param support_unsync Whether unsynchronous maps are truly supported.
* @param bt_depth Depth of backtrace to be captured for this buffer
* representation.
*/
struct debug_flush_buf *
debug_flush_buf_create(boolean supports_unsync, unsigned bt_depth);
/**
* Reference a buffer representation.
*
* @param dst Pointer copy destination
* @param src Pointer copy source (may be NULL).
*
* Replace a pointer to a buffer representation with proper refcounting.
*/
void
debug_flush_buf_reference(struct debug_flush_buf **dst,
struct debug_flush_buf *src);
/**
* Create a context representation.
*
* @param catch_map_of_referenced Whether to catch synchronous maps of buffers
* already present on the command stream.
* @param bt_depth Depth of backtrace to be captured for this context
* representation.
*/
struct debug_flush_ctx *
debug_flush_ctx_create(boolean catch_map_of_referenced, unsigned bt_depth);
/**
* Destroy a context representation.
*
* @param fctx The context representation to destroy.
*/
void
debug_flush_ctx_destroy(struct debug_flush_ctx *fctx);
/**
* Map annotation
*
* @param fbuf The buffer representation to map.
* @param flags Pipebuffer flags for the map.
*
* Used to annotate a map of the buffer described by the buffer representation.
*/
void debug_flush_map(struct debug_flush_buf *fbuf, unsigned flags);
/**
* Unmap annotation
*
* @param fbuf The buffer representation to map.
*
* Used to annotate an unmap of the buffer described by the
* buffer representation.
*/
void debug_flush_unmap(struct debug_flush_buf *fbuf);
/**
* Might flush annotation
*
* @param fctx The context representation that might be flushed.
*
* Used to annotate a conditional (possible) flush of the given context.
*/
void debug_flush_might_flush(struct debug_flush_ctx *fctx);
/**
* Flush annotation
*
* @param fctx The context representation that is flushed.
*
* Used to annotate a real flush of the given context.
*/
void debug_flush_flush(struct debug_flush_ctx *fctx);
/**
* Flush annotation
*
* @param fctx The context representation that is flushed.
*
* Used to annotate a real flush of the given context.
*/
void debug_flush_cb_reference(struct debug_flush_ctx *fctx,
struct debug_flush_buf *fbuf);
#endif
#endif

View File

@@ -1382,7 +1382,7 @@ get_next_slot(struct gen_mipmap_state *ctx)
static unsigned
set_vertex_data(struct gen_mipmap_state *ctx,
enum pipe_texture_target tex_target,
uint layer, float r)
uint face, float r)
{
unsigned offset;
@@ -1403,14 +1403,21 @@ set_vertex_data(struct gen_mipmap_state *ctx,
ctx->vertices[3][0][1] = 1.0f;
/* Setup vertex texcoords. This is a little tricky for cube maps. */
if (tex_target == PIPE_TEXTURE_CUBE) {
if (tex_target == PIPE_TEXTURE_CUBE ||
tex_target == PIPE_TEXTURE_CUBE_ARRAY) {
static const float st[4][2] = {
{0.0f, 0.0f}, {1.0f, 0.0f}, {1.0f, 1.0f}, {0.0f, 1.0f}
};
util_map_texcoords2d_onto_cubemap(layer, &st[0][0], 2,
util_map_texcoords2d_onto_cubemap(face, &st[0][0], 2,
&ctx->vertices[0][1][0], 8,
FALSE);
/* set the layer for cube arrays */
ctx->vertices[0][1][3] = r;
ctx->vertices[1][1][3] = r;
ctx->vertices[2][1][3] = r;
ctx->vertices[3][1][3] = r;
}
else if (tex_target == PIPE_TEXTURE_1D_ARRAY) {
/* 1D texture array */
@@ -1520,29 +1527,7 @@ util_gen_mipmap(struct gen_mipmap_state *ctx,
assert(filter == PIPE_TEX_FILTER_LINEAR ||
filter == PIPE_TEX_FILTER_NEAREST);
switch (pt->target) {
case PIPE_TEXTURE_1D:
type = TGSI_TEXTURE_1D;
break;
case PIPE_TEXTURE_2D:
type = TGSI_TEXTURE_2D;
break;
case PIPE_TEXTURE_3D:
type = TGSI_TEXTURE_3D;
break;
case PIPE_TEXTURE_CUBE:
type = TGSI_TEXTURE_CUBE;
break;
case PIPE_TEXTURE_1D_ARRAY:
type = TGSI_TEXTURE_1D_ARRAY;
break;
case PIPE_TEXTURE_2D_ARRAY:
type = TGSI_TEXTURE_2D_ARRAY;
break;
default:
assert(0);
type = TGSI_TEXTURE_2D;
}
type = util_pipe_tex_to_tgsi_tex(pt->target, 1);
/* check if we can render in the texture's format */
if (!screen->is_format_supported(screen, psv->format, pt->target,
@@ -1600,7 +1585,9 @@ util_gen_mipmap(struct gen_mipmap_state *ctx,
if (pt->target == PIPE_TEXTURE_3D)
nr_layers = u_minify(pt->depth0, dstLevel);
else if (pt->target == PIPE_TEXTURE_2D_ARRAY || pt->target == PIPE_TEXTURE_1D_ARRAY)
else if (pt->target == PIPE_TEXTURE_2D_ARRAY ||
pt->target == PIPE_TEXTURE_1D_ARRAY ||
pt->target == PIPE_TEXTURE_CUBE_ARRAY)
nr_layers = pt->array_size;
else
nr_layers = 1;
@@ -1613,9 +1600,14 @@ util_gen_mipmap(struct gen_mipmap_state *ctx,
layer = i;
/* XXX hmm really? */
rcoord = (float)layer / (float)nr_layers + 1.0f / (float)(nr_layers * 2);
} else if (pt->target == PIPE_TEXTURE_2D_ARRAY || pt->target == PIPE_TEXTURE_1D_ARRAY) {
} else if (pt->target == PIPE_TEXTURE_2D_ARRAY ||
pt->target == PIPE_TEXTURE_1D_ARRAY) {
layer = i;
rcoord = (float)layer;
} else if (pt->target == PIPE_TEXTURE_CUBE_ARRAY) {
layer = i;
face = layer % 6;
rcoord = layer / 6;
} else
layer = face;

View File

@@ -112,10 +112,13 @@ static INLINE float logf( float f )
#define logf(x) ((float)log((double)(x)))
#endif /* logf */
#if _MSC_VER < 1800
#define isfinite(x) _finite((double)(x))
#define isnan(x) _isnan((double)(x))
#endif /* _MSC_VER < 1800 */
#endif /* _MSC_VER < 1400 && !defined(__cplusplus) */
#if _MSC_VER < 1800
static INLINE double log2( double x )
{
const double invln2 = 1.442695041;
@@ -133,6 +136,7 @@ roundf(float x)
{
return x >= 0.0f ? floorf(x + 0.5f) : ceilf(x - 0.5f);
}
#endif
#define INFINITY (DBL_MAX + DBL_MAX)
#define NAN (INFINITY - INFINITY)

View File

@@ -178,6 +178,12 @@ The integer capabilities:
ARB_framebuffer_object is provided.
* ``PIPE_CAP_TGSI_VS_LAYER``: Whether TGSI_SEMANTIC_LAYER is supported
as a vertex shader output.
* ``PIPE_CAP_MAX_GEOMETRY_OUTPUT_VERTICES``: The maximum number of vertices
output by a single invocation of a geometry shader.
* ``PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS``: The maximum number of
vertex components output by a single invocation of a geometry shader.
This is the product of the number of attribute components per vertex and
the number of output vertices.
.. _pipe_capf:

View File

@@ -210,6 +210,11 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param)
case PIPE_CAP_MAX_STREAM_OUTPUT_INTERLEAVED_COMPONENTS:
return 0;
/* Geometry shader output, unsupported. */
case PIPE_CAP_MAX_GEOMETRY_OUTPUT_VERTICES:
case PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS:
return 0;
/* Texturing. */
case PIPE_CAP_MAX_TEXTURE_2D_LEVELS:
case PIPE_CAP_MAX_TEXTURE_3D_LEVELS:

View File

@@ -264,6 +264,11 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap cap)
case PIPE_CAP_MAX_RENDER_TARGETS:
return 1;
/* Geometry shader output, unsupported. */
case PIPE_CAP_MAX_GEOMETRY_OUTPUT_VERTICES:
case PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS:
return 0;
/* Fragment coordinate conventions. */
case PIPE_CAP_TGSI_FS_COORD_ORIGIN_UPPER_LEFT:
case PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_HALF_INTEGER:

View File

@@ -374,6 +374,9 @@ ilo_get_param(struct pipe_screen *screen, enum pipe_cap param)
return ILO_MAX_SO_BINDINGS / ILO_MAX_SO_BUFFERS;
case PIPE_CAP_MAX_STREAM_OUTPUT_INTERLEAVED_COMPONENTS:
return ILO_MAX_SO_BINDINGS;
case PIPE_CAP_MAX_GEOMETRY_OUTPUT_VERTICES:
case PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS:
return 0;
case PIPE_CAP_STREAM_OUTPUT_PAUSE_RESUME:
if (is->dev.gen >= ILO_GEN(7))
return is->dev.has_gen7_sol_reset;

View File

@@ -376,9 +376,15 @@ lp_rast_shade_tile(struct lp_rasterizer_task *task,
/* color buffer */
for (i = 0; i < scene->fb.nr_cbufs; i++){
stride[i] = scene->cbufs[i].stride;
color[i] = lp_rast_get_unswizzled_color_block_pointer(task, i, tile_x + x,
tile_y + y, inputs->layer);
if (scene->fb.cbufs[i]) {
stride[i] = scene->cbufs[i].stride;
color[i] = lp_rast_get_unswizzled_color_block_pointer(task, i, tile_x + x,
tile_y + y, inputs->layer);
}
else {
stride[i] = 0;
color[i] = NULL;
}
}
/* depth buffer */

View File

@@ -189,6 +189,9 @@ llvmpipe_get_param(struct pipe_screen *screen, enum pipe_cap param)
case PIPE_CAP_MAX_STREAM_OUTPUT_SEPARATE_COMPONENTS:
case PIPE_CAP_MAX_STREAM_OUTPUT_INTERLEAVED_COMPONENTS:
return 16*4;
case PIPE_CAP_MAX_GEOMETRY_OUTPUT_VERTICES:
case PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS:
return 1024;
case PIPE_CAP_STREAM_OUTPUT_PAUSE_RESUME:
return 1;
case PIPE_CAP_TGSI_CAN_COMPACT_CONSTANTS:

View File

@@ -71,7 +71,6 @@ struct nv50_ir_varying
#define NV50_SEMANTIC_CLIPDISTANCE (TGSI_SEMANTIC_COUNT + 0)
#define NV50_SEMANTIC_VIEWPORTINDEX (TGSI_SEMANTIC_COUNT + 4)
#define NV50_SEMANTIC_LAYER (TGSI_SEMANTIC_COUNT + 5)
#define NV50_SEMANTIC_INVOCATIONID (TGSI_SEMANTIC_COUNT + 6)
#define NV50_SEMANTIC_TESSFACTOR (TGSI_SEMANTIC_COUNT + 7)
#define NV50_SEMANTIC_TESSCOORD (TGSI_SEMANTIC_COUNT + 8)

View File

@@ -1488,8 +1488,13 @@ CodeEmitterNVC0::emitOUT(const Instruction *i)
// vertex stream
if (i->src(1).getFile() == FILE_IMMEDIATE) {
code[1] |= 0xc000;
code[0] |= SDATA(i->src(1)).u32 << 26;
// Using immediate encoding here triggers an invalid opcode error
// or random results when error reporting is disabled.
// TODO: figure this out when we get multiple vertex streams
assert(SDATA(i->src(1)).u32 == 0);
srcId(NULL, 26);
// code[1] |= 0xc000;
// code[0] |= SDATA(i->src(1)).u32 << 26;
} else {
srcId(i->src(1), 26);
}

View File

@@ -861,8 +861,8 @@ int Source::inferSysValDirection(unsigned sn) const
case TGSI_SEMANTIC_INSTANCEID:
case TGSI_SEMANTIC_VERTEXID:
return 1;
#if 0
case TGSI_SEMANTIC_LAYER:
#if 0
case TGSI_SEMANTIC_VIEWPORTINDEX:
return 0;
#endif

View File

@@ -37,18 +37,25 @@ namespace nv50_ir {
// ah*bl 00
//
// fffe0001 + fffe0001
//
// Note that this sort of splitting doesn't work for signed values, so we
// compute the sign on those manually and then perform an unsigned multiply.
static bool
expandIntegerMUL(BuildUtil *bld, Instruction *mul)
{
const bool highResult = mul->subOp == NV50_IR_SUBOP_MUL_HIGH;
DataType fTy = mul->sType; // full type
DataType hTy;
DataType fTy; // full type
switch (mul->sType) {
case TYPE_S32: fTy = TYPE_U32; break;
case TYPE_S64: fTy = TYPE_U64; break;
default: fTy = mul->sType; break;
}
DataType hTy; // half type
switch (fTy) {
case TYPE_S32: hTy = TYPE_S16; break;
case TYPE_U32: hTy = TYPE_U16; break;
case TYPE_U64: hTy = TYPE_U32; break;
case TYPE_S64: hTy = TYPE_S32; break;
default:
return false;
}
@@ -59,15 +66,25 @@ expandIntegerMUL(BuildUtil *bld, Instruction *mul)
bld->setPosition(mul, true);
Value *s[2];
Value *a[2], *b[2];
Value *c[2];
Value *t[4];
for (int j = 0; j < 4; ++j)
t[j] = bld->getSSA(fullSize);
s[0] = mul->getSrc(0);
s[1] = mul->getSrc(1);
if (isSignedType(mul->sType)) {
s[0] = bld->getSSA(fullSize);
s[1] = bld->getSSA(fullSize);
bld->mkOp1(OP_ABS, mul->sType, s[0], mul->getSrc(0));
bld->mkOp1(OP_ABS, mul->sType, s[1], mul->getSrc(1));
}
// split sources into halves
i[0] = bld->mkSplit(a, halfSize, mul->getSrc(0));
i[1] = bld->mkSplit(b, halfSize, mul->getSrc(1));
i[0] = bld->mkSplit(a, halfSize, s[0]);
i[1] = bld->mkSplit(b, halfSize, s[1]);
i[2] = bld->mkOp2(OP_MUL, fTy, t[0], a[0], b[1]);
i[3] = bld->mkOp3(OP_MAD, fTy, t[1], a[1], b[0], t[0]);
@@ -75,23 +92,76 @@ expandIntegerMUL(BuildUtil *bld, Instruction *mul)
i[4] = bld->mkOp3(OP_MAD, fTy, t[3], a[0], b[0], t[2]);
if (highResult) {
Value *r[3];
Value *c[2];
Value *r[5];
Value *imm = bld->loadImm(NULL, 1 << (halfSize * 8));
c[0] = bld->getSSA(1, FILE_FLAGS);
c[1] = bld->getSSA(1, FILE_FLAGS);
for (int j = 0; j < 3; ++j)
for (int j = 0; j < 5; ++j)
r[j] = bld->getSSA(fullSize);
i[8] = bld->mkOp2(OP_SHR, fTy, r[0], t[1], bld->mkImm(halfSize * 8));
i[6] = bld->mkOp2(OP_ADD, fTy, r[1], r[0], imm);
bld->mkOp2(OP_UNION, TYPE_U32, r[2], r[1], r[0]);
i[5] = bld->mkOp3(OP_MAD, fTy, mul->getDef(0), a[1], b[1], r[2]);
bld->mkMov(r[3], r[0])->setPredicate(CC_NC, c[0]);
bld->mkOp2(OP_UNION, TYPE_U32, r[2], r[1], r[3]);
i[5] = bld->mkOp3(OP_MAD, fTy, r[4], a[1], b[1], r[2]);
// set carry defs / sources
i[3]->setFlagsDef(1, c[0]);
i[4]->setFlagsDef(0, c[1]); // actual result not required, just the carry
// actual result required in negative case, but ignored for
// unsigned. for some reason the compiler ends up dropping the whole
// instruction if the destination is unused but the flags are.
if (isSignedType(mul->sType))
i[4]->setFlagsDef(1, c[1]);
else
i[4]->setFlagsDef(0, c[1]);
i[6]->setPredicate(CC_C, c[0]);
i[5]->setFlagsSrc(3, c[1]);
if (isSignedType(mul->sType)) {
Value *cc[2];
Value *rr[7];
Value *one = bld->getSSA(fullSize);
bld->loadImm(one, 1);
for (int j = 0; j < 7; j++)
rr[j] = bld->getSSA(fullSize);
// NOTE: this logic uses predicates because splitting basic blocks is
// ~impossible during the SSA phase. The RA relies on a correlation
// between edge order and phi node sources.
// Set the sign of the result based on the inputs
bld->mkOp2(OP_XOR, fTy, NULL, mul->getSrc(0), mul->getSrc(1))
->setFlagsDef(0, (cc[0] = bld->getSSA(1, FILE_FLAGS)));
// 1s complement of 64-bit value
bld->mkOp1(OP_NOT, fTy, rr[0], r[4])
->setPredicate(CC_S, cc[0]);
bld->mkOp1(OP_NOT, fTy, rr[1], t[3])
->setPredicate(CC_S, cc[0]);
// add to low 32-bits, keep track of the carry
Instruction *n = bld->mkOp2(OP_ADD, fTy, NULL, rr[1], one);
n->setPredicate(CC_S, cc[0]);
n->setFlagsDef(0, (cc[1] = bld->getSSA(1, FILE_FLAGS)));
// If there was a carry, add 1 to the upper 32 bits
// XXX: These get executed even if they shouldn't be
bld->mkOp2(OP_ADD, fTy, rr[2], rr[0], one)
->setPredicate(CC_C, cc[1]);
bld->mkMov(rr[3], rr[0])
->setPredicate(CC_NC, cc[1]);
bld->mkOp2(OP_UNION, fTy, rr[4], rr[2], rr[3]);
// Merge the results from the negative and non-negative paths
bld->mkMov(rr[5], rr[4])
->setPredicate(CC_S, cc[0]);
bld->mkMov(rr[6], r[4])
->setPredicate(CC_NS, cc[0]);
bld->mkOp2(OP_UNION, mul->sType, mul->getDef(0), rr[5], rr[6]);
} else {
bld->mkMov(mul->getDef(0), r[4]);
}
} else {
bld->mkMov(mul->getDef(0), t[3]);
}
@@ -590,6 +660,10 @@ void NV50LoweringPreSSA::loadTexMsInfo(uint32_t off, Value **ms,
Value *tmp = new_LValue(func, FILE_GPR);
uint8_t b = prog->driver->io.resInfoCBSlot;
off += prog->driver->io.suInfoBase;
if (prog->getType() > Program::TYPE_VERTEX)
off += 16 * 2 * 4;
if (prog->getType() > Program::TYPE_GEOMETRY)
off += 16 * 2 * 4;
*ms_x = bld.mkLoadv(TYPE_U32, bld.mkSymbol(
FILE_MEMORY_CONST, b, TYPE_U32, off + 0), NULL);
*ms_y = bld.mkLoadv(TYPE_U32, bld.mkSymbol(

View File

@@ -666,8 +666,9 @@ NVC0LoweringPass::handleTEX(TexInstruction *i)
const int dim = i->tex.target.getDim() + i->tex.target.isCube();
const int arg = i->tex.target.getArgCount();
const int lyr = arg - (i->tex.target.isMS() ? 2 : 1);
const int chipset = prog->getTarget()->getChipset();
if (prog->getTarget()->getChipset() >= NVISA_GK104_CHIPSET) {
if (chipset >= NVISA_GK104_CHIPSET) {
if (i->tex.rIndirectSrc >= 0 || i->tex.sIndirectSrc >= 0) {
WARN("indirect TEX not implemented\n");
}
@@ -697,7 +698,7 @@ NVC0LoweringPass::handleTEX(TexInstruction *i)
}
} else
// (nvc0) generate and move the tsc/tic/array source to the front
if (dim != arg || i->tex.rIndirectSrc >= 0 || i->tex.sIndirectSrc >= 0) {
if (i->tex.target.isArray() || i->tex.rIndirectSrc >= 0 || i->tex.sIndirectSrc >= 0) {
LValue *src = new_LValue(func, FILE_GPR); // 0xttxsaaaa
Value *arrayIndex = i->tex.target.isArray() ? i->getSrc(lyr) : NULL;
@@ -728,6 +729,13 @@ NVC0LoweringPass::handleTEX(TexInstruction *i)
i->setSrc(0, src);
}
// For nvc0, the sample id has to be in the second operand, as the offset
// does. Right now we don't know how to pass both in, and this case can't
// happen with OpenGL. On nve0, the sample id is part of the texture
// coordinate argument.
assert(chipset >= NVISA_GK104_CHIPSET ||
!i->tex.useOffsets || !i->tex.target.isMS());
// offset is last source (lod 1st, dc 2nd)
if (i->tex.useOffsets) {
uint32_t value = 0;
@@ -741,7 +749,7 @@ NVC0LoweringPass::handleTEX(TexInstruction *i)
i->setSrc(s, bld.loadImm(NULL, value));
}
if (prog->getTarget()->getChipset() >= NVISA_GK104_CHIPSET) {
if (chipset >= NVISA_GK104_CHIPSET) {
//
// If TEX requires more than 4 sources, the 2nd register tuple must be
// aligned to 4, even if it consists of just a single 4-byte register.

View File

@@ -187,7 +187,8 @@ LoadPropagation::checkSwapSrc01(Instruction *insn)
return;
}
if (insn->op == OP_SET)
if (insn->op == OP_SET || insn->op == OP_SET_AND ||
insn->op == OP_SET_OR || insn->op == OP_SET_XOR)
insn->asCmp()->setCond = reverseCondCode(insn->asCmp()->setCond);
else
if (insn->op == OP_SLCT)
@@ -417,7 +418,17 @@ ConstantFolding::expr(Instruction *i,
case TYPE_F32: res.data.f32 = a->data.f32 * b->data.f32; break;
case TYPE_F64: res.data.f64 = a->data.f64 * b->data.f64; break;
case TYPE_S32:
case TYPE_U32: res.data.u32 = a->data.u32 * b->data.u32; break;
if (i->subOp == NV50_IR_SUBOP_MUL_HIGH) {
res.data.s32 = ((int64_t)a->data.s32 * b->data.s32) >> 32;
break;
}
/* fallthrough */
case TYPE_U32:
if (i->subOp == NV50_IR_SUBOP_MUL_HIGH) {
res.data.u32 = ((uint64_t)a->data.u32 * b->data.u32) >> 32;
break;
}
res.data.u32 = a->data.u32 * b->data.u32; break;
default:
return;
}
@@ -524,6 +535,7 @@ ConstantFolding::expr(Instruction *i,
} else {
i->op = OP_MOV;
}
i->subOp = 0;
}
void
@@ -625,12 +637,41 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s)
{
const int t = !s;
const operation op = i->op;
Instruction *newi = i;
switch (i->op) {
case OP_MUL:
if (i->dType == TYPE_F32)
tryCollapseChainedMULs(i, s, imm0);
if (i->subOp == NV50_IR_SUBOP_MUL_HIGH) {
assert(!isFloatType(i->sType));
if (imm0.isInteger(1) && i->dType == TYPE_S32) {
bld.setPosition(i, false);
// Need to set to the sign value, which is a compare.
newi = bld.mkCmp(OP_SET, CC_LT, TYPE_S32, i->getDef(0),
TYPE_S32, i->getSrc(t), bld.mkImm(0));
delete_Instruction(prog, i);
} else if (imm0.isInteger(0) || imm0.isInteger(1)) {
// The high bits can't be set in this case (either mul by 0 or
// unsigned by 1)
i->op = OP_MOV;
i->subOp = 0;
i->setSrc(0, new_ImmediateValue(prog, 0u));
i->src(0).mod = Modifier(0);
i->setSrc(1, NULL);
} else if (!imm0.isNegative() && imm0.isPow2()) {
// Translate into a shift
imm0.applyLog2();
i->op = OP_SHR;
i->subOp = 0;
imm0.reg.data.u32 = 32 - imm0.reg.data.u32;
i->setSrc(0, i->getSrc(t));
i->src(0).mod = i->src(t).mod;
i->setSrc(1, new_ImmediateValue(prog, imm0.reg.data.u32));
i->src(1).mod = 0;
}
} else
if (imm0.isInteger(0)) {
i->op = OP_MOV;
i->setSrc(0, new_ImmediateValue(prog, 0u));
@@ -721,7 +762,7 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s)
else
tA = tB;
tB = s ? bld.getSSA() : i->getDef(0);
bld.mkOp2(OP_ADD, TYPE_U32, tB, mul->getDef(0), tA);
newi = bld.mkOp2(OP_ADD, TYPE_U32, tB, mul->getDef(0), tA);
if (s)
bld.mkOp2(OP_SHR, TYPE_U32, i->getDef(0), tB, bld.mkImm(s));
@@ -753,7 +794,7 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s)
tA = bld.getSSA();
bld.mkCmp(OP_SET, CC_LT, TYPE_S32, tA, TYPE_S32, i->getSrc(0), bld.mkImm(0));
tD = (d < 0) ? bld.getSSA() : i->getDef(0)->asLValue();
bld.mkOp2(OP_SUB, TYPE_U32, tD, tB, tA);
newi = bld.mkOp2(OP_SUB, TYPE_U32, tD, tB, tA);
if (d < 0)
bld.mkOp1(OP_NEG, TYPE_S32, i->getDef(0), tB);
@@ -831,7 +872,7 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s)
default:
return;
}
if (i->op != op)
if (newi->op != op)
foldCount++;
}

View File

@@ -284,6 +284,7 @@ public:
bool run(const std::list<ValuePair>&);
Symbol *assignSlot(const Interval&, const unsigned int size);
Value *offsetSlot(Value *, const LValue *);
inline int32_t getStackSize() const { return stackSize; }
private:
@@ -774,6 +775,7 @@ GCRA::RIG_Node::init(const RegisterSet& regs, LValue *lval)
weight = std::numeric_limits<float>::infinity();
degree = 0;
degreeLimit = regs.getFileSize(f, lval->reg.size);
degreeLimit -= relDegree[1][colors] - 1;
livei.insert(lval->livei);
}
@@ -1466,10 +1468,25 @@ SpillCodeInserter::assignSlot(const Interval &livei, const unsigned int size)
return slot.sym;
}
Value *
SpillCodeInserter::offsetSlot(Value *base, const LValue *lval)
{
if (!lval->compound || (lval->compMask & 0x1))
return base;
Value *slot = cloneShallow(func, base);
slot->reg.data.offset += (ffs(lval->compMask) - 1) * lval->reg.size;
slot->reg.size = lval->reg.size;
return slot;
}
void
SpillCodeInserter::spill(Instruction *defi, Value *slot, LValue *lval)
{
const DataType ty = typeOfSize(slot->reg.size);
const DataType ty = typeOfSize(lval->reg.size);
slot = offsetSlot(slot, lval);
Instruction *st;
if (slot->reg.file == FILE_MEMORY_LOCAL) {
@@ -1488,8 +1505,9 @@ SpillCodeInserter::spill(Instruction *defi, Value *slot, LValue *lval)
LValue *
SpillCodeInserter::unspill(Instruction *usei, LValue *lval, Value *slot)
{
const DataType ty = typeOfSize(slot->reg.size);
const DataType ty = typeOfSize(lval->reg.size);
slot = offsetSlot(slot, lval);
lval = cloneShallow(func, lval);
Instruction *ld;
@@ -1506,6 +1524,16 @@ SpillCodeInserter::unspill(Instruction *usei, LValue *lval, Value *slot)
return lval;
}
// For each value that is to be spilled, go through all its definitions.
// A value can have multiple definitions if it has been coalesced before.
// For each definition, first go through all its uses and insert an unspill
// instruction before it, then replace the use with the temporary register.
// Unspill can be either a load from memory or simply a move to another
// register file.
// For "Pseudo" instructions (like PHI, SPLIT, MERGE) we can erase the use
// if we have spilled to a memory location, or simply with the new register.
// No load or conversion instruction should be needed.
bool
SpillCodeInserter::run(const std::list<ValuePair>& lst)
{
@@ -1524,12 +1552,13 @@ SpillCodeInserter::run(const std::list<ValuePair>& lst)
LValue *dval = (*d)->get()->asLValue();
Instruction *defi = (*d)->getInsn();
// handle uses first or they'll contain the spill stores
// Unspill at each use *before* inserting spill instructions,
// we don't want to have the spill instructions in the use list here.
while (!dval->uses.empty()) {
ValueRef *u = dval->uses.front();
Instruction *usei = u->getInsn();
assert(usei);
if (usei->op == OP_PHI) {
if (usei->isPseudo()) {
tmp = (slot->reg.file == FILE_MEMORY_LOCAL) ? NULL : slot;
last = NULL;
} else
@@ -1541,7 +1570,7 @@ SpillCodeInserter::run(const std::list<ValuePair>& lst)
}
assert(defi);
if (defi->op == OP_PHI) {
if (defi->isPseudo()) {
d = lval->defs.erase(d);
--d;
if (slot->reg.file == FILE_MEMORY_LOCAL)
@@ -1885,7 +1914,7 @@ RegAlloc::InsertConstraintsPass::texConstraintNVC0(TexInstruction *tex)
s = tex->srcCount(0xff);
n = 0;
} else {
s = tex->tex.target.getArgCount();
s = tex->tex.target.getArgCount() - tex->tex.target.isMS();
if (!tex->tex.target.isArray() &&
(tex->tex.rIndirectSrc >= 0 || tex->tex.sIndirectSrc >= 0))
++s;

View File

@@ -329,6 +329,8 @@ TargetNV50::insnCanLoad(const Instruction *i, int s,
return false;
if (sf == FILE_IMMEDIATE)
return false;
if (i->subOp == NV50_IR_SUBOP_MUL_HIGH && sf == FILE_MEMORY_CONST)
return false;
ldSize = 2;
} else {
ldSize = typeSizeof(ld->dType);
@@ -532,7 +534,7 @@ recordLocation(uint16_t *locs, uint8_t *masks,
case TGSI_SEMANTIC_INSTANCEID: locs[SV_INSTANCE_ID] = addr; break;
case TGSI_SEMANTIC_VERTEXID: locs[SV_VERTEX_ID] = addr; break;
case TGSI_SEMANTIC_PRIMID: locs[SV_PRIMITIVE_ID] = addr; break;
case NV50_SEMANTIC_LAYER: locs[SV_LAYER] = addr; break;
case TGSI_SEMANTIC_LAYER: locs[SV_LAYER] = addr; break;
case NV50_SEMANTIC_VIEWPORTINDEX: locs[SV_VIEWPORT_INDEX] = addr; break;
default:
break;

View File

@@ -49,6 +49,11 @@ struct nouveau_screen {
boolean hint_buf_keep_sysmem_copy;
struct {
unsigned profiles_checked;
unsigned profiles_present;
} firmware_info;
#ifdef NOUVEAU_ENABLE_DRIVER_STATISTICS
union {
uint64_t v[29];

View File

@@ -21,6 +21,7 @@
*/
#include <sys/mman.h>
#include <sys/stat.h>
#include <stdio.h>
#include <fcntl.h>
@@ -350,6 +351,77 @@ nouveau_vp3_load_firmware(struct nouveau_vp3_decoder *dec,
return 0;
}
static int
firmware_present(struct pipe_screen *pscreen, enum pipe_video_profile profile)
{
struct nouveau_screen *screen = nouveau_screen(pscreen);
int chipset = screen->device->chipset;
int vp3 = chipset < 0xa3 || chipset == 0xaa || chipset == 0xac;
int vp5 = chipset >= 0xd0;
int ret;
/* For all chipsets, try to create a BSP objects. Assume that if firmware
* is present for it, firmware is also present for VP/PPP */
if (!(screen->firmware_info.profiles_checked & 1)) {
struct nouveau_object *channel = NULL, *bsp = NULL;
struct nv04_fifo nv04_data = {.vram = 0xbeef0201, .gart = 0xbeef0202};
struct nvc0_fifo nvc0_args = {};
struct nve0_fifo nve0_args = {.engine = NVE0_FIFO_ENGINE_BSP};
void *data = NULL;
int size, oclass;
if (chipset < 0xc0)
oclass = 0x85b1;
else if (chipset < 0xe0)
oclass = 0x90b1;
else
oclass = 0x95b1;
if (chipset < 0xc0) {
data = &nv04_data;
size = sizeof(nv04_data);
} else if (chipset < 0xe0) {
data = &nvc0_args;
size = sizeof(nvc0_args);
} else {
data = &nve0_args;
size = sizeof(nve0_args);
}
/* kepler must have its own channel, so just do this for everyone */
nouveau_object_new(&screen->device->object, 0,
NOUVEAU_FIFO_CHANNEL_CLASS,
data, size, &channel);
if (channel) {
nouveau_object_new(channel, 0, oclass, NULL, 0, &bsp);
if (bsp)
screen->firmware_info.profiles_present |= 1;
nouveau_object_del(&bsp);
nouveau_object_del(&channel);
}
screen->firmware_info.profiles_checked |= 1;
}
if (!(screen->firmware_info.profiles_present & 1))
return 0;
/* For vp3/vp4 chipsets, make sure that the relevant firmware is present */
if (!vp5 && !(screen->firmware_info.profiles_checked & (1 << profile))) {
char path[PATH_MAX];
struct stat s;
if (vp3)
vp3_getpath(profile, path);
else
vp4_getpath(profile, path);
ret = stat(path, &s);
if (!ret && s.st_size > 1000)
screen->firmware_info.profiles_present |= (1 << profile);
screen->firmware_info.profiles_checked |= (1 << profile);
}
return vp5 || (screen->firmware_info.profiles_present & (1 << profile));
}
int
nouveau_vp3_screen_get_video_param(struct pipe_screen *pscreen,
enum pipe_video_profile profile,
@@ -363,8 +435,10 @@ nouveau_vp3_screen_get_video_param(struct pipe_screen *pscreen,
switch (param) {
case PIPE_VIDEO_CAP_SUPPORTED:
/* VP3 does not support MPEG4, VP4+ do. */
return profile >= PIPE_VIDEO_PROFILE_MPEG1 && (
!vp3 || codec != PIPE_VIDEO_FORMAT_MPEG4);
return entrypoint == PIPE_VIDEO_ENTRYPOINT_BITSTREAM &&
profile >= PIPE_VIDEO_PROFILE_MPEG1 &&
(!vp3 || codec != PIPE_VIDEO_FORMAT_MPEG4) &&
firmware_present(pscreen, profile);
case PIPE_VIDEO_CAP_NPOT_TEXTURES:
return 1;
case PIPE_VIDEO_CAP_MAX_WIDTH:

View File

@@ -108,6 +108,8 @@ nv30_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param)
case PIPE_CAP_MAX_TEXEL_OFFSET:
case PIPE_CAP_MAX_STREAM_OUTPUT_SEPARATE_COMPONENTS:
case PIPE_CAP_MAX_STREAM_OUTPUT_INTERLEAVED_COMPONENTS:
case PIPE_CAP_MAX_GEOMETRY_OUTPUT_VERTICES:
case PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS:
case PIPE_CAP_TGSI_CAN_COMPACT_CONSTANTS:
case PIPE_CAP_TEXTURE_BARRIER:
case PIPE_CAP_SEAMLESS_CUBE_MAP:
@@ -219,7 +221,7 @@ nv30_screen_get_shader_param(struct pipe_screen *pscreen, unsigned shader,
case PIPE_SHADER_CAP_MAX_CONTROL_FLOW_DEPTH:
return 0;
case PIPE_SHADER_CAP_MAX_INPUTS:
return (eng3d->oclass >= NV40_3D_CLASS) ? 12 : 10;
return 8; /* should be possible to do 10 with nv4x */
case PIPE_SHADER_CAP_MAX_CONSTS:
return (eng3d->oclass >= NV40_3D_CLASS) ? 224 : 32;
case PIPE_SHADER_CAP_MAX_CONST_BUFFERS:
@@ -300,10 +302,16 @@ nv30_screen_destroy(struct pipe_screen *pscreen)
{
struct nv30_screen *screen = nv30_screen(pscreen);
if (screen->base.fence.current &&
screen->base.fence.current->state >= NOUVEAU_FENCE_STATE_EMITTED) {
nouveau_fence_wait(screen->base.fence.current);
nouveau_fence_ref (NULL, &screen->base.fence.current);
if (screen->base.fence.current) {
struct nouveau_fence *current = NULL;
/* nouveau_fence_wait will create a new current fence, so wait on the
* _current_ one, and remove both.
*/
nouveau_fence_ref(screen->base.fence.current, &current);
nouveau_fence_wait(current);
nouveau_fence_ref(NULL, &current);
nouveau_fence_ref(NULL, &screen->base.fence.current);
}
nouveau_object_del(&screen->query);

View File

@@ -77,13 +77,13 @@
/* 8 user clip planes, at 4 32-bit floats each */
#define NV50_CB_AUX_UCP_OFFSET 0x0000
#define NV50_CB_AUX_UCP_SIZE (8 * 4 * 4)
/* 256 textures, each with ms_x, ms_y u32 pairs */
/* 16 textures * 3 shaders, each with ms_x, ms_y u32 pairs */
#define NV50_CB_AUX_TEX_MS_OFFSET 0x0080
#define NV50_CB_AUX_TEX_MS_SIZE (256 * 2 * 4)
#define NV50_CB_AUX_TEX_MS_SIZE (16 * 3 * 2 * 4)
/* For each MS level (4), 8 sets of 32-bit integer pairs sample offsets */
#define NV50_CB_AUX_MS_OFFSET 0x880
#define NV50_CB_AUX_MS_OFFSET 0x200
#define NV50_CB_AUX_MS_SIZE (4 * 8 * 4 * 2)
/* next spot: 0x980 */
/* next spot: 0x300 */
/* 4 32-bit floats for the vertex runout, put at the end */
#define NV50_CB_AUX_RUNOUT_OFFSET (NV50_CB_AUX_SIZE - 0x10)
@@ -171,6 +171,8 @@ struct nv50_context {
boolean vbo_push_hint;
uint32_t rt_array_mode;
struct pipe_query *cond_query;
boolean cond_cond;
uint cond_mode;

View File

@@ -104,7 +104,7 @@ nv50_vertprog_assign_slots(struct nv50_ir_prog_info *info)
prog->vp.bfc[info->out[i].si] = i;
break;
case TGSI_SEMANTIC_LAYER:
prog->gp.has_layer = true;
prog->gp.has_layer = TRUE;
prog->gp.layerid = n;
break;
default:
@@ -170,10 +170,8 @@ nv50_fragprog_assign_slots(struct nv50_ir_prog_info *info)
if (info->in[i].sn == TGSI_SEMANTIC_COLOR)
prog->vp.bfc[info->in[i].si] = j;
else if (info->in[i].sn == TGSI_SEMANTIC_PRIMID) {
else if (info->in[i].sn == TGSI_SEMANTIC_PRIMID)
prog->vp.attrs[2] |= NV50_3D_VP_GP_BUILTIN_ATTR_EN_PRIMITIVE_ID;
prog->gp.primid = j;
}
prog->in[j].id = i;
prog->in[j].mask = info->in[i].mask;
@@ -345,7 +343,6 @@ nv50_program_translate(struct nv50_program *prog, uint16_t chipset)
prog->vp.clpd[0] = map_undef;
prog->vp.clpd[1] = map_undef;
prog->vp.psiz = map_undef;
prog->gp.primid = 0x80;
prog->gp.has_layer = 0;
info->driverPriv = prog;

View File

@@ -88,9 +88,8 @@ struct nv50_program {
struct {
uint32_t vert_count;
ubyte primid; /* primitive id output register */
uint8_t prim_type; /* point, line strip or tri strip */
bool has_layer;
uint8_t has_layer;
ubyte layerid; /* hw value of layer output */
} gp;

View File

@@ -143,6 +143,9 @@ nv50_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param)
case PIPE_CAP_MAX_STREAM_OUTPUT_INTERLEAVED_COMPONENTS:
case PIPE_CAP_MAX_STREAM_OUTPUT_SEPARATE_COMPONENTS:
return 64;
case PIPE_CAP_MAX_GEOMETRY_OUTPUT_VERTICES:
case PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS:
return 1024;
case PIPE_CAP_STREAM_OUTPUT_PAUSE_RESUME:
return (class_3d >= NVA0_3D_CLASS) ? 1 : 0;
case PIPE_CAP_BLEND_EQUATION_SEPARATE:
@@ -287,8 +290,15 @@ nv50_screen_destroy(struct pipe_screen *pscreen)
struct nv50_screen *screen = nv50_screen(pscreen);
if (screen->base.fence.current) {
nouveau_fence_wait(screen->base.fence.current);
nouveau_fence_ref (NULL, &screen->base.fence.current);
struct nouveau_fence *current = NULL;
/* nouveau_fence_wait will create a new current fence, so wait on the
* _current_ one, and remove both.
*/
nouveau_fence_ref(screen->base.fence.current, &current);
nouveau_fence_wait(current);
nouveau_fence_ref(NULL, &current);
nouveau_fence_ref(NULL, &screen->base.fence.current);
}
if (screen->base.pushbuf)
screen->base.pushbuf->user_priv = NULL;
@@ -741,12 +751,13 @@ nv50_screen_create(struct nouveau_device *dev)
goto fail;
}
/* This over-allocates by a whole code BO. The GP, which would execute at
* the end of the last page, would trigger faults. The going theory is that
* it prefetches up to a certain amount. This avoids dmesg spam.
/* This over-allocates by a page. The GP, which would execute at the end of
* the last page, would trigger faults. The going theory is that it
* prefetches up to a certain amount.
*/
ret = nouveau_bo_new(dev, NOUVEAU_BO_VRAM, 1 << 16,
4 << NV50_CODE_BO_SIZE_LOG2, NULL, &screen->code);
(3 << NV50_CODE_BO_SIZE_LOG2) + 0x1000,
NULL, &screen->code);
if (ret) {
NOUVEAU_ERR("Failed to allocate code bo: %d\n", ret);
goto fail;

View File

@@ -346,7 +346,7 @@ nv50_fp_linkage_validate(struct nv50_context *nv50)
struct nv50_varying dummy;
int i, n, c, m;
uint32_t primid = 0;
uint32_t layerid = vp->gp.layerid;
uint32_t layerid = 0;
uint32_t psiz = 0x000;
uint32_t interp = fp->fp.interp;
uint32_t colors = fp->fp.colors;
@@ -401,17 +401,21 @@ nv50_fp_linkage_validate(struct nv50_context *nv50)
if (vp->out[n].sn == fp->in[i].sn &&
vp->out[n].si == fp->in[i].si)
break;
if (i == fp->gp.primid) {
switch (fp->in[i].sn) {
case TGSI_SEMANTIC_PRIMID:
primid = m;
break;
case TGSI_SEMANTIC_LAYER:
layerid = m;
break;
}
m = nv50_vec4_map(map, m, lin,
&fp->in[i], (n < vp->out_nr) ? &vp->out[n] : &dummy);
}
if (vp->gp.has_layer) {
// In GL4.x, layer can be an fp input, but not in 3.x. Make sure to add
// it to the output map.
map[m++] = layerid;
if (vp->gp.has_layer && !layerid) {
layerid = m;
map[m++] = vp->gp.layerid;
}
if (nv50->rast->pipe.point_size_per_vertex) {

View File

@@ -556,11 +556,12 @@ nv50_sampler_state_delete(struct pipe_context *pipe, void *hwcso)
{
unsigned s, i;
for (s = 0; s < 3; ++s)
for (s = 0; s < 3; ++s) {
assert(nv50_context(pipe)->num_samplers[s] <= PIPE_MAX_SAMPLERS);
for (i = 0; i < nv50_context(pipe)->num_samplers[s]; ++i)
if (nv50_context(pipe)->samplers[s][i] == hwcso)
nv50_context(pipe)->samplers[s][i] = NULL;
}
nv50_screen_tsc_free(nv50_context(pipe)->screen, nv50_tsc_entry(hwcso));

View File

@@ -65,6 +65,7 @@ nv50_validate_fb(struct nv50_context *nv50)
PUSH_DATA (push, sf->height);
BEGIN_NV04(push, NV50_3D(RT_ARRAY_MODE), 1);
PUSH_DATA (push, array_mode | array_size);
nv50->rt_array_mode = array_mode | array_size;
} else {
PUSH_DATA (push, 0);
PUSH_DATA (push, 0);

View File

@@ -295,7 +295,7 @@ nv50_clear_render_target(struct pipe_context *pipe,
PUSH_DATA (push, bo->offset + sf->offset);
PUSH_DATA (push, nv50_format_table[dst->format].rt);
PUSH_DATA (push, mt->level[sf->base.u.tex.level].tile_mode);
PUSH_DATA (push, 0);
PUSH_DATA (push, mt->layer_stride >> 2);
BEGIN_NV04(push, NV50_3D(RT_HORIZ(0)), 2);
if (nouveau_bo_memtype(bo))
PUSH_DATA(push, sf->width);
@@ -303,7 +303,10 @@ nv50_clear_render_target(struct pipe_context *pipe,
PUSH_DATA(push, NV50_3D_RT_HORIZ_LINEAR | mt->level[0].pitch);
PUSH_DATA (push, sf->height);
BEGIN_NV04(push, NV50_3D(RT_ARRAY_MODE), 1);
PUSH_DATA (push, 1);
if (mt->layout_3d)
PUSH_DATA(push, NV50_3D_RT_ARRAY_MODE_MODE_3D | 512);
else
PUSH_DATA(push, 512);
if (!nouveau_bo_memtype(bo)) {
BEGIN_NV04(push, NV50_3D(ZETA_ENABLE), 1);
@@ -366,7 +369,7 @@ nv50_clear_depth_stencil(struct pipe_context *pipe,
PUSH_DATA (push, bo->offset + sf->offset);
PUSH_DATA (push, nv50_format_table[dst->format].rt);
PUSH_DATA (push, mt->level[sf->base.u.tex.level].tile_mode);
PUSH_DATA (push, 0);
PUSH_DATA (push, mt->layer_stride >> 2);
BEGIN_NV04(push, NV50_3D(ZETA_ENABLE), 1);
PUSH_DATA (push, 1);
BEGIN_NV04(push, NV50_3D(ZETA_HORIZ), 3);
@@ -374,6 +377,9 @@ nv50_clear_depth_stencil(struct pipe_context *pipe,
PUSH_DATA (push, sf->height);
PUSH_DATA (push, (1 << 16) | 1);
BEGIN_NV04(push, NV50_3D(RT_ARRAY_MODE), 1);
PUSH_DATA (push, 512);
BEGIN_NV04(push, NV50_3D(VIEWPORT_HORIZ(0)), 2);
PUSH_DATA (push, (width << 16) | dstx);
PUSH_DATA (push, (height << 16) | dsty);
@@ -402,6 +408,11 @@ nv50_clear(struct pipe_context *pipe, unsigned buffers,
if (!nv50_state_validate(nv50, NV50_NEW_FRAMEBUFFER, 9 + (fb->nr_cbufs * 2)))
return;
/* We have to clear ALL of the layers, not up to the min number of layers
* of any attachment. */
BEGIN_NV04(push, NV50_3D(RT_ARRAY_MODE), 1);
PUSH_DATA (push, (nv50->rt_array_mode & NV50_3D_RT_ARRAY_MODE_MODE_3D) | 512);
if (buffers & PIPE_CLEAR_COLOR && fb->nr_cbufs) {
BEGIN_NV04(push, NV50_3D(CLEAR_COLOR(0)), 4);
PUSH_DATAf(push, color->f[0]);
@@ -459,6 +470,10 @@ nv50_clear(struct pipe_context *pipe, unsigned buffers,
(j << NV50_3D_CLEAR_BUFFERS_LAYER__SHIFT));
}
}
/* restore the array mode */
BEGIN_NV04(push, NV50_3D(RT_ARRAY_MODE), 1);
PUSH_DATA (push, nv50->rt_array_mode);
}
@@ -962,6 +977,7 @@ nv50_blit_3d(struct nv50_context *nv50, const struct pipe_blit_info *info)
float x0, x1, y0, y1, z;
float dz;
float x_range, y_range;
float tri_x, tri_y;
blit->mode = nv50_blit_select_mode(info);
blit->color_mask = nv50_blit_derive_color_mask(info);
@@ -981,11 +997,14 @@ nv50_blit_3d(struct nv50_context *nv50, const struct pipe_blit_info *info)
x_range = (float)info->src.box.width / (float)info->dst.box.width;
y_range = (float)info->src.box.height / (float)info->dst.box.height;
tri_x = 16384 << nv50_miptree(dst)->ms_x;
tri_y = 16384 << nv50_miptree(dst)->ms_y;
x0 = (float)info->src.box.x - x_range * (float)info->dst.box.x;
y0 = (float)info->src.box.y - y_range * (float)info->dst.box.y;
x1 = x0 + 16384.0f * x_range;
y1 = y0 + 16384.0f * y_range;
x1 = x0 + tri_x * x_range;
y1 = y0 + tri_y * y_range;
x0 *= (float)(1 << nv50_miptree(src)->ms_x);
x1 *= (float)(1 << nv50_miptree(src)->ms_x);
@@ -1054,7 +1073,7 @@ nv50_blit_3d(struct nv50_context *nv50, const struct pipe_blit_info *info)
PUSH_DATAf(push, y0);
PUSH_DATAf(push, z);
BEGIN_NV04(push, NV50_3D(VTX_ATTR_2F_X(0)), 2);
PUSH_DATAf(push, 16384 << nv50_miptree(dst)->ms_x);
PUSH_DATAf(push, tri_x);
PUSH_DATAf(push, 0.0f);
BEGIN_NV04(push, NV50_3D(VTX_ATTR_3F_X(1)), 3);
PUSH_DATAf(push, x0);
@@ -1062,7 +1081,7 @@ nv50_blit_3d(struct nv50_context *nv50, const struct pipe_blit_info *info)
PUSH_DATAf(push, z);
BEGIN_NV04(push, NV50_3D(VTX_ATTR_2F_X(0)), 2);
PUSH_DATAf(push, 0.0f);
PUSH_DATAf(push, 16384 << nv50_miptree(dst)->ms_y);
PUSH_DATAf(push, tri_y);
BEGIN_NV04(push, NV50_3D(VERTEX_END_GL), 1);
PUSH_DATA (push, 0);
}

View File

@@ -286,7 +286,7 @@ nv50_validate_tic(struct nv50_context *nv50, int s)
}
if (nv50->num_textures[s]) {
BEGIN_NV04(push, NV50_3D(CB_ADDR), 1);
PUSH_DATA (push, (NV50_CB_AUX_TEX_MS_OFFSET << (8 - 2)) | NV50_CB_AUX);
PUSH_DATA (push, ((NV50_CB_AUX_TEX_MS_OFFSET + 16 * s * 2 * 4) << (8 - 2)) | NV50_CB_AUX);
BEGIN_NI04(push, NV50_3D(CB_DATA(0)), nv50->num_textures[s] * 2);
for (i = 0; i < nv50->num_textures[s]; i++) {
struct nv50_tic_entry *tic = nv50_tic_entry(nv50->textures[s][i]);

View File

@@ -278,7 +278,7 @@ nv50_miptree_transfer_map(struct pipe_context *pctx,
if (util_format_is_plain(res->format)) {
tx->nblocksx = box->width << mt->ms_x;
tx->nblocksy = box->height << mt->ms_x;
tx->nblocksy = box->height << mt->ms_y;
} else {
tx->nblocksx = util_format_get_nblocksx(res->format, box->width);
tx->nblocksy = util_format_get_nblocksy(res->format, box->height);

View File

@@ -741,16 +741,80 @@ error:
return NULL;
}
#define FIRMWARE_BSP_KERN 0x01
#define FIRMWARE_VP_KERN 0x02
#define FIRMWARE_BSP_H264 0x04
#define FIRMWARE_VP_MPEG2 0x08
#define FIRMWARE_VP_H264_1 0x10
#define FIRMWARE_VP_H264_2 0x20
#define FIRMWARE_PRESENT(val, fw) (val & FIRMWARE_ ## fw)
static int
firmware_present(struct pipe_screen *pscreen, enum pipe_video_format codec)
{
struct nouveau_screen *screen = nouveau_screen(pscreen);
struct nouveau_object *obj = NULL;
struct stat s;
int checked = screen->firmware_info.profiles_checked;
int present, ret;
if (!FIRMWARE_PRESENT(checked, VP_KERN)) {
nouveau_object_new(screen->channel, 0, 0x7476, NULL, 0, &obj);
if (obj)
screen->firmware_info.profiles_present |= FIRMWARE_VP_KERN;
nouveau_object_del(&obj);
screen->firmware_info.profiles_checked |= FIRMWARE_VP_KERN;
}
if (codec == PIPE_VIDEO_FORMAT_MPEG4_AVC) {
if (!FIRMWARE_PRESENT(checked, BSP_KERN)) {
nouveau_object_new(screen->channel, 0, 0x74b0, NULL, 0, &obj);
if (obj)
screen->firmware_info.profiles_present |= FIRMWARE_BSP_KERN;
nouveau_object_del(&obj);
screen->firmware_info.profiles_checked |= FIRMWARE_BSP_KERN;
}
if (!FIRMWARE_PRESENT(checked, VP_H264_1)) {
ret = stat("/lib/firmware/nouveau/nv84_vp-h264-1", &s);
if (!ret && s.st_size > 1000)
screen->firmware_info.profiles_present |= FIRMWARE_VP_H264_1;
screen->firmware_info.profiles_checked |= FIRMWARE_VP_H264_1;
}
/* should probably check the others, but assume that 1 means all */
present = screen->firmware_info.profiles_present;
return FIRMWARE_PRESENT(present, VP_KERN) &&
FIRMWARE_PRESENT(present, BSP_KERN) &&
FIRMWARE_PRESENT(present, VP_H264_1);
} else {
if (!FIRMWARE_PRESENT(checked, VP_MPEG2)) {
ret = stat("/lib/firmware/nouveau/nv84_vp-mpeg12", &s);
if (!ret && s.st_size > 1000)
screen->firmware_info.profiles_present |= FIRMWARE_VP_MPEG2;
screen->firmware_info.profiles_checked |= FIRMWARE_VP_MPEG2;
}
present = screen->firmware_info.profiles_present;
return FIRMWARE_PRESENT(present, VP_KERN) &&
FIRMWARE_PRESENT(present, VP_MPEG2);
}
}
int
nv84_screen_get_video_param(struct pipe_screen *pscreen,
enum pipe_video_profile profile,
enum pipe_video_entrypoint entrypoint,
enum pipe_video_cap param)
{
enum pipe_video_format codec;
switch (param) {
case PIPE_VIDEO_CAP_SUPPORTED:
return u_reduce_video_profile(profile) == PIPE_VIDEO_FORMAT_MPEG4_AVC ||
u_reduce_video_profile(profile) == PIPE_VIDEO_FORMAT_MPEG12;
codec = u_reduce_video_profile(profile);
return (codec == PIPE_VIDEO_FORMAT_MPEG4_AVC ||
codec == PIPE_VIDEO_FORMAT_MPEG12) &&
firmware_present(pscreen, codec);
case PIPE_VIDEO_CAP_NPOT_TEXTURES:
return 1;
case PIPE_VIDEO_CAP_MAX_WIDTH:

View File

@@ -64,7 +64,7 @@ nvc0_shader_output_address(unsigned sn, unsigned si, unsigned ubase)
switch (sn) {
case NV50_SEMANTIC_TESSFACTOR: return 0x000 + si * 0x4;
case TGSI_SEMANTIC_PRIMID: return 0x060;
case NV50_SEMANTIC_LAYER: return 0x064;
case TGSI_SEMANTIC_LAYER: return 0x064;
case NV50_SEMANTIC_VIEWPORTINDEX: return 0x068;
case TGSI_SEMANTIC_PSIZE: return 0x06c;
case TGSI_SEMANTIC_POSITION: return 0x070;

View File

@@ -127,6 +127,9 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param)
case PIPE_CAP_MAX_STREAM_OUTPUT_SEPARATE_COMPONENTS:
case PIPE_CAP_MAX_STREAM_OUTPUT_INTERLEAVED_COMPONENTS:
return 128;
case PIPE_CAP_MAX_GEOMETRY_OUTPUT_VERTICES:
case PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS:
return 1024;
case PIPE_CAP_BLEND_EQUATION_SEPARATE:
case PIPE_CAP_INDEP_BLEND_ENABLE:
case PIPE_CAP_INDEP_BLEND_FUNC:
@@ -331,7 +334,14 @@ nvc0_screen_destroy(struct pipe_screen *pscreen)
struct nvc0_screen *screen = nvc0_screen(pscreen);
if (screen->base.fence.current) {
nouveau_fence_wait(screen->base.fence.current);
struct nouveau_fence *current = NULL;
/* nouveau_fence_wait will create a new current fence, so wait on the
* _current_ one, and remove both.
*/
nouveau_fence_ref(screen->base.fence.current, &current);
nouveau_fence_wait(current);
nouveau_fence_ref(NULL, &current);
nouveau_fence_ref(NULL, &screen->base.fence.current);
}
if (screen->base.pushbuf)

View File

@@ -190,7 +190,7 @@ nvc0_gmtyprog_validate(struct nvc0_context *nvc0)
/* we allow GPs with no code for specifying stream output state only */
if (gp && gp->code_size) {
const boolean gp_selects_layer = gp->hdr[13] & (1 << 9);
const boolean gp_selects_layer = !!(gp->hdr[13] & (1 << 9));
BEGIN_NVC0(push, NVC0_3D(MACRO_GP_SELECT), 1);
PUSH_DATA (push, 0x41);

View File

@@ -130,7 +130,7 @@ static boolean r300_cbzb_clear_allowed(struct r300_context *r300,
(struct pipe_framebuffer_state*)r300->fb_state.state;
/* Only color clear allowed, and only one colorbuffer. */
if ((clear_buffers & ~PIPE_CLEAR_COLOR) != 0 || fb->nr_cbufs != 1)
if ((clear_buffers & ~PIPE_CLEAR_COLOR) != 0 || fb->nr_cbufs != 1 || !fb->cbufs[0])
return FALSE;
return r300_surface(fb->cbufs[0])->cbzb_allowed;
@@ -313,7 +313,7 @@ static void r300_clear(struct pipe_context* pipe,
/* Use fast color clear for an AA colorbuffer.
* The CMASK is shared between all colorbuffers, so we use it
* if there is only one colorbuffer bound. */
if ((buffers & PIPE_CLEAR_COLOR) && fb->nr_cbufs == 1 &&
if ((buffers & PIPE_CLEAR_COLOR) && fb->nr_cbufs == 1 && fb->cbufs[0] &&
r300_resource(fb->cbufs[0]->texture)->tex.cmask_dwords) {
/* Try to obtain the access to the CMASK if we don't have one. */
if (!r300->cmask_access) {

View File

@@ -688,6 +688,20 @@ static INLINE void r300_mark_atom_dirty(struct r300_context *r300,
}
}
static INLINE struct pipe_surface *
r300_get_nonnull_cb(struct pipe_framebuffer_state *fb, unsigned i)
{
if (fb->cbufs[i])
return fb->cbufs[i];
/* The i-th framebuffer is NULL, return any non-NULL one. */
for (i = 0; i < fb->nr_cbufs; i++)
if (fb->cbufs[i])
return fb->cbufs[i];
return NULL;
}
struct pipe_context* r300_create_context(struct pipe_screen* screen,
void *priv);

View File

@@ -42,15 +42,18 @@ void r300_emit_blend_state(struct r300_context* r300,
struct r300_blend_state* blend = (struct r300_blend_state*)state;
struct pipe_framebuffer_state* fb =
(struct pipe_framebuffer_state*)r300->fb_state.state;
struct pipe_surface *cb;
CS_LOCALS(r300);
if (fb->nr_cbufs) {
if (fb->cbufs[0]->format == PIPE_FORMAT_R16G16B16A16_FLOAT) {
cb = fb->nr_cbufs ? r300_get_nonnull_cb(fb, 0) : NULL;
if (cb) {
if (cb->format == PIPE_FORMAT_R16G16B16A16_FLOAT) {
WRITE_CS_TABLE(blend->cb_noclamp, size);
} else if (fb->cbufs[0]->format == PIPE_FORMAT_R16G16B16X16_FLOAT) {
} else if (cb->format == PIPE_FORMAT_R16G16B16X16_FLOAT) {
WRITE_CS_TABLE(blend->cb_noclamp_noalpha, size);
} else {
unsigned swz = r300_surface(fb->cbufs[0])->colormask_swizzle;
unsigned swz = r300_surface(cb)->colormask_swizzle;
WRITE_CS_TABLE(blend->cb_clamp[swz], size);
}
} else {
@@ -88,9 +91,11 @@ void r300_emit_dsa_state(struct r300_context* r300, unsigned size, void* state)
/* Choose the alpha ref value between 8-bit (FG_ALPHA_FUNC.AM_VAL) and
* 16-bit (FG_ALPHA_VALUE). */
if (is_r500 && (alpha_func & R300_FG_ALPHA_FUNC_ENABLE)) {
if (fb->nr_cbufs &&
(fb->cbufs[0]->format == PIPE_FORMAT_R16G16B16A16_FLOAT ||
fb->cbufs[0]->format == PIPE_FORMAT_R16G16B16X16_FLOAT)) {
struct pipe_surface *cb = fb->nr_cbufs ? r300_get_nonnull_cb(fb, 0) : NULL;
if (cb &&
(cb->format == PIPE_FORMAT_R16G16B16A16_FLOAT ||
cb->format == PIPE_FORMAT_R16G16B16X16_FLOAT)) {
alpha_func |= R500_FG_ALPHA_FUNC_FP16_ENABLE;
} else {
alpha_func |= R500_FG_ALPHA_FUNC_8BIT;
@@ -419,7 +424,7 @@ void r300_emit_fb_state(struct r300_context* r300, unsigned size, void* state)
/* Set up colorbuffers. */
for (i = 0; i < fb->nr_cbufs; i++) {
surf = r300_surface(fb->cbufs[i]);
surf = r300_surface(r300_get_nonnull_cb(fb, i));
OUT_CS_REG(R300_RB3D_COLOROFFSET0 + (4 * i), surf->offset);
OUT_CS_RELOC(surf);
@@ -600,7 +605,7 @@ void r300_emit_fb_state_pipelined(struct r300_context *r300,
* (must be written after unpipelined regs) */
OUT_CS_REG_SEQ(R300_US_OUT_FMT_0, 4);
for (i = 0; i < num_cbufs; i++) {
OUT_CS(r300_surface(fb->cbufs[i])->format);
OUT_CS(r300_surface(r300_get_nonnull_cb(fb, i))->format);
}
for (; i < 1; i++) {
OUT_CS(R300_US_OUT_FMT_C4_8 |
@@ -1310,6 +1315,8 @@ validate:
if (r300->fb_state.dirty) {
/* Color buffers... */
for (i = 0; i < fb->nr_cbufs; i++) {
if (!fb->cbufs[i])
continue;
tex = r300_resource(fb->cbufs[i]->texture);
assert(tex && tex->buf && "cbuf is marked, but NULL!");
r300->rws->cs_add_reloc(r300->cs, tex->cs_buf,

View File

@@ -151,6 +151,8 @@ static int r300_get_param(struct pipe_screen* pscreen, enum pipe_cap param)
case PIPE_CAP_MAX_STREAM_OUTPUT_BUFFERS:
case PIPE_CAP_MAX_STREAM_OUTPUT_SEPARATE_COMPONENTS:
case PIPE_CAP_MAX_STREAM_OUTPUT_INTERLEAVED_COMPONENTS:
case PIPE_CAP_MAX_GEOMETRY_OUTPUT_VERTICES:
case PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS:
case PIPE_CAP_STREAM_OUTPUT_PAUSE_RESUME:
case PIPE_CAP_FRAGMENT_COLOR_CLAMPED:
case PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION:

View File

@@ -579,16 +579,17 @@ static void r300_set_blend_color(struct pipe_context* pipe,
struct r300_blend_color_state *state =
(struct r300_blend_color_state*)r300->blend_color_state.state;
struct pipe_blend_color c;
enum pipe_format format = fb->nr_cbufs ? fb->cbufs[0]->format : 0;
struct pipe_surface *cb;
float tmp;
CB_LOCALS;
state->state = *color; /* Save it, so that we can reuse it in set_fb_state */
c = *color;
cb = fb->nr_cbufs ? r300_get_nonnull_cb(fb, 0) : NULL;
/* The blend color is dependent on the colorbuffer format. */
if (fb->nr_cbufs) {
switch (format) {
if (cb) {
switch (cb->format) {
case PIPE_FORMAT_R8_UNORM:
case PIPE_FORMAT_L8_UNORM:
case PIPE_FORMAT_I8_UNORM:
@@ -623,7 +624,7 @@ static void r300_set_blend_color(struct pipe_context* pipe,
BEGIN_CB(state->cb, 3);
OUT_CB_REG_SEQ(R500_RB3D_CONSTANT_COLOR_AR, 2);
switch (format) {
switch (cb ? cb->format : 0) {
case PIPE_FORMAT_R16G16B16A16_FLOAT:
case PIPE_FORMAT_R16G16B16X16_FLOAT:
OUT_CB(util_float_to_half(c.color[2]) |
@@ -858,6 +859,9 @@ static void r300_fb_set_tiling_flags(struct r300_context *r300,
/* Set tiling flags for new surfaces. */
for (i = 0; i < state->nr_cbufs; i++) {
if (!state->cbufs[i])
continue;
r300_tex_set_tiling_flags(r300,
r300_resource(state->cbufs[i]->texture),
state->cbufs[i]->u.tex.level);
@@ -950,7 +954,8 @@ static unsigned r300_get_num_samples(struct r300_context *r300)
num_samples = 6;
for (i = 0; i < fb->nr_cbufs; i++)
num_samples = MIN2(num_samples, fb->cbufs[i]->texture->nr_samples);
if (fb->cbufs[i])
num_samples = MIN2(num_samples, fb->cbufs[i]->texture->nr_samples);
if (fb->zsbuf)
num_samples = MIN2(num_samples, fb->zsbuf->texture->nr_samples);
@@ -967,7 +972,7 @@ r300_set_framebuffer_state(struct pipe_context* pipe,
{
struct r300_context* r300 = r300_context(pipe);
struct r300_aa_state *aa = (struct r300_aa_state*)r300->aa_state.state;
struct pipe_framebuffer_state *old_state = r300->fb_state.state;
struct pipe_framebuffer_state *current_state = r300->fb_state.state;
unsigned max_width, max_height, i;
uint32_t zbuffer_bpp = 0;
boolean unlock_zbuffer = FALSE;
@@ -986,17 +991,17 @@ r300_set_framebuffer_state(struct pipe_context* pipe,
return;
}
if (old_state->zsbuf && r300->zmask_in_use && !r300->locked_zbuffer) {
if (current_state->zsbuf && r300->zmask_in_use && !r300->locked_zbuffer) {
/* There is a zmask in use, what are we gonna do? */
if (state->zsbuf) {
if (!pipe_surface_equal(old_state->zsbuf, state->zsbuf)) {
if (!pipe_surface_equal(current_state->zsbuf, state->zsbuf)) {
/* Decompress the currently bound zbuffer before we bind another one. */
r300_decompress_zmask(r300);
r300->hiz_in_use = FALSE;
}
} else {
/* We don't bind another zbuffer, so lock the current one. */
pipe_surface_reference(&r300->locked_zbuffer, old_state->zsbuf);
pipe_surface_reference(&r300->locked_zbuffer, current_state->zsbuf);
}
} else if (r300->locked_zbuffer) {
/* We have a locked zbuffer now, what are we gonna do? */
@@ -1014,9 +1019,20 @@ r300_set_framebuffer_state(struct pipe_context* pipe,
}
assert(state->zsbuf || (r300->locked_zbuffer && !unlock_zbuffer) || !r300->zmask_in_use);
/* If zsbuf is set from NULL to non-NULL or vice versa.. */
if (!!current_state->zsbuf != !!state->zsbuf) {
r300_mark_atom_dirty(r300, &r300->dsa_state);
}
util_copy_framebuffer_state(r300->fb_state.state, state);
/* Remove trailing NULL colorbuffers. */
while (current_state->nr_cbufs && !current_state->cbufs[current_state->nr_cbufs-1])
current_state->nr_cbufs--;
/* Set whether CMASK can be used. */
r300->cmask_in_use =
state->nr_cbufs == 1 &&
state->nr_cbufs == 1 && state->cbufs[0] &&
r300->screen->cmask_resource == state->cbufs[0]->texture;
/* Need to reset clamping or colormask. */
@@ -1025,11 +1041,6 @@ r300_set_framebuffer_state(struct pipe_context* pipe,
/* Re-swizzle the blend color. */
r300_set_blend_color(pipe, &((struct r300_blend_color_state*)r300->blend_color_state.state)->state);
/* If zsbuf is set from NULL to non-NULL or vice versa.. */
if (!!old_state->zsbuf != !!state->zsbuf) {
r300_mark_atom_dirty(r300, &r300->dsa_state);
}
if (r300->screen->info.drm_minor < 12) {
/* The tiling flags are dependent on the surface miplevel, unfortunately.
* This workarounds a bad design decision in old kernels which were
@@ -1037,8 +1048,6 @@ r300_set_framebuffer_state(struct pipe_context* pipe,
r300_fb_set_tiling_flags(r300, state);
}
util_copy_framebuffer_state(r300->fb_state.state, state);
if (unlock_zbuffer) {
pipe_surface_reference(&r300->locked_zbuffer, NULL);
}
@@ -1089,7 +1098,8 @@ r300_set_framebuffer_state(struct pipe_context* pipe,
if (DBG_ON(r300, DBG_FB)) {
fprintf(stderr, "r300: set_framebuffer_state:\n");
for (i = 0; i < state->nr_cbufs; i++) {
r300_print_fb_surf_info(state->cbufs[i], i, "CB");
if (state->cbufs[i])
r300_print_fb_surf_info(state->cbufs[i], i, "CB");
}
if (state->zsbuf) {
r300_print_fb_surf_info(state->zsbuf, 0, "ZB");

View File

@@ -79,45 +79,49 @@ int eg_bytecode_cf_build(struct r600_bytecode *bc, struct r600_bytecode_cf *cf)
bc->bytecode[id++] = S_SQ_CF_ALLOC_EXPORT_WORD0_RW_GPR(cf->output.gpr) |
S_SQ_CF_ALLOC_EXPORT_WORD0_ELEM_SIZE(cf->output.elem_size) |
S_SQ_CF_ALLOC_EXPORT_WORD0_ARRAY_BASE(cf->output.array_base) |
S_SQ_CF_ALLOC_EXPORT_WORD0_TYPE(cf->output.type);
S_SQ_CF_ALLOC_EXPORT_WORD0_TYPE(cf->output.type) |
S_SQ_CF_ALLOC_EXPORT_WORD0_INDEX_GPR(cf->output.index_gpr);
bc->bytecode[id] =
S_SQ_CF_ALLOC_EXPORT_WORD1_BURST_COUNT(cf->output.burst_count - 1) |
S_SQ_CF_ALLOC_EXPORT_WORD1_SWIZ_SEL_X(cf->output.swizzle_x) |
S_SQ_CF_ALLOC_EXPORT_WORD1_SWIZ_SEL_Y(cf->output.swizzle_y) |
S_SQ_CF_ALLOC_EXPORT_WORD1_SWIZ_SEL_Z(cf->output.swizzle_z) |
S_SQ_CF_ALLOC_EXPORT_WORD1_SWIZ_SEL_W(cf->output.swizzle_w) |
S_SQ_CF_ALLOC_EXPORT_WORD1_BARRIER(cf->output.barrier) |
S_SQ_CF_ALLOC_EXPORT_WORD1_BARRIER(cf->barrier) |
S_SQ_CF_ALLOC_EXPORT_WORD1_CF_INST(opcode);
if (bc->chip_class == EVERGREEN) /* no EOP on cayman */
bc->bytecode[id] |= S_SQ_CF_ALLOC_EXPORT_WORD1_END_OF_PROGRAM(cf->output.end_of_program);
bc->bytecode[id] |= S_SQ_CF_ALLOC_EXPORT_WORD1_END_OF_PROGRAM(cf->end_of_program);
id++;
} else if (cfop->flags & CF_STRM) {
/* MEM_STREAM instructions */
} else if (cfop->flags & CF_MEM) {
/* MEM_STREAM, MEM_RING instructions */
bc->bytecode[id++] = S_SQ_CF_ALLOC_EXPORT_WORD0_RW_GPR(cf->output.gpr) |
S_SQ_CF_ALLOC_EXPORT_WORD0_ELEM_SIZE(cf->output.elem_size) |
S_SQ_CF_ALLOC_EXPORT_WORD0_ARRAY_BASE(cf->output.array_base) |
S_SQ_CF_ALLOC_EXPORT_WORD0_TYPE(cf->output.type);
S_SQ_CF_ALLOC_EXPORT_WORD0_TYPE(cf->output.type) |
S_SQ_CF_ALLOC_EXPORT_WORD0_INDEX_GPR(cf->output.index_gpr);
bc->bytecode[id] = S_SQ_CF_ALLOC_EXPORT_WORD1_BURST_COUNT(cf->output.burst_count - 1) |
S_SQ_CF_ALLOC_EXPORT_WORD1_BARRIER(cf->output.barrier) |
S_SQ_CF_ALLOC_EXPORT_WORD1_BARRIER(cf->barrier) |
S_SQ_CF_ALLOC_EXPORT_WORD1_CF_INST(opcode) |
S_SQ_CF_ALLOC_EXPORT_WORD1_BUF_COMP_MASK(cf->output.comp_mask) |
S_SQ_CF_ALLOC_EXPORT_WORD1_BUF_ARRAY_SIZE(cf->output.array_size);
if (bc->chip_class == EVERGREEN) /* no EOP on cayman */
bc->bytecode[id] |= S_SQ_CF_ALLOC_EXPORT_WORD1_END_OF_PROGRAM(cf->output.end_of_program);
bc->bytecode[id] |= S_SQ_CF_ALLOC_EXPORT_WORD1_END_OF_PROGRAM(cf->end_of_program);
id++;
} else {
/* branch, loop, call, return instructions */
/* other instructions */
bc->bytecode[id++] = S_SQ_CF_WORD0_ADDR(cf->cf_addr >> 1);
bc->bytecode[id++] = S_SQ_CF_WORD1_CF_INST(opcode)|
S_SQ_CF_WORD1_BARRIER(1) |
S_SQ_CF_WORD1_COND(cf->cond) |
S_SQ_CF_WORD1_POP_COUNT(cf->pop_count);
S_SQ_CF_WORD1_POP_COUNT(cf->pop_count) |
S_SQ_CF_WORD1_END_OF_PROGRAM(cf->end_of_program);
}
}
return 0;
}
#if 0
void eg_bytecode_export_read(struct r600_bytecode *bc,
struct r600_bytecode_output *output, uint32_t word0, uint32_t word1)
{
@@ -138,3 +142,4 @@ void eg_bytecode_export_read(struct r600_bytecode *bc,
output->array_size = G_SQ_CF_ALLOC_EXPORT_WORD1_BUF_ARRAY_SIZE(word1);
output->comp_mask = G_SQ_CF_ALLOC_EXPORT_WORD1_BUF_COMP_MASK(word1);
}
#endif

View File

@@ -927,7 +927,8 @@ static void *evergreen_create_rs_state(struct pipe_context *ctx,
S_028810_PS_UCP_MODE(3) |
S_028810_ZCLIP_NEAR_DISABLE(!state->depth_clip) |
S_028810_ZCLIP_FAR_DISABLE(!state->depth_clip) |
S_028810_DX_LINEAR_ATTR_CLIP_ENA(1);
S_028810_DX_LINEAR_ATTR_CLIP_ENA(1) |
S_028810_DX_RASTERIZATION_KILL(state->rasterizer_discard);
rs->multisample_enable = state->multisample;
/* offset */
@@ -996,7 +997,6 @@ static void *evergreen_create_rs_state(struct pipe_context *ctx,
state->fill_back != PIPE_POLYGON_MODE_FILL) |
S_028814_POLYMODE_FRONT_PTYPE(r600_translate_fill(state->fill_front)) |
S_028814_POLYMODE_BACK_PTYPE(r600_translate_fill(state->fill_back)));
r600_store_context_reg(&rs->buffer, R_028350_SX_MISC, S_028350_MULTIPASS(state->rasterizer_discard));
return rs;
}
@@ -1097,7 +1097,8 @@ struct pipe_sampler_view *
evergreen_create_sampler_view_custom(struct pipe_context *ctx,
struct pipe_resource *texture,
const struct pipe_sampler_view *state,
unsigned width0, unsigned height0)
unsigned width0, unsigned height0,
unsigned force_level)
{
struct r600_screen *rscreen = (struct r600_screen*)ctx->screen;
struct r600_pipe_sampler_view *view = CALLOC_STRUCT(r600_pipe_sampler_view);
@@ -1109,6 +1110,8 @@ evergreen_create_sampler_view_custom(struct pipe_context *ctx,
unsigned macro_aspect, tile_split, bankh, bankw, nbanks, fmask_bankh;
enum pipe_format pipe_format = state->format;
struct radeon_surface_level *surflevel;
unsigned base_level, first_level, last_level;
uint64_t va;
if (view == NULL)
return NULL;
@@ -1165,13 +1168,26 @@ evergreen_create_sampler_view_custom(struct pipe_context *ctx,
endian = r600_colorformat_endian_swap(format);
base_level = 0;
first_level = state->u.tex.first_level;
last_level = state->u.tex.last_level;
width = width0;
height = height0;
depth = texture->depth0;
pitch = surflevel[0].nblk_x * util_format_get_blockwidth(pipe_format);
if (force_level) {
base_level = force_level;
first_level = 0;
last_level = 0;
width = u_minify(width, force_level);
height = u_minify(height, force_level);
depth = u_minify(depth, force_level);
}
pitch = surflevel[base_level].nblk_x * util_format_get_blockwidth(pipe_format);
non_disp_tiling = tmp->non_disp_tiling;
switch (surflevel[0].mode) {
switch (surflevel[base_level].mode) {
case RADEON_SURF_MODE_LINEAR_ALIGNED:
array_mode = V_028C70_ARRAY_LINEAR_ALIGNED;
break;
@@ -1210,6 +1226,8 @@ evergreen_create_sampler_view_custom(struct pipe_context *ctx,
} else if (texture->target == PIPE_TEXTURE_CUBE_ARRAY)
depth = texture->array_size / 6;
va = r600_resource_va(ctx->screen, texture);
view->tex_resource = &tmp->resource;
view->tex_resource_words[0] = (S_030000_DIM(r600_tex_dim(texture->target, texture->nr_samples)) |
S_030000_PITCH((pitch / 8) - 1) |
@@ -1221,7 +1239,7 @@ evergreen_create_sampler_view_custom(struct pipe_context *ctx,
view->tex_resource_words[1] = (S_030004_TEX_HEIGHT(height - 1) |
S_030004_TEX_DEPTH(depth - 1) |
S_030004_ARRAY_MODE(array_mode));
view->tex_resource_words[2] = (surflevel[0].offset + r600_resource_va(ctx->screen, texture)) >> 8;
view->tex_resource_words[2] = (surflevel[base_level].offset + va) >> 8;
/* TEX_RESOURCE_WORD3.MIP_ADDRESS */
if (texture->nr_samples > 1 && rscreen->has_compressed_msaa_texturing) {
@@ -1231,12 +1249,12 @@ evergreen_create_sampler_view_custom(struct pipe_context *ctx,
view->skip_mip_address_reloc = true;
} else {
/* FMASK should be in MIP_ADDRESS for multisample textures */
view->tex_resource_words[3] = (tmp->fmask.offset + r600_resource_va(ctx->screen, texture)) >> 8;
view->tex_resource_words[3] = (tmp->fmask.offset + va) >> 8;
}
} else if (state->u.tex.last_level && texture->nr_samples <= 1) {
view->tex_resource_words[3] = (surflevel[1].offset + r600_resource_va(ctx->screen, texture)) >> 8;
} else if (last_level && texture->nr_samples <= 1) {
view->tex_resource_words[3] = (surflevel[1].offset + va) >> 8;
} else {
view->tex_resource_words[3] = (surflevel[0].offset + r600_resource_va(ctx->screen, texture)) >> 8;
view->tex_resource_words[3] = (surflevel[base_level].offset + va) >> 8;
}
view->tex_resource_words[4] = (word4 |
@@ -1255,8 +1273,8 @@ evergreen_create_sampler_view_custom(struct pipe_context *ctx,
view->tex_resource_words[5] |= S_030014_LAST_LEVEL(log_samples);
view->tex_resource_words[6] |= S_030018_FMASK_BANK_HEIGHT(fmask_bankh);
} else {
view->tex_resource_words[4] |= S_030010_BASE_LEVEL(state->u.tex.first_level);
view->tex_resource_words[5] |= S_030014_LAST_LEVEL(state->u.tex.last_level);
view->tex_resource_words[4] |= S_030010_BASE_LEVEL(first_level);
view->tex_resource_words[5] |= S_030014_LAST_LEVEL(last_level);
/* aniso max 16 samples */
view->tex_resource_words[6] |= S_030018_MAX_ANISO(4);
}
@@ -1277,7 +1295,7 @@ evergreen_create_sampler_view(struct pipe_context *ctx,
const struct pipe_sampler_view *state)
{
return evergreen_create_sampler_view_custom(ctx, tex, state,
tex->width0, tex->height0);
tex->width0, tex->height0, 0);
}
static void evergreen_emit_clip_state(struct r600_context *rctx, struct r600_atom *atom)
@@ -1407,7 +1425,7 @@ void evergreen_init_color_surface(struct r600_context *rctx,
struct pipe_resource *pipe_tex = surf->base.texture;
unsigned level = surf->base.u.tex.level;
unsigned pitch, slice;
unsigned color_info, color_attrib, color_dim = 0;
unsigned color_info, color_attrib, color_dim = 0, color_view;
unsigned format, swap, ntype, endian;
uint64_t offset, base_offset;
unsigned non_disp_tiling, macro_aspect, tile_split, bankh, bankw, fmask_bankh, nbanks;
@@ -1416,10 +1434,15 @@ void evergreen_init_color_surface(struct r600_context *rctx,
bool blend_clamp = 0, blend_bypass = 0;
offset = rtex->surface.level[level].offset;
if (rtex->surface.level[level].mode < RADEON_SURF_MODE_1D) {
if (rtex->surface.level[level].mode == RADEON_SURF_MODE_LINEAR) {
assert(surf->base.u.tex.first_layer == surf->base.u.tex.last_layer);
offset += rtex->surface.level[level].slice_size *
surf->base.u.tex.first_layer;
}
color_view = 0;
} else
color_view = S_028C6C_SLICE_START(surf->base.u.tex.first_layer) |
S_028C6C_SLICE_MAX(surf->base.u.tex.last_layer);
pitch = (rtex->surface.level[level].nblk_x) / 8 - 1;
slice = (rtex->surface.level[level].nblk_x * rtex->surface.level[level].nblk_y) / 64;
if (slice) {
@@ -1569,12 +1592,7 @@ void evergreen_init_color_surface(struct r600_context *rctx,
surf->cb_color_info = color_info;
surf->cb_color_pitch = S_028C64_PITCH_TILE_MAX(pitch);
surf->cb_color_slice = S_028C68_SLICE_TILE_MAX(slice);
if (rtex->surface.level[level].mode < RADEON_SURF_MODE_1D) {
surf->cb_color_view = 0;
} else {
surf->cb_color_view = S_028C6C_SLICE_START(surf->base.u.tex.first_layer) |
S_028C6C_SLICE_MAX(surf->base.u.tex.last_layer);
}
surf->cb_color_view = color_view;
surf->cb_color_attrib = color_attrib;
if (rtex->fmask.size) {
surf->cb_color_fmask = (base_offset + rtex->fmask.offset) >> 8;
@@ -1824,12 +1842,14 @@ static void evergreen_set_framebuffer_state(struct pipe_context *ctx,
}
log_samples = util_logbase2(rctx->framebuffer.nr_samples);
if (rctx->b.chip_class == CAYMAN && rctx->db_misc_state.log_samples != log_samples) {
/* This is for Cayman to program SAMPLE_RATE, and for RV770 to fix a hw bug. */
if ((rctx->b.chip_class == CAYMAN ||
rctx->b.family == CHIP_RV770) &&
rctx->db_misc_state.log_samples != log_samples) {
rctx->db_misc_state.log_samples = log_samples;
rctx->db_misc_state.atom.dirty = true;
}
evergreen_update_db_shader_control(rctx);
/* Calculate the CS size. */
rctx->framebuffer.atom.num_dw = 4; /* SCISSOR */
@@ -2519,6 +2539,7 @@ static void evergreen_emit_constant_buffers(struct r600_context *rctx,
struct r600_resource *rbuffer;
uint64_t va;
unsigned buffer_index = ffs(dirty_mask) - 1;
unsigned gs_ring_buffer = (buffer_index == R600_GS_RING_CONST_BUFFER);
cb = &state->cb[buffer_index];
rbuffer = (struct r600_resource*)cb->buffer;
@@ -2527,10 +2548,12 @@ static void evergreen_emit_constant_buffers(struct r600_context *rctx,
va = r600_resource_va(&rctx->screen->b.b, &rbuffer->b.b);
va += cb->buffer_offset;
r600_write_context_reg_flag(cs, reg_alu_constbuf_size + buffer_index * 4,
ALIGN_DIVUP(cb->buffer_size >> 4, 16), pkt_flags);
r600_write_context_reg_flag(cs, reg_alu_const_cache + buffer_index * 4, va >> 8,
pkt_flags);
if (!gs_ring_buffer) {
r600_write_context_reg_flag(cs, reg_alu_constbuf_size + buffer_index * 4,
ALIGN_DIVUP(cb->buffer_size >> 4, 16), pkt_flags);
r600_write_context_reg_flag(cs, reg_alu_const_cache + buffer_index * 4, va >> 8,
pkt_flags);
}
radeon_emit(cs, PKT3(PKT3_NOP, 0, 0) | pkt_flags);
radeon_emit(cs, r600_context_bo_reloc(&rctx->b, &rctx->b.rings.gfx, rbuffer, RADEON_USAGE_READ));
@@ -2540,10 +2563,12 @@ static void evergreen_emit_constant_buffers(struct r600_context *rctx,
radeon_emit(cs, va); /* RESOURCEi_WORD0 */
radeon_emit(cs, rbuffer->buf->size - cb->buffer_offset - 1); /* RESOURCEi_WORD1 */
radeon_emit(cs, /* RESOURCEi_WORD2 */
S_030008_ENDIAN_SWAP(r600_endian_swap(32)) |
S_030008_STRIDE(16) |
S_030008_BASE_ADDRESS_HI(va >> 32UL));
S_030008_ENDIAN_SWAP(gs_ring_buffer ? ENDIAN_NONE : r600_endian_swap(32)) |
S_030008_STRIDE(gs_ring_buffer ? 4 : 16) |
S_030008_BASE_ADDRESS_HI(va >> 32UL) |
S_030008_DATA_FORMAT(FMT_32_32_32_32_FLOAT));
radeon_emit(cs, /* RESOURCEi_WORD3 */
S_03000C_UNCACHED(gs_ring_buffer ? 1 : 0) |
S_03000C_DST_SEL_X(V_03000C_SQ_SEL_X) |
S_03000C_DST_SEL_Y(V_03000C_SQ_SEL_Y) |
S_03000C_DST_SEL_Z(V_03000C_SQ_SEL_Z) |
@@ -2551,7 +2576,8 @@ static void evergreen_emit_constant_buffers(struct r600_context *rctx,
radeon_emit(cs, 0); /* RESOURCEi_WORD4 */
radeon_emit(cs, 0); /* RESOURCEi_WORD5 */
radeon_emit(cs, 0); /* RESOURCEi_WORD6 */
radeon_emit(cs, 0xc0000000); /* RESOURCEi_WORD7 */
radeon_emit(cs, /* RESOURCEi_WORD7 */
S_03001C_TYPE(V_03001C_SQ_TEX_VTX_VALID_BUFFER));
radeon_emit(cs, PKT3(PKT3_NOP, 0, 0) | pkt_flags);
radeon_emit(cs, r600_context_bo_reloc(&rctx->b, &rctx->b.rings.gfx, rbuffer, RADEON_USAGE_READ));
@@ -2715,6 +2741,77 @@ static void evergreen_emit_vertex_fetch_shader(struct r600_context *rctx, struct
radeon_emit(cs, r600_context_bo_reloc(&rctx->b, &rctx->b.rings.gfx, shader->buffer, RADEON_USAGE_READ));
}
static void evergreen_emit_shader_stages(struct r600_context *rctx, struct r600_atom *a)
{
struct radeon_winsys_cs *cs = rctx->b.rings.gfx.cs;
struct r600_shader_stages_state *state = (struct r600_shader_stages_state*)a;
uint32_t v = 0, v2 = 0, primid = 0;
if (state->geom_enable) {
uint32_t cut_val;
if (rctx->gs_shader->current->shader.gs_max_out_vertices <= 128)
cut_val = V_028A40_GS_CUT_128;
else if (rctx->gs_shader->current->shader.gs_max_out_vertices <= 256)
cut_val = V_028A40_GS_CUT_256;
else if (rctx->gs_shader->current->shader.gs_max_out_vertices <= 512)
cut_val = V_028A40_GS_CUT_512;
else
cut_val = V_028A40_GS_CUT_1024;
v = S_028B54_ES_EN(V_028B54_ES_STAGE_REAL) |
S_028B54_GS_EN(1) |
S_028B54_VS_EN(V_028B54_VS_STAGE_COPY_SHADER);
v2 = S_028A40_MODE(V_028A40_GS_SCENARIO_G) |
S_028A40_CUT_MODE(cut_val);
if (rctx->gs_shader->current->shader.gs_prim_id_input)
primid = 1;
}
r600_write_context_reg(cs, R_028B54_VGT_SHADER_STAGES_EN, v);
r600_write_context_reg(cs, R_028A40_VGT_GS_MODE, v2);
r600_write_context_reg(cs, R_028A84_VGT_PRIMITIVEID_EN, primid);
}
static void evergreen_emit_gs_rings(struct r600_context *rctx, struct r600_atom *a)
{
struct pipe_screen *screen = rctx->b.b.screen;
struct radeon_winsys_cs *cs = rctx->b.rings.gfx.cs;
struct r600_gs_rings_state *state = (struct r600_gs_rings_state*)a;
struct r600_resource *rbuffer;
r600_write_config_reg(cs, R_008040_WAIT_UNTIL, S_008040_WAIT_3D_IDLE(1));
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
radeon_emit(cs, EVENT_TYPE(EVENT_TYPE_VGT_FLUSH));
if (state->enable) {
rbuffer =(struct r600_resource*)state->esgs_ring.buffer;
r600_write_config_reg(cs, R_008C40_SQ_ESGS_RING_BASE,
(r600_resource_va(screen, &rbuffer->b.b)) >> 8);
radeon_emit(cs, PKT3(PKT3_NOP, 0, 0));
radeon_emit(cs, r600_context_bo_reloc(&rctx->b, &rctx->b.rings.gfx, rbuffer, RADEON_USAGE_READWRITE));
r600_write_config_reg(cs, R_008C44_SQ_ESGS_RING_SIZE,
state->esgs_ring.buffer_size >> 8);
rbuffer =(struct r600_resource*)state->gsvs_ring.buffer;
r600_write_config_reg(cs, R_008C48_SQ_GSVS_RING_BASE,
(r600_resource_va(screen, &rbuffer->b.b)) >> 8);
radeon_emit(cs, PKT3(PKT3_NOP, 0, 0));
radeon_emit(cs, r600_context_bo_reloc(&rctx->b, &rctx->b.rings.gfx, rbuffer, RADEON_USAGE_READWRITE));
r600_write_config_reg(cs, R_008C4C_SQ_GSVS_RING_SIZE,
state->gsvs_ring.buffer_size >> 8);
} else {
r600_write_config_reg(cs, R_008C44_SQ_ESGS_RING_SIZE, 0);
r600_write_config_reg(cs, R_008C4C_SQ_GSVS_RING_SIZE, 0);
}
r600_write_config_reg(cs, R_008040_WAIT_UNTIL, S_008040_WAIT_3D_IDLE(1));
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
radeon_emit(cs, EVENT_TYPE(EVENT_TYPE_VGT_FLUSH));
}
void cayman_init_common_regs(struct r600_command_buffer *cb,
enum chip_class ctx_chip_class,
enum radeon_family ctx_family,
@@ -2733,7 +2830,9 @@ void cayman_init_common_regs(struct r600_command_buffer *cb,
r600_store_context_reg(cb, R_028A4C_PA_SC_MODE_CNTL_1, 0);
r600_store_context_reg(cb, R_028354_SX_SURFACE_SYNC, S_028354_SURFACE_SYNC_MASK(0xf));
r600_store_context_reg_seq(cb, R_028350_SX_MISC, 2);
r600_store_value(cb, 0);
r600_store_value(cb, S_028354_SURFACE_SYNC_MASK(0xf));
r600_store_context_reg(cb, R_028800_DB_DEPTH_CONTROL, 0);
}
@@ -2905,6 +3004,7 @@ static void cayman_init_atom_start_cs(struct r600_context *rctx)
eg_store_loop_const(cb, R_03A200_SQ_LOOP_CONST_0, 0x01000FFF);
eg_store_loop_const(cb, R_03A200_SQ_LOOP_CONST_0 + (32 * 4), 0x01000FFF);
eg_store_loop_const(cb, R_03A200_SQ_LOOP_CONST_0 + (64 * 4), 0x01000FFF);
}
void evergreen_init_common_regs(struct r600_command_buffer *cb,
@@ -3008,7 +3108,9 @@ void evergreen_init_common_regs(struct r600_command_buffer *cb,
/* The cs checker requires this register to be set. */
r600_store_context_reg(cb, R_028800_DB_DEPTH_CONTROL, 0);
r600_store_context_reg(cb, R_028354_SX_SURFACE_SYNC, S_028354_SURFACE_SYNC_MASK(0xf));
r600_store_context_reg_seq(cb, R_028350_SX_MISC, 2);
r600_store_value(cb, 0);
r600_store_value(cb, S_028354_SURFACE_SYNC_MASK(0xf));
return;
}
@@ -3363,6 +3465,7 @@ void evergreen_init_atom_start_cs(struct r600_context *rctx)
eg_store_loop_const(cb, R_03A200_SQ_LOOP_CONST_0, 0x01000FFF);
eg_store_loop_const(cb, R_03A200_SQ_LOOP_CONST_0 + (32 * 4), 0x01000FFF);
eg_store_loop_const(cb, R_03A200_SQ_LOOP_CONST_0 + (64 * 4), 0x01000FFF);
}
void evergreen_update_ps_state(struct pipe_context *ctx, struct r600_pipe_shader *shader)
@@ -3510,6 +3613,78 @@ void evergreen_update_ps_state(struct pipe_context *ctx, struct r600_pipe_shader
shader->flatshade = rctx->rasterizer->flatshade;
}
void evergreen_update_es_state(struct pipe_context *ctx, struct r600_pipe_shader *shader)
{
struct r600_command_buffer *cb = &shader->command_buffer;
struct r600_shader *rshader = &shader->shader;
r600_init_command_buffer(cb, 32);
r600_store_context_reg(cb, R_028890_SQ_PGM_RESOURCES_ES,
S_028890_NUM_GPRS(rshader->bc.ngpr) |
S_028890_STACK_SIZE(rshader->bc.nstack));
r600_store_context_reg(cb, R_02888C_SQ_PGM_START_ES,
r600_resource_va(ctx->screen, (void *)shader->bo) >> 8);
/* After that, the NOP relocation packet must be emitted (shader->bo, RADEON_USAGE_READ). */
}
void evergreen_update_gs_state(struct pipe_context *ctx, struct r600_pipe_shader *shader)
{
struct r600_context *rctx = (struct r600_context *)ctx;
struct r600_command_buffer *cb = &shader->command_buffer;
struct r600_shader *rshader = &shader->shader;
struct r600_shader *cp_shader = &shader->gs_copy_shader->shader;
unsigned gsvs_itemsize =
(cp_shader->ring_item_size * rshader->gs_max_out_vertices) >> 2;
r600_init_command_buffer(cb, 64);
/* VGT_GS_MODE is written by evergreen_emit_shader_stages */
r600_store_context_reg(cb, R_028AB8_VGT_VTX_CNT_EN, 1);
r600_store_context_reg(cb, R_028B38_VGT_GS_MAX_VERT_OUT,
S_028B38_MAX_VERT_OUT(rshader->gs_max_out_vertices));
r600_store_context_reg(cb, R_028A6C_VGT_GS_OUT_PRIM_TYPE,
r600_conv_prim_to_gs_out(rshader->gs_output_prim));
if (rctx->screen->b.info.drm_minor >= 35) {
r600_store_context_reg(cb, R_028B90_VGT_GS_INSTANCE_CNT,
S_028B90_CNT(0) |
S_028B90_ENABLE(0));
}
r600_store_context_reg_seq(cb, R_02891C_SQ_GS_VERT_ITEMSIZE, 4);
r600_store_value(cb, cp_shader->ring_item_size >> 2);
r600_store_value(cb, 0);
r600_store_value(cb, 0);
r600_store_value(cb, 0);
r600_store_context_reg(cb, R_028900_SQ_ESGS_RING_ITEMSIZE,
(rshader->ring_item_size) >> 2);
r600_store_context_reg(cb, R_028904_SQ_GSVS_RING_ITEMSIZE,
gsvs_itemsize);
r600_store_context_reg_seq(cb, R_02892C_SQ_GSVS_RING_OFFSET_1, 3);
r600_store_value(cb, gsvs_itemsize);
r600_store_value(cb, gsvs_itemsize);
r600_store_value(cb, gsvs_itemsize);
/* FIXME calculate these values somehow ??? */
r600_store_context_reg_seq(cb, R_028A54_GS_PER_ES, 3);
r600_store_value(cb, 0x80); /* GS_PER_ES */
r600_store_value(cb, 0x100); /* ES_PER_GS */
r600_store_value(cb, 0x2); /* GS_PER_VS */
r600_store_context_reg(cb, R_028878_SQ_PGM_RESOURCES_GS,
S_028878_NUM_GPRS(rshader->bc.ngpr) |
S_028878_STACK_SIZE(rshader->bc.nstack));
r600_store_context_reg(cb, R_028874_SQ_PGM_START_GS,
r600_resource_va(ctx->screen, (void *)shader->bo) >> 8);
/* After that, the NOP relocation packet must be emitted (shader->bo, RADEON_USAGE_READ). */
}
void evergreen_update_vs_state(struct pipe_context *ctx, struct r600_pipe_shader *shader)
{
struct r600_command_buffer *cb = &shader->command_buffer;
@@ -3552,7 +3727,9 @@ void evergreen_update_vs_state(struct pipe_context *ctx, struct r600_pipe_shader
S_02881C_VS_OUT_CCDIST0_VEC_ENA((rshader->clip_dist_write & 0x0F) != 0) |
S_02881C_VS_OUT_CCDIST1_VEC_ENA((rshader->clip_dist_write & 0xF0) != 0) |
S_02881C_VS_OUT_MISC_VEC_ENA(rshader->vs_out_misc_write) |
S_02881C_USE_VTX_POINT_SIZE(rshader->vs_out_point_size);
S_02881C_USE_VTX_POINT_SIZE(rshader->vs_out_point_size) |
S_02881C_USE_VTX_EDGE_FLAG(rshader->vs_out_edgeflag) |
S_02881C_USE_VTX_RENDER_TARGET_INDX(rshader->vs_out_layer);
}
void *evergreen_create_resolve_blend(struct r600_context *rctx)
@@ -3919,6 +4096,10 @@ void evergreen_init_state_functions(struct r600_context *rctx)
rctx->atoms[id++] = &rctx->b.streamout.begin_atom;
r600_init_atom(rctx, &rctx->vertex_shader.atom, id++, r600_emit_shader, 23);
r600_init_atom(rctx, &rctx->pixel_shader.atom, id++, r600_emit_shader, 0);
r600_init_atom(rctx, &rctx->geometry_shader.atom, id++, r600_emit_shader, 0);
r600_init_atom(rctx, &rctx->export_shader.atom, id++, r600_emit_shader, 0);
r600_init_atom(rctx, &rctx->shader_stages.atom, id++, evergreen_emit_shader_stages, 6);
r600_init_atom(rctx, &rctx->gs_rings.atom, id++, evergreen_emit_gs_rings, 26);
rctx->b.b.create_blend_state = evergreen_create_blend_state;
rctx->b.b.create_depth_stencil_alpha_state = evergreen_create_dsa_state;

View File

@@ -48,6 +48,7 @@
#define EVENT_TYPE_ZPASS_DONE 0x15
#define EVENT_TYPE_CACHE_FLUSH_AND_INV_EVENT 0x16
#define EVENT_TYPE_SO_VGTSTREAMOUT_FLUSH 0x1f
#define EVENT_TYPE_VGT_FLUSH 0x24
#define EVENT_TYPE_FLUSH_AND_INV_DB_META 0x2c
#define EVENT_TYPE(x) ((x) << 0)
@@ -274,6 +275,11 @@
#define G_008E2C_NUM_LS_LDS(x) (((x) >> 16) & 0xFFFF)
#define C_008E2C_NUM_LS_LDS(x) 0xFFFF0000
#define R_008C40_SQ_ESGS_RING_BASE 0x00008C40
#define R_008C44_SQ_ESGS_RING_SIZE 0x00008C44
#define R_008C48_SQ_GSVS_RING_BASE 0x00008C48
#define R_008C4C_SQ_GSVS_RING_SIZE 0x00008C4C
#define R_008CF0_SQ_MS_FIFO_SIZES 0x00008CF0
#define S_008CF0_CACHE_FIFO_SIZE(x) (((x) & 0xFF) << 0)
#define G_008CF0_CACHE_FIFO_SIZE(x) (((x) >> 0) & 0xFF)
@@ -576,6 +582,9 @@
#define S_028810_VTX_KILL_OR(x) (((x) & 0x1) << 21)
#define G_028810_VTX_KILL_OR(x) (((x) >> 21) & 0x1)
#define C_028810_VTX_KILL_OR 0xFFDFFFFF
#define S_028810_DX_RASTERIZATION_KILL(x) (((x) & 0x1) << 22)
#define G_028810_DX_RASTERIZATION_KILL(x) (((x) >> 22) & 0x1)
#define C_028810_DX_RASTERIZATION_KILL 0xFFBFFFFF
#define S_028810_DX_LINEAR_ATTR_CLIP_ENA(x) (((x) & 0x1) << 24)
#define G_028810_DX_LINEAR_ATTR_CLIP_ENA(x) (((x) >> 24) & 0x1)
#define C_028810_DX_LINEAR_ATTR_CLIP_ENA 0xFEFFFFFF
@@ -821,12 +830,22 @@
#define S_028A40_MODE(x) (((x) & 0x3) << 0)
#define G_028A40_MODE(x) (((x) >> 0) & 0x3)
#define C_028A40_MODE 0xFFFFFFFC
#define V_028A40_GS_OFF 0
#define V_028A40_GS_SCENARIO_A 1
#define V_028A40_GS_SCENARIO_B 2
#define V_028A40_GS_SCENARIO_G 3
#define V_028A40_GS_SCENARIO_C 4
#define V_028A40_SPRITE_EN 5
#define S_028A40_ES_PASSTHRU(x) (((x) & 0x1) << 2)
#define G_028A40_ES_PASSTHRU(x) (((x) >> 2) & 0x1)
#define C_028A40_ES_PASSTHRU 0xFFFFFFFB
#define S_028A40_CUT_MODE(x) (((x) & 0x3) << 3)
#define G_028A40_CUT_MODE(x) (((x) >> 3) & 0x3)
#define C_028A40_CUT_MODE 0xFFFFFFE7
#define V_028A40_GS_CUT_1024 0
#define V_028A40_GS_CUT_512 1
#define V_028A40_GS_CUT_256 2
#define V_028A40_GS_CUT_128 3
#define S_028A40_COMPUTE_MODE(x) (x << 14)
#define S_028A40_PARTIAL_THD_AT_EOI(x) (x << 17)
#define R_028A6C_VGT_GS_OUT_PRIM_TYPE 0x028A6C
@@ -1201,6 +1220,7 @@
#define C_030008_ENDIAN_SWAP 0x3FFFFFFF
#define R_03000C_SQ_VTX_CONSTANT_WORD3_0 0x03000C
#define S_03000C_UNCACHED(x) (((x) & 0x1) << 2)
#define S_03000C_DST_SEL_X(x) (((x) & 0x7) << 3)
#define G_03000C_DST_SEL_X(x) (((x) >> 3) & 0x7)
#define V_03000C_SQ_SEL_X 0x00000000
@@ -1457,6 +1477,34 @@
#define G_028860_UNCACHED_FIRST_INST(x) (((x) >> 28) & 0x1)
#define C_028860_UNCACHED_FIRST_INST 0xEFFFFFFF
#define R_028878_SQ_PGM_RESOURCES_GS 0x028878
#define S_028878_NUM_GPRS(x) (((x) & 0xFF) << 0)
#define G_028878_NUM_GPRS(x) (((x) >> 0) & 0xFF)
#define C_028878_NUM_GPRS 0xFFFFFF00
#define S_028878_STACK_SIZE(x) (((x) & 0xFF) << 8)
#define G_028878_STACK_SIZE(x) (((x) >> 8) & 0xFF)
#define C_028878_STACK_SIZE 0xFFFF00FF
#define S_028878_DX10_CLAMP(x) (((x) & 0x1) << 21)
#define G_028878_DX10_CLAMP(x) (((x) >> 21) & 0x1)
#define C_028878_DX10_CLAMP 0xFFDFFFFF
#define S_028878_UNCACHED_FIRST_INST(x) (((x) & 0x1) << 28)
#define G_028878_UNCACHED_FIRST_INST(x) (((x) >> 28) & 0x1)
#define C_028878_UNCACHED_FIRST_INST 0xEFFFFFFF
#define R_028890_SQ_PGM_RESOURCES_ES 0x028890
#define S_028890_NUM_GPRS(x) (((x) & 0xFF) << 0)
#define G_028890_NUM_GPRS(x) (((x) >> 0) & 0xFF)
#define C_028890_NUM_GPRS 0xFFFFFF00
#define S_028890_STACK_SIZE(x) (((x) & 0xFF) << 8)
#define G_028890_STACK_SIZE(x) (((x) >> 8) & 0xFF)
#define C_028890_STACK_SIZE 0xFFFF00FF
#define S_028890_DX10_CLAMP(x) (((x) & 0x1) << 21)
#define G_028890_DX10_CLAMP(x) (((x) >> 21) & 0x1)
#define C_028890_DX10_CLAMP 0xFFDFFFFF
#define S_028890_UNCACHED_FIRST_INST(x) (((x) & 0x1) << 28)
#define G_028890_UNCACHED_FIRST_INST(x) (((x) >> 28) & 0x1)
#define C_028890_UNCACHED_FIRST_INST 0xEFFFFFFF
#define R_028864_SQ_PGM_RESOURCES_2_VS 0x028864
#define S_028864_SINGLE_ROUND(x) (((x) & 0x3) << 0)
#define G_028864_SINGLE_ROUND(x) (((x) >> 0) & 0x3)
@@ -1880,6 +1928,8 @@
#define G_02884C_EXPORT_Z(x) (((x) >> 0) & 0x1)
#define C_02884C_EXPORT_Z 0xFFFFFFFE
#define R_02885C_SQ_PGM_START_VS 0x0002885C
#define R_028874_SQ_PGM_START_GS 0x00028874
#define R_02888C_SQ_PGM_START_ES 0x0002888C
#define R_0288A4_SQ_PGM_START_FS 0x000288A4
#define R_0288D0_SQ_PGM_START_LS 0x000288d0
#define R_0288A8_SQ_PGM_RESOURCES_FS 0x000288A8
@@ -1894,6 +1944,9 @@
#define R_028920_SQ_GS_VERT_ITEMSIZE_1 0x00028920
#define R_028924_SQ_GS_VERT_ITEMSIZE_2 0x00028924
#define R_028928_SQ_GS_VERT_ITEMSIZE_3 0x00028928
#define R_02892C_SQ_GSVS_RING_OFFSET_1 0x0002892C
#define R_028930_SQ_GSVS_RING_OFFSET_2 0x00028930
#define R_028934_SQ_GSVS_RING_OFFSET_3 0x00028934
#define R_028940_ALU_CONST_CACHE_PS_0 0x00028940
#define R_028944_ALU_CONST_CACHE_PS_1 0x00028944
#define R_028980_ALU_CONST_CACHE_VS_0 0x00028980
@@ -1928,6 +1981,15 @@
#define S_028A48_VPORT_SCISSOR_ENABLE(x) (((x) & 0x1) << 1)
#define S_028A48_LINE_STIPPLE_ENABLE(x) (((x) & 0x1) << 2)
#define R_028A4C_PA_SC_MODE_CNTL_1 0x00028A4C
#define R_028A54_GS_PER_ES 0x00028A54
#define R_028A58_ES_PER_GS 0x00028A58
#define R_028A5C_GS_PER_VS 0x00028A5C
#define R_028A84_VGT_PRIMITIVEID_EN 0x028A84
#define S_028A84_PRIMITIVEID_EN(x) (((x) & 0x1) << 0)
#define G_028A84_PRIMITIVEID_EN(x) (((x) >> 0) & 0x1)
#define C_028A84_PRIMITIVEID_EN 0xFFFFFFFE
#define R_028A94_VGT_MULTI_PRIM_IB_RESET_EN 0x00028A94
#define S_028A94_RESET_EN(x) (((x) & 0x1) << 0)
#define G_028A94_RESET_EN(x) (((x) >> 0) & 0x1)
@@ -1962,11 +2024,27 @@
#define R_028B28_VGT_STRMOUT_DRAW_OPAQUE_OFFSET 0x028B28
#define R_028B2C_VGT_STRMOUT_DRAW_OPAQUE_BUFFER_FILLED_SIZE 0x028B2C
#define R_028B30_VGT_STRMOUT_DRAW_OPAQUE_VERTEX_STRIDE 0x028B30
#define R_028B38_VGT_GS_MAX_VERT_OUT 0x028B38
#define S_028B38_MAX_VERT_OUT(x) (((x) & 0x7FF) << 0)
#define R_028B44_VGT_STRMOUT_BASE_OFFSET_HI_0 0x028B44
#define R_028B48_VGT_STRMOUT_BASE_OFFSET_HI_1 0x028B48
#define R_028B4C_VGT_STRMOUT_BASE_OFFSET_HI_2 0x028B4C
#define R_028B50_VGT_STRMOUT_BASE_OFFSET_HI_3 0x028B50
#define R_028B54_VGT_SHADER_STAGES_EN 0x00028B54
#define S_028B54_LS_EN(x) (((x) & 0x3) << 0)
#define V_028B54_LS_STAGE_OFF 0x00
#define V_028B54_LS_STAGE_ON 0x01
#define V_028B54_CS_STAGE_ON 0x02
#define S_028B54_HS_EN(x) (((x) & 0x1) << 2)
#define S_028B54_ES_EN(x) (((x) & 0x3) << 3)
#define V_028B54_ES_STAGE_OFF 0x00
#define V_028B54_ES_STAGE_DS 0x01
#define V_028B54_ES_STAGE_REAL 0x02
#define S_028B54_GS_EN(x) (((x) & 0x1) << 5)
#define S_028B54_VS_EN(x) (((x) & 0x3) << 6)
#define V_028B54_VS_STAGE_REAL 0x00
#define V_028B54_VS_STAGE_DS 0x01
#define V_028B54_VS_STAGE_COPY_SHADER 0x02
#define R_028B70_DB_ALPHA_TO_MASK 0x00028B70
#define S_028B70_ALPHA_TO_MASK_ENABLE(x) (((x) & 0x1) << 0)
#define S_028B70_ALPHA_TO_MASK_OFFSET0(x) (((x) & 0x3) << 8)
@@ -1998,12 +2076,9 @@
#define S_028B8C_OFFSET(x) (((x) & 0xFFFFFFFF) << 0)
#define G_028B8C_OFFSET(x) (((x) >> 0) & 0xFFFFFFFF)
#define C_028B8C_OFFSET 0x00000000
#define R_028B94_VGT_STRMOUT_CONFIG 0x028B94
#define S_028B94_STREAMOUT_0_EN(x) (((x) & 0x1) << 0)
#define S_028B94_STREAMOUT_1_EN(x) (((x) & 0x1) << 1)
#define S_028B94_STREAMOUT_2_EN(x) (((x) & 0x1) << 2)
#define S_028B94_STREAMOUT_3_EN(x) (((x) & 0x1) << 3)
#define S_028B94_RAST_STREAM(x) (((x) & 0x07) << 4)
#define R_028B90_VGT_GS_INSTANCE_CNT 0x00028B90
#define S_028B90_ENABLE(x) (((x) & 0x1) << 0)
#define S_028B90_CNT(x) (((x) & 0x7F) << 2)
#define R_028B98_VGT_STRMOUT_BUFFER_CONFIG 0x028B98
#define S_028B98_STREAM_0_BUFFER_EN(x) (((x) & 0x0F) << 0)
#define S_028B98_STREAM_1_BUFFER_EN(x) (((x) & 0x0F) << 4)

View File

@@ -193,7 +193,6 @@ int r600_bytecode_add_output(struct r600_bytecode *bc,
if ((output->gpr + output->burst_count) == bc->cf_last->output.gpr &&
(output->array_base + output->burst_count) == bc->cf_last->output.array_base) {
bc->cf_last->output.end_of_program |= output->end_of_program;
bc->cf_last->op = bc->cf_last->output.op = output->op;
bc->cf_last->output.gpr = output->gpr;
bc->cf_last->output.array_base = output->array_base;
@@ -203,7 +202,6 @@ int r600_bytecode_add_output(struct r600_bytecode *bc,
} else if (output->gpr == (bc->cf_last->output.gpr + bc->cf_last->output.burst_count) &&
output->array_base == (bc->cf_last->output.array_base + bc->cf_last->output.burst_count)) {
bc->cf_last->output.end_of_program |= output->end_of_program;
bc->cf_last->op = bc->cf_last->output.op = output->op;
bc->cf_last->output.burst_count += output->burst_count;
return 0;
@@ -215,6 +213,7 @@ int r600_bytecode_add_output(struct r600_bytecode *bc,
return r;
bc->cf_last->op = output->op;
memcpy(&bc->cf_last->output, output, sizeof(struct r600_bytecode_output));
bc->cf_last->barrier = 1;
return 0;
}
@@ -1526,24 +1525,26 @@ static int r600_bytecode_cf_build(struct r600_bytecode *bc, struct r600_bytecode
bc->bytecode[id++] = S_SQ_CF_ALLOC_EXPORT_WORD0_RW_GPR(cf->output.gpr) |
S_SQ_CF_ALLOC_EXPORT_WORD0_ELEM_SIZE(cf->output.elem_size) |
S_SQ_CF_ALLOC_EXPORT_WORD0_ARRAY_BASE(cf->output.array_base) |
S_SQ_CF_ALLOC_EXPORT_WORD0_TYPE(cf->output.type);
S_SQ_CF_ALLOC_EXPORT_WORD0_TYPE(cf->output.type) |
S_SQ_CF_ALLOC_EXPORT_WORD0_INDEX_GPR(cf->output.index_gpr);
bc->bytecode[id++] = S_SQ_CF_ALLOC_EXPORT_WORD1_BURST_COUNT(cf->output.burst_count - 1) |
S_SQ_CF_ALLOC_EXPORT_WORD1_SWIZ_SEL_X(cf->output.swizzle_x) |
S_SQ_CF_ALLOC_EXPORT_WORD1_SWIZ_SEL_Y(cf->output.swizzle_y) |
S_SQ_CF_ALLOC_EXPORT_WORD1_SWIZ_SEL_Z(cf->output.swizzle_z) |
S_SQ_CF_ALLOC_EXPORT_WORD1_SWIZ_SEL_W(cf->output.swizzle_w) |
S_SQ_CF_ALLOC_EXPORT_WORD1_BARRIER(cf->output.barrier) |
S_SQ_CF_ALLOC_EXPORT_WORD1_BARRIER(cf->barrier) |
S_SQ_CF_ALLOC_EXPORT_WORD1_CF_INST(opcode) |
S_SQ_CF_ALLOC_EXPORT_WORD1_END_OF_PROGRAM(cf->output.end_of_program);
} else if (cfop->flags & CF_STRM) {
S_SQ_CF_ALLOC_EXPORT_WORD1_END_OF_PROGRAM(cf->end_of_program);
} else if (cfop->flags & CF_MEM) {
bc->bytecode[id++] = S_SQ_CF_ALLOC_EXPORT_WORD0_RW_GPR(cf->output.gpr) |
S_SQ_CF_ALLOC_EXPORT_WORD0_ELEM_SIZE(cf->output.elem_size) |
S_SQ_CF_ALLOC_EXPORT_WORD0_ARRAY_BASE(cf->output.array_base) |
S_SQ_CF_ALLOC_EXPORT_WORD0_TYPE(cf->output.type);
S_SQ_CF_ALLOC_EXPORT_WORD0_TYPE(cf->output.type) |
S_SQ_CF_ALLOC_EXPORT_WORD0_INDEX_GPR(cf->output.index_gpr);
bc->bytecode[id++] = S_SQ_CF_ALLOC_EXPORT_WORD1_BURST_COUNT(cf->output.burst_count - 1) |
S_SQ_CF_ALLOC_EXPORT_WORD1_BARRIER(cf->output.barrier) |
S_SQ_CF_ALLOC_EXPORT_WORD1_BARRIER(cf->barrier) |
S_SQ_CF_ALLOC_EXPORT_WORD1_CF_INST(opcode) |
S_SQ_CF_ALLOC_EXPORT_WORD1_END_OF_PROGRAM(cf->output.end_of_program) |
S_SQ_CF_ALLOC_EXPORT_WORD1_END_OF_PROGRAM(cf->end_of_program) |
S_SQ_CF_ALLOC_EXPORT_WORD1_BUF_ARRAY_SIZE(cf->output.array_size) |
S_SQ_CF_ALLOC_EXPORT_WORD1_BUF_COMP_MASK(cf->output.comp_mask);
} else {
@@ -1551,7 +1552,8 @@ static int r600_bytecode_cf_build(struct r600_bytecode *bc, struct r600_bytecode
bc->bytecode[id++] = S_SQ_CF_WORD1_CF_INST(opcode) |
S_SQ_CF_WORD1_BARRIER(1) |
S_SQ_CF_WORD1_COND(cf->cond) |
S_SQ_CF_WORD1_POP_COUNT(cf->pop_count);
S_SQ_CF_WORD1_POP_COUNT(cf->pop_count) |
S_SQ_CF_WORD1_END_OF_PROGRAM(cf->end_of_program);
}
return 0;
}
@@ -1932,12 +1934,12 @@ void r600_bytecode_disasm(struct r600_bytecode *bc)
print_indent(o, 67);
fprintf(stderr, " ES:%X ", cf->output.elem_size);
if (!cf->output.barrier)
if (!cf->barrier)
fprintf(stderr, "NO_BARRIER ");
if (cf->output.end_of_program)
if (cf->end_of_program)
fprintf(stderr, "EOP ");
fprintf(stderr, "\n");
} else if (r600_isa_cf(cf->op)->flags & CF_STRM) {
} else if (r600_isa_cf(cf->op)->flags & CF_MEM) {
int o = 0;
const char *exp_type[] = {"WRITE", "WRITE_IND", "WRITE_ACK",
"WRITE_IND_ACK"};
@@ -1963,14 +1965,17 @@ void r600_bytecode_disasm(struct r600_bytecode *bc)
o += print_swizzle(7);
}
if (cf->output.type == V_SQ_CF_ALLOC_EXPORT_WORD0_SQ_EXPORT_WRITE_IND)
o += fprintf(stderr, " R%d", cf->output.index_gpr);
o += print_indent(o, 67);
fprintf(stderr, " ES:%i ", cf->output.elem_size);
if (cf->output.array_size != 0xFFF)
fprintf(stderr, "AS:%i ", cf->output.array_size);
if (!cf->output.barrier)
if (!cf->barrier)
fprintf(stderr, "NO_BARRIER ");
if (cf->output.end_of_program)
if (cf->end_of_program)
fprintf(stderr, "EOP ");
fprintf(stderr, "\n");
} else {
@@ -2486,6 +2491,7 @@ void r600_bytecode_alu_read(struct r600_bytecode *bc,
}
}
#if 0
void r600_bytecode_export_read(struct r600_bytecode *bc,
struct r600_bytecode_output *output, uint32_t word0, uint32_t word1)
{
@@ -2506,3 +2512,4 @@ void r600_bytecode_export_read(struct r600_bytecode *bc,
output->array_size = G_SQ_CF_ALLOC_EXPORT_WORD1_BUF_ARRAY_SIZE(word1);
output->comp_mask = G_SQ_CF_ALLOC_EXPORT_WORD1_BUF_COMP_MASK(word1);
}
#endif

View File

@@ -115,7 +115,6 @@ struct r600_bytecode_output {
unsigned array_size;
unsigned comp_mask;
unsigned type;
unsigned end_of_program;
unsigned op;
@@ -126,7 +125,7 @@ struct r600_bytecode_output {
unsigned swizzle_z;
unsigned swizzle_w;
unsigned burst_count;
unsigned barrier;
unsigned index_gpr;
};
struct r600_bytecode_kcache {
@@ -148,6 +147,8 @@ struct r600_bytecode_cf {
struct r600_bytecode_kcache kcache[4];
unsigned r6xx_uses_waterfall;
unsigned eg_alu_extended;
unsigned barrier;
unsigned end_of_program;
struct list_head alu;
struct list_head tex;
struct list_head vtx;

View File

@@ -59,6 +59,7 @@ static void r600_blitter_begin(struct pipe_context *ctx, enum r600_blitter_op op
util_blitter_save_vertex_buffer_slot(rctx->blitter, rctx->vertex_buffer_state.vb);
util_blitter_save_vertex_elements(rctx->blitter, rctx->vertex_fetch_shader.cso);
util_blitter_save_vertex_shader(rctx->blitter, rctx->vs_shader);
util_blitter_save_geometry_shader(rctx->blitter, rctx->gs_shader);
util_blitter_save_so_targets(rctx->blitter, rctx->b.streamout.num_targets,
(struct pipe_stream_output_target**)rctx->b.streamout.targets);
util_blitter_save_rasterizer(rctx->blitter, rctx->rasterizer_state.cso);
@@ -598,6 +599,12 @@ static void r600_copy_buffer(struct pipe_context *ctx, struct pipe_resource *dst
} else {
util_resource_copy_region(ctx, dst, 0, dstx, 0, 0, src, 0, src_box);
}
/* The index buffer (VGT) doesn't seem to see the result of the copying.
* Can we somehow flush the index buffer cache? Starting a new IB seems
* to do the trick. */
if (rctx->b.chip_class <= R700)
rctx->b.rings.gfx.flush(ctx, RADEON_FLUSH_ASYNC);
}
/**
@@ -678,6 +685,7 @@ static void r600_resource_copy_region(struct pipe_context *ctx,
struct pipe_surface *dst_view, dst_templ;
struct pipe_sampler_view src_templ, *src_view;
unsigned dst_width, dst_height, src_width0, src_height0, src_widthFL, src_heightFL;
unsigned src_force_level = 0;
struct pipe_box sbox, dstbox;
/* Handle buffers first. */
@@ -736,6 +744,8 @@ static void r600_resource_copy_region(struct pipe_context *ctx,
sbox.height = util_format_get_nblocksy(src->format, src_box->height);
sbox.depth = src_box->depth;
src_box = &sbox;
src_force_level = src_level;
} else if (!util_blitter_is_copy_supported(rctx->blitter, dst, src)) {
if (util_format_is_subsampled_2x1_32bpp(src->format)) {
@@ -788,7 +798,8 @@ static void r600_resource_copy_region(struct pipe_context *ctx,
if (rctx->b.chip_class >= EVERGREEN) {
src_view = evergreen_create_sampler_view_custom(ctx, src, &src_templ,
src_width0, src_height0);
src_width0, src_height0,
src_force_level);
} else {
src_view = r600_create_sampler_view_custom(ctx, src, &src_templ,
src_widthFL, src_heightFL);

View File

@@ -81,7 +81,7 @@ void r600_need_cs_space(struct r600_context *ctx, unsigned num_dw,
}
/* SX_MISC */
if (ctx->b.chip_class <= R700) {
if (ctx->b.chip_class == R600) {
num_dw += 3;
}
@@ -210,6 +210,15 @@ void r600_flush_emit(struct r600_context *rctx)
S_0085F0_SMX_ACTION_ENA(1);
}
/* Workaround for buggy flushing on some R6xx chipsets. */
if (rctx->b.flags & R600_CONTEXT_FLUSH_AND_INV &&
(rctx->b.family == CHIP_RV670 ||
rctx->b.family == CHIP_RS780 ||
rctx->b.family == CHIP_RS880)) {
cp_coher_cntl |= S_0085F0_CB1_DEST_BASE_ENA(1) |
S_0085F0_DEST_BASE_0_ENA(1);
}
if (cp_coher_cntl) {
cs->buf[cs->cdw++] = PKT3(PKT3_SURFACE_SYNC, 3, 0);
cs->buf[cs->cdw++] = cp_coher_cntl; /* CP_COHER_CNTL */
@@ -260,7 +269,7 @@ void r600_context_flush(struct r600_context *ctx, unsigned flags)
r600_flush_emit(ctx);
/* old kernels and userspace don't set SX_MISC, so we must reset it to 0 here */
if (ctx->b.chip_class <= R700) {
if (ctx->b.chip_class == R600) {
r600_write_context_reg(cs, R_028350_SX_MISC, 0);
}
@@ -301,6 +310,12 @@ void r600_begin_new_cs(struct r600_context *ctx)
ctx->config_state.atom.dirty = true;
ctx->stencil_ref.atom.dirty = true;
ctx->vertex_fetch_shader.atom.dirty = true;
ctx->export_shader.atom.dirty = true;
if (ctx->gs_shader) {
ctx->geometry_shader.atom.dirty = true;
ctx->shader_stages.atom.dirty = true;
ctx->gs_rings.atom.dirty = true;
}
ctx->vertex_shader.atom.dirty = true;
ctx->viewport.atom.dirty = true;
@@ -346,7 +361,7 @@ void r600_begin_new_cs(struct r600_context *ctx)
ctx->last_primitive_type = -1;
ctx->last_start_instance = -1;
ctx->initial_gfx_cs_size = ctx->b.rings.gfx.cs->cdw;
ctx->b.initial_gfx_cs_size = ctx->b.rings.gfx.cs->cdw;
}
/* The max number of bytes to copy per packet. */

View File

@@ -44,7 +44,7 @@
static const struct debug_named_value r600_debug_options[] = {
/* features */
#if defined(R600_USE_LLVM)
{ "nollvm", DBG_NO_LLVM, "Disable the LLVM shader compiler" },
{ "llvm", DBG_LLVM, "Enable the LLVM shader compiler" },
#endif
{ "nocpdma", DBG_NO_CP_DMA, "Disable CP DMA" },
{ "nodma", DBG_NO_ASYNC_DMA, "Disable asynchronous DMA" },
@@ -73,7 +73,7 @@ static void r600_flush(struct pipe_context *ctx, unsigned flags)
unsigned render_cond_mode = 0;
boolean render_cond_cond = FALSE;
if (rctx->b.rings.gfx.cs->cdw == rctx->initial_gfx_cs_size)
if (rctx->b.rings.gfx.cs->cdw == rctx->b.initial_gfx_cs_size)
return;
rctx->b.rings.gfx.flushing = true;
@@ -94,7 +94,7 @@ static void r600_flush(struct pipe_context *ctx, unsigned flags)
ctx->render_condition(ctx, render_cond, render_cond_cond, render_cond_mode);
}
rctx->initial_gfx_cs_size = rctx->b.rings.gfx.cs->cdw;
rctx->b.initial_gfx_cs_size = rctx->b.rings.gfx.cs->cdw;
}
static void r600_flush_from_st(struct pipe_context *ctx,
@@ -347,7 +347,6 @@ static int r600_get_param(struct pipe_screen* pscreen, enum pipe_cap param)
case PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY:
case PIPE_CAP_USER_INDEX_BUFFERS:
case PIPE_CAP_USER_CONSTANT_BUFFERS:
case PIPE_CAP_COMPUTE:
case PIPE_CAP_START_INSTANCE:
case PIPE_CAP_MAX_DUAL_SOURCE_RENDER_TARGETS:
case PIPE_CAP_TEXTURE_BUFFER_OBJECTS:
@@ -356,6 +355,9 @@ static int r600_get_param(struct pipe_screen* pscreen, enum pipe_cap param)
case PIPE_CAP_TEXTURE_MULTISAMPLE:
return 1;
case PIPE_CAP_COMPUTE:
return rscreen->b.chip_class > R700;
case PIPE_CAP_TGSI_TEXCOORD:
return 0;
@@ -372,6 +374,11 @@ static int r600_get_param(struct pipe_screen* pscreen, enum pipe_cap param)
return 1;
case PIPE_CAP_GLSL_FEATURE_LEVEL:
if (family >= CHIP_CEDAR)
return 330;
/* pre-evergreen geom shaders need newer kernel */
if (rscreen->b.info.drm_minor >= 37)
return 330;
return 140;
/* Supported except the original R600. */
@@ -383,6 +390,7 @@ static int r600_get_param(struct pipe_screen* pscreen, enum pipe_cap param)
/* Supported on Evergreen. */
case PIPE_CAP_SEAMLESS_CUBE_MAP_PER_TEXTURE:
case PIPE_CAP_CUBE_MAP_ARRAY:
case PIPE_CAP_TGSI_VS_LAYER:
return family >= CHIP_CEDAR ? 1 : 0;
/* Unsupported features. */
@@ -392,7 +400,6 @@ static int r600_get_param(struct pipe_screen* pscreen, enum pipe_cap param)
case PIPE_CAP_FRAGMENT_COLOR_CLAMPED:
case PIPE_CAP_VERTEX_COLOR_CLAMPED:
case PIPE_CAP_USER_VERTEX_BUFFERS:
case PIPE_CAP_TGSI_VS_LAYER:
return 0;
/* Stream output. */
@@ -404,19 +411,27 @@ static int r600_get_param(struct pipe_screen* pscreen, enum pipe_cap param)
case PIPE_CAP_MAX_STREAM_OUTPUT_INTERLEAVED_COMPONENTS:
return 32*4;
/* Geometry shader output. */
case PIPE_CAP_MAX_GEOMETRY_OUTPUT_VERTICES:
return 1024;
case PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS:
return 16384;
/* Texturing. */
case PIPE_CAP_MAX_TEXTURE_2D_LEVELS:
case PIPE_CAP_MAX_TEXTURE_3D_LEVELS:
case PIPE_CAP_MAX_TEXTURE_CUBE_LEVELS:
if (family >= CHIP_CEDAR)
return 15;
else
return 14;
case PIPE_CAP_MAX_TEXTURE_3D_LEVELS:
/* textures support 8192, but layered rendering supports 2048 */
return 12;
case PIPE_CAP_MAX_TEXTURE_ARRAY_LAYERS:
return rscreen->b.info.drm_minor >= 9 ?
(family >= CHIP_CEDAR ? 16384 : 8192) : 0;
/* textures support 8192, but layered rendering supports 2048 */
return rscreen->b.info.drm_minor >= 9 ? 2048 : 0;
case PIPE_CAP_MAX_COMBINED_SAMPLERS:
return 32;
return 48;
/* Render targets. */
case PIPE_CAP_MAX_RENDER_TARGETS:
@@ -449,14 +464,20 @@ static int r600_get_param(struct pipe_screen* pscreen, enum pipe_cap param)
static int r600_get_shader_param(struct pipe_screen* pscreen, unsigned shader, enum pipe_shader_cap param)
{
struct r600_screen *rscreen = (struct r600_screen *)pscreen;
switch(shader)
{
case PIPE_SHADER_FRAGMENT:
case PIPE_SHADER_VERTEX:
case PIPE_SHADER_COMPUTE:
case PIPE_SHADER_COMPUTE:
break;
case PIPE_SHADER_GEOMETRY:
/* XXX: support and enable geometry programs */
if (rscreen->b.family >= CHIP_CEDAR)
break;
/* pre-evergreen geom shaders need newer kernel */
if (rscreen->b.info.drm_minor >= 37)
break;
return 0;
default:
/* XXX: support tessellation on Evergreen */
@@ -568,10 +589,10 @@ struct pipe_screen *r600_screen_create(struct radeon_winsys *ws)
rscreen->b.debug_flags |= DBG_COMPUTE;
if (debug_get_bool_option("R600_DUMP_SHADERS", FALSE))
rscreen->b.debug_flags |= DBG_FS | DBG_VS | DBG_GS | DBG_PS | DBG_CS;
if (!debug_get_bool_option("R600_HYPERZ", TRUE))
rscreen->b.debug_flags |= DBG_NO_HYPERZ;
if (!debug_get_bool_option("R600_LLVM", TRUE))
rscreen->b.debug_flags |= DBG_NO_LLVM;
if (debug_get_bool_option("R600_HYPERZ", FALSE))
rscreen->b.debug_flags |= DBG_HYPERZ;
if (debug_get_bool_option("R600_LLVM", FALSE))
rscreen->b.debug_flags |= DBG_LLVM;
if (rscreen->b.family == CHIP_UNKNOWN) {
fprintf(stderr, "r600: Unknown chipset 0x%04X\n", rscreen->b.info.pci_id);

View File

@@ -38,21 +38,22 @@
#include "util/u_double_list.h"
#include "util/u_transfer.h"
#define R600_NUM_ATOMS 41
#define R600_NUM_ATOMS 42
/* the number of CS dwords for flushing and drawing */
#define R600_MAX_FLUSH_CS_DWORDS 16
#define R600_MAX_DRAW_CS_DWORDS 34
#define R600_MAX_DRAW_CS_DWORDS 37
#define R600_TRACE_CS_DWORDS 7
#define R600_MAX_USER_CONST_BUFFERS 13
#define R600_MAX_DRIVER_CONST_BUFFERS 3
#define R600_MAX_DRIVER_CONST_BUFFERS 4
#define R600_MAX_CONST_BUFFERS (R600_MAX_USER_CONST_BUFFERS + R600_MAX_DRIVER_CONST_BUFFERS)
/* start driver buffers after user buffers */
#define R600_UCP_CONST_BUFFER (R600_MAX_USER_CONST_BUFFERS)
#define R600_TXQ_CONST_BUFFER (R600_MAX_USER_CONST_BUFFERS + 1)
#define R600_BUFFER_INFO_CONST_BUFFER (R600_MAX_USER_CONST_BUFFERS + 2)
#define R600_GS_RING_CONST_BUFFER (R600_MAX_USER_CONST_BUFFERS + 3)
#define R600_MAX_CONST_BUFFER_SIZE 4096
@@ -159,6 +160,7 @@ struct r600_sample_mask {
struct r600_config_state {
struct r600_atom atom;
unsigned sq_gpr_resource_mgmt_1;
unsigned sq_gpr_resource_mgmt_2;
};
struct r600_stencil_ref
@@ -179,9 +181,21 @@ struct r600_viewport_state {
struct pipe_viewport_state state;
};
struct r600_shader_stages_state {
struct r600_atom atom;
unsigned geom_enable;
};
struct r600_gs_rings_state {
struct r600_atom atom;
unsigned enable;
struct pipe_constant_buffer esgs_ring;
struct pipe_constant_buffer gsvs_ring;
};
/* This must start from 16. */
/* features */
#define DBG_NO_LLVM (1 << 17)
#define DBG_LLVM (1 << 17)
#define DBG_NO_CP_DMA (1 << 18)
#define DBG_NO_ASYNC_DMA (1 << 19)
/* shader backend */
@@ -221,6 +235,7 @@ struct r600_rasterizer_state {
unsigned clip_plane_enable;
unsigned pa_sc_line_stipple;
unsigned pa_cl_clip_cntl;
unsigned pa_su_sc_mode_cntl;
float offset_units;
float offset_scale;
bool offset_enable;
@@ -353,7 +368,7 @@ struct r600_fetch_shader {
struct r600_shader_state {
struct r600_atom atom;
struct r600_pipe_shader_selector *shader;
struct r600_pipe_shader *shader;
};
struct r600_context {
@@ -361,7 +376,6 @@ struct r600_context {
struct r600_screen *screen;
struct blitter_context *blitter;
struct u_suballocator *allocator_fetch_shader;
unsigned initial_gfx_cs_size;
/* Hardware info. */
boolean has_vertex_cache;
@@ -415,7 +429,11 @@ struct r600_context {
struct r600_cso_state vertex_fetch_shader;
struct r600_shader_state vertex_shader;
struct r600_shader_state pixel_shader;
struct r600_shader_state geometry_shader;
struct r600_shader_state export_shader;
struct r600_cs_shader_state cs_shader_state;
struct r600_shader_stages_state shader_stages;
struct r600_gs_rings_state gs_rings;
struct r600_constbuf_state constbuf_state[PIPE_SHADER_TYPES];
struct r600_textures_info samplers[PIPE_SHADER_TYPES];
/** Vertex buffers for fetch shaders */
@@ -427,6 +445,7 @@ struct r600_context {
unsigned compute_cb_target_mask;
struct r600_pipe_shader_selector *ps_shader;
struct r600_pipe_shader_selector *vs_shader;
struct r600_pipe_shader_selector *gs_shader;
struct r600_rasterizer_state *rasterizer;
bool alpha_to_one;
bool force_blend_disable;
@@ -493,7 +512,8 @@ struct pipe_sampler_view *
evergreen_create_sampler_view_custom(struct pipe_context *ctx,
struct pipe_resource *texture,
const struct pipe_sampler_view *state,
unsigned width0, unsigned height0);
unsigned width0, unsigned height0,
unsigned force_level);
void evergreen_init_common_regs(struct r600_command_buffer *cb,
enum chip_class ctx_chip_class,
enum radeon_family ctx_family,
@@ -506,6 +526,8 @@ void cayman_init_common_regs(struct r600_command_buffer *cb,
void evergreen_init_state_functions(struct r600_context *rctx);
void evergreen_init_atom_start_cs(struct r600_context *rctx);
void evergreen_update_ps_state(struct pipe_context *ctx, struct r600_pipe_shader *shader);
void evergreen_update_es_state(struct pipe_context *ctx, struct r600_pipe_shader *shader);
void evergreen_update_gs_state(struct pipe_context *ctx, struct r600_pipe_shader *shader);
void evergreen_update_vs_state(struct pipe_context *ctx, struct r600_pipe_shader *shader);
void *evergreen_create_db_flush_dsa(struct r600_context *rctx);
void *evergreen_create_resolve_blend(struct r600_context *rctx);
@@ -545,6 +567,8 @@ r600_create_sampler_view_custom(struct pipe_context *ctx,
void r600_init_state_functions(struct r600_context *rctx);
void r600_init_atom_start_cs(struct r600_context *rctx);
void r600_update_ps_state(struct pipe_context *ctx, struct r600_pipe_shader *shader);
void r600_update_es_state(struct pipe_context *ctx, struct r600_pipe_shader *shader);
void r600_update_gs_state(struct pipe_context *ctx, struct r600_pipe_shader *shader);
void r600_update_vs_state(struct pipe_context *ctx, struct r600_pipe_shader *shader);
void *r600_create_db_flush_dsa(struct r600_context *rctx);
void *r600_create_resolve_blend(struct r600_context *rctx);
@@ -803,15 +827,6 @@ static INLINE uint32_t S_FIXED(float value, uint32_t frac_bits)
}
#define ALIGN_DIVUP(x, y) (((x) + (y) - 1) / (y))
static inline unsigned r600_tex_aniso_filter(unsigned filter)
{
if (filter <= 1) return 0;
if (filter <= 2) return 1;
if (filter <= 4) return 2;
if (filter <= 8) return 3;
/* else */ return 4;
}
/* 12.4 fixed-point */
static INLINE unsigned r600_pack_float_12p4(float x)
{
@@ -819,4 +834,32 @@ static INLINE unsigned r600_pack_float_12p4(float x)
x >= 4096 ? 0xffff : x * 16;
}
#define V_028A6C_OUTPRIM_TYPE_POINTLIST 0
#define V_028A6C_OUTPRIM_TYPE_LINESTRIP 1
#define V_028A6C_OUTPRIM_TYPE_TRISTRIP 2
static INLINE unsigned r600_conv_prim_to_gs_out(unsigned mode)
{
static const int prim_conv[] = {
V_028A6C_OUTPRIM_TYPE_POINTLIST,
V_028A6C_OUTPRIM_TYPE_LINESTRIP,
V_028A6C_OUTPRIM_TYPE_LINESTRIP,
V_028A6C_OUTPRIM_TYPE_LINESTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_LINESTRIP,
V_028A6C_OUTPRIM_TYPE_LINESTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP
};
assert(mode < Elements(prim_conv));
return prim_conv[mode];
}
#endif

File diff suppressed because it is too large Load Diff

View File

@@ -37,6 +37,7 @@ struct r600_shader_io {
unsigned lds_pos; /* for evergreen */
unsigned back_color_input;
unsigned write_mask;
int ring_offset;
};
struct r600_shader {
@@ -61,12 +62,23 @@ struct r600_shader {
/* flag is set if the shader writes VS_OUT_MISC_VEC (e.g. for PSIZE) */
boolean vs_out_misc_write;
boolean vs_out_point_size;
boolean vs_out_layer;
boolean vs_out_edgeflag;
boolean has_txq_cube_array_z_comp;
boolean uses_tex_buffers;
boolean gs_prim_id_input;
/* geometry shader properties */
unsigned gs_input_prim;
unsigned gs_output_prim;
unsigned gs_max_out_vertices;
/* size in bytes of a data item in the ring (single vertex data) */
unsigned ring_item_size;
unsigned indirect_files;
unsigned max_arrays;
unsigned num_arrays;
unsigned vs_as_es;
struct r600_shader_array * arrays;
};
@@ -74,6 +86,7 @@ struct r600_shader_key {
unsigned color_two_side:1;
unsigned alpha_to_one:1;
unsigned nr_cbufs:4;
unsigned vs_as_es:1;
};
struct r600_shader_array {
@@ -85,6 +98,8 @@ struct r600_shader_array {
struct r600_pipe_shader {
struct r600_pipe_shader_selector *selector;
struct r600_pipe_shader *next_variant;
/* for GS - corresponding copy shader (installed as VS) */
struct r600_pipe_shader *gs_copy_shader;
struct r600_shader shader;
struct r600_command_buffer command_buffer; /* register writes */
struct r600_resource *bo;

View File

@@ -911,6 +911,10 @@ static void *r600_create_rs_state(struct pipe_context *ctx,
S_028810_ZCLIP_NEAR_DISABLE(!state->depth_clip) |
S_028810_ZCLIP_FAR_DISABLE(!state->depth_clip) |
S_028810_DX_LINEAR_ATTR_CLIP_ENA(1);
if (rctx->b.chip_class == R700) {
rs->pa_cl_clip_cntl |=
S_028810_DX_RASTERIZATION_KILL(state->rasterizer_discard);
}
rs->multisample_enable = state->multisample;
/* offset */
@@ -968,19 +972,25 @@ static void *r600_create_rs_state(struct pipe_context *ctx,
S_028C08_PIX_CENTER_HALF(state->half_pixel_center) |
S_028C08_QUANT_MODE(V_028C08_X_1_256TH));
r600_store_context_reg(&rs->buffer, R_028DFC_PA_SU_POLY_OFFSET_CLAMP, fui(state->offset_clamp));
r600_store_context_reg(&rs->buffer, R_028814_PA_SU_SC_MODE_CNTL,
S_028814_PROVOKING_VTX_LAST(!state->flatshade_first) |
S_028814_CULL_FRONT(state->cull_face & PIPE_FACE_FRONT ? 1 : 0) |
S_028814_CULL_BACK(state->cull_face & PIPE_FACE_BACK ? 1 : 0) |
S_028814_FACE(!state->front_ccw) |
S_028814_POLY_OFFSET_FRONT_ENABLE(state->offset_tri) |
S_028814_POLY_OFFSET_BACK_ENABLE(state->offset_tri) |
S_028814_POLY_OFFSET_PARA_ENABLE(state->offset_tri) |
S_028814_POLY_MODE(state->fill_front != PIPE_POLYGON_MODE_FILL ||
state->fill_back != PIPE_POLYGON_MODE_FILL) |
S_028814_POLYMODE_FRONT_PTYPE(r600_translate_fill(state->fill_front)) |
S_028814_POLYMODE_BACK_PTYPE(r600_translate_fill(state->fill_back)));
r600_store_context_reg(&rs->buffer, R_028350_SX_MISC, S_028350_MULTIPASS(state->rasterizer_discard));
rs->pa_su_sc_mode_cntl = S_028814_PROVOKING_VTX_LAST(!state->flatshade_first) |
S_028814_CULL_FRONT(state->cull_face & PIPE_FACE_FRONT ? 1 : 0) |
S_028814_CULL_BACK(state->cull_face & PIPE_FACE_BACK ? 1 : 0) |
S_028814_FACE(!state->front_ccw) |
S_028814_POLY_OFFSET_FRONT_ENABLE(state->offset_tri) |
S_028814_POLY_OFFSET_BACK_ENABLE(state->offset_tri) |
S_028814_POLY_OFFSET_PARA_ENABLE(state->offset_tri) |
S_028814_POLY_MODE(state->fill_front != PIPE_POLYGON_MODE_FILL ||
state->fill_back != PIPE_POLYGON_MODE_FILL) |
S_028814_POLYMODE_FRONT_PTYPE(r600_translate_fill(state->fill_front)) |
S_028814_POLYMODE_BACK_PTYPE(r600_translate_fill(state->fill_back));
if (rctx->b.chip_class == R700) {
r600_store_context_reg(&rs->buffer, R_028814_PA_SU_SC_MODE_CNTL, rs->pa_su_sc_mode_cntl);
}
if (rctx->b.chip_class == R600) {
r600_store_context_reg(&rs->buffer, R_028350_SX_MISC,
S_028350_MULTIPASS(state->rasterizer_discard));
}
return rs;
}
@@ -1264,6 +1274,7 @@ static void r600_init_color_surface(struct r600_context *rctx,
unsigned level = surf->base.u.tex.level;
unsigned pitch, slice;
unsigned color_info;
unsigned color_view;
unsigned format, swap, ntype, endian;
unsigned offset;
const struct util_format_description *desc;
@@ -1277,10 +1288,15 @@ static void r600_init_color_surface(struct r600_context *rctx,
}
offset = rtex->surface.level[level].offset;
if (rtex->surface.level[level].mode < RADEON_SURF_MODE_1D) {
if (rtex->surface.level[level].mode == RADEON_SURF_MODE_LINEAR) {
assert(surf->base.u.tex.first_layer == surf->base.u.tex.last_layer);
offset += rtex->surface.level[level].slice_size *
surf->base.u.tex.first_layer;
}
surf->base.u.tex.first_layer;
color_view = 0;
} else
color_view = S_028080_SLICE_START(surf->base.u.tex.first_layer) |
S_028080_SLICE_MAX(surf->base.u.tex.last_layer);
pitch = rtex->surface.level[level].nblk_x / 8 - 1;
slice = (rtex->surface.level[level].nblk_x * rtex->surface.level[level].nblk_y) / 64;
if (slice) {
@@ -1466,14 +1482,7 @@ static void r600_init_color_surface(struct r600_context *rctx,
}
surf->cb_color_info = color_info;
if (rtex->surface.level[level].mode < RADEON_SURF_MODE_1D) {
surf->cb_color_view = 0;
} else {
surf->cb_color_view = S_028080_SLICE_START(surf->base.u.tex.first_layer) |
S_028080_SLICE_MAX(surf->base.u.tex.last_layer);
}
surf->cb_color_view = color_view;
surf->color_initialized = true;
}
@@ -1667,8 +1676,6 @@ static void r600_set_framebuffer_state(struct pipe_context *ctx,
rctx->alphatest_state.atom.dirty = true;
}
r600_update_db_shader_control(rctx);
/* Calculate the CS size. */
rctx->framebuffer.atom.num_dw =
10 /*COLOR_INFO*/ + 4 /*SCISSOR*/ + 3 /*SHADER_CONTROL*/ + 8 /*MSAA*/;
@@ -2055,6 +2062,11 @@ static void r600_emit_db_misc_state(struct r600_context *rctx, struct r600_atom
db_render_control |= S_028D0C_DEPTH_CLEAR_ENABLE(1);
}
/* RV770 workaround for a hang with 8x MSAA. */
if (rctx->b.family == CHIP_RV770 && a->log_samples == 3) {
db_render_override |= S_028D10_MAX_TILES_IN_DTT(6);
}
r600_write_context_reg_seq(cs, R_028D0C_DB_RENDER_CONTROL, 2);
radeon_emit(cs, db_render_control); /* R_028D0C_DB_RENDER_CONTROL */
radeon_emit(cs, db_render_override); /* R_028D10_DB_RENDER_OVERRIDE */
@@ -2067,6 +2079,7 @@ static void r600_emit_config_state(struct r600_context *rctx, struct r600_atom *
struct r600_config_state *a = (struct r600_config_state*)atom;
r600_write_config_reg(cs, R_008C04_SQ_GPR_RESOURCE_MGMT_1, a->sq_gpr_resource_mgmt_1);
r600_write_config_reg(cs, R_008C08_SQ_GPR_RESOURCE_MGMT_2, a->sq_gpr_resource_mgmt_2);
}
static void r600_emit_vertex_buffers(struct r600_context *rctx, struct r600_atom *atom)
@@ -2118,16 +2131,18 @@ static void r600_emit_constant_buffers(struct r600_context *rctx,
struct r600_resource *rbuffer;
unsigned offset;
unsigned buffer_index = ffs(dirty_mask) - 1;
unsigned gs_ring_buffer = (buffer_index == R600_GS_RING_CONST_BUFFER);
cb = &state->cb[buffer_index];
rbuffer = (struct r600_resource*)cb->buffer;
assert(rbuffer);
offset = cb->buffer_offset;
r600_write_context_reg(cs, reg_alu_constbuf_size + buffer_index * 4,
ALIGN_DIVUP(cb->buffer_size >> 4, 16));
r600_write_context_reg(cs, reg_alu_const_cache + buffer_index * 4, offset >> 8);
if (!gs_ring_buffer) {
r600_write_context_reg(cs, reg_alu_constbuf_size + buffer_index * 4,
ALIGN_DIVUP(cb->buffer_size >> 4, 16));
r600_write_context_reg(cs, reg_alu_const_cache + buffer_index * 4, offset >> 8);
}
radeon_emit(cs, PKT3(PKT3_NOP, 0, 0));
radeon_emit(cs, r600_context_bo_reloc(&rctx->b, &rctx->b.rings.gfx, rbuffer, RADEON_USAGE_READ));
@@ -2137,8 +2152,8 @@ static void r600_emit_constant_buffers(struct r600_context *rctx,
radeon_emit(cs, offset); /* RESOURCEi_WORD0 */
radeon_emit(cs, rbuffer->buf->size - offset - 1); /* RESOURCEi_WORD1 */
radeon_emit(cs, /* RESOURCEi_WORD2 */
S_038008_ENDIAN_SWAP(r600_endian_swap(32)) |
S_038008_STRIDE(16));
S_038008_ENDIAN_SWAP(gs_ring_buffer ? ENDIAN_NONE : r600_endian_swap(32)) |
S_038008_STRIDE(gs_ring_buffer ? 4 : 16));
radeon_emit(cs, 0); /* RESOURCEi_WORD3 */
radeon_emit(cs, 0); /* RESOURCEi_WORD4 */
radeon_emit(cs, 0); /* RESOURCEi_WORD5 */
@@ -2323,34 +2338,124 @@ static void r600_emit_vertex_fetch_shader(struct r600_context *rctx, struct r600
radeon_emit(cs, r600_context_bo_reloc(&rctx->b, &rctx->b.rings.gfx, shader->buffer, RADEON_USAGE_READ));
}
static void r600_emit_shader_stages(struct r600_context *rctx, struct r600_atom *a)
{
struct radeon_winsys_cs *cs = rctx->b.rings.gfx.cs;
struct r600_shader_stages_state *state = (struct r600_shader_stages_state*)a;
uint32_t v2 = 0, primid = 0;
if (state->geom_enable) {
uint32_t cut_val;
if (rctx->gs_shader->current->shader.gs_max_out_vertices <= 128)
cut_val = V_028A40_GS_CUT_128;
else if (rctx->gs_shader->current->shader.gs_max_out_vertices <= 256)
cut_val = V_028A40_GS_CUT_256;
else if (rctx->gs_shader->current->shader.gs_max_out_vertices <= 512)
cut_val = V_028A40_GS_CUT_512;
else
cut_val = V_028A40_GS_CUT_1024;
v2 = S_028A40_MODE(V_028A40_GS_SCENARIO_G) |
S_028A40_CUT_MODE(cut_val);
if (rctx->gs_shader->current->shader.gs_prim_id_input)
primid = 1;
}
r600_write_context_reg(cs, R_028A40_VGT_GS_MODE, v2);
r600_write_context_reg(cs, R_028A84_VGT_PRIMITIVEID_EN, primid);
}
static void r600_emit_gs_rings(struct r600_context *rctx, struct r600_atom *a)
{
struct pipe_screen *screen = rctx->b.b.screen;
struct radeon_winsys_cs *cs = rctx->b.rings.gfx.cs;
struct r600_gs_rings_state *state = (struct r600_gs_rings_state*)a;
struct r600_resource *rbuffer;
r600_write_config_reg(cs, R_008040_WAIT_UNTIL, S_008040_WAIT_3D_IDLE(1));
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
radeon_emit(cs, EVENT_TYPE(EVENT_TYPE_VGT_FLUSH));
if (state->enable) {
rbuffer =(struct r600_resource*)state->esgs_ring.buffer;
r600_write_config_reg(cs, R_008C40_SQ_ESGS_RING_BASE,
(r600_resource_va(screen, &rbuffer->b.b)) >> 8);
radeon_emit(cs, PKT3(PKT3_NOP, 0, 0));
radeon_emit(cs, r600_context_bo_reloc(&rctx->b, &rctx->b.rings.gfx, rbuffer, RADEON_USAGE_READWRITE));
r600_write_config_reg(cs, R_008C44_SQ_ESGS_RING_SIZE,
state->esgs_ring.buffer_size >> 8);
rbuffer =(struct r600_resource*)state->gsvs_ring.buffer;
r600_write_config_reg(cs, R_008C48_SQ_GSVS_RING_BASE,
(r600_resource_va(screen, &rbuffer->b.b)) >> 8);
radeon_emit(cs, PKT3(PKT3_NOP, 0, 0));
radeon_emit(cs, r600_context_bo_reloc(&rctx->b, &rctx->b.rings.gfx, rbuffer, RADEON_USAGE_READWRITE));
r600_write_config_reg(cs, R_008C4C_SQ_GSVS_RING_SIZE,
state->gsvs_ring.buffer_size >> 8);
} else {
r600_write_config_reg(cs, R_008C44_SQ_ESGS_RING_SIZE, 0);
r600_write_config_reg(cs, R_008C4C_SQ_GSVS_RING_SIZE, 0);
}
r600_write_config_reg(cs, R_008040_WAIT_UNTIL, S_008040_WAIT_3D_IDLE(1));
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
radeon_emit(cs, EVENT_TYPE(EVENT_TYPE_VGT_FLUSH));
}
/* Adjust GPR allocation on R6xx/R7xx */
bool r600_adjust_gprs(struct r600_context *rctx)
{
unsigned num_ps_gprs = rctx->ps_shader->current->shader.bc.ngpr;
unsigned num_vs_gprs = rctx->vs_shader->current->shader.bc.ngpr;
unsigned num_vs_gprs, num_es_gprs, num_gs_gprs;
unsigned new_num_ps_gprs = num_ps_gprs;
unsigned new_num_vs_gprs = num_vs_gprs;
unsigned new_num_vs_gprs, new_num_es_gprs, new_num_gs_gprs;
unsigned cur_num_ps_gprs = G_008C04_NUM_PS_GPRS(rctx->config_state.sq_gpr_resource_mgmt_1);
unsigned cur_num_vs_gprs = G_008C04_NUM_VS_GPRS(rctx->config_state.sq_gpr_resource_mgmt_1);
unsigned cur_num_gs_gprs = G_008C08_NUM_GS_GPRS(rctx->config_state.sq_gpr_resource_mgmt_2);
unsigned cur_num_es_gprs = G_008C08_NUM_ES_GPRS(rctx->config_state.sq_gpr_resource_mgmt_2);
unsigned def_num_ps_gprs = rctx->default_ps_gprs;
unsigned def_num_vs_gprs = rctx->default_vs_gprs;
unsigned def_num_gs_gprs = 0;
unsigned def_num_es_gprs = 0;
unsigned def_num_clause_temp_gprs = rctx->r6xx_num_clause_temp_gprs;
/* hardware will reserve twice num_clause_temp_gprs */
unsigned max_gprs = def_num_ps_gprs + def_num_vs_gprs + def_num_clause_temp_gprs * 2;
unsigned tmp;
unsigned max_gprs = def_num_gs_gprs + def_num_es_gprs + def_num_ps_gprs + def_num_vs_gprs + def_num_clause_temp_gprs * 2;
unsigned tmp, tmp2;
if (rctx->gs_shader) {
num_es_gprs = rctx->vs_shader->current->shader.bc.ngpr;
num_gs_gprs = rctx->gs_shader->current->shader.bc.ngpr;
num_vs_gprs = rctx->gs_shader->current->gs_copy_shader->shader.bc.ngpr;
} else {
num_es_gprs = 0;
num_gs_gprs = 0;
num_vs_gprs = rctx->vs_shader->current->shader.bc.ngpr;
}
new_num_vs_gprs = num_vs_gprs;
new_num_es_gprs = num_es_gprs;
new_num_gs_gprs = num_gs_gprs;
/* the sum of all SQ_GPR_RESOURCE_MGMT*.NUM_*_GPRS must <= to max_gprs */
if (new_num_ps_gprs > cur_num_ps_gprs || new_num_vs_gprs > cur_num_vs_gprs) {
if (new_num_ps_gprs > cur_num_ps_gprs || new_num_vs_gprs > cur_num_vs_gprs ||
new_num_es_gprs > cur_num_es_gprs || new_num_gs_gprs > cur_num_gs_gprs) {
/* try to use switch back to default */
if (new_num_ps_gprs > def_num_ps_gprs || new_num_vs_gprs > def_num_vs_gprs) {
if (new_num_ps_gprs > def_num_ps_gprs || new_num_vs_gprs > def_num_vs_gprs ||
new_num_gs_gprs > def_num_gs_gprs || new_num_es_gprs > def_num_es_gprs) {
/* always privilege vs stage so that at worst we have the
* pixel stage producing wrong output (not the vertex
* stage) */
new_num_ps_gprs = max_gprs - (new_num_vs_gprs + def_num_clause_temp_gprs * 2);
new_num_ps_gprs = max_gprs - ((new_num_vs_gprs - new_num_es_gprs - new_num_gs_gprs) + def_num_clause_temp_gprs * 2);
new_num_vs_gprs = num_vs_gprs;
new_num_gs_gprs = num_gs_gprs;
new_num_es_gprs = num_es_gprs;
} else {
new_num_ps_gprs = def_num_ps_gprs;
new_num_vs_gprs = def_num_vs_gprs;
new_num_es_gprs = def_num_es_gprs;
new_num_gs_gprs = def_num_gs_gprs;
}
} else {
return true;
@@ -2362,10 +2467,11 @@ bool r600_adjust_gprs(struct r600_context *rctx)
* it will lockup. So in this case just discard the draw command
* and don't change the current gprs repartitions.
*/
if (num_ps_gprs > new_num_ps_gprs || num_vs_gprs > new_num_vs_gprs) {
R600_ERR("ps & vs shader require too many register (%d + %d) "
if (num_ps_gprs > new_num_ps_gprs || num_vs_gprs > new_num_vs_gprs ||
num_gs_gprs > new_num_gs_gprs || num_es_gprs > new_num_es_gprs) {
R600_ERR("shaders require too many register (%d + %d + %d + %d) "
"for a combined maximum of %d\n",
num_ps_gprs, num_vs_gprs, max_gprs);
num_ps_gprs, num_vs_gprs, num_es_gprs, num_gs_gprs, max_gprs);
return false;
}
@@ -2373,8 +2479,12 @@ bool r600_adjust_gprs(struct r600_context *rctx)
tmp = S_008C04_NUM_PS_GPRS(new_num_ps_gprs) |
S_008C04_NUM_VS_GPRS(new_num_vs_gprs) |
S_008C04_NUM_CLAUSE_TEMP_GPRS(def_num_clause_temp_gprs);
if (rctx->config_state.sq_gpr_resource_mgmt_1 != tmp) {
tmp2 = S_008C08_NUM_ES_GPRS(new_num_es_gprs) |
S_008C08_NUM_GS_GPRS(new_num_gs_gprs);
if (rctx->config_state.sq_gpr_resource_mgmt_1 != tmp || rctx->config_state.sq_gpr_resource_mgmt_2 != tmp2) {
rctx->config_state.sq_gpr_resource_mgmt_1 = tmp;
rctx->config_state.sq_gpr_resource_mgmt_2 = tmp2;
rctx->config_state.atom.dirty = true;
rctx->b.flags |= R600_CONTEXT_WAIT_3D_IDLE;
}
@@ -2492,19 +2602,19 @@ void r600_init_atom_start_cs(struct r600_context *rctx)
num_es_stack_entries = 16;
break;
case CHIP_RV770:
num_ps_gprs = 192;
num_ps_gprs = 130;
num_vs_gprs = 56;
num_temp_gprs = 4;
num_gs_gprs = 0;
num_es_gprs = 0;
num_ps_threads = 188;
num_gs_gprs = 31;
num_es_gprs = 31;
num_ps_threads = 180;
num_vs_threads = 60;
num_gs_threads = 0;
num_es_threads = 0;
num_ps_stack_entries = 256;
num_vs_stack_entries = 256;
num_gs_stack_entries = 0;
num_es_stack_entries = 0;
num_gs_threads = 4;
num_es_threads = 4;
num_ps_stack_entries = 128;
num_vs_stack_entries = 128;
num_gs_stack_entries = 128;
num_es_stack_entries = 128;
break;
case CHIP_RV730:
case CHIP_RV740:
@@ -2513,10 +2623,10 @@ void r600_init_atom_start_cs(struct r600_context *rctx)
num_temp_gprs = 4;
num_gs_gprs = 0;
num_es_gprs = 0;
num_ps_threads = 188;
num_ps_threads = 180;
num_vs_threads = 60;
num_gs_threads = 0;
num_es_threads = 0;
num_gs_threads = 4;
num_es_threads = 4;
num_ps_stack_entries = 128;
num_vs_stack_entries = 128;
num_gs_stack_entries = 0;
@@ -2528,10 +2638,10 @@ void r600_init_atom_start_cs(struct r600_context *rctx)
num_temp_gprs = 4;
num_gs_gprs = 0;
num_es_gprs = 0;
num_ps_threads = 144;
num_ps_threads = 136;
num_vs_threads = 48;
num_gs_threads = 0;
num_es_threads = 0;
num_gs_threads = 4;
num_es_threads = 4;
num_ps_stack_entries = 128;
num_vs_stack_entries = 128;
num_gs_stack_entries = 0;
@@ -2707,9 +2817,12 @@ void r600_init_atom_start_cs(struct r600_context *rctx)
r600_store_value(cb, 0); /* R_028240_PA_SC_GENERIC_SCISSOR_TL */
r600_store_value(cb, S_028244_BR_X(8192) | S_028244_BR_Y(8192)); /* R_028244_PA_SC_GENERIC_SCISSOR_BR */
r600_store_context_reg_seq(cb, R_0288CC_SQ_PGM_CF_OFFSET_PS, 2);
r600_store_context_reg_seq(cb, R_0288CC_SQ_PGM_CF_OFFSET_PS, 5);
r600_store_value(cb, 0); /* R_0288CC_SQ_PGM_CF_OFFSET_PS */
r600_store_value(cb, 0); /* R_0288D0_SQ_PGM_CF_OFFSET_VS */
r600_store_value(cb, 0); /* R_0288D4_SQ_PGM_CF_OFFSET_GS */
r600_store_value(cb, 0); /* R_0288D8_SQ_PGM_CF_OFFSET_ES */
r600_store_value(cb, 0); /* R_0288DC_SQ_PGM_CF_OFFSET_FS */
r600_store_context_reg(cb, R_0288E0_SQ_VTX_SEMANTIC_CLEAR, ~0);
@@ -2718,10 +2831,12 @@ void r600_init_atom_start_cs(struct r600_context *rctx)
r600_store_value(cb, 0); /* R_028404_VGT_MIN_VTX_INDX */
r600_store_context_reg(cb, R_0288A4_SQ_PGM_RESOURCES_FS, 0);
r600_store_context_reg(cb, R_0288DC_SQ_PGM_CF_OFFSET_FS, 0);
if (rctx->b.chip_class == R700)
r600_store_context_reg(cb, R_028350_SX_MISC, 0);
if (rctx->b.chip_class == R700 && rctx->screen->b.has_streamout)
r600_store_context_reg(cb, R_028354_SX_SURFACE_SYNC, S_028354_SURFACE_SYNC_MASK(0xf));
r600_store_context_reg(cb, R_028800_DB_DEPTH_CONTROL, 0);
if (rctx->screen->b.has_streamout) {
r600_store_context_reg(cb, R_028B28_VGT_STRMOUT_DRAW_OPAQUE_OFFSET, 0);
@@ -2729,6 +2844,7 @@ void r600_init_atom_start_cs(struct r600_context *rctx)
r600_store_loop_const(cb, R_03E200_SQ_LOOP_CONST_0, 0x1000FFF);
r600_store_loop_const(cb, R_03E200_SQ_LOOP_CONST_0 + (32 * 4), 0x1000FFF);
r600_store_loop_const(cb, R_03E200_SQ_LOOP_CONST_0 + (64 * 4), 0x1000FFF);
}
void r600_update_ps_state(struct pipe_context *ctx, struct r600_pipe_shader *shader)
@@ -2898,9 +3014,75 @@ void r600_update_vs_state(struct pipe_context *ctx, struct r600_pipe_shader *sha
S_02881C_VS_OUT_CCDIST0_VEC_ENA((rshader->clip_dist_write & 0x0F) != 0) |
S_02881C_VS_OUT_CCDIST1_VEC_ENA((rshader->clip_dist_write & 0xF0) != 0) |
S_02881C_VS_OUT_MISC_VEC_ENA(rshader->vs_out_misc_write) |
S_02881C_USE_VTX_POINT_SIZE(rshader->vs_out_point_size);
S_02881C_USE_VTX_POINT_SIZE(rshader->vs_out_point_size) |
S_02881C_USE_VTX_EDGE_FLAG(rshader->vs_out_edgeflag) |
S_02881C_USE_VTX_RENDER_TARGET_INDX(rshader->vs_out_layer);
}
void r600_update_gs_state(struct pipe_context *ctx, struct r600_pipe_shader *shader)
{
struct r600_context *rctx = (struct r600_context *)ctx;
struct r600_command_buffer *cb = &shader->command_buffer;
struct r600_shader *rshader = &shader->shader;
struct r600_shader *cp_shader = &shader->gs_copy_shader->shader;
unsigned gsvs_itemsize =
(cp_shader->ring_item_size * rshader->gs_max_out_vertices) >> 2;
r600_init_command_buffer(cb, 64);
/* VGT_GS_MODE is written by r600_emit_shader_stages */
r600_store_context_reg(cb, R_028AB8_VGT_VTX_CNT_EN, 1);
if (rctx->b.chip_class >= R700) {
r600_store_context_reg(cb, R_028B38_VGT_GS_MAX_VERT_OUT,
S_028B38_MAX_VERT_OUT(rshader->gs_max_out_vertices));
}
r600_store_context_reg(cb, R_028A6C_VGT_GS_OUT_PRIM_TYPE,
r600_conv_prim_to_gs_out(rshader->gs_output_prim));
r600_store_context_reg_seq(cb, R_0288C8_SQ_GS_VERT_ITEMSIZE, 4);
r600_store_value(cb, cp_shader->ring_item_size >> 2);
r600_store_value(cb, 0);
r600_store_value(cb, 0);
r600_store_value(cb, 0);
r600_store_context_reg(cb, R_0288A8_SQ_ESGS_RING_ITEMSIZE,
(rshader->ring_item_size) >> 2);
r600_store_context_reg(cb, R_0288AC_SQ_GSVS_RING_ITEMSIZE,
gsvs_itemsize);
/* FIXME calculate these values somehow ??? */
r600_store_config_reg_seq(cb, R_0088C8_VGT_GS_PER_ES, 2);
r600_store_value(cb, 0x80); /* GS_PER_ES */
r600_store_value(cb, 0x100); /* ES_PER_GS */
r600_store_config_reg_seq(cb, R_0088E8_VGT_GS_PER_VS, 1);
r600_store_value(cb, 0x2); /* GS_PER_VS */
r600_store_context_reg(cb, R_02887C_SQ_PGM_RESOURCES_GS,
S_02887C_NUM_GPRS(rshader->bc.ngpr) |
S_02887C_STACK_SIZE(rshader->bc.nstack));
r600_store_context_reg(cb, R_02886C_SQ_PGM_START_GS,
r600_resource_va(ctx->screen, (void *)shader->bo) >> 8);
/* After that, the NOP relocation packet must be emitted (shader->bo, RADEON_USAGE_READ). */
}
void r600_update_es_state(struct pipe_context *ctx, struct r600_pipe_shader *shader)
{
struct r600_command_buffer *cb = &shader->command_buffer;
struct r600_shader *rshader = &shader->shader;
r600_init_command_buffer(cb, 32);
r600_store_context_reg(cb, R_028890_SQ_PGM_RESOURCES_ES,
S_028890_NUM_GPRS(rshader->bc.ngpr) |
S_028890_STACK_SIZE(rshader->bc.nstack));
r600_store_context_reg(cb, R_028880_SQ_PGM_START_ES,
r600_resource_va(ctx->screen, (void *)shader->bo) >> 8);
/* After that, the NOP relocation packet must be emitted (shader->bo, RADEON_USAGE_READ). */
}
void *r600_create_resolve_blend(struct r600_context *rctx)
{
struct pipe_blend_state blend;
@@ -3262,6 +3444,10 @@ void r600_init_state_functions(struct r600_context *rctx)
rctx->atoms[id++] = &rctx->b.streamout.begin_atom;
r600_init_atom(rctx, &rctx->vertex_shader.atom, id++, r600_emit_shader, 23);
r600_init_atom(rctx, &rctx->pixel_shader.atom, id++, r600_emit_shader, 0);
r600_init_atom(rctx, &rctx->geometry_shader.atom, id++, r600_emit_shader, 0);
r600_init_atom(rctx, &rctx->export_shader.atom, id++, r600_emit_shader, 0);
r600_init_atom(rctx, &rctx->shader_stages.atom, id++, r600_emit_shader_stages, 0);
r600_init_atom(rctx, &rctx->gs_rings.atom, id++, r600_emit_gs_rings, 0);
rctx->b.b.create_blend_state = r600_create_blend_state;
rctx->b.b.create_depth_stencil_alpha_state = r600_create_dsa_state;

View File

@@ -301,11 +301,6 @@ static void r600_bind_dsa_state(struct pipe_context *ctx, void *state)
rctx->alphatest_state.sx_alpha_test_control = dsa->sx_alpha_test_control;
rctx->alphatest_state.sx_alpha_ref = dsa->alpha_ref;
rctx->alphatest_state.atom.dirty = true;
if (rctx->b.chip_class >= EVERGREEN) {
evergreen_update_db_shader_control(rctx);
} else {
r600_update_db_shader_control(rctx);
}
}
}
@@ -698,6 +693,8 @@ static INLINE struct r600_shader_key r600_shader_selector_key(struct pipe_contex
/* Dual-source blending only makes sense with nr_cbufs == 1. */
if (key.nr_cbufs == 1 && rctx->dual_src_blend)
key.nr_cbufs = 2;
} else if (sel->type == PIPE_SHADER_VERTEX) {
key.vs_as_es = (rctx->gs_shader != NULL);
}
return key;
}
@@ -709,7 +706,6 @@ static int r600_shader_select(struct pipe_context *ctx,
bool *dirty)
{
struct r600_shader_key key;
struct r600_context *rctx = (struct r600_context *)ctx;
struct r600_pipe_shader * shader = NULL;
int r;
@@ -771,11 +767,6 @@ static int r600_shader_select(struct pipe_context *ctx,
shader->next_variant = sel->current;
sel->current = shader;
if (rctx->ps_shader &&
rctx->cb_misc_state.nr_ps_color_outputs != rctx->ps_shader->current->nr_ps_color_outputs) {
rctx->cb_misc_state.nr_ps_color_outputs = rctx->ps_shader->current->nr_ps_color_outputs;
rctx->cb_misc_state.atom.dirty = true;
}
return 0;
}
@@ -784,16 +775,10 @@ static void *r600_create_shader_state(struct pipe_context *ctx,
unsigned pipe_shader_type)
{
struct r600_pipe_shader_selector *sel = CALLOC_STRUCT(r600_pipe_shader_selector);
int r;
sel->type = pipe_shader_type;
sel->tokens = tgsi_dup_tokens(state->tokens);
sel->so = state->stream_output;
r = r600_shader_select(ctx, sel, NULL);
if (r)
return NULL;
return sel;
}
@@ -809,6 +794,12 @@ static void *r600_create_vs_state(struct pipe_context *ctx,
return r600_create_shader_state(ctx, state, PIPE_SHADER_VERTEX);
}
static void *r600_create_gs_state(struct pipe_context *ctx,
const struct pipe_shader_state *state)
{
return r600_create_shader_state(ctx, state, PIPE_SHADER_GEOMETRY);
}
static void r600_bind_ps_state(struct pipe_context *ctx, void *state)
{
struct r600_context *rctx = (struct r600_context *)ctx;
@@ -816,31 +807,7 @@ static void r600_bind_ps_state(struct pipe_context *ctx, void *state)
if (!state)
state = rctx->dummy_pixel_shader;
rctx->pixel_shader.shader = rctx->ps_shader = (struct r600_pipe_shader_selector *)state;
rctx->pixel_shader.atom.num_dw = rctx->ps_shader->current->command_buffer.num_dw;
rctx->pixel_shader.atom.dirty = true;
r600_context_add_resource_size(ctx, (struct pipe_resource *)rctx->ps_shader->current->bo);
if (rctx->b.chip_class <= R700) {
bool multiwrite = rctx->ps_shader->current->shader.fs_write_all;
if (rctx->cb_misc_state.multiwrite != multiwrite) {
rctx->cb_misc_state.multiwrite = multiwrite;
rctx->cb_misc_state.atom.dirty = true;
}
}
if (rctx->cb_misc_state.nr_ps_color_outputs != rctx->ps_shader->current->nr_ps_color_outputs) {
rctx->cb_misc_state.nr_ps_color_outputs = rctx->ps_shader->current->nr_ps_color_outputs;
rctx->cb_misc_state.atom.dirty = true;
}
if (rctx->b.chip_class >= EVERGREEN) {
evergreen_update_db_shader_control(rctx);
} else {
r600_update_db_shader_control(rctx);
}
rctx->ps_shader = (struct r600_pipe_shader_selector *)state;
}
static void r600_bind_vs_state(struct pipe_context *ctx, void *state)
@@ -850,19 +817,19 @@ static void r600_bind_vs_state(struct pipe_context *ctx, void *state)
if (!state)
return;
rctx->vertex_shader.shader = rctx->vs_shader = (struct r600_pipe_shader_selector *)state;
rctx->vertex_shader.atom.dirty = true;
rctx->vs_shader = (struct r600_pipe_shader_selector *)state;
rctx->b.streamout.stride_in_dw = rctx->vs_shader->so.stride;
}
r600_context_add_resource_size(ctx, (struct pipe_resource *)rctx->vs_shader->current->bo);
static void r600_bind_gs_state(struct pipe_context *ctx, void *state)
{
struct r600_context *rctx = (struct r600_context *)ctx;
/* Update clip misc state. */
if (rctx->vs_shader->current->pa_cl_vs_out_cntl != rctx->clip_misc_state.pa_cl_vs_out_cntl ||
rctx->vs_shader->current->shader.clip_dist_write != rctx->clip_misc_state.clip_dist_write) {
rctx->clip_misc_state.pa_cl_vs_out_cntl = rctx->vs_shader->current->pa_cl_vs_out_cntl;
rctx->clip_misc_state.clip_dist_write = rctx->vs_shader->current->shader.clip_dist_write;
rctx->clip_misc_state.atom.dirty = true;
}
rctx->gs_shader = (struct r600_pipe_shader_selector *)state;
if (!state)
return;
rctx->b.streamout.stride_in_dw = rctx->gs_shader->so.stride;
}
static void r600_delete_shader_selector(struct pipe_context *ctx,
@@ -905,6 +872,20 @@ static void r600_delete_vs_state(struct pipe_context *ctx, void *state)
r600_delete_shader_selector(ctx, sel);
}
static void r600_delete_gs_state(struct pipe_context *ctx, void *state)
{
struct r600_context *rctx = (struct r600_context *)ctx;
struct r600_pipe_shader_selector *sel = (struct r600_pipe_shader_selector *)state;
if (rctx->gs_shader == sel) {
rctx->gs_shader = NULL;
}
r600_delete_shader_selector(ctx, sel);
}
void r600_constant_buffers_dirty(struct r600_context *rctx, struct r600_constbuf_state *state)
{
if (state->dirty_mask) {
@@ -1098,10 +1079,65 @@ static void r600_setup_txq_cube_array_constants(struct r600_context *rctx, int s
pipe_resource_reference(&cb.buffer, NULL);
}
static void update_shader_atom(struct pipe_context *ctx,
struct r600_shader_state *state,
struct r600_pipe_shader *shader)
{
state->shader = shader;
if (shader) {
state->atom.num_dw = shader->command_buffer.num_dw;
state->atom.dirty = true;
r600_context_add_resource_size(ctx, (struct pipe_resource *)shader->bo);
} else {
state->atom.num_dw = 0;
state->atom.dirty = false;
}
}
static void update_gs_block_state(struct r600_context *rctx, unsigned enable)
{
if (rctx->shader_stages.geom_enable != enable) {
rctx->shader_stages.geom_enable = enable;
rctx->shader_stages.atom.dirty = true;
}
if (rctx->gs_rings.enable != enable) {
rctx->gs_rings.enable = enable;
rctx->gs_rings.atom.dirty = true;
if (enable && !rctx->gs_rings.esgs_ring.buffer) {
unsigned size = 0x1C000;
rctx->gs_rings.esgs_ring.buffer =
pipe_buffer_create(rctx->b.b.screen, PIPE_BIND_CUSTOM,
PIPE_USAGE_STATIC, size);
rctx->gs_rings.esgs_ring.buffer_size = size;
size = 0x4000000;
rctx->gs_rings.gsvs_ring.buffer =
pipe_buffer_create(rctx->b.b.screen, PIPE_BIND_CUSTOM,
PIPE_USAGE_STATIC, size);
rctx->gs_rings.gsvs_ring.buffer_size = size;
}
if (enable) {
r600_set_constant_buffer(&rctx->b.b, PIPE_SHADER_GEOMETRY,
R600_GS_RING_CONST_BUFFER, &rctx->gs_rings.esgs_ring);
r600_set_constant_buffer(&rctx->b.b, PIPE_SHADER_VERTEX,
R600_GS_RING_CONST_BUFFER, &rctx->gs_rings.gsvs_ring);
} else {
r600_set_constant_buffer(&rctx->b.b, PIPE_SHADER_GEOMETRY,
R600_GS_RING_CONST_BUFFER, NULL);
r600_set_constant_buffer(&rctx->b.b, PIPE_SHADER_VERTEX,
R600_GS_RING_CONST_BUFFER, NULL);
}
}
}
static bool r600_update_derived_state(struct r600_context *rctx)
{
struct pipe_context * ctx = (struct pipe_context*)rctx;
bool ps_dirty = false;
bool ps_dirty = false, vs_dirty = false, gs_dirty = false;
bool blend_disable;
if (!rctx->blitter->running) {
@@ -1119,23 +1155,101 @@ static bool r600_update_derived_state(struct r600_context *rctx)
}
}
r600_shader_select(ctx, rctx->ps_shader, &ps_dirty);
update_gs_block_state(rctx, rctx->gs_shader != NULL);
if (rctx->ps_shader && rctx->rasterizer &&
((rctx->rasterizer->sprite_coord_enable != rctx->ps_shader->current->sprite_coord_enable) ||
(rctx->rasterizer->flatshade != rctx->ps_shader->current->flatshade))) {
if (rctx->gs_shader) {
r600_shader_select(ctx, rctx->gs_shader, &gs_dirty);
if (unlikely(!rctx->gs_shader->current))
return false;
if (rctx->b.chip_class >= EVERGREEN)
evergreen_update_ps_state(ctx, rctx->ps_shader->current);
else
r600_update_ps_state(ctx, rctx->ps_shader->current);
if (!rctx->shader_stages.geom_enable) {
rctx->shader_stages.geom_enable = true;
rctx->shader_stages.atom.dirty = true;
}
ps_dirty = true;
/* gs_shader provides GS and VS (copy shader) */
if (unlikely(rctx->geometry_shader.shader != rctx->gs_shader->current)) {
update_shader_atom(ctx, &rctx->geometry_shader, rctx->gs_shader->current);
update_shader_atom(ctx, &rctx->vertex_shader, rctx->gs_shader->current->gs_copy_shader);
/* Update clip misc state. */
if (rctx->gs_shader->current->gs_copy_shader->pa_cl_vs_out_cntl != rctx->clip_misc_state.pa_cl_vs_out_cntl ||
rctx->gs_shader->current->gs_copy_shader->shader.clip_dist_write != rctx->clip_misc_state.clip_dist_write) {
rctx->clip_misc_state.pa_cl_vs_out_cntl = rctx->gs_shader->current->gs_copy_shader->pa_cl_vs_out_cntl;
rctx->clip_misc_state.clip_dist_write = rctx->gs_shader->current->gs_copy_shader->shader.clip_dist_write;
rctx->clip_misc_state.atom.dirty = true;
}
}
r600_shader_select(ctx, rctx->vs_shader, &vs_dirty);
if (unlikely(!rctx->vs_shader->current))
return false;
/* vs_shader is used as ES */
if (unlikely(vs_dirty || rctx->export_shader.shader != rctx->vs_shader->current)) {
update_shader_atom(ctx, &rctx->export_shader, rctx->vs_shader->current);
}
} else {
if (unlikely(rctx->geometry_shader.shader)) {
update_shader_atom(ctx, &rctx->geometry_shader, NULL);
update_shader_atom(ctx, &rctx->export_shader, NULL);
rctx->shader_stages.geom_enable = false;
rctx->shader_stages.atom.dirty = true;
}
r600_shader_select(ctx, rctx->vs_shader, &vs_dirty);
if (unlikely(!rctx->vs_shader->current))
return false;
if (unlikely(vs_dirty || rctx->vertex_shader.shader != rctx->vs_shader->current)) {
update_shader_atom(ctx, &rctx->vertex_shader, rctx->vs_shader->current);
/* Update clip misc state. */
if (rctx->vs_shader->current->pa_cl_vs_out_cntl != rctx->clip_misc_state.pa_cl_vs_out_cntl ||
rctx->vs_shader->current->shader.clip_dist_write != rctx->clip_misc_state.clip_dist_write) {
rctx->clip_misc_state.pa_cl_vs_out_cntl = rctx->vs_shader->current->pa_cl_vs_out_cntl;
rctx->clip_misc_state.clip_dist_write = rctx->vs_shader->current->shader.clip_dist_write;
rctx->clip_misc_state.atom.dirty = true;
}
}
}
if (ps_dirty) {
rctx->pixel_shader.atom.num_dw = rctx->ps_shader->current->command_buffer.num_dw;
rctx->pixel_shader.atom.dirty = true;
r600_shader_select(ctx, rctx->ps_shader, &ps_dirty);
if (unlikely(!rctx->ps_shader->current))
return false;
if (unlikely(ps_dirty || rctx->pixel_shader.shader != rctx->ps_shader->current)) {
if (rctx->cb_misc_state.nr_ps_color_outputs != rctx->ps_shader->current->nr_ps_color_outputs) {
rctx->cb_misc_state.nr_ps_color_outputs = rctx->ps_shader->current->nr_ps_color_outputs;
rctx->cb_misc_state.atom.dirty = true;
}
if (rctx->b.chip_class <= R700) {
bool multiwrite = rctx->ps_shader->current->shader.fs_write_all;
if (rctx->cb_misc_state.multiwrite != multiwrite) {
rctx->cb_misc_state.multiwrite = multiwrite;
rctx->cb_misc_state.atom.dirty = true;
}
}
if (rctx->b.chip_class >= EVERGREEN) {
evergreen_update_db_shader_control(rctx);
} else {
r600_update_db_shader_control(rctx);
}
if (unlikely(!ps_dirty && rctx->ps_shader && rctx->rasterizer &&
((rctx->rasterizer->sprite_coord_enable != rctx->ps_shader->current->sprite_coord_enable) ||
(rctx->rasterizer->flatshade != rctx->ps_shader->current->flatshade)))) {
if (rctx->b.chip_class >= EVERGREEN)
evergreen_update_ps_state(ctx, rctx->ps_shader->current);
else
r600_update_ps_state(ctx, rctx->ps_shader->current);
}
update_shader_atom(ctx, &rctx->pixel_shader, rctx->ps_shader->current);
}
/* on R600 we stuff masks + txq info into one constant buffer */
@@ -1145,11 +1259,15 @@ static bool r600_update_derived_state(struct r600_context *rctx)
r600_setup_buffer_constants(rctx, PIPE_SHADER_FRAGMENT);
if (rctx->vs_shader && rctx->vs_shader->current->shader.uses_tex_buffers)
r600_setup_buffer_constants(rctx, PIPE_SHADER_VERTEX);
if (rctx->gs_shader && rctx->gs_shader->current->shader.uses_tex_buffers)
r600_setup_buffer_constants(rctx, PIPE_SHADER_GEOMETRY);
} else {
if (rctx->ps_shader && rctx->ps_shader->current->shader.uses_tex_buffers)
eg_setup_buffer_constants(rctx, PIPE_SHADER_FRAGMENT);
if (rctx->vs_shader && rctx->vs_shader->current->shader.uses_tex_buffers)
eg_setup_buffer_constants(rctx, PIPE_SHADER_VERTEX);
if (rctx->gs_shader && rctx->gs_shader->current->shader.uses_tex_buffers)
eg_setup_buffer_constants(rctx, PIPE_SHADER_GEOMETRY);
}
@@ -1157,6 +1275,8 @@ static bool r600_update_derived_state(struct r600_context *rctx)
r600_setup_txq_cube_array_constants(rctx, PIPE_SHADER_FRAGMENT);
if (rctx->vs_shader && rctx->vs_shader->current->shader.has_txq_cube_array_z_comp)
r600_setup_txq_cube_array_constants(rctx, PIPE_SHADER_VERTEX);
if (rctx->gs_shader && rctx->gs_shader->current->shader.has_txq_cube_array_z_comp)
r600_setup_txq_cube_array_constants(rctx, PIPE_SHADER_GEOMETRY);
if (rctx->b.chip_class < EVERGREEN && rctx->ps_shader && rctx->vs_shader) {
if (!r600_adjust_gprs(rctx)) {
@@ -1174,33 +1294,10 @@ static bool r600_update_derived_state(struct r600_context *rctx)
rctx->blend_state.cso,
blend_disable);
}
return true;
}
static unsigned r600_conv_prim_to_gs_out(unsigned mode)
{
static const int prim_conv[] = {
V_028A6C_OUTPRIM_TYPE_POINTLIST,
V_028A6C_OUTPRIM_TYPE_LINESTRIP,
V_028A6C_OUTPRIM_TYPE_LINESTRIP,
V_028A6C_OUTPRIM_TYPE_LINESTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_LINESTRIP,
V_028A6C_OUTPRIM_TYPE_LINESTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP
};
assert(mode < Elements(prim_conv));
return prim_conv[mode];
}
void r600_emit_clip_misc_state(struct r600_context *rctx, struct r600_atom *atom)
{
struct radeon_winsys_cs *cs = rctx->b.rings.gfx.cs;
@@ -1227,7 +1324,7 @@ static void r600_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info
return;
}
if (!rctx->vs_shader) {
if (!rctx->vs_shader || !rctx->ps_shader) {
assert(0);
return;
}
@@ -1311,6 +1408,25 @@ static void r600_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info
r600_emit_atom(rctx, rctx->atoms[i]);
}
/* On R6xx, CULL_FRONT=1 culls all points, lines, and rectangles,
* even though it should have no effect on those. */
if (rctx->b.chip_class == R600 && rctx->rasterizer) {
unsigned su_sc_mode_cntl = rctx->rasterizer->pa_su_sc_mode_cntl;
unsigned prim = info.mode;
if (rctx->gs_shader) {
prim = rctx->gs_shader->current->shader.gs_output_prim;
}
prim = r600_conv_prim_to_gs_out(prim); /* decrease the number of types to 3 */
if (prim == V_028A6C_OUTPRIM_TYPE_POINTLIST ||
prim == V_028A6C_OUTPRIM_TYPE_LINESTRIP ||
info.mode == R600_PRIM_RECTANGLE_LIST) {
su_sc_mode_cntl &= C_028814_CULL_FRONT;
}
r600_write_context_reg(cs, R_028814_PA_SU_SC_MODE_CNTL, su_sc_mode_cntl);
}
/* Update start instance. */
if (rctx->last_start_instance != info.start_instance) {
r600_write_ctl_const(cs, R_03CFF4_SQ_VTX_START_INST_LOC, info.start_instance);
@@ -1330,8 +1446,6 @@ static void r600_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info
r600_write_context_reg(cs, R_028A0C_PA_SC_LINE_STIPPLE,
S_028A0C_AUTO_RESET_CNTL(ls_mask) |
(rctx->rasterizer ? rctx->rasterizer->pa_sc_line_stipple : 0));
r600_write_context_reg(cs, R_028A6C_VGT_GS_OUT_PRIM_TYPE,
r600_conv_prim_to_gs_out(info.mode));
r600_write_config_reg(cs, R_008958_VGT_PRIMITIVE_TYPE,
r600_conv_pipe_prim(info.mode));
@@ -1615,11 +1729,14 @@ bool sampler_state_needs_border_color(const struct pipe_sampler_state *state)
void r600_emit_shader(struct r600_context *rctx, struct r600_atom *a)
{
struct radeon_winsys_cs *cs = rctx->b.rings.gfx.cs;
struct r600_pipe_shader *shader = ((struct r600_shader_state*)a)->shader->current;
struct r600_pipe_shader *shader = ((struct r600_shader_state*)a)->shader;
if (!shader)
return;
r600_emit_command_buffer(cs, &shader->command_buffer);
radeon_emit(cs, PKT3(PKT3_NOP, 0, 0));
radeon_emit(cs, r600_context_bo_reloc(&rctx->b, &rctx->b.rings.gfx, shader->bo, RADEON_USAGE_READ));
}
@@ -1633,7 +1750,6 @@ struct pipe_surface *r600_create_surface_custom(struct pipe_context *pipe,
assert(templ->u.tex.first_layer <= util_max_layer(texture, templ->u.tex.level));
assert(templ->u.tex.last_layer <= util_max_layer(texture, templ->u.tex.level));
assert(templ->u.tex.first_layer == templ->u.tex.last_layer);
if (surface == NULL)
return NULL;
pipe_reference_init(&surface->base.reference, 1);
@@ -2148,6 +2264,7 @@ void r600_init_common_state_functions(struct r600_context *rctx)
{
rctx->b.b.create_fs_state = r600_create_ps_state;
rctx->b.b.create_vs_state = r600_create_vs_state;
rctx->b.b.create_gs_state = r600_create_gs_state;
rctx->b.b.create_vertex_elements_state = r600_create_vertex_fetch_shader;
rctx->b.b.bind_blend_state = r600_bind_blend_state;
rctx->b.b.bind_depth_stencil_alpha_state = r600_bind_dsa_state;
@@ -2156,6 +2273,7 @@ void r600_init_common_state_functions(struct r600_context *rctx)
rctx->b.b.bind_rasterizer_state = r600_bind_rs_state;
rctx->b.b.bind_vertex_elements_state = r600_bind_vertex_elements;
rctx->b.b.bind_vs_state = r600_bind_vs_state;
rctx->b.b.bind_gs_state = r600_bind_gs_state;
rctx->b.b.delete_blend_state = r600_delete_blend_state;
rctx->b.b.delete_depth_stencil_alpha_state = r600_delete_dsa_state;
rctx->b.b.delete_fs_state = r600_delete_ps_state;
@@ -2163,6 +2281,7 @@ void r600_init_common_state_functions(struct r600_context *rctx)
rctx->b.b.delete_sampler_state = r600_delete_sampler_state;
rctx->b.b.delete_vertex_elements_state = r600_delete_vertex_elements;
rctx->b.b.delete_vs_state = r600_delete_vs_state;
rctx->b.b.delete_gs_state = r600_delete_gs_state;
rctx->b.b.set_blend_color = r600_set_blend_color;
rctx->b.b.set_clip_state = r600_set_clip_state;
rctx->b.b.set_constant_buffer = r600_set_constant_buffer;

View File

@@ -123,6 +123,7 @@
#define EVENT_TYPE_SO_VGTSTREAMOUT_FLUSH 0x1f
#define EVENT_TYPE_SAMPLE_STREAMOUTSTATS 0x20
#define EVENT_TYPE_FLUSH_AND_INV_DB_META 0x2c /* supported on r700+ */
#define EVENT_TYPE_VGT_FLUSH 0x24
#define EVENT_TYPE_FLUSH_AND_INV_CB_META 46 /* supported on r700+ */
#define EVENT_TYPE(x) ((x) << 0)
#define EVENT_INDEX(x) ((x) << 8)
@@ -200,6 +201,19 @@
/* Registers */
#define R_008490_CP_STRMOUT_CNTL 0x008490
#define S_008490_OFFSET_UPDATE_DONE(x) (((x) & 0x1) << 0)
#define R_008C40_SQ_ESGS_RING_BASE 0x008C40
#define R_008C44_SQ_ESGS_RING_SIZE 0x008C44
#define R_008C48_SQ_GSVS_RING_BASE 0x008C48
#define R_008C4C_SQ_GSVS_RING_SIZE 0x008C4C
#define R_008C50_SQ_ESTMP_RING_BASE 0x008C50
#define R_008C54_SQ_ESTMP_RING_SIZE 0x008C54
#define R_008C50_SQ_GSTMP_RING_BASE 0x008C58
#define R_008C54_SQ_GSTMP_RING_SIZE 0x008C5C
#define R_0088C8_VGT_GS_PER_ES 0x0088C8
#define R_0088CC_VGT_ES_PER_GS 0x0088CC
#define R_0088E8_VGT_GS_PER_VS 0x0088E8
#define R_008960_VGT_STRMOUT_BUFFER_FILLED_SIZE_0 0x008960 /* read-only */
#define R_008964_VGT_STRMOUT_BUFFER_FILLED_SIZE_1 0x008964 /* read-only */
#define R_008968_VGT_STRMOUT_BUFFER_FILLED_SIZE_2 0x008968 /* read-only */
@@ -529,6 +543,9 @@
#define S_028810_VTX_KILL_OR(x) (((x) & 0x1) << 21)
#define G_028810_VTX_KILL_OR(x) (((x) >> 21) & 0x1)
#define C_028810_VTX_KILL_OR 0xFFDFFFFF
#define S_028810_DX_RASTERIZATION_KILL(x) (((x) & 0x1) << 22) /* R700 only? */
#define G_028810_DX_RASTERIZATION_KILL(x) (((x) >> 22) & 0x1)
#define C_028810_DX_RASTERIZATION_KILL 0xFFBFFFFF
#define S_028810_DX_LINEAR_ATTR_CLIP_ENA(x) (((x) & 0x1) << 24)
#define G_028810_DX_LINEAR_ATTR_CLIP_ENA(x) (((x) >> 24) & 0x1)
#define C_028810_DX_LINEAR_ATTR_CLIP_ENA 0xFEFFFFFF
@@ -804,6 +821,9 @@
#define S_028D10_IGNORE_SC_ZRANGE(x) (((x) & 0x1) << 17)
#define G_028D10_IGNORE_SC_ZRANGE(x) (((x) >> 17) & 0x1)
#define C_028D10_IGNORE_SC_ZRANGE 0xFFFDFFFF
#define S_028D10_MAX_TILES_IN_DTT(x) (((x) & 0x1F) << 21)
#define G_028D10_MAX_TILES_IN_DTT(x) (((x) >> 21) & 0x1F)
#define C_028D10_MAX_TILES_IN_DTT 0xFC1FFFFF
#define R_02880C_DB_SHADER_CONTROL 0x02880C
#define S_02880C_Z_EXPORT_ENABLE(x) (((x) & 0x1) << 0)
#define G_02880C_Z_EXPORT_ENABLE(x) (((x) >> 0) & 0x1)
@@ -1824,12 +1844,20 @@
#define S_028A40_MODE(x) (((x) & 0x3) << 0)
#define G_028A40_MODE(x) (((x) >> 0) & 0x3)
#define C_028A40_MODE 0xFFFFFFFC
#define V_028A40_GS_OFF 0
#define V_028A40_GS_SCENARIO_A 1
#define V_028A40_GS_SCENARIO_B 2
#define V_028A40_GS_SCENARIO_G 3
#define S_028A40_ES_PASSTHRU(x) (((x) & 0x1) << 2)
#define G_028A40_ES_PASSTHRU(x) (((x) >> 2) & 0x1)
#define C_028A40_ES_PASSTHRU 0xFFFFFFFB
#define S_028A40_CUT_MODE(x) (((x) & 0x3) << 3)
#define G_028A40_CUT_MODE(x) (((x) >> 3) & 0x3)
#define C_028A40_CUT_MODE 0xFFFFFFE7
#define V_028A40_GS_CUT_1024 0
#define V_028A40_GS_CUT_512 1
#define V_028A40_GS_CUT_256 2
#define V_028A40_GS_CUT_128 3
#define R_008DFC_SQ_CF_WORD0 0x008DFC
#define S_008DFC_ADDR(x) (((x) & 0xFFFFFFFF) << 0)
#define G_008DFC_ADDR(x) (((x) >> 0) & 0xFFFFFFFF)
@@ -2332,6 +2360,26 @@
#define S_028D44_ALPHA_TO_MASK_OFFSET3(x) (((x) & 0x3) << 14)
#define S_028D44_OFFSET_ROUND(x) (((x) & 0x1) << 16)
#define R_028868_SQ_PGM_RESOURCES_VS 0x028868
#define R_028890_SQ_PGM_RESOURCES_ES 0x028890
#define S_028890_NUM_GPRS(x) (((x) & 0xFF) << 0)
#define G_028890_NUM_GPRS(x) (((x) >> 0) & 0xFF)
#define C_028890_NUM_GPRS 0xFFFFFF00
#define S_028890_STACK_SIZE(x) (((x) & 0xFF) << 8)
#define G_028890_STACK_SIZE(x) (((x) >> 8) & 0xFF)
#define C_028890_STACK_SIZE 0xFFFF00FF
#define S_028890_DX10_CLAMP(x) (((x) & 0x1) << 21)
#define G_028890_DX10_CLAMP(x) (((x) >> 21) & 0x1)
#define C_028890_DX10_CLAMP 0xFFDFFFFF
#define R_02887C_SQ_PGM_RESOURCES_GS 0x02887C
#define S_02887C_NUM_GPRS(x) (((x) & 0xFF) << 0)
#define G_02887C_NUM_GPRS(x) (((x) >> 0) & 0xFF)
#define C_02887C_NUM_GPRS 0xFFFFFF00
#define S_02887C_STACK_SIZE(x) (((x) & 0xFF) << 8)
#define G_02887C_STACK_SIZE(x) (((x) >> 8) & 0xFF)
#define C_02887C_STACK_SIZE 0xFFFF00FF
#define S_02887C_DX10_CLAMP(x) (((x) & 0x1) << 21)
#define G_02887C_DX10_CLAMP(x) (((x) >> 21) & 0x1)
#define C_02887C_DX10_CLAMP 0xFFDFFFFF
#define R_0286CC_SPI_PS_IN_CONTROL_0 0x0286CC
#define R_0286D0_SPI_PS_IN_CONTROL_1 0x0286D0
#define R_028644_SPI_PS_INPUT_CNTL_0 0x028644
@@ -2421,11 +2469,15 @@
#define G_028C04_MAX_SAMPLE_DIST(x) (((x) >> 13) & 0xF)
#define C_028C04_MAX_SAMPLE_DIST 0xFFFE1FFF
#define R_0288CC_SQ_PGM_CF_OFFSET_PS 0x0288CC
#define R_0288DC_SQ_PGM_CF_OFFSET_FS 0x0288DC
#define R_0288D0_SQ_PGM_CF_OFFSET_VS 0x0288D0
#define R_0288D4_SQ_PGM_CF_OFFSET_GS 0x0288D4
#define R_0288D8_SQ_PGM_CF_OFFSET_ES 0x0288D8
#define R_0288DC_SQ_PGM_CF_OFFSET_FS 0x0288DC
#define R_028840_SQ_PGM_START_PS 0x028840
#define R_028894_SQ_PGM_START_FS 0x028894
#define R_028858_SQ_PGM_START_VS 0x028858
#define R_02886C_SQ_PGM_START_GS 0x02886C
#define R_028880_SQ_PGM_START_ES 0x028880
#define R_028080_CB_COLOR0_VIEW 0x028080
#define S_028080_SLICE_START(x) (((x) & 0x7FF) << 0)
#define G_028080_SLICE_START(x) (((x) >> 0) & 0x7FF)
@@ -2863,6 +2915,7 @@
#define R_0283F4_SQ_VTX_SEMANTIC_29 0x0283F4
#define R_0283F8_SQ_VTX_SEMANTIC_30 0x0283F8
#define R_0283FC_SQ_VTX_SEMANTIC_31 0x0283FC
#define R_0288C8_SQ_GS_VERT_ITEMSIZE 0x0288C8
#define R_0288E0_SQ_VTX_SEMANTIC_CLEAR 0x0288E0
#define R_028400_VGT_MAX_VTX_INDX 0x028400
#define S_028400_MAX_INDX(x) (((x) & 0xFFFFFFFF) << 0)
@@ -3287,6 +3340,8 @@
#define R_028B28_VGT_STRMOUT_DRAW_OPAQUE_OFFSET 0x028B28
#define R_028B2C_VGT_STRMOUT_DRAW_OPAQUE_BUFFER_FILLED_SIZE 0x028B2C
#define R_028B30_VGT_STRMOUT_DRAW_OPAQUE_VERTEX_STRIDE 0x028B30
#define R_028B38_VGT_GS_MAX_VERT_OUT 0x028B38 /* r7xx */
#define S_028B38_MAX_VERT_OUT(x) (((x) & 0x7FF) << 0)
#define R_028B44_VGT_STRMOUT_BASE_OFFSET_HI_0 0x028B44
#define R_028B48_VGT_STRMOUT_BASE_OFFSET_HI_1 0x028B48
#define R_028B4C_VGT_STRMOUT_BASE_OFFSET_HI_2 0x028B4C

View File

@@ -169,8 +169,10 @@ enum shader_target
{
TARGET_UNKNOWN,
TARGET_VS,
TARGET_ES,
TARGET_PS,
TARGET_GS,
TARGET_GS_COPY,
TARGET_COMPUTE,
TARGET_FETCH,

View File

@@ -137,7 +137,7 @@ void bc_dump::dump(cf_node& n) {
for (int k = 0; k < 4; ++k)
s << chans[n.bc.sel[k]];
} else if (n.bc.op_ptr->flags & (CF_STRM | CF_RAT)) {
} else if (n.bc.op_ptr->flags & CF_MEM) {
static const char *exp_type[] = {"WRITE", "WRITE_IND", "WRITE_ACK",
"WRITE_IND_ACK"};
fill_to(s, 18);
@@ -150,6 +150,9 @@ void bc_dump::dump(cf_node& n) {
if ((n.bc.op_ptr->flags & CF_RAT) && (n.bc.type & 1)) {
s << ", @R" << n.bc.index_gpr << ".xyz";
}
if ((n.bc.op_ptr->flags & CF_MEM) && (n.bc.type & 1)) {
s << ", @R" << n.bc.index_gpr << ".x";
}
s << " ES:" << n.bc.elem_size;

View File

@@ -63,7 +63,7 @@ int bc_finalizer::run() {
// workaround for some problems on r6xx/7xx
// add ALU NOP to each vertex shader
if (!ctx.is_egcm() && sh.target == TARGET_VS) {
if (!ctx.is_egcm() && (sh.target == TARGET_VS || sh.target == TARGET_ES)) {
cf_node *c = sh.create_clause(NST_ALU_CLAUSE);
alu_group_node *g = sh.create_alu_group();
@@ -695,7 +695,7 @@ void bc_finalizer::finalize_cf(cf_node* c) {
c->bc.rw_gpr = reg >= 0 ? reg : 0;
c->bc.comp_mask = mask;
if ((flags & CF_RAT) && (c->bc.type & 1)) {
if (((flags & CF_RAT) || (!(flags & CF_STRM))) && (c->bc.type & 1)) {
reg = -1;

View File

@@ -58,7 +58,10 @@ int bc_parser::decode() {
if (pshader) {
switch (bc->type) {
case TGSI_PROCESSOR_FRAGMENT: t = TARGET_PS; break;
case TGSI_PROCESSOR_VERTEX: t = TARGET_VS; break;
case TGSI_PROCESSOR_VERTEX:
t = pshader->vs_as_es ? TARGET_ES : TARGET_VS;
break;
case TGSI_PROCESSOR_GEOMETRY: t = TARGET_GS; break;
case TGSI_PROCESSOR_COMPUTE: t = TARGET_COMPUTE; break;
default: assert(!"unknown shader target"); return -1; break;
}
@@ -134,8 +137,12 @@ int bc_parser::parse_decls() {
}
}
if (sh->target == TARGET_VS)
if (sh->target == TARGET_VS || sh->target == TARGET_ES)
sh->add_input(0, 1, 0x0F);
else if (sh->target == TARGET_GS) {
sh->add_input(0, 1, 0x0F);
sh->add_input(1, 1, 0x0F);
}
bool ps_interp = ctx.hw_class >= HW_CLASS_EVERGREEN
&& sh->target == TARGET_PS;
@@ -202,7 +209,7 @@ int bc_parser::decode_cf(unsigned &i, bool &eop) {
if (cf->bc.rw_rel)
gpr_reladdr = true;
assert(!cf->bc.rw_rel);
} else if (flags & (CF_STRM | CF_RAT)) {
} else if (flags & CF_MEM) {
if (cf->bc.rw_rel)
gpr_reladdr = true;
assert(!cf->bc.rw_rel);
@@ -676,7 +683,7 @@ int bc_parser::prepare_ir() {
} while (1);
c->bc.end_of_program = eop;
} else if (flags & (CF_STRM | CF_RAT)) {
} else if (flags & CF_MEM) {
unsigned burst_count = c->bc.burst_count;
unsigned eop = c->bc.end_of_program;
@@ -694,7 +701,7 @@ int bc_parser::prepare_ir() {
sh->get_gpr_value(true, c->bc.rw_gpr, s, false);
}
if ((flags & CF_RAT) && (c->bc.type & 1)) { // indexed write
if (((flags & CF_RAT) || (!(flags & CF_STRM))) && (c->bc.type & 1)) { // indexed write
c->src.resize(8);
for(int s = 0; s < 3; ++s) {
c->src[4 + s] =

View File

@@ -349,7 +349,7 @@ void dump::dump_op(node &n, const char *name) {
static const char *exp_type[] = {"PIXEL", "POS ", "PARAM"};
sblog << " " << exp_type[c->bc.type] << " " << c->bc.array_base;
has_dst = false;
} else if (c->bc.op_ptr->flags & CF_STRM) {
} else if (c->bc.op_ptr->flags & (CF_MEM)) {
static const char *exp_type[] = {"WRITE", "WRITE_IND", "WRITE_ACK",
"WRITE_IND_ACK"};
sblog << " " << exp_type[c->bc.type] << " " << c->bc.array_base

View File

@@ -215,7 +215,7 @@ void shader::init() {
void shader::init_call_fs(cf_node* cf) {
unsigned gpr = 0;
assert(target == TARGET_VS);
assert(target == TARGET_VS || target == TARGET_ES);
for(inputs_vec::const_iterator I = inputs.begin(),
E = inputs.end(); I != E; ++I, ++gpr) {
@@ -433,6 +433,7 @@ std::string shader::get_full_target_name() {
const char* shader::get_shader_target_name() {
switch (target) {
case TARGET_VS: return "VS";
case TARGET_ES: return "ES";
case TARGET_PS: return "PS";
case TARGET_GS: return "GS";
case TARGET_COMPUTE: return "COMPUTE";

View File

@@ -59,7 +59,7 @@ void *r600_buffer_map_sync_with_rings(struct r600_common_context *ctx,
rusage = RADEON_USAGE_WRITE;
}
if (ctx->rings.gfx.cs->cdw &&
if (ctx->rings.gfx.cs->cdw != ctx->initial_gfx_cs_size &&
ctx->ws->cs_is_buffer_referenced(ctx->rings.gfx.cs,
resource->cs_buf, rusage)) {
if (usage & PIPE_TRANSFER_DONTBLOCK) {

View File

@@ -137,7 +137,7 @@ static const struct debug_named_value common_debug_options[] = {
{ "ps", DBG_PS, "Print pixel shaders" },
{ "cs", DBG_CS, "Print compute shaders" },
{ "nohyperz", DBG_NO_HYPERZ, "Disable Hyper-Z" },
{ "hyperz", DBG_HYPERZ, "Enable Hyper-Z" },
/* GL uses the word INVALIDATE, gallium uses the word DISCARD */
{ "noinvalrange", DBG_NO_DISCARD_RANGE, "Disable handling of INVALIDATE_RANGE map flags" },

View File

@@ -83,7 +83,7 @@
#define DBG_PS (1 << 11)
#define DBG_CS (1 << 12)
/* features */
#define DBG_NO_HYPERZ (1 << 13)
#define DBG_HYPERZ (1 << 13)
#define DBG_NO_DISCARD_RANGE (1 << 14)
/* The maximum allowed bit is 15. */
@@ -241,6 +241,7 @@ struct r600_common_context {
enum radeon_family family;
enum chip_class chip_class;
struct r600_rings rings;
unsigned initial_gfx_cs_size;
struct u_upload_mgr *uploader;
struct u_suballocator *allocator_so_filled_size;
@@ -389,6 +390,15 @@ r600_resource_reference(struct r600_resource **ptr, struct r600_resource *res)
(struct pipe_resource *)res);
}
static inline unsigned r600_tex_aniso_filter(unsigned filter)
{
if (filter <= 1) return 0;
if (filter <= 2) return 1;
if (filter <= 4) return 2;
if (filter <= 8) return 3;
/* else */ return 4;
}
#define R600_ERR(fmt, args...) \
fprintf(stderr, "EE %s:%d %s - "fmt, __FILE__, __LINE__, __func__, ##args)

View File

@@ -296,6 +296,12 @@ void r600_texture_get_fmask_info(struct r600_common_screen *rscreen,
fmask.nsamples = 1;
fmask.flags |= RADEON_SURF_FMASK;
/* Force 2D tiling if it wasn't set. This may occur when creating
* FMASK for MSAA resolve on R6xx. On R6xx, the single-sample
* destination buffer must have an FMASK too. */
fmask.flags = RADEON_SURF_CLR(fmask.flags, MODE);
fmask.flags |= RADEON_SURF_SET(RADEON_SURF_MODE_2D, MODE);
if (rscreen->chip_class >= SI) {
fmask.flags |= RADEON_SURF_HAS_TILE_MODE_INDEX;
}
@@ -596,7 +602,7 @@ r600_texture_create_object(struct pipe_screen *screen,
if (rtex->is_depth) {
if (!(base->flags & (R600_RESOURCE_FLAG_TRANSFER |
R600_RESOURCE_FLAG_FLUSHED_DEPTH)) &&
!(rscreen->debug_flags & DBG_NO_HYPERZ)) {
(rscreen->debug_flags & DBG_HYPERZ)) {
r600_texture_allocate_htile(rscreen, rtex);
}

View File

@@ -58,6 +58,9 @@
#define NUM_H264_REFS 17
#define NUM_VC1_REFS 5
#define FB_BUFFER_OFFSET 0x1000
#define FB_BUFFER_SIZE 2048
/* UVD buffer representation */
struct ruvd_buffer
{
@@ -81,6 +84,7 @@ struct ruvd_decoder {
struct ruvd_buffer msg_fb_buffers[NUM_BUFFERS];
struct ruvd_msg *msg;
uint32_t *fb;
struct ruvd_buffer bs_buffers[NUM_BUFFERS];
void* bs_ptr;
@@ -131,16 +135,21 @@ static void send_cmd(struct ruvd_decoder *dec, unsigned cmd,
set_reg(dec, RUVD_GPCOM_VCPU_CMD, cmd << 1);
}
/* map the next available message buffer */
static void map_msg_buf(struct ruvd_decoder *dec)
/* map the next available message/feedback buffer */
static void map_msg_fb_buf(struct ruvd_decoder *dec)
{
struct ruvd_buffer* buf;
uint8_t *ptr;
/* grap the current message buffer */
/* grab the current message/feedback buffer */
buf = &dec->msg_fb_buffers[dec->cur_buffer];
/* copy the message into it */
dec->msg = dec->ws->buffer_map(buf->cs_handle, dec->cs, PIPE_TRANSFER_WRITE);
/* and map it for CPU access */
ptr = dec->ws->buffer_map(buf->cs_handle, dec->cs, PIPE_TRANSFER_WRITE);
/* calc buffer offsets */
dec->msg = (struct ruvd_msg *)ptr;
dec->fb = (uint32_t *)(ptr + FB_BUFFER_OFFSET);
}
/* unmap and send a message command to the VCPU */
@@ -148,8 +157,8 @@ static void send_msg_buf(struct ruvd_decoder *dec)
{
struct ruvd_buffer* buf;
/* ignore the request if message buffer isn't mapped */
if (!dec->msg)
/* ignore the request if message/feedback buffer isn't mapped */
if (!dec->msg || !dec->fb)
return;
/* grap the current message buffer */
@@ -157,6 +166,8 @@ static void send_msg_buf(struct ruvd_decoder *dec)
/* unmap the buffer */
dec->ws->buffer_unmap(buf->cs_handle);
dec->msg = NULL;
dec->fb = NULL;
/* and send it to the hardware */
send_cmd(dec, RUVD_CMD_MSG_BUFFER, buf->cs_handle, 0,
@@ -644,7 +655,7 @@ static void ruvd_destroy(struct pipe_video_codec *decoder)
assert(decoder);
map_msg_buf(dec);
map_msg_fb_buf(dec);
memset(dec->msg, 0, sizeof(*dec->msg));
dec->msg->size = sizeof(*dec->msg);
dec->msg->msg_type = RUVD_MSG_DESTROY;
@@ -773,7 +784,7 @@ static void ruvd_end_frame(struct pipe_video_codec *decoder,
memset(dec->bs_ptr, 0, bs_size - dec->bs_size);
dec->ws->buffer_unmap(bs_buf->cs_handle);
map_msg_buf(dec);
map_msg_fb_buf(dec);
dec->msg->size = sizeof(*dec->msg);
dec->msg->msg_type = RUVD_MSG_DECODE;
dec->msg->stream_handle = dec->stream_handle;
@@ -813,6 +824,10 @@ static void ruvd_end_frame(struct pipe_video_codec *decoder,
dec->msg->body.decode.db_surf_tile_config = dec->msg->body.decode.dt_surf_tile_config;
dec->msg->body.decode.extension_support = 0x1;
/* set at least the feedback buffer size */
dec->fb[0] = FB_BUFFER_SIZE;
send_msg_buf(dec);
send_cmd(dec, RUVD_CMD_DPB_BUFFER, dec->dpb.cs_handle, 0,
@@ -822,7 +837,7 @@ static void ruvd_end_frame(struct pipe_video_codec *decoder,
send_cmd(dec, RUVD_CMD_DECODING_TARGET_BUFFER, dt, 0,
RADEON_USAGE_WRITE, RADEON_DOMAIN_VRAM);
send_cmd(dec, RUVD_CMD_FEEDBACK_BUFFER, msg_fb_buf->cs_handle,
0x1000, RADEON_USAGE_WRITE, RADEON_DOMAIN_GTT);
FB_BUFFER_OFFSET, RADEON_USAGE_WRITE, RADEON_DOMAIN_GTT);
set_reg(dec, RUVD_ENGINE_CNTL, 1);
flush(dec);
@@ -898,7 +913,8 @@ struct pipe_video_codec *ruvd_create_decoder(struct pipe_context *context,
bs_buf_size = width * height * 512 / (16 * 16);
for (i = 0; i < NUM_BUFFERS; ++i) {
unsigned msg_fb_size = align(sizeof(struct ruvd_msg), 0x1000) + 0x1000;
unsigned msg_fb_size = FB_BUFFER_OFFSET + FB_BUFFER_SIZE;
STATIC_ASSERT(sizeof(struct ruvd_msg) <= FB_BUFFER_OFFSET);
if (!create_buffer(dec, &dec->msg_fb_buffers[i], msg_fb_size)) {
RUVD_ERR("Can't allocated message buffers.\n");
goto error;
@@ -920,7 +936,7 @@ struct pipe_video_codec *ruvd_create_decoder(struct pipe_context *context,
clear_buffer(dec, &dec->dpb);
map_msg_buf(dec);
map_msg_fb_buf(dec);
dec->msg->size = sizeof(*dec->msg);
dec->msg->msg_type = RUVD_MSG_CREATE;
dec->msg->stream_handle = dec->stream_handle;

View File

@@ -151,7 +151,7 @@ static void si_update_descriptors(struct si_context *sctx,
7 + /* copy */
(4 + desc->element_dw_size) * util_bitcount(desc->dirty_mask) + /* update */
4; /* pointer update */
#if HAVE_LLVM >= 0x0305
#if LLVM_SUPPORTS_GEOM_SHADERS
if (desc->shader_userdata_reg >= R_00B130_SPI_SHADER_USER_DATA_VS_0 &&
desc->shader_userdata_reg < R_00B230_SPI_SHADER_USER_DATA_GS_0)
desc->atom.num_dw += 4; /* second pointer update */
@@ -176,7 +176,7 @@ static void si_emit_shader_pointer(struct si_context *sctx,
radeon_emit(cs, va);
radeon_emit(cs, va >> 32);
#if HAVE_LLVM >= 0x0305
#if LLVM_SUPPORTS_GEOM_SHADERS
if (desc->shader_userdata_reg >= R_00B130_SPI_SHADER_USER_DATA_VS_0 &&
desc->shader_userdata_reg < R_00B230_SPI_SHADER_USER_DATA_GS_0) {
radeon_emit(cs, PKT3(PKT3_SET_SH_REG, 2, 0));

View File

@@ -81,7 +81,7 @@ void si_context_flush(struct si_context *ctx, unsigned flags)
{
struct radeon_winsys_cs *cs = ctx->b.rings.gfx.cs;
if (!cs->cdw)
if (cs->cdw == ctx->b.initial_gfx_cs_size)
return;
/* suspend queries */
@@ -177,6 +177,8 @@ void si_begin_new_cs(struct si_context *ctx)
}
si_all_descriptors_begin_new_cs(ctx);
ctx->b.initial_gfx_cs_size = ctx->b.rings.gfx.cs->cdw;
}
#if SI_TRACE_CS

View File

@@ -269,7 +269,7 @@ static int si_get_param(struct pipe_screen* pscreen, enum pipe_cap param)
return 256;
case PIPE_CAP_GLSL_FEATURE_LEVEL:
return HAVE_LLVM >= 0x0305 ? 330 : 140;
return (LLVM_SUPPORTS_GEOM_SHADERS) ? 330 : 140;
case PIPE_CAP_TEXTURE_BUFFER_OFFSET_ALIGNMENT:
return 1;
@@ -299,13 +299,22 @@ static int si_get_param(struct pipe_screen* pscreen, enum pipe_cap param)
case PIPE_CAP_MAX_STREAM_OUTPUT_INTERLEAVED_COMPONENTS:
return sscreen->b.has_streamout ? 32*4 : 0;
/* Geometry shader output. */
case PIPE_CAP_MAX_GEOMETRY_OUTPUT_VERTICES:
return 1024;
case PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS:
return 4095;
/* Texturing. */
case PIPE_CAP_MAX_TEXTURE_2D_LEVELS:
case PIPE_CAP_MAX_TEXTURE_3D_LEVELS:
case PIPE_CAP_MAX_TEXTURE_CUBE_LEVELS:
return 15;
return 15; /* 16384 */
case PIPE_CAP_MAX_TEXTURE_3D_LEVELS:
/* textures support 8192, but layered rendering supports 2048 */
return 12;
case PIPE_CAP_MAX_TEXTURE_ARRAY_LAYERS:
return 16384;
/* textures support 8192, but layered rendering supports 2048 */
return 2048;
case PIPE_CAP_MAX_COMBINED_SAMPLERS:
return HAVE_LLVM >= 0x0305 ? 48 : 32;
@@ -340,7 +349,7 @@ static int si_get_shader_param(struct pipe_screen* pscreen, unsigned shader, enu
case PIPE_SHADER_VERTEX:
break;
case PIPE_SHADER_GEOMETRY:
#if HAVE_LLVM < 0x0305
#if !(LLVM_SUPPORTS_GEOM_SHADERS)
return 0;
#endif
break;

View File

@@ -39,6 +39,10 @@
#define SI_MAX_DRAW_CS_DWORDS 18
#define LLVM_SUPPORTS_GEOM_SHADERS \
((HAVE_LLVM >= 0x0305) || \
(HAVE_LLVM == 0x0304 && LLVM_VERSION_PATCH >= 1))
struct si_pipe_compute;
struct si_screen {

View File

@@ -2307,7 +2307,7 @@ static void *si_create_fs_state(struct pipe_context *ctx,
return si_create_shader_state(ctx, state, PIPE_SHADER_FRAGMENT);
}
#if HAVE_LLVM >= 0x0305
#if LLVM_SUPPORTS_GEOM_SHADERS
static void *si_create_gs_state(struct pipe_context *ctx,
const struct pipe_shader_state *state)
@@ -2337,7 +2337,7 @@ static void si_bind_vs_shader(struct pipe_context *ctx, void *state)
sctx->vs_shader = sel;
}
#if HAVE_LLVM >= 0x0305
#if LLVM_SUPPORTS_GEOM_SHADERS
static void si_bind_gs_shader(struct pipe_context *ctx, void *state)
{
@@ -2396,7 +2396,7 @@ static void si_delete_vs_shader(struct pipe_context *ctx, void *state)
si_delete_shader_selector(ctx, sel);
}
#if HAVE_LLVM >= 0x0305
#if LLVM_SUPPORTS_GEOM_SHADERS
static void si_delete_gs_shader(struct pipe_context *ctx, void *state)
{
@@ -2723,16 +2723,15 @@ static void *si_create_sampler_state(struct pipe_context *ctx,
rstate->val[0] = (S_008F30_CLAMP_X(si_tex_wrap(state->wrap_s)) |
S_008F30_CLAMP_Y(si_tex_wrap(state->wrap_t)) |
S_008F30_CLAMP_Z(si_tex_wrap(state->wrap_r)) |
(state->max_anisotropy & 0x7) << 9 | /* XXX */
r600_tex_aniso_filter(state->max_anisotropy) << 9 |
S_008F30_DEPTH_COMPARE_FUNC(si_tex_compare(state->compare_func)) |
S_008F30_FORCE_UNNORMALIZED(!state->normalized_coords) |
aniso_flag_offset << 16 | /* XXX */
S_008F30_DISABLE_CUBE_WRAP(!state->seamless_cube_map));
rstate->val[1] = (S_008F34_MIN_LOD(S_FIXED(CLAMP(state->min_lod, 0, 15), 8)) |
S_008F34_MAX_LOD(S_FIXED(CLAMP(state->max_lod, 0, 15), 8)));
rstate->val[2] = (S_008F38_LOD_BIAS(S_FIXED(CLAMP(state->lod_bias, -16, 16), 8)) |
S_008F38_XY_MAG_FILTER(si_tex_filter(state->mag_img_filter)) |
S_008F38_XY_MIN_FILTER(si_tex_filter(state->min_img_filter)) |
S_008F38_XY_MAG_FILTER(si_tex_filter(state->mag_img_filter) | aniso_flag_offset) |
S_008F38_XY_MIN_FILTER(si_tex_filter(state->min_img_filter) | aniso_flag_offset) |
S_008F38_MIP_FILTER(si_tex_mipfilter(state->min_mip_filter)));
rstate->val[3] = S_008F3C_BORDER_COLOR_TYPE(border_color_type);
@@ -2890,7 +2889,7 @@ static void si_bind_vs_sampler_states(struct pipe_context *ctx, unsigned count,
si_set_sampler_states(sctx, pm4, count, states,
&sctx->samplers[PIPE_SHADER_VERTEX],
R_00B130_SPI_SHADER_USER_DATA_VS_0);
#if HAVE_LLVM >= 0x0305
#if LLVM_SUPPORTS_GEOM_SHADERS
si_set_sampler_states(sctx, pm4, count, states,
&sctx->samplers[PIPE_SHADER_VERTEX],
R_00B330_SPI_SHADER_USER_DATA_ES_0);
@@ -3166,7 +3165,7 @@ void si_init_state_functions(struct si_context *sctx)
sctx->b.b.bind_fs_state = si_bind_ps_shader;
sctx->b.b.delete_vs_state = si_delete_vs_shader;
sctx->b.b.delete_fs_state = si_delete_ps_shader;
#if HAVE_LLVM >= 0x0305
#if LLVM_SUPPORTS_GEOM_SHADERS
sctx->b.b.create_gs_state = si_create_gs_state;
sctx->b.b.bind_gs_state = si_bind_gs_shader;
sctx->b.b.delete_gs_state = si_delete_gs_shader;

View File

@@ -121,6 +121,9 @@ softpipe_get_param(struct pipe_screen *screen, enum pipe_cap param)
case PIPE_CAP_MAX_STREAM_OUTPUT_SEPARATE_COMPONENTS:
case PIPE_CAP_MAX_STREAM_OUTPUT_INTERLEAVED_COMPONENTS:
return 16*4;
case PIPE_CAP_MAX_GEOMETRY_OUTPUT_VERTICES:
case PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS:
return 0;
case PIPE_CAP_PRIMITIVE_RESTART:
return 1;
case PIPE_CAP_SHADER_STENCIL_EXPORT:

Some files were not shown because too many files have changed in this diff Show More